Posts

Showing posts with the label Data Mining

Which Machine Learning Algorithm To Use?

Image
Terminologies We learnt a few machine learning  terminologies  and algorithms in this blog. Supervised  means we rely on labelled training data. It is task driven to identify a goal. Unsupervised  means unlabeled training data. It is data driven to identify a pattern. Classification arranges data into classes/categories using a labeled dataset. Regression develops a model to predict continuous numerical values. Clustering  separates an unlabeled dataset into clusters/groups of similar objects. Classification is a supervised learning algorithm, while Clustering is an unsupervised algorithm. Regression is considered supervised learning because the model is trained using both the input features and output labels - which can be numerical values. I will mention here that two other unsupervised approaches are:  Association , to identify underlying relationships, and Dimension Reduction , to reduce the number dimensions/features to make calculations simpler. I did not cover any methods on a

Data Mining or Machine Learning

Image
I covered a number of statistical tests using Excel LAMBDA. The reason for using Excel LAMBDA was its ubiquity and undemanding learning curve . While there are more statistical inferences test, I only covered those that I commonly used. If however you think other common ones, please let me know. I would be interested as well. Data Mining or Machine Learning When I started data analysis, the term data mining made sense. The techniques used within Data Mining is with the intention of identifying patterns within a data set. The problem came when I started searching more of a topic from data mining, they keep popping up in  machine learning . Machine Learning  is the process of computers learning in a way that mimics human learning or through algorithms. To accomplish this machine learning use data mining techniques as the process requires identification of patterns. While there is a difference between data mining and machine learning, do not be surprise of the overlap or if you start wond