Posts

Showing posts with the label k-means

KMeans clustering - Finding your centre

Image
KMeans clustering is a method to partition your observations into k clusters with k centroids that describes the centre of each cluster . When given a new observation, it is part of a cluster if it is closest to the centroid of that cluster. The diagram above illustrates the k-means clustering concept. The KMeans approach starts by deciding the number of clusters you wish. Then you estimate where the centroids of each cluster might be located. The distance of each observation to each centroid is calculated. Then each observation is re-clustered to the closest centroid. For each new cluster, we re-calculate a new centroid by averaging the cluster data by each feature. We repeat this cycle until no further refinement is achieved. Since Excel LAMBDA does not have iterative loops, a recursive approach will be used. Implementing KMeans clustering in LAMBDA With k-means clustering we implement Predict  before  Fit . Predict Predict takes a list of observations array and a list of centr