week 7 – Monday – Shivakumar Pasem

In today’s class we have discussed about k means clustering.

K- means clustering: K-means clustering is a widely used technique in statistical analysis and unsupervised machine learning for partitioning a dataset into distinct groups or clusters based on the similarity of data points. It’s a straightforward and effective method for grouping data into clusters with similar characteristics.

K-means clustering aims to divide a dataset into ‘K’ clusters, where each data point belongs to the cluster with the nearest mean (centroid). The ‘K’ in K-means represents the number of clusters, which is typically pre-specified. The algorithm iteratively refines the clusters by assigning data points to the nearest centroid and recalculating the centroid as the mean of the points in each cluster.

In our project we are k-means clustering to divide the dataset in to few clusters. Age group is the one which helps in making clusters. We can group them based on the ages like 20 -30, 40-50,50-60. Each group’s mean difference can be calculated with the help of that we can get p-value.

Leave a Reply Cancel reply