Help The Education Support Forum through MathsGee serve learners across Africa with a DONATION

0 like 0 dislike
12 views
In K-Means clustering, what does the acronym WCSS stand for?
by anonymous | 12 views

0 like 0 dislike

The KMeans algorithm clusters data by trying to separate samples in n groups of equal variance, minimizing a criterion known as the inertia or within-cluster sum-of-squares. This algorithm requires the number of clusters to be specified. It scales well to large number of samples and has been used across a large range of application areas in many different fields.

Inertia can be recognized as a measure of how internally coherent clusters are. It suffers from various drawbacks:

• Inertia makes the assumption that clusters are convex and isotropic, which is not always the case. It responds poorly to elongated clusters, or manifolds with irregular shapes.

• Inertia is not a normalized metric: we just know that lower values are better and zero is optimal. But in very high-dimensional spaces, Euclidean distances tend to become inflated (this is an instance of the so-called “curse of dimensionality”). Running a dimensionality reduction algorithm such as Principal component analysis (PCA) prior to k-means clustering can alleviate this problem and speed up the computations.

by Diamond (48,764 points)

0 like 0 dislike
0 like 0 dislike
0 like 0 dislike
0 like 0 dislike
1 like 0 dislike
2 like 0 dislike
1 like 0 dislike
0 like 0 dislike
1 like 0 dislike