Clustering is the process of obtaining natural groups from the data. These are groups and not classes, because, unlike classification, instead of analyzing data labeled with a class, clustering analyzes the data to generate this label. The data are grouped on the basis of the principle of maximizing the similarity between the elements of a group by minimizing the similarity between different groups.
That is, groups are formed such that the objects of the same group are very similar to each other and, at the same time, they are very different from the objects of another group. Clustering algorithms are descriptive rather than predictive.
For example, a clustering algorithm may be used to look for a borrower that has characteristics similar to a borrower that is difficult to assess. If the algorithm finds an appropriate cluster for the borrower, the average default assessment of the cluster may be used as an estimate of the default assessment of the borrower.