Types of Clustering Methods: Overview and Quick Start R Code

Cluster analysis is one of the important data mining methods for discovering knowledge in multidimensional data. The goal of clustering is to identify pattern or groups of similar objects within a data set of interest.
Each group contains observations with similar profile according to a specific criteria. Similarity between observations is defined using some inter-observation distance measures including Euclidean and correlation-based distance measures.
In the literature, cluster analysis is referred as “pattern recognition” or “unsupervised machine learning” - “unsupervised” because we are not guided by a priori ideas of which variables or samples belong in which clusters. “Learning” because the machine algorithm “learns” how to cluster.
Cluster analysis is popular in many fields, including:
Note that, it’ possible to cluster both observations (i.e, samples or individuals) and features (i.e, variables). Observations can be clustered on the basis of variables and variables can be clustered on the basis of observations.
Here, we provide a practical guide to unsupervised machine learning or cluster analysis using R software.
Related Book:
This document contains 5 parts.
Part I. Cluster Analysis Basics:
Part II. Partitioning Clustering methods:
Part III. Hierarchical Clustering:
Part IV. Clustering Validation and Evaluation Strategies :