Nearest-neighbor chain algorithm: Difference between revisions

Content deleted Content added
top: repeatedly (repeatedly)
m link paths
Line 1:
In the theory of [[cluster analysis]], the '''nearest-neighbor chain algorithm''' is an [[algorithm]] that can be used to perform several types of [[agglomerative hierarchical clustering]], in which a hierarchy of clusters is created by repeatedly merging pairs of smaller clusters to form larger clusters. In particular it can be used for [[Ward's method]], [[complete-linkage clustering]], and [[single-linkage clustering]], which all work by repeatedly merging the closest two clusters under different definitions of the distance between clusters.
 
The main idea of the algorithm is to find pairs of clusters to merge by following [[Path (graph theory)|paths]] in the [[nearest neighbor graph]] of the clusters until the paths terminate in pairs of mutual nearest neighbors. That pair of clusters is then chosen as the pair to merge by the algorithm. The algorithm uses a [[Stack (abstract data type)|stack]] to find and merge mutually nearest pairs of clusters efficiently.
The resulting system of merges occurs in a different order than in a naive implementation of the same clustering algorithms, but can be shown to generate the same hierarchy of clusters.