Content deleted Content added
Neelgajare (talk | contribs) m grammar Tags: Reverted Visual edit |
Revert old vandalism |
||
Line 1:
{{good article}}
In the theory of [[cluster analysis]], the '''nearest-neighbor chain algorithm''' is an [[algorithm]] that can speed up several methods for [[agglomerative hierarchical clustering]]. These are methods that take a collection of points as input, and create a hierarchy of clusters of points by repeatedly merging pairs of smaller clusters to form larger clusters. The clustering methods that the nearest-neighbor chain algorithm can be used for include [[Ward's method]], [[complete-linkage clustering]], and [[single-linkage clustering]]; these all work by repeatedly merging the closest two clusters but use different definitions of the distance between clusters. The cluster distances for which the nearest-neighbor chain algorithm works are called ''reducible'' and are characterized by a simple inequality among certain cluster distances.
Line 92:
==Time and space analysis==
Each iteration of the loop performs a single search for the nearest neighbor of a cluster, and either adds one cluster to the stack or removes two clusters from it.
Each of these iterations may spend time scanning as many as {{math|''n'' − 1}} inter-cluster distances to find the nearest neighbor.
Line 202:
===Distances sensitive to merge order===
The above presentation explicitly disallowed distances sensitive to merge order. Indeed, allowing such distances can cause problems. In particular, there exist order-sensitive cluster distances which satisfy reducibility, but for which the above algorithm will return a hierarchy with suboptimal costs. Therefore, when cluster distances are defined by a recursive formula (as some of the ones discussed above are), care must be taken
| last=Müllner
| first=Daniel
|