Nearest-neighbor chain algorithm: Difference between revisions

Content deleted Content added
update murtagh 1983 url, add another footnote from it
Line 12:
In agglomerative clustering methods, the input also includes a distance function defined on the points, or a numerical measure of their dissimilarity.
The distance or dissimilarity should be symmetric: the distance between two points does not depend on which of them is considered first.
However, unlike the distances in a [[metric space]], it is not required to satisfy the [[triangle inequality]].<ref name="murtagh-tcj"/>
 
Next, the dissimilarity function is extended from pairs of points to pairs of clusters. Different clustering methods perform this extension in different ways. For instance, in the [[single-linkage clustering]] method, the distance between two clusters is defined to be the minimum distance between any two points from each cluster. Given this distance between clusters, a hierarchical clustering may be defined by a [[greedy algorithm]] that initially places each point in its own single-point cluster and then repeatedly forms a new cluster by merging the [[closest pair]] of clusters.<ref name="murtagh-tcj"/>
Line 47:
| volume = 26 | issue = 4 | pages = 354–359
| year = 1983
| url = http://thameswww.csmultiresolutions.rhul.ac.ukcom/~fionnstrule/old-articles/Survey_of_hierarchical_clustering_algorithms.pdf
| doi = 10.1093/comjnl/26.4.354}}.</ref>