Nearest-neighbor chain algorithm: Difference between revisions

Content deleted Content added
Background: +source
m Correctness: split long sentence
Line 101:
will in general perform its merges in a different order than the nearest-neighbor chain algorithm. Both of these properties depend on the specific choice of how to measure the distance between clusters.<ref name="murtagh-tcj"/>
 
The correctness of this algorithm relies on a property of its distance function called ''reducibility'',. This property was identified by {{harvtxt|Bruynooghe|1977}} in connection with an earlier clustering method that used mutual nearest neighbor pairs but not chains of nearest neighbors.<ref name="b77">{{citation|first=Michel|last=Bruynooghe|title=Méthodes nouvelles en classification automatique de données taxinomiqes nombreuses|journal=Statistique et Analyse des Données|volume=3|pages=24–42|year=1977|url=http://www.numdam.org/item?id=SAD_1977__2_3_24_0}}.</ref> A distance function {{mvar|d}} on clusters is defined to be reducible if, for every three clusters {{mvar|A}}, {{mvar|B}} and {{mvar|C}} in the greedy hierarchical clustering such that {{mvar|A}} and {{mvar|B}} are mutual nearest neighbors, the following inequality holds:<ref name="murtagh-tcj"/>
:{{math|''d''(''A'' ∪ ''B'', ''C'') ≥ min(d(''A'',''C''), d(''B'',''C''))}}.