Nearest-neighbor chain algorithm: Difference between revisions

Content deleted Content added
The algorithm: more on tie-breaking
m Correctness: split long sentence
Line 78:
:{{math|''d''(''A'' ∪ ''B'', ''C'') ≥ min(d(''A'',''C''), d(''B'',''C''))}}.
 
If a distance function has the reducibility property, then merging two clusters {{mvar|C}} and {{mvar|D}} can only cause the nearest neighbor of {{mvar|E}} to change if that nearest neighbor was one of {{mvar|C}} and {{mvar|D}}. This has two important consequences for the nearest neighbor chain algorithm:. firstFirst, it can be shown using this property that, at each step of the algorithm, the clusters on the stack {{mvar|S}} form a valid chain of nearest neighbors, because whenever a nearest neighbor becomes invalidated it is immediately removed from the stack.
 
Second, and even more importantly, it follows from this property that, if two clusters {{mvar|C}} and {{mvar|D}} both belong to the greedy hierarchical clustering, and are mutual nearest neighbors at any point in time, then they will be merged by the greedy clustering, for they must remain mutual nearest neighbors until they are merged. It follows that each mutual nearest neighbor pair found by the nearest neighbor chain algorithm is also a pair of clusters found by the greedy algorithm, and therefore that the nearest neighbor chain algorithm computes exactly the same clustering (although in a different order) as the greedy algorithm.