Hierarchical clustering: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Added doi-access. | Use this bot. Report bugs. | Suggested by Jay8g | #UCB_toolbar
Tag: Reverted
Revert CITESPAM
Tags: Undo references removed
Line 18:
 
== Cluster Linkage ==
In order to decide which clusters should be combined (for agglomerative), or where a cluster should be split (for divisive), a measure of dissimilarity between sets of observations is required. In most methods of hierarchical clustering, this is achieved by use of an appropriate [[distance]] ''d'', such as the Euclidean distance, between ''single'' observations of the data set, and a linkage criterion, which specifies the dissimilarity of ''sets'' as a function of the pairwise distances of observations in the sets <ref>{{Cite journal |last=Wani |first=Aasim Ayaz |date=2024-08-29 |title=Comprehensive analysis of clustering algorithms: exploring limitations and innovative solutions |url=https://peerj.com/articles/cs-2286/ |journal=PeerJ Computer Science |language=en |volume=10 |pages=e2286 |doi=10.7717/peerj-cs.2286 |doi-access=free |issn=2376-5992}}</ref>. The choice of metric as well as linkage can have a major impact on the result of the clustering, where the lower level metric determines which objects are most [[similarity measure|similar]], whereas the linkage criterion influences the shape of the clusters. For example, complete-linkage tends to produce more spherical clusters than single-linkage.
 
The linkage criterion determines the distance between sets of observations as a function of the pairwise distances between observations.