Hierarchical clustering: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 17:07, 7 July 2025 edit AnomieBOT (talk \| contribs) Bots 6,855,621 edits Rescuing orphaned refs (":5" from rev 1299220194) ← Previous edit		Latest revision as of 00:47, 31 July 2025 edit undo Citation bot (talk \| contribs) Bots 5,861,485 edits Add: bibcode, authors 1-1. Removed URL that duplicated identifier. Removed parameters. Some additions/deletions were parameter name changes. \| Use this bot. Report bugs. \| Suggested by Abductive \| Category:Cluster analysis algorithms \| #UCB_Category 17/42
(3 intermediate revisions by 3 users not shown)
Line 5: In [[data mining]] and [[statistics]], '''hierarchical clustering'''<ref name="HC">{{cite book \|first=Frank \|last=Nielsen \| title=Introduction to HPC with MPI for Data Science \| year=2016 \| publisher=Springer \|isbn=978-3-319-21903-5 \|pages=195–211 \|chapter=8. Hierarchical Clustering \| url=https://www.springer.com/gp/book/9783319219028 \|chapter-url=https://www.researchgate.net/publication/314700681 }}</ref> (also called '''hierarchical cluster analysis''' or '''HCA''') is a method of [[cluster analysis]] that seeks to build a [[hierarchy]] of clusters. Strategies for hierarchical clustering generally fall into two categories: * '''Agglomerative'''~~: Agglomerative~~: Agglomerative clustering, often referred to as a "bottom-up" approach, begins with each data point as an individual cluster. At each step, the algorithm merges the two most similar clusters based on a chosen distance metric (e.g., Euclidean distance) and linkage criterion (e.g., single-linkage, complete-linkage).<ref name=":4">{{Cite journal \|~~last~~last1=Murtagh \|~~first~~first1=Fionn \|last2=Contreras \|first2=Pedro \|date=2012 \|title=Algorithms for hierarchical clustering: an overview \|url=https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.53 \|journal=WIREs Data Mining and Knowledge Discovery \|language=en \|volume=2 \|issue=1 \|pages=86–97 \|doi=10.1002/widm.53 \|issn=1942-4795\|url-access=subscription }}</ref>. This process continues until all data points are combined into a single cluster or a stopping criterion is met. Agglomerative methods are more commonly used due to their simplicity and computational efficiency for small to medium-sized datasets .<ref>{{Cite journal \|last=Mojena \|first=R. \|date=1977-04-01 \|title=Hierarchical grouping methods and stopping rules: an evaluation \|url=https://academic.oup.com/comjnl/article-lookup/doi/10.1093/comjnl/20.4.359 \|journal=The Computer Journal \|language=en \|volume=20 \|issue=4 \|pages=359–363 \|doi=10.1093/comjnl/20.4.359 \|issn=0010-4620}}</ref>. * '''Divisive''': Divisive clustering, known as a "top-down" approach, starts with all data points in a single cluster and recursively splits the cluster into smaller ones. At each step, the algorithm selects a cluster and divides it into two or more subsets, often using a criterion such as maximizing the distance between resulting clusters. Divisive methods are less common but can be useful when the goal is to identify large, distinct clusters first. Line 18: == Cluster Linkage == In order to decide which clusters should be combined (for agglomerative), or where a cluster should be split (for divisive), a measure of dissimilarity between sets of observations is required. In most methods of hierarchical clustering, this is achieved by use of an appropriate [[distance]] ''d'', such as the Euclidean distance, between ''single'' observations of the data set, and a linkage criterion, which specifies the dissimilarity of ''sets'' as a function of the pairwise distances of observations in the sets. The choice of metric as well as linkage can have a major impact on the result of the clustering, where the lower level metric determines which objects are most [[similarity measure\|similar]], whereas the linkage criterion influences the shape of the clusters <ref name=":5">{{Cite journal \|last=Wani \|first=Aasim Ayaz \|date=2024-08-29 \|title=Comprehensive analysis of clustering algorithms: exploring limitations and innovative solutions \|journal=PeerJ Computer Science \|language=en \|volume=10 \|pages=e2286 \|doi=10.7717/peerj-cs.2286 \|issn=2376-5992 \|pmc=11419652 \|pmid=39314716 \|doi-access=free}}</ref>. For example, complete-linkage tends to produce more spherical clusters than single-linkage. The linkage criterion determines the distance between sets of observations as a function of the pairwise distances between observations. Line 76: - \min_{m\in B} \sum_{y\in B} d(m,y)</math> \|- \|Medoid linkage<ref>{{Cite conference \|last1=Miyamoto \|first1=Sadaaki \|last2=Kaizu \|first2=Yousuke \|last3=Endo \|first3=Yasunori \|date=2016 \|title=Hierarchical and Non-Hierarchical Medoid Clustering Using Asymmetric Similarity Measures ~~\|url=https://ieeexplore.ieee.org/document/7801678~~ \|conference=2016 Joint 8th International Conference on Soft Computing and Intelligent Systems (SCIS) and 17th International Symposium on Advanced Intelligent Systems (ISIS) \|pages=400–403 \|doi=10.1109/SCIS-ISIS.2016.0091~~\|url-access=subscription~~ }}</ref><ref>{{Cite conference \|date=2016 \|title=Visual Clutter Reduction through Hierarchy-based Projection of High-dimensional Labeled Data\| conference=Graphics Interface \|url=https://graphicsinterface.org/wp-content/uploads/gi2016-14.pdf \| first1=Dominik\|last1=Herr\|first2=Qi\|last2=Han\|first3=Steffen\|last3=Lohmann\| first4=Thomas \|last4=Ertl \|access-date=2022-11-04 \|website=Graphics Interface \|language=en-CA \|doi=10.20380/gi2016.14}}</ref> \|<math>d(m_A, m_B)</math> where <math>m_A</math>, <math>m_B</math> are the medoids of the previous clusters \|- Line 86: * The probability that candidate clusters spawn from the same distribution function (V-linkage). * The product of in-degree and out-degree on a k-nearest-neighbour graph (graph degree linkage).<ref>{{Cite book\|last1=Zhang\|first1=Wei\|last2=Wang\|first2=Xiaogang\|last3=Zhao\|first3=Deli\|last4=Tang\|first4=Xiaoou\|title=Computer Vision – ECCV 2012 \|chapter=Graph Degree Linkage: Agglomerative Clustering on a Directed Graph \|date=2012\|editor-last=Fitzgibbon\|editor-first=Andrew\|editor2-last=Lazebnik\|editor2-first=Svetlana\|editor2-link= Svetlana Lazebnik \|editor3-last=Perona\|editor3-first=Pietro\|editor4-last=Sato\|editor4-first=Yoichi\|editor5-last=Schmid\|editor5-first=Cordelia\|series=Lecture Notes in Computer Science\|language=en\|publisher=Springer Berlin Heidelberg\|volume=7572\|pages=428–441\|doi=10.1007/978-3-642-33718-5_31\|isbn=9783642337185\|arxiv=1208.5092\|bibcode=2012arXiv1208.5092Z\|s2cid=14751}} See also: https://github.com/waynezhanghk/gacluster</ref> * The increment of some cluster descriptor (i.e., a quantity defined for measuring the quality of a cluster) after merging two clusters.<ref>{{cite journal \|first1=W. \|last1=Zhang \|first2=D. \|last2=Zhao \|first3=X. \|last3=Wang \|title=Agglomerative clustering via maximum incremental path integral \|journal=Pattern Recognition \|volume=46 \|issue=11 \|pages=3056–65 \|date=2013 \|doi=10.1016/j.patcog.2013.04.013 \|citeseerx=10.1.1.719.5355 \|bibcode=2013PatRe..46.3056Z}}</ref><ref>{{cite book \|last1=Zhao \|first1=D. \|last2=Tang \|first2=X. \|chapter=Cyclizing clusters via zeta function of a graph \|chapter-url= \|title=NIPS'08: Proceedings of the 21st International Conference on Neural Information Processing Systems \|date=2008 \|isbn=9781605609492 \|pages=1953–60 \|publisher=Curran \|citeseerx=10.1.1.945.1649}}</ref><ref>{{cite journal \|first1=Y. \|last1=Ma \|first2=H. \|last2=Derksen \|first3=W. \|last3=Hong \|first4=J. \|last4=Wright \|title=Segmentation of Multivariate Mixed Data via Lossy Data Coding and Compression \|journal=IEEE Transactions on Pattern Analysis and Machine Intelligence \|volume=29 \|issue=9 \|pages=1546–62 \|date=2007 \|doi=10.1109/TPAMI.2007.1085 \|pmid=17627043 \|bibcode=2007ITPAM..29.1546M \|hdl=2142/99597 \|s2cid=4591894 \|hdl-access=free }}</ref> == Agglomerative clustering example ==