Revision as of 11:07, 3 July 2023 edit MewTheEditor (talk \| contribs) 191 edits →Divisive clustering: Added the DIANA algorithm. ← Previous edit		Revision as of 21:13, 15 August 2023 edit undo OAbot (talk \| contribs) Bots 643,717 edits m Open access bot: doi updated in citation with #oabot. Next edit →
Line 15: == Complexity == The standard algorithm for '''hierarchical agglomerative clustering''' (HAC) has a [[time complexity]] of <math>\mathcal{O}(n^3)</math> and requires <math>\Omega(n^2)</math> memory, which makes it too slow for even medium data sets. However, for some special cases, optimal efficient agglomerative methods (of complexity <math>\mathcal{O}(n^2)</math>) are known: '''SLINK'''<!--boldface per WP:R#PLA--><ref name="SLINK">{{cite journal \| author=R. Sibson \| title=SLINK: an optimally efficient algorithm for the single-link cluster method \| journal=The Computer Journal \| volume=16 \| issue=1 \| pages=30–34 \| year=1973 \| publisher=British Computer Society \| url=http://www.cs.gsu.edu/~wkim/index_files/papers/sibson.pdf \| doi=10.1093/comjnl/16.1.30\| doi-access=free }}</ref> for [[Single-linkage clustering\|single-linkage]] and CLINK<ref name="CLINK">{{cite journal \| author=D. Defays \| title=An efficient algorithm for a complete-link method \| journal=The Computer Journal \| volume=20 \| issue=4 \| pages=364–6 \| year=1977 \| publisher=British Computer Society \| doi=10.1093/comjnl/20.4.364\| doi-access=~~free~~ }}</ref> for [[complete-linkage clustering]]. With a [[heap (data structure)\|heap]], the runtime of the general case can be reduced to <math>\mathcal{O}(n^2 \log n)</math>, an improvement on the aforementioned bound of <math>\mathcal{O}(n^3)</math>, at the cost of further increasing the memory requirements. In many cases, the memory overheads of this approach are too large to make it practically usable. Divisive clustering with an exhaustive search is <math>\mathcal{O}(2^n)</math>, but it is common to use faster heuristics to choose splits, such as [[k-means clustering\|''k''-means]].

Hierarchical clustering: Difference between revisions