Revision as of 18:56, 5 July 2021 edit OAbot (talk \| contribs) Bots 643,717 edits m Open access bot: doi added to citation with #oabot. ← Previous edit		Revision as of 03:05, 1 September 2021 edit undo 71.84.226.221 (talk) →The Monti consensus clustering algorithm: typo Next edit →
Line 15: The Monti consensus clustering algorithm<ref>{{Cite journal\|last1=Monti\|first1=Stefano\|last2=Tamayo\|first2=Pablo\|last3=Mesirov\|first3=Jill\|last4=Golub\|first4=Todd\|date=2003-07-01\|title=Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data\|journal=Machine Learning\|language=en\|volume=52\|issue=1\|pages=91–118\|doi=10.1023/A:1023949509487\|issn=1573-0565\|doi-access=free}}</ref> is one of the most popular consensus clustering algorithms and is used to determine the number of clusters, <math>K</math>. Given a dataset of <math>N</math> total number of points to cluster, this algorithm works by resampling and clustering the data, for each <math>K</math> and a <math>N \times N</math> consensus matrix is calculated, where each element represents the fraction of times two samples clustered together. A perfectly stable matrix would consist entirely of zeros and ones, representing all sample pairs always clustering together or not together over all resampling iterations. The relative stability of the consensus matrices can be used to infer the optimal <math>K</math>. More specifically, given a set of points to cluster, <math>D=\{e_1,e_2,...e_N\}</math>, let <math>D^1,D^2,...,D^H</math> be the list of <math>H</math> ~~pertubed~~perturbed (resampled) datasets of the original dataset <math>D</math>, and let <math>M^h</math> denote the <math>NXN</math> connectivity matrix resulting from applying a clustering algorithm to the dataset <math>D^h</math>. The entries of <math>M^h</math> are defined as follows: <math>M^h(i,j)= \begin{cases} 1, & \text{if}\text{ points i and j belong to the same cluster} \\ 0, & \text{otherwise} \end{cases}</math>

Consensus clustering: Difference between revisions