Content deleted Content added
m link Factor analysis using Find link |
Crudely fill 1 bare URL ref to website homepage, using title 'Home' |
||
Line 39:
==== K-means clustering ====
{{main|k-means clustering}}
K-means clustering is an algorithm for grouping genes or samples based on pattern into ''K'' groups. Grouping is done by minimizing the sum of the squares of distances between the data and the corresponding cluster [[centroid]]. Thus the purpose of K-means clustering is to classify data based on similar expression.<ref>{{cite web |url=http://www.biostat.ucsf.edu/ |title=Home |website=biostat.ucsf.edu}}</ref> K-means clustering algorithm and some of its variants (including [[k-medoids]]) have been shown to produce good results for gene expression data (at least better than hierarchical clustering methods). Empirical comparisons of [[k-means]], [[k-medoids]], hierarchical methods and, different distance measures can be found in the literature.<ref name="Jaskowiak2014" /><ref name=Souto2011>{{cite journal|last1=de Souto|first1=Marcilio C. P.|last2=Costa|first2=Ivan G.|last3=de Araujo|first3=Daniel S. A.|last4=Ludermir|first4=Teresa B.|last5=Schliep|first5=Alexander|title=Clustering cancer gene expression data: a comparative study|journal=BMC Bioinformatics|volume=9|issue=1|pages=497|doi=10.1186/1471-2105-9-497|pmid=19038021|pmc=2632677|year=2008}}</ref>
===Pattern recognition===
|