Human genetic clustering: Difference between revisions

Content deleted Content added
Millager (talk | contribs)
a bit more
Millager (talk | contribs)
pushing forward
Line 2:
'''Human genetic clustering''' refers to a wide range of scientific and statistical methods often used to characterize patterns and subgroups within studies of [[human genetic variation]].
 
Clustering studies are thought to be valuable for characterizing the general structure of genetic variation among human populations, to better understand ancestral origins, evolutionary history, and personalized medicine. Since the mapping of the human genome, and with the availability of increasingly powerful analytic tools, cluster analyses have revealed a range of ancestral and migratory trends among human populations and individuals.<ref name=":0">{{Cite journal|last=Novembre|first=John|last2=Ramachandran|first2=Sohini|date=2011-09-22|title=Perspectives on Human Population Structure at the Cusp of the Sequencing Era|url=http://dx.doi.org/10.1146/annurev-genom-090810-183123|journal=Annual Review of Genomics and Human Genetics|volume=12|issue=1|pages=245–274|doi=10.1146/annurev-genom-090810-183123|issn=1527-8204}}</ref>
 
The practice of defining clusters of human populations is largely arbitrary and variable, depending on the sampled data, genetic markers, and statistical methods applied to their construction. Nevertheless, studies of human genetic clustering have been implicated in discussions of [[Race (human categorization)|race]], [[Ethnic group|ethnicity]], and [[scientific racism]], as some have controversially suggested that genetic clusters may represent genetically determined races.<ref>{{Cite journal|last=Jorde|first=Lynn B|last2=Wooding|first2=Stephen P|date=2004-10-26|title=Genetic variation, classification and 'race'|url=http://dx.doi.org/10.1038/ng1435|journal=Nature Genetics|volume=36|issue=S11|pages=S28–S33|doi=10.1038/ng1435|issn=1061-4036}}</ref><ref>{{Cite book|last=Verfasser.|first=Marks, Jonathan (Jonathan M.), 1955-|url=http://worldcat.org/oclc/1037867598|title=Is science racist?|isbn=978-0-7456-8925-8|oclc=1037867598}}</ref>
 
== Genetic clustering algorithms and methods ==
Since at least 2001, a wide range of methods have been developed to assess the structure of human populations with the use of genetic data. Most commonly, genetic clusters can be derived by analysis of [[Single-nucleotide polymorphism|single nucleotide polymorphisms]] (SNPs), although other genetic data can be input and analyzed as well. Models for genetic clustering also vary by algorithms and programs used to process the data. Most methods for determining clusters can be categorized as '''model-based clustering methods''' or '''multidimensional summaries'''.<ref>{{Cite journal|last=Novembre|first=John|last2=Ramachandran|first2=Sohini|date=2011-09-22|title=Perspectives on Human Population Structure at the Cusp of the Sequencing Era|url=http://dx.doi.org/10.1146/annurev-genom-090810-183123|journal=Annual Review of Genomics and Human Genetics|volume=12|issue=1|pages=245–274|doi=10.1146/annurev-genom-090810-183123|issn=1527-8204}}</ref><ref>{{Cite journal|last=Lawson|first=Daniel John|last2=Falush|first2=Daniel|date=2012-09-22|title=Population Identification Using Genetic Data|url=http://dx.doi.org/10.1146/annurev-genom-082410-101510|journal=Annual Review of Genomics and Human Genetics|volume=13|issue=1|pages=337–361|doi=10.1146/annurev-genom-082410-101510|issn=1527-8204}}</ref> By processing a large number of SNPs (or other genetic marker data), both approaches to genetic clustering operate by identifying similarities among individual SNPs or [[haplotype]] tracts to reveal ancestral genetic similarities. '''###add something about these being different but showing similar results, and cite Lawson & Falush.'''
 
=== Model-based clustering ===
Common model-based clustering algorithms include STRUCTURE, ADMIXTURE, and HAPMIX. These algorithms operate by taking genetic data and finding the best fit for genetic data among an arbitrary or mathematically derived number of clusters, such that differences within clusters are minimized and differences between clusters are maximized. ClustersThis clustering method is also referred to as "[[Genetic admixture|admixture]] inference," as individual genomes (or individuals within populations) can be determinedcharacterized by the proportions of [[Allele|alleles]] linked to each cluster.<ref name=":0" from/>
 
=== Multidimensional summary statistics ===
Where model-based clustering aims to characterize proportions of cluster