Human genetic clustering: Difference between revisions

Content deleted Content added
AnomieBOT (talk | contribs)
Rescuing orphaned refs (":63" from rev 1021437580)
OAbot (talk | contribs)
m Open access bot: doi, pmc added to citation with #oabot.
Line 1:
'''Human genetic clustering''' refers to patterns of relative genetic similarity among human individuals and populations, as well as the wide range of scientific and statistical methods used to study this aspect of [[human genetic variation]].
 
Clustering studies are thought to be valuable for characterizing the general structure of genetic variation among human populations, to contribute to the study of ancestral origins, evolutionary history, and precision medicine. Since the mapping of the human genome, and with the availability of increasingly powerful analytic tools, [[Cluster analysis|cluster analyses]] have revealed a range of ancestral and migratory trends among human populations and individuals.<ref name=":02">{{Cite journal|last=Novembre|first=John|last2=Ramachandran|first2=Sohini|date=2011-09-22|title=Perspectives on Human Population Structure at the Cusp of the Sequencing Era|url=http://dx.doi.org/10.1146/annurev-genom-090810-183123|journal=Annual Review of Genomics and Human Genetics|volume=12|issue=1|pages=245–274|doi=10.1146/annurev-genom-090810-183123|issn=1527-8204}}</ref> Human genetic clusters tend to be organized by geographic ancestry, with divisions between clusters aligning largely with geographic barriers such as oceans or mountain ranges.<ref name=":32">{{Cite journal|last=Maglo|first=Koffi N.|last2=Mersha|first2=Tesfaye B.|last3=Martin|first3=Lisa J.|date=2016-02-17|title=Population Genomics and the Statistical Values of Race: An Interdisciplinary Perspective on the Biological Classification of Human Populations and Implications for Clinical Genetic Epidemiological Research|url=http://dx.doi.org/10.3389/fgene.2016.00022|journal=Frontiers in Genetics|volume=7|doi=10.3389/fgene.2016.00022|issn=1664-8021|doi-access=free}}</ref><ref name=":92">{{Cite journal|date=2012-10-29|editor-last=Goodman|editor-first=Alan H.|editor2-last=Moses|editor2-first=Yolanda T.|editor3-last=Jones|editor3-first=Joseph L.|title=Race|url=http://dx.doi.org/10.1002/9781118233023|doi=10.1002/9781118233023}}</ref> Clustering studies have been applied to global populations,<ref name=":102">{{Cite journal|last=Rosenberg|first=N. A.|date=2002-12-20|title=Genetic Structure of Human Populations|url=http://dx.doi.org/10.1126/science.1078311|journal=Science|volume=298|issue=5602|pages=2381–2385|doi=10.1126/science.1078311|issn=0036-8075}}</ref> as well as to population subsets like post-colonial North America.<ref name=":112">{{Cite journal|last=Han|first=Eunjung|last2=Carbonetto|first2=Peter|last3=Curtis|first3=Ross E.|last4=Wang|first4=Yong|last5=Granka|first5=Julie M.|last6=Byrnes|first6=Jake|last7=Noto|first7=Keith|last8=Kermany|first8=Amir R.|last9=Myres|first9=Natalie M.|last10=Barber|first10=Mathew J.|last11=Rand|first11=Kristin A.|date=2017-02-07|title=Clustering of 770,000 genomes reveals post-colonial population structure of North America|url=https://www.nature.com/articles/ncomms14238|journal=Nature Communications|language=en|volume=8|issue=1|pages=14238|doi=10.1038/ncomms14238|issn=2041-1723|doi-access=free}}</ref><ref name=":122">{{Cite journal|last=Jordan|first=I. King|last2=Rishishwar|first2=Lavanya|last3=Conley|first3=Andrew B.|date=September 2019|title=Native American admixture recapitulates population-specific migration and settlement of the continental United States|url=https://pubmed.ncbi.nlm.nih.gov/31545791/|journal=PLoS genetics|volume=15|issue=9|pages=e1008225|doi=10.1371/journal.pgen.1008225|issn=1553-7404|pmc=6756731|pmid=31545791}}</ref> Notably, the practice of defining clusters among modern human populations is largely arbitrary and variable due to the continuous nature of human genotypes; although individual genetic markers can be used to produce smaller groups, there are no models that produce completely distinct subgroups when larger numbers of genetic markers are used.<ref name=":32" /><ref name=":52">{{Cite journal|last=Bamshad|first=Michael J.|last2=Olson|first2=Steve E.|date=December 2003|title=Does Race Exist?|url=http://dx.doi.org/10.1038/scientificamerican1203-78|journal=Scientific American|volume=289|issue=6|pages=78–85|doi=10.1038/scientificamerican1203-78|issn=0036-8733}}</ref><ref name=":22">{{Cite journal|last=Kalinowski|first=S T|date=2010-08-04|title=The computer program STRUCTURE does not reliably identify the main genetic clusters within species: simulations and implications for human population structure|url=http://dx.doi.org/10.1038/hdy.2010.95|journal=Heredity|volume=106|issue=4|pages=625–632|doi=10.1038/hdy.2010.95|issn=0018-067X|doi-access=free}}</ref>
 
Many studies of human genetic clustering have been implicated in discussions of [[Race (human categorization)|race]], [[Ethnic group|ethnicity]], and [[scientific racism]], as some have controversially suggested that genetically derived clusters may be understood as proof of genetically determined races.<ref name=":42">{{Cite journal|last=Jorde|first=Lynn B|last2=Wooding|first2=Stephen P|date=2004-10-26|title=Genetic variation, classification and 'race'|url=http://dx.doi.org/10.1038/ng1435|journal=Nature Genetics|volume=36|issue=S11|pages=S28–S33|doi=10.1038/ng1435|issn=1061-4036|doi-access=free}}</ref><ref>{{Cite book|last=Verfasser.|first=Marks, Jonathan (Jonathan M.), 1955-|url=http://worldcat.org/oclc/1037867598|title=Is science racist?|isbn=978-0-7456-8925-8|oclc=1037867598}}</ref> Although cluster analyses invariably organize humans (or groups of humans) into subgroups, debate is ongoing on how to interpret these genetic clusters with respect to race and its social and phenotypic features. And, because there is such a small fraction of genetic variation between human genotypes overall, genetic clustering approaches are highly dependent on the sampled data, genetic markers, and statistical methods applied to their construction.
 
== Genetic clustering algorithms and methods ==
A wide range of methods have been developed to assess the structure of human populations with the use of genetic data. Early studies of within and between-group genetic variation used physical phenotypes and blood groups, with modern genetic studies using genetic markers such as [[Alu element|Alu sequences]], [[Microsatellite|short tandem repeat polymorphisms]], and [[Single-nucleotide polymorphism|single nucleotide polymorphisms]] (SNPs), among others.<ref>{{Cite journal|last=Bamshad|first=Michael|last2=Wooding|first2=Stephen|last3=Salisbury|first3=Benjamin A.|last4=Stephens|first4=J. Claiborne|date=August 2004|title=Deconstructing the relationship between genetics and race|url=http://dx.doi.org/10.1038/nrg1401|journal=Nature Reviews Genetics|volume=5|issue=8|pages=598–609|doi=10.1038/nrg1401|issn=1471-0056}}</ref> Models for genetic clustering also vary by algorithms and programs used to process the data. Most sophisticated methods for determining clusters can be categorized as '''model-based clustering methods''' (such as the algorithm STRUCTURE<ref name=":132">{{Cite journal|last=Pritchard|first=Jonathan K|last2=Stephens|first2=Matthew|last3=Donnelly|first3=Peter|date=2000-06-01|title=Inference of Population Structure Using Multilocus Genotype Data|url=https://doi.org/10.1093/genetics/155.2.945|journal=Genetics|volume=155|issue=2|pages=945–959|doi=10.1093/genetics/155.2.945|issn=1943-2631|doi-access=free}}</ref>) or '''multidimensional summaries''' (typically through principal component analysis).<ref name=":02" /><ref name=":14">{{Cite journal|last=Lawson|first=Daniel John|last2=Falush|first2=Daniel|date=2012-09-22|title=Population Identification Using Genetic Data|url=http://dx.doi.org/10.1146/annurev-genom-082410-101510|journal=Annual Review of Genomics and Human Genetics|volume=13|issue=1|pages=337–361|doi=10.1146/annurev-genom-082410-101510|issn=1527-8204}}</ref> By processing a large number of SNPs (or other genetic marker data) in different ways, both approaches to genetic clustering tend to converge on similar patterns by identifying similarities among SNPs and/or [[haplotype]] tracts to reveal ancestral genetic similarities.<ref name=":14" />
 
=== Model-based clustering ===
Line 19:
 
== Notable applications to human genetic data ==
Modern applications of genetic clustering methods to global-scale genetic data were first marked by studies associated with the [[Human Genome Diversity Project]] (HGDP) data.<ref name=":02" /> These early HGDP studies, such as those by Rosenberg et al. (2002),<ref name=":102" /><ref>{{Cite journal|last=Rosenberg|first=Noah A|last2=Mahajan|first2=Saurabh|last3=Ramachandran|first3=Sohini|last4=Zhao|first4=Chengfeng|last5=Pritchard|first5=Jonathan K|last6=Feldman|first6=Marcus W|date=2005-12-09|title=Clines, Clusters, and the Effect of Study Design on the Inference of Human Population Structure|url=http://dx.doi.org/10.1371/journal.pgen.0010070|journal=PLoS Genetics|volume=1|issue=6|pages=e70|doi=10.1371/journal.pgen.0010070|issn=1553-7404|doi-access=free}}</ref> contributed to theories of the serial founder effect and early human migration out of Africa, and clustering methods have been notably applied to describe admixed continental populations.<ref name=":112" /><ref name=":122" /><ref>{{Cite journal|last=Leslie|first=Stephen|last2=Winney|first2=Bruce|last3=Hellenthal|first3=Garrett|last4=Davison|first4=Dan|last5=Boumertit|first5=Abdelhamid|last6=Day|first6=Tammy|last7=Hutnik|first7=Katarzyna|last8=Royrvik|first8=Ellen C.|last9=Cunliffe|first9=Barry|last10=Lawson|first10=Daniel J.|last11=Falush|first11=Daniel|date=March 2015|title=The fine-scale genetic structure of the British population|url=https://www.nature.com/articles/nature14230|journal=Nature|language=en|volume=519|issue=7543|pages=309–314|doi=10.1038/nature14230|issn=1476-4687|pmc=4632200}}</ref> Genetic clustering and HGDP studies have also contributed to methods for, and criticisms of, the [[Genealogical DNA test|genetic ancestry consumer testing]] industry.<ref>{{Cite journal|last=Royal|first=Charmaine D.|last2=Novembre|first2=John|last3=Fullerton|first3=Stephanie M.|last4=Goldstein|first4=David B.|last5=Long|first5=Jeffrey C.|last6=Bamshad|first6=Michael J.|last7=Clark|first7=Andrew G.|date=2010-05-14|title=Inferring Genetic Ancestry: Opportunities, Challenges, and Implications|url=https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2869013/|journal=American Journal of Human Genetics|volume=86|issue=5|pages=661–673|doi=10.1016/j.ajhg.2010.03.011|issn=0002-9297|pmc=2869013|pmid=20466090}}</ref>
 
A number of landmark genetic cluster studies have been conducted on global human populations since 2002, including the following: