Non-coding DNA: Difference between revisions

Content deleted Content added
AnomieBOT (talk | contribs)
m Dating maintenance tags: {{Split}}
Junk DNA: "Material WP:SPLIT back out to Junk_DNA
Line 1:
{{Short description|DNA not coding for protein}}
 
{{split | Non-coding DNA | Junk DNA |date=March 2023}}
 
'''Non-coding DNA''' ('''ncDNA''') sequences are components of an organism's [[DNA]] that do not [[genetic code|encode]] [[protein]] sequences. Some non-coding DNA is [[Transcription (genetics)|transcribed]] into functional [[non-coding RNA]] molecules (e.g. [[transfer RNA]], [[microRNA]], [[Piwi-interacting RNA|piRNA]], [[ribosomal RNA]], and [[RNA interference|regulatory RNAs]]). Other functional regions of the non-coding DNA fraction include [[regulatory sequence]]s that control gene expression; [[scaffold attachment region]]s; [[origin of replication|origins of DNA replication]]; [[centromere]]s; and [[telomere]]s. Some non-coding regions appear to be mostly nonfunctional such as [[introns]], [[pseudogenes]], [[intergenic DNA]], and fragments of [[transposons]] and [[viruses]].
Line 109 ⟶ 107:
 
==Junk DNA==
{{Main|Junk DNA}}
 
Although many non-coding regions have biological function,<ref name="Costa non-coding32">{{cite book |title=Non-coding RNAs and Epigenetic Regulation of Gene Expression: Drivers of Natural Selection |vauthors=Costa F |date=2012 |publisher=[[Caister Academic Press]] |isbn=978-1-904455-94-3 |veditors=Morris KV |chapter=7 Non-coding RNAs, Epigenomics, and Complexity in Human Cells}}{{page needed|date=June 2022}}</ref><ref name="Nessa32">{{cite book |title=Junk DNA: A Journey Through the Dark Matter of the Genome |vauthors=Carey M |date=2015 |publisher=Columbia University Press |isbn=978-0-231-17084-0 |author-link=Nessa Carey}}{{page needed|date=June 2022}}</ref> much of the non-coding DNA in most genomes does not have biological function and has been described as "Junk DNA". Though exact definitions differ, this refers broadly to "any DNA sequence that does not play a functional role in development, physiology, or some other organism-level capacity."<ref name="PalazzoGregory20142">{{cite journal |vauthors=Palazzo AF, Gregory TR |date=May 2014 |title=The case for junk DNA |journal=PLOS Genetics |volume=10 |issue=5 |pages=e1004351 |doi=10.1371/journal.pgen.1004351 |pmc=4014423 |pmid=24809441}}</ref> The term has been contentious as different definitions of what conostitutes biologcal function lead to higly different estimates of what proportion of a genome falls into the category.<ref name=":02">{{cite journal |last1=Palazzo |first1=A F |last2=Kejiou |first2=N S |year=2022 |title=Non-Darwinian Molecular Biology |journal=Front. Genet. |volume=13 |pages=831068 |doi=10.3389/fgene.2022.831068 |pmc=8888898 |pmid=35251134 |doi-access=free}}</ref><ref name=":13">{{cite journal |vauthors=Ponting CP, Hardison RC |date=November 2011 |title=What fraction of the human genome is functional? |journal=Genome Research |volume=21 |issue=11 |pages=1769–1776 |doi=10.1101/gr.116814.110 |pmc=3205562 |pmid=21875934}}</ref> In particular, the [[ENCODE]] project in the 2000s demonstrated detectable biochemical activity resulting from most parts of the genome ([[Transcription (biology)|transcription to RNA]], [[Transcription factor-binding site|transcription factor binding]], etc).<ref name="eddy2">{{cite journal |author-link=Sean Eddy |vauthors=Eddy SR |date=November 2012 |title=The C-value paradox, junk DNA and ENCODE |journal=Current Biology |volume=22 |issue=21 |pages=R898–R899 |doi=10.1016/j.cub.2012.10.002 |pmid=23137679 |s2cid=28289437 |doi-access=free}}</ref><ref>{{Cite journal |last=Celniker |first=Susan E. |last2=Dillon |first2=Laura A. L. |last3=Gerstein |first3=Mark B. |last4=Gunsalus |first4=Kristin C. |last5=Henikoff |first5=Steven |last6=Karpen |first6=Gary H. |last7=Kellis |first7=Manolis |last8=Lai |first8=Eric C. |last9=Lieb |first9=Jason D. |last10=MacAlpine |first10=David M. |last11=Micklem |first11=Gos |last12=Piano |first12=Fabio |last13=Snyder |first13=Michael |last14=Stein |first14=Lincoln |last15=White |first15=Kevin P. |date=2009-06 |title=Unlocking the secrets of the genome |url=https://www.nature.com/articles/459927a |journal=Nature |language=en |volume=459 |issue=7249 |pages=927–930 |doi=10.1038/459927a |issn=1476-4687 |pmc=PMC2843545 |pmid=19536255}}</ref> However, whether this biochemical activity is promiscuious activity in a noisy biological system or evolutionarily relevant biological function has been lesss clear - and consequently, whether that DNA counts as "junk" or not.<ref name=":13" /><ref name="eddy2" /><ref name=":12">{{cite journal |vauthors=Ponting CP, Hardison RC |date=November 2011 |title=What fraction of the human genome is functional? |journal=Genome Research |volume=21 |issue=11 |pages=1769–1776 |doi=10.1101/gr.116814.110 |pmc=3205562 |pmid=21875934}}</ref>
{{split | Non-coding DNA | Junk DNA |date=March 2023}}
 
Although many non-coding regions have biological function,<ref name="Costa non-coding3">{{cite book |title=Non-coding RNAs and Epigenetic Regulation of Gene Expression: Drivers of Natural Selection |vauthors=Costa F |date=2012 |publisher=[[Caister Academic Press]] |isbn=978-1-904455-94-3 |veditors=Morris KV |chapter=7 Non-coding RNAs, Epigenomics, and Complexity in Human Cells}}{{page needed|date=June 2022}}</ref><ref name="Nessa3">{{cite book |title=Junk DNA: A Journey Through the Dark Matter of the Genome |vauthors=Carey M |date=2015 |publisher=Columbia University Press |isbn=978-0-231-17084-0 |author-link=Nessa Carey}}{{page needed|date=June 2022}}</ref> some portion of non-coding DNA has also been described as "Junk DNA". Though exact definitions differ, this refers broadly to "any DNA sequence that does not play a functional role in development, physiology, or some other organism-level capacity."<ref name="PalazzoGregory2014" /> The term "junk DNA" was used in the 1960s.<ref name="PalazzoGregory2014" /><ref name="EhretdeHaller1963">{{cite journal | vauthors = Ehret CF, De Haller G | title = Origin, development, and maturation of organelles and organelle systems of the cell surface in Paramecium | journal = Journal of Ultrastructure Research | volume = 23 | pages = SUPPL6:1–SUPPL642 | date = October 1963 | pmid = 14073743 | doi = 10.1016/S0022-5320(63)80088-X }}</ref><ref name="Gregory Evolution Genome">{{cite book| veditors = TR |title=The Evolution of the Genome|date=2005|publisher=Elsevier |isbn=978-0-12-301463-4|pages=29–31|url=https://books.google.com/books?id=8HtPZP9VSiMC&dq=not+only+is+%22junk+dna%22+an+inappropriate+moniker&pg=PA30}}</ref> but it only became widely known in 1972 in a paper by [[Susumu Ohno]].<ref name="Ohno"/> Ohno noted that the [[mutational load]] from deleterious mutations placed an upper limit on the number of functional [[Locus (genetics)|loci]] that could be expected given a typical mutation rate. He hypothesized that mammalian genomes could not have more than 30,000 loci under selection before the "cost" from the mutational load would cause an inescapable decline in fitness, and eventually extinction.<ref name="Ohno" /> Similar calculations focusing on nucleotides rather than gene loci come to the similar conclusion that the functional portion of the human genome (given mutation rates, genome size and population size) can only be maintained up to approximately 15%.<ref>{{cite journal
| last1= Graur | first1 = D
| title = An Upper Limit on the Functional Fraction of the Human Genome
| doi= 10.1093/gbe/evx121
| year = 2017
| journal = Genome Biol. Evol.
| volume = 9| number =7
| pages = 1880–1885
| pmid = 28854598
| pmc = 5570035
}}</ref> The presence of junk DNA also explained the observation that even closely related species can have widely (orders-of-magnitude) different genome sizes ([[C-value|C-value paradox]]).<ref name=eddy/>
 
=== Terminology ===
The term "junk DNA" is contentious and different exact definitions (and associated methods) yield widely different estimates of its prevalence.<ref name=":0">{{cite journal |last1=Palazzo |first1=A F |last2=Kejiou |first2=N S |year=2022 |title=Non-Darwinian Molecular Biology |journal=Front. Genet. |volume=13 |pages=831068 |doi=10.3389/fgene.2022.831068 |pmc=8888898 |pmid=35251134 |doi-access=free}}</ref> Some authors assert that the term occurs mainly in [[popular science]] and is no longer used in serious research articles.<ref name="SA">{{cite journal |vauthors=Khajavinia A, Makalowski W |date=May 2007 |title=What is "junk" DNA, and what is it worth? |journal=Scientific American |volume=296 |issue=5 |pages=104 |bibcode= |doi=10.1038/scientificamerican0507-104 |pmid=17503549}}</ref> It has also been pointed out that the term 'junk' can imply that its accumulation is disadvantageous, whereas the majority of non-functional sequence is likely merely neutral.<ref>{{cite journal |last1=Brenner |first1=Sydney |date=September 1998 |title=Refuge of spandrels |journal=Current Biology |volume=8 |issue=19 |pages=R669 |doi=10.1016/s0960-9822(98)70427-0 |pmid=9776723 |doi-access=free |s2cid=2918533}}</ref> Strong reactions to the term "junk DNA" have also lead some to recommend more neutral terminology, such as "nonfunctional DNA."<ref name="eddy" />
 
=== Measurement and estimates ===
Different methodologies rest on different implicit definitions yield different estimates of the non-functional fraction of the genome.<ref name=":0" />
 
For example, 20% of human genomic DNA shows no detectable biochemical activity,<ref name="Nature489p57" /> but [[comparative genomics]] methods estimate a nonfunctional fraction of 85-92%.<ref name=":1" /><ref name="kellis" /><ref name="Rands" /> Consequently, different exact definitions of Junk DNA would yield different exact proportions. Each method has limitations, for example, genetic approaches may miss functional elements that do not manifest physically on the organism, evolutionary approaches have difficulties using accurate multispecies sequence alignments since genomes of even closely related species vary considerably, and biochemical signatures do not always automatically signify a function.<ref name="kellis" /> Ultimately genetic, evolutionary, and biochemical approaches can all be used in a complementary way to identify regions that may be functional in human biology and disease.<ref name="kellis" />
 
==== Biochemical activity ====
Detectable biochemical activity (e.g. [[Transcription (biology)|transcription]], [[Transcription factor-binding site|transcription factor association]], [[chromatin structure]], and [[histone modification]]) was observed for at least 80% of human genomic DNA by the Encyclopedia of DNA Elements ([[ENCODE]]) project.<ref name="Nature489p57">{{cite journal | vauthors = Dunham I, Kundaje A, Aldred SF, Collins PJ, Davis CA, Doyle F, etal | collaboration = The ENCODE Project Consortium | title = An integrated encyclopedia of DNA elements in the human genome | journal = Nature | volume = 489 | issue = 7414 | pages = 57–74 | date = September 2012 | pmid = 22955616 | pmc = 3439153 | doi = 10.1038/nature11247 | bibcode = 2012Natur.489...57T }}.</ref> This forms an upper estimate of the functional portion of the human genome since biochemical activity is not necessarily [[biological function]] or [[Natural selection|selective advantage]].<ref name="observer">{{cite news |url=https://www.theguardian.com/science/2013/feb/24/scientists-attacked-over-junk-dna-claim |title=Scientists attacked over claim that 'junk DNA' is vital to life | vauthors = McKie R |work=The Observer|date=24 February 2013 }}</ref><ref name="eddy">{{cite journal | vauthors = Eddy SR | title = The C-value paradox, junk DNA and ENCODE | journal = Current Biology | volume = 22 | issue = 21 | pages = R898–R899 | date = November 2012 | pmid = 23137679 | doi = 10.1016/j.cub.2012.10.002 | s2cid = 28289437 | author-link = Sean Eddy | doi-access = free }}</ref><ref name="doolittle2013">{{cite journal | vauthors = Doolittle WF | title = Is junk DNA bunk? A critique of ENCODE | journal = Proceedings of the National Academy of Sciences of the United States of America | volume = 110 | issue = 14 | pages = 5294–5300 | date = April 2013 | pmid = 23479647 | pmc = 3619371 | doi = 10.1073/pnas.1221376110 | author-link = W. Ford Doolittle | bibcode = 2013PNAS..110.5294D | doi-access = free }}</ref><ref name="PalazzoGregory2014">{{cite journal | vauthors = Palazzo AF, Gregory TR | title = The case for junk DNA | journal = PLOS Genetics | volume = 10 | issue = 5 | pages = e1004351 | date = May 2014 | pmid = 24809441 | pmc = 4014423 | doi = 10.1371/journal.pgen.1004351 }}</ref><ref name="graur">{{cite journal | vauthors = Graur D, Zheng Y, Price N, Azevedo RB, Zufall RA, Elhaik E | title = On the immortality of television sets: "function" in the human genome according to the evolution-free gospel of ENCODE | journal = Genome Biology and Evolution | volume = 5 | issue = 3 | pages = 578–590 | year = 2013 | pmid = 23431001 | pmc = 3622293 | doi = 10.1093/gbe/evt028 }}</ref> For example, transcription factor binding sites are short and can be found by chance over the whole genome<ref>{{cite journal | vauthors = Lambert SA, Jolma A, Campitelli LF, Das PK, Yin Y, Albu M, Chen X, Taipale J, Hughes TR, Weirauch MT | display-authors = 6 | title = The Human Transcription Factors | journal = Cell | volume = 172 | issue = 4 | pages = 650–665 | date = February 2018 | pmid = 29425488 | doi = 10.1016/j.cell.2018.01.029 | s2cid = 3599827 | doi-access = free }}</ref> and 70% of transcribed sequences are below 1 transcript per cell<ref name="kellis" /> and so may be spurious background transcription.<ref name="kellis" />
 
==== Genetic function ====
Contributing to the debate is that there is no consensus on what constitutes a "functional" element in the genome since geneticists, evolutionary biologists, and molecular biologists employ different approaches and definitions of "function",<ref name="kellis" /> often with a lack of clarity of what they mean in the literature.<ref>{{cite journal |last1=Linquist |first1=Stefan |last2=Doolittle |first2=W. Ford |last3=Palazzo |first3=Alexander F. |title=Getting clear about the F-word in genomics |journal=PLOS Genetics |date=1 April 2020 |volume=16 |issue=4 |pages=e1008702 |doi=10.1371/journal.pgen.1008702|pmid=32236092 |pmc=7153884 }}</ref> Due to the ambiguity in the terminology, there are different schools of thought over this matter.<ref>{{cite journal |last1=Doolittle |first1=W. Ford |title=We simply cannot go on being so vague about 'function' |journal=Genome Biology |date=December 2018 |volume=19 |issue=1 |pages=223 |doi=10.1186/s13059-018-1600-4|pmid=30563541 |pmc=6299606 |doi-access=free }}</ref>
 
However, widespread transcription and splicing in the human genome has been discussed as another indicator of genetic function in addition to genomic conservation which may miss poorly conserved functional sequences.<ref name="kellis" /> And much of the apparent junk DNA is involved in [[epigenetic]] regulation and appears to be necessary for the development of complex organisms.<ref name="Nessa" /><ref name="extent functionality" /><ref name="Morris Epigenetics" />
 
Some critics have argued that functionality can only be assessed in reference to an appropriate [[null hypothesis]]. In this case, the null hypothesis would be that these parts of the genome are non-functional and have properties, be it on the basis of conservation or biochemical activity, that would be expected of such regions based on our general understanding of [[molecular evolution]] and [[biochemistry]]. According to these critics, until a region in question has been shown to have additional features, beyond what is expected of the null hypothesis, it should provisionally be labelled as non-functional.<ref name="PalazzoLee2015">{{cite journal | vauthors = Palazzo AF, Lee ES | title = Non-coding RNA: what is functional and what is junk? | journal = Frontiers in Genetics | volume = 6 | pages = 2 | year = 2015 | pmid = 25674102 | pmc = 4306305 | doi = 10.3389/fgene.2015.00002 | doi-access = free }}</ref>
 
===== Evolutionary impact =====
One indication of functionality of a genomic region is if that sequence has been maintained by purifying selection (or if mutating away the sequence is deleterious to the organism). Estimates for the functionally constrained fraction of the human genome based on evolutionary conservation using [[comparative genomics]] range between 8 and 15%.<ref name=":1">{{cite journal | vauthors = Ponting CP, Hardison RC | title = What fraction of the human genome is functional? | journal = Genome Research | volume = 21 | issue = 11 | pages = 1769–1776 | date = November 2011 | pmid = 21875934 | pmc = 3205562 | doi = 10.1101/gr.116814.110 }}</ref><ref name="kellis">{{cite journal | vauthors = Kellis M, Wold B, Snyder MP, Bernstein BE, Kundaje A, Marinov GK, Ward LD, Birney E, Crawford GE, Dekker J, Dunham I, Elnitski LL, Farnham PJ, Feingold EA, Gerstein M, Giddings MC, Gilbert DM, Gingeras TR, Green ED, Guigo R, Hubbard T, Kent J, Lieb JD, Myers RM, Pazin MJ, Ren B, Stamatoyannopoulos JA, Weng Z, White KP, Hardison RC | display-authors = 6 | title = Defining functional DNA elements in the human genome | journal = Proceedings of the National Academy of Sciences of the United States of America | volume = 111 | issue = 17 | pages = 6131–6138 | date = April 2014 | pmid = 24753594 | pmc = 4035993 | doi = 10.1073/pnas.1318948111 | doi-access = free | bibcode = 2014PNAS..111.6131K }}</ref><ref name="Rands">{{cite journal | vauthors = Rands CM, Meader S, Ponting CP, Lunter G | title = 8.2% of the Human genome is constrained: variation in rates of turnover across functional element classes in the human lineage | journal = PLOS Genetics | volume = 10 | issue = 7 | pages = e1004525 | date = July 2014 | pmid = 25057982 | pmc = 4109858 | doi = 10.1371/journal.pgen.1004525 }}</ref> These may still be an underestimate when lineage-specific constraints are included. However, others have argued against relying solely on estimates from comparative genomics due to its limited scope since non-coding DNA has been found to be involved in [[epigenetic]] activity and complex [[gene regulatory network|networks of genetic interactions]] and is explored in [[evolutionary developmental biology]].<ref name="Nessa">{{cite book| vauthors = Carey M | author-link =Nessa Carey|title=Junk DNA: A Journey Through the Dark Matter of the Genome|date=2015|publisher=Columbia University Press|isbn=978-0-231-17084-0}}{{page needed|date=June 2022}}</ref><ref name="kellis" /><ref name="extent functionality">{{cite journal | vauthors = Liu G, Mattick JS, Taft RJ | title = A meta-analysis of the genomic and transcriptomic composition of complex life | journal = Cell Cycle | volume = 12 | issue = 13 | pages = 2061–2072 | date = July 2013 | pmc = 4685169 | doi = 10.1186/1877-6566-7-2 | pmid = 23759593 }}</ref><ref name="Morris Epigenetics">{{cite book | veditors = Morris K |title=Non-Coding RNAs and Epigenetic Regulation of Gene Expression: Drivers of Natural Selection |date=2012 |publisher=Caister Academic Press |___location=Norfolk, UK |isbn=978-1-904455-94-3}}{{page needed|date=June 2022}}</ref>
 
Biologically functional sequences may also have different evolutionary impacts on the sequence itself or the organism that it is found in. Much of the DNA in large genomes originates from [[selfish DNA|selfish]] amplification of [[transposable element]]s. Some of this sequence has biological function (transposition and self replication in the host genome) but does not provided a selective advantage to the host organism.<ref name="Doolittle1980">{{cite journal |vauthors=Doolittle WF, Sapienza C |date=April 1980 |title=Selfish genes, the phenotype paradigm and genome evolution |journal=Nature |volume=284 |issue=5757 |pages=601–603 |bibcode=1980Natur.284..601D |doi=10.1038/284601a0 |pmid=6245369 |s2cid=4311366}}</ref>
 
An additional complication is that the large body of nonfunctional background transcripts produced by non-function sequences can [[De novo gene birth|evolve into functional elements ''de novo'']].<ref>{{cite journal | vauthors = Palazzo AF, Koonin EV | title = Functional Long Non-coding RNAs Evolve from Junk Transcripts | journal = Cell | volume = 183 | issue = 5 | pages = 1151–1161 | date = November 2020 | pmid = 33068526 | doi = 10.1016/j.cell.2020.09.047 | s2cid = 222815635 | doi-access = free }}</ref><ref>{{cite journal | vauthors = Graur D, Zheng Y, Azevedo RB | title = An evolutionary classification of genomic function | journal = Genome Biology and Evolution | volume = 7 | issue = 3 | pages = 642–645 | date = January 2015 | pmid = 25635041 | pmc = 5322545 | doi = 10.1093/gbe/evv021 }}</ref> Therefore a sequence fitting a strict defining of junk as having no biological function and no fitness effect can still have long-term evolutionary significance.<ref>{{Cite journal |last=Schmitz |first=Jonathan F. |last2=Ullrich |first2=Kristian K. |last3=Bornberg-Bauer |first3=Erich |date=2018-09-10 |title=Incipient de novo genes can evolve from frozen accidents that escaped rapid transcript turnover |url=https://www.nature.com/articles/s41559-018-0639-7 |journal=Nature Ecology & Evolution |language=en |volume=2 |issue=10 |pages=1626–1632 |doi=10.1038/s41559-018-0639-7 |issn=2397-334X}}</ref><ref>{{Cite journal |last=Neme |first=Rafik |last2=Tautz |first2=Diethard |date=2016-02-02 |title=Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence |url=https://elifesciences.org/articles/09977 |journal=eLife |language=en |volume=5 |pages=e09977 |doi=10.7554/eLife.09977 |issn=2050-084X}}</ref>
 
==Genome-wide association studies (GWAS) and non-coding DNA==