Non-coding DNA: Difference between revisions

Content deleted Content added
Junk DNA: note that we're not just talking human genome here
consistent citation formatting
Line 8:
[[Genome size]] in eukaryotes can vary over a wide range, even between closely related sequences. This puzzling observation was originally known as the [[C-value | C-value Paradox]] where "C" refers to the haploid genome size.<ref>{{cite journal | vauthors = Thomas CA | title = The genetic organization of chromosomes | journal = Annual Review of Genetics | volume = 5 | pages = 237–256 | date = 1971 | pmid = 16097657 | doi = 10.1146/annurev.ge.05.120171.001321 }}</ref> The paradox was resolved with the discovery that most of the differences were due to the expansion and contraction of repetitive DNA and not the number of genes. Some researchers speculated that this repetitive DNA was mostly junk DNA. The reasons for the changes in genome size are still being worked out and this problem is called the C-value Enigma.<ref>{{ cite journal | vauthors = Elliott TA, Gregory TR | date = 2015 | title = What's in a genome? The C-value enigma and the evolution of eukaryotic genome content | journal = Phil. Trans. R. Soc. B | volume = 370 | issue = 1678 | pages = 20140331 | doi = 10.1098/rstb.2014.0331| pmid = 26323762 | pmc = 4571570 | s2cid = 12095046 }}</ref>
 
This led to the observation that the number of genes does not seem to correlate with perceived notions of complexity because the number of genes seems to be relatively constant, an issue termed the [[G-value paradox|G-value Paradox]].<ref>{{ cite journal | vauthors = Hahn MW, Wray GA | date = 2002 | title = The g-value paradox | journal = Evolution and Development | volume = 4 | issue = 2 | pages = 73–75 | doi = 10.1046/j.1525-142X.2002.01069.x| pmid = 12004964 | s2cid = 2810069 }}</ref> For example, the genome of the unicellular ''[[Polychaos dubium]]'' (formerly known as ''Amoeba dubia'') has been reported to contain more than 200 times the amount of DNA in humans (i.e. more than 600 billion [[genome size|pairs of bases]] vs a bit more than 3 billion in humans).<ref name=Gregory>{{cite journal | vauthors = Gregory TR, Hebert PD | title = The modulation of DNA content: proximate causes and ultimate consequences | journal = Genome Research | volume = 9 | issue = 4 | pages = 317–324 | date = April 1999 | pmid = 10207154 | doi = 10.1101/gr.9.4.317 | s2cid = 16791399 | doi-access = free }}</ref> The [[pufferfish]] ''[[Takifugu]] rubripes'' genome is only about one eighth the size of the human genome, yet seems to have a comparable number of genes. Genes take up about 30% of the pufferfish genome and the coding DNA is about 10%. (Non-coding DNA = 90%.) The reduced size of the pufferfish genome is due to a reduction in the length of introns and less repetitive DNA.<ref>{{ cite journal | vauthors = Aparicio S, Chapman J, Stupka E, Putnam N, Chia JM, Dehal P, Christoffels A, Rash S, Hoon S, Smit A | date = 2002 | title = Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes | journal = Science | volume = 297 | issue = 5585 | pages = 1301–1310 | doi = 10.1126/science.1072104| pmid = 12142439 | bibcode = 2002Sci...297.1301A | s2cid = 10310355 }}</ref><ref name="Ohno">{{cite journal |last1 vauthors = Ohno |first1=S | title = So much '"junk'" DNA in our genome | journal = Brookhaven Symposia in Biology |date=1972 |volume = 23 | pages =366–70 366–370 | date = 1972 | pmid = 5065367 | oclc = 101819442 }}</ref>
 
''[[Utricularia gibba]]'', a [[bladderwort]] plant, has a very small nuclear genome (100.7 Mb) compared to most plants.<ref name = Ibarra-Laclette>{{ cite journal | vauthors = Ibarra-Laclette E, Lyons E, Hernández-Guzmán G, Pérez-Torres CA, Carretero-Paulet L, Chang TH, Lan T, Welch AJ, Juárez MJ, Simpson J, etal | date = 2013 | title = Architecture and evolution of a minute plant genome | journal = Nature | volume = 498 | issue = 7452 | pages = 94–98 | doi = 10.1038/nature12132| pmid = 23665961 | pmc = 4972453 | bibcode = 2013Natur.498...94I | s2cid = 18219754 }}</ref><ref name = Lan>{{ cite journal | vauthors = Lan T, Renner T, Ibarra-Laclette E, Farr KM, Chang TH, Cervantes-Pérez SA, Zheng C, Sankoff D, Tang H, and Purbojati RW | date = 2017 | title = Long-read sequencing uncovers the adaptive topography of a carnivorous plant genome | journal = Proceedings of the National Academy of Sciences | volume = 114 | issue = 22 | pages = E4435–E4441 | doi = 10.1073/pnas.1702072114| pmid = 28507139 | pmc = 5465930 | bibcode = 2017PNAS..114E4435L | doi-access = free }}</ref> It likely evolved from an ancestral genome that was 1,500 Mb in size.<ref name = Lan/> The bladderwort genome has roughly the same number of genes as other plants but the total amount of coding DNA comes to about 30% of the genome.<ref name = Ibarra-Laclette/><ref name="Lan"/>
Line 14:
The remainder of the genome (70% non-coding DNA) consists of promoters and regulatory sequences that are shorter than those in other plant species.<ref name = Ibarra-Laclette/> The genes contain introns but there are fewer of them and they are smaller than the introns in other plant genomes.<ref name = Ibarra-Laclette/> There are noncoding genes, including many copies of ribosomal RNA genes.<ref name = Lan/> The genome also contains telomere sequences and centromeres as expected.<ref name = Lan/> Much of the repetitive DNA seen in other eukaryotes has been deleted from the bladderwort genome since that lineage split from those of other plants. About 59% of the bladderwort genome consists of transposon-related sequences but since the genome is so much smaller than other genomes, this represents a considerable reduction in the amount of this DNA.<ref name = Lan/> The authors of the original 2013 article note that claims of additional functional elements in the non-coding DNA of animals do not seem to apply to plant genomes.<ref name = Ibarra-Laclette/>
 
According to a New York Times piece, during the evolution of this species, "... genetic junk that didn’t serve a purpose was expunged, and the necessary stuff was kept."<ref>{{cite news | lastvauthors = Klein | first = JoannaJ | title = Genetic Tidying Up Made Humped Bladderworts Into Carnivorous Plants | url = https://www.nytimes.com/2017/05/19/science/humped-bladderwort-carnivorous-plant-genome.html | work = New York Times | date = 19 May 2017 | access-date = May 30, 2022}}</ref> According to Victor Albert of the University of Buffalo, the plant is able to expunge its so-called junk DNA and "have a perfectly good multicellular plant with lots of different cells, organs, tissue types and flowers, and you can do it without the junk. Junk is not needed."<ref>{{ cite press release | vauthors = Hsu C, and Stolte D | date = May 13, 2013 | title = Carnivorous Plant Throws Out 'Junk' DNA | url = https://news.arizona.edu/story/carnivorous-plant-throws-out-junk-dna | ___location = Tucson, AZ, USA | publisher = University of Arizona | access-date = May 29, 2022}}</ref>
 
==Types of non-coding DNA sequences==
Line 26:
Typical classes of noncoding genes in eukaryotes include genes for [[small nuclear RNA]]s (snRNAs), [[small nucleolar RNA]]s (sno RNAs), [[microRNA]]s (miRNAs), [[Small interfering RNA|short interfering RNAs]] (siRNAs), [[Piwi-interacting RNA|PIWI-interacting RNAs]] (piRNAs), and [[Long non-coding RNA|long noncoding RNAs]] (lncRNAs). In addition, there are a number of unique RNA genes that produce catalytic RNAs.<ref>{{cite journal | vauthors=Cech TR, Steitz JA | title=The Noncoding RNA Revolution - Trashing Old Rules to Forge New Ones | journal=Cell|volume=157|pages=77–94|date=2014| issue=1 | doi=10.1016/j.cell.2014.03.008 | pmid=24679528 | s2cid=14852160 | doi-access=free }}</ref>
 
Noncoding genes account for only a few percent of prokaryotic genomes<ref>{{cite journal |last1 vauthors = Rogozin IB, Makarova KS, Natale DA, Spiridonov AN, Tatusov RL, Wolf YI, Yin J, Koonin EV |first1 display-authors =I. B.6 | title = Congruent evolution of different classes of non-coding DNA in prokaryotic genomes | journal = Nucleic Acids Research |date=1 October 2002 |volume = 30 | issue = 19 | pages = 4264–4271 |doi date =10.1093/nar/gkf549 October 2002 | pmid = 12364605 | pmc = 140549 | doi = 10.1093/nar/gkf549 }}</ref> but they can represent a vastly higher fraction in eukaryotic genomes.<ref>{{cite book |doi=10.1016/B978-0-12-800049-6.00171-2 |chapter=Adaptive Molecular Evolution: Detection Methods |title=Encyclopedia of Evolutionary Biology |year=2016 |last1 vauthors = Bielawski |first1=J.P.JP, |last2=Jones |first2=C. |pages=16–25 |isbn=978-0-12-800426-5 }}</ref> In humans, the noncoding genes take up at least 6% of the genome, largely because there are hundreds of copies of ribosomal RNA genes.{{citation needed|date=May 2022}} Protein-coding genes occupy about 38% of the genome; a fraction that is much higher than the coding region because genes contain large introns.{{citation needed|date=May 2022}}
 
The total number of noncoding genes in the human genome is controversial. Some scientists think that there are only about 5,000 noncoding genes while others believe that there may be more than 100,000 (see the article on [[Non-coding RNA]]). The difference is largely due to debate over the number of lncRNA genes.<ref>{{ cite journal | vauthors = Ponting CP, and Haerty W | date = 2022 | title = Genome-Wide Analysis of Human Long Noncoding RNAs: A Provocative Review | journal = Annual Review of Genomics and Human Genetics | volume = 23 | pages = 153–172 | doi = 10.1146/annurev-genom-112921-123710| pmid = 35395170 | s2cid = 248049706 | doi-access = free }}</ref>
Line 63:
 
===Centromeres===
[[File:Human karyotype with bands and sub-bands.png|thumb|Schematic [[karyotype|karyogram]] of a human, showing an overview of the [[human genome]] on [[G banding]], wherein non-coding DNA is present at the centromeres (shown as narrow segment of each chromosome), and also occurs to a greater extent in darker ([[GC-content|GC poor]]) regions.<ref name=Romiguier2017>{{cite journal | authorvauthors = Romiguier J, Roux C | title = Analytical Biases Associated with GC-Content in Molecular Evolution. | journal =Front GenetFrontiers |in year= 2017Genetics | volume = 8 | issue = | pages = 16 | pmidyear =28261263 2017 | doipmid =10.3389/fgene.2017.00016 28261263 | pmc = 5309256 | urldoi =https://www 10.ncbi.nlm.nih.gov3389/entrez/eutils/elinkfgene.fcgi?dbfrom=pubmed&tool=sumsearch2017.org/cite&retmode=ref&cmd=prlinks&id=28261263 00016 }} </ref><br>{{further|Karyotype}}]]
{{further|Centromere}}
 
Line 108:
==Junk DNA==
{{Main|Junk DNA}}
Although many non-coding regions have biological function,<ref name="Costa non-coding32">{{cite book |title=Non-coding RNAs and Epigenetic Regulation of Gene Expression: Drivers of Natural Selection |vauthors=Costa F |date=2012 |publisher=[[Caister Academic Press]] |isbn=978-1-904455-94-3 |veditors=Morris KV |chapter=7 Non-coding RNAs, Epigenomics, and Complexity in Human Cells}}{{page needed|date=June 2022}}</ref><ref name="Nessa32">{{cite book |title=Junk DNA: A Journey Through the Dark Matter of the Genome |vauthors=Carey M |date=2015 |publisher=Columbia University Press |isbn=978-0-231-17084-0 |author-link=Nessa Carey}}{{page needed|date=June 2022}}</ref> some genomes contain sequence that does not have biological function and has been described as "Junk DNA". Though exact definitions differ, this refers broadly to "any DNA sequence that does not play a functional role in development, physiology, or some other organism-level capacity."<ref name="PalazzoGregory20142">{{cite journal | vauthors = Palazzo AF, Gregory TR |date=May 2014 |title = The case for junk DNA | journal =PLOS PLoS Genetics | volume = 10 | issue = 5 | pages = e1004351 | date = May 2014 | pmid = 24809441 | pmc = 4014423 | doi = 10.1371/journal.pgen.1004351 |pmc=4014423 |pmid=24809441}}</ref> The amount of sequence that falls under this term varies widely between organisms.<ref name=":0" /><ref name=":12" /> The term itself has been contentious as different definitions of what constitutes biological function lead to highly different estimates of what proportion of a genome falls into the category.<ref name=":02">{{cite journal |last1 vauthors = Palazzo |first1=AAF, FKejiou |last2=KejiouNS |first2=N Stitle |year=2022 |title=Non-Darwinian Molecular Biology | journal =Front. Genet.Frontiers in Genetics | volume = 13 | pages = 831068 | year = 2022 | pmid = 35251134 | pmc = 8888898 | doi = 10.3389/fgene.2022.831068 |pmc=8888898 |pmid=35251134 |doi-access = free }}</ref><ref name=":13">{{cite journal | vauthors = Ponting CP, Hardison RC |date=November 2011title |title= What fraction of the human genome is functional? | journal = Genome Research | volume = 21 | issue = 11 | pages = 1769–1776 | date = November 2011 | pmid = 21875934 | pmc = 3205562 | doi = 10.1101/gr.116814.110 |pmc=3205562 |pmid=21875934}}</ref> In particular, the [[ENCODE]] project in the 2000s demonstrated detectable biochemical activity resulting from most parts of the genome ([[Transcription (biology)|transcription to RNA]], [[Transcription factor-binding site|transcription factor binding]], etc).<ref name="eddy2">{{cite journal |author-link=Sean Eddyvauthors |vauthors= Eddy SR |date=November 2012title |title= The C-value paradox, junk DNA and ENCODE | journal = Current Biology | volume = 22 | issue = 21 | pages =R898–R899 R898-R899 | date = November 2012 | pmid = 23137679 | doi = 10.1016/j.cub.2012.10.002 |pmid=23137679 |s2cid = 28289437 | doi-access = free | author-link = Sean Eddy }}</ref><ref>{{Citecite journal |last=Celniker |first=Susanvauthors E. |last2=Dillon |first2=LauraCelniker A.SE, L.Dillon LA, |last3=Gerstein |first3=MarkMB, B. |last4=Gunsalus |first4=KristinKC, C. |last5=Henikoff |first5=StevenS, |last6=Karpen |first6=GaryGH, H. |last7=Kellis |first7=ManolisM, |last8=Lai |first8=EricEC, C. |last9=Lieb |first9=JasonJD, D. |last10=MacAlpine |first10=DavidDM, M. |last11=Micklem |first11=GosG, |last12=Piano |first12=FabioF, |last13=Snyder |first13=MichaelM, |last14=Stein |first14=LincolnL, |last15=White |first15=KevinKP, P.Waterston RH |date=2009 display-06authors = 6 | title = Unlocking the secrets of the genome |url=https://www.nature.com/articles/459927a |journal = Nature |language=en |volume = 459 | issue = 7249 | pages = 927–930 |doi date =10.1038/459927a June 2009 |issn pmid =1476-4687 19536255 | pmc =PMC2843545 2843545 |pmid doi =19536255 10.1038/459927a }}</ref> However, whether this biochemical activity is promiscuous activity in a noisy biological system or evolutionarily relevant biological function has been less clear - and consequently, whether that DNA counts as "junk" or not.<ref name=":12">{{cite journal | vauthors = Ponting CP, Hardison RC |date=November 2011title |title= What fraction of the human genome is functional? | journal = Genome Research | volume = 21 | issue = 11 | pages = 1769–1776 | date = November 2011 | pmid = 21875934 | pmc = 3205562 | doi = 10.1101/gr.116814.110 |pmc=3205562 |pmid=21875934}}</ref><ref name=":13" /><ref name="eddy2" />
 
==Genome-wide association studies (GWAS) and non-coding DNA==
Line 135:
{{Refbegin|32em}}
* {{cite book | vauthors = Bennett MD, Leitch IJ | year = 2005 | chapter = Genome size evolution in plants |chapter-url=https://books.google.com/books?id=8HtPZP9VSiMC&pg=PA89 | title = The Evolution of the Genome | veditors = Gregory RT | publisher = Elsevier | ___location = San Diego | pages = 89–162 |isbn=978-0-08-047052-8}}
* {{cite book |doi=10.1016/B978-012301463-4/50003-6 |chapter=Genome Size Evolution in Animals |title=The Evolution of the Genome |year=2005 |last1=Gregory |first1vauthors =T. RyanGregory TR |pages=3–87 |isbn=978-0-12-301463-4 }}
* {{cite journal | vauthors = Shabalina SA, Spiridonov NA | title = The mammalian transcriptome and the function of non-coding DNA sequences | journal = Genome Biology | volume = 5 | issue = 4 | pages = 105 | year = 2004 | pmid = 15059247 | pmc = 395773 | doi = 10.1186/gb-2004-5-4-105 }}
* {{cite journal | vauthors = Castillo-Davis CI | title = The evolution of noncoding DNA: how much junk, how much func? | journal = Trends in Genetics | volume = 21 | issue = 10 | pages = 533–536 | date = October 2005 | pmid = 16098630 | doi = 10.1016/j.tig.2005.08.001 }}