Revision as of 15:21, 17 March 2020 edit Citation bot (talk \| contribs) Bots 5,872,790 edits m Alter: template type. \| You can use this bot yourself. Report bugs here. \| Activated by User:AManWithNoPlan \| All pages linked from User:AManWithNoPlan/sandbox2 \| via #UCB_webform_linked ← Previous edit		Revision as of 07:54, 12 April 2020 edit undo OAbot (talk \| contribs) Bots 646,409 edits m Open access bot: doi added to citation with #oabot. Next edit →
Line 13: == Composition == [[File:Transitions-transversions.png\|thumb\|286x286px\|'''Point mutation types:''' transitions (blue) are elevated compared to transversions (red) in GC-rich coding regions.<ref>(n.d.). Retrieved from <nowiki>https://www.differencebetween.com/wp-content/uploads/2017/03/Difference-Between-Transition-and-Transversion-3.png</nowiki></ref>]] The evidence suggests that there is a general interdependence between base composition patterns and coding region availability.<ref>{{cite journal \| vauthors = Lercher MJ, Urrutia AO, Pavlícek A, Hurst LD \| title = A unification of mosaic structures in the human genome \| journal = Human Molecular Genetics \| volume = 12 \| issue = 19 \| pages = 2411–5 \| date = October 2003 \| pmid = 12915446 \| doi = 10.1093/hmg/ddg251 \| doi-access = free }}</ref> The coding region is thought to contain a higher [[GC-content]] than non-coding regions. There is further research that discovered that the longer the coding strand, the higher the GC-content. Short coding strands are comparatively still GC-poor, similar to the low GC-content of the base composition translational [[stop codon]]s like TAG, TAA, and TGA.<ref>{{cite journal \| vauthors = Oliver JL, Marín A \| title = A relationship between GC content and coding-sequence length \| journal = Journal of Molecular Evolution \| volume = 43 \| issue = 3 \| pages = 216–23 \| date = September 1996 \| pmid = 8703087 \| doi = 10.1007/pl00006080 \| bibcode = 1996JMolE..43..216O }}</ref> GC-rich areas are also where the ratio [[point mutation]] type is altered slightly: there are more [[Transition (genetics)\|transitions]], which are changes from purine to purine or pyrimidine to pyrimidine, compared to [[transversion]]s, which are changes from purine to pyrimidine or pyrimidine to purine. The transitions are less likely to change the encoded amino acid and remain a [[silent mutation]] (especially if they occur in the third [[nucleotide]] of a codon) which is usually beneficial to the organism during translation and protein formation.<ref>{{Cite web\|url=http://rosalind.info/glossary/gene-coding-region/\|title=ROSALIND {{!}} Glossary {{!}} Gene coding region\|website=rosalind.info\|access-date=2019-10-31}}</ref> This indicates that essential coding regions (gene-rich) are higher in GC-content and more stable and resistant to [[mutation]] compared to accessory and non-essential regions (gene-poor).<ref>{{cite journal \| vauthors = Vinogradov AE \| title = DNA helix: the importance of being GC-rich \| journal = Nucleic Acids Research \| volume = 31 \| issue = 7 \| pages = 1838–44 \| date = April 2003 \| pmid = 12654999 \| pmc = 152811 \| doi = 10.1093/nar/gkg296 }}</ref> However, it is still unclear whether this came about through neutral and random mutation or through a pattern of [[Natural selection\|selection]].<ref>{{cite journal \| vauthors = Bohlin J, Eldholm V, Pettersson JH, Brynildsrud O, Snipen L \| title = The nucleotide composition of microbial genomes indicates differential patterns of selection on core and accessory genomes \| journal = BMC Genomics \| volume = 18 \| issue = 1 \| pages = 151 \| date = February 2017 \| pmid = 28187704 \| pmc = 5303225 \| doi = 10.1186/s12864-017-3543-7 }}</ref> There is also debate on whether the methods used, such as gene windows, to ascertain the relationship between GC-content and coding region are accurate and unbiased.<ref>{{cite journal \| vauthors = Sémon M, Mouchiroud D, Duret L \| title = Relationship between gene expression and GC-content in mammals: statistical significance and biological relevance \| journal = Human Molecular Genetics \| volume = 14 \| issue = 3 \| pages = 421–7 \| date = February 2005 \| pmid = 15590696 \| doi = 10.1093/hmg/ddi038 \| doi-access = free }}</ref> == Structure and Function == Line 31: [[Alkylation]] is one form of regulation of the coding region.<ref>{{cite journal \| vauthors = Shinohara K, Sasaki S, Minoshima M, Bando T, Sugiyama H \| title = Alkylation of template strand of coding region causes effective gene silencing \| journal = Nucleic Acids Research \| volume = 34 \| issue = 4 \| pages = 1189–95 \| date = 2006-02-13 \| pmid = 16500890 \| pmc = 1383623 \| doi = 10.1093/nar/gkl005 }}</ref> The gene that would have been transcribed can be silenced by targeting a specific sequence. The bases in this sequence would be blocked using [[Alkyl\|alkyl groups]], which create the [[Gene silencing\|silencing]] effect.<ref>{{Cite web\|url=http://www.informatics.jax.org/vocab/gene_ontology/GO:0006305\|title=DNA alkylation Gene Ontology Term (GO:0006305)\|website=www.informatics.jax.org\|access-date=2019-10-30}}</ref> While the [[regulation of gene expression]] manages the abundance of RNA or protein made in a cell, the regulation of these mechanisms can be controlled by a [[regulatory sequence]] found before the [[open reading frame]] begins in a strand of DNA. The [[regulatory sequence]] will then determine the ___location and time that expression will occur for a protein coding region.<ref>{{Cite journal \|last1=Shafee\|first1=Thomas\|last2=Lowe\|first2=Rohan \| name-list-format = vanc \|date=2017 \|title=Eukaryotic and prokaryotic gene structure\|journal=WikiJournal of Medicine\|volume=4\|issue=1\|doi=10.15347/wjm/2017.002\|doi-access=free}}</ref> [[RNA splicing]] ultimately determines what part of the sequence becomes translated and expressed, and this process involves cutting out introns and putting together exons. Where the RNA [[spliceosome]] cuts, however, is guided by the recognition of [[splice site]]s, in particular the 5' splicing site, which is one of the substrates for the first step in splicing.<ref>{{cite journal \| vauthors = Konarska MM \| title = Recognition of the 5' splice site by the spliceosome \| journal = Acta Biochimica Polonica \| volume = 45 \| issue = 4 \| pages = 869–81 \| date = 1998 \| pmid = 10397335 \| doi = 10.18388/abp.1998_4346 \| doi-access = free }}</ref> The coding regions are within the exons, which become covalently joined together to form the [[mature messenger RNA]]. == Mutations == Line 54: While identification of [[open reading frames]] within a DNA sequence is straightforward, identifying coding sequences is not, because the cell translates only a subset of all open reading frames to proteins.<ref>{{cite journal \| vauthors = Furuno M, Kasukawa T, Saito R, Adachi J, Suzuki H, Baldarelli R, Hayashizaki Y, Okazaki Y \| display-authors = 6 \| title = CDS annotation in full-length cDNA sequence \| journal = Genome Research \| volume = 13 \| issue = 6B \| pages = 1478–87 \| date = June 2003 \| pmid = 12819146 \| pmc = 403693 \| doi = 10.1101/gr.1060303 \| url = http://genome.cshlp.org/content/13/6b/1478.full.pdf+html \| publisher = Cold Spring Harbor Laboratory Press }}</ref> Currently CDS prediction uses sampling and sequencing of mRNA from cells, although there is still the problem of determining which parts of a given mRNA are actually translated to protein. CDS prediction is a subset of [[gene prediction]], the latter also including prediction of DNA sequences that code not only for protein but also for other functional elements such as RNA genes and regulatory sequences. In both [[prokaryote]]s and [[eukaryote]]s, [[Overlapping gene\|gene overlapping]] occurs relatively often in both DNA and RNA viruses as an evolutionary advantage to reduce genome size while retaining the ability to produce various proteins from the available coding regions.<ref>{{cite journal \| vauthors = Rogozin IB, Spiridonov AN, Sorokin AV, Wolf YI, Jordan IK, Tatusov RL, Koonin EV \| title = Purifying and directional selection in overlapping prokaryotic genes \| language = English \| journal = Trends in Genetics \| volume = 18 \| issue = 5 \| pages = 228–32 \| date = May 2002 \| pmid = 12047938 \| doi = 10.1016/S0168-9525(02)02649-5 \| url = https://www.cell.com/trends/genetics/abstract/S0168-9525(02)02649-5 }}</ref><ref>{{cite journal \| vauthors = Chirico N, Vianelli A, Belshaw R \| title = Why genes overlap in viruses \| journal = Proceedings. Biological Sciences \| volume = 277 \| issue = 1701 \| pages = 3809–17 \| date = December 2010 \| pmid = 20610432 \| pmc = 2992710 \| doi = 10.1098/rspb.2010.1052 }}</ref> For both DNA and RNA, [[Sequence alignment#Pairwise alignment\|pairwise alignments]] can detect overlapping coding regions, including short [[open reading frame]]s in viruses, but would require a known coding strand to compare the potential overlapping coding strand with.<ref>{{cite journal \| vauthors = Firth AE, Brown CM \| title = Detecting overlapping coding sequences with pairwise alignments \| journal = Bioinformatics \| volume = 21 \| issue = 3 \| pages = 282–92 \| date = February 2005 \| pmid = 15347574 \| doi = 10.1093/bioinformatics/bti007 \| url = https://academic.oup.com/bioinformatics/article/21/3/282/237775 \| doi-access = free }}</ref> An alternative method using single genome sequences would not require multiple genome sequences to execute comparisons but would require at least 50 nucleotides overlapping in order to be sensitive.<ref>{{cite journal \| vauthors = Schlub TE, Buchmann JP, Holmes EC \| title = A Simple Method to Detect Candidate Overlapping Genes in Viruses Using Single Genome Sequences \| journal = Molecular Biology and Evolution \| volume = 35 \| issue = 10 \| pages = 2572–2581 \| date = October 2018 \| pmid = 30099499 \| pmc = 6188560 \| doi = 10.1093/molbev/msy155 \| editor-first = Harmit \| editor-last = Malik }}</ref> == See also ==

Coding region: Difference between revisions