Content deleted Content added
→Coding sequence detection: Whole genome method |
|||
Line 52:
== Coding sequence detection ==
[[File:Human karyotype with bands and sub-bands.png|thumb|Schematic [[karyotype|karyogram]] of a human, showing an overview of the [[human genome]] on [[G banding]] (which includes [[Giemsa]]-staining), wherein coding DNA regions occur to a greater extent in darker([[GC-content|GC rich]]) regions.<ref name="pmid28261263">{{cite journal| author=Romiguier J, Roux C| title=Analytical Biases Associated with GC-Content in Molecular Evolution. | journal=Front Genet | year= 2017 | volume= 8 | issue= | pages= 16 | pmid=28261263 | doi=10.3389/fgene.2017.00016 | pmc=5309256 | url=https://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=28261263 }} </ref><br>{{further|Karyotype}}]]
While identification of [[open reading frames]] within a DNA sequence is straightforward, identifying coding sequences is not, because the cell translates only a subset of all open reading frames to proteins.<ref>{{cite journal | vauthors = Furuno M, Kasukawa T, Saito R, Adachi J, Suzuki H, Baldarelli R, Hayashizaki Y, Okazaki Y | display-authors = 6 | title = CDS annotation in full-length cDNA sequence | journal = Genome Research | volume = 13 | issue = 6B | pages = 1478–87 | date = June 2003 | pmid = 12819146 | pmc = 403693 | doi = 10.1101/gr.1060303 | publisher = Cold Spring Harbor Laboratory Press }}</ref> Currently CDS prediction uses sampling and sequencing of mRNA from cells, although there is still the problem of determining which parts of a given mRNA are actually translated to protein. CDS prediction is a subset of [[gene prediction]], the latter also including prediction of DNA sequences that code not only for protein but also for other functional elements such as RNA genes and regulatory sequences.
|