Content deleted Content added
mNo edit summary |
mNo edit summary |
||
Line 1:
The linguistic complexity (LC) measure <ref>{{cite book| author=[http://evolution.haifa.ac.il/index.php/people/item/40-edward-n-trifonov-phd Edward N. Trifonov] |year=1990| book=Structure & Methods| title=Structure and Methods| series= Human Genome Initiative and DNA Recombination| volume=1| pages=69–77|chapter=Making sense of the human genome|publisher=Adenine Press, New York
When a [[nucleotide]] sequence is studied as a text written in the four-letter alphabet, the repetitiveness of such a text, that is, the extensive repetition of some [[N-gram|N-grams (words)]], can be calculated, and served as a measure of sequence complexity. Thus, the more complex a [[DNA_sequence|DNA sequence]], the richer is its oligonucleotide vocabulary, whereas repetitious sequences have relatively lower complexities. We have recently improved the original algorithm described in (Trifonov 1990) without changing the essence of the linguistic complexity approach.
|