Revision as of 14:38, 7 March 2012 edit Rkalendar (talk \| contribs) 224 edits mNo edit summary ← Previous edit		Revision as of 14:39, 7 March 2012 edit undo Rkalendar (talk \| contribs) 224 edits mNo edit summary Next edit →
Line 1: The linguistic complexity (LC) measure <ref>{{cite book\| author=[http://evolution.haifa.ac.il/index.php/people/item/40-edward-n-trifonov-phd Edward N. Trifonov] \|year=1990\| book=Structure & Methods\| title=Structure and Methods\| series= Human Genome Initiative and DNA Recombination\| volume=1\| pages=69–77\|chapter=Making sense of the human genome\|publisher=Adenine Press, New York ~~[http://evolution.haifa.ac.il/index.php/people/item/40-edward-n-trifonov-phd Edward N. Trifonov Ph.D.]~~}}</ref> was introduced as a measure of the ‘vocabulary richness’of a text. When a [[nucleotide]] sequence is studied as a text written in the four-letter alphabet, the repetitiveness of such a text, that is, the extensive repetition of some [[N-gram\|N-grams (words)]], can be calculated, and served as a measure of sequence complexity. Thus, the more complex a [[DNA_sequence\|DNA sequence]], the richer is its oligonucleotide vocabulary, whereas repetitious sequences have relatively lower complexities. We have recently improved the original algorithm described in (Trifonov 1990) without changing the essence of the linguistic complexity approach.

Linguistic sequence complexity: Difference between revisions