Content deleted Content added
m Dating maintenance tags: {{What}} |
m Open access bot: doi updated in citation with #oabot. |
||
Line 9:
This formula is different from the original LC measure<ref name=Trifonov1990 /> in two respects: in the way vocabulary usage U<sub>i</sub> is calculated, and because {{math|<var>i</var>}} is not in the range of 2 to N-1 but only up to W. This limitation on the range of U<sub>i</sub> makes the algorithm substantially more efficient without loss of power.<ref name=Gabrielian1999 />
In <ref name=TAKLB01>{{Cite journal | doi = 10.1093/bioinformatics/18.5.679| title = Sequence complexity profiles of prokaryotic genomic sequences: A fast algorithm for calculating linguistic complexity| journal = Bioinformatics| volume = 18| issue = 5| pages = 679–88| year = 2002| last1 = Troyanskaya | first1 = O. G.| last2 = Arbell | first2 = O.| last3 = Koren | first3 = Y.| last4 = Landau | first4 = G. M.| last5 = Bolshoy | first5 = A. | pmid=12050064| doi-access = free}}</ref> {{what|date=July 2023}} was used another modified version, wherein linguistic complexity (LC) is defined as the ratio of the number of substrings of any length present in the string to the maximum possible number of substrings. Maximum vocabulary over word sizes 1 to m can be calculated according to the simple formula .<ref name=TAKLB01 />
This sequence analysis complexity calculation can be used to search for conserved regions between compared sequences for the detection of low-complexity regions including simple sequence repeats, imperfect [[Direct repeat|direct]] or [[inverted repeat]]s, polypurine and polypyrimidine [[Triple-stranded DNA|triple-stranded DNA structures]], and four-stranded structures (such as [[G-quadruplex]]es).<ref name=Kalendar2011>{{Cite journal | last1 = Kalendar | first1 = R. | last2 = Lee | first2 = D. | last3 = Schulman | first3 = A. H. | doi = 10.1016/j.ygeno.2011.04.009 | title = Java web tools for PCR, in silico PCR, and oligonucleotide assembly and analysis | journal = Genomics | volume = 98 | issue = 2 | pages = 137–144 | year = 2011 | pmid = 21569836 | doi-access =
== References ==
|