Revision as of 06:14, 16 April 2018 edit John Douglas Huff (talk \| contribs) 32 edits ←Created page with '{{subst:AFC submission/draftnew}}<!-- Important, do not remove this line before article has been created. --> The Ruzzo-Tompa algorithm is a linear time alg...'		Revision as of 10:55, 16 April 2018 edit undo John Douglas Huff (talk \| contribs) 32 edits No edit summary Next edit →
Line 1: {{AFC submission\|t\|\|ts=20180416061329\|u=John Douglas Huff\|ns=118\|demo=}}<!-- Important, do not remove this line before article has been created. --> The [['''Ruzzo-Tompa algorithm]]''' is a linear time algorithm for finding all non-overlapping, contiguous, maximal scoring subsequences in a sequence of real numbers<ref>{{cite journal\|last1=Ruzzo\|first1=Walter L.\|last2=Martin\|first2=Tompa\|title=A Linear Time Algorithm for Finding All Maximal Scoring Subsequences\|journal=Proceedings. International Conference on Intelligent Systems for Molecular Biology\|date=1999\|pages=234-241\|pmid=10786306\|url=https://dl.acm.org/citation.cfm?id=660812\|ref=ruzzo-tompa}}</ref>. ~~There~~This algorithm is an ~~existing~~improvement ~~article~~over onpreviously ~~the~~known ~~[[Maximum~~quadratic ~~subarray~~time ~~problem]]~~algorithms. The ~~[[Ruzzo-Tompa]]~~maximum scoring subsequence from the set produced by the algorithm is ana ~~improvement~~solution ~~over~~to ~~previously~~the ~~known~~[[Maximum ~~quadratic~~subarray ~~time algorithms~~problem]]. The problem of find disjoint maximal subsequences is of practical importance in the analysis of [[DNA]]. Maximal subsequences algorithms have been used in the identification of transmembrane segments and the evaluation of [[sequence homology]]<ref>{{cite journal\|last1=Karlin\|first1=S\|last2=Altschul\|first2=SF\|title=Applications and statistics for multiple high-scoring segments in molecular sequences\|journal=Proceedings of the National Academy of Sciences of the United States of America\|date=Jun 15, 1993\|volume=90\|issue=12\|pages=5873-5877\|pmid=8390686}}</ref>. ==Algorithm== The standard implementation of the Ruzzo-Tompa algorithm runs in <math>O(n)</math> time and uses <math>O(n)</math> space, where <math>n</math> is the length of the list of scores. The description of the algorithm provided by Ruzzo and Tompa is as follows: The following [[Python (programming language)\|Python]] code implements the Ruzzo-Tompa algorithm: <source lang="python" line="1"> def RuzzoTompa(scores): k=0 total = 0; # Allocating arrays of size n I,L,R,Lidx = [[0]*len(scores) for _ in range(4)] for i,s in enumerate(scores): total += s if s > 0: I[k] = (i,i+1) Lidx[k] = i L[k] = total-s R[k] = total while(True): maxj = None for j in range(k-1,-1,-1): if L[j] < L[k]: maxj = j break; if maxj != None and R[maxj] < R[k]: I[maxj] = (Lidx[maxj],i+1) R[maxj] = total k = maxj else: k+=1 break; return [scores[I[l][0]:I[l][1]] for l in range(k)] </source>

Ruzzo–Tompa algorithm: Difference between revisions