Ruzzo–Tompa algorithm: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Alter: pages, url, journal. URLs might have been anonymized. Add: s2cid, isbn, year, authors 1-1. Removed proxy/dead URL that duplicated identifier. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Corvus florensis | #UCB_webform 2458/3499
Citation bot (talk | contribs)
Alter: title, template type. Add: chapter-url, chapter. Removed or converted URL. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | #UCB_CommandLine
Line 7:
The Ruzzo–Tompa algorithm has been used in [[Bioinformatics]] tools to study biological data. The problem of finding disjoint maximal subsequences is of practical importance in the analysis of [[DNA]]. Maximal subsequences algorithms have been used in the identification of transmembrane segments and the evaluation of [[sequence homology]].<ref name="karlin">{{cite journal|last1=Karlin|first1=S|last2=Altschul|first2=SF|title=Applications and statistics for multiple high-scoring segments in molecular sequences|journal=Proceedings of the National Academy of Sciences of the United States of America|date=Jun 15, 1993|volume=90|issue=12|pages=5873–5877|pmid=8390686|pmc=46825|doi=10.1073/pnas.90.12.5873|bibcode=1993PNAS...90.5873K|doi-access=free}}</ref>
 
The algorithm is used in [[sequence alignment]] which is used as a method of identifying similar [[DNA]], [[RNA]], or [[protein]] sequences.<ref>{{Cite journalbook |last1=Spouge |first1=John L. |last2=Mariño-Ramírez |first2=Leonardo |last3=Sheetlin |first3=Sergey L. |title=2012 IEEE 2nd International Conference on Computational Advances in Bio and medical Sciences (ICCABS) |chapter=The ruzzo-tompa algorithm can find the maximal paths in weighted, directed graphs on a one-dimensional lattice |chapter-url=https://ieeexplore.ieee.org/document/6182645 |journal=2012 IEEE 2nd International Conference on Computational Advances in Bio and Medical Sciences (ICCABS) |year=2012 |pages=1–6 |doi=10.1109/ICCABS.2012.6182645|isbn=978-1-4673-1321-6 |s2cid=14584619 }}</ref> Accounting for the ordering of pairs of high-scoring subsequences in two sequences creates better sequence alignments. This is because the biological model suggests that separate high-scoring subsequence pairs arise from insertions or deletions within a matching region. Requiring consistent ordering of high-scoring subsequence pairs increases their statistical significance.<ref name="karlin" />
 
===Web scraping===
Line 17:
==Problem definition==
 
The problem of finding all maximal subsequences is defined as follows: Given a list of real numbered scores <math>x_1,x_2,\ldots,x_n</math>, find the list of contiguous subsequences that gives the greatest total score, where the score of each subsequence <math>S_{i,j} = \sum_{i\leq k\leq j} x_k</math>. The subsequences must be disjoint (non-overlapping) and have a positive score.<ref>{{Cite journalbook |last1=Spouge |first1=John L. |last2=Mariño-Ramírez |first2=Leonardo |last3=Sheetlin |first3=Sergey L. |title=2012 IEEE 2nd International Conference on Computational Advances in Bio and medical Sciences (ICCABS) |chapter=The ruzzo-tompa algorithm can find the maximal paths in weighted, directed graphs on a one-dimensional lattice |chapter-url=https://ieeexplore.ieee.org/document/6182645 |journal=2012 IEEE 2nd International Conference on Computational Advances in Bio and Medical Sciences (ICCABS) |year=2012 |pages=1–6 |doi=10.1109/ICCABS.2012.6182645|isbn=978-1-4673-1321-6 |s2cid=14584619 }}</ref>
 
==Other algorithms==