Multiple sequence alignment: Difference between revisions

Content deleted Content added
Progressive alignment construction: removing external links
Iterative methods: removing external links
Line 68:
A set of methods to produce MSAs while reducing the errors inherent in progressive methods are classified as "iterative" because they work similarly to progressive methods but repeatedly realign the initial sequences as well as adding new sequences to the growing MSA. One reason progressive methods are so strongly dependent on a high-quality initial alignment is the fact that these alignments are always incorporated into the final result&nbsp;— that is, once a sequence has been aligned into the MSA, its alignment is not considered further. This approximation improves efficiency at the cost of accuracy. By contrast, iterative methods can return to previously calculated pairwise alignments or sub-MSAs incorporating subsets of the query sequence as a means of optimizing a general [[objective function]] such as finding a high-quality alignment score.<ref name="mount"/>
 
A variety of subtly different iteration methods have been implemented and made available in software packages; reviews and comparisons have been useful but generally refrain from choosing a "best" technique.<ref name="hirosawa">{{cite journal |vauthors=Hirosawa M, Totoki Y, Hoshida M, Ishikawa M | year = 1995 | title = Comprehensive study on iterative algorithms of multiple sequence alignment | journal = Comput Appl Biosci | volume = 11 | issue = 1| pages = 13–18 | pmid = 7796270 | doi=10.1093/bioinformatics/11.1.13}}</ref> The software package [http://www.genome.jp/tools/prrn/ PRRN/PRRP] uses a [[hill-climbing algorithm]] to optimize its MSA alignment score<ref name="gotoh">{{cite journal | doi = 10.1006/jmbi.1996.0679 | author = Gotoh O | year = 1996 | title = Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments | journal = J Mol Biol | volume = 264 | issue = 4| pages = 823–38 | pmid = 8980688 }}</ref> and iteratively corrects both alignment weights and locally divergent or "gappy" regions of the growing MSA.<ref name="mount">Mount DM. (2004). Bioinformatics: Sequence and Genome Analysis 2nd ed. Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY.</ref> PRRP performs best when refining an alignment previously constructed by a faster method.<ref name="mount"/>
 
Another iterative program, DIALIGN, takes an unusual approach of focusing narrowly on local alignments between sub-segments or [[sequence motif]]s without introducing a gap penalty.<ref name="brudno">{{cite journal | vauthors = Brudno M, Chapman M, Göttgens B, Batzoglou S, Morgenstern B | title = Fast and sensitive multiple alignment of large genomic sequences | journal = BMC Bioinformatics | volume = 4 | pages = 66 | date = December 2003 | pmid = 14693042 | pmc = 521198 | doi = 10.1186/1471-2105-4-66 | doi-access = free }}</ref> The alignment of individual motifs is then achieved with a matrix representation similar to a dot-matrix plot in a pairwise alignment. An alternative method that uses fast local alignments as anchor points or "seeds" for a slower global-alignment procedure is implemented in the [http://dialign.gobics.de/chaos-dialign-submission CHAOS/DIALIGN] suite.<ref name="brudno"/>
 
A third popular iteration-based method called [[MUSCLE (alignment software)|MUSCLE]] (multiple sequence alignment by log-expectation) improves on progressive methods with a more accurate distance measure to assess the relatedness of two sequences.<ref name="edgar">{{cite journal | doi = 10.1093/nar/gkh340 | author = Edgar RC | year = 2004 | title = MUSCLE: multiple sequence alignment with high accuracy and high throughput | journal = Nucleic Acids Research | volume = 32 | issue = 5| pages = 1792–97 | pmid=15034147 | pmc=390337}}</ref> The distance measure is updated between iteration stages (although, in its original form, MUSCLE contained only 2-3 iterations depending on whether refinement was enabled).