Multiple sequence alignment: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Add: doi-access. | Use this bot. Report bugs. | #UCB_CommandLine
Rewrite first sentence
Tags: Visual edit Mobile edit Mobile web edit Advanced mobile edit
Line 1:
{{Short description|Alignment of more than two molecular sequences}}
[[File:RPLP0 90 ClustalW aln.gif|right|thumb|575px|First 90 positions of a protein multiple sequence alignment of instances of the acidic ribosomal protein P0 (L10E) from several organisms. Generated with [[ClustalX]].]]
'''Multiple sequence alignment''' ('''MSA''') may refer tois the process or the result of [[sequence alignment]] of three or more [[biological sequence]]s, generally [[protein]], [[DNA]], or [[RNA]]. In many cases, the input set of query sequences are assumed to have an [[evolutionary]] relationship by which they share a linkage and are descended from a common ancestor. From the resulting MSA, sequence [[homology (biology)|homology]] can be inferred and [[molecular phylogeny|phylogenetic analysis]] can be conducted to assess the sequences' shared evolutionary origins. Visual depictions of the alignment as in the image at right illustrate [[mutation]] events such as point mutations (single [[amino acid]] or [[nucleotide]] changes) that appear as differing characters in a single alignment column, and insertion or deletion mutations ([[indel]]s or gaps) that appear as hyphens in one or more of the sequences in the alignment. Multiple sequence alignment is often used to assess sequence [[conservation (genetics)|conservation]] of [[protein ___domain]]s, [[tertiary structure|tertiary]] and [[secondary structure|secondary]] structures, and even individual amino acids or nucleotides.
 
Computational [[algorithm]]s are used to produce and analyse the MSAs due to the difficulty and intractability of manually processing the sequences given their biologically-relevant length. MSAs require more sophisticated methodologies than [[sequence alignment|pairwise alignment]] because they are more [[Computational complexity theory|computationally complex]]. Most multiple sequence alignment programs use [[heuristic]] methods rather than [[global optimization]] because identifying the optimal alignment between more than a few sequences of moderate length is prohibitively computationally expensive. On the other hand, heuristic methods generally fail to give guarantees on the solution quality, with heuristic solutions shown to be often far below the optimal solution on benchmark instances.<ref name="thompson2011">{{cite journal | doi = 10.1371/journal.pone.0018093|vauthors= Thompson JD, Linard B, Lecompte O, Poch O | year = 2011 | title = A comprehensive benchmark study of multiple sequence alignment methods: current challenges and future perspectives | journal = PLOS ONE | volume = 6 | issue = 3| pages = e18093| pmid = 21483869| pmc = 3069049|bibcode= 2011PLoSO...618093T |doi-access= free }}</ref><ref name="nuin2006" /><ref name="hosseininasab">{{cite journal | doi = 10.1287/ijoc.2019.0937 |vauthors=Hosseininasab A, van Hoeve WJ | year = 2019 | title = Exact Multiple Sequence Alignment by Synchronized Decision Diagrams | journal = INFORMS Journal on Computing |s2cid=109937203 }}</ref>