Content deleted Content added
Dasymutilla (talk | contribs) m wordage |
Citation bot (talk | contribs) Removed proxy/dead URL that duplicated identifier. | Use this bot. Report bugs. | #UCB_CommandLine |
||
Line 145:
Progressive, hierarchical, or tree methods generate a multiple sequence alignment by first aligning the most similar sequences and then adding successively less related sequences or groups to the alignment until the entire query set has been incorporated into the solution. The initial tree describing the sequence relatedness is based on pairwise comparisons that may include heuristic pairwise alignment methods similar to [[FASTA]]. Progressive alignment results are dependent on the choice of "most related" sequences and thus can be sensitive to inaccuracies in the initial pairwise alignments. Most progressive multiple sequence alignment methods additionally weight the sequences in the query set according to their relatedness, which reduces the likelihood of making a poor choice of initial sequences and thus improves alignment accuracy.
Many variations of the [[Clustal]] progressive implementation<ref name=higgins>{{cite journal | journal=Gene | volume=73 | issue=1 | pages=237–44 | year=1988 | author=[[Desmond G. Higgins|Higgins DG]], Sharp PM | title=CLUSTAL: a package for performing multiple sequence alignment on a microcomputer | pmid=3243435 | doi = 10.1016/0378-1119(88)90330-7 }}</ref><ref name=thompson>{{cite journal | journal=Nucleic Acids Res | volume=22 | pages=4673–80 | year=1994 | author1=Thompson JD| author2-link=Desmond G. Higgins |author2= Higgins DG|author3= Gibson TJ. | title=CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice | pmid=7984417 |pmc=308517 |url=|doi=10.1093/nar/22.22.4673 | issue=22 }}</ref><ref name=chenna>{{cite journal | journal=Nucleic Acids Res | volume=31 | pages=3497–500 | year=2003 |author1=Chenna R |author2=Sugawara H |author3=Koike T |author4=Lopez R |author5=Gibson TJ |author6=Higgins DG |author7=Thompson JD. | title=Multiple sequence alignment with the Clustal series of programs | url= | pmid=12824352 | doi = 10.1093/nar/gkg500 | issue=13 | pmc=168907 }}</ref> are used for multiple sequence alignment, phylogenetic tree construction, and as input for [[protein structure prediction]]. A slower but more accurate variant of the progressive method is known as [[T-Coffee]].<ref name=notredame>{{cite journal | journal=J Mol Biol | volume=302 | issue=1 | pages=205–17 | year=2000 | author1=Notredame C| author2-link=Desmond G. Higgins |author2= Higgins DG|author3= Heringa J. | title=T-Coffee: A novel method for fast and accurate multiple sequence alignment | pmid=10964570 | doi = 10.1006/jmbi.2000.4042 | s2cid=10189971
===Iterative methods===
Line 167:
===DALI===
The DALI method, or [[distance matrix]] alignment, is a fragment-based method for constructing structural alignments based on contact similarity patterns between successive hexapeptides in the query sequences.<ref name=holm>{{cite journal | journal=Science | volume=273 | pages=595–603 | year=1996 |author1=Holm L |author2=Sander C | title=Mapping the protein universe | pmid=8662544 | doi = 10.1126/science.273.5275.595 | issue=5275 | bibcode=1996Sci...273..595H | s2cid=7509134
===SSAP===
Line 187:
Methods of statistical significance estimation for gapped sequence alignments are available in the literature.<ref name="ortet"/><ref name=altschul>{{cite book|author1=Altschul SF |author2=Gish W |chapter=Local alignment statistics |title=Computer Methods for Macromolecular Sequence Analysis | year=1996| journal= Meth.Enz. | volume=266 | pages = 460–480|doi=10.1016/S0076-6879(96)66029-7|pmid=8743700 |series=Methods in Enzymology|isbn=9780121821678}}</ref><ref name=hartmann>{{cite journal| author=Hartmann AK| year=2002| title=Sampling rare events: statistics of local sequence alignments|
journal= Phys. Rev. E| volume=65| page=056102|doi=10.1103/PhysRevE.65.056102| pmid=12059642| issue=5|arxiv=cond-mat/0108201|bibcode=2002PhRvE..65e6102H| s2cid=193085
===Assessment of credibility===
Line 198:
==Other biological uses==
Sequenced RNA, such as [[expressed sequence tags]] and full-length mRNAs, can be aligned to a sequenced genome to find where there are genes and get information about [[alternative splicing]]<ref>{{cite book |author1=Kim N |author2=Lee C |title=Bioinformatics |chapter=Bioinformatics Detection of Alternative Splicing |volume=452 |pages=179–97 |year=2008 |pmid=18566765 |doi=10.1007/978-1-60327-159-2_9 |series=Methods in Molecular Biology |isbn=978-1-58829-707-5}}</ref> and [[RNA editing]].<ref>{{cite journal |vauthors=Li JB, Levanon EY, Yoon JK, etal |title=Genome-wide identification of human RNA editing sites by parallel DNA capturing and sequencing |journal=Science |volume=324 |issue=5931 |pages=1210–3 |date=May 2009 |pmid=19478186 |doi=10.1126/science.1170995|bibcode=2009Sci...324.1210L |s2cid=31148824
==Non-biological uses==
|