List of sequence alignment software: Difference between revisions

Content deleted Content added
Tags: Mobile edit Mobile web edit
OAbot (talk | contribs)
m Open access bot: url-access=subscription updated in citation with #oabot.
 
(21 intermediate revisions by 15 users not shown)
Line 11:
! Year
|-
| [[BLAST (biotechnology)|BLAST]]
| Local search with fast k-tuple heuristic (Basic Local Alignment Search Tool) || Both ||[[Stephen Altschul|Altschul SF]], [[Warren Gish|Gish W]], [[Webb Miller|Miller W]], [[Eugene Myers|Myers EW]], [[David J. Lipman|Lipman DJ]]<ref>{{Cite journal|author=Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ |title=Basic local alignment search tool |journal=Journal of Molecular Biology |volume=215 |issue=3 |pages=403–10 |date=October 1990 |pmid=2231712 |doi=10.1016/S0022-2836(05)80360-2|last2=Gish |last3=Miller |last4=Myers |last5=Lipman|s2cid=14441902 }}</ref> || 1990
|-
Line 18:
|-
| [[CS-BLAST]]
| Sequence-context specific BLAST, more sensitive than BLAST, FASTA, and SSEARCH. Position-specific iterative version CSI-BLAST more sensitive than PSI-BLAST || Protein || Angermueller C, Biegert A, Soeding J<ref>{{Cite journal |last1= Angermüller |first1= C. |last2= Biegert |first2= A. |last3= Söding |first3= J. |title= Discriminative modelling of context-specific amino acid substitution probabilities |journal= Bioinformatics |volume= 28 |issue= 24 |pages= 3240–7|date=Dec 2012 |doi= 10.1093/bioinformatics/bts622 |pmid=23080114|doi-access= free |hdl= 11858/00-001M-0000-0015-8D22-F |hdl-access= free }}</ref>
|| 2013
|-
Line 44:
|-
| [[HH-suite]]
| Pairwise comparison of profile Hidden Markov models; very sensitive || Protein || Söding J<ref>{{Cite journal|author=Söding J |title=Protein homology detection by HMM-HMM comparison |journal=Bioinformatics |volume=21 |issue=7 |pages=951–60 |date=April 2005 |pmid=15531603 |doi=10.1093/bioinformatics/bti125|doi-access=free |hdl=11858/00-001M-0000-0017-EC7A-F |hdl-access=free }}</ref><ref>{{Cite journal|last1=Remmert|first1=Michael|last2=Biegert|first2=Andreas|last3=Hauser|first3=Andreas|last4=Söding|first4=Johannes|date=2011-12-25|title=HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment|journal=Nature Methods|volume=9|issue=2|pages=173–175|doi=10.1038/nmeth.1818|issn=1548-7105|pmid=22198341|hdl=11858/00-001M-0000-0015-8D56-A|s2cid=205420247|hdl-access=free}}</ref> ||2005/2012
|-
| IDF
Line 60:
|-
| MMseqs2
| Software suite to search and cluster huge sequence sets. Similar sensitivity to BLAST and PSI-BLAST but orders of magnitude faster || Protein || Steinegger M, Mirdita M, Galiez C, Söding J<ref>{{Cite journal|last1=Steinegger|first1=Martin|last2=Soeding|first2=Johannes|date=2017-10-16|journal=Nature Biotechnology|volume=35|issue=11|pages=1026–1028|title=MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets|url=https://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.3988.html|doi=10.1038/nbt.3988|pmid=29035372|hdl=11858/00-001M-0000-002E-1967-3|s2cid=402352|hdl-access=free|url-access=subscription}}</ref> || 2017
|-
| USEARCH
Line 68:
|OpenCL Smith-Waterman on Altera's FPGA for Large Protein Databases
|Protein
|Rucci E, García C, Botella G, De Giusti A, Naiouf M, Prieto-Matías M<ref>{{Cite journal|last1=Rucci|first1=Enzo|last2=Garcia|first2=Carlos|last3=Botella|first3=Guillermo|last4=Giusti|first4=Armando E. De|last5=Naiouf|first5=Marcelo|last6=Prieto-Matias|first6=Manuel|date=2016-06-30|title=OSWALD: OpenCL Smith–Waterman on Altera's FPGA for Large Protein Databases|url=http://hpc.sagepub.com/content/early/2016/06/30/1094342016654215|journal=International Journal of High Performance Computing Applications|volume=32|issue=3|pages=337–350|doi=10.1177/1094342016654215|s2cid=212680914|issn=1094-3420|hdl=11336/48798|hdl-access=free}}</ref>
|2016
|-
Line 88:
| ScalaBLAST
| Highly parallel Scalable BLAST || Both || Oehmen et al.<ref>{{cite journal
|last1=Oehmen |first1=C.|last2= Nieplocha |first2=J. |title=ScalaBLAST: A scalable implementation of BLAST for high-performance data-intensive bioinformatics analysis|journal=IEEE Transactions on Parallel &and Distributed Systems |volume=17 |issue=8 |pages=740–749 |date=August 2006
|doi=10.1109/TPDS.2006.112|s2cid=11122366|url=https://zenodo.org/record/1232261 }}</ref>||2011
|-
Line 107:
|-
| SWIMM
| Smith-Waterman implementation for Intel Multicore and Manycore architectures || Protein || Rucci E, García C, Botella G, De Giusti A, Naiouf M and Prieto-Matías M<ref>{{Cite journal|last1=Rucci|first1=Enzo|last2=García|first2=Carlos|last3=Botella|first3=Guillermo|last4=De Giusti|first4=Armando|last5=Naiouf|first5=Marcelo|last6=Prieto-Matías|first6=Manuel|date=2015-12-25|title=An energy-aware performance analysis of SWIMM: Smith–Waterman implementation on Intel's Multicore and Manycore architectures|journal=Concurrency and Computation: Practice and Experience|volume=27|issue=18|pages=5517–5537|doi=10.1002/cpe.3598|s2cid=42945406|issn=1532-0634|url=http://sedici.unlp.edu.ar/handle/10915/82869|hdl=11336/53930|hdl-access=free}}</ref>|| 2015
|-
| SWIMM2.0
Line 144:
| BLASTZ, LASTZ
| Seeded pattern-matching || Nucleotide || Local || Schwartz ''et al.''<ref>{{Cite journal| author=Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W| title=Human-mouse alignments with BLASTZ| journal=Genome Research |volume=13 |issue=1 |date=2003 |pages=103–107 |pmid=12529312 | pmc=430961| doi=10.1101/gr.809403| last2=Kent| last3=Smit| last4=Zhang| last5=Baertsch| last6=Hardison| last7=Haussler| last8=Miller}}</ref><ref>{{Cite thesis| author=Harris R S | year=2007| title=Improved pairwise alignment of genomic DNA}}</ref> || 2004,2009
|-
| [[CodonCode Aligner]]
| Fast pairwise and multi-sequence alignments with multiple algorithms. || Nucleotide || Both || CodonCode Corporation || 2003-2025
|-
| CUDAlign
| DNA sequence alignment of unrestricted size in single or multiple GPUs
|| Nucleotide || Local, SemiGlobal, Global || E. Sandes<ref>{{Cite journal|author=Sandes, Edans F. de O. |author2=de Melo, Alba Cristina M.A.|title=Retrieving Smith-Waterman Alignments with Optimizations for Megabase Biological Sequences Using GPU |journal=IEEE Transactions on Parallel and Distributed Systems|volume=24 |issue=5 | pages=1009–1021 |date=May 2013 |doi=10.1109/TPDS.2012.194}}</ref><ref>{{Cite conference|author=Sandes, Edans F. de O. |author2=Miranda, G. |author3=De Melo, A.C.M.A. |author4=Martorell, X. |author5=Ayguade, E.|title=CUDAlign 3.0: Parallel Biological Sequence Comparison in Large GPU Clusters |conference=Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on |page=160 |date=May 2014 |doi=10.1109/CCGrid.2014.18|hdl=2117/24766 |hdl-access=free }}</ref><ref>{{Cite conference|author=Sandes, Edans F. de O. |author2=Miranda, G. |author3=De Melo, A.C.M.A. |author4=Martorell, X. |author5=Ayguade, E.|title=Fine-grain Parallel Megabase Sequence Comparison with Multiple Heterogeneous GPUs |conference=Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming |pages=383–384 |date=August 2014 |doi=10.1145/2555243.2555280|hdl=2117/23094 |hdl-access=free }}</ref> || 2011-2015
|-
| DNADot
Line 188 ⟶ 191:
| NW-align
| Standard Needleman-Wunsch dynamic programming algorithm || Protein || Global || Y Zhang || 2012
|-
| mAlign
| modelling alignment; models the information content of the sequences || Nucleotide || Both || D. Powell, L. Allison and T. I. Dix || 2004
|-
| matcher
Line 217:
|-
| Path
| [[Smith-Waterman]] on [[protein]] back-[[translation (genetics)|translation]] [[Chart|graph]] (detects [[frameshift]]s at protein level) || Protein || Local || M. Gîrdea ''et al.''<ref>{{Cite journal |last1=Girdea |first1=M |last2=Noe |first2=L |last3=Kucherov |first3=G |title=Back-translation for discovering distant protein homologies in the presence of frameshift mutations |journal=Algorithms for Molecular Biology |volume=5 |issue=6 |page=6 |date=January 2010 |pmid=20047662 |pmc=2821327 |doi=10.1186/1748-7188-5-6 |doi-access=free }}</ref> || 2009
|-
| [[PatternHunter]]
Line 305:
| [[AMAP]]
| Sequence annealing || Both || Global || A. Schwartz and [[Lior Pachter|L. Pachter]] || 2006 ||
|-
| anon.
| fast, optimal alignment of three sequences using linear gap costs || Nucleotides || Global || D. Powell, L. Allison and T. I. Dix || 2000 ||
|-
| [[BAli-Phy]]
Line 318 ⟶ 315:
| Iterative alignment || Both || Local (preferred) || M. Brudno and B. Morgenstern || 2003 ||
|-
| [[Clustal]]W
| Progressive alignment || Both || Local or global || Thompson ''et al.'' || 1994 || {{free}}, [[GNU Lesser General Public License|LGPL]]
|-
| [[CodonCode Aligner]]
| Multi-alignment; ClustalWMuscle, Clustal & Phrap support || Nucleotides || Local or global || P. Richterich ''et al.'' || 2003 (latest version 20092024) ||
|-
| [[Compass]]
Line 360 ⟶ 357:
|-
| GUIDANCE
| Quality control and filtering of multiple sequence alignments || Both || Local or global || O. Penn ''et al.'' || 200102010 (latest version 2015) ||
|-
| Kalign
| Progressive alignment || Both || Global || T. Lassmann || 2005 ||
|-
|MACSE
|Progressive-iterative alignment. Multiple alignment of coding sequences accounting for frameshifts and stop codons.
|Nucleotides
|Global
|V. Ranwez ''et al.''
|2011 (latest version, v2.07 2023)
|
|-
| [[MAFFT]]
Line 375 ⟶ 380:
|-
| MegAlign Pro (Lasergene Molecular Biology)
| Software to align DNA, RNA, protein, or DNA + protein sequences via pairwise and multiple sequence alignment algorithms including MUSCLE, Mauve, MAFFT, Clustal Omega, Jotun Hein, Wilbur-Lipman, Martinez Needleman-Wunsch, Lipman-Pearson and Dotplot analysis. || Both || Local or global ||[[DNASTAR]] || 1993-20162023 ||
|-
| MSA
Line 390 ⟶ 395:
|-
| [[MUSCLE (alignment software)|MUSCLE]]
| Progressive-iterative alignment (v3), Probabilistic/consistency (v5) || Both || Local or global || R. Edgar || 2004 || Public ___domain
|-
| Opal
Line 520 ⟶ 525:
|-
| Shuffle-LAGAN
| Pairwise glocalglobal alignment of completed genome regions || Nucleotide
|-
| SIBsim4, [[Sim4]]
Line 649 ⟶ 654:
| {{yes}}
| {{free}}, [[BSD licenses|BSD]]
|<ref name="WiltonEtAl2015">{{cite journal|last1=Wilton|first1=Richard|last2=Budavari|first2=Tamas|last3=Langmead|first3=Ben|last4=Wheelan|first4=Sarah J.|last5=Salzberg|first5=Steven L.|last6=Szalay|first6=Alexander S.|title=Arioc: high-throughput read alignment with GPU-accelerated exploration of the seed-and-extend search space|journal=PeerJ|volume=3|pages=e808|year=2015|doi=10.7717/peerj.808|pmid=25780763|pmc=4358639 |doi-access=free }}</ref>
| 2015
|-
Line 720 ⟶ 725:
| {{yes}}, [[POSIX Threads]]
| {{free}}, [[Artistic License|Artistic]]
|<ref name="LangmeadTrapnell2009">{{cite journal|last1=Langmead|first1=Ben|last2=Trapnell|first2=Cole|last3=Pop|first3=Mihai|last4=Salzberg|first4=Steven L|title=Ultrafast and memory-efficient alignment of short DNA sequences to the human genome|journal=Genome Biology|volume=10|issue=3|year=2009|pages=R25|issn=1465-6906|doi=10.1186/gb-2009-10-3-r25|pmid=19261174|pmc=2690996 |doi-access=free }}</ref>
|2009
|-
Line 740 ⟶ 745:
| {{yes}}
| {{free}}, [[GNU General Public License|GPL]]
|<ref name="KerpedjievFrellsen2014">{{cite journal|last1=Kerpedjiev|first1=Peter|last2=Frellsen|first2=Jes|last3=Lindgreen|first3=Stinus|last4=Krogh|first4=Anders|title=Adaptable probabilistic mapping of short reads using position specific scoring matrices|journal=BMC Bioinformatics|volume=15|issue=1|year=2014|page=100|issn=1471-2105|doi=10.1186/1471-2105-15-100|pmid=24717095|pmc=4021105 |doi-access=free }}</ref>
| 2014
|-
Line 760 ⟶ 765:
| {{yes}}, [[Hadoop]] [[MapReduce]]
| {{free}}, [[Artistic License|Artistic]]
|
|
|-
| CodonCode Aligner
| Fast assembly, accurate consensus sequences with support for quality scores. Compare Contigs, Phred, Phrap, and Bowtie support. Build separate contigs for hundreds of different clones or a single contig with thousands of sequences.
|{{yes}}
|{{yes}}
|{{yes}}
|{{yes}}
|{{proprietary}}, [[Commercial software|commercial]]
|
|
Line 1,001 ⟶ 1,016:
|
|
| <ref name="RivalsEtAl2009">{{cite book|last1=Rivals|first1=Eric|last2=Salmela|first2=Leena|last3=Kiiskinen|first3=Petteri|last4=Kalsi|first4=Petri|last5=Tarhio|first5=Jorma|title=mpscanAlgorithms in Bioinformatics |chapter=Mpscan: Fast Localisation of Multiple Reads in Genomes|journal=Algorithms in Bioinformatics|volume=5724|year=2009|pages= 246–260|doi=10.1007/978-3-642-04241-6_21|series=Lecture Notes in Computer Science|bibcode=2009LNCS.5724..246R|isbn=978-3-642-04240-9|citeseerx=10.1.1.156.928|s2cid=17187140 }}</ref>
| 2009
|-
Line 1,161 ⟶ 1,176:
| {{yes}}
| {{proprietary}}, [[freeware]] for noncommercial use
|<ref name="SearlsHoffmann2009">{{cite journal|last1=Searls|first1=David B.|last2=Hoffmann|first2=Steve|last3=Otto|first3=Christian|last4=Kurtz|first4=Stefan|last5=Sharma|first5=Cynthia M.|last6=Khaitovich|first6=Philipp|last7=Vogel|first7=Jörg|last8=Stadler|first8=Peter F.|last9=Hackermüller|first9=Jörg|title=Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures|journal=PLOS Computational Biology|volume=5|issue=9|year=2009|pages=e1000502|issn=1553-7358|doi=10.1371/journal.pcbi.1000502|pmid=19750212|pmc=2730575|bibcode=2009PLSCB...5E0502H |doi-access=free }}</ref>
|2009
|-
Line 1,190 ⟶ 1,205:
| {{yes}}
| {{yes}}, [[OpenMP]]
| {{free}}, [[BSD licenses
| {{free}}, [[BSD licenses|BSD]]]] derivative
|
<ref name="RumbleLacrouteDalcaFiumeSidowBrudno2009">{{cite journal|last1=Rumble|first1=Stephen M.|last2=Lacroute|first2=Phil|last3=Dalca|first3=Adrian V.|last4=Fiume|first4=Marc|last5=Sidow|first5=Arend|last6=Brudno|first6=Michael|title=SHRiMP: Accurate Mapping of Short Color-space Reads|journal=PLOS Computational Biology
|volume=5|issue=5|year=2009|pages=e1000386|pmid=19461883|pmc=2678294|doi=10.1371/journal.pcbi.1000386|bibcode=2009PLSCB...5E0386R |doi-access=free }}</ref>
<ref name="DavidDzambaListerIlieBrudno2011">{{cite journal|last1=David|first1=Matei|last2=Dzamba|first2=Misko|last3=Lister|first3=Dan|last4= Ilie|first4=Lucian|last5=Brudno|first5=Michael|title=SHRiMP2: Sensitive yet Practical Short Read Mapping|journal=Bioinformatics|volume=27|issue=7|year=2011|pages=1011–1012|pmid=21278192|doi=10.1093/bioinformatics/btr046|doi-access=free}}</ref>
| 2009-2011
Line 1,206 ⟶ 1,222:
|<ref name="MalhisButterfieldEsterJones2009">{{cite journal|last1=Malhis|first1=Nawar|last2=Butterfield|first2=Yaron S. N.|last3=Ester|first3=Martin|last4=Jones|first4=Steven J. M.|title=Slider – Maximum use of probability information for alignment of short sequence reads and SNP detection|journal=Bioinformatics
|volume=25|issue=1|year=2009|pages=6–13|pmid=18974170|pmc=2638935|doi=10.1093/bioinformatics/btn565}}</ref><ref name="MalhisJones2010">{{cite journal|last1=Malhis|first1=Nawar|last2=Jones|first2=Steven J. M.|title=High Quality SNP Calling Using Illumina Data at Shallow Coverage|journal=Bioinformatics
|volume=26|issue=8|year=2010|pages=1029–1035|pmid=20190250|doi=10.1093/bioinformatics/btq092|doi-access=free}}</ref>
| 2009-2010
|-
Line 1,216 ⟶ 1,232:
| {{yes}}, [[POSIX Threads]]; SOAP3, SOAP3-dp need GPU with [[CUDA]] support
| {{free}}, [[GNU General Public License|GPL]]
|<ref name="LiLi2008">{{cite journal|last1=Li|first1=R.|last2=Li|first2=Y.|last3=Kristiansen|first3=K.|last4=Wang|first4=J.|title=SOAP: short oligonucleotide alignment program|journal=Bioinformatics|volume=24|issue=5|year=2008|pages=713–714|issn=1367-4803|doi=10.1093/bioinformatics/btn025|pmid=18227114|doi-access=free}}</ref><ref name="LiYu2009">{{cite journal|last1=Li|first1=R.|last2=Yu|first2=C.|last3=Li|first3=Y.|last4=Lam|first4=T.-W.|last5=Yiu|first5=S.-M.|last6=Kristiansen|first6=K.|last7=Wang|first7=J.|title=SOAP2: an improved ultrafast tool for short read alignment|journal=Bioinformatics|volume=25|issue=15|year=2009|pages=1966–1967|issn=1367-4803|doi=10.1093/bioinformatics/btp336|pmid=19497933|doi-access=free}}</ref>
|
|-
Line 1,336 ⟶ 1,352:
{{Reflist}}
 
[[Category:Database-related lists|Seq]]
[[Category:Genetics-related lists|Sequence]]
[[Category:Lists of bioinformatics software|Sequence alignment software]]