Content deleted Content added
Citation bot (talk | contribs) m Alter: isbn, template type. Add: pmid, citeseerx. Removed parameters. You can use this bot yourself. Report bugs here. | User-activated; Category:Bioinformatics. |
GoingBatty (talk | contribs) m General fixes and manual cleanup, replaced: Open Source software → open-source software |
||
Line 1:
{{
In [[bioinformatics]], a '''sequence alignment''' is a way of arranging the sequences of [[DNA]], [[RNA]], or protein to identify regions of similarity that may be a consequence of functional, [[structural biology|structural]], or [[evolution]]ary relationships between the sequences.<ref name=mount>{{cite book| author=Mount DM.| year=2004 | title=Bioinformatics: Sequence and Genome Analysis |edition=2nd | publisher= Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY. |isbn=978-0-87969-608-5}}</ref> Aligned sequences of [[nucleotide]] or [[amino acid]] residues are typically represented as rows within a [[matrix (mathematics)|matrix]]. Gaps are inserted between the [[Residue (chemistry)|residues]] so that identical or similar characters are aligned in successive columns.
Line 62:
Thus, the number of gaps in an alignment is usually reduced and residues and gaps are kept together, which typically makes more biological sense. The Gotoh algorithm implements affine gap costs by using three matrices.
Dynamic programming can be useful in aligning nucleotide to protein sequences, a task complicated by the need to take into account [[frameshift]] mutations (usually insertions or deletions). The framesearch method produces a series of global or local pairwise alignments between a query nucleotide sequence and a search set of protein sequences, or vice versa. Its ability to evaluate frameshifts offset by an arbitrary number of nucleotides makes the method useful for sequences containing large numbers of indels, which can be very difficult to align with more efficient heuristic methods. In practice, the method requires large amounts of computing power or a system whose architecture is specialized for dynamic programming. The [[BLAST]] and [[EMBOSS]] suites provide basic tools for creating translated alignments (though some of these approaches take advantage of side-effects of sequence searching capabilities of the tools). More general methods are available from both commercial sources, such as ''FrameSearch'', distributed as part of the [[Accelrys]] [[GCG (software)|GCG package]], and [[
The dynamic programming method is guaranteed to find an optimal alignment given a particular scoring function; however, identifying a good scoring function is often an empirical rather than a theoretical matter. Although dynamic programming is extensible to more than two sequences, it is prohibitively slow for large numbers of sequences or extremely long sequences.
Line 164:
{{Use dmy dates|date=April 2017}}
[[Category:Bioinformatics]]
[[Category:Computational phylogenetics]]
|