Sequence alignment: Difference between revisions

Content deleted Content added
Global and local alignments: rewriting the explaination of a partial alignment in a more understandable form
Tags: Mobile edit Mobile web edit
Line 34:
Sequence alignments can be stored in a wide variety of text-based file formats, many of which were originally developed in conjunction with a specific alignment program or implementation. Most web-based tools allow a limited number of input and output formats, such as [[FASTA format]] and [[GenBank]] format and the output is not easily editable. Several conversion programs that provide graphical and/or command line interfaces are available {{Dead link|date=August 2009}}, such as [https://web.archive.org/web/20071024223546/http://bioweb.pasteur.fr/seqanal/interfaces/readseq.html READSEQ] and [[EMBOSS]]. There are also several programming packages which provide this conversion functionality, such as [[BioPython]], [[BioRuby]] and [[BioPerl]]. The [[SAM (file format)|SAM/BAM files]] use the CIGAR (Compact Idiosyncratic Gapped Alignment Report) string format to represent an alignment of a sequence to a reference by encoding a sequence of events (e.g. match/mismatch, insertions, deletions).<ref>{{Cite web|url=https://samtools.github.io/hts-specs/SAMv1.pdf|title=Sequence Alignment/Map Format Specification|last=|first=|date=|website=|access-date=}}</ref>
 
==Global and localLocal alignmentsAlignments==
Global alignments, which attempt to align every residue in every sequence, are most useful when the sequences in the query set are similar and of roughly equal size. (This does not mean global alignments cannot start and/or end in gaps.) A general global alignment technique is the [[Needleman–Wunsch algorithm]], which is based on dynamic programming. Local alignments are more useful for dissimilar sequences that are suspected to contain regions of similarity or similar sequence motifs within their larger sequence context. The [[Smith–Waterman algorithm]] is a general local alignment method based on the same dynamic programming scheme but with additional choices to start and end at any place.<ref name="Polyanovsky2011"/>