Multiple sequence alignment: Difference between revisions

Content deleted Content added
Hidden Markov models: refs to original papers on some methods
Motif finding: meme/mast refs
Line 48:
Blocks analysis is a method of motif finding that restricts motifs to ungapped regions in the alignment. Blocks can be generated from an MSA or they can be extracted from unaligned sequences using a precalculated set of common motifs previously generated from known gene families.<ref name="henikoff">Henikoff S, Henikoff JG. (1991). Automated assembly of protein blocks for database searching. ''Nucleic Acids Res'' 19:6565-72.</ref> Block scoring generally relies on the spacing of high-frequency characters rather than on the calculation of an explicit substitution matrix. A server for locating motifs in unaligned sequences is located at [http://blocks.fhcrc.org/ BLOCKS].
 
Statistical pattern-matching has been implemented using both the [[expectation-maximization algorithm]] and the [[Gibbs sampler]]. One of the most common motif-finding tools, known as MEME, uses expectation maximization and hidden Markov methods to generate motifs that are then used as search tools by its companion program MAST.<ref name="baileyelkan"> Bailey TL, Elkan C.(1994). Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California.</ref><ref name="baileygribskov">Bailey TL, Gribskov M. (1998). Combining evidence using p-values: application to sequence homology searches. ''Bioinformatics''14:48-54.</ref> Both are available at [http://meme.sdsc.edu/meme/intro.html MEME/MAST].
 
==See also==