Sequence clustering: Difference between revisions

Content deleted Content added
fmt ext lnks, rm 404 link
No edit summary
Line 1:
In [[bioinformatics]], '''sequence clustering''' [[algorithm]]s attempt to group [[Homology (biology)|homologous]] sequences into [[protein family|families]]. Generally, clustering is based on [[sequence alignment]].
sequences that are somehow related. The sequences can be either of genomic, "transcriptomic" ([[EST (biology)|ESTs]]) or [[protein]] origin.
 
For proteins, one [[Homology (biology)|homologous]] sequences into [[protein family|families]]. For EST data, clustering is important to group sequences originating from the same [[gene]] before the ESTs are assembled to reconstruct the original [[mRNA]].
 
Generally, the clustering algorithms are single linkage clustering, constructing a [[transitive closure]] of sequences with a similarity over a particular threshold. The similarity score is often based on [[sequence alignment]].
 
== External links ==