Content deleted Content added
fmt ext lnks, rm 404 link |
No edit summary |
||
Line 1:
In [[bioinformatics]], '''sequence clustering''' [[algorithm]]s attempt to group
sequences that are somehow related. The sequences can be either of genomic, "transcriptomic" ([[EST (biology)|ESTs]]) or [[protein]] origin.
For proteins, one [[Homology (biology)|homologous]] sequences into [[protein family|families]]. For EST data, clustering is important to group sequences originating from the same [[gene]] before the ESTs are assembled to reconstruct the original [[mRNA]].
Generally, the clustering algorithms are single linkage clustering, constructing a [[transitive closure]] of sequences with a similarity over a particular threshold. The similarity score is often based on [[sequence alignment]].
== External links ==
|