Revision as of 17:54, 21 May 2004 edit Dmb000006 (talk \| contribs) 1,467 edits mNo edit summary ← Previous edit		Revision as of 04:40, 22 May 2004 edit undo Lexor (talk \| contribs) Extended confirmed users 12,806 edits m wikifications: casing in sect titles, consolidate paras (WP:1SP Next edit →
Line 1: In [[bioinformatics]], '''sequence clustering''' [[algorithm]]s attempt to group sequences that are somehow related. The sequences can be either of genomic, "transcriptomic" ([[expressed sequence tag\|ESTs]]) or [[protein]] origin. For proteins, [[Homology (biology)\|homologous]] sequences are typically grouped into [[protein family\|families]]. For EST data, clustering is important to group sequences originating from the same [[gene]] before the ESTs are [[sequence assembly\|assembled]] to reconstruct the original [[mRNA]]. Generally, the clustering algorithms are [[single linkage clustering]], constructing a [[transitive closure]] of sequences with a similarity over a particular threshold. The similarity score is often based on [[sequence alignment]]. Sequence clustering is often used to make a [[Non redundant ~~sequences~~sequence\|non-redundant]] set of sequences.▼ ▲Sequence clustering is often used to make a [[Non redundant sequences\|non-redundant]] set of sequences. == External links == === Sequence ~~Clustering~~clustering ~~Packages~~packages === * [http://www.ebi.ac.uk/~holm/nrdb90 RDB90 and nrdb90.pl: a nonredundant sequence database] * [http://www.ebi.ac.uk/research/cgg/tribe/ TribeMCL: a method for clustering proteins into related groups] Line 17 ⟶ 15: <!-- * [http://bio.cc/RSDB RSDB] broken link --> === Non-~~Redundant~~redundant ~~Sequence~~sequence ~~Databases~~databases === * [http://www.fccc.edu/research/labs/dunbrack/pisces/ PISCES: A Protein Sequence Culling Server]

Sequence clustering: Difference between revisions