Structural alignment: Difference between revisions

Content deleted Content added
Cems2 (talk | contribs)
Cems2 (talk | contribs)
Line 92:
For every overlapping window of 7 consecutive residues it computes the set of displacement direction unit vectors between adjacent C-alpha residues. All-against-all local motifs are compared based on the URMS score. These values becomes the pair alignment score entries for dynamic programming which produces a seed pair-wise residue alignment. The second phase uses a modified MaxSub algorithm: a single 7 reside aligned pair in each proteins is used to orient the two full length protein structures to maximally superimpose these just these 7 C-alpha, then in this orientation it scans for any additional aligned pairs that are close in 3D. It re-orients the structures to superimpose this expanded set and iterates till no more pairs coincide in 3D. This process is restarted for every 7 residue window in the seed alignment. The output is the maximal number of atoms found from any of these initial seeds. This statistic is converted to a calibrated E-value for the similarity of the proteins.
 
Mammoth makes no attempt to re-iterate the initial alignment or extend the high quality sub-subset. Therefore the seed alignment it displays can't be fairly compared to DALI or TM align as its was formed simply as a heuristic to limitprune the search space. (It can be used if one wants an alignment based solely on local structure-motif similarity agnostic of long range rigid body atomic alignment.) Because of that same parsimony, it is well over ten times faster than DALI, CE and TM-align. <ref name="foldclass">{{cite journal
|title=Efficient SCOP-fold classification and retrieval using index-based protein substructure alignments
|authors=Pin-Hao Chi, Bin Pang, Dmitry Korkin, Chi-Ren Shyu