Structural alignment: Difference between revisions

Content deleted Content added
No edit summary
m Mammoth: typo, typo(s) fixed: till → until
Line 96:
}}</ref>
 
For every overlapping window of 7 consecutive residues it computes the set of displacement direction unit vectors between adjacent C-alpha residues. All-against-all local motifs are compared based on the URMS score. These values becomes the pair alignment score entries for dynamic programming which produces a seed pair-wise residue alignment. The second phase uses a modified MaxSub algorithm: a single 7 reside aligned pair in each proteins is used to orient the two full length protein structures to maximally superimpose these just these 7 C-alpha, then in this orientation it scans for any additional aligned pairs that are close in 3D. It re-orients the structures to superimpose this expanded set and iterates tilluntil no more pairs coincide in 3D. This process is restarted for every 7 residue window in the seed alignment. The output is the maximal number of atoms found from any of these initial seeds. This statistic is converted to a calibrated E-value for the similarity of the proteins.
 
Mammoth makes no attempt to re-iterate the initial alignment or extend the high quality sub-subset. Therefore, the seed alignment it displays can't be fairly compared to DALI or TM align as itsit was formed simply as a heuristic to prune the search space. (It can be used if one wants an alignment based solely on local structure-motif similarity agnostic of long range rigid body atomic alignment.) Because of that same parsimony, it is well over ten times faster than DALI, CE and TM-align.<ref name="foldclass">{{cite journal
|title=Efficient SCOP-fold classification and retrieval using index-based protein substructure alignments
|authors=Pin-Hao Chi, Bin Pang, Dmitry Korkin, Chi-Ren Shyu