Substitution matrix: Difference between revisions

Content deleted Content added
Line 57:
 
It turns out that the BLOSUM62 matrix does an excellent job detecting similarities in distant sequences, and this is the matrix used by default in most recent alignment applications such as [[BLAST (biotechnology)|BLAST]].
 
It also turns out the BLOSUM computer code written by Henikoff and Henikoff does not exactly match the description in their paper. Surprisingly, this commonly-used "wrong" version has better search performance.<ref name=article>{{cite journal |last1=Styczynski |first1=Mark P |last2=Jensen |first2=Kyle L |last3=Rigoutsos |first3=Isidore |last4=Stephanopoulos |first4=Gregory |title=BLOSUM62 miscalculations improve search performance |journal=Nature Biotechnology |date=March 2008 |volume=26 |issue=3 |pages=274–275 |doi=10.1038/nbt0308-274 | pmid=18327232 |s2cid=205266180 }}</ref>
 
One issue with BLOSUM is that it describes observed substitutions, which can be misleading since it ignores the possibility of intermediate substitutions (a consequence of counting changes, equivalent to maximum parismony).<ref name="WAG original paper"/> As a result, it does not describe a true substitution model; its distances do not have a theoretical meaning as evolutionary distances. PMB's (Probability Matrix from Blocks) authors use the observed differences in many BLOSUM matrices to estimate actual substitution frequencies, which they present as the PMB matrix.<ref>{{cite journal |last1=Veerassamy |first1=Shalini |last2=Smith |first2=Andrew |last3=Tillier |first3=Elisabeth R. M. |title=A Transition Probability Model for Amino Acid Substitutions from Blocks |journal=Journal of Computational Biology |date=December 2003 |volume=10 |issue=6 |pages=997–1010 |doi=10.1089/106652703322756195}}</ref>
 
=== Differences between PAM and BLOSUM ===