Comparison of the PAM and BLOSUM Amino Acid Substitution Matrices.

Cold Spring Harbor Protocols (Impact Factor: 4.63). 01/2008; 2008:pdb.ip59. DOI: 10.1101/pdb.ip59
Source: PubMed

ABSTRACT INTRODUCTIONThe choice of a scoring system including scores for matches, mismatches, substitutions, insertions, and deletions influences the alignment of both DNA and protein sequences. To score matches and mismatches in alignments of proteins, it is necessary to know how often one amino acid is substituted for another in related proteins. Percent accepted mutation (PAM) matrices list the likelihood of change from one amino acid to another in homologous protein sequences during evolution and thus are focused on tracking the evolutionary origins of proteins. In contrast, the blocks amino acid substitution matrices (BLOSUM) are based on scoring substitutions found over a range of evolutionary periods. There are important differences in the ways that the PAM and BLOSUM scoring matrices were derived. These differences, which are discussed in this article, should be appreciated when interpreting the results of protein sequence alignments obtained with these matrices.

  • [Show abstract] [Hide abstract]
    ABSTRACT: Methods are discussed that provide sensitive criteria for detection of weak sequence homologies. They are based on the Dayhoff relatedness odds amino acid exchange matrix and certain residue physical characteristics. The search procedure uses several residue probe lengths in comparing all possible segments of two protein sequences, and search plots are shown with peak values displayed over the entire search length. Alignments are automatically effected using the highest search matrix values and without the necessity of gap penalties. Tests for significance are derived from actual protein sequences rather than a random shuffling procedure.
    Journal of Molecular Biology 02/1987; 193(2):385-96. DOI:10.1016/0022-2836(87)90226-9 · 4.33 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In aligning homologous protein sequences, it is generally assumed that amino acid substitutions subsequent in time occur independently of amino acid substitutions previous in time, i.e. that patterns of mutation are similar at low and high sequence divergence. This assumption is examined here and shown to be incorrect in an interesting way. Separate mutation matrices were constructed for aligned protein sequence pairs at divergences ranging from 5 to 100 PAM units (point accepted mutations per 100 aligned positions). From these, the corresponding log-odds (Dayhoff) matrices, normalized to 250 PAM units, were constructed. The matrices show that the genetic code influences accepted point mutations strongly at early stages of divergence, while the chemical properties of the side chains dominate at more advanced stages.
    Protein engineering 12/1994; 7(11):1323-32. DOI:10.1093/protein/7.11.1323
  • [Show abstract] [Hide abstract]
    ABSTRACT: We examined two extensive families of protein sequences using four different alignment schemes that employ various degrees of "weighting" in order to determine which approach is most sensitive in establishing relationships. All alignments used a similarity approach based on a general algorithm devised by Needleman and Wunsch. The approaches included a simple program, UM (unitary matrix), whereby only identities are scored; a scheme in which the genetic code is used as a basis for weighting (GC); another that employs a matrix based on structural similarity of amino acids taken together with the genetic basis of mutation (SG); and a fourth that uses the empirical log-odds matrix (LOM) developed by Dayhoff on the basis of observed amino acid replacements. The two sequence families examined were (a) nine different globins and (b) nine different tyrosine kinase-like proteins. It was assumed a priori that all members of a family share common ancestry. In cases where two sequences were more than 30% identical, alignments by all four methods were almost always the same. In cases where the percentage identity was less than 20%, however, there were often significant differences in the alignments. On the average, the Dayhoff LOM approach was the most effective in verifying distant relationships, as judged by an empirical "jumbling test." This was not universally the case, however, and in some instances the simple UM was actually as good or better. Trees constructed on the basis of the various alignments differed with regard to their limb lengths, but had essentially the same branching orders. We suggest some reasons for the different effectivenesses of the four approaches in the two different sequence settings, and offer some rules of thumb for assessing the significance of sequence relationships.
    Journal of Molecular Evolution 02/1985; 21(2):112-25. DOI:10.1007/BF02100085 · 1.86 Impact Factor
Show more