Comparison of the PAM and BLOSUM amino acid substitution matrices

Cold Spring Harbor Protocols (Impact Factor: 4.63). 06/2008; 2008(6):pdb.ip59. DOI: 10.1101/pdb.ip59
Source: PubMed


INTRODUCTIONThe choice of a scoring system including scores for matches, mismatches, substitutions, insertions, and deletions influences the alignment of both DNA and protein sequences. To score matches and mismatches in alignments of proteins, it is necessary to know how often one amino acid is substituted for another in related proteins. Percent accepted mutation (PAM) matrices list the likelihood of change from one amino acid to another in homologous protein sequences during evolution and thus are focused on tracking the evolutionary origins of proteins. In contrast, the blocks amino acid substitution matrices (BLOSUM) are based on scoring substitutions found over a range of evolutionary periods. There are important differences in the ways that the PAM and BLOSUM scoring matrices were derived. These differences, which are discussed in this article, should be appreciated when interpreting the results of protein sequence alignments obtained with these matrices.

174 Reads
  • Source
    • "Generally, obtaining an efficient multiple alignment looks impossible when the sequences do not have enough similarity between them. Sequence alignment programs use a scoring matrix such as point accepted mutation (PAM) and BLOcks SUbstitution Matrix (BLOSUM) to generate a score for the alignment [11]. Some limitations of alignment-based approaches are [12] as follows. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth.
    Full-text · Article · Jun 2014
  • Source
    • "In bioinformatics several matrices are available with the most popular being PAMxx and BLOSUMxx (Mount, 2008). PAMxx provides scores based on the observed frequencies of alignments in related proteins (xx meaning up to xx% of divergence between two genes i.e. xx=50), where identities are given the highest scores (frequently observed substitutions are given a positive score and rarely observed substitutions a negative score). "
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a novel approach to comparing saccadic eye movement sequences based on the Needleman-Wunsch algorithm used in bioinformatics to compare DNA sequences. In the proposed method, the saccade sequence is spatially and temporally binned and then recoded to create a sequence of letters that retains fixation location, time, and order information. The comparison of two letter sequences is made by maximizing the similarity score computed from a substitution matrix that provides the score for all letter pair substitutions and a penalty gap. The substitution matrix provides a meaningful link between each location coded by the individual letters. This link could be distance but could also encode any useful dimension, including perceptual or semantic space. We show, by using synthetic and behavioral data, the benefits of this method over existing methods. The ScanMatch toolbox for MATLAB is freely available online (
    Full-text · Article · Aug 2010 · Behavior Research Methods
  • [Show abstract] [Hide abstract]
    ABSTRACT: INTRODUCTIONThe original Dayhoff percent accepted mutation (PAM) matrices were developed based on a small number of protein sequences and an evolutionary model of protein change. By extrapolating from the observed changes at small evolutionary distances to large ones, it was possible to establish a PAM250 scoring matrix for sequences that were highly divergent. Another approach to finding a scoring matrix for divergent sequences is to start with a more divergent set of sequences and produce a scoring matrix from the substitutions found in those less-related sequences. The blocks amino acid substitution matrices (BLOSUM) scoring matrices were prepared this way. This article explains how BLOSUM scoring matrices were created and how they can best be used.
    No preview · Article · Jun 2008 · Cold Spring Harbor Protocols
Show more