Comparison of the PAM and BLOSUM amino acid substitution matrices

Cold Spring Harbor Protocols (Impact Factor: 4.63). 06/2008; 2008(6):pdb.ip59. DOI: 10.1101/pdb.ip59
Source: PubMed


INTRODUCTIONThe choice of a scoring system including scores for matches, mismatches, substitutions, insertions, and deletions influences the alignment of both DNA and protein sequences. To score matches and mismatches in alignments of proteins, it is necessary to know how often one amino acid is substituted for another in related proteins. Percent accepted mutation (PAM) matrices list the likelihood of change from one amino acid to another in homologous protein sequences during evolution and thus are focused on tracking the evolutionary origins of proteins. In contrast, the blocks amino acid substitution matrices (BLOSUM) are based on scoring substitutions found over a range of evolutionary periods. There are important differences in the ways that the PAM and BLOSUM scoring matrices were derived. These differences, which are discussed in this article, should be appreciated when interpreting the results of protein sequence alignments obtained with these matrices.

171 Reads
  • Source
    • "Generally, obtaining an efficient multiple alignment looks impossible when the sequences do not have enough similarity between them. Sequence alignment programs use a scoring matrix such as point accepted mutation (PAM) and BLOcks SUbstitution Matrix (BLOSUM) to generate a score for the alignment [11]. Some limitations of alignment-based approaches are [12] as follows. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth.
    06/2014; 2014(4):173869. DOI:10.1155/2014/173869
  • Source
    • "In bioinformatics several matrices are available with the most popular being PAMxx and BLOSUMxx (Mount, 2008). PAMxx provides scores based on the observed frequencies of alignments in related proteins (xx meaning up to xx% of divergence between two genes i.e. xx=50), where identities are given the highest scores (frequently observed substitutions are given a positive score and rarely observed substitutions a negative score). "
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a novel approach to comparing saccadic eye movement sequences based on the Needleman-Wunsch algorithm used in bioinformatics to compare DNA sequences. In the proposed method, the saccade sequence is spatially and temporally binned and then recoded to create a sequence of letters that retains fixation location, time, and order information. The comparison of two letter sequences is made by maximizing the similarity score computed from a substitution matrix that provides the score for all letter pair substitutions and a penalty gap. The substitution matrix provides a meaningful link between each location coded by the individual letters. This link could be distance but could also encode any useful dimension, including perceptual or semantic space. We show, by using synthetic and behavioral data, the benefits of this method over existing methods. The ScanMatch toolbox for MATLAB is freely available online (
    Behavior Research Methods 08/2010; 42(3):692-700. DOI:10.3758/BRM.42.3.692 · 2.93 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: INTRODUCTIONComparing different amino acid scoring matrix-gap penalty combinations poses several problems. For example, the analysis often overlooks the purposes of different matrices; e.g., protein family or domain searching, evolutionary analysis, or structural alignment. In the past, gap penalties were usually not published or well known, thus throwing a level of uncertainty into the results. More recently, when investigators publish a new scoring matrix, they usually provide suitable choices for gap penalties that may be used for comparisons with other matrices. This article summarizes a number of reports that have examined combinations of alignment algorithm, scoring matrix, and gap penalties used to align sequences for various purposes.
    Cold Spring Harbor Protocols 06/2008; 2008(6):pdb.ip60. DOI:10.1101/pdb.ip60 · 4.63 Impact Factor
Show more