Comparison of the PAM and BLOSUM Amino Acid Substitution Matrices.
ABSTRACT INTRODUCTIONThe choice of a scoring system including scores for matches, mismatches, substitutions, insertions, and deletions influences the alignment of both DNA and protein sequences. To score matches and mismatches in alignments of proteins, it is necessary to know how often one amino acid is substituted for another in related proteins. Percent accepted mutation (PAM) matrices list the likelihood of change from one amino acid to another in homologous protein sequences during evolution and thus are focused on tracking the evolutionary origins of proteins. In contrast, the blocks amino acid substitution matrices (BLOSUM) are based on scoring substitutions found over a range of evolutionary periods. There are important differences in the ways that the PAM and BLOSUM scoring matrices were derived. These differences, which are discussed in this article, should be appreciated when interpreting the results of protein sequence alignments obtained with these matrices.
SourceAvailable from: Jay Gerard Pedersen
Conference Paper: Malware Analysis using Bioinformatics Tools[Show abstract] [Hide abstract]
ABSTRACT: As a new strain of computer malware is discovered, it triggers a meticulous process of analyzing its behavior and developing appropriate defenses. A systematic process which identifies regions of commonality and variability with known samples can ease the burden of malware analysis. We address this challenge using an interdisciplinary approach which applies biological sequence analysis methods to computer malware. Specifically, we have developed a method which has the goal of classifying a digital artifact (possibly malware) based on its similarity to known digital artifacts (or known malware samples) using methods and tools of bioinformatics. Our approach is analogous to classifications of biological sequences, which are routinely performed using online databases of known biological sequences.The 2012 International Conference on Security & Management (SAM2012),; 07/2012
[Show abstract] [Hide abstract]
ABSTRACT: Background: Crystallization is not always achieved for all proteins in a good size and a good quality for X-ray diffraction. So that condition opens a field for the development of theoretical molecular and protein studies allowing the representation of the molecules in 3D, providing spatial information to study the interaction between ligands and macromolecular receptors. Materials and Methods: In silico study from primary sequence analysis of six different proteins LuxS crystallized of several bacteria. 1J6X protein of Helicobacter pylori was selected for its similarity with the LuxS protein sequence in Porphyromonas gingivalis (P. gingivalis) strain W83 to produce a homology model of this protein, using the Sybyl and MOE software. A docking was performed to assess the reproducibility of the model in a biological environment. Results: The LuxS protein modelling of P. gingivalis strain W83 was developed, which allows the approach to a proposed structure for the interaction between the protein and its natural ligand. The model generated with computational resources achieved the correct position and biological behavior by means of developed calculations. The docking showed a cavity in which the ligand adopted several positions with good results. Conclusions: A LuxS protein model was obtained, validated by different methods. This generated a 3D model for LuxS protein in P. gingivalis strain W83 with biological reproducibility by means of molecular docking.12/2012; 5(3):105-113. DOI:10.4067/S0719-01072012000300001
[Show abstract] [Hide abstract]
ABSTRACT: Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth.06/2014; 2014:173869. DOI:10.1155/2014/173869