Enhanced genome annotation using structural profiles in the program 3D-PSSM

Biomolecular Modelling Laboratory, Imperial Cancer Research Fund, 44 Lincoln's Inn Fields, London, WC2A 3PX, England.
Journal of Molecular Biology (Impact Factor: 3.96). 07/2000; 299(2):499-520. DOI: 10.1006/jmbi.2000.3741
Source: PubMed

ABSTRACT A method (three-dimensional position-specific scoring matrix, 3D-PSSM) to recognise remote protein sequence homologues is described. The method combines the power of multiple sequence profiles with knowledge of protein structure to provide enhanced recognition and thus functional assignment of newly sequenced genomes. The method uses structural alignments of homologous proteins of similar three-dimensional structure in the structural classification of proteins (SCOP) database to obtain a structural equivalence of residues. These equivalences are used to extend multiply aligned sequences obtained by standard sequence searches. The resulting large superfamily-based multiple alignment is converted into a PSSM. Combined with secondary structure matching and solvation potentials, 3D-PSSM can recognise structural and functional relationships beyond state-of-the-art sequence methods. In a cross-validated benchmark on 136 homologous relationships unambiguously undetectable by position-specific iterated basic local alignment search tool (PSI-Blast), 3D-PSSM can confidently assign 18 %. The method was applied to the remaining unassigned regions of the Mycoplasma genitalium genome and an additional 13 regions were assigned with 95 % confidence. 3D-PSSM is available to the community as a web server:

1 Follower
  • Source
  • [Show abstract] [Hide abstract]
    ABSTRACT: During the maturation of extracellular proteins, disulfide bonds that chemically cross-link specific cysteines are often added to stabilize a protein or to join it covalently to other proteins. Disulfide formation, which requires a change in the covalent structure of the protein, occurs as the protein folds into its three-dimensional structure. In the eukaryotic endoplasmic reticulum and in the bacterial periplasm, an elaborate system of chaperones and folding catalysts ensure that disulfides connect the proper cysteines and that the folding protein does not make improper interactions. This review focuses specifically on one of these folding assistants, protein disulfide isomerase (PDI), an enzyme that catalyzes disulfide formation and isomerization and a chaperone that inhibits aggregation.
    Biochimica et Biophysica Acta (BBA) - Proteins & Proteomics 06/2004; 1699(1-2):35-44. DOI:10.1016/S1570-9639(04)00063-9 · 3.19 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: One of the central problems in post-genomic era is the understanding of function of myriad of putative gene products suggested by the genome sequencing projects. Computational approaches aimed at establishing the relationships between proteins, purely on the basis of their amino acid sequences, provide a rapid and useful first step. Sequence analysis methods, which use evolutionary information on protein families perform well in terms of detecting remote homologues. Use of three-dimensional (3-D) structures provides a further edge in detecting distantly related proteins as 3-D structures are conserved better than the amino acid sequences. Also, in many cases, similarity in the fold of proteins corresponds to gross similarity in functions. Hence, knowledge of 3-D structures has profound influence in identifying the functions of newly discovered gene products. This review covers recent developments in this area of homology detection and its influence in computational genomics. Introduction A critical problem confronting the present era of genome revolution is assignment of function and structure to newly discovered proteins. The exploding rate at which genomes are sequenced is a formidable challenge for experimental scientists who attempt biological and biochemical characterization of proteins. The realm of rapid, preliminary assignment of protein function is, therefore, a challenging one and several procedures and strategies have been developed in the last decade to complement the pace of genome sequencing. Several effective experimental techniques are aimed to understand properties of proteins at the genomic scale. Microarray and protein expression profiles quantify transcription and translation of genes in an organism. Techniques, such as mass spectrometry and 2D-Gel, also serve as important tools for genome-wide analysis to characterize the gene products. Genome-wide yeast-two-hybrid analysis serves as a powerful tool to study interactions between proteins. While these techniques provide a variety of information about genes encoded in a genome, the biochemical or biological functional characterization of such proteins is still not available for most of the proteins in various organisms. Preliminary characterization and assignment of protein function is often performed by relating newly discovered proteins to those proteins whose structure and biochemical function are well known 1-6 . These similarities are deduced at the level of amino acid sequences, performing pair-wise string comparisons between the protein sequences by the process of protein homology detection⎯a central tool in genomics. Further, in-silico approaches to identify interacting proteins include Rossetta approach 7 , comparison of phylogenetic profiles 8 and chromosomal localization of genes 9 .