A highly accurate statistical approach for the prediction of transmembrane beta-barrels

Department of Biochemistry, Tulane University Health Sciences Center, New Orleans, LA 70112, USA.
Bioinformatics (Impact Factor: 4.62). 08/2010; 26(16):1965-74. DOI: 10.1093/bioinformatics/btq308
Source: PubMed

ABSTRACT Transmembrane beta-barrels (TMBBs) belong to a special structural class of proteins predominately found in the outer membranes of Gram-negative bacteria, mitochondria and chloroplasts. TMBBs are surface-exposed proteins that perform a variety of functions ranging from nutrient acquisition to osmotic regulation. These properties suggest that TMBBs have great potential for use in vaccine or drug therapy development. However, membrane proteins, such as TMBBs, are notoriously difficult to identify and characterize using traditional experimental approaches and current prediction methods are still unreliable.
A prediction method based on the physicochemical properties of experimentally characterized TMBB structures was developed to predict TMBB-encoding genes from genomic databases. The Freeman-Wimley prediction algorithm developed in this study has an accuracy of 99% and MCC of 0.748 when using the most efficient prediction criteria, which is better than any previously published algorithm.
The MS Windows-compatible application is available for download at

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Outer membrane proteins (OMPs) play important roles in bacterial cellular processes. Discriminating OMPs from different fold types of proteins is helpful for successful prediction of their structures and for exact designs of OMP-targeted drugs. In this paper, we developed a novel prediction method based on primary sequence features and support vector machine (SVM) algorithms. For protein sequences, discriminative features were extracted by the combination of sequence encoding based on grouped weights (EBGW), amino acid compositions and biochemical properties. Feature subsets were screened using F-score algorithm for training a SVM-based classifier, namely EBGW_OMP. The performance of EBGW_OMP was examined on a benchmark dataset of 1087 proteins. The results show that EBGW_OMP can discriminate OMPs from globular proteins, α-helical membrane proteins or non-OMPs with cross-validated accuracy of 98.0%, 97.6% or 97.9%, respectively, which outperformed existing sequence-based methods. EBGW_OMP also successfully distinguished 681 out of 722 OMPs with 97.0% accuracy in another benchmark dataset of 2657 proteins. Genome-wide tests show that EBGW_OMP has excellent capability of correctly detecting OMPs and is considerable for genomic OMPs prediction. The web server implements EBGW_OMP is freely accessible at OMP.
    2014 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB); 05/2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Despite the increasing number of recently solved membrane protein structures, coverage of membrane protein fold space remains relatively sparse. This necessitates the use of computational strategies to investigate membrane protein structure, allowing us to further our understanding of how membrane proteins carry out their diverse range of functions, while aiding the development of novel predictive tools with which to probe uncharacterised folds. Analysis of known structures, the application of machine learning techniques, molecular dynamics simulations and protein structure prediction have enabled significant advances to be made in the field of membrane protein research. In this communication, the key bioinformatic methods that allow the characterisation of membrane proteins are reviewed, the tools available for the structural analysis of membrane proteins are presented and the contribution these tools have made to expanding our understanding of membrane protein structure, function and stability is discussed.
    Journal of Structural Biology 10/2011; 179(3):327-37. DOI:10.1016/j.jsb.2011.10.008 · 3.37 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Transmembrane beta barrel (TMB) proteins are found in the outer membranes of bacteria, mitochondria and chloroplasts. TMBs are involved in a variety of functions such as mediating flux of metabolites and active transport of siderophores, enzymes and structural proteins, and in the translocation across or insertion into membranes. We present here TMBHMM, a computational method based on a hidden Markov model for predicting the structural topology of putative TMBs from sequence. In addition to predicting transmembrane strands, TMBHMM also predicts the exposure status (i.e., exposed to the membrane or hidden in the protein structure) of the residues in the transmembrane region, which is a novel feature of the TMBHMM method. Furthermore, TMBHMM can also predict the membrane residues that are not part of beta barrel forming strands. The training of the TMBHMM was performed on a non-redundant data set of 19 TMBs. The self-consistency test yielded Q(2) accuracy of 0.87, Q(3) accuracy of 0.83, Matthews correlation coefficient of 0.74 and SOV for beta strand of 0.95. In this self-consistency test the method predicted 83% of transmembrane residues with correct exposure status. On an unseen, non-redundant test data set of 10 proteins, the 2-state and 3-state TMBHMM prediction accuracies are around 73% and 72%, respectively, and are comparable to other methods from the literature. The TMBHMM web server takes an amino acid sequence or a multiple sequence alignment as an input and predicts the exposure status and the structural topology as output. The TMBHMM web server is available under the tmbhmm tab at:
    Biochimica et Biophysica Acta 03/2011; 1814(5):664-70. DOI:10.1016/j.bbapap.2011.03.004 · 4.66 Impact Factor


Available from