Article

NClassG+: A classifier for non-classically secreted Gram-positive bacterial proteins.

School of Medicine and Health Sciences, Universidad del Rosario, Carrera 24 No, 63C-69, Bogotá DC, Colombia.
BMC Bioinformatics (impact factor: 2.75). 01/2011; 12:21. DOI:10.1186/1471-2105-12-21 pp.21
Source: PubMed

ABSTRACT Most predictive methods currently available for the identification of protein secretion mechanisms have focused on classically secreted proteins. In fact, only two methods have been reported for predicting non-classically secreted proteins of Gram-positive bacteria. This study describes the implementation of a sequence-based classifier, denoted as NClassG+, for identifying non-classically secreted Gram-positive bacterial proteins.
Several feature-based classifiers were trained using different sequence transformation vectors (frequencies, dipeptides, physicochemical factors and PSSM) and Support Vector Machines (SVMs) with Linear, Polynomial and Gaussian kernel functions. Nested k-fold cross-validation (CV) was applied to select the best models, using the inner CV loop to tune the model parameters and the outer CV group to compute the error. The parameters and Kernel functions and the combinations between all possible feature vectors were optimized using grid search.
The final model was tested against an independent set not previously seen by the model, obtaining better predictive performance compared to SecretomeP V2.0 and SecretPV2.0 for the identification of non-classically secreted proteins. NClassG+ is freely available on the web at http://www.biolisi.unal.edu.co/web-servers/nclassgpositive/.

0 0
 · 
0 Bookmarks
 · 
49 Views
  • Source
    Article: Locating proteins in the cell using TargetP, SignalP and related tools.
    [show abstract] [hide abstract]
    ABSTRACT: Determining the subcellular localization of a protein is an important first step toward understanding its function. Here, we describe the properties of three well-known N-terminal sequence motifs directing proteins to the secretory pathway, mitochondria and chloroplasts, and sketch a brief history of methods to predict subcellular localization based on these sorting signals and other sequence properties. We then outline how to use a number of internet-accessible tools to arrive at a reliable subcellular localization prediction for eukaryotic and prokaryotic proteins. In particular, we provide detailed step-by-step instructions for the coupled use of the amino-acid sequence-based predictors TargetP, SignalP, ChloroP and TMHMM, which are all hosted at the Center for Biological Sequence Analysis, Technical University of Denmark. In addition, we describe and provide web references to other useful subcellular localization predictors. Finally, we discuss predictive performance measures in general and the performance of TargetP and SignalP in particular.
    Nature Protocol 02/2007; 2(4):953-71. · 8.36 Impact Factor
  • Article: Computational classification of classically secreted proteins.
    [show abstract] [hide abstract]
    ABSTRACT: The ability to identify classically secreted proteins is an important component of targeted therapeutic studies and the discovery of circulating biomarkers. Here, we review some of the most recent programs available for the in silico prediction of secretory proteins, the performance of which is benchmarked with an independent set of annotated human proteins. The description of these programs and the results of this benchmarking provide insights into the most recently developed prediction programs, which will enable investigators to make more informed decisions about which program best addresses their research needs.
    Drug Discovery Today 04/2007; 12(5-6):234-40. · 6.83 Impact Factor
  • Source
    Article: Using neural networks for prediction of the subcellular location of proteins.
    [show abstract] [hide abstract]
    ABSTRACT: Neural networks have been trained to predict the subcellular location of proteins in prokaryotic or eukaryotic cells from their amino acid composition. For three possible subcellular locations in prokaryotic organisms a prediction accuracy of 81% can be achieved. Assigning a reliability index, 33% of the predictions can be made with an accuracy of 91%. For eukaryotic proteins (excluding plant sequences) an overall prediction accuracy of 66% for four locations was achieved, with 33% of the sequences being predicted with an accuracy of 82% or better. With the subcellular location restricting a protein's possible function, this method should be a useful tool for the systematic analysis of genome data and is available via a server on the world wide web.
    Nucleic Acids Research 06/1998; 26(9):2230-6. · 8.03 Impact Factor

Full-text (2 Sources)

View
0 Downloads
Available from

Keywords

classically secreted proteins
 
denoted
 
different sequence transformation vectors
 
feature-based classifiers
 
Gaussian kernel functions
 
Gram-positive bacteria
 
grid search
 
Kernel functions
 
Nested k-fold cross-validation
 
non-classically secreted Gram-positive bacterial proteins
 
non-classically secreted proteins
 
physicochemical factors
 
possible feature vectors
 
predictive methods
 
predictive performance
 
protein secretion mechanisms
 
PSSM
 
sequence-based classifier
 
Support Vector Machines
 
SVMs
 

Daniel Restrepo-Montoya