Article

Computational prediction of human proteins that can be secreted into the bloodstream.

Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA 30602, USA.
Bioinformatics (Impact Factor: 4.62). 09/2008; 24(20):2370-5. DOI: 10.1093/bioinformatics/btn418
Source: PubMed

ABSTRACT We present a novel computational method for predicting which proteins from highly and abnormally expressed genes in diseased human tissues, such as cancers, can be secreted into the bloodstream, suggesting possible marker proteins for follow-up serum proteomic studies. A main challenging issue in tackling this problem is that our understanding about the downstream localization after proteins are secreted outside the cells is very limited and not sufficient to provide useful hints about secretion to the bloodstream. To bypass this difficulty, we have taken a data mining approach by first collecting, through extensive literature searches, human proteins that are known to be secreted into the bloodstream due to various pathological conditions as detected by previous proteomic studies, and then asking the question: 'what do these secreted proteins have in common in terms of their physical and chemical properties, amino acid sequence and structural features that can be used to predict them?' We have identified a list of features, such as signal peptides, transmembrane domains, glycosylation sites, disordered regions, secondary structural content, hydrophobicity and polarity measures that show relevance to protein secretion. Using these features, we have trained a support vector machine-based classifier to predict protein secretion to the bloodstream. On a large test set containing 98 secretory proteins and 6601 non-secretory proteins of human, our classifier achieved approximately 90% prediction sensitivity and approximately 98% prediction specificity. Several additional datasets are used to further assess the performance of our classifier. On a set of 122 proteins that were found to be of abnormally high abundance in human blood due to various cancers, our program predicted 62 as blood-secreted proteins. By applying our program to abnormally highly expressed genes in gastric cancer and lung cancer tissues detected through microarray gene expression studies, we predicted 13 and 31 as blood secreted, respectively, suggesting that they could serve as potential biomarkers for these two cancers, respectively. Our study demonstrated that our method can provide highly useful information to link genomic and proteomic studies for disease biomarker discovery. Our software can be accessed at http://csbl1.bmb.uga.edu/cgi-bin/Secretion/secretion.cgi.

0 Bookmarks
 · 
103 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: Mineralising organisms are thought to have been evolving for some 542 million years to precisely control both inorganic compound formation and inhibition in aqueous conditions. While the exact mechanisms continue to be elusive, there is growing evidence that control of the amorphous state is critical to these processes. To address this issue, the ability of three abundant serum proteins to stabilise the amorphous state of several important biogenic calcium minerals is examined. After gaining an insight into the relative stabilising strength of these proteins, their effect on the crystallization of gold was explored, demonstrating a potential use of this approach for the synthesis of functional nano-sized materials.
    Advanced Functional Materials 05/2011; 21(15):2968 - 2977. · 10.44 Impact Factor
  • Source
    Ovarian Cancer - Basic Science Perspective, 02/2012; , ISBN: 978-953-307-812-0
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Proteins can move from blood circulation into salivary glands through active transportation, passive diffusion or ultrafiltration, some of which are then released into saliva and hence can potentially serve as biomarkers for diseases if accurately identified. We present a novel computational method for predicting salivary proteins that come from circulation. The basis for the prediction is a set of physiochemical and sequence features we found to be discerning between human proteins known to be movable from circulation to saliva and proteins deemed to be not in saliva. A classifier was trained based on these features using a support-vector machine to predict protein secretion into saliva. The classifier achieved 88.56% average recall and 90.76% average precision in 10-fold cross-validation on the training data, indicating that the selected features are informative. Considering the possibility that our negative training data may not be highly reliable (i.e., proteins predicted to be not in saliva), we have also trained a ranking method, aiming to rank the known salivary proteins from circulation as the highest among the proteins in the general background, based on the same features. This prediction capability can be used to predict potential biomarker proteins for specific human diseases when coupled with the information of differentially expressed proteins in diseased versus healthy control tissues and a prediction capability for blood-secretory proteins. Using such integrated information, we predicted 31 candidate biomarker proteins in saliva for breast cancer.
    PLoS ONE 11/2013; 8(11):e80211. · 3.53 Impact Factor

Full-text (2 Sources)

Download
17 Downloads
Available from
Sep 2, 2014