[Show abstract][Hide abstract] ABSTRACT: Background:
Argonaute 2 (AGO2), a central component of RNA-induced silencing complex, plays critical roles in cancer. We examined whether the single nucleotide polymorphisms (SNPs) of AGO2 were related to the risk of nasopharyngeal carcinoma (NPC).
Twenty-five tag SNPs within AGO2 were genotyped in Guangxi population consisting of 855 NPC patients and 1036 controls. The SNPs significantly associated with NPC were further replicated in Guangdong population consisting of 996 NPC patients and 972 controls. Functional experiments were conducted to examine the biologic roles of AGO2 in NPC.
A significantly increased risk of advanced lymph node metastasis of NPC was identified for the AGO2 rs3928672 GA + AA genotype compared with GG genotype in both the Guangxi and Guangdong populations (combined odd ratio = 2.08, 95 % confidence interval = 1.44-3.01, P = 8.60 × 10(-5)). Moreover, the AGO2 protein expression levels of rs3928672 GA + AA genotype carriers were higher than the GG genotype carriers in the NPC tissues (P = 0.041), and AGO2 was significantly over-expressed in NPC tissues compared with non-cancerous nasopharyngeal tissues (P = 0.011). In addition, AGO2 knockdown reduced cell proliferation, induced apoptosis, and inhibited migration of NPC cells. Furthermore, gene expression microarray showed that genes altered following AGO2 knockdown were clustered in tumorigenesis and metastasis relevant pathways.
Our findings suggest that the genetic polymorphism in AGO2 may be a risk factor for the advanced lymph node metastasis of NPC in Chinese populations, and AGO2 acts as an oncogene in the development of NPC.
BMC Cancer 11/2015; 15(1):862. DOI:10.1186/s12885-015-1895-4 · 3.36 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Although the "missing protein" is a temporary concept in C-HPP, the biological information for their "missing" could be an important clue in evolutionary studies. Here we classified missing-protein-encoding genes into two groups, the genes encoding PE2 proteins (with transcript evidence) and the genes encoding PE3/4 proteins (with no transcript evidence). These missing-protein-encoding genes distribute unevenly among different chromosomes, chromosomal regions, or gene clusters. In the view of evolutionary features, PE3/4 genes tend to be young, spreading at the nonhomology chromosomal regions and evolving at higher rates. Interestingly, there is a higher proportion of singletons in PE3/4 genes than the proportion of singletons in all genes (background) and OTCSGs (organ, tissue, cell type-specific genes). More importantly, most of the paralogous PE3/4 genes belong to the newly duplicated members of the paralogous gene groups, which mainly contribute to special biological functions, such as "smell perception". These functions are heavily restricted into specific type of cells, tissues, or specific developmental stages, acting as the new functional requirements that facilitated the emergence of the missing-protein-encoding genes during evolution. In addition, the criteria for the extremely special physical-chemical proteins were first set up based on the properties of PE2 proteins, and the evolutionary characteristics of those proteins were explored. Overall, the evolutionary analyses of missing-protein-encoding genes are expected to be highly instructive for proteomics and functional studies in the future.
Journal of Proteome Research 11/2015; DOI:10.1021/acs.jproteome.5b00450 · 4.25 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Investigations of missing proteins (MPs) are being endorsed by many bioanalytical strategies. We proposed that proteogenomics of testis tissue was a feasible approach to identify more MPs because testis tissues have higher gene expression levels. Here, we combined proteomics and transcriptomics to survey gene expression in human testis tissues from three post-mortem individuals. Protein were extracted and separated with glycine- and tricine-SDS-PAGE. A total of 9,597 protein groups were identified; of these 166 protein groups were listed as MPs, including 138 groups (83.1%) with transcriptional evidence. A total of 2,948 proteins are designated as MPs, and 5.6% of these were identified in this study. The high incidence of MPs in testis tissue indicates that this is a rich resource for MPs. Functional category analysis revealed that the biological process that testis MPs mainly involving in are sexual reproduction and spermatogenesis. Some of the MPs are potentially involved in tumorgenesis in other tissues. Therefore, this proteogenomics analysis of individual testis tissues provides convincing evidence for the discovery of MPs. All mass spectrometry data from this study have been deposited in the ProteomeXchange (dataset identifier PXD002179, username: email@example.com, password: NFEv8D8P).
Journal of Proteome Research 08/2015; 14(9). DOI:10.1021/acs.jproteome.5b00435 · 4.25 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: PTEN is one of the most frequently mutated tumour suppressors and reduction in PTEN protein stability also plays a role in tumorigenesis. Although several ubiquitin ligases for PTEN have been identified, the deubiquitylase for de-polyubiquitylation and stabilization of PTEN is less defined. Here, we report OTUD3 as a deubiquitylase of PTEN. OTUD3 interacts with, de-polyubiquitylates and stabilizes PTEN. Depletion of OTUD3 leads to the activation of Akt signalling, induction of cellular transformation and cancer metastasis. OTUD3 transgenic mice exhibit higher levels of the PTEN protein and are less prone to tumorigenesis. Reduction of OTUD3 expression, concomitant with decreased PTEN abundance, correlates with human breast cancer progression. Furthermore, we identified loss-of-function OTUD3 mutations in human cancers, which either abolish OTUD3 catalytic activity or attenuate the interaction with PTEN. These findings demonstrate that OTUD3 is an essential regulator of PTEN and that the OTUD3-PTEN signalling axis plays a critical role in tumour suppression.
[Show abstract][Hide abstract] ABSTRACT: As part of the Chromosome-Centric Human Proteome Project (C-HPP) mission, laboratories all over the world have tried to map the entire missing proteins (MPs) since 2012. Based on the first and second Chinese Chromosome Proteome Database (CCPD 1.0 and 2.0) studies, we developed systematic enrichment strategies to identify MPs that fell into four classes: (1) low molecular weight (LMW) proteins, (2) membrane proteins, (3) proteins that contained various post-translational modifications (PTMs), and (4) nucleic acid-associated proteins. Of 8845 proteins identified in 7 datasets, 79 proteins were classified as MPs. Among datasets derived from different enrichment strategies, datasets for LMW and PTM yielded the most novel MPs. In addition, we found that some MPs were identified in multiple-datasets, which implied that tandem enrichments methods might improve the ability to identify MPs. Moreover, low expression at the transcription level was the major cause of the "missing" of these MPs; however, MPs with higher expression level also evaded identification, most likely due to other characteristics such as LMW, high hydrophobicity and PTM. By combining a stringent manual check of the MS2 spectra with peptides synthesis verification, we confirmed 30 MPs (neXtProt PE2~PE4) and 6 potential MPs (neXtProt PE5) with authentic MS evidence. By integrating our large-scale datasets of CCPD 2.0, the number of identified proteins has increased considerably beyond simulation saturation. Here, we show that special enrichment strategies can break through the data saturation bottleneck, which could increase the efficiency of MP identification in future C-HPP studies. All 7 datasets have been uploaded to ProteomeXchange with the identifier PXD002255.
Journal of Proteome Research 07/2015; 14(9). DOI:10.1021/acs.jproteome.5b00481 · 4.25 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: This paper summarizes the recent activities of the Chromosome-Centric Human Proteome Project (C-HPP) consortium, which develops new technologies to identify yet-to-be annotated proteins (termed "missing proteins") in biological samples that lack sufficient experimental evidence at the protein level for confident protein identification. The C-HPP also aims to identify new protein forms that may be caused by genetic variability, post-translational modifications, and alternative splicing. Proteogenomic data integration forms the basis of the C-HPP's activities; therefore, we have summarized some of key approaches and their roles in the project. We present new analytical technologies that improve the chemical space and lower detection limits coupled with bioinformatics tools and some publicly available resources that can be used to improve data analysis or support the development of analytical assays. Most of this paper's contents have been compiled from posters, slides, and discussions presented in the series of C-HPP workshops held during 2014. All data (posters, presentations) used are available at the C-HPP Wiki (http://c-hpp.webhosting.rug.nl/) and in the supporting information.
Journal of Proteome Research 06/2015; 14(9). DOI:10.1021/pr5013009 · 4.25 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Multi-drug resistance is the main cause of treatment failure in cancer patients. How to identify molecules underlying drug resistance from multi-omics data remains a great challenge. Here, we introduce a data biased strategy, ProteinRank, to prioritize drug-resistance associated proteins in cancer cells. First, we identified differentially expressed proteins in Adriamycin and Vincristine resistant gastric cancer cells compared to their parental cells using iTRAQ combined with LC-MS/MS experiments, and then mapped them to human protein-protein interaction network; second, we applied ProteinRank to analyze the whole network and rank proteins similar to known drug resistance related proteins. Cross validations demonstrated a better performance of ProteinRank compared to the method without usage of MS data. Further validations confirmed the altered expressions or activities of several top ranked proteins. Functional study showed PIM3 or CAV1 silencing was sufficient to reverse the drug resistance phenotype. These results indicated ProteinRank could prioritize key proteins related to drug resistance in gastric cancer and provided important clues for cancer research.
[Show abstract][Hide abstract] ABSTRACT: SUMOylation has emerged as a new regulation mechanism for proteins involved in multiple physiological and pathological processes. However, the detailed function of SUMOylation in liver cancer is still elusive. Our study revealed that the SUMOylation-activating enzyme UBA2 was highly expressed in liver cancer cells and clinical samples. Silencing of UBA2 expression could suppress cell proliferation to some extent. To elucidate the function of UBA2, we used a large scale proteomics strategy to identify SUMOylation targets in HepG2 cells. We characterized 827 potential SUMO1-modified proteins that were not present in the control samples. These proteins were enriched in gene expression processes. Twelve candidates were validated as SUMO1-modified proteins by immunoprecipitation-Western blotting. We further characterized an identified SUMOylated protein TFII-I and determined that TFII-I was modified by SUMO1 at K221 and K240. PIAS4 was an E3 ligase for TFII-I SUMOylation, and SENP2 was responsible for deSUMOylating TFII-I in HepG2 cells. SUMOylation reduced TFII-I binding to its repressor HDAC3 and thus promoted its transactivation. We further showed that SUMOylation was critical for TFII-I to promote cell proliferation and colony formation. Our findings contribute to understanding the role of SUMOylation in liver cancer development.
Journal of Proteome Research 04/2015; 14(6). DOI:10.1021/acs.jproteome.5b00062 · 4.25 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The Chromosome-centric Human Proteome Project (C-HPP) aims to catalog the genome-encoded proteins by the chromosome-by-chromosome strategy. As the C-HPP proceeds, the increasing requirements of data-intensive analysis for the MS/MS data pose a challenge to the proteomic community, especially those small laboratories lacking computational infrastructures. To address this challenge, we have updated the previous CAPER browser into a higher version, CAPER 3.0 - a scalable cloud-based system for data-intensive analysis of C-HPP datasets. CAPER 3.0 uses cloud computing technology to facilitate MS/MS-based peptide identification. In particular, it can use both public and private cloud, facilitating the analysis of C-HPP datasets. CAPER 3.0 provides a graphical user interface (GUI) to help users transfer data, configure jobs, track progress, and visualize the results comprehensively. These features enable users without programming expertise to easily conduct data-intensive analysis using CAPER 3.0. Here, we illustrate the usage of CAPER 3.0 with four specific mass spectral data-intensive problems: detecting novel peptides, identifying single amino acid variants (SAVs) derived from known missense mutations, identifying sample-specific SAVs and identifying exon-skipping events. CAPER 3.0 is available at http://prodigy.bprc.ac.cn/caper3.
Journal of Proteome Research 03/2015; 14(9). DOI:10.1021/pr501335w · 4.25 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Currently, major concerns about the safety and efficacy of RNA interference (RNAi)-based bone anabolic strategies still exist because of the lack of direct osteoblast-specific delivery systems for osteogenic siRNAs. Here we screened the aptamer CH6 by cell-SELEX, specifically targeting both rat and human osteoblasts, and then we developed CH6 aptamer-functionalized lipid nanoparticles (LNPs) encapsulating osteogenic pleckstrin homology domain-containing family O member 1 (Plekho1) siRNA (CH6-LNPs-siRNA). Our results showed that CH6 facilitated in vitro osteoblast-selective uptake of Plekho1 siRNA, mainly via macropinocytosis, and boosted in vivo osteoblast-specific Plekho1 gene silencing, which promoted bone formation, improved bone microarchitecture, increased bone mass and enhanced mechanical properties in both osteopenic and healthy rodents. These results indicate that osteoblast-specific aptamer-functionalized LNPs could act as a new RNAi-based bone anabolic strategy, advancing the targeted delivery selectivity of osteogenic siRNAs from the tissue level to the cellular level.
Nature Medicine 02/2015; 21(3). DOI:10.1038/nm.3791 · 27.36 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Inhaled xenobiotics such as tobaccospecific carcinogen 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone are mainly metabolized by phase I oxidase cytochrome P450, family 2, subfamily A, polypeptide 13 (CYP2A13), phase II conjugate UDP glucuronosyltransferase 2 family, polypeptide B17 (UGT2B17), and phase III transporter ATP-binding cassette, subfamily B (MDR/TAP), member 1(ABCB1), with genetic polymorphisms implicated in lung cancer. Their genetic interaction and pulmonary expression regulation are largely unknown. We analyzed joint association for CYP2A13 and ABCB1 polymorphisms in 2 independent lung cancer case populations (669 and 566 patients) and 1 common control population (749 subjects), and characterized the transacting function of the lung development-related transcription factor forkhead boxA2(FOXA2). We undertook FOXA2 overexpression and down-regulation in lung epithelial cell lines, analyzed functional impact on the transactivation of CYP2A13, UGT2B17, and ABCB1, and measured correlation for their expressions in lung tissues. We found a substantial reduction in cancer risk (OR 0.39; 95% CI 0.25-0.61; Pinteraction = 0.029) associated with combined genotypes for CYP2A13 R257C and a functionary regulatory variant in the cis element of ABCB1 synergistically targeted by GATA binding protein 6 and FOXA2. Genetic manipulation of FOXA2 consistently influenced its binding to and transactivation of the promoters of CYP2A13, UGT2B17, and ABCB1, whose mRNA and protein expressions were all consistently correlated with those of FOXA2 in both tumorous and normal lung tissues. We therefore establish FOXA2 as a core transcriptional modulator for pulmonary xenobiotic metabolic pathways and uncover an etiologically relevant interaction between CYP2A13 and ABCB1, furthering our understanding of expression and function of the xenobiotic metabolism system.
The FASEB Journal 02/2015; 29(5). DOI:10.1096/fj.14-264580 · 5.04 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Previously isolated pathways screened from individual genes were investigated at either the transcriptional or translational level; however, the consistency between the pathways screened at the gene expression levels was obscure in metastatic human hepatocellular carcinoma (HCC). To elucidate this question, we performed a transcriptomic (16353 genes) and proteomic (7861 proteins) analysis simultaneously on six metastatic HCC cell lines against two non-metastatic HCC cell lines, with all HBV traceable and close genetic-backgrounds for a comparative study. The quantitative and integrated results showed that significant genes were screened differentially with 351 transcripts from the transcriptome and 304 proteins from the proteome, with limited overlapping genes (7%). However, we discovered that these discrete 351 transcripts and 304 proteins screened share extrusive significant-pathways/networks with a 77% overlap, including active TGF-β, RAS, NFκB, and Wnt, and inactive HNF4A, which are responsible for HCC metastasis. We conclude that the discrete, but significant genes predicted by either ome play intrinsically important roles in the linkage of responsible pathways shared by both omes in HCC metastasis. This article is protected by copyright. All rights reserved.
This article is protected by copyright. All rights reserved.
[Show abstract][Hide abstract] ABSTRACT: Motivation: Anatomical Therapeutic Chemical (ATC) classification system, widely applied in almost all drug utilization studies, is currently the most widely recognized classification system for drugs. Currently new drug entries are added into the system only on users’ requests, which leads to seriously incomplete drug coverage of the system, and bioinformatics prediction is helpful during this process.
Results: Here we propose a novel prediction model of drug-ATC code associations, using logistic regression to integrate multiple heterogeneous data sources including chemical structures, target proteins, gene expression, side-effects and chemical-chemical associations. The model obtains good performance for the prediction
not only on ATC codes of unclassified drugs but also on new ATC codes of classified drugs assessed by cross-validation and independent test sets, and its efficacy exceeds previous methods. Further to facilitate the use, the model is developed into a user-friendly web service SPACE (Similarity-based Predictor of ATC CodE),
which for each submitted compound, will give candidate ATC codes (ranked according to the decreasing probability_score predicted by the model) together with corresponding supporting evidence. This work not only contributes to knowing drugs’ therapeutic, pharmacological and chemical properties, but also provides clues for drug repositioning and side-effect discovery. In addition, the construction of the prediction model also provides a general framework for similarity-based data integration which is suitable for other drug-related studies such as target, side-effect prediction etc..
Availability: The web service SPACE is available at
[Show abstract][Hide abstract] ABSTRACT: Integration of pathway and protein-protein interaction (PPI) data can provide more information that could lead to new biological insights. PPIs are usually represented by a simple binary model, whereas pathways are represented by more complicated models. We developed a series of rules for transforming protein interactions from pathway to binary model, and the protein interactions from seven pathway databases, including PID, BioCarta, Reactome, NetPath, INOH, SPIKE and KEGG, were transformed based on these rules. These pathway-derived binary protein interactions were integrated with PPIs from other five PPI databases including HPRD, IntAct, BioGRID, MINT and DIP, to develop integrated dataset (named PathPPI). More detailed interaction type and modification information on protein interactions can be preserved in PathPPI than other existing datasets. Comparison analysis results indicate that most of the interaction overlaps values (O
AB) among these pathway databases were less than 5%, and these databases must be used conjunctively. The PathPPI data was provided at http://proteomeview.hupo.org.cn/PathPPI/PathPPI.html.