Breast cancer subtyping from plasma proteins

BMC Medical Genomics (Impact Factor: 2.87). 01/2013; 6(1). DOI: 10.1186/1755-8794-6-S1-S6


Early detection of breast cancer in blood is both appealing clinically and challenging technically due to the disease's illusive nature and heterogeneity. Today, even though major breast cancer subtypes have been characterized, i.e., luminal A, luminal B, HER2+, and basal-like, little is known about the heterogeneity of breast cancer in blood, which could help to discover minimally invasive protein biomarkers with which clinical researchers can detect, classify, and monitor different breast cancer subtypes.

In this study, we performed an integrative pathway-assisted clustering analysis of breast cancer subtypes from plasma proteome samples collected from 80 patients diagnosed with breast cancer and 80 healthy women. First, four breast cancer subtypes and additionally unknown subtype (according to existing annotation) were determined based on pathology lab test results in primary tumors of enrolled patients. Next, we developed and applied four distance metrics, i.e., Protein Intensity, Q-Value, Pathway Profile, and Distance Score Function, to measure and characterize these cancer subtypes. Then, we developed a permutation test to evaluate the significant protein level changes in each biological pathway for each breast cancer subtype, using q-value. Lastly, we developed a pathway-protein matrix for each of the four distance methods to estimate the distance between breast cancer subtypes, for which further Pathway Association Network analysis were performed.

We found that 1) the luminal group (luminal A and luminal B) are clustered together, as well as the basal group (basal-like and HER2+) and 2) luminal A and luminal B are more close to each other than basal-like and HER2+ to each other. Our results were consistent with a recent independent breast cancer research from the Cancer Genome Atlas Network using genomic DNA copy number arrays, DNA methylation, exome sequencing, messenger RNA arrays, microRNA sequencing and reverse-phase protein arrays. Our results showed that changes of different breast cancer subtypes at the pathway level are more profound and less variable than those at the molecular level. Similar subtypes share distinct yet similar pathway activation networks, while dissimilar subtypes are different also at the level of pathway activation networks. The results also showed that distance or similarity of cancer subtypes based on pathway analysis might be able to provide further insight into the intrinsic relationship of breast cancer subtypes. We believe integrative pathway-assisted proteomics analysis described here can become a model for reliable clustering or classification of other cancer subtypes.

Download full-text


Available from: Jake Chen
  • Source
    • "For commercial re-use, please contact Bioinformatics, 2015, i1–i8 doi: 10.1093/bioinformatics/btv265 ISMB/ECCB 2015 significantly enhance their ability to annotate genes from Omics results (Ganter and Giroux, 2008), interpret heterogeneous genetic study results (Hale et al., 2012), identify disease subtypes and progression (Hung, 2013; Zhang and Chen, 2013), select or prioritize 5 drug targets (Sivachenko and Yuryev, 2007) and understand biological mechanisms (Chen et al., 2006; Murohashi et al., 2010). Heterogeneous bioinformatics tools have been developed to perform different aspects of GNPA Huang da et al., 2009). "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this article, we described a new database framework to perform integrative "gene-set, network, and pathway analysis" (GNPA). In this framework, we integrated heterogeneous data on pathways, annotated list, and gene-sets (PAGs) into a PAG electronic repository (PAGER). PAGs in the PAGER database are organized into P-type, A-type and G-type PAGs with a three-letter-code standard naming convention. The PAGER database currently compiles 44 313 genes from 5 species including human, 38 663 PAGs, 324 830 gene-gene relationships and two types of 3 174 323 PAG-PAG regulatory relationships-co-membership based and regulatory relationship based. To help users assess each PAG's biological relevance, we developed a cohesion measure called Cohesion Coefficient (CoCo), which is capable of disambiguating between biologically significant PAGs and random PAGs with an area-under-curve performance of 0.98. PAGER database was set up to help users to search and retrieve PAGs from its online web interface. PAGER enable advanced users to build PAG-PAG regulatory networks that provide complementary biological insights not found in gene set analysis or individual gene network analysis. We provide a case study using cancer functional genomics data sets to demonstrate how integrative GNPA help improve network biology data coverage and therefore biological interpretability. The PAGER database can be accessible openly at jakechen@iupui.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
    Full-text · Article · Jun 2015 · Bioinformatics
  • Source
    • "Separately, signaling pathway impact analysis (SPIA) combines both functional evidences from classical enrichment analysis and topological evidences represented as perturbation factor on a given pathway under a given condition (Tarca et al., 2009). Network analysis using partial network modules are also promising , e.g., developing pathway biomarkers from proteomic data (Zhang and Chen, 2010) and breast cancer subtyping from plasma proteins (Zhang and Chen, 2013). In all, these pathway/network analysis tools integrates network topological information at a limited scale, either at the protein interaction network level or at the network module level. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Proteomics is inherently a systems science that studies not only measured protein and their expressions in a cell, but also the interplay of proteins, protein complexes, signaling pathways, and network modules. There is a rapid accumulation of Proteomics data in recent years. However, Proteomics data are highly variable, with results being sensitive to data preparation methods, sample condition, instrument types, and analytical method. To address this challenge in Proteomics data analysis, we review common approaches developed to incorporate biological function and network topological information. We categorize existing tools into four categories: tools with basic functional information and little topological features (e.g., GO category analysis), tools with rich functional information and little topological features (e.g., GSEA), tools with basic functional information and rich topological features (e.g., Cytoscape), and tools with rich functional information and rich topological features (e.g., PathwayExpress). We review the general application potential of these tools to Proteomics. In addition, we also review tools that can achieve automated learning of pathway modules and features, and tools that help perform integrated network visual analytics.
    Full-text · Article · Jun 2014 · Journal of Theoretical Biology
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Breast cancer remains a significant scientific, clinical and societal challenge. This gap analysis has reviewed and critically assessed enduring issues and new challenges emerging from recent research, and proposes strategies for translating solutions into practice. More than 100 internationally recognised specialist breast cancer scientists, clinicians and healthcare professionals collaborated to address nine thematic areas: genetics, epigenetics and epidemiology; molecular pathology and cell biology; hormonal influences and endocrine therapy; imaging, detection and screening; current/novel therapies and biomarkers; drug resistance; metastasis, angiogenesis, circulating tumour cells, cancer 'stem' cells; risk and prevention; living with and managing breast cancer and its treatment. The groups developed summary papers through an iterative process which, following further appraisal from experts and patients, were melded into this summary account. The 10 major gaps identified were: (1) understanding the functions and contextual interactions of genetic and epigenetic changes in normal breast development and during malignant transformation; (2) how to implement sustainable lifestyle changes (diet, exercise and weight) and chemopreventive strategies; (3) the need for tailored screening approaches including clinically actionable tests; (4) enhancing knowledge of molecular drivers behind breast cancer subtypes, progression and metastasis; (5) understanding the molecular mechanisms of tumour heterogeneity, dormancy, de novo or acquired resistance and how to target key nodes in these dynamic processes; (6) developing validated markers for chemosensitivity and radiosensitivity; (7) understanding the optimal duration, sequencing and rational combinations of treatment for improved personalised therapy; (8) validating multimodality imaging biomarkers for minimally invasive diagnosis and monitoring of responses in primary and metastatic disease; (9) developing interventions and support to improve the survivorship experience; (10) a continuing need for clinical material for translational research derived from normal breast, blood, primary, relapsed, metastatic and drug-resistant cancers with expert bioinformatics support to maximise its utility. The proposed infrastructural enablers include enhanced resources to support clinically relevant in vitro and in vivo tumour models; improved access to appropriate, fully annotated clinical samples; extended biomarker discovery, validation and standardisation; and facilitated cross-discipline working. With resources to conduct further high-quality targeted research focusing on the gaps identified, increased knowledge translating into improved clinical care should be achievable within five years.
    Full-text · Article · Oct 2013 · Breast cancer research: BCR
Show more