[Show abstract][Hide abstract] ABSTRACT: Background
IntClust is a classification of breast cancer comprising ten subtypes based on molecular drivers identified through the integration of genomic and transcriptomic data from 1,000 breast tumors and validated in a further 1,000. We present a reliable method for subtyping breast tumors into the IntClust subtypes based on gene expression and demonstrate the clinical and biological validity of the IntClust classification.ResultsWe developed a gene expression-based approach for classifying breast tumors into the ten IntClust subtypes by using the ensemble profile of the index discovery dataset. We evaluate this approach in 983 independent samples for which the combined copy-number and gene expression IntClust classification was available. Only 24 samples are discordantly classified. Next, we compile a consolidated external dataset composed of a further 7,544 breast tumors. We use our approach to classify all samples into the IntClust subtypes. All ten subtypes are observable in most studies at comparable frequencies. The IntClust subtypes are significantly associated with relapse-free survival and recapitulate patterns of survival observed previously. In studies of neo-adjuvant chemotherapy, IntClust reveals distinct patterns of chemosensitivity. Finally, patterns of expression of genomic drivers reported by TCGA are better explained by IntClust as compared to the PAM50 classifier.Conclusions
IntClust subtypes are reproducible in a large meta-analysis, show clinical validity and best capture variation in genomic drivers. IntClust is a driver-based breast cancer classification and is likely to become increasingly relevant as more targeted biological therapies become available.
[Show abstract][Hide abstract] ABSTRACT: The evolution of cancer genomes within a single tumor creates mixed cell populations with divergent somatic mutational landscapes. Inference of tumor subpopulations has been disproportionately focused on the assessment of somatic point mutations, whereas computational methods targeting evolutionary dynamics of copy number alterations (CNA) and loss of heterozygosity (LOH) in whole genome sequencing data remain under-developed. We present a novel probabilistic model, TITAN, to infer CNA and LOH events while accounting for mixtures of cell populations, thereby estimating the proportion of cells harboring each event. We evaluate TITAN on idealized mixtures, simulating clonal populations from whole genome sequences taken from genomically heterogeneous ovarian tumor sites collected from the same patient. In addition, we show in 23 whole genomes of breast tumors that inference of CNA and LOH using TITAN critically inform population structure and the nature of the evolving cancer genome. Finally, we experimentally validated subclonal predictions using fluorescence in situ hybridization (FISH) and single-cell sequencing from an ovarian cancer patient sample, thereby recapitulating the key modeling assumptions of TITAN.
[Show abstract][Hide abstract] ABSTRACT: The gut endocrine system is emerging as a central player in the control of appetite and glucose homeostasis, and as a rich source of peptides with therapeutic potential in the field of diabetes and obesity. In this study we have explored the physiology of insulin-like peptide 5 (Insl5), which we identified as a product of colonic enteroendocrine L-cells, better known for their secretion of glucagon-like peptide-1 and peptideYY. i.p. Insl5 increased food intake in wild-type mice but not mice lacking the cognate receptor Rxfp4. Plasma Insl5 levels were elevated by fasting or prolonged calorie restriction, and declined with feeding. We conclude that Insl5 is an orexigenic hormone released from colonic L-cells, which promotes appetite during conditions of energy deprivation.
Proceedings of the National Academy of Sciences of the United States of America. 07/2014;
[Show abstract][Hide abstract] ABSTRACT: BRCA2 mutations are significantly associated with early onset breast cancer, and the tumour suppressing function of BRCA2 has been attributed to its involvement in homologous recombination (HR)-mediated DNA repair. In order to identify additional functions of BRCA2, we generated BRCA2-knockout HCT116 human colorectal carcinoma cells. Using genome-wide microarray analyses, we have discovered a link between the loss of BRCA2 and the up-regulation of a subset of interferon (IFN)-related genes, including APOBEC3F and APOBEC3G. The over-expression of IFN-related genes was confirmed in different human BRCA2−/− and mouse Brca2−/− tumour cell lines, and was independent of senescence and apoptosis. In isogenic wild type BRCA2 cells, we observed over-expression of IFN-related genes after treatment with DNA-damaging agents, and following ionizing radiation. Cells with endogenous DNA damage because of defective BRCA1 or RAD51 also exhibited over-expression of IFN-related genes. Transcriptional activity of the IFN-stimulated response element (ISRE) was increased in BRCA2 knockout cells, and the expression of BRCA2 greatly decreased IFN-α stimulated ISRE reporter activity, suggesting that BRCA2 directly represses the expression of IFN-related genes through the ISRE. Finally, the colony forming capacity of BRCA2 knockout cells was significantly reduced in the presence of either IFN-β or IFN-γ, suggesting that IFNs may have potential as therapeutic agents in cancer cells with BRCA2 mutations.
The Journal of Pathology 07/2014; · 7.59 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Hypomethylating agents are widely used in patients with myelodysplastic syndromes and unfit patients with acute myeloid leukemia. However, it is not well understood why only some patients respond to hypomethylating agents. We found previously that the effect of decitabine on hematopoietic stem cell viability differed between Mll5 wildtype and null cells. We therefore investigated the role of MLL5 expression levels on outcome of acute myeloid leukemia patients who were treated with decitabine. MLL5 above the median expression level predicted longer overall survival independent of DNMT3A mutation status in bivariate analysis (median overall survival for high vs. low MLL5 expression, 292 vs. 167 days, P=.026). In patients who received 3 or more courses decitabine, high MLL5 expression and wildtype DNMT3A independently predicted improved overall survival (median overall survival for high vs. low MLL5 expression, 468 vs. 243 days, P=.012). In transformed murine cells, loss of Mll5 was associated with resistance to low-dose decitabine, less global DNA methylation in promoter regions, and reduced DNA demethylation upon decitabine treatment. Together, these data support our clinical observation of improved outcome in decitabine treated patients who express MLL5 at high levels, and suggest a mechanistic role of MLL5 in the regulation of DNA methylation.
[Show abstract][Hide abstract] ABSTRACT: In breast cancer, the TP53 gene is frequently mutated and the mutations have been associated with poor prognosis. The prognostic impact of the different types of TP53 mutations across the different molecular subtypes is still poorly understood. Here, we characterize the spectrum and prognostic significance of TP53 mutations with respect to the PAM50 subtypes and Integrative Clusters (IC). Experimental design: TP53 mutation status was obtained for 1420 tumor samples from the METABRIC cohort by sequencing all coding exons using the Sanger method.
TP53 mutations were found in 28.3% of the tumors, conferring a worse overall and breast cancer specific survival (HR=2.03, 95%CI=1.65-2.48, p<0.001), and were also found to be an independent marker of poor prognosis in estrogen receptor positive cases (HR=1.86, 95%CI=1.39-2.49, p<0.001). The mutation spectrum of TP53 varied between the breast cancer subtypes, and individual alterations showed subtype specific association. TP53 mutations were associated with increased mortality in patients with Luminal B, HER2-enriched and Normal-like tumors, but not in patients with Luminal A and Basal-like tumors. Similar observations were made in ICs, where mutation associated with poorer outcome in IC1, IC4 and IC5. The combined effect of TP53 mutation, TP53 LOH and MDM2 amplification on mortality was additive.
This study reveals that TP53 mutations have different clinical relevance in molecular subtypes of breast cancer, and suggests diverse roles for TP53 in the biology underlying breast cancer development.
Clinical Cancer Research 05/2014; · 7.84 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Cancer evolves by mutation, with somatic reactivation of retrotransposons being one such mutational process. Germline retrotransposition can cause processed pseudogenes, but whether this occurs somatically has not been evaluated. Here we screen sequencing data from 660 cancer samples for somatically acquired pseudogenes. We find 42 events in 17 samples, especially non-small cell lung cancer (5/27) and colorectal cancer (2/11). Genomic features mirror those of germline LINE element retrotranspositions, with frequent target-site duplications (67%), consensus TTTTAA sites at insertion points, inverted rearrangements (21%), 5 0 truncation (74%) and polyA tails (88%). Transcriptional consequences include expression of pseudogenes from UTRs or introns of target genes. In addition, a somatic pseudogene that integrated into the promoter and first exon of the tumour suppressor gene, MGA, abrogated expression from that allele. Thus, formation of processed pseudogenes represents a new class of mutation occurring during cancer development, with potentially diverse functional consequences depending on genomic context.
[Show abstract][Hide abstract] ABSTRACT: We introduce PyClone, a statistical model for inference of clonal population structures in cancers. PyClone is a Bayesian clustering method for grouping sets of deeply sequenced somatic mutations into putative clonal clusters while estimating their cellular prevalences and accounting for allelic imbalances introduced by segmental copy-number changes and normal-cell contamination. Single-cell sequencing validation demonstrates PyClone's accuracy.
[Show abstract][Hide abstract] ABSTRACT: Amplification of the EMSY gene in sporadic breast and ovarian cancers is a poor prognostic indicator. Although EMSY has been linked to transcriptional silencing, its mechanism of action is unknown. Here, we report that EMSY acts as an oncogene, causing the transformation of cells in vitro and potentiating tumor formation and metastatic features in vivo. We identify an inverse correlation between EMSY amplification and miR-31 expression, an antimetastatic microRNA, in the METABRIC cohort of human breast samples. Re-expression of miR-31 profoundly reduced cell migration, invasion, and colony-formation abilities of cells overexpressing EMSY or haboring EMSY amplification. We show that EMSY is recruited to the miR-31 promoter by the DNA binding factor ETS-1, and it represses miR-31 transcription by delivering the H3K4me3 demethylase JARID1b/PLU-1/KDM5B. Altogether, these results suggest a pathway underlying the role of EMSY in breast cancer and uncover potential diagnostic and therapeutic targets in sporadic breast cancer.
[Show abstract][Hide abstract] ABSTRACT: Cellular barcoding offers a powerful approach to characterize the growth and differentiation activity of large numbers of cotransplanted stem cells. Here, we describe a lentiviral genomic-barcoding and analysis strategy and its use to compare the clonal outputs of transplants of purified mouse and human basal mammary epithelial cells. We found that both sources of transplanted cells produced many bilineage mammary epithelial clones in primary recipients, although primary clones containing only one detectable mammary lineage were also common. Interestingly, regardless of the species of origin, many clones evident in secondary recipients were not detected in the primary hosts, and others that were changed from appearing luminal-restricted to appearing bilineage. This barcoding methodology has thus revealed conservation between mice and humans of a previously unknown diversity in the growth and differentiation activities of their basal mammary epithelial cells stimulated to grow in transplanted hosts.
[Show abstract][Hide abstract] ABSTRACT: Complex focal chromosomal rearrangements in cancer genomes, also called “firestorms”, can be scored from DNA copy number data. The complex arm-wise aberration index (CAAI) is a score that captures DNA copy number alterations that appear as focal complex events in tumors, and has potential prognostic value in breast cancer. This study aimed to validate this DNA-based prognostic index in breast cancer and test for the first time its potential prognostic value in ovarian cancer. Copy number alteration (CNA) data from 1950 breast carcinomas (METABRIC cohort) and 508 high-grade serous ovarian carcinomas (TCGA dataset) were analyzed. Cases were classified CAAI positive if at least one complex focal event was scored. Complex alterations were frequently localized on chromosome 8p (n = 159), 17q (n = 176) and 11q (n = 251). CAAI events on 11q were most frequent in estrogen receptor positive (ER+) cases and on 17q in estrogen receptor negative (ER−) cases. We found only a modest correlation between CAAI and the overall rate of genomic instability (GII) and number of breakpoints (r = 0.27 and r = 0.42, p < 0.001). Breast cancer specific survival (BCSS), overall survival (OS) and ovarian cancer progression free survival (PFS) were used as clinical end points in Cox proportional hazard model survival analyses. CAAI positive breast cancers (43%) had higher mortality: hazard ratio (HR) of 1.94 (95%CI, 1.62–2.32) for BCSS, and of 1.49 (95%CI, 1.30–1.71) for OS. Representations of the 70-gene and the 21-gene predictors were compared with CAAI in multivariable models and CAAI was independently significant with a Cox adjusted HR of 1.56 (95%CI, 1.23–1.99) for ER+ and 1.55 (95%CI, 1.11–2.18) for ER− disease. None of the expression-based predictors were prognostic in the ER− subset. We found that a model including CAAI and the two expression-based prognostic signatures outperformed a model including the 21-gene and 70-gene signatures but excluding CAAI. Inclusion of CAAI in the clinical prognostication tool PREDICT significantly improved its performance. CAAI positive ovarian cancers (52%) also had worse prognosis: HRs of 1.3 (95%CI, 1.1–1.7) for PFS and 1.3 (95%CI, 1.1–1.6) for OS. This study validates CAAI as an independent predictor of survival in both ER+ and ER− breast cancer and reveals a significant prognostic value for CAAI in high-grade serous ovarian cancer.
[Show abstract][Hide abstract] ABSTRACT: Triple-negative breast cancers (TNBC) do not represent a single disease subgroup and are often aggressive breast cancers with poor prognoses. Unlike estrogen/progesterone receptor and HER2 (human epidermal growth factor receptor 2) breast cancers, which are responsive to targeted treatments, there is no effective targeted therapy for TNBC, although approximately 50% of patients respond to conventional chemotherapies, including taxanes, anthracyclines, cyclophosphamide, and platinum salts.Content:Genomic studies have helped clarify some of the possible disease groupings that make up TNBC. We discuss the findings, including copy number- transcriptome analysis, whole genome sequencing, and exome sequencing, in terms of the biological properties and phenotypes that make up the constellation of TNBC. The relationships between subgroups defined by transcriptome and genome analysis are discussed.Summary:TNBC is not a uniform molecular or disease entity but a constellation of variably well-defined biological properties whose relationship to each other is not understood. There is good support for the existence of a basal expression subtype, p53 mutated, high-genomic instability subtype of TNBC. This should be considered a distinct TNBC subtype. Other subtypes with variable degrees of supporting evidence exist within the nonbasal/p53wt (wild-type p53) TNBC, including a group of TNBC with PI3K (phosphoinositide 3-kinase) pathway activation that have better overall prognosis than the basal TNBC. Consistent molecular phenotyping of TNBC by whole genome sequencing, transcriptomics, and functional studies with patient-derived tumor xenograft models will be essential components in clinical and biological studies as means of resolving this heterogeneity.
[Show abstract][Hide abstract] ABSTRACT: Rhabdomyosarcoma (RMS) is the most common soft tissue sarcoma in children. Children with metastatic RMS have a five-year event-free survival of <30% and a recent trial of the toposisomerase I inhibitor irinotecan failed to improve outcome. We hypothesized that this resistance to irinotecan arose from overexpression of the DNA repair enzyme tyrosyl-DNA phosphodiesterase (Tdp1) which processes topoisomerase I -DNA complexes resulting from topoisomerase I inhibitor treatment. Using tissue microarrays and gene expression arrays, we found marked overexpression of Tdp1 protein and mRNA in RMS tumors and that knockdown of TDP1 or inhibition of poly (ADP-ribose) polymerase-1 (PARP-1), an enzyme in the same complex as Tdp1, sensitized RMS cell lines to analogues of irinotecan. Interestingly, although BRCA1 and BRCA2 mutations or altered expression were undetectable in RMS cell lines, TDP1 knockdown and PARP-1 inhibition alone were cytotoxic to some RMS cells suggesting that they harbor genetic lesions of DNA repair that have synthetic lethal interactions with loss of Tdp1 or PARP1 function. Furthermore, culturing embryonal RMS cells in low-serum, low-glucose medium increased cytotoxicity of PARP-1 inhibition and was intrinsically cytotoxic to alveolar, though not embryonal RMS cells. We conclude therefore that TDP1 knockdown, PARP-1 inhibition and dietary restriction are considerations as components of RMS therapies.
Molecular Cancer Research 08/2013; · 4.35 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Breast cancer is the most common malignancy in women and is responsible for hundreds of thousands of deaths annually. As with most cancers, it is a heterogeneous disease and different breast cancer subtypes are treated differently. Understanding the difference in prognosis for breast cancer based on its molecular and phenotypic features is one avenue for improving treatment by matching the proper treatment with molecular subtypes of the disease. In this work, we employed a competition-based approach to modeling breast cancer prognosis using large datasets containing genomic and clinical information and an online real-time leaderboard program used to speed feedback to the modeling team and to encourage each modeler to work towards achieving a higher ranked submission. We find that machine learning methods combined with molecular features selected based on expert prior knowledge can improve survival predictions compared to current best-in-class methodologies and that ensemble models trained across multiple user submissions systematically outperform individual models within the ensemble. We also find that model scores are highly consistent across multiple independent evaluations. This study serves as the pilot phase of a much larger competition open to the whole research community, with the goal of understanding general strategies for model optimization using clinical and molecular profiling data and providing an objective, transparent system for assessing prognostic models.
[Show abstract][Hide abstract] ABSTRACT: Although molecular prognostics in breast cancer are among the most successful examples of translating genomic analysis to clinical applications, optimal approaches to breast cancer clinical risk prediction remain controversial. The Sage Bionetworks-DREAM Breast Cancer Prognosis Challenge (BCC) is a crowdsourced research study for breast cancer prognostic modeling using genome-scale data. The BCC provided a community of data analysts with a common platform for data access and blinded evaluation of model accuracy in predicting breast cancer survival on the basis of gene expression data, copy number data, and clinical covariates. This approach offered the opportunity to assess whether a crowdsourced community Challenge would generate models of breast cancer prognosis commensurate with or exceeding current best-in-class approaches. The BCC comprised multiple rounds of blinded evaluations on held-out portions of data on 1981 patients, resulting in more than 1400 models submitted as open source code. Participants then retrained their models on the full data set of 1981 samples and submitted up to five models for validation in a newly generated data set of 184 breast cancer patients. Analysis of the BCC results suggests that the best-performing modeling strategy outperformed previously reported methods in blinded evaluations; model performance was consistent across several independent evaluations; and aggregating community-developed models achieved performance on par with the best-performing individual models.
Science translational medicine 04/2013; 5(181):181re1. · 10.76 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Breast cancer is a group of heterogeneous diseases that show substantial variation in their molecular and clinical characteristics. This heterogeneity poses significant challenges not only in breast cancer management, but also in studying the biology of the disease. Recently, rapid progress has been made in understanding the genomic diversity of breast cancer. These advances led to the characterisation of a new genome-driven integrated classification of breast cancer, which substantially refines the existing classification systems currently used. The novel classification integrates molecular information on the genomic and transcriptomic landscapes of breast cancer to define 10 integrative clusters, each associated with distinct clinical outcomes and providing new insights into the underlying biology and potential molecular drivers. These findings have profound implications both for the individualisation of treatment approaches, bringing us a step closer to the realisation of personalised cancer management in breast cancer, but also provide a new framework for studying the underlying biology of each novel subtype.
[Show abstract][Hide abstract] ABSTRACT: Mixed Lineage Leukemia 5 (MLL5) is a histone methyltransferase that plays a key role in hematopoiesis, spermatogenesis and cell cycle progression. In addition to its catalytic domain, MLL5 contains a PHD finger domain, a protein module that is often involved in binding to the N-terminus of histone H3. Here we report the NMR solution structure of the MLL5 PHD domain showing a variant of the canonical PHD fold that combines conserved H3 binding features from several classes of other PHD domains (including an aromatic cage) along with a novel C-terminal α-helix, not previously seen. We further demonstrate that the PHD domain binds with similar affinity to histone H3 tail peptides di- and tri-methylated at lysine 4 (H3K4me2 and H3K4me3), the former being the putative product of the MLL5 catalytic reaction. This work establishes the PHD domain of MLL5 as a bone fide 'reader' domain of H3K4 methyl marks suggesting that it may guide the spreading or further methylation of this site on chromatin.
PLoS ONE 01/2013; 8(10):e77020. · 3.53 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Simultaneous interrogation of tumor genomes and transcriptomes is underway in unprecedented global e orts. Yet, despite the essential need to separate driver mutations modulating gene expression networks from transcriptionally inert passenger mutations, robust computational methods to ascertain the impact of individual mutations on transcriptional networks are underdeveloped. We introduce a novel computational framework, DriverNet, to identify likely driver mutations by virtue of their e ect on mRNA expression networks. Application to four cancer datasets reveals the prevalence of rare candidate driver mutations associated with disrupted transcriptional networks and a simultaneous modulation of oncogenic and metabolic networks, induced by copy number co-modi cation of adjacent oncogenic and metabolic drivers. DriverNet is available on Bioconductor or at http://compbio.bccrc.ca/software/drivernet/.