[show abstract][hide abstract] ABSTRACT: First-generation molecular profiles for human breast cancers have enabled the identification of features that can predict therapeutic response; however, little is known about how the various data types can best be combined to yield optimal predictors. Collections of breast cancer cell lines mirror many aspects of breast cancer molecular pathobiology, and measurements of their omic and biological therapeutic responses are well-suited for development of strategies to identify the most predictive molecular feature sets.
We used least squares-support vector machines and random forest algorithms to identify molecular features associated with responses of a collection of 70 breast cancer cell lines to 90 experimental or approved therapeutic agents. The datasets analyzed included measurements of copy number aberrations, mutations, gene and isoform expression, promoter methylation and protein expression. Transcriptional subtype contributed strongly to response predictors for 25% of compounds, and adding other molecular data types improved prediction for 65%. No single molecular dataset consistently out-performed the others, suggesting that therapeutic response is mediated at multiple levels in the genome. Response predictors were developed and applied to TCGA data, and were found to be present in subsets of those patient samples.
These results suggest that matching patients to treatments based on transcriptional subtype will improve response rates, and inclusion of additional features from other profiling data types may provide additional benefit. Further, we suggest a systems biology strategy for guiding clinical trials so that patient cohorts most likely to respond to new therapies may be more efficiently identified.
[show abstract][hide abstract] ABSTRACT: The human epidermal growth factor receptor (HER) family of tyrosine kinases is deregulated in multiple cancers either through amplification, overexpression, or mutation. ERBB3/HER3, the only member with an impaired kinase domain, although amplified or overexpressed in some cancers, has not been reported to carry oncogenic mutations. Here, we report the identification of ERBB3 somatic mutations in ∼11% of colon and gastric cancers. We found that the ERBB3 mutants transformed colonic and breast epithelial cells in a ligand-independent manner. However, the mutant ERBB3 oncogenic activity was dependent on kinase-active ERBB2. Furthermore, we found that anti-ERBB antibodies and small molecule inhibitors effectively blocked mutant ERBB3-mediated oncogenic signaling and disease progression in vivo.
Cancer cell 05/2013; 23(5):603-17. · 25.29 Impact Factor
[show abstract][hide abstract] ABSTRACT: Small-cell lung cancer (SCLC) is an exceptionally aggressive disease with poor prognosis. Here, we obtained exome, transcriptome and copy-number alteration data from approximately 53 samples consisting of 36 primary human SCLC and normal tissue pairs and 17 matched SCLC and lymphoblastoid cell lines. We also obtained data for 4 primary tumors and 23 SCLC cell lines. We identified 22 significantly mutated genes in SCLC, including genes encoding kinases, G protein-coupled receptors and chromatin-modifying proteins. We found that several members of the SOX family of genes were mutated in SCLC. We also found SOX2 amplification in ∼27% of the samples. Suppression of SOX2 using shRNAs blocked proliferation of SOX2-amplified SCLC lines. RNA sequencing identified multiple fusion transcripts and a recurrent RLF-MYCL1 fusion. Silencing of MYCL1 in SCLC cell lines that had the RLF-MYCL1 fusion decreased cell proliferation. These data provide an in-depth view of the spectrum of genomic alterations in SCLC and identify several potential targets for therapeutic intervention.
[show abstract][hide abstract] ABSTRACT: Identifying and understanding changes in cancer genomes is essential for the development of targeted therapeutics. Here we analyse systematically more than 70 pairs of primary human colon tumours by applying next-generation sequencing to characterize their exomes, transcriptomes and copy-number alterations. We have identified 36,303 protein-altering somatic changes that include several new recurrent mutations in the Wnt pathway gene TCF7L2, chromatin-remodelling genes such as TET2 and TET3 and receptor tyrosine kinases including ERBB3. Our analysis for significantly mutated cancer genes identified 23 candidates, including the cell cycle checkpoint kinase ATM. Copy-number and RNA-seq data analysis identified amplifications and corresponding overexpression of IGF2 in a subset of colon tumours. Furthermore, using RNA-seq data we identified multiple fusion transcripts including recurrent gene fusions involving R-spondin family members RSPO2 and RSPO3 that together occur in 10% of colon tumours. The RSPO fusions were mutually exclusive with APC mutations, indicating that they probably have a role in the activation of Wnt signalling and tumorigenesis. Consistent with this we show that the RSPO fusion proteins were capable of potentiating Wnt signalling. The R-spondin gene fusions and several other gene mutations identified in this study provide new potential opportunities for therapeutic intervention in colon cancer.
[show abstract][hide abstract] ABSTRACT: Oncogenic mutations in PIK3CA, which encodes the phosphoinositide-3-kinase (PI3K) catalytic subunit p110α, occur in ∼25% of human breast cancers. In this study, we report the development of a knock-in mouse model for breast cancer where the endogenous Pik3ca allele was modified to allow tissue-specific conditional expression of a frequently found Pik3ca(H1047R) (Pik3ca(e20H1047R)) mutant allele. We found that activation of the latent Pik3ca(H1047R) allele resulted in breast tumors with multiple histological types. Whole-exome analysis of the Pik3ca(H1047R)-driven mammary tumors identified multiple mutations, including Trp53 mutations that appeared spontaneously during the development of adenocarinoma and spindle cell tumors. Further, we used this model to test the efficacy of GDC-0941, a PI3K inhibitor, in clinical development, and showed that the tumors respond to PI3K inhibition.Oncogene advance online publication, 27 February 2012; doi:10.1038/onc.2012.53.
[show abstract][hide abstract] ABSTRACT: Oncogenic mutations in PIK3CA, which encodes the phosphoinositide-3-kinase (PI3K) catalytic subunit p110α, occur in ~25% of human breast cancers. In this study, we report the development of a knock-in mouse model for breast cancer where the endogenous Pik3ca allele was modified to allow tissue-specific conditional expression of a frequently found Pik3caH1047R (Pik3cae20H1047R) mutant allele. We found that activation of the latent Pik3caH1047R allele resulted in breast tumors with multiple histological types. Whole-exome analysis of the Pik3caH1047R-driven mammary tumors identified multiple mutations, including Trp53 mutations that appeared spontaneously during the development of adenocarinoma and spindle cell tumors. Further, we used this model to test the efficacy of GDC-0941, a PI3K inhibitor, in clinical development, and showed that the tumors respond to PI3K inhibition.Keywords: Pik3ca; H1047R; knock-in; mammary gland; Trp53; exome sequencing
[show abstract][hide abstract] ABSTRACT: Breast cancers are comprised of molecularly distinct subtypes that may respond differently to pathway-targeted therapies now under development. Collections of breast cancer cell lines mirror many of the molecular subtypes and pathways found in tumors, suggesting that treatment of cell lines with candidate therapeutic compounds can guide identification of associations between molecular subtypes, pathways, and drug response. In a test of 77 therapeutic compounds, nearly all drugs showed differential responses across these cell lines, and approximately one third showed subtype-, pathway-, and/or genomic aberration-specific responses. These observations suggest mechanisms of response and resistance and may inform efforts to develop molecular assays that predict clinical response.
Proceedings of the National Academy of Sciences 02/2012; 109(8):2724-9. · 9.74 Impact Factor
[show abstract][hide abstract] ABSTRACT: Timely intervention for cancer requires knowledge of its earliest genetic aberrations. Sequencing of tumors and their metastases reveals numerous abnormalities occurring late in progression. A means to temporally order aberrations in a single cancer, rather than inferring them from serially acquired samples, would define changes preceding even clinically evident disease. We integrate DNA sequence and copy number information to reconstruct the order of abnormalities as individual tumors evolve for 2 separate cancer types. We detect vast, unreported expansion of simple mutations sharply demarcated by recombinative loss of the second copy of TP53 in cutaneous squamous cell carcinomas (cSCC) and serous ovarian adenocarcinomas, in the former surpassing 50 mutations per megabase. In cSCCs, we also report diverse secondary mutations in known and novel oncogenic pathways, illustrating how such expanded mutagenesis directly promotes malignant progression. These results reframe paradigms in which TP53 mutation is required later, to bypass senescence induced by driver oncogenes.
Cancer Discovery 07/2011; 1(2):137-43. · 10.14 Impact Factor
[show abstract][hide abstract] ABSTRACT: A catalogue of molecular aberrations that cause ovarian cancer is critical for developing and deploying therapies that will improve patients' lives. The Cancer Genome Atlas project has analysed messenger RNA expression, microRNA expression, promoter methylation and DNA copy number in 489 high-grade serous ovarian adenocarcinomas and the DNA sequences of exons from coding genes in 316 of these tumours. Here we report that high-grade serous ovarian cancer is characterized by TP53 mutations in almost all tumours (96%); low prevalence but statistically recurrent somatic mutations in nine further genes including NF1, BRCA1, BRCA2, RB1 and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes. Analyses delineated four ovarian cancer transcriptional subtypes, three microRNA subtypes, four promoter methylation subtypes and a transcriptional signature associated with survival duration, and shed new light on the impact that tumours with BRCA1/2 (BRCA1 or BRCA2) and CCNE1 aberrations have on survival. Pathway analyses suggested that homologous recombination is defective in about half of the tumours analysed, and that NOTCH and FOXM1 signalling are involved in serous ovarian cancer pathophysiology.
[show abstract][hide abstract] ABSTRACT: Protein isoforms produced by alternative splicing (AS) of many genes have been implicated in several aspects of cancer genesis and progression. These observations motivated a genome-wide assessment of AS in breast cancer. We accomplished this by measuring exon level expression in 31 breast cancer and nonmalignant immortalized cell lines representing luminal, basal, and claudin-low breast cancer subtypes using Affymetrix Human Junction Arrays. We analyzed these data using a computational pipeline specifically designed to detect AS with a low false-positive rate. This identified 181 splice events representing 156 genes as candidates for AS. Reverse transcription-PCR validation of a subset of predicted AS events confirmed 90%. Approximately half of the AS events were associated with basal, luminal, or claudin-low breast cancer subtypes. Exons involved in claudin-low subtype-specific AS were significantly associated with the presence of evolutionarily conserved binding motifs for the tissue-specific Fox2 splicing factor. Small interfering RNA knockdown of Fox2 confirmed the involvement of this splicing factor in subtype-specific AS. The subtype-specific AS detected in this study likely reflects the splicing pattern in the breast cancer progenitor cells in which the tumor arose and suggests the utility of assays for Fox-mediated AS in cancer subtype definition and early detection. These data also suggest the possibility of reducing the toxicity of protein-targeted breast cancer treatments by targeting protein isoforms that are not present in limiting normal tissues.
Molecular Cancer Research 07/2010; 8(7):961-74. · 4.35 Impact Factor
[show abstract][hide abstract] ABSTRACT: Methylation of CpG islands within the DNA promoter regions is one mechanism that leads to aberrant gene expression in cancer. In particular, the abnormal methylation of CpG islands may silence associated genes. Therefore, using high-throughput microarrays to measure CpG island methylation will lead to better understanding of tumor pathobiology and progression, while revealing potentially new biomarkers. We have examined a recently developed high-throughput technology for measuring genome-wide methylation patterns called mTACL. Here, we propose a computational pipeline for integrating gene expression and CpG island methylation profiles to identify epigenetically regulated genes for a panel of 45 breast cancer cell lines, which is widely used in the Integrative Cancer Biology Program (ICBP). The pipeline (i) reduces the dimensionality of the methylation data, (ii) associates the reduced methylation data with gene expression data, and (iii) ranks methylation-expression associations according to their epigenetic regulation. Dimensionality reduction is performed in two steps: (i) methylation sites are grouped across the genome to identify regions of interest, and (ii) methylation profiles are clustered within each region. Associations between the clustered methylation and the gene expression data sets generate candidate matches within a fixed neighborhood around each gene. Finally, the methylation-expression associations are ranked through a logistic regression, and their significance is quantified through permutation analysis.
Our two-step dimensionality reduction compressed 90% of the original data, reducing 137,688 methylation sites to 14,505 clusters. Methylation-expression associations produced 18,312 correspondences, which were used to further analyze epigenetic regulation. Logistic regression was used to identify 58 genes from these correspondences that showed a statistically significant negative correlation between methylation profiles and gene expression in the panel of breast cancer cell lines. Subnetwork enrichment of these genes has identified 35 common regulators with 6 or more predicted markers. In addition to identifying epigenetically regulated genes, we show evidence of differentially expressed methylation patterns between the basal and luminal subtypes.
Our results indicate that the proposed computational protocol is a viable platform for identifying epigenetically regulated genes. Our protocol has generated a list of predictors including COL1A2, TOP2A, TFF1, and VAV3, genes whose key roles in epigenetic regulation is documented in the literature. Subnetwork enrichment of these predicted markers further suggests that epigenetic regulation of individual genes occurs in a coordinated fashion and through common regulators.
[show abstract][hide abstract] ABSTRACT: microRNAs have been shown to be involved in different human cancers. We therefore have performed expression profiles on a panel of pediatric tumors to identify cancer-specific microRNAs. We also investigated if microRNAs are coregulated with their host gene.
We performed parallel microRNAs and mRNA expression profiling on 57 tumor xenografts and cell lines representing 10 different pediatric solid tumors using microarrays. For those microRNAs that map to their host mRNA, we calculated correlations between them.
We found that the majority of cancer types clustered together based on their global microRNA expression profiles by unsupervised hierarchical clustering. Fourteen microRNAs were significantly differentially expressed between rhabdomyosarcoma and neuroblastoma, and 8 of them were validated in independent patient tumor samples. Exploration of the expression of microRNAs in relationship with their host genes showed that the expression for 43 of 68 (63%) microRNAs located inside known coding genes was significantly correlated with that of their host genes. Among these 43 microRNAs, 5 of 7 microRNAs in the OncomiR-1 cluster correlated significantly with their host gene MIRHG1 (P < 0.01). In addition, high expression of MIRHG1 was significantly associated with high stage and MYCN amplification in neuroblastoma tumors, and the expression level of MIRHG1 could predict the outcome of neuroblastoma patients independently from the current neuroblastoma risk-stratification in two independent patient cohorts.
Pediatric cancers express cancer-specific microRNAs. The high expression of the OncomiR-1 host gene MIRHG1 correlates with poor outcome for patients with neuroblastoma, indicating important oncogenic functions of this microRNA cluster in neuroblastoma biology.
Clinical Cancer Research 09/2009; 15(17):5560-8. · 7.84 Impact Factor
[show abstract][hide abstract] ABSTRACT: Genomic experiments produce multiple views of biological systems, among them are DNA sequence and copy number variation, and mRNA and protein abundance. Understanding these systems needs integrated bioinformatic analysis. Public databases such as Ensembl provide relationships and mappings between the relevant sets of probe and target molecules. However, the relationships can be biologically complex and the content of the databases is dynamic. We demonstrate how to use the computational environment R to integrate and jointly analyze experimental datasets, employing BioMart web services to provide the molecule mappings. We also discuss typical problems that are encountered in making gene-to-transcript-to-protein mappings. The approach provides a flexible, programmable and reproducible basis for state-of-the-art bioinformatic data integration.
[show abstract][hide abstract] ABSTRACT: Biological studies involve a growing number of distinct high-throughput experiments to characterize samples of interest. There is a lack of methods to visualize these different genomic datasets in a versatile manner. In addition, genomic data analysis requires integrated visualization of experimental data along with constantly changing genomic annotation and statistical analyses.
We developed GenomeGraphs, as an add-on software package for the statistical programming environment R, to facilitate integrated visualization of genomic datasets. GenomeGraphs uses the biomaRt package to perform on-line annotation queries to Ensembl and translates these to gene/transcript structures in viewports of the grid graphics package. This allows genomic annotation to be plotted together with experimental data. GenomeGraphs can also be used to plot custom annotation tracks in combination with different experimental data types together in one plot using the same genomic coordinate system.
GenomeGraphs is a flexible and extensible software package which can be used to visualize a multitude of genomic datasets within the statistical programming environment R.
[show abstract][hide abstract] ABSTRACT: Human cancer cells typically harbour multiple chromosomal aberrations, nucleotide substitutions and epigenetic modifications that drive malignant transformation. The Cancer Genome Atlas (TCGA) pilot project aims to assess the value of large-scale multi-dimensional analysis of these molecular characteristics in human cancer and to provide the data rapidly to the research community. Here we report the interim integrative analysis of DNA copy number, gene expression and DNA methylation aberrations in 206 glioblastomas-the most common type of adult brain cancer-and nucleotide sequence aberrations in 91 of the 206 glioblastomas. This analysis provides new insights into the roles of ERBB2, NF1 and TP53, uncovers frequent mutations of the phosphatidylinositol-3-OH kinase regulatory subunit gene PIK3R1, and provides a network view of the pathways altered in the development of glioblastoma. Furthermore, integration of mutation, DNA methylation and clinical treatment data reveals a link between MGMT promoter methylation and a hypermutator phenotype consequent to mismatch repair deficiency in treated glioblastomas, an observation with potential clinical implications. Together, these findings establish the feasibility and power of TCGA, demonstrating that it can rapidly expand knowledge of the molecular basis of cancer.
[show abstract][hide abstract] ABSTRACT: Human cancer cells typically harbour multiple chromosomal aberrations, nucleotide substitutions and epigenetic modifications that drive malignant transformation. The Cancer Genome Atlas (TCGA) pilot project aims to assess the value of large-scale multi-dimensional analysis of these molecular characteristics in human cancer and to provide the data rapidly to the research community. Here we report the interim integrative analysis of DNA copy number, gene expression and DNA methylation aberrations in 206 glioblastomas—the most common type of adult brain cancer—and nucleotide sequence aberrations in 91 of the 206 glioblastomas. This analysis provides new insights into the roles of ERBB2, NF1 and TP53, uncovers frequent mutations of the phosphatidylinositol-3-OH kinase regulatory subunit gene PIK3R1, and provides a network view of the pathways altered in the development of glioblastoma. Furthermore, integration of mutation, DNA methylation and clinical treatment data reveals a link between MGMT promoter methylation and a hypermutator phenotype consequent to mismatch repair deficiency in treated glioblastomas, an observation with potential clinical implications. Together, these findings establish the feasibility and power of TCGA, demonstrating that it can rapidly expand knowledge of the molecular basis of cancer.
[show abstract][hide abstract] ABSTRACT: The panel of 60 human cancer cell lines (the NCI-60) assembled by the National Cancer Institute for anticancer drug discovery is a widely used resource. We previously sequenced 24 cancer genes in those cell lines. Eleven of the genes were found to be mutated in three or more of the lines. Using a pharmacogenomic approach, we analyzed the relationship between drug activity and mutations in those 11 genes (APC, RB1, KRAS, NRAS, BRAF, PIK3CA, PTEN, STK11, MADH4, TP53, and CDKN2A). That analysis identified an association between mutation in BRAF and the antiproliferative potential of phenothiazine compounds. Phenothiazines have been used as antipsychotics and as adjunct antiemetics during cancer chemotherapy and more recently have been reported to have anticancer properties. However, to date, the anticancer mechanism of action of phenothiazines has not been elucidated. To follow up on the initial pharmacologic observations in the NCI-60 screen, we did pharmacologic experiments on 11 of the NCI-60 cell lines and, prospectively, on an additional 24 lines. The studies provide evidence that BRAF mutation (codon 600) in melanoma as opposed to RAS mutation is predictive of an increase in sensitivity to phenothiazines as determined by 3-(4,5-dimethylthiazol-2-yl)-5-(3-carboxymethoxyphenyl)-2-(4-sulfophenyl)-2H-tetrazolium, inner salt assay (Wilcoxon P = 0.007). That pattern of increased sensitivity to phenothiazines based on the presence of codon 600 BRAF mutation may be unique to melanomas, as we do not observe it in a panel of colorectal cancers. The findings reported here have potential implications for the use of phenothiazines in the treatment of V600E BRAF mutant melanoma.
Molecular Cancer Therapeutics 06/2008; 7(6):1337-46. · 5.60 Impact Factor
[show abstract][hide abstract] ABSTRACT: Loss of 1p36 heterozygosity commonly occurs with MYCN amplification in neuroblastoma tumors, and both are associated with an aggressive phenotype. Database searches identified five microRNAs that map to the commonly deleted region of 1p36 and we hypothesized that the loss of one or more of these microRNAs contributes to the malignant phenotype of MYCN-amplified tumors. By bioinformatic analysis, we identified that three out of the five microRNAs target MYCN and of these miR-34a caused the most significant suppression of cell growth through increased apoptosis and decreased DNA synthesis in neuroblastoma cell lines with MYCN amplification. Quantitative RT-PCR showed that neuroblastoma tumors with 1p36 loss expressed lower level of miR-34a than those with normal copies of 1p36. Furthermore, we demonstrated that MYCN is a direct target of miR-34a. Finally, using a series of mRNA expression profiling experiments, we identified other potential direct targets of miR-34a, and pathway analysis demonstrated that miR-34a suppresses cell-cycle genes and induces several neural-related genes. This study demonstrates one important regulatory role of miR-34a in cell growth and MYCN suppression in neuroblastoma.
[show abstract][hide abstract] ABSTRACT: ArrayExpress is a public microarray repository founded on the Minimum Information About a Microarray Experiment (MIAME) principles that stores MIAME-compliant gene expression data. Plant-based data sets represent approximately one-quarter of the experiments in ArrayExpress. The majority are based on Arabidopsis (Arabidopsis thaliana); however, there are other data sets based on Triticum aestivum, Hordeum vulgare, and Populus subsp. AtMIAMExpress is an open-source Web-based software application for the submission of Arabidopsis-based microarray data to ArrayExpress. AtMIAMExpress exports data in MAGE-ML format for upload to any MAGE-ML-compliant application, such as J-Express and ArrayExpress. It was designed as a tool for users with minimal bioinformatics expertise, has comprehensive help and user support, and represents a simple solution to meeting the MIAME guidelines for the Arabidopsis community. Plant data are queryable both in ArrayExpress and in the Data Warehouse databases, which support queries based on gene-centric and sample-centric annotation. The AtMIAMExpress submission tool is available at http://www.ebi.ac.uk/at-miamexpress/. The software is open source and is available from http://sourceforge.net/projects/miamexpress/. For information, contact firstname.lastname@example.org.