[show abstract][hide abstract] ABSTRACT: Portraying high-throughput genomics research as a wild frontier, Andrea Bild and colleagues use caricatures to highlight common pitfalls in genomic research and provide recommendations for navigating this terrain.
[show abstract][hide abstract] ABSTRACT: Over the past two decades, many biotechnology platforms have been developed for high-throughput gene expression profiling. However, because each platform is subject to technology-specific biases and produces distinct raw-data distributions, researchers have experienced difficulty in integrating data across platforms. Data integration is crucial to data-generating consortiums, researchers transitioning to newer profiling technologies, and individuals seeking to aggregate data across experiments. We address this need with our Universal exPression Code (UPC) approach, which corrects for platform-specific background noise using models that account for the genomic base composition and length of target regions; this approach also uses a mixture model to estimate whether a gene is active in a particular profiling sample. The latter produces standardized UPC values on a zero-to-one scale, so that they can be interpreted consistently, irrespective of profiling technology, thus enabling downstream analysis pipelines to be developed in a platform-agnostic manner. The UPC method can be applied to one- and two-channel expression microarrays and to next-generation sequencing data (RNA sequencing). Furthermore, UPCs are derived using information from within a given sample only-no ancillary samples are required at processing time. Thus, UPCs are suitable for personalized-medicine workflows where samples must be processed individually rather than in batches. In a variety of analyses and comparisons, UPCs perform comparably to other methods designed specifically for microarrays or RNA sequencing in most settings. Software for calculating UPCs is freely available at www.bioconductor.org/packages/release/bioc/html/SCAN.UPC.html.
Proceedings of the National Academy of Sciences 10/2013; · 9.74 Impact Factor
[show abstract][hide abstract] ABSTRACT: Alterations in epigenetic marks, including methylation or acetylation, are common in human cancers. For many epigenetic pathways, however, direct measures of activity are unknown, making their role in various cancers difficult to assess. Gene expression signatures facilitate the examination of patterns of epigenetic pathway activation across and within human cancer types allowing better understanding of the relationships between these pathways.
We used Bayesian regression to generate gene expression signatures from normal epithelial cells before and after epigenetic pathway activation. Signatures were applied to datasets from TCGA, GEO, CaArray, ArrayExpress, and the cancer cell line encyclopedia. For TCGA data, signature results were correlated with copy number variation and DNA methylation changes. GSEA was used to identify biologic pathways related to the signatures.
We developed and validated signatures reflecting downstream effects of enhancer of zeste homolog 2(EZH2), histone deacetylase(HDAC) 1, HDAC4, sirtuin 1(SIRT1), and DNA methyltransferase 2(DNMT2). By applying these signatures to data from cancer cell lines and tumors in large public repositories, we identify those cancers that have the highest and lowest activation of each of these pathways. Highest EZH2 activation is seen in neuroblastoma, hepatocellular carcinoma, small cell lung cancer, and melanoma, while highest HDAC activity is seen in pharyngeal cancer, kidney cancer, and pancreatic cancer. Across all datasets studied, activation of both EZH2 and HDAC4 is significantly underrepresented. Using breast cancer and glioblastoma as examples to examine intrinsic subtypes of particular cancers, EZH2 activation was highest in luminal breast cancers and proneural glioblastomas, while HDAC4 activation was highest in basal breast cancer and mesenchymal glioblastoma. EZH2 and HDAC4 activation are associated with particular chromosome abnormalities: EZH2 activation with aberrations in genes from the TGF and phosphatidylinositol pathways and HDAC4 activation with aberrations in inflammatory and chemokine related genes.
Gene expression patterns can reveal the activation level of epigenetic pathways. Epigenetic pathways define biologically relevant subsets of human cancers. EZH2 activation and HDAC4 activation correlate with growth factor signaling and inflammation, respectively, and represent two distinct states for cancer cells. This understanding may allow us to identify targetable drivers in these cancer subsets.
BMC Medical Genomics 09/2013; 6(1):35. · 3.47 Impact Factor
[show abstract][hide abstract] ABSTRACT: RATIONALE: Molecular phenotyping of COPD has been impeded in part by the difficulty in obtaining lung tissue samples from individuals with impaired lung function. OBJECTIVES: We sought to determine whether COPD-associated processes are reflected in gene-expression profiles of bronchial airway epithelial cells obtained via bronchoscopy. METHODS: Gene expression profiling of bronchial brushings obtained from 238 current and former smokers with and without COPD was performed using Affymetrix Human Gene 1.0 ST Arrays. MEASUREMENTS AND MAIN RESULTS: We identified 98 genes whose expression levels were associated with COPD status, FEV1% predicted, and FEV1/FVC. In silico analysis identified ATF4 as a potential transcriptional regulator of genes with COPD-associated airway expression, and ATF4 overexpression in airway epithelial cells in vitro recapitulates COPD-associated gene expression changes. Genes with COPD-associated expression in the bronchial airway epithelium had similarly altered expression profiles in prior studies performed on small-airway epithelium and lung parenchyma, suggesting that transcriptomic alterations in the bronchial airway epithelium reflect molecular events found at more distal sites of disease activity. Many of the airway COPD-associated gene expression changes revert toward baseline following therapy with the inhaled corticosteroid fluticasone in independent cohorts. CONCLUSIONS: Our findings demonstrate a molecular field of injury throughout the bronchial airway of active and former smokers with COPD that may be driven in part by ATF4 and is modifiable with therapy. Bronchial airway epithelium may therefore ultimately serve as a relatively accessible tissue in which to measure biomarkers of disease activity for guiding clinical management of COPD.
American Journal of Respiratory and Critical Care Medicine 03/2013; · 11.04 Impact Factor
[show abstract][hide abstract] ABSTRACT: Cigarette smoke produces a molecular "field of injury" in epithelial cells lining the respiratory tract. However, the specific signaling pathways that are altered in the airway of smokers and the signaling processes responsible for the transition from smoking-induced airway damage to lung cancer remain unknown. In this study, we use a genomic approach to study the signaling processes associated with tobacco smoke exposure and lung cancer. First, we developed and validated pathway-specific gene expression signatures in bronchial airway epithelium that reflect activation of signaling pathways relevant to tobacco-exposure including ATM, BCL2, GPX1, NOS2, IKBKB, and SIRT1. Using these profiles and four independent gene expression datasets, we found that SIRT1 activity is significantly up-regulated in cytologically normal airway epithelial cells from active smokers compared to non-smokers. In contrast, this activity is strikingly down-regulated in non-small cell lung cancer. This pattern of signaling modulation was unique to SIRT1, and down-regulation of SIRT1 activity is confined to tumors from smokers. Decreased activity of SIRT1 was validated using genomic analyses of mouse models of lung cancer and biochemical testing of SIRT1 activity in patient lung tumors. Together, our findings indicate a role of SIRT1 in response to smoke and a potential role in repressing lung cancer. Further, our findings suggest that the airway gene-expression signatures derived in this study can provide novel insights into signaling pathways altered in the "field of inury" induced by tobacco smoke and thus may impact strategies for prevention of tobacco-related lung cancer.
[show abstract][hide abstract] ABSTRACT: Gene-expression microarrays allow researchers to characterize biological phenomena in a high-throughput fashion but are subject to technological biases and inevitable variabilities that arise during sample collection and processing. Normalization techniques aim to correct such biases. Most existing methods require multiple samples to be processed in aggregate; consequently, each sample's output is influenced by other samples processed jointly. However, in personalized-medicine workflows, samples may arrive serially, so renormalizing all samples upon each new arrival would be impractical. We have developed Single Channel Array Normalization (SCAN), a single-sample technique that models the effects of probe-nucleotide composition on fluorescence intensity and corrects for such effects, dramatically increasing the signal-to-noise ratio within individual samples while decreasing variation across samples. In various benchmark comparisons, we show that SCAN performs as well as or better than competing methods yet has no dependence on external reference samples and can be applied to any single-channel microarray platform.
[show abstract][hide abstract] ABSTRACT: We leverage genomic and biochemical data to identify synergistic drug regimens for breast cancer. In order to study the mechanism of the histone deacetylase (HDAC) inhibitors valproic acid (VPA) and suberoylanilide hydroxamic acid (SAHA) in breast cancer, we generated and validated genomic profiles of drug response using a series of breast cancer cell lines sensitive to each drug. These genomic profiles were then used to model drug response in human breast tumors and show significant correlation between VPA and SAHA response profiles in multiple breast tumor data sets, highlighting their similar mechanism of action. The genes deregulated by VPA and SAHA converge on the cell cycle pathway (Bayes factor 5.21 and 5.94, respectively; P-value 10(-8.6) and 10(-9), respectively). In particular, VPA and SAHA upregulate key cyclin-dependent kinase (CDK) inhibitors. In two independent datasets, cancer cells treated with CDK inhibitors have similar gene expression profile changes to the cellular response to HDAC inhibitors. Together, these results led us to hypothesize that VPA and SAHA may interact synergistically with CDK inhibitors such as PD-033299. Experiments show that HDAC and CDK inhibitors have statistically significant synergy in both breast cancer cell lines and primary 3-dimensional cultures of cells from pleural effusions of patients. Therefore, synergistic relationships between HDAC and CDK inhibitors may provide an effective combinatorial regimen for breast cancer. Importantly, these studies provide an example of how genomic analysis of drug-response profiles can be used to design rational drug combinations for cancer treatment.The Pharmacogenomics Journal advance online publication, 15 November 2011; doi:10.1038/tpj.2011.48.
The Pharmacogenomics Journal 11/2011; · 5.13 Impact Factor
[show abstract][hide abstract] ABSTRACT: Identifying the best drug for each cancer patient requires an efficient individualized strategy. We present MATCH (Merging genomic and pharmacologic Analyses for Therapy CHoice), an approach using public genomic resources and drug testing of fresh tumor samples to link drugs to patients. Valproic acid (VPA) is highlighted as a proof-of-principle. In order to predict specific tumor types with high probability of drug sensitivity, we create drug response signatures using publically available gene expression data and assess sensitivity in a data set of >40 cancer types. Next, we evaluate drug sensitivity in matched tumor and normal tissue and exclude cancer types that are no more sensitive than normal tissue. From these analyses, breast tumors are predicted to be sensitive to VPA. A meta-analysis across breast cancer data sets shows that aggressive subtypes are most likely to be sensitive to VPA, but all subtypes have sensitive tumors. MATCH predictions correlate significantly with growth inhibition in cancer cell lines and three-dimensional cultures of fresh tumor samples. MATCH accurately predicts reduction in tumor growth rate following VPA treatment in patient tumor xenografts. MATCH uses genomic analysis with in vitro testing of patient tumors to select optimal drug regimens before clinical trial initiation.
Molecular Systems Biology 07/2011; 7:513. · 11.34 Impact Factor
[show abstract][hide abstract] ABSTRACT: To the Editor: We would like to retract our article, "A Genomic Strategy to Refine Prognosis in Early-Stage Non-Small-Cell Lung Cancer,"(1) which was published in the Journal on August 10, 2006. Using a sample set from a study by the American College of Surgeons Oncology Group (ACOSOG) and a collection of samples from a study by the Cancer and Leukemia Group B (CALGB), we have tried and failed to reproduce results supporting the validation of the lung metagene model described in the article. We deeply regret the effect of this action on the work of other investigators.
New England Journal of Medicine 03/2011; 364(12):1176. · 51.66 Impact Factor
[show abstract][hide abstract] ABSTRACT: Although only a subset of smokers develop lung cancer, we cannot determine which smokers are at highest risk for cancer development, nor do we know the signaling pathways altered early in the process of tumorigenesis in these individuals. On the basis of the concept that cigarette smoke creates a molecular field of injury throughout the respiratory tract, this study explores oncogenic pathway deregulation in cytologically normal proximal airway epithelial cells of smokers at risk for lung cancer. We observed a significant increase in a genomic signature of phosphatidylinositol 3-kinase (PI3K) pathway activation in the cytologically normal bronchial airway of smokers with lung cancer and smokers with dysplastic lesions, suggesting that PI3K is activated in the proximal airway before tumorigenesis. Further, PI3K activity is decreased in the airway of high-risk smokers who had significant regression of dysplasia after treatment with the chemopreventive agent myo-inositol, and myo-inositol inhibits the PI3K pathway in vitro. These results suggest that deregulation of the PI3K pathway in the bronchial airway epithelium of smokers is an early, measurable, and reversible event in the development of lung cancer and that genomic profiling of these relatively accessible airway cells may enable personalized approaches to chemoprevention and therapy. Our work further suggests that additional lung cancer chemoprevention trials either targeting the PI3K pathway or measuring airway PI3K activation as an intermediate endpoint are warranted.
Science translational medicine 04/2010; 2(26):26ra25. · 10.76 Impact Factor
[show abstract][hide abstract] ABSTRACT: Perhaps the major challenge in developing more effective therapeutic strategies for the treatment of breast cancer patients is confronting the heterogeneity of the disease, recognizing that breast cancer is not one disease but multiple disorders with distinct underlying mechanisms. Gene-expression profiling studies have been used to dissect this complexity, and our previous studies identified a series of intrinsic subtypes of breast cancer that define distinct populations of patients with respect to survival. Additional work has also used signatures of oncogenic pathway deregulation to dissect breast cancer heterogeneity as well as to suggest therapeutic opportunities linked to pathway activation.
We used genomic analyses to identify relations between breast cancer subtypes, pathway deregulation, and drug sensitivity. For these studies, we use three independent breast cancer gene-expression data sets to measure an individual tumor phenotype. Correlation between pathway status and subtype are examined and linked to predictions for response to conventional chemotherapies.
We reveal patterns of pathway activation characteristic of each molecular breast cancer subtype, including within the more aggressive subtypes in which novel therapeutic opportunities are critically needed. Whereas some oncogenic pathways have high correlations to breast cancer subtype (RAS, CTNNB1, p53, HER1), others have high variability of activity within a specific subtype (MYC, E2F3, SRC), reflecting biology independent of common clinical factors. Additionally, we combined these analyses with predictions of sensitivity to commonly used cytotoxic chemotherapies to provide additional opportunities for therapeutics specific to the intrinsic subtype that might be better aligned with the characteristics of the individual patient.
Genomic analyses can be used to dissect the heterogeneity of breast cancer. We use an integrated analysis of breast cancer that combines independent methods of genomic analyses to highlight the complexity of signaling pathways underlying different breast cancer phenotypes and to identify optimal therapeutic opportunities.
Breast cancer research: BCR 08/2009; 11(4):R55. · 5.87 Impact Factor
[show abstract][hide abstract] ABSTRACT: Recent studies have emphasized the importance of pathway-specific interpretations for understanding the functional relevance of gene alterations in human cancers. Although signaling activities are often conceptualized as linear events, in reality, they reflect the activity of complex functional networks assembled from modules that each respond to input signals. To acquire a deeper understanding of this network structure, we developed an approach to deconstruct pathways into modules represented by gene expression signatures. Our studies confirm that they represent units of underlying biological activity linked to known biochemical pathway structures. Importantly, we show that these signaling modules provide tools to dissect the complexity of oncogenic states that define disease outcomes as well as response to pathway-specific therapeutics. We propose that this model of pathway structure constitutes a framework to study the processes by which information propogates through cellular networks and to elucidate the relationships of fundamental modules to cellular and clinical phenotypes.
[show abstract][hide abstract] ABSTRACT: The Emu-myc transgenic mouse has provided a valuable model for the study of B-cell lymphoma. Making use of gene expression analysis and, in particular, expression signatures of cell signaling pathway activation, we now show that several forms of B lymphoma can be identified in the Emu-myc mice associated with time of tumor onset. Furthermore, one form of Emu-myc tumor with pre-B character is shown to resemble human Burkitt lymphoma, whereas others exhibit more differentiated B-cell characteristics and show similarity with human diffuse large B-cell lymphoma in the pattern of gene expression, as well as oncogenic pathway activation. Importantly, we show that signatures of oncogenic pathway activity provide further dissection of the spectrum of diffuse large B-cell lymphoma, identifying a subset of patients who have very poor prognosis and could benefit from more aggressive or novel therapeutic strategies. Taken together, these studies provide insight into the complexity of the oncogenic process and a novel strategy for dissecting the heterogeneity of B lymphoma.
Cancer Research 11/2008; 68(20):8525-34. · 8.65 Impact Factor
[show abstract][hide abstract] ABSTRACT: The purpose of this study was to develop an integrated genomic-based approach to personalized treatment of patients with advanced-stage ovarian cancer. We have used gene expression profiles to identify patients likely to be resistant to primary platinum-based chemotherapy and also to identify alternate targeted therapeutic options for patients with de novo platinum-resistant disease.
A gene expression model that predicts response to platinum-based therapy was developed using a training set of 83 advanced-stage serous ovarian cancers and tested on a 36-sample external validation set. In parallel, expression signatures that define the status of oncogenic signaling pathways were evaluated in 119 primary ovarian cancers and 12 ovarian cancer cell lines. In an effort to increase chemotherapy sensitivity, pathways shown to be activated in platinum-resistant cancers were subject to targeted therapy in ovarian cancer cell lines.
Gene expression profiles identified patients with ovarian cancer likely to be resistant to primary platinum-based chemotherapy with greater than 80% accuracy. In patients with platinum-resistant disease, we identified expression signatures consistent with activation of Src and Rb/E2F pathways, components of which were successfully targeted to increase response in ovarian cancer cell lines.
We have defined a strategy for treatment of patients with advanced-stage ovarian cancer that uses therapeutic stratification based on predictions of response to chemotherapy, coupled with prediction of oncogenic pathway deregulation, as a method to direct the use of targeted agents.
Journal of Clinical Oncology 03/2007; 25(5):517-25. · 18.04 Impact Factor
[show abstract][hide abstract] ABSTRACT: Functions encoded by single genes in lower organisms are often represented by multiple related genes in the mammalian genome. An example is the retinoblastoma and E2F families of proteins that regulate transcription during the cell cycle. Analysis of gene function using germline mutations is often confounded by overlapping function resulting in compensation. Indeed, in cells deleted of the E2F1 or E2F3 genes, there is an increase in the expression of the other family member. To avoid complications of compensatory effects, we have used small-interfering RNAs that target individual E2F proteins to generate a temporary loss of E2F function. We find that both E2F1 and E2F3 are required for cells to enter the S phase from a quiescent state, whereas only E2F3 is necessary for the S phase in growing cells. We also find that the acute loss of E2F3 activity affects the expression of genes encoding DNA replication and mitotic activities, whereas loss of E2F1 affects a limited number of genes that are distinct from those regulated by E2F3. We conclude that the long-term loss of E2F activity does lead to compensation by other family members and that the analysis of acute loss of function reveals specific and distinct roles for these proteins.