[show abstract][hide abstract] ABSTRACT: Pluripotency, the ability of a cell to differentiate and give rise to all embryonic lineages, defines a small number of mammalian cell types such as embryonic stem (ES) cells. While it has been generally held that pluripotency is the product of a transcriptional regulatory network that activates and maintains the expression of key stem cell genes, accumulating evidence is pointing to a critical role for epigenetic processes in establishing and safeguarding the pluripotency of ES cells, as well as maintaining the identity of differentiated cell types. In order to better understand the role of epigenetic mechanisms in pluripotency, we have examined the dynamics of chromatin modifications genome-wide in human ES cells (hESCs) undergoing differentiation into a mesendodermal lineage. We found that chromatin modifications at promoters remain largely invariant during differentiation, except at a small number of promoters where a dynamic switch between acetylation and methylation at H3K27 marks the transition between activation and silencing of gene expression, suggesting a hierarchy in cell fate commitment over most differentially expressed genes. We also mapped over 50 000 potential enhancers, and observed much greater dynamics in chromatin modifications, especially H3K4me1 and H3K27ac, which correlate with expression of their potential target genes. Further analysis of these enhancers revealed potentially key transcriptional regulators of pluripotency and a chromatin signature indicative of a poised state that may confer developmental competence in hESCs. Our results provide new evidence supporting the role of chromatin modifications in defining enhancers and pluripotency.
Cell Research 08/2011; 21(10):1393-409. · 10.53 Impact Factor
[show abstract][hide abstract] ABSTRACT: The human body is composed of diverse cell types with distinct functions. Although it is known that lineage specification depends on cell-specific gene expression, which in turn is driven by promoters, enhancers, insulators and other cis-regulatory DNA sequences for each gene, the relative roles of these regulatory elements in this process are not clear. We have previously developed a chromatin-immunoprecipitation-based microarray method (ChIP-chip) to locate promoters, enhancers and insulators in the human genome. Here we use the same approach to identify these elements in multiple cell types and investigate their roles in cell-type-specific gene expression. We observed that the chromatin state at promoters and CTCF-binding at insulators is largely invariant across diverse cell types. In contrast, enhancers are marked with highly cell-type-specific histone modification patterns, strongly correlate to cell-type-specific gene expression programs on a global scale, and are functionally active in a cell-type-specific manner. Our results define over 55,000 potential transcriptional enhancers in the human genome, significantly expanding the current catalogue of human enhancers and highlighting the role of these elements in cell-type-specific gene expression.
[show abstract][hide abstract] ABSTRACT: The most widely used method for detecting genome-wide protein-DNA interactions is chromatin immunoprecipitation on tiling microarrays, commonly known as ChIP-chip. Here, we conducted the first objective analysis of tiling array platforms, amplification procedures, and signal detection algorithms in a simulated ChIP-chip experiment. Mixtures of human genomic DNA and "spike-ins" comprised of nearly 100 human sequences at various concentrations were hybridized to four tiling array platforms by eight independent groups. Blind to the number of spike-ins, their locations, and the range of concentrations, each group made predictions of the spike-in locations. We found that microarray platform choice is not the primary determinant of overall performance. In fact, variation in performance between labs, protocols, and algorithms within the same array platform was greater than the variation in performance between array platforms. However, each array platform had unique performance characteristics that varied with tiling resolution and the number of replicates, which have implications for cost versus detection power. Long oligonucleotide arrays were slightly more sensitive at detecting very low enrichment. On all platforms, simple sequence repeats and genome redundancy tended to result in false positives. LM-PCR and WGA, the most popular sample amplification techniques, reproduced relative enrichment levels with high fidelity. Performance among signal detection algorithms was heavily dependent on array platform. The spike-in DNA samples and the data presented here provide a stable benchmark against which future ChIP platforms, protocol improvements, and analysis methods can be evaluated.
Genome Research 04/2008; 18(3):393-403. · 14.40 Impact Factor
[show abstract][hide abstract] ABSTRACT: Eukaryotic gene transcription is accompanied by acetylation and methylation of nucleosomes near promoters, but the locations and roles of histone modifications elsewhere in the genome remain unclear. We determined the chromatin modification states in high resolution along 30 Mb of the human genome and found that active promoters are marked by trimethylation of Lys4 of histone H3 (H3K4), whereas enhancers are marked by monomethylation, but not trimethylation, of H3K4. We developed computational algorithms using these distinct chromatin signatures to identify new regulatory elements, predicting over 200 promoters and 400 enhancers within the 30-Mb region. This approach accurately predicted the location and function of independently identified regulatory elements with high sensitivity and specificity and uncovered a novel functional enhancer for the carnitine transporter SLC22A5 (OCTN2). Our results give insight into the connections between chromatin modifications and transcriptional regulatory activity and provide a new tool for the functional annotation of the human genome.
[show abstract][hide abstract] ABSTRACT: Insulator elements affect gene expression by preventing the spread of heterochromatin and restricting transcriptional enhancers from activation of unrelated promoters. In vertebrates, insulator's function requires association with the CCCTC-binding factor (CTCF), a protein that recognizes long and diverse nucleotide sequences. While insulators are critical in gene regulation, only a few have been reported. Here, we describe 13,804 CTCF-binding sites in potential insulators of the human genome, discovered experimentally in primary human fibroblasts. Most of these sequences are located far from the transcriptional start sites, with their distribution strongly correlated with genes. The majority of them fit to a consensus motif highly conserved and suitable for predicting possible insulators driven by CTCF in other vertebrate genomes. In addition, CTCF localization is largely invariant across different cell types. Our results provide a resource for investigating insulator function and possible other general and evolutionarily conserved activities of CTCF sites.
[show abstract][hide abstract] ABSTRACT: Cancer of the ovary confers the worst prognosis among women with gynecologic malignancies, underscoring the need to develop new biomarkers for detection of early disease, particularly those that can be readily monitored in the blood.
We developed an algorithm to identify secreted proteins encoded among approximately 22,500 genes on commercial oligonucleotide arrays and applied it to gene expression profiles of 67 stage I to IV serous papillary carcinomas and 9 crudely enriched normal ovarian tissues, to identify putative diagnostic markers. ELISAs were used to validate increased levels of secreted proteins in patient sera encoded by genes with differentially high expression.
We identified 275 genes predicted to encode secreted proteins with increased/decreased expression in ovarian cancers (<0.5- or >2-fold, P < 0.001). The serum levels of four of these proteins (matrix metalloproteinase-7, osteopontin, secretory leukoprotease inhibitor, and kallikrein 10) were significantly elevated in a series of 67 independent patients with serous ovarian carcinomas compared with 67 healthy controls (P < 0.001, Wilcoxon rank sum test). Optimized support vector machine classifiers with as few as two of these markers (osteopontin or kallikrein 10/matrix metalloproteinase-7) in combination with CA-125 yielded sensitivity and specificity values ranging from 96% to 98.7% and 99.7% to 100%, respectively, with the ability to discern early-stage disease from normal, healthy controls.
Our data suggest that this assay combination warrants further investigation as a multi-analyte diagnostic test for serous ovarian adenocarcinoma.
Clinical Cancer Research 02/2007; 13(2 Pt 1):458-66. · 7.84 Impact Factor
[show abstract][hide abstract] ABSTRACT: The mouse N-ethyl-N-nitrosourea (ENU) mutagenesis program at the Genomics Institute of the Novartis Research Foundation (GNF) uses MouseTRACS to analyze phenotype screens and manage animal husbandry. MouseTRACS is a Web-based laboratory informatics system that electronically records and organizes mouse colony operations, prints cage cards, tracks inventory, manages requests, and reports Institutional Animal Care and Use Committee (IACUC) protocol usage. For efficient phenotype screening, MouseTRACS identifies mutants, visualizes data, and maps mutations. It displays and integrates phenotype and genotype data using likelihood odds ratio (LOD) plots of genetic linkage between genotype and phenotype. More detailed mapping intervals show individual single nucleotide polymorphism (SNP) markers in the context of phenotype. In addition, dynamically generated pedigree diagrams and inventory reports linked to screening results summarize the inheritance pattern and the degree of penetrance. MouseTRACS displays screening data in tables and uses standard charts such as box plots, histograms, scatter plots, and customized charts looking at clustered mice or cross pedigree comparisons. In summary, MouseTRACS enables the efficient screening, analysis, and management of thousands of animals to find mutant mice and identify novel gene functions. MouseTRACS is available under an open source license at http://www.mousetracs.sourceforge.net.
[show abstract][hide abstract] ABSTRACT: Half of patients treated for locally advanced bladder cancer relapse with often fatal metastatic disease to the lung. We have recently shown that reduced expression of the GDP dissociation inhibitor, RhoGDI2, is associated with decreased survival of patients with advanced bladder cancer. However, the effectors by which RhoGDI2 affects metastasis are unknown. Here we use DNA microarrays to identify genes suppressed by RhoGDI2 reconstitution in lung metastatic bladder cancer cell lines. We identify such RNAs and focus only on those that also increase with tumor stage in human bladder cancer samples to discover only clinically relevant targets of RhoGDI2. Levels of endothelin-1 (ET-1), a potent vasoconstrictor, were affected by both RhoGDI2 reconstitution and tumor stage. To test the hypothesis that the endothelin axis is important in lung metastasis, lung metastatic bladder carcinoma cells were injected in mice treated with the endothelin receptor-specific antagonist, atrasentan, thereby blocking engagement of the up-regulated ET-1 ligand with its cognate receptor. Endothelin antagonism resulted in a dramatic reduction of lung metastases, similar to the effect of reexpressing RhoGDI2 in these metastatic cells. Taken together, these experiments show a novel approach of identifying therapeutic targets downstream of metastasis suppressor genes. The data also suggest that blockade of the ET-1 axis may prevent lung metastasis, a new therapeutic concept that warrants clinical evaluation.
Cancer Research 09/2005; 65(16):7320-7. · 8.65 Impact Factor
[show abstract][hide abstract] ABSTRACT: The tissue-specific pattern of mRNA expression can indicate important clues about gene function. High-density oligonucleotide arrays offer the opportunity to examine patterns of gene expression on a genome scale. Toward this end, we have designed custom arrays that interrogate the expression of the vast majority of protein-encoding human and mouse genes and have used them to profile a panel of 79 human and 61 mouse tissues. The resulting data set provides the expression patterns for thousands of predicted genes, as well as known and poorly characterized genes, from mice and humans. We have explored this data set for global trends in gene expression, evaluated commonly used lines of evidence in gene prediction methodologies, and investigated patterns indicative of chromosomal organization of transcription. We describe hundreds of regions of correlated transcription and show that some are subject to both tissue and parental allele-specific expression, suggesting a link between spatial expression and imprinting.
Proceedings of the National Academy of Sciences 05/2004; 101(16):6062-7. · 9.74 Impact Factor
[show abstract][hide abstract] ABSTRACT: A hallmark of most neurodegenerative diseases, including those caused by polyglutamine expansion, is the formation of ubiquitin (Ub)-positive protein aggregates in affected neurons. This finding suggests that the Ub system may be involved in common mechanisms underlying these otherwise unrelated diseases. Here we report the finding of ataxin-3 (Atx-3), whose mutation is implicated in the neurodegenerative disease spinocerebellar ataxia type 3, in a bioinformatics search of the human genome for components of the Ub system. We show that wild-type Atx-3 is a Ub-binding protein and that the interaction of Atx-3 with Ub is mediated by motifs homologous to those found in a proteasome subunit. Both wild-type Atx-3 and the otherwise unrelated Ub-binding protein p62/Sequestosome-1 have been shown to be sequestered into aggregates in affected neurons in several neurodegenerative diseases, but the mechanism for this recruitment has remained unclear. In this article, we show that functional Ub-binding motifs in Atx-3 and p62 proteins are required for the localization of both proteins into aggregates in a cell-based assay that recapitulates several features of polyglutamine disease. We propose that the Ub-mediated sequestration of essential Ub-binding protein(s) into aggregates may be a common mechanism contributing to the pathogenesis of neurodegenerative diseases.
Proceedings of the National Academy of Sciences 08/2003; 100(15):8892-7. · 9.74 Impact Factor
[show abstract][hide abstract] ABSTRACT: High-throughput gene expression profiling has become an important tool for investigating transcriptional activity in a variety of biological samples. To date, the vast majority of these experiments have focused on specific biological processes and perturbations. Here, we have generated and analyzed gene expression from a set of samples spanning a broad range of biological conditions. Specifically, we profiled gene expression from 91 human and mouse samples across a diverse array of tissues, organs, and cell lines. Because these samples predominantly come from the normal physiological state in the human and mouse, this dataset represents a preliminary, but substantial, description of the normal mammalian transcriptome. We have used this dataset to illustrate methods of mining these data, and to reveal insights into molecular and physiological gene function, mechanisms of transcriptional regulation, disease etiology, and comparative genomics. Finally, to allow the scientific community to use this resource, we have built a free and publicly accessible website (http://expression.gnf.org) that integrates data visualization and curation of current gene annotations.
Proceedings of the National Academy of Sciences 05/2002; 99(7):4465-70. · 9.74 Impact Factor
[show abstract][hide abstract] ABSTRACT: The specificity of tissue transcriptional activities may be understood via high-throughput gene-expression profiling of a significant fraction of the human and mouse transcriptomes