Timothy R Hughes

University of Helsinki, Helsinki, Province of Southern Finland, Finland

Are you Timothy R Hughes?

Claim your profile

Publications (101)1285.21 Total impact

  • Article: A compendium of nucleosome and transcript profiles reveals determinants of chromatin architecture and transcription.
    [show abstract] [hide abstract]
    ABSTRACT: Nucleosomes in all eukaryotes examined to date adopt a characteristic architecture within genes and play fundamental roles in regulating transcription, yet the identity and precise roles of many of the trans-acting factors responsible for the establishment and maintenance of this organization remain to be identified. We profiled a compendium of 50 yeast strains carrying conditional alleles or complete deletions of genes involved in transcriptional regulation, histone biology, and chromatin remodeling, as well as compounds that target transcription and histone deacetylases, to assess their respective roles in nucleosome positioning and transcription. We find that nucleosome patterning in genes is affected by many factors, including the CAF-1 complex, Spt10, and Spt21, in addition to previously reported remodeler ATPases and histone chaperones. Disruption of these factors or reductions in histone levels led genic nucleosomes to assume positions more consistent with their intrinsic sequence preferences, with pronounced and specific shifts of the +1 nucleosome relative to the transcription start site. These shifts of +1 nucleosomes appear to have functional consequences, as several affected genes in Ino80 mutants exhibited altered expression responses. Our parallel expression profiling compendium revealed extensive transcription changes in intergenic and antisense regions, most of which occur in regions with altered nucleosome occupancy and positioning. We show that the nucleosome-excluding transcription factors Reb1, Abf1, Tbf1, and Rsc3 suppress cryptic transcripts at their target promoters, while a combined analysis of nucleosome and expression profiles identified 36 novel transcripts that are normally repressed by Tup1/Cyc8. Our data confirm and extend the roles of chromatin remodelers and chaperones as major determinants of genic nucleosome positioning, and these data provide a valuable resource for future studies.
    PLoS Genetics 05/2013; 9(5):e1003479. · 8.69 Impact Factor
  • Article: DNA-Binding Specificities of Human Transcription Factors.
    [show abstract] [hide abstract]
    ABSTRACT: Although the proteins that read the gene regulatory code, transcription factors (TFs), have been largely identified, it is not well known which sequences TFs can recognize. We have analyzed the sequence-specific binding of human TFs using high-throughput SELEX and ChIP sequencing. A total of 830 binding profiles were obtained, describing 239 distinctly different binding specificities. The models represent the majority of human TFs, approximately doubling the coverage compared to existing systematic studies. Our results reveal additional specificity determinants for a large number of factors for which a partial specificity was known, including a commonly observed A- or T-rich stretch that flanks the core motifs. Global analysis of the data revealed that homodimer orientation and spacing preferences, and base-stacking interactions, have a larger role in TF-DNA binding than previously appreciated. We further describe a binding model incorporating these features that is required to understand binding of TFs to DNA.
    Cell 01/2013; 152(1-2):327-339. · 32.40 Impact Factor
  • Source
    Article: The draft genome and transcriptome of Cannabis sativa.
    [show abstract] [hide abstract]
    ABSTRACT: Cannabis sativa has been cultivated throughout human history as a source of fiber, oil and food, and for its medicinal and intoxicating properties. Selective breeding has produced cannabis plants for specific uses, including high-potency marijuana strains and hemp cultivars for fiber and seed production. The molecular biology underlying cannabinoid biosynthesis and other traits of interest is largely unexplored. We sequenced genomic DNA and RNA from the marijuana strain Purple Kush using shortread approaches. We report a draft haploid genome sequence of 534 Mb and a transcriptome of 30,000 genes. Comparison of the transcriptome of Purple Kush with that of the hemp cultivar 'Finola' revealed that many genes encoding proteins involved in cannabinoid and precursor pathways are more highly expressed in Purple Kush than in 'Finola'. The exclusive occurrence of Δ9-tetrahydrocannabinolic acid synthase in the Purple Kush transcriptome, and its replacement by cannabidiolic acid synthase in 'Finola', may explain why the psychoactive cannabinoid Δ9-tetrahydrocannabinol (THC) is produced in marijuana but not in hemp. Resequencing the hemp cultivars 'Finola' and 'USO-31' showed little difference in gene copy numbers of cannabinoid pathway enzymes. However, single nucleotide variant analysis uncovered a relatively high level of variation among four cannabis types, and supported a separation of marijuana and hemp. The availability of the Cannabis sativa genome enables the study of a multifunctional plant that occupies a unique role in human culture. Its availability will aid the development of therapeutic marijuana strains with tailored cannabinoid profiles and provide a basis for the breeding of hemp with improved agronomic characteristics.
    Genome biology 10/2011; 12(10):R102. · 6.63 Impact Factor
  • Article: An alternative splicing switch regulates embryonic stem cell pluripotency and reprogramming.
    [show abstract] [hide abstract]
    ABSTRACT: Alternative splicing (AS) is a key process underlying the expansion of proteomic diversity and the regulation of gene expression. Here, we identify an evolutionarily conserved embryonic stem cell (ESC)-specific AS event that changes the DNA-binding preference of the forkhead family transcription factor FOXP1. We show that the ESC-specific isoform of FOXP1 stimulates the expression of transcription factor genes required for pluripotency, including OCT4, NANOG, NR5A2, and GDF3, while concomitantly repressing genes required for ESC differentiation. This isoform also promotes the maintenance of ESC pluripotency and contributes to efficient reprogramming of somatic cells into induced pluripotent stem cells. These results reveal a pivotal role for an AS event in the regulation of pluripotency through the control of critical ESC-specific transcriptional programs.
    Cell 09/2011; 147(1):132-46. · 32.40 Impact Factor
  • Article: Response to "the reality of pervasive transcription".
    PLoS Biology 07/2011; 9(7):e1001102. · 11.45 Impact Factor
  • Source
    Article: Sequence specificity is obtained from the majority of modular C2H2 zinc-finger arrays.
    [show abstract] [hide abstract]
    ABSTRACT: C2H2 zinc fingers (C2H2-ZFs) are the most prevalent type of vertebrate DNA-binding domain, and typically appear in tandem arrays (ZFAs), with sequential C2H2-ZFs each contacting three (or more) sequential bases. C2H2-ZFs can be assembled in a modular fashion, providing one explanation for their remarkable evolutionary success. Given a set of modules with defined three-base specificities, modular assembly also presents a way to construct artificial proteins with specific DNA-binding preferences. However, a recent survey of a large number of three-finger ZFAs engineered by modular assembly reported high failure rates (∼70%), casting doubt on the generality of modular assembly. Here, we used protein-binding microarrays to analyze 28 ZFAs that failed in the aforementioned study. Most (17) preferred specific sequences, which in all but one case resembled the intended target sequence. Like natural ZFAs, the engineered ZFAs typically yielded degenerate motifs, binding dozens to hundreds of related individual sequences. Thus, the failure of these proteins in previous assays is not due to lack of sequence-specific DNA-binding activity. Our findings underscore the relevance of individual C2H2-ZF sequence specificities within tandem arrays, and support the general ability of modular assembly to produce ZFAs with sequence-specific DNA-binding activity.
    Nucleic Acids Research 02/2011; 39(11):4680-90. · 8.03 Impact Factor
  • Article: Jury remains out on simple models of transcription factor specificity.
    Quaid Morris, Martha L Bulyk, Timothy R Hughes
    Nature Biotechnology 01/2011; 29(6):483-4. · 29.50 Impact Factor
  • Article: Analysis of Escherichia coli RNase E and RNase III activity in vivo using tiling microarrays.
    [show abstract] [hide abstract]
    ABSTRACT: Tiling microarrays have proven to be a valuable tool for gaining insights into the transcriptomes of microbial organisms grown under various nutritional or stress conditions. Here, we describe the use of such an array, constructed at the level of 20 nt resolution for the Escherichia coli MG1655 genome, to observe genome-wide changes in the steady-state RNA levels in mutants defective in either RNase E or RNase III. The array data were validated by comparison to previously published results for a variety of specific transcripts as well as independent northern analysis of additional mRNAs and sRNAs. In the absence of RNase E, 60% of the annotated coding sequences showed either increases or decreases in their steady-state levels. In contrast, only 12% of the coding sequences were affected in the absence of RNase III. Unexpectedly, many coding sequences showed decreased abundance in the RNase E mutant, while more than half of the annotated sRNAs showed changes in abundance. Furthermore, the steady-state levels of many transcripts showed overlapping effects of both ribonucleases. Data are also presented demonstrating how the arrays were used to identify potential new genes, RNase III cleavage sites and the direct or indirect control of specific biological pathways.
    Nucleic Acids Research 12/2010; 39(8):3188-203. · 8.03 Impact Factor
  • Source
    Article: Objective sequence-based subfamily classifications of mouse homeodomains reflect their in vitro DNA-binding preferences.
    [show abstract] [hide abstract]
    ABSTRACT: Classifying proteins into subgroups with similar molecular function on the basis of sequence is an important step in deriving reliable functional annotations computationally. So far, however, available classification procedures have been evaluated against protein subgroups that are defined by experts using mainly qualitative descriptions of molecular function. Recently, in vitro DNA-binding preferences to all possible 8-nt DNA sequences have been measured for 178 mouse homeodomains using protein-binding microarrays, offering the unprecedented opportunity of evaluating the classification methods against quantitative measures of molecular function. To this end, we automatically derive homeodomain subtypes from the DNA-binding data and independently group the same domains using sequence information alone. We test five sequence-based methods, which use different sequence-similarity measures and algorithms to group sequences. Results show that methods that optimize the classification robustness reflect well the detailed functional specificity revealed by the experimental data. In some of these classifications, 73-83% of the subfamilies exactly correspond to, or are completely contained in, the function-based subtypes. Our findings demonstrate that certain sequence-based classifications are capable of yielding very specific molecular function annotations. The availability of quantitative descriptions of molecular function, such as DNA-binding data, will be a key factor in exploiting this potential in the future.
    Nucleic Acids Research 12/2010; 38(22):7927-42. · 8.03 Impact Factor
  • Source
    Article: Contribution of histone sequence preferences to nucleosome organization: proposed definitions and methodology.
    [show abstract] [hide abstract]
    ABSTRACT: We propose definitions and procedures for comparing nucleosome maps and discuss current agreement and disagreement on the effect of histone sequence preferences on nucleosome organization in vivo.
    Genome biology 11/2010; 11(11):140. · 6.63 Impact Factor
  • Source
    Article: RBPDB: a database of RNA-binding specificities.
    [show abstract] [hide abstract]
    ABSTRACT: The RNA-Binding Protein DataBase (RBPDB) is a collection of experimental observations of RNA-binding sites, both in vitro and in vivo, manually curated from primary literature. To build RBPDB, we performed a literature search for experimental binding data for all RNA-binding proteins (RBPs) with known RNA-binding domains in four metazoan species (human, mouse, fly and worm). In total, RPBDB contains binding data on 272 RBPs, including 71 that have motifs in position weight matrix format, and 36 sets of sequences of in vivo-bound transcripts from immunoprecipitation experiments. The database is accessible by a web interface which allows browsing by domain or by organism, searching and export of records, and bulk data downloads. Users can also use RBPDB to scan sequences for RBP-binding sites. RBPDB is freely available, without registration at http://rbpdb.ccbr.utoronto.ca/.
    Nucleic Acids Research 10/2010; 39(Database issue):D301-8. · 8.03 Impact Factor
  • Article: Nucleosome sequence preferences influence in vivo nucleosome organization.
    Nature Structural &#38 Molecular Biology 08/2010; 17(8):918-20; author reply 920-2. · 12.71 Impact Factor
  • Source
    Article: FuncBase: a resource for quantitative gene function annotation.
    [show abstract] [hide abstract]
    ABSTRACT: SUMMARY: Computational gene function prediction can serve to focus experimental resources on high-priority experimental tasks. FuncBase is a web resource for viewing quantitative machine learning-based gene function annotations. Quantitative annotations of genes, including fungal and mammalian genes, with Gene Ontology terms are accompanied by a community feedback system. Evidence underlying function annotations is shown. For example, a custom Cytoscape viewer shows functional linkage graphs relevant to the gene or function of interest. FuncBase provides links to external resources, and may be accessed directly or via links from species-specific databases. AVAILABILITY: FuncBase as well as all underlying data and annotations are freely available via http://func.med.harvard.edu/
    Bioinformatics 07/2010; 26(14):1806-7. · 5.47 Impact Factor
  • Source
    Article: Genome-wide analysis of ETS-family DNA-binding in vitro and in vivo.
    [show abstract] [hide abstract]
    ABSTRACT: Members of the large ETS family of transcription factors (TFs) have highly similar DNA-binding domains (DBDs)-yet they have diverse functions and activities in physiology and oncogenesis. Some differences in DNA-binding preferences within this family have been described, but they have not been analysed systematically, and their contributions to targeting remain largely uncharacterized. We report here the DNA-binding profiles for all human and mouse ETS factors, which we generated using two different methods: a high-throughput microwell-based TF DNA-binding specificity assay, and protein-binding microarrays (PBMs). Both approaches reveal that the ETS-binding profiles cluster into four distinct classes, and that all ETS factors linked to cancer, ERG, ETV1, ETV4 and FLI1, fall into just one of these classes. We identify amino-acid residues that are critical for the differences in specificity between all the classes, and confirm the specificities in vivo using chromatin immunoprecipitation followed by sequencing (ChIP-seq) for a member of each class. The results indicate that even relatively small differences in in vitro binding specificity of a TF contribute to site selectivity in vivo.
    The EMBO Journal 07/2010; 29(13):2147-60. · 9.20 Impact Factor
  • Source
    Article: Most "dark matter" transcripts are associated with known genes.
    [show abstract] [hide abstract]
    ABSTRACT: A series of reports over the last few years have indicated that a much larger portion of the mammalian genome is transcribed than can be accounted for by currently annotated genes, but the quantity and nature of these additional transcripts remains unclear. Here, we have used data from single- and paired-end RNA-Seq and tiling arrays to assess the quantity and composition of transcripts in PolyA+ RNA from human and mouse tissues. Relative to tiling arrays, RNA-Seq identifies many fewer transcribed regions ("seqfrags") outside known exons and ncRNAs. Most nonexonic seqfrags are in introns, raising the possibility that they are fragments of pre-mRNAs. The chromosomal locations of the majority of intergenic seqfrags in RNA-Seq data are near known genes, consistent with alternative cleavage and polyadenylation site usage, promoter- and terminator-associated transcripts, or new alternative exons; indeed, reads that bridge splice sites identified 4,544 new exons, affecting 3,554 genes. Most of the remaining seqfrags correspond to either single reads that display characteristics of random sampling from a low-level background or several thousand small transcripts (median length = 111 bp) present at higher levels, which also tend to display sequence conservation and originate from regions with open chromatin. We conclude that, while there are bona fide new intergenic transcripts, their number and abundance is generally low in comparison to known exons, and the genome is not as pervasively transcribed as previously reported.
    PLoS Biology 05/2010; 8(5):e1000371. · 11.45 Impact Factor
  • Source
    Article: Conservation and regulatory associations of a wide affinity range of mouse transcription factor binding sites.
    [show abstract] [hide abstract]
    ABSTRACT: Sequence-specific binding by transcription factors (TFs) interprets regulatory information encoded in the genome. Using recently published universal protein binding microarray (PBM) data on the in vitro DNA binding preferences of these proteins for all possible 8-base-pair sequences, we examined the evolutionary conservation and enrichment within putative regulatory regions of the binding sequences of a diverse library of 104 nonredundant mouse TFs spanning 22 different DNA-binding domain structural classes. We found that not only high affinity binding sites, but also numerous moderate and low affinity binding sites, are under negative selection in the mouse genome. These 8-mers occur preferentially in putative regulatory regions of the mouse genome, including CpG islands and non-exonic ultraconserved elements (UCEs). Of TFs whose PBM "bound" 8-mers are enriched within sets of tissue-specific UCEs, many are expressed in the same tissue(s) as the UCE-driven gene expression. Phylogenetically conserved motif occurrences of various TFs were also enriched in the noncoding sequence surrounding numerous gene sets corresponding to Gene Ontology categories and tissue-specific gene expression clusters, suggesting involvement in transcriptional regulation of those genes. Altogether, our results indicate that many of the sequences bound by these proteins in vitro, including lower affinity DNA sequences, are likely to be functionally important in vivo. This study not only provides an initial analysis of the potential regulatory associations of 104 mouse TFs, but also presents an approach for the functional analysis of TFs from any other metazoan genome as their DNA binding preferences are determined by PBMs or other technologies.
    Genomics 04/2010; 95(4):185-95. · 3.02 Impact Factor
  • Article: Multiplexed massively parallel SELEX for characterization of human transcription factor binding specificities.
    [show abstract] [hide abstract]
    ABSTRACT: The genetic code-the binding specificity of all transfer-RNAs--defines how protein primary structure is determined by DNA sequence. DNA also dictates when and where proteins are expressed, and this information is encoded in a pattern of specific sequence motifs that are recognized by transcription factors. However, the DNA-binding specificity is only known for a small fraction of the approximately 1400 human transcription factors (TFs). We describe here a high-throughput method for analyzing transcription factor binding specificity that is based on systematic evolution of ligands by exponential enrichment (SELEX) and massively parallel sequencing. The method is optimized for analysis of large numbers of TFs in parallel through the use of affinity-tagged proteins, barcoded selection oligonucleotides, and multiplexed sequencing. Data are analyzed by a new bioinformatic platform that uses the hundreds of thousands of sequencing reads obtained to control the quality of the experiments and to generate binding motifs for the TFs. The described technology allows higher throughput and identification of much longer binding profiles than current microarray-based methods. In addition, as our method is based on proteins expressed in mammalian cells, it can also be used to characterize DNA-binding preferences of full-length proteins or proteins requiring post-translational modifications. We validate the method by determining binding specificities of 14 different classes of TFs and by confirming the specificities for NFATC1 and RFX3 using ChIP-seq. Our results reveal unexpected dimeric modes of binding for several factors that were thought to preferentially bind DNA as monomers.
    Genome Research 04/2010; 20(6):861-73. · 13.61 Impact Factor
  • Source
    Article: Conserved expression without conserved regulatory sequence: the more things change, the more they stay the same.
    Matthew T Weirauch, Timothy R Hughes
    [show abstract] [hide abstract]
    ABSTRACT: Regulatory regions with similar transcriptional output often have little overt sequence similarity, both within and between genomes. Although cis- and trans-regulatory changes can contribute to sequence divergence without dramatically altering gene expression outputs, heterologous DNA often functions similarly in organisms that share little regulatory sequence similarities (e.g. human DNA in fish), indicating that trans-regulatory mechanisms tend to diverge more slowly and can accommodate a variety of cis-regulatory configurations. This capacity to 'tinker' with regulatory DNA probably relates to the complexity, robustness and evolvability of regulatory systems, but cause-and-effect relationships among evolutionary processes and properties of regulatory systems remain a topic of debate. The challenge of understanding the concrete mechanisms underlying cis-regulatory evolution - including the conservation of function without the conservation of sequence - relates to the challenge of understanding the function of regulatory systems in general. Currently, we are largely unable to recognize functionally similar regulatory DNA.
    Trends in Genetics 02/2010; 26(2):66-74. · 10.06 Impact Factor
  • Source
    Article: Dramatic changes in transcription factor binding over evolutionary time.
    Matthew T Weirauch, Timothy R Hughes
    [show abstract] [hide abstract]
    ABSTRACT: A recent study reveals a surprisingly high degree of change in the occupancy patterns of two transcription factors in the livers of five vertebrates.
    Genome biology 01/2010; 11(6):122. · 6.63 Impact Factor
  • Article: FuncBase : a resource for quantitative gene function annotation.
    Bioinformatics. 01/2010; 26:1806-1807.

Institutions

  • 2010–2013
    • University of Helsinki
      Helsinki, Province of Southern Finland, Finland
  • 2002–2011
    • University of Toronto
      • • Banting and Best Department of Medical Research
      • • Department of Molecular Genetics
      Toronto, Ontario, Canada
  • 2009–2010
    • Weizmann Institute of Science
      • Department of Computer Science and Applied Mathematics
      Israel
  • 2008
    • Harvard University
      • Department of Medicine Brigham and Women's Hospital
      Boston, MA, USA
  • 2006
    • University of California, San Francisco
      • Department of Cellular and Molecular Pharmacology
      San Francisco, CA, USA