Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters. Gene

Genetics Genomics and Bioinformatics Graduate Program, University of California, Riverside, CA 92521, USA.
Gene (Impact Factor: 2.14). 04/2007; 389(1):52-65. DOI: 10.1016/j.gene.2006.09.029
Source: PubMed


The core promoter of eukaryotic genes is the minimal DNA region that recruits the basal transcription machinery to direct efficient and accurate transcription initiation. The fraction of human and yeast genes that contain specific core promoter elements such as the TATA box and the initiator (INR) remains unclear and core promoter motifs specific for TATA-less genes remain to be identified. Here, we present genome-scale computational analyses indicating that approximately 76% of human core promoters lack TATA-like elements, have a high GC content, and are enriched in Sp1-binding sites. We further identify two motifs - M3 (SCGGAAGY) and M22 (TGCGCANK) - that occur preferentially in human TATA-less core promoters. About 24% of human genes have a TATA-like element and their promoters are generally AT-rich; however, only approximately 10% of these TATA-containing promoters have the canonical TATA box (TATAWAWR). In contrast, approximately 46% of human core promoters contain the consensus INR (YYANWYY) and approximately 30% are INR-containing TATA-less genes. Significantly, approximately 46% of human promoters lack both TATA-like and consensus INR elements. Surprisingly, mammalian-type INR sequences are present - and tend to cluster - in the transcription start site (TSS) region of approximately 40% of yeast core promoters and the frequency of specific core promoter types appears to be conserved in yeast and human genomes. Gene Ontology analyses reveal that TATA-less genes in humans, as in yeast, are frequently involved in basic "housekeeping" processes, while TATA-containing genes are more often highly regulated, such as by biotic or stress stimuli. These results reveal unexpected similarities in the occurrence of specific core promoter types and in their associated biological processes in yeast and humans and point to novel vertebrate-specific DNA motifs that might play a selective role in TATA-independent transcription.

Download full-text


Available from: Ernest Martinez,
  • Source
    • "The identification of DNA binding sites for transcription factors (motifs) is important for a complete understanding of co-regulation of gene expression, but still remains to be quite challenging to achieve. Two approaches dominate motif-finding algorithms: (1) the word-based way [1] [2] [3] that relies on exhaustive enumeration or counting frequencies and (2) the probabilistic way [4] [5] [6] that relies on optimizing a scalar-based scoring matrix [7] [8], which is visualized conveniently by a sequence "
    [Show abstract] [Hide abstract]
    ABSTRACT: The conventional way of identifying possible motif sequences in a DNA strand is to use representative scalar weight matrix for searching good match substring alignments. However, this approach, solely based on match alignment information, is susceptible to a high number of ambiguous sites or false positives if the motif sequences are not well conserved. A significant amount of time is then required to verify these sites for the suggested motifs. Hence in this paper, the use of mismatch alignment information in addition to match alignment information for DNA motif searching is proposed. The objective is to reduce the number of ambiguous false positives encountered in the DNA motif searching, thereby making the process more efficient for biologists to use.
    Procedia Computer Science 12/2015; 51(1):602-609. DOI:10.1016/j.procs.2015.05.328
  • Source
    • "Moreover, analysis of promoter sequences (400 bp upstream of and 100 bp downstream from the TSS) of the up-regulated genes identified binding sites for factors regulated by p38 kinase activity such as JUN, JUND, ATF1, and CREB1 (Fig. 2C; Tan et al. 1996; Gao et al. 2013). In addition, we found an enrichment for the TATA box motif in promoter sequences of the upregulated genes (Fig. 2C), in agreement with the well-established role of this motif in the regulation of signal-inducible transcription (Basehoar et al. 2004; Yang et al. 2007). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Selectivity of transcriptional responses to extracellular cues is reflected by the deposition of stimulus-specific chromatin marks. Although histone H3 phosphorylation is a target of numerous signaling pathways, its role in transcriptional regulation still remains poorly understood. Here we report for the first time a genome-wide analysis of H3S28ph in the mammalian system in the context of stress signaling. We found that this mark targets as many as 50% of all stress induced genes, underlining its importance in signal-induced transcription. By combining ChIP-seq, RNA-seq and spectrometry we identified the factors involved in biological interpretation of this histone modification. We found that MSK1/2 mediated phosphorylation of H3S28 at stress-responsive promoters contributes to the dissociation of HDAC co-repressor complexes and thereby to enhanced local histone acetylation and subsequent transcriptional activation of stress-induced genes. Our data reveals a novel function of the H3S28ph mark in the activation of mammalian genes in response to MAP kinase pathway activation.
    Genome Research 08/2014; 24(11). DOI:10.1101/gr.176255.114 · 14.63 Impact Factor
  • Source
    • "The TATA box is typically the main recognition sequence for RNA polII at the transcription start site [17]. In addition, most genes also contain initiator sequences that can fulfill the role of the promoter [18], [19]. The consensus initiator sequence is 5′-C/T-C/T-A+1-N-T/A-C/T-C/T-C/T, and it is not as highly conserved as the TATA box [20]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Human trefoil factor 3 (hTFF3) is a small-molecule peptide with potential medicinal value. Its main pharmacological function is to alleviate gastrointestinal mucosal injuries caused by various factors and promote the repair of damaged mucosa. However, how its transcription is regulated is not yet known. The aim of this study was to clone the hTFF3 gene promoter region, identify the core promoter and any transcription factors that bind to the promoter, and begin to clarify the regulation of its expression. The 5' flanking sequence of the hTFF3 gene was cloned from human whole blood genomic DNA by PCR. Truncated promoter fragments with different were cloned and inserted into the pGL3-Basic vector to determine the position of the core hTFF3 promoter. Transcription element maintaining basic transcriptional activity was assessed by mutation techniques. Protein-DNA interactions were analyzed by chromatin immunoprecipitation (ChIP). RNA interference and gene over-expression were performed to assay the effect of transcription factor on the hTFF3 expression. The results showed that approximately 1,826 bp of the fragment upstream of hTFF3 was successfully amplified, and its core promoter region was determined to be from -300 bp to -280 bp through analysis of truncated mutants. Mutation analysis confirmed that the sequence required to maintain basic transcriptional activity was accurately positioned from -300 bp to -296 bp. Bioinformatic analysis indicated that this area contained a Sp1 binding site. Sp1 binding to the hTFF3 promoter was confirmed by ChIP experiments. Sp1 over-expression and interference experiments showed that Sp1 enhanced the transcriptional activity of the hTFF3 promoter and increased hTFF3 expression. This study demonstrated that Sp1 plays an important role in maintaining the transcription of hTFF3.
    PLoS ONE 04/2014; 9(4):e95562. DOI:10.1371/journal.pone.0095562 · 3.23 Impact Factor
Show more