SCPD: A promoter database of the yeast

Cold Spring Harbor Laboratory, PO Box 100, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA.
Bioinformatics (Impact Factor: 4.98). 07/1999; 15(7-8):607-11. DOI: 10.1093/bioinformatics/15.7.607
Source: PubMed


In order to facilitate a systematic study of the promoters and transcriptionally regulatory cis-elements of the yeast Saccharomyces cerevisiae on a genomic scale, we have developed a comprehensive yeast-specific promoter database, SCPD.
Currently SCPD contains 580 experimentally mapped transcription factor (TF) binding sites and 425 transcriptional start sites (TSS) as its primary data entries. It also contains relevant binding affinity and expression data where available. In addition to mechanisms for promoter information (including sequence) retrieval and a data submission form, SCPD also provides some simple but useful tools for promoter sequence analysis.
SCPD can be accessed from the URL The database is continually updated.

Full-text preview

Available from:
  • Source
    • "In the construction of synthetic hybrid promoters, truncated CYC1 and LEU2 promoters have been used as core promoters (Blazeck et al., 2012). However, these core promoters contain multiple TATA binding sites (Zhu and Zhang, 1999) and are unsuitable for synthetically introducing operator sites for repressor binding. Operators for TetR and LacI have previously been inserted downstream of the TATA boxes of GAL1, GAL10, and ADH1 promoters (Blake et al., 2003), where introduction of operator sites need to be carried out for each core promoter. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Expression of heterologous proteins in metabolic engineering endeavors can be detrimental to host cells due to increased usage of cellular resources. Dynamic controls, where protein expression can be triggered on-demand, are effective for the engineering and optimization of bio-catalysts towards robust cell growth and enhanced biochemical productivity. Here, we describe the development and characterization of AND-gate dynamic controllers in Saccharomyces cerevisiae which combine two dynamic control strategies, inducible promoters and sensing-regulation. These dynamic controllers were constructed based on synthetic hybrid promoters. Promoter enhancer sequences were fused to a synthetic GAL1 core promoter containing DNA binding sites for the binding of a repressor that reduced DNA affinity upon interaction with key intermediates in a biochemical pathway. As fatty acids are key intermediates for production of fatty alcohols, fatty acid esters, alkenes and alkanes, which are advanced biofuels, we used the fatty acid responsive FadR repressor and its operator sequence to demonstrate the functionality of the dynamic controllers. We established that the synthetic GAL1 core promoter can be used as a modular promoter part for constructing synthetic hybrid promoters and conferring fatty acid inducibility. We further showed the performance of the AND-gate dynamic controllers, where two inputs (fatty acid and copper presence / phosphate starvation) were required to switch the AND-gate ON. This work provides a convenient platform for constructing AND-gate dynamic controllers, i.e. promoters that combine inducible functionality with regulation of protein expression levels upon detection of key intermediates towards the engineering and optimization of bio-catalytic yeast cells. Biotechnol. Bioeng. © 2013 Wiley Periodicals, Inc.
    Preview · Article · Jan 2014 · Biotechnology and Bioengineering
  • Source
    • "Upstream regions of genes in each TC and LC were tested for motif enrichment with significance threshold p-value < 10-3 (Additional file 2: Figure S3B). Tomtom (v4.8.1) [74] was used to detect similarity to yeast motifs (databases MacIsaac_v1 [75] and SCPD [76]; E < 1 and motif length ≤ 9) and to identify S. cerevisiae motif-associated proteins (ScAPs). F. graminearum homologues of ScAPs were identified (BLASTp; E < 10-6 and matched regions covering ≥ 30% of the query or target sequence). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Genes for the production of a broad range of fungal secondary metabolites are frequently colinear. The prevalence of such gene clusters was systematically examined across the genome of the cereal pathogen Fusarium graminearum. The topological structure of transcriptional networks was also examined to investigate control mechanisms for mycotoxin biosynthesis and other processes. The genes associated with transcriptional processes were identified, and the genomic location of transcription-associated proteins (TAPs) analyzed in conjunction with the locations of genes exhibiting similar expression patterns. Highly conserved TAPs reside in regions of chromosomes with very low or no recombination, contrasting with putative regulator genes. Co-expression group profiles were used to define positionally clustered genes and a number of members of these clusters encode proteins participating in secondary metabolism. Gene expression profiles suggest there is an abundance of condition-specific transcriptional regulation. Analysis of the promoter regions of co-expressed genes showed enrichment for conserved DNA-sequence motifs. Potential global transcription factors recognising these motifs contain distinct sets of DNA-binding domains (DBDs) from those present in local regulators. Proteins associated with basal transcriptional functions are encoded by genes enriched in regions of the genome with low recombination. Systematic searches revealed dispersed and compact clusters of co-expressed genes, often containing a transcription factor, and typically containing genes involved in biosynthetic pathways. Transcriptional networks exhibit a layered structure in which the position in the hierarchy of a regulator is closely linked to the DBD structural class.
    Full-text · Article · Jun 2013 · BMC Systems Biology
  • Source
    • "Except for Saccharomyces cerevisiae, the Markov chain of a species was learned from promoter sequences in the UCSC Genome Browser database [39]. For Saccharomyces cerevisiae, the promoter sequences were retrieved from the SCPD [40] using the yeast gene list in euGenes [41]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background: Scientists routinely scan DNA sequences for transcription factor (TF) binding sites (TFBSs). Most of the available tools rely on position-specific scoring matrices (PSSMs) constructed from aligned binding sites. Because of the resolutions of assays used to obtain TFBSs, databases such as TRANSFAC, ORegAnno and PAZAR store unaligned variable-length DNA segments containing binding sites of a TF. These DNA segments need to be aligned to build a PSSM. While the TRANSFAC database provides scoring matrices for TFs, nearly 78% of the TFs in the public release do not have matrices available. As work on TFBS alignment algorithms has been limited, it is highly desirable to have an alignment algorithm tailored to TFBSs. Results: We designed a novel algorithm named LASAGNA, which is aware of the lengths of input TFBSs and utilizes position dependence. Results on 189 TFs of 5 species in the TRANSFAC database showed that our method significantly outperformed ClustalW2 and MEME. We further compared a PSSM method dependent on LASAGNA to an alignment-free TFBS search method. Results on 89 TFs whose binding sites can be located in genomes showed that our method is significantly more precise at fixed recall rates. Finally, we described LASAGNA-ChIP, a more sophisticated version for ChIP (Chromatin immunoprecipitation) experiments. Under the one-per-sequence model, it showed comparable performance with MEME in discovering motifs in ChIP-seq peak sequences. Conclusions: We conclude that the LASAGNA algorithm is simple and effective in aligning variable-length binding sites. It has been integrated into a user-friendly webtool for TFBS search and visualization called LASAGNA-Search. The tool currently stores precomputed PSSM models for 189 TFs and 133 TFs built from TFBSs in the TRANSFAC Public database (release 7.0) and the ORegAnno database (08Nov10 dump), respectively. The webtool is available at
    Full-text · Article · Mar 2013 · BMC Bioinformatics
Show more