SCPD: a promoter database of the yeast Saccharomyces cerevisiae.

Cold Spring Harbor Laboratory, PO Box 100, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA.
Bioinformatics (Impact Factor: 4.62). 07/1999; 15(7-8):607-11. DOI: 10.1093/bioinformatics/15.7.607
Source: PubMed

ABSTRACT In order to facilitate a systematic study of the promoters and transcriptionally regulatory cis-elements of the yeast Saccharomyces cerevisiae on a genomic scale, we have developed a comprehensive yeast-specific promoter database, SCPD.
Currently SCPD contains 580 experimentally mapped transcription factor (TF) binding sites and 425 transcriptional start sites (TSS) as its primary data entries. It also contains relevant binding affinity and expression data where available. In addition to mechanisms for promoter information (including sequence) retrieval and a data submission form, SCPD also provides some simple but useful tools for promoter sequence analysis.
SCPD can be accessed from the URL The database is continually updated.

  • [Show abstract] [Hide abstract]
    ABSTRACT: We consider the problem of identifying motifs that abstracts the task of finding short conserved sites in genomic DNA. The planted (l, d)-motif problem, PMP, is the mathematical abstraction of this problem, which consists of finding a substring of length l that occurs in each s i in a set of input sequences S = {s 1, s 2, . . . ,s t } with at most d substitutions. Our propose algorithm combines the voting algorithm and pattern matching algorithm to find exact motifs. The combined algorithm is achieved by running the voting algorithm on t′ sequences, t′ t. After that we use the pattern matching on the output of the voting algorithm and the reminder sequences, t − t′. Two values of t′ are calculated. The first value of t′ makes the running time of our proposed algorithm less than the running time of voting algorithm. The second value of t′ makes the running time of our proposed algorithm is minimal. We show that our proposed algorithm is faster than the voting algorithm by testing both algorithms on simulated data from (9, d ≤ 2) to (19, d ≤ 7). Finally, we test the performance of the combined algorithm on realistic biological data.
    Mathematics in Computer Science 01/2013; 7(4).
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Identifying repeated factors that occur in a string of letters or common factors that occur in a setof strings represents an important task in computer science and biology. Such patterns are calledmotifs, and the process of identifying them is called motif extraction. In biology, motif extractionconstitutes a fundamental step in understanding regulation of gene expression. State-of-the-art toolsfor motif extraction have their own constraints. Most of these tools are only designed for singlemotif extraction; structured motifs additionally allow for distance intervals between their single motifcomponents. Moreover, motif extraction from large-scale datasets-for instance, large-scale ChIPSeqdatasets¿cannot be performed by current tools. Other constraints include high time and/or spacecomplexity for identifying long motifs with higher error thresholds.
    BMC Bioinformatics 07/2014; 15(1):235. · 2.67 Impact Factor
  • Source
    IEEE transactions on neural networks and learning systems 10/2013; 24(10):1677-1688. · 4.37 Impact Factor


1 Download
Available from