The Candida Genome Database incorporates multiple Candida species: multispecies search and analysis tools with curated gene and protein information for Candida albicans and Candida glabrata

Department of Genetics, Stanford University Medical School, Stanford, CA 94305-5120, USA.
Nucleic Acids Research (Impact Factor: 9.11). 11/2011; 40(Database issue):D667-74. DOI: 10.1093/nar/gkr945
Source: PubMed


The Candida Genome Database (CGD, http://www is an internet-based resource that provides centralized access to genomic sequence data and manually curated functional information about genes and proteins of the fungal pathogen Candida albicans and other Candida species. As the scope of Candida research, and the number of sequenced strains and related species, has grown in recent years, the need for expanded genomic resources has also grown. To answer this need, CGD has expanded beyond storing data solely for C. albicans, now integrating data from multiple species. Herein we describe the incorporation of this multispecies information, which includes curated gene information and the reference sequence for C. glabrata, as well as orthology relationships that interconnect Locus Summary pages, allowing easy navigation between genes of C. albicans and C. glabrata. These orthology relationships are also used to predict GO annotations of their products. We have also added protein information pages that display domains, structural information and physicochemical properties; bibliographic pages highlighting important topic areas in Candida biology; and a laboratory strain lineage page that describes the lineage of commonly used laboratory strains. All of these data are freely available at http://www We welcome feedback from the research community at [email protected]
/* */

40 Reads
  • Source
    • "S. pombe genome information (Assembly 16) was taken from the PomBase database (Wood et al. 2012); BS locations were calculated based on the position specific scoring matrix (PSSM) information extracted from the Sanger Institute, which is based on the original full genome sequencing from (Wood et al. 2002). A. nidulans (FGSC A4) and C. albicans (SC5314 Assembly 21) genome information was taken from the Aspergillus Genome Database (AspGD) (Arnaud et al. 2012) and Candida Genome Database (CGD) (Inglis et al. 2012), respectively ; BS locations were calculated based on the fungal BS consensus sequence (CURAY). We used only introns taken from coding sequence genes and excluded 5 ′ UTR and 3 ′ UTR introns. "
    [Show abstract] [Hide abstract]
    ABSTRACT: RNA splicing is the central process of intron removal in eukaryotes known to regulate various cellular functions such as growth, development, and response to external signals. The canonical sequences indicating the splicing sites needed for intronic boundary recognition are well known. However, the roles and evolution of the local folding of intronic and exonic sequence features adjacent to splice sites has yet to be thoroughly studied. Here, focusing on four fungi (Saccharomyces cerevisiae, Schizosaccharomyces pombe, Aspergillus nidulans, and Candida albicans), we performed for the first time a comprehensive high-resolution study aimed at characterizing the encoding of intronic splicing efficiency in pre-mRNA transcripts and its effect on intron evolution. Our analysis supports the conjecture that pre-mRNA local folding strength at intronic boundaries is under selective pressure, as it significantly affects splicing efficiency. Specifically, we show that in the immediate region of 12-30 nucleotides (nt) surrounding the intronic donor site there is a preference for weak pre-mRNA folding; similarly, in the region of 15-33 nt surrounding the acceptor and branch sites there is a preference for weak pre-mRNA folding. We also show that in most cases there is a preference for strong pre-mRNA folding further away from intronic splice sites. In addition, we demonstrate that these signals are not associated with gene-specific functions, and they correlate with splicing efficiency measurements (r = 0.77, P = 2.98 × 10(-21)) and with expression levels of the corresponding genes (P = 1.24 × 10(-19)). We suggest that pre-mRNA folding strength in the above-mentioned regions has a direct effect on splicing efficiency by improving the recognition of intronic boundaries. These new discoveries are contributory steps toward a broader understanding of splicing regulation and intronic/transcript evolution.
    RNA 08/2015; DOI:10.1261/rna.051268.115 · 4.94 Impact Factor
  • Source
    • "Resampled statistics were obtained via 10,000 random iterations, and a significance threshold of 0.0005 (predicted to select ~3 false-positive hits per sample) was chosen to identify differentially expressed genes. Gene Ontology (GO) enrichment analysis was performed with the CGD Gene Ontology Term Finder (Inglis et al. 2012), with P values corresponding to Bonferroni-corrected hypergeometric test P values. The magnitude of the overlap between two gene lists was evaluated as the ratio of the intersection over the union of the two lists. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Candida albicans is the most important fungal pathogen of humans, causing severe infections especially in nosocomial and immunocompromised settings. However, it is also the most prevalent fungus of the normal human microbiome, where it shares its habitat with hundreds of trillions of other microbial cells. Despite weak organic acids (WOAs) being among the most abundant metabolites produced by bacterial microbiota, little is known about their effect on C. albicans. Here we employed a sequencing-based profiling strategy to systematically investigate the transcriptional stress response of C. albicans to lactic, acetic, propionic and butyric acid at several time points after treatment. Our data reveal a complex transcriptional response, with individual WOAs triggering unique gene expression profiles and with important differences between acute and chronic exposure. In spite of these dissimilarities, we found significant overlaps between the gene expression changes induced by each WOA, which led us to uncover a core transcriptional response that was largely unrelated to other previously published C. albicans transcriptional stress responses. Genes commonly up-regulated by WOAs were enriched in several iron transporters, which was associated with an overall decrease in intracellular iron concentrations. Moreover, chronic exposure to any WOA lead to down-regulation of RNA synthesis and ribosome biogenesis genes, which resulted in significant reduction of total RNA levels and of ribosomal RNA in particular. In conclusion, this study suggests that GI microbiota might directly influence C. albicans physiology via production of WOAs, with possible implications of how this fungus interacts with its host in both health and disease. Copyright © 2015 Author et al.
    G3-Genes Genomes Genetics 01/2015; 5(4). DOI:10.1534/g3.114.015941 · 3.20 Impact Factor
  • Source
    • "This allows bPeaks to automatically calculate the proportion of peaks in promoter regions of genes (default value is 800 bp before start codon ATG). Annotations were collected from the Saccharomyces Genome Database (SGD) (Cherry et al., 2012), the Candida Genome Database (Inglis et al., 2012) and the Genolevures database (Sherman et al., 2009). For other organisms, the user can specify the boundaries of any genomic element (gene, promoter , non-coding elements, etc.) and use bPeaks to identify peaks that fall in each category. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Peak calling is a critical step in ChIPseq data analysis. Choosing the correct algorithm as well as optimized parameters for a specific biological system is an essential task. In this article, we present an original peak calling method (bPeaks) specifically designed to detect transcription factor (TF) binding sites in small eukaryotic genomes, such as in yeasts. As TF interactions with DNA are strong and generate high binding signals, bPeaks uses simple parameters to compare the sequences (reads) obtained from the IP (immunoprecipitation) with those from the control DNA (input). Because yeasts have small genomes (<20 Mb), our program has the advantage to use ChIPseq information at the single nucleotide level and can explore, in a reasonable computational time, results obtained with different sets of parameter values. Graphical outputs and text files are provided to rapidly assess the relevance of the detected peaks. Taking advantage of the simple promoter structure in yeasts, additional functions were implemented in bPeaks to automatically assign the peaks to promoter regions and retrieve peak coordinates on the DNA sequence for further predictions of regulatory motifs, enriched in the list of peaks. Applications of the bPeaks program to three different ChIPseq datasets from Saccharomyces cerevisiae, Candida albicans and Candida glabrata are presented. Each time, bPeaks allowed to correctly predicted the DNA binding sequence of the studied TF and provided relevant lists of peaks. The bioinformatics tool bPeaks is freely distributed to academic users. Supplementary data together with detailed tutorials are available online: This article is protected by copyright. All rights reserved.
    Yeast 10/2014; 31(10). DOI:10.1002/yea.3031 · 1.63 Impact Factor
Show more