de Lichtenberg, U. et al. Comparison of computational methods for the identification of cell cycle-regulated genes. Bioinformatics 21, 1164-1171

Technical University of Denmark, Lyngby, Capital Region, Denmark
Bioinformatics (Impact Factor: 4.98). 04/2005; 21(7):1164-71. DOI: 10.1093/bioinformatics/bti093
Source: PubMed


MOTIVATION: DNA microarrays have been used extensively to study the cell cycle transcription programme in a number of model organisms. The Saccharomyces cerevisiae data in particular have been subjected to a wide range of bioinformatics analysis methods, aimed at identifying the correct and complete set of periodically expressed genes. RESULTS: Here, we provide the first thorough benchmark of such methods, surprisingly revealing that most new and more mathematically advanced methods actually perform worse than the analysis published with the original microarray data sets. We show that this loss of accuracy specifically affects methods that only model the shape of the expression profile without taking into account the magnitude of regulation. We present a simple permutation-based method that performs better than most existing methods.

Download full-text


Available from: Søren Brunak, Aug 17, 2015
  • Source
    • "Statistical comparisons of gene expression (ChIP assay, luciferase activity) were performed using SPSS software v.21 (IBM) and a two-tailed Students t-test, and diurnal rhythms in mRNA or protein were analyzed using the JTK_Cycle nonparametric algorithm as described on the CircaDB database ( [19]–[21]. For JTK_Cycle the p-value, phase, and amplitude were evaluated, but not period as this requires at least 3 cycles of data [35]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Circadian rhythms are important for healthy cardiovascular physiology and are regulated at the molecular level by a circadian clock mechanism. We and others previously demonstrated that 9-13% of the cardiac transcriptome is rhythmic over 24 h daily cycles; the heart is genetically a different organ day versus night. However, which rhythmic mRNAs are regulated by the circadian mechanism is not known. Here, we used open access bioinformatics databases to identify 94 transcripts with expression profiles characteristic of CLOCK and BMAL1 targeted genes, using the CircaDB website and JTK_Cycle. Moreover, 22 were highly expressed in the heart as determined by the BioGPS website. Furthermore, 5 heart-enriched genes had human/mouse conserved CLOCK:BMAL1 promoter binding sites (E-boxes), as determined by UCSC table browser, circadian mammalian promoter/enhancer database PEDB, and the European Bioinformatics Institute alignment tool (EMBOSS). Lastly, we validated findings by demonstrating that Titin cap (Tcap, telethonin) was targeted by transcriptional activators CLOCK and BMAL1 by showing 1) Tcap mRNA and TCAP protein had a diurnal rhythm in murine heart; 2) cardiac Tcap mRNA was rhythmic in animals kept in constant darkness; 3) Tcap and control Per2 mRNA expression and cyclic amplitude were blunted in ClockΔ19/Δ19 hearts; 4) BMAL1 bound to the Tcap promoter by ChIP assay; 5) BMAL1 bound to Tcap promoter E-boxes by biotinylated oligonucleotide assay; and 6) CLOCK and BMAL1 induced tcap expression by luciferase reporter assay. Thus this study identifies circadian regulated genes in silico, with validation of Tcap, a critical regulator of cardiac Z-disc sarcomeric structure and function.
    Full-text · Article · Aug 2014 · PLoS ONE
  • Source
    • "There are two principal approaches to the identification of periodically expressed genes, non‐parametric (model‐free) approaches (Spellman et al, 1998; Wichert et al, 2004) and parametric (model‐based) methods (Tavazoie et al, 1999; Johansson et al, 2003; Lu et al, 2004; Guo et al, 2013). A successful screening method needs to account for measurement noise and outliers, and ideally provides a smoothed, error‐corrected estimate of the expression time course (de Lichtenberg et al, 2005a). Additionally, it has to account for the loss of synchronization of cells along the time course, which is caused by variability in progression through the cell cycle. "
    [Show abstract] [Hide abstract]
    ABSTRACT: During the cell cycle, the levels of hundreds of mRNAs change in a periodic manner, but how this is achieved by alterations in the rates of mRNA synthesis and degradation has not been studied systematically. Here, we used metabolic RNA labeling and comparative dynamic transcriptome analysis (cDTA) to derive mRNA synthesis and degradation rates every 5 min during three cell cycle periods of the yeast Saccharomyces cerevisiae. A novel statistical model identified 479 genes that show periodic changes in mRNA synthesis and generally also periodic changes in their mRNA degradation rates. Peaks of mRNA degradation generally follow peaks of mRNA synthesis, resulting in sharp and high peaks of mRNA levels at defined times during the cell cycle. Whereas the timing of mRNA synthesis is set by upstream DNA motifs and their associated transcription factors (TFs), the synthesis rate of a periodically expressed gene is apparently set by its core promoter.
    Full-text · Article · Jan 2014 · Molecular Systems Biology
  • Source
    • "While cross-validation of expression measurements can be used to discover methodological problems [18], the lack of diurnal expression datasets from alternative techniques, such as RNA-Seq, impedes such verification in the case of cyanobacteria. These observations raise the question, how normalization and other preprocessing steps affect commonly used descriptors for periodic expression, e.g., the number of oscillating genes (by tests of significance of oscillation) and the circadian phase of peak transcript levels [11,19,20]. Such phase information is usually used to derive a temporal order of the observed processes. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background The transcriptomes of several cyanobacterial strains have been shown to exhibit diurnal oscillation patterns, reflecting the diurnal phototrophic lifestyle of the organisms. The analysis of such genome-wide transcriptional oscillations is often facilitated by the use of clustering algorithms in conjunction with a number of pre-processing steps. Biological interpretation is usually focussed on the time and phase of expression of the resulting groups of genes. However, the use of microarray technology in such studies requires the normalization of pre-processing data, with unclear impact on the qualitative and quantitative features of the derived information on the number of oscillating transcripts and their respective phases. Results A microarray based evaluation of diurnal expression in the cyanobacterium Synechocystis sp. PCC 6803 is presented. As expected, the temporal expression patterns reveal strong oscillations in transcript abundance. We compare the Fourier transformation-based expression phase before and after the application of quantile normalization, median polishing, cyclical LOESS, and least oscillating set (LOS) normalization. Whereas LOS normalization mostly preserves the phases of the raw data, the remaining methods introduce systematic biases. In particular, quantile-normalization is found to introduce a phase-shift of 180°, effectively changing night-expressed genes into day-expressed ones. Comparison of a large number of clustering results of differently normalized data shows that the normalization method determines the result. Subsequent steps, such as the choice of data transformation, similarity measure, and clustering algorithm, only play minor roles. We find that the standardization and the DTF transformation are favorable for the clustering of time series in contrast to the 12 m transformation. We use the cluster-wise functional enrichment of a clustering derived by LOS normalization, clustering using flowClust, and DFT transformation to derive the diurnal biological program of Synechocystis sp.. Conclusion Application of quantile normalization, median polishing, and also cyclic LOESS normalization of the presented cyanobacterial dataset lead to increased numbers of oscillating genes and the systematic shift of the expression phase. The LOS normalization minimizes the observed detrimental effects. As previous analyses employed a variety of different normalization methods, a direct comparison of results must be treated with caution.
    Full-text · Article · Apr 2013 · BMC Bioinformatics
Show more