Comparison of computational methods for the identification of cell cycle-regulated genes

Technical University of Denmark, Lyngby, Capital Region, Denmark
Bioinformatics (Impact Factor: 4.62). 04/2005; 21(7):1164-71. DOI: 10.1093/bioinformatics/bti093
Source: PubMed

ABSTRACT MOTIVATION: DNA microarrays have been used extensively to study the cell cycle transcription programme in a number of model organisms. The Saccharomyces cerevisiae data in particular have been subjected to a wide range of bioinformatics analysis methods, aimed at identifying the correct and complete set of periodically expressed genes. RESULTS: Here, we provide the first thorough benchmark of such methods, surprisingly revealing that most new and more mathematically advanced methods actually perform worse than the analysis published with the original microarray data sets. We show that this loss of accuracy specifically affects methods that only model the shape of the expression profile without taking into account the magnitude of regulation. We present a simple permutation-based method that performs better than most existing methods.

1 Follower
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Detecting periodicity signals from time-series microarray data is commonly used to facilitate the understanding of the critical roles and underlying mechanisms of regulatory transcriptomes. However, time-series microarray data are noisy. How the temporal data structure affects the performance of periodicity detection has remained elusive. We present a novel method based on empirical mode decomposition (EMD) to examine this effect. We applied EMD to a yeast microarray dataset and extracted a series of intrinsic mode function (IMF) oscillations from the time-series data. Our analysis indicated that many periodically expressed genes might have been under-detected in the original analysis because of interference between decomposed IMF oscillations. By validating a protein complex coexpression analysis, we revealed that 56 genes were newly determined as periodic. We demonstrated that EMD can be used incorporating with existing periodicity detection methods to improve their performance. This approach can be applied to other time-series microarray studies.
    PLoS ONE 11/2014; 9(11):e111719. DOI:10.1371/journal.pone.0111719 · 3.53 Impact Factor
  • Source
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Oscillations play a significant role in biological systems, with many examples in the fast, ultradian, circadian, circalunar, and yearly time domains. However, determining periodicity in such data can be problematic. There are a number of computational methods to identify the periodic components in large datasets, such as signal-to-noise based Fourier decomposition, Fisher's g-test and autocorrelation. However, the available methods assume a sinusoidal model and do not attempt to quantify the waveform shape and the presence of multiple periodicities, which provide vital clues in determining the underlying dynamics. Here, we developed a Fourier based measure that generates a de-noised waveform from multiple significant frequencies. This waveform is then correlated with the raw data from the respiratory oscillation found in yeast, to provide oscillation statistics including waveform metrics and multi-periods. The method is compared and contrasted to commonly used statistics. Moreover, we show the utility of the program in the analysis of noisy datasets and other high-throughput analyses, such as metabolomics and flow cytometry, respectively.
    Frontiers in Cell and Developmental Biology 08/2014; 2:40. DOI:10.3389/fcell.2014.00040


1 Download
Available from