A permutation-based multiple testing method for time-course microarray experiments

Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, North Carolina 27710, USA.
BMC Bioinformatics (Impact Factor: 2.67). 10/2009; 10:336. DOI: 10.1186/1471-2105-10-336
Source: DBLP

ABSTRACT Time-course microarray experiments are widely used to study the temporal profiles of gene expression. Storey et al. (2005) developed a method for analyzing time-course microarray studies that can be applied to discovering genes whose expression trajectories change over time within a single biological group, or those that follow different time trajectories among multiple groups. They estimated the expression trajectories of each gene using natural cubic splines under the null (no time-course) and alternative (time-course) hypotheses, and used a goodness of fit test statistic to quantify the discrepancy. The null distribution of the statistic was approximated through a bootstrap method. Gene expression levels in microarray data are often complicatedly correlated. An accurate type I error control adjusting for multiple testing requires the joint null distribution of test statistics for a large number of genes. For this purpose, permutation methods have been widely used because of computational ease and their intuitive interpretation.
In this paper, we propose a permutation-based multiple testing procedure based on the test statistic used by Storey et al. (2005). We also propose an efficient computation algorithm. Extensive simulations are conducted to investigate the performance of the permutation-based multiple testing procedure. The application of the proposed method is illustrated using the Caenorhabditis elegans dauer developmental data.
Our method is computationally efficient and applicable for identifying genes whose expression levels are time-dependent in a single biological group and for identifying the genes for which the time-profile depends on the group in a multi-group setting.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Mature microRNAs (miRNAs) are small endogenous non-coding RNAs 18-25 nt in length. They program the RNA Induced Silencing Complex (RISC) to make it inhibit either messenger RNAs or promoter DNAs. We have found that the mean abundance of miRNAs in Arabidopsis is correlated with the abundance of DRYD tetranucleotides near the 3'-end and the abundance of WRHB tetranucleotides in the center of the miRNA sequence. Based on this correlation, we have estimated miRNA abundances in seven organs of this plant, namely: inflorescences, stems, siliques, seedlings, roots, cauline, and rosette leaves. We have also found that the mean affinity of miRNAs for two proteins in the Argonaute family (Ago2 and Ago3) in man is correlated with the abundance of YRHB tetranucleotides near the 3'-end and that the preference of miRNAs for Ago2 is correlated with the abundance of RHHK tetranucleotides in the center of the miRNA sequence. This allowed us to obtain statistically significant estimates of miRNA abundances in human embryonic kidney cells, HEK293T. These findings in relation to two taxonomically distant entities (man and Arabidopsis) fit one another like pieces of a jigsaw puzzle, which allowed us to heuristically generalize them and state that the miRNA abundance in the human brain may be determined by the abundance of YRHB and RHHK tetranucleotides in these miRNAs.
    Frontiers in Genetics 07/2013; 4:122. DOI:10.3389/fgene.2013.00122
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis.
    03/2013; 2013:203681. DOI:10.1155/2013/203681
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Mature miRNA of 20-24 nt in length are the endogenous sncRNA. They programs RISC to regulate functioning of mRNA with complimentary sites for these miRNAs. In case of Ago3 protein present in human RISC miRNAs direct inhibition of translation, whereas in case of Ago2 is in RISC, than mRNA cleavage in the middle of miRNA/mRNA heteroduplex is also possible. Using ACTIVITY system, that we developed earlier, we analyzed published data on miRNA affinity to human Ago2 and Ago3 proteins. We found increase in miRNA affinity to both Ago2 and Ago3 with the increase of the YRHB tetranucleotide abundance near 3'-end of these miRNAs (r = 0.613, alpha < 0.025). We also found that miRNA tendency to bind Ago2 in favor of Ago3 increases with the RHHK tetranucleotide abundance near miRNA center (r = 0.501, alpha < 0.05). Using these two findings we proposed two formulas to predict miRNA affinity to Ago2 and Ago3 proteins based on the YRHB and RHHK abundances within this arbitrary miRNA. Thereby we made reliable predictions of miRNA affinity to these proteins in RISC for both canonical (alpha < 0.00025) and non-canonical (alpha < 0.05) miRNAs in comparison with independent experimental data.
    Molecular Biology 04/2011; 45(2):366-75. · 0.74 Impact Factor

Preview (2 Sources)

Available from