Estimation of False Discovery Rates in Multiple Testing: Application to Gene Microarray Data
ABSTRACT Testing for significance with gene expression data from DNA microarray experiments involves simultaneous comparisons of hundreds or thousands of genes. If R denotes the number of rejections (declared significant genes) and V denotes the number of false rejections, then V/R, if R > 0, is the proportion of false rejected hypotheses. This paper proposes a model for the distribution of the number of rejections and the conditional distribution of V given R, V / R. Under the independence assumption, the distribution of R is a convolution of two binomials and the distribution of V / R has a noncentral hypergeometric distribution. Under an equicorrelated model, the distributions are more complex and are also derived. Five false discovery rate probability error measures are considered: FDR = E(V/R), pFDR = E(V/R / R > 0) (positive FDR), cFDR = E(V/R / R = r) (conditional FDR), mFDR = E(V)/E(R) (marginal FDR), and eFDR = E(V)/r (empirical FDR). The pFDR, cFDR, and mFDR are shown to be equivalent under the Bayesian framework, in which the number of true null hypotheses is modeled as a random variable. We present a parametric and a bootstrap procedure to estimate the FDRs. Monte Carlo simulations were conducted to evaluate the performance of these two methods. The bootstrap procedure appears to perform reasonably well, even when the alternative hypotheses are correlated (rho = .25). An example from a toxicogenomic microarray experiment is presented for illustration.
SourceAvailable from: Qinghua Xu[Show abstract] [Hide abstract]
ABSTRACT: Gene expression profiling of whole blood is showing great promise for the discovery of novel biomarkers for colorectal cancer (CRC) detection. Given the relatively low incidence rate of CRC in the general population, most blood samples collected prior to a colonoscopy were confirmed to be noncancerous afterward. Previous studies have relied on blood samples collected after a colonoscopy to reach the sufficient number of CRC cases. The present study aimed to determine the colonoscopy-induced variability in the blood transcriptome and its potential impact on biomarker discovery.Journal of Cancer Research and Clinical Oncology 10/2014; 141(4). DOI:10.1007/s00432-014-1837-6 · 3.01 Impact Factor
[Show abstract] [Hide abstract]
ABSTRACT: A semi-parametric density ratio method which borrows strength from two or more samples can be applied to moving windows of variable size in cluster detection. The method requires neither the prior knowledge of the underlying distribution nor the number of cases before scanning. In this paper, the semi-parametric cluster detection procedure combined with controlling of the false discovery rate (FDR) for multiple testing is studied. It is shown by simulations that for binary data, using Kulldorff's Northeastern benchmark data, the semi-parametric method and Kulldorff's method performs similarly. When the data are not binary, the semi-parametric methodology still works in many cases, but Kulldorff's method requires the choices of a correct probability model, namely the correct scan statistic, in order to achieve power comparable to that achieved by the semi-parametric method. Kulldorff's method with an inappropriate probability model may lose power.
[Show abstract] [Hide abstract]
ABSTRACT: Early growth is connected to a key link between embryonic development and aging. In this paper, liver gene expression profiles were assayed at postnatal day 22 and week 16 of age. Meanwhile another independent animal experiment and cell culture were carried out for validation. Significance analysis of microarrays, qPCR verification, drug induction/inhibition assays, and metabonomics indicated that alpha-2u globulin (extracellular region)-socs2 (-SH2-containing signals/receptor tyrosine kinases)-ppp2r2a/pik3c3 (MAPK signaling)-hsd3b5/cav2 (metabolism/organization) plays a vital role in early development. Taken together, early development of male rats is ECR and MAPK-mediated coordination of cancer-like growth and negative regulations. Our data represent the first comprehensive description of early individual development, which could be a valuable basis for understanding the functioning of the gene interaction network of infant development.BioMed Research International 01/2014; 2014:850802. DOI:10.1155/2014/850802 · 2.71 Impact Factor