Article

Weighted multiple hypothesis testing procedures.

University of Alabama at Birmingham, Birmingham, AL 35294, USA.
Statistical Applications in Genetics and Molecular Biology (Impact Factor: 1.52). 02/2009; 8(1):Article23. DOI: 10.2202/1544-6115.1437
Source: PubMed

ABSTRACT Multiple hypothesis testing is commonly used in genome research such as genome-wide studies and gene expression data analysis (Lin, 2005). The widely used Bonferroni procedure controls the family-wise error rate (FWER) for multiple hypothesis testing, but has limited statistical power as the number of hypotheses tested increases. The power of multiple testing procedures can be increased by using weighted p-values (Genovese et al., 2006). The weights for the p-values can be estimated by using certain prior information. Wasserman and Roeder (2006) described a weighted Bonferroni procedure, which incorporates weighted p-values into the Bonferroni procedure, and Rubin et al. (2006) and Wasserman and Roeder (2006) estimated the optimal weights that maximize the power of the weighted Bonferroni procedure under the assumption that the means of the test statistics in the multiple testing are known (these weights are called optimal Bonferroni weights). This weighted Bonferroni procedure controls FWER and can have higher power than the Bonferroni procedure, especially when the optimal Bonferroni weights are used. To further improve the power of the weighted Bonferroni procedure, first we propose a weighted Sidák procedure that incorporates weighted p-values into the Sidák procedure, and then we estimate the optimal weights that maximize the average power of the weighted Sidák procedure under the assumption that the means of the test statistics in the multiple testing are known (these weights are called optimal Sidák weights). This weighted Sidák procedure can have higher power than the weighted Bonferroni procedure. Second, we develop a generalized sequential (GS) Sidák procedure that incorporates weighted p-values into the sequential Sidák procedure (Scherrer, 1984). This GS idák procedure is an extension of and has higher power than the GS Bonferroni procedure of Holm (1979). Finally, under the assumption that the means of the test statistics in the multiple testing are known, we incorporate the optimal Sidák weights and the optimal Bonferroni weights into the GS Sidák procedure and the GS Bonferroni procedure, respectively. Theoretical proof and/or simulation studies show that the GS Sidák procedure can have higher power than the GS Bonferroni procedure when their corresponding optimal weights are used, and that both of these GS procedures can have much higher power than the weighted Sidák and the weighted Bonferroni procedures. All proposed procedures control the FWER well and are useful when prior information is available to estimate the weights.

0 Followers
 · 
171 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Abstract Multiple comparisons or multiple testing has been viewed as a thorny issue in genetic association studies aiming to detect disease-associated genetic variants from a large number of genotyped variants. We alleviate the problem of multiple comparisons by proposing a hierarchical modeling approach that is fundamentally different from the existing methods. The proposed hierarchical models simultaneously fit as many variables as possible and shrink unimportant effects towards zero. Thus, the hierarchical models yield more efficient estimates of parameters than the traditional methods that analyze genetic variants separately, and also coherently address the multiple comparisons problem due to largely reducing the effective number of genetic effects and the number of statistically "significant" effects. We develop a method for computing the effective number of genetic effects in hierarchical generalized linear models, and propose a new adjustment for multiple comparisons, the hierarchical Bonferroni correction, based on the effective number of genetic effects. Our approach not only increases the power to detect disease-associated variants but also controls the Type I error. We illustrate and evaluate our method with real and simulated data sets from genetic association studies. The method has been implemented in our freely available R package BhGLM (http://www.ssg.uab.edu/bhglm/).
    Statistical Applications in Genetics and Molecular Biology 02/2014; 13(1):35-48. DOI:10.1515/sagmb-2012-0040 · 1.52 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Proneural NEUROG2 (neurogenin 2 [Ngn2]) is essential for neuronal commitment, cell cycle withdrawal, and neuronal differentiation. Although NEUROG2's influence on neuronal commitment and differentiation is beginning to be clarified, its role in cell cycle withdrawal remains unknown. We therefore set out to investigate the molecular mechanisms by which NEUROG2 induces cell cycle arrest during spinal neurogenesis. We developed a large-scale chicken embryo strategy, designed to find gene networks modified at the onset of NEUROG2 expression, and thereby we identified those involved in controlling the cell cycle. NEUROG2 activation leads to a rapid decrease of a subset of cell cycle regulators acting at G(1) and S phases, including CCND1, CCNE1/2, and CCNA2 but not CCND2. The use of NEUROG2VP16 and NEUROG2EnR, acting as the constitutive activator and repressor, respectively, indicates that NEUROG2 indirectly represses CCND1 and CCNE2 but opens the possibility that CCNE2 is also repressed by a direct mechanism. We demonstrated by phenotypic analysis that this rapid repression of cyclins prevents S phase entry of neuronal precursors, thus favoring cell cycle exit. We also showed that cell cycle exit can be uncoupled from neuronal differentiation and that during normal development NEUROG2 is in charge of tightly coordinating these two processes.
    Molecular and Cellular Biology 04/2012; 32(13):2596-607. DOI:10.1128/MCB.06745-11 · 5.04 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: After genetic regions have been identified in genomewide association studies (GWAS), investigators often follow up with more targeted investigations of specific regions. These investigations typically are based on single nucleotide polymorphisms (SNPs) with dense coverage of a region. Methods are thus needed to test the hypothesis of any association in given genetic regions. Several approaches for combining P-values obtained from testing individual SNP hypothesis tests are available. We recently proposed a sequential procedure for testing the global null hypothesis of no association in a region. When this global null hypothesis is rejected, this method provides a list of significant hypotheses and has weak control of the family-wise error rate. In this paper, we devise a permutation-based version of the test that accounts for correlations of tests based on SNPs in the same genetic region. Based on simulated data, the method has correct control of the type I error rate and higher or comparable power to other tests.
    Genetic Epidemiology 12/2013; 37(8). DOI:10.1002/gepi.21755 · 2.95 Impact Factor

Full-text (2 Sources)

Download
56 Downloads
Available from
May 29, 2014