Optimal methods for meta-analysis of genome-wide association studies.

Department of Health Research and Policy, Stanford University, Stanford, California, USA.
Genetic Epidemiology (Impact Factor: 2.95). 09/2011; 35(7):581-91. DOI: 10.1002/gepi.20603
Source: PubMed

ABSTRACT Meta-analysis of genome-wide association studies involves testing single nucleotide polymorphisms (SNPs) using summary statistics that are weighted sums of site-specific score or Wald statistics. This approach avoids having to pool individual-level data. We describe the weights that maximize the power of the summary statistics. For small effect-sizes, any choice of weights yields summary Wald and score statistics with the same power, and the optimal weights are proportional to the square roots of the sites' Fisher information for the SNP's regression coefficient. When SNP effect size is constant across sites, the optimal summary Wald statistic is the well-known inverse-variance-weighted combination of estimated regression coefficients, divided by its standard deviation. We give simple approximations to the optimal weights for various phenotypes, and show that weights proportional to the square roots of study sizes are suboptimal for data from case-control studies with varying case-control ratios, for quantitative trait data when the trait variance differs across sites, for count data when the site-specific mean counts differ, and for survival data with different proportions of failing subjects. Simulations suggest that weights that accommodate intersite variation in imputation error give little power gain compared to those obtained ignoring imputation uncertainties. We note advantages to combining site-specific score statistics, and we show how they can be used to assess effect-size heterogeneity across sites. The utility of the summary score statistic is illustrated by application to a meta-analysis of schizophrenia data in which only site-specific P-values and directions of association are available.

  • [Show abstract] [Hide abstract]
    ABSTRACT: Understanding the effects of gene-environment interaction on complex human diseases or traits in genome-wide association studies (GWAS) can help uncover novel genes and identify environmental hazards that influence only certain genetically susceptible groups. Thus there is a pressing need to develop efficient and powerful interaction analysis methods. In this paper, we propose a novel meta-analysis method of gene-environment interaction, based on meta-regression (MR-M&I). Compared with existing meta-analysis methods, MR-M&I allows for heterogeneity in the environmental factor (E) by dividing the subjects in each study into groups according to the distribution of E. Moreover, it can readily estimate linear or non-linear interactions, and thus it is more generally applicable to different scenarios. We use numerical examples to demonstrate the performance of MR-M&I and compare it with two commonly used methods in current GWAS. The results show that MR-M&I is more powerful than the other methods.
    Genomic Signal Processing and Statistics, (GENSIPS), 2012 IEEE International Workshop on; 01/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Genome-wide association studies (GWAS) have created heightened interest in understanding the effects of gene-environment interaction on complex human diseases or traits. Applying methods for analyzing such interaction can help uncover novel genes and identify environmental hazards that influence only certain genetically susceptible groups. However, the number of interaction analysis methods is still limited, so there is a need to develop more efficient and powerful methods. In this paper, we propose two novel meta-analysis methods of studying gene-environment interaction, based on meta-regression of estimated genetic effects on the environmental factor. The two methods can perform joint analysis of a single nucleotide polymorphism's (SNP) main and interaction effects, or analyze only the effect of the interaction. They can readily estimate any linear or non-linear interactions by simply modifying the gene-environment regression function. Thus, they are efficient methods to be applied to different scenarios. We use numerical examples to demonstrate the performance of our methods. We also compare them with two other methods commonly used in current GWAS, i.e., meta-analysis of SNP main effects (MAIN) and joint meta-analysis of SNP main and interaction effects (JMA). The results show that our methods are more powerful than MAIN when the interaction effect exists, and are comparable to JMA in the linear or quadratic interaction cases. In the numerical examples, we also investigate how the number of the divided groups and the sample size of the studies affect the performance of our methods.
    IEEE Transactions on NanoBioscience 12/2013; 12(4):354-362. · 1.77 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: For analysis of the main effects of SNPs, meta-analysis of summary results from individual studies has been shown to provide comparable results as “mega-analysis” that jointly analyzes the pooled participant data from the available studies. This fact revolutionized the genetic analysis of complex traits through large GWAS consortia. Investigations of gene-environment (G×E) interactions are on the rise since they can potentially explain a part of the missing heritability and identify individuals at high risk for disease. However, for analysis of gene-environment interactions, it is not known whether these methods yield comparable results. In this empirical study, we report that the results from both methods were largely consistent for all four tests; the standard 1 degree of freedom (df) test of main effect only, the 1 df test of the main effect (in the presence of interaction effect), the 1 df test of the interaction effect, and the joint 2 df test of main and interaction effects. They provided similar effect size and standard error estimates, leading to comparable P-values. The genomic inflation factors and the number of SNPs with various thresholds were also comparable between the two approaches. Mega-analysis is not always feasible especially in very large and diverse consortia since pooling of raw data may be limited by the terms of the informed consent. Our study illustrates that meta-analysis can be an effective approach also for identifying interactions. To our knowledge, this is the first report investigating meta-versus mega-analyses for interactions.
    Genetic Epidemiology 04/2014; 38(4). · 2.95 Impact Factor


Available from