Article

The Use of Imputed Values in the Meta-Analysis of Genome-Wide Association Studies

Cancer Prevention Program, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA.
Genetic Epidemiology (Impact Factor: 2.6). 11/2011; 35(7):597-605. DOI: 10.1002/gepi.20608
Source: PubMed

ABSTRACT

In genome-wide association studies (GWAS), it is a common practice to impute the genotypes of untyped single nucleotide polymorphism (SNP) by exploiting the linkage disequilibrium structure among SNPs. The use of imputed genotypes improves genome coverage and makes it possible to perform meta-analysis combining results from studies genotyped on different platforms. A popular way of using imputed data is the "expectation-substitution" method, which treats the imputed dosage as if it were the true genotype. In current practice, the estimates given by the expectation-substitution method are usually combined using inverse variance weighting (IVM) scheme in meta-analysis. However, the IVM is not optimal as the estimates given by the expectation-substitution method are generally biased. The optimal weight is, in fact, proportional to the inverse variance and the expected value of the effect size estimates. We show both theoretically and numerically that the bias of the estimates is very small under practical conditions of low effect sizes in GWAS. This finding validates the use of the expectation-substitution method, and shows the inverse variance is a good approximation of the optimal weight. Through simulation, we compared the power of the IVM method with several methods including the optimal weight, the regular z-score meta-analysis and a recently proposed "imputation aware" meta-analysis method (Zaitlen and Eskin [2010] Genet Epidemiol 34:537-542). Our results show that the performance of the inverse variance weight is always indistinguishable from the optimal weight and similar to or better than the other two methods.

Download full-text

Full-text

Available from: Shuo Jiao
  • Source
    • "Excluding poorly genotyped variants from only a subset of individuals introduces an unequal sample size across sites, making the downstream statistics more complex. Commonly, this is overcome via the imputation of missing data [48], in which the state of an un-genotyped marker is inferred from the haplotypes of the other individuals. This approach may be valid when data is missing due to technical reasons (low coverage sequencing or poor hybridization to genotyping arrays); however, it is likely to miss-infer the correct state if more than two alleles are present at a site, which will occur whenever SVs and CNVs overlap a SNP. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Over the last 10 years, high-density SNP arrays and DNA re-sequencing have illuminated the majority of the genotypic space for a number of organisms, including humans, maize, rice and Arabidopsis. For any researcher willing to define and score a phenotype across many individuals, Genome Wide Association Studies (GWAS) present a powerful tool to reconnect this trait back to its underlying genetics. In this review we discuss the biological and statistical considerations that underpin a successful analysis or otherwise. The relevance of biological factors including effect size, sample size, genetic heterogeneity, genomic confounding, linkage disequilibrium and spurious association, and statistical tools to account for these are presented. GWAS can offer a valuable first insight into trait architecture or candidate loci for subsequent validation.
    Full-text · Article · Jul 2013 · Plant Methods
  • [Show abstract] [Hide abstract]
    ABSTRACT: The genetic traits that result in autoimmune diseases represent complicating factors in explicating the molecular and cellular elements of autoimmune responses and how these responses can be overcome or manipulated. This article focuses on the major non-major histocompatibility complex genes that have been found to be linked to autoimmune diseases. A given gene may associate with a number of autoimmune diseases and, conversely, a given disease may link to a number of common autoimmune disease (AD) genes. Collaboration and interaction among genes and the number of diseases that develop and the extensive risk factors shared among ADs further complicate the outcome. This article describes the various relationships between gene regions associated with multiple ADs and the complexity of those relationships.
    No preview · Article · Jan 2012 · Critical Reviews in Immunology
  • [Show abstract] [Hide abstract]
    ABSTRACT: The genetic traits that result in autoimmune diseases represent complicating factors in explicating the molecular and cellular elements of autoimmune responses and how these responses can be overcome or manipulated. This article focuses on the major non-major histocompatibility complex genes that have been found to be linked to autoimmune diseases. A given gene may associate with a number of autoimmune diseases and, conversely, a given disease may link to a number of common autoimmune disease (AD) genes. Collaboration and interaction among genes and the number of diseases that develop and the extensive risk factors shared among ADs further complicate the outcome. This article describes the various relationships between gene regions associated with multiple ADs and the complexity of those relationships.
    No preview · Article · Nov 2012 · Critical Reviews in Immunology
Show more