Balding DJ. A tutorial on statistical methods for population association studies. Nat Rev Genet 7: 781-791

Department of Epidemiology and Public Health, Imperial College, St Marys Campus, Norfolk Place, London W2 1PG, UK.
Nature Reviews Genetics (Impact Factor: 36.98). 11/2006; 7(10):781-91. DOI: 10.1038/nrg1916
Source: PubMed

ABSTRACT Although genetic association studies have been with us for many years, even for the simplest analyses there is little consensus on the most appropriate statistical procedures. Here I give an overview of statistical approaches to population association studies, including preliminary analyses (Hardy-Weinberg equilibrium testing, inference of phase and missing data, and SNP tagging), and single-SNP and multipoint tests for association. My goal is to outline the key methods with a brief discussion of problems (population structure and multiple testing), avenues for solutions and some ongoing developments.

237 Reads
  • Source
    • "For example, if m 5 100,000, it is expected that about 5,000 false positive associations are observed by chance even none of SNPs is diseaserelated . Thus, multiple comparison is an important consideration in GWAS analysis, and must be handled appropriately [18] . "
    [Show abstract] [Hide abstract]
    ABSTRACT: In the past few years, genome-wide association study (GWAS) has made great successes in identifying genetic susceptibility loci underlying many complex diseases and traits. The findings provide important genetic insights into understanding pathogenesis of diseases. In this paper, we present an overview of widely used approaches and strategies for analysis of GWAS, offered a general consideration to deal with GWAS data. The issues regarding data quality control, population structure, association analysis, multiple comparison and visual presentation of GWAS results are discussed; other advanced topics including the issue of missing heritability, meta-analysis, set-based association analysis, copy number variation analysis and GWAS cohort analysis are also briefly introduced. © 2015 the Journal of Biomedical Research. All rights reserved.
    07/2015; 29(4):285-97. DOI:10.7555/JBR.29.20140007
  • Source
    • "However, the allele frequencies in some of these SNPs display substantial differences between different continental populations (Supplementary material, Table A2), driving to biases in the analyses due to stratification created by population structure patterns and genetic admixture events (Balding, 2006), such structure become an important confounder covariate in regression analyses and must be addressed in detail. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The wide variation in severity displayed during Dengue Virus (DENV) infection may be influenced by host susceptibility. In several epidemiological approaches, differences in disease outcomes have been found between some ethnic groups, suggesting that human genetic background has an important role in disease severity. In the Caribbean, It has been reported that populations of African descent present considerable less frequency of severe forms compared with Mestizo and White self-reported groups. Admixed populations offer advantages for genetic epidemiology studies due to variation and distribution of alleles, such as those involved in disease susceptibility, as well to provide explanations of individual variability in clinical outcomes. The current study analysed three Colombian populations, which like most of Latin American populations, are made up of the product of complex admixture processes between European, Native American and African ancestors; having as a main goal to assess the effect of genetic ancestry, estimated with 30 Ancestry Informative Markers (AIMs), on DENV infection severity. We found that African Ancestry has a protective effect against severe outcomes under several systems of clinical classification: Severe Dengue (OR: 0.963 for every 1% increase in African Ancestry, 95% confidence interval (0.934 - 0.993), p-value: 0.016), Dengue Haemorrhagic Fever (OR: 0.969, 95% CI (0.947 - 0.991), p-value: 0.006), and occurrence of haemorrhages (OR: 0.971, 95% CI (0.952 - 0.989), p-value: 0.002). Conversely, decrease from 100% to 0% African ancestry significantly increases the chance of severe outcomes: OR is 44-fold for Severe Dengue, 24-fold for Dengue Haemorrhagic Fever, and 20-fold for occurrence of haemorrhages. Furthermore, several warning signs also showed statistically significant association given more evidences in specific stages of DENV infection. These results provide consistent evidence in order to infer statistical models providing a framework for future genetic epidemiology and clinical studies.
    Infection Genetics and Evolution 10/2014; 27. DOI:10.1016/j.meegid.2014.07.003 · 3.02 Impact Factor
  • Source
    • "To address these aims, we investigated the gene region by requesting the genotype and phenotype data from two publicly available data sets and collected additional fine map and expression data from TwinsUK, since our method requires original, rather than summary published data. All genomic data were screened for quality control and tested for population stratification using standard procedures (Balding, 2006) for the following samples: "
    [Show abstract] [Hide abstract]
    ABSTRACT: Numerous functional studies have implicated PARL in relation to type 2 diabetes (T2D). We hypothesised that conflicting human association studies may be due to neighbouring causal variants being in linkage disequilibrium (LD) with PARL. We conducted a comprehensive candidate gene study of the extended LD genomic region that includes PARL and transporter ABCC5 using three data sets (two European and one African American), in relation to healthy glycaemic variation, visceral fat accumulation and T2D disease. We observed no evidence for previously reported T2D association with Val262Leu or PARL using array and fine-map genomic and expression data. By contrast, we observed strong evidence of T2D association with ABCC5 (intron 26) for European and African American samples (P = 3E−07) and with ABCC5 adipose expression in Europeans [odds ratio (OR) = 3.8, P = 2E−04]. The genomic location estimate for the ABCC5 functional variant, associated with all phenotypes and expression data (P = 1E−11), was identical for all samples (at Chr3q 185,136 kb B36), indicating that the risk variant is an expression quantitative trait locus (eQTL) with increased expression conferring risk of disease. That the association with T2D is observed in populations of disparate ancestry suggests the variant is a ubiquitous risk factor for T2D.
    Annals of Human Genetics 09/2014; 78(5). DOI:10.1111/ahg.12072 · 2.21 Impact Factor
Show more


237 Reads
Available from