Two-stage designs for gene-disease association studies with sample size constraints.

Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, NY 10021, USA.
Biometrics (Impact Factor: 1.52). 10/2004; 60(3):589-97. DOI: 10.1111/j.0006-341X.2004.00207.x
Source: PubMed

ABSTRACT Gene-disease association studies based on case-control designs may often be used to identify candidate polymorphisms (markers) conferring disease risk. If a large number of markers are studied, genotyping all markers on all samples is inefficient in resource utilization. Here, we propose an alternative two-stage method to identify disease-susceptibility markers. In the first stage all markers are evaluated on a fraction of the available subjects. The most promising markers are then evaluated on the remaining individuals in Stage 2. This approach can be cost effective since markers unlikely to be associated with the disease can be eliminated in the first stage. Using simulations we show that, when the markers are independent and when they are correlated, the two-stage approach provides a substantial reduction in the total number of marker evaluations for a minimal loss of power. The power of the two-stage approach is evaluated when a single marker is associated with the disease, and in the presence of multiple disease-susceptibility markers. As a general guideline, the simulations over a wide range of parametric configurations indicate that evaluating all the markers on 50% of the individuals in Stage 1 and evaluating the most promising 10% of the markers on the remaining individuals in Stage 2 provides near-optimal power while resulting in a 45% decrease in the total number of marker evaluations.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Genome-wide Association Studies (GWAS) require large phenotyping and genotyping costs. Two-stage design can be efficient to reduce genotyping costs: on the first stage some disease associated SNP are detected and these associations are checked on the second stage with reliable significance level. This procedure decreases the number of genotyped SNP on the second stage, thus the genotyping costs will be less than genotyping costs of one-stage design. Modern genotyping technologies allow using 96 and 384 well plates. Thus the number of individuals should be proportional to well plate size. Monte Carlo simulation was used to find optimal number of well plates and critical values on the first and second stages. We also found that the costs have inverse relationship to Kullback-Leibler divergence between cases and controls distributions under alternative hypothesis.
    Applied Methods of Statistical Analysis. Simulations and Statistical Inference (AMSA 2013) International Conference, Novosibirsk, Russia; 09/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Publisher’s description: This book covers the statistical models and methods that are used to understand human genetics, following the historical and recent developments of human genetics. Starting with Mendel’s first experiments to genome-wide association studies, the book describes how genetic information can be incorporated into statistical models to discover disease genes. All commonly used approaches in statistical genetics (e.g. aggregation analysis, segregation, linkage analysis, etc), are used, but the focus of the book is modern approaches to association analysis. Numerous examples illustrate key points throughout the text, both of Mendelian and complex genetic disorders. The intended audience is statisticians, biostatisticians, epidemiologists and quantitatively-oriented geneticists and health scientists wanting to learn about statistical methods for genetic analysis, whether to better analyze genetic data, or to pursue research in methodology. A background in intermediate level statistical methods is required. The authors include few mathematical derivations, and the exercises provide problems for students with a broad range of skill levels. No background in genetics is assumed.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Aims: To determine whether placental IGF1R, IGFBP3, INSR and IGF1 DNA methylation and mRNA levels were dysregulated when exposed to maternal impaired glucose tolerance (IGT) and investigate whether the epigenetic profile is associated with feto-placental developmental markers. Patients & methods: The IGT diagnosis was made according to the WHO criteria (IGT: n = 34; normal glucose tolerance [NGT]: n = 106). DNA methylation and mRNA levels were quantified using bisulfite pyrosequencing and qRT-PCR, respectively. Results: IGF1R and IGFBP3 DNA methylation levels were lower in placentas exposed to IGT compared with NGT (-4.3%; p = 0.021 and -2.5%; p = 0.006 respectively) and correlated with 2-h post-oral glucose tolerance test (OGTT) glycemia (r = -0.23; p = 0.010 and r = -0.20; p = 0.028, respectively). IGF1R mRNA levels were associated with newborns' growth markers (e.g., birth weight; r = 0.20; p = 0.032). Conclusion: These results support the growth-promoting role of the IGF system in placental/fetal development and suggest that the IGF1R and IGFBP3 DNA methylation profiles are dysregulated in IGT, potentially affecting the fetal metabolic programming.
    Epigenomics 04/2014; 6(2):193-207. · 2.43 Impact Factor