Correcting for Measurement Error in Individual Ancestry Estimates in Structured Association Tests

Center for Public Health Genomics, Department of Biostatistical Sciences, Division of Public Health Services, Wake Forest University Health Sciences, Winston-Salem, North Carolina 27101, USA.
Genetics (Impact Factor: 5.96). 08/2007; 176(3):1823-33. DOI: 10.1534/genetics.107.075408
Source: PubMed


We present theoretical explanations and show through simulation that the individual admixture proportion estimates obtained by using ancestry informative markers should be seen as an error-contaminated measurement of the underlying individual ancestry proportion. These estimates can be used in structured association tests as a control variable to limit type I error inflation or reduce loss of power due to population stratification observed in studies of admixed populations. However, the inclusion of such error-containing variables as covariates in regression models can bias parameter estimates and reduce ability to control for the confounding effect of admixture in genetic association tests. Measurement error correction methods offer a way to overcome this problem but require an a priori estimate of the measurement error variance. We show how an upper bound of this variance can be obtained, present four measurement error correction methods that are applicable to this problem, and conduct a simulation study to compare their utility in the case where the admixed population results from the intermating between two ancestral populations. Our results show that the quadratic measurement error correction (QMEC) method performs better than the other methods and maintains the type I error to its nominal level.

Download full-text


Available from: Jose Fernandez, Feb 26, 2014
15 Reads
  • Source
    • "The admixture proportions of each population sample were estimated by the ADMIX software version 2.0 software [] (23,24). The parental population allele frequencies used were described by Shriver et al. (18). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Suppressor of cytokine signaling 3, myxovirus resistance protein and osteopontin gene polymorphisms may influence the therapeutic response in patients with chronic hepatitis C, and an association with IL28 might increase the power to predict sustained virologic response. Our aims were to evaluate the association between myxovirus resistance protein, osteopontin and suppressor of cytokine signaling 3 gene polymorphisms in combination with IL28B and to assess the therapy response in hepatitis C patients treated with pegylated-interferon plus ribavirin. Myxovirus resistance protein, osteopontin, suppressor of cytokine signaling 3 and IL28B polymorphisms were analyzed by PCR-restriction fragment length polymorphism, direct sequencing and real-time PCR. Ancestry was determined using genetic markers. We analyzed 181 individuals, including 52 who were sustained virologic responders. The protective genotype frequencies among the sustained virologic response group were as follows: the G/G suppressor of cytokine signaling 3 (rs4969170) (62.2%); T/T osteopontin (rs2853744) (60%); T/T osteopontin (rs11730582) (64.3%); and the G/T myxovirus resistance protein (rs2071430) genotype (54%). The patients who had ≥3 of the protective genotypes from the myxovirus resistance protein, the suppressor of cytokine signaling 3 and osteopontin had a greater than 90% probability of achieving a sustained response (p<0.0001). The C/C IL28B genotype was present in 58.8% of the subjects in this group. The sustained virological response rates increased to 85.7% and 91.7% by analyzing C/C IL28B with the T/T osteopontin genotype at rs11730582 and the G/G suppressor of cytokine signaling 3 genotype, respectively. Genetic ancestry analysis revealed an admixed population. Hepatitis C genotype 1 patients who were responders to interferon-based therapy had a high frequency of multiple protective polymorphisms in the myxovirus resistance protein, osteopontin and suppressor of cytokine signaling 3 genes. The combined analysis of the suppressor of cytokine signaling 3 and IL28B genotypes more effectively predicted sustained virologic response than IL28B analysis alone.
    Clinics (São Paulo, Brazil) 10/2013; 68(10):1325-32. DOI:10.6061/clinics/2013(10)06 · 1.19 Impact Factor
  • Source
    • "The SAT approach is commonly used to correct for population stratification and admixture in genetic association tests. Recently published papers have shown that SAT approaches may fail to provide nominal type I errors for various reasons, including measurement error in the estimation of the confounding effect [34,35] and cases where the estimated confounder captures an insufficient fraction of phenotypic variation [36]. However, we restricted our attention to older and better-known SAT approaches [33,37-40]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Questions remain regarding the utility of self-reported ethnicity (SRE) in genetic and epidemiologic research. It is not clear whether conditioning on SRE provides adequate protection from inflated type I error rates due to population stratification and admixture. We address this question using data obtained from the Multi-Ethnic Study of Atherosclerosis (MESA), which enrolled individuals from 4 self-reported ethnic groups. We compare the agreement between SRE and genetic based measures of ancestry (GBMA), and conduct simulation studies based on observed MESA data to evaluate the performance of each measure under various conditions. Four clusters are identified using 96 ancestry informative markers. Three of these clusters are well delineated, but 30% of the self-reported Hispanic-Americans are misclassified. We also found that MESA SRE provides type I error rates that are consistent with the nominal levels. More extensive simulations revealed that this finding is likely due to the multi-ethnic nature of the MESA. Finally, we describe situations where SRE may perform as well as a GBMA in controlling the effect of population stratification and admixture in association tests. The performance of SRE as a control variable in genetic association tests is more nuanced than previously thought, and may have more value than it is currently credited with, especially when smaller replication studies are being considered in multi-ethnic samples.
    BMC Genetics 03/2011; 12(1):28. DOI:10.1186/1471-2156-12-28 · 2.40 Impact Factor
  • Source
    • "Considering each phenotype and each genetic model separately, we applied a Bonferroni multiple correction to the marker association tests; a p-value cut-off of 3.6 × 10-4 keeps the nominal type I error rate at 0.05. To determine the extent to which measurement error in admixture estimates could skew the results, we applied the method described by Divers et al. [46] Basically, we obtained an estimate of the measurement error covariance and applied the simulation extrapolation (SimEx) algorithm [47] to retest for association between each marker and phenotype, for each mode of inheritance model. Analyses were carried out with PLINK,[48] SAS 9.1 software (SAS Institute, Cary, NC, USA) and R [49]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Type 2 diabetes represents an increasing health burden. Its prevalence is rising among younger age groups and differs among racial/ethnic groups. Little is known about its genetic basis, including whether there is a genetic basis for racial/ethnic disparities. We examined a multi-ethnic sample of 253 healthy children to evaluate associations between insulin-related phenotypes and 142 ancestry-informative markers (AIMs), while adjusting for sex, age, Tanner stage, genetic admixture, total body fat, height and socio-economic status. We also evaluated the effect of measurement errors in the estimation of the individual ancestry proportions on the regression results. We found that European genetic admixture is positively associated with insulin sensitivity (S I ), and negatively associated with the acute insulin response to glucose, fasting insulin levels and the homeostasis model assessment of insulin resistance. Our analysis revealed associations between individual AIMs on chromosomes 2, 8 and 15 and these phenotypes. Most notably, marker rs3287 at chromosome 2p21 was found to be associated with S I ( p = 5.8 × 10(-5)). This marker may be in admixture linkage disequilibrium with nearby loci ( THADA and BCL11A ) that previously have been reported to be associated with diabetes and diabetes-related phenotypes in several genome-wide association and linkage studies. Our results provide further evidence that variation in the 2p21 region containing THADA and BCL11A is associated with type 2 diabetes. Importantly, we have implicated this region in the early development of diabetes-related phenotypes, and in the genetic aetiology of population differences in these phenotypes.
    Human genomics 01/2011; 5(2):79-89. DOI:10.1186/1479-7364-5-2-79 · 2.15 Impact Factor
Show more