Informed Conditioning on Clinical Covariates Increases Power in Case-Control Association Studies

Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
PLoS Genetics (Impact Factor: 7.53). 11/2012; 8(11):e1003032. DOI: 10.1371/journal.pgen.1003032
Source: PubMed


Genetic case-control association studies often include data on clinical covariates, such as body mass index (BMI), smoking status, or age, that may modify the underlying genetic risk of case or control samples. For example, in type 2 diabetes, odds ratios for established variants estimated from low-BMI cases are larger than those estimated from high-BMI cases. An unanswered question is how to use this information to maximize statistical power in case-control studies that ascertain individuals on the basis of phenotype (case-control ascertainment) or phenotype and clinical covariates (case-control-covariate ascertainment). While current approaches improve power in studies with random ascertainment, they often lose power under case-control ascertainment and fail to capture available power increases under case-control-covariate ascertainment. We show that an informed conditioning approach, based on the liability threshold model with parameters informed by external epidemiological information, fully accounts for disease prevalence and non-random ascertainment of phenotype as well as covariates and provides a substantial increase in power while maintaining a properly controlled false-positive rate. Our method outperforms standard case-control association tests with or without covariates, tests of gene x covariate interaction, and previously proposed tests for dealing with covariates in ascertained data, with especially large improvements in the case of case-control-covariate ascertainment. We investigate empirical case-control studies of type 2 diabetes, prostate cancer, lung cancer, breast cancer, rheumatoid arthritis, age-related macular degeneration, and end-stage kidney disease over a total of 89,726 samples. In these datasets, informed conditioning outperforms logistic regression for 115 of the 157 known associated variants investigated (P-value = 1×10(-9)). The improvement varied across diseases with a 16% median increase in χ(2) test statistics and a commensurate increase in power. This suggests that applying our method to existing and future association studies of these diseases may identify novel disease loci.

Download full-text


Available from: Aage Haugen
  • Source
    • "Synthetic androgen, R1881 (Sigma Aldrich), was solubilised in ethanol to a final concentration of 10 µM and unless otherwise stated, used in a 10 nM dose in cell lines, as previously reported [7], [26]. Inhibitor against human O-GlcNAc transferase, ST045849 was purchased from TimTec (Newark, USA) and solubilised in DMSO to a final concentration of 20 mM. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Prostate cancer is the second most common cause of cancer-associated deaths in men and signalling via a transcription factor called androgen receptor (AR) is an important driver of the disease. Androgen treatment is known to affect the expression and activity of other oncogenes including receptor tyrosine kinases (RTKs). In this study we report that AR-positive prostate cancer cell-lines express 50% higher levels of enzymes in the hexosamine biosynthesis pathway (HBP) than AR-negative prostate cell-lines. HBP produces hexosamines that are used by endoplasmic reticulum and golgi enzymes to glycosylate proteins targeted to plasma-membrane and secretion. Inhibition of O-linked glycosylation by ST045849 or N-linked glycosylation with tunicamycin decreased cell viability by 20%. In addition, tunicamycin inhibited the androgen-induced expression of AR target genes KLK3 and CaMKK2 by 50%. RTKs have been shown to enhance AR activity and we used an antibody array to identify changes in the phosphorylation status of RTKs in response to androgen stimulation. Hormone treatment increased the activity of Insulin like Growth Factor 1-Receptor (IGF-1R) ten-fold and this was associated with a concomitant increase in the N-linked glycosylation of the receptor, analyzed by lectin enrichment experiments. Glycosylation is known to be important for the processing and stability of RTKs. Inhibition of N-linked glycosylation resulted in accumulation of IGF-1R pro-receptor with altered mobility as shown by immunoprecipitation. Confocal imaging revealed that androgen induced plasma-membrane localization of IGF-1R was blocked by tunicamycin. In conclusion we have established that the glycosylation of IGF-1R is necessary for the full activation of the receptor in response to androgen treatment and that perturbing this process can break the feedback loop between AR and IGF-1R activation in prostate cells. Achieving similar results selectively in a clinical setting will be an important challenge in the future.
    Full-text · Article · May 2013 · PLoS ONE
  • Source

    Preview · Article · Nov 2012 · PLoS Genetics
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Applications of linear mixed models (LMMs) to problems in genomics include phenotype prediction, correction for confounding in genome-wide association studies, estimation of narrow sense heritability, and testing sets of variants (e.g., rare variants) for association. In each of these applications, the LMM uses a genetic similarity matrix, which encodes the pairwise similarity between every two individuals in a cohort. Although ideally these similarities would be estimated using strictly variants relevant to the given phenotype, the identity of such variants is typically unknown. Consequently, relevant variants are excluded and irrelevant variants are included, both having deleterious effects. For each application of the LMM, we review known effects and describe new effects showing how variable selection can be used to mitigate them.
    Full-text · Article · May 2013 · Scientific Reports
Show more