Article

Robust linear regression methods in association studies.

Department of Mathematics, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal.
Bioinformatics (Impact Factor: 5.47). 01/2011; 27(6):815-21. DOI: 10.1093/bioinformatics/btr006
Source: PubMed

ABSTRACT It is well known that data deficiencies, such as coding/rounding errors, outliers or missing values, may lead to misleading results for many statistical methods. Robust statistical methods are designed to accommodate certain types of those deficiencies, allowing for reliable results under various conditions. We analyze the case of statistical tests to detect associations between genomic individual variations (SNP) and quantitative traits when deviations from the normality assumption are observed. We consider the classical analysis of variance tests for the parameters of the appropriate linear model and a robust version of those tests based on M-regression. We then compare their empirical power and level using simulated data with several degrees of contamination.
Data normality is nothing but a mathematical convenience. In practice, experiments usually yield data with non-conforming observations. In the presence of this type of data, classical least squares statistical methods perform poorly, giving biased estimates, raising the number of spurious associations and often failing to detect true ones. We show through a simulation study and a real data example, that the robust methodology can be more powerful and thus more adequate for association studies than the classical approach.
The code of the robustified version of function lmekin() from the R package kinship is provided as Supplementary Material.

0 Bookmarks
 · 
112 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: BACKGROUND: IgE is both a marker and mediator of allergic inflammation. Despite reported differences in serum total IgE levels by race-ethnicity, African American and Latino subjects have not been well represented in genetic studies of total IgE. OBJECTIVE: We sought to identify the genetic predictors of serum total IgE levels. METHODS: We used genome-wide association data from 4292 subjects (2469 African Americans, 1564 European Americans, and 259 Latinos) in the EVE Asthma Genetics Consortium. Tests for association were performed within each cohort by race-ethnic group (ie, African American, Latino, and European American) and asthma status. The resulting P values were meta-analyzed, accounting for sample size and direction of effect. Top single nucleotide polymorphism associations from the meta-analysis were reassessed in 6 additional cohorts comprising 5767 subjects. RESULTS: We identified 10 unique regions in which the combined association statistic was associated with total serum IgE levels (P < 5.0 × 10(-6)) and the minor allele frequency was 5% or greater in 2 or more population groups. Variant rs9469220, corresponding to HLA-DQB1, was the single nucleotide polymorphism most significantly associated with serum total IgE levels when assessed in both the replication cohorts and the discovery and replication sets combined (P = .007 and 2.45 × 10(-7), respectively). In addition, findings from earlier genome-wide association studies were also validated in the current meta-analysis. CONCLUSION: This meta-analysis independently identified a variant near HLA-DQB1 as a predictor of total serum IgE levels in multiple race-ethnic groups. This study also extends and confirms the findings of earlier genome-wide association analyses in African American and Latino subjects.
    The Journal of allergy and clinical immunology 11/2012; · 12.05 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Increased postprandial lipid (PPL) response to dietary fat intake is a heritable risk factor for cardiovascular disease (CVD). Variability in postprandial lipids results from the complex interplay of dietary and genetic factors. We hypothesized that detailed lipid profiles (eg, sterols and fatty acids) may help elucidate specific genetic and dietary pathways contributing to the PPL response.
    PLoS ONE 01/2014; 9(6):e99509. · 3.53 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: One recent study indicates a significant association between certain single nucleotide polymorphisms (SNPs) in the genomic sequence of feline p53 and feline injection-site sarcoma (FISS). The aim of this study was to investigate the correlation between a specific nucleotide insertion in p53 gene and FISS in a German cat population. Blood samples from 150 German cats were allocated to a control group consisting of 100 healthy cats and a FISS-group consisting of 50 cats with FISS. All blood samples were examined for the presence of the SNP in the p53 gene. Results found the T-insertion at SNP 3 in 20.0% of the cats in the FISS-group and 19.2% of cats in the control-group. No statistically significant difference was observed in allelic distribution between the two groups. Further investigations are necessary to determine the association of SNPs in the feline p53 gene and the occurrence of FISS.
    Veterinary and Comparative Oncology 08/2012; 9999(9999). · 1.56 Impact Factor