Beyond Missing Heritability: Prediction of Complex Traits

Department of Biostatistics, University of Alabama at Birmingham, Alabama, United States of America.
PLoS Genetics (Impact Factor: 8.17). 04/2011; 7(4):e1002051. DOI: 10.1371/journal.pgen.1002051
Source: PubMed

ABSTRACT Despite rapid advances in genomic technology, our ability to account for phenotypic variation using genetic information remains limited for many traits. This has unfortunately resulted in limited application of genetic data towards preventive and personalized medicine, one of the primary impetuses of genome-wide association studies. Recently, a large proportion of the "missing heritability" for human height was statistically explained by modeling thousands of single nucleotide polymorphisms concurrently. However, it is currently unclear how gains in explained genetic variance will translate to the prediction of yet-to-be observed phenotypes. Using data from the Framingham Heart Study, we explore the genomic prediction of human height in training and validation samples while varying the statistical approach used, the number of SNPs included in the model, the validation scheme, and the number of subjects used to train the model. In our training datasets, we are able to explain a large proportion of the variation in height (h(2) up to 0.83, R(2) up to 0.96). However, the proportion of variance accounted for in validation samples is much smaller (ranging from 0.15 to 0.36 depending on the degree of familial information used in the training dataset). While such R(2) values vastly exceed what has been previously reported using a reduced number of pre-selected markers (<0.10), given the heritability of the trait (∼ 0.80), substantial room for improvement remains.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The aim of this study was to separate marked additive genetic variability for three quantitative traits in chickens into components associated with classes of minor allele frequency (MAF), individual chromosomes and marker density using the genomewide complex trait analysis (GCTA) approach. Data were from 1351 chickens measured for body weight (BW), ultrasound of breast muscle (BM) and hen house egg production (HHP), each bird with 354 364 SNP genotypes. Estimates of variance components show that SNPs on commercially available genotyping chips marked a large amount of genetic variability for all three traits. The estimated proportion of total variation tagged by all autosomal SNPs was 0.30 (SE 0.04) for BW, 0.33 (SE 0.04) for BM, and 0.19 (SE 0.05) for HHP. We found that a substantial proportion of this variation was explained by low frequency variants (MAF <0.20) for BW and BM, and variants with MAF 0.10–0.30 for HHP. The marked genetic variance explained by each chromosome was linearly related to its length (R2 = 0.60) for BW and BM. However, for HHP, there was no linear relationship between estimates of variance and length of the chromosome (R2 = 0.01). Our results suggest that the contribution of SNPs to marked additive genetic variability is dependent on the allele frequency spectrum. For the sample of birds analysed, it was found that increasing marker density beyond 100K SNPs did not capture additional additive genetic variance.
    Journal of Animal Breeding and Genetics 06/2014; DOI:10.1111/jbg.12079 · 2.06 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Accurate genetic prediction of quantitative traits related to complex disease risk would have potential clinical impact, so investigation of statistical methodology to improve predictive performance is important. We compare a simple approach of polygenic scores using top ranking single nucleotide polymorphisms (SNPs) to a set of shrinkage models, namely Ridge Regression, Lasso and Hyper-Lasso. These penalised regression methods analyse all genotyped SNPs simultaneously, potentially including much larger sets of SNPs in the models, not only those with the smallest P values. We compare the accuracy of these models for predicting low-density lipoprotein (LDL) and high-density lipoprotein (HDL) cholesterol, two lipid traits of clinical relevance, in the Whitehall II and British Women's Health and Heart Study cohorts, using SNPs from the HumanCVD BeadChip. For gene scores, the most accurate predictions arise from multivariate weighted scores and include only a small number of SNPs, identified as top hits by the HumanCVD BeadChip. Furthermore, there was little benefit from including external results from published sets of SNPs. We found that shrinkage approaches rarely improved significantly on gene score results. Genetic predictive performance is trait specific, depending on the heritability and genetic architecture of the trait, and is limited by the training data sample size. Our results for lipid traits suggest no current benefit of more complex methods over existing gene score methods. Instead, the most important choice for the prediction model is the number of SNPs and selection of the most predictive SNPs to include. However further comparisons, in larger samples and for other phenotypes, would still be of interest.
    Genetic Epidemiology 01/2014; 38(1). DOI:10.1002/gepi.21777 · 2.95 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: LEPR, MC4R, IGF2 and PRKAG3 are genes with known effects on fat content and distribution in pig carcass and pork. In a study performed with Duroc × Landrace/Large White pigs, we have found that IGF2 has strong additive effects on several carcass conformational traits and on fatty acid composition in several anatomical locations. MC4R shows additive effects on saturated fatty acid content in several muscles. On the other side, almost no additive effect has been found for PRKAG3 and very few for LEPR. In this work, no dominant effect has been found for any of the four genes. Using a Bayesian Lasso approach, we have been able now to find first-order epistatic (mainly dominant-additive) effects between LEPR and PRKAG3 for intramuscular fat content and for saturated fatty acid content in L. dorsii, B. femoralis, Ps. major and whole ham. The presence of interactions between genes in the shaping of traits of such importance as intramuscular fat content and composition highlights the complexity of heritable traits and the difficulty of gene-assisted selection for such traits.
    Animal Genetics 09/2013; 45(1). DOI:10.1111/age.12091 · 2.21 Impact Factor

Preview (2 Sources)

1 Download
Available from