Using Extended Genealogy to Estimate Components of Heritability for 23 Quantitative and Dichotomous Traits

The University of Queensland, Australia
PLoS Genetics (Impact Factor: 7.53). 05/2013; 9(5):e1003520. DOI: 10.1371/journal.pgen.1003520
Source: PubMed


Important knowledge about the determinants of complex human phenotypes can be obtained from the estimation of heritability, the fraction of phenotypic variation in a population that is determined by genetic factors. Here, we make use of extensive phenotype data in Iceland, long-range phased genotypes, and a population-wide genealogical database to examine the heritability of 11 quantitative and 12 dichotomous phenotypes in a sample of 38,167 individuals. Most previous estimates of heritability are derived from family-based approaches such as twin studies, which may be biased upwards by epistatic interactions or shared environment. Our estimates of heritability, based on both closely and distantly related pairs of individuals, are significantly lower than those from previous studies. We examine phenotypic correlations across a range of relationships, from siblings to first cousins, and find that the excess phenotypic correlation in these related individuals is predominantly due to shared environment as opposed to dominance or epistasis. We also develop a new method to jointly estimate narrow-sense heritability and the heritability explained by genotyped SNPs. Unlike existing methods, this approach permits the use of information from both closely and distantly related pairs of individuals, thereby reducing the variance of estimates of heritability explained by genotyped SNPs while preventing upward bias. Our results show that common SNPs explain a larger proportion of the heritability than previously thought, with SNPs present on Illumina 300K genotyping arrays explaining more than half of the heritability for the 23 phenotypes examined in this study. Much of the remaining heritability is likely to be due to rare alleles that are not captured by standard genotyping arrays.

Full-text preview

Available from:
  • Source
    • "Before embarking on Genome Wide Association (GWA) projects, the heritability of complex traits is often assessed in twin and family studies, or, more recently, assessed based on common single nucleotide polymorphisms (SNPs). Such SNP-based heritability can be estimated when genetic similarities between distantly related individuals are summarized in a genetic relatedness matrix, which then is used to predict their phenotype similarity (Visscher et al. 2010; Lubke et al. 2012; Lee et al. 2012; Zaitlen et al. 2013). This technique, known as genomic-relatedness-matrix restricted maximum likelihood (GREML; Benjamin et al. 2012), is implemented, for example, in the software package GCTA (Genome-wide Complex Trait Analysis; Yang et al. 2011). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Combining genotype data across cohorts increases power to estimate the heritability due to common single nucleotide polymorphisms (SNPs), based on analyzing a Genetic Relationship Matrix (GRM). However, the combination of SNP data across multiple cohorts may lead to stratification, when for example, different genotyping platforms are used. In the current study, we address issues of combining SNP data from different cohorts, the Netherlands Twin Register (NTR) and the Generation R (GENR) study. Both cohorts include children of Northern European Dutch background (N = 3102 + 2826, respectively) who were genotyped on different platforms. We explore imputation and phasing as a tool and compare three GRM-building strategies, when data from two cohorts are (1) just combined, (2) pre-combined and cross-platform imputed and (3) cross-platform imputed and post-combined. We test these three strategies with data on childhood height for unrelated individuals (N = 3124, average age 6.7 years) to explore their effect on SNP-heritability estimates and compare results to those obtained from the independent studies. All combination strategies result in SNP-heritability estimates with a standard error smaller than those of the independent studies. We did not observe significant difference in estimates of SNP-heritability based on various cross-platform imputed GRMs. SNP-heritability of childhood height was on average estimated as 0.50 (SE = 0.10). Introducing cohort as a covariate resulted in ≈2 % drop. Principal components (PCs) adjustment resulted in SNP-heritability estimates of about 0.39 (SE = 0.11). Strikingly, we did not find significant difference between cross-platform imputed and combined GRMs. All estimates were significant regardless the use of PCs adjustment. Based on these analyses we conclude that imputation with a reference set helps to increase power to estimate SNP-heritability by combining cohorts of the same ethnicity genotyped on different platforms. However, important factors should be taken into account such as remaining cohort stratification after imputation and/or phenotypic heterogeneity between and within cohorts. Whether one should use imputation, or just combine the genotype data, depends on the number of overlapping SNPs in relation to the total number of genotyped SNPs for both cohorts, and their ability to tag all the genetic variance related to the specific trait of interest.
    Behavior Genetics 06/2015; 45(5). DOI:10.1007/s10519-015-9725-7 · 3.21 Impact Factor
  • Source
    • "Ten previously published GWASs investigated cross-sectional BMI and reported a genome-wide significant association for a variant within or around the fat mass and obesity-associated protein (FTO) gene [13]–[15], [21], [22], [36]–[40]. Visual inspection of our data does not show convincing evidence for a strong signal in the FTO region (Figure 1). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Many health outcomes are influenced by a person's body mass index, as well as by the trajectory of body mass index through a lifetime. Although previous research has established that body mass index related traits are influenced by genetics, the relationship between these traits and genetics has not been well characterized in people of South Asian ancestry. To begin to characterize this relationship, we analyzed the association between common genetic variation and five phenotypes related to body mass index in a population-based sample of 5,354 Bangladeshi adults. We discovered a significant association between SNV rs347313 (intron of NOS1AP) and change in body mass index in women over two years. In a linear mixed-model, the G allele was associated with an increase of 0.25 kg/m2 in body mass index over two years (p-value of 2.3·10-8). We also estimated the heritability of these phenotypes from our genotype data. We found significant estimates of heritability for all of the body mass index-related phenotypes. Our study evaluated the genetic determinants of body mass index related phenotypes for the first time in South Asians. The results suggest that these phenotypes are heritable and some of this heritability is driven by variation that differs from those previously reported. We also provide evidence that the genetic etiology of body mass index related traits may differ by ancestry, sex, and environment, and consequently that these factors should be considered when assessing the genetic determinants of the risk of body mass index-related disease.
    PLoS ONE 08/2014; 9(8):e105062. DOI:10.1371/journal.pone.0105062 · 3.23 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Height has an extremely polygenic pattern of inheritance. Genome-wide association studies (GWAS) have revealed hundreds of common variants that are associated with human height at genome-wide levels of significance. However, only a small fraction of phenotypic variation can be explained by the aggregate of these common variants. In a large study of African-American men and women (n = 14,419), we genotyped and analyzed 966,578 autosomal SNPs across the entire genome using a linear mixed model variance components approach implemented in the program GCTA (Yang et al Nat Genet 2010), and estimated an additive heritability of 44.7% (se: 3.7%) for this phenotype in a sample of evidently unrelated individuals. While this estimated value is similar to that given by Yang et al in their analyses, we remain concerned about two related issues: (1) whether in the complete absence of hidden relatedness, variance components methods have adequate power to estimate heritability when a very large number of SNPs are used in the analysis; and (2) whether estimation of heritability may be biased, in real studies, by low levels of residual hidden relatedness. We addressed the first question in a semi-analytic fashion by directly simulating the distribution of the score statistic for a test of zero heritability with and without low levels of relatedness. The second question was addressed by a very careful comparison of the behavior of estimated heritability for both observed (self-reported) height and simulated phenotypes compared to imputation R2 as a function of the number of SNPs used in the analysis. These simulations help to address the important question about whether today's GWAS SNPs will remain useful for imputing causal variants that are discovered using very large sample sizes in future studies of height, or whether the causal variants themselves will need to be genotyped de novo in order to build a prediction model that ultimately captures a large fraction of the variability of height, and by implication other complex phenotypes. Our overall conclusions are that when study sizes are quite large (5,000 or so) the additive heritability estimate for height is not apparently biased upwards using the linear mixed model; however there is evidence in our simulation that a very large number of causal variants (many thousands) each with very small effect on phenotypic variance will need to be discovered to fill the gap between the heritability explained by known versus unknown causal variants. We conclude that today's GWAS data will remain useful in the future for causal variant prediction, but that finding the causal variants that need to be predicted may be extremely laborious.
    PLoS ONE 06/2015; 10(6):e0131106. DOI:10.1371/journal.pone.0131106 · 3.23 Impact Factor
Show more