[Show abstract][Hide abstract] ABSTRACT: Disease incidences increase with age, but the molecular characteristics of ageing that lead to increased disease susceptibility remain inadequately understood. Here we perform a whole-blood gene expression meta-analysis in 14,983 individuals of European ancestry (including replication) and identify 1,497 genes that are differentially expressed with chronological age. The age-associated genes do not harbor more age-associated CpG-methylation sites than other genes, but are instead enriched for the presence of potentially functional CpG-methylation sites in enhancer and insulator regions that associate with both chronological age and gene expression levels. We further used the gene expression profiles to calculate the 'transcriptomic age' of an individual, and show that differences between transcriptomic age and chronological age are associated with biological features linked to ageing, such as blood pressure, cholesterol levels, fasting glucose, and body mass index. The transcriptomic prediction model adds biological relevance and complements existing epigenetic prediction models, and can be used by others to calculate transcriptomic age in external cohorts.
[Show abstract][Hide abstract] ABSTRACT: Inbreeding depression refers to lower fitness among offspring of genetic relatives. This reduced fitness is caused by the inheritance of two identical chromosomal segments (autozygosity) across the genome, which may expose the effects of (partially) recessive deleterious mutations. Even among outbred populations, autozygosity can occur to varying degrees due to cryptic relatedness between parents. Using dense genome-wide single-nucleotide polymorphism (SNP) data, we examined the degree to which autozygosity associated with measured cognitive ability in an unselected sample of 4854 participants of European ancestry. We used runs of homozygosity—multiple homozygous SNPs in a row—to estimate autozygous tracts across the genome. We found that increased levels of autozygosity predicted lower general cognitive ability, and estimate a drop of 0.6 s.d. among the offspring of first cousins (P=0.003–0.02 depending on the model). This effect came predominantly from long and rare autozygous tracts, which theory predicts as more likely to be deleterious than short and common tracts. Association mapping of autozygous tracts did not reveal any specific regions that were predictive beyond chance after correcting for multiple testing genome wide. The observed effect size is consistent with studies of cognitive decline among offspring of known consanguineous relationships. These findings suggest a role for multiple recessive or partially recessive alleles in general cognitive ability, and that alleles decreasing general cognitive ability have been selected against over evolutionary time.
[Show abstract][Hide abstract] ABSTRACT: Background
DNA methylation levels change with age. Recent studies have identified biomarkers of chronological age based on DNA methylation levels. It is not yet known whether DNA methylation age captures aspects of biological age.
Here we test whether differences between people’s chronological ages and estimated ages, DNA methylation age, predict all-cause mortality in later life. The difference between DNA methylation age and chronological age (Δage) was calculated in four longitudinal cohorts of older people. Meta-analysis of proportional hazards models from the four cohorts was used to determine the association between Δage and mortality. A 5-year higher Δage is associated with a 21% higher mortality risk, adjusting for age and sex. After further adjustments for childhood IQ, education, social class, hypertension, diabetes, cardiovascular disease, and APOE e4 status, there is a 16% increased mortality risk for those with a 5-year higher Δage. A pedigree-based heritability analysis of Δage was conducted in a separate cohort. The heritability of Δage was 0.43.
DNA methylation-derived measures of accelerated aging are heritable traits that predict mortality independently of health status, lifestyle factors, and known genetic factors.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-015-0584-6) contains supplementary material, which is available to authorized users.
[Show abstract][Hide abstract] ABSTRACT: Monozygotic (MZ) twins form an important system for the study of biological plasticity in humans. While MZ twins are generally considered to be genetically identical, a number of studies have emerged that have demonstrated copy-number differences within a twin pair, particularly in those discordant for disease. The rate of autosomal copy-number variation (CNV) discordance within MZ twin pairs was investigated using a population sample of 376 twin pairs genotyped on Illumina Human610-Quad arrays. After CNV calling using both QuantiSNP and PennCNV followed by manual annotation, only a single CNV difference was observed within the MZ twin pairs, being a 130 KB duplication of chromosome 5. Five other potential discordant CNV were called by the software, but excluded based on manual annotation of the regions. It is concluded that large CNV discordance is rare within MZ twin pairs, indicating that any CNV difference found within phenotypically discordant MZ twin pairs has a high probability of containing the causal gene(s) involved.
Twin Research and Human Genetics 01/2015; 18(01):1-6. DOI:10.1017/thg.2014.85 · 2.30 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Replying to A. R. Wood et al. 514, http://dx.doi.org/10.1038/nature13691 (2014).We thank Wood et al. for their interesting observations and although their proposed mechanism does not explain all our reported results, we acknowledge that alternative mechanisms could be behind the observation of epistatic signals. Although we replicate our results in large, independent samples, 19/30 of our reported interactions (Table 1 in ref. 2), Wood et al. do not replicate in the InCHIANTI data set (n = 450) at a type-I error rate of 0.05/30 = 0.002, including none of our reported cis-trans interactions. Having insufficient data to replicate the discovery interactions makes it problematic to draw firm conclusions on the reported cis-trans effects.
[Show abstract][Hide abstract] ABSTRACT: Epigenetic mechanisms such as DNA methylation (DNAm) are essential for regulation of gene expression. DNAm is dynamic, influenced by both environmental and genetic factors. Epigenetic drift is the divergence of the epigenome as a function of age due to stochastic changes in methylation. Here we show that epigenetic drift may be constrained at many CpGs across the human genome by DNA sequence variation and by lifetime environmental exposures. We estimate repeatability of DNAm at 234,811 autosomal CpGs in whole blood using longitudinal data (2-3 repeated measurements) on 478 older people from two Scottish birth cohorts - the Lothian Birth Cohorts of 1921 and 1936. Median age was 79yrs and 70yrs, and the follow-up period was ~10yrs and ~6yrs, respectively. We compare this to methylation heritability estimated in the Brisbane Systems Genomics Study, a cross-sectional study of 117 families (offspring median age 13yrs; parent median age 46yrs). CpG repeatability in older people was highly correlated (0.68) with heritability estimated in younger people. Highly heritable sites had strong underlying cis-genetic effects. 37 and 1687 autosomal CpGs were associated with smoking and sex, respectively. Both sets were strongly enriched for high repeatability. Sex-associated CpGs were also strongly enriched for high heritability. Our results show that a large number of CpGs across the genome, as a result of environmental and/or genetic constraints, have stable DNAm variation over the human life-time. Moreover, at a number of CpGs, most variation in the population is due to genetic factors, despite some sites being highly modifiable by the environment.
Genome Research 09/2014; 24(11). DOI:10.1101/gr.176933.114 · 14.63 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Background
Despite the important role DNA methylation plays in transcriptional regulation, the transgenerational inheritance of DNA methylation is not well understood. The genetic heritability of DNA methylation has been estimated using twin pairs, although concern has been expressed whether the underlying assumption of equal common environmental effects are applicable due to intrauterine differences between monozygotic and dizygotic twins. We estimate the heritability of DNA methylation on peripheral blood leukocytes using Illumina HumanMethylation450 array using a family based sample of 614 people from 117 families, allowing comparison both within and across generations.
The correlations from the various available relative pairs indicate that on average the similarity in DNA methylation between relatives is predominantly due to genetic effects with any common environmental or zygotic effects being limited. The average heritability of DNA methylation measured at probes with no known SNPs is estimated as 0.187. The ten most heritable methylation probes were investigated with a genome-wide association study, all showing highly statistically significant cis mQTLs. Further investigation of one of these cis mQTL, found in the MHC region of chromosome 6, showed the most significantly associated SNP was also associated with over 200 other DNA methylation probes in this region and the gene expression level of 9 genes.
The majority of transgenerational similarity in DNA methylation is attributable to genetic effects, and approximately 20% of individual differences in DNA methylation in the population are caused by DNA sequence variation that is not located within CpG sites.
[Show abstract][Hide abstract] ABSTRACT: Epistasis is the phenomenon whereby one polymorphism's effect on a trait depends on other polymorphisms present in the genome. The extent to which epistasis influences complex traits and contributes to their variation is a fundamental question in evolution and human genetics. Although often demonstrated in artificial gene manipulation studies in model organisms, and some examples have been reported in other species, few examples exist for epistasis among natural polymorphisms in human traits. Its absence from empirical findings may simply be due to low incidence in the genetic control of complex traits, but an alternative view is that it has previously been too technically challenging to detect owing to statistical and computational issues. Here we show, using advanced computation and a gene expression study design, that many instances of epistasis are found between common single nucleotide polymorphisms (SNPs). In a cohort of 846 individuals with 7,339 gene expression levels measured in peripheral blood, we found 501 significant pairwise interactions between common SNPs influencing the expression of 238 genes (P < 2.91 × 10(-16)). Replication of these interactions in two independent data sets showed both concordance of direction of epistatic effects (P = 5.56 × 10(-31)) and enrichment of interaction P values, with 30 being significant at a conservative threshold of P < 9.98 × 10(-5). Forty-four of the genetic interactions are located within 5 megabases of regions of known physical chromosome interactions (P = 1.8 × 10(-10)). Epistatic networks of three SNPs or more influence the expression levels of 129 genes, whereby one cis-acting SNP is modulated by several trans-acting SNPs. For example, MBNL1 is influenced by an additive effect at rs13069559, which itself is masked by trans-SNPs on 14 different chromosomes, with nearly identical genotype-phenotype maps for each cis-trans interaction. This study presents the first evidence, to our knowledge, for many instances of segregating common polymorphisms interacting to influence human traits.
[Show abstract][Hide abstract] ABSTRACT: Principal components analysis has been employed in gene expression studies to correct for population substructure, batch and environmental effects. This method typically involves the removal of variation contained in as many as 50 principal components (PCs), which can constitute a large proportion of total variation present in the data. Each PC, however, can detect many sources of variation including gene expression networks and genetic variation influencing transcript levels. We demonstrate that PCs generated from gene expression data can simultaneously contain both genetic and non-genetic factors. From heritability estimates we show that all PCs contain a considerable portion of genetic variation whilst non-genetic artifacts such as batch effects were associated to varying degrees with the first 60 PCs. These PCs demonstrate an enrichment of biological pathways including core immune function and metabolic pathways. The use of PC correction in two independent datasets resulted in a reduction in the number of cis- and trans-eQTLs detected. Comparisons of PC and linear model correction revealed that PC correction was not as efficient at removing known batch effects and had a higher penalty on genetic variation. Therefore, this study highlights the danger of eliminating biologically relevant data when employing PC correction in gene expression data.
[Show abstract][Hide abstract] ABSTRACT: While genome-wide association studies (GWAS) have been successful in identifying a large number of variants associated with disease, the challenge of locating the underlying causal loci remains. Sequencing of case and control DNA pools provides an inexpensive method for assessing all variation in a genomic region surrounding a significant GWAS result. However, individual variants need to be ranked in terms of the strength of their association to disease in order to prioritise follow-up by individual genotyping. A simple method for testing for case-control association in sequence data from DNA pools is presented that allows the partitioning of the variance in allele frequency estimates into components due to the sampling of chromosomes from the pool during sequencing, sampling individuals from the population and unequal contribution from individuals during pool construction. The utility of this method is demonstrated on a sequence from the alcohol dehydrogenase (ADH) gene cluster on a case-control sample for heavy alcohol consumption.
PLoS ONE 06/2013; 8(6):e65410. DOI:10.1371/journal.pone.0065410 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Our understanding of major depressive disorder (MDD) has focused on the influence of genetic variation and environmental risk factors. Growing evidence suggests the additional role of epigenetic mechanisms influencing susceptibility for complex traits. DNA sequence within discordant monozygotic twin (MZT) pairs is virtually identical; thus, they represent a powerful design for studying the contribution of epigenetic factors to disease liability. The aim of this study was to investigate whether specific methylation profiles in white blood cells could contribute to the aetiology of MDD. Participants were drawn from the Queensland Twin Registry and comprised 12 MZT pairs discordant for MDD and 12 MZT pairs concordant for no MDD and low neuroticism. Bisulphite treatment and genome-wide interrogation of differentially methylated CpG sites using the Illumina Human Methylation 450 BeadChip were performed in WBC-derived DNA. No overall difference in mean global methylation between cases and their unaffected co-twins was found; however, the differences in females was significant (P=0.005). The difference in variance across all probes between affected and unaffected twins was highly significant (P<2.2 × 10(-16)), with 52.4% of probes having higher variance in cases (binomial P-value<2.2 × 10(-16)). No significant differences in methylation were observed between discordant MZT pairs and their matched concordant MZT (permutation minimum P=0.11) at any individual probe. Larger samples are likely to be needed to identify true associations between methylation differences at specific CpG sites.
[Show abstract][Hide abstract] ABSTRACT: There is increasing evidence that heritable variation in gene expression underlies genetic variation in susceptibility to disease. Therefore, a comprehensive understanding of the similarity between relatives for transcript variation is warranted-in particular, dissection of phenotypic variation into additive and non-additive genetic factors and shared environmental effects. We conducted a gene expression study in blood samples of 862 individuals from 312 nuclear families containing MZ or DZ twin pairs using both pedigree and genotype information. From a pedigree analysis we show that the vast majority of genetic variation across 17,994 probes is additive, although non-additive genetic variation is identified for 960 transcripts. For 180 of the 960 transcripts with non-additive genetic variation, we identify expression quantitative trait loci (eQTL) with dominance effects in a sample of 339 unrelated individuals and replicate 31% of these associations in an independent sample of 139 unrelated individuals. Over-dominance was detected and replicated for a trans association between rs12313805 and ETV6, located 4MB apart on chromosome 12. Surprisingly, only 17 probes exhibit significant levels of common environmental effects, suggesting that environmental and lifestyle factors common to a family do not affect expression variation for most transcripts, at least those measured in blood. Consistent with the genetic architecture of common diseases, gene expression is predominantly additive, but a minority of transcripts display non-additive effects.
[Show abstract][Hide abstract] ABSTRACT: There is increasing evidence for the role of rare copy-number variation (CNV) in the development of neuropsychiatric disorders. It is likely that such variants also have an effect on the variation of cognition in what is considered the "normal" phenotypic range. The role of rare CNV (>20 KB in length; frequency <5 %) on general cognitive ability is investigated in a sample of 800 individuals (mean age = 16.5, SD = 1.2) using copy-number variants called from the Illumina 610K SNP genotyping array with the software QuantiSNP. We assessed three measures of CNV burden-total CNV length, number of CNV and average CNV length-for both deletions and duplications in combination and separately. No correlation was found between any of the measures of CNV burden and IQ, or when comparing the top and bottom 10 % of the sample for IQ, both on a genome-wide scale and at individual positions across the genome.
[Show abstract][Hide abstract] ABSTRACT: In a previous study, we detected a 6p25-p24 region linked to schizophrenia in families with high composite cognitive deficit (CD) scores, a quantitative trait integrating multiple cognitive measures. Association mapping of a 10 Mb interval identified a 260 kb region with a cluster of single-nucleotide polymorphisms (SNPs) significantly associated with CD scores and memory performance. The region contains two colocalising genes, LYRM4 and FARS2, both encoding mitochondrial proteins. The two tagging SNPs with strongest evidence of association were located around the overlapping putative promoters, with rs2224391 predicted to alter a transcription factor binding site (TFBS). Sequencing the promoter region identified 22 SNPs, many predicted to affect TFBSs, in a tight linkage disequilibrium block. Luciferase reporter assays confirmed promoter activity in the predicted promoter region, and demonstrated marked downregulation of expression in the LYRM4 direction under the haplotype comprising the minor alleles of promoter SNPs, which however is not driven by rs2224391. Experimental evidence from LYRM4 expression in lymphoblasts, gel-shift assays and modelling of DNA breathing dynamics pointed to two adjacent promoter SNPs, rs7752203-rs4141761, as the functional variants affecting expression. Their C-G alleles were associated with higher transcriptional activity and preferential binding of nuclear proteins, whereas the G-A combination had opposite effects and was associated with poor memory and high CD scores. LYRM4 is a eukaryote-specific component of the mitochondrial biogenesis of Fe-S clusters, essential cofactors in multiple processes, including oxidative phosphorylation. LYRM4 downregulation may be one of the mechanisms involved in inefficient oxidative phosphorylation and oxidative stress, increasingly recognised as contributors to schizophrenia pathogenesis.Molecular Psychiatry advance online publication, 4 October 2011; doi:10.1038/mp.2011.129.
[Show abstract][Hide abstract] ABSTRACT: To identify common genetic variants that predispose to caffeine-induced insomnia and to test whether genes whose expression changes in the presence of caffeine are enriched for association with caffeine-induced insomnia.
A hypothesis-free, genome-wide association study.
Community-based sample of Australian twins from the Australian Twin Registry.
After removal of individuals who said that they do not drink coffee, a total of 2,402 individuals from 1,470 families in the Australian Twin Registry provided both phenotype and genotype information.
A dichotomized scale based on whether participants reported ever or never experiencing caffeine-induced insomnia. A factor score based on responses to a number of questions regarding normal sleep habits was included as a covariate in the analysis. More than 2 million common single nucleotide polymorphisms (SNPs) were tested for association with caffeine-induced insomnia. No SNPs reached the genome-wide significance threshold. In the analysis that did not include the insomnia factor score as a covariate, the most significant SNP identified was an intronic SNP in the PRIMA1 gene (P = 1.4 × 10⁻⁶, odds ratio = 0.68 [0.53 - 0.89]). An intergenic SNP near the GBP4 gene on chromosome 1 was the most significant upon inclusion of the insomnia factor score into the model (P = 1.9 × 10⁻⁶, odds ratio = 0.70 [0.62 - 0.78]). A previously identified association with a polymorphism in the ADORA2A gene was replicated.
Several genes have been identified in the study as potentially influencing caffeine-induced insomnia. They will require replication in another sample. The results may have implications for understanding the biologic mechanisms underlying insomnia.