ArticlePDF Available

Meta-analysis of the heritability of human traits based on fifty years of twin studies

Authors:

Abstract

Despite a century of research on complex traits in humans, the relative importance and specific nature of the influences of genes and environment on human traits remain controversial. We report a meta-analysis of twin correlations and reported variance components for 17,804 traits from 2,748 publications including 14,558,903 partly dependent twin pairs, virtually all published twin studies of complex traits. Estimates of heritability cluster strongly within functional domains, and across all traits the reported heritability is 49%. For a majority (69%) of traits, the observed twin correlations are consistent with a simple and parsimonious model where twin resemblance is solely due to additive genetic variation. The data are inconsistent with substantial influences from shared environment or non-additive genetic variation. This study provides the most comprehensive analysis of the causes of individual differences in human traits thus far and will guide future gene-mapping efforts. All the results can be visualized using the MaTCH webtool.
Nature GeNeticsADVANCE ONLINE PUBLICATION 1
ARTICLES
Insight into the nature of observed variation in human traits is impor-
tant in medicine, psychology, social sciences and evolutionary biology.
It has gained new relevance with both the ability to map genes for
human traits and the availability of large, collaborative data sets to do
so on an extensive and comprehensive scale. Individual differences
in human traits have been studied for more than a century, yet the
causes of variation in human traits remain uncertain and controver-
sial. Specifically, the partitioning of observed variability into underly-
ing genetic and environmental sources and the relative importance of
additive and non-additive genetic variation are continually debated1–5.
Recent results from large-scale genome-wide association studies
(GWAS) show that many genetic variants contribute to the variation
in complex traits and that effect sizes are typically small6,7. However,
the sum of the variance explained by the detected variants is much
smaller than the reported heritability of the trait4,6–10. This ‘missing
heritability’ has led some investigators to conclude that non-additive
variation must be important4 ,11. Although the presence of gene-gene
interaction has been demonstrated empirically5,12–17, little is known
about its relative contribution to observed variation18.
In this study, our aim is twofold. First, we analyze empirical esti-
mates of the relative contributions of genes and environment for
virtually all human traits investigated in the past 50 years. Second, we
assess empirical evidence for the presence and relative importance of
non-additive genetic influences on all human traits studied. We rely
on classical twin studies, as the twin design has been used widely
to disentangle the relative contributions of genes and environment,
across a variety of human traits. The classical twin design is based
on contrasting the trait resemblance of monozygotic and dizygotic
twin pairs. Monozygotic twins are genetically identical, and dizygotic
twins are genetically full siblings. We show that, for a majority of traits
(69%), the observed statistics are consistent with a simple and parsi-
monious model where the observed variation is solely due to additive
genetic variation. The data are inconsistent with a substantial influence
from shared environment or non-additive genetic variation. We also
show that estimates of heritability cluster strongly within functional
domains, and across all traits the reported heritability is 49%. Our
results are based on a meta-analysis of twin correlations and reported
variance components for 17,804 traits from 2,748 publications includ-
ing 14,558,903 partly dependent twin pairs, virtually all twin studies of
complex traits published between 1958 and 2012. This study provides
the most comprehensive analysis of the causes of individual differences
in human traits thus far and will guide future gene-mapping efforts. All
results can be visualized with the accompanying MaTCH webtool.
RESULTS
The distribution of studied traits is nonrandom
We systematically retrieved published classical twin studies in which
observed variation in human traits was partitioned into genetic and
Meta-analysis of the heritability of human traits based on
fifty years of twin studies
Tinca J C Polderman1,10, Beben Benyamin2,10, Christiaan A de Leeuw1,3, Patrick F Sullivan4–6,
Arjen van Bochoven7, Peter M Visscher2,8,11 & Danielle Posthuma1,9,11
Despite a century of research on complex traits in humans, the relative importance and specific nature of the influences of 
genes and environment on human traits remain controversial. We report a meta-analysis of twin correlations and reported 
variance components for 17,804 traits from 2,748 publications including 14,558,903 partly dependent twin pairs, virtually all 
published twin studies of complex traits. Estimates of heritability cluster strongly within functional domains, and across all traits 
the reported heritability is 49%. For a majority (69%) of traits, the observed twin correlations are consistent with a simple and 
parsimonious model where twin resemblance is solely due to additive genetic variation. The data are inconsistent with substantial 
influences from shared environment or non-additive genetic variation. This study provides the most comprehensive analysis of 
the causes of individual differences in human traits thus far and will guide future gene-mapping efforts. All the results can be 
visualized using the MaTCH webtool.
1Department of Complex Trait Genetics, VU University, Center for Neurogenomics and Cognitive Research, Amsterdam, the Netherlands. 2Queensland Brain Institute,
University of Queensland, Brisbane, Queensland, Australia. 3Institute for Computing and Information Sciences, Radboud University Nijmegen, Nijmegen, the
Netherlands. 4Center for Psychiatric Genomics, Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, USA. 5Department of Psychiatry,
University of North Carolina, Chapel Hill, North Carolina, USA. 6Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.
7Faculty of Sciences, VU University, Amsterdam, the Netherlands. 8University of Queensland Diamantina Institute, Translational Research Institute, Brisbane,
Queensland, Australia. 9Department of Clinical Genetics, VU University Medical Center, Neuroscience Campus Amsterdam, Amsterdam, the Netherlands.
10These authors contributed equally to this work. 11These authors jointly supervised this work. Correspondence should be addressed to D.P. (d.posthuma@vu.nl).
Received 13 February; accepted 1 April; published online 18 May 2015; doi:10.1038/ng.3285
2  ADVANCE ONLINE PUBLICATION Nature GeNetics
ARTICLES
environmental influences. For each study, we collected reported twin
correlations for continuous traits and contingency tables for dichoto-
mous traits, estimates from genetic model-fitting and study character-
istics (sample size, population, age cohort and ascertainment scheme)
(Supplementary Table 1 and Supplementary Note). We manually
classified the investigated traits using the chapter and subchapter
levels of the International Classification of Functioning, Disability and
Health (ICF) or the International Statistical Classification of Diseases
and Related Health Problems (ICD-10) (Online Methods). ICD-10
and ICF subchapter levels refer to actual diseases (for example, atopic
dermatitis) and traits (for example, temperament and personality
functions), respectively. We identified 2,748 relevant twin studies,
published between 1958 and 2012. Half of these were published after
2004, with sample sizes per study in 2012 of around 1,000 twin pairs
(Supplementary Table 2). Each study could report on multiple traits
measured in one or several samples. These 2,748 studies reported on
17,804 traits. Twin subjects came from 39 different countries, with
a large proportion of studies (34%) based on US twin samples. The
continents of South America (0.5%), Africa (0.2%) and Asia (5%) were
heavily under-represented (Fig. 1a,b and Supplementary Table 3).
The total number of studied twins was 14,558,903 partly dependent
pairs, or 2,247,128 when correcting for reporting on multiple traits
per study. The majority of studies (59%) were based on the adult
population (aged 18–64 years), although the sample sizes available
for studies of the elderly population (aged 65 years or older) were
the largest (Supplementary Table 4). Authorship network analyses
showed that 61 communities of authors wrote the 2,748 published
studies. The 11 largest authorship communities contained >65 authors
and could be mapped back to the main international twin registries,
such as the Vietnam Era Twin Registry, the Finnish Twin Cohort and
the Swedish Twin Registry (Supplementar y Fig. 1).
The investigated traits fell into 28 general trait domains. The dis-
tribution of the traits evaluated in twin studies was highly skewed,
with 51% of studies focusing on traits classified under the psychi-
atric, metabolic and cognitive domains, whereas traits classified
under the developmental, connective tissue and infection domains
together accounted for less than 1% of all investigated traits (Fig. 1c
and Supplementary Tables 57). The ten most investigated traits
were temperament and personality functions, weight maintenance
functions, general metabolic functions, depressive episode, higher-
level cognitive functions, conduct disorders, mental and behavioral
disorders due to use of alcohol, anxiety disorders, height and mental
and behavioral disorders due to use of tobacco. Collectively, these
traits accounted for 59% of all investigated traits.
0 200 400 600 800 1,000 1,200
Connective tissue
Developmental
Infection
Aging
Mortality
Neoplasms
Cell
Hematological
Social values
Muscular
Gastrointestinal
Nutritional
Dermatological
Social interactions
Ear, nose, throat
Immunological
Activities
Ophthalmological
Respiratory
Reproduction
Endocrine
Environment
Cardiovascular
Skeletal
Neurological
Cognitive
Metabolic
Psychiatric
a b
c d
Studies (n)
Continuous
89%
Dichotomous
10%
Dichotomous,
ascertained
1%
Other, non-
disease
75%
Disease
8%
Symptoms of
disease
17%
Number of investigated traits Average number of twin pairs per study
1234
–0.5
0
0.5
1.0
r
MZ
R
2
= 0.009468
1 2 3 4
–0.5
0
0.5
1.0
r
DZ
R
2
= 0.0003032
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
0
0.2
0.4
0.6
0.8
1.0
log
10
(n pairs)
h2
R
2
= 0.001651
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
0
0.2
0.4
0.6
0.8
log
10
(n pairs)
c2
R
2
= 0.005001
1 6,746 10 2,104
Figure 1 Distribution of the investigated traits in virtually all twin studies published between 1958 and 2012. (a) The number of investigated
traits in classical twin studies across all countries. (b) The average number of twin pairs included per study across countries. (c) The number of
investigated traits according to functional trait domain and trait characteristic (inset). (d) Monozygotic and dizygotic twin correlations and reported
estimates of h2 and c2 as a function of sample size. Contour lines indicate the density of the data in that region. The lines are colored by ‘heat’
from blue to red, indicating increasing data density.
Nature GeNeticsADVANCE ONLINE PUBLICATION 3
ARTICLES
Equal contribution of genes and environment
We did not find evidence of systematic publication bias as a function of
sample size (for example, where studies based on relatively small sam-
ples were only published when larger effects were reported) (Fig. 1d,
Supplementary Figs. 26 and Supplementary Tables 811). We cal-
culated the weighted averages of correlations for monozygotic (rMZ)
and dizygotic (rDZ) twins and of the reported estimates of the relative
contributions of genetic and environmental influences to the investi-
gated traits using a random-effects meta-analytic model to allow for
heterogeneity across different studies (Supplementary Tables 1215).
The meta-analyses of all traits yielded an average rMZ of 0.636 (s.e.m. =
0.002) and an average rDZ of 0.339 (s.e.m. = 0.003). The reported
heritability (h2) across all traits was 0.488 (s.e.m. = 0.004), and the
reported estimate of shared environmental effects (c2) was 0.174
(s.e.m. = 0.004) (Fig. 2a,b, Table 1 and Supplementary Fig. 7).
Estimates of h2 and c2 cluster in functional domains
We found that heritability estimates clustered in functional domains,
with the largest heritability estimates for traits classified under the
ophthalmological domain (h2 = 0.712, s.e.m. = 0.041), followed by
the ear, nose and throat (h2 = 0.637, s.e.m. = 0.064), dermatological
(h2 = 0.604, s.e.m. = 0.043) and skeletal (h2 = 0.591, s.e.m. = 0.018)
domains. The lowest heritability estimates were for traits in the
environment, reproduction and social values domains (Fig. 2d and
Supplementar y Table 16). All weighted averages of h2 across
>500 distinct traits had a mean greater than zero (Supplementary
Tables 1724). The lowest reported heritability for a specific trait was
for gene expression, with an estimated h2 = 0.055 (s.e.m. = 0.026) and
an estimated c2 of 0.736 (s.e.m. = 0.033) (but note that these trait aver-
ages are based on reported estimates of variance components derived
from only 20 data points reporting on the expression levels of 20 genes;
Supplementary Table 21). We found the largest influence of c2 for
traits in the cell domain (c2 = 0.674, s.e.m. = 0.048), followed by traits
in the infection (c2 = 0.351, s.e.m. = 0.153), hematological (c2 = 0.324,
s.e.m. = 0.090), endocrine (c2 = 0.322, s.e.m. = 0.050), reproduction
(c2 = 0.320, s.e.m. = 0.061), social values (c2 = 0.271, s.e.m. = 0.032),
environment (c2 = 0.269, s.e.m. = 0.020) and skeletal (c2 = 0.265,
s.e.m. = 0.019) domains (Fig. 2d and Supplementary Table 16).
0
0.2
0.4
0.6
0.8
1.0
0
0.2
0.4
0.6
0.8
1.0
All traits
Skeletal
Reproduction
Ear, nose, throat
Metabolic
Ophthalmological
Dermatological
Respiratory
Neurological
Cognitive
Activities
Cardiovascular
Endocrine
Psychiatric
Environment
Gastrointestinal
Social values
Nutritional
Social interactions
d
r
MZ
r
MZM
r
MZF
r
DZ
r
DZSS
r
DZM
r
DZF
r
DOS
h
2
h
2
SS
h
2
M
h
2
F
c
2
c
2
SS
c
2
M
c
2
F
c
2
F
Correlation (r)
Frequency
–0.5 0 0.5 1.0
0
50
150
250
MZ
DZ
a
0
0.2
0.4
0.6
0.8
1.0
r
MZ
r
MZM
r
MZF
r
DZ
r
DZSS
r
DZM
r
DZF
r
DOS
0
0.2
0.4
0.6
0.8
1.0
0–11 12–17 18–64
Age (years)
65+
h
2
h
2
SS
h
2
M
h
2
F
c
2
c
2
SS
c
2
M
c
–0.5 0 0.5 1.0
–0.5
0
0.5
1.0
r
MZ
r
DZ
b
R
2
= 0.4167
Figure 2 Twin correlations and heritabilities for all human traits studied. (a) Distribution of rMZ and rDZ estimates across the traits investigated in
2,748 twin studies published between 1958 and 2012. rMZ estimates are based on 9,568 traits and 2,563,628 partly dependent twin pairs; rDZ
estimates are based on 5,220 traits and 2,606,252 partly dependent twin pairs (Table 1). (b) Relationship between rMZ and rDZ, using all 5,185 traits
for which both were reported. (c) Random-effects meta-analytic estimates of twin correlations (top) and reported variance components (bottom) across
all traits separately for four age cohorts. Error bars, standard errors. (d) Random-effects meta-analytic estimates of twin correlations (top) and reported
variance components (bottom) across all traits, and within functional domains for which data on all correlations and variance components were
available. Error bars, standard errors.
4  ADVANCE ONLINE PUBLICATION Nature GeNetics
ARTICLES
Heterogeneity of twin correlations across sex and age
Across all traits, the weighted averages of twin correlations and
reported h2 and c2 values did not show evidence of heterogeneity by
sex, although there was some evidence for lower correlation in oppo-
site-sex twin pairs in comparison to same-sex dizygotic twin pairs
(Table 1 and Supplementar y Note). The data showed a decrease in
monozygotic and dizygotic twin resemblance after adolescence and
an accompanying decrease in the estimates of both h2 and c2 (Fig. 2c
and Supplementary Table 15).
In the top 20 most investigated traits for twin correlations, the
weighted estimates did not show consistent evidence for heterogene-
ity by sex, with rMZM (monozygotic male twin correlation) and rMZF
(monozygotic female twin correlation) as well as rDZM (dizygotic
male twin correlation) and rDZF (dizygotic female twin correlation)
remarkably similar across the majority of the top 20 specific traits
investigated (Fig. 3), although for several traits the correlations for
opposite-sex twin pairs were lower than the estimates for same-sex
twin pairs, mostly after age 11 years (for example, for weight mainte-
nance functions, functions of brain, mental and behavioral disorders
due to the use of alcohol, and mental and behavioral disorders due
to the use of tobacco). Heterogeneity of weighted twin correlations
by age was more prominent than heterogeneity by sex (Fig. 3).
For example, when considering rMZ, we note that for most of the
top 20 investigated traits the estimate tended to decrease with age,
especially after adolescence, a trend that was generally mirrored
in the rDZ estimates (Fig. 3). Meta-analysis results across all traits
across different countries are provided in Supplementary Table 25.
Model-fitting and selection leads to underestimation of h2
Falconer’s equations can be used to calculate ĥ2 and ĉ2 on the basis
of twin correlations18. In these equations, ĥ2 = 2 × (rMZ rDZ) and
ĉ2 = 2 × rDZrMZ. When these are applied using the weighted averages
of rMZ and rDZ, we find an ĥ2 estimate of 2 × (0.636 0.339) = 0.593 and
a ĉ2 estimate of 2 × 0.339 − 0.636 = 0.042 (Table 1 and Supplementary
Fig. 8). We note that the ĥ2 estimate based on twin correlations is
larger than the weighted average of reported h2 values (Supplementary
Figs. 9 and 10). As a consequence, the ĉ2 estimate based on twin
correlations is lower than the weighted average of the reported c2
component. To test whether this discrepancy was due to a bias in
studies reporting only twin correlations or only variance compo-
nents, we conducted the meta-analysis only on studies reporting both.
This analysis yielded similar estimates with a similar discrepancy
(Supplementary Table 13), ruling out the explanation that twin cor-
relations may have been reported on traits that happened to be more
heritable than traits for which the estimates of variance components
were reported. Through theory, we show that such a discrepancy can
arise when individual studies represent a mixture of traits that follow
a pattern of rMZ > 2rDZ and rMZ < 2rDZ and where the choice of fitting
a model that includes shared environment or non-additive genetic
influences is based on the observed pattern of twin correlations
(Supplementary Note). More specifically, because c2 and non-additive
genetic influences cannot be estimated simultaneously from twin
correlations, an ACE model (for additive genetic (A), common envi-
ronmental (C), and error or non-shared environmental (E) influences)
is fitted to the data if 2rDZ rMZ > 0. In contrast, if 2rDZ rMZ < 0,
an ADE model, including non-additive genetic (D) instead of com-
mon environmental (C) influences, is selected. This leads to sampling
bias in estimating h2 from the full model. We show (Supplementary
Table 26) that such per-study choices cause bias and can lead to a 10%
downward bias in the reported estimates of h2 in comparison to those
based on twin correlations, consistent with the observed discrepancy
between our meta-analysis of variance component estimates calculated
from twin correlations and the reported variance components.
Overall twin correlations imply a simple additive model
There may be many causes of similarities and differences within
monozygotic and dizygotic twin pairs, and these are typically inter-
preted in terms of additive or non-additive genetic influences and
shared or non-shared environmental influences19. Yet, there are
essentially only two estimable and testable variance components of
interest in the twin design. Therefore, inference from classical twin
studies on all underlying, unobserved sources of variation that lead
to the resemblance between relatives is limited. However, there are
two simple and parsimonious hypotheses that can be tested across
traits from estimated correlation coefficients for monozygotic twin
pairs (rMZ) and dizygotic twin pairs (rDZ). The first is that the cor-
relations for the monozygotic and dizygotic twin populations (
ρ
MZ
and
ρ
DZ) are the same, consistent with twin resemblance being solely
due to non-genetic factors. The second hypothesis involves a two-
fold ratio of
ρ
MZ to
ρ
DZ, consistent with twin resemblance being
solely due to additive genetic variation. Across-trait consistency
with either of these hypotheses is not a proof of these simple models
but would provide an extremely parsimonious model against which
other experimental designs (for example, DNA sequence–based
studies) should be tested. For the vast majority of traits (84%),
Table 1 Weighted means of twin correlations and variance
components across all human traits investigated in a classical twin
study and published between 1958 and 2012
Statistic Estimate (s.e.m.) n traits n pairs
rMZ 0.636 (0.002) 9,568 2,563,627
rMZM 0.617 (0.004) 4,518 1,070,962
rMZF 0.626 (0.004) 4,360 1,171,841
rDZ 0.339 (0.003) 5,220 2,606,252
rDZSS 0.345 (0.003) 6,108 1,752,952
rDZM 0.321 (0.003) 4,412 1,039,238
rDZF 0.342 (0.004) 4,255 1,068,562
rDOS 0.302 (0.005) 2,342 898,610
h20.488 (0.004) 2,929 4,341,721
h2 (same sex) 0.471 (0.005) 1,795 1,187,837
h2 (male) 0.465 (0.005) 2,095 1,732,622
h2 (female) 0.472 (0.005) 1,957 1,539,582
c
20.174 (0.004) 2,771 4,272,318
c
2 (same sex) 0.189 (0.005) 1,769 1,185,116
c
2 (male) 0.157 (0.004) 1,988 1,519,148
c
2 (female) 0.169 (0.005) 1,925 1,516,192
2(rMZrDZ) 0.593 (0.008) 9,568 5,169,879
2(rMZrDZ) (same sex) 0.581 (0.008) 9,568 4,316,578
2(rMZrDZ) (male) 0.593 (0.010) 4,518 2,110,200
2(rMZrDZ) (female) 0.569 (0.010) 4,360 2,240,403
2rDZrMZ 0.042 (0.007) 9,568 5,169,879
2rDZ − rMZ (same sex) 0.055 (0.006) 9,568 4,316,578
2rDZrMZ (male) 0.025 (0.008) 4,518 2,110,200
2rDZrMZ (female) 0.057 (0.008) 4,360 2,240,403
r, correlation; MZ, monozygotic twins; DZ, dizygotic twins; MZM, monozygotic twins,
male; MZF, monzygotic twins, female; DZSS, dizygotic twins, same sex; DZM, dizygotic
twins, both male; DZF, dizygotic twins, both female; DOS, dizygotic twins, opposite sex;
h2, heritability; c2, proportion of variance due to shared environmental variation;
estimate, estimate based on random-effects meta-analysis; n traits, number of
investigated traits; n pairs, number of dependent twin pairs. The pairs are not
independent, as the same or an overlapping sample of twins may have been used
for multiple traits and across multiple studies.
Nature GeNeticsADVANCE ONLINE PUBLICATION 5
ARTICLES
we found that monozygotic twin correlations were larger than dizy-
gotic twin correlations. Using the weighted estimates of rMZ and rDZ
across all traits, we showed that, on average, 2rDZrMZ = 0.042
(s.e.m. = 0.007) (Table 1), which is very close to a twofold differ-
ence in the correlation of monozygotic twins relative to dizygotic
twins (Supplementary Figs. 11 and 12). The proportion of single
Blood
pressure
funct.
Conduct
dis.
Depr.
episode
Endocr.
gland
funct.
Food Funct. of
brain
General
metab.
funct.
Heart
funct. Height
High-L.
cognitive
funct.
Hyperkin-
etic dis.
Imm.
system
funct.
Ment.
beh. dis.
alc.
Ment.
beh. dis.
tob.
Other
anxiety
dis.
Spec.
personal.
dis.
Structure
of the
eyeball
Structure
of mouth
Temp.
pers.
funct.
Weight
maint.
funct.
r
MZ
0.47 0.34 0.43 0.52 0.85 0.60 0.40 0.50 0.47 0.79 0.36 0.59
r
MZM
0.63 0.29 0.58 0.42 0.45 0.60 0.88 0.48 0.50 0.34 0.59
r
MZF
0.41 0.44 0.52 0.46 0.90 0.52 0.29 0.50 0.71 0.72 0.35 0.63
r
DZ
0.26 0.23 0.27 0.17 0.31 0.49 0.16 0.23
r
DZSS
0.14 0.09 0.29 0.25 0.49 0.25 0.19 0.33 0.33 0.14 0.33
r
DZM
0.22 0.08 0.46 0.21 0.13 0.38 0.50 0.28 0.23 0.10 0.24
r
DZF
0.10 0.31 0.19 0.22 0.49 0.23 0.12 0.32 0.43 0.38 0.15 0.35
r
DOS
0.21 0.24 0.33 0.15 0.25 0.18 0.12 0.14
0
0.2
0.4
0.6
0.8
1.0 Age 65+ years
Blood
pressure
funct.
Conduct
dis.
Depr.
episode
Endocr.
gland
funct.
Food Funct. of
brain
General
metab.
funct.
Heart
funct. Height
High-L.
cognitive
funct.
Hyperkin-
etic dis.
Imm.
system
funct.
Ment.
beh. dis.
alc.
Ment.
beh. dis.
tob.
Other
anxiety
dis.
Spec.
personal.
dis.
Structure
of the
eyeball
Structure
of mouth
Temp.
pers.
funct.
Weight
maint.
funct.
r
MZ
0.60 0.63 0.50 0.52 0.49 0.71 0.78 0.55 0.94 0.69 0.67 0.70 0.77 0.82 0.55 0.71 0.85 0.75 0.51 0.87
r
MZM
0.62 0.56 0.49 0.59 0.69 0.78 0.58 0.94 0.60 0.66 0.75 0.80 0.51 0.83 0.86 0.52 0.85
r
MZF
0.55 0.63 0.52 0.55 0.72 0.75 0.53 0.91 0.55 0.69 0.75 0.84 0.50 0.83 0.86 0.50 0.87
r
DZ
0.31 0.38 0.31 0.31 0.32 0.45 0.30 0.49 0.38 0.23 0.41 0.56 0.59 0.35 0.29 0.44 0.38 0.21 0.42
r
DZSS
0.36 0.43 0.36 0.42 0.17 0.36 0.45 0.36 0.58 0.38 0.21 0.63 0.62 0.32 0.39 0.42 0.24 0.51
r
DZM
0.44 0.36 0.33 0.27 0.29 0.43 0.35 0.50 0.36 0.21 0.62 0.64 0.31 0.35 0.57 0.26 0.49
r
DZF
0.29 0.42 0.36 0.42 0.41 0.46 0.36 0.54 0.35 0.31 0.58 0.60 0.33 0.42 0.43 0.29 0.50
r
DOS
0.22 0.33 0.28 0.54 0.30 0.41 0.29 0.44 0.27 0.15 0.46 0.49 0.27 0.39 0.31 0.19 0.30
0
0.2
0.4
0.6
0.8
1.0 Age 12–17 years
Blood
pressure
funct.
Conduct
dis.
Depr.
episode
Endocr.
gland
funct.
Food Funct. of
brain
General
metab.
funct.
Heart
funct. Height
High-L.
cognitive
funct.
Hyperkin-
etic dis.
Imm.
system
funct.
Ment.
beh. dis.
alc.
Ment.
beh. dis.
tob.
Other
anxiety
dis.
Spec.
personal.
dis.
Structure
of the
eyeball
Structure
of mouth
Temp.
pers.
funct.
Weight
maint.
funct.
r
MZ
0.59 0.67 0.39 0.53 0.42 0.65 0.65 0.52 0.92 0.68 0.58 0.56 0.55 0.69 0.41 0.41 0.68 0.89 0.42 0.76
r
MZM
0.54 0.55 0.40 0.52 0.42 0.70 0.63 0.53 0.91 0.57 0.58 0.62 0.53 0.64 0.36 0.39 0.79 0.42 0.70
r
MZF
0.53 0.52 0.44 0.58 0.41 0.60 0.66 0.54 0.89 0.44 0.62 0.52 0.49 0.72 0.39 0.40 0.71 0.42 0.73
r
DZ
0.29 0.34 0.18 0.37 0.20 0.19 0.36 0.24 0.47 0.28 0.26 0.30 0.30 0.41 0.17 0.22 0.33 0.52 0.21 0.34
r
DZSS
0.30 0.43 0.21 0.34 0.24 0.24 0.33 0.29 0.53 0.27 0.31 0.38 0.36 0.44 0.16 –0.09 0.33 0.17 0.39
r
DZM
0.25 0.36 0.18 0.32 0.22 0.39 0.32 0.21 0.53 0.34 0.35 0.29 0.31 0.37 0.14 0.25 0.33 0.17 0.36
r
DZF
0.33 0.34 0.22 0.37 0.22 0.35 0.36 0.30 0.51 0.25 0.34 0.30 0.31 0.47 0.16 0.20 0.30 0.19 0.36
r
DOS
0.28 0.29 0.14 0.26 0.14 0.10 0.30 0.16 0.45 0.25 0.22 0.26 0.15 0.14 0.32 0.13 0.25
–0.2
0
0.2
0.4
0.6
0.8
1.0 Age 18–64 years
Blood
pressure
funct.
Conduct
dis.
Depr.
episode
Endocr.
gland
funct.
Food Funct. of
brain
General
metab.
funct.
Heart
funct. Height
Higher-
level
cognitive
funct.
Hyperkin-
etic dis.
Imm.
system
funct.
Ment.
beh. dis.
alc.
Ment.
beh. dis.
tob.
Other
anxiety
dis.
Spec.
personal.
dis.
Structure
of the
eyeball
Structure
of mouth
Temp.
pers.
funct.
Weight
maint.
funct.
r
MZ
0.54 0.68 0.60 0.61 0.81 0.69 0.81 0.49 0.88 0.77 0.65 0.64 0.67 0.62 0.68 0.75 0.84 0.61 0.84
r
MZM
0.68 0.62 0.73 0.77 0.71 0.78 0.41 0.91 0.71 0.64 0.58 0.69 0.62 0.83
r
MZF
0.70 0.66 0.58 0.76 0.72 0.76 0.44 0.88 0.72 0.61 0.62 0.71 0.61 0.85
r
DZ
0.39 0.43 0.42 0.44 0.52 0.30 0.47 0.26 0.59 0.54 0.27 0.38 0.41 0.32 0.33 0.40 0.34 0.54
r
DZSS
0.41 0.44 0.39 0.22 0.55 0.33 0.52 0.33 0.61 0.58 0.24 0.41 0.43 0.41 0.30 0.51 0.33 0.56
r
DZM
0.45 0.39 0.08 0.56 0.32 0.36 0.26 0.56 0.48 0.18 0.42 0.25 0.39 0.54
r
DZF
0.42 0.40 0.32 0.62 0.32 0.54 0.26 0.59 0.50 0.20 0.36 0.41 0.37 0.56
r
DOS
0.42 0.43 0.30 0.26 0.55 0.47 0.23 0.40 0.30 0.48 0.39 0.51
0
0.2
0.4
0.6
0.8
1.0 Age 0–11 years
Figure 3 Twin correlations for the top 20 most investigated specific traits by age and sex. Alc., alcohol; dis., disorders; depr., depressive; endocr.,
endocrine; imm., immunological; funct., functions; maint., maintenance; metab., metabolic; ment. beh., mental and behavioral; spec. personal.,
specific personality; temp. pers., temperament and personality; tob., tobacco; r, correlation; MZ, monozygotic twins; DZ, dizygotic twins; M, males;
F, females; SS, same-sex pairs only; DOS, dizygotic opposite-sex pairs. Inclusion for the top 20 most investigated traits was conditional on the reporting
of rMZ and rDZ. Empty cells denote insufficient information available to calculate weighted estimates; error bars, standard errors. We note that estimates
and graphs for all specific traits are available from the online MaTCH webtool.
6  ADVANCE ONLINE PUBLICATION Nature GeNetics
ARTICLES
Table 2 Weighted means of twin correlations and proportion (
π
0) of studies that are consistent with a model where trait resemblance is
solely due to additive genetic variation for the main trait domains and the top 20 most investigated traits
π
0rMZ rDZ
n traits Estimate Estimate (s.e.m.) n traits n pairs Estimate (s.e.m.) n traits n pairs
All traits 5,185 0.69 0.636 (0.002) 9,568 2,563,628 0.339 (0.003) 5,220 2,606,252
General trait domains
Activities 62 0.35 0.570 (0.019) 118 58,227 0.340 (0.022) 63 55,864
Cardiovascular 267 0.95 0.564 (0.008) 380 41,669 0.295 (0.010) 268 25,544
Cell 54 0.59 0.722 (0.022) 72 3,188 0.523 (0.043) 54 1,667
Cognitive 450 0.57 0.646 (0.007) 931 288,867 0.371 (0.010) 454 304,720
Dermatological 74 0.45 0.729 (0.025) 109 19,509 0.402 (0.017) 75 23,245
Ear, nose, throat 165 0.97 0.760 (0.013) 200 27,882 0.332 (0.015) 172 14,222
Endocrine 108 0.69 0.555 (0.017) 162 10,112 0.387 (0.022) 110 9,140
Environment 145 0.50 0.551 (0.014) 295 120,606 0.396 (0.017) 145 99,137
Gastrointestinal 32 0.59 0.551 (0.024) 64 10,982 0.274 (0.028) 39 28,431
Hematological 19 0.65 0.764 (0.023) 50 5,541 0.560 (0.032) 19 3,218
Immunological 230 0.67 0.608 (0.012) 280 18,051 0.357 (0.013) 231 36,075
Metabolic 464 0.60 0.746 (0.005) 912 210,189 0.405 (0.008) 464 197,921
Neurological 702 1.00 0.685 (0.005) 1,751 129,076 0.289 (0.006) 705 89,103
Nutritional 110 0.72 0.479 (0.016) 205 75,751 0.220 (0.015) 110 79,188
Ophthalmological 106 0.87 0.730 (0.017) 199 26,139 0.385 (0.017) 106 16,189
Psychiatric 1,778 0.62 0.552 (0.004) 2,865 1,232,382 0.306 (0.005) 1,781 1,374,817
Reproduction 16 0.44 0.767 (0.034) 34 12,130 0.333 (0.063) 16 27,879
Respiratory 125 0.74 0.697 (0.018) 184 34,443 0.325 (0.019) 127 51,150
Skeletal 190 0.51 0.830 (0.008) 395 111,282 0.504 (0.012) 191 113,080
Social interactions 24 0.63 0.338 (0.017) 146 43,501 0.267 (0.041) 24 22,764
Social values 45 0.69 0.489 (0.030) 120 52,492 0.414 (0.062) 45 28,071
Top 20 investigated traits for rMZ and rDZ
Blood pressure functions 110 0.93 0.581 (0.010) 179 20,621 0.307 (0.013) 110 11,620
Conduct disorder 216 0.41 0.663 (0.009) 289 147,974 0.408 (0.010) 216 192,651
Depressive episode 115 0.60 0.454 (0.014) 173 98,315 0.253 (0.015) 115 121,936
Endocrine gland functions 92 0.72 0.538 (0.017) 139 8,533 0.382 (0.025) 92 7,295
Food 110 0.72 0.479 (0.016) 205 75,751 0.220 (0.015) 110 79,188
Functions of brain 594 0.99 0.676 (0.006) 1,010 69,722 0.287 (0.006) 594 58,951
General metabolic functions 219 0.69 0.682 (0.007) 462 62,108 0.371 (0.010) 219 58,338
Heart functions 140 1.00 0.529 (0.009) 174 15,070 0.268 (0.011) 140 11,109
Height 87 0.29 0.908 (0.005) 128 53,076 0.543 (0.008) 87 68,358
Higher-level cognitive functions 188 0.44 0.710 (0.009) 419 152,197 0.441 (0.016) 188 158,626
Hyperkinetic disorders 100 0.37 0.651 (0.013) 144 86,450 0.260 (0.016) 100 121,139
Immunological system functions 223 0.67 0.606 (0.012) 276 16,703 0.357 (0.013) 223 32,964
Mental and behavioral disorders due to the
use of alcohol
100 0.36 0.630 (0.015) 158 94,477 0.409 (0.020) 101 94,196
Mental and behavioral disorders due to the
use of tobacco
70 0.47 0.719 (0.016) 110 51,102 0.468 (0.022) 72 34,186
Other anxiety disorders 145 0.29 0.548 (0.013) 191 105,902 0.327 (0.016) 145 153,730
Specific personality disorders 140 0.93 0.448 (0.009) 162 41,460 0.225 (0.007) 140 33,681
Structure of the eyeball 86 0.91 0.735 (0.022) 121 19,276 0.370 (0.019) 86 13,580
Structure of mouth 117 0.89 0.819 (0.010) 127 7,769 0.399 (0.012) 119 8,493
Temperament and personality functions 568 0.84 0.470 (0.008) 1,134 334,190 0.234 (0.010) 568 296,114
Weight maintenance functions 215 0.48 0.810 (0.005) 391 141,152 0.437 (0.010) 215 134,867
General trait domain categories with <10 entries for
π
0 were excluded. For definitions of abbreviations, see Table 1. Inclusion of the top 20 investigated traits was conditional on
the reporting of rMZ and rDZ.
studies in which the pattern of twin correlations was consistent
with the null hypothesis that 2rDZ = rMZ was 69%. This obser ved
pattern of twin correlations is consistent with a simple and parsimoni-
ous underlying model of the absence of environmental effects shared
by twin pairs and the presence of genetic effects that are entirely
due to additive genetic variation (Table 2). This remarkable fitting
of the data with a simple mode of family resemblance is inconsistent
with the hypothesis that a substantial part of variation in human
traits is due to shared environmental variation or to substantial
non-additive genetic variation.
Most specific traits follow an additive genetic model
Although across all traits 69% of studies showed a pattern of monozy-
gotic and dizygotic twin correlations consistent with an rMZ that was
exactly twice the rDZ, this finding is not necessarily representative of
the majority of studies in functional domains or for every specific trait
(that is, at the ICD-10 or ICF subchapter level). We thus calculated the
proportion of studies consistent with 2rDZ = rMZ within functional
domains and for each specific trait and found that traits consistent
with this hypothesis tended to cluster in specific functional domains
(Supplementar y Tables 2729). A pattern of twin correlations
Nature GeNeticsADVANCE ONLINE PUBLICATION 7
ARTICLES
consistent with 2rDZ = rMZ was most prominent for traits included
in the neurological, ear, nose and throat, cardiovascular and oph-
thalmological domains, with 99.5%, 97%, 95% and 87% of studies,
respectively, being consistent with a model where all resemblance
was entirely due to additive genetic variance. In only 3 of 28 general
trait domains were most studies inconsistent with this model. These
domains were activities (35%), reproduction (44%) and dermatologi-
cal (45%) (Table 2 and Supplementary Table 27). Of the 59 specific
traits (ICD-10 or ICF subchapter classifications) for which we had
sufficient information to calculate the proportion of studies consistent
with 2rDZ = rMZ, 21 traits showed a proportion less than 0.50, whereas
for the remaining 38 traits the majority of individual studies were con-
sistent with 2rDZ = rMZ (Supplementary Table 29). Of the top 20 most
investigated specific traits, we found that for 12 traits the majority
of individual studies were consistent with a model where variance
was solely due to additive genetic variance and non-shared environ-
mental variance, whereas the pattern of monozygotic and dizygotic
twin correlations was inconsistent with this model for 8 traits, sug-
gesting that, apart from additive genetic influences and non-shared
environmental influences, either or both non-additive genetic influ-
ences and shared environmental influences are needed to explain the
observed pattern of twin correlations (Table 2). These eight traits were
conduct disorders, height, higher-level cognitive functions, hyper-
kinetic disorders, mental and behavioral disorders due to the use of
alcohol, mental and behavioral disorders due to the use of tobacco,
other anxiety disorders and weight maintenance functions. For all
eight traits, meta-analyses on reported variance components resulted
in a weighted estimate of reported shared environmental influences
that was statistically different from zero (Supplementary Table 21).
Comparison of weighted twin correlations for these specific traits
resulted in positive estimates of 2rDZrMZ, except for hyperkinetic
disorders, where 2rDZrMZ was −0.130 (s.e.m. = 0.034) on the basis
of 144 individual reports and 207,589 twin pairs, which suggests the
influence of non-additive genetic variation for this trait or any other
source of variation that leads to a disproportionate similarity among
monozygotic twin pairs.
DISCUSSION
We have conducted a meta-analysis of virtually all twin studies pub-
lished in the past 50 years, on a wide range of traits and reporting
on more than 14 million twin pairs across 39 different countries.
Our results provide compelling evidence that all human traits are
heritable: not one trait had a weighted heritability estimate of zero.
The relative influences of genes and environment are not randomly
distributed across all traits but cluster in functional domains. In
general, we showed that reported estimates of variance components
from model-fitting can underestimate the true trait heritability, when
compared with heritability based on twin correlations. Roughly
two-thirds of traits show a pattern of monozygotic and dizygotic
twin correlations that is consistent with a simple model whereby trait
resemblance is solely due to additive genetic variation. This implies
that, for the majority of complex traits, causal genetic variants can be
detected using a simple additive genetic model.
Approximately one-third of traits did not follow the simple pattern
of a twofold ratio of monozygotic to dizygotic correlations. For these
traits, a simple additive genetic model does not sufficiently describe
the population variance. An incorrect assumption about narrow-sense
heritability (the proportion of total phenotypic variation due to addi-
tive genetic variation) can lead to a mismatch between the results from
gene-finding studies and previous expectations. If the pattern of twin
correlations is consistent with a substantial contribution from shared
environmental factors, as we find for conduct disorders, religion and
spirituality, and education, then gene-mapping studies may yield dis-
appointing results. If the cause of departure from a simple additive
genetic model is the existence of non-additive genetic variation, as
is, for example, suggested by the average twin correlations for recur-
rent depressive disorder, hyperkinetic disorders and atopic dermatitis,
then it may be tempting to fit non-additive models in gene-mapping
studies (for example, GWAS or sequencing studies). However, the
statistical power of such scans is extremely low owing to the many
non-additive models that can be fitted (for example, within-locus
dominance versus between-locus additive-by-additive effects) and the
penalty incurred by multiple testing. Our current results signal traits
for which an additive model cannot be assumed. For most of these
traits, dizygotic twin correlations are higher than half the monozygotic
twin correlations, suggesting that shared environmental effects are
causing the deviation from a simple additive genetic model. Yet, data
from twin pairs only do not provide sufficient information to resolve
the actual causes of deviation from a simple additive genetic model.
More detailed studies may identify the likely causes of such deviation
and may as such uncover epidemiological or biological factors
that drive family resemblance. To make stronger inferences about the
causes underlying resemblance between relatives for traits that deviate
from the additive genetic model, additional data are required, for
example, from large population samples with extensive phenotypic
and DNA sequence information, detailed measures of environmental
exposures and larger pedigrees including non-twin relationships.
We note that our inference is based on twin studies published
between 1958 and 2012 and that it generally applies to complex traits
but does not necessarily generalize to mendelian subtypes of traits.
Most mendelian traits are rare in the population and are therefore not
studied by researchers of twins because they cannot ascertain enough
affected twin pairs to reliably estimate genetic parameters. In the rare
case where sufficient numbers of affected twin pairs were available,
the mendelian subtypes were analyzed together with the subtypes of
the same trait that were due to common causes, as it was unknown
whether the studied trait was a mendelian subtype. If the traits we
studied were in fact a mix of mendelian and complex subtypes,
our inference would be biased away from our main result because
mendelian diseases tend to be dominant or recessive, not additive. In
addition, there may be heterogeneity in measurement errors between
studies for the same trait and between traits. A test-retest correlation
would quantify measurement error when contrasted with a correla-
tion between monozygotic twins, but few twin studies report such
correlations in the same papers that estimate heritability.
Our results provide the most comprehensive empirical overview
of the relative contributions of genes and environment to all human
traits that have been studied in twins thus far, which can guide and
serve as a reference for future gene-mapping efforts.
URLs. ICF classification, http://apps.who.int/classifications/
icfbrowser/; ICD-10 classification, http://apps.who.int/classifications/
icd10/browse/2010/en. The data used for this manuscript have been
integrated in a web application, where user-specified selections of
traits can be made to apply the analyses presented in this work. The
web application is called MaTCH (Meta-analysis of Twin Correlations
and Heritability) and is accessible via http://match.ctglab.nl/; Gephi,
http://gephi.github.io/.
METHODS
Methods and any associated references are available in the online
version of the paper.
8  ADVANCE ONLINE PUBLICATION Nature GeNetics
ARTICLES
Note: Any Supplementary Information and Source Data files are available in the
online version of the paper.
ACKNOWLEDGMENTS
We would like to thank M. Frantsen, M.P. Roeling, R. Lee and D.M. DeCristo for
their contribution to collecting the full texts of selected twin studies and data entry.
This work was funded by the Netherlands Organization for Scientific Research
(NWO VICI 453-14-005, NWO Complexity 645-000-003), by the Australian
Research Council (DP130102666) and by the Australian National Health and
Medical Research Council (APP613601).
AUTHOR CONTRIBUTIONS
D.P., B.B., P.F.S. and P.M.V. performed the analyses. D.P. conceived the study.
D.P., T.J.C.P. and P.M.V. designed the study. T.J.C.P. and D.P. collected and entered
the data. D.P. and P.F.S. categorized traits according to standard classifications.
A.v.B. and C.A.d.L. checked data entries, and checked and wrote statistical scripts.
A.v.B. designed and programmed the webtool. D.P., T.J.C.P. and P.M.V. wrote the
manuscript. All authors discussed the results and commented on the manuscript.
COMPETING FINANCIAL INTERESTS
The authors declare no competing financial interests.
Reprints and permissions information is available online at http://www.nature.com/
reprints/index.html.
1. Moore, J.H. Analysis of gene-gene interactions. Curr. Protoc. Hum. Genet. Chapter
1, Unit 1.14 (2004).
2. Hill, W.G., Goddard, M.E. & Visscher, P.M. Data and theory point to mainly additive
genetic variance for complex traits. PLoS Genet. 4, e1000008 (2008).
3. Traynor, B.J. & Singleton, A.B. Nature versus nurture: death of a dogma, and the
road ahead. Neuron 68, 196–200 (2010).
4. Zuk, O., Hechter, E., Sunyaev, S.R. & Lander, E.S. The mystery of missing
heritability: genetic interactions create phantom heritability. Proc. Natl. Acad. Sci.
USA 109, 1193–1198 (2012).
5. Phillips, P.C. Epistasis—the essential role of gene interactions in the structure and
evolution of genetic systems. Nat. Rev. Genet. 9, 855–867 (2008).
6. Visscher, P.M., Brown, M.A., McCarthy, M.I. & Yang, J. Five years of GWAS discovery.
Am. J. Hum. Genet. 90, 7–24 (2012).
7. Manolio, T.A. et al. Finding the missing heritability of complex diseases. Nature
461, 747–753 (2009).
8. Stranger, B.E., Stahl, E.A. & Raj, T. Progress and promise of genome-wide association
studies for human complex trait genetics. Genetics 187, 367–383 (2011).
9. Maher, B. Personal genomes: the case of the missing heritability. Nature 456,
18–21 (2008).
10. Eichler, E.E. et al. Missing heritability and strategies for finding the underlying
causes of complex disease. Nat. Rev. Genet. 11, 446–450 (2010).
11. Nelson, R.M., Pettersson, M.E. & Carlborg, Ö. A century after Fisher: time for a
new paradigm in quantitative genetics. Trends Genet. 29, 669–676 (2013).
12. Barker, J.S. Inter-locus interactions: a review of experimental evidence. Theor.
Popul. Biol. 16, 323–346 (1979).
13. Cockerham, C.C. An extension of the concept of partitioning hereditary variance for
analysis of covariances among relatives when epistasis is present. Genetics 39,
859–882 (1954).
14. Cockerham, C.C. in Statistical Genetics and Plant Breeding 53–94 (Nat. Acad. Sci.
Nat. Res. Council Publ., 1963).
15. Kempthorne, O. On the covariances between relatives under selfing with general
epistacy. Proc. R. Soc. Lond. B Biol. Sci. 145, 100–108 (1956).
16. Crow, J.F. & Kimura, M. An Introduction To Population Genetics Theory (Harper
and Row, 1970).
17. Carlborg, O. & Haley, C.S. Epistasis: too often neglected in complex trait studies?
Nat. Rev. Genet. 5, 618–625 (2004).
18. Falconer, D.S. & Mackay, T.F.C. Quantitative Genetics (Longman Group, 1996).
19. Lynch, M. & Walsch, B. Genetics and Analysis of Quantitative Traits (Sinauer
Associates, 1998).
Nature GeNetics
doi:10.1038/ng.3285
ONLINE METHODS
Identifying relevant studies. We searched PubMed for studies published
between 1 Januar y 1900 and 31 December 2012 that provided twin correla-
tions, concordance rates, and a heritability estimate (h2) or an estimate of
shared environmental influences (c2), using monozygotic and dizygotic twins.
The following search term was used
(“English”[Language] AND (“1900/01/01”[Date - Publication]: “2012/12/31”
[Date - Publication]) AND twin AND “journal article”[Publication Type]
AND humans”[Filter] AND (heritability[Title/Abstract] OR “genetic
influence”[Title/Abstract] OR “environmental influence”[Title/Abstract] OR
“genetic factors”[Title/Abstract] OR “environmental factors”[Title/Abstract])
AND “journal article”[Publication Type]) NOT review[Title] NOT review
[Publication Type])
The search was run on 31 January 2013 and again on 29 April 2013, which
yielded an additional 44 publications, with the difference likely due to keywords
or tags that had been added to publications in the intermittent period.
The last PubMed search yielded 4,388 unique studies. From these, we
deleted studies that were not relevant for the current purpose using the fol-
lowing exclusion criteria: (i) studies with only monozygotic twins available;
(ii) studies with no heritability estimates, twin correlations or concordances
available; (iii) review studies; (iv) meta-analyses; and (v) multivariate stud-
ies that provided information on completely overlapping traits and samples
with previously published univariate studies. Some studies investigating h2
for the brain (for example, voxel-based brain measures) were not included for
practical purposes. These studies typically presented their results in graphs
with color-coded point estimates of heritability mapped onto the brain. Such
estimates could not be quantified, and these studies were thus not included.
From the remaining 2,748 studies, we were able to retrieve the full text
from all but 5 papers (99.8%). Of the studies without full-text availability,
we included relevant information based on the abstract. An overview of
authors and journals and a full reference list of all 2,748 studies are provided
in Supplementary Tables 3032.
Primar y information obtained from each study. From every study, we
retrieved basic information on the PubMed ID, the authors, the trait as named
in the study and the year of publication. In addition, the following informa-
tion was retrieved:
Country of origin of the study population. We used standard ISO country
names, and where possible data entry was done separately for each country
investigated in the study.
Age group of the study population. The study population was classified
into four age cohorts on the basis of the average age of the included
sample: age >0 and <12; age 12 and <18; age 18 and <65; and age 65.
Monozygotic and dizygotic twin correlations. Twin correlations were
entered as provided in the study and could be calculated as intraclass,
Pearson, polychoric or tetrachoric correlations or on the basis of least-
squares or maximum-likelihood estimates. When available, we entered the
twin correlations separately for males and females (i.e., monozygotic male
(MZM), monozygotic female (MZF), dizygotic male (DZM), dizygotic
female (DZF) and dizygotic opposite-sex (DOS) pairs). If correlations
were not available for males and females separately, we entered the MZ and
DZ correlations, i.e., the correlations based on both sexes. In cases where
it was clear that the dizygotic correlation was based on same-sex twins
only, we entered the dizygotic same-sex (DZSS) correlation.
Estimates of heritability (h2) and shared environmental component (c2),
under the full ACE (or ADE) model. We entered ‘h2_FULL and ‘c2_FULL’,
on the basis of estimates under the full ACE (including additive genetic
and shared and non-shared environmental influences) or ADE (including
additive and non-additive genetic and non-shared environmental influ-
ences) model. When an ACE model was fitted, the estimate for A was
entered in ‘h2_FULL’ and the estimate for C was entered in c2_FULL’.
When an ADE model was fitted, the estimates of A and D were summed
and entered for ‘h2_FULL’ and zero was entered for c2_FULL. When esti-
mates were provided separately for males and females, they were entered
separately. In the case of multivariate analyses, univariate estimates were
always preferred to allow comparison across studies.
Estimates of heritability (h2) and shared environmental component (c2),
under the best-fitting ACE (or ADE) model. We entered ‘h2_BEST’ and
c2_BEST’, on the basis of estimates under the best-fitting ACE or ADE
model as provided in the study. When an ACE model was the best-fitting
model, the estimate for A was entered in ‘h2_BEST’ and the estimate for
C was entered in ‘c2_ BEST’. When an ADE model was the best-fitting
model, the estimates of A and D were summed and entered for ‘h2_ BEST’
and zero was entered for ‘c2_ BEST’. When estimates were provided
separately for males and females, they were entered separately. In the
case of multivariate analyses, univariate estimates were always preferred
to allow comparison across studies. In cases where estimates for the
best-fitting model were not directly provided but information available
in the paper indicated that the best-fitting model was AE (or CE or E),
we entered zero for ‘c2_ BEST’ and missing for ‘h2_ BEST’ (when the
best-fitting model was an AE model), missing for c2_ BEST’ and zero
for ‘h2_ BEST’ (when the best-fitting model was a CE model), and zero
for ‘c2_ BEST’ and zero for ‘h2_ BEST’ (when the best-fitting model was
described to be an E model).
The total number of twin pairs as used for each entered estimate.
Whether the study was a classical twin study. All 2,748 studies in the
database included monozygotic and dizygotic twins. However, a classi-
cal twin study was defined as a study that involved only reared-together
monozygotic and dizygotic twins. From studies that included siblings,
extended families, adoptees or reared-apart twins, only estimates based on
the reared-together twin sample were used for the meta-analyses. Most of
the non-classical twin studies did provide twin correlations for the classical
twin design and were thus included in the meta-analysis for twin correla-
tions. When A and C estimates were based on extended twin designs, they
were excluded from the meta-analyses.
The method used for estimating the variance components. We entered the
statistical method used for estimating the variance components, which
included, for example, ANOVA, Bayesian, maximum-likelihood (ML),
DeFries-Fulker regression, least-squares (LS) or intrapair differences.
We also listed a dichotomized version of this indicating whether the
method used was ‘ML or LS’ or ‘not ML or LS’, for all other methods.
In the meta-analyses for h2 and c2 estimates, only those based on maxi-
mum likelihood or least squares were included.
Whether the trait was dichotomous or continuous. Traits measured
as 0 or 1, as well as traits measured on a quantitative scale but dichot-
omized before analysis, were listed as dichotomous. All other traits,
including ordinal traits, were listed as ‘not dichotomous’ and treated
as continuous.
Whether the study involved ascertainment for the trait. When the trait
under investigation was the same trait that was used to select probands,
the study was listed as ‘ascertained’.
Number of concordant and discordant pairs. In cases of dichotomous
traits, the total numbers of pairs for discordant and concordant affected
pairs were entered separately for each zygosity. In cases of dichotomous
traits that were not ascertained, the number of concordant unaffected
pairs was also entered.
Prevalence. In cases of dichotomous traits, the population prevalence, sep-
arately for monozygotic and dizygotic twins when available, was entered.
Prevalence was based on what was provided in the study or was calculated
using (2c + d)/2n, where c is the number of concordant affected pairs,
d is the number of discordant pairs and n is the total number of pairs in
non-ascertained traits.
Thus, provided that there was availability, the statistics in Supplementar y
Table 1 were obtained for each trait reported on in every study. When the
five basic twin correlations were available (rMZM, rMZF, rDZM, rDZF and rDOS),
we calculated rMZ, rDZSS and rDZ, using the weighted average via Fisher
z transformation and using sample size as weights. In situations where rMZM
(or rMZF) was exactly 1 (or −1), we subtracted (or added in the case of −1)
0.00001 to the correlation to ensure non-problematic Fisher z transformation.
Sample sizes of MZ, DZSS and DZ were obtained by summing the sample
Nature GeNetics doi:10.1038/ng.3285
sizes of MZM and MZF, of DZM and DZF, and of DZM, DZF and
DOS, respectively. Estimates of h2 and c2 were calculated across sex as the
n-weighted average across the separate male and female estimates, when
available. For the number of concordant and discordant pairs, MZ, DZ and
DZSS were calculated on the basis of the numbers available for MZM, MZF,
DZM, DZF and DOS. Prevalences for pooled entries were calculated as
an n-weighted average.
Data entry checks. Studies were entered and cross-checked for obvious typos
by T.J.C.P. and D.P. After initial data entry and initial cross-checking, all data
points were manually checked (D.P.) by looking up the entered values in the
original paper. In addition, automatic checks were run (D.P. and B.B.) to
identify typos, strange outliers or obvious mistakes. These checks included:
(i) identifying highly unlikely values (clear typos, for example, correlation of 120);
(ii) testing whether the sum of h2 and c2 was <100; (iii) testing for strange
discrepancies between estimates from the full and best-fitting models; and
(iv) checking for outliers on the basis of extreme sample size and extreme
χ
2
values for rejecting the null hypothesis that either 2 × (rMZ rDZ) or 2 × rDZ
rMZ is equal to zero.
Classification of traits. After data entry, all traits were manually classified
using the ICF. The ICF is the framework of the World Health Organization
(WHO) for health and disability and provides the conceptual basis for the
definition, measurement and policy formulations for health and disability.
It is a universal classification of disability and health for use in health and
health-related sectors. ICF belongs to the WHO family of international
classifications, the best-known member of which is the ICD-10. ICD-10
provides an etiological framework for the classification of diseases, disorders
and other health conditions, whereas ICF classifies functioning and disability
associated with health conditions. The ICD-10 and ICF are therefore comple-
mentary (see URLs).
Most traits investigated in twin studies concern healthy functioning (for
example, cognitive function, social attitudes, body height and personality)
and were classified according to ICF. In cases where the studied traits were
diseases or symptoms of disease, ICD-10 was used. Traits were given two
hierarchical classifications corresponding to the ICF or ICD-10 hierarchical
structure, using the chapter structure (for example, b1) and the level directly
under the chapter (for example, b110), which corresponds to the code for the
actual disease (ICD-10) or trait (ICF).
Six new classes at the chapter level and 17 new classes at the subchapter
level were created to accommodate traits that could not be classified under
either ICF or ICD-10. For the chapter level, the created classes were cell, func-
tion of DNA, functions of the nervous system, medication effects, mortality
and structure of DNA. For the subchapter level, the classes created were
all-cause mortality, cell cycle, cell growth, diazepam effects, expression,
function of brain, gene expression, height, methylation, mortality from heart
disease, mtDNA, physical appearance, receptor binding, sister chromatid
exchange, structure of DNA, telomeres and X inactivation.
In addition to the two standard ICF or ICD-10 classification levels, we
added a general classification of f unctional trait domains. We thus clas-
sified all traits using a 3-level scheme that included 28 broad, functional
domains, 54 ICF or ICD-10 chapter-level classes and 313 subchapter-le vel
class es. A small proportion of studied traits (<0.1%) could not be classi-
fied meaningfully on the chapter level (two traits) or the subchapter level
(three traits). There were 326 unique combinations across the 3 levels of
trait categorization (Supplementar y Table 33). All analyses were con-
ducted on all entries of each of the three levels of classification. In addition,
we analyzed all traits toget her. Although this is unspecific in terms of
diseases or traits, it provides a general overview of the relationship between
monozygotic and dizygotic twin correlations and shows general patterns
of, for example, sex and age differences. The most specific level was the
subchapter level, which was the actu al ICD-10 diagnosis or a similar
ICF classification for normal functioning, reflect ing specific traits such
as cleft lip, hype rkinetic disorders or higher-level cognitive function.
As researchers do not necessarily adhere to the ICD-10 or ICF trait nomen-
clature, traits with the same subchapter classification could have different
trait names in the original study: for example, for higher-level cognitive
func tion, the original studies includ ed the trait names total IQ score,
cognitive ability, intelligence or ‘g’.
Tests for publication bias. Publication bias can occur when studies that report
relatively large heritability estimates or high twin correlations are more likely
to be submitted and/or accepted for publication than studies that report more
modest effects. Such a publication bias would lead to an overestimation of the
true twin correlations or the true heritability and environmental estimates.
We used several standard statistical tools to aid in identifying and quantify-
ing possible publication bias, including inspection of funnel plots, Begg and
Mazumdar’s test20, Egger’s regression test21 and Rosenthal’s fail-safe N22.
Meta-analysis methods of twin correlations and variance components. We
used the DerSimonian-Laird (DSL) random-effects meta-analytical approach
with correlation coefficients as effect sizes, as described by Schulze23 and
implemented in the R package metacor. This function transforms a correlation
to its Fisher z value with corresponding standard error before the meta-analy-
sis. This method is preferred over conducting a meta-analysis directly on the
correlations because the standard error of a twin correlation is a function of not
only sample size but also the correlation itself, with larger correlations having a
smaller standard error. This can cause problems in a meta-analysis, as it would
lead to the larger correlations appearing more precise and being assigned more
weight in the analysis, irrespective of sample size. To avoid this problem, the
DSL method transforms correlations to the Fisher’s z metric, whose standard
error is determined solely by sample size. All n-weighted computations were
thus performed using Fisher’s z metric, and the results were converted back
to correlations for interpretation.
The random-effects approach allows for heterogeneity of the true twin cor-
relations across different studies. That is, rather than assuming that there is one
true level of twin correlation, the random-effects model allows a distribution of
true correlations. The combined effect of the random-effects model represents
the mean of the population of true correlations. For computational reasons,
correlations of −1 and 1 were converted to −0.99999 and 0.99999 before meta-
analysis. We set a threshold of at least five pairs of twins available per estimate
and at least two studies available per category. Meta-analyses were conducted
for each category of all three levels of classification.
We note that twin samples used in different publications were not independ-
ent. For example, studies using Australian twins are predominantly based on
twins from the Australian Twin Registry. These studies sometimes include dif-
ferent subsamples but may also include completely overlapping samples used
to investigate different traits. As participants are anonymous, it is not possible
to determine the extent of overlap in the studies included in our analyses. We
thus assumed independency of samples in the meta-analyses. This assump -
tion leads to an underestimation of the variance of weighted estimates and
an overestimation of their precision. We expect that the dependency of study
samples is lowest at the specific level of the ICD-10 or ICF subchapters and
highest for the general functional domains.
Meta-analysis for dichotomous, non-ascertained traits. In the DSL ran-
dom-effects model, the standard error of a correlation is calculated on the
basis of the provided n (pairs). The estimated standard errors for continuous
traits are correct, but for dichotomous traits the resulting standard error is
incorrect. That is because twin correlations for non-ascertained, dichotomous
traits are typically based on three categories of pairs: concordant, unaffected
(CON), discordant (DIS) and concordant, affected (CON+) pairs. Whereas
the total number of participating pairs is the sum of these, the information
that determines the twin correlation and its significance is mostly derived
from the latter two categories. For non-ascertained, dichotomous traits, we
calculated the study-specific tetrachoric twin correlation on the basis of the
contingency table (i.e., CON, DIS and CON+ pairs), under the assumption
that the dichotomous traits represent latent variables that follow a bivariate
normal distribution24. We used a maximum-likelihood estimator described
by Olsson25, implemented in the R function polycor, to calculate the study-
specific twin correlation and its standard error. As our meta-analysis required
twin correlation and sample size (not standard error) as input and we wanted
to be able to pool across continuous and dichotomous traits, we calculated
the ‘effective’ number of pairs on the basis of the obtained standard error.
Nature GeNetics
doi:10.1038/ng.3285
The effective number of pairs was defined as the number of pairs that produced
the exact same standard error within the DSL meta-analyses as the standard
error obtained from the contingency table.
Meta-analysis for dichotomous, ascertained traits. For ascertained traits, it
was not possible to calculate the twin correlations and standard errors on the
basis of the contingency table, as the traits included only pairs with at least
one proband. Without information on the number of concordant, unaffected
pairs, the prevalence of the affected status would be required to calculate a
twin correlation. We used the algorithms derived from Falconer26 and Smith27.
Again, for practical purposes, calculated standard errors were transformed to
an effective number of pairs for use in the DSL meta-analysis.
Proportion of studies consistent with specific hypotheses. We estimated
the proportion of studies that were consistent with H0: 2 × (rMZrDZ)
= 0 (
π
0(h)) and the proportion of obser vations consistent with H0: 2 × rDZ
rMZ = 0 (
π
0(c)), using the Jiang and Doerge method28, as well as the q-value
method29.
Authorship network analysis. We used the approach more fully described
previously30. Briefly, we retrieved from PubMed the full Medline listing
for all twin studies included in this meta-analysis using NCBI eutils. The
output was parsed to capture the names of all authors. The twin study author
list was manually reviewed to resolve clear inconsistencies in the spelling
of the names of authors between publications. GEPHI was used to construct
a network to understand twin study publication patterns. For clarity, we
removed individuals who had published only one paper (i.e., we required
authorship on 2 papers). The substructure of the network was investigated
by estimating community membership modules using the Louvain method31
implemented in GEPHI.
20. Begg, C.B. & Mazumdar, M. Operating characteristics of a rank correlation test for
publication bias. Biometrics 50, 1088–1101 (1994).
21. Egger, M., Davey Smith, G., Schneider, M. & Minder, C. Bias in meta-analysis
detected by a simple, graphical test. Br. Med. J. 315, 629–634 (1997).
22. Rosenthal, R. The file drawer problem and tolerance for null results. Psychol. Bull.
86, 91–106 (1979).
23. Schulze, R. Meta-Analysis: A Comparison Of Approaches (Hogrefe & Huber,
2004).
24. Drasgow, F. in Encyclopedia of Statistical Sciences (eds. Kotz, S., Read, C.B.,
Balakrishnan, N. & Vidakovic, B.) Vol. 7, 68–74 (John Wiley & Sons, 2006).
25. Olsson, U. Maximum likelihood estimation of the polychoric correlation coefficient.
Psychometrika 44, 443–460 (1979).
26. Falconer, D.S. The inheritance of liability to certain diseases, estimated from the
incidence among relatives. Ann. Hum. Genet. 29, 51–76 (1965).
27. Smith, C. Concordance in twins: methods and interpretation. Am. J. Hum. Genet.
26, 454–466 (1974).
28. Jiang, H. & Doerge, R.W. Estimating the proportion of true null hypotheses for
multiple comparisons. Cancer Inform. 6, 25–32 (2008).
29. Storey, J.D. & Tibshirani, R. Statistical significance for genomewide studies.
Proc. Natl. Acad. Sci. USA 100, 9440–9445 (2003).
30. Bulik-Sullivan, B.K. & Sullivan, P.F. The authorship network of genome-wide
association studies. Nat. Genet. 44, 113 (2012).
31. Blondel, V., Guillaume, J., Lambiotte, R. & Lefebvre, E. Fast unfolding of community
hierarchies in large networks. J. Stat. Mech. 10, P10008 (2008).
... These moderation effects may be explained by reverse causation [38], but genetic influences-which are not subject to reverse causation-also underlie these factors. Educational attainment [39,40] and, particularly, general cognitive ability [40,41] are heritable. Thus, some of these factors may also confer resilienceenhancing genetic effects. ...
... These moderation effects may be explained by reverse causation [38], but genetic influences-which are not subject to reverse causation-also underlie these factors. Educational attainment [39,40] and, particularly, general cognitive ability [40,41] are heritable. Thus, some of these factors may also confer resilienceenhancing genetic effects. ...
Article
Full-text available
Polygenic risk scores (PRSs) can boost risk prediction in late-onset Alzheimer’s disease (LOAD) beyond apolipoprotein E (APOE) but have not been leveraged to identify genetic resilience factors. Here, we sought to identify resilience-conferring common genetic variants in (1) unaffected individuals having high PRSs for LOAD, and (2) unaffected APOE-ε4 carriers also having high PRSs for LOAD. We used genome-wide association study (GWAS) to contrast “resilient” unaffected individuals at the highest genetic risk for LOAD with LOAD cases at comparable risk. From GWAS results, we constructed polygenic resilience scores to aggregate the addictive contributions of risk-orthogonal common variants that promote resilience to LOAD. Replication of resilience scores was undertaken in eight independent studies. We successfully replicated two polygenic resilience scores that reduce genetic risk penetrance for LOAD. We also showed that polygenic resilience scores positively correlate with polygenic risk scores in unaffected individuals, perhaps aiding in staving off disease. Our findings align with the hypothesis that a combination of risk-independent common variants mediates resilience to LOAD by moderating genetic disease risk.
... While the GRS model may seem simplistic, a large meta-analysis from twin studies found the majority (69%) of traits are consistent with a simple additive genetic effect [234]. The biggest limitation of the GRS is its applicability to different ancestries. ...
Thesis
Autoimmune membranous nephropathy (AMN) is a rare kidney disease. The genetics of AMN have been partially elucidated and confirmed the role of phospholipase A2 receptor-1 (PLA2R1) and HLA. The functional effect of the genetic variations is not fully understood. This thesis investigates these unexplored genetic aspects utilising a range of methodologies and unique cohorts. Analysing genomic sequencing data of PLA2R1 in 335 AMN patients identified 109 strongly associated variants; 9 with a very strong association, p-value <10-50. In a larger cohort of 1158 European AMN patients, the findings from previous GWAS were confirmed with a strong association with HLA-DQA1, HLA-DRB1 and PLA2R1. No associations were found on a genome wide scale with clinical correlates of disease such as proteinuria, sex, and age. HLA typing by imputation in 372 anti-PLA2R1 antibody positive and uniquely 32 antithrombospondin type-1 domain-containing 7A (THSD7A) antibody positive AMN confirmed the dominant HLA type in European AMN as HLA-DRB1*03:01 and HLADQA1*05:01; replicating previous studies. No statistically significant HLA type was identified for anti-THSD7A AMN. Anti-PLA2R1 AMN has a different genetic risk than anti-THSD7A and anti-contactin AMN as determined by the genetic risk score (GRS), and this can help differentiate between them. Interestingly, 33% of dual antibody negative AMN is likely to be anti-PLA2R1 AMN. AMN patients with a higher genetic risk have a younger age of onset. In a rare, undescribed cohort of 15 non-familial paediatric cases of AMN the GRS proved that these individuals did not have the same genetic risk factors as anti-PLA2R1 AMN. Finally, the genetic risk of AMN in UK Biobank Europeans is 0.8%. Even though there is a high genetic risk for AMN this does not mean this proportion of individuals will develop AMN. In conclusion, this thesis highlights important differences between antibody status groups, confirms previous GWAS findings and reports unique features about rare AMN cohorts.
... First, we estimated the genome-wide SNP heritability of the EF components in the five models. We found that for the "I + U + S" model, heritability was significant for the inhibiting component 58 . This value is higher than that found in a prior consortium study based on unrelated individuals 59 i.e., h 2 SNP = 0.19-0.22 ...
Article
Full-text available
One central mission of cognitive neuroscience is to understand the ontology of complex cognitive functions. We addressed this question with a cognitive neurogenetic approach using a large-scale dataset of executive functions (EFs), whole-brain resting-state functional connectivity, and genetic polymorphisms. We found that the bifactor model with common and shifting-specific components not only was parsimonious but also showed maximal dissociations among the EF components at behavioral, neural, and genetic levels. In particular, the genes with enhanced expression in the middle frontal gyrus (MFG) and the subcallosal cingulate gyrus (SCG) showed enrichment for the common and shifting-specific component, respectively. Finally, High-dimensional mediation models further revealed that the functional connectivity patterns significantly mediated the genetic effect on the common EF component. Our study not only reveals insights into the ontology of EFs and their neurogenetic basis, but also provides useful tools to uncover the structure of complex constructs of human cognition.
Article
Background Mental health and cognitive achievement are partly heritable, highly polygenic, and associated with brain variations in structure and function. However, the underlying neural mechanisms remains unclear. Methods We investigated the association between genetic predispositions to various mental health and cognitive traits and a large set of structural and functional brain measures from the UK Biobank (N=36,799). We also applied linkage disequilibrium score regression to estimate the genetic correlations between various traits and brain measures based on genome-wide data. To decompose the complex association patterns, we performed a multivariate partial least squares model of the genetic and imaging modalities. Results The univariate analyses showed that certain traits were related to brain structure (significant genetic correlations with total cortical surface area from rg= -0.101 for smoking initiation to rg= 0.230 for cognitive ability), while other traits were related to brain function (significant genetic correlations with functional connectivity from rg= -0.161 for educational attainment to rg= 0.318 for schizophrenia). The multivariate analysis showed that genetic predispositions to attention deficit hyperactivity disorder, smoking initiation, and cognitive traits have stronger associations with brain structure than with brain function, whereas genetic predispositions to most other psychiatric disorders have stronger associations with brain function than with brain structure. Conclusions These results reveal that genetic predispositions to mental health and cognitive traits have distinct brain profiles.
Article
Our aim was to adapt and validate a Brazilian Portuguese version of the Twin Relationship Questionnaire developed by Fortuna et al. (2010) and validated by H. Segal and Knafo-Noam (2019) in Israel. The respondents were 862 Brazilian mothers of twins ( N = 1724 twins) with mean age of 35 years (SD = 6.1). The majority of the sample lived in the Southeast (61.8%) or in the South (24.5%) of Brazil. We conducted a Multilevel Confirmatory Factor Analysis with the pair of twins as second level variable, and the five-factor structure (closeness, dependence, conflict, dominance, and rivalry) proposed by the original validation study of H. Segal and Knafo-Noam (2019) was confirmed. The final model retained 15 items out of 22 proposed in the original version of the questionnaire. Although the TRQ-BR has fewer items, the accuracy compared to the original questionnaire was maintained. Mixed Model Analysis (LMM) of TRQ scores were used to investigate twins’ relationships as a function of zygosity, age groups, and sex in order to provide evidence of convergent validity of the instrument. As expected, mothers perceived monozygotic twins (MZ) as more depedent than dizygotic twins (DZ). Furthermore, male twin pairs were considered more conflictive when compared to female twins. The present study showed that TRQ-BR is an adequate instrument for research purposes in the Brazilian population. It can also be useful for applied areas such as clinical and educational fields.
Article
Full-text available
Common genetic variants explain less variation in complex phenotypes than inferred from family-based studies, and there is a debate on the source of this ‘missing heritability’. We investigated the contribution of rare genetic variants to tobacco use with whole-genome sequences from up to 26,257 unrelated individuals of European ancestries and 11,743 individuals of African ancestries. Across four smoking traits, single-nucleotide-polymorphism-based heritability (hSNP2) was estimated from 0.13 to 0.28 (s.e., 0.10–0.13) in European ancestries, with 35–74% of it attributable to rare variants with minor allele frequencies between 0.01% and 1%. These heritability estimates are 1.5–4 times higher than past estimates based on common variants alone and accounted for 60% to 100% of our pedigree-based estimates of narrow-sense heritability (hped2, 0.18–0.34). In the African ancestry samples, hSNP2 was estimated from 0.03 to 0.33 (s.e., 0.09–0.14) across the four smoking traits. These results suggest that rare variants are important contributors to the heritability of smoking. The team of authors led by Seon-Kyeong Jang use whole-genome sequencing data and show that rare genetic variants explain much of the ‘missing heritability’ in smoking behaviours. These results help address a long-standing mystery in behavioural genetics.
Article
The basic random effects meta-analytic model is overwhelmingly dominant in psychological research. Indeed, it is typically employed even when more complex multilevel multivariate meta-analytic models are warranted. In this paper, we aim to help overcome challenges so that multilevel multivariate meta-analytic models will be more often employed in practice. We do so by introducing MLMVmeta-an easy-to-use web application that implements multilevel multivariate meta-analytic methodology that is both specially tailored to contemporary psychological research and easily estimable, interpretable, and parsimonious-and illustrating it across three case studies. The three case studies demonstrate the more accurate and extensive results that can be obtained via multilevel multivariate meta-analytic models. Further, they sequentially build in complexity featuring increasing numbers of experimental factors and conditions, dependent variables, and levels; this in turn necessitates increasingly complex model specifications that also sequentially build upon one another.
Article
Introduction: Obstructive sleep apnea (OSA) is a common disorder characterized by the repetitive collapse of the upper airways during sleep, most likely in the oropharyngeal region. Anatomical factors significantly contribute to the disease development; however, the heritability of the upper airway dimensions, which lead to the collapsibility of the upper airways, is less known. In the current study, we aimed to quantify the impact of heritable and environmental factors on the upper airway dimensions in twins using magnetic resonance imaging (MRI). Methods: We completed head and neck MRI imaging on 110 (66 monozygotic and 44 dizygotic, age median and Q1–Q3: 53 (44–63.75) years) adult twins from the Hungarian Twin Registry. We completed cephalometric, soft tissue and fatty tissue space measurements on T1- and T2-weighted images in sagittal, coronal and axial planes. For the analysis of the genetic and environmental, the determination of the measured parameters was performed with an ACE twin statistical model. Results: We found a strong genetic determination in the anteroposterior diameter of the tongue and the thickness of the submental fatty tissue of the neck. Other parameters of the tongue, soft palate and uvula have shown moderate heritability, while we found strong environmental determination in the thickness of the parapharyngeal fatty tissue, the thickness of the pharyngeal wall, and the smallest diameter of the posterior upper airways. Conclusion: Our twin study can help better understand the genetic and environmental background of anatomical structures involved in the development of sleep apnea.
Article
Youth behavior changes and their relationships to personality have generally been investigated using self-report studies, which are subject to reporting biases and confounding variables. Supplementing these with objective measures, like GPS location data, and twin-based research designs, which help control for confounding genetic and environmental influences, may allow for more rigorous, causally informative research on adolescent behavior patterns. To investigate this possibility, this study aimed to (1) investigate whether behavior changes during the transition from adolescence to emerging adulthood are evident in changing mobility patterns, (2) estimate the influence of adolescent personality on mobility patterns, and (3) estimate genetic and environmental influences on mobility, personality, and the relationship between them. Twins aged Fourteen to twenty-two (N=709, 55% female) provided a baseline personality measure, the Big Five Inventory, and multiple years of smartphone GPS data from June 2016 - December 2019. Mobility, as measured by daily locations visited and distance travelled, was found via mixed effects models to increase during adolescence before declining slightly in emerging adulthood. Mobility was positively associated with Extraversion and Conscientiousness (r of 0.17 - 0.25, r of 0.10 - 0.16) and negatively with Openness (r of -0.11 - -0.13). ACE models found large genetic (A = 0.56 - 0.81) and small-moderate environmental (C of 0.12 - 0.28, E of 0.07 - 0.15) influences on mobility. A and E influences were highly shared across mobility measures (rg = 0.70, re= 0.58). Associations between mobility and personality were partially explained by mutual genetic influences (rg of -0.27 - 0.53). Results show that as autonomy increases during adolescence and emerging adulthood, we see corresponding increases in youth mobility. Furthermore, the heritability of mobility patterns and their relationship to personality demonstrate that mobility patterns are informative, psychologically meaningful behaviors worthy of continued interest in psychology.
Article
Full-text available
Quantitative genetics traces its roots back through more than a century of theory, largely formed in the absence of directly observable genotype data, and has remained essentially unchanged for decades. By contrast, molecular genetics arose from direct observations and is currently undergoing rapid changes, making the amount of available data ever greater. Thus, the two disciplines are disparate both in their origins and their current states, yet they address the same fundamental question: how does the genotype affect the phenotype? The rapidly accumulating genomic data necessitate sophisticated analysis, but many of the current tools are adaptations of methods designed during the early days of quantitative genetics. We argue here that the present analysis paradigm in quantitative genetics is at its limits in regards to unraveling complex traits and it is necessary to re-evaluate the direction that genetic research is taking for the field to realize its full potential.
Article
Full-text available
In quantitative genetics, experiments designed to elucidate the nature of gene action and hence the importance of epistasis, have included analysis of genetic differences among individuals in random mating populations (partitioning of genetic variation, analysis of selection responses), of differences among inbred lines or selected populations (variance components in crosses among lines, chromosome analysis using genetic markers and crossover suppression), of the effects of inbreeding, and of population structure. Evidence in population genetic studies has come from studies of linkage disequilibrium and co-adaptation in natural populations, and of multilocus fitness estimation and linkage disequilibrium and associative overdominance in experimental populations. While it is clear that epistasis does contribute to the genetic variation in some quantitative characters, and in particular reproductive fitness, much of the evidence is equivocal and unsatisfying.
Article
This article has no abstract.
Article
For any given research area, one cannot tell how many studies have been conducted but never reported. The extreme view of the "file drawer problem" is that journals are filled with the 5% of the studies that show Type I errors, while the file drawers are filled with the 95% of the studies that show nonsignificant results. Quantitative procedures for computing the tolerance for filed and future null results are reported and illustrated, and the implications are discussed. (15 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Genome-wide association studies (GWAS) of complex human traits have become a major approach in human genetics. Taken together, GWAS are arguably the largest biological investigations of humans ever conducted. The number of people genotyped with a GWAS array is difficult to know, but probably exceeds 1,000,000. Major findings from these studies are that many common diseases have a polygenic architecture, the genetic effect sizes of common SNP variants are unexpectedly small, the identification of the involvement of genes and biological processes not previously suspected, and the association of some loci with different diseases.1–2 Critically, the sample sizes necessary to identify robust and replicable findings were beyond that achievable by single groups, and collaborations rapidly evolved to augment statistical power. We sought to describe the collaborative networks that emerged as part of GWAS. Full methods are in the Supplemental Material. Briefly, we used the NHGRI GWAS catalog1 and PubMed to identify the authors on 604 GWAS published from the first report in 2005 up to the last complete year (2010). These 604 GWAS papers had a total of 21,007 authorships (8,718 individuals). We constructed network diagrams – graphs – where nodes are authors and edges connect co-authors on a GWAS paper. Overall, there was >10× increase in the number of co-authorships from 2005 to 2010. A network graph created using GEPHI is shown in Figure 1. The graph shows modularity at several levels: there are 14 empirical co-authorship modules (groups of nodes of the same color), and several modules show substructure (clusters within a module), and there are often abundant connections between modules. This graph is annotated more fully in Figure S5 and Table S6, and is coherent in the identification of individuals, labs, and phenotypes studied. Small-scale collaborations between laboratories are common in human genetics. There are multiple examples of large-scale, “big science” collaborations (e.g., the Human Genome Project, HapMap, the 1000 Genomes Project, and ENCODE). Most large collaborations developed in a federated manner with a formal, top-down structure. GWAS collaborations are an unusual event in the history of biomedicine as large and extensive collaborations self-organized, and emerged rapidly from grassroots origins.
Article
The past five years have seen many scientific and biological discoveries made through the experimental design of genome-wide association studies (GWASs). These studies were aimed at detecting variants at genomic loci that are associated with complex traits in the population and, in particular, at detecting associations between common single-nucleotide polymorphisms (SNPs) and common diseases such as heart disease, diabetes, auto-immune diseases, and psychiatric disorders. We start by giving a number of quotes from scientists and journalists about perceived problems with GWASs. We will then briefly give the history of GWASs and focus on the discoveries made through this experimental design, what those discoveries tell us and do not tell us about the genetics and biology of complex traits, and what immediate utility has come out of these studies. Rather than giving an exhaustive review of all reported findings for all diseases and other complex traits, we focus on the results for auto-immune diseases and metabolic diseases. We return to the perceived failure or disappointment about GWASs in the concluding section.
Article
Human genetics has been haunted by the mystery of "missing heritability" of common traits. Although studies have discovered >1,200 variants associated with common diseases and traits, these variants typically appear to explain only a minority of the heritability. The proportion of heritability explained by a set of variants is the ratio of (i) the heritability due to these variants (numerator), estimated directly from their observed effects, to (ii) the total heritability (denominator), inferred indirectly from population data. The prevailing view has been that the explanation for missing heritability lies in the numerator--that is, in as-yet undiscovered variants. While many variants surely remain to be found, we show here that a substantial portion of missing heritability could arise from overestimation of the denominator, creating "phantom heritability." Specifically, (i) estimates of total heritability implicitly assume the trait involves no genetic interactions (epistasis) among loci; (ii) this assumption is not justified, because models with interactions are also consistent with observable data; and (iii) under such models, the total heritability may be much smaller and thus the proportion of heritability explained much larger. For example, 80% of the currently missing heritability for Crohn's disease could be due to genetic interactions, if the disease involves interaction among three pathways. In short, missing heritability need not directly correspond to missing variants, because current estimates of total heritability may be significantly inflated by genetic interactions. Finally, we describe a method for estimating heritability from isolated populations that is not inflated by genetic interactions.