Genome-wide survey implicates the influence of copy number variants (CNVs) in the development of early-onset bipolar disorder.
ABSTRACT We used genome-wide single nucleotide polymorphism (SNP) data to search for the presence of copy number variants (CNVs) in 882 patients with bipolar disorder (BD) and 872 population-based controls. A total of 291 (33%) patients had an early age-at-onset < or =21 years (AO < or =21 years). We systematically filtered for CNVs that cover at least 30 consecutive SNPs and which directly affect at least one RefSeq gene. We tested whether (a) the genome-wide burden of these filtered CNVs differed between patients and controls and whether (b) the frequency of specific CNVs differed between patients and controls. Genome-wide burden analyses revealed that the frequency and size of CNVs did not differ substantially between the total samples of BD patients and controls. However, separate analysis of patients with AO < or =21 years and AO>21 years showed that the frequency of microduplications was significantly higher (P=0.0004) and the average size of singleton microdeletions was significantly larger (P=0.0056) in patients with AO < or =21 years compared with controls. A search for specific BD-associated CNVs identified two common CNVs: (a) a 160 kb microduplication on 10q11 was overrepresented in AO < or = 21 years patients (9.62%) compared with controls (3.67%, P=0.0005) and (b) a 248 kb microduplication on 6q27 was overrepresented in the AO< or = 21 years subgroup (5.84%) compared with controls (2.52%, P=0.0039). These data suggest that CNVs have an influence on the development of early-onset, but not later-onset BD. Our study provides further support for previous hypotheses of an etiological difference between early-onset and later-onset BD.
- [Show abstract] [Hide abstract]
ABSTRACT: Objectives Copy number variants (CNVs) have been shown to affect susceptibility for neuropsychiatric disorders. To date, studies implicating the serotonergic system in complex conditions have just focused on single nucleotide polymorphisms (SNPs). We therefore sought to identify novel common genetic copy number polymorphisms affecting genes of the serotonergic system, and to assess their putative role in bipolar affective disorder (BPAD) and major depressive disorder (MDD).MethodsA selection of 41 genes of the serotonergic system encoding receptors, the serotonin transporter, metabolic enzymes and chaperones were investigated using a paired-end mapping (PEM) approach on next-generation sequencing data from the pilot project of the 1000 Genomes Project. For association testing, 593 patients with MDD, 1,145 patients with BPAD, and 1,738 healthy controls were included in the study.ResultsPEM led to the identification of a microdeletion in the gene encoding tryptophan hydroxylase 2 (TPH2), affecting an amygdala- and hippocampus-specific isoform. It was not associated with BPAD or MDD using a case–control association approach.Conclusions We did not find evidence for a role of the TPH2 microdeletion in the pathoetiology of affective disorders. Further studies examining its putative role in behavioral traits regulated by the limbic system are warranted.Bipolar Disorders 04/2014; · 4.62 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Background Age-of-onset (AO) is increasingly used in molecular genetics of bipolar I disorder (BP-I) as a phenotypic specifier with the goal of reducing genetic heterogeneity. However, questions regarding the cut-off age for defining early onset (EO), as well as the number of onset groups characterizing BP-I have emerged over the last decade with no definite conclusion. The aims of this paper are: 1) to see whether a mixture of three distributions better describes the AO of BP-I than a mixture of two distributions in different independent samples; 2) to compare the morbid risk (MR) for BP-I and for major affective disorders and schizophrenia in first degree relatives of BP-I probands by proband onset group derived from commingling analysis, since the MR to relatives is a trait with strong genetic background. Methods We applied commingling (admixture) analysis to the AO of three BP-I samples from Romania (n=621), Germany (n=882), and Poland (n=354). Subsequently, the morbid risk (MR) for BP-I and for major psychoses (BP-I, BP-II, Mdd-UP, schizoaffective disorders, schizophrenia) was estimated in first degree relatives by proband AO-group derived from admixture analysis in the Romanian sample. Results In the three independent samples and in the combined sample two- and three-AO-group distributions fitted the empirical data equally well. The upper EO limit varied between 21 and 25 years from sample to sample. The MR for both BP-I and for all major psychoses was similar in first degree relatives of EO probands (AO≤21) and in relatives of intermediate-onset probands (AO=22–34). Significant MR differences appeared only when comparing the EO group to the late-onset (LO) group (AO>34). Similar to Mdd-UP and schizophrenia, a significant MR decrease in proband first degree relatives was visible after proband AO of 34 years. Under the three-AO-group classification the MR for both BP-I and all major psychoses in first degree relatives did not differ by relative sex in any proband AO-group. Under the two-AO-group classification female relatives of LO probands (AO>24) had a significantly higher MR for all major psychoses than male relatives, while there was no sex difference for the relatives of EO probands. Limitations MR was not computed in the German and Polish samples because family data were not available and 34% of the relatives of the Romanian probands were not available for direct interview. Conclusion Similar to other clinical traits, the MR for major psychoses to relatives failed to support a three-AO-group classification in BP-I suggesting that this is not more useful for the molecular analysis than a two-AO-group classification.Journal of Affective Disorders. 01/2014; 168:197–204.
- [Show abstract] [Hide abstract]
ABSTRACT: An increased rate of de novo copy number variants (CNVs) has been found in schizophrenia (SZ), autism and developmental delay. An increased rate has also been reported in bipolar affective disorder (BD). Here, in a larger BD sample, we aimed to replicate these findings and compare de novo CNVs between SZ and BD. We used Illumina microarrays to genotype 368 BD probands, 76 SZ probands and all their parents. CNVs were called by PennCNV and filtered for frequency (<1%) and size (>10kb). Putative de novo CNVs were validated with the z-score algorithm, manual inspection of log R ratios, and qPCR probes. We found 15 de novo CNVs in BD (4.1% rate) and six in SZ (7.9% rate). Combining results with previous studies and using a cut-off of >100kb, the rate of de novo CNVs in BD was intermediate between controls and SZ: 1.5% in controls, 2.2% in BD and 4.3% in SZ. Only the differences between SZ and BD and SZ and controls were significant. The median size of de novo CNVs in BD (448kb) was also intermediate between SZ (613kb) and controls (338kb), but only the comparison between SZ and controls was significant. Only one de novo CNV in BD was in a confirmed SZ locus (16p11.2). Sporadic or early onset cases were not more likely to have de novo CNVs. We conclude that de novo CNVs play a smaller role in BD compared to SZ. Patients with a positive family history can also harbour de novo mutations.Human Molecular Genetics 07/2014; · 7.69 Impact Factor
Genome-wide survey implicates the influence of copy number variants
(CNVs) in the development of early-onset bipolar disorder
L Priebe1,2, F Degenhardt1,2, S Herms1,2, B Haenisch1,2, M Mattheisen1,2,3, V Nieratschker4, M
Weingarten1,2, S Witt4, R Breuer4, T Paul4, M Alblas1,2, S Moebus5, M Lathrop6, M Leboyer7,8,9, S
Schreiber10, M Grigoroiu-Serbanescu11, W Maier12, P Propping2, M Rietschel4, MM Nöthen1,2, S
Cichon1,2,13,14, TW Mühleisen1,2,14
1Department of Genomics, Life and Brain Center, University of Bonn, Bonn, Germany
2Institute of Human Genetics, University of Bonn, Bonn, Germany
3Institute for Medical Biometry, Informatics, and Epidemiology, University of Bonn, Bonn,
4Central Institute of Mental Health, Division of Genetic Epidemiology in Psychiatry, Mannheim,
5Institute for Medical Informatics, Biometry and Epidemiology, University Clinic Essen, Essen,
6Centre National de Génotypage, Evry Cedex, France
7INSERM U-513, Faculté de Médecine, Créteil, France
8University of Paris, Faculty of Medicine, IRF10, Créteil, France.
9AP-HP, Albert Chenevier and Henri Mondor Hospitals, Department of Psychiatry, Créteil, France
10Institute of Clinical Molecular Biology, Christian Albrechts University, Kiel, Germany
11Biometric Psychiatric Genetics Research Unit, Alexandru Obregia Clinical Psychiatric Hospital,
12Department of Psychiatry, University of Bonn, Bonn, Germany
13Institute of Neuroscience and Medicine (INM-1), Structural and Functional Organization of the
Brain, Genomic Imaging, Research Center Juelich, Juelich, Germany
14These authors contributed equally to this work
Correspondence to: Prof Dr S Cichon, Department of Genomics, Institute of Human Genetics, Life
and Brain Center, University of Bonn, Sigmund-Freud-Str. 25, D-53127 Bonn, Germany.
We used genome-wide SNP data to search for the presence of CNVs in 882 patients with bipolar
disorder (BD) and 872 population-based controls. A total of 291 (33%) patients had an early age-at-
onset ≤21 years (AO≤21y). We systematically filtered for CNVs that cover at least 30 consecutive
SNPs and which directly affect at least one RefSeq gene. We tested whether: (a) the genome-wide
burden of these filtered CNVs differed between patients and controls, and (b) the frequency of
specific CNVs differed between patients and controls. Genome-wide burden analyses revealed that
the frequency and size of CNVs did not differ substantially between the total samples of BD
patients and controls. However, separate analysis of patients with AO≤21y and AO>21y showed
that the frequency of microduplications was significantly higher (P=0.0004) and the average size of
singleton microdeletions was significantly larger (P=0.0056) in patients with AO≤21y compared to
controls. A search for specific BD-associated CNVs identified two common CNVs: (a) a 160 kb
microduplication on 10q11 was overrepresented in AO≤21y patients (9.62%) compared to controls
(3.67%; P=0.0005), and (b) a 248 kb microduplication on 6q27 was overrepresented in the AO≤21y
subgroup (5.84%) compared to controls (2.52%, P=0.0039). These data suggest that CNVs have an
influence on the development of early-onset, but not later-onset BD. Our study provides further
support for previous hypotheses of an etiological difference between early-onset and later-onset BD.
Keywords: bipolar disorder; copy number variant; genome-wide burden; early age-at-onset;
Running title: SNP-based search for CNVs in bipolar disorder
Bipolar disorder (BD) is a common, severe mood disorder that is characterized by recurring
episodes of extreme exaltation (mania) and depression. Mood symptoms are often accompanied by
disturbances in thinking and behavior. BD has a lifetime risk in the general population of 0.5-1.5%.1
Twin, family, and adoption studies have provided strong evidence for a genetic predisposition to
BD.2,3 Heritability has been estimated to be between 60% and 85%.4
Changes in the copy number of submicroscopic chromosomal segments, known as copy number
variants (CNVs), are a major component of the difference between human genomes.5 Based on their
frequency, rare (<1%) and common (>1%) CNVs can be distinguished. Several recent studies have
shown a strong influence of rare CNVs on the development of neuropsychiatric phenotypes such as
autism6,7 and schizophrenia.8-10 To date, only a limited number of studies have been published for
BD.11-15 Lachman et al. investigated a mixed cohort of Caucasian patients and controls from the
Czech Republic and the United States, and found that microdeletions and microduplications,
affecting the gene glycogen synthase kinase 3 beta (GSK3B) were significantly increased in
patients.11 Zhang et al. investigated singleton microdeletions (i.e. those occurring only once in the
total dataset of patients and controls) of >100 kb in their European American sample of 1 001 BD
patients and 1 034 controls, and found that they were overrepresented in patients.12 Recently, Yang
et al. published a study of a three-generation Older Amish pedigree with segregating affective
disorder.13 They reported that a set of four CNVs on chromosomes 6q27, 9q21, 12p13, and 15q11
were enriched in affected family members, and that these altered the expression of neuronal genes.
Grozeva et al.14 screened a sample of 1 868 patients with BD and 2 938 controls for large (>100 kb)
and rare (found in <1% of the population) CNVs. No specific CNV was associated with BD, and
the authors found no increased genome-wide burden of CNVs in patients compared to controls. The
Wellcome Trust Case Control Consortium (WTCCC15) investigated common CNVs (found in >5%
of the population) of >0.5 kb in a sample of 2007 patients with BD and 3 000 controls, and found
no association between CNVs and the disease.
In the present study, we screened the genome-wide SNP data of 882 patients with BD and 872
population-based controls for predicted common and rare CNVs using more than 540 000
autosomal and X-chromosomal markers. All study participants were of German descent. We tested
both the overall group of BD patients and a subgroup with an age-at-onset of ≤21 years since
several studies have suggested that early-onset BD patients may represent a clinically and
genetically more homogeneous subtype of BD. Clinical studies have demonstrated that early-onset
BD is a more severe form of the disorder that is characterized by frequent psychotic features, more
mixed episodes, greater psychiatric co-morbidity, and poorer response to prophylactic lithium
treatment.16-18 Familial aggregation is more pronounced in relatives of early-onset BD patients than
in relatives of later-onset BD patients.16-20 Finally, the findings of several studies have suggested the
existence of an intra-familial correlation for age-at-onset among bipolar siblings,21,22 and a
segregation analysis has shown that BD is transmitted differentially in early- and later-onset BD
To control for the number of technical artifacts, we developed a stringent protocol for quality
control (QC) and filtering. On the basis of these data, we conducted statistical tests for the genome-
wide burden of CNVs and for all specific common and rare CNVs that were found to be associated
Materials and Methods:
Unrelated patients with a clinical history of bipolar disorder (post QC: type I, n=767; type II,
n=102; not other specified, n=13) were recruited at two centers: the Central Institute of Mental
Health, Mannheim, and the Department of Psychiatry and Psychotherapy of the University of Bonn.
The study was approved by the Institutional Review Boards, and all patients provided written
informed consent prior to inclusion. DSM-IV life-time diagnoses of BD were assigned using a
consensus best-estimate procedure that was based on all available information including the
findings of a structured SCID-I interview.23 The same set of instruments was used by both centers.24
Age-at-onset (AO) was defined as the age at which the first DSM-IV-criteria episode of either
depression or mania had occurred. Post QC, the mean age of patients at the time of recruitment was
44.03 years with a standard deviation (SD) of 13.41, the mean age-at-onset was 27.90 years
(SD=11.28; median=24; mode=19). The AO distribution of the total sample deviated to the right of
the Gaussian distribution (Kolmogorov-Smirnov Z=5.10; P<0.001; positive skewness=1.10). The
male/female ratio was 0.47.
Determining the cut-off point for early age-at-onset
Despite extensive debate over the past decade, no consensus has yet been reached concerning the
cut-off point for the definition of early and late AO in BD. Authors who have applied the same
expectation-maximization algorithm to different samples have described divergent cut-offs for the
definition of the early AO. Some have reported that a three-AO-group distribution best fitted their
data25,26 and others a two-AO-group distribution.27 This demonstrates that the results of an
admixture analysis are sample-dependent.
We therefore performed a commingling analysis in the present sample before selecting the cut-off
point for the definition of early AO. We used the SEGREG-subroutine of the software S.A.G.E.
(version 6.1, http://darwin.cwru.edu/sage/).28 Commingling analysis reveals the distribution mixture
of a trait through segregation analysis while allowing for ascertainment correction. Class D models
are used as the regressive models in commingling analysis, which assume that the trait under
investigation in the study probands is not conditional upon the trait in antecedent family members.
The model that fitted the data best was selected on the basis of the smallest value of the Akaike
Although the two-AO-group and the three-AO-group models had fitted our data equally well in a
preliminary analysis of a larger sample,29 the best model in the present sample was a two-AO-group
distribution. Prior to QC, the mean AO in the early onset group was 20.67 years (SD=8.40) and the
mean AO in the late onset group was 33.20 years (SD=10.30). To select patients with a clear early
AO, we only selected patients with an AO that was lower or equal to the rounded mean of the early
onset group, i.e. age 21 (AO≤21y). In this AO≤21y subgroup, the mean AO was 17.44 years
Prior to QC, our overall sample consisted of 957 patients and 880 controls. Post QC, this was
reduced to 882 patients and 872 controls. A total of 291 patients remained in the AO≤21y subgroup
and for these patients the mean age-at-recruitment was 38.71 years (SD=12.88), the mean AO was
17.54 years (SD=2.52), and the male/female ratio was 0.43. In the later-onset subgroup with an
AO>21 years (AO>21y, n=591), the mean age-at-recruitment was 46.84 years (SD=12.87), and the
mean AO was 33.20 years (SD=10.30), and the male/female ratio was 0.49.
Controls were drawn from two population-based epidemiological studies: (a) Population-based
Recruitment of Patients and Controls for the Analysis (PopGen, n=497)30 from Schleswig-Holstein
(Northern Germany); and (b) the Heinz Nixdorf Recall study (Risk Factors, Evaluation of Coronary
Calcification, and Lifestyle; HNR, n=383)31 from Essen, Bochum, and Mülheim a. d. Ruhr (Ruhr
area). Post QC, the mean age at recruitment of the control group was 47.98 (SD=11.42), and the
male/female ratio was 0.51. All patients and controls reported that they were of German descent.
DNA extraction, genotyping, and quality control
Venous blood samples were collected from all patients and controls. Lymphocyte DNA was
extracted either by salting out with saturated sodium chloride solution32 or by a Chemagic Magnetic
Separation Module I (Chemagen, Baesweiler, Germany) according to the manufacturer's
Individuals were genotyped using Illumina's HumanHap550v3 (HH550) or Human610-Quadv1
(H610Q) BeadArrays. The genotype data had been generated as part of a genome-wide association
study of BD (Cichon, Mühleisen et al., unpublished data). The H610Q chip contains approximately
60 000 more probes and SNPs than the HH550 array. The majority of the excess content represents
non-polymorphic CNV probes, which leads to an excess of CNV calls for individuals genotyped on
the H610Q chip (data not shown). To avoid such a bias, we only analyzed SNPs that are present on
both chips, i.e. a total of 541 524 SNPs (post QC).
Prior to computational CNV prediction, stringent QC criteria were applied to the genotype data at
both the marker and the individual level. SNPs with a call rate of <98% were excluded. Individuals
were excluded for the following reasons: (a) DNA call rate <97% (20 patients);(b) differences
between X-chromosomally inferred and phenotypic sex (six patients); (c) DNA sample doublets
identified by identity-by-state estimates (defined as IBS=2.0, two controls); (d) relatedness of
individuals (1.6<=IBS<2.0, no individual excluded); and (e) population outlier according to multi-
dimensional scaling with HapMap phase 2 (one patient and three controls were excluded prior to
the present CNV study).
http://www.well.ox.ac.uk/QuantiSNP).33 The algorithm implemented in QuantiSNP uses an
were predicted using the program QuantiSNP (version 1.1,
Objective Bayes Hidden-Markov Model to estimate the copy number. To evaluate the presence of a
CNV, QuantiSNP uses the normalized intensity data (i.e. log R ratio) and allele frequency data (i.e.
B allele frequency) of each SNP. Both values were calculated by Illumina's BeadStudio Genotyping
module (version 3.3.7, http://www.illumina.com/pages.ilmn?ID=169). Individuals were excluded if
their SD from the log R ratio or their B allele frequency exceeded certain thresholds: log R ratio
>0.36 or B allele frequency >0.12 (22 patients, six controls). We employed the normalization
procedure for local GC content implemented in QuantiSNP to improve the accuracy of detection.
The Log Bayes Factor (LBF) was computed for each CNV. This factor indicates the confidence of
each predicted CNV, with higher values indicating higher statistical reliability.
To minimize the number of false-positive CNV calls, we only considered CNVs with a LBF ≥30,
that spanned a minimum of 30 consecutive SNPs, and which directly affect at least one RefSeq
gene.34 After applying these filters, we also excluded individuals with more than seven CNVs, since
they were extreme outliers in terms of the number of CNV events (27 patients). Following quality
control, 1 044 CNV calls from a total of 882 BD patients and 872 controls were statistically
analyzed. A total of 291 of the patients (33%) had an AO≤21 years, and 591 patients had an AO>21
To confirm our QuantiSNP-based CNV results, we additionally screened our dataset with
PennCNV.35 Both PennCNV and QuantiSNP apply a Hidden-Markov Model to estimate the copy
number of an individual. They also take the log R ratio and B allele frequency of each SNP into
account, and correct for GC content. The CNV data of both algorithms are available upon request.
Statistical analysis of CNV burden and specific CNVs
All association tests for genome-wide CNV burden and association of specific CNVs were
performed using PLINK (version 1.06, http://pngu.mgh.harvard.edu/purcell/plink/).36 We conducted
the burden tests for CNV frequency (PROP, RATE) as well as for CNV length (TOTKB, AVGKB),
i.e. the total number of CNVs in patients vs. controls (RATE); proportion of individuals with one or
more CNVs in patients vs. controls (PROP); total length spanned by CNVs per individual in the
patient group vs. the control group (TOTKB), and average size of CNVs per individual in the
patient group vs. the control group (AVGKB). All P-values were generated using 50 000
We defined three major comparison groups for the genome-wide burden analyses:
1. All patients vs. all controls
2. AO≤21y patients vs. controls
3. AO>21y patients vs. controls
We tested each of these three comparison groups for six different categories of CNVs:
1. All CNVs
4. All singleton CNVs
5. Singleton microduplications
6. Singleton microdeletions
We performed a total of four different burden tests (RATE, PROP, TOTKB, and AVGKB) in 18 test
groups, i.e. three major comparison groups for the six different categories of CNVs as described
above, resulting in a total of 72 tests for association between CNV burden and BD. To account for
all 72 tests, we also applied Bonferroni's method, although this procedure may be too conservative
given that the tests were not independent of each other.
In addition, we monitored the distribution of CNVs in chromosomal regions 1q21, 2p16, 7q34-36,
15q11, 15q13, 16p11, 17p12, and 22q11 which have previously been reported to be associated with
a variety of neuropsychiatric disorders (Table 2). Since the borders of these regions are known, we
relaxed our filter criteria, i.e. we included all CNVs with LBF≥10 which were visually inspected by
two independent investigators regardless of the number of affected SNPs. Association tests for these
CNVs were performed using Fisher's exact test.
Verification of specific CNVs
All CNVs identified by QuantiSNP and PennCNV that had been found to be associated with
neuropsychiatric disorders in previous studies, or which were located within chromosomal regions
associated with BD, were visually inspected using Illumina's GenomeStudio.
The specific CNVs that were found to be associated with BD in the present study (6q27 and 10q11)
were verified by quantitative real-time PCR (qPCR) using TaqMan Copy Number Assays (Applied
Biosystem, Foster City, CA, USA). We confirmed each CNV carrier by qPCR and also tested non-
CNV carriers (as defined by QuantiSNP and PennCNV) to detect possible CNV carriers who had
not been identified by QuantiSNP and PennCNV. The status of all CNV carriers was confirmed by
qPCR, and no CNV carriers were detected among the putative non-CNV carriers tested. Copy
numbers were calculated using the ΔΔCt method implemented in the CopyCaller Software (v1.0,
Analysis of pathways and biological processes
We analyzed whether genes affected by CNVs were enriched in certain pathways or biological
processes using the web-based the program Ingenuity Pathways Analysis platform (IPA, version
8.0, http://www.ingenuity.com). IPA is based on functional annotation and molecular interactions.
Gene lists were assembled using RefSeq genes that are affected by the CNVs identified in the
burden analysis (microduplications, singleton deletions). Lists were uploaded into IPA and
investigated using the "core analysis" function and default settings. In the functional analysis,
biological functions were grouped into different categories from the Ingenuity Knowledge Base. To
calculate the statistical significance of pathways and biological processes assigned to gene sets, P-
values of the Fisher's exact test were corrected by the Benjamini-Hochberg method.
We systematically analyzed our samples for significant genome-wide differences in the distribution
of all CNVs between patients and controls (genome-wide burden tests) as well as for a significant
overrepresentation of specific CNVs in patients or controls.
General description of the CNV dataset
Following the baseline QC of SNP data, QuantiSNP identified a total of 124 146 putative CNVs in
the initial sample of 957 patients and 880 controls (an average of 67.6 CNVs per individual).
Following the application of all QC filters, 1 044 potential CNVs remained in the filtered sample of
882 BD patients and 872 controls (an average of 0.59 CNVs per individual).
We examined the distribution of the number of CNVs per individual and identified a total of 27
extreme outliers in terms of CNV observations, with more than seven CNVs being detected in each
individual. The samples of a total of 24 of these individuals were clustered in the outer rows of the
same 96-well plate, and thus these findings are likely to represent plate effects. These individuals
were excluded from the downstream analyses.
Association analysis for genome-wide CNV burden
Overall, 10 of the 72 genome-wide burden tests revealed nominally significant differences in CNV
burden between patients and controls. In the following, we provide a detailed description of the
most important findings, as outlined in Table 1.
All patients vs. controls. In the total sample of 882 BD patients and 872 controls, the genome-wide
burden of singleton microdeletions showed nominally significant association for two out of the four
tests performed. The average total length of all singleton microdeletions per individual was 487.6
kb in patients compared to 265.1 kb in controls (TOTKB: P=0.014). The average size per singleton
microdeletion was 472.6 kb in patients and 249.3 kb in controls (AVGKB: P=0.014). The PROP-
and RATE-tests generated no significant P-values.
Patients with an early AO vs. controls. When comparing the 291 AO≤21y patients with all controls,
we again found that singleton microdeletions in patients were, on average, larger (661.7 kb in
AO≤21y patients vs. 249.3kb in controls, AVGKB: P=0.0056) and spanned longer chromosomal
regions per individual than in controls (679.6 kb in AO≤21y patients vs. 261.2 kb in controls,
TOTKB: P=0.0084). Furthermore, we observed that the total proportion of individuals with at least
one microduplication (44.3% in AO≤21y patients vs. 33.1% in controls, PROP: P=0.00040), at least
one CNV (52.9% in AO≤21y patients vs. 42.3% in controls, PROP: P=0.00092), or at least one
singleton microduplication (17.2% in AO≤21y patients vs. 11.9% in controls, PROP: P=0.017) was
significantly higher in this BD subgroup. The PROP test P-value for microduplications, which was
the most significant of all of the burden analyses, withstood correction for multiple testing with the
Bonferroni method which accounted for the number of all tests performed in this study (n=72,
Padjusted=0.029). However, there were no significant differences in the proportion of individuals who
carried either microdeletions or singleton microdeletions.
Patients with AO>21y vs. controls. In the third test group, we analyzed the burden of CNVs in the
AO>21y subgroup (n=591) vs. all controls. These tests revealed no significant differences in CNV
burden between patients and controls.
Association analyses of specific CNVs
We identified two common microduplications that were significantly overrepresented in patients
compared to controls: (a) a 248 kb microduplication on chromosome 6q27, and (b) a 160 kb
microduplication on chromosome 10q11 (Figure 1).
The 10q11 microduplication (Figure 1a) was observed in 53 patients (6.01%) and in 32 controls
(3.67%, P=0.035, OR [95%-CI] =1.53 [1.05-2.72]). Of these 53 patients, 28 belong to the AO?21y
subgroup (9.62% in AO≤21y patients vs. 3.67% in controls, P=0.00052, OR [95%-CI] =2.79 [1.59-
4.89]). Following genome-wide correction using permutation, this P-value remained significant
(n=50 000; Padjusted=0.032). In view of the genetic marker resolution provided by our approach, all
observed microduplications at the 10q11 locus appear to have the same breakpoints (length
approximated 160 kb, chr10:47.01-47.17 Mb, NCBI build 36), and carry 30 consecutive SNPs. This
CNV covers the complete gene anthrax toxin receptor-like gene (ANTXRL).
The microduplication on chromosome 6q27 (Figure 1b) was detected in 17 patients from the
AO≤21y subgroup (5.84%) and in 22 controls (2.52%, P=0.0039, OR [95%-CI] =2.40 [1.18-4.80]).
There were no significant differences in distribution when all BD patients were compared to
controls. In CNV carriers, slight differences in CNV size were observed, with a shared overlap of
around 248 kb (chr6:168.09-168.33 Mb, NCBI build 36) that was due to 110 adjacent SNPs. Three
genes are affected by this microduplication: (a) kinesin family member 25 (KIF25), (b) FERM
domain containing 1 (FRMD1), and (c) parts of the 3' terminus of mixed-lineage leukemia
translocated to 4 (MLLT4). The gene dapper, antagonist of beta-catenin, homolog 2 (DACT2) lies
around 115 kb downstream from this common microduplication. DACT2 participates in the WNT
signaling pathway37,38 which is known to regulate neurogenesis and neuroprotection.
In a follow-up analysis, we tested whether patients carrying either the common CNVs on 6q27 or
the CNV on 10q11 showed differences in sex distribution or family history of psychiatric disorder
compared to patients who were non-carriers. No significant associations were observed for either
phenotypic item (data not shown).
Specific CNVs at loci previously associated with psychiatric disorder
We tested whether CNVs in one of six genomic regions that have previously been reported to be
associated with neuropsychiatric disorders (1q21, 2p16, 7q34-36, 15q11, 15q13, 16p11, 17p12 and
22q11) were overrepresented in our BD patients. CNVs in these regions were not significantly
overrepresented (Table 2). However, the power of our sample to find significant association with
these rare CNVs was low.
Pathways and biological processes impacted by CNVs in early-onset patients
To further characterize the two top association findings of our burden analyses in AO≤21y patients
(Table 1), we used IPA to explore possible functional relationships between genes covered by either
microduplications or singleton microdeletions. In the analysis of the 46 genes hit by singleton
microdeletions, the significant first five top hits for biological processes (i.e. "drug metabolism",
"lipid metabolism", "molecular transport", "small molecular biochemistry" and "endocrine system
development and function") were enriched by the presence of the genes SLCO1B1 and SLCO1B3,
which are both located on 12p12. These were covered by a singleton microdeletion in one AO≤21y
patient. These categories were therefore omitted from data interpretation. Further top process
categories that were significantly overrepresented due to several input genes being hit by singleton
microdeletions included "endocrine system disorder", "genetic disorder", "metabolic disease",
"immunological disease," and "infectious disease" (Table 3). The same functional categories were
found to be enriched when analyzing the 287 genes affected by microduplications in AO≤21y
patients (Table 4) although there was no overlap in affected genes between the two gene lists,
except for MAD1L1.
No significant result was obtained for canonical pathways in IPA after correction for multiple
testing (data not shown). This was probably due to the limited number of genes introduced into the
In the present study, we investigated a large sample of patients and controls of German descent for
the presence of CNVs that may be involved in the development of BD. Our analyses included tests
to monitor the genome-wide burden of CNVs (frequency and size) between patients and controls as
well as tests to detect specific common and rare CNVs that were significantly overrepresented in
either patients or controls. Since clinical and formal genetic studies have suggested that BD with an
AO≤21y may be genetically distinct from BD with an AO >21 years,16-22 we also analyzed these
subgroups separately. The basis for the statistical analyses was provided by high-quality CNV
prediction (QuantiSNP) using the SNP intensity data of 1 754 individuals and verification
(PennCNV, visual inspection, and TaqMan for specific CNVs). Parameters were specified to ensure
that only relatively large CNVs (detected by ≥30 consecutive SNPs) with high statistical confidence
(QuantiSNP: log Bayes factor ≥30; PennCNV: confidence value ≥30) passed QC. Using such
stringent criteria is always a trade-off between sensitivity and quality of data. We have chosen our
criteria to reduce type I error and are aware that this may lead to an increase of type II error at the
Our genome-wide burden tests (RATE and PROP) provided no evidence that the overall number of
singleton CNVs (both microdeletions and microduplications) overlapping with RefSeq genes was
enriched in the total sample of BD patients compared with controls. The TOTKB and AVGKB tests
demonstrated that the length of singleton microdeletions differed significantly between patients and
controls (Table 1). Singleton microdeletions in patients were approximately twice the size (490 kb)
of those in controls (260 kb; AVGKB P=0.014). Separate analyses of the AO≤21y subgroup and the
AO>21y subgroup demonstrated that this effect was mainly attributable to the AO≤21y subgroup, in
which the average size of microdeletions was approximately 680 kb (AVGKB P=0.0056), and to a
lesser extent to the AO>21y subgroup, in which the average size of singleton microdeletions (348
kb) was not significantly larger than in controls (AVGKB P=0.17; data not shown). The most
significant finding of our burden analyses was that the total number of patients with at least one
microduplication (both common as well as singleton microduplications) was significantly higher in
the AO≤21y group, but not in the AO>21y subgroup (PROP, P=0.0004). The burden results suggest
that both a higher CNV load and a larger CNV size are associated with BD in patients with an
AO≤21y. The effect in the early-onset group is was attributable to longer singleton deletions as well
as to a higher frequency of microduplications (common and rare). Analysis of either the overall
sample or the AO>21y sample alone produced only marginal evidence that the genome-wide burden
of CNVs plays a role in disease development.
These results show a similar trend to those of Zhang et al.12 who reported a significant
overrepresentation of singleton deletions of more than 100 kb in their BD cases (16.2%) compared
to controls (12.3%; P=0.007). Interestingly, they also found that this effect was more pronounced in
an early AO form of BD (age of mania onset ≤18 years). This is consistent with our observation of
the presence of longer singleton deletions in early-onset BD patients. However, Zhang et al.12 found
no evidence of a higher frequency of microduplications in their early-onset mania subsample.
Unfortunately, a more exact comparison of our data with that of Zhang et al.12 is limited by the
methodological differences between the two studies: Zhang et al.12 used Affymetrix 6.0 arrays
whereas we used Illumina HH550/H610Q arrays; Zhang et al.12 performed CNV detection with
Birdsuite whereas we used QuantiSNP and PennCNV; Zhang et al.12 took all CNVs above a certain
size threshold into account whereas we filtered for CNVs that overlapped with at least 30
consecutive SNPs and RefSeq genes.
Another recent study by Yang et al.13 found no evidence for any significant association between the
average number and size of CNVs and affective disorders (including BD and major depression).
However, their study investigated a single three-generation Old Order Amish family, and their main
focus was upon the identification of CNVs that co-segregated with disease status across
generations. The limited number of affected (n=19) and unaffected family members (n=32) and
their genetic relatedness, as well as the broader phenotype definition used may have resulted in
limited power to investigate the genome-wide burden aspects of CNVs in BD.
Two recent genome-wide studies of CNVs in BD have been published (Grozeva et al.14,
WTCCC15), which investigated common and rare CNVs. They tested for association between BD
and specific CNVs and CNVs at previously reported loci, as well as for genome-wide burden.
Neither of these two studies found evidence for the involvement of CNVs in disease development.
The majority of patients and controls investigated by Grozeva et al.14 were also analyzed by the
WTCCC15, and thus the results of the two studies are not entirely independent of each other. Major
methodological differences exist between these two studies and our study, which hamper any direct
comparison of the results. Firstly, different arrays were used to screen for CNVs (Grozeva et al.14:
Affymetrix’ Genechip Human Mapping 500K Array Set; WTCCC15: Agilent Comparative Genomic
Hybridization arrays). Secondly, Grozeva et al.14 searched for rare CNVs (MAF <1%) only,
whereas the WTCCC15 investigated common CNVs (MAF >5%) only. The present study did not
have any restrictions with regard to CNV frequency, and the frequencies of the two specific BD-
associated CNVs on 10q11 (3.68% in controls) and 6q27 (2.53% in controls) are within a frequency
range that was not investigated by Grozeva et al.14 or the WTCCC.15 Another important difference
is the separate analyses of AO subgroups in the present study, which suggest that CNVs play a role
in early onset, but not in later onset BD. Since the aforementioned studies did not take AO into
account, it is unclear whether such an effect was present in their BD samples. Nonetheless, all
studies, including the present study, are in agreement that CNVs do not appear to influence BD
when AO is not taken into account.
In a subsequent step of our genome-wide burden analysis, we performed an IPA to investigate
whether the genes affected by microduplications (n=287) or singleton microdeletions (n=46) in
AO≤21y patients were significantly enriched for biological processes or pathways. When removing
single gene-based enriched categories, the top five significantly overrepresented biological function
categories were the same for both the microduplications and the microdeletions gene lists, i.e. the
disorder and disease processes "endocrine system disorder", "genetic disorder", "metabolic disease",
"immunological disease," and "infectious disease". Follow-up studies in independent samples are
clearly necessary to confirm that biological functions within these disease categories are involved in
the pathophysiology of (early-onset) BD. One interesting gene with prior suggestive evidence for
association with BD, mitotic arrest deficient-like 1 (MAD1L1, 7p22.3), was affected by a singleton
microdeletion as well as a singleton microduplication in early-onset BD patients. Support for the
involvement of this locus was provided by a recent genome-wide association study (GWAS) of BD,
in which two MAD1L1 SNPs (rs11764590 and rs10278591; r2=0.7) were the second and third best
results in the meta-analysis step (P=1.28x10-7 and P=1.81x10-5, respectively, unpublished data).
MAD1L1 is a mitotic spindle assembly checkpoint protein. Homozygous knockout of MAD1L1 in
mice confers embryonic lethality, indicating that MAD1L1 plays an essential role during embryonic
Two genes affected by microduplications in early AO patients, epidermal growth factor receptor
(EGFR, 7p11.2, one patient) and nucleoredoxin (NXN, 17p13.3, one patient, two controls), were
identified as susceptibility genes for BD by two recent GWAS.40,41 EGFR rs17172438 (P=3.26x10-
5, OR=1.32) and EGFR rs729969 (P=3.30x10-5, OR=1.36) were ranked among the top 20 findings
in the GWAS by Sklar et al.40 EGFR and its ligands are cell signaling molecules that mediate
diverse downstream cellular functions, including cell proliferation and differentiation. In a further
GWAS, Baum et al.41 found that NXN rs2360111 (P=0.0003, OR=1.23) were associated with BD.
The NXN gene encodes the protein nucleoredoxin which is involved in the inhibition of the WNT
Another aim of the present study was to identify specific BD-associated CNVs. We found that the
frequency of two relatively common CNVs on 6q27 and 10q11 differed significantly between
patients and controls. Common microduplications located on chromosome 10q11 showed
association in the overall sample (P=0.035, OR=1.53). Again, this effect was much stronger in the
AO≤21y subgroup (P=0.00052, OR=2.79). The other microduplication on chromosome 6q27 was
overrepresented in the AO≤?21y subgroup (P=0.0039, OR=2.40), but not in the overall sample.
Interestingly, the 6q27 CNV region was implicated in BD in a three-generation Old Order Amish
pedigree.13 The 6q27 region was one of four regions that were enriched in affected pedigree
members and which had an effect on the expression of genes within or near the rearrangement. The
identification of the 6q27 region in two independent studies is clearly of interest, although further
independent studies are required to support this finding. One limitation which prevents stronger
conclusions being drawn from the Yang et al. study is that there was only modest co-segregation of
the 6q27 CNV with BD. Thus the possibility that that this relatively common variant segregated
within the family independently of disease cannot be excluded.
In summary, the present study found evidence for a significant association between BD and
microdeletions and microduplications. Although the frequency of microdeletions was not
significantly higher in patients compared to controls, the size of singleton microdeletions was
significantly larger. Our data also suggest that a higher frequency of microduplications is implicated
in disease development. We found both the genome-wide burden of microduplications as well as
common specific microduplications on 6q27 and 10q11 to be enriched in BD patients. A further
important finding of our study is that CNVs were strongly associated in patients with an AO ≤21
years, but not in patients with an AO >21 years. In the overall BD sample, only a very weak
association with CNVs was detected. Our results support for findings from previous clinical and
formal genetic studies that early and later onset BD may represent genetically distinct forms of the
disease. Future studies of CNVs in BD should therefore take the AO of their patient samples into