Accurately Assessing the Risk of Schizophrenia
Conferred by Rare Copy-Number Variation Affecting
Genes with Brain Function
Soumya Raychaudhuri1,2,3, Joshua M. Korn1,2,4, Steven A. McCarroll1,5, The International Schizophrenia
Consortium", David Altshuler1,2,5,6, Pamela Sklar7,8,9, Shaun Purcell8,9*, Mark J. Daly1,2*
1Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America, 2Center for Human Genetic Research, Massachusetts
General Hospital, Boston, Massachusetts, United States of America, 3Division of Rheumatology, Immunology, and Allergy, Brigham and Women’s Hospital, Boston,
Massachusetts, United States of America, 4Graduate Program in Biophysics, Harvard University, Cambridge, Massachusetts, United States of America, 5Department of
Genetics, Harvard Medical School, Boston, Massachusetts, United States of America, 6Department of Molecular Biology, Massachusetts General Hospital, Boston,
Massachusetts, United States of America, 7Department of Psychiatry, Massachusetts General Hospital, Boston, Massachusetts, United States of America, 8Psychiatric and
Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America, 9Stanley Center for Psychiatric Research, Broad
Institute, Cambridge, Massachusetts, United States of America
Investigators have linked rare copy number variation (CNVs) to neuropsychiatric diseases, such as schizophrenia. One
hypothesis is that CNV events cause disease by affecting genes with specific brain functions. Under these circumstances, we
expect that CNV events in cases should impact brain-function genes more frequently than those events in controls. Previous
publications have applied ‘‘pathway’’ analyses to genes within neuropsychiatric case CNVs to show enrichment for brain-
functions. While such analyses have been suggestive, they often have not rigorously compared the rates of CNVs impacting
genes with brain function in cases to controls, and therefore do not address important confounders such as the large size of
brain genes and overall differences in rates and sizes of CNVs. To demonstrate the potential impact of confounders, we
genotyped rare CNV events in 2,415 unaffected controls with Affymetrix 6.0; we then applied standard pathway analyses
using four sets of brain-function genes and observed an apparently highly significant enrichment for each set. The
enrichment is simply driven by the large size of brain-function genes. Instead, we propose a case-control statistical test, cnv-
enrichment-test, to compare the rate of CNVs impacting specific gene sets in cases versus controls. With simulations, we
demonstrate that cnv-enrichment-test is robust to case-control differences in CNV size, CNV rate, and systematic differences
in gene size. Finally, we apply cnv-enrichment-test to rare CNV events published by the International Schizophrenia
Consortium (ISC). This approach reveals nominal evidence of case-association in neuronal-activity and the learning gene sets,
but not the other two examined gene sets. The neuronal-activity genes have been associated in a separate set of
schizophrenia cases and controls; however, testing in independent samples is necessary to definitively confirm this
association. Our method is implemented in the PLINK software package.
Citation: Raychaudhuri S, Korn JM, McCarroll SA, The International Schizophrenia Consortium, Altshuler D, et al. (2010) Accurately Assessing the Risk of
Schizophrenia Conferred by Rare Copy-Number Variation Affecting Genes with Brain Function. PLoS Genet 6(9): e1001097. doi:10.1371/journal.pgen.1001097
Editor: David B. Allison, University of Alabama at Birmingham, United States of America
Received March 11, 2010; Accepted July 27, 2010; Published September 9, 2010
Copyright: ? 2010 Raychaudhuri et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: For this work, SR is supported by a K08 National Institutes of Health (NIH) career development award (AR055688). MJD is supported by NIH grants
through the U01 (HG004171, DK62432) and R01 (DK083756-1, DK64869) mechanisms. The MIGen study was funded by funding from the NIH-NLBI (R01HL087676)
and a grant from the National Center for Research Resources. The funders had no role in study design, data collection and analysis, decision to publish, or
preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
* E-mail: email@example.com (MJD); firstname.lastname@example.org (SP)
" Please see Acknowledgements for consortium authorship.
Multiple recent studies have demonstrated a convincing and
statistically significant excess of rare CNVs in individuals affected
by schizophrenia compared to unaffected individuals [1–4].
Similar observations have now been made separately in autism
[5–7] and bipolar disorder . However, it is typically not readily
evident which individual CNV events are pathogenic since (1)
many rare events are seen in the general population and the excess
in cases is relatively modest and (2) individual events are too rare
to demonstrate definitive association in realistically sized patient
collections. Hence, it is challenging to translate these rare CNV
events into a clear understanding of disease pathology. To identify
candidate genes for follow-up, investigators have employed
statistical tests of gene set enrichment, originally developed as an
effective approach to interpret gene expression data .
Practically, these analyses identify functional gene sets or
‘pathways’ that are over-represented among those genes affected
by case CNVs compared to unaffected genes [1,8,10,11], often
relying on online resources such as Panther , Ingenuity
Pathway Analysis (Ingenuity Systems, www.ingenuity.com), and
Gene Ontology .
For example, gene set enrichment analyses by Walsh et al.
suggested that rare CNVs in schizophrenia preferentially disrupt
PLoS Genetics | www.plosgenetics.org1September 2010 | Volume 6 | Issue 9 | e1001097
those genes with neuro-developmental functions ; Zhang et al.
similarly reported that rare CNVs in bipolar disorder preferen-
tially overlap genes involved in behavior and learning . More
recently Glessner et al. reported that genes affected by rare and
common CNVs in autism are also involved in brain function .
While these initial results are highly promising, the gene set
enrichment statistical framework as applied to copy number
variation is critically limited and potentially confounded.
The key analytical question in this setting is whether an event
that impacts a set of genes or a pathway, increases disease risk
compared to events that do not impact that pathway. Under the
hypothesis that events affecting a specific brain-function
pathway are pathogenic, the rate of those events affecting the
brain-function pathway should indeed be greater in cases than
in controls – ideally fully explaining the observed genome-wide
differences in case and control event rates. An alternative
possibility is that the increased rate and size of CNVs in cases
represents a mutational syndrome or genomic instability, and
that they are not in themselves pathogenic. Under that
possibility, case events should not preferentially impact any
particular gene set; however, differences is size and rate might
The commonly used gene set enrichment analytical approach
used to address this question falls short on two accounts: (1) they
do not rigorously compare case event rates to control event rates
and (2) since they examine affected genes rather than events, they
do not accurately account for the fact that multiple genes might be
contributed from a single event or that single genes may be
affected by multiple events. Here we propose a straight forward
statistical test to explicitly compare the rate of CNVs impacting a
specific gene set in cases to the rate in controls that carefully
accounts for background differences in CNV rate and size.
A possible consequence of not rigorously comparing event rates
in cases to controls is that sets consisting of genes that are more
frequently affected by CNVs might spuriously appear to be highly
enriched in cases, but also will be highly enriched in controls.
Examples of such genes include large genes spanning a massive
portion of the genome or those whose functions are highly
redundant or non-essential. This is a particularly important issue
for neuropsychiatric disease considering the reportedly large size of
genes with brain function. Multiple studies of CNVs in the general
population have reported enrichment for neuro-physiological genes
[14,15] – suggesting that brain-function genes may be susceptible to
CNV events in general, possibly due to their large size or other
factors. In fact, published events in neuropsychiatric disease studies
often implicate large genes (see Table S1). Others have already
noted that gene size itself can bias pathway enrichment analyses in
other contexts, such as annotating non-coding elements for function
[16,17]. In particular, Taher et al noted that randomly selected
in ‘‘development’’, ‘‘cell-adhesion’’, and ‘‘nervous system develop-
ment’’ . Some of the published disease studies attempted to
address this issue indirectly by applying similar analyses to control
events and note the lack of statistical evidence of enrichment for
brain function gene sets [1,8]; however, control events are typically
fewer and smaller, implicating many fewer genes, and therefore
simply comparing the statistical significance of gene set enrichment
results in cases and controls is not adequate.
One possible consequence of examining genes rather than the
events they occur in is that individual large events contribute many
genes and may skew the analysis much more than smaller events,
and cause spurious findings. This is of particular concern since
genes with common function can often cluster together on the
genome and a single event in one individual affecting a cluster of
related genes can naively appear to implicate an entire pathway
[18,19]. One interesting example is the reported enrichment of
psychological disorder genes in the Zhang et al data set (see Table S1);
11 of the 16 deleted psychological disorder genes are in the 22q11.21
region observed in two individuals in the data set . These genes
are possibly annotated as psychological disorder genes since rare
deletions in 22q11.21 have long been observed among schizo-
phrenia cases . Removal of the two individuals with 22q11.21
events eliminates any enrichment for the psychological disorder gene
set – suggesting that there is little evidence that this particular set is
necessarily relevant to disease outside of the 22q11.21 region. Of
course at least one gene in this region is pathogenic, but it is
unlikely that .10 in this region are and that in aggregate define a
key pathogenic set.
A second possible consequence of examining genes and not the
events they occur in is that genes affected in both cases and
controls, but at different rates, are not properly accounted for. For
instance, a critical gene affected by many pathogenic events
contributes equally to a gene set enrichment analyses as a gene
sporadically affected by a single event. One interesting example is
the NRXN1 gene, a large gene that plays an important role in
synaptic development . Since CNV events affecting NRXN1
have been observed in both schizophrenic cases and controls, they
would contribute equally to a pathway analysis of case events as
they would to one of control events. However, the rate of
functional events observed in cases is significantly more than in
controls; pathway-based approaches could be bolstered if methods
explicitly take into account these differences between cases and
controls event rates for genes of interest.
Here, we describe a case-control statistical test, cnv-enrichment-test,
to explicitly compare the rate of CNVs impacting specific genes
sets in cases to controls. We show how cnv-enrichment-test is robust to
even extreme biases in gene size and case-control differences in
CNV rate and size. We also demonstrate how standard gene set
enrichment approaches is often confounded under realistic
scenarios, by gene size and other gene structural features; we
demonstrate these confounders in a set of 2,415 controls
genotyped for rare single-event deletions. We finally apply the
cnv-enrichment-test to examine genes with brain function within a
large dataset of CNVs identified in schizophrenia cases and
controls published by the International Schizophrenia Consortium
(ISC)  and demonstrate nominal evidence of association for
previously described gene sets.
Specific rare deletion and duplication events in the genome
have now been shown to be associated with neuropsychi-
atric diseases such as 16p11.2 to autism and 22q11.21 to
schizophrenia. However, controversy remains as to whether
rare events impacting certain pathways as a group increase
the riskof disease, and if so, what those pathways are.Other
studies have used standard gene-set enrichment approach-
es to demonstrate that events discovered in cases contain
more genes in neuro-developmental pathways than would
be expected by chance. However, these analyses do not
explicitly compare the relative enrichment in cases to any
enrichment that may also be present in controls. Therefore,
they can be confounded by the large size of brain genes or
by larger size or frequency of CNVs in cases. Here we
propose a case-control statistical test to assess whether a
key pathway is differentially impacted by CNVs in cases
compared to controls. Our approach is robust to skewed
gene sizes and case-control differences in CNV rate and size.
Pathway Analyses of Genes Affected by Rare CNVs
PLoS Genetics | www.plosgenetics.org2 September 2010 | Volume 6 | Issue 9 | e1001097
to determine whether any enrichment is general to all genes, or
specific to a subset of genes.
a) Enrichment of genic CNVs
./plink ––cfile mycnv
b) Enrichment of pathway genes CNVs, relative to all
./plink ––cfile mycnv
c) Enrichment of pathway genes CNVs, relative to all
./plink ––cfile mycnv
./plink ––cfile my-genic-cnv
The usual modifiers (to define intersection differently, allow for a
certain kb border around each gene, filter on CNV size, type or
frequency, etc) are all available. Under all circumstances, 2-sided
asymptotic p-values are returned. Alternatively, permutation testing
can be applied and 1-sided empirical p-values are returned (positive
enrichment in cases, based on estimated regression coefficient).
For additional information consult the PLINK website (http://
pngu.mgh.harvard.edu/purcell/plink/), the resources subsection
(gene list) (http://pngu.mgh.harvard.edu/purcell/plink/res.shtml),
or the CNV file format subsection (http://pngu.mgh.harvard.edu/
in the meta-controls. A. Here we plot the distribution of the genes
that are not deleted (n=14,027, blue) and the genes that are
deleted (n=538, red) separately for the meta-controls. Deleted
genes are larger with a median of 66 kb compared to genes not
deleted with a median of 27 kb. Medians and inter-quartile ranges
are indicated with the boxes, while the range indicates the 2.5 to
97.5 percentiles for both distributions. B. We plot the fraction
intrinic fraction score as a function of gene size. Larger genes tend
to have potentially greater proportions of events that could be fully
intronic. Red points indicate deleted genes while blue point
indicate the remainder. C. Here we plot the local gene density, i.e.,
the number of other nearby genes overlapped by a CNV as a
function of gene size. Events overlapping large genes tend not to
overlap other nearby genes. Red points indicate deleted genes
while blue points indicate the remainder.
Found at: doi:10.1371/journal.pgen.1001097.s001 (1.85 MB TIF)
Features predicting whether a gene overlaps a CNV
association studies. Here we list affected genes within gene sets
highlighted in three neuropsychiatric disease studies. In the first
column we list the study, in the next two columns we list the
functional gene sets and their source. In the fourth and fifth
column we list the genes, and their sizes. In the final column we list
the mean size. Many of the genes highlighted in all three studies
are very large genes. *For the Walsh et al. study, these genes were
compiled from multiple brain function gene sets.
Found at: doi:10.1371/journal.pgen.1001097.s002 (0.09 MB
Gene size of brain genes highlighted in three CNV-
and controls. We tested different statistical models as outlined in
Materials and Methods (M0–M4) for false positive associations
under each of five extreme scenarios (S0–S4) outlined in the above
table. For a single hypothetical chromosome, 250Mb in length, we
placed 2000 evenly-spaced, non-overlapping genes. Every fifth
gene was designated as a ‘‘brain gene’’; brain genes were set to be
considerably larger than other genes (50kb versus 10kb). In all
scenarios we simulated CNV data for 2000 cases and 2000
controls, specifying the mean CNV size was either 60kb or 100kb
(range 10kb to 150kb, standard deviation 30kb) and CNV rate per
individual. Under the first scenario, S0, there were no differences
between cases and controls in the rate and size of CNVs: we
therefore expected all methods to give appropriate type I error
rates here. Under S1, the rate of CNVs was higher in cases. Under
S2, the average CNV size was smaller in cases. Under S3, cases
had a greater number, and larger, CNVs than controls. Under S4,
cases had a greater number, but smaller, CNVs than controls.
Found at: doi:10.1371/journal.pgen.1001097.s003 (0.04 MB
Simulated distribution of CNV rate and size in cases
examined affected and unaffected individuals from the ISC,
Walsh et al., and Zhang et al.. We also used unaffected populations
from four separate studies (meta-controls). For each study we list
the number of samples, the genotyping technology used to identify
CNVs (Representational Oligonucleotide Microarray Analysis
(ROMA), Affymetrix 5.0 (5.0) or Affymetrix 6.0 (6.0)), the number
of observed events, how we defined a ‘rare’ event, their size, and
the number of genes affected by those events.
Found at: doi:10.1371/journal.pgen.1001097.s004 (0.06 MB
Collections examined in this study. Our study
separate populations. Meta-control rare deletion events were
called based on Affymetrix 6.0 arrays. For each of the four
collections we list the number of samples, the number of rare
deletions .20 kb and the ratio of deletions to samples, the number
of rare deletions .100 kb and the ratio of deletions to samples,
and finally the median event size.
Found at: doi:10.1371/journal.pgen.1001097.s005 (0.04 MB
Deletion in meta-controls events called in four
enriched in meta-controls. Here we list 13 GO codes with an
average gene length .200 kb and their descriptions in the first
four columns. In the next three columns we list the number of
genes for each code overlapping rare deletions in meta-controls,
the odds ratio, and the statistical significance. All enrichment
analyses p-values are calculated with Fisher’s exact test; enrich-
ment is calculated for both disrupted and deleted genes. In the
final three columns we list the number of genes disrupted by rare
deletions in meta-controls, the odds ratio, and the statistical
Found at: doi:10.1371/journal.pgen.1001097.s006 (0.06 MB
Gene ontology codes with the largest genes are
Found at: doi:10.1371/journal.pgen.1001097.s007 (0.66 MB
CNVprop scores of genes.
We thank Christopher Cotsapas, Paul de Bakker, and Benjamin Voight,
for helpful comments and criticism. We also thank Aylwin Ng and Ramnik
Xavier for assistance in gene expression analysis. We thank (1) The
Myocardial Infarction Genetics Consortium (MIGen) study for the use of
Pathway Analyses of Genes Affected by Rare CNVs
PLoS Genetics | www.plosgenetics.org 11 September 2010 | Volume 6 | Issue 9 | e1001097
their genotype data as control data in our study (2) Johanna Seddon and
the Progression of AMD Study, AMD Registry Study, Family Study of
AMD, The US Twin Study of AMD, and the Age-Related Eye Disease
Study (AREDS) for use of genotype data from their healthy controls in our
study (3) Phil de Jager, David Hafler and the Multiple Sclerosis
collaborative for use of genotype data from their healthy controls recruited
at Brigham and Women’s Hospital and (4) the GAIN collaborative for use
of genotype data from their healthy controls in our study.
The International Schizophrenia Consortium.
Shaun Purcell, Jennifer Stone, Sarah Bergen, Colm O’Dushlaine,
Douglas Ruderfer, Pamela Sklar, Broad Institute, Cambridge, USA and
Massachusetts General Hospital, Boston, USA; Edward Scolnick, Kim-
berly Chambert, Broad Institute, Boston, USA; Michael O’Donovan,
George Kirov, Nick Craddock, Peter Holmans, Nigel Williams, Lucy
Georgieva, Ivan Nikolov, Nadine Norton, H Williams, Draga Toncheva,
Vihra Milanova, Michael Owen, Cardiff University, Cardiff, UK;
Christina Hultman, Paul Lichtenstein, Emma Thelander, Patrick Sullivan,
Karolinska Institutet, Stockholm, Sweden and University of North
Carolina Chapel Hill, Chapel Hill, USA; Derek Morris, Elaine Kenny,
John Waddington, Michael Gill, Aiden Corvin, Trinity College Dublin,
Dublin, Ireland; Andrew McQuillin, Khalid Choudhury, Susmita Datta,
Jonathan Pimm, Srinivasa Thirumalai, Vinay Puri, Robert Krasucki, Jacob
Lawrence, Digby Quested, Nicholas Bass, David Curtis, Hugh Gurling,
University College London, London, UK; Caroline Crombie, Gillian
Fraser, Noelle Kwan, Nicholas Walker, David St. Clair, University of
Aberdeen, Aberdeen, UK; Douglas Blackwood, Walter Muir, Kevin
McGhee, Alan Maclean, Margaret Van Beck, University of Edinburgh,
Edinburgh, UK; Peter Visscher, Stuart Macgregor, Naomi Wray,
Queensland Institute of Medical Research, Brisbane, Australia; Michele
T. Pato, Helena Medeiros, Frank Middleton, Celia Carvalho, Christopher
Morley, Ayman Fanous, David Conti, James Knowles, Carlos Paz
Ferreira, Antonio Macedo, M. Helena Azevedo, Carlos N. Pato, University
of Southern California, Los Angeles, CA, USA.
Conceived and designed the experiments: SR SAM DA PS SP MJD.
Performed the experiments: SR JMK. Analyzed the data: SR JMK SAM
SP MJD. Contributed reagents/materials/analysis tools: JMK PS. Wrote
the paper: SR DA PS SP MJD.
1. Walsh T, McClellan JM, McCarthy SE, Addington AM, Pierce SB, et al. (2008)
Rare structural variants disrupt multiple genes in neurodevelopmental pathways
in schizophrenia. Science 320: 539–543.
2. International Schizophrenia Consortium (2008) Rare chromosomal deletions
and duplications increase risk of schizophrenia. Nature 455: 237–241.
3. Stefansson H, Rujescu D, Cichon S, Pietilainen OP, Ingason A, et al. (2008)
Large recurrent microdeletions associated with schizophrenia. Nature 455:
4. Xu B, Roos JL, Levy S, van Rensburg EJ, Gogos JA, et al. (2008) Strong
association of de novo copy number mutations with sporadic schizophrenia. Nat
Genet 40: 880–885.
5. Marshall CR, Noor A, Vincent JB, Lionel AC, Feuk L, et al. (2008) Structural
variation of chromosomes in autism spectrum disorder. Am J Hum Genet 82:
6. Sebat J, Lakshmi B, Malhotra D, Troge J, Lese-Martin C, et al. (2007) Strong
association of de novo copy number mutations with autism. Science 316:
7. Weiss LA, Shen Y, Korn JM, Arking DE, Miller DT, et al. (2008) Association
between Microdeletion and Microduplication at 16p11.2 and Autism.
N Engl J Med 358: 667–675.
8. Zhang D, Cheng L, Qian Y, Alliey-Rodriguez N, Kelsoe JR, et al. (2009)
Singleton deletions throughout the genome increase risk of bipolar disorder. Mol
Psychiatry 14: 376–380.
9. Mootha VK, Lindgren CM, Eriksson KF, Subramanian A, Sihag S, et al. (2003)
PGC-1alpha-responsive genes involved in oxidative phosphorylation are
coordinately downregulated in human diabetes. Nat Genet 34: 267–273.
10. Elia J, Gai X, Xie HM, Perin JC, Geiger E, et al. (2010) Rare structural variants
found in attention-deficit hyperactivity disorder are preferentially associated with
neurodevelopmental genes. Mol Psychiatry 5: 637–646.
11. Glessner JT, Wang K, Cai G, Korvatska O, Kim CE, et al. (2009) Autism
genome-wide copy number variation reveals ubiquitin and neuronal genes.
Nature 459: 569–573.
12. Mi H, Lazareva-Ulitsky B, Loo R, Kejariwal A, Vandergriff J, et al. (2005) The
PANTHER database of protein families, subfamilies, functions and pathways.
Nucleic Acids Res 33: D284–288.
13. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, et al. (2000) Gene
ontology: tool for the unification of biology. The Gene Ontology Consortium.
Nat Genet 25: 25–29.
14. Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, et al. (2006) Global
variation in copy number in the human genome. Nature 444: 444–454.
15. Yim SH, Kim TM, Hu HJ, Kim JH, Kim BJ, et al. (2010) Copy number
variations in East-Asian population and their evolutionary and functional
implications. Hum Mol Genet 19: 1001–1008.
16. Stanley SM, Bailey TL, Mattick JS (2006) GONOME: measuring correlations
between GO terms and genomic positions. BMC Bioinformatics 7: 94.
17. Taher L, Ovcharenko I (2009) Variable locus length in the human genome leads
to ascertainment bias in functional inference for non-coding elements.
Bioinformatics 25: 578–584.
18. Kanehisa M, Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes.
Nucleic Acids Res 28: 27–30.
19. Cohen BA, Mitra RD, Hughes JD, Church GM (2000) A computational analysis
of whole-genome expression data reveals chromosomal domains of gene
expression. Nat Genet 26: 183–186.
20. Rujescu D, Ingason A, Cichon S, Pietilainen OP, Barnes MR, et al. (2009)
Disruption of the neurexin 1 gene is associated with schizophrenia. Hum Mol
Genet 18: 988–996.
21. The Wellcome Trust (2007) Genome-wide association study of 14,000 cases of
seven common diseases and 3,000 shared controls. Nature 447: 661–678.
22. Barnes C, Plagnol V, Fitzgerald T, Redon R, Marchini J, et al. (2008) A robust
statistical method for case-control association testing with copy number
variation. Nat Genet 40: 1245–1252.
23. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, et al. (2007)
PLINK: a tool set for whole-genome association and population-based linkage
analyses. Am J Hum Genet 81: 559–575.
24. De Jager PL, Jia X, Wang J, de Bakker PI, Ottoboni L, et al. (2009) Meta-
analysis of genome scans and replication identify CD6, IRF8 and TNFRSF1A as
new multiple sclerosis susceptibility loci. Nat Genet 41: 776–782.
25. Kathiresan S, Voight BF, Purcell S, Musunuru K, Ardissino D, et al. (2009)
Genome-wide association of early-onset myocardial infarction with single
nucleotide polymorphisms and copy number variants. Nat Genet 41: 334–341.
26. Neale BM, Fagerness J, Reynolds R, Sobrin L, Parker M, et al. (2010) Genome-
wide association study of advanced age-related macular degeneration identifies a
role of the hepatic lipase gene (LIPC). Proc Natl Acad Sci U S A 107:
27. Wang K, Zhang H, Kugathasan S, Annese V, Bradfield JP, et al. (2009) Diverse
genome-wide association studies associate the IL12/IL23 pathway with Crohn
Disease. Am J Hum Genet 84: 399–405.
28. Roth RB, Hevezi P, Lee J, Willhite D, Lechner SM, et al. (2006) Gene
expression analyses reveal molecular relationships among 20 regions of the
human CNS. Neurogenetics 7: 67–80.
29. Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, et al. (2003) Summaries
of Affymetrix GeneChip probe level data. Nucleic Acids Res 31: e15.
30. Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, et al. (2008)
Database resources of the National Center for Biotechnology Information.
Nucleic Acids Res 36: D13–21.
31. Korn JM, Kuruvilla FG, McCarroll SA, Wysoker A, Nemesh J, et al. (2008)
Integrated genotype calling and association analysis of SNPs, common copy
number polymorphisms and rare CNVs. Nat Genet 40: 1253–1260.
32. McCarroll SA, Kuruvilla FG, Korn JM, Cawley S, Nemesh J, et al. (2008)
Integrated detection and population-genetic analysis of SNPs and copy number
variation. Nat Genet 40: 1253–1260.
Pathway Analyses of Genes Affected by Rare CNVs
PLoS Genetics | www.plosgenetics.org12 September 2010 | Volume 6 | Issue 9 | e1001097