Getting from Genes to Function in Lung Disease:
An NHLBI Workshop Report
Carole Ober1, Atul J. Butte2, Jack A. Elias3, A. Jake Lusis4, Weiniu Gan5,
Susan Banks-Schlegel6, David Schwartz7
1 Department of Human Genetics, The University of Chicago, Chicago, IL; 2
Departments of Pediatrics and Medicine, Stanford University School of Medicine,
Stanford, CA; 3 Department of Internal Medicine, Yale University School of Medicine,
New Haven, CT; 4 Department of Medicine, University of California, Los Angeles, CA; 5,6
Division of Lung Diseases, National Heart, Lung, and Blood Institute, Bethesda, MD;
7Department of Medicine, National Jewish Health, Denver, CO
All authors contributed equally to this article
Susan Banks-Schlegel, Ph.D.
Airway Biology and Disease Branch
Division of Lung Diseases
National Heart, Lung, and Blood Institute/NIH
Two Rockledge Center, Suite 10042
6701 Rockledge Drive, MSC 7952
Bethesda, MD 20892-7952
Page 1 of 27
Media embargo until 2 weeks after above posting date; see thoracic.org/go/embargo
AJRCCM Articles in Press. Published on June 17, 2010 as doi:10.1164/rccm.201002-0180PP
Copyright (C) 2010 by the American Thoracic Society.
Sponsored by the Division of Lung Diseases, National Heart, Lung, and Blood Institute,
National Institutes of Health, Department of Health and Human Services
Running Head: Genes to Function in Lung Disease
Descriptor Number: 6.07
Word Count: Text Excluding Abstract and References= 2705
At a Glance Commentary
1) Scientific knowledge on the Subject: Functional characterization of genes (and
associated variants) involved in lung disease using integrated approaches is essential
for advancing understanding of lung health and disease and developing new
approaches to prevention, diagnosis, prognosis, treatment, and cure of the disease.
2) What this study adds to the field: This NHLBI workshop summary identifies critical
gaps in knowledge concerning gene discovery, systems genetics, and functional
characterization of genetic variation involved in the pathogenesis of lung disease in
human and model systems. Experts in genetics, genomics, proteomics, networks, and
informatics were brought together with experts in pulmonary biology and medicine with
the goal of cross-fertilization and generation of needed next-step initiatives that can be
translated into research needs and opportunities in this important scientific area that
transcends all lung disease areas.
Page 2 of 27
Genome-wide association studies (GWAS) have revealed novel genes and pathways
involved in lung disease, many of which are potential targets for therapy. However,
despite numerous successes, a large proportion of the genetic variance in disease risk
remains unexplained, and the function of the associated genetic variations identified by
GWAS and the mechanisms by which they alter individual risk for disease or
pathogenesis are still largely unknown. The National Heart, Lung, and Blood Institute
(NHLBI) convened a two-day workshop to address these shortcomings and to make
recommendations for future research areas that will move the scientific community
beyond gene discovery. Topics of individual sessions ranged from data integration and
systems genetics to functional validation of genetic variations in humans and model
systems. There was broad consensus among the participants for five high priority areas
for future research, including the following: (1) integrated approaches to characterize
the function of genetic variations, (2) studies on the role of environment and
mechanisms of transcriptional and post-transcriptional regulation, (3) development of
model systems to study gene function in complex biological systems, (4) comparative
phenomic studies across lung diseases, and (5) training in and applications of
bioinformatic approaches for comprehensive mining of existing data sets. Lastly, it was
agreed upon that future research on lung diseases should integrate approaches across
“-omic” technologies and to include ethnically/racially diverse populations in human
studies of lung disease, whenever possible.
Word Count: 230
Key Words: genetics, epigenetics, genomics, bioinformatics, lung disease
Page 3 of 27
Recent results of genome-wide association studies for asthma (1-6), chronic obstructive
pulmonary disease (COPD) (7), sarcoidosis (8), idiopathic pulmonary fibrosis (IPF) (9),
and other lung-relevant phenotypes (10-13) have highlighted both the power and
shortcomings of this approach (14-16). While novel genes or genetic variants have been
identified and, therefore, implicated in disease pathogenesis, a large proportion of the
genetic variance in each case remains unexplained. Moreover, among those genes and
genetic variations implicated in disease pathogenesis by the genome-wide association
studies approach, the function of those variations and the mechanisms by which they
contribute to disease pathogenesis are still largely unknown.
To address these shortcomings and consider future directions for the genetic
dissection of complex lung phenotypes, the Lung Division of the National Heart, Lung,
and Blood Institute (NHLBI) sponsored a two-day workshop, “Getting from Genes to
Function in Lung Disease”, on September 3rd-4th, 2009. Two overview presentations by
Dr. Carole Ober (University of Chicago) on the genetics of lung disease and by Dr.
Ronald Crystal (Weill Cornell Medical College) on lung disease phenotypes emphasized
both the complex etiology of lung diseases (Fig. 1) and common pathogenic features
across lung diseases. They suggested that additional insights may be gleaned from
studying common features of diseases with different disease endpoints and emphasized
the need for better phenotyping in studies of lung disease and development. Dr. Eric
Schadt (Pacific Biosciences) then reviewed genomics and systems biology approaches
to studying complex phenotypes. The workshop then covered topics on data integration
(A. Butte, Chair), systems genetics (A. Lusis, Chair), functional validation in model
Page 4 of 27
organisms (J. Elias, Chair) and in humans (D. Schwartz, Chair), and translational and
integrative genomics in clinical settings (C. Ober, Chair). The workshop attendees
participated in lively discussions on these topics with the ultimate goal of defining a
series of recommendations to the Institute on future research initiatives that will fill in our
gaps of knowledge of the genetic basis of lung disease by applying integrated state-of-
the-art “-omic” approaches.
Session 1: Data Integration
In the opening scientific session of the workshop, Dr. William Cookson (Imperial
College, London) spoke on “Integration of Genetics and Gene Expression Profiling in
Lung Disease”. Dr. Cookson reviewed expression quantitative trait loci (eQTL) mapping,
the increasingly popular analytic method for integrating gene expression measurements
and genetic measurements made from the same samples or individuals. In this
approach, gene expression levels are treated as quantitative traits and mapped to
chromosomal loci, and Dr. Cookson showed how this method has been used to study
traits related to asthma. However, Dr. Cookson argued that a single measurement of
gene expression in a single tissue or cell type may not be sufficient, and time-series
measurements and measurements in multiple tissues may be more informative. As a
cautionary tale, Dr. Cookson illustrated how eQTL mapping results differ when different
genotyping platforms are used. Because probes differ between array platforms, the
results of studies using different platforms will also differ, and these differences must be
taken into account when interpreting results. He pointed out, however, that this will only
Page 5 of 27
be an issue until RNA sequencing (RNAseq) becomes more widely available, which is
already the preferred approach for measuring transcript levels.
Dr. Atul Butte (Stanford University) next spoke on “Exploring Systems Medicine
Using Translational Bioinformatics”. Dr. Butte focused on the highly-enabling nature of
publicly available molecular measurements, and showed that measurements from more
than 300,000 gene expression microarrays and DNA samples from tens of thousands of
individuals can be downloaded from the National Center for Biotechnology Information
(NCBI). Dr. Butte illustrated how these kinds of measurements can be used for
integrative genomics, giving one case example on their use for finding novel biomarkers
for solid organ transplantation rejection and for enabling the discovery of novel ligands
and receptors associated with type 2 diabetes mellitus. His final points warned against
the development of reference repositories and bioinformatics methodology as end
goals, but instead promoted investments in the application of computational
methodologies on available data to further the development of diagnostics and
The final speaker in this session, Dr. Dan Roden (Vanderbilt University Medical
Center) spoke on “Integration of the Health System into Genetic and Genomic Data”.
Dr. Roden showed Vanderbilt’s impressive progress in obtaining DNA samples on tens
of thousands of patients within their health system, and tying these de-identified
samples with a linked but also de-identified copy of their electronic medical record, a
system called BioVU. With over 60,000 samples already obtained, Dr. Roden illustrated
the next hardest challenge; determining the patient’s ”medical phenotype” from the
mostly text-based records that describe them. Obtaining DNA on this many patients will
Page 6 of 27
enable phenotype-wide association studies in the future, which Dr. Roden defined as
the search for differences in specific medical phenotypes that correspond to variance at
a genetic locus. This novel approach was contrasted to genome-wide association
studies, in which variance in one narrowly-defined phenotype is correlated against
genotype at every loci to identify associations.
Discussant Dr. Joe (Skip) Garcia (The University of Illinois, Chicago) closed the
session with a discussion on the current barriers to fully integrate medical and biological
data, and on suggestions for overcoming these limitations in the future. One point that
was raised in discussion was that we might use as models the organized structure
around data storage and curation that has already been implemented in the National
Cancer Institute (NCI), with their Cancer Biomedical Informatics Grid (caBIG), and the
National Human Genome Research Institute (NHGRI), with their Cancer Genome Atlas
(TCGA). Whether the NHLBI should consider building a similar infrastructure (e.g.
“pulmBIG”), was debated. The laudatory history of the investments that NHLBI has
made in genomics was also discussed, including the Programs in Genomic Application
(PGA), which developed several reference technologies and data sets that are now
widely used. Lastly, the need to fund junior faculty and trainees in these new data-
driven, computationally-intensive areas of research was also viewed as an important
Session 2: Systems Genetics
Dr. Nancy Cox (The University of Chicago) led off a session on system genetics with a
talk on “Using Genome-Wide Association Studies and Integrative Network Approaches
to Identify Genes and Pathways in Lung Disease”. She indicated that, while genome-
Page 7 of 27
wide association studies has clearly been successful in identifying novel genes for lung
diseases, the findings thus far have explained a relatively small fraction variance of the
heritability of the diseases and, most likely, all results to date are the “low hanging fruit”.
Moreover, she emphasized that relatively little biology has come from genome-wide
association studies of lung, as well as on other, diseases. The remainder of her talk
focused on the use of genome-wide expression data to complement standard genome-
wide association studies. She showed how expression array data could be used to in
silico map loci controlling gene expression in cis or in trans and how eQTL analysis can
help prioritize candidate genes in regions identified by genome-wide association
studies. She pointed out that this information will also enhance our understanding of the
underlying biology of the disease/trait, and that eQTLs provide new ways of looking for
gene-gene and even gene-environment interactions.
Dr. A. Jake Lusis (UCLA) built upon this by speaking on “Systems Genetics
Approaches to Complex Traits”. The goal of systems biology is to define all the
elements present in a given system and to create an interaction network between these
components so that the behavior of the system can be explained under specified
conditions. Systems genetics is a form of systems biology in which the perturbations
used to construct the biologic network are common variations in the population. The
elements, or nodes, in the network correspond to molecular phenotypes such as
transcript, metabolite, or protein levels. Dr. Lusis described two systems being studied
in his lab using systems genetics. The first consists of a series of primary endothelial
cells studied both before and after treatment with oxidized phospholipids to model
inflammation that occurs in atherosclerosis. The cells were then examined for global
Page 8 of 27
transcript levels using microarrays and the individuals were genotyped using a high
density SNP array. This allowed the identification of eQTL and the modeling of biologic
networks using co-expression analysis. The networks, and predictions made from the
networks, were validated using siRNA knockdown. Dr. Lusis described a second system
that consisted of 100 inbred strains of mice, termed the Hybrid Mouse Diversity Panel
(HMDP), which allows relatively high resolution mapping in mice using association
rather than linkage. Because the inbred strains are permanent, the mice can be
characterized for many different molecular and clinical traits, and then readily mapped
using standard mapping approaches.
Discussant Dr. Scott Weiss (Harvard Medical School) summarized the highlights
of this session and led a discussion on future directions in applying systems genetics to
Session 3a: Functional Validation in Model Systems
The first speaker in this session, Dr. Marcelo Nobrega (The University of Chicago)
spoke on “In Vivo Platforms to Follow-up Noncoding Variants Emerging from Genome-
Wide Association Studies”. The experimental follow-up of noncoding variants that map
to gene deserts represent some of the most challenging aspects emerging from multiple
GWAS. Focusing specifically on the hypothesis that functional noncoding variants often
disrupt cis-regulatory elements, he described experimental platforms to identify these
distant long-range regulatory sequences and infer the functional impact of variants
within them. He showed how a strategy to convert Bacterial Artificial Chromosomes
(BACs) into enhancer trapping systems allow for the characterization of regulatory
landscapes using in vivo enhancer assays in mice and zebrafish. Dr. Nobrega showed
Page 9 of 27
how this strategy uncovered multiple enhancers in the gene desert surrounding the
TBX20 gene. By showing that the expression of TBX20 was abrogated following the
deletion of specific enhancers, he demonstrated that this strategy successfully uncovers
most cis-acting elements at a locus. He concluded by applying these principles to
identify a human prostate-specific enhancer in a gene desert, and showing that this
enhancer contains a SNP that has been associated with prostate cancer in multiple
GWAS and confers allele-specific in vivo activity to this enhancer.
Dr. Jack A. Elias (Yale University) spoke on “Genetic Manipulation of Mice to
Define Functionality Relevant to Human Lung Disease”. He discussed the need to go
from genetic associations to biologic clarification, pathway identification, and therapeutic
target validation; and he outlined the murine approaches that can be used to achieve
these goals. Dr. Elias emphasized five basic questions: What does the gene do? What
pathways does it use? How does a given genetic variant alter expression or function
and relate to the pathogenesis of the associated disease? What is the role of the gene
in development or early life origins of disease? Are the protein or its regulators viable
therapeutic targets? The use and limitations of systemic and tissue-localized and
constitutive and inducible null mutations, over-expressing transgenic, and knock-in
approaches were addressed. Limiting issues included the need for (i) easier ways of
generating mutant animals, (ii) banks of affordable mutant animals, ES cells and tissue
targeted Cre recombinase transgenic mice, (iii) integrated analytic approaches to these
models, and (iv) improved methodologies to address issues relating to microRNA,
glycobiology, and epigenetics. The need for better models of human disease that will
Page 10 of 27
allow more accurate extrapolations from mice to man and for an iterative approach that
combines findings from mice, cells and human investigations was emphasized.
Session 3b: Functional Validation in Humans
Dr. Donata Vercelli discussed the “Functional Dissection of Human Genetic Variants, Or
Finding the Mechanisms that Link Genotype to Phenotype”. She focused on functional
studies of genetic variants associated with human complex diseases using asthma as a
case in point, and emphasized the need for mechanistic studies that go beyond mere
associations to define how complex disease-associated variants dysregulate gene
expression and function. This knowledge is in turn necessary to identify targets for
effective preventive and treatment strategies. She criticized the reductionist in vitro
approaches that are commonly used as being artificial and unable to model interactions
among multiple gene variants within complex haplotypes, or gene-gene and gene-
environment interactions. Dr. Vercelli convincingly argued that in vivo models relying on
mice that carry wild type or polymorphic human haplotypes as transgenes or knock-in
alleles provide a workable solution to this problem. Her studies on mice transgenic for
the entire human Th2 locus on chromosome 5q (including IL13, one of the most robust
asthma/allergy susceptibility genes) show that the human genes are appropriately
regulated, both transcriptionally and epigenetically, in this model. Thus, the functional
impact of polymorphic haplotypes on the expression of these genes, and the biological
events they control in response to relevant stimuli, can be rigorously tested in vivo.
Importantly, these models can be expanded to study gene-gene and gene-environment
interactions, including response to drugs (i.e., pharmacogenetics).
Page 11 of 27
Dr. David Schwartz (National Jewish Health) elaborated on this theme by
pointing out that most human diseases are caused by genetic variation and
environmental exposures, and that including epigenetic marks in genetic studies would,
in part, account for this interaction. Moreover, he demonstrated that when biology is
conserved across evolution, such as seen in innate immunity, model organisms (such
as flies, worms, and yeast) represent ideal biological systems to exploit to discover
novel genes and mechanisms involved in these common biological processes. Finally,
Dr. Schwartz indicated that genetic, genomic, and molecular profiles can be used to
define human disease, identify disease earlier, understand the dynamic biology of
disease within an individual, and individualize therapy and prognosis.
Discussant Dr. Fernando Martinez (University of Arizona) closed the session on
functional validation by further highlighted the importance of considering developmental
stage and environmental exposures in functional studies.
Session 4: Integration Across Genetic, Genomic, and Systems Approaches
In the final session of the workshop, Dr. Damien Chaussabel (Baylor Institute for
Immunology Research) discussed “Translational Human Immunology”, an approach for
integrating patient-based clinical studies and trials with high throughput genomic and
proteomic profiling, flow cytometry, and high resolution cellular studies to better classify
human diseases, with the ultimate goal of improved (customized) therapeutics. He
discussed the challenges that arise from integrating data from these various sources, as
well as from investigators on different continents and using different formats. Once
integrated, however, these data can be mined in ways that provide novel insights into
Page 12 of 27
Dr. Joseph Loscalzo (Harvard Medical School) followed by speaking on “Human
Disease Classification in the Post-Genomic Era: A Complex Systems Approach”,
arguing against a reductionist approach to understanding human disease. Instead, he
proposed using quantitative approaches to examine complex biological systems that
comprise networks and simultaneously consider ensembles of models. Ultimately, this
approach would enable the definition of pathophenotypes (clinical syndromes and
disease) that are determined by genetic, protein, cellular, and environmental
components of a network. Both speakers provided elegant examples of applying
sophisticated bioinformatic approaches that integrate across data sets to characterize
disease processes and to personalize therapeutic approaches.
Prioritizing Variants or Genes for Functional Studies
A road map for moving from gene discovery to function, biology, and discovery of
therapeutic targets is illustrated in Figure 2. In this section we focus on the early steps in
this journey that require decisions on which variants/genes identified in a GWAS should
be considered for functional studies.
It was widely recognized by the speakers and attendees at the conference that
prioritizing specific variants or genes themselves for functional studies is a first critical
step in moving beyond genetic association to function. Often a GWAS identifies many
variants or genes as potential candidates for lung health or disease that can not be
further prioritized based on the statistical evidence of association. Because studies
elucidating the function and biology of associated variants or the biology of newly
discovered genes can be costly and time-consuming, prioritization of these GWAS
discoveries for further study is an important and challenging first step.
Page 13 of 27
There were two major themes that emerged from these discussions. First, in
selecting variants or genes for functional studies, it is important to consider the
robustness of the association. Although consideration should be given to the strength of
the association, the strongest signal in any one GWAS may not always be the most
robust association overall. Therefore, associations with particular variants, or even with
different variants within the same gene, that replicate broadly across studies should be
given highest priority. Often, these will not be the strongest associations in any one
study, but the consistent evidence for association in many different studies (e.g., as
revealed in a meta-analyses) would further suggest that the variant and gene have main
effects on the phenotype, are less likely influenced by gene-gene or gene-environment
interactions, and are most likely to be true associations.
Second, the large amassing of publicly available data on gene expression
(including eQTLs) and of bioinformatic tools to predict potential functionality of genetic
variants allow in silico studies of function or putative function that could both provide
additional confidence in the association and inform the types of functional studies to
consider. In this same vein, incorporating systems genetics, biological networks, data
mining, and predictive structural changes can provide context for newly discovered
genes and motivate specific functional studies, at a relatively low cost. Using in silico
approaches for prioritizing variants or genes after a GWAS are particularly important in
situations in which replication studies are not available, as may be the case for rare
phenotypes or phenotypes that may be costly to obtain on a large number of patients
(e.g., through imaging studies). In those cases, integrating bioinformatic approaches
that include complementary data sets, such as gene expression networks and eQTL
Page 14 of 27
mapping, could reveal plausible biological pathways for the newly discovered gene and
motivate subsequent functional studies.
The participants in the workshop identified specific high priority areas for future research
directions (Table 1), which fell into four broad categories: 1) Functional characterization
of genes involved in the pathogenesis of lung disease and of their associated genetic
variations, 2) Identification and characterization of interactions between genes and
between genes and environment that impact lung development and pathology, 3) More
sophisticated phenotyping (phenomics) that incorporates genetic, molecular, cellular
and/or physiologic biomarkers (transcriptomics, proteomics, metabolomics, etc.) and
imaging, 4) Better and more comprehensive mining of existing data, 5) High throughput
biological screens in embryonic stem cells and model organisms that focus on gene
targets identified in genome-wide association and linkage studies. It was further
recommended that future research related to each of these categories should be
addressed using integrated approaches that include more than one of the following: “-
omic” technologies, systems genetics/biology and pathway/network analysis, animal
models of lung disease, human studies in ethnically/racially diverse populations,
multiple lung diseases, mining of data from existing resources. The latter could include,
but is not limited to, data from clinical trial cohorts and population-based cohorts,
studies of transcriptional and proteomic profiling in relevant tissues, mouse models of
lung disease, and repositories of banked tissues. Lastly, there was a strong consensus
for the need to train young investigators to use bioinformatic tools that will allow the
Page 15 of 27
mining of existing data sets to address hypotheses on the pathogenesis of lung disease
and to make lung-related data sets more accessible to the community at large.
Page 16 of 27
1. Hancock, D. B., I. Romieu, M. Shi, J. J. Sienra-Monge, H. Wu, G. Y. Chiu, H. Li, B.
E. del Rio-Navarro, S. A. Willis-Owen, S. T. Weiss, B. A. Raby, H. Gao, C. Eng, R.
Chapela, E. G. Burchard, H. Tang, P. F. Sullivan, and S. J. London. 2009. Genome-
wide association study implicates chromosome 9q21.31 as a susceptibility locus for
asthma in mexican children. PLoS Genet 5(8):e1000623.
2. Himes, B. E., G. M. Hunninghake, J. W. Baurley, N. M. Rafaels, P. Sleiman, D. P.
Strachan, J. B. Wilk, S. A. Willis-Owen, B. Klanderman, J. Lasky-Su, R. Lazarus, A.
J. Murphy, M. E. Soto-Quiros, L. Avila, T. Beaty, R. A. Mathias, I. Ruczinski, K. C.
Barnes, J. C. Celedon, W. O. Cookson, W. J. Gauderman, F. D. Gilliland, H.
Hakonarson, C. Lange, M. F. Moffatt, G. T. O'Connor, B. A. Raby, E. K. Silverman,
and S. T. Weiss. 2009. Genome-wide association analysis identifies PDE4D as an
asthma-susceptibility gene. Am J Hum Genet 84(5):581-93.
3. Li, X., T. D. Howard, S. L. Zheng, T. Haselkorn, S. P. Peters, D. A. Meyers, and E.
R. Bleecker. 2010. Genome-wide association study of asthma identifies RAD50-IL13
and HLA-DR/DQ regions. J Allergy Clin Immunol 125(2):328-335 e11.
4. Mathias, R. A., A. V. Grant, N. Rafaels, T. Hand, L. Gao, C. Vergara, Y. J. Tsai, M.
Yang, M. Campbell, C. Foster, P. Gao, A. Togias, N. N. Hansel, G. Diette, N. F.
Adkinson, M. C. Liu, M. Faruque, G. M. Dunston, H. R. Watson, M. B. Bracken, J.
Hoh, P. Maul, T. Maul, A. E. Jedlicka, T. Murray, J. B. Hetmanski, R. Ashworth, C.
M. Ongaco, K. N. Hetrick, K. F. Doheny, E. W. Pugh, C. N. Rotimi, J. Ford, C. Eng,
E. G. Burchard, P. M. Sleiman, H. Hakonarson, E. Forno, B. A. Raby, S. T. Weiss,
A. F. Scott, M. Kabesch, L. Liang, G. Abecasis, M. F. Moffatt, W. O. Cookson, I.
Page 17 of 27
Ruczinski, T. H. Beaty, and K. C. Barnes. 2010. A genome-wide association study
on African-ancestry populations for asthma. J Allergy Clin Immunol 125(2):336-346
5. Moffatt, M. F., M. Kabesch, L. Liang, A. L. Dixon, D. Strachan, S. Heath, M. Depner,
A. von Berg, A. Bufe, E. Rietschel, A. Heinzmann, B. Simma, T. Frischer, S. A.
Willis-Owen, K. C. Wong, T. Illig, C. Vogelberg, S. K. Weiland, E. von Mutius, G. R.
Abecasis, M. Farrall, I. G. Gut, G. M. Lathrop, and W. O. Cookson. 2007. Genetic
variants regulating ORMDL3 expression contribute to the risk of childhood asthma.
6. Sleiman, P. M., J. Flory, M. Imielinski, J. P. Bradfield, K. Annaiah, S. A. Willis-Owen,
K. Wang, N. M. Rafaels, S. Michel, K. Bonnelykke, H. Zhang, C. E. Kim, E. C.
Frackelton, J. T. Glessner, C. Hou, F. G. Otieno, E. Santa, K. Thomas, R. M. Smith,
W. R. Glaberson, M. Garris, R. M. Chiavacci, T. H. Beaty, I. Ruczinski, J. M. Orange,
J. Allen, J. M. Spergel, R. Grundmeier, R. A. Mathias, J. D. Christie, E. von Mutius,
W. O. Cookson, M. Kabesch, M. F. Moffatt, M. M. Grunstein, K. C. Barnes, M.
Devoto, M. Magnusson, H. Li, S. F. Grant, H. Bisgaard, and H. Hakonarson. 2010.
Variants of DENND1B associated with asthma in children. N Engl J Med 362(1):36-
7. Pillai, S. G., D. Ge, G. Zhu, X. Kong, K. V. Shianna, A. C. Need, S. Feng, C. P.
Hersh, P. Bakke, A. Gulsvik, A. Ruppert, K. C. Lodrup Carlsen, A. Roses, W.
Anderson, S. I. Rennard, D. A. Lomas, E. K. Silverman, and D. B. Goldstein. 2009. A
genome-wide association study in chronic obstructive pulmonary disease (COPD):
identification of two major susceptibility loci. PLoS Genet 5(3):e1000421.
Page 18 of 27
8. Hofmann, S., A. Franke, A. Fischer, G. Jacobs, M. Nothnagel, K. I. Gaede, M.
Schurmann, J. Muller-Quernheim, M. Krawczak, P. Rosenstiel, and S. Schreiber.
2008. Genome-wide association study identifies ANXA11 as a new susceptibility
locus for sarcoidosis. Nat Genet 40(9):1103-6.
9. Mushiroda, T., S. Wattanapokayakit, A. Takahashi, T. Nukiwa, S. Kudoh, T. Ogura,
H. Taniguchi, M. Kubo, N. Kamatani, and Y. Nakamura. 2008. A genome-wide
association study identifies an association of a common variant in TERT with
susceptibility to idiopathic pulmonary fibrosis. J Med Genet 45(10):654-6.
10. Gu, Y., I. T. Harley, L. B. Henderson, B. J. Aronow, I. Vietor, L. A. Huber, J. B.
Harley, J. R. Kilpatrick, C. D. Langefeld, A. H. Williams, A. G. Jegga, J. Chen, M.
Wills-Karp, S. H. Arshad, S. L. Ewart, C. L. Thio, L. M. Flick, M. D. Filippi, H. L.
Grimes, M. L. Drumm, G. R. Cutting, M. R. Knowles, and C. L. Karp. 2009.
Identification of IFRD1 as a modifier gene for cystic fibrosis lung disease. Nature
11. Gudbjartsson, D. F., U. S. Bjornsdottir, E. Halapi, A. Helgadottir, P. Sulem, G. M.
Jonsdottir, G. Thorleifsson, H. Helgadottir, V. Steinthorsdottir, H. Stefansson, C.
Williams, J. Hui, J. Beilby, N. M. Warrington, A. James, L. J. Palmer, G. H.
Koppelman, A. Heinzmann, M. Krueger, H. M. Boezen, A. Wheatley, J. Altmuller, H.
D. Shin, S. T. Uh, H. S. Cheong, B. Jonsdottir, D. Gislason, C. S. Park, L. M.
Rasmussen, C. Porsbjerg, J. W. Hansen, V. Backer, T. Werge, C. Janson, U. B.
Jonsson, M. C. Ng, J. Chan, W. Y. So, R. Ma, S. H. Shah, C. B. Granger, A. A.
Quyyumi, A. I. Levey, V. Vaccarino, M. P. Reilly, D. J. Rader, M. J. Williams, A. M.
van Rij, G. T. Jones, E. Trabetti, G. Malerba, P. F. Pignatti, A. Boner, L.
Page 19 of 27
Pescollderungg, D. Girelli, O. Olivieri, N. Martinelli, B. R. Ludviksson, D.
Ludviksdottir, G. I. Eyjolfsson, D. Arnar, G. Thorgeirsson, K. Deichmann, P. J.
Thompson, M. Wjst, I. P. Hall, D. S. Postma, T. Gislason, J. Gulcher, A. Kong, I.
Jonsdottir, U. Thorsteinsdottir, and K. Stefansson. 2009. Sequence variants affecting
eosinophil numbers associate with asthma and myocardial infarction. Nat Genet
12. Weidinger, S., C. Gieger, E. Rodriguez, H. Baurecht, M. Mempel, N. Klopp, H.
Gohlke, S. Wagenpfeil, M. Ollert, J. Ring, H. Behrendt, J. Heinrich, N. Novak, T.
Bieber, U. Kramer, D. Berdel, A. von Berg, C. P. Bauer, O. Herbarth, S. Koletzko, H.
Prokisch, D. Mehta, T. Meitinger, M. Depner, E. von Mutius, L. Liang, M. Moffatt, W.
Cookson, M. Kabesch, H. E. Wichmann, and T. Illig. 2008. Genome-wide scan on
total serum IgE levels identifies FCER1A as novel susceptibility locus. PLoS Genet
13. Wilk, J. B., T. H. Chen, D. J. Gottlieb, R. E. Walter, M. W. Nagle, B. J. Brandler, R.
H. Myers, I. B. Borecki, E. K. Silverman, S. T. Weiss, and G. T. O'Connor. 2009. A
genome-wide association study of pulmonary function measures in the Framingham
Heart Study. PLoS Genet 5(3):e1000429.
14. Goldstein, D. B. 2009. Common genetic variation and human traits. N Engl J Med
15. Hirschhorn, J. N. 2009. Genomewide association studies--illuminating biologic
pathways. N Engl J Med 360(17):1699-701.
16. Kraft, P., and D. J. Hunter. 2009. Genetic risk prediction--are we there yet? N Engl J
Page 20 of 27
Figure 1. Lung diseases are complex phenotypes that are influenced by both
environmental exposures and genotype. The timing of environmental exposures during
development is likely critical for many lung diseases. The diseases shown span a
spectrum from those with a single “major” gene and high penetrance (cystic fibrosis, α1-
antitrypsin deficiency) to those with major environmental triggers and low penetrance
(acute lung injury and acute respiratory distress syndrome). Three of the more common
lung diseases (asthma, COPD, and sarcoidosis) have heritabilities of approximately
50%, indicating that neither genotype nor environment is sufficient to cause disease and
that both play major roles in disease susceptibility. ALI, acute lung injury; ARDS, acute
respiratory distress syndrome; IPF, idiopathic pulmonary fibrosis; COPD, chronic
obstructive lung disease, A1AT, α1-antitrypsin deficiency; CF, cystic fibrosis; ETS,
environmental tobacco smoke.
Figure 2. Roadmap from Gene to Function in Lung Diseases. Variation that is
associated with a phenotype of interest can be identified by positional cloning (following
linkage), GWAS, or by direct sequencing. In silico bioinformatic approaches help both to
prioritize variants or genes for functional studies and to put results into broader
biological contexts. Subsequent studies to elucidate the functional effects of the
variation and the biology of the gene/pathway follow, via many approaches that will be
determined by the type of variation discovered (coding vs. noncoding, synonymous vs.
nonsynomymous, genic vs. intergenic, etc.) and in silico predictions, but which will
always benefit from combining multiple approaches. A comprehensive understanding of
function and biology will lead to the discovery of therapeutic targets for lung disease,
Page 21 of 27
biomarkers, and presymptomatic diagnosis. The latter will then feed back to a better
understanding and revised definitions of phenotypes. In this context, we see the need
for forging new collaborations and building diverse multi-disciplinary teams that
integrate genetics, molecular biology, cell physiology, and bioinformatics to bear on this
Page 22 of 27
Table 1. High priority areas for future research directions.
1. Functionally characterize genes (and variation) identified by genome-wide
association studies using integrated approaches across scientific disciplines.
2. Studies on the role of environment and mechanisms of transcriptional
(eQTLs, epigenetic modifications) and post-transcriptional (proteomics,
3. Development of model systems (including mice, flys, zebrafish, worms, and
yeast) to understand how genes function in complex biological systems.
4. Comparative phenomics: the search for common markers of lung disease
(markers defined comprehensively to include molecular, developmental,
cellular, physiologic, anatomic, imaging, etc.).
5. Integrative approaches to identifying disease genes using existing datasets,
including the development of training programs and/or post-graduate courses
for training pulmonary scientists in bioinformatic approaches to data mining.
Page 23 of 27
Page 24 of 27
Page 25 of 27
Conflict of Interest
None of the authors has a financial relationship with a commercial entity that has an
interest in the subject of this manuscript.
• Atul Butte, M.D., Ph.D., Stanford University, Stanford, CA
• Jack Elias, M.D., Yale University, New Haven, CT
• Aldons Jake Lusis, Ph.D., University of California, Los Angeles, CA
• Carole Ober, Ph.D., University of Chicago, Chicago, IL
• David Schwartz, M.D., National Jewish Health, Denver, CO
Michael J Bamshad, M.D., University of Washington, Seattle, WA
• Kathleen Barnes, Ph.D., Johns Hopkins University, Baltimore, MD
• Eugene Bleecker, M.D., Wake Forest University, Winston-Salem, NC
• Pat Brooks, Pacific Biosciences, Menlo Park, CA
• Esteban G. Burchard, M.D., M.P.H., University of California, San Francisco, CA
• Damien Chaussabel, Ph.D., Baylor Research Institute, Houston, TX
• Bohao Chen, M.D., University of Chicago, Chicago, IL
• Geofffrey L. Chupp, M.D., Yale University, New Haven, CT
• F. Sessions Cole., M.D., Washington University School of Medicine, St. Louis,
• William Cookson, M.D., D Phil, FRCP, National Heart and Lung Institute, Royal
Brompton Campus, Imperial College of London, London, UK
• David B. Corry, M.D., Baylor College of Medicine, Houston, TX
• Nancy Cox, Ph.D., University of Chicago, Chicago, IL
• James D. Crapo, M.D., National Jewish Health, Denver, CO
• Ronald G. Crystal, M.D., Cornell University, New York, NY
• Joe (Skip) G. Garcia, M.D., University of Illinois, Chicago, IL
• Frank Gilliland, M.D., Ph.D., University of Southern California, Los Angeles, CA
• Hakon Hakonarson, M.D., Ph.D., Children’s Hospital of Philadelphia,
• Howard J. Huang, M.D., Washington University School of Medicine, St. Louis,
• Naftali Kaminski, M.D., University of Pittsburgh School of Medicine, Pittsburgh,
• Michael Knowles, M.D., University of North Carolina, Chapel Hill, Chapel Hill, NC
• Abigail Lara, M.D., University of Colorado, Denver, CO
• Stephanie London, M.D., Dr. Ph., National Institute of Environmental Health
Sciences, Research Triangle Park, NC
• Joseph Loscalzo, M.D., Ph.D., Brigham and Women’s Hospital, Boston, MA
Page 26 of 27
25 Download full-text
• James E. Loyd, Ph.D., Vanderbilt University School of Medicine, Nashville, TN
• Fernando D. Martinez, M.D., University of Arizona, Tucson, AZ
• Nuala Meyer, M.D., University of Pennsylvania School of Medicine, Philadelphia,
• Deborah A. Meyers, Ph.D., Wake Forest University, Winston-Salem, NC
• Deborah Nickerson, Ph.D., University of Washington, Seattle, WA
• Dan Nicolae, Ph.D., University of Chicago, Chicago, IL
• Marcelo A. Nobrega, M.D., Ph.D., University of Chicago, Chicago, IL
• Rudy Pascual, M.D., Wake Forest University, Winston-Salem, NC
• Vincinio de Jesus Perez, M.D., Ph.D., Stanford University School of Medicine,
• Diego A. Preciado, M.D., Ph.D., Children’s National Medical Center, Washington,
• Benjamin Raby, M.D., Brigham and Women’s Hospital, Boston, MA
• Dan M. Roden, M.D., Vanderbilt University School of Medicine, Nashville, TN
• Eric Schadt, Ph.D., Pacific Biosciences, Menlo Park, CA
• Sunita Sharma, M.D., MPH, Channing Laboratory, Harvard Medical School, Boston, MA
• Edwin Silverman, M.D., Ph.D., Brigham and Women’s Hospital, Boston, MA
• Avrum Spira, M.D., M.Sc., Boston University School of Medicine, Boston, MA
• Donata Vercelli, M.D., University of Arizona, Tucson, AZ
• Scott T. Weiss, M.D., M.S., Brigham and Women’s Hospital, Boston, MA
• Marsha Wills-Karp, Ph.D., Cincinnati Children’s Hospital Medical, Cincinnati, OH
• Prescott G. Woodruff, M.D., M.P.H., University of California, San Francisco,CA
• Fred Wright, Ph.D., University of North Carolina, Chapel Hill, Chapel Hill, NC
• Mark M. Wurfel, M.D., Ph.D., University of Washington, Seattle, WA
• John R. Yates, Ph.D., The Scripps Research Institute, La Jolla, CA
• Fei Zou, Ph.D., University of North Carolina, Chapel Hill, Chapel Hill, NC
• Susan Banks-Schlegel, Ph.D., Division of Lung Diseases, NHLBI, Bethesda, MD
• Sandra Colombini-Hatch, M.D., Division of Lung Diseases, NHLBI, Bethesda,
• Weiniu Gan, Ph.D., Division of Lung Diseases, NHLBI, Bethesda, MD
• Dorothy Gail, Ph.D., Division of Lung Diseases, NHLBI, Bethesda, MD
• James P. Kiley, Ph.D., Division of Lung Diseases, NHLBI, Bethesda, MD
• Alan M. Michelson, M.D., Ph.D., Office of the Director, NHLBI, Bethesda, MD
Page 27 of 27