[Show abstract][Hide abstract] ABSTRACT: Genome-wide association studies have previously identified 23 genetic loci associated with circulating fibrinogen concentration. These studies used HapMap imputation and did not examine the X chromosome. 1000 Genomes imputation provides better coverage of uncommon variants, and includes indels. We conducted a genome-wide association analysis of 34 studies imputed to the 1000 Genomes Project reference panel and including ∼120,000 participants of European ancestry (95,806 participants with data on the X chromosome). Approximately 10.7 million SNPs and 1.2 million indels were examined. We identified 41 genome-wide significant fibrinogen loci of which 18 were newly identified. There were no genome-wide significant signals on the X chromosome. The lead variants of 5 significant loci were indels. We further identified 6 additional independent signals, including 3 rare variants, at two previously characterized loci: FGB and IRF1. Together the 41 loci explain 3% of the variance in plasma fibrinogen concentration.
Human Molecular Genetics 11/2015; DOI:10.1093/hmg/ddv454 · 6.39 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: BACKGROUND: Studies of related individuals have consistently demonstrated notable familial aggregation of cancer. We aim to estimate the heritability and genetic correlation attributable to the additive effects of common single-nucleotide polymorphisms (SNPs) for cancer at 13 anatomical sites. METHODS: Between 2007 and 2014, the US National Cancer Institute has generated data from genome-wide association studies (GWAS) for 49 492 cancer case patients and 34 131 control patients. We apply novel mixed model methodology (GCTA) to this GWAS data to estimate the heritability of individual cancers, as well as the proportion of heritability attributable to cigarette smoking in smoking-related cancers, and the genetic correlation between pairs of cancers. RESULTS: GWAS heritability was statistically significant at nearly all sites, with the estimates of array-based heritability, hl (2), on the liability threshold (LT) scale ranging from 0.05 to 0.38. Estimating the combined heritability of multiple smoking characteristics, we calculate that at least 24% (95% confidence interval [CI] = 14% to 37%) and 7% (95% CI = 4% to 11%) of the heritability for lung and bladder cancer, respectively, can be attributed to genetic determinants of smoking. Most pairs of cancers studied did not show evidence of strong genetic correlation. We found only four pairs of cancers with marginally statistically significant correlations, specifically kidney and testes (rho = 0.73, SE = 0.28), diffuse large B-cell lymphoma (DLBCL) and pediatric osteosarcoma (rho = 0.53, SE = 0.21), DLBCL and chronic lymphocytic leukemia (CLL) (rho = 0.51, SE =0.18), and bladder and lung (rho = 0.35, SE = 0.14). Correlation analysis also indicates that the genetic architecture of lung cancer differs between a smoking population of European ancestry and a nonsmoking Asian population, allowing for the possibility that the genetic etiology for the same disease can vary by population and environmental exposures. CONCLUSION: Our results provide important insights into the genetic architecture of cancers and suggest new avenues for investigation.
JNCI Journal of the National Cancer Institute 10/2015; 107(12). DOI:10.1093/jnci/djv279 · 12.58 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: In the genomic era, group association tests are of great interest. Due to the overwhelming number of individual genomic features, the power of testing for association of a single genomic feature at a time is often very small, as are the effect sizes for most features. Many methods have been proposed to test association of a trait with a group of features within a functional unit as a whole, e.g. all SNPs in a gene, yet few of these methods account for the fact that generally a substantial proportion of the features are not associated with the trait. In this paper, we propose to model the association for each feature in the group as a mixture of features with no association and features with non-zero associations to explicitly account for the possibility that a fraction of features may not be associated with the trait while other features in the group are. The feature-level associations are first estimated by generalized linear models; the sequence of these estimated associations is then modeled by a hidden Markov chain. To test for global association, we develop a modified likelihood ratio test based on a log-likelihood function that ignores higher order dependency plus a penalty term. We derive the asymptotic distribution of the likelihood ratio test under the null hypothesis. Furthermore, we obtain the posterior probability of association for each feature, which provides evidence of feature-level association and is useful for potential follow-up studies. In simulations and data application, we show that our proposed method performs well when compared with existing group association tests especially when there are only few features associated with the outcome.
[Show abstract][Hide abstract] ABSTRACT: Menopause timing has a substantial impact on infertility and risk of disease, including breast cancer, but the underlying mechanisms are poorly understood. We report a dual strategy in ∼70,000 women to identify common and low-frequency protein-coding variation associated with age at natural menopause (ANM). We identified 44 regions with common variants, including two regions harboring additional rare missense alleles of large effect. We found enrichment of signals in or near genes involved in delayed puberty, highlighting the first molecular links between the onset and end of reproductive lifespan. Pathway analyses identified major association with DNA damage response (DDR) genes, including the first common coding variant in BRCA1 associated with any complex trait. Mendelian randomization analyses supported a causal effect of later ANM on breast cancer risk (∼6% increase in risk per year; P = 3 × 10(-14)), likely mediated by prolonged sex hormone exposure rather than DDR mechanisms.
[Show abstract][Hide abstract] ABSTRACT: Background:
Esophageal adenocarcinoma (EA) is among the leading causes of cancer mortality, especially in developed countries. A high level of somatic copy number alterations (CNAs) accumulates over the decades in the progression from Barrett's esophagus, the precursor lesion, to EA. Accurate identification of somatic CNAs is essential to understand cancer development. Many studies have been conducted for the detection of CNA in EA using microarrays. Next-generation sequencing (NGS) technologies are believed to have advantages in sensitivity and accuracy to detect CNA, yet no NGS-based CNA detection in EA has been reported.
In this study, we analyzed whole-exome (WES) and whole-genome sequencing (WGS) data for detecting CNA from a published large-scale genomic study of EA. Two specific comparisons were conducted. First, the recurrent CNAs based on WGS and WES data from 145 EA samples were compared to those found in five previous microarray-based studies. We found that the majority of the previously identified regions were also detected in this study. Interestingly, some novel amplifications and deletions were discovered using the NGS data. In particular, SKI and PRKCZ detected in a deletion region are involved in transforming growth factor-β pathway, suggesting the potential utility of novel biomarkers for EA. Second, we compared CNAs detected in WGS and WES data from the same 15 EA samples. No large-scale CNA was identified statistically more frequently by WES or WGS, while more focal-scale CNAs were detected by WGS than by WES.
Our results suggest that NGS can replace microarrays to detect CNA in EA. WGS is superior to WES in that it can offer finer resolution for the detection, though if the interest is on recurrent CNAs, WES can be preferable to WGS for its cost-effectiveness.
Human genomics 09/2015; 9(1):22. DOI:10.1186/s40246-015-0044-0 · 2.15 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The extent to which low-frequency (minor allele frequency (MAF) between 1-5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is mainly unknown. Bone mineral density (BMD) is highly heritable, a major predictor of osteoporotic fractures, and has been previously associated with common genetic variants, as well as rare, population-specific, coding variants. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication genotyping (n = 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size fourfold larger than the mean of previously reported common variants for lumbar spine BMD (rs11692564(T), MAF = 1.6%, replication effect size = +0.20 s.d., Pmeta = 2 × 10-14), which was also associated with a decreased risk of fracture (odds ratio = 0.85; P = 2 × 10-11; ncases = 98,742 and ncontrols = 409,511). Using an En1cre/flox mouse model, we observed that conditional loss of En1 results in low bone mass, probably as a consequence of high bone turnover. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817(T), MAF = 1.2%, replication effect size = +0.41 s.d., Pmeta = 1 × 10-11). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of complex traits and disease in the general population.
[Show abstract][Hide abstract] ABSTRACT: Under suitable assumptions and by exploiting the independence between inherited genetic susceptibility and treatment assignment, the case-only design yields efficient estimates for subgroup treatment effects and gene-treatment interaction in a Cox model. However it cannot provide estimates of the genetic main effect and baseline hazards, that are necessary to compute the absolute disease risk. For two-arm, placebo-controlled trials with rare failure time endpoints, we consider augmenting the case-only design with random samples of controls from both arms, as in the classical case-cohort sampling scheme, or with a random sample of controls from the active treatment arm only. The latter design is motivated by vaccine trials for cost-effective use of resources and specimens so that host genetics and vaccine-induced immune responses can be studied simultaneously in a bigger set of participants. We show that these designs can identify all parameters in a Cox model and that the efficient case-only estimator can be incorporated in a two-step plug-in procedure. Results in simulations and a data example suggest that incorporating case-only estimators in the classical case-cohort design improves the precision of all estimated parameters; sampling controls only in the active treatment arm attains a similar level of efficiency.
[Show abstract][Hide abstract] ABSTRACT: Genetic susceptibility to colorectal cancer is caused by rare pathogenic mutations and common genetic variants that contribute to familial risk. Here we report the results of a two-stage association study with 18,299 cases of colorectal cancer and 19,656 controls, with follow-up of the most statistically significant genetic loci in 4,725 cases and 9,969 controls from two Asian consortia. We describe six new susceptibility loci reaching a genome-wide threshold of P<5.0E-08. These findings provide additional insight into the underlying biological mechanisms of colorectal cancer and demonstrate the scientific value of large consortia-based genetic epidemiology studies.
[Show abstract][Hide abstract] ABSTRACT: Stroke is the second leading cause of death and the third leading cause of years of life lost. Genetic factors contribute to stroke prevalence, and candidate gene and genome-wide association studies (GWAS) have identified variants associated with ischemic stroke risk. These variants often have small effects without obvious biological significance. Exome sequencing may discover predicted protein-altering variants with a potentially large effect on ischemic stroke risk.
To investigate the contribution of rare and common genetic variants to ischemic stroke risk by targeting the protein-coding regions of the human genome.
The National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project (ESP) analyzed approximately 6000 participants from numerous cohorts of European and African ancestry. For discovery, 365 cases of ischemic stroke (small-vessel and large-vessel subtypes) and 809 European ancestry controls were sequenced; for replication, 47 affected sibpairs concordant for stroke subtype and an African American case-control series were sequenced, with 1672 cases and 4509 European ancestry controls genotyped. The ESP's exome sequencing and genotyping started on January 1, 2010, and continued through June 30, 2012. Analyses were conducted on the full data set between July 12, 2012, and July 13, 2013.
Discovery of new variants or genes contributing to ischemic stroke risk and subtype (primary analysis) and determination of support for protein-coding variants contributing to risk in previously published candidate genes (secondary analysis).
We identified 2 novel genes associated with an increased risk of ischemic stroke: a protein-coding variant in PDE4DIP (rs1778155; odds ratio, 2.15; P = 2.63 × 10-8) with an intracellular signal transduction mechanism and in ACOT4 (rs35724886; odds ratio, 2.04; P = 1.24 × 10-7) with a fatty acid metabolism; confirmation of PDE4DIP was observed in affected sibpair families with large-vessel stroke subtype and in African Americans. Replication of protein-coding variants in candidate genes was observed for 2 previously reported GWAS associations: ZFHX3 (cardioembolic stroke) and ABCA1 (large-vessel stroke).
Exome sequencing discovered 2 novel genes and mechanisms, PDE4DIP and ACOT4, associated with increased risk for ischemic stroke. In addition, ZFHX3 and ABCA1 were discovered to have protein-coding variants associated with ischemic stroke. These results suggest that genetic variation in novel pathways contributes to ischemic stroke risk and serves as a target for prediction, prevention, and therapy.
[Show abstract][Hide abstract] ABSTRACT: Results from the Women's Health Initiative (WHI) clinical trials (CT) demonstrated no increase in the risk of lung cancer in postmenopausal women treated with hormone therapy. We conducted a joint analysis of the WHI observational study data and CT data to further explore the association between estrogen and estrogen-related reproductive factors and lung cancer risk.
Reproductive history, oral contraceptive (OC) use, and postmenopausal hormone therapy (HT) was evaluated in 160,855 women with known HT exposures. Follow-up for lung cancer was through September 17, 2012; 2,467 incident lung cancer cases were ascertained, with median follow-up of 14 years.
For all lung cancers, women with previous use of estrogen plus progestin of < 5 years (HR=0.84; 95% CI 0.71-0.99) were at reduced risk. A limited number of reproductive factors demonstrated associations with risk. There was a trend towards decreased risk with increasing age at menopause (ptrend=0.04) and a trend towards increased risk with increasing number of live births (ptrend=0.03). Reduced risk of non-small cell lung cancer was associated with age 20-29 at first live birth. Risk estimates varied with smoking history, years of HT use and previous bilateral oophorectomy.
Indirect measures of estrogen exposure to lung tissue, as used in this study, provide only weak evidence for an association between reproductive history or HT use and risk of lung cancer. More detailed mechanistic studies and evaluation of risk factors in conjunction with ER expression in the lung should continue as a role for estrogen can't be ruled out and may hold potential for prevention and treatment strategies.
Journal of thoracic oncology: official publication of the International Association for the Study of Lung Cancer 04/2015; 10(7). DOI:10.1097/JTO.0000000000000558 · 5.28 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Several regions of the genome show pleiotropic associations with multiple cancers. We sought to evaluate whether 181 single-nucleotide polymorphisms previously associated with various cancers in genome-wide association studies were also associated with melanoma risk.
We evaluated 2,131 melanoma cases and 20,353 controls from three studies in the Population Architecture using Genomics and Epidemiology (PAGE) study (EAGLE-BioVU, MEC, WHI) and two collaborating studies (HPFS, NHS). Overall and sex-stratified analyses were performed across studies.
We observed statistically significant associations with melanoma for two lung cancer SNPs in the TERT-CLPTM1L locus (Bonferroni-corrected p<2.8x10-4), replicating known pleiotropic effects at this locus. In sex-stratified analyses, we also observed a potential male-specific association between prostate cancer risk variant rs12418451 and melanoma risk (OR=1.22, p=8.0x10-4). No other variants in our study were associated with melanoma after multiple comparisons adjustment (p>2.8e-4).
We provide confirmatory evidence of pleiotropic associations with melanoma for two SNPs previously associated with lung cancer, and provide suggestive evidence for a male-specific association with melanoma for prostate cancer variant rs12418451. This SNP is located near TPCN2, an ion transport gene containing SNPs which have been previously associated with hair pigmentation but not melanoma risk. Previous evidence provides biological plausibility for this association, and suggests a complex interplay between ion transport, pigmentation, and melanoma risk that may vary by sex. If confirmed, these pleiotropic relationships may help elucidate shared molecular pathways between cancers and related phenotypes.
PLoS ONE 03/2015; 10(3):e0120491. DOI:10.1371/journal.pone.0120491 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Obesity is heritable and predisposes to many diseases. To understand the genetic basis of obesity better, here we conduct a genome-wide association study and Metabochip meta-analysis of body mass index (BMI), a measure commonly used to define obesity and assess adiposity, in up to 339,224 individuals. This analysis identifies 97 BMI-associated loci (P < 5 × 10(-8)), 56 of which are novel. Five loci demonstrate clear evidence of several independent association signals, and many loci have significant effects on other metabolic phenotypes. The 97 loci account for ∼2.7% of BMI variation, and genome-wide estimates suggest that common variation accounts for >20% of BMI variation. Pathway analyses provide strong support for a role of the central nervous system in obesity susceptibility and implicate new genes and pathways, including those related to synaptic function, glutamate signalling, insulin secretion/action, energy metabolism, lipid biology and adipogenesis