[Show abstract][Hide abstract] ABSTRACT: Lung cancer is the leading cause of cancer death, disproportionately affecting African-Americans. Prior studies have reported specific genetic markers linked to both smoking quantity and risk of lung cancer in multiple ethnic/racial groups. Investigators analyzed associations between 28 polymorphisms and average cigarettes smoked per day (CPD) in 7156 African-American females and examined interactions between the top polymorphisms and CPD in a cohort of African-American males and females (1078 lung cancer cases and 822 health control patients). The results suggested that six polymorphisms within one genomic region increased lung cancer risk in African-Americans, which was most pronounced in light smokers.
[Show abstract][Hide abstract] ABSTRACT: Candidate gene and genome-wide association studies (GWAS) have identified 15 independent genomic regions associated with bladder cancer risk. In search for additional susceptibility variants, we followed up on four promising single nucleotide polymorphisms (SNPs) that had not achieved genome-wide significance in 6,911 cases and 11,814 controls (rs6104690, rs4510656, rs5003154 and rs4907479, P<1×10(-6)), using additional data from existing GWAS datasets and targeted genotyping for studies that did not have GWAS data. In a combined analysis, which included data on up to 15,058 cases and 286,270 controls, two SNPs achieved genome-wide statistical significance: rs6104690 in a gene desert at 20p12.2 (P=2.19×10(-11)) and rs4907479 within the MCF2L gene at 13q34 (P=3.3×10(-10)). Imputation and fine-mapping analyses were performed in these two regions for a subset of 5,551 bladder cancer cases and 10,242 controls. Analyses at the 13q34 region suggest a single signal marked by rs4907479. In contrast, we detected two signals in the 20p12.2 region - the first signal is marked by rs6104690 and the second signal is marked by two moderately correlated SNPs (r(2)=0.53), rs6108803 and the previously reported rs62185668. The second 20p12.2 signal is more strongly associated with the risk of muscle-invasive (T2-T4 stage) compared to non-muscle-invasive (Ta, T1 stage) bladder cancer (case-case P<0.02 for both rs62185668 and rs6108803). Functional analyses are needed to explore the biological mechanisms underlying these novel genetic associations with risk for bladder cancer.
No preview · Article · Jan 2016 · Human Molecular Genetics
[Show abstract][Hide abstract] ABSTRACT: For African American or Hispanic women, the extent to which clinical breast cancer risk prediction models are improved by including information on susceptibility single nucleotide polymorphisms (SNPs) is unknown, even though these women comprise increasing proportions of the US population and represent a large proportion of the world's population. We studied 7539 African American and 3363 Hispanic women from the Women's Health Initiative. The age-adjusted 5-year risks from the BCRAT and IBIS risk prediction models were measured and combined with a risk score based on >70 independent susceptibility SNPs. Logistic regression, adjusting for age group, was used to estimate risk associations with log-transformed age-adjusted 5-year risks. Discrimination was measured by the odds ratio (OR) per standard deviation (SD) and the area under the receiver operator curve (AUC). When considered alone, the ORs for African American women were 1.28 for BCRAT, and 1.04 for IBIS. When combined with the SNP risk score (OR 1.23), the corresponding ORs were 1.39 and 1.22. For Hispanic women the corresponding ORs were 1.25 for BCRAT, and 1.15 for IBIS. When combined with the SNP risk score (OR 1.39), the corresponding ORs were 1.48 and 1.42. There was no evidence that any of the combined models were not well calibrated. Including information on known breast cancer susceptibility loci provides approximately 10 and 19 % improvement in risk prediction using BCRAT for African Americans and Hispanics, respectively. The corresponding figures for IBIS are approximately 18 and 26 %, respectively.
Preview · Article · Nov 2015 · Breast Cancer Research and Treatment
[Show abstract][Hide abstract] ABSTRACT: Genome-wide association studies have previously identified 23 genetic loci associated with circulating fibrinogen concentration.
These studies used HapMap imputation and did not examine the X-chromosome. 1000 Genomes imputation provides better coverage
of uncommon variants, and includes indels. We conducted a genome-wide association analysis of 34 studies imputed to the 1000
Genomes Project reference panel and including ∼120 000 participants of European ancestry (95 806 participants with data on
the X-chromosome). Approximately 10.7 million single-nucleotide polymorphisms and 1.2 million indels were examined. We identified
41 genome-wide significant fibrinogen loci; of which, 18 were newly identified. There were no genome-wide significant signals
on the X-chromosome. The lead variants of five significant loci were indels. We further identified six additional independent
signals, including three rare variants, at two previously characterized loci: FGB and IRF1. Together the 41 loci explain 3% of the variance in plasma fibrinogen concentration.
Full-text · Article · Nov 2015 · Human Molecular Genetics
[Show abstract][Hide abstract] ABSTRACT: Background: Studies of related individuals have consistently demonstrated notable familial aggregation of cancer. We aim to estimate
the heritability and genetic correlation attributable to the additive effects of common single-nucleotide polymorphisms (SNPs)
for cancer at 13 anatomical sites.
Full-text · Article · Oct 2015 · JNCI Journal of the National Cancer Institute
[Show abstract][Hide abstract] ABSTRACT: In the genomic era, group association tests are of great interest. Due to the overwhelming number of individual genomic features, the power of testing for association of a single genomic feature at a time is often very small, as are the effect sizes for most features. Many methods have been proposed to test association of a trait with a group of features within a functional unit as a whole, e.g. all SNPs in a gene, yet few of these methods account for the fact that generally a substantial proportion of the features are not associated with the trait. In this paper, we propose to model the association for each feature in the group as a mixture of features with no association and features with non-zero associations to explicitly account for the possibility that a fraction of features may not be associated with the trait while other features in the group are. The feature-level associations are first estimated by generalized linear models; the sequence of these estimated associations is then modeled by a hidden Markov chain. To test for global association, we develop a modified likelihood ratio test based on a log-likelihood function that ignores higher order dependency plus a penalty term. We derive the asymptotic distribution of the likelihood ratio test under the null hypothesis. Furthermore, we obtain the posterior probability of association for each feature, which provides evidence of feature-level association and is useful for potential follow-up studies. In simulations and data application, we show that our proposed method performs well when compared with existing group association tests especially when there are only few features associated with the outcome.
[Show abstract][Hide abstract] ABSTRACT: Menopause timing has a substantial impact on infertility and risk of disease, including breast cancer, but the underlying mechanisms are poorly understood. We report a dual strategy in ∼70,000 women to identify common and low-frequency protein-coding variation associated with age at natural menopause (ANM). We identified 44 regions with common variants, including two regions harboring additional rare missense alleles of large effect. We found enrichment of signals in or near genes involved in delayed puberty, highlighting the first molecular links between the onset and end of reproductive lifespan. Pathway analyses identified major association with DNA damage response (DDR) genes, including the first common coding variant in BRCA1 associated with any complex trait. Mendelian randomization analyses supported a causal effect of later ANM on breast cancer risk (∼6% increase in risk per year; P = 3 × 10(-14)), likely mediated by prolonged sex hormone exposure rather than DDR mechanisms.
[Show abstract][Hide abstract] ABSTRACT: Background:
Esophageal adenocarcinoma (EA) is among the leading causes of cancer mortality, especially in developed countries. A high level of somatic copy number alterations (CNAs) accumulates over the decades in the progression from Barrett's esophagus, the precursor lesion, to EA. Accurate identification of somatic CNAs is essential to understand cancer development. Many studies have been conducted for the detection of CNA in EA using microarrays. Next-generation sequencing (NGS) technologies are believed to have advantages in sensitivity and accuracy to detect CNA, yet no NGS-based CNA detection in EA has been reported.
In this study, we analyzed whole-exome (WES) and whole-genome sequencing (WGS) data for detecting CNA from a published large-scale genomic study of EA. Two specific comparisons were conducted. First, the recurrent CNAs based on WGS and WES data from 145 EA samples were compared to those found in five previous microarray-based studies. We found that the majority of the previously identified regions were also detected in this study. Interestingly, some novel amplifications and deletions were discovered using the NGS data. In particular, SKI and PRKCZ detected in a deletion region are involved in transforming growth factor-β pathway, suggesting the potential utility of novel biomarkers for EA. Second, we compared CNAs detected in WGS and WES data from the same 15 EA samples. No large-scale CNA was identified statistically more frequently by WES or WGS, while more focal-scale CNAs were detected by WGS than by WES.
Our results suggest that NGS can replace microarrays to detect CNA in EA. WGS is superior to WES in that it can offer finer resolution for the detection, though if the interest is on recurrent CNAs, WES can be preferable to WGS for its cost-effectiveness.
[Show abstract][Hide abstract] ABSTRACT: The extent to which low-frequency (minor allele frequency (MAF) between 1-5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is mainly unknown. Bone mineral density (BMD) is highly heritable, a major predictor of osteoporotic fractures, and has been previously associated with common genetic variants, as well as rare, population-specific, coding variants. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication genotyping (n = 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size fourfold larger than the mean of previously reported common variants for lumbar spine BMD (rs11692564(T), MAF = 1.6%, replication effect size = +0.20 s.d., Pmeta = 2 × 10-14), which was also associated with a decreased risk of fracture (odds ratio = 0.85; P = 2 × 10-11; ncases = 98,742 and ncontrols = 409,511). Using an En1cre/flox mouse model, we observed that conditional loss of En1 results in low bone mass, probably as a consequence of high bone turnover. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817(T), MAF = 1.2%, replication effect size = +0.41 s.d., Pmeta = 1 × 10-11). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of complex traits and disease in the general population.
[Show abstract][Hide abstract] ABSTRACT: Under suitable assumptions and by exploiting the independence between inherited genetic susceptibility and treatment assignment, the case-only design yields efficient estimates for subgroup treatment effects and gene-treatment interaction in a Cox model. However it cannot provide estimates of the genetic main effect and baseline hazards, that are necessary to compute the absolute disease risk. For two-arm, placebo-controlled trials with rare failure time endpoints, we consider augmenting the case-only design with random samples of controls from both arms, as in the classical case-cohort sampling scheme, or with a random sample of controls from the active treatment arm only. The latter design is motivated by vaccine trials for cost-effective use of resources and specimens so that host genetics and vaccine-induced immune responses can be studied simultaneously in a bigger set of participants. We show that these designs can identify all parameters in a Cox model and that the efficient case-only estimator can be incorporated in a two-step plug-in procedure. Results in simulations and a data example suggest that incorporating case-only estimators in the classical case-cohort design improves the precision of all estimated parameters; sampling controls only in the active treatment arm attains a similar level of efficiency.
[Show abstract][Hide abstract] ABSTRACT: Genetic susceptibility to colorectal cancer is caused by rare pathogenic mutations and common genetic variants that contribute to familial risk. Here we report the results of a two-stage association study with 18,299 cases of colorectal cancer and 19,656 controls, with follow-up of the most statistically significant genetic loci in 4,725 cases and 9,969 controls from two Asian consortia. We describe six new susceptibility loci reaching a genome-wide threshold of P<5.0E-08. These findings provide additional insight into the underlying biological mechanisms of colorectal cancer and demonstrate the scientific value of large consortia-based genetic epidemiology studies.
No preview · Article · Jul 2015 · Nature Communications
[Show abstract][Hide abstract] ABSTRACT: Stroke is the second leading cause of death and the third leading cause of years of life lost. Genetic factors contribute to stroke prevalence, and candidate gene and genome-wide association studies (GWAS) have identified variants associated with ischemic stroke risk. These variants often have small effects without obvious biological significance. Exome sequencing may discover predicted protein-altering variants with a potentially large effect on ischemic stroke risk.
To investigate the contribution of rare and common genetic variants to ischemic stroke risk by targeting the protein-coding regions of the human genome.
The National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project (ESP) analyzed approximately 6000 participants from numerous cohorts of European and African ancestry. For discovery, 365 cases of ischemic stroke (small-vessel and large-vessel subtypes) and 809 European ancestry controls were sequenced; for replication, 47 affected sibpairs concordant for stroke subtype and an African American case-control series were sequenced, with 1672 cases and 4509 European ancestry controls genotyped. The ESP's exome sequencing and genotyping started on January 1, 2010, and continued through June 30, 2012. Analyses were conducted on the full data set between July 12, 2012, and July 13, 2013.
Discovery of new variants or genes contributing to ischemic stroke risk and subtype (primary analysis) and determination of support for protein-coding variants contributing to risk in previously published candidate genes (secondary analysis).
We identified 2 novel genes associated with an increased risk of ischemic stroke: a protein-coding variant in PDE4DIP (rs1778155; odds ratio, 2.15; P = 2.63 × 10-8) with an intracellular signal transduction mechanism and in ACOT4 (rs35724886; odds ratio, 2.04; P = 1.24 × 10-7) with a fatty acid metabolism; confirmation of PDE4DIP was observed in affected sibpair families with large-vessel stroke subtype and in African Americans. Replication of protein-coding variants in candidate genes was observed for 2 previously reported GWAS associations: ZFHX3 (cardioembolic stroke) and ABCA1 (large-vessel stroke).
Exome sequencing discovered 2 novel genes and mechanisms, PDE4DIP and ACOT4, associated with increased risk for ischemic stroke. In addition, ZFHX3 and ABCA1 were discovered to have protein-coding variants associated with ischemic stroke. These results suggest that genetic variation in novel pathways contributes to ischemic stroke risk and serves as a target for prediction, prevention, and therapy.
[Show abstract][Hide abstract] ABSTRACT: Results from the Women's Health Initiative (WHI) clinical trials (CT) demonstrated no increase in the risk of lung cancer in postmenopausal women treated with hormone therapy. We conducted a joint analysis of the WHI observational study data and CT data to further explore the association between estrogen and estrogen-related reproductive factors and lung cancer risk.
Reproductive history, oral contraceptive (OC) use, and postmenopausal hormone therapy (HT) was evaluated in 160,855 women with known HT exposures. Follow-up for lung cancer was through September 17, 2012; 2,467 incident lung cancer cases were ascertained, with median follow-up of 14 years.
For all lung cancers, women with previous use of estrogen plus progestin of < 5 years (HR=0.84; 95% CI 0.71-0.99) were at reduced risk. A limited number of reproductive factors demonstrated associations with risk. There was a trend towards decreased risk with increasing age at menopause (ptrend=0.04) and a trend towards increased risk with increasing number of live births (ptrend=0.03). Reduced risk of non-small cell lung cancer was associated with age 20-29 at first live birth. Risk estimates varied with smoking history, years of HT use and previous bilateral oophorectomy.
Indirect measures of estrogen exposure to lung tissue, as used in this study, provide only weak evidence for an association between reproductive history or HT use and risk of lung cancer. More detailed mechanistic studies and evaluation of risk factors in conjunction with ER expression in the lung should continue as a role for estrogen can't be ruled out and may hold potential for prevention and treatment strategies.
No preview · Article · Apr 2015 · Journal of thoracic oncology: official publication of the International Association for the Study of Lung Cancer