# Nilanjan Chatterjee's research while affiliated with University of Freiburg and other places

## Publications (619)

Background Genome-wide association studies (GWAS) have identified multiple common breast cancer susceptibility variants. Many of these variants have differential associations by estrogen receptor (ER) status, but how these variants relate with other tumor features and intrinsic molecular subtypes is unclear. Methods Among 106,571 invasive breast c...
Large scale genetic association studies have identified many trait-associated variants and understanding the role of these variants in the downstream regulation of gene-expressions can uncover important mediating biological mechanisms. Here we propose ARCHIE, a summary statistic based sparse canonical correlation analysis method to identify sets of...
Genome-wide association studies (GWAS) have been performed to identify host genetic factors for a range of phenotypes, including for infectious diseases. The use of population-based common controls from biobanks and extensive consortiums is a valuable resource to increase sample sizes in the identification of associated loci with minimal additional...
Investigations into the causal underpinnings of disease processes can be aided by the incorporation of genetic information. Genetic studies require populations varied in both ancestry and prevalent disease in order to optimize discovery and ensure generalizability of findings to the global population. Here, we report the genetic determinants of the...
Background Reproductive factors have been shown to be differentially associated with risk of estrogen receptor (ER) positive and ER-negative breast cancer. However, their associations with intrinsic-like subtypes are less clear. Methods Analyses included up to 23,353 cases, and 71,072 controls pooled from 31 population-based case-control or cohort...
Background: The majority of female lung cancer cases in Asia are never-smokers with distinct risk factor profiles. Given the high burden of disease in this population, there is an increasing need to improve the understanding of lung cancer. Current risk models for lung cancer focus on active smokers and individuals of European ancestry. Therefore,...
Genome-wide association studies (GWAS) have found widespread evidence of pleiotropy, but characterization of global patterns of pleiotropy remain highly incomplete due to insufficient power of current approaches. We develop fastASSET, an extension of the method ASSET, to allow computationally efficient detection of variant-level pleiotropic associa...
We studied whether a polygenic score for reduced kidney function developed from population-based studies was associated with adverse outcomes among persons with chronic kidney disease. The polygenic score was significantly associated with incident kidney failure, major adverse cardiovascular outcomes and overall mortality while adjusting for age, s...
Background: Risk estimates for women carrying germline mutations in breast cancer susceptibility genes are mainly based on studies of European ancestry women. Methods: We investigated associations between pathogenic variants (PV) in 34 genes with breast cancer risk in 871 cases (307 estrogen receptor (ER)-positive, 321 ER-negative, and 243 ER-un...
Background Cohort collaborations often require meta-analysis of exposure-outcome association estimates across cohorts as an alternative to pooling individual-level data that requires a laborious process of data harmonization on individual-level data. However, it is likely that important confounders are not all measured uniformly across the cohorts...
BACKGROUND AND AIMS Over 10% of the adult population worldwide is affected by chronic kidney disease (CKD). CKD is associated with an increased risk of kidney failure (KF), cardiovascular events and mortality. CKD is defined and staged by estimated glomerular filtration rate (eGFR), the most common measure of kidney function. Genome-wide associatio...
Improved understanding of genetic regulation of the proteome can facilitate identification of the causal mechanisms for complex traits. We analyzed data on 4,657 plasma proteins from 7,213 European American (EA) and 1,871 African American (AA) individuals from the Atherosclerosis Risk in Communities study, and further replicated findings on 467 AA...
Public health strategies aimed at disease prevention or early detection and intervention have the potential to advance human health worldwide. However, their success depends on the identification of risk factors that underlie disease burden in the general population. Genome-wide association studies (GWAS) have implicated thousands of single-nucleot...
Polygenic risk scores are becoming increasingly predictive of complex traits, but subpar performance in non-European populations raises concerns about their potential clinical applications. We develop a powerful and scalable method to calculate PRS using GWAS summary-statistics from multi-ancestry training samples by integrating multiple techniques...
Introduction: Molecular mechanisms underlying the benefits of healthy dietary patterns on chronic diseases are poorly understood. Identifying protein biomarkers of healthy diets can help us characterize biological pathways influenced by diet quality and confirm the validity of established healthy dietary patterns. Objectives: To identify protein bi...
Background Rare pathogenic variants in cardiomyopathy (CM) genes can predispose to cardiac remodeling or fibrosis. We studied the carrier status for such variants in adults without clinical cardiovascular disease (CVD) in whom cardiac MRI (CMR)-derived measures of myocardial fibrosis were obtained in the Multi-Ethnic Study of Atherosclerosis (MESA)...
Metabolomics genome wide association study (GWAS) help outline the genetic contribution to human metabolism. However, studies to date have focused on relatively healthy, population-based samples of White individuals. Here, we conducted a GWAS of 537 blood metabolites measured in the Chronic Renal Insufficiency Cohort (CRIC) Study, with separate ana...
Physical inactivity (PA) is an important risk factor for a wide range of diseases. Previous genome-wide association studies (GWAS), based on self-reported data or a small number of phenotypes derived from accelerometry, have identified a limited number of genetic loci associated with habitual PA and provided evidence for involvement of central nerv...
Importance The risk of airflow limitation and chronic obstructive pulmonary disease (COPD) is influenced by combinations of cigarette smoking and genetic susceptibility, yet it remains unclear whether gene-by-smoking interactions are associated with quantitative measures of lung function. Objective To assess the interaction of cigarette smoking an...
Identifying cancer driver genes is essential for understanding the mechanisms of carcinogenesis and designing therapeutic strategies. Although driver genes have been identified for many cancer types, it is still not clear whether the selection pressure of driver genes is homogeneous across cancer subtypes. We propose a statistical framework MutScot...
Introduction: Even though various risk estimators are widely used to predict atherosclerosis from subclinical levels to hard CHD, there is a remarkable proportion of low-risk individuals with inordinately high coronary artery calcification (CAC) or with hard CHD events. Rare pathogenic variants (<0.1%) in the atherosclerosis gene panel with larger...
Two-phase designs can reduce the cost of epidemiological studies by limiting the ascertainment of expensive covariates or/and exposures to an efficiently selected subset (phase-II) of a larger (phase-I) study. Efficient analysis of the resulting dataset combining disparate information from phase-I and phase-II, however, can be complex. Most of the...
Background: In India, as elsewhere, the incidence of gall-bladder cancer (GBC) is substantially higher in women than in men. Yet, the relevance of reproductive factors to GBC remains poorly understood. Methods: We used logistic regression adjusted for age, education and area to examine associations between reproductive factors and GBC risk, usin...
Background: Genome-wide association studies (GWAS) have revealed numerous loci for kidney function (estimated glomerular filtration rate, eGFR). The relationship of polygenic predictors of eGFR, risk of incident adverse kidney outcomes, and the plasma proteome is not known. Methods: We developed a genome-wide polygenic risk score (PRS) for eGFR by...
Background Proteomic profiling may allow identification of plasma proteins that associate with subsequent changesin kidney function, elucidating biologic processes underlying the development and progression of CKD. Methods We quantified the association between 4877 plasma proteins and a composite outcome of ESKD or decline in eGFR by ≥50% among 94...
Background: Genomic regions that confer susceptibility for bladder cancer have provided important insights into the mechanisms of this disease. Sixteen genomic regions harboring bladder cancer susceptibility loci have been reported to date. To identify additional loci associated with bladder cancer risk, we conducted a meta-analysis including data...
Evaluating gene by environment (G$\times$E) interaction under an additive risk model (i.e. additive interaction) has gained wider attention. Recently, statistical tests have been proposed for detecting additive interaction that utilize an assumption on G-E independence to boost power, which do not rely on restrictive genetic models such as dominant...
The plasma proteomic changes that precede the onset of dementia could yield insights into disease biology and highlight new biomarkers and avenues for intervention. We quantified 4,877 plasma proteins in nondemented older adults in the Atherosclerosis Risk in Communities cohort and performed a proteome-wide association study of dementia risk over f...
Noninvasive multicancer liquid biopsy tests are rapidly emerging for early detection, but implementation will require risk stratification to enhance risk-benefit balance. Using data from the UK Biobank study, we trained and validated a model that allows estimation of absolute risk of developing at least one of the eight common cancers among women....
Improved understanding of the proteome can facilitate the identification of causal mechanisms for complex traits. We conducted a comprehensive analysis of the common variant cis -regulatory genetic architecture of 4,665 plasma proteins from 7,213 European Americans (EA) and 1,871 African Americans (AA) from the Atherosclerosis Risk in Communities (...
Importance: Risk to airflow limitation and Chronic Obstructive Pulmonary Disease (COPD) is influenced by combinations of cigarette smoking and genetic susceptibility, yet it remains unclear whether gene-by-smoking interactions contribute to quantitative measures of lung function. Objective: Determine whether smoking modifies the effect of a polygen...
Background Rigorous evaluation of the calibration and discrimination of breast-cancer risk-prediction models in prospective cohorts is critical for applications under clinical guidelines. We comprehensively evaluated an integrated model incorporating classical risk factors and a 313-variant polygenic risk score (PRS) to predict breast-cancer risk....
Improved understanding of the genetic architecture of the proteome through studies of larger sample size, ethnic diversity, and advanced methods can facilitate the identification of causal mechanisms for complex traits. We conducted a comprehensive analysis of the common variant cis -regulatory genetic architecture of 4,665 plasma proteins or prote...
Polygenic risk scores (PRSs), which often aggregate results from genome-wide association studies, can bridge the gap between initial discovery efforts and clinical applications for the estimation of disease risk using genetics. However, there is notable heterogeneity in the application and reporting of these risk scores, which hinders the translati...
Physical activity (PA) is an important risk factor for a wide range of diseases. Previous genome-wide association studies (GWAS), based on self-reported data or a small number of phenotypes derived from accelerometry, have identified a limited number of genetic loci associated with habitual PA and provided evidence for involvement of central nervou...
Background The Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA) and the Tyrer-Cuzick breast cancer risk prediction models are commonly used in clinical practice and have recently been extended to include polygenic risk scores (PRS). In addition, BOADICEA has also been extended to include reproductive and...
Germline variation and smoking are independently associated with pancreatic ductal adenocarcinoma (PDAC). We conducted genome-wide smoking interaction analysis of PDAC using genotype data from four previous genome-wide association studies in individuals of European ancestry (7,937 cases and 11,774 controls). Examination of expression quantitative t...
Mendelian Randomization (MR) analysis is increasingly popular for testing the causal effect of exposures on disease outcomes using data from genome-wide association studies. In some settings, the underlying exposure, such as systematic inflammation, may not be directly observable, but measurements can be available on multiple biomarkers, or other t...
We previously identified 10 lung adenocarcinoma susceptibility loci in a genome-wide association study (GWAS) conducted in the Female Lung Cancer Consortium in Asia (FLCCA), the largest genomic study of lung cancer among never-smoking women to date. Furthermore, household coal use for cooking and heating has been linked to lung cancer in Asia, espe...
Reducing COVID-19 burden for populations will require equitable and effective risk-based allocations of scarce preventive resources, including vaccinations¹. To aid in this effort, we developed a general population risk calculator for COVID-19 mortality based on various sociodemographic factors and pre-existing conditions for the US population, com...
Lung cancer is the leading cause of cancer-related death globally. An improved risk stratification strategy can increase efficiency of low-dose CT (LDCT) screening. Here we assessed whether individual's genetic background has clinical utility for risk stratification in the context of LDCT screening. On the basis of 13,119 patients with lung cancer...
Background Previous studies have often evaluated methods for Mendelian randomization (MR) analysis based on simulations that do not adequately reflect the data-generating mechanisms in genome-wide association studies (GWAS) and there are often discrepancies in the performance of MR methods in simulations and real data sets. Methods We use a simula...
There is increasing evidence that pleiotropy, the association of multiple traits with the same genetic variants/loci, is a very common phenomenon. Cross-phenotype association tests are often used to jointly analyze multiple traits from a genome-wide association study (GWAS). The underlying methods, however, are often designed to test the global nul...
Background: Recent clinical guidelines support intensive blood pressure (BP) treatment targets. However, observational data suggest that excessive diastolic BP (DBP) lowering might increase the risk of myocardial infarction (MI); reflecting a J- or U-shaped relationship. Methods: We analyzed 47,407 participants from 5 cohorts (median age 60 years)....
Background: Past history of gallstones is associated with increased risk of gallbladder cancer (GBC) in observational studies. We conducted complementary observational and Mendelian Randomization (MR) analyses to determine whether history of gallstones is causally related to development of GBC in an Indian population. Methods: To investigate ass...
Acquired mutations are pervasive across normal tissues. However, understanding of the processes that drive transformation of certain clones to cancer is limited. Here we study this phenomenon in the context of clonal hematopoiesis (CH) and the development of therapy-related myeloid neoplasms (tMNs). We find that mutations are selected differentiall...
Background: Objective measures of physical activity (PA) derived from wrist-worn accelerometers are compared with traditional risk factors in terms of mortality prediction performance in the UK Biobank. Methods: A subset of participants in the UK Biobank study wore a tri-axial wrist-worn accelerometer in a free-living environment for up to 7 day...
Genome-wide association studies (GWAS) have revealed numerous loci for kidney function (estimated glomerular filtration rate, eGFR). The relationship of polygenic predictors of eGFR, risk of incident adverse kidney outcomes, and the plasma proteome is not known. We developed a genome-wide polygenic risk score (PRS) using a weighted average of 1.2 m...
Several statistical methods have been proposed for testing gene(G)-environment(E) interactions under additive risk models using genome-wide association study data. However, these approaches have strong assumptions on underlying genetic models such as dominant or recessive effects that are known to be less robust when the true genetic model is unkno...
p>Recent studies among healthy individuals show evidence of somatic mutations in leukemia-associated genes, referred to as clonal hematopoiesis (CH). To determine the relationship between CH and oncologic therapy we collected sequential blood samples from 525 cancer patients (median sampling interval time = 23 months, range: 6-53 months) of whom 61...
Introduction: Pancreatic cancer is the seventh leading cause of cancer death worldwide with pancreatic ductal adenocarcinoma (PDAC) being the most common subtype (>90%). Inherited genetic changes and cigarette smoking are established independent risk factors of PDAC. Methods: We conducted a genome-wide gene by smoking interaction analysis of PDAC r...
Background: The Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA) and the Tyrer-Cuzick breast cancer risk prediction models are commonly used in clinical practice and have recently been extended to include polygenic risk scores (PRS). In addition, BOADICEA has also been extended to include reproductive and...
While genome-wide association studies have identified susceptibility variants for numerous traits, their combined utility for predicting broad measures of health, such as mortality, remains poorly understood. We used data from the UK Biobank to combine polygenic risk scores (PRS) for 13 diseases and 12 mortality risk factors into sex-specific compo...
The 2017 American College of Cardiology/American Heart Association guideline defines hypertension as a blood pressure ≥130/80 mm Hg, whereas the 2018 European Society of Cardiology (ESC) and 2019 National Institute for Health and Care Excellence (NICE) guidelines use a ≥140/90 mm Hg threshold. Our objective was to study the associations between iso...
Genome-wide association studies (GWAS) have led to the identification of hundreds of susceptibility loci across cancers, but the impact of further studies remains uncertain. Here we analyse summary-level data from GWAS of European ancestry across fourteen cancer sites to estimate the number of common susceptibility variants (polygenicity) and under...
Background: Obesity and diabetes are major modifiable risk factors for pancreatic cancer. Interactions between genetic variants and diabetes/obesity have not previously been comprehensively investigated in pancreatic cancer at the genome-wide level. Methods: We conducted a gene-environment interaction (GxE) analysis including 8,255 cases and 11,...
Breast cancer susceptibility variants frequently show heterogeneity in associations by tumor subtype1–3. To identify novel loci, we performed a genome-wide association study including 133,384 breast cancer cases and 113,789 controls, plus 18,908 BRCA1 mutation carriers (9,414 with breast cancer) of European ancestry, using both standard and novel m...
A variety of predisposing factors have been associated with serious illness and death from COVID-19. Understanding the distribution of risks associated with these factors by local communities can provide important opportunities for targeting interventions. We characterize the distribution of risk for COVID-19 mortality for populations at large acro...
Polygenic risk scores (PRS), often aggregating the results from genome-wide association studies, can bridge the gap between the initial variant discovery efforts and disease risk estimation for clinical applications. However, there is remarkable heterogeneity in the reporting of these risk scores due to a lack of adherence to reporting standards an...
Large-scale genome-wide association (GWAS) studies provide opportunities for developing genetic risk prediction models that have the potential to improve disease prevention, intervention or treatment. The key step is to develop polygenic risk score (PRS) models with high predictive performance for a given disease, which typically requires a large t...
Purpose: The Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA) and the Tyrer-Cuzick breast cancer models have recently been extended to include polygenic risk scores (PRS). In addition, BOADICEA has also been extended to include reproductive and lifestyle factors, which were already part of Tyrer-Cuzick mo...
We evaluated the joint associations between a new 313-variant PRS (PRS313) and questionnaire-based breast cancer risk factors for women of European ancestry, using 72,284 cases and 80,354 controls from the Breast Cancer Association Consortium. Interactions were evaluated using standard logistic regression, and a newly developed case-only method, fo...
Blood pressure and kidney function have a bidirectional relation. Hypertension has long been considered as a risk factor for kidney function decline. However, whether intensive blood; pressure control could promote kidney health has been uncertain. The kidney is known to have a; major role in affecting blood pressure through sodium extraction and r...
S ummary There is increasing evidence that pleiotropy, the association of multiple traits with the same genetic variants/loci, is a very common phenomenon. Cross-phenotype association tests are often used to jointly analyze multiple traits from a GWAS. The underlying methods, however, are often designed to test the global null hypothesis that there...
Background: Independent validation of risk prediction models in prospective cohorts is required for risk-stratified cancer prevention. Such studies often have a two-phase design, where information on expensive biomarkers are ascertained in a nested sub-study of the original cohort. Methods: We propose a simple approach for evaluating model discr...