Kumardeep Chaudhary

Kumardeep Chaudhary
Icahn School of Medicine at Mount Sinai | MSSM · Department of Genetics and Genomic Sciences

PhD

About

126
Publications
34,634
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,320
Citations
Additional affiliations
January 2018 - present
Icahn School of Medicine at Mount Sinai
Position
  • PostDoc Position
Description
  • Health informatics, EHR, Genomics, Personalized Medicine, Deep Learning, Subtype Identification, Genotype-Phenotype discoveries
April 2016 - December 2017
University of Hawaii Cancer Center
Position
  • PostDoc Position
Description
  • Translational bioinformatics and NGS data analyses, Deep learning, Survival Analysis, Driver mutations, Hepatocellular Carcinoma (HCC), Drug Repurposing
September 2015 - January 2016
Institute of Microbial Technology
Position
  • Fellow
Description
  • Cancer Bioinformatics, Machine Learning, Database, Prediction Web Server, Peptide informatics, Cheminformatics, Immunoinformatics

Publications

Publications (126)
Article
Full-text available
We report a genome-wide association study (GWAS) of coronary artery disease (CAD) incorporating nearly a quarter of a million cases, in which existing studies are integrated with data from cohorts of white, Black and Hispanic individuals from the Million Veteran Program. We document near equivalent heritability of CAD across multiple ancestral grou...
Article
Full-text available
Data science has been an invaluable part of the COVID-19 pandemic response with multiple applications, ranging from tracking viral evolution to understanding the vaccine effectiveness. Asymptomatic breakthrough infections have been a major problem in assessing vaccine effectiveness in populations globally. Serological discrimination of vaccine resp...
Article
Genetic risk for coronary artery disease (CAD) is commonly measured with polygenic risk scores (PRS); yet, the relationship of atherosclerotic burden with PRS in healthy individuals not at high clinical risk for CAD (i.e., without a high pooled cohort equations [PCE] score) is unknown. Here, we implemented a novel recall-by-PRS strategy to measure...
Article
Full-text available
Immunization is expected to confer protection against infection and severe disease for vaccines while reducing risks to unimmunized populations by inhibiting transmission. Here, based on serial serological studies of an observational cohort of healthcare workers, we show that during a Severe Acute Respiratory Syndrome -Coronavirus 2 Delta-variant o...
Article
Full-text available
Data science has been an invaluable part of the COVID-19 pandemic response with multiple applications, ranging from tracking viral evolution to understanding the effectiveness of interventions. Asymptomatic breakthrough infections have been a major problem during the ongoing surge of Delta variant globally. Serological discrimination of vaccine res...
Article
BACKGROUND: Clinical features from electronic health records (EHRs) can be used to build a complementary tool to predict coronary artery disease (CAD) susceptibility. OBJECTIVES: The purpose of this study was to determine whether an EHR score can improve CAD prediction and reclassification 1 year before diagnosis, beyond conventional clinical guid...
Article
Importance: Population-based assessment of disease risk associated with gene variants informs clinical decisions and risk stratification approaches. Objective: To evaluate the population-based disease risk of clinical variants in known disease predisposition genes. Design, setting, and participants: This cohort study included 72 434 individual...
Preprint
Full-text available
Data science has been an invaluable part of the COVID-19 pandemic response with multiple applications, ranging from tracking viral evolution to understanding the effectiveness of interventions. Asymptomatic breakthrough infections have been a major problem during the ongoing surge of Delta variant globally. Serological discrimination of vaccine res...
Article
Full-text available
The genetic makeup of an individual contributes to susceptibility and response to viral infection. While environmental, clinical and social factors play a role in exposure to SARS-CoV-2 and COVID-19 disease severity1,2, host genetics may also be important. Identifying host-specific genetic factors may reveal biological mechanisms of therapeutic rel...
Preprint
Full-text available
Acute kidney injury (AKI) is a known complication of COVID-19 and is associated with an increased risk of in-hospital mortality. Unbiased proteomics using longitudinally collected biological specimens can lead to improved risk stratification and discover pathophysiological mechanisms. Using longitudinal measurements of ∼4000 plasma proteins in two...
Article
Identifying whether a given genetic mutation results in a gene product with increased (gain-of-function; GOF) or diminished (loss-of-function; LOF) activity is an important step toward understanding disease mechanisms because they may result in markedly different clinical phenotypes. Here, we generated an extensive database of documented germline G...
Article
Background Despite advances in cardiovascular disease and risk factor management, mortality from ischemic heart failure (HF) in patients with coronary artery disease (CAD) remains high. Given the partial role of genetics in HF and lack of reliable risk stratification tools, we developed and validated a polygenic risk score for HF in patients with C...
Preprint
Immunization is expected to confer protection against infection and severe disease for vaccinees, while reducing risks to unimmunized populations by inhibiting transmission. Here, based on serial serological studies, we show that during a severe SARS-CoV2 Delta-variant outbreak in Delhi, 25.3% (95% CI 16.9-35.2) of previously uninfected, ChAdOx1-nC...
Article
Full-text available
Multi-omics data are good resources for prognosis and survival prediction; however, these are difficult to integrate computationally. We introduce DeepProg, a novel ensemble framework of deep-learning and machine-learning approaches that robustly predicts patient survival subtypes using multi-omics data. It identifies two optimal survival subtypes...
Preprint
Full-text available
Diabetic kidney disease (DKD) is considered partially hereditary, but the genetic factors underlying disease remain largely unknown. A key barrier to our understanding stems from its heterogeneity, and likely polygenic etiology. Proteinuric and non-proteinuric DKD are two sub-classes of DKD, defined by high urinary albumin-to-creatinine ratio (UACR...
Article
Background: Acute Kidney Injury treated with dialysis initiation is a common complication of COVID-19 infection among hospitalized patients. However, dialysis supplies and personnel are often limited. Methods: Using data from adult hospitalized COVID-19 patients from five hospitals from the Mount Sinai Health System who were admitted from March 10t...
Article
Biobanks with exomes linked to electronic health records (EHRs) enable the study of genetic pleiotropy between rare variants and seemingly disparate diseases. We performed robust clinical phenotyping of rare, putatively deleterious variants (loss‐of‐function [LoF] and deleterious missense variants) in ERCC6, a gene implicated in inherited retinal d...
Preprint
Full-text available
The genetic makeup of an individual contributes to susceptibility and response to viral infection. While environmental, clinical and social factors play a role in exposure to SARS-CoV-2 and COVID-19 disease severity, host genetics may also be important. Identifying host-specific genetic factors indicate biological mechanisms of therapeutic relevanc...
Preprint
Full-text available
A major goal of genomic medicine is to quantify the disease risk of genetic variants. Here, we report the penetrance of 37,772 clinically relevant variants (including those reported in ClinVar ¹ and of loss-of-function consequence) for 197 diseases in an analysis of exome sequence data for 72,434 individuals over five ancestries and six decades of...
Article
Full-text available
Diabetic retinopathy (DR) is a common consequence in type 2 diabetes (T2D) and a leading cause of blindness in working-age adults. Yet, its genetic predisposition is largely unknown. Here, we examined the polygenic architecture underlying DR by deriving and assessing a genome-wide polygenic risk score (PRS) for DR. We evaluated the PRS in 6079 indi...
Article
Background/aims: Acute kidney injury (AKI) in critically ill patients is common, and continuous renal replacement therapy (CRRT) is a preferred mode of renal replacement therapy (RRT) in hemodynamically unstable patients. Prediction of clinical outcomes in patients on CRRT is challenging. We utilized several approaches to predict RRT-free survival...
Preprint
Full-text available
Coronary artery disease (CAD) is a leading cause of death, yet its genetic determinants are not fully elucidated. We report a multi-ethnic genome-wide association study of CAD involving nearly a quarter of a million cases, incorporating the largest cohorts to date of Whites, Blacks, and Hispanics from the Million Veteran Program with existing studi...
Article
Full-text available
Background Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and the associated Coronavirus Disease 2019 (COVID-19) is a public health emergency. Acute kidney injury (AKI) is a common complication in hospitalized patients with COVID-19 although mechanisms underlying AKI are yet unclear. There may be a direct effect of SARS-CoV-2 virus on...
Article
Full-text available
More than 800 million people in the world suffer from chronic kidney disease (CKD). Genome-wide association studies (GWAS) have identified hundreds of loci where genetic variants are associated with kidney function; however, causal genes and pathways for CKD remain unknown. Here, we performed integration of kidney function GWAS and human kidney–spe...
Article
Full-text available
The clinical impact of rare loss-of-function variants has yet to be determined for most genes. Integration of DNA sequencing data with electronic health records (EHRs) could enhance our understanding of the contribution of rare genetic variation to human disease¹. By leveraging 10,900 whole-exome sequences linked to EHR data in the Penn Medicine Bi...
Preprint
Full-text available
Gain-of-function (GOF) and loss-of-function (LOF) mutations in the same gene may result in markedly different clinical phenotypes and hence require different therapeutic treatments. Identifying the functional consequences of mutations is an important step toward understanding disease mechanisms. While there are numerous computational tools (e.g., C...
Article
Background and objectives Sepsis-associated AKI is a heterogeneous clinical entity. We aimed to agnostically identify sepsis-associated AKI subphenotypes using deep learning on routinely collected data in electronic health records. Design, setting, participants, & measurements We used the Medical Information Mart for Intensive Care III database, w...
Article
Full-text available
Background: Early reports indicate that AKI is common among patients with coronavirus disease 2019 (COVID-19) and associated with worse outcomes. However, AKI among hospitalized patients with COVID-19 in the United States is not well described. Methods: This retrospective, observational study involved a review of data from electronic health reco...
Article
Introduction A previous study demonstrated that the surface area‐normalized standard Kt/V (SAstdKt/V) was better associated with mortality than standard Kt/V (stdKt/V). This study investigates the association of SAstdKt/V and stdKt/V with mortality, anemia, and hypoalbuminemia in a larger patient cohort with a longer follow‐up period. Methods We i...
Preprint
Full-text available
Background: Tamoxifen is the most commonly used endocrine therapy (ET) for Breast Cancer (BC) patients expressing estrogen receptors (ER), representing almost 70% of all cases. However, one third of early stage BC patients demonstrate endocrine resistance to tamoxifen over the initial five-year treatment period, prompting significant research effor...
Article
Full-text available
While the past decade has seen meaningful improvements in clinical outcomes for multiple myeloma patients, a subset of patients does not benefit from current therapeutics for unclear reasons. Many gene expression-based models of risk have been developed, but each model uses a different combination of genes and often involves assaying many genes mak...
Article
Full-text available
Cancer is driven by genomic alterations, but the processes causing this disease are largely performed by proteins. However, proteins are harder and more expensive to measure than genes and transcripts. To catalyze developments of methods to infer protein levels from other omics measurements, we leveraged crowdsourcing via the NCI-CPTAC DREAM proteo...
Article
Background Polygenic risk scores (PRS) for coronary artery disease (CAD) identify high-risk individuals more likely to benefit from primary prevention statin therapy. Whether polygenic CAD risk is captured by conventional paradigms for assessing clinical cardiovascular risk remains unclear. Objectives This study sought to intersect polygenic risk...
Article
Urinary tract stones have high heritability indicating a strong genetic component. However, genome wide association studies (GWAS) have uncovered only a few genome wide significant single nucleotide polymorphisms (SNPs). Polygenic risk scores (PRS) sum cumulative effect of many SNPs and shed light on underlying genetic architecture. Using GWAS summ...
Preprint
Full-text available
Importance: Preliminary reports indicate that acute kidney injury (AKI) is common in coronavirus disease (COVID)-19 patients and is associated with worse outcomes. AKI in hospitalized COVID-19 patients in the United States is not well-described. Objective: To provide information about frequency, outcomes and recovery associated with AKI and dialysi...
Article
Full-text available
Symptoms are common in patients on maintenance hemodialysis but identification is challenging. New informatics approaches including natural language processing (NLP) can be utilized to identify symptoms from narrative clinical documentation. Here we utilized NLP to identify seven patient symptoms from notes of maintenance hemodialysis patients of t...
Preprint
Background Prognosis (survival) prediction of patients is important for disease management. Multi-omics data are good resources for survival prediction, however, difficult to integrate computationally. Results We introduce DeepProg, a new computational framework that robustly predicts patient survival subtypes based on multiple types of omic data....
Article
Full-text available
Background and objectives: Hypernatremia is common in hospitalized, critically ill patients. Although there are no clear guidelines on sodium correction rate for hypernatremia, some studies suggest a reduction rate not to exceed 0.5 mmol/L per hour. However, the data supporting this recommendation and the optimal rate of hypernatremia correction i...
Article
Full-text available
The response to respiratory viruses varies substantially between individuals, and there are currently no known molecular predictors from the early stages of infection. Here we conduct a community-based analysis to determine whether pre- or early post-exposure molecular factors could predict physiologic responses to viral exposure. Using peripheral...
Article
Although driver genes in hepatocellular carcinoma (HCC) have been investigated in various previous genetic studies, prevalence of key driver genes among heterogeneous populations is unknown. Moreover, the phenotypic associations of these driver genes are poorly understood. This report aims to reveal the phenotypic impacts of a group of consensus dr...
Article
Full-text available
Background: Evidences in literature strongly advocate the potential of immunomodulatory peptides for use as vaccine adjuvants. All the mechanisms of vaccine adjuvants ensuing immunostimulatory effects directly or indirectly stimulate antigen presenting cells (APCs). While numerous methods have been developed in the past for predicting B cell and T...
Article
Full-text available
We propose an unsupervised multi-omics integration pipeline, using deep-learning autoencoder algorithm, to predict the survival subtypes in bladder cancer (BC). We used TCGA dataset comprising mRNA, miRNA and methylation to infer two survival subtypes. We then constructed a supervised classification model to predict the survival subgroups of any ne...
Preprint
Full-text available
Urine-based cancer biomarkers offer numerous advantages over the other biomarkers and play a crucial role in cancer management. In this study, an attempt has been made to develop proteomics-based prediction models to discriminate patients of oncological disorders related to urinary tract and healthy controls from their urine samples. The dataset us...
Article
Full-text available
This paper describes in silico models developed using a wide range of peptide features for predicting antifungal peptides (AFPs). Our analyses indicate that certain types of residue (e.g., C, G, H, K, R, Y) are more abundant in AFPs. The positional residue preference analysis reveals the prominence of the particular type of residues (e.g., R, V, K)...
Preprint
Full-text available
Background Evidences in literature strongly advocate the potential of immunomodulatory peptides for use as vaccine adjuvants. All the mechanisms of vaccine adjuvants ensuing immunostimulatory effects directly or indirectly stimulate Antigen Presenting Cells (APCs). While numerous methods have been developed in the past for predicting B-cell and T-c...
Article
Full-text available
Metabolomics holds the promise as a new technology to diagnose highly heterogeneous diseases. Conventionally, metabolomics data analysis for diagnosis is done using various statistical and machine learning based classification methods. However, it remains unknown if the deep neural network, a class of increasingly popular machine learning methods,...
Article
Full-text available
Identifying robust survival subgroups of hepatocellular carcinoma (HCC) will significantly improve patient care. Currently, endeavor of integrating multi-omics data to explicitly predict HCC survival from multiple patient cohorts is lacking. To fill in this gap, we present a deep learning (DL) based model on HCC that robustly differentiates surviva...
Chapter
Advances in the knowledge of various roles played by non-coding RNAs have stimulated the application of RNA molecules as therapeutics. Among these molecules, miRNA, siRNA, and CRISPR-Cas9 associated gRNA have been identified as the most potent RNA molecule classes with diverse therapeutic applications. One of the major limitations of RNA-based ther...