
Jun DingMcGill University | McGill · Department of Medicine
Jun Ding
Doctor of Philosophy
About
71
Publications
6,410
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,032
Citations
Citations since 2017
Introduction
Additional affiliations
Publications
Publications (71)
The outbreak of the COVID-19 pandemic caused catastrophic socioeconomic consequences and fundamentally reshaped the lives of billions across the globe. Our current understanding of the relationships between clinical variables (demographics, symptoms, follow-up symptoms, comorbidities, treatments, lab results, complications, and other clinical measu...
Resident-tissue macrophages (RTM) arise from embryonic precursors1,2, yet developmental signals shaping their longevity remain largely unknown. Here we demonstrated in mice genetically deficient in 12/15-LOX (Alox15-/-) that neonatal neutrophil-derived 12-HETE was required for self-renewal and maintenance of alveolar macrophages (AM) during lung de...
The ever-increasing availability of single-cell transcriptomic data offers unrivaled opportunities to profile cellular states in various biological processes at high resolution, which has brought substantial advancements in understanding complex mechanisms underlying a large variety of bioprocesses. As limited by the protocol and technology, single...
UNIFAN is an unsupervised cell type annotation tool for single-cell RNA sequencing data (scRNA-seq). Given single-cell expression data as input, UNIFAN outputs cell clusters as well as annotations for each cluster. The clustering process utilizes information on pathways and biological processes and these are also used to annotate the resulting clus...
One of the first steps in the analysis of single-cell RNA sequencing data (scRNA-seq) is the assignment of cell types. While a number of supervised methods have been developed for this, in most cases such assignment is performed by first clustering cells in low-dimensional space and then assigning cell types to different clusters. To overcome noise...
Cell type assignment is a major challenge for all types of high throughput single cell data. In many cases such assignment requires the repeated manual use of external and complementary data sources. To improve the ability to uniformly assign cell types across large consortia, platforms and modalities, we developed Cellar, a software tool that prov...
A major advantage of single cell RNA-sequencing (scRNA-Seq) data is the ability to reconstruct continuous ordering and trajectories for cells. Here we present TraSig, a computational method for improving the inference of cell-cell interactions in scRNA-Seq studies that utilizes the dynamic information to identify significant ligand-receptor pairs w...
Dysregulation of the balance between pro-inflammatory and anti-inflammatory macrophages has a key function in the pathogenesis of Duchenne muscular dystrophy (DMD), a fatal genetic disease. We postulate that an evolutionarily ancient protective mechanism against infection, known as trained immunity, drives pathological inflammation in DMD. Here we...
Methods for profiling genes at the single-cell level have revolutionized our ability to study several biological processes and systems including development, differentiation, response programmes and disease progression. In many of these studies, cells are profiled over time in order to infer dynamic changes in cell states and types, sets of express...
One of the first steps in the analysis of single cell RNA-Sequencing data (scRNA-Seq) is the assignment of cell types.
Patients with COPD may be at an increased risk for severe illness from COVID-19 because of ACE2 upregulation, the entry receptor for SARS-CoV-2. Chronic exposure to cigarette smoke, the main risk factor for COPD, increases pulmonary ACE2. How ACE2 expression is controlled is not known but may involve HuR, an RNA binding protein that increases prote...
One of the first steps in the analysis of single cell RNA-Sequencing data (scRNA-Seq) is the assignment of cell types. While a number of supervised methods have been developed for this, in most cases such assignment is performed by first clustering cells in low-dimensional space and then assigning cell types to different clusters. To overcome noise...
Single-cell technologies are revolutionizing the ability of researchers to infer the causes and results of biological processes. Although several studies of pluripotent cell differentiation have recently utilized single-cell sequencing data, other aspects related to the optimization of differentiation protocols, their validation, robustness, and us...
A major advantage of single cell RNA-Sequencing (scRNA-Seq) data is the ability to reconstruct continuous ordering and trajectories for cells. To date, such ordering was mainly used to group cells and to infer interactions within cells. Here we present TraSig, a computational method for improving the inference of cell-cell interactions in scRNA-Seq...
Making sense of single-cell data requires various computational efforts such as clustering, visualization and gene regulatory network inference, often addressed by different methods. DeepSEM provides an all-in-one solution.
Several recent technologies and platforms enable the profiling of various molecular signals at the single-cell level. A key question for all studies using such data is the assignment of cell types. To improve the ability to correctly assign cell types in single and multi-omics sequencing and imaging single-cell studies, we developed Cellar. This in...
Abnormal coagulation and an increased risk of thrombosis are features of severe COVID-19, with parallels proposed with hemophagocytic lymphohistiocytosis (HLH), a life-threating condition associated with hyperinflammation. The presence of HLH was described in severely ill patients during the H1N1 influenza epidemic, presenting with pulmonary vascul...
Emphysema, a component of chronic obstructive pulmonary disease (COPD), is characterized by irreversible alveolar destruction that results in a progressive decline in lung function. This alveolar destruction is caused by cigarette smoke, the most important risk factor for COPD. Only 15%‐20% of smokers develop COPD, suggesting that unknown factors c...
The COVID-19 pandemic is associated with severe pneumonia and acute respiratory distress syndrome leading to death in susceptible individuals. For those who recover, post-COVID-19 complications may include development of pulmonary fibrosis. Factors contributing to disease severity or development of complications is not known. Using computational an...
Motivation:
Recent technological advances enable the profiling of spatial single cell expression data. Such data presents a unique opportunity to study cell-cell interactions and the signaling genes that mediate them. However, most current methods for the analysis of this data focus on unsupervised descriptive modeling, making it hard to identify...
Motivation: Recent technological advances enable the profiling of spatial single cell expression data. Such data presents a unique opportunity to study cell-cell interactions and the signaling genes that mediate them. However, most current methods for the analysis of this data focus on unsupervised descriptive modeling, making it hard to identify k...
Abnormal coagulation and an increased risk of thrombosis are features of severe COVID-19, with parallels proposed with hemophagocytic lymphohistiocytosis (HLH), a life-threating condition associated with hyperinflammation. The presence of HLH was described in severely ill patients during the H1N1 influenza epidemic, presenting with pulmonary vascul...
The vast majority of biological processes are dynamic, changing over time. Several studies profile high throughput time series data and use it for analyzing and modeling various biological processes. In this review, we focus on data, methods, and analysis for reconstructing dynamic regulatory network models from high throughput time series datasets...
Several molecular datasets have been recently compiled to characterize the activity of SARS-CoV-2 within human cells. Here we extend computational methods to integrate several different types of sequence, functional and interaction data to reconstruct networks and pathways activated by the virus in host cells. We identify the key proteins in these...
Methods for the analysis of time series single cell expression data (scRNA-Seq) either do not utilize information about transcription factors (TFs) and their targets or only study these as a post-processing step. Using such information can both, improve the accuracy of the reconstructed model and cell assignments, while at the same time provide inf...
Alveolar epithelial type 2 cells (AEC2s) are the facultative progenitors responsible for maintaining lung alveoli throughout life but are difficult to isolate from patients. Here, we engineer AEC2s from human pluripotent stem cells (PSCs) in vitro and use time-series single-cell RNA sequencing with lentiviral barcoding to profile the kinetics of th...
microRNAs play essential roles in RNA silencing and regulating gene expression, and their involvement has been demonstrated in normal and pathological processes. microRNAs function by binding to the target mRNA molecule. The dysfunction of microRNAs has been associated with the development and progression of many diseases. To better understand the...
To develop a systems biology model of fibrosis progression within the human lung we performed RNAseq and microRNA analysis on 95 samples obtained from 10 idiopathic pulmonary fibrosis (IPF) and 6 control lungs. Extent of fibrosis in each sample was assessed by microCT measured alveolar surface density (ASD) and confirmed by histology. Regulatory ge...
One million patients with congenital heart disease (CHD) live in the United States. They have a lifelong risk of developing heart failure. Current concepts do not sufficiently address mechanisms of heart failure development specifically for these patients. Here, analysis of heart tissue from an infant with tetralogy of Fallot with pulmonary stenosi...
One million patients with congenital heart disease (CHD) live in the US. They have a lifelong risk of developing heart failure. Current concepts do not sufficiently address mechanisms of heart failure development specifically for these patients. We show that cardiomyocyte cytokinesis failure is increased in tetralogy of Fallot with pulmonary stenos...
Alveolar epithelial type 2 cells (AEC2s) are the facultative progenitors responsible for maintaining lung alveoli throughout life, yet are difficult to access from patients for biomedical research or lung regeneration applications. Here we engineer AEC2s from human induced pluripotent stem cells (iPSCs) in vitro and use single cell RNA sequencing (...
A comprehensive understanding of the dynamic regulatory networks that govern postnatal alveolar lung development is still lacking. To construct such a model, we profiled mRNA, microRNA, DNA methylation, and proteomics of developing murine alveoli isolated by laser capture microdissection at 14 predetermined time points. We developed a detailed comp...
Several recent studies focus on the inference of developmental and response trajectories from single cell RNA-Seq (scRNA-Seq) data. A number of computational methods, often referred to as pseudo-time ordering, have been developed for this task. Recently, CRISPR has also been used to reconstruct lineage trees by inserting random mutations. However,...
Cardiac differentiation of human pluripotent stem cells (hPSCs) requires orchestration of dynamic gene regulatory networks during stepwise fate transitions but often generates immature cell types that do not fully recapitulate properties of their adult counterparts, suggesting incomplete activation of key transcriptional networks. We performed exte...
Several recent studies focus on the inference of developmental and response trajectories from single cell RNA-Seq (scRNA-Seq) data. A number of computational methods, often referred to as pseudo-time ordering, have been developed for this task. Recently, CRISPR has also been used to reconstruct lineage trees by inserting random mutations. However,...
Differentiation into diverse cell lineages requires the orchestration of gene regulatory networks guiding diverse cell fate choices. Utilizing human pluripotent stem cells, we measured expression dynamics of 17,718 genes from 43,168 cells across five time points over a thirty day time-course of in vitro cardiac-directed differentiation. Unsupervise...
The Dynamic Regulatory Events Miner (DREM) software reconstructs dynamic regulatory networks by integrating static protein-DNA interaction data with time series gene expression data. In recent years, several additional types of high-throughput time series data have been profiled when studying biological processes including time series miRNA express...
Supporting methods and results.
This file provides the detailed method description and also the supporting results.
(PDF)
An example of using single-cell RNA-seq data in iDREM.
(A) The single-cell RNA-seq data. (B) Cluster the cells into different sub-types based on the expression profile. (C) Identify the signature genes (marker genes) for each cell type. (D) Intersect the marker genes (of specific cell-type) with the predicted paths/nodes in iDREM model to identify...
Sankey Diagram for model II.
The Sankey Diagram shows the GO functions and regulators associated with each of the predicted paths.
(PDF)
Sankey Diagram for model III.
The Sankey Diagram shows the GO functions and regulators associated with each of the predicted paths.
(PDF)
Mouse microglia development time points used in this paper.
(PDF)
Supported regulating factors predicted by iDREM.
(PDF)
Regulator comparison for models using different sets of input data.
(PDF)
iDREM visualization configuration panels.
(A) Global config, which can be used to customize the visualizations (e.g. background color, node color, visualization size). (B)Regulator Panel, which can be used to visualize the gene/miRNA expression. (C) Enrichment panel, which an be used to find the enriched paths/nodes in iDREM model for any given inp...
iDREM interactive visualization.
This figure shows the interactive visualization for the microglia data used in the study.
(PDF)
Sankey Diagram for model I.
The Sankey Diagram shows the GO functions and regulators associated with each of the predicted paths.
(PDF)
Sankey Diagram for model IV.
The Sankey Diagram shows the GO functions and regulators associated with each of the predicted paths.
(PDF)
Predicted paths for models I, II, III, IV.
I: only use miRNA and mRNA expression data; II: data used by I + time series proteomics data; III: the data used by I + the time series methylation data; IV: using all data presented in the study.
(PDF)
Top Go Terms associated with each path.
(PDF)
Generating detailed and accurate organogenesis models using single cell RNA-seq data remains a major challenge. Current methods have relied primarily on the assumption that decedent cells are similar to their parents in terms of gene expression levels. These assumptions do not always hold for in-vivo studies which often include infrequently sampled...
Motivation:
The identification of microRNA (miRNA) target sites is important. In the past decade, dozens of computational methods have been developed to predict miRNA target sites. Despite their existence, rarely does a method consider the well-known competition and cooperation among miRNAs when attempts to discover target sites. To fill this gap,...
Motivation:
Profiling of genome wide DNA methylation is now routinely performed when studying development, cancer and several other biological processes. Although Whole genome Bisulfite Sequencing provides high-quality methylation measurements at the resolution of nucleotides, it is relatively costly and so several studies have used alternative me...
Transcriptional and chromatin regulations mediate the liver response to nutrient availability. The role of chromatin factors involved in hormonal regulation in response to fasting is not fully understood. We have identified SETDB2, a glucocorticoid-induced putative epigenetic modifier, as a positive regulator of GR-mediated gene activation in liver...
Motivation:
The identification of microRNA (miRNA) target sites is fundamentally important for studying gene regulation. There are dozens of computational methods available for miRNA target site prediction. Despite their existence, we still cannot reliably identify miRNA target sites, partially due to our limited understanding of the characteristi...
MicroRNAs (miRNAs) play critical roles in gene regulation. Although it is well known that multiple miRNAs may work as miRNA modules to synergistically regulate common target mRNAs, the understanding of miRNA modules is still in its infancy.
We employed the recently generated high throughput experimental data to study miRNA modules. We predicted 181...
The identification of transcription factor binding motifs is important for the study of gene transcriptional regulation. The
chromatin immunoprecipitation (ChIP), followed by massive parallel sequencing (ChIP-seq) experiments, provides an unprecedented
opportunity to discover binding motifs. Computational methods have been developed to identify mot...
The identification of nuclear-encoded chloroplast proteins is important for the understanding of their functions and their interaction in chloroplasts. Despite various endeavors in predicting these proteins, there is still room for developing novel computational methods for further improving the prediction accuracy. Here we developed a novel comput...
We have developed a novel approach called ChIPModule to systematically discover transcription factors and their cofactors from ChIP-seq data. Given a ChIP-seq dataset and the binding patterns of a large number of transcription factors, ChIPModule can efficiently identify groups of transcription factors, whose binding sites significantly co-occur in...
Chlamydomonas reinhardtii is one of the most important microalgae model organisms and has been widely studied toward the understanding of chloroplast functions and various cellular processes. Further exploitation of C. reinhardtii as a model system to elucidate various molecular mechanisms and pathways requires systematic study of gene regulation....
Chloroplasts play critical roles in land plant cells. Despite their importance and the availability of at least 200 sequenced chloroplast genomes, the number of known DNA regulatory sequences in chloroplast genomes are limited. In this paper, we designed computational methods to systematically study putative DNA regulatory sequences in intergenic r...
The identification of cis-regulatory modules (CRMs) can greatly advance our understanding of gene regulatory mechanisms. Despite the existence of binding sites of more than three transcription factors (TFs) in a CRM, studies in plants often consider only the cooccurrence of binding sites of one or two TFs. In addition, CRM studies in plants are lim...
For RFID tags, a Novel Tag Anti-collision Algorithm with Grouping (TAAG) is proposed. It divides tags into groups and adopts a deterministic method to identify tags within group. TAAG estimates the total number of tags in systems from group identifying result and then adjusts the grouping method accordingly. The performance of the proposed TAAG alg...