Carlo De Donno’s research while affiliated with Technical University of Munich and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (11)


Early life adversity shapes social subordination and cell type-specific transcriptomic patterning in the ventral hippocampus
  • Article
  • Full-text available

December 2023

·

132 Reads

·

19 Citations

Science Advances

·

·

·

[...]

·

Alon Chen

Adverse events in early life can modulate the response to additional stressors later in life and increase the risk of developing psychiatric disorders. The underlying molecular mechanisms responsible for these effects remain unclear. Here, we uncover that early life adversity (ELA) in mice leads to social subordination. Using single-cell RNA sequencing (scRNA-seq), we identified cell type–specific changes in the transcriptional state of glutamatergic and GABAergic neurons in the ventral hippocampus of ELA mice after exposure to acute social stress in adulthood. These findings were reflected by an alteration in excitatory and inhibitory synaptic transmission induced by ELA in response to acute social stress. Finally, enhancing the inhibitory network function through transient diazepam treatment during an early developmental sensitive period reversed the ELA-induced social subordination. Collectively, this study significantly advances our understanding of the molecular, physiological, and behavioral alterations induced by ELA, uncovering a previously unknown cell type–specific vulnerability to ELA.

Download

ScPoli enables learning cell-level and sample-level representations
a, scPoli reference building: the model integrates different datasets and learns condition embeddings for each integrated study and a set of cell type prototypes. b, scPoli reference mapping: the model weights are frozen (in gray) and a new set of condition embeddings are added to the model. Cell type labels are transferred from the closest prototype in the latent space. Example of a standard workflow using scPoli on multiple pancreas datasets. c,d, Uniform manifold approximation and projection (UMAP) of the raw data to be integrated in a reference (13,093 cells), showing cell types (c) and studies (d) by color. e,f, Integrated reference data colored by cell type (e) and study (f). g, A total of 3,289 query cells (celseq and celseq2 studies) are projected onto the reference data in the reference mapping step. UMAPs show in color the query cells and in gray the reference cells. Reference cell type prototypes are shown in bigger circles with a black edge. Unlabeled prototypes are shown in bigger gray circles with black edges. The accuracy of the label transfer is 80%. h, Cells are colored by study or origin after reference mapping. The model achieves a mean integration score of 0.86. i, Outcome of the label transfer step from reference to query. j, PCA of the condition embeddings learned by scPoli.
ScPoli reaches state-of-the-art performance on data integration and label transfer
a, Mean integration score obtained using the benchmarked models on different datasets. The bars on the right show the average results across datasets. b, Overall scores across datasets for biological conservation and batch correction performance of the benchmarked models. c, Weighted F1 scores achieved by each model when classifying query cells on the various datasets. d, Weighted F1 score and the overall integration score of the models capable of both data integration and label transfer. e, Macro-averaged F1 query classification scores achieved by each model on the various datasets. f, Macro-averaged F1 score and the overall integration score of the models capable of both data integration and label transfer.
ScPoli performs interpretable integration and query-to-reference mapping on the HLCA
a,b, Uniform manifold approximation and projection (UMAP) of the integrated HLCA core after reference building, cells are color coded by their study of origin (a) and by cell type (b). DC, dendritic cell. EC, endothelial cell. NK, natural killer cell. c, Comparison of integration performance yielded by scPoli and the scANVI. d,e, Visualization of the first two PCs obtained with a PCA of the sample embeddings learned from the reference data. Samples are color coded by their original study (d) and by sample type (e). f, UMAP of the joint query and reference datasets after query-to-reference mapping for a healthy query. Reference cells are shown in light gray, query cells are colored by the predicted cell type and unknown cells are shown in dark gray. Legend for the predicted cells is shared with b. Reference prototypes are shown as bigger dots with a black border, and are colored by cell type. g, UMAP of the integrated object with uncertainties in color. Reference cells are shown in gray. Labeled cell type prototypes are shown with bigger dots with a black border.
ScPoli allows classification of disease state for unlabeled samples
a, Uniform manifold approximation and projection (UMAP) of Su et al. dataset after integration. Unlabeled cells are shown colored by the predicted cell type. Labeled cell type prototypes are shown with bigger dots with a black border. DC, dendritic cell. NK, natural killer cell. b, UMAP of the integrated dataset colored by patient. Data from 30 random samples out of 270 are shown to simplify the visualization. c, PCA of the labeled sample embeddings obtained with scPoli colored by disease state. d, Unlabeled sample embeddings are shown in PCA space colored by their predicted disease state on top of the reference sample embeddings in gray. e, Comparison of classification accuracy and F1 score obtained on scPoli embeddings and average gene and scVI latent expression vectors for each sample (pseudobulks). Data points are obtained from cross-validation (n = 5). Data are presented as mean values ± standard error of the mean.
ScPoli sample embeddings capture technical variation and can guide data integration workflows
a,b, Uniform manifold approximation and projections (UMAPs) of Schulte-Schrepping et al. dataset consisting of healthy and COVID-19 PBMC samples after sample-level integration. Cells are colored by cell type (a) and experiment (b), respectively. DC, dendritic cell. NK, natural killer cell. HSPC, hematopoietic stem and progenitor cell. CTL, cytotoxic T cell. TCM, central memory T cell. TEM, effector memory T cell. ILC, innate lymphoid cell. MAIT, mucosal-associated invariant T cell. c–f, Sample embeddings obtained with scPoli colored by experiment (legend shared with b) (c), cohort (d) and disease (e on principal components 1 and 2 and f on principal components 2 and 3). For each covariate the association with the first and second PCs is displayed.

+1

Population-level integration of single-cell datasets enables multi-scale analysis across samples

October 2023

·

152 Reads

·

65 Citations

Nature Methods

The increasing generation of population-level single-cell atlases has the potential to link sample metadata with cellular data. Constructing such references requires integration of heterogeneous cohorts with varying metadata. Here we present single-cell population level integration (scPoli), an open-world learner that incorporates generative models to learn sample and cell representations for data integration, label transfer and reference mapping. We applied scPoli on population-level atlases of lung and peripheral blood mononuclear cells, the latter consisting of 7.8 million cells across 2,375 samples. We demonstrate that scPoli can explain sample-level biological and technical variations using sample embeddings revealing genes associated with batch effects and biological effects. scPoli is further applicable to single-cell sequencing assay for transposase-accessible chromatin and cross-species datasets, offering insights into chromatin accessibility and comparative genomics. We envision scPoli becoming an important tool for population-level single-cell data integration facilitating atlas use but also interpretation by means of multi-scale analyses.


An integrated cell atlas of the lung in health and disease

June 2023

·

882 Reads

·

403 Citations

Nature Medicine

Single-cell technologies have transformed our understanding of human tissues. Yet, studies typically capture only a limited number of donors and disagree on cell type definitions. Integrating many single-cell datasets can address these limitations of individual studies and capture the variability present in the population. Here we present the integrated Human Lung Cell Atlas (HLCA), combining 49 datasets of the human respiratory system into a single atlas spanning over 2.4 million cells from 486 individuals. The HLCA presents a consensus cell type re-annotation with matching marker genes, including annotations of rare and previously undescribed cell types. Leveraging the number and diversity of individuals in the HLCA, we identify gene modules that are associated with demographic covariates such as age, sex and body mass index, as well as gene modules changing expression along the proximal-to-distal axis of the bronchial tree. Mapping new data to the HLCA enables rapid data annotation and interpretation. Using the HLCA as a reference for the study of disease, we identify shared cell states across multiple lung diseases, including SPP1⁺ profibrotic monocyte-derived macrophages in COVID-19, pulmonary fibrosis and lung carcinoma. Overall, the HLCA serves as an example for the development and use of large-scale, cross-dataset organ atlases within the Human Cell Atlas.


Predicting cellular responses to complex perturbations in high-throughput screens

May 2023

·

96 Reads

·

166 Citations

Molecular Systems Biology

Recent advances in multiplexed single-cell transcriptomics experiments facilitate the high-throughput study of drug and genetic perturbations. However, an exhaustive exploration of the combinatorial perturbation space is experimentally unfeasible. Therefore, computational methods are needed to predict, interpret, and prioritize perturbations. Here, we present the compositional perturbation autoencoder (CPA), which combines the interpretability of linear models with the flexibility of deep-learning approaches for single-cell response modeling. CPA learns to in silico predict transcriptional perturbation response at the single-cell level for unseen dosages, cell types, time points, and species. Using newly generated single-cell drug combination data, we validate that CPA can predict unseen drug combinations while outperforming baseline models. Additionally, the architecture's modularity enables incorporating the chemical representation of the drugs, allowing the prediction of cellular response to completely unseen drugs. Furthermore, CPA is also applicable to genetic combinatorial screens. We demonstrate this by imputing in silico 5,329 missing combinations (97.6% of all possibilities) in a single-cell Perturb-seq experiment with diverse genetic interactions. We envision CPA will facilitate efficient experimental design and hypothesis generation by enabling in silico response prediction at the single-cell level and thus accelerate therapeutic applications using single-cell technologies.



Figure 1: scPoli enables learning cell-level and sample-level representations. (a) scPoli reference building: the model integrates different datasets and learns conditional embeddings for each integrated study and a set of cell type prototypes. (b) scPoli reference mapping: the model weights are frozen (in grey) and a new set of conditional embeddings are added to the model. Cell type labels are transferred from the closest prototype in the latent space. Example of a standard workflow using scPoli on multiple pancreas datasets. (c) UMAP of the raw data to be integrated in a reference (13,093 cells), showing cell types and (d) studies by color. (e, f ) Integrated reference data. (g) 3,289 query cells (celseq and celseq2 studies) are projected onto the reference data in the reference mapping step. UMAPs show in color the query cells and in grey the reference cells. Reference cell type prototypes are shown in bigger circles with a black edge. The accuracy of the label transfer is 80%. (h) Cells are colored by study or origin after reference mapping. The model achieves a mean integration score of 0.86. (i) Outcome of the label transfer step from reference to query. (j) Principal component analysis (PCA) of the conditional embeddings learned by scPoli.
Figure 2: scPoli reaches state-of-the-art performance on data integration and label transfer. (a) Mean integration score obtained using the benchmarked models on different datasets. The barplots on the right show the average results across datasets. (b) Overall scores across datasets for biological conservation and batch correction performance of the benchmarked models. (c) Weighted F1 scores achieved by each model when classifying query cells on the various datasets. (d) Weighted F1 score and the overall integration score of the models capable of both data integration and label transfer. (e) Macro averaged F1 query classification scores achieved by each model on the various datasets. (f ) Macro averaged F1 score and the overall integration score of the models capable of both data integration and label transfer.
Figure 5: scPoli sample embeddings capture technical variation and can guide data integration workflows. (a, b) UMAPs of Schulte-Schrepping et al. dataset consisting of healthy and COVID-19 PBMC samples after sample-level integration. Cells are colored by cell type and experiment respectively. (c) Sample embeddings obtained with scPoli colored by experiment (legend shared with (b)), (d) cohort and (e, f ) disease. For each covariate the association with the first and second principal components is displayed.
Population-level integration of single-cell datasets enables multi-scale analysis across samples

November 2022

·

414 Reads

·

8 Citations

The increasing generation of population-level single-cell atlases with hundreds or thousands of samples has the potential to link demographic and technical metadata with high-resolution cellular and tissue data in homeostasis and disease. Constructing such comprehensive references requires large-scale integration of heterogeneous cohorts with varying metadata capturing demographic and technical information. Here, we present single-cell population level integration (scPoli), a semi-supervised conditional deep generative model for data integration, label transfer and query-to-reference mapping. Unlike other models, scPoli learns both sample and cell representations, is aware of cell-type annotations and can integrate and annotate newly generated query datasets while providing an uncertainty mechanism to identify unknown populations. We extensively evaluated the method and showed its advantages over existing approaches. We applied scPoli to two population-level atlases of lung and peripheral blood mononuclear cells (PBMCs), the latter consisting of roughly 8 million cells across 2,375 samples. We demonstrate that scPoli allows atlas-level integration and automatic reference mapping with label transfer. It can explain sample-level biological and technical variations such as disease, anatomical location and assay by means of its novel sample embeddings. We use these embeddings to explore sample-level metadata, enable automatic sample classification and guide a data integration workflow. scPoli also enables simultaneous sample-level and cell-level analysis of gene expression patterns, revealing genes associated with batch effects and the main axes of between-sample variation. We envision scPoli becoming an important tool for population-level single-cell data integration facilitating atlas use but also interpretation by means of multi-scale analyses.


Predicting Cellular Responses with Variational Causal Inference and Refined Relational Information

September 2022

·

18 Reads

·

1 Citation

Predicting the responses of a cell under perturbations may bring important benefits to drug discovery and personalized therapeutics. In this work, we propose a novel graph variational Bayesian causal inference framework to predict a cell's gene expressions under counterfactual perturbations (perturbations that this cell did not factually receive), leveraging information representing biological knowledge in the form of gene regulatory networks (GRNs) to aid individualized cellular response predictions. Aiming at a data-adaptive GRN, we also developed an adjacency matrix updating technique for graph convolutional networks and used it to refine GRNs during pre-training, which generated more insights on gene relations and enhanced model performance. Additionally, we propose a robust estimator within our framework for the asymptotically efficient estimation of marginal perturbation effect, which is yet to be carried out in previous works. With extensive experiments, we exhibited the advantage of our approach over state-of-the-art deep learning models for individual response prediction.


Ketamine exerts its sustained antidepressant effects via cell-type-specific regulation of Kcnq2

May 2022

·

251 Reads

·

61 Citations

Neuron

A single sub-anesthetic dose of ketamine produces a rapid and sustained antidepressant response, yet the molecular mechanisms responsible for this remain unclear. Here, we identified cell-type-specific transcriptional signatures associated with a sustained ketamine response in mice. Most interestingly, we identified the Kcnq2 gene as an important downstream regulator of ketamine action in glutamatergic neurons of the ventral hippocampus. We validated these findings through a series of complementary molecular, electrophysiological, cellular, pharmacological, behavioral, and functional experiments. We demonstrated that adjunctive treatment with retigabine, a KCNQ activator, augments ketamine’s antidepressant-like effects in mice. Intriguingly, these effects are ketamine specific, as they do not modulate a response to classical antidepressants, such as escitalopram. These findings significantly advance our understanding of the mechanisms underlying the sustained antidepressant effects of ketamine, with important clinical implications.


Figure 1. Human Lung Cell Atlas study overview. Harmonized cell annotations, raw count data, harmonized patient and sample metadata, and sample anatomical locations encoded into a common coordinate framework were collected and generated as input for the Human Lung Cell Atlas (HLCA) core (left). After integration of the core datasets, the atlas was extended by mapping 34 additional datasets, including disease samples, to the HLCA core, bringing the total number of cells in the extended HLCA to 2.2 million. The HLCA core provides detailed consensus cell annotations with matched consensus cell type markers (right, top), gene modules associated with technical, demographic, and anatomical covariates in various cellular identities (right, middle), GWAS-based association of lung conditions with cell types (right, middle), a reference projection model to annotate new data (right, middle) and discover new cell types, transitional cell states, and disease-associated cell states (right, bottom).
An integrated cell atlas of the human lung in health and disease

March 2022

·

370 Reads

Organ- and body-scale cell atlases have the potential to transform our understanding of human biology. To capture the variability present in the population, these atlases must include diverse demographics such as age and ethnicity from both healthy and diseased individuals. The growth in both size and number of single-cell datasets, combined with recent advances in computational techniques, for the first time makes it possible to generate such comprehensive large-scale atlases through integration of multiple datasets. Here, we present the integrated Human Lung Cell Atlas (HLCA) combining 46 datasets of the human respiratory system into a single atlas spanning over 2.2 million cells from 444 individuals across health and disease. The HLCA contains a consensus re-annotation of published and newly generated datasets, resolving under- or misannotation of 59% of cells in the original datasets. The HLCA enables recovery of rare cell types, provides consensus marker genes for each cell type, and uncovers gene modules associated with demographic covariates and anatomical location within the respiratory system. To facilitate the use of the HLCA as a reference for single-cell lung research and allow rapid analysis of new data, we provide an interactive web portal to project datasets onto the HLCA. Finally, we demonstrate the value of the HLCA reference for interpreting disease-associated changes. Thus, the HLCA outlines a roadmap for the development and use of organ-scale cell atlases within the Human Cell Atlas.


Figure 1 | Interpretable single-cell perturbation modeling using a compositional perturbation autoencoder (CPA). (a) Given a matrix of gene expressions per cell together with annotated potentially quantitative perturbations d and other covariates such as cell line, patient or species, CPA learns the combined perturbation response for a single-cell. It encodes the gene expression using a neural network into a lower dimensional latent space that is eventually decoded back to an approximate gene expression vector, aimed to be as close as possible to the original one. To make the latent space intepretable in terms of perturbation and covariates, the encoded gene expression vector is first mapped to a 'basal state', by feeding the signal to discriminators to remove any signal from perturbations and covariates. The basal state is then composed with perturbations and covariates -with potentially reweighted dosages -to reconstruct the gene expression. All encoder, decoder and discriminator weights as well as the perturbation and covariate dictionaries are learned during training. (b) Features of CPA are interpreted via plotting of the two learned dictionaries, interpolating covariate specific dose response curves and predicting novel unseen drug combinations.
Compositional perturbation autoencoder for single-cell response modeling

April 2021

·

437 Reads

·

21 Citations

Recent advances in multiplexing single-cell transcriptomics across experiments are enabling the high throughput study of drug and genetic perturbations. However, an exhaustive exploration of the combinatorial perturbation space is experimentally unfeasible, so computational methods are needed to predict, interpret and prioritize perturbations. Here, we present the Compositional Perturbation Autoencoder (CPA), which combines the interpretability of linear models with the flexibility of deep-learning approaches for single-cell response modeling. CPA encodes and learns transcriptional drug response across different cell types, doses, and drug combinations. The model produces easy-to-interpret embeddings for drugs and cell types, allowing drug similarity analysis and predictions for unseen dosages and drug combinations. We show CPA accurately models single-cell perturbations across compounds, dosages, species, and time. We further demonstrate that CPA predicts combinatorial genetic interactions of several types, implying it captures features that distinguish different interaction programs. Finally, we demonstrate CPA allows in-silico generation of 5,329 missing combinations (97.6% of all possibilities) with diverse genetic interactions. We envision our model will facilitate efficient experimental design by enabling in silico response prediction at the single-cell level.


Citations (8)


... Neurologically, researchers studying ELA note reductions in cortical thickness, the volume of gray matter, total brain volume, and reduced volumes in the prefrontal cortex, hippocampal, and amygdala (Adedayo et al., 2023;Arnold, 2012;Kos et al., 2023). These highly connected areas of the brain function to process short and long-term memory while also processing emotional reactions. ...

Reference:

The Archeology of Adoption: Tracing the Journey from Birth The Archeology of Adoption: Tracing the Journey from Birth Through Adoption Using Pre-Adoptive Artifacts Through Adoption Using Pre-Adoptive Artifacts
Early life adversity shapes social subordination and cell type-specific transcriptomic patterning in the ventral hippocampus

Science Advances

... A total of 14 scRNA-seq datasets were retrieved from multiple studies from the pancreas and PBMC tissues. The batch-wise concatenated matrix was obtained from De Donno., et al [37]. The integrated pancreas dataset was comprised of 16 382 cells and 13 cell types. ...

Population-level integration of single-cell datasets enables multi-scale analysis across samples

Nature Methods

... The cellular architecture of the human lung is now mapped with remarkable precision, based upon multiple single cell RNA-seq analyses (Montoro et al. 2018;Plasschaert et al. 2018;Vieira Braga et al. 2019;Deprez et al. 2020;Travaglini et al. 2020;Sikkema et al. 2023) and the combined efforts of several international consortia including the Human Cell Atlas ( h t t p s : / / w w w . h u m a n c e l l a t l a s . ...

An integrated cell atlas of the lung in health and disease

Nature Medicine

... A cell's gene expression profile is a function of multiple attributes such as tissue of origin, its surrounding microenvironment, biometric factors of the donor, and experimental technical variables, as well as clinical variables such as treatments or infections. Single-cell sequencing has enabled profiling gene expression at single-cell resolution in a high-throughput manner 1,2 , but it still remains challenging to analyze and interpret millions of cells at the same time due to the simultaneous and intertwined effects of the aforementioned attributes on gene expression, which can be hard to disentangle 3 . Such a disentangled representation of single-cell data would be more interpretable and would facilitate understanding of biological mechanisms 4,5 , and if combined with causal modeling, would also allow for better counterfactual predictions 6 . ...

Predicting cellular responses to complex perturbations in high-throughput screens
  • Citing Article
  • May 2023

Molecular Systems Biology

... Furthermore, the latent space of VAE models serves as a continuous, compact approximation of the underlying distribution of growth curves which can be used for downstream machine learning tasks, including generative tasks such as predicting growth dynamics from parameter sets or initial conditions. Already, exploiting the "latent structure" of datasets through such regularized autoencoder representations is the critical first stage for most modern ML techniques spanning audio processing, single-cell multi-omics, and computer vision [29][30][31][32][33][34][35][36] . Here we demonstrated that these representation learning methods hold similar promise for the study and engineering of microbial community dynamics. ...

Population-level integration of single-cell datasets enables multi-scale analysis across samples

... These markers have been reported to be involved in (2R,6R)-HNK's mechanisms of behavioural actions(Fukumoto et al., 2019;Lumsden et al., 2019;Zanos et al., 2016Zanos et al., , 2023.While the hippocampus has traditionally been perceived primarily as a structure responsible for cognition/memory, an increasing body of evidence suggests that it also plays a role in regulating emotional, reward behaviours and stress responses(Fanselow & Dong, 2010;LeGates et al., 2018). The choice of the ventral over dorsal hippocampus stems from evidence demonstrating that activation of the ventral, but not dorsal, hippocampus can bidirectionally regulate depressive-like behaviours in response to ketamine(Lopez et al., 2022;Rawat et al., 2022;Yamada & Jinno, 2019). ...

Ketamine exerts its sustained antidepressant effects via cell-type-specific regulation of Kcnq2
  • Citing Article
  • May 2022

Neuron

... Understanding cellular responses to perturbations is crucial for biomedical applications and drug design, as it helps identify gene-gene interactions across different cell types and potential drug targets 50 . Using Perturb-seq 51,52 data resources to train models for modeling cellular response to perturbations is a key task of computational biology [53][54][55] . We combined the scFoundation with an advanced model called GEARS 53 for predicting the single-cell-resolution perturbation. ...

Compositional perturbation autoencoder for single-cell response modeling

... Single cell RNA profiling has been performed to distinguish different hypothalamic neural populations (Chen et al., 2017;Romanov et al., 2017;Steuernagel et al., 2022). Studies have also revealed the molecular profile of PVN neurons (Lewis et al., 2020;Lopez et al., 2021;Romanov et al., 2015;Short et al., 2023;Son et al., 2021;Xu et al., 2020). These studies generally identify the major neural classes noted above. ...

Single-cell molecular profiling of all three components of the HPA axis reveals adrenal ABCB1 as a regulator of stress adaptation

Science Advances