Fabian J Theis

Fabian J Theis
Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH) | HZM · Institute of Computational Biology

Dr.

About

1,002
Publications
222,365
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
56,901
Citations
Additional affiliations
May 2013 - present
Technical University of Munich
Position
  • Professor
January 2012 - present
Tel Aviv University
Position
  • Tel-Aviv University
January 2009 - December 2012
Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH)

Publications

Publications (1,002)
Preprint
Full-text available
Single-cell data generation techniques have provided valuable insights into the intricate nature of cellular heterogeneity. However, effectively unraveling subtle variations within a specific gene set of interest, while mitigating the confounding presence of higher-order variability, remains challenging. To address this, we propose scARE, a novel e...
Preprint
Full-text available
Tissue phenotypes such as metabolic states, inflammation, and tumor properties are functions of molecular states of cells that constitute the tissue. Recent spatial molecular profiling assays measure tissue architecture motifs in a molecular and often unbiased way and thus can explain some aspects of emergence of these phenotypes. Here, we characte...
Preprint
Full-text available
Single-cell proteomics aims to characterize biological function and heterogeneity at the level of proteins in an unbiased manner. It is currently limited in proteomic depth, throughput and robustness, a challenge that we address here by a streamlined multiplexed workflow using data-independent acquisition (mDIA). We demonstrate automated and comple...
Preprint
Full-text available
The increasing generation of population-level single-cell atlases with hundreds or thousands of samples has the potential to link demographic and technical metadata with high-resolution cellular and tissue data in homeostasis and disease. Constructing such comprehensive references requires large-scale integration of heterogeneous cohorts with varyi...
Article
Full-text available
Background Molecular measurements of the genome, the transcriptome, and the epigenome, often termed multi-omics data, provide an in-depth view on biological systems and their integration is crucial for gaining insights in complex regulatory processes. These data can be used to explain disease related genetic variants by linking them to intermediate...
Article
Full-text available
Models of intercellular communication in tissues are based on molecular profiles of dissociated cells, are limited to receptor–ligand signaling and ignore spatial proximity in situ. We present node-centric expression modeling, a method based on graph neural networks that estimates the effects of niche composition on gene expression in an unbiased m...
Preprint
Full-text available
Therapeutic antibodies are widely used to treat severe diseases. Most of them alter immune cells and act within the immunological synapse; an essential cell-to-cell interaction to direct the humoral immune response. Although many antibody designs are generated and evaluated, a high-throughput tool for systematic antibody characterization and predic...
Preprint
Full-text available
Flow and mass cytometry data are commonly analyzed via manual gating strategies which requires prior knowledge, expertise and time. With increasingly complex experiments with many parameters and samples, traditional manual flow and mass cytometry data analysis becomes cumbersome if not inefficient. At the same time, computational tools developed fo...
Chapter
Optical coherence tomography (OCT) imaging from different camera devices causes challenging domain shifts and can cause a severe drop in accuracy for machine learning models. In this work, we introduce a minimal noise adaptation method based on a singular value decomposition (SVDNA) to overcome the domain gap between target domains from three diffe...
Preprint
Full-text available
Optical coherence tomography (OCT) imaging from different camera devices causes challenging domain shifts and can cause a severe drop in accuracy for machine learning models. In this work, we introduce a minimal noise adaptation method based on a singular value decomposition (SVDNA) to overcome the domain gap between target domains from three diffe...
Article
Full-text available
Despite the therapeutic promise of direct reprogramming, basic principles concerning fate erasure and the mechanisms to resolve cell identity conflicts remain unclear. To tackle these fundamental questions, we established a single-cell protocol for the simultaneous analysis of multiple cell fate conversion events based on combinatorial and traceabl...
Article
Full-text available
Background Despite the well-known detrimental effects of cigarette smoke (CS), little is known about the complex gene expression dynamics in the early stages after exposure. This study aims to investigate early transcriptomic responses following CS exposure of airway epithelial cells in culture and compare these to those found in human CS exposure...
Preprint
Full-text available
Targeted spatial transcriptomics methods capture the topology of cell types and states in tissues at single cell- and subcellular resolution by measuring the expression of a predefined set of genes. The selection of an optimal set of probed genes is crucial for capturing and interpreting the spatial signals present in a tissue. However, current sel...
Article
Full-text available
Parkinson’s disease (PD) as a progressive neurodegenerative disorder arises from multiple genetic and environmental factors. However, underlying pathological mechanisms remain poorly understood. Using multiplexed single-cell transcriptomics, we analyze human neural precursor cells (hNPCs) from sporadic PD (sPD) patients. Alterations in gene express...
Preprint
Full-text available
RNA velocity has been rapidly adopted to guide the interpretation of transcriptional dynamics in snapshot single-cell transcriptomics data. Current approaches for estimating and analyzing RNA velocity can empirically reveal complex dynamics but lack effective strategies for quantifying the uncertainty of the estimate and its overall applicability t...
Preprint
Full-text available
In psychiatric disorders, common and rare genetic variants cause widespread dysfunction of cells and their interactions, especially in the prefrontal cortex, giving rise to psychiatric symptoms. To better understand these processes, we traced the effects of common and rare genetics, and cumulative disease risk scores, to their molecular footprints...
Article
Full-text available
Pathogenic variants in genes that cause dilated cardiomyopathy (DCM) and arrhythmogenic cardiomyopathy (ACM) convey high risks for the development of heart failure through unknown mechanisms. Using single-nucleus RNA sequencing, we characterized the transcriptome of 880,000 nuclei from 18 control and 61 failing, nonischemic human hearts with pathog...
Article
Full-text available
During pancreas development endocrine cells leave the ductal epithelium to form the islets of Langerhans, but the morphogenetic mechanisms are incompletely understood. Here, we identify the Ca2+-independent atypical Synaptotagmin-13 (Syt13) as a key regulator of endocrine cell egression and islet formation. We detect specific upregulation of the Sy...
Article
Full-text available
The adult zebrafish heart has a high capacity for regeneration following injury. However, the composition of the regenerative niche has remained largely elusive. Here, we dissected the diversity of activated cell states in the regenerating zebrafish heart based on single-cell transcriptomics and spatiotemporal analysis. We observed the emergence of...
Preprint
Full-text available
Fibroblast to myofibroblast conversion is a major driver of tissue remodeling in organ fibrosis. Several distinct lineages of fibroblasts support homeostatic tissue niche functions, yet, specific activation states and phenotypic trajectories of fibroblasts during injury and repair have remained unclear. Here, we combined spatial transcriptomics, lo...
Preprint
Full-text available
Single-cell multimodal profiling provides a high-resolution view of cellular information. Recently, multimodal profiling approaches have been coupled with CRISPR technologies to perform pooled screens of single or combinatorial perturbations. This opens the possibility of exploring the massive space of combinatorial perturbations and their regulato...
Preprint
Single-cell genomics is now producing an ever-increasing amount of datasets that, when integrated, could provide large-scale reference atlases of tissue in health as well as disease. Such large-scale atlases not only increase the scale and generalizability of analyses but also enable combining the knowledge generated by individual studies. Specific...
Preprint
Purpose To determine real life quantitative changes in OCT biomarkers in a large set of treatment naive patients undergoing anti-VEGF therapy. For this purpose, we devised a novel deep learning based semantic segmentation algorithm providing, to the best of our knowledge, the first benchmark results for automatic segmentation of 11 OCT features inc...
Article
Full-text available
Plasmacytoid and conventional dendritic cells (pDC and cDC) are generated from progenitor cells in the bone marrow and commitment to pDCs or cDC subtypes may occur in earlier and later progenitor stages. Cells within the CD11c+MHCII−/loSiglec-H+CCR9lo DC precursor fraction of the mouse bone marrow generate both pDCs and cDCs. Here we investigate th...
Preprint
Highly specific and efficient drugs have been developed during the last two decades to treat non-communicable chronic inflammatory skin diseases (ncISD). Due to their specificity, these drugs are asking for precise diagnostic measures to attribute the most efficient treatment to each patient. Diagnosis, however, is complicated by the complex pathog...
Preprint
Full-text available
Highly multiplexed quantitative subcellular imaging holds enormous promise for understanding how spatial context shapes the activity of our genome and its products at multiple scales. Yet unbiased analysis of subcellular organisation across experimental conditions remains challenging, because differences in molecular profiles between conditions con...
Preprint
Full-text available
Single-cell ATAC-sequencing (scATAC-seq) coverage in regulatory regions is typically binarized as an indicator of open chromatin. However, the implications of scATAC-seq data binarization have not systematically been assessed. Here, we show that the goodness-of-fit of existing models and their applications, including clustering, cell type identific...
Article
Full-text available
Disease recovery dynamics are often difficult to assess, as patients display heterogeneous recovery courses. To model recovery dynamics, exemplified by severe COVID-19, we apply a computational scheme on longitudinally sampled blood transcriptomes, generating recovery states, which we then link to cellular and molecular mechanisms, presenting a fra...
Article
A single sub-anesthetic dose of ketamine produces a rapid and sustained antidepressant response, yet the molecular mechanisms responsible for this remain unclear. Here, we identified cell-type-specific transcriptional signatures associated with a sustained ketamine response in mice. Most interestingly, we identified the Kcnq2 gene as an important d...
Article
Full-text available
Direct reprogramming based on genetic factors resembles a promising strategy to replace lost cells in degenerative diseases such as Parkinson's disease. For this, we developed a knock-in mouse line carrying a dual dCas9 transactivator system (dCAM) allowing the conditional in vivo activation of endogenous genes. To enable a translational applicatio...
Preprint
Full-text available
Single-cell multimodal omics technologies provide a holistic approach to study cellular decision making. Yet, learning from multimodal data is complicated because of missing and incomplete reference samples, nonoverlapping features and batch effects between datasets. To integrate and provide a unified view of multi-modal datasets, we propose Multig...
Preprint
Full-text available
Organ- and body-scale cell atlases have the potential to transform our understanding of human biology. To capture the variability present in the population, these atlases must include diverse demographics such as age and ethnicity from both healthy and diseased individuals. The growth in both size and number of single-cell datasets, combined with r...
Preprint
Full-text available
Organ- and body-scale cell atlases have the potential to transform our understanding of human biology. To capture the variability present in the population, these atlases must include diverse demographics such as age and ethnicity from both healthy and diseased individuals. The growth in both size and number of single-cell datasets, combined with r...
Article
Full-text available
Single-cell technologies are revolutionizing biology but are today mainly limited to imaging and deep sequencing. However, proteins are the main drivers of cellular function and in-depth characterization of individual cells by mass spectrometry (MS)-based proteomics would thus be highly valuable and complementary. Here, we develop a robust workflow...
Preprint
Full-text available
The increasing availability of large-scale single-cell datasets has enabled the detailed description of cell states across multiple biological conditions and perturbations. In parallel, recent advances in unsupervised machine learning, particularly in transfer learning, have enabled fast and scalable mapping of these new single-cell datasets onto r...
Article
Full-text available
Computational trajectory inference enables the reconstruction of cell state dynamics from single-cell RNA sequencing experiments. However, trajectory inference requires that the direction of a biological process is known, largely limiting its application to differentiating systems in normal development. Here, we present CellRank ( https://cellrank....
Article
Methods for profiling RNA and protein expression in a spatially resolved manner are rapidly evolving, making it possible to comprehensively characterize cells and tissues in health and disease. To maximize the biological insights obtained using these techniques, it is critical to both clearly articulate the key biological questions in spatial analy...
Article
Full-text available
The Human Cell Atlas (HCA) consortium aims to establish an atlas of all organs in the healthy human body at single-cell resolution to increase our understanding of basic biological processes that govern development, physiology and anatomy, and to accelerate diagnosis and treatment of disease. The lung biological network of the HCA aims to generate...
Article
Full-text available
Maldevelopment of the pharyngeal endoderm, an embryonic tissue critical for patterning of the pharyngeal region and ensuing organogenesis, ultimately contributes to several classes of human developmental syndromes and disorders. Such syndromes are characterized by a spectrum of phenotypes that currently cannot be fully explained by known mutations...
Article
Full-text available
Large single-cell atlases are now routinely generated to serve as references for analysis of smaller-scale studies. Yet learning from reference data is complicated by batch effects between datasets, limited availability of computational resources and sharing restrictions on raw data. Here we introduce a deep learning strategy for mapping query data...
Article
Full-text available
Single-cell atlases often include samples that span locations, laboratories and conditions, leading to complex, nested batch effects in data. Thus, joint analysis of atlas datasets requires reliable data integration. To guide integration method choice, we benchmarked 68 method and preprocessing combinations on 85 batches of gene expression, chromat...
Preprint
Full-text available
The meninges of the brain are an important component of neuroinflammatory response. Diverse immune cells move from the calvaria marrow into the dura mater via recently discovered skull-meninges connections (SMCs). However, how the calvaria bone marrow is different from the other bones and whether and how it contributes to human diseases remain unkn...
Preprint
Full-text available
anndata is a Python package for handling annotated data matrices in memory and on disk ( github.com/theislab/anndata ), positioned between pandas and xarray. anndata offers a broad range of computationally efficient features including, among others, sparse data support, lazy operations, and a PyTorch interface. Statement of need Generating insight...
Article
Full-text available
Single cell RNA-seq has revolutionized transcriptomics by providing cell type resolution for differential gene expression and expression quantitative trait loci (eQTL) analyses. However, efficient power analysis methods for single cell data and inter-individual comparisons are lacking. Here, we present scPower; a statistical framework for the desig...
Preprint
Full-text available
Spatial molecular profiling of complex tissues is essential to investigate cellular function in physiological and pathological states. However, methods for molecular analysis of biological specimens imaged in 3D as a whole are lacking. Here, we present DISCO-MS, a technology combining whole-organ imaging, deep learning-based image analysis, and ult...
Article
Full-text available
Spatial molecular profiling of complex tissues is essential to investigate cellular function in physiological and pathological states. However, methods for molecular analysis of biological specimens imaged in 3D as a whole are lacking. Here, we present DISCO-MS, a technology combining whole-organ imaging, deep learning-based image analysis, and ult...
Article
Full-text available
Background Single-cell metabolic studies bring new insights into cellular function, which can often not be captured on other omics layers. Metabolic information has wide applicability, such as for the study of cellular heterogeneity or for the understanding of drug mechanisms and biomarker development. However, metabolic measurements on single-cell...
Article
COVID-19-induced ‘acute respiratory distress syndrome’ (ARDS) is associated with prolonged respiratory failure and high mortality, but the mechanistic basis of lung injury remains incompletely understood. Here, we analyzed pulmonary immune responses and lung pathology in two cohorts of patients with COVID-19 ARDS using functional single cell genomi...
Article
Full-text available
Recent years have seen a revolution in single-cell RNA-sequencing (scRNA-seq) technologies, datasets, and analysis methods. Since 2016, the scRNA-tools database has cataloged software tools for analyzing scRNA-seq data. With the number of tools in the database passing 1000, we provide an update on the state of the project and the field. This data s...
Article
Full-text available
Objective: A fine-tuned balance of glucocorticoid receptor (GR) activation is essential for organ formation, with disturbances influencing many health outcomes. In utero, glucocorticoids have been linked to brain-related negative outcomes, with unclear underlying mechanisms, especially regarding cell-type-specific effects. An in vitro model of fet...
Preprint
Full-text available
Neuroinflammation after stroke is characterized by the activation of resident microglia and the invasion of circulating leukocytes into the brain. Although lymphocytes infiltrate the brain in small number, they have been consistently demonstrated to be the most potent leukocyte subpopulation contributing to secondary inflammatory brain injury. Howe...
Preprint
Full-text available
Learning robust representations can help uncover underlying biological variation in scRNA-seq data. Disentangled representation learning is one approach to obtain such informative as well interpretable representations. Here, we learn disentangled representations of scRNA-seq data using β-variational autoencoder (β-VAE) and apply the model for out-o...
Article
Full-text available
EpiScanpy is a toolkit for the analysis of single-cell epigenomic data, namely single-cell DNA methylation and single-cell ATAC-seq data. To address the modality specific challenges from epigenomics data, epiScanpy quantifies the epigenome using multiple feature space constructions and builds a nearest neighbour graph using epigenomic distance betw...
Article
Full-text available
Objective: The effectiveness of bariatric surgery in restoring β-cell function has been described in type-2 diabetes (T2D) patients and animal models for years, whereas the mechanistic underpinnings are largely unknown. The possibility of vertical sleeve gastrectomy (VSG) to rescue far-progressed, clinically-relevant T2D and to promote β-cell reco...
Article
Full-text available
Excess nutrient uptake and altered hormone secretion in the gut contribute to a systemic energy imbalance, which causes obesity and an increased risk of type 2 diabetes and colorectal cancer. This functional maladaptation is thought to emerge at the level of the intestinal stem cells (ISCs). However, it is not clear how an obesogenic diet affects I...
Preprint
Full-text available
Epithelial cell egression is important for organ development, but also drives cancer metastasis. Better understandings of pancreatic epithelial morphogenetic programs generating islets of Langerhans aid to diabetes therapy. Here we identify the Ca2+-independent atypical Synaptotagmin 13 (Syt13) as a key driver of endocrine cell egression and islet...
Article
Full-text available
Single-cell RNA-seq datasets are often first analyzed independently without harnessing model fits from previous studies, and are then contextualized with public data sets, requiring time-consuming data wrangling. We address these issues with sfaira, a single-cell data zoo for public data sets paired with a model zoo for executable pre-trained model...