About
132
Publications
15,393
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,963
Citations
Introduction
Additional affiliations
November 2015 - present
October 2004 - March 2008
September 2014 - October 2015
Publications
Publications (132)
Background
Single-cell sequencing can provide novel insights into the understanding and treatment of diseases. In cancer, for example, intratumor heterogeneity is a major cause of treatment resistance and relapse. Although technological progress has substantially increased the throughput of sequenced cells, single-cell sequencing remains cost and l...
Myelodysplastic neoplasia (MDS) and acute myeloid leukemia (AML) share common clinical and genetic features. With increasing knowledge on genomic drivers, morphology-based definitions are being increasingly replaced by molecular definitions in the classification systems of both AML and MDS. Due to similarities of genetic drivers, a growing body of...
Deep single-cell multi-omic profiling offers a promising approach to understand and overcome drug resistance in relapsed or refractory (rr) acute myeloid leukemia (AML). Here, we combine single-cell ex vivo drug profiling (pharmacoscopy) with single-cell and bulk DNA, RNA, and protein analyses, alongside clinical data from 21 rrAML patients. Unsupe...
Motivation
Multimodal profiling strategies promise to produce more informative insights into biomedical cohorts via the integration of the information each modality contributes. To perform this integration, however, the development of novel analytical strategies is needed. Multimodal profiling strategies often come at the expense of lower sample nu...
Causal Bayesian networks are widely used tools for summarising the dependencies between variables and elucidating their putative causal relationships. Learning networks from data is computationally hard in general. The current state-of-the-art approaches for exact causal discovery are integer linear programming over the underlying space of directed...
Acute myeloid leukemia (AML) has a poor prognosis and a heterogeneous mutation landscape. Although common mutations are well-studied, little research has characterized how the sequence of mutations relates to clinical features. Using published, single-cell DNA sequencing data from three institutions, we compared clonal evolution patterns in AML to...
Understanding the complex background of cancer requires genotype-phenotype information in single-cell resolution. Here, we perform long-read single-cell RNA sequencing (scRNA-seq) on clinical samples from three ovarian cancer patients presenting with omental metastasis and increase the PacBio sequencing depth to 12,000 reads per cell. Our approach...
Acute myeloid leukemia (AML) has a poor prognosis and a heterogeneous mutation landscape. Although common mutations are well-studied, little research has characterized how the sequence of mutations relates to clinical features. Using published, single-cell DNA sequencing data from three institutions, we compared clonal evolution patterns in AML to...
Myeloid malignancies exhibit considerable heterogeneity with overlapping clinical and genetic features among different subtypes. Current classification schemes, predominantly based on clinical features, fall short of capturing the complex genomic landscapes of these malignancies. Here, we present a data-driven approach that integrates mutational fe...
The swift advancements in single-cell DNA sequencing (scDNA-seq) have enabled quantitative assessment of genetic content in individual cells, allowing downstream analyses at the single-cell resolution. This technology considerably facilitates cancer research, yet its underlying power has not been fully exploited. Specifically, computational methods...
Reconstructing the history of somatic DNA alterations can help understand the evolution of a tumor and predict its resistance to treatment. Single-cell DNA sequencing (scDNAseq) can be used to investigate clonal heterogeneity and to inform phylogeny reconstruction. However, most existing phylogenetic methods for scDNAseq data are designed either fo...
Cell lineages accumulate somatic mutations during organismal development, potentially leading to pathological states. The rate of somatic evolution within a cell population can vary due to multiple factors, including selection, a change in the mutation rate, or differences in the microenvironment. Here, we developed a statistical test called the Po...
Background:
Sexual abuse and bullying are associated with poor mental health in adulthood. We previously established a clear relationship between bullying and symptoms of psychosis. Similarly, we would expect sexual abuse to be linked to the emergence of psychotic symptoms, through effects on negative affect.
Method:
We analysed English data fro...
Cancer progression is an evolutionary process shaped by both deterministic and stochastic forces. Multi-region and single-cell sequencing of tumors enable high-resolution reconstruction of the mutational history of each tumor and highlight the extensive diversity across tumors and patients. Resolving the interactions among mutations and recovering...
Gaussian Process Networks (GPNs) are a class of directed graphical models which employ Gaussian processes as priors for the conditional expectation of each variable given its parents in the network. The model allows describing continuous joint distributions in a compact but flexible manner with minimal parametric assumptions on the dependencies bet...
Identifying cell types based on expression profiles is a pillar of single cell analysis. Existing machine-learning methods identify predictive features from annotated training data, which are often not available in early-stage studies. This can lead to overfitting and inferior performance when applied to new data. To address these challenges we pre...
Introduction: Acute myeloid leukemia (AML) has a poor prognosis, despite aggressive therapies and a recent expansion in the array of treatments. Treatment is often determined by mutations, which risk-stratify a patient’s leukemia or can identify mutations that serve as therapeutic targets. Although common mutations in AML have been extensively stud...
The R package BiDAG implements Markov chain Monte Carlo (MCMC) methods for structure learning and sampling of Bayesian networks. The package includes tools to search for a maximum a posteriori (MAP) graph and to sample graphs from the posterior distribution given the data. A new hybrid approach to structure learning enables inference in large graph...
We present SIEVE, a statistical method for the joint inference of somatic variants and cell phylogeny under the finite-sites assumption from single-cell DNA sequencing. SIEVE leverages raw read counts for all nucleotides and corrects the acquisition bias of branch lengths. In our simulations, SIEVE outperforms other methods in phylogenetic reconstr...
Comprehensive molecular characterization of cancer subtypes is essential for predicting clinical outcomes and searching for personalized treatments. We present bnClustOmics, a statistical model and computational tool for multi-omics unsupervised clustering, which serves a dual purpose: Clustering patient samples based on a Bayesian network mixture...
Motivation
Tumours evolve as heterogeneous populations of cells, which may be distinguished by different genomic aberrations. The resulting intra-tumour heterogeneity plays an important role in cancer patient relapse and treatment failure, so that obtaining a clear understanding of each patient's tumour composition and evolutionary history is key f...
How tumors evolve affects cancer progression, therapy response, and relapse. However, whether tumor evolution is driven primarily by selectively advantageous or neutral mutations remains under debate. Resolving this controversy has so far been limited by the use of bulk sequencing data. Here, we leverage the high resolution of single-cell DNA seque...
Dynamic Bayesian networks (DBNs) can be used for the discovery of gene regulatory networks (GRNs) from time series gene expression data. Here, we suggest a strategy for learning DBNs from gene expression data by employing a Bayesian approach that is scalable to large networks and is targeted at learning models with high predictive accuracy. Our fra...
Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful technique to decipher tissue composition at the single-cell level and to inform on disease mechanisms, tumor heterogeneity, and the state of the immune microenvironment. Although multiple methods for the computational analysis of scRNA-seq data exist, their application in a clinical s...
Adjusting for covariates is a well-established method to estimate the total causal effect of an exposure variable on an outcome of interest. Depending on the causal structure of the mechanism under study, there may be different adjustment sets, equally valid from a theoretical perspective, leading to identical causal effects. However, in practice,...
Describing the causal relations governing a system is a fundamental task in many scientific fields, ideally addressed by experimental studies. However, obtaining data under intervention scenarios may not always be feasible, while discovering causal relations from purely observational data is notoriously challenging. In certain settings, such as gen...
Single-cell DNA sequencing (scDNA-seq) has enabled the identification of single nucleotide somatic variants and the reconstruction of cell phylogenies. However, statistical phylogenetic models for cell phylogeny reconstruction from raw sequencing data are still in their infancy. Here we present SIEVE (SIngle-cell EVolution Explorer), a statistical...
Tumours evolve as heterogeneous populations of cells, which may be distinguished by different genomic aberrations. The resulting intra-tumour heterogeneity plays an important role in cancer patient relapse and treatment failure, so that obtaining a clear understanding of each patient's tumour composition and evolutionary history is key for personal...
Reconstructing the history of somatic DNA alterations that occurred in a tumour can help understand its evolution and predict its resistance to treatment. Single-cell DNA sequencing (scDNAseq) can be used to investigate clonal heterogeneity and to inform phylogeny reconstruction. However, existing phylogenetic methods for scDNAseq data are designed...
Cancer progression is an evolutionary process shaped by both deterministic and stochastic forces. Recent advances in multi-region sequencing, single-cell sequencing, and phylogenetic tree inference empower more precise reconstruction of the mutational history of each tumor. At the same time, the increased resolution also highlights the extensive di...
Bayesian networks are probabilistic graphical models widely employed to understand dependencies in high dimensional data, and even to facilitate causal discovery. Learning the underlying network structure, which is encoded as a directed acyclic graph (DAG) is highly challenging mainly due to the vast number of possible networks in combination with...
Comprehensive molecular characterization of cancer subtypes is essential for predicting clinical outcomes and searching for personalized treatments. We present bnClustOmics, a statistical model and computational tool for multi-omics unsupervised clustering, which serves a dual purpose: Clustering patient samples based on a Bayesian network mixture...
Dynamic Bayesian networks (DBNs) can be used for the discovery of gene regulatory networks from time series gene expression data. Here, we suggest a strategy for learning DBNs from gene expression data by employing a Bayesian approach that is scalable to large networks and is targeted at learning models with high predictive accuracy. Our framework...
While learning the graphical structure of Bayesian networks from observational data is key to describing and helping understand data generating processes in complex applications, the task poses considerable challenges due to its computational complexity. The directed acyclic graph (DAG) representing a Bayesian network model is generally not identif...
Inference of the marginal probability distribution is defined as the calculation of the probability of a subset of the variables and is relevant for handling missing data and hidden variables. While inference of the marginal probability distribution is crucial for various problems in machine learning and statistics, its exact computation is general...
Tumour progression is an evolutionary process in which different clones evolve over time, leading to intra-tumour heterogeneity. Interactions between clones can affect tumour evolution and hence disease progression and treatment outcome. Intra-tumoural pairs of mutations that are overrepresented in a co-occurring or clonally exclusive fashion over...
Cancer progression is an evolutionary process shaped by both deterministic and stochastic forces. Multi-region and single-cell sequencing of tumors empower high-resolution reconstruction of the mutational history of each tumor. At the same time, it also highlights the extensive diversity across tumors and patients. Resolving the interactions among...
Although combination antiretroviral therapies seem to be effective at controlling HIV-1 infections regardless of the viral subtype, there is increasing evidence for subtype-specific drug resistance mutations. The order and rates at which resistance mutations accumulate in different subtypes also remain poorly understood. Most of this knowledge is d...
Background:
Recent network models propose that mutual interaction between symptoms has an important bearing on the onset of schizophrenic disorder. In particular, cross-sectional studies suggest that affective symptoms may influence the emergence of psychotic symptoms. However, longitudinal analysis offers a more compelling test for causation: the...
Describing the relationship between the variables in a study domain and modelling the data generating mechanism is a fundamental problem in many empirical sciences. Probabilistic graphical models are one common approach to tackle the problem. Learning the graphical structure is computationally challenging and a fervent area of current research with...
We present an in-depth study of the universal correlations of scattering-matrix entries required in the framework of nonstationary many-body scattering of noninteracting indistinguishable particles where the incoming states are localized wave packets. Contrary to the stationary case, the emergence of universal signatures of chaotic dynamics in dyna...
Tumour progression is an evolutionary process in which different clones evolve over time, leading to intra-tumour heterogeneity. Interactions between clones can affect tumour evolution and hence disease progression and treatment outcome. Pairs of mutations that are overrepresented in a clonally exclusive fashion over a cohort of patient samples may...
The R package BiDAG implements Markov chain Monte Carlo (MCMC) methods for structure learning and sampling of Bayesian networks. The package includes tools to search for a maximum a posteriori (MAP) graph and to sample graphs from the posterior distribution given the data. A new hybrid approach to structure learning enables inference in large graph...
Intra-tumour heterogeneity is the molecular hallmark of renal cancer, and the molecular tumour composition determines the treatment outcome of renal cancer patients. In renal cancer tumourigenesis, in general, different tumour clones evolve over time. We analysed intra-tumour heterogeneity and subclonal mutation patterns in 178 tumour samples obtai...
Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful technique to decipher tissue composition at the single-cell level and to inform on disease mechanisms, tumor heterogeneity, and the state of the immune microenvironment. Although multiple methods for the computational analysis of scRNA-seq data exist, their application in a clinical s...
The application and integration of molecular profiling technologies create novel opportunities for personalized medicine. Here, we introduce the Tumor Profiler Study, an observational trial combining a prospective diagnostic approach to assess the relevance of in-depth tumor profiling to support clinical decision-making with an exploratory approach...
Bayesian networks are a powerful framework for studying the dependency structure of variables in a complex system. The problem of learning Bayesian networks is tightly associated with the given data type. Ordinal data, such as stages of cancer, rating scale survey questions, and letter grades for exams, are ubiquitous in applied research. However,...
Motivation
Recent technological advances have led to an increase in the production and availability of single-cell data. The ability to integrate a set of multi-technology measurements would allow the identification of biologically or clinically meaningful observations through the unification of the perspectives afforded by each technology. In most...
A Correction to this paper has been published: https://doi.org/10.1038/s41467-020-19902-7
SARS-CoV-2, the virus responsible for the current COVID-19 pandemic, is evolving into different genetic variants by accumulating mutations as it spreads globally. In addition to this diversity of consensus genomes across patients, RNA viruses can also display genetic diversity within individual hosts, and co-existing viral variants may affect disea...
Bayesian networks are a powerful framework for studying the dependency structure of variables in a complex system. The problem of learning Bayesian networks is tightly associated with the given data type. Ordinal data, such as stages of cancer, rating scale survey questions, and letter grades for exams, are ubiquitous in applied research. However,...
Clonal diversity is a consequence of cancer cell evolution driven by Darwinian selection. Precise characterization of clonal architecture is essential to understand the evolutionary history of tumor development and its association with treatment resistance. Here, using a single-cell DNA sequencing, we report the clonal architecture and mutational h...
Although combination antiretoviral therapies seem to be effective at controlling HIV-1 infections regardless of the viral subtype, there is increasing evidence for subtype-specific drug resistance mutations. The order and rates at which resistance mutations accumulate in different subtypes also remain poorly understood. Here, we present a methodolo...
Germinal centers (GCs) are specialized compartments within the secondary lymphoid organs where B cells proliferate, differentiate, and mutate their antibody genes in response to the presence of foreign antigens. Through the GC lifespan, interclonal competition between B cells leads to increased affinity of the B cell receptors for antigens accompan...
Copy number alterations are driving forces of tumour development and the emergence of intra-tumour heterogeneity. A comprehensive picture of these genomic aberrations is therefore essential for the development of personalised and precise cancer diagnostics and therapies. Single-cell sequencing offers the highest resolution for copy number profiling...
Adjusting for covariates is a well established method to estimate the total causal effect of an exposure variable on an outcome of interest. Depending on the causal structure of the mechanism under study there may be different adjustment sets, equally valid from a theoretical perspective, leading to identical causal effects. However, in practice, w...
Recent technological advances allow profiling of tumor samples to an unparalleled level with
respect to molecular and spatial composition as well as treatment response. We describe a
prospective, observational clinical study performed within the Tumor Profiler (TuPro) Consortium
that aims to show the extent to which such comprehensive information l...
One of the pervasive features of cancer is the diversity of mutations found in malignant cells within the same tumor; a phenomenon called clonal diversity or intratumor heterogeneity. Clonal diversity allows tumors to adapt to the selective pressure of treatment and likely contributes to the development of treatment resistance and cancer recurrence...
Next-generation sequencing of DNA and RNA obtained from liquid biopsies of cancer patients may reveal important insights into disease progression and metastasis formation, and it holds the promise to enable new methods for noninvasive screening and clinical decision support. However, implementing liquid biopsy sequencing protocols is challenged by...
Understanding the clonal architecture and evolutionary history of a tumour poses one of the key challenges to overcome treatment failure due to resistant cell populations. Previously, studies on subclonal tumour evolution have been primarily based on bulk sequencing and in some recent cases on single-cell sequencing data. Either data type alone has...
Background:
Extensive DNA sequencing has led to an unprecedented view of the diversity of individual genomes and their evolution among patients with clear cell renal cell carcinoma (ccRCC).
Objective:
To understand subclonal architecture and dynamics of patient-derived two-dimensional (2D) and three-dimensional (3D) ccRCC models in vitro, in ord...
In the post-genomic era of big data in biology, computational approaches to integrate multiple heterogeneous data sets become increasingly important. Despite the availability of large amounts of omics data, the prioritisation of genes relevant for a specific functional pathway based on genetic screening experiments, remains a challenging task. Here...
Reconstructing the evolution of tumors is a key aspect towards the identification of appropriate cancer therapies. The task is challenging because tumors evolve as heterogeneous cell populations. Single-cell sequencing holds the promise of resolving the heterogeneity of tumors; however, it has its own challenges including elevated error rates, alle...
Large-scale genomic data highlight the complexity and diversity of the molecular changes that drive cancer progression. Statistical analysis of cancer data from different tissues can guide drug repositioning as well as the design of targeted treatments. Here, we develop an improved Bayesian network model for tumour mutational profiles and apply it...