Daniele Ramazzotti

Daniele Ramazzotti
  • Doctor of Philosophy
  • Professor (Associate) at Università degli Studi di Milano-Bicocca

About

69
Publications
21,607
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,585
Citations
Introduction
Daniele Ramazzotti works at the Department of Medicine and Surgery at the University of Milano-Bicocca, Milan, Italy. His research interests involve bioinformatics, biostatistics, cancer genomics, and evolution. With a focus on developing computational methods to analyze large-scale biological data, he aims to uncover new insights into cancer progression and resistance mechanisms. His interdisciplinary approach bridges biology, medicine, and data science to drive innovation in cancer research.
Current institution
Università degli Studi di Milano-Bicocca
Current position
  • Professor (Associate)
Additional affiliations
March 2022 - February 2025
Università degli Studi di Milano-Bicocca
Position
  • Tenure track researcher
July 2019 - February 2022
Università degli Studi di Milano-Bicocca
Position
  • PostDoc Position
May 2016 - June 2019
Stanford University
Position
  • PostDoc Position
Education
November 2012 - February 2016
Università degli Studi di Milano-Bicocca
Field of study
  • Computer Science

Publications

Publications (69)
Article
Full-text available
Recurring sequences of genomic alterations occurring across patients can highlight repeated evolutionary processes with significant implications for predicting cancer progression. Leveraging the ever-increasing availability of cancer omics data, here we unveil cancer’s evolutionary signatures tied to distinct disease outcomes, representing “favored...
Article
Full-text available
The dominant mutational signature in colorectal cancer genomes is C > T deamination (COSMIC Signature 1) and, in a small subgroup, mismatch repair signature (COSMIC signatures 6 and 44). Mutations in common colorectal cancer driver genes are often not consistent with those signatures. Here we perform whole-genome sequencing of normal colon crypts f...
Article
Full-text available
SETBP1 mutations are found in various clonal myeloid disorders. However, it is unclear whether they can initiate leukemia, as SETBP1 mutations typically appear as later events during oncogenesis. To answer this question, we generated a mouse model expressing mutated SETBP1 in hematopoietic tissue: this model showed profound alterations in the diffe...
Article
Full-text available
The patterns by which primary tumors spread to metastatic sites remain poorly understood. Here, we define patterns of metastatic seeding in prostate cancer (PCa) using a novel injection-based mouse model — EvoCaP (Evolution in Cancer of the Prostate), featuring aggressive metastatic cancer to bone, liver, lungs, and lymph nodes. To define migration...
Article
Full-text available
Cancer evolution lays the groundwork for predictive oncology. Testing evolutionary metrics requires quantitative measurements in controlled clinical trials. We mapped genomic intratumor heterogeneity in locally advanced prostate cancer using 642 samples from 114 individuals enrolled in clinical trials with a 12-year median follow-up. We concomitant...
Article
As genes tend to be co-regulated as gene modules, feature selection in machine learning (ML) on gene expression data can be challenged by the complexity of gene regulation. Here, we present a protocol for reconciling differences in classifier features identified using different ML approaches. We describe steps for loading the PathwaySpace R package...
Article
Full-text available
Breast cancer (BC) is a highly heterogeneous disease with diverse molecular subtypes, which complicates prognosis and treatment. In this study, we performed a multi-omics clustering analysis using the Cancer Integration via MultIkernel LeaRning (CIMLR) method on a large BC dataset from The Cancer Genome Atlas (TCGA) to identify key prognostic bioma...
Article
Full-text available
Molecular subtypes, such as defined by The Cancer Genome Atlas (TCGA), delineate a cancer’s underlying biology, bringing hope to inform a patient’s prognosis and treatment plan. However, most approaches used in the discovery of subtypes are not suitable for assigning subtype labels to new cancer specimens from other studies or clinical trials. Here...
Article
Background Unstable hemoglobins are caused by single amino acid substitutions in the HBB gene, often affecting key histidine residues, leading to protein destabilization and hemolytic crises. In contrast, long HBB variants, exceeding 20 bp, are rare and associated with a β-thalassemia phenotype due to disrupted α-β chain interactions. We describe a...
Article
Full-text available
Background: Anaplastic lymphoma kinase (ALK) plays a role in the development of lymphoma, lung cancer and neuroblastoma. While tyrosine kinase inhibitors (TKIs) have improved treatment outcomes, relapse remains a challenge due to on-target mutations and off-target resistance mechanisms. ALK-positive (ALK+) tumors can evade the immune system, partly...
Article
Full-text available
Background Anaplastic Large Cell Lymphoma (ALCL) is a rare and aggressive T-cell lymphoma, classified into ALK-positive and ALK-negative subtypes, based on the presence of chromosomal translocations involving the ALK gene. The current standard of treatment for ALCL is polychemotherapy, with a high overall survival rate. However, a subset of patient...
Article
Full-text available
Background Copy number alterations (CNAs) are genetic changes commonly found in cancer that involve different regions of the genome and impact cancer progression by affecting gene expression and genomic stability. Computational techniques can analyze copy number data obtained from high-throughput sequencing platforms, and various tools visualize an...
Article
Full-text available
A group of 27 patients diagnosed with metastatic triple-negative breast cancer (mTNBC) was randomly distributed into two groups and underwent different lines of metronomic treatment (mCHT). The former group (N 14) received first-line mCHT and showed a higher overall survival rate than the second group (N 13), which underwent second-line mCHT. Analy...
Article
Full-text available
Polycythemia Vera (PV) is typically caused by V617F or exon 12 JAK2 mutations. Little is known about Polycythemia cases where no JAK2 variants can be detected, and no other causes identified. This condition is defined as idiopathic erythrocytosis (IE). We evaluated clinical-laboratory parameters of a cohort of 56 IE patients and we determined their...
Article
Full-text available
In a first‐of‐its‐kind study, we assessed the capabilities of large language models (LLMs) in making complex decisions in haematopoietic stem cell transplantation. The evaluation was conducted not only for Generative Pre‐trained Transformer 4 (GPT‐4) but also conducted on other artificial intelligence models: PaLm 2 and Llama‐2. Using detailed haem...
Article
Full-text available
Mantle-cell lymphoma (MCL) is a B-cell non-Hodgkin Lymphoma (NHL) with a poor prognosis, at high risk of relapse after conventional treatment. MCL-associated tumour microenvironment (TME) is characterized by M2-like tumour-associated macrophages (TAMs), able to interact with cancer cells, providing tumour survival and resistance to immuno-chemother...
Article
Full-text available
Cancer patients show heterogeneous phenotypes and very different outcomes and responses even to common treatments, such as standard chemotherapy. This state-of-affairs has motivated the need for the comprehensive characterization of cancer phenotypes and fueled the generation of large omics datasets, comprising multiple omics data reported for the...
Conference Paper
Full-text available
In recent years, many algorithmic strategies have been developed to exploit single-cell mutational profiles generated via sequencing experiments of cancer samples and return reliable models of cancer evolution. Here, we introduce the COB-tree algorithm, which summarizes the solutions explored by state-of-the-art methods for clonal tree inference, t...
Article
Full-text available
Background Longitudinal single-cell sequencing experiments of patient-derived models are increasingly employed to investigate cancer evolution. In this context, robust computational methods are needed to properly exploit the mutational profiles of single cells generated via variant calling, in order to reconstruct the evolutionary history of a tumo...
Article
Full-text available
Recent investigations have improved our understanding of the molecular aberrations supporting Waldenström Macroglobulinemia (WM) biology; however, whether the immune microenvironment contributes to WM pathogenesis remains unanswered. We first showed how a transgenic murine model of human-like lymphoplasmacytic lymphoma/WM exhibits an increased numb...
Article
Full-text available
We present a large-scale analysis of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) substitutions, considering 1,585,456 high-quality raw sequencing samples, aimed at investigating the existence and quantifying the effect of mutational processes causing mutations in SARS-CoV-2 genomes when interacting with the human host. As a result,...
Article
Full-text available
Activation-induced cytidine deaminase, AICDA or AID, is a driver of somatic hypermutation and class-switch recombination in immunoglobulins. In addition, this deaminase belonging to the APOBEC family may have off-target effects genome-wide, but its effects at pan-cancer level are not well elucidated. Here, we used different pan-cancer datasets, tot...
Article
Full-text available
Colorectal malignancies are a leading cause of cancer-related death¹ and have undergone extensive genomic study2,3. However, DNA mutations alone do not fully explain malignant transformation4–7. Here we investigate the co-evolution of the genome and epigenome of colorectal tumours at single-clone resolution using spatial multi-omic profiling of ind...
Article
Full-text available
Genetic and epigenetic variation, together with transcriptional plasticity, contribute to intratumour heterogeneity¹. The interplay of these biological processes and their respective contributions to tumour evolution remain unknown. Here we show that intratumour genetic ancestry only infrequently affects gene expression traits and subclonal evoluti...
Article
Full-text available
We outline the features of the R package SparseSignatures and its application to determine the signatures contributing to mutation profiles of tumor samples. We describe installation details and illustrate a step-by-step approach to (1) prepare the data for signature analysis, (2) determine the optimal parameters, and (3) employ them to determine t...
Article
Full-text available
A key task of genomic surveillance of infectious viral diseases lies in the early detection of dangerous variants. Unexpected help to this end is provided by the analysis of deep sequencing data of viral samples, which are typically discarded after creating consensus sequences. Such analysis allows one to detect intra-host low-frequency mutations,...
Article
Full-text available
Many large national and transnational studies have been dedicated to the analysis of SARS-CoV-2 genome, most of which focused on missense and nonsense mutations. However, approximately 30% of the SARS-CoV-2 variants are synonymous, therefore changing the target codon without affecting the corresponding protein sequence. By performing a large-scale...
Article
Full-text available
The rise of longitudinal single-cell sequencing experiments on patient-derived cell cultures, xenografts and organoids is opening new opportunities to track cancer evolution, assess the efficacy of therapies and identify resistant subclones. We introduce LACE, the first algorithmic framework that processes single-cell mutational profiles from samp...
Article
Full-text available
We describe the procedures to perform the following: (1) the de novo discovery of mutational signatures from raw sequencing data of viral samples and (2) the association of existing viral mutational signatures to the samples of a given dataset. The goal is to identify and characterize the nucleotide substitution patterns related to the mutational p...
Article
Full-text available
Motivation: Driver (epi)genomic alterations underlie the positive selection of cancer subpopulations, which promotes drug resistance and relapse. Even though substantial heterogeneity is witnessed in most cancer types, mutation accumulation patterns can be regularly found and can be exploited to reconstruct predictive models of cancer evolution. Ye...
Article
Full-text available
Bayesian Networks have been widely used in the last decades in many _elds, to describe statistical dependencies among random variables. In general, learning the structure of such models is a problem with considerable theoretical interest that poses many challenges. On the one hand, it is a well-known NP-complete problem, practically hardened by the...
Article
Full-text available
Cancer is the result of mutagenic processes that can be inferred from tumor genomes by analyzing rate spectra of point mutations, or “mutational signatures”. Here we present SparseSignatures, a novel framework to extract signatures from somatic point mutation data. Our approach incorporates a user-specified background signature, employs regularizat...
Article
Full-text available
We introduce VERSO, a two-step framework for the characterization of viral evolution from sequencing data of viral genomes, which improves over phylogenomic approaches for consensus sequences. VERSO exploits an efficient algorithmic strategy to return robust phylogenies from clonal variant profiles, also in conditions of sampling limitations. It th...
Article
Full-text available
To dissect the mechanisms underlying the inflation of variants in the SARS-CoV-2 genome, we present one of the largest up-to-date analyses of intra-host genomic diversity, which reveals that most samples present heterogeneous genomic architectures, due to the interplay between host-related mutational processes and transmission dynamics. The deconvo...
Article
Full-text available
Learning the structure of dependencies among multiple random variables is a problem of considerable theoretical and practical interest. Within the context of Bayesian Networks, a practical and surprisingly successful solution to this learning problem is achieved by adopting score-functions optimisation schema, augmented with multiple restarts to av...
Article
Full-text available
Atypical chronic myeloid leukemia (aCML) is a BCR-ABL1-negative clonal disorder, which belongs to the myelodysplastic/ myeloproliferative group. This disease is characterized by recurrent somatic mutations in SETBP1, ASXL1 and ETNK1 genes, as well as high genetic heterogeneity, thus posing a great therapeutic challenge. To provide a comprehensive g...
Article
Full-text available
Patients admitted to the intensive care unit frequently have anemia and impaired renal function, but often lack historical blood results to contextualize the acuteness of these findings. Using data available within two hours of ICU admission, we developed machine learning models that accurately (AUC 0.86–0.89) classify an individual patient’s basel...
Article
Full-text available
The metabolic processes related to the synthesis of the molecules needed for a new round of cell division underlie the complex behaviour of cell populations in multi-cellular systems, such as tissues and organs, whereas their deregulation can lead to pathological states, such as cancer. Even within genetically homogeneous populations, complex dynam...
Article
Full-text available
Background. A large number of algorithms is being developed to reconstruct evolutionary models of individual tumours from genome sequencing data. Most methods can analyze multiple samples collected either through bulk multi-region sequencing experiments or the sequencing of individual cancer cells. However, rarely the same method can support both d...
Article
Full-text available
Background Critically ill patients may die despite invasive intervention. In this study, we examine trends in the application of two such treatments over a decade, namely, endotracheal ventilation and vasopressors and inotropes administration, as well as the impact of these trends on survival durations in patients who die within a month of ICU admi...
Conference Paper
Full-text available
The increasing availability of sequencing data of cancer samples is fueling the development of algorithmic strategies to investigate tumor heterogeneity and infer reliable models of cancer evolution. We here build up on previous works on cancer progression inference from genomic alteration data, to deliver two distinct Cytoscape-based applications,...
Article
Full-text available
Structural learning of Bayesian Networks (BNs) is a NP-hard problem, which is further complicated by many theoretical issues, such as the I-equivalence among different structures. In this work, we focus on a specific subclass of BNs, named Suppes-Bayes Causal Networks (SBCNs), which include specific structural constraints based on Suppes’ probabili...
Article
Full-text available
Outcomes for cancer patients vary greatly even within the same tumor type, and characterization of molecular subtypes of cancer holds important promise for improving prognosis and personalized treatment. This promise has motivated recent efforts to produce large amounts of multidimensional genomic (‘multi-omic’) data, but current algorithms still f...
Article
Full-text available
One of the most challenging tasks when adopting Bayesian Networks (BNs) is the one of learning their structure from data. This task is complicated by the huge search space of possible solutions, and by the fact that the problem is NP-hard. Hence, full enumeration of all the possible solutions is not always feasible and approximations are often requ...
Article
Full-text available
Background. Germline mutations in the BRCA1 and BRCA2 genes predispose carriers to breast and ovarian cancer, and there remains a need to identify the specific genomic mechanisms by which cancer evolves in these patients. Here we present a systematic genomic analysis of breast tumors with BRCA1 and BRCA2 mutations. Methods. We analyzed genomic da...
Article
Full-text available
Recurrent successions of genomic changes, both within and between patients, reflect repeated evolutionary processes that are valuable for the anticipation of cancer progression. Multi-region sequencing allows the temporal order of some genomic changes in a tumor to be inferred, but the robust identification of repeated evolution across patients rem...
Article
Full-text available
Over the past decades, both critical care and cancer care have improved substantially. Due to increased cancer-specific survival, we hypothesized that both the number of cancer patients admitted to the ICU and overall survival have increased since the millennium change. MIMIC-III, a freely accessible critical care database of Beth Israel Deaconess...
Conference Paper
Full-text available
Mastering the dynamics of social influence requires separating, in a database of information propagation traces, the genuine causal processes from temporal correlation, homophily and other spurious causes. However, most of the studies to characterize social influence and, in general, most data-science analyses focus on correlations, statistical ind...
Article
Full-text available
Identification of modules in molecular networks is at the core of many current analysis methods in biomedical research. However, how well different approaches identify disease-relevant modules in different types of gene and protein networks remains poorly understood. We launched the “Disease Module Identification DREAM Challenge”, an open competiti...
Article
Full-text available
The most recent financial upheavals have cast doubt on the adequacy of some of the conventional quantitative risk management strategies, such as VaR (Value at Risk), in many common situations. Consequently, there has been an increasing need for verisimilar financial stress testings, namely simulating and analyzing financial portfolios in extreme, a...
Article
Full-text available
Motivation We here present SIMLR (Single-cell Interpretation via Multi-kernel LeaRning), an open-source tool that implements a novel framework to learn a sample-to-sample similarity measure from expression data observed for heterogenous samples. SIMLR can be effectively used to perform tasks such as dimension reduction, clustering, and visualizati...
Article
Full-text available
The complicated, evolving landscape of cancer mutations poses a formidable challenge to identify cancer genes among the large lists of mutations typically generated in NGS experiments. The ability to prioritize these variants is therefore of paramount importance. To address this issue we developed OncoScore, a text-mining tool that ranks genes acco...
Article
Full-text available
We present single-cell interpretation via multikernel learning (SIMLR), an analytic framework and software which learns a similarity measure from single-cell RNA-seq data in order to perform dimension reduction, clustering and visualization. On seven published data sets, we benchmark SIMLR against state-of-the-art methods. We show that SIMLR is sca...
Article
Full-text available
Discrimination discovery from data is an important task aiming at identifying patterns of illegal and unethical discriminatory activities against protected-by-law groups, e.g., ethnic minorities. While any legally-valid proof of discrimination requires evidence of causality, the state-of-the-art methods are essentially correlation-based, albeit, as...
Article
Full-text available
Models of cancer progression provide insights on the order of accumulation of genetic alterations during cancer development. Algorithms to infer such models from the currently available mutational profiles collected from different cancer patiens (cross-sectional data) have been defined in the literature since late 90s. These algorithms differ in th...
Chapter
Full-text available
Learning Objectives • Understand the requirements for a “clean” database that is “tidy” and ready for use in statistical analysis. • Understand the steps of cleaning raw data, integrating data, reducing and reshaping data. • Be able to apply basic techniques for dealing with common problems with raw data including missing data inconsistent data, a...
Conference Paper
Full-text available
The emergence and development of cancer is a consequence of the accumulation over time of genomic mutations involving a specific set of genes, which provides the cancer clones with a functional selective advantage. In this work, we model the order of accumulation of such mutations during the progression, which eventually leads to the disease, by me...
Article
Full-text available
Significance A causality-based machine learning Pipeline for Cancer Inference (PiCnIc) is introduced to infer the underlying somatic evolution of ensembles of tumors from next-generation sequencing data. PiCnIc combines techniques for sample stratification, driver selection, and identification of fitness-equivalent exclusive alterations to exploit...
Article
Full-text available
Several diseases related to cell proliferation are characterized by the accumulation of somatic DNA changes, with respect to wildtype conditions. Cancer and HIV are two common examples of such diseases, where the mutational load in the cancerous/viral population increases over time. In these cases, selective pressures are often observed along with...
Thesis
Full-text available
Recently, there has been a resurgence of interest in rigorous algorithms for the inference of cancer progression from genomic data. The motivations are manifold: (i) growing NGS and single cell data from cancer patients, (ii) need for novel Data Science and Machine Learning algorithms to infer models of cancer progression, and (iii) a desire to und...
Conference Paper
Full-text available
Gene and protein networks are very important to model complex large-scale systems in molecular biology. Inferring or reverseengineering such networks can be defined as the process of identifying gene/protein interactions from experimental data through computational analysis. However, this task is typically complicated by the enormously large scale...
Article
Full-text available
Motivation: We introduce TRONCO (TRanslational ONCOlogy), an open-source R package that implements the state-of-the-art algorithms for the inference of cancer progression models from (epi)genomic mutational profiles. TRONCO can be used to extract population-level models describing the trends of accumulation of alterations in a cohort of cross-secti...
Article
Full-text available
We devise a novel inference algorithm to effectively solve the cancer progression model reconstruction problem. Our empirical analysis of the accuracy and convergence rate of our algorithm, CAncer PRogression Inference (CAPRI), shows that it outperforms the state-of-the-art algorithms addressing similar problems. Motivation: Several cancer-related...
Thesis
Full-text available
This thesis conducts an observational study into whether diuretics should be administered to ICU patients with sepsis when length of stay in the ICU and 30-day post-hospital mortality are considered. The central contribution of the thesis is a stepwise, reusable software-based approach for examining the outcome of treatment vs no-treatment decision...
Article
Full-text available
Existing techniques to reconstruct tree models of progression for accumulative processes, such as cancer, seek to estimate causation by combining correlation and a frequentist notion of temporal priority. In this paper, we define a novel theoretical framework called CAPRESE (CAncer PRogression Extraction with Single Edges) to reconstruct such model...
Conference Paper
Full-text available
The Spatial Processes package enables an explicit definition of a spatial environment on top of the normal dynamic modeling SBML capabilities. The possibility of an explicit representation of spatial dynamics increases the representation power of SBML. In this work we used those new SBML features to define an extensive model of colonic crypts compo...

Network

Cited By