Alex Graudenzi

Alex Graudenzi
Università degli Studi di Milano-Bicocca | UNIMIB · Department of Informatics, Systems and Communication (DISCo)

PhD
Tenure track researcher @unimib || Data science + AI + bioinformatics for cancer and viral evolution

About

120
Publications
10,079
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,166
Citations
Introduction
Tenure Track Researcher in Bioinformatics and Health Data Science at Univ. of Milan-Bicocca Co-head of the Data and Computational Biology Group of the Univ. of Milan-Bicocca. Director of the Lake Como Workshop and School on Cancer, Development and Complexity. Author of 60+ publications on international Journals and Conference Proceedings. H-index: 15 650+ citations Main fields: data science, artificial intelligence, bioinformatics, computational biology, complex systems, cancer/viral evolution.

Publications

Publications (120)
Article
Full-text available
We outline the features of the R package SparseSignatures and its application to determine the signatures contributing to mutation profiles of tumor samples. We describe installation details and illustrate a step-by-step approach to (1) prepare the data for signature analysis, (2) determine the optimal parameters, and (3) employ them to determine t...
Article
Full-text available
Background The combined effects of biological variability and measurement-related errors on cancer sequencing data remain largely unexplored. However, the spatio-temporal simulation of multi-cellular systems provides a powerful instrument to address this issue. In particular, efficient algorithmic frameworks are needed to overcome the harsh trade-o...
Article
Full-text available
A key task of genomic surveillance of infectious viral diseases lies in the early detection of dangerous variants. Unexpected help to this end is provided by the analysis of deep sequencing data of viral samples, which are typically discarded after creating consensus sequences. Such analysis allows one to detect intra-host low-frequency mutations,...
Article
Full-text available
Checkpoint inhibitors (CPIs) are routinely employed in relapsed/refractory classical Hodgkin lymphoma. Nonetheless, persistent long‐term responses are uncommon, and one‐third of patients are refractory. Several reports have suggested that treatment with CPIs may re‐sensitize patients to chemotherapy, however there is no consensus on the optimal che...
Article
Full-text available
Many large national and transnational studies have been dedicated to the analysis of SARS-CoV-2 genome, most of which focused on missense and nonsense mutations. However, approximately 30% of the SARS-CoV-2 variants are synonymous, therefore changing the target codon without affecting the corresponding protein sequence. By performing a large-scale...
Article
Full-text available
Many large national and transnational studies have been dedicated to the analysis of SARS-CoV-2 genome, most of which focused on missense and nonsense mutations. However, approximately 30% of the SARS-CoV-2 variants are synonymous, therefore changing the target codon without affecting the corresponding protein sequence. By performing a large-scale...
Article
Full-text available
The rise of longitudinal single-cell sequencing experiments on patient-derived cell cultures, xenografts and organoids is opening new opportunities to track cancer evolution, assess the efficacy of therapies and identify resistant subclones. We introduce LACE, the first algorithmic framework that processes single-cell mutational profiles from samp...
Article
The rise of longitudinal single-cell sequencing experiments on patient-derived cell cultures, xenografts and organoids is opening new opportunities to track cancer evolution, assess the efficacy of therapies and identify resistant subclones. We introduce LACE, the first algorithmic framework that processes single-cell mutational profiles from samp...
Preprint
One of the key challenges in Deep Learning is the definition of effective strategies for the detection of adversarial examples. To this end, we propose a novel approach named Ensemble Adversarial Detector (EAD) for the identification of adversarial examples, in a standard multiclass classification scenario. EAD combines multiple detectors that expl...
Article
Full-text available
We describe the procedures to perform the following: (1) the de novo discovery of mutational signatures from raw sequencing data of viral samples and (2) the association of existing viral mutational signatures to the samples of a given dataset. The goal is to identify and characterize the nucleotide substitution patterns related to the mutational p...
Article
Full-text available
Motivation: Driver (epi)genomic alterations underlie the positive selection of cancer subpopulations, which promotes drug resistance and relapse. Even though substantial heterogeneity is witnessed in most cancer types, mutation accumulation patterns can be regularly found and can be exploited to reconstruct predictive models of cancer evolution. Ye...
Preprint
Full-text available
Algorithmic strategies for the spatio-temporal simulation of multi-cellular systems are crucial to generate synthetic datasets for bioinformatics tools benchmarking, as well as to investigate experimental hypotheses on real-world systems in a variety of in-silico scenarios. In particular, efficient algorithms are needed to overcome the harsh trade-...
Article
The strong nonlinearity of large and highly connected reaction systems, such as metabolic networks, hampers the determination of variations in reaction fluxes from variations in species abundances, when comparing different steady states of a given system. We hypothesize that patterns in species abundance variations exist that mainly depend on the s...
Article
Background The increasing availability of omics data collected from patients affected by severe pathologies, such as cancer, is fostering the development of data science methods for their analysis. Introduction The combination of data integration and machine learning approaches can provide new powerful instruments to tackle the complexity of cance...
Article
Matters Arising from: Sharma, A., Cao, E.Y., Kumar, V. et al. Longitudinal single-cell RNA sequencing of patient-derived primary cells reveals drug-induced infidelity in stem cell hierarchy. Nat Commun 9, 4931 (2018). https://doi.org/10.1038/s41467-018-07261-3. In Sharma, A. et al. Nat Commun 9, 4931 (2018) the authors employ longitudinal single-ce...
Preprint
A bstract Matters Arising from: Sharma, A., Cao, E.Y., Kumar, V. et al. Longitudinal single-cell RNA sequencing of patient-derived primary cells reveals drug-induced infidelity in stem cell hierarchy. Nat Commun 9 , 4931 (2018). https://doi.org/10.1038/s41467-018-07261-3 . In Sharma, A. et al. Nat Commun 9 , 4931 (2018) the authors employ longitudi...
Article
Full-text available
We introduce VERSO, a two-step framework for the characterization of viral evolution from sequencing data of viral genomes, which improves over phylogenomic approaches for consensus sequences. VERSO exploits an efficient algorithmic strategy to return robust phylogenies from clonal variant profiles, also in conditions of sampling limitations. It th...
Article
Full-text available
To dissect the mechanisms underlying the inflation of variants in the SARS-CoV-2 genome, we present one of the largest up-to-date analyses of intra-host genomic diversity, which reveals that most samples present heterogeneous genomic architectures, due to the interplay between host-related mutational processes and transmission dynamics. The deconvo...
Chapter
The increasing availability of sequencing data of cancer samples is fueling the development of algorithmic strategies to investigate tumor heterogeneity and infer reliable models of cancer evolution. We here build up on previous works on cancer progression inference from genomic alteration data, to deliver two distinct Cytoscape-based applications,...
Chapter
FBCA (Flux Balance Cellular Automata) has been recently proposed as a new multi-scale modeling framework to represent the spatial dynamics of multi-cellular systems, while simultaneously taking into account the metabolic activity of individual cells. Preliminary results have revealed the potentialities of the framework in enabling to identify and a...
Article
Motivation The advancements of single-cell sequencing methods have paved the way for the characterization of cellular states at unprecedented resolution, revolutionizing the investigation on complex biological systems. Yet, single-cell sequencing experiments are hindered by several technical issues, which cause output data to be noisy, impacting t...
Chapter
Many systems in nature, society and technology are composed of numerous interacting parts. Very often these dynamics lead to the formation of medium-level structures, whose detection could allow a high-level description of the dynamical organization of the system itself, and thus to its understanding. In this work we apply this idea to the “cancer...
Preprint
A bstract To dissect the mechanisms underlying the observed inflation of variants in SARS-CoV-2 genome, we present the largest up-to-date analysis of intra-host genomic diversity, which reveals that the majority of samples present a complex sublineage architecture, due to the interplay between host-related mutational processes and transmission dyna...
Article
Full-text available
One of the key challenges in current cancer research is the development of computational strategies to support clinicians in the identification of successful personalized treatments. Control theory might be an effective approach to this end, as proven by the long-established application to therapy design and testing. In this respect, we here introd...
Preprint
Full-text available
We introduce VERSO, a two-step framework for the characterization of viral evolution from sequencing data of viral genomes, which improves over phylogenomic approaches for consensus sequences. VERSO exploits an efficient algorithmic strategy to return robust phylogenies from clonal variant profiles, also in conditions of sampling limitations. It th...
Article
Full-text available
We present MaREA4Galaxy, a user-friendly tool that allows a user to characterize and to graphically compare groups of samples with different transcriptional regulation of metabolism, as estimated from cross-sectional RNA-seq data. The tool is available as plug-in for the widely-used Galaxy platform for comparative genomics and bioinformatics analys...
Preprint
Full-text available
The current understanding of deep neural networks can only partially explain how input structure, network parameters and optimization algorithms jointly contribute to achieve the strong generalization power that is typically observed in many real-world applications. In order to improve the comprehension and interpretability of deep neural networks,...
Preprint
Full-text available
The rise of longitudinal single-cell sequencing experiments on patient-derived cell cultures, xenografts and organoids is opening new opportunities to track cancer evolution in single tumors and to investigate intra-tumor heterogeneity. This is particularly relevant when assessing the efficacy of therapies over time on the clonal composition of a t...
Chapter
The current understanding of deep neural networks can only partially explain how input structure, network parameters and optimization algorithms jointly contribute to achieve the strong generalization power that is typically observed in many real-world applications. In order to improve the comprehension and interpretability of deep neural networks,...
Article
Full-text available
The metabolic processes related to the synthesis of the molecules needed for a new round of cell division underlie the complex behaviour of cell populations in multi-cellular systems, such as tissues and organs, whereas their deregulation can lead to pathological states, such as cancer. Even within genetically homogeneous populations, complex dynam...
Preprint
The increasing availability of sequencing data of cancer samples is fueling the development of algorithmic strategies to investigate tumor heterogeneity and infer reliable models of cancer evolution. We here build up on previous works on cancer progression inference from genomic alteration data, to deliver two distinct Cytoscape -based applications...
Preprint
Full-text available
One of the key challenges in current cancer research is the development of reliable methods for the definition of personalized therapeutic strategies, based on increasingly available experimental data on single patients. To this end, methods from control theory can be effectively employed on patient-specific pharmacokinetic and pharmacodynamic mode...
Chapter
Understanding the synchronization, either induced or spontaneous, of cell growth, division and proliferation in a cell culture is an important topic in molecular biology and biotechnology. Metabolic processes related to the synthesis of all the molecules needed for a new round of cell division are the basic underlying phenomena responsible for the...
Article
Full-text available
Background. A large number of algorithms is being developed to reconstruct evolutionary models of individual tumours from genome sequencing data. Most methods can analyze multiple samples collected either through bulk multi-region sequencing experiments or the sequencing of individual cancer cells. However, rarely the same method can support both d...
Article
Full-text available
Author summary Cytotoxicity of chemotherapeutic agents and resistance to targeted treatments are the main reasons why cancer is still one of the top causes of death. As tumor cells are intrinsically resistant to therapies that target signaling pathways, targeting the metabolic hallmarks of cancer holds promise for more incisive treatments. Regretta...
Data
Sensitivity of scFBA results to ϵ for LCPT45 dataset. A) Left: histogram of biomass produced by each single cell when ϵ = 0. Right: Total biomass produced by the population of cells as a function of ϵ. The inset reports the same curve zoomed in on low ϵ values. B) Clustergram (distance metric: euclidean) of the effect of single gene deletions perfo...
Data
Clustering of transcripts vs. fluxes. A) H358 dataset. Clustergram (distance metric: euclidean) of the transcripts of the metabolic genes included in metabolic network (left) and of the metabolic fluxes predicted by scFBA (middle). Right panel: elbow analysis comparing cluster errors for k ∈ {1, ⋯, 20} (k-means clustering) in both transcripts (blue...
Data
scFBA computation time. The linear relationship between the time for an FBA (and thus a scFBA) optimization and the size of the network is well established. We estimated the computation time required to perform a complete model reconstruction, from a template metabolic network to a population model with RASs integrated, for different number of cell...
Data
Comparison of the fluxes predicted by scFBA, GIMME and iMAT with respect to LCPT45 dataset. (XLSX)
Data
scFBA vs. popFBA. A) Dataset H358. Variability of the fraction of the biomass synthesis flux (logarithmic scale) for each cell over the population growth rate (left panel) before (purple) and after data integration (green). Effect of gene deletion (bars in right panel) on the population growth rate before (popFBA), after data integration (scFBA), a...
Data
Description of sensitivity of scFBA results to ϵ. (PDF)
Data
Evaluation of clustering goodness. (PDF)
Data
Comparison of the fluxes of the two main clusters in Fig 3A-middle. (XLSX)
Conference Paper
Full-text available
The increasing availability of sequencing data of cancer samples is fueling the development of algorithmic strategies to investigate tumor heterogeneity and infer reliable models of cancer evolution. We here build up on previous works on cancer progression inference from genomic alteration data, to deliver two distinct Cytoscape-based applications,...
Article
Full-text available
Structural learning of Bayesian Networks (BNs) is a NP-hard problem, which is further complicated by many theoretical issues, such as the I-equivalence among different structures. In this work, we focus on a specific subclass of BNs, named Suppes-Bayes Causal Networks (SBCNs), which include specific structural constraints based on Suppes’ probabili...
Article
Effective stratification of cancer patients on the basis of their molecular make-up is a key open challenge. Given the altered and heterogenous nature of cancer metabolism, we here propose to use the overall expression of central carbon metabolism as biomarker to characterize groups of patients with important characteristics, such as response to ad...
Article
Full-text available
Several diseases related to cell proliferation are characterized by the accumulation of somatic DNA changes, with respect to wild-type conditions. Cancer and HIV are 2 common examples of such diseases, where the mutational load in the cancerous/viral population increases over time. In these cases, selective pressures are often observed along with c...
Chapter
One of the critical issues when adopting Bayesian networks (BNs) to model dependencies among random variables is to “learn” their structure. This is a well-known NP-hard problem in its most general and classical formulation, which is furthermore complicated by known pitfalls such as the issue of I-equivalence among different structures. In this wor...
Preprint
Full-text available
Motivation Metabolic reprogramming is a general feature of cancer cells. Regrettably, the comprehensive quantification of metabolites in biological specimens does not promptly translate into knowledge on the utilization of metabolic pathways. Computational models hold the promise to bridge this gap, by estimating fluxes across metabolic pathways. Y...
Preprint
Full-text available
The characterization of the metabolic deregulations that distinguish cancer phenotypes, and which might be effectively targeted by ad-hoc strategies, is a key open challenge. To this end, we here introduce MaREA (Metabolic Reaction Enrichment Analysis), a computational pipeline that processes cross-sectional RNAseq data to identify the metabolic re...
Article
Full-text available
It is well known that tumors originating from the same tissue have different prognosis and sensitivity to treatments. Over the last decade, cancer genomics consortia like the Cancer Genome Atlas (TCGA) have been generating thousands of cross-sectional data, for thousands of human primary tumors originated from various tissues. Thanks to that public...
Conference Paper
Full-text available
It is well known that tumors originating from the same tissue have different prognosis and sensitivity to treatments, depending on their molecular features. Over the last decade, cancer genomics consortia like the Cancer Genome Atlas (TCGA; https://cancergenome.nih.gov) have been generating thousands of cross-sectional data, spanning from genetic a...
Technical Report
Phylogenetic methods are routinely used to quantify intra tumor heterogeneity (ITH) from multi-sample sequencing of individual tumors. These methods can deconvolve clonal or mutational structures, but sometimes require several complex technical assumptions. Here, we present a simple computational framework (Temporal oRder of Individual Tumors, TRaI...
Conference Paper
One of the critical issues when adopting Bayesian networks (BNs) to model dependencies among random variables is to "learn" their structure. This is a well-known NP-hard problem in its most general and classical formulation, which is furthermore complicated by known pitfalls such as the issue of I-equivalence among different structures. In this wor...
Preprint
Full-text available
Several diseases related to cell proliferation are characterized by the accumulation of somatic DNA changes, with respect to wildtype conditions. Cancer and HIV are two common examples of such diseases, where the mutational load in the cancerous/viral population increases over time. In these cases, selective pres sures are often observed along with...
Article
Cancer heterogeneity represents a major hurdle in the development of effective theranostic strategies, as it prevents to devise unique and maximally efficient diagnostic, prognostic and therapeutic procedures even for patients affected by the same tumor type. Computational techniques can nowadays leverage the huge and ever increasing amount of (epi...
Article
Full-text available
Models of cancer progression provide insights on the order of accumulation of genetic alterations during cancer development. Algorithms to infer such models from the currently available mutational profiles collected from different cancer patiens (cross-sectional data) have been defined in the literature since late 90s. These algorithms differ in th...
Article
Full-text available
Gene Regulatory Networks (GRNs) control many biological systems, but how such network coordination is shaped is still unknown. GRNs can be subdivided into basic connections that describe how the network members interact e.g., co-expression, physical interaction, co-localization, genetic influence, pathways, and shared protein domains. The important...
Article
Full-text available
The genomic evolution inherent to cancer relates directly to a renewed focus on the voluminous next-generation sequencing data and machine learning for the inference of explanatory models of how the (epi)genomic events are choreographed in cancer initiation and development. However, despite the increasing availability of multiple additional -omics...
Article
Understanding the dynamical evolution of cancer, with the final goal of developing effective techniques for diagnosis, prediction and treatment is one of the main challenges of modern biosciences. In this paper we approach the temporal ordering reconstruction problem, which refers to the temporal sorting of a collection of static biological data. T...
Code
Full-text available
SpidermiR: An R/Bioconductor package for integrative network analysis with miRNA data
Article
Full-text available
Several diseases related to cell proliferation are characterized by the accumulation of somatic DNA changes, with respect to wildtype conditions. Cancer and HIV are two common examples of such diseases, where the mutational load in the cancerous/viral population increases over time. In these cases, selective pressures are often observed along with...
Preprint
Full-text available
Motivation We introduce TRONCO (TRanslational ONCOlogy), an open-source R package that implements the state-of-the-art algorithms for the inference of cancer progression models from (epi)genomic mutational profiles. TRONCO can be used to extract population-level models describing the trends of accumulation of alterations in a cohort of cross-sectio...