Fabio StellaUniversità degli Studi di Milano-Bicocca | UNIMIB · Department of Informatics, Systems and Communication (DISCo)
Fabio Stella
PhD Computational Mathematics and Operations Research
About
131
Publications
41,989
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,613
Citations
Introduction
Main skills are DATA MINING and TEXT MINING. Main contributions concern BAYESIAN NETWORKS, CONTINUOUS TIME BAYESIAN NETWORKS and CONTINUOUS TIME BAYESIAN NETWORK CLASSIFIERS. TOPIC MODELS have been the subject of some contributions as well as FEED-FORWARD NEURAL NETWORKS and BAYESIAN CONFIDENCE PROPAGATION NEURAL NETWORKS.
I developed the Pathway titled Introduction to Data Mining
https://learn.eduopen.org/eduopen/pathway_details.php?specialid=23
Additional affiliations
January 1993 - January 1998
February 1998 - December 2020
Publications
Publications (131)
Background: In the last decades, the increasing number of adolescent and young adult (AYA) survivors of breast cancer (BC) has highlighted the cardiotoxic role of cancer therapies, making cardiovascular diseases (CVDs) among the most frequent, although rare, long-term sequalae. Leveraging innovative artificial intelligence (AI) tools and real-world...
Randomised clinical trials to study treatment effects may be infeasible for several reasons: we often resort to analysing observational healthcare data instead. Still, we must ensure the validity and interpretability of the causal relationships discovered using machine learning to support clinical decision-making. This is particularly important in...
Causality is receiving increasing attention by the artificial intelligence and machine learning communities. This paper gives an example of modelling a recommender system problem using causal graphs. Specifically, we approached the causal discovery task to learn a causal graph by combining observational data from an open-source dataset with prior k...
Causality is receiving increasing attention in the Recommendation Systems (RSs) community, which has realised that RSs could greatly benefit from causality to transform accurate predictions into effective and explainable decisions. Indeed, the RS literature has repeatedly highlighted that, in real-world scenarios, recommendation algorithms suffer m...
The artifacts affecting electroencephalographic (EEG) signals may undermine the correct interpretation of neural data that are used in a variety of applications spanning from diagnosis support systems to recreational brain-computer interfaces. Therefore, removing or - at least - reducing the noise content in respect to the actual brain activity dat...
This study conducts a comprehensive analysis of deep reinforcement learning (DRL) algorithms applied to supply chain inventory management (SCIM), which consists of a sequential decision-making problem focused on determining the optimal quantity of products to produce and ship across multiple capacitated local warehouses over a specific time horizon...
This paper deals with the problem of optimising bids and budgets of a set of digital advertising campaigns. We improve on the current state of the art by introducing support for multi-ad group marketing campaigns and developing a highly data efficient parametric contextual bandit. The bandit, which exploits domain knowledge to reduce the exploratio...
Objective
We aimed to develop a machine learning model to infer OCEAN traits from text.
Background
The psycholexical approach allows retrieving information about personality traits from human language. However, it has rarely been applied because of methodological and practical issues that current computational advancements could overcome.
Method...
The preprint is published here: https://osf.io/preprints/psyarxiv/9t5ep
OBJECTIVE – We aimed to develop a machine-learning model to infer OCEAN traits from text.
BACKGROUND – The psycholexical approach allows retrieving information about personality traits from human language. However, it has rarely been applied because of methodological and prac...
Over the last decades, many prognostic models based on artificial intelligence techniques have been used to provide detailed predictions in healthcare. Unfortunately, the real-world observational data used to train and validate these models are almost always affected by biases that can strongly impact the outcomes validity: two examples are values...
We introduce a novel heuristic designed to address the supply chain inventory management problem in the context of a two-echelon divergent supply chain. The proposed heuristic advances the current state-of-the-art by combining deep reinforcement learning with multi-stage stochastic programming. In particular, deep reinforcement learning is employed...
Causal inference for testing clinical hypotheses from observational data presents many difficulties because the underlying data-generating model and the associated causal graph are not usually available. Furthermore, observational data may contain missing values, which impact the recovery of the causal graph by causal discovery algorithms: a crucia...
Causality is gaining more and more attention in the machine learning community and consequently also in recommender systems research. The limitations of learning offline from observed data are widely recognized, however, applying debiasing strategies like Inverse Propensity Weighting does not always solve the problem of making wrong estimates. This...
Recommender systems were created to support users in situations of information overload. However, users are consciously or unconsciously influenced by many factors when making decisions, and the recommender must account for these to be effective. In this work, we use a causal graph to investigate the influence of different factors on the user’s dec...
We approached the causal discovery task in the recommender system domain to learn a causal graph by combining observational data provided by a meta-search booking platform for online hotel search with prior knowledge made available by domain experts. The results show that it is possible to learn a causal graph coherent with previous findings in the...
Interacting systems of events may exhibit cascading behavior where events tend to be temporally clustered. While the cascades themselves may be obvious from the data, it is important to understand which states of the system trigger them. For this purpose, we propose a modeling framework based on continuous-time Bayesian networks (CTBNs) to analyze...
A key challenge in computer vision and deep learning is the definition of robust strategies for the detection of adversarial examples. In this work, we propose the adoption of ensemble approaches to leverage the effectiveness of multiple detectors in exploiting distinct properties of the input data. To this end, the ENsemble Adversarial Detector (E...
Causal inference for testing clinical hypotheses from observational data presents many difficulties because the underlying data-generating model and the associated causal graph are not usually available. Furthermore, observational data may contain missing values, which impact the recovery of the causal graph by causal discovery algorithms: a crucia...
Understanding the laws that govern a phenomenon is the core of scientific progress. This is especially true when the goal is to model the interplay between different aspects in a causal fashion. Indeed, causal inference itself is specifically designed to quantify the underlying relationships that connect a cause to its effect. Causal discovery is a...
Causal inference for testing clinical hypotheses from observational data presents many difficulties because the underlying data-generating model and the associated causal graph are not usually available. Furthermore, observational data may contain missing values, which impact the recovery of the causal graph by causal discovery algorithms: a crucia...
Assessing the pre-operative risk of lymph node metastases in endometrial cancer patients is a complex and challenging task. In principle, machine learning and deep learning models are flexible and expressive enough to capture the dynamics of clinical risk assessment. However, in this setting we are limited to observational data with quality issues,...
Reproducibility is a main principle in science and fundamental to ensure scientific progress. However, many recent works point out that there are widespread deficiencies for this aspect in the AI field, making the reproducibility of results impractical or even impossible. We therefore studied the state of reproducibility support on the topic of Rei...
Individual-specific networks, defined as networks of nodes and connecting edges that are specific to an individual, are promising tools for precision medicine. When such networks are biological, interpretation of functional modules at an individual level becomes possible. An under-investigated problem is relevance or ”significance” assessment of ea...
Learning the structure of continuous-time Bayesian networks directly from data has traditionally been performed using score-based structure learning algorithms. Only recently has a constraint-based method been proposed, proving to be more suitable under specific settings, as in modelling systems with variables having more than two states. As a resu...
This paper deals with the problem of optimising bids and budgets of a digital advertising portfolio. We improve on the current state of the art by introducing support for multi-ad group marketing campaigns and developing a highly data efficient parametric contextual bandit. The bandit, which exploits domain knowledge to reduce the exploration space...
Purpose – Many incoming requests for quotation usually compete for the attention of
accommodation service provider staff on a daily basis, while some of them might deserve
more priority than others.
Design – This research is therefore based on the correspondence history of a large booking
management system that examines the features of quotation re...
Recommender Systems were created to support users in situations of information overload. However, users are consciously or unconsciously influenced by several factors in their decision-making. We analysed a historical dataset from a meta-search booking platform with the aim of exploring how these factors influence user choices in the context of onl...
Post-harvest diseases are one of the main causes of economical losses in the apple fruit production sector. Therefore, this paper presents an application of a knowledge-based expert system to diagnose post-harvest diseases of apple. Specifically, we detail the process of domain knowledge elicitation for constructing a Bayesian network reasoning sys...
Assessing the pre-operative risk of lymph node metastases in endometrial cancer patients is a complex and challenging task. In principle, machine learning and deep learning models are flexible and expressive enough to capture the dynamics of clinical risk assessment. However, in this setting we are limited to observational data with quality issues,...
Learning the structure of continuous-time Bayesian networks directly from data has traditionally been performed using score-based structure learning algorithms. Only recently has a constraint-based method been proposed, proving to be more suitable under specific settings, as in modelling systems with variables having more than two states. As a resu...
Understanding the laws that govern a phenomenon is the core of scientific progress. This is especially true when the goal is to model the interplay between different aspects in a causal fashion. Indeed, causal inference itself is specifically designed to quantify the underlying relationships that connect a cause to its effect. Causal discovery is a...
In this study, we analyze and compare the performance of state-of-the-art deep reinforcement learning algorithms for solving the supply chain inventory management problem. This complex sequential decision-making problem consists of determining the optimal quantity of products to be produced and shipped across different warehouses over a given time...
Expense optimisation for online marketing is a relevant and challenging task. In particular, the problem of splitting daily budget among campaigns, together with the problem of setting bids for the auctions that regulate ad appearance, have been recently cast as a multi-armed bandit problem. However, at the current state of the art several shortcom...
Post-harvest diseases of apple can cause considerable economic losses. Thus, we developed DSSApple, an interactive web-based decision support system, that helps users to diagnose post-harvest diseases of domesticated apple based on observed macroscopic symptoms on fruit. Specifically, DSSApple is designed as a two-stream hybrid diagnostic tool, tha...
One of the key challenges in Deep Learning is the definition of effective strategies for the detection of adversarial examples. To this end, we propose a novel approach named Ensemble Adversarial Detector (EAD) for the identification of adversarial examples, in a standard multiclass classification scenario. EAD combines multiple detectors that expl...
This article presents the development of an expert system to support the diagnosis of post-harvest diseases of stored apples. We propose a picture-based and conversational interaction with users, where sampled images depicting symptoms of apples with known diseases are presented to users to elicit their feedback on perceived similarities in order t...
This paper presents a case-study of a knowledge-based recommender system capable to diagnose post-harvest diseases of apples. It describes the process of knowledge elicitation and construction of a Bayesian Network reasoning system as well as its evaluation with three different types of studies involving diseased apples. The ground truth of disease...
Dynamic Bayesian networks have been well explored in the literature as discrete-time models: however, their continuous-time extensions have seen comparatively little attention. In this paper, we propose the first implementation of a constraint-based algorithm for learning the structure of continuous-time Bayesian networks. We discuss the different...
The Multi-Armed Bandit (MAB) problem has been extensively studied in order to address real-world challenges related to sequential decision making. In this setting, an agent selects the best action to be performed at time-step t, based on the past rewards received by the environment. This formulation implicitly assumes that the expected payoff for e...
Post-harvest diseases of apple are one of the major issues in the economical sector of apple production, causing severe economical losses to producers. Thus, we developed DSSApple, a picture-based decision support system able to help users in the diagnosis of apple diseases. Specifically, this paper addresses the problem of sequentially optimizing...
Incomplete data are a common feature in many domains, from clinical trials to industrial applications. Bayesian networks (BNs) are often used in these domains because of their graphical and causal interpretations. BN parameter learning from incomplete data is usually implemented with the Expectation-Maximisation algorithm (EM), which computes the r...
Incomplete data are a common feature in many domains, from clinical trials to industrial applications. Bayesian networks (BNs) are often used in these domains because of their graphical and causal interpretations. BN parameter learning from incomplete data is usually implemented with the Expectation-Maximisation algorithm (EM), which computes the r...
Systems biology approaches are extensively used to model and reverse-engineer gene regulatory networks from experimental data. Indoleamine 2,3-dioxygenases (IDOs)—belonging in the heme dioxygenase family—degrade l-tryptophan to kynurenines. These enzymes are also responsible for the de novo synthesis of nicotinamide adenine dinucleotide (NAD+). As...
Dynamic Bayesian networks have been well explored in the literature as discrete-time models; however, their continuous-time extensions have seen comparatively little attention. In this paper, we propose the first constraint-based algorithm for learning the structure of continuous-time Bayesian networks. We discuss the different statistical tests an...
The current understanding of deep neural networks can only partially explain how input structure, network parameters and optimization algorithms jointly contribute to achieve the strong generalization power that is typically observed in many real-world applications. In order to improve the comprehension and interpretability of deep neural networks,...
Emerging studies in the deep learning community focus on techniques aimed to identify which part of a graph can be suitable for making better decisions and best contributes to an accurate inference. These researches (i.e., “attentional mechanisms” for graphs) can be applied effectively in all those situations in which it is not trivial to capture d...
The current understanding of deep neural networks can only partially explain how input structure, network parameters and optimization algorithms jointly contribute to achieve the strong generalization power that is typically observed in many real-world applications. In order to improve the comprehension and interpretability of deep neural networks,...
A necessary step in the development of artificial intelligence is to enable a machine to represent how the world works, building an internal structure from data. This structure should hold a good trade-off between expressive power and querying efficiency. Bayesian networks have proven to be an effective and versatile tool for the task at hand. They...
Background:
Recently, mobile devices, such as smartphones, have been introduced into healthcare research to substitute paper diaries as data-collection tools in the home environment. Such devices support collecting patient data at different time points over a long period, resulting in clinical time-series data with high temporal complexity, such a...
This chapter builds on a dataset where online users of a spa platform participated in an online idea contest providing free-text descriptions of their proposals for spa services. A panel of domain experts annotated these idea descriptions with a score for their innovativeness that serves as ground truth for machine learning experiments. Thus, the c...
This demo presents a conversational navigation approach for a diagnostic application of postharvest diseases of apple with the goal to educate users on the diagnosed diseases as well as to recommend consequences for the storage facility and what action to take for the next growing period. It thus builds on earlier works on picture-based navigation...
The aquifer of the Oltrepò Pavese plain (northern Italy) is affected by paleo-saltwater intrusions that pose a contamination risk to water wells. The report first briefly describes how the presence of saline water can be predicted using geophysical investigations (electrical resistivity tomography or electromagnetic surveys) and a machine learning...
Non-stationary continuous time Bayesian networks are introduced. They allow the parents set of each node in a continuous time Bayesian network to change over time. Structural learning of nonstationary continuous time Bayesian networks is developed under different knowledge settings. A macroeconomic dataset is used to assess the effectiveness of lea...
The application of statistical classification methods is investigated—in comparison also to spatial interpolation methods—for predicting the acceptability of well-water quality in a situation where an effective quantitative model of the hydrogeological system under consideration cannot be developed. In the example area in northern Italy, in particu...
The deep learning wave is propagating through many research areas and communities. In the last years it quickly propagated to Recommendation Systems, a research area which aims to recommend items to users. Indeed, many deep learning models and architectures have been proposed for Recommendation Systems to improve collaborative filtering and content...
When looking for recently published scientific papers, a researcher usually focuses on the topics related to her/his scientific interests. The task of a recommender system is to provide a list of unseen papers that match these topics. The core idea of this paper is to leverage the latent topics of interest in the publications of the researchers, an...
Most evaluations of novel algorithmic contributions assess their accuracy in predicting what was withheld in an offline evaluation scenario. However, several doubts have been raised that standard offline evaluation practices are not appropriate to select the best algorithm for field deployment. The goal of this work is therefore to compare the offl...
Physical activity (PA) is considered one of the most important factors for the prevention and management of non-communicable diseases (NCDs). Mobile technologies offer several opportunities for supporting PA, especially if combined with psychological aspects, model-based reasoning systems and personalized human computer interaction. This still on-g...
Non-stationary continuous time Bayesian networks are introduced. They allow the parents set of each node to change over continuous time. Three settings are developed for learning non-stationary continuous time Bayesian networks from data: known transition times, known number of epochs and unknown number of epochs. A score function for each setting...
Recommendation of scientific papers is a task aimed to support researchers in accessing relevant articles from a large pool of unseen articles. When writing a paper, a researcher focuses on the topics related to her/his scientific domain, by using a technical language.
The core idea of this paper is to exploit the topics related to the researchers...
T helper 17 (TH17) cells represent a pivotal adaptive cell subset involved in multiple immune disorders in mammalian species. Deciphering the molecular interactions regulating TH17 cell differentiation is particularly critical for novel drug target discovery designed to control maladaptive inflammatory conditions. Using continuous time Bayesian net...
User generated content in general and textual reviews in particular constitute a vast source of information for the decision making of tourists and management and are therefore a key component for e-tourism. This paper provides a description of the topic model method with a particular application focus on the tourism domain. It therefore contribute...
User generated content in general and textual reviews in particular constitute a vast source of information for the decision making of tourists and management and are therefore a key component for e-tourism. This paper explores different application scenarios for the topic model method to process these textual reviews in order to provide accurate d...
The detection of design patterns is a useful activity giving support to the comprehension and maintenance of software systems. Many approaches and tools have been proposed in the literature providing different results. In this paper, we extend a previous work regarding the application of machine learning techniques for design pattern detection, by...
Background
Dynamic aspects of gene regulatory networks are typically investigated by measuring system variables at multiple time points. Current state-of-the-art computational approaches for reconstructing gene networks directly build on such data, making a strong assumption that the system evolves in a synchronous fashion at fixed points in time....