
Francesco FolinoItalian National Research Council | CNR · Institute for High Performance Computing and Networking ICAR
Francesco Folino
Ph.D. in Computer Science
About
68
Publications
6,715
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,037
Citations
Citations since 2017
Introduction
Additional affiliations
August 2010 - present
September 2003 - December 2006
Education
November 2003 - November 2006
Publications
Publications (68)
In the last years link prediction in complex networks has attracted an ever increasing attention from the scientific community. In this paper we apply link prediction models to a very challenging scenario: predicting the onset of future diseases on the base of the current health status of patients. To this purpose, a comorbidity network where nodes...
Process Mining techniques exploit the information stored in the execution log of a process to extract some high-level process model, useful for analysis or design tasks. Most of these techniques focus on "structural" aspects of the process, in that they only consider what elementary activities were executed and in which ordering. Hence, any other "...
An approach for disease prediction that combines clustering, Markov models and association analysis techniques is proposed.
Patient medical records are first clustered, and then a Markov model is generated for each cluster to perform predictions about illnesses a patient could likely be affected in the future. However, when the probability of the...
Modeling behavioral aspects of business processes is a hard and costly task, which usually requires heavy intervention of business experts. This explains the increasing attention given to process mining techniques, which automatically extract behavioral process models from log data. In the case of complex processes, however, the models identified b...
Detecting deviant traces in business process logs is a crucial task in modern organizations due to the detrimental effect of certain deviant behaviors (e.g., attacks, frauds, faults). Training a Deviance Detection Model (DDM) only over labeled traces with supervised learning methods unfits real-life contexts where a small fraction of the traces are...
Predicting the final outcome of an ongoing process instance is a key problem in many real-life contexts. This problem has been addressed mainly by discovering a prediction model by using traditional machine learning methods and, more recently, deep learning methods, exploiting the supervision coming from outcome-class labels associated with histori...
A Correction to this paper has been published: 10.1007/s13740-021-00121-2
The ever-increasing attention of process mining (PM) research to the logs of low structured processes and of non-process-aware systems (e.g., ERP, IoT systems) poses a number of challenges. Indeed, in such cases, the risk of obtaining low-quality results is rather high, and great effort is needed to carry out a PM project, most of which is usually...
Classification-oriented Machine Learning methods are a precious tool, in modern Intrusion Detection Systems (IDSs), for discriminating between suspected intrusion attacks and normal behaviors. Many recent proposals in this field leveraged Deep Neural Network (DNN) methods, capable of learning effective hierarchical data representations automaticall...
Traditionally, Expert Systems have found a natural application in the behavioral analysis of processes. In fact, they have proved effective in the tasks of interpreting the data collected during the process executions and of analyzing these data with the aim of diagnosing/detecting anomalies. In this context, we focus on log data generated by execu...
Mining deviances from expected behaviors in process logs is a relevant problem in modern organizations, owing to their negative impact in terms of monetary/reputation losses. Most proposals to deviance mining combine the extraction of behavioral features from log traces with the induction of standard classifiers. Difficulties in capturing the multi...
Process Mining (PM) is meant to extract knowledge on the behavior of business processes from historical log data. Lately, an increasing attention has been gained by the Predictive Process Monitoring, a field of PM that tries to extend process monitoring systems with prediction capabilities and, in particular. Several current proposals in literature...
The ever increasing attention of Process Mining (PM) research to the logs of lowly-structured processes and of non process-aware systems (e.g., ERP, IoT systems) poses several challenges stemming from the lower quality that these logs have, concerning the precision, completeness and abstraction with which they describe the activities performed. In...
Process Discovery techniques, allowing to extract graph-like models from large process logs, are a valuable mean for grasping a summarized view of real business processes’ behaviors. If augmented with statistics on process performances (e.g., processing times), such models help study the evolution of process performances across different processing...
Current approaches to the security-oriented classification of process log traces can be split into two categories: (i) example-driven methods, that induce a classifier from annotated example traces; (ii) model-driven methods, based on checking the conformance of each test trace to security-breach models defined by experts. These categories are orth...
Business Process Intelligence (BPI) and Process Mining, two very active research areas of research, share a great interest towards the issue of discovering an effective Deviance Detection Model (DDM), computed via accessing log data. The DDM model allows us to understand whether novel instances of the target business process are deviant or not, thu...
In many application contexts, a business process' executions are subject to performance constraints expressed in an aggregated form, usually over predefined time windows, and detecting a likely violation to such a constraint in advance could help undertake corrective measures for preventing it. This paper illustrates a prediction-aware event proces...
Monitoring the performances of a business process is a key issue in many organizations, especially when the process must comply with predefined performance constraints. In such a case, empowering the monitoring system with prediction capabilities would allow us to know in advance a constraint violation, and possibly trigger corrective measures to e...
Increasing attention has been paid to the problem of explaining and analyzing "deviant cases" generated by a business process, i.e. instances of the process that diverged from prescribed/expected behavior (e.g. frauds, faults, SLA violations). In many real settings, such cases are labelled with a numerical deviance measure, and the analyst wants to...
Increasing attention has been paid to the detection and analysis of “deviant” instances of a business process that are connected with some kind of “hidden” undesired behavior (e.g. frauds and faults). In particular, several recent works faced the problem of inducing a binary classification model (here named deviance detection model) that can discri...
This paper presents a framework for analyzing and predicting the performances of a business process, based on historical data gathered during its past enactments. The framework hinges on an inductive-learning technique for
discovering a special kind of predictive process models, which can support the run-time prediction of a given performance measu...
Increasing attention has been paid of late to the problem of detecting and explaining “deviant” process instances, i.e. instances diverging from normal/desired outcomes (e.g., frauds, faults, SLA violations), based on log data. Current solutions allow to discriminate between deviant and normal instances, by combining the extraction of (sequence-bas...
Predicting the fix time (i.e. the time needed to eventually solve a case) is a key task in an issue tracking system, which attracted the attention of data-mining researchers in recent years. Traditional approaches only try to forecast the overall fix time of a case when it is reported, without updating this preliminary estimate as long as the case...
Process discovery techniques are a precious tool for analyzing the real behavior of a business process. However, their direct application to lowly structured logs may yield unreadable and inaccurate models. Current solutions rely on event abstraction or trace clustering, and assume that log events refer to well-defined (possibly low-level) process...
Process discovery (i.e. the automated induction of a behavioral process model from execution logs) is an important tool for business process analysts/managers, who can exploit the extracted knowledge in key process improvement and (re-)design tasks. Unfortunately, when directly applied to the logs of complex and/or lowly-structured processes, such...
This paper presents a framework for analyzing and predicting the performances of a business process, based on historical data gathered during its past enactments. The framework hinges on an inductive-learning technique for discovering a special kind of predictive process models, which can support the run-time prediction of a given performance measu...
Process Mining techniques have been gaining attention, especially as concerns the discovery of predictive process models. Traditionally focused on workflows, they usually assume that process tasks are clearly specified, and referred to in the logs. This limits however their application to many real-life BPM environments (e.g. issue tracking systems...
Fix-time prediction is a key task in bug tracking systems, which was recently faced through predictive data mining approaches, trying to estimate the time needed to solve a case, at the very moment when it is reported. And yet, the actions performed on a bug, along its life, can help refine the prediction of its (remaining) fix-time, by leveraging...
DynamicNet, an effective and efficient algorithm for supporting community evolution detection in time-evolving information networks is presented and experimentally evaluated in this paper. DynamicNet introduces a graph-based model-theoretic approach to represent time-evolving information networks, and to capture how they change over time. A central...
Predicting run-time performances is a hot issue in ticket resolution processes. Recent efforts to take account for the sequence of resolution steps, suggest that predictive Process Mining (PM) techniques could be applied in this field, if suitably adapted to the peculiarities of ticket systems. In particular, the performances of a ticket instance u...
In this paper, we propose a framework for representing, modeling and mining time-evolving information networks. Our framework introduces a graph-based model-theoretic approach to represent such networks and how they change over time. Also, we provide a method for supporting matching-based community evolution detection in time-evolving information n...
This paper presents a novel approach to the discovery of predictive process models, which are meant to support the run-time prediction of some performance indicator (e.g., the remaining processing time) on new ongoing processinstances. To this purpose,we combine a series of data mining techniques(ranging from pattern mining,to non-parametric regres...
The discovery of evolving communities in dynamic networks is an important research topic that poses challenging tasks. Evolutionary clustering is a recent framework for clustering dynamic networks that introduces the concept of temporal smoothness inside the community structure detection method. Evolutionary-based clustering approaches try to maxim...
Process Mining techniques have been gaining attention, owing to their potentiality to extract compact process models from massive logs. Traditionally focused on workflows, they often assume that process tasks are clearly specified, and referred to in the logs. This limits how- ever their application to many real-life BPM environments (e.g. issue tr...
The discovery of predictive models for process performances is an emerging topic, which poses a series of difficulties when considering complex and flexible processes, whose behaviour tend to change over time depending on context factors. We try to face such a situation by proposing a predictive-clustering approach, where different context-related...
Discovering predictive models for run-time support is an emerging topic in Process Mining research, which can effectively help optimize business process enactments. However, making accurate estimates is not easy especially when considering fine-grain performance measures (e.g., processing times) on a complex and flexible business process, where per...
A prominent goal of process mining is to build automatically a model explaining all the episodes recorded in the log of some transactional system. Whenever the process to be mined is complex and highly-flexible, however, equipping all the traces with just one model might lead to mixing different usage scenarios, thereby resulting in a spaghetti-lik...
An approach for disease prediction that combines clustering, Markov models and association analysis techniques is proposed.
Patient medical records are clustered and a Markov model for each cluster is generated to perform prediction of illnesses
a patient could likely be affected in the future. However, when the probability of the most likely state...
A recommendation engine for disease prediction that combines clustering and association analysis techniques is proposed. The system produces local prediction models, specialized on subgroups of similar patients by using the past patient medical history, to determine the set of possible illnesses an individual could develop. Each model is generated...
A prediction model that exploits the past medical patient history to determine the risk of individuals to develop future diseases
is proposed. The model is generated by using the set of frequent diseases that contemporarily appear in the same patient.
The illnesses a patient could likely be affected in the future are obtained by considering the ite...
The bi-clustering, i.e., simultaneously clustering two types of objects based on their correlations, has been studied actively in the last few years, in virtue of its impact on several relevant applications, such as text mining, collaborative filtering, gene expression analysis. In particular, many research efforts were recently spent on extending...
The discovery of evolving communities in dynamic networks is an important research topic that poses challenging tasks. Previous evolutionary based clustering methods try to maximize cluster accuracy, with respect to incoming data of the current time step, and minimize clustering drift from one time step to the successive one. In order to optimize b...
A knowledge-based framework for supporting and analyzing loosely-structured collaborative processes (LSCPs) is presented in this paper. The framework takes advantages from a number of knowledge representation, management and processing capabilities, including recent process mining techniques. In order to support the enactment, analysis and optimiza...
A multiobjective genetic algorithm for detecting communities in dynamic networks, i.e., networks that evolve over time, is proposed. The approach leverages on the concept of evolutionary clustering, assuming that abrupt changes of community structure in short time periods are not desirable. The algorithm correctly detects communities and it is show...
Process-oriented systems have been increasingly attracting data mining researchers, mainly due to the advantages that the application of inductive process mining techniques to log data could open to both the analysis of complex processes and the design of new process models. However, the actual impact of process mining in the industry is endangered...
A novel approach for reconciling tuples stored as free text into an existing attribute schema is proposed. The basic idea
is to subject the available text to progressive classification, i.e., a multi-stage classification scheme where, at each intermediate stage, a classifier is learnt that analyzes the textual
fragments not reconciled at the end of...
The “internetworked” enterprise domain poses a challenge to IT researchers, due to the complexity and dynamicity of collaboration
processes that are to be supported in such a scenario typically. A major issue in this context, where several entities are
possibly involved that cooperate according to continuously evolving schemes, is to develop suitab...
We propose a data warehousing architecture for effective risk analysis in a banking scenario. The core of the architecture consists in two data mining tools for improving the quality of consolidated data during the acquisition process. Specifically, we deal with schema reconciliation, i.e. segmentation of a string sequence according to fixed attrib...
We propose a hierarchical, model-based co-clustering framework for handling high-dimensional datasets. The technique views the dataset as a joint probability distribution over row and column variables. Our approach starts by initially clustering rows in a dataset, where each cluster is characterized by a different probability distribution. Subseque...
We propose an incremental algorithm for discovering clusters of duplicate tuples in large databases. The core of the approach is the usage of an indexing technique which, for any newly arrived tuple mu, allows to efficiently retrieve a set of tuples in the database which are mostly similar to mu, and which are likely to refer to the same real-world...
A novel approach for reconciling tuples stored as free text into an existing attribute schema is proposed. The basic idea is to subject the available text to progressive classification, i.e., a multi-stage classification scheme where, at each intermediate stage, a classifier is learnt that analyzes the textual fragments not reconciled at the end of...
We propose an incremental algorithm for clustering duplicate tuples in large databases, which allows to assign any new tuple t to the cluster containing the database tuples which are most similar to t (and hence are likely to refer to the same real-world entity t is associated with). The core of the approach is a hash-based indexing technique that...
We present a novel personalization engine that provides individualized access to Web contents/services by means of data mining
techniques. It associates adaptive content delivery and navigation support with form filling, a functionality that catches
the typical interaction of a user with a Web service, in order to automatically fill in its form fi...
Projects
Project (1)