About
159
Publications
15,070
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,317
Citations
Introduction
Skills and Expertise
Publications
Publications (159)
Efficient and effective service delivery in Public Administration (PA) relies on the development and utilization of key performance indicators (KPIs) for evaluating and measuring performance. This paper presents an innovative framework for KPI construction within performance evaluation systems, leveraging Random Forest algorithms and variable impor...
COVID-19 has significantly impacted individuals, communities, and countries worldwide. These effects include health impacts, economics impacts, social impacts, educational, political and environmental impacts. The COVID-19 vaccine development was crucial for disease control and monitoring, yet the threat still looms large. Vaccine recommender syste...
Artificial Intelligence (AI) encompasses a variety of methods and algorithms that have found applications across numerous domains over time. The increasing complexity and abundance of data in healthcare have spurred investigations into utilizing AI techniques within the medical field, leading to promising avenues for fostering innovation, facilitat...
With the rise of social media, individuals face challenges in decision-making due to the abundance of options available. Recommender Systems (RSs) leverage Artificial Intelligence (AI) to provide users with personalized suggestions aligned with their preferences and interests. This study presents a systematic review of AI-based Recommender Systems,...
Since the spread of the coronavirus flu in 2019 (hereafter referred to as COVID-19), millions of people worldwide have been affected by the pandemic, which has significantly impacted our habits in various ways. In order to eradicate the disease, a great help came from unprecedentedly fast vaccines development along with strict preventive measures a...
In order to overcome the classical methods of judgement, in the literature there is a lot of material about different methodology and their intrinsic limitations. One of the most relevant modern model to deal with votation system dynamics is the Majority Judgement.It was created with the aim of reducing polarization of the electorate in modern demo...
The development of the vaccine for the control of COVID-19 is the need of hour. The immunity against coronavirus highly depends upon the vaccine distribution. Unfortunately, vaccine hesitancy seems to be another big challenge worldwide. Therefore, it is necessary to analysis and figure out the public opinion about COVID-19 vaccines. In this era of...
Nowadays, really huge volumes of fake news are continuously posted by malicious users with fraudulent goals thus leading to very negative social effects on individuals and society and causing continuous threats to democracy, justice, and public trust. This is particularly relevant in social media platforms (e.g., Facebook, Twitter, Snapchat), due t...
The rise of cyber crime observed in recent years calls for more efficient and effective data exploration and analysis tools. In this respect, the need to support advanced analytics on activity logs and real time data is driving data scientist' interest in designing and implementing scalable cyber security solutions. However, when data science algor...
The world has to face health concerns due to huge spread of COVID. For this reason, the development of vaccine is the need of hour. The higher vaccine distribution, the higher the immunity against coronavirus. Therefore, there is a need to analyse the people’s sentiment for the vaccine campaign. Today, social media is the rich source of data where...
The whole world is facing health challenges due to wide spread of COVID-19 pandemic. To control the spread of COVID-19, the development of its vaccine is the need of hour. Considering the importance of the vaccines, many industries have put their efforts in vaccine development. The higher immunity against the COVID can be achieved by high intake of...
The pervasive diffusion of Social Networks (SN) produced an unprecedented amount of heterogeneous data. Thus, traditional approaches quickly became unpractical for real life applications due their intrinsic properties: large amount of user-generated data (text, video, image and audio), data heterogeneity and high speed generation rate. More in deta...
In this paper, a new architecture, named Sigma (as Sigma uppercase usually denote a sum and this architecture is a summarization of Lambda and Kappa), is proposed to provide a solution for build a complete Big Data System, interactive and scalable, using a variety of tools and techniques. The architecture have been tested in a real life scenario an...
The uncontrolled growth of fake news creation and dissemination we observed in recent years causes continuous threats to democracy, justice, and public trust. This problem has significantly driven the effort of both academia and industries for developing more accurate fake news detection strategies. Early detection of fake news is crucial, however...
The problem of exchanging data, even considering incomplete and heterogeneous data, has been deeply investigated in the last years. The approaches proposed so far are quite rigid as they refer to fixed schema and/or are based on a deductive approach consisting in the use of a fixed set of (mapping) rules. In this paper we describe a smart data exch...
Big Data paradigm is leading both research and industry effort calling for new approaches in many computer science areas. In this paper, we show how semantic similarity search for natural language texts can be leveraged in biomedical domain by Word Embedding models obtained by word2vec algorithm, exploiting a specifically developed Big Data archite...
In this paper we propose an architecture specifically devoted to the analysis of huge natural language biomedical textual collections, with the purpose of searching for semantic similarity in order to obtain useful hints for effective simulation that could help physicians in diagnosis tasks. We leverage Word Embedding models trained with word2vec a...
This book discusses the challenges facing current research in knowledge discovery and data mining posed by the huge volumes of complex data now gathered in various real-world applications (e.g., business process monitoring, cybersecurity, medicine, language processing, and remote sensing). The book consists of 14 chapters covering the latest resear...
This book constitutes the refereed post-conference proceedings of the 8th International Workshop on New Frontiers in Mining Complex Patterns, NFMCP 2019, held in conjunction with ECML-PKDD 2019 in Würzburg, Germany, in September 2019.
The workshop focused on the latest developments in the analysis of complex and massive data sources, such as blogs,...
In this paper we address the problem of analyzing biomedical data collection with the purpose of searching for semantic similarity among textual documents. In details, we leverage Word Embeddings models obtained by word2vec algorithm and a specific Big Data architecture for their management, defining an approach able to permit the retrieving of sem...
The problem of exchanging data, even considering incomplete and heterogeneous data, has been deeply investigated in the last years. The approaches proposed so far are quite rigid as they refer to fixed schema and/or are based on a deductive approach consisting in the use of a fixed set of (mapping) rules. In this paper, we propose HIKE (Highly Inte...
The Big Data paradigm has recently come on scene in a quite pervasive manner. Sifting through massive amounts of this kind of data, parsing them, transferring them from a source to a target database, and analyzing them to improve business decision-making processes is too complex for traditional approaches. In this respect, there have been recent pr...
The increasing complexity of new malware and the constant refinement of detection mechanisms are driving malware writers to rethink the malware development process. In this respect, compilers play a key role and can be used to implement evasion techniques able to defeat even the new generation of detection algorithms. In this paper we provide an ov...
The data posting framework introduced in [8] adapts the well-known Data Exchange techniques to the new Big Data management and analysis challenges that can be found in real world scenarios. Although it is expressive enough, it requires the ability of using count constraints and may be difficult for a non expert user. Moreover, the data posting prob...
Computational techniques both from a software and hardware viewpoint are nowadays growing at impressive rates leading to the development of projects whose complexity could be quite challenging, e.g., bio-medical simulations. Tackling such high demand could be quite hard in many context due to technical and economic motivation. A good trade-off can...
The current era of Big Data [7] has forced both researchers and industries to rethink the computational solutions for analyzing massive data. In fact, a great deal of attention has been devoted to the design of new algorithms for analyzing information available from Twitter, Google, Facebook, and Wikipedia, just to cite a few of the main big data p...
Health Information Systems (HIS) can offer patients great benefits in terms of quality of care and reduction in costs, thus many organizations are in the process of starting initiatives to develop such systems in their domain. Many national and international organizations have developed their own HIS systems in according to their needs and accordin...
Big Data, as a new paradigm, has forced both researchers and industries to rethink data management techniques which has become inadequate in many contexts. Indeed, we deal everyday with huge amounts of collected data about user suggestions and searches. These data require new advanced analysis strategies to be devised in order to profitably leverag...
Currently, many emerging computer science applications call for collaborative solutions to complex projects that require huge amounts of computing resources to be completed, e.g., physical science simulation, big data analysis. Many approaches have been proposed for high performance computing designing a task partitioning strategy able to assign pi...
The need to support advanced analytics on Big Data is driving data scientist' interest toward massively parallel distributed systems and software platforms, such as Map-Reduce and Spark, that make possible their scalable utilization. However, when complex data mining algorithms are required, their fully scalable deployment on such platforms faces a...
Computer Science is a relatively young discipline, but in the last two decades the advances in hardware technology and software engineering has induced notable changes in the way users interact with computers. In particular, several processes involving data have changed in a radical manner. As a matter of fact, the amount of data stored in reposito...
This book offers readers a comprehensive guide to the evolution of the database field from its earliest stages up to the present—and from classical relational database management systems to the current Big Data metaphor. In particular, it gathers the most significant research from the Italian database community that had relevant intersections with...
Process mining methods have been proven effective in turning historical log data into actionable process knowledge. However, most of them work under the assumption that the events reported in the logs can be easily mapped to well-defined process activities, that are the terms in which analysts are used to reason on the processes’ behaviors. We here...
Social Networks analysis is driving both research and industrial effort as the outcomes of this activity are relevant both from a merely theoretical point of view and for the potential market advantages they can provide to companies. Indeed, there is a growing number of applications that call for user (social) intervention with the aim of helping e...
Due to the emerging Big Data paradigm, driven by the increasing availability of intelligent services easily accessible by a large number of users (e.g., social networks), traditional data management techniques are inadequate in many real-life scenarios. In particular, the availability of huge amounts of data pertaining to user social interactions,...
Technology becomes more and more advanced everyday, both from the software and from the hardware perspective. Brand new devices, more powerful and capable of the generation preceding them, are steadily released. Everybody owns laptops, smartphones and many other devices with great compute capabilities, able to easily solve problems a few years ago...
The advances in computational techniques both from a software and hardware viewpoint lead to the development of projects whose complexity could be quite challenging, e.g., biomedical simulations. In order to deal with the increased demand of computational power many collaborative approaches have been proposed in order to apply proper partitioning s...
Watermarking digital content is a very common approach leveraged by creators of copyrighted digital data to embed fingerprints into their data. The rationale of such operation is to mark each copy of the data in order to uniquely identify it. These watermarks are embedded in a suitable way to prevent their stripping or modification by users for ill...
A centroid-based clustering algorithm is proposed that works in a totally unsupervised fashion and is significantly faster and more accurate than existing algorithms. The algorithm, named CLUBS (for CLustering Using Binary Splitting), achieves these results by combining features of hierarchical and partition-based algorithms. Thus, CLUBS consists o...
This book features a collection of revised and significantly extended versions of the papers accepted for presentation at the 5th International Workshop on New Frontiers in Mining Complex Patterns, NFMCP 2016, held in conjunction with ECML-PKDD 2016 in Riva del Garda, Italy, in September 2016. The book is composed of five parts: feature selection a...
Information management in healthcare is nowadays experiencing a great revolution. After the impressive progress in digitizing medical data by private organizations, also the federal government and other public stakeholders have also started to make use of healthcare data for data analysis purposes in order to extract actionable knowledge. In this p...
Big data paradigm is currently the leading paradigm for data production and management. As a matter of fact, new information are generated at high rates in specialized fields (e.g., cybersecurity scenario). This may cause that the events to be studied occur at rates that are too fast to be effectively analyzed in real time. For example, in order to...
Log analysis and querying recently received a renewed interest from the research community, as the effective understanding of process behavior is crucial for improving business process management. Indeed, currently available log querying tools are not completely satisfactory, especially from the viewpoint of easiness of use. As a matter of fact, th...
Due to the increasing availability of huge amounts of data, traditional data management techniques result inadequate in many real life scenarios. Furthermore, heterogeneity and high speed of this data require suitable data storage and management tools to be designed from scratch. In this paper, we describe a framework tailored for analyzing user in...
Information management in healthcare is nowadays experiencing a great revolution. After the impressive progress in digitizing medical data by private organizations, also the federal government and other public stakeholders have also started to make use of healthcare data for data analysis purposes in order to extract actionable knowledge. In this p...
This book constitutes the thoroughly refereed post-conference proceedings of the 4th International Workshop on New Frontiers in Mining Complex Patterns, NFMCP 2015, held in conjunction with ECML-PKDD 2015 in Porto, Portugal, in September 2015.
The 15 revised full papers presented together with one invited talk were carefully reviewed and selected f...
We consider the scenario where the executions of different business processes are traced into a log, where each trace describes a process instance as a sequence of low-level events (representing basic kinds of operations). In this context, we address a novel problem: given a description of the processes’ behaviors in terms of high-level activities...
The issue of devising efficient and effective solutions for supporting the analysis of process logs has recently received great attention from the research community, as effectively accomplishing any business process management task requires understanding the behavior of the processes. In this paper, we propose a new framework supporting the analys...
Predicting the output power of renewable energy production plants distributed on a wide territory is a really valuable goal, both for marketing and energy management purposes. Vi-POC (Virtual Power Operating Center) project aims at designing and implementing a prototype which is able to achieve this goal. Due to the heterogeneity and the high volum...
The increasing availability of large process log repositories calls for efficient solutions for their analysis. In this regard, a novel specialized compression technique for process logs is proposed, that builds a synopsis supporting a fast estimation of aggregate queries, which are of crucial importance in exploratory and high-level analysis tasks...
Predicting the output power of renewable energy production plants distributed on a wide territory is a valuable goal, both for marketing and energy management purposes. In this paper, we describe Vi-POC (Virtual Power Operating Center) – a distributed system for storing huge amounts of data, gathered from energy production plants and weather predic...
In this paper we propose an end to end framework that allows efficient analysis for trajectory streams. In particular, our approach consists of several steps. First, we perform a partitioning strategy for incoming streams of trajectories in order to reduce the trajectory size and represent trajectories using a suitable data structure. After the enc...
The recent advances in genomic technologies and the availability of large-scale datasets call for the development of advanced data analysis techniques, such as data mining and statistical analysis to cite a few. A main goal in understanding cell mechanisms is to explain the relationship among genes and related molecular processes through the combin...
The problem of accurately predicting the energy production from renewable sources has recently received an increasing attention from both the industrial and the research communities. It presents several challenges, such as facing with the rate data are provided by sensors, the heterogeneity of the data collected, power plants efficiency, as well as...
In this paper, we study the problem of mining for frequent trajectories, which is crucial in many application scenarios, such as vehicle traffic management, hand-off in cellular networks, supply chain management. We approach this problem as that of mining for frequent sequential patterns. Our approach consists of a partitioning strategy for incomin...
The recent developments in technologies and life sciences have paved the way to complex interactions among entities in distributed and heterogeneous environments. As a result, an enormous amount of valuable information is available, spanning from structured to multimedia and spatial or spatio-temporal data. The data mining research community has be...
Nowadays, almost all kind of electronic devices leave traces of their movements (e.g. smartphone, GPS devices and so on). Thus, the huge number of this “tiny” data sources leads to the generation of massive data streams of geo-referenced data. As a matter of fact, the effective analysis of such amounts of data is challenging, since the possibility...
The problem of accurately predicting the energy production from renewable sources has recently received an increasing attention from both the industrial and the research communities. It presents several challenges, such as facing with the high rate data are provided by sensors, the heterogeneity of the data collected, power plants efficiency, as we...
Network
Cited