Diego Reforgiato Recupero

Diego Reforgiato Recupero
University of Cagliari | UNICA · mathematics and computer science

Professor

About

269
Publications
105,573
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,096
Citations

Publications

Publications (269)
Article
Full-text available
Model interpretability is essential in machine learning, particularly for applications in critical fields like healthcare, where understanding model decisions is paramount. While SHAP (SHapley Additive exPlanations) has proven to be a robust tool for explaining machine learning predictions, its high computational cost limits its practicality for re...
Preprint
Full-text available
Large language models (LLMs) have shown promising capabilities in healthcare analysis but face several challenges like hallucinations, parroting, and bias manifestation. These challenges are exacerbated in complex, sensitive, and low-resource domains. Therefore, in this work we introduce IC-AnnoMI, an expert-annotated motivational interviewing (MI)...
Article
Full-text available
Machine and Deep Learning methods are widely adopted to predict corporate bankruptcy events for their effectiveness. Bankruptcy prediction is commonly modeled as a binary classification task over accounting data where the positive label is associated with companies with a high likelihood of bankruptcy and the negative label with a low risk of failu...
Preprint
Full-text available
Numerous methods and pipelines have recently emerged for the automatic extraction of knowledge graphs from documents such as scientific publications and patents. However, adapting these methods to incorporate alternative text sources like micro-blogging posts and news has proven challenging as they struggle to model open-domain entities and relatio...
Conference Paper
Full-text available
Several techniques and workflows have emerged recently for automatically extracting knowledge graphs from documents like scientific articles and patents. However, adapting these approaches to integrate alternative text sources such as micro-blogging posts and news and to model open-domain entities and relationships commonly found in these sources i...
Article
Full-text available
Online platforms have become the primary means for travellers to search, compare, and book accommodations for their trips. Consequently, online platforms and revenue managers must acquire a comprehensive comprehension of these dynamics to formulate a competitive and appealing offerings. Recent advancements in natural language processing, specifical...
Article
Full-text available
The labor market is a dynamic and rapidly evolving environment. Job positions that require expertise in various sectors often lead candidates to question their suitability. Therefore, it is crucial to furnish them with relevant, accurate, and timely information. In this article, we introduce a knowledge plug-in for existing conversational agents de...
Article
Full-text available
Numerous methods and pipelines have recently emerged for the automatic extraction of knowledge graphs from documents such as scientific publications and patents. However, adapting these methods to incorporate alternative text sources like micro-blogging posts and news has proven challenging as they struggle to model open-domain entities and relatio...
Conference Paper
Full-text available
This paper explores the growing importance of Environmental, Social, and Governance (ESG) criteria in financial assessments and conducts an AI-driven analysis of ESG concepts' evolution from 1980 to 2022. Focusing on media sources from the United States and the United Kingdom, the study utilizes the Dow Jones News Article dataset for a comprehensiv...
Article
Full-text available
A primary concern in the realm of mechanical engineering is to ensure the efficient and effective data entry of hardware devices. Fasteners are mechanical tools that rigidly connect or affix two surfaces or objects together. They are small and often different fasteners might look similar; it is therefore a long and prone-to-risk procedure to manual...
Article
Full-text available
The use of machine learning in Healthcare has the potential to improve patient outcomes as well as broaden the reach and affordability of Healthcare. The history of other application areas indicates that strong benchmarks are essential for the development of intelligent systems. We present Personal Health Interfaces Leverag-ing HUman-MAchine Natura...
Article
Full-text available
Generating synthetic data is a complex task that necessitates accurately replicating the statistical and mathematical properties of the original data elements. In sectors such as finance, utilizing and disseminating real data for research or model development can pose substantial privacy risks owing to the inclusion of sensitive information. Additi...
Article
Full-text available
In recent years, transformer-based models have emerged as powerful tools for natural language processing tasks, demonstrating remarkable performance in several domains. However, they still present significant limitations. These shortcomings become more noticeable when dealing with highly specific and complex concepts, particularly within the scient...
Article
Full-text available
Industries such as construction and business companies are becoming increasingly digitized. The amount of data to be monitored and processed has increased significantly since the advent of the Internet of Things and the massive use of sensors. In addition to the data from these sensors, large amounts of data that require specific handling and proce...
Article
Full-text available
In recent years, the significance of Environmental, Social, and Governance criteria in assessing financial investments has grown significantly. This paper presents an AI-driven analysis of ESG concepts and their evolution from 1980 to 2022, with a specific focus on media sources from the United States and the United Kingdom. The primary data source...
Article
Full-text available
In video analysis, collection and labeling of data can be time and resource-consuming. To solve the scarcity of data problems, synthetic data augmentation is a promising solution. In this paper, we present an approach to generate synthetic videos for action recognition using Unity, the popular game engine. The synthetic videos are generated with hi...
Article
Full-text available
In today’s rapidly evolving labor market, the emergence of new roles and the decline of traditional ones have led to a complex landscape of job titles and skill requirements. This complexity often causes ambiguity and confusion, affecting both novices and experienced professionals. To address this, extensive international efforts have produced refe...
Chapter
Full-text available
The crucial task of analysing the complex dynamics of the research landscape and uncovering the latest insights from the scientific literature is of paramount importance to researchers, governments, and commercial organizations. Springer Nature, one of the leading academic publishers worldwide, plays a significant role in this domain and regularly...
Chapter
Full-text available
The labor market is a key part of an economy. Several existing online platforms allow the upload of resumes and the search for a job. One of their limitations, however, is that obtaining the best opportunity can be hard because certain jobs need some experiences, abilities, and features that an applicant might not know. The recent diffusion and emp...
Chapter
Improvements in global wealth and well-being result in an increase in the life expectancy of the population and, consequently, determine an increase in the number of people with physical or mental impairments. This slice of the population needs daily assistance and monitoring to live a safe and productive life. For this reason, several researchers...
Article
Full-text available
Value at risk is a statistic used to anticipate the largest possible losses over a specific time frame and within some level of confidence, usually 95% or 99%. For risk management and regulators, it offers a solution for trustworthy quantitative risk management tools. VaR has become the most widely used and accepted indicator of downside risk. Toda...
Article
Full-text available
Both the operational phase and embodied emissions that are introduced during the construction phase through the manufacture, sourcing, and installation of the building's materials and components are significant contributors to carbon emissions from the built environment. It is essential to change the current design and (re)construction processes in...
Article
Full-text available
Modern financial markets produce massive datasets that need to be analysed using new modelling techniques like those from (deep) Machine Learning and Artificial Intelligence. The common goal of these techniques is to forecast the behaviour of the market, which can be translated into various classification tasks, such as, for instance, predicting th...
Preprint
Full-text available
Understanding the relationship between the composition of a research team and the potential impact of their research papers is crucial as it can steer the development of new science policies for improving the research enterprise. Numerous studies assess how the characteristics and diversity of research teams can influence their performance across s...
Article
Music is an extremely subjective art form whose commodification via the recording industry in the 20th century has led to an increasingly subdivided set of genre labels that attempt to organize musical styles into definite categories. Music psychology has been studying the processes through which music is perceived, created, responded to, and incor...
Conference Paper
Full-text available
Understanding the relationship between the composition of a research team and the potential impact of their research papers is crucial as it can steer the development of new science policies for improving the research enterprise. Numerous studies assess how the characteristics and diversity of research teams can influence their performance across s...
Conference Paper
Natural Language Processing (NLP) is crucial to perform recommendations of items that can be only described by natural language. However, NLP usage within recommendation modules is difficult and usually requires a relevant initial effort, thus limiting its widespread adoption. To overcome this limitation, we introduce FORESEE, a novel architecture...
Article
Full-text available
Research on the analysis of counselling conversations through natural language processing methods has seen remarkable growth in recent years. However, the potential of this field is still greatly limited by the lack of access to publicly available therapy dialogues, especially those with expert annotations, but it has been alleviated thanks to the...
Chapter
The last few decades have witnessed the increasing deployment of digital technologies in the urban environment with the goal of creating improved services to citizens especially related to their safety. This motivation, enabled by the widespread evolution of cutting edge technologies within the Artificial Intelligence, Internet of Things, and Compu...
Article
Full-text available
In the last few years, chatbots have become mainstream solutions adopted in a variety of domains for automatizing communication at scale. In the same period, knowledge graphs have attracted significant attention from business and academia as robust and scalable representations of information. In the scientific and academic research domain, they are...
Article
Full-text available
The tourism and hospitality sectors have become increasingly important in the last few years and the companies operating in this field are constantly challenged with providing new innovative services. At the same time, (big-) data has become the “new oil” of this century and Knowledge Graphs are emerging as the most natural way to collect, refine,...
Article
Full-text available
In this paper, we propose an innovative tool able to enrich cultural and creative spots (gems , hereinafter) extracted from the European Commission Cultural Gems portal, by suggesting relevant keywords ( tags ) and YouTube videos (represented with proper thumbnails ). On the one hand, the system queries the YouTube search portal, selects the videos...
Article
Full-text available
Machine learning techniques have recently become the norm for detecting patterns in financial markets. However, relying solely on machine learning algorithms for decision-making can have negative consequences, especially in a critical domain such as the financial one. On the other hand, it is well-known that transforming data into actionable insigh...
Article
Full-text available
Human-centricity is the core value behind the evolution of manufacturing towards Industry 5.0. Nevertheless, there is a lack of architecture that considers safety, trustworthiness, and human-centricity at its core. Therefore, we propose an architecture that integrates Artificial Intelligence (Active Learning, Forecasting, Explainable Artificial Int...
Article
Full-text available
In the last few years, we have witnessed the emergence of several knowledge graphs that explicitly describe research knowledge with the aim of enabling intelligent systems for supporting and accelerating the scientific process. These resources typically characterize a set of entities in this space (e.g., tasks, methods, evaluation techniques, prote...
Chapter
Full-text available
Research publishing companies need to constantly monitor and compare scientific journals and conferences in order to inform critical business and editorial decisions. Semantic Web and Knowledge Graph technologies are natural solutions since they allow these companies to integrate, represent, and analyse a large quantity of information from heteroge...
Article
Full-text available
Science communication has a number of bottlenecks that include the rising number of published research papers and its non-machine-accessible and document-based paradigm, which makes the exploration, reading, and reuse of research outcomes rather inefficient. Recently, Knowledge Graphs (KG), i.e., semantic interlinked networks of entities, have been...
Conference Paper
In recent years, we saw the emergence of several approaches for producing machine-readable, semantically rich, interlinked descriptions of the content of research publications, typically encoded as knowledge graphs. A common limitation of these solutions is that they address a low number of articles, either because they rely on human experts to sum...
Article
Full-text available
Lexicons have risen as alternative resources to common supervised methods for classification or regression in different domains (e.g., Sentiment Analysis). These resources (especially lexical) lack of important domain context and it is not possible to tune/edit/improve them depending on new domains and data. With the exponential production of data...
Article
Full-text available
This paper investigates a new method to simulate pedestrian crowd movement in a large and complex virtual environment, representing a public space such as a shopping mall. To demonstrate pedestrian dynamics, we consider groups of pedestrians of different size, sharing a crowded environment. A pedestrian has its own characteristics, such as gender,...
Article
Full-text available
This open-source tool, written in Python, referred to as XAI StatArb, implements a machine learning approach (ML) powered by eXplainable Artificial Intelligence techniques integrated into a statistical arbitrage trading pipeline. Specifically, given a set of stocks and their raw financial information, the tool aims at forecasting the next day’s ret...
Conference Paper
Full-text available
Research on natural language processing for counselling dialogue analysis has seen substantial development in recent years, but access to this area remains extremely limited due to the lack of publicly available expert-annotated therapy conversations. In this work, we introduce AnnoMI, the first publicly and freely accessible dataset of professiona...
Article
Nowadays, video-sharing portals’ popularity has entailed massive growth in data uploads over the Internet. For several applications (e.g., browsing, retrieval, or recommendation of videos), dealing with vast data volumes has become a critical issue. In a video-sharing scenario, the devising of tools and infrastructures able to completely satisfy us...
Article
Full-text available
The availability of frameworks and applications in the robotic domain fostered in the last years a spread in the adoption of robots in daily life activities. Many of these activities include the robot teleoperation, i.e. controlling its movements remotely. Virtual Reality (VR) demonstrated its effectiveness in lowering the skill barrier for such a...
Preprint
Full-text available
Human-centricity is the core value behind the evolution of manufacturing towards Industry 5.0. Nevertheless, there is a lack of architecture that considers safety, trustworthiness, and human-centricity at its core. Therefore, we propose an architecture that integrates Artificial Intelligence (Active Learning, Forecasting, Explainable Artificial Int...
Article
Full-text available
In most Computer Vision applications, Deep Learning models achieve state-of-the-art performances. One drawback of Deep Learning is the large amount of data needed to train the models. Unfortunately, in many applications, data are difficult or expensive to collect. Data augmentation can alleviate the problem, generating new data from a smaller initi...
Chapter
Full-text available
One of the most important steps when employing machine learning approaches is the feature engineering process. It plays a key role in the identification that features that can effectively help modeling the given classification or regression task. This process is usually not trivial and it might lead to the development of handcrafted features. Withi...
Article
Full-text available
Scientific conferences are essential for developing active research communities, promoting the cross-pollination of ideas and technologies, bridging between academia and industry, and disseminating new findings. Analyzing and monitoring scientific conferences is thus crucial for all users who need to take informed decisions in this space. However,...
Article
Full-text available
The use of superior algorithms and complex architectures in language models have successfully imparted human-like abilities to machines for specific tasks. But two significant constraints, the available training data size and the understanding of domain-specific context, hamper the pre-trained language models from optimal and reliable performance....
Article
Full-text available
Academia and industry share a complex, multifaceted, and symbiotic relationship. Analysing the knowledge flow between them, understanding which directions have the biggest potential, and discovering the best strategies to harmonise their efforts is a critical task for several stakeholders. Research publications and patents are an ideal medium to an...