
Diego Reforgiato RecuperoUniversità degli studi di Cagliari | UNICA · mathematics and computer science
Diego Reforgiato Recupero
Professor
About
224
Publications
75,585
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,383
Citations
Citations since 2017
Introduction
Diego Reforgiato Recupero is professor at the University of Cagliari. His short bio can be found here
https://www.unica.it/unica/it/ateneo_s07_ss01_sss01.page?contentId=SHD30988
Publications
Publications (224)
Natural Language Processing (NLP) is crucial to perform recommendations of items that can be only described by natural language. However, NLP usage within recommendation modules is difficult and usually requires a relevant initial effort, thus limiting its widespread adoption. To overcome this limitation, we introduce FORESEE, a novel architecture...
Research on the analysis of counselling conversations through natural language processing methods has seen remarkable growth in recent years. However, the potential of this field is still greatly limited by the lack of access to publicly available therapy dialogues, especially those with expert annotations, but it has been alleviated thanks to the...
In the last few years, chatbots have become mainstream solutions adopted in a variety of domains for automatizing communication at scale. In the same period, knowledge graphs have attracted significant attention from business and academia as robust and scalable representations of information. In the scientific and academic research domain, they are...
In this paper, we propose an innovative tool able to enrich cultural and creative spots (gems , hereinafter) extracted from the European Commission Cultural Gems portal, by suggesting relevant keywords ( tags ) and YouTube videos (represented with proper thumbnails ). On the one hand, the system queries the YouTube search portal, selects the videos...
Machine learning techniques have recently become the norm for detecting patterns in financial markets. However, relying solely on machine learning algorithms for decision-making can have negative consequences, especially in a critical domain such as the financial one. On the other hand, it is well-known that transforming data into actionable insigh...
Human-centricity is the core value behind the evolution of manufacturing towards Industry 5.0. Nevertheless, there is a lack of architecture that considers safety, trustworthiness, and human-centricity at its core. Therefore, we propose an architecture that integrates Artificial Intelligence (Active Learning, Forecasting, Explainable Artificial Int...
In the last few years, we have witnessed the emergence of several knowledge graphs that explicitly describe research knowledge with the aim of enabling intelligent systems for supporting and accelerating the scientific process. These resources typically characterize a set of entities in this space (e.g., tasks, methods, evaluation techniques, prote...
Research publishing companies need to constantly monitor and compare scientific journals and conferences in order to inform critical business and editorial decisions. Semantic Web and Knowledge Graph technologies are natural solutions since they allow these companies to integrate, represent, and analyse a large quantity of information from heteroge...
Science communication has a number of bottlenecks that include the rising number of published research papers and its non-machine-accessible and document-based paradigm, which makes the exploration, reading, and reuse of research outcomes rather inefficient. Recently, Knowledge Graphs (KG), i.e., semantic interlinked networks of entities, have been...
In recent years, we saw the emergence of several approaches for producing machine-readable, semantically rich, interlinked descriptions of the content of research publications, typically encoded as knowledge graphs. A common limitation of these solutions is that they address a low number of articles, either because they rely on human experts to sum...
Lexicons have risen as alternative resources to common supervised methods for classification or regression in different domains (e.g., Sentiment Analysis). These resources (especially lexical) lack of important domain context and it is not possible to tune/edit/improve them depending on new domains and data. With the exponential production of data...
This paper investigates a new method to simulate pedestrian crowd movement in a large and complex virtual environment, representing a public space such as a shopping mall. To demonstrate pedestrian dynamics, we consider groups of pedestrians of different size, sharing a crowded environment. A pedestrian has its own characteristics, such as gender,...
This open-source tool, written in Python, referred to as XAI StatArb, implements a machine learning approach (ML) powered by eXplainable Artificial Intelligence techniques integrated into a statistical arbitrage trading pipeline. Specifically, given a set of stocks and their raw financial information, the tool aims at forecasting the next day’s ret...
Research on natural language processing for counselling dialogue analysis has seen substantial development in recent years, but access to this area remains extremely limited due to the lack of publicly available expert-annotated therapy conversations. In this work, we introduce AnnoMI, the first publicly and freely accessible dataset of professiona...
Nowadays, video-sharing portals’ popularity has entailed massive growth in data uploads over the Internet. For several applications (e.g., browsing, retrieval, or recommendation of videos), dealing with vast data volumes has become a critical issue. In a video-sharing scenario, the devising of tools and infrastructures able to completely satisfy us...
The availability of frameworks and applications in the robotic domain fostered in the last years a spread in the adoption of robots in daily life activities. Many of these activities include the robot teleoperation, i.e. controlling its movements remotely. Virtual Reality (VR) demonstrated its effectiveness in lowering the skill barrier for such a...
Human-centricity is the core value behind the evolution of manufacturing towards Industry 5.0. Nevertheless, there is a lack of architecture that considers safety, trustworthiness, and human-centricity at its core. Therefore, we propose an architecture that integrates Artificial Intelligence (Active Learning, Forecasting, Explainable Artificial Int...
In most Computer Vision applications, Deep Learning models achieve state-of-the-art performances. One drawback of Deep Learning is the large amount of data needed to train the models. Unfortunately, in many applications, data are difficult or expensive to collect. Data augmentation can alleviate the problem, generating new data from a smaller initi...
One of the most important steps when employing machine learning approaches is the feature engineering process. It plays a key role in the identification that features that can effectively help modeling the given classification or regression task. This process is usually not trivial and it might lead to the development of handcrafted features. Withi...
Scientific conferences are essential for developing active research communities, promoting the cross-pollination of ideas and technologies, bridging between academia and industry, and disseminating new findings. Analyzing and monitoring scientific conferences is thus crucial for all users who need to take informed decisions in this space. However,...
The use of superior algorithms and complex architectures in language models have successfully imparted human-like abilities to machines for specific tasks. But two significant constraints, the available training data size and the understanding of domain-specific context, hamper the pre-trained language models from optimal and reliable performance....
Academia and industry share a complex, multifaceted, and symbiotic relationship. Analysing the knowledge flow between them, understanding which directions have the biggest potential, and discovering the best strategies to harmonise their efforts is a critical task for several stakeholders. Research publications and patents are an ideal medium to an...
We propose a new method for simulating pedestrian crowd movement in a virtual environment. A crowd consists of groups of different number of people with different attributes such as gender, age, position, velocity, and energy. Each group has its own intention used to generate a trajectory for each pedestrian navigating in the virtual environment. A...
In the last decades, modern societies are experiencing an increasing adoption of interconnected smart devices. This revolution involves not only canonical devices such as smartphones and tablets, but also simple objects like light bulbs. Named as the Internet of Things (IoT), this ever-growing scenario offers enormous opportunities in many areas of...
A lot of people have neuromuscular problems that affect their lives leading them to lose an important degree of autonomy in their daily activities. When their disabilities do not involve speech disorders, robotic wheelchairs with voice assistant technologies may provide appropriate human–robot interaction for them. Given the wide improvement and di...
In this paper, we investigate the multi‐goal path planning problem to find the shortest collision‐free path connecting a given set of goal points located in the robot working environment. This problem combines two sub‐problems: first, optimize the sequence of the goal points located in the free workspace; second, compute the shortest collision‐free...
There is a lack of a single architecture specification that addresses the needs of trusted and secure Artificial Intelligence systems with humans in the loop, such as human-centered manufacturing systems at the core of the evolution towards Industry 5.0. To realize this, we propose an architecture that integrates forecasts, Explainable Artificial I...
Knowledge graphs (KGs) are widely used for modeling scholarly communication, performing scientometric analyses, and supporting a variety of intelligent services to explore the literature and predict research dynamics. However, they often suffer from incompleteness (e.g., missing affiliations, references, research topics), leading to a reduced scope...
The incompleteness of Knowledge Graphs (KGs) is a crucial issue affecting the quality of AI-based services. In the scholarly domain, KGs describing research publications typically lack important information, hindering our ability to analyse and predict research dynamics. In recent years, link prediction approaches based on Knowledge Graph Embedding...
This chapter is an introduction to the use of data science technologies in the fields of economics and finance. The recent explosion in computation and information technology in the past decade has made available vast amounts of data in various domains, which has been referred to as Big Data . In economics and finance, in particular, tapping into t...
This open access book covers the use of data science, including advanced machine learning, big data analytics, Semantic Web technologies, natural language processing, social media analysis, time series analysis, among others, for applications in economics and finance. In addition, it shows some successful applications of advanced data science solut...
The incompleteness of Knowledge Graphs (KGs) is a crucial issue affecting the quality of AI-based services. In the scholarly domain, KGs describing research publications typically lack important information, hindering our ability to analyse and predict research dynamics. In recent years, link prediction approaches based on Knowledge Graph Embedding...
Today, we are seeing an ever-increasing number of clinical notes that contain clinical results, images, and textual descriptions of patient's health state. All these data can be analyzed and employed to cater novel services that can help people and domain experts with their common healthcare tasks. However, many technologies such as Deep Learning a...
Empathetic response from the therapist is key to the success of clinical psychotherapy, especially motivational interviewing. Previous work on computational modelling of empathy in motivational interviewing has focused on offline, session-level assessment of therapist empathy, where empathy captures all efforts that the therapist makes to understan...
The credit scoring models are aimed to assess the capability of refunding a loan by assessing user reliability in several financial contexts, representing a crucial instrument for a large number of financial operators such as banks. Literature solutions offer many approaches designed to evaluate users' reliability on the basis of information about...
In the current age of overwhelming information and massive production of textual data on the Web, Event Detection has become an increasingly important task in various application domains. Several research branches have been developed to tackle the problem from different perspectives, including Natural Language Processing and Big Data analysis, with...
This paper introduces , a new semantic role labeling method that transforms a text into a frame-oriented knowledge graph. It performs dependency parsing, identifies the words that evoke lexical frames, locates the roles and fillers for each frame, runs coercion techniques, and formalizes the results as a knowledge graph. This formal representation...
WordNet lexical-database groups English words into sets of synonyms called "synsets." Synsets are utilized for several applications in the field of text-mining. However, they were also open to criticism because although, in reality, not all the members of a synset represent the meaning of that synset with the same degree, in practice, they are cons...
There is a lack of a single architecture specification that addresses the needs of trusted and secure Artificial Intelligence systems with humans in the loop, such as human-centered manufacturing systems at the core of the evolution towards Industry 5.0. To realize this, we propose an architecture that integrates forecasts, Explainable Artificial I...
Today, increasing numbers of people are interacting online and a lot of textual comments
are being produced due to the explosion of online communication. However, a paramount inconvenience within online environments is that comments that are shared within digital platforms can hide hazards, such as fake news, insults, harassment, and, more in gener...
Local energy communities (LECs) are structures based on the collaboration of neighbouring prosumers for suiting their energy requests. Prosumers are users participating in the community, which are able to produce energy rather than just consuming it. These communities have the purpose to incentivize usage of renewable energy. Inside them, it is pos...
The electric power grid undergoes a transformation, with many consumers becoming both producers and consumers of electricity. This transformation poses challenges to the existing grid as it was not designed to have reverse power flows. Local energy communities are effective in addressing those issues and engaging grid users to play an active role i...
The continuous growth of scientific literature brings innovations and, at the same time, raises new challenges. One of them is related to the fact that its analysis has become difficult due to the high volume of published papers for which manual effort for annotations and management is required. Novel technological infrastructures are needed to hel...
In recent years, machine learning algorithms have been successfully employed to leverage the potential of identifying hidden patterns of financial market behavior and, consequently, have become a land of opportunities for financial applications such as algorithmic trading. In this paper, we propose a statistical arbitrage trading strategy with two...
In this manuscript, we propose a Machine Learning approach to tackle a binary classification problem whose goal is to predict the magnitude (high or low) of future stock price variations for individual companies of the S&P 500 index. Sets of lexicons are generated from globally published articles with the goal of identifying the most impactful word...
The stock market forecasting is one of the most challenging application of machine learning, as its historical data are naturally noisy and unstable. Most of the successful approaches act in a supervised manner, labeling training data as being of positive or negative moments of the market. However, training machine learning classifiers in such a wa...
The adoption of computer-aided stock trading methods is gaining popularity in recent years, mainly because of their ability to process efficiently past information through machine learning to predict future market behavior. Several approaches have been proposed to this task, with the most effective ones using fusion of a pile of classifiers decisio...
In this paper we present a mixture of technologies tailored for e-learning related to the Deep Learning, Sentiment Analysis, and Semantic Web domains, which we have employed to show four different use cases that we have validated in the field of Human-Robot Interaction. The approach has been designed using Zora, a humanoid robot that can be easily...
The continuous growth of scientific literature brings innovations and, at the same time, raises new challenges. One of them is related to the fact that its analysis has become difficult due to the high volume of published papers for which manual effort for annotations and management is required. Novel technological infrastructures are needed to hel...
The past decade has seen an explosion of the amount of digital information generated within the healthcare domain. Digital data exist in the form of images, video, speech, transcripts, electronic health records, clinical records, and free-text. Analysis and interpretation of healthcare data is a daunting task, and it demands a great deal of time, r...
Financial forecasting represents a challenging task, mainly due to the irregularity of the market, high fluctuations and noise of the involved data, as well as collateral phenomena including investor mood and mass psychology. In recent years, many researchers focused their work on predicting the performance of the market by exploiting novel Machine...
Abstract Unhealthy behaviors regarding nutrition are a global risk for health. Therefore, the healthiness of an individual’s nutrition should be monitored in the medium and long term. A powerful tool for monitoring nutrition is a food diary; i.e., a daily list of food taken by the individual, together with portion information. Unfortunately, frail...
The anomaly-based Intrusion Detection Systems (IDSs) represent one of the most efficient methods in countering the intrusion attempts against the ever growing number of network-based services. Despite the central role they play, their effectiveness is jeopardized by a series of problems that reduce the IDS effectiveness in a real-world context, mai...
Scientific knowledge has been traditionally disseminated and preserved through research articles published in journals, conference proceedings , and online archives. However, this article-centric paradigm has been often criticized for not allowing to automatically process, categorize , and reason on this knowledge. An alternative vision is to gener...