Ruihai Dong

Ruihai Dong
  • Doctor of Philosophy (PhD)
  • Lecturer at University College Dublin

About

98
Publications
31,108
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,718
Citations
Introduction
Opinionated product recommendation, which combines the notions of similarity and opinion polarity during recommendation process by adding product features and sentiment information extracted from user-generated product reviews into traditional case-base recommender system.
Current institution
University College Dublin
Current position
  • Lecturer

Publications

Publications (98)
Preprint
Full-text available
Deep learning has achieved significant breakthroughs in medical imaging, but these advancements are often dependent on large, well-annotated datasets. However, obtaining such datasets poses a significant challenge, as it requires time-consuming and labor-intensive annotations from medical experts. Consequently, there is growing interest in learning...
Preprint
Full-text available
Pre-trained transformer models have shown great promise in various natural language processing tasks, including personalized news recommendations. To harness the power of these models, we introduce Transformers4NewsRec, a new Python framework built on the \textbf{Transformers} library. This framework is designed to unify and compare the performance...
Preprint
Accurate and robust stock trend forecasting has been a crucial and challenging task, as stock price changes are influenced by multiple factors. Graph neural network-based methods have recently achieved remarkable success in this domain by constructing stock relationship graphs that reflect internal factors and relationships between stocks. However,...
Preprint
Full-text available
Societal risk emanating from how recommender algorithms disseminate content online is now well documented. Emergent regulation aims to mitigate this risk through ethical audits and enabling new research on the social impact of algorithms. However, there is currently a need for tools and methods that enable such evaluation. This paper presents ARTAI...
Preprint
Full-text available
Representation learning has emerged as a powerful paradigm for extracting valuable latent features from complex, high-dimensional data. In financial domains, learning informative representations for assets can be used for tasks like sector classification, and risk management. However, the complex and stochastic nature of financial markets poses uni...
Article
News recommender systems (NRS) have been widely applied for online news websites to help users find relevant articles based on their interests. Recent methods have demonstrated considerable success in terms of recommendation performance. However, the lack of explanation for these recommendations can lead to mistrust among users and lack of acceptan...
Article
Full-text available
Image emotion classification (IEC) aims to extract the abstract emotions evoked in images. Recently, language-supervised methods such as contrastive language-image pretraining (CLIP) have demonstrated superior performance in image understanding. However, the underexplored task of IEC presents three major challenges: a tremendous training objective...
Article
Full-text available
The negative effects of media bias, such as affecting readers’ perceptions and influencing their social decisions, have been widely identified by social scientists. The cumulative impact of the combination of media bias and personalised news recommendation systems, however, has remained largely unstudied, particularly in real-world news recommendat...
Article
Full-text available
Geological fault detection is a critical aspect of geological exploitation and oil-gas exploration. The automation of fault detection can significantly reduce the dependence on expert labeling. Current prevailing methods often treat fault detection as a semantic segmentation task using the Convolutional Neural Network (CNN). However, CNNs emphasize...
Chapter
Full-text available
Spectrum analysis systems in online water quality testing are designed to detect types and concentrations of pollutants and enable regulatory agencies to respond promptly to pollution incidents. However, spectral data-based testing devices suffer from complex noise patterns when deployed in non-laboratory environments. To make the analysis model ap...
Chapter
Media bias has significant negative effects, such as influencing elections and shaping people’s perceptions. However, the relationship between media bias and personalised news recommendation algorithms (widely adopted by many news platforms) remains unclear. In this study, we describe a novel framework that simulates user interactions with recommen...
Preprint
Full-text available
Spectrum analysis systems in online water quality testing are designed to detect types and concentrations of pollutants and enable regulatory agencies to respond promptly to pollution incidents. However, spectral data-based testing devices suffer from complex noise patterns when deployed in non-laboratory environments. To make the analysis model ap...
Chapter
The financial domain has proven to be a fertile source of challenging machine learning problems across a variety of tasks including prediction, clustering, and classification. Researchers can access an abundance of time-series data and even modest performance improvements can be translated into significant additional value. In this work, we conside...
Preprint
Full-text available
Precisely recommending candidate news articles to users has always been a core challenge for personalized news recommendation systems. Most recent works primarily focus on using advanced natural language processing techniques to extract semantic information from rich textual data, employing content-based methods derived from local historical news....
Conference Paper
Full-text available
Identifying meaningful and actionable relationships between the price movements of financial assets is a challenging but important problem for many financial tasks, from portfolio optimization to sector classification. However, recent machine learning research often focuses on price forecasting, neglecting the understanding and modelling of asset r...
Preprint
In recent years, many recommender systems have utilized textual data for topic extraction to enhance interpretability. However, our findings reveal a noticeable deficiency in the coherence of keywords within topics, resulting in low explainability of the model. This paper introduces a novel approach called entropy regularization to address the issu...
Preprint
Full-text available
News recommender systems (NRS) have been widely applied for online news websites to help users find relevant articles based on their interests. Recent methods have demonstrated considerable success in terms of recommendation performance. However, the lack of explanation for these recommendations can lead to mistrust among users and lack of acceptan...
Article
In recent years, human activity recognition (HAR) methods are developing rapidly. However, most existing methods base on single input data modality, and suffers from accuracy and robustness issues. In this paper, we present a novel multi-modal HAR architecture which fuses signals from both RGB visual data and Inertial Measurement Units (IMU) data....
Preprint
There have been growing concerns regarding the out-of-domain generalization ability of natural language processing (NLP) models, particularly in question-answering (QA) tasks. Current synthesized data augmentation methods for QA are hampered by increased training costs. To address this issue, we propose a novel approach that combines prompting meth...
Preprint
Full-text available
The financial domain has proven to be a fertile source of challenging machine learning problems across a variety of tasks including prediction, clustering, and classification. Researchers can access an abundance of time-series data and even modest performance improvements can be translated into significant additional value. In this work, we conside...
Article
Full-text available
As deep learning (DL) models have been successfully applied to various image processing tasks, DL models, particularly convolutional neural networks (CNN), have been introduced into the geosciences to assist geologists in faster seismic interpretation. However, the generalization of DL-based fault interpretation is a challenge. When applied to seis...
Preprint
Full-text available
Industry classification schemes provide a taxonomy for segmenting companies based on their business activities. They are relied upon in industry and academia as an integral component of many types of financial and economic analysis. However, even modern classification schemes have failed to embrace the era of big data and remain a largely subjectiv...
Article
Financial forecasting has been an important and active area of machine learning research because of the challenges it presents and the potential rewards that even minor improvements in prediction accuracy or forecasting may entail. Traditionally, financial forecasting has heavily relied on quantitative indicators and metrics derived from structured...
Conference Paper
Full-text available
Financial forecasting has been an important and active area of machine learning research because of the challenges it presents and the potential rewards that even minor improvements in prediction accuracy or forecasting may entail. Traditionally , financial forecasting has heavily relied on quantitative indicators and metrics derived from structure...
Preprint
Many recent deep learning-based solutions have widely adopted the attention-based mechanism in various tasks of the NLP discipline. However, the inherent characteristics of deep learning models and the flexibility of the attention mechanism increase the models' complexity, thus leading to challenges in model explainability. In this paper, to addres...
Preprint
Full-text available
Identifying meaningful relationships between the price movements of financial assets is a challenging but important problem in a variety of financial applications. However with recent research, particularly those using machine learning and deep learning techniques, focused mostly on price forecasting, the literature investigating the modelling of a...
Preprint
Full-text available
Financial forecasting has been an important and active area of machine learning research because of the challenges it presents and the potential rewards that even minor improvements in prediction accuracy or forecasting may entail. Traditionally, financial forecasting has heavily relied on quantitative indicators and metrics derived from structured...
Chapter
Full-text available
As deep learning (DL) technologies have developed rapidly, many new techniques have become available for recommender systems. Yet, there is very little research addressing how users’ feedback for particular items (such as ratings) can affect recommendations. This feedback can assist in building more fine-grained user profiles, as not all raw clicks...
Article
Full-text available
Despite many recent advances, state-of-the-art recommender systems still struggle to achieve good performance with sparse datasets. To address the sparsity issue, transfer learning techniques have been investigated for recommender systems, but they tend to impose strict constraints on the content and structure of the data in the source and target d...
Article
Full-text available
Seismic interpretation is a fundamental approach for obtaining static and dynamic information about subsurface reservoirs, such as geological faults/salt bodies and associated fluid types and distribution. Due to the exponential growth in seismic data volume and considerable uncertainty in manual interpretation, deep learning (DL) algorithms have b...
Article
Full-text available
Wearable sensor-based HAR (human activity recognition) is a popular human activity perception method. However, due to the lack of a unified human activity model, the number and positions of sensors in the existing wearable HAR systems are not the same, which affects the promotion and application. In this paper, an information gain-based human activ...
Chapter
Forecasting stock returns is a challenging problem due to the highly stochastic nature of the market and the vast array of factors and events that can influence trading volume and prices. Nevertheless it has proven to be an attractive target for machine learning research because of the potential for even modest levels of prediction accuracy to deli...
Chapter
Full-text available
Ancient oracle bone inscriptions (OBIs) are important Chinese cultural artefacts, which are difficult and time-consuming to decipher even by the most expert paleographers and, as a result, a large proportion of excavated OBIs remain unidentified. In practice, OBIs are deciphered by translating between different writing systems; Chinese writing syst...
Article
Full-text available
Explainable recommendations have drawn more attention from both academia and industry recently, because they can help users better understand recommendations (i.e., why some particular items are recommended), therefore improving the persuasiveness of the recommender system and users’ satisfaction. However, little work has been done to provide expla...
Preprint
Full-text available
Leveraging unlabelled data through weak or distant supervision is a compelling approach to developing more effective text classification models. This paper proposes a simple but effective data augmentation method, which leverages the idea of pseudo-labelling to select samples from noisy distant supervision annotation datasets. The result shows that...
Preprint
Full-text available
Forecasting stock returns is a challenging problem due to the highly stochastic nature of the market and the vast array of factors and events that can influence trading volume and prices. Nevertheless it has proven to be an attractive target for machine learning research because of the potential for even modest levels of prediction accuracy to deli...
Conference Paper
Full-text available
While state-of-the-art NLP models have been achieving the excellent performance of a wide range of tasks in recent years, important questions are being raised about their robustness and their underlying sensitivity to systematic biases that may exist in their training and test data. Such issues come to be manifest in performance problems when faced...
Preprint
Full-text available
While state-of-the-art NLP models have been achieving the excellent performance of a wide range of tasks in recent years, important questions are being raised about their robustness and their underlying sensitivity to systematic biases that may exist in their training and test data. Such issues come to be manifest in performance problems when faced...
Preprint
Full-text available
The explosion in the sheer magnitude and complexity of financial news data in recent years makes it increasingly challenging for investment analysts to extract valuable insights and perform analysis. We propose FactCheck in finance, a web-based news aggregator with deep learning models, to provide analysts with a holistic view of important financia...
Article
Full-text available
Traditional Recommender Systems (RS) use central servers to collect user data, compute user profiles and train global recommendation models. Central computation of RS models has great results in performance because the models are trained using all the available information and the full user profiles. However, centralised RS require users to share t...
Article
Full-text available
The lack of large-scale open-source expert-labelled seismic datasets is one of the barriers to applying today’s AI techniques to automatic fault recognition tasks. The dataset present in this article consists of a large number of processed seismic images and their corresponding fault annotations. The processed seismic images, which are originally f...
Chapter
Full-text available
Traditional recommender systems are usually single-shot systems, lacking real-time dialog with customers. Using dialog as an interactive method can more accurately capture user preferences and enhance system transparency. However, building such a goal-oriented dialog system suffered many challenges as the system itself needs to collaborate with var...
Article
Full-text available
With the explosive growth in seismic data acquisition and the successful application of deep convolutional neural networks (DCNN) to various image processing tasks within multidisciplinary fields, many researchers have begun to research DCNN based automatic seismic interpretation techniques. Due to the vast number of parameters considered in deep n...
Preprint
Full-text available
Multi-label text classification (MLTC) is an attractive and challenging task in natural language processing (NLP). Compared with single-label text classification, MLTC has a wider range of applications in practice. In this paper, we propose a heterogeneous graph convolutional network model to solve the MLTC problem by modeling tokens and labels as...
Chapter
With the increasing number of images containing rich emotional information in social media and the urgent demand for faster and more accurate image emotional information mining, some researchers have begun to pay attention to image emotion classification research. However, most of the work focuses on the complex model design, neglecting the proper...
Chapter
This paper focuses on the problem of inconsistent predictions of modern convolutional neural networks (CNN) at patch (i.e. sub-image) boundaries. Limited by the graphics processing unit (GPU) resources, image tiling and stitching countermeasure have been applied for most megapixel images, that is, cutting images into overlapping tiles as CNN input,...
Preprint
Full-text available
Corporate mergers and acquisitions (M&A) account for billions of dollars of investment globally every year, and offer an interesting and challenging domain for artificial intelligence. However, in these highly sensitive domains, it is crucial to not only have a highly robust and accurate model, but be able to generate useful explanations to garner...
Article
Full-text available
A fall detection module is an important component of community-based care for the elderly to reduce their health risk. It requires the accuracy of detections as well as maintains energy saving. In order to meet the above requirements, a sensing module-integrated energy-efficient sensor was developed which can sense and cache the data of human activ...
Conference Paper
Full-text available
The volatility forecasting task refers to predicting the amount of variability in the price of a financial asset over a certain period. It is an important mechanism for evaluating the risk associated with an asset and, as such, is of significant theoretical and practical importance in financial analysis. While classical approaches have framed this...
Article
Autonomous vehicles have become a hot spot of the automotive industry, many cities have claimed that autonomous vehicles should be capable of recognizing gestures used by traffic police. Traditional traffic police gesture recognition methods rely on depth-sensor or wearable-devices, which limits their availability in the domain of the intelligent v...
Conference Paper
Full-text available
Financial and Economic Attitudes Revealed by Search (FEARS) index reflects the attention and sentiment of public investors and is an important factor for predicting stock price return. In this paper, we take into account the semantics of the FEARS search terms by leveraging the Bidirectional En-coder Representations from Transformers (BERT), and fu...
Preprint
Full-text available
It has been shown that financial news leads to the fluctuation of stock prices. However, previous work on news-driven financial market prediction focused only on predicting stock price movement without providing an explanation. In this paper, we propose a dual-layer attention-based neural network to address this issue. In the initial stage, we intr...
Conference Paper
Full-text available
We describe a novel, multi-task recommendation model, which jointly learns to perform rating prediction and recommendation explanation by combining matrix factorization, for rating prediction, and adversarial sequence to sequence learning for explanation generation. The result is evaluated using real-world datasets to demonstrate improved rating pr...
Conference Paper
Full-text available
Collaborative filtering (CF) is a common recommendation approach that relies on user-item ratings. However, the natural sparsity of user-item rating data can be problematic in many domains and settings, limiting the ability to generate accurate predictions and effective recommendations. Moreover, in some CF approaches latent features are often used...
Conference Paper
Full-text available
In this paper, we introduce a novel recommendation model, which harnesses a convolutional neural network to mine meaningful information from customer reviews, and integrates it with matrix factorization algorithm seamlessly. It is a valid method to improve the transparency of CF algorithms.
Conference Paper
Full-text available
We propose a multi-level attention-based neural network for relation extraction based on the work of Lin et al. to alleviate the problem of wrong labelling in distant supervision. In this paper, we first adopt gated recurrent units to represent the semantic information. Then, we introduce a customized multi-level attention mechanism, which is expec...
Conference Paper
Full-text available
User-generated reviews are a plentiful source of user opinions and interests and can play an important role in a range of artificial intelligence contexts, particularly when it comes to recommender systems. In this paper, we describe how natural language processing and opinion mining techniques can be used to automatically mine useful recommendatio...
Conference Paper
Full-text available
The explosion of user-generated content, especially tweets, customer reviews, makes it possible to build sentiment lexicons automatically by harnessing the consistency between the content and its accompanying emotional signal, either explicitly or implicitly. In this work we describe novel techniques for automatically producing domain specific sent...
Conference Paper
Full-text available
E-commerce recommender systems seek out matches between customers and items in order to help customers discover more relevant and satisfying products and to increase the conversion rate of browsers to buyers. To do this, a recommender system must learn about the likes and dislikes of customers/users as well as the advantages and disadvantages (pros...
Conference Paper
Full-text available
To help users discover relevant products and items recommender systems must learn about the likes and dislikes of users and the pros and cons of items. In this paper, we present a novel approach to building rich feature-based user profiles and item descriptions by mining user-generated reviews. We show how this information can be integrated into re...
Article
Full-text available
In the world of recommender systems, so-called content-based methods are an important approach that rely on the availability of detailed product or item descriptions to drive the recommendation process. For example, recommendations can be generated for a target user by selecting unseen products that are similar to the products that the target user...
Article
Full-text available
User-generated reviews are now plentiful online and they have proven to be a valuable source of real user opinions and real user experiences. In this chapter we consider recent work that seeks to extract topics, opinions, and sentiment from review text that is unstructured and often noisy. We describe and evaluate a number of practical case-studies...
Conference Paper
Full-text available
In this paper we build on recent work on case-based product recommendation focused on generating rich product descriptions for use in a recommendation context by mining user-generated reviews. This is in contrast to conventional case-based approaches which tend to rely on case descriptions that are based on available meta-data or catalog descriptio...
Conference Paper
In this paper we build on recent work on case-based product recommendation focused on generating rich product descriptions for use in a recommendation context by mining user-generated reviews. This is in contrast to conventional case-based approaches which tend to rely on case descriptions that are based on available meta-data or catalog descriptio...
Conference Paper
Full-text available
This paper describes a novel approach to product recommendation that is based on opinionated product descriptions that are automatically mined from user-generated product reviews. We present a recommendation ranking strategy that combines similarity and sentiment to suggest products that are similar but superior to a query product according to the...
Conference Paper
Full-text available
Automatically identifying informative reviews is increasingly important given the rapid growth of user generated reviews on sites like Amazon and TripAdvisor. In this paper, we describe and evaluate techniques for identifying and recommending helpful product reviews using a combination of review features, including topical and sentiment information...
Data
Full-text available
Case-based reasoning (CBR) attempts to reuse past expe-riences to solve new problems. CBR ideas are commonplace in recom-mendation systems, which rely on the similarity between product queries and a case base of product cases. But, the relationship between CBR and many of these recommenders can be tenuous: the idea that prod-uct cases made up of st...
Conference Paper
Full-text available
In this paper we describe a novel approach to case-based product recommendation. It is novel because it does not leverage the usual static, feature-based, purely similarity-driven approaches of traditional case-based recommenders. Instead we harness experiential cases, which are automatically mined from user generated reviews, and we use these as t...
Conference Paper
Full-text available
Supplementing product information with user-generated content such as ratings and reviews can help to convert browsers into buyers. As a result this type of content is now front and centre for many major e-commerce sites such as Amazon. We believe that this type of content can provide a rich source of valuable information that is useful for a varie...
Conference Paper
Full-text available
User generated reviews are now a familiar and valuable part of most e-commerce sites since high quality reviews are known to influence purchasing de-cisions. In this paper we describe work on the Reviewer's Assistant (RA), which is a recommendation system that is designed to help users to write better reviews. It does this by suggesting relevant to...
Conference Paper
Full-text available
Today, online reviews for products and services have become an important class of user-generated content and they play a valuable role for countless online businesses by helping to convert casual browsers into informed and satisfied buyers. As users gravitate towards sites that offer insightful and objec-tive reviews, the ability to source helpful...
Conference Paper
Full-text available
User generated reviews are now a familiar and valuable part of most e-commerce sites since high quality reviews are known to influence purchasing decisions. In this demonstration we describe work on the Reviewer's Assistant (RA), which is a recommendation system that is designed to help users to write better quality reviews. It does this by suggest...
Conference Paper
Full-text available
Today, online reviews for products and services have become an important class of user-generated content and they play a valuable role for countless online businesses by helping to convert casual browsers into informed and satisfied buyers. In many respects, the content of user reviews is every bit as important as the catalog content that describes...
Conference Paper
Full-text available
User opinions and reviews are an important part of the modern web and all major e-commerce sites typically provide their users with the ability to provide and access customer reviews across their product catalog. The importance of reviews has driven the need to improve the review quality by providing interactive support for the reviewer and we will...
Conference Paper
Full-text available
User opinions and reviews are an important part of the modern web and all major e-commerce sites typically provide their users with the ability to provide and access customer reviews across their product catalog. Indeed this has become a vital part of the service provided by sites like Amazon and TripAdvisor, so much so that many of us will routine...

Network

Cited By