Hao Xu

Hao Xu
Jilin University | JUT · College of Computer Science & Technology

PhD

About

80
Publications
10,006
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
665
Citations
Introduction
Skills and Expertise

Publications

Publications (80)
Chapter
Full-text available
Deep learning algorithms perform poorly on long-tailed datasets because there is insufficient data in the tail classes to recover its original distribution, resulting in an under-representation of the tail classes in the model. In this work, we propose H2T-FAST, a Head-to-Tail Feature Augmentation method by Style Transfer to improve the performance...
Preprint
Optical character recognition (OCR) methods have been applied to diverse tasks, e.g., street view text recognition and document analysis. Recently, zero-shot OCR has piqued the interest of the research community because it considers a practical OCR scenario with unbalanced data distribution. However, there is a lack of benchmarks for evaluating suc...
Conference Paper
The long-tail effect is a common issue that limits the performance of deep learning models on real-world datasets. Character image datasets are also affected by such unbalanced data distribution due to differences in character usage frequency. Thus, current character recognition methods are limited when applied in the real world, especially for the...
Article
Full-text available
The complexity and non-Euclidean structure of graph data hinders the development of data augmentation methods similar to those in computer vision. In this paper, we propose a feature augmentation method for graph nodes based on topological regularization, in which topological structure information is introduced into an end-to-end model to promote b...
Article
Full-text available
Human activity recognition (HAR) plays a central role in ubiquitous computing applications such as health monitoring. In the real world, it is impractical to perform reliably and consistently over time across a population of individuals due to the cross-individual variation in human behavior. Existing transfer learning algorithms suffer the challen...
Article
Full-text available
In today’s multilingual lexical databases, the majority of the world’s languages are under-represented. Beyond a mere issue of resource incompleteness, we show that existing lexical databases have structural limitations that result in a reduced expressivity on culturally-specific words and in mapping them across languages. In particular, the lexica...
Article
Full-text available
The ubiquity of smartphones equipped with multiple sensors has provided the possibility of automatically recognizing of human activity, which can benefit intelligent applications such as smart homes, health monitoring, and aging care. However, there are two major barriers to deploying an activity recognition model in real-world scenarios. Firstly,...
Preprint
Full-text available
Mood inference with mobile sensing data has been studied in ubicomp literature over the last decade. This inference enables context-aware and personalized user experiences in general mobile apps and valuable feedback and interventions in mobile health apps. However, even though model generalization issues have been highlighted in many studies, the...
Article
Full-text available
The session-based recommendation predicts the next user’s interest item based on an anonymous user–item interaction sequence. However, most existing methods focus on capturing sequential signals or item-transition patterns within the current session while ignoring potential collaborative behaviors among different users from other sessions that coul...
Technical Report
Full-text available
This paper describes a dataset collected at the end of 2020 and in the summer of 2021 as part of the WeNet project, a Horizon 2020 funded project that aims at developing a diversity-aware, machinemediated paradigm for social interactions. The aim of the survey was to measure aspects of diversity based on social practices and related daily behaviour...
Article
Full-text available
Aspect-based sentiment analysis is a fine-grained sentiment analysis task that identifies the sentiment polarity of different aspects in a sentence. Recently, several studies have used graph convolution networks (GCN) to obtain the relationship between aspects and context words with the dependency tree of sentences. However, errors introduced by th...
Article
The rapid development of online social media makes Abusive Language Detection (ALD) a hot topic in the field of affective computing. However, most methods for ALD in social networks do not take into account the interactive relationships among user posts, which simply regard ALD as a task of text context representation learning. To solve this proble...
Article
Full-text available
Multi-label text classification has been widely concerned by scholars due to its contribution to practical applications. One of the key challenges in multi-label text classification is how to extract and leverage the correlation among labels. However, it is quite challenging to directly model the correlations among labels in a complex and unknown l...
Article
Learning node representations in graphs is a widespread problem in node classification and link prediction. Current research has focused on static heterogeneous, homogeneous networks, and dynamic homogeneous networks. However, many existing graphs, such as citation and social networks, are heterogeneous. Therefore, it is still a great challenge to...
Preprint
Degraded images commonly exist in the general sources of character images, leading to unsatisfactory character recognition results. Existing methods have dedicated efforts to restoring degraded character images. However, the denoising results obtained by these methods do not appear to improve character recognition performance. This is mainly becaus...
Preprint
Constructing high-quality character image datasets is challenging because real-world images are often affected by image degradation. There are limitations when applying current image restoration methods to such real-world character images, since (i) the categories of noise in character images are different from those in general images; (ii) real-wo...
Preprint
The long-tail effect is a common issue that limits the performance of deep learning models on real-world datasets. Character image dataset development is also affected by such unbalanced data distribution due to differences in character usage frequency. Thus, current character recognition methods are limited when applying to real-world datasets, in...
Article
Full-text available
Cross-lingual document retrieval, which aims to take a query in one language to retrieve relevant documents in another, has attracted strong research interest in the last decades. Most studies on this task start with cross-lingual comparisons at the word level and then represent documents via word embeddings, which leads to insufficient structure i...
Conference Paper
Full-text available
Federated Learning (FL) is an emerging privacy-aware machine learning technique that applies successfully to the collaborative learning of global models for Human Activity Recognition (HAR). As of now, the applications of FL for HAR assume that the data associated with diverse individuals follow the same distribution. However, this assumption is im...
Article
Full-text available
Human Activity Recognition(HAR) plays an important role in the field of ubiquitous computing, which can benefit various human-centric applications such as smart homes, health monitoring, and aging systems. Human Activity Recognition mainly leverages smartphones and wearable devices to collect sensory signals labeled with activity annotations and tr...
Article
Full-text available
Knowledge graph (KG) helps to improve the accuracy, diversity, and interpretability of a recommender systems. KG has been applied in recommendation systems, exploiting graph neural networks (GNNs), but most existing recommendation models based on GNNs ignore the influence of node types and the loss of information during aggregation. In this paper,...
Article
Full-text available
The rapid development of online social media makes abuse detection a hot topic in the field of emotional computing. However, most natural language processing (NLP) methods only focus on linguistic features of posts and ignore the influence of users’ emotions. To tackle the problem, we propose a multitask framework combining abuse detection and emot...
Article
Full-text available
Graph neural networks (GNNs) can deal with complex network structures and model complex syntax structures in natural languages, which makes GNN outstanding in text classification tasks. However, most graph neural network approaches don’t seem to take full advantage of the topological gains from document graphs. In this paper, we propose a topologic...
Article
Full-text available
The popularity of information technology has given rise to a growing interest in smart education and has provided the possibility of combining online and offline education. Knowledge graphs, an effective technology for knowledge representation and management, have been successfully utilized to manage massive educational resources. However, the exis...
Article
Full-text available
Short text classification is an important problem of natural language processing (NLP), and graph neural networks (GNNs) have been successfully used to solve different NLP problems. However, few studies employ GNN for short text classification, and most of the existing graph-based models ignore sequential information (e.g., word orders) in each doc...
Article
Few-shot classification aims at recognizing novel categories from low data regimes based on prior knowledge. However, the existing methods for few-shot scene classification have limitations on using few annotated data and do not fully consider the intra-class samples with classification targets in different sizes, which lead to poor feature represe...
Preprint
Full-text available
One of the key problems in multi-label text classification is how to take advantage of the correlation among labels. However, it is very challenging to directly model the correlations among labels in a complex and unknown label space. In this paper, we propose a Label Mask multi-label text classification model (LM-MTC), which is inspired by the ide...
Article
Personality detection based on user-generated text content analysis has a significant impact on information science, for instance, information seeking. Existing deep learning-based approaches, however, have two major limitations. Firstly, they extract only keywords for personality detection and lack the analysis of sentiment information and psychol...
Preprint
Full-text available
The complexity and non-Euclidean structure of graph data hinder the development of data augmentation methods similar to those in computer vision. In this paper, we propose a feature augmentation method for graph nodes based on topological regularization, in which topological structure information is introduced into end-to-end model. Specifically, w...
Preprint
Full-text available
Applications like personal assistants need to be aware ofthe user's context, e.g., where they are, what they are doing, and with whom. Context information is usually inferred from sensor data, like GPS sensors and accelerometers on the user's smartphone. This prediction task is known as context recognition. A well-defined context model is fundament...
Article
Full-text available
Literature has indicated that negative emotions may lead students to disengagement in teaching activities. Furthermore, the contagion of negative emotion is similar to infectious disease diffusion that drives more students into negative emotions. However, few methods have been brought forward to intervene in negative emotional contagion in real tim...
Article
Full-text available
Knowledge graphs (KGs) have been proven to be effective for improving the performance of recommender systems. KGs can store rich side information and relieve the data sparsity problem. There are many linked attributes between entity pairs (e.g., items and users) in KGs, which can be called multiple-step relation paths. Existing methods do not suffi...
Article
E-learners face a large amount of fragmented learning content during e-learning. How to extract and organize this learning content is the key to achieving the established learning target, especially for non-experts. Reasonably arranging the order of the learning objects to generate a well-defined learning path can help the e-learner complete the le...
Chapter
With the improvement of living standards, people are paying more attention to healthcare, but there is still a long way to go to improve healthcare. A usable, intelligent aided diagnosis measure can be helpful for people to achieve daily health management. Several studies suggested that tongue features can directly reflect a person’s physical state...
Article
Full-text available
Scientific retrieval systems need to be given domain search terms for searching publications, however, as natural language, search terms provided by users are often fuzzy and limited and some relevant terms are always overlooked in searching. Meanwhile, users always desire to be given domain related keywords to enlighten themselves what other terms...
Conference Paper
With the improvement of people's living standards, people pay more and more attention to healthcare, in which a healthy diet plays an important role. Therefore, a scientific knowledge management method about healthy diet which can integrate heterogeneous information from different sources and formats is urgently needed to reduce the information gap...
Article
Full-text available
In the past 40 years, with the changes to dietary structure and the dramatic increase in the consumption of meat products in developing countries, especially in China, encouraging populations to maintain their previous healthy eating patterns will have health, environmental, and economic co-benefits. Healthy diet education plays an important role i...
Article
Full-text available
Improving health awareness is essential to health and healthcare sustainability. How to arouse attention to the health of people and encourage them to attend to healthcare progress so that we can reduce the costs of promoting healthcare by achieving more with less effort remains to be explored. In this paper, we provide a simplified health manageme...
Article
Full-text available
In recent years, with the rapid growth of science and innovation, plenty of constantly-updated scientific achievements containing innovative knowledge can be acquired and used to solve problems. However, most undergraduate students and non-researchers cannot use them efficiently. In traditional teacher-centric education, education for sustainabilit...
Article
Full-text available
The understanding of the structure of knowledge is an essential step of education. Although teachers offer the information foundation and relationship among knowledge points, there are still few methods to encourage students to explore the structure of knowledge by themselves outside of classes. This paper explores the gamification method and the k...
Article
Full-text available
With to the impact of economic globalization, the talent construction of key disciplines in science and technology should be administrated with humanism. An analysis of existing articles shows that the research of talent development mainly relates to the following aspects: cultivating objectives, cultivator, cultivation way, and evaluation criteria...
Article
Serous ovarian carcinoma (SOC) is one of the most life-threatening types of gynecological malignancy, but the pathogenesis of SOC remains unknown. Previous studies have indicated that differentially expressed genes and microRNAs (miRNAs) serve important functions in SOC. However, genes and miRNAs are identified in a disperse form, and limited infor...
Article
Considering emerging technologies of the intelligent space and wireless sensor network, a business architecture, system architecture and technology architecture are proposed for the mission planning system of a space sensor network. The business architecture of a mission planning system is constructed from application flows, such as those of task m...
Article
Evaluating the quality of scientific publications with dual hesitant fuzzy information gives rise to the problem of multiple-attribute decision making. Motivated by the ideal of Hamacher aggregation operators and the Choquet integral, we have developed a dual hesitant fuzzy Hamacher correlated average (DHFHCA) operator. Using this operator, we have...
Article
Full-text available
Scientific and technological papers play a fundamental role in the scientific and technological innovation of countries. The quality control of scientific and technological articles is vital to the journals and management of personnel. This paper investigates multiple attribute decision-making problems with the application of hesitant fuzzy uncerta...
Article
Purpose – This paper aims to propose an entity-based scientific metadata schema, i.e. Scientific Knowledge Object (SKO) Types. During the past 50 years, many metadata schemas have been developed in a variety of disciplines. However, current scientific metadata schemas focus on describing data, but not entities. They are descriptive, but few of them...
Article
Purpose – This paper applies the knowledge-based genetic algorithm to solve the optimization problem in complex products technological processes. Design/methodology/approach – The knowledge-based genetic algorithm (KGA) is defined as a hybrid genetic algorithm (GA) which combined the GA model with the knowledge model. The GA model searches the fea...
Conference Paper
Full-text available
As the amount of networking information production keeps growing, big data analysis has been widely used in the field of education. In recent years, teacher-student interactions on the Web are increasing, this will generate a lot of data. The use of big data analysis methods to analyze these data has become a popular research lately. In this paper,...
Article
We live in a society in rapid development of technology, social networking sites that we use almost every day, but the main purpose of these social networks are not sharing knowledge within the university campus to share knowledge and learn from the experience is very important. Joomla! is an internationally renowned content management system, whic...
Article
During the past fifty years, many metadata schemas have been developed in a variety of disciplines. However, current scientific metadata schemas focus on describing data, but not entities. They are descriptive, but few of them are structural and administrative. SKO Types is an entity-oriented theory for representing and linking Scientific Knowledge...
Article
Scientific publishing is currently undergoing significant paradigm shifts, as it makes the transition from print to electronic format, from subscribers only to open access and from static information to a dynamic knowledge space. In this paper, we investigate four tremendously promising online publishing systems and projects as a short review of st...
Article
We believe that semantic annotation will undermine the traditional way of reading and knowledge dissemination. In this paper, we introduce the SKOTeX, whose name is derived from LaTeX and BibTeX, respectively an editing tool which enables users to generate semantic enriched documentation, and a file format that specifies sets of annotating commands...
Article
During the past fifty years, many metadata schemas have been developed in a variety of disciplines. In this paper, we delve into five state-of-the-art metadata schemas that are widely used in scientific publishing areas and most related to our research, i.e. Dublin Core, LOM, BiBTeX, Schema.org and SKO Types.
Article
Scientific discourses have obviously enhanced their accessibility and reusability in response to the development of Semantic Web technologies. A handful of representation models of discourse representation have been proposed during these years for semantic search and strategy reading. In this paper, we delineate the relationships that operate betwe...
Article
Full-text available
We live in a social and technology-driven world, and academic life is no different. Although there are several off-the-shelf social networking sites, such as Renren and Weibo, universities aim to create academic engagement networks that specifically foster communication and collaboration among students and faculties, and ultimately support advanced...
Article
Full-text available
The data transmission dynamic scheduling is a process that allocates the ground stations and available time windows to the data transmission tasks dynamically for improving the resource utilization. A novel heuristic is proposed to solve the data transmission dynamic scheduling problem. The characteristic of this heuristic is the dynamic hybridizat...
Article
In this paper, a hybrid approach is proposed to the precision of particlesize-distribution retrieval. This approach includes a Mie-Matrix-Optimizing Method (MMOM) and a combined model. The MMOM method is applied to improve the condition number of the Mie matrix. The combined model is employed to approximate the particle size distributions. The prop...
Article
Simulation optimization studies the problem of optimizing simulation-based objectives. Simulation optimization is a new and hot topic in the field of system simulation and operational research. To improve the search efficiency, this paper presents a hybrid approach which combined genetic algorithm and local optimization technique for simulation opt...
Article
The Extended Capacitated Arc Routing Problem (ECARP) is a challenging vehicle routing problem with numerous real-world applications. We propose an improved evolutionary approach to cope with the ECARP in this research. The exploitation of heuristic information characterizes our approach. Two kinds of heuristic information, Arc Assignment Priority I...
Conference Paper
Only few models or frameworks of making situated use of advanced technology in scientific publication scenarios are available. Moreover, most of the existing prototypes and applications lack specific expertises and semantics. The approach we proposed in our project is based on some fundamental theories especially specified for managing scientific k...
Article
A traditional online scientific discourse always lacks information of semantic structuring representation. In practice, the logical structure of a discourse itself is usually hidden within the publication's content that kept in mind by the authors, but hardly ever articulated explicitly to readers. Such a barrier makes readers difficult to navigate...
Conference Paper
Writing scientific discourses and publishing academic results are integral parts of a researcher’s daily professional life. Although tremendous magic have been brought by advancement of digital library technologies and social networking services, there are still no off-the-shelf utilities for strategic writing, reading and even publishing. In this...
Article
With the increasing of Quality of Service (QoS) requirements raised by various kinds of Internet services, the efficiency of QoS support has become more and more crucial. An optimization approach to the Internet quality of service based on an improved genetic algorithm is proposed in this paper. In this improved genetic algorithm, the initial popul...
Article
Full-text available
Web technology is revolutionizing the way diverse scientific knowledge is produced and disseminated. In the past few years, a handful of discourse representation models have been proposed for the externalization of the rhetoric and argumentation captured within scientific publications. However, there hasn’t been a unified interoperable pattern that...
Conference Paper
Writing and publishing scientific papers is an integral part of researcher’s life for knowledge dissemination. In China, most on-line publishing today remains an exhibition of electronic facsimiles of traditional articles without being in step with the advancement of Semantic Web. Our project aims to propose a pattern-based approach for scientific...
Chapter
With the advancement of digital library techniques and open access services, more and more off-the-shelf utilities for managing scientific publications are emerging and wide-spread used. Nevertheless, most online articles of today remain the electronic facsimiles of the traditional linear structured papers lacking of semantics and interlinked knowl...
Article
Email has become one of the fastest and most economical forms of communication. Email is also one of the most ubiquitous and pervasive applications used on a daily basis by millions of people worldwide. However, the increase in email users has resulted in a dramatic increase in spam emails during the past few years. This paper proposes a new spam f...
Conference Paper
Navigation online is one of the most common daily experiences in the research communities. Although researchers could get more and more benefits from the development of Web Technologies, most online discourses today are still the electronic facsimiles of traditional linear structured articles. In this paper, we propose a pattern-based representatio...
Article
Full-text available
Scientific papers and scientific conferences are still, despite the emergence of several new dissemination technologies, the de-facto standard in which scientific knowledge is consumed and discussed. While there is no shortage of services and platforms that aid this process (e.g. scholarly search engines, websites, blogs, conference management prog...
Chapter
Managing ubiquitous scientific knowledge is a part of daily life for scholars, while it also becomes a hot topic in the Semantic Web research community. In this paper, we propose a SKO Types framework aiming to facilitate managing ubiquitous Scientific Knowledge Objects (SKO) driven by semantic authoring, modularization, annotation and search. SKO...