About
80
Publications
10,006
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
665
Citations
Introduction
Skills and Expertise
Publications
Publications (80)
Ziyao Meng Xue Gu Qiang Shen- [...]
Hao Xu
Deep learning algorithms perform poorly on long-tailed datasets because there is insufficient data in the tail classes to recover its original distribution, resulting in an under-representation of the tail classes in the model. In this work, we propose H2T-FAST, a Head-to-Tail Feature Augmentation method by Style Transfer to improve the performance...
Optical character recognition (OCR) methods have been applied to diverse tasks, e.g., street view text recognition and document analysis. Recently, zero-shot OCR has piqued the interest of the research community because it considers a practical OCR scenario with unbalanced data distribution. However, there is a lack of benchmarks for evaluating suc...
The long-tail effect is a common issue that limits the performance of deep learning models on real-world datasets. Character image datasets are also affected by such unbalanced data distribution due to differences in character usage frequency. Thus, current character recognition methods are limited when applied in the real world, especially for the...
The complexity and non-Euclidean structure of graph data hinders the development of data augmentation methods similar to those in computer vision. In this paper, we propose a feature augmentation method for graph nodes based on topological regularization, in which topological structure information is introduced into an end-to-end model to promote b...
Human activity recognition (HAR) plays a central role in ubiquitous computing applications such as health monitoring. In the real world, it is impractical to perform reliably and consistently over time across a population of individuals due to the cross-individual variation in human behavior. Existing transfer learning algorithms suffer the challen...
In today’s multilingual lexical databases, the majority of the world’s languages are under-represented. Beyond a mere issue of resource incompleteness, we show that existing lexical databases have structural limitations that result in a reduced expressivity on culturally-specific words and in mapping them across languages. In particular, the lexica...
The ubiquity of smartphones equipped with multiple sensors has provided the possibility of automatically recognizing of human activity, which can benefit intelligent applications such as smart homes, health monitoring, and aging care. However, there are two major barriers to deploying an activity recognition model in real-world scenarios. Firstly,...
Mood inference with mobile sensing data has been studied in ubicomp literature over the last decade. This inference enables context-aware and personalized user experiences in general mobile apps and valuable feedback and interventions in mobile health apps. However, even though model generalization issues have been highlighted in many studies, the...
The session-based recommendation predicts the next user’s interest item based on an anonymous user–item interaction sequence. However, most existing methods focus on capturing sequential signals or item-transition patterns within the current session while ignoring potential collaborative behaviors among different users from other sessions that coul...
This paper describes a dataset collected at the end of 2020 and in the summer of 2021 as part of the WeNet project, a Horizon 2020 funded project that aims at developing a diversity-aware, machinemediated paradigm for social interactions. The aim of the survey was to measure aspects of diversity based on social practices and related daily behaviour...
Aspect-based sentiment analysis is a fine-grained sentiment analysis task that identifies the sentiment polarity of different aspects in a sentence. Recently, several studies have used graph convolution networks (GCN) to obtain the relationship between aspects and context words with the dependency tree of sentences. However, errors introduced by th...
The rapid development of online social media makes Abusive Language Detection (ALD) a hot topic in the field of affective computing. However, most methods for ALD in social networks do not take into account the interactive relationships among user posts, which simply regard ALD as a task of text context representation learning. To solve this proble...
Multi-label text classification has been widely concerned by scholars due to its contribution to practical applications. One of the key challenges in multi-label text classification is how to extract and leverage the correlation among labels. However, it is quite challenging to directly model the correlations among labels in a complex and unknown l...
Learning node representations in graphs is a widespread problem in node classification and link prediction. Current research has focused on static heterogeneous, homogeneous networks, and dynamic homogeneous networks. However, many existing graphs, such as citation and social networks, are heterogeneous. Therefore, it is still a great challenge to...
Degraded images commonly exist in the general sources of character images, leading to unsatisfactory character recognition results. Existing methods have dedicated efforts to restoring degraded character images. However, the denoising results obtained by these methods do not appear to improve character recognition performance. This is mainly becaus...
Constructing high-quality character image datasets is challenging because real-world images are often affected by image degradation. There are limitations when applying current image restoration methods to such real-world character images, since (i) the categories of noise in character images are different from those in general images; (ii) real-wo...
The long-tail effect is a common issue that limits the performance of deep learning models on real-world datasets. Character image dataset development is also affected by such unbalanced data distribution due to differences in character usage frequency. Thus, current character recognition methods are limited when applying to real-world datasets, in...
Cross-lingual document retrieval, which aims to take a query in one language to retrieve relevant documents in another, has attracted strong research interest in the last decades. Most studies on this task start with cross-lingual comparisons at the word level and then represent documents via word embeddings, which leads to insufficient structure i...
Federated Learning (FL) is an emerging privacy-aware machine learning technique that applies successfully to the collaborative learning of global models for Human Activity Recognition (HAR). As of now, the applications of FL for HAR assume that the data associated with diverse individuals follow the same distribution. However, this assumption is im...
Human Activity Recognition(HAR) plays an important role in the field of ubiquitous computing, which can benefit various human-centric applications such as smart homes, health monitoring, and aging systems. Human Activity Recognition mainly leverages smartphones and wearable devices to collect sensory signals labeled with activity annotations and tr...
Knowledge graph (KG) helps to improve the accuracy, diversity, and interpretability of a recommender systems. KG has been applied in recommendation systems, exploiting graph neural networks (GNNs), but most existing recommendation models based on GNNs ignore the influence of node types and the loss of information during aggregation. In this paper,...
The rapid development of online social media makes abuse detection a hot topic in the field of emotional computing. However, most natural language processing (NLP) methods only focus on linguistic features of posts and ignore the influence of users’ emotions. To tackle the problem, we propose a multitask framework combining abuse detection and emot...
Graph neural networks (GNNs) can deal with complex network structures and model complex syntax structures in natural languages, which makes GNN outstanding in text classification tasks. However, most graph neural network approaches don’t seem to take full advantage of the topological gains from document graphs. In this paper, we propose a topologic...
Nan Li Qiang Shen Rui Song- [...]
Hao Xu
The popularity of information technology has given rise to a growing interest in smart education and has provided the possibility of combining online and offline education. Knowledge graphs, an effective technology for knowledge representation and management, have been successfully utilized to manage massive educational resources. However, the exis...
Short text classification is an important problem of natural language processing (NLP), and graph neural networks (GNNs) have been successfully used to solve different NLP problems. However, few studies employ GNN for short text classification, and most of the existing graph-based models ignore sequential information (e.g., word orders) in each doc...
Few-shot classification aims at recognizing novel categories from low data regimes based on prior knowledge. However, the existing methods for few-shot scene classification have limitations on using few annotated data and do not fully consider the intra-class samples with classification targets in different sizes, which lead to poor feature represe...
One of the key problems in multi-label text classification is how to take advantage of the correlation among labels. However, it is very challenging to directly model the correlations among labels in a complex and unknown label space. In this paper, we propose a Label Mask multi-label text classification model (LM-MTC), which is inspired by the ide...
Personality detection based on user-generated text content analysis has a significant impact on information science, for instance, information seeking. Existing deep learning-based approaches, however, have two major limitations. Firstly, they extract only keywords for personality detection and lack the analysis of sentiment information and psychol...
The complexity and non-Euclidean structure of graph data hinder the development of data augmentation methods similar to those in computer vision. In this paper, we propose a feature augmentation method for graph nodes based on topological regularization, in which topological structure information is introduced into end-to-end model. Specifically, w...
Applications like personal assistants need to be aware ofthe user's context, e.g., where they are, what they are doing, and with whom. Context information is usually inferred from sensor data, like GPS sensors and accelerometers on the user's smartphone. This prediction task is known as context recognition. A well-defined context model is fundament...
Literature has indicated that negative emotions may lead students to disengagement in teaching activities. Furthermore, the contagion of negative emotion is similar to infectious disease diffusion that drives more students into negative emotions. However, few methods have been brought forward to intervene in negative emotional contagion in real tim...
Knowledge graphs (KGs) have been proven to be effective for improving the performance of recommender systems. KGs can store rich side information and relieve the data sparsity problem. There are many linked attributes between entity pairs (e.g., items and users) in KGs, which can be called multiple-step relation paths. Existing methods do not suffi...
E-learners face a large amount of fragmented learning content during e-learning. How to extract and organize this learning content is the key to achieving the established learning target, especially for non-experts. Reasonably arranging the order of the learning objects to generate a well-defined learning path can help the e-learner complete the le...
With the improvement of living standards, people are paying more attention to healthcare, but there is still a long way to go to improve healthcare. A usable, intelligent aided diagnosis measure can be helpful for people to achieve daily health management. Several studies suggested that tongue features can directly reflect a person’s physical state...
Scientific retrieval systems need to be given domain search terms for searching publications, however, as natural language, search terms provided by users are often fuzzy and limited and some relevant terms are always overlooked in searching. Meanwhile, users always desire to be given domain related keywords to enlighten themselves what other terms...
With the improvement of people's living standards, people pay more and more attention to healthcare, in which a healthy diet plays an important role. Therefore, a scientific knowledge management method about healthy diet which can integrate heterogeneous information from different sources and formats is urgently needed to reduce the information gap...
In the past 40 years, with the changes to dietary structure and the dramatic increase in the consumption of meat products in developing countries, especially in China, encouraging populations to maintain their previous healthy eating patterns will have health, environmental, and economic co-benefits. Healthy diet education plays an important role i...
Improving health awareness is essential to health and healthcare sustainability. How to arouse attention to the health of people and encourage them to attend to healthcare progress so that we can reduce the costs of promoting healthcare by achieving more with less effort remains to be explored. In this paper, we provide a simplified health manageme...
In recent years, with the rapid growth of science and innovation, plenty of constantly-updated scientific achievements containing innovative knowledge can be acquired and used to solve problems. However, most undergraduate students and non-researchers cannot use them efficiently. In traditional teacher-centric education, education for sustainabilit...
The understanding of the structure of knowledge is an essential step of education. Although teachers offer the information foundation and relationship among knowledge points, there are still few methods to encourage students to explore the structure of knowledge by themselves outside of classes. This paper explores the gamification method and the k...
With to the impact of economic globalization, the talent construction of key disciplines in science and technology should be administrated with humanism. An analysis of existing articles shows that the research of talent development mainly relates to the following aspects: cultivating objectives, cultivator, cultivation way, and evaluation criteria...
Serous ovarian carcinoma (SOC) is one of the most life-threatening types of gynecological malignancy, but the pathogenesis of SOC remains unknown. Previous studies have indicated that differentially expressed genes and microRNAs (miRNAs) serve important functions in SOC. However, genes and miRNAs are identified in a disperse form, and limited infor...
Considering emerging technologies of the intelligent space and wireless sensor network, a business architecture, system architecture and technology architecture are proposed for the mission planning system of a space sensor network. The business architecture of a mission planning system is constructed from application flows, such as those of task m...
Evaluating the quality of scientific publications with dual hesitant fuzzy information gives rise to the problem of multiple-attribute decision making. Motivated by the ideal of Hamacher aggregation operators and the Choquet integral, we have developed a dual hesitant fuzzy Hamacher correlated average (DHFHCA) operator. Using this operator, we have...
Scientific and technological papers play a fundamental role in the scientific and technological innovation of countries. The quality control of scientific and technological articles is vital to the journals and management of personnel. This paper investigates multiple attribute decision-making problems with the application of hesitant fuzzy uncerta...
Purpose
– This paper aims to propose an entity-based scientific metadata schema, i.e. Scientific Knowledge Object (SKO) Types. During the past 50 years, many metadata schemas have been developed in a variety of disciplines. However, current scientific metadata schemas focus on describing data, but not entities. They are descriptive, but few of them...
Purpose
– This paper applies the knowledge-based genetic algorithm to solve the optimization problem in complex products technological processes.
Design/methodology/approach
– The knowledge-based genetic algorithm (KGA) is defined as a hybrid genetic algorithm (GA) which combined the GA model with the knowledge model. The GA model searches the fea...
As the amount of networking information production keeps growing, big data analysis has been widely used in the field of education. In recent years, teacher-student interactions on the Web are increasing, this will generate a lot of data. The use of big data analysis methods to analyze these data has become a popular research lately. In this paper,...
We live in a society in rapid development of technology, social networking sites that we use almost every day, but the main purpose of these social networks are not sharing knowledge within the university campus to share knowledge and learn from the experience is very important. Joomla! is an internationally renowned content management system, whic...
During the past fifty years, many metadata schemas have been developed in a variety of disciplines. However, current scientific metadata schemas focus on describing data, but not entities. They are descriptive, but few of them are structural and administrative. SKO Types is an entity-oriented theory for representing and linking Scientific Knowledge...
Scientific publishing is currently undergoing significant paradigm shifts, as it makes the transition from print to electronic format, from subscribers only to open access and from static information to a dynamic knowledge space. In this paper, we investigate four tremendously promising online publishing systems and projects as a short review of st...
We believe that semantic annotation will undermine the traditional way of reading and knowledge dissemination. In this paper, we introduce the SKOTeX, whose name is derived from LaTeX and BibTeX, respectively an editing tool which enables users to generate semantic enriched documentation, and a file format that specifies sets of annotating commands...
During the past fifty years, many metadata schemas have been developed in a variety of disciplines. In this paper, we delve into five state-of-the-art metadata schemas that are widely used in scientific publishing areas and most related to our research, i.e. Dublin Core, LOM, BiBTeX, Schema.org and SKO Types.
Scientific discourses have obviously enhanced their accessibility and reusability in response to the development of Semantic Web technologies. A handful of representation models of discourse representation have been proposed during these years for semantic search and strategy reading. In this paper, we delineate the relationships that operate betwe...
We live in a social and technology-driven world, and academic life is no different. Although there are several off-the-shelf social networking sites, such as Renren and Weibo, universities aim to create academic engagement networks that specifically foster communication and collaboration among students and faculties, and ultimately support advanced...
The data transmission dynamic scheduling is a process that allocates the ground stations and available time windows to the data transmission tasks dynamically for improving the resource utilization. A novel heuristic is proposed to solve the data transmission dynamic scheduling problem. The characteristic of this heuristic is the dynamic hybridizat...
In this paper, a hybrid approach is proposed to the precision of particlesize-distribution retrieval. This approach includes a Mie-Matrix-Optimizing Method (MMOM) and a combined model. The MMOM method is applied to improve the condition number of the Mie matrix. The combined model is employed to approximate the particle size distributions. The prop...
Simulation optimization studies the problem of optimizing simulation-based objectives. Simulation optimization is a new and hot topic in the field of system simulation and operational research. To improve the search efficiency, this paper presents a hybrid approach which combined genetic algorithm and local optimization technique for simulation opt...
The Extended Capacitated Arc Routing Problem (ECARP) is a challenging vehicle routing problem with numerous real-world applications. We propose an improved evolutionary approach to cope with the ECARP in this research. The exploitation of heuristic information characterizes our approach. Two kinds of heuristic information, Arc Assignment Priority I...
Only few models or frameworks of making situated use of advanced technology in scientific publication scenarios are available. Moreover, most of the existing prototypes and applications lack specific expertises and semantics. The approach we proposed in our project is based on some fundamental theories especially specified for managing scientific k...
A traditional online scientific discourse always lacks information of semantic structuring representation. In practice, the logical structure of a discourse itself is usually hidden within the publication's content that kept in mind by the authors, but hardly ever articulated explicitly to readers. Such a barrier makes readers difficult to navigate...
Writing scientific discourses and publishing academic results are integral parts of a researcher’s daily professional life.
Although tremendous magic have been brought by advancement of digital library technologies and social networking services,
there are still no off-the-shelf utilities for strategic writing, reading and even publishing. In this...
With the increasing of Quality of Service (QoS) requirements raised by various kinds of Internet services, the efficiency of QoS support has become more and more crucial. An optimization approach to the Internet quality of service based on an improved genetic algorithm is proposed in this paper. In this improved genetic algorithm, the initial popul...
Web technology is revolutionizing the way diverse scientific knowledge is produced and disseminated. In the past few years, a handful of discourse representation models have been proposed for the externalization of the rhetoric and argumentation captured within scientific publications. However, there hasn’t been a unified interoperable pattern that...
Writing and publishing scientific papers is an integral part of researcher’s life for knowledge dissemination. In China, most
on-line publishing today remains an exhibition of electronic facsimiles of traditional articles without being in step with
the advancement of Semantic Web. Our project aims to propose a pattern-based approach for scientific...
With the advancement of digital library techniques and open access services, more and more off-the-shelf utilities for managing
scientific publications are emerging and wide-spread used. Nevertheless, most online articles of today remain the electronic
facsimiles of the traditional linear structured papers lacking of semantics and interlinked knowl...
Email has become one of the fastest and most economical forms of communication. Email is also one of the most ubiquitous and pervasive applications used on a daily basis by millions of people worldwide. However, the increase in email users has resulted in a dramatic increase in spam emails during the past few years. This paper proposes a new spam f...
Navigation online is one of the most common daily experiences in the research communities. Although researchers could get
more and more benefits from the development of Web Technologies, most online discourses today are still the electronic facsimiles
of traditional linear structured articles. In this paper, we propose a pattern-based representatio...
Scientific papers and scientific conferences are still, despite the emergence of several new dissemination technologies, the de-facto standard in which scientific knowledge is consumed and discussed. While there is no shortage of services and platforms that aid this process (e.g. scholarly search engines, websites, blogs, conference management prog...
Managing ubiquitous scientific knowledge is a part of daily life for scholars, while it also becomes a hot topic in the Semantic
Web research community. In this paper, we propose a SKO Types framework aiming to facilitate managing ubiquitous Scientific
Knowledge Objects (SKO) driven by semantic authoring, modularization, annotation and search. SKO...