Han Liu

Han Liu
Shenzhen University · College of Computer Science and Software Engineering

BSc, MSc, PhD

About

114
Publications
45,614
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,544
Citations
Introduction
I am an Assistant Professor in Machine Learning at the Shenzhen University. I received a BSc in Computing from University of Portsmouth in 2011, an MSc in Software Engineering from University of Southampton in 2012, and a PhD in Machine Learning from University of Portsmouth in 2015. My research interests include data mining, machine learning, neural networks, rule based systems, intelligent systems, fuzzy systems, and big data analytics.
Additional affiliations
June 2020 - May 2022
Shenzhen University
Position
  • Research Associate
Description
  • Working as a member of the machine learning research group and undertaking research tasks on deep symbolic learning.
June 2017 - April 2020
Cardiff University
Position
  • Research Associate
Description
  • Working as a member of the Scoial Data Science Lab and undertaking the project entitled "Centre for Cyberhate Research and Policy" in the context of machine learning and text mining.
June 2016 - April 2017
University of Portsmouth
Position
  • Research Associate
Description
  • Working as a member of the computational intelligence research group and undertaking specific tasks on research projects.
Education
February 2013 - November 2015
University of Portsmouth
Field of study
  • Machine Learning
September 2011 - September 2012
University of Southampton
Field of study
  • Software Engineering
September 2009 - July 2011
University of Portsmouth
Field of study
  • Computing

Publications

Publications (114)
Conference Paper
Full-text available
In traditional machine learning, classifiers training is typically un-dertaken in the setting of single-task learning, so the trained classi-fier can discriminate between different classes. However, this must be based on the assumption that different classes are mutually exclusive. In real applications, the above assumption does not always hold. Fo...
Article
Full-text available
Classification is a special type of machine learning tasks, which is essentially achieved by training a classifier that can be used to classify new instances. In order to train a high performance classifier, it is crucial to extract representative features from raw data, such as text and images. In reality, instances could be highly diverse even if...
Article
Full-text available
Rule learning approaches, which essentially aim to gerenate a decision tree or a set of “if-then” rules, have been popularly used in practice for automatically building rule-based models for prediction tasks, e.g., classification and regression. The key strength of rule-based models is their ability to interpret how an output is obtained given an i...
Article
Full-text available
Binary Decomposition can be adopted in ordered and unordered ways. Inspired by the case that label order information can be exploited to improve the ordinal classification performance through adopting an ordered decomposition strategy, this paper explores whether the effectiveness of binary decomposition in nominal classification tasks can be impro...
Article
Full-text available
Sentiment analysis is a very popular application area of text mining and machine learning. The popular methods include Support Vector Machine, Naive Bayes, Decision Trees and Deep Neural Networks. However, these methods generally belong to discriminative learning, which aims to distinguish one class from others with a clear-cut outcome, under the p...
Article
Full-text available
Hateful individuals and groups have increasingly been using the Internet to express their ideas, spread their beliefs, and recruit new members. Under- standing the network characteristics of these hateful groups could help understand individuals’ exposure to hate and derive intervention strategies to mitigate the dangers of such networks by disrupt...
Conference Paper
Full-text available
Machine learning has become a popular approach for automatic detection of specific patterns. However, each learning algorithm could have its own advantages and disadvantages for dealing with special types of data, e.g. heuristic algorithms could generally lead to the production of biased classifiers, especially when learning from a small data sampl...
Conference Paper
Full-text available
Rule learning has been a popular machine learning branch owing to the advantage of such learning approaches in model interpretability. Since rule learning approaches are generally sensitive to the changes of training data, i.e., a slight modification of the training data may lead to a large variation on the performance of prediction, it results in...
Article
Due to the rapid development of electronic journals, selecting appropriate journals to publish research papers has become a significant challenge to researchers. Sometimes, even a high-quality paper may get rejected from the editor due to the mismatch between the topic of the paper and the scope of the journal. To address this issue, we present a f...
Article
Full-text available
Quantifying the uncertainty of supervised learning models plays an important role in making more reliable predictions. Epistemic uncertainty, which usually is due to insufficient knowledge about the model, can be reduced by collecting more data or refining the learning models. Over the last few years, scholars have proposed many epistemic uncertain...
Article
Full-text available
The presence of missing data is a challenging issue in processing real-world datasets. It is necessary to improve the data quality by imputing the missing values so that effective learning from data can be achieved. Recently, deep learning has become the most powerful type of machine learning techniques, which can be used for discovering the hidden...
Article
Full-text available
In classification tasks, unlabeled data bring the uncertainty in the learning process, which may result in the degradation of the performance. In this paper, we propose a novel semi-supervised Inception Neural Network Ensemble based architecture to achieve missing label imputation. The main idea of the proposed architecture is to use smaller ensemb...
Preprint
Full-text available
Quantifying the uncertainty of supervised learning models plays an important role in making more reliable predictions. Epistemic uncertainty, which usually is due to insufficient knowledge about the model, can be reduced by collecting more data or refining the learning models. Over the last few years, scholars have proposed many epistemic uncertain...
Article
Although decision trees have been widely applied to different security related applications, their security has not been investigated extensively in an adversarial environment. This work aims to study the robustness of classical decision tree (DT) and Fuzzy decision tree (FDT) under evasion attack that manipulate the features in order to mislead th...
Article
Full-text available
The presence of missing data is a common and pivotal issue, which generally leads to a serious decrease of data quality and thus indicates the necessity to effectively handle missing data. In this paper, we propose a missing value imputation approach driven by Fuzzy C-Mean clustering to improve the classification accuracy by referring only to the k...
Article
Full-text available
In real life scenarios, classification problems with the characters of monotonicity constraints and imbalanced class distribution widely exist. However, at present, the research on this kind of problem is still rare. Traditional algorithms designed only for monotonic classification and imbalanced classification are not available for monotonic imbal...
Article
Full-text available
In this research, we propose two Particle Swarm Optimisation (PSO) variants to undertake feature selection tasks. The aim is to overcome two major shortcomings of the original PSO model, i.e., premature convergence and weak exploitation around the near optimal solutions. The first proposed PSO variant incorporates four key operations, including a m...
Article
Full-text available
Data-driven process monitoring methods have attracted great attention due to the case that it can provide an efficient way to cope with the industrial process without the need of first-principle models. The local information that results from the reaction process often influences the process monitoring result. Unfortunately, this local information...
Article
Full-text available
Due to the rapid development of human–computer interaction, affective computing has attracted more and more attention in recent years. In emotion recognition, Electroencephalogram (EEG) signals are easier to be recorded than other physiological experiments and are not easily camouflaged. Because of the high dimensional nature of EEG data and the di...
Article
Full-text available
Deep learning (DL) has emerged as a powerful image processing technique that learns the features of the data and produces state-of-the-art prediction results. The decade from 2010 to 2020 is a real revival of DL, which has come to a turning point in history. In image classification, many deep learning networks have been proposed by scholars, and ea...
Conference Paper
Full-text available
With the recent booming of artificial intelligence (AI), particularly deep learning techniques, digital healthcare is one of the prevalent areas that could gain benefits from AI-enabled functionality. This research presents a novel AI-enabled Internet of Things (IoT) device operating from the ESP-8266 platform capable of assisting those who suffer...
Article
Full-text available
The use of skeleton data for human posture recognition is a key research topic in the human-computer interaction field. To improve the accuracy of human posture recognition, a new algorithm based on multiple features and rule learning is proposed in this paper. Firstly, a 219-dimensional vector that includes angle features and distance features is...
Article
Full-text available
Shape recognition is a fundamental problem and a special type of image classification, where each shape is considered as a class. Current approaches to shape recognition mainly focus on designing low-level shape descriptors, and classify them using some machine learning approaches. In order to achieve effective learning of shape features, it is ess...
Article
Full-text available
In recent years, the increasing prevalence of hate speech in social media has been considered as a serious problem worldwide. Many governments and organizations have made significant investment in hate speech detection techniques, which have also attracted the attention of the scientific community. Although plenty of literature focusing on this iss...
Article
Full-text available
Handwritten digits recognition has been treated as a multi-class classification problem in the machine learning context, where each of the ten digits (0-9) is viewed as a class and the machine learning task is essentially to train a classifier that can effectively discriminate the ten classes. In practice, it is very usual that the performance of a...
Article
Full-text available
In this article, we conduct a comprehensive study of online antagonistic content related to Jewish identity posted on Twitter between October 2015 and October 2016 by UK-based users. We trained a scalable supervised machine learning classifier to identify antisemitic content to reveal patterns of online antisemitism perpetration at the source. We b...
Conference Paper
Full-text available
This paper presents a system developed during our participation (team name: scmhl5) in the TRAC-2 Shared Task on aggression identification. In particular, we participated in English Sub-task A on three-class classification ('Overtly Aggressive', 'Covertly Aggressive' and 'Non-aggressive') and English Sub-task B on binary classification for Misogyni...
Preprint
Full-text available
With the recent booming of artificial intelligence (AI), particularly deep learning techniques, digital healthcare is one of the prevalent areas that could gain benefits from AI-enabled functionality. This research presents a novel AI-enabled Internet of Things (IoT) device operating from the ESP-8266 platform capable of assisting those who suffer...
Chapter
As web data evolves, new technological challenges arise and one of the contributing factors to these challenges is the online social networks. Although they have some benefits, their negative impact on vulnerable users such as the spread of suicidal ideation is concerning. As such, it is vital to fine tune the approaches and techniques in order to...
Book
This book covers virtually all aspects of image formation in medical imaging, including systems based on ionizing radiation (x-rays, gamma rays) and non-ionizing techniques (ultrasound, optical, thermal, magnetic resonance, and magnetic particle imaging) alike. In addition, it discusses the development and application of computer-aided detection an...
Article
Full-text available
Handwritten character recognition has been profoundly studied for many years in the field of pattern recognition. Due to its vast practical applications and financial implications, handwritten character recognition is still an important research area. In this research, the Handwritten Ethiopian Character Recognition (HECR) dataset has been prepared...
Conference Paper
Full-text available
Due to the presence of shades of grey in reality, the fuzzy set theory has been popularly used for dealing with real-life classification problems in the setting of machine learning. In particular, fuzzy rule-based systems are trained using real-life data for classifying new instances that involve fuzziness, impression and uncertainty. Traditional a...
Article
Full-text available
Offensive or antagonistic language targeted at individuals and social groups based on their personal characteristics (also known as cyber hate speech or cyberhate) has been frequently posted and widely circulated via the World Wide Web. This can be considered as a key risk factor for individual and societal tension surrounding regional instability....
Article
Full-text available
In this research, we propose two variants of the Firefly Algorithm (FA), namely inward intensified exploration FA (IIEFA) and compound intensified exploration FA (CIEFA), for undertaking the obstinate problems of initialization sensitivity and local optima traps of the K-means clustering model. To enhance the capability of both exploitation and exp...
Article
Full-text available
Due to the vast and rapid increase in the size of data, machine learning has become an increasingly popular approach of data classification, which can be done by training a single classifier or a group of clas-sifiers. A single classifier is typically learned by using a standard algorithm, such as C4.5. Due to the fact that each of the standard lea...
Article
Full-text available
The authors regret that summary statistics reported in the original published version were incorrect. All summary statistics, including figures 4-7, have been corrected in the online and print version of the paper. © 2019 The Author(s) 2019. Published by Oxford University Press on behalf of the Centre for Crime and Justice Studies (ISTD).
Article
Full-text available
Rule learning is a special type of machine learning approaches, and its key advantage is the generation of interpretable models, which provides a transparent process of showing how an input is mapped to an output. Traditional rule learning algorithms are typically based on Boolean logic for inducing rule antecedents, which are very effective for tr...
Article
Full-text available
National governments now recognize online hate speech as a pernicious social problem. In the wake of political votes and terror attacks, hate incidents online and offline are known to peak in tandem. This article examines whether an association exists between both forms of hate, independent of ‘trigger’ events. Using Computational Criminology that...
Conference Paper
Full-text available
Image classification is a special type of applied machine learning tasks, where each image can be treated as an instance if there is only one target object that belongs to a specific class and needs to be recognized from an image. In the case of recognizing multiple target objects from an image, the image classification task can be formulated as im...
Conference Paper
Full-text available
Image classification is a special type of classification tasks in the setting of supervised machine learning. In general, in order to achieve good performance of image classification, it is important to select high quality features for training classifiers. However, different instances of images would usually present very diverse features even if t...
Conference Paper
Full-text available
In recent years, targeted sentiment analysis has received great attention as a fine-grained sentiment analysis. Determining the sentiment polarity of a specific target in a sentence is the main task. This paper proposes a multi-channel convolutional neural network (MCL-CNN) for targeted sentiment classification. Our approach can not only paralleliz...
Conference Paper
Full-text available
Dissolved oxygen of aquaculture is an important measure of the quality of culture environment and how aquatic products have been grown. In the machine learning context, the above measure can be achieved by defining a regression problem, which aims at numerical prediction of the dissolved oxygen status. In general, the vast majority of popular machi...
Conference Paper
Full-text available
Binocular stereo matching aims to obtain disparities from two very close views. Existing stereo matching methods may cause false matching when there are much image noise and disparity discontinuities. This paper proposes a novel binocular stereo matching algorithm based on SAD and improved Census transformation. We first perform improved Census tra...
Conference Paper
Full-text available
Face sketch recognition identifies the face photo from a large face sketch dataset. Some traditional methods are typically used to reduce the modality gap between face photos and sketches and gain excellent recognition rate based on a pseudo image which is synthesized using the corresponded face photo. However, these methods cannot obtain better hi...
Article
Full-text available
Rule learning is one of the most popular types of machine learning approaches, which typically follow two main strategies: 'divide and conquer' and 'separate and conquer'. The former strategy is aimed at induction of rules in the form of a decision tree, whereas the latter one is aimed at direct induction of if-then rules. Due to the case that the...
Article
Full-text available
In machine learning tasks, it is essential for a data set to be partitioned into a training set and a test set in a specific ratio. In this context, the training set is used for learning a model for making predictions on new instances, whereas the test set is used for evaluating the prediction accuracy of a model on new instances. In the context of...
Article
Full-text available
Rule learning is a popular branch of machine learning, which can provide accurate and interpretable classification results. In general, two main strategies of rule learning are referred to as 'divide and conquer' and 'separate and con-quer'. Decision tree generation that follows the former strategy has a serious drawback, which is known as the repl...
Preprint
Full-text available
Sentiment analysis is a very popular application area of text mining and machine learning. The popular methods include Support Vector Machine, Naive Bayes, Decision Trees and Deep Neural Networks. However, these methods generally belong to discriminative learning, which aims to distinguish one class from others with a clear-cut outcome, under the p...
Conference Paper
Full-text available
Classification is a popular task of supervised machine learning, which can be achieved by training a single classifier or a group of classifiers. In general, the performance of each traditional learning algorithm which leads to the production of a single classifier is varied on different data sets, i.e., each learning algorithm may produce good cla...
Article
Full-text available
In traditional machine learning, classification is typically undertaken in the way of discriminative learning by using probabilistic approaches, i.e. learning a classifier that discriminates one class from other classes. The above learning strategy is mainly due to the assumption that different classes are mutually exclusive and each instance is cl...
Conference Paper
Full-text available
This paper presents an occlusion robust tracking(ORT) method for multiple faces tracking. Given a video havingmultiple faces, we firstly detect faces in the first frame usingthe off-the-shelf face detector, and then extract wavelet packettransform (WPT) coefficients and color features from the detectedfaces, finally we design a back propagation (BP...
Conference Paper
Full-text available
Feature selection is typically employed before or in conjunction with classification algorithms to reduce the feature dimensionality and improve the classification performance, as well as reduce processing time. While particular approaches have been developed for feature selection, such as filter and wrapper approaches, some algorithms perform feat...
Conference Paper
Full-text available
For this study, we used the Doc2Vec embedding approach for feature extraction, with the context window size of 2, minimum word frequency of 2, sampling rate of 0.001, learning rate of 0.025, minimum learning rate of 1.0E-4, 200 layers, batch size of 10000 and 40 epochs. Distributed Memory (DM) is used as the embedding learning algorithm with the ne...