About
73
Publications
29,373
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,564
Citations
Introduction
Current institution
Publications
Publications (73)
This work addresses a soft-clustered vehicle routing problem that extends the classical capacitated vehicle routing problem with one additional constraint, that is, customers are partitioned into clusters and all customers of the same cluster must be served by the same vehicle. Its potential applications include parcel delivery in courier companies...
This work studies a soft-clustered capacitated arc routing problem that extends the classical capacitated arc routing problem with an important constraint. The problem has a set of required edges (e.g., the streets to be serviced) that are partitioned into clusters. The constraint ensures that all required edges of the same cluster are served by th...
Rank aggregation combines the preference rankings of multiple alternatives from different voters into a single consensus ranking, providing a useful model for a variety of practical applications but posing a computationally challenging problem. In this paper, we provide an effective hybrid evolutionary ranking algorithm to solve the rank aggregatio...
This study considers a well-known critical node detection problem that aims to minimize a pairwise connectivity measure of an undirected graph via the removal of a subset of nodes (referred to as critical nodes) subject to a cardinality constraint. Potential applications include epidemic control, emergency response, vulnerability assessment, carbon...
The capacitated electric vehicle routing problem (CEVRP) extends the traditional vehicle routing problem by simultaneously considering the service order of the customers and the recharging schedules of the vehicles. Due to its NP-hard nature, we decompose the original problem into two sub-problems: a capacitated vehicle routing problem (CVRP) and a...
The
k
-vertex cut (
k
-VC) problem belongs to the family of the critical node detection problems, which aims to find a minimum subset of vertices whose removal decomposes a graph into at least
k
connected components. It is an important NP-hard problem with various real-world applications, e.g., vulnerability assessment, carbon emissions tracki...
A bottleneck-minimized colored traveling salesman problem is an important variant of colored traveling salesman problems. It is useful in handling the planning problems with partially overlapped workspace such as the scheduling transportation resources for timely delivery of goods. In this work, we propose an efficient bi-trajectory hybrid search m...
Detecting critical nodes in sparse graphs is important in a variety of application domains, such as network vulnerability assessment, epidemic control, and drug design. The critical node problem (CNP) aims to find a set of critical nodes from a network whose deletion maximally degrades the pairwise connectivity of the residual network. Due to its g...
Rank aggregation combines the preference rankings of multiple alternatives from different voters into a single consensus ranking, providing a useful model for a variety of practical applications, but posing a computationally challenging problem. In this paper, we provide an effective hybrid evolutionary ranking algorithm to solve the rank aggregati...
This study considers a well-known critical node detection problem that aims to minimize a pairwise connectivity measure of an undirected graph via the removal of a sub-set of nodes (referred to as critical nodes) subject to a cardinality constraint. Potential appli-cations include epidemic control, emergency response, vulnerability assessment, carb...
The distance-based critical node problem involves identifying a subset of nodes in a graph such that the removal of these nodes leads to a residual graph with the minimum distance-based connectivity. Due to its NP-hard nature, solving this problem using exact algorithms has proved to be highly challenging. Moreover, existing heuristic algorithms ar...
The
$\alpha$
-separator problem (
$\alpha$
-SP) consists of finding the minimum set of vertices whose removal separates the network into multiple different connected components with fewer than a limited number of vertices in each component, which belongs to the family of critical node detection problems. The
$\alpha$
-SP problem is an importan...
A coloring traveling salesman problem (CTSP) generalizes the well-known multiple traveling salesman problem, where colors are used to differentiate salesmen’s the accessibility to individual cities to be visited. As a useful model for a variety of complex scheduling problems, CTSP is computationally challenging. In this paper, we propose a Multi-ne...
This paper considers a category of classification problems where the samples of different classes are not represented equally. It arises in a variety of application areas and has been widely studied in pattern recognition. This paper focuses on enhancing the original data representation by combining the gravitation-based method with multiple empiri...
We present frequent pattern-based search (FPBS) that combines data mining and optimization. FPBS is a general-purpose method that unifies data mining and optimization within the population-based search framework. The method emphasizes the relevance of a modular- and component-based approach, making it applicable to optimization problems by instanti...
Finding an optimal set of critical nodes in a complex network has been a long-standing problem in the fields of both artificial intelligence and operations research. Potential applications include epidemic control, network security, carbon emission monitoring, emergence response, drug design, and vulnerability assessment. In this work, we consider...
Rank aggregation aims to combine the preference rankings of a number of alternatives from different voters into a single consensus ranking. As a useful model for a variety of practical applications, however, it is a computationally challenging problem. In this paper, we propose an effective hybrid evolutionary ranking algorithm to solve the rank ag...
Detecting critical nodes in sparse networks is important in a variety of application domains. A Critical Node Problem (CNP) aims to find a set of critical nodes from a network whose deletion maximally degrades the pairwise connectivity of the residual network. Due to its general NP-hard nature, state-of-the-art CNP solutions are based on heuristic...
Electronic health records (EHRs) in hospital information systems contain patients’ diagnoses and treatments, so EHRs are essential to clinical data mining. Of all the tasks in the mining process, Chinese word segmentation (CWS) is a fundamental and important one, and most state-of-the-art methods greatly rely on large scale of manually annotated da...
Yan Jin Bowen Xiong Kun He- [...]
Yi Zhou
Maximal Clique Enumeration (MCE) is a fundamental and challenge problem in various graph theory and network applications. Numerous algorithms have been proposed in the past decades, however, only a few of them focus on improving the practical efficiency in large graphs. To this end, we propose an efficient algorithm called FACEN based on the Bron–K...
Population-based memetic algorithms have been successfully applied to solve many difficult combinatorial problems. Often, a population of fixed size is used in such algorithms to record some best solutions sampled during the search. However, given the particular features of the problem instance under consideration, a population of variable size wou...
Identifying critical nodes is an efficient way to analyze and apprehend the properties , structures, and functions of complex networks, which is a challenging NP-hard problem. This paper studies a node-weighted version of the critical node problem (NWCNP) that involves minimization of pairwise connectivity measure of a given node-weighted graph via...
In real scenes, humans can easily infer their positions and distances from other objects with their own eyes. To make the robots have the same visual ability, this paper presents an unsupervised OnionNet framework, including LeafNet and ParachuteNet, for single-view depth prediction and camera pose estimation. In OnionNet, for speeding up OnionNet’...
Enriching terminology base (TB) is an important and continuous process, since formal term can be renamed and new term alias emerges all the time. As a potential supplementary for TB enrichment, electronic health record (EHR) is a fundamental source for clinical research and practise. The task to align the set of external terms in EHRs to TB can be...
Clinical Named Entity Recognition (CNER) aims to automatically identity clinical terminologies in Electronic Health Records (EHRs), which is a fundamental and crucial step for clinical research. To train a high-performance model for CNER, it usually requires a large number of EHRs with high-quality labels. However, labeling EHRs, especially Chinese...
Background:
Electronic health records (EHRs) provide possibilities to improve patient care and facilitate clinical research. However, there are many challenges faced by the applications of EHRs, such as temporality, high dimensionality, sparseness, noise, random error and systematic bias. In particular, temporal information is difficult to effecti...
Named entities composed of multiple continuous words frequently occur in domain-specific knowledge graphs. In general, these named entities are composable and extensible, such as names of symptoms and diseases in the medical domain. Unlike the general entities, we address them as
compound entities
, and try to identify hypernymy relations between...
Heart Failure (HF) is one of the most common causes of hospitalization and is burdened by short-term (in-hospital) and long-term (6 to 12 month) mortality. Accurate prediction of HF mortality plays a critical role in evaluating early treatment effects. However, due to the lack of a simple and effective prediction model, mortality prediction of HF i...
By dividing the original data set into several sub-sets, Multiple Partial Empirical Kernel Learning (MPEKL) constructs multiple kernel matrixes corresponding to the sub-sets, and these kernel matrixes are decomposed to provide the explicit kernel functions. Then, the instances in the original data set are mapped into multiple kernel spaces, which p...
Critical node problems involve identifying a subset of critical nodes from an undirected graph whose removal results in optimizing a pre-defined measure over the residual graph. As useful models for a variety of practical applications, these problems are computational challenging. In this paper, we study the classic critical node problem (CNP) and...
Population-based memetic algorithms have been successfully applied to solve many difficult combinatorial problems. Often, a population of fixed size was used in such algorithms to record some best solutions sampled during the search. However, given the particular features of the problem instance under consideration, a population of variable size wo...
Enriching existing medical terminology knowledge bases (KBs) is an important and never-ending work for clinical research because new terminology alias may be continually added and standard terminologies may be newly renamed. In this paper, we propose a novel automatic terminology enriching approach to supplement a set of terminologies to KBs. Speci...
Entity and relation extraction is the necessary step in structuring medical text. However, the feature extraction ability of the bidirectional long short term memory network in the existing model does not achieve the best effect. At the same time, the language model has achieved excellent results in more and more natural language processing tasks....
Clinical text structuring is a critical and fundamental task for clinical research. Traditional methods such as taskspecific end-to-end models and pipeline models usually suffer from the lack of dataset and error propagation. In this paper, we present a question answering based clinical text structuring (QA-CTS) task to unify different specific tas...
Electronic health record is an important source for clinical researches and applications, and errors inevitably occur in the data, which could lead to severe damages to both patients and hospital services. One of such error is the mismatches between diagnoses and prescriptions, which we address as 'medication anomaly' in the paper, and clinicians u...
Clinical named entity recognition (CNER) is a fundamental and crucial task for clinical and translation research. In recent years, deep learning methods have achieved significant success in CNER tasks. However, these methods depend greatly on recurrent neural networks (RNNs), which maintain a vector of hidden activations that are propagated through...
Coronary artery disease (CAD) is one of the leading
causes of cardiovascular disease deaths. CAD condition progresses
rapidly, if not diagnosed and treated at an early stage may
eventually lead to an irreversible state of the heart muscle death.
Invasive coronary arteriography is the gold standard technique
for CAD diagnosis. Coronary arteriography...
Named entities are usually composable and extensible. Typical examples are names of symptoms and diseases in medical areas. To distinguish these entities from general entities, we name them compound entities. In this paper, we present an attention-based Bi-GRU-CapsNet model to detect hypernymy relationship between compound entities. Our model consi...
Clinical Named Entity Recognition (CNER) aims
to identify and classify clinical terms such as diseases, symptoms,
treatments, exams, and body parts in electronic health
records, which is a fundamental and crucial task for clinical
and translation research. In recent years, deep learning methods
have achieved significant success in CNER tasks. Howev...
Electronic Health Records (EHRs) provide possibilities to improve patient care and facilitate clinical research. However, there are many challenges faced by the applications of EHRs, such as temporality, high dimensionality, sparseness, noise, random error, and systematic bias. In particular, temporal patient information is difficult to effectively...
Named entities are usually composable and extensible. Typical examples are names of symptoms and diseases in medical areas. To distinguish these entities from general entities, we name them compound entities. In this paper, we present an attention-based Bi-GRU-CapsNet model to detect hypernymy relationship between compound entities. Our model consi...
Since 2008, a regional medical health platform has been built for managing electronic health records of top public hospitals in Shanghai. However, public hospitals often use different names to present a same laboratory examination item (or lab indicator) in this regional platform, which seriously hinders the interconnection and sharing of medical i...
Clinical named entity recognition aims to identify and classify clinical terms such as diseases, symptoms, treatments, exams, and body parts in electronic health records, which is a fundamental and crucial task for clinical and translational research. In recent years, deep neural networks have achieved significant success in named entity recognitio...
Medical activities, such as diagnoses, medicine treatments, and laboratory tests, as well as temporal relations between these activities are the basic concepts in clinical research. However, existing relational data model on electronic medical records (EMRs) lacks explicit and accurate semantic definitions of these concepts. It leads to the inconve...
Clinical Named Entity Recognition (CNER) aims to identify and classify clinical terms such as diseases, symptoms, treatments, exams, and body parts in electronic health records, which is a fundamental and crucial task for clinical and translation research. In recent years, deep learning methods have achieved significant success in CNER tasks. Howev...
Coronary artery disease (CAD) is one of the leading causes of cardiovascular disease deaths. CAD condition progresses rapidly, if not diagnosed and treated at an early stage may eventually lead to an irreversible state of the heart muscle death. Invasive coronary arteriography is the gold standard technique for CAD diagnosis. Coronary arteriography...
Label ranking aims to learn a mapping from instances to rankings over a finite number of predefined labels. Random forest is a powerful and one of the most successfully general-purpose machine learning algorithms of modern times. In the literature, there seems no research has yet been done in applying random forest to label ranking. In this paper,...
Clinical Named Entity Recognition (CNER) aims to identify and classify clinical terms such as diseases, symptoms, treatments, exams, and body parts in electronic health records, which is a fundamental and crucial task for clinical and translational research. In recent years, deep neural networks have achieved significant success in named entity rec...
This paper presents an improved probability learning based local search algorithm for the well-known graph coloring problem. The algorithm iterates through three distinct phases: a starting coloring generation phase based on a probability matrix, a heuristic coloring improvement phase and a learning based probability updating phase. The method main...
This paper presents a hybrid approach called frequent pattern based search that combines data mining and optimization. The proposed method uses a data mining procedure to mine frequent patterns from a set of high-quality solutions collected from previous search, and the mined frequent patterns are then employed to build starting solutions that are...
The critical node problem (CNP) aims to identify a subset of critical nodes in an undirected graph such that removing these critical nodes minimizes the pairwise node connectivity over the residual graph. CNP has various applications; however, it is computationally challenging. This paper introduces FastCNP, a fast heuristic algorithm for solving t...
Given a set of $n$ elements separated by a pairwise distance matrix, the minimum differential dispersion problem (Min-Diff DP) aims to identify a subset of m elements (m < n) such that the difference between the maximum sum and the minimum sum of the inter-element distances between any two chosen elements is minimized. We propose an effective itera...
Critical node problems involve identifying a subset of critical nodes from an undirected graph whose removal results in optimizing a pre-defined measure over the residual graph. As useful models for a variety of practical applications, these problems are computational challenging. In this paper, we study the classic critical node problem (CNP) and...
As a usual model for a variety of practical applications, the maximum diversity problem (MDP) is computational challenging. In this paper, we present an opposition-based memetic algorithm (OBMA) for solving MDP, which integrates the concept of opposition-based learning (OBL) into the wellknown memetic search framework. OBMA explores both candidate...
Label ranking aims to learn a mapping from instances to rankings over a finite number of predefined labels. Random forest is a powerful and one of the most successful general-purpose machine learning algorithms of modern times. In this paper, we present a powerful random forest label ranking method which uses random decision trees to retrieve neare...
Given a set of $n$ elements separated by a pairwise distance matrix, the minimum differential dispersion problem (Min-Diff DP) aims to identify a subset of m elements (m < n) such that the difference between the maximum sum and the minimum sum of the inter-element distances between any two chosen elements is minimized. We propose an effective itera...
Grouping problems aim to partition a set of items into multiple mutually disjoint subsets according to some specific criterion and constraints. Grouping problems cover a large class of important combinatorial optimization problems that are generally computationally difficult. In this paper, we propose a general solution approach for grouping proble...
Grouping problems aim to partition a set of items into multiple mutually disjoint subsets according to some specific criterion and constraints. Grouping problems cover a large class of important combinatorial optimization problems that are generally computationally difficult. In this paper, we propose a general solution approach for grouping proble...
Label ranking studies the issue of learning a model that maps instances to rankings over a finite set of predefined labels. In order to relieve the cost of memory and time during training and prediction, we propose a novel approach for label ranking problem based on Gaussian mixture model in this paper. The key idea of the approach is to divide the...
The evaluation of classifiers' performances plays a critical role in construction and selection of classification model. Although many performance metrics have been proposed in machine learning community, no general guidelines are available among practitioners regarding which metric to be selected for evaluating a classifier's performance. In this...
The correct selection of performance metrics is one of the most key issues in evaluating classifier's performance. Although many performance metrics have been proposed and used in machine learning community, there is not any common conclusions among practitioners regarding which metric to choose for evaluating a classifier's performance. In this pa...
The problem of learning label rankings is receiving increasing attention from machine learning and data mining community. Its goal is to learn a mapping from instances to rankings over a finite number of labels. In this paper, we devote to giving an overview of the state-of-the-art in the area of label ranking, and providing a basic taxonomy of the...