About
222
Publications
28,927
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,644
Citations
Introduction
Current institution
Publications
Publications (222)
Time-series forecasting is a challenging task that requires high accuracy and efficiency. Hybrid models that combine decomposition algorithms with multiple individual models have demonstrated promising results for forecasting performance. However, these models also face the issues of high computational cost and time consumption when dealing with mu...
In the field of complex networks, identifying important nodes is of great importance both in theoretical and practical applications. Compared with the important node identification of the static network, the important node identification of the temporal network is a more urgent problem to solve since most complex networks in reality change with tim...
To clean and correct abnormal information in domain-oriented knowledge bases (KBs) such as DBpedia automatically is one of the focuses of large KB correction. It is of paramount importance to improve the accuracy of different application systems, such as Q&A systems, which are based on these KBs. In this paper, a triples correction assessment (TCA)...
The identification of multiple influential nodes that influence the structure or function of a complex network has attracted much attention in recent years. Distinguished from individual significant nodes, the problem of overlapping spheres of influence among influential nodes becomes a key factor that hinders their identification. Most approaches...
Siamese network based trackers have achieved outstanding performance in visual object tracking, which in essence is the application of the efficient cross-correlation as the matching function. However, it is experimentally found that the cross-correlation based matching function is difficult to generate accurate tracking results in some challenging...
Suffering from the inefficiency of deeper and wider networks, most remarkable super-resolution algorithms cannot be easily applied to real-world scenarios, especially resource-constrained devices. In this paper, to concentrate on fewer parameters and faster inference, an end-to-end Wavelet-based Transformer for Image Super-resolution (WTSR) is prop...
Unsupervised Domain Adaptation (UDA) challenges the problem of alleviating the effect of domain shift. Common UDA methods all require labelled source samples. However, in some real application scenarios, such as Federated Learning, the source data is inaccessible due to data privacy or intellectual property, and only a pre-trained source model and...
The action recognition backbone has continued to advance. The two-stream method based on Convolutional Neural Networks (CNNs) usually pays more attention to the video’s local features and ignores global information because of the limitation of Convolution kernels. Transformer based on attention mechanism is adopted to capture global information, wh...
Compressed sensing has captured considerable attention of researchers in the past decades. In this paper, with the aid of the powerful null space property, some deterministic recovery conditions are established for the previous $$\ell _{1}$$ ℓ 1 – $$\ell _{1}$$ ℓ 1 method and the $$\ell _{1}$$ ℓ 1 – $$\ell _{2}$$ ℓ 2 method to guarantee the exact s...
Identifying multiple influential spreaders, which relates to finding k ( k > 1) nodes with the most significant influence, is of great importance both in theoretical and practical applications. It is usually formulated as a node-ranking problem and addressed by sorting spreaders’ influence as measured based on the topological structure of interacti...
In ensemble-based unsupervised outlier detection, the lack of ground truth makes the combination of basic outlier detectors a challenging task. The existing outlier ensembles usually use certain fusion rules (majority voting, averaging) to aggregate base detectors, which results in relatively low model accuracy and robustness. To overcome this prob...
Domain-oriented knowledge bases (KBs) such as DBpedia and YAGO are largely constructed by applying a set of predefined extraction rules to the semi-structured contents of Wikipedia articles. Although both of these large-scale KBs achieve very high average precision values (above 95% for DBpedia (The estimated precision of those statements is 95% in...
Online signature analysis can be widely applied in e-security and health. The latest method combines the Sigma-Lognormal model and visual feedback to extract the kinematic and spatial parameters of online signatures, but the model still does not perform well in complex handwriting signatures. Inaccurate parameters cannot reveal health information a...
Identifying the nodes that play significant roles in the epidemic spreading process has attracted extensive attention in recent years. Few centrality measures, such as temporal degree and temporal closeness centrality, have been proposed to quantify node importance based on the topological structure of social contact networks. Most methods estimate...
As a new bio-inspired algorithm, the Physarum-based algorithm has shown great performance for solving complex computational problems. More and more researchers try to use the algorithm to solve some network optimization problems. Although the Physarum-based algorithm can figure out these problems correctly and accurately, the convergence speed of P...
Compressed sensing has captured considerable attention of researchers in the past decades. In this paper, with the aid of the powerful null space property, some deterministic recovery conditions are established for the previous ℓ 1 -ℓ 1 and ℓ 1 -ℓ 2 methods to guarantee the exact sparse recovery when the side information of the desired signal is av...
Sufficient training data typically are required to train learning models. However, due to the expensive manual process for labeling a large number of samples, the amount of available training data is always limited (real data). Generative Adversarial Network (GAN) has good performance in generating artificial samples (generated data), the generated...
Feature selection (FS) is one of the most powerful techniques to cope with the curse of dimensionality. In the study, a new filter approach to feature selection based on distance correlation is presented (DCFS, for short), which keeps the model-free advantage without any pre-specified parameters. Our method consists of two steps: hard step (forward...
Despite the growing prominence of generative adversarial networks (GANs), improving the performance of GANs is still a challenging problem. To this end, a combination method for training GANs is proposed by coupling spectral normalization with a zero-centered gradient penalty technique (the penalty is done on the inner function of Sigmoid function...
Multi-label active learning (MAL) aims to learn an accurate multi-label classifier by selecting which examples (or example-label pairs) will be annotated and reducing query effort. MAL is more complicated, since one example can be associated with a set of non-exclusive labels and the annotator has to scrutinize the whole example and label space to...
This book constitutes the refereed proceedings of the 5th International School on Engineering Trustworthy Software Systems, SETSS 2019, held in Chongqing, China, in April 2019.
The five chapters in this volume provide lectures on leading-edge research in methods and tools for use in computer system engineering. The topics covered in these chapters...
Yonghou He Bo Chen Yuanxi Li- [...]
Li Tao
The characteristics of patient arrivals and service utilization are the theoretical foundation for modeling and simulating healthcare service systems. However, some commonly acknowledged characteristics of outpatient departments (e.g., the Gaussian distribution of the patient numbers, or the exponential distribution of diagnosis time) may be doubte...
Clustering is a fundamental data exploration task which aims at discovering the hidden grouping structure in the data. The traditional clustering methods typically compute a single partition. However, there often exist different and equally meaningful clusterings in complex data. To solve this issue, multiple clustering approaches have emerged with...
Active surveillance, which aims at detecting and controlling infectious diseases at an early stage, is essential to prevent the spread of infections, protect people’s health, and promote social good. One difficult problem in active surveillance is how to intelligently sample a small group of nodes as sentinels from a large number of individuals for...
Developing effective and efficient small-scale data classification methods is very challenging in the digital age. Recent researches have shown that deep forest achieves a considerable increase in classification accuracy compared with general methods, especially when the training set is small. However, the standard deep forest may experience over-f...
Graph coloring problem (GCP) is a classical combinatorial optimization problem and has many applications in the industry. Many algorithms have been proposed for solving GCP. However, insufficient efficiency and unreliable stability still limit their performance. Aiming to overcome these shortcomings, a physarum-based ant colony optimization for sol...
Multi-view Multi-instance Multi-label Learning (M3L) deals with complex objects encompassing diverse instances, represented with different feature views, and annotated with multiple labels. Existing M3L solutions only partially explore the inter or intra relations between objects (or bags), instances, and labels, which can convey important contextu...
We look at a recent expansion of Physarum research from inspiring biomimetic algorithms to serving as a model organism in the evolutionary study of perception, memory, learning, and decision making.
Multi-view Multi-instance Multi-label Learning(M3L) deals with complex objects encompassing diverse instances, represented with different feature views, and annotated with multiple labels. Existing M3L solutions only partially explore the inter or intra relations between objects (or bags), instances, and labels, which can convey important contextua...
The uncertain capacitated arc routing problem is of great significance for its wide applications in the real world. In the uncertain capacitated arc routing problem, variables such as task demands and travel costs are realised in real time. This may cause the predefined solution to become ineffective and/or infeasible. There are two main challenges...
Designing effective transport networks can be considered as one of the most debated problems in the area of computational intelligence. Some nature-inspired algorithms have shown excellent abilities in the adaptive network construction. In this aspect, a unique creature, called
Physarum
, has exhibited the computing capacity to optimize protoplas...
The problem of human-agent negotiation is still not well understood, mainly because human players are not fully rational from game theory’s perspective and thus the interaction in such context is hard to model using traditional ways. This paper proposes a novel strategy for complex human-agent negotiation – that is – multiple issues, unknown oppone...
This volume contains lectures on leading-edge research in methods and tools for use in computer system engineering; at the 4th International School on Engineering Trustworthy Software Systems, SETSS 2018, held in April 2018 at Southwest University in Chongqing, China.
The five chapters in this volume provide an overview of research in the frontier...
The information of subcellular location is important to understand the functions of the proteins.Considerable efforts have been made for the precise prediction of protein subcellular location. However, the feature representation of protein sequences, a fundamental step in most of existing computational methods, is still a challenging task. In this...
The processes of urbanization and the expansion of urban scale are growing rapidly in China. Therefore, it is important to determine the influencing factors of the traffic flow. In order to evaluate the impact of internal and external flows, the area inside the Five Ring of Beijing is divided into several square-shaped areas. Each segment is define...
Current effort on multi-label learning generally assumes that the given labels are noise-free. However, obtaining noise-free labels is quite difficult and often impractical. In this paper, we study how to identify a subset of relevant labels from a set of candidate ones given as annotations to instances, and introduce a matrix factorization based m...
Multi-label learning aims at assigning a set of appropriate labels to multi-label samples. Although it has been successfully applied in various domains in recent years, most multi-label learning methods require sufficient labeled training samples, because of the large number of possible label sets. Co-training, as an important branch of semi-superv...
Learning from multi-view multi-label data has wide applications. There are two main challenges of this learning task: incomplete views and missing (weak) labels. The former assumes that views may not include all data objects. The weak label setting implies that only a subset of relevant labels are provided for training objects while other labels ar...
Vehicle routing problem (VRP) is a classic combinatorial optimization problem and has many applications in industry. Solutions of VRP have significant impact on logistic cost. In most VRP models, the shortest distance is used as the objective function, which is not the case in many real-word applications. To this end, a VRP model with fixed and fue...
Physarum polycephalum, a single-celled, multinucleate slime mould, is a seemingly simple organism, yet it exhibits quasi-intelligent behaviour during extension, foraging, and as it adapts to dynamic environments. For these reasons, Physarum is an attractive target for modelling with the underlying goal to uncover the physiological mechanisms behind...
To solve dynamic Sylvester equation in the presence of additive noises, a novel recurrent neural network (NRNN) with finite-time convergence and excellent robustness is proposed and analyzed in this paper. As compared with the design process of Zhang neural network (ZNN), the proposed NRNN is based on an ingenious integral design formula activated...
In order to characterize the structural features of public transportation systems (PTSs) in mountain cities, this paper systematically examines the robustness of urban transportation networks in Chongqing, a famous inland mountain city in China, and reveals the community feature of mobility patterns of residents. First, according to the transportat...
Frequent pattern mining is an effective approach for spatiotemporal association analysis of mobile trajectory big data in data-driven intelligent transportation systems. While existing parallel algorithms have been successfully applied to frequent pattern mining of large-scale trajectory data, two major challenges are how to overcome the inherent d...
Network immunization is an effective strategy for restraining virus spreading in computer networks and rumor propagation in social networks. Currently, lots of strategies are proposed based on topological structures of networks, such as degree-based and betweenness-based network immunization strategies. However, these studies assume that nodes in a...
Motivated by the recently emerged ℓ1 - 2 method for sparse signal recovery, in this study, the authors make an ongoing effect to extend this methodology to the setting of block sparsity, which directly leads to the proposed ℓ2/ℓ1 - 2 method for blocksparse signal recovery. Some theoretical results are induced to guarantee the validity of proposed m...
With the development of intelligent transportation systems, the estimation of traffic flow in urban areas has attracted a great attention of researchers. The timely and accurate travel information of urban residents could assist users in planning their travel strategies and improve the operational efficiency of intelligent transportation systems. C...
Identifying the most influential nodes in computer networks is an important issue in preventing the spread of computer viruses. In order to quantify the importance of nodes in the spreading of computer viruses, various centrality measures have been developed under an assumption of a static network. These measures have limitations in that many netwo...
Network traffic classification plays a significant role in cyber security applications and management scenarios. Conventional statistical classification techniques rely on the assumption that clean labelled samples are available for building classification models. However, in the big data era, mislabelled training data commonly exist due to the int...
Load-shedding is an intentional reduction approach which can maintain the stability of a microgrid system effectively. Recent studies have shown that a load-shedding problem can be solved by formulating it as a 0/1 knapsack problem (KP). Although approximate solutions of 0/1 KP can be given by ant colony optimization (ACO) algorithms, adopting them...
Fei Hu Li Li Zili Zhang- [...]
Xiao-Fei Xu
With the explosion of online communication and publication, texts become obtainable via forums, chat messages, blogs, book reviews and movie reviews. Usually, these texts are much short and noisy without sufficient statistical signals and enough information for a good semantic analysis. Traditional natural language processing methods such as Bow-of...
Community detection, an effective tool to analyze and understand network data, has been paid more and more attention in recent years. One of the most popular methods of detecting community structure is to find the division with the maximal modularity. However, the modularity maximization is an NP-complete problem. In the field of swarm intelligence...
The robustness is one of the primary characteristics of a real system, which impacts the function and performance of the system. Many real systems in our real world can be formulated as complex networks. It is a feasible method to estimate the robustness of real systems from the perspective of complex networks. The robustness evaluation is one of t...
Community mining is a vital problem for complex network analysis. Markov chains based algorithms are known as its easy-to-implement and have provided promising solutions for community mining. Existing Markov clustering algorithms have been optimized from the aspects of parallelization and penalty strategy. However, the dynamic process for enlarging...
Community mining is a powerful tool for discovering the knowledge of networks and has a wide application. The modularity is one of very popular measurements for evaluating the efficiency of community divisions. However, the modularity maximization is a NP-complete problem. As an effective optimization algorithm for solving NP-complete problems, ant...
Physarum Polycephalum is a unicellular and multi-headed slime mold, which can form high efficient networks connecting spatially separated food sources in the process of foraging. Such adaptive networks exhibit a unique characteristic in which network length and fault tolerance are appropriately balanced. Based on the biological observations, the fo...
A mobile ad hoc network is a kind of popular self-configuring network, in which multicast routing under the quality of service constraints, is a significant challenge. Many researchers have proved that such problem can be formulated as a NP-complete problem and proposed some swarm-based intelligent algorithms to solve the optimal solution, such as...
Cloud manufacturing (CMfg) has drawn extensive attentions from industrial community and academia. Quality of service (QoS)-aware service composition is critical to the on-demand use of distributed manufacturing resources and capabilities in CMfg systems. However, most previous work plainly composed composite services by the approach of one-to-one m...
This volume contains a record of some of the lectures and seminars delivered at the Second International School on Engineering Trustworthy Software Systems (SETSS 2016), held in March/April 2016 at Southwest University in Chongqing, China.
The six contributions included in this volume provide an overview of leading-edge research in methods and tool...
Community detection is a crucial and essential problem in the structure analytics of complex networks, which can help us understand and predict the characteristics and functions of complex networks. Many methods, ranging from the optimization-based algorithms to the heuristic-based algorithms, have been proposed for solving such a problem. Due to t...
By embedding an ℓp-norm noise constraint for p ≥ 2 into the recently emerged ℓ1-2 method, in this letter, we study theoretically and numerically an ℓ1-2/ℓp method for recovery of general noisy signals from highly coherent measurement matrices. In particular, the obtained theoretical results not only improve the condition deduced in [1] for Gaussian...
Several technical indicators have been proposed to assess the impact of authors and institutions. Here, we combine the h-index and the PageRank algorithm to do away with some of the individual limitations of these two indices. Most importantly, we aim to take into account value differences between citations-evaluating the citation sources by defini...
Gene Ontology (GO) provides GO annotations (GOA) that associate gene products with GO terms that summarize their cellular, molecular and functional aspects in the context of biological pathways. GO Consortium (GOC) resorts to various quality assurances to ensure the correctness of annotations. Due to resources limitations, only a small portion of a...
We propose an efficient bio-inspired algorithm for design of optimal supply chain networks in a competitive oligopoly markets. The firms compete in manufacture, storage and distribution of a product to several markets. Each firms aims at maximisation of its own profit by optimising the design capacity and product flow in the supply chain. We model...
Spam, also known as unsolicited bulk e-mail (UBE), has recently become a serious threat that negatively impacts the usability of legitimate mails. In this article, an evidential spam-filtering framework is proposed. As a useful tool to handle uncertainty, the Dempster–Shafer theory of evidence (D–S theory) is integrated into the proposed approach....
Ant colony optimization (ACO) algorithms often have a lower search efficiency for solving travelling salesman problems (TSPs). According to this shortcoming, this paper proposes a universal self-adaptive control strategy of population size for ACO algorithms. By decreasing the number of ants dynamically based on the optimal solutions obtained from...
Community mining is a crucial and essential problem in complex networks analysis. Many algorithms have been proposed for solving such problem. However, the weaker robustness and lower accuracy still limit their efficiency. Aiming to overcome those shortcomings, this paper proposes a general Physarum-based computational framework for community minin...
The era of big data brings new challenges to the network traffic technique that is an essential tool for network management and security. To deal with the problems of dynamic ports and encrypted payload in traditional port-based and payload-basedmethods, the state-of-the-art method employs flow statistical features and machine learning techniques t...
Bi-objective Traveling Salesman Problem (bTSP) is an important field in the operations research, its solutions can be widely applied in the real world. Many researches of Multi-objective Ant Colony Optimization (MOACOs) have been proposed to solve bTSPs. However, most of MOACOs suffer premature convergence. This paper proposes an optimization strat...
Since the appearance of slime mould-inspired network design applications, it has attracted the attention of many researchers from all over the world. In this chapter, we provide an overview of a variety of slime mould-inspired applications on graph-optimization problems. We will focus more on the mathematical model inspired by slime mould, develop...
In big-data-driven traffic flow prediction systems, the robustness of prediction performance depends on accuracy and timeliness. This paper presents a new MapReduce-based nearest neighbor (NN) approach for
traffic flow prediction
using
correlation
analysis (TFPC) on a Hadoop platform. In particular, we develop a real-time prediction system incl...
The courses of SETSS 2014 aim to improve the understanding of the relation between theory and practice in software engineering, to contribute to narrowing the gap between them. This volume contains the lecture notes of the five courses and materials of one seminar. The common themes of the courses include the design and use of theories, techniques...
Identifying influential spreaders in networks, which contributes to optimizing the use of available resources and efficient spreading of information, is of great theoretical significance and practical value. A random-walk-based algorithm LeaderRank has been shown as an effective and efficient method in recognizing leaders in social network, which e...
Accurate and timely traffic flow prediction is crucial to proactive traffic management and control in data-driven intelligent transportation systems (D2ITS), which has attracted great research interest in the last few years. In this paper, we propose a Spatial-Temporal Weighted K-Nearest Neighbor model, named STW-KNN, in a general MapReduce framewo...
Identifying influential nodes is of theoretical significance in many domains. Although lots of methods have been proposed to solve this problem, their evaluations are under single-source attack in scale-free networks. Meanwhile, some researches have speculated that the combinations of some methods may achieve more optimal results. In order to evalu...
The 0/1 Knapsack Problem (KP), which is a classical NP-complete problem, has been widely applied to solving many real world problems. Ant system (AS), as one of the earliest ant colony optimization (ACO) algorithms, provides approximate solutions to 0/1 KPs. However, there are some shortcomings such as low efficiency and premature convergence in mo...
Traffic subarea division is vital for traffic system management and traffic network analysis in intelligent transportation systems (ITSs). Since existing methods may not be suitable for big traffic data processing, this paper presents a MapReduce-based Parallel Three-Phase
K
-Means (Par3PKM) algorithm for solving traffic subarea division problem...
High-throughput experimental techniques provide a wide variety of heterogeneous proteomic data sources. To exploit the information spread across multiple sources for protein function prediction, these data sources are transformed into kernels and then integrated into a composite kernel. Several methods first optimize the weights on these kernels to...
NP-hard problems exist in many real world applications. Ant colony optimization (ACO) algorithms can provide approximate solutions for those NP-hard problems, but the performance of ACO algorithms is significantly reduced due to premature convergence and weak robustness, etc. With these observations in mind, this paper proposes a Physarum-based phe...
Physarum can form a higher efficient and stronger robust network in the processing of foraging. The vacant-particle model with shrinkage (VP-S model), which captures the relationship between the movement of Physarum and the process of network formation, can construct a network with a good balance between exploration and exploitation. In this paper,...
As a typical NP-complete problem, 0/1 Knapsack Problem (KP), has been widely applied in many domains for solving practical problems. Although ant colony optimization (ACO) algorithms can obtain approximate solutions to 0/1 KP, there exist some shortcomings such as the low convergence rate, premature convergence and weak robustness. In order to get...
Researches on Physarum polycephalum show that methods inspired by the primitive unicellular organism can construct an efficient network and solve some complex problems in graph theory. Current models simulating the intelligent behavior of Physarum are mainly based on Hagen–Poiseuille Law and Kirchhoff Law, reaction–diffusion, Cellular Automaton and...