Jun He

Aberystwyth University, Aberystwyth, Wales, United Kingdom

Are you Jun He?

Claim your profile

Publications (67)46.42 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Some experimental investigations have shown that evolutionary algorithms (EAs) are efficient for the minimum label spanning tree (MLST) problem. However, we know little about that in theory. As one step towards this issue, we theoretically analyze the performances of the (1+1) EA, a simple version of EAs, and a multi-objective evolutionary algorithm called GSEMO on the MLST problem. We reveal that for the MLST$_{b}$ problem the (1+1) EA and GSEMO achieve a $\frac{b+1}{2}$-approximation ratio in expected polynomial times of $n$ the number of nodes and $k$ the number of labels. We also show that GSEMO achieves a $(2ln(n))$-approximation ratio for the MLST problem in expected polynomial time of $n$ and $k$. At the same time, we show that the (1+1) EA and GSEMO outperform local search algorithms on three instances of the MLST problem. We also construct an instance on which GSEMO outperforms the (1+1) EA.
    09/2014;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Evolutionary algorithms are well suited for solving the knapsack problem. Some empirical studies claim that evolutionary algorithms can produce good solutions to the 0-1 knapsack problem. Nonetheless, few rigorous investigations address the quality of solutions that evolutionary algorithms may produce for the knapsack problem. The current paper focuses on a theoretical investigation of three types of (N+1) evolutionary algorithms that exploit bitwise mutation, truncation selection, plus different repair methods for the 0-1 knapsack problem. It assesses the solution quality in terms of the approximation ratio. Our work indicates that the solution produced by pure strategy and mixed strategy evolutionary algorithms is arbitrarily bad. Nevertheless, the evolutionary algorithm using helper objectives may produce 1/2-approximation solutions to the 0-1 knapsack problem.
    04/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: The 0-1 knapsack problem is a well-known combinatorial optimisation problem. Approximation algorithms have been designed for solving it and they return provably good solutions within polynomial time. On the other hand, genetic algorithms are well suited for solving the knapsack problem and they find reasonably good solutions quickly. A naturally arising question is whether genetic algorithms are able to find solutions as good as approximation algorithms do. This paper presents a novel multi-objective optimisation genetic algorithm for solving the 0-1 knapsack problem. Experiment results show that the new algorithm outperforms its rivals, the greedy algorithm, mixed strategy genetic algorithm, and greedy algorithm + mixed strategy genetic algorithm.
    04/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: The 0-1 knapsack problem is a well-known combinatorial optimisation problem. Approximation algorithms have been designed for solving it and they return provably good solutions within polynomial time. On the other hand, genetic algorithms are well suited for solving the knapsack problem and they find reasonably good solutions quickly. A naturally arising question is whether genetic algorithms are able to find solutions as good as approximation algorithms do. This paper presents a novel multi-objective optimisation genetic algorithm for solving the 0-1 knapsack problem. Experiment results show that the new algorithm outperforms its rivals, the greedy algorithm, mixed strategy genetic algorithm, and greedy algorithm + mixed strategy genetic algorithm.
    03/2014;
  • Source
    Jun He, Feidun He, Xin Yao
    [Show abstract] [Hide abstract]
    ABSTRACT: The convergence, convergence rate and expected hitting time play fundamental roles in the analysis of randomised search heuristics. This paper presents a unified Markov chain approach to studying them. Using the approach, the sufficient and necessary conditions of convergence in distribution are established. Then the average convergence rate is introduced to randomised search heuristics and its lower and upper bounds are derived. Finally, novel average drift analysis and backward drift analysis are proposed for bounding the expected hitting time. A computational study is also conducted to investigate the convergence, convergence rate and expected hitting time. The theoretical study belongs to a prior and general study while the computational study belongs to a posterior and case study.
    12/2013;
  • Source
    Boris Mitavskiy, Jun He
    [Show abstract] [Hide abstract]
    ABSTRACT: In the current work we introduce a novel estimation of distribution algorithm to tackle a hard combinatorial optimization problem, namely the single-machine scheduling problem, with uncertain delivery times. The majority of the existing research coping with optimization problems in uncertain environment aims at finding a single sufficiently robust solution so that random noise and unpredictable circumstances would have the least possible detrimental effect on the quality of the solution. The measures of robustness are usually based on various kinds of empirically designed averaging techniques. In contrast to the previous work, our algorithm aims at finding a collection of robust schedules that allow for a more informative decision making. The notion of robustness is measured quantitatively in terms of the classical mathematical notion of a norm on a vector space. We provide a theoretical insight into the relationship between the properties of the probability distribution over the uncertain delivery times and the robustness quality of the schedules produced by the algorithm after a polynomial runtime in terms of approximation ratios.
    12/2013;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Drift analysis is a useful tool for estimating upper and lower bounds on the runtime of evolutionary algorithms. A new representation of drift analysis, called average drift analysis, is introduced in the paper. It takes a weaker requirement than point-wise drift analysis does. Point-wise drift theorems are a corollary of average drift theorems. Therefore average drift analysis is more powerful than point-wise drift analysis. To demonstrate the advantage of average drift analysis, we choose (1+N) evolutionary algorithms for linear-like functions as case study. Using average drift analysis, an exact bound on the runtime has been drawn for the algorithm and then the cut-off point of population scalability is derived.
    08/2013;
  • Source
    Boris Mitavskiy, Jun He
    [Show abstract] [Hide abstract]
    ABSTRACT: A popular current research trend deals with expanding the Monte-Carlo tree search sampling methodologies to the environments with uncertainty and incomplete information. Recently a finite population version of Geiringer theorem with nonhomologous recombination has been adopted to the setting of Monte-Carlo tree search to cope with randomness and incomplete information by exploiting the entrinsic similarities within the state space of the problem. The only limitation of the new theorem is that the similarity relation was assumed to be an equivalence relation on the set of states. In the current paper we lift this "curtain of limitation" by allowing the similarity relation to be modeled in terms of an arbitrary set cover of the set of state-action pairs.
    05/2013;
  • Source
    Boris Mitavskiy, Jun He
    [Show abstract] [Hide abstract]
    ABSTRACT: Hybrid and mixed strategy EAs have become rather popular for tackling various complex and NP-hard optimization problems. While empirical evidence suggests that such algorithms are successful in practice, rather little theoretical support for their success is available, not mentioning a solid mathematical foundation that would provide guidance towards an efficient design of this type of EAs. In the current paper we develop a rigorous mathematical framework that suggests such designs based on generalized schema theory, fitness levels and drift analysis. An example-application for tackling one of the classical NP-hard problems, the "single-machine scheduling problem" is presented.
    05/2013;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The classical Geiringer theorem addresses the limiting frequency of occurrence of various alleles after repeated application of crossover. It has been adopted to the setting of evolutionary algorithms and, a lot more recently, reinforcement learning and Monte-Carlo tree search methodology to cope with a rather challenging question of action evaluation at the chance nodes. The theorem motivates novel dynamic parallel algorithms that are explicitly described in the current paper for the first time. The algorithms involve independent agents traversing a dynamically constructed directed graph that possibly has loops. A rather elegant and profound category-theoretic model of cognition in biological neural networks developed by a well-known French mathematician, professor Andree Ehresmann jointly with a neurosurgeon, Jan Paul Vanbremeersch over the last thirty years provides a hint at the connection between such algorithms and Hebbian learning.
    Natural Computing 05/2013; · 0.68 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In pure strategy meta-heuristics, only one search strategy is applied for all time. In mixed strategy meta-heuristics, each time one search strategy is chosen from a strategy pool with a probability and then is applied. An example is classical genetic algorithms, where either a mutation or crossover operator is chosen with a probability each time. The aim of this paper is to compare the performance between mixed strategy and pure strategy meta-heuristic algorithms. First an experimental study is implemented and results demonstrate that mixed strategy evolutionary algorithms may outperform pure strategy evolutionary algorithms on the 0-1 knapsack problem in up to 77.8% instances. Then Complementary Strategy Theorem is rigorously proven for applying mixed strategy at the population level. The theorem asserts that given two meta-heuristic algorithms where one uses pure strategy 1 and another uses pure strategy 2, the condition of pure strategy 2 being complementary to pure strategy 1 is sufficient and necessary if there exists a mixed strategy meta-heuristics derived from these two pure strategies and its expected number of generations to find an optimal solution is no more than that of using pure strategy 1 for any initial population, and less than that of using pure strategy 1 for some initial population.
    03/2013;
  • [Show abstract] [Hide abstract]
    ABSTRACT: The particle swarm optimization algorithm has been used for solving multi-objective optimization problems in last decade. This algorithm has a capacity of fast convergence; however its exploratory capability needs to be enriched. An alternative method of overcoming this disadvantage is to add mutation operator(s) into particle swarm optimization algorithms. Since the single-point mutation is good at global exploration, in this paper a new coevolutionary algorithm is proposed, which combines single-point mutation and particle swarm optimization together. The two operators are cooperated under the framework of mixed strategy evolutionary algorithms. The proposed algorithm is validated on a benchmark test set, and is compared with classical multi-objective optimization evolutionary algorithms such as NSGA2, SPEA2 and CMOPSO. Simulation results show that the new algorithm does not only guarantee its performance in terms of fast convergence and uniform distribution, but also have the advantages of stability and robustness.
    Proceedings of the 7th international conference on Rough Sets and Knowledge Technology; 08/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The hardness of fitness functions is an important research issue in evolutionary computation. In theory, the study of the hardness of fitness functions can help understand the ability of evolutionary algorithms (EAs). In practice, the study may provide a guideline to the design of benchmarks. The aim of this paper is to answer the question: what are the easiest and hardest fitness functions with respect to an EA and how will such functions be constructed? In the paper, the easiest fitness (and hardest) fitness functions have been constructed to any given elitist (1+1) EA for maximising any class of fitness functions with the same optima. In terms of the time-fitness landscape, the unimodal functions are the easiest and deceptive functions are the hardest. The paper also reveals that a fitness function, that is easiest to one EA, may become the hardest to another EA, and vice versa.
    03/2012;
  • Source
    Boris Mitavskiy, Jun He
    [Show abstract] [Hide abstract]
    ABSTRACT: Nowadays hybrid evolutionary algorithms, i.e, heuristic search algorithms combining several mutation operators some of which are meant to implement stochastically a well known technique designed for the specific problem in question while some others playing the role of random search, have become rather popular for tackling various NP-hard optimization problems. While empirical studies demonstrate that hybrid evolutionary algorithms are frequently successful at finding solutions having fitness sufficiently close to the optimal, many fewer articles address the computational complexity in a mathematically rigorous fashion. This paper is devoted to a mathematically motivated design and analysis of a parameterized family of evolutionary algorithms which provides a polynomial time approximation scheme for one of the well-known NP-hard combinatorial optimization problems, namely the "single machine scheduling problem without precedence constraints". The authors hope that the techniques and ideas developed in this article may be applied in many other situations.
    02/2012;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Mixed strategy EAs aim to integrate several mutation operators into a single algorithm. However few theoretical analysis has been made to answer the question whether and when the performance of mixed strategy EAs is better than that of pure strategy EAs. In theory, the performance of EAs can be measured by asymptotic convergence rate and asymptotic hitting time. In this paper, it is proven that given a mixed strategy (1+1) EAs consisting of several mutation operators, its performance (asymptotic convergence rate and asymptotic hitting time)is not worse than that of the worst pure strategy (1+1) EA using one mutation operator; if these mutation operators are mutually complementary, then it is possible to design a mixed strategy (1+1) EA whose performance is better than that of any pure strategy (1+1) EA using one mutation operator.
    12/2011;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Population-based evolutionary algorithms (EAs) have been widely applied to solve various optimization problems. The question of how the performance of a population-based EA depends on the population size arises naturally. The performance of an EA may be evaluated by different measures, such as the average convergence rate to the optimal set per generation or the expected number of generations to encounter an optimal solution for the first time. Population scalability is the performance ratio between a benchmark EA and another EA using identical genetic operators but a larger population size. Although intuitively the performance of an EA may improve if its population size increases, currently there exist only a few case studies for simple fitness functions. This paper aims at providing a general study for discrete optimisation. A novel approach is introduced to analyse population scalability using the fundamental matrix. The following two contributions summarize the major results of the current article. (1) We demonstrate rigorously that for elitist EAs with identical global mutation, using a lager population size always increases the average rate of convergence to the optimal set; and yet, sometimes, the expected number of generations needed to find an optimal solution (measured by either the maximal value or the average value) may increase, rather than decrease. (2) We establish sufficient and/or necessary conditions for the superlinear scalability, that is, when the average convergence rate of a $(\mu+\mu)$ EA (where $\mu\ge2$) is bigger than $\mu$ times that of a $(1+1)$ EA.
    08/2011;
  • Source
    Jun He, Tianshi Chen
    [Show abstract] [Hide abstract]
    ABSTRACT: Population-based Random Search (RS) algorithms, such as Evolutionary Algorithms (EAs), Ant Colony Optimization (ACO), Artificial Immune Systems (AIS) and Particle Swarm Optimization (PSO), have been widely applied to solving discrete optimization problems. A common belief in this area is that the performance of a population-based RS algorithm may improve if increasing its population size. The term of population scalability is used to describe the relationship between the performance of RS algorithms and their population size. Although understanding population scalability is important to design efficient RS algorithms, there exist few theoretical results about population scalability so far. Among those limited results, most of them belong to case studies, e.g. simple RS algorithms for simple problems. Different from them, the paper aims at providing a general study. A large family of RS algorithms, called ARS, has been investigated in the paper. The main contribution of this paper is to introduce a novel approach based on the fundamental matrix for analyzing population scalability. The performance of ARS is measured by a new index: spectral radius of the fundamental matrix. Through analyzing fundamental matrix associated with ARS, several general results have been proven: (1) increasing population size may increase population scalability; (2) no super linear scalability is available on any regular monotonic fitness landscape; (3) potential super linear scalability may exist on deceptive fitness landscapes; (4) "bridgeable point" and "diversity preservation" are two necessary conditions for super linear scalability on all fitness landscapes; and (5) "road through bridges" is a sufficient condition for super linear scalability.
    01/2011;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The main aim of randomized search heuristics is to produce good approximations of optimal solutions within a small amount of time. In contrast to numerous experimental results, there are only a few theoretical explorations on this subject. We consider the approximation ability of randomized search heuristics for the class of covering problems and compare single-objective and multi-objective models for such problems. For the VertexCover problem, we point out situations where the multi-objective model leads to a fast construction of optimal solutions while in the single-objective case, no good approximation can be achieved within the expected polynomial time. Examining the more general SetCover problem, we show that optimal solutions can be approximated within a logarithmic factor of the size of the ground set, using the multi-objective approach, while the approximation quality obtainable by the single-objective approach in expected polynomial time may be arbitrarily bad.
    Evolutionary Computation 01/2010; 18(4):617-33. · 2.11 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: To exploit an evolutionary algorithm’s performance to the full extent, the selection scheme should be chosen carefully. Empirically, it is commonly acknowledged that low selection pressure can prevent an evolutionary algorithm from premature convergence, and is thereby more suitable for wide-gap problems. However, there are few theoretical time complexity studies that actually give the conditions under which a high or a low selection pressure is better. In this paper, we provide a rigorous time complexity analysis showing that low selection pressure is better for the wide-gap problems with two optima.
    Theoretical Computer Science. 01/2010;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Vertex cover is one of the best known NP-hard combinatorial optimization problems. Experimental work has claimed that evolutionary algorithms (EAs) perform fairly well for the problem and can compete with problem-specific ones. A theoretical analysis that explains these empirical results is presented concerning the random local search algorithm and the (1+1)-EA. Since it is not expected that an algorithm can solve the vertex cover problem in polynomial time, a worst case approximation analysis is carried out for the two considered algorithms and comparisons with the best known problem-specific ones are presented. By studying instance classes of the problem, general results are derived. Although arbitrarily bad approximation ratios of the (1+1)-EA can be proved for a bipartite instance class, the same algorithm can quickly find the minimum cover of the graph when a restart strategy is used. Instance classes where multiple runs cannot considerably improve the performance of the (1+1)-EA are considered and the characteristics of the graphs that make the optimization task hard for the algorithm are investigated and highlighted. An instance class is designed to prove that the (1+1)-EA cannot guarantee better solutions than the state-of-the-art algorithm for vertex cover if worst cases are considered. In particular, a lower bound for the worst case approximation ratio, slightly less than two, is proved. Nevertheless, there are subclasses of the vertex cover problem for which the (1+1)-EA is efficient. It is proved that if the vertex degree is at most two, then the algorithm can solve the problem in polynomial time.
    IEEE Transactions on Evolutionary Computation 11/2009; · 4.81 Impact Factor

Publication Stats

824 Citations
46.42 Total Impact Points

Institutions

  • 2011–2013
    • Aberystwyth University
      • Department of Computer Science
      Aberystwyth, Wales, United Kingdom
  • 2009
    • University of Science and Technology of China
      • School of Computer Science and Technology
      Hefei, Anhui Sheng, China
    • University of Wales
      • Department of Computer Science
      Cardiff, WLS, United Kingdom
  • 2002–2009
    • University of Birmingham
      • • Centre of Excellence for Research in Computational Intelligence and Applications (CERCIA)
      • • School of Computer Science
      Birmingham, ENG, United Kingdom
  • 1999–2009
    • Beijing Jiaotong University
      • • School of Computer and Information Technology
      • • Department of Computer Science
      Peping, Beijing, China
  • 2004–2007
    • South China University of Technology
      • School of Computer Science and Engineering
      Shengcheng, Guangdong, China
    • Wuhan University
      • State Key Lab of Software Engineering
      Wu-han-shih, Hubei, China