Question
Asked 5th Jul, 2013

What is the optimal/recommended population size for differential evolution?

At CEC2013, a presenter said that Storn and Price recommended a population size of 10 times the number of dimensions -- e.g. population size = 100 for a ten dimensional problem. However, the only reference to DE in the presenter's paper is the original 1995 tech report, and this report only lists the population size used (and it varies).
Does anyone know of this 10x population size recommendation (and have the correct reference)? Does anyone else have references to recommended population sizes for DE?

Most recent answer

22nd Dec, 2020
Evgenii Sopov
Siberian Institute of Applied System Analysis named after A.N. Antamoshkin
I think the question is incorrect. There are many factors to be mentioned. For example, if you have a limitation for the FEVs budget, you need to tune exploration/exploitation strategies. More generation will lead to better convergence, the larger size of the population will guarantee better exploration at the early generations. With a limited budget, you can provide them both.
For many real-world problems, it is a good idea to distribute FEVs equally between generations and population size (40/50 or 50/40 is better than 10/200 or 100/20).

Popular Answers (1)

25th Jul, 2013
Shi Cheng
Shaanxi Normal University
I don't think “ a population size of 10 times the number of dimensions” is always right for optimization. Even for the single objective optimization, the different problems have different properties, such as scalable/non-scalable, dimensional dependent/independent, just to name a few. We can't treat all problems as the same.
The large number of population size may affect the optimization results, but there is no significant relationship among the population size, dimensions of problem, and the optimization results.
On the paper "A large population size can be unhelpful in evolutionary algorithms, Tianshi Chena, Ke Tang, Guoliang Chen, Xin Yao Theoretical Computer Science 436 (2012) 54–70", chen et al. have discussed the setting of population size in evolutionary algorithms, maybe that conclusions also correct in swarm intelligence.
8 Recommendations

All Answers (34)

8th Jul, 2013
Birkan Can
University of Limerick
Population size in evolutaionary algorithms needs to be large enough to initialise with a rich set of solutions. You may need to modulate the minimum size to cope with drift, premature convergence etc. Also see http://www.cs.mun.ca/~tingh/papers/GPEM10.pdf.
1 Recommendation
8th Jul, 2013
Ioannis T. Christou
Athens Information Technology
Dear Steven, the paper recommending the 10*D rule is the following:
R. Storn: "On the usage of differential evolution for function optimization", Biennial Conf. of the North American Fuzzy Information Processing Society, 1996.
Now, regarding the effectiveness of this rule, and DE in general, I personally have found that for unconstrained (or box-constrained) NLPs with "hard" functions such as the ones used in the CEC2005 tests, or the non-differentiable functions LND1...LND10 (see article by Brimberg, Hansen, Mladenovich, "Continuous optimization by Variable Neighborhood Search", - A chapter in "Encyclopedia of Operations Research and Management Science"), DE does not perform very well, in fact it takes a lot more function evaluations to achieve comparable results to those achieved by other more sophisticated heuristics. The parameters that seem to matter most are the px (cross-over, usually set around 0.8) and w (weight factor).
But above all, remember the "No Free Lunch Theorem": there does not exist a "best choice" of parameters for all (or even most) of your test-instances.
2 Recommendations
8th Jul, 2013
Scott John Turner
Canterbury Christ Church University
Good question I have always used x10 criteria as a rule of thumb I have lost the reference where it originally came from. I recently read an interesting paper that looked at some of the other issues around population size, etc: https://www.researchgate.net/publication/220862320_Initial_Population_for_Genetic_Algorithms_A_Metric_Approach
8th Jul, 2013
Dr. Nekuri Naveen
University of Hyderabad
Dear Stephen Chen, with my experience I will recommend the population size between 30 - 60. 30 is the best with my experience. Also, as told by Ioannis Christou there is no "free lunch".
8th Jul, 2013
Bahram Zaeri
Superna - Ottawa
I think it's highly dependent. And there is no exact rule to calculate it. it depends on many parameters such as problem structure,number of dimensions, search space restrictions and etc.
8th Jul, 2013
Aleš Zamuda
University of Maribor
Dear Stephen,
a NP=100 for DE is a good initial guess. However, for e.g. small number of function evaluations, a smaller NP could be beneticial. Also, we researched that it may help to have bigger population at start and smaller at end, i.e. to reduce population size gradually during evolution. This has also shown nice on recent RWIC CEC2011 problem set:
A. Zamuda, J. Brest. Population Reduction Differential Evolution with Multiple Mutation Strategies in Real World Industry Challenges. Artificial Intelligence and Soft Computing -- ICAISC 2012, 2012, vol. 7269/2012, pp. 154-161. DOI 10.1007/978-3-642-29353-5_18.
A bigger population can also be divided into smaller populations, i.e. subpopulations and then use some island model and cooperation. There is also quite some DE extensions with small populations (e.g. NP=10) now. For citing on DE parameters, please also use as possible novel papers, since there have been quite some contextually-only findings of settings, never withstanding the NFL with a single answer.
Best Regards.
1 Recommendation
8th Jul, 2013
H.E. Lehtihet
Dear Stephen. You can do a google search using: the following :
"empirical study on the effect of population size on differential evolution algorithm".
You might be interested also by some works done on DE with a adaptive/dynamic population size.
1 Recommendation
8th Jul, 2013
Stephen Chen
York University
Hi Everyone,
Thanks for all of your answers. For relatively small problems, most people recommend popsize >> dimensionality, but this doesn't scale well to problems with a very large number of dimensions. We're developing a new technique (Minimum Population Search) that is designed from the beginning to be scaled to very high dimensional problems (1000+). As part of our work, we're also studying the effects of population size for PSO, DE, and UMDA as a function of dimensionality.
In general, PSO and DE scale badly because they work best with popsize >> dimensionality. Thanks for the extra insight, and we'll keep working away at it.
Cheers,
Stephen
8th Jul, 2013
H.E. Lehtihet
Dear Stephen, I think this is going to be a useful work because, at least for DE, the effect of pop size is not well understood in the case of very high dim problems. Good luck
9th Jul, 2013
Hernan Eduardo Aguirre
Shinshu University
Dear Sthephen,
I assume you are referring to single objective optimisation, but just in case you should be aware that Population size depends strongly on the number of objective functions of your problem, and more specifically on the number of true Pareto optimal solutions. This is general, independent of the algorithm.
Cheers,
9th Jul, 2013
Ayça Altay
Rutgers, The State University of New Jersey
It depends on many parameters: problem size, search space, literally the number of parameters of the algorithm. There are no hard and fast rules. For DE, 30-50 is very common. With the size 30-50, the algorithm would produce results in a short time. Yet, making it 100 is also common, too. The thing is you want to produce some good result in a short time. So I suggest you test the population sizes, as well. For example, if you have a minimization problem that you do not know the exact solution, test it for 30, then for 50 and then for 70 and so on. If you have (or you think you have) a clear distinction between and improvement in results, then keep increasing. If you think, it is converging, then stop increasing. You could always find your problem's own population size. Yet, when I was in a conference, a professor told me that I had used 100 and this was kind of cheating. He also told that if the algorithm gave good results for 30-50, then it was good; yet if I changed it to 100, it would be forcing the algorithm to provide a good result, that would be like bending the algorithm and it wouldn't be the natural act of the algorithm.
1 Recommendation
9th Jul, 2013
Konstantinos B Baltzis
Aristotle University of Thessaloniki
You should also notice that, a large population size may affect the ability of the algorithm to find the correct search direction. In general, I also suggest an initial value around 100.
10th Jul, 2013
Borhan Kazimipour
Monash University (Australia)
Firstly, the population size depend on the search landscape, DE strategy and parameter settings. If your implementation converges fast, you always can increase the population size. In theory, increasing population size has no adverse effect on the performance of the DE. On the other hand, It increases the chance of finding better results and converging to global optimum.
In practice, however, increasing population size may have bad effect on the final results. When parameter setting prevent algorithm to fully converge or when the computational budget is fixed, increasing population size may result on non-converged population. In other words, If we increase population size, while computational budget is fixed, we have to decrease the number of iterations and this may result to early termination. We discussed this phenomena in our last publication regarding large scale optimization: B. Kazimipour, X. Li and A. K. Qin, “Initialization Methods for Large Scale Global Optimization”, in Evolutionary Computation, 2013. CEC 2013. (IEEE World Congress on Computational Intelligence). IEEE Congress on. IEEE, 2013.
3 Recommendations
10th Jul, 2013
Pablo García-Sánchez
University of Granada
Appart from the good responses in this thread, I would suggest use 2^n population sizes (i.e. 64, 128, 256... instead 50, 100, 200) because it is easier to divide them into several subpopulations in case you need it (for other comparisons, to be divided in islands, etc).
1 Recommendation
11th Jul, 2013
Nattavud Pimpa
Mahidol University
Depends upon the parameter.
12th Jul, 2013
Halil Karahan
Pamukkale University
Population size depends on many parameters: problem, search space, selection of adjustable parameters of the algorithm. There are no general rules. But, for DE, (5-10) * times the number of dimensions is very common. In this case algorithm would produce results in a short time. We discussed this phenomena in our last publication. Discussion of "Estimation of nonlinear Muskingum Model Parameter using DE" by Dong-Mei Xu, Lin Qiu and Shou-Yu Chen and given reference as Karahan (2011) .
1 Recommendation
14th Jul, 2013
Jeerayut Wetweerapong
Khon Kaen University
I agree with many of us here that population size is a dependent parameter. There are many essential factors contributing to the success of a searching algorithm. For hard problems, key factors need to be balanced properly in order to get a good solution in the time or the number of iterations allowed. Only using a larger size of population won't help. It increases the chance of getting better solutions but at the same time uses more function evaluations and time for intensifying the search.
15th Jul, 2013
Elena Niculina Dragoi
Gheorghe Asachi Technical University of Iasi, Faculty of Chemical Engineering and Environmental Protection
From my experience, for large non-linear problems (with multiple local minima) with a dimensionality (D) around 1000, the best solutions are obtained when using a population of 250-500 individuals, Although this solution works in my case (neural network determination for chemical engineering processes), we must take into consideration that each problem has its own characteristics .. The rule (5-10)*D may work for problems with a small D, but using large values for population when having an already large D is resource consuming and this is a big problem.
2 Recommendations
24th Jul, 2013
Xinchao Zhao
Beijing University of Posts and Telecommunications
I think populion size is not too critical, it depends on the difficulty of problem mainly. Your experience needs experimental verification. More attention should be paid to generation strategy, which is the key of an algorithm. Normal question:30-100; large scale optimization:200-500. It needs your some trials.
25th Jul, 2013
Shi Cheng
Shaanxi Normal University
I don't think “ a population size of 10 times the number of dimensions” is always right for optimization. Even for the single objective optimization, the different problems have different properties, such as scalable/non-scalable, dimensional dependent/independent, just to name a few. We can't treat all problems as the same.
The large number of population size may affect the optimization results, but there is no significant relationship among the population size, dimensions of problem, and the optimization results.
On the paper "A large population size can be unhelpful in evolutionary algorithms, Tianshi Chena, Ke Tang, Guoliang Chen, Xin Yao Theoretical Computer Science 436 (2012) 54–70", chen et al. have discussed the setting of population size in evolutionary algorithms, maybe that conclusions also correct in swarm intelligence.
8 Recommendations
25th Jul, 2013
Stephen Chen
York University
Hi Everyone,
Thank you all for your useful comments and discussion. A key aspect of our research is to study the relationship between population size and problem dimensionality. Ideally, we would want population size >> dimensionality to afford full exploration of all aspects of the search space. However, due to time constraints, we often end up with population size > dimensionality for low-dimensional problems, and population size < dimensionality for high dimensional problems.
Our current research is trying to study/understand the effects on search techniques when population size < dimensionality, and hopefully to provide insight into how our new technique (Minimum Population Search) can address these issues. If we get anything interesting in our results, I'll be sure to post a follow up!
5th Aug, 2013
R. Russell Rhinehart
R. Russell Rhinehart Co - www.r3eda.com
I agree with the 10 per DV dimension rule. Too small a population and it can converge early on a slope and it will not provide adequate initial surface exploration. Too large a population and it takes excessive number of function evaluations to converge. I am not aware of any fundamental analysis that indicates how the balance is affected by the number of players. My support of the 10 per dimension is simply based on my experience in problems of 1 to 80 decision variables.
1 Recommendation
14th Mar, 2015
Stephen Chen
York University
Thank you to everyone for a productive discussion. The ideas became an important part of the attached paper.
Due to the super-linear growth in convergence time, it is generally infeasible to have population sizes large than the dimensionality for d >= 100.  For lower dimensional problems, it is useful to have p > d to improve diversity.  For high dimensional problems, p ~= 0.5d seems to be the most effective.
8 Recommendations
16th Nov, 2015
Mahamad Nabab Alam
National Institute of Technology, Warangal
Normally, higher population size produces good results for DE. Also, very large population size may not give good results. So a compromise need to be done.  No general rule can give guarantee to achieve the global optimum solution by fixing the population size.
So, a comprehensive testing is required to fix the population size for each problem in DE. 
1 Recommendation
20th Mar, 2019
Erik Cuevas
University of Guadalajara
This is depends on the optimization problema at hand. However, the minimal population is of three individuals.
2 Recommendations
23rd Apr, 2019
Erik Cuevas
University of Guadalajara
This is a requeriment of the DE approach
2 Recommendations
11th May, 2019
Erik Cuevas
University of Guadalajara
Something that is also applicable to other methods is the following condition:
If there are several individuals the algorithm performance is better. However, it computational cost increases
2 Recommendations
7th Aug, 2019
Jeerayut Wetweerapong
Khon Kaen University
It depends on all important components of the algorithm: other control parameters (dimensions, scaling factor, crossover rate), mutation strategies used, other incorporated techniques, and the nature of the problems to be solved (being highly non-separable or highly multimodal). With a well-designed diversified strategy, the smaller population sizes can be used.
19th Dec, 2020
Marco Antonio Campos Benvenga
Universidade Paulista
Considering that we are talking about a probabilistic optimizatiom method, how much big is the population better will be the chances to find the optimal results. But, it has a computational cost. In my experience in this area, I observed that, the more important is the search strategy of the algorithm. There are some metaheuristic hybrid algorithms that hav beem showed good results in optimations problems. Best regards!
20th Dec, 2020
Rana Muhammad Adnan
Hohai University
according datasets,it will vary.
6 Recommendations

Similar questions and discussions

Related Publications

Article
Scheduling is key towards improving the performance of a Flexible Assembly Line (FAL). In this paper, a Bilevel Differential Evolution (BiDE) algorithm to solve a FAL scheduling problem is proposed. The BiDE algorithm optimizes the performance of the FAL with respect to two criteria: the weighted sum of Earliness/Tardiness (E/T) penalties and the b...
Chapter
Evolutionary Computation (EC) is a branch of Artificial Intelligence which encompasses heuristic optimization methods loosely based on biological evolutionary processes. These methods are efficient in finding optimal or near-optimal solutions in large, complex non-linear search spaces. While evolutionary algorithms (EAs) are comparatively slow in c...
Article
Full-text available
Evolutionary Computation (EC) is a branch of Artificial Intelligence which encompasses heuristic optimization methods loosely based on biological evolutionary processes. These methods are efficient in finding optimal or near-optimal solutions in large, complex non-linear search spaces. While evolutionary algorithms (EAs) are comparatively slow in c...
Got a technical question?
Get high-quality answers from experts.