I am benchmarking an algorithm and using famous nonlinear continuous functions such as Grieank or Rosenbrock. Most of these functions have a minimum value of 0. Each previous article provides a success rate but does not define the success. However, in a lot of books and some articles, the success rate is defined as (the number of successful runs) / (total number of runs). It has also drawn my attention that apart from the success rate, I don't see much statistical data over the results such as the average or standard deviation of the trials. My question is which result qualifies as successful? I understand the exact result being successful, yet, are 10 to the power -6, 10 to the power -27 or 10 to the power -100 successful? To what degree is the algorithm successful? Is there a role for the programming language and the dimension of the problem?
Generally there are two ways to define an efficient success rate. At first you can set a certain empirical value for your objective function to reach. But because of diversity, it might not work for every problem. Secondly you may define a convergence measure. it's more complicated but more efficient. After a number of runs when no better solutions were generated by the algorithm, it's identified that the best possible solution is found or the algorithm is stagnant. Also you can set some adjustment to restart the algorithm automatically.
Sirindhorn International Institute of Technology (SIIT)
The computer accuracy defines the maximum accuracy of the algorithm. Now, I do not understand the success rate unless it is something like a genetic algorithm or anything with a random component. As long as a deterministic algorithm is concerned, it must converge to the solution given certain conditions. In this sense, there is no success or no success. It simply converges or not.
For each test function, we run the algorithm 100 times for example, and we compare each one to the theorical (or known) result with a given acuracy, i.e. %accuracy = (Result-(theorical result))/(theorical result)*100 (if theorical result not equal to 0, otherwise we don't divide by it)
Ayca: There are a variety of appropriate measures for success rate of an optimizer, and there are alternate measures of "goodness". The optimizer stops iterations when the convergence criterion is satisfied (perhaps small changes to the decision variables). However, they stop in the proximity of the minimum (or max) as they progressively approach it and begin to make changes smaller than the user-defined threshold. Rarely would an optimizer stop exactly on the minimum. In test problems, you know the true value of the global optimum, and in some it is structured to have a value of zero. But in general the truth cannot be known (if the optimum were known, you would not be running an optimizer). Further, in many applications, there are local minima that trap optimizers in the vicinity of a not-global minimum. As a measure of performance, I would look at the probability of the optimizer finding a solution in the vicinity of the global optimum. Run the optimizer many times count the number of times a solution is in the proximity of the global. "Proximity" will depend on the precision result of the convergence criterion. The probability of finding the global will be binomially distributed (it wither does or does not). You will want to run a large enough number of independent trials (randomly initialized) so that you can make statistically confident statements about the "success rate". The standard deviation of a binomial probability is sigma=SQRT(pq/N), where p is the probability of success, q=1-p, and N is the number of independent trials. If you want the 2-sigma limits on your measured p to be small (perhaps .1p), then N=400(1-p)/p. As you run trials, you'll begin to be able to estimate p, and then can specify the number of trials.
But, as important as finding the global optimum is the amount of work required to find it. I like using the average number of function evaluations (ANOFE) as the indicator of work. A function evaluation would include the computation of function derivatives.
If an optimizer has a high probability of finding the global, but requires excessive ANOFE, and another has a lower probability, but with few ANOFE, it might be best to run the lower ANOFE optimizer extra times to have the same chance of finding the global, but with fewer function evaluations.
In the paper, Rhinehart, R. R., M. Su, and U. Manimegalai-Sridhar, “Leapfrogging and Synoptic Leapfrogging: a new optimization approach”, Computers and Chemical Engineering, Vol. 40, 11 May 2012, pp. 67-81, we derive an use a formula to compare NOFE of optimizers normalized to the same probability of finding the global.
Thank you alli for your help. I have gained some good insights, yet I am still a little confused. I am trying a multi-swarm type PSO algorithm and trying to do tests that may prove the robustness of the algorithm as in this one
Optimum solutions to all of these benchmark problems are already known, so that's how I know how close I am to an optimum solution. I ran the algorithm for 150 times, have enough statistics and just am not quite sure how to interpret them. I am definitely using ANOFE, it is one of my performance indicators. So are the average, standard deviation, minimum or maximum or results. I have already run t-tests and ANOVA showing which algorithm results are better for two stopping criteria i) for a predetermined number of iterations ii) stop when there is exactly no improvement in the result for consecutive 30 iterations
However, both of algorithms use the term "success rate" that s a percentage given by (the number of successful runs) / (total number of runs).
For example, for a function with an optimum value of 0 , if I ran the algorithm 150 times and the minimum and maximum and the mean is 0, meaning I had the exact optimum for all the runs, then it would certainly mean that I am %100 successful. What I am trying to find out is, for example, for a 20 dimensional problem with an optimum value of 0, if I had found 10 to the power -30, would it mean this trial is successful or not? What is success? Is it just the exact optimum? If I found 10 to the power -5, this could be success for a 100 dimensional problem that has a range from 0 to 1000000000, but I might not be a success for a 2 dimensional problem that ranges from 0 to 10? Or might it? Or might not it?
Here there are thresholds defined for success. If a run of the algorithm, passes a problem-specific threshold, then it is a success. Otherwise, it is a failure. But the thresholds look rather intuitive to me. I am not sure, this could apply to every study.
Great. I teach an optimizations application class and have the students explore similar attributes as you are doing. I prefer direct search approaches (PS, Hook-Jeeves, Nelder-Mead, Leapfrogging) over gradient based approaches (Newton, Levenberg-Marquardt, GRG, BFGS) because direct search algorithms can handle surface discontinuities. So, I'm happy to see your investigation of PS. However, PS is computationally expensive. At each iteration, each particle explores a new trial solution, meaning that all of the particles (including those not near the global, that will be drawn elsewhere) are wasting function evaluations. To me PS can be justified only when there are hard to find global optima, and the broad surface exploration increases the probability of finding the global. So, if you are investigating PS, be sure to choose your test functions to have multiple optima. Leapfrogging (and many other mimetic algorithms) has the same broad surface investigation as PS, but since only the worst player is relocated, it has a significant improvement in NOFE over PS.
Numerical-based optimizers cannot find the exact optimum, iterations stop when the solution is close enough to the convergence criterion. When your optimizer returns the true value of zero, I suspect it is an artifact of display truncation. It truncated 1E-123 to zero.
Yes, the same convergence criterion on a high dimension application will permit the proximity to be less precise.
Accordingly, "success" in finding the global should be based on proximity to the global. There are several ways to evaluate proximity. If you run many trials from independent initializations, sort the results, and plot the sorted rank vs. the OF value, you'll have a Cumulative Distribution Function, CDF(OF). There should be a clear distinction whether a solution is in the proximity of the global or is at a second-best OF value. You can use the CDF to choose a discriminating OF value.
Alternately, you can use a sensitivity analysis to extrapolate what residual might remain when the DV meet the convergence criteria. eOF=sum abs(partialOF/partialDV)*eDV.
It may be insignificant after power -5 for comparison. Why dont you also go for test functions with different i/o ranges? It could be helpful to derive statistical bounds at each iteration alternative to multiple runs. I guess Gunter Rudolph has some work on this type of convergence.
To help gather more support for these initiatives, please consider sharing this post further (you don’t need a ResearchGate account to see it), and I will continue to update it with other initiatives as I find them. You can also click “Recommend” below to help others in your ResearchGate network see it. And if you know of any other community initiatives that we can share here please let us know via this form: https://forms.gle/e37EHouWXFLyhYE8A
-Ijad Madisch, CEO & Co-Founder of ResearchGate
This list outlines country-level initiatives from various academic institutions and research organizations, with a focus on programs and sponsorship for Ukrainian researchers:
M-elite coevolutionary kinetic-molecular theory optimization algorithm (MECKMTOA) was proposed. MECKMTOA uses M elites to avoid misleading, improves the convergence precision by learning and collaboration among the elites, employs a new wave operator to prevent premature by dimension. The results show that MECKMTOA has good performance in precision...