ArticlePDF Available

A new CG-algorithm with self-scaling VM-update for unconstraint optimization

Authors:

Abstract

In this paper, a new combined extended Conjugate-Gradient (CG) and Variable-Metric (VM) methods is proposed for solving unconstrained large-scale numerical optimization problems. The basic idea is to choose a combination of the current gradient and some pervious search directions as a new search direction updated by Al-Bayati's SCVM-method to fit a new step-size parameter using Armijo Inexact Line Searches (ILS). This method is based on the ILS and its numerical properties are discussed using different non-linear test functions with various dimensions. The global convergence property of the new algorithm is investigated under few weak conditions. Numerical experiments show that the new algorithm seems to converge faster and is superior to some other similar methods in many situations.
226
Available at
http://pvamu.edu/aam
Appl. Appl. Math.
ISSN: 1932-9466
Vol. 7, Issue 1 (June 2012), pp. 226 – 247
Applications and Applied
Mathematics:
An International Journal
(AAM)
A New CG-Algorithm with Self-Scaling VM-Update for
Unconstraint Optimization
Abbas Y. Al-Bayati
College of Basic Education Telafer
University of Mosul, Mosul-Iraq
profabbaslbayati@yahoo.com
Ivan S. Latif
Department of Mathematics
University of Salahaddin, Erbil-Iraq
ivansubhi2001@yahoo.com
Received: February 09, 2011; Accepted: February 23, 2012
Abstract
In this paper, a new combined extended Conjugate-Gradient (CG) and Variable-Metric (VM)
methods is proposed for solving unconstrained large-scale numerical optimization problems. The
basic idea is to choose a combination of the current gradient and some pervious search directions
as a new search direction updated by Al-Bayati's SCVM-method to fit a new step-size parameter
using Armijo Inexact Line Searches (ILS). This method is based on the ILS and its numerical
properties are discussed using different non-linear test functions with various dimensions. The
global convergence property of the new algorithm is investigated under few weak conditions.
Numerical experiments show that the new algorithm seems to converge faster and is superior to
some other similar methods in many situations.
Keywords: Unconstrained Optimization, Gradient Related Method, Self-Scaling VM-Method,
Inexact Line Searches
MSC 2010: 49M07, 49M10, 90C06, 65K
AAM: Intern. J., Vol. 7, Issue 1 (June 2012) 227
1. Introduction
Consider an unconstrained optimization problem:
min n
(),fx x , (1)
where 1
:n
f is a continuously differentiable function in n
an n-dimensional Euclidean
space; n may be very large in some sense. Most of the well-known iterative algorithms for
solving (1) take the form:
kkkk dxx
1 , (2)
where k
d is a search direction and k
is a positive step-size along the search direction. This
class of methods is called a line search gradient method. If k
x is the current iterative point, then
we denote )(k
xf by k
g, )(k
xf by k
f and )(
xf by
f, respectively. If we take kk gd
,
then the corresponding method is called the Steepest Descent (SD) method; one of the simpler
gradient methods. That has wide applications in large scale optimization; see Nocedal and
Wright (1999). Generally the CG-method is a useful technique for solving large-scale nonlinear
problems because it avoids the computation and storage of some matrices associated with the
Hessian of objective functions. The CG-method has the form:
11
,if0,
,if 0,
k
k
kkk
gk
dgd k



(3)
where k
is a parameter that determines the different CG-methods; see for example the
following references: Crowder & Wolfe (1972); Dai & Yuan (1996, 1999) and Fletcher-Reeves
(1964). Well known choices of k
satisfy:
2
PR HS
11
kk
22
11
11
() ()
, ,
k
FR kk k kk k
k
kk
kk
ggg g gg g
dg
gg






 , (4)
which respectively, correspond to the FR (Fletcher-Reeves, 1964), PR (Polak-Ribiere, 1969) and
HS (Hestenes Stiefel, 1952). CG-method with Exact Line Search (ELS) has finite convergence
when they are used to minimize strictly convex quadratic function; see for example Al-Bayati
and Al-Assady (1986). However, if the objective function is not quadratic or ELS is not used
then a CG-method has no finite convergence. Also a CG-method has no global convergence if
the objective function is non- quadratic. Similarly, Miele and Cantrell (1969) studied the
memory gradient method for (1); namely, if 0
x is an initial point and 00 gd , the method can
be stated as follows:
1k1 ,
kkkkk vgvvxx
, (5)
228 Abbas Y. Al-Bayati and Ivan S. Latif
where
and
are scalars chosen at each step so as to yield the greatest decrease in the
function f.Cantrell (1969) showed that the memory gradient method and the FR-CG method are
identical in the particular case of a quadratic function. Cragg and Levy (1969) proposed a super-
memory gradient method whose search direction is defined by:
k
i
iikk dgd
1
1
, (6)
where 0
x is an initial point and 00 gd
. Wolfe and Viazminsky (1976) also investigated a
super-memory descent method for (1) in which the objective takes the form:
)(min)(
1
)(
,...,,
1
)(
1
ik
m
i
k
ikk
x
ik
m
i
k
ikkk vpxfvpxf
k
, (7a)
where
ik
m
i
k
ikkk pv
1
)( , (7b)
m is a fixed positive integer; with
0
kk gp . (8)
Both the memory and super-memory gradient methods are more efficient than the CG and SD
methods by considering the amount of computation and storages required in the latter. Shi-Shen
(2004) combined the CG-method and supper-memory descent method to form a new gradient
method that may be more effective than the standard CG-method for solving large scale
optimization problems as follows:
() () ()
111
2
min{ ( ,...,1 ) [ , ]( 2, , )}
2
i
m
kkki
k
k k k m ki ki k
i
gd i m

 

, (9a)
where
),...,( )()(
11
k
k
k
mkkkkk dxx
. (9b)
We denote ),...,( )()(
1
k
k
k
mkk
d
by k
d throughout this paper, m is a fixed positive integer and k
is a scalar chosen by a line search procedure. The theoretical and practical merits of the Quasi
Newton (QN) family of methods for unconstrained optimization have been systematically
explored since the classic paper of Fletcher and Powell analyzed by Davidonُ VM method. In
1970 the self-scaling VM algorithms were introduced, showing significant improvement in
efficiency over earlier methods.
AAM: Intern. J., Vol. 7, Issue 1 (June 2012) 229
Recently, Al-Bayati and Latif (2008) proposed a new three terms preconditioned gradient
memory method. Their method subsumes some other families of nonlinear preconditioned
gradient memory methods as its subfamilies with Powell's restart criterion and inexact Armijo
line searches. Their search direction was defined by:
&
11
,if0
,if0,
kk
BL
k
kkkkk
Hg k
dgHdHd k




(10)
where
is a step-size defined by inexact Armijo line search procedure and
is the conjugacy
parameter. Al-Bayati et al. (2009) introduced two versions CG-algorithm. Their search directions
are defined by:
1
1
1
,if0
,if0,
k
kv
kkk
gk
dgd k


and
)(
)1( 1
1
k
T
k
k
T
k
k
T
k
k
T
k
v
kys
yg
yy
ys
(11a)
2
2
1
,if0
,if0,
k
kv
kkk
gk
dgd k


and 211
()
(1 ) ,
TT T
vkk k k kk
kTT T
kk kk kk
sy g y sg
yy dy dy

 (11b)
where (11) has been proved to be a sufficiently descent directions.
Also, Zhang, et al. in (2009) had modified Dai-Liao DL-CG method with three terms search
directions as follows:

0
111
, if 0,
, if 0,
kDL
kkk kk k
gk
dgd yts k




(12a)
where 111
k
T
kk
T
kk yddg
and DL
k
is defined by:
.0 ,
)(
11
11
t
yd
tsyg
k
T
k
kk
T
k
DL
k
(12b)
They show that the sufficient descent condition also holds true if no line search is used, that is,
.
2
11 kk
T
kgdg (13)
In order to achieve the global convergence result, Grippo and Lucidi (1997) proposed the
following new line search: for given constants 0
, 0
, and
1 ,0
, let
230 Abbas Y. Al-Bayati and Ivan S. Latif
... ,1 ,0 ;max 2j
d
dg
k
k
T
k
j
k
, (14)
which satisfy

2
1
2
1 kkkk dxfxf

. (15)
This line search will be preferred to the classical Armijo one for the sake of a greater reduction
of objective function. Introducing this line search rule. This may be taken as an open problem.
In this paper, a new gradient related algorithm combined with VM-update used for solving large
scale unconstrained optimization problems, is proposed. The new algorithm is a kind of ILS
method modified with VM-algorithm. The basic idea is to choose a combination of the current
gradient and some previous search directions with Al-Bayati self-scaling VM-update which is
based on two-parameter family of rank-two updating formulae. The algorithm is compared with
similar published algorithms, which may be more effective than the standard conjugate related
algorithm; namely, Nazareth (1977) and other VM-algorithm. The global rate of convergence is
investigated under a diverse weak condition. Numerical experiment shows that the new
algorithm seems to converge more stably and is superior to other similar methods.
2. Shi-Shen Algorithm
Shi-Shen (2004) proposed the following assumptions:
1
S: The objective function fhas lower bound on the level set )}()({ 00 xfxfxL n ,
where 0
x is an available initial point.
2
S: The gradient )( xg of )(xf is Lipschitz continuous in an open convex set
B
which
contains 0
L i.e., there exist a constant 0Lsuch that:
Byx, , )()( yxLygxg .
3
S: The gradient )( xg is uniformly continuous in an open convex set
B
containing 0
L.
Obviously Assumption ( 2
S) implies ( 3
S).
As we know, a key to devise an algorithm for unconstrained optimization problems is to choose
an available search direction k
dand a suitable step-size k
at each iteration. Certainly if we
choose a search direction k
d satisfying:
0
kk dg , (16)
AAM: Intern. J., Vol. 7, Issue 1 (June 2012) 231
then we can devise a descent direction generally, we demand that:
2
kkk gdg
, (17a)
which is called sufficient descent condition where 0
.
Furthermore, if
kkkk dgdg .
, (17b)
then many descent algorithms have their convergence under the above condition. It is called an
angle condition or a gradient–related conception.
Definition 2.1. Berstsekas (1982)
Let }{k
x be a sequence generated by the gradient method (2). We say that the sequence }{k
d is
uniformly gradient related to }{k
x if for every convergent subsequence }{k
xfor which
k
0lim
Kk
k
g , (18a)
we have

kK k
k
k
kdsup lim, ginf lim0
Kk
k
d. (18b)
Equivalently, }{k
dis uniformly gradient related if whenever a subsequence }{k
g tends to a non-
zero vector, the corresponding subsequence of direction k
d is bounded and does not tend to be
orthogonal to k
g. Moreover, we must choose a line search rule to find the step-size a long search
direction at each iteration.
Lemma 2.1. Berstsekas (1982)
Let }{k
xbe a sequence generated by a gradient method and assume that }{k
dis uniformly
gradient related and k
is chosen by the minimization rule or the limited minimization rule, then
every limited point of }{k
xis a critical point *
x, i.e. 0)( *xg . As to the parameters in the
algorithm, we seem to choose ),...,2](,
2
[
)(
1mi
i
k
i
k
k
ik
for solving large scale optimization
problems as defined in (9). To get the algorithm to converge more quickly, we take:
232 Abbas Y. Al-Bayati and Ivan S. Latif
2
k1
()
1
2
k1
, if g ,
2
, if g .
i
k
kki
k
ki
i
kkki
gd
gd

for mi ,...,2,1
. (19)
Now, it is easy to prove that:
10
22
)(
1
m
i
i
k
m
i
k
ik
and


m
i
m
i
i
k
k
ik
k
k
22
)(
1
)( 01111
.
The details may be found in Crowder & Wolfe (1972).
Algorithm 2.1. Shi-Shen
Let 10 , 10 2
2
1
1
, a fixed integer 1
2 , x , 1
n
mkand
is a small
parameter, then:
Step 1. If
k
g, then stop.
Step 2. Set ),...,( )()(
11
k
k
k
mkkkkk dxx
, where
() ()
1(k) ( )
k11
2
,if1,
( ,..., ) -,if,
k
kk m
kkm k k
kkiki
i
gkm
dgdkm
 

 


mi
i
k
i
k
k
ik ,...,2, ],
2
[
)(
1
,
2
k1
()
1
2
k1
, if g ,
2
, if g ,
i
k
kki
k
ki
i
kkki
gd
gd

for mi ,...,2,1
,
and
AAM: Intern. J., Vol. 7, Issue 1 (June 2012) 233
m
i
k
ik
ikkk
k
dgg
g
m2
)(
1
(k)
k
1
2
2
i
k1 m),2,....,(i ,
,
scalar k
in Step (2) is chosen by a cubic line search; Bunday (1984).
Step 3. If the available storage is exceeded, then employ a restart option [Zoutendijk (1970)]
either with
nk or kkkk gggg
111 .
Step 4. Set 1 kk , go to Step1.
3. A New Proposed Algorithm for Solving Problem (1)
In this section we want to choose a line search rule to find the best step-size parameter along the
search direction at each iteration. In fact, we can use the generalized Armijo ُ line search rule
implemented in Luenberger (1989):
Set scalar
,S and 1
with (0,1) , )1,0(
1
and 0S. Let k
be the largest
in
,...,, 2
SSS such that:
kkkkk dgdxff
1
)( . (20)
Choosing the parameter
is important for the implementation of the line search method. If
is
too large then the line search process may be too slow while if
is too small then the line
search process may be too fast so as to lose the available step size we should choose a suitable
step size at each iteration.
4. New Method
In order to increase the efficiency of Algorithm 2.1, an extended Armijo line search rule given in
Cantrell (1969) is used to find the best value of the step-size in order to locate the new hybrid
line search which combines the search direction of Shi-Shen Algorithm 2.1 with Al-Bayati
(1991) self-scaling update and as shown below:
Let 1 , 10 1
2
1
, a fixed integer 1k, x, 2 1 n
m and 1
H is any positive
definite matrix usually
1IH and
a small parameter.
Algorithm 4.1.
Step 1. If 0
k
g, then stop.
234 Abbas Y. Al-Bayati and Ivan S. Latif
Step 2. Set ),...,( )()(
11
k
k
k
mkkkkk dxx
, where
() ()
1(k) ( )
k11
2
,if-1,
( ,..., ) -H{ }, if ,
k
kk m
kkm k k
kkiki
i
Hg k m
dgdkm
 

 


()
1[,], 2,,
2
i
ki
k
ki k im

 ,
2
k1
()
1
2
k1
, if g ,
2
, if g ,
i
k
kki
k
ki
i
kkki
gd
gd

for mi ,...,2,1
,
and
m
i
k
ik
ikkk
k
dgg
g
m2
)(
1
(k)
k
1
2
2
i
k1 m),2,....,(i ,
.
Scalar k
in Step (2) is chosen by Armijo line search rule defined in (20) and k
H is defined by
Al-Bayati (1991) VM-update defined in Step (3).
Step 3. Update k
H by )(
1kkkkkkk
kkk
kkkk
kk yvvvww
yHy
HyyH
HH
with
kkkkk ggxxv 1k1 y , ,


kkkkkkkkkkkk yHyyHyvvyHyw 2
1
,
kkkkkk yvyHy
.
Step 4: If available storage is exceeded, then employ a restart option either with nk or
kkkk gggg
111 .
Step 5. Set 1 kk and go to Step 2.
Now to ensure that the new algorithm has a super-linear convergence, let us consider the
following theorems:
AAM: Intern. J., Vol. 7, Issue 1 (June 2012) 235
Theorem 4.1.
If )( 1
S and )( 2
S hold and if the new Algorithm 4.1 generates an infinite sequence }{k
x, then

mk k
k
g
4
, (21a)
where
),(max 2
1
2
2
ikk
mi
kdg
. (21b)
Proof:
Since }{k
fis a decreasing sequence and has a lower bound on the level set 0
L , it is a convergent
sequence. Moreover, since 1k
His also has a global rate of convergence, see Al-Bayati (1991) for
the details; therefore Lemma 2.1 shows that (21) holds and hence the new algorithm has a super
linear convergence and hence the proof is complete.
Theorem 4.2.
If conditions in Theorem 4.1 hold, then either lim 0
k
k
g

or }{k
x has no bound.
Proof:
If 0lim
k
k
g, then there exists an infinite subset ,...}1,{
0
mmK and 0
such that:
0
Kk,
k
g (22)
Thus
0
4
4
Kk
g
k
k
k
(23)
By Theorem 4.1 and for 1k, we obtain
}{max 2
1
2
i
i
kgd
(24)
Now if mk , then the conclusion is obvious. However, if mk , then by induction process we
obtain the conclusion; we have:
236 Abbas Y. Al-Bayati and Ivan S. Latif

mk k
k
Kk k
g
4
4
0
. (25)
Then there exists at least one mii
00 2: such that:

kKk
ik
d
,
1
0
lim (26)
and hence }{k
x has no bound.
5. Numerical Results.
Comparative tests are performed with seventy eight well-known test functions (Twenty six with
three different versions) which are specified in the Appendix. All the results shown in Table 1
are obtained with newly-programmed Fortran routines which employ double precision. The
Comparative performances of the algorithms are in the usual way by considering both the total
number of function evaluations (NOF) and the total number of iterations (NOI). In each case the
convergence criterion is that the value of 5
101
k
g. The cubic fitting technique, published in
its original form by Bunday (1984) is used as the common linear search subprogram for Shi-
Shen algorithm while Armijo line search procedure defined in (21) is used for our new proposed
algorithm.
Each test function was solved using the following two algorithms:
(1) The original algorithm published by Shi-Shen call it (Shi-Shen, Algorithm 2.1).
(2) The new proposed algorithm call it (New algorithm, Algorithm 4.1).
The numerical results in Table 1 give the comparison between the New and the Shi-Shen
algorithms for different dimensions of test functions, while Table 2 gives the total overall the
tools. The details of the percentage of improvements of NOI and NOF was given in Table 3. The
important thing is that the new algorithm needs less iteration, fewer evaluations of )(xf and
)(xg than the standard Shi-Shen algorithm in many situations especially for large-scale
unconstrained optimization problems, when the iterative process reaches the same precision. The
new proposed algorithm uses less CPU time than Shi-Shen method even we have not mention it.
However, we can see that Shi-Shen algorithm may fail in some cases while the new method
always converges for the minimum points. Moreover, the new algorithm seems to be suitable to
solve ill-conditioned problems and suitable to solve large-scale minimization problems.
AAM: Intern. J., Vol. 7, Issue 1 (June 2012) 237
Table 1. Comparison between the New (4.1) and Shi-Shen Algorithms (3.1) using three different cases for n and m.
N0.
OF
TEST
TEST
FUNCTION
SHI-SHEN (3.1) NOI(NOF) NEW (4.1) NOI(NOF)
n 12 36 360 1080 4320 12 36 360 1080 4320
1
EX-
Tridiagonal-1
m<n 57
142
54
118
58
118
58
118
59
120
16
31
16
19
16
20
15
19
16
20
m=n
53
113
54
112
54
112
58
118
58
118
16
28
16
19
16
20
15
19
16
20
m>n 53
109
54
111
58
118
58
118
58
118
15
19
16
19
16
20
15
19
16
20
2
EX-Three
exponential
m<n 26
61
22
59
22
45
22
45
21
43
4
7
5
8
5
8
5
8
5
8
m=n 26
55
21
43
22
45
22
45
21
43
4
7
5
8
5
8
5
8
5
8
m>n 26
53
21
43
22
45
22
45
21
43
4
7
5
8
5
8
5
8
5
8
3
Matrix Rom
m<n 2
7
2
7
2
7
2
7
2
7
3
5
4
6
4
6
4
6
4
6
m=n 2
7
2
7
2
7
2
7
2
7
3
5
4
6
4
6
4
6
4
6
m>n 2
7
2
7
2
7
2
7
2
7
3
5
4
6
4
6
4
6
4
6
4
EX-Freud
& Roth
m<n 6645
13735
311
707
1940
5341
1956
5745
2040
6086
20
26
21
27
21
27
21
27
21
27
m=n 512
1455
9265
18877
10504
21132
10882
21869
11324
22743
20
26
21
27
21
27
21
27
21
27
m>n 9440
718975
9792
19697
10528
21151
10880
21855
11324
22743
20
26
21
27
21
27
21
27
21
27
5
GEN-
Tridiagon
al-1
m<n 58
145
71
230
58
127
58
139
52
135
16
39
17
29
16
20
15
19
16
20
m=n 58
123
56
115
58
127
58
127
52
135
16
28
16
19
16
20
15
19
16
20
m>n 48
99
26
75
58
127
58
127
52
135
15
19
16
19
16
20
15
19
16
20
6
Diagonal4
m<n 7
15
7
15
9
19
9
19
9
19
8
12
8
12
8
12
8
12
8
12
m=n 7
15
7
15
9
19
9
19
9
19
8
12
8
12
8
12
8
12
8
12
m>n 7
15
7
15
9
19
9
19
9
19
8
12
8
12
8
12
8
12
8
12
7
Dqdrtic
m<n 1260
3036
1371
3807
925
2509
1025
2987
1305
3827
20
25
16
21
12
17
12
17
13
18
m=n 1260
2626
1371
2781
925
1853
1025
2051
1305
2611
20
25
16
21
12
17
12
17
13
18
m>n 2521
1260
1371
2743
925
1851
1025
2051
1305
2611
20
25
16
21
12
17
12
17
13
18
8
Denschnb
m<n 11
28
12
31
13
29
13
29
14
31
8
20
9
18
8
11
8
11
8
11
m=n 11
25
12
27
13
29
13
29
14
31
7
10
8
11
8
11
8
11
8
11
m>n 11
25
12
27
13
29
13
29
14
31
7
10
8
11
8
11
8
11
8
11
238 Abbas Y. Al-Bayati and Ivan S. Latif
N0.
OF
TEST
TEST
FUNCTION
SHI-SHEN (3.1) NOI(NOF) NEW (4.1) NOI(NOF)
n 12 36 360 1080 4320 12 36 360 1080 4320
9
GEN-
Quartic GQ1
m<n 11
27
11
27
11
24
11
24
11
24
9
12
12
15
14
17
14
17
14
17
m=n 11
24
11
27
11
24
11
24
11
24
9
12
12
15
14
17
14
17
14
17
m>n 11
24
11
27
11
24
11
24
11
24
9
12
12
15
14
17
14
17
14
17
10
Diagonal 7
m<n 4
10
4
10
4
10
4
10
4
10
4
7
4
7
5
8
5
8
5
8
m=n 4
10
4
10
4
10
4
10
4
10
4
7
4
7
5
8
5
8
5
8
m>n 4
10
4
10
4
10
4
10
4
10
4
7
4
7
5
8
5
8
5
8
11
Full Hessian
m<n 25
76
31
160
17
78
15
75
21
121
4
8
3
7
3
8
3
9
3
9
m=n 25
68
31
93
17
78
15
75
21
121
4
8
3
7
3
8
3
9
3
9
m>n 25
64
31
93
17
78
15
75
21
121
4
8
3
7
3
8
3
9
3
9
12
GEN-Powell
m<n 85
208
88
252
104
307
119
239
125
251
50
56
50
56
50
56
50
56
50
56
m=n 109
229
107
217
115
231
119
239
125
521
50
56
50
56
50
56
50
56
50
56
m>n 101
203
107
215
115
231
119
239
125
251
50
56
50
56
50
56
50
56
50
56
13
GEN-Rosen
m<n 48
117
37
130
39
138
39
138
41
145
52
66
53
67
54
68
54
68
54
68
m=n 830
1757
864
1779
954
1937
998
2023
1052
2131
52
66
53
67
54
68
54
68
54
68
m>n 820
1667
864
1755
954
1935
998
2023
1052
2131
52
66
53
67
54
68
54
68
54
68
14
Non-
Diagonal
m<n 23
63
44
138
73
222
23
63
23
63
11
15
11
15
11
15
11
15
11
15
m=n 469
1024
129
302
373
803
385
826
385
826
11
15
11
15
11
15
11
15
11
15
m>n 407
902
359
777
373
802
385
826
385
826
11
15
11
15
11
15
11
15
11
15
15
GEN-Wolf
m<n 102
440
197
544
206
530
212
540
216
560
14
17
16
26
15
18
15
18
16
19
m=n 182
388
197
400
206
413
212
425
216
433
14
17
16
26
15
18
15
18
16
19
m>n 102
365
197
395
206
413
212
425
216
433
14
17
16
26
15
18
15
18
16
19
16
GEN-Strait
m<n 9
20
9
20
11
23
11
23
11
23
10
15
10
15
10
15
10
15
10
15
m=n 9
19
9
19
11
23
11
23
11
23
10
15
10
15
10
15
10
15
10
15
m>n 9
19
9
19
11
23
11
23
11
23
10
15
10
15
10
15
10
15
10
15
AAM: Intern. J., Vol. 7, Issue 1 (June 2012) 239
N0.
OF
TEST
TEST
FUNCTION
SHI-SHEN (3.1) NOI(NOF) NEW (4.1) NOI(NOF)
n 12 36 360 1080 4320 12 36 360 1080 4320
17
GEN-Recipe
m<n 69
204
73
267
81
203
85
213
89
223
11
14
11
14
12
15
12
15
13
16
m=n 69
183
73
187
81
203
85
213
89
223
11
14
11
14
12
15
12
15
13
16
m>n 69
173
73
183
81
203
85
213
89
223
11
14
11
14
12
15
12
15
13
16
18
Non-
diagonal
(Shanno-78)
m<n 30
71
10
24
102
219
292
789
946
2751
34
44
13
19
17
24
17
25
17
25
m=n 30
65
10
22
102
206
292
586
946
2751
34
44
13
19
17
24
17
25
17
25
m>n 30
65
10
22
102
206
292
586
946
2751
34
44
13
19
17
24
17
25
17
25
19
EX-
Tridigonal-2
m<n 83
194
102
203
102
218
103
221
99
212
13
26
19
22
19
22
19
22
19
22
m=n 83
173
102
207
102
205
105
211
98
198
13
25
19
22
19
22
19
22
19
22
m>n 83
167
102
205
102
205
105
211
98
198
13
16
19
22
19
22
19
22
19
22
20
GEN-Beale
m<n 656
1597
586
1625
509
1350
535
1517
565
1607
20
23
20
23
22
25
22
25
22
25
m=n 503
1049
459
931
509
1020
535
1071
565
1131
20
23
20
23
22
25
22
25
22
25
m>n 433
867
457
915
509
1019
535
1071
565
1131
20
23
20
23
22
25
22
25
22
25
21
EX-Block-
Diagonal
BD2
m<n 16
39
18
48
18
38
20
42
20
42
37
120
69
79
72
76
75
79
78
82
m=n 16
35
18
48
18
38
20
42
20
42
90
145
69
79
72
76
75
79
78
82
m>n 16
34
18
38
18
38
20
42
20
42
64
68
66
70
72
76
75
79
78
82
22
Diagonal 7
m<n 4
11
5
13
5
13
5
13
5
13
8
23
7
12
8
13
8
13
8
13
m=n 4
11
5
13
5
13
5
13
5
13
7
12
7
12
8
13
8
13
8
13
m>n 4
11
5
13
5
13
5
13
5
13
7
12
7
12
8
13
8
13
8
13
23
Cosine
(cute)
m<n 9
21
10
24
10
27
11
28
13
31
22
46
13
25
22
27
75
87
12
16
m=n 9
20
10
22
10
27
11
28
13
31
20
26
13
17
22
27
75
87
12
16
m>n 9
20
10
22
10
27
11
28
13
31
20
26
13
17
22
27
75
87
12
16
24
EX-
Himmelblau
m<n 3
7
4
9
4
9
4
9
4
9
8
17
9
18
9
18
9
18
9
18
m=n 3
7
4
9
4
9
4
9
4
9
8
17
9
18
9
18
9
18
9
18
m>n 3
7
4
9
4
9
4
9
4
9
8
17
9
18
9
18
9
18
9
18
240 Abbas Y. Al-Bayati and Ivan S. Latif
N0.
OF
TEST
TEST
FUNCTION
SHI-SHEN (3.1) NOI(NOF) NEW (4.1) NOI(NOF)
n 12 36 360 1080 4320 12 36 360 1080 4320
25
Raydan2
m<n 2
5
2
5
2
5
2
5
2
5
5
7
5
7
6
8
6
8
6
8
m=n 2
5
2
5
2
5
2
5
2
5
5
7
5
7
6
8
6
8
6
8
m>n 2
5
2
5
2
5
2
5
2
5
5
7
5
7
6
8
6
8
6
8
26
Diagonal6
m<n 2
7
2
7
2
7
2
7
2
7
5
7
5
7
6
8
6
8
6
8
m=n 2
7
2
7
2
7
2
7
2
7
5
7
5
7
6
8
6
8
6
8
m>n 2
7
2
7
2
7
2
7
2
7
5
7
5
7
6
8
6
8
6
8
Table 2. Comparison between the New (4.1) and Shi-Shen (3.1) algorithms using the total of tools for each test
function.
Total of
each function
n
SHI-SHEN NOI(NOF) NEW NOI(NOF)
12 36 360 1080 4320 12 36 360 1080 4320
m<n 9243
20286
3083
8480
4327
11616
4636
13045
5699
16365
412
686
426
563
447
573
499
626
444
562
m=n 4289
9493
12825
26275
14113
28576
14885
30095
16354
34206
461
657
426
549
447
562
499
625
444
562
m>n 14238
725153
13550
27428
14141
28595
14883
30081
16354
33936
433
553
423
540
447
562
499
633
444
562
Total of
26x3=78 test
functions
27770
754932
29458
62183
32581
68787
34404
73221
38407
84507
1306
1896
1275
1663
1341
1686
1497
1875
1332
1686
AAM: Intern. J., Vol. 7, Issue 1 (June 2012) 241
Table 3. Percentage performance of the new (4.1) proposed algorithm against Shi-Shen (3.1) algorithm for 100% in
both NOI and NOF.
5. Conclusions and Discussions.
In this Paper, a new combined gradient related and VM-algorithm for solving large-scale
unconstrained optimization problems is proposed. The new algorithm is a kind of Armijo line
search method. The basic idea is to choose a combination of the current gradient and some
previous search directions which are updated by Al-Bayati's self scaling (1991) VM as a new
search direction and to find a step-size by using Armijo ILS. Using more information at the
current iterative step may improve the performance of the algorithm and accelerate the gradient
relates which need a few iterations. The new algorithm concept is useful to analyze its global
convergence property. Numerical experiments show that the new algorithm converges faster and
is more efficient than the standard Shi-Shen algorithm in many situations. The new algorithm is
expected to solve ill-conditioned problems. Clearly there are large ranges of the improving
percentages against the standard Shi-Shen algorithm; namely, the new algorithm has about
(53)% NOI and (75)%NOF improvements against Shi-Shen algorithm taking n=12. These
n
Cost
NEW
1080
NOI
NOF
m<n 89.24
95.2
NOI
NOF
m=n 96.65
97.92
NOI
NOF
m>n 96.65
97.89
NOI
NOF
Total 95.65
97.44
4320
NOI
NOF
m<n 92.21
96.57
NOI
NOF
m=n 97.29
98.36
NOI
NOF
m>n 97.29
98.34
NOI
NOF
Total 96.53
98.01
n
Cost
NEW
12
NOI
NOF
m<n 95.5
96.62
NOI
NOF
m=n 89.25
93.08
NOI
NOF
m>n 96.96
99.92
NOI
NOF
Total 95.3
99.75
36
NOI
NOF
m<n 86.18
93.36
NOI
NOF
m=n 96.68
97.91
NOI
NOF
m>n 96.88
98.03
NOI
NOF
Total 95.67
97.33
360
NOI
NOF
m<n 89.67
95.07
NOI
NOF
m=n 96.83
98.03
NOI
NOF
m>n 96.83
98.03
NOI
NOF
Total 95.88
97.55
242 Abbas Y. Al-Bayati and Ivan S. Latif
improvements are very clear for n=36; n = 360, 1080 and finally, for n = 4320 the new algorithm
has about (64)% NOI and (80)% NOF improvements against Shi-Shen (2004) CG-algorithm.
REFERENCES
Al-Bayati, A. (1991). A new family of self-scaling VM-algorithms, Journal of Education and
Science, Mosul University, Iraq, 12, 25-54.
Al-Bayati, A. and Al-Assady, N. (1986). Conjugate Gradient Methods, Technical Research
Report, No. (1/86), School of Computer Studies, Leeds University, U.K.
Al-Bayati, A. Y. and Latif, I. S. (2008). A new three-term preconditioned gradient memory
algorithm for nonlinear optimization problems, American Journal of Applied Science,
Science Publication, New York, 4(2), 81-87.
Al-Bayati A. Y., Salim, A. J. and Abbo, K.K. (2009). Two versions of CG-algorithms based on
conjugacy conditions for unconstrained optimization, American Journal of Economic and
Business administration, Science Publication, USA, 1(2), 97-104.
Berstsekas, D. (1982). Constrained Optimization and Lagrange Multiplier Methods, Academic
Press, New York, USA.
Bunday, B. (1984). Basic Optimization Methods, Edward Arnold. Bedford square, London, U.K.
Cantrell, J. (1969). On the relation between the memory gradient and the Fletcher-Reeves
method, JOTA, 4(1), 67-71.
Crowder, H. and Wolfe, P. (1972). Liner convergence of the conjugate gradient methods, IBM
Journal of Research and Development, 16, 431-433.
Cragg, E. and Levy, A. (1969). Study on a super memory gradient method for the minimization
of function, JOTA, 4(3), 191-205.
Dai Y. and Yuan, Y. (1999). Nonlinear Conjugate Gradient Methods, Shanghai Science-17 and
Technology press, Shanghai.
Dai, Y. and Yuan, Y. (1996). Convergence properties of the Fletcher-Reeves method, IMA. J. of
Numerical Analysis, 16, 155-164.
Fletcher, R. and Reeves, C. (1964). Function minimization by conjugate gradients, J. Computer,
7, 149-154.
L. Grippo, L. and Lucidi, S. (1997). A globally convergent version of the Polak-Ribiere
conjugate gradient method, Mathematical Programming, 78 (3), 375–391.
Hestenes, M. and Stiefel, E. (1952). Methods of conjugate gradients for solving linear Systems,
J. Res. Nat Bureau stand., 29, 409-430.
Luenberger, D. (1989). Linear and Nonlinear Programming, 2nd Edition. Addition-Wesley,
Reading, U.K.
Miele, A. and Cantrell, J. (1969). Study on a memory gradient method for the minimization of
function, JOTA, 3(6), 459-470.
Nazareth, L. (1977). A Conjugate direction algorithm for unconstrained minimization without
line searches, JOTA, 23, 373-387.
Nocedal, J. (1980). Updating quasi-Newton matrices with limited storage, Mathematics of
Computation, 35, 773-782.
Nocedal, J. and Wright, S. (1999). Numerical Optimization, Spring Series in Operations
Research, Springer Verlag, New York, USA.
AAM: Intern. J., Vol. 7, Issue 1 (June 2012) 243
Nocedal, J. (2005). Unconstrained Optimization Test Functions. Research Institute for
Informatics, Center for advanced Modeling and Optimization, Bucharest1, Romania.
Polak, E. and Ribiere, G. (1969). Note sur la Convergence des methods de direction conjuguees.
revue ferue francaised, Inform. Rec. Oper., 16, 35-43.
Shi, Z. and Shen, J. (2004). A gradient-related algorithm with inexact line searcher, J.
Computational and Applied Mathematics, 170, 349-370.
Wolfe, M. and Viazminsky, C. (1976). Super memory descent methods for unconstrained
minimization, JOTA., 18(4), 455-468.
Zhang J., Xiao, Y. and Wei, Z. (2009). Nonlinear conjugate gradient methods with sufficient
descent condition for large scale unconstrained optimization, Mathematical Problems in
Engineering, DOI: 1155/2009/243290.
Zoutendijk, G. (1970). Nonlinear Programming: Computational Methods in Integer and
Nonlinear Programming,, J. Abadie. ed., North-Holland, Amsterdam.
APPENDIX
All the test functions used in this paper are from general literature Nocedal (1980 , 2005).
1. Extended Tridigonal-1Function:
2/
1
4
212
2
212 )1()3()(
n
i
iiii xxxxxf ,
]2,...,2,2[
0x.
2. Extended Three Exponential Terms Function:

2/
1
12212212 )1.0exp()1.03exp()1.03exp()(
n
i
iiiii xxxxxxf ,
]1.0,...,1.0,1.0[
0x.
3. Diagonal 5 Function (Matrix Rom):
n
i
ii xxxf
1
))exp()log(exp()(,
]1.1,...,1.1[
0x.
4. Extended Freud & Roth Function:
244 Abbas Y. Al-Bayati and Ivan S. Latif

2/
1
2
22212
2
22212 )14)1((29)2)5((13)(
n
i
iiiiiiii xxxxxxxxxf ,
.]2,5.0.,...,2,5.0.,2.,5.0[
0x.
5. Generalized Tridiagonal-1 Function:

4
212
1
1
2
212 13)(
ii
n
i
ii xxxxxf ,
]2,...,2,2[
0x.
6. Diagonal 4 Function:

2/
1
2
2
2
12
2
1
)(
n
i
ii cxxxf ,
100c , ]1,...,1,1[
0x.
7. Dqudrtic Function (CUTE):

2
1
2
2
2
1
2
)(
n
i
iii dxcxxxf ,
100d100,c , ]3,...,3,3[
0x.
8. Extended Denschnb Function (CUTE):
2/
1
2
2
2
2
2
12
2
12 )1()2()2()(
n
i
iiii xxxxxf ,
]1.0,...,1.0,1.0[
0x.
9. Generalized quartic Function GQ1
1
1
22
1
2)()(
n
i
iii xxxxf ,
.]1.,...,1.,1[
0x.
10. Diagonal 8 Function:
AAM: Intern. J., Vol. 7, Issue 1 (June 2012) 245
n
i
iiii xxxxxf
1
2
2)exp()( ,
.]1.,1.,...,1.,1[
0x.
11. Full Hessian Function:
n
i
iiii
n
i
ixxxxxxf
1
2
2
1
)2)exp(()( ,
.]1.,...,1.,1[
0x.
12. Generalized Powell function:
]})2exp[-(- )sin(][3{)( 2
x
3/
1
2
)(1
1
2
3i32
2
2
i
iii
ii x
x
n
i
xx
xx
xf
,
.]2.,1.,0.,...,2.,1.,0[
0x.
13. Generalized Rosen Brock Banana function:
2/
1
2
12
22
122 )1()(100)(
n
i
iii xxxxf ,
]1,2.1.,...,1,2.1[
0x
14. Generalized Non-diagonal function:
n
i
ii xxxxf
2
222
1)1()(100[)(,
.].1.,...,1[
0x
15. Generalized Wolfe Function:

,1)2/3()122/3()12)2/3(()(
1
1
2
1
2
11
2
211
n
i
nnniiii xxxxxxxxxxxf
.]1.,...,1[
0x.
246 Abbas Y. Al-Bayati and Ivan S. Latif
16. Generalized Strait Function:
2/
1
2
12
2
2
2
12 )1(100)()(
n
i
iii xxxxf ,
.]2.,...,2[
0x.
17. Generalized Recipe Function:

3/
1
)2(
2
19
2
13 2
313
2
3
)5()(
n
i
xx
x
ii ii
i
xxxf ,
.]1.,5.,2.,...,1.,5.,2[
0x.
18. Non-diagonal (Shanno-78) Function (Cute):
22
11
2
2)(100)1()(
i
n
i
ixxxxf ,
.]1.,...,1.,1[
0x.
19. Extended Tridiagonal-2 Function:
1
1
1
2
1)1)(1()1()(
n
i
iiii xxcxxxf ,
1.0c , .]1.,...,1.,1[
0x.
20. Generalized Beale Function:

2/
1
2
2
212
2
2
212
2
22 1(625.2)1(25.2)1(5.1)(
n
i
iiiiii xxxxxxxf ,
.]1.,1.,...,1.,1[
0x.
21. Extended Block-Diagonal BD2 Function:
2/
1
23
212
22
2
2
12 .)2.)1(exp(.)2()(
n
i
iiii xxxxxf ,
.]2,5.1.,...,2,5.1[
0x.
AAM: Intern. J., Vol. 7, Issue 1 (June 2012) 247
22. Diagonal 7 Function:
n
i
iii xxxxf
1
2)2)(exp()(,
.]1.,1.,...,1.,1[
0x.
23. Cosine Function (CUTE):
1
1
2
1)5.0cos()(
n
i
ii xxxf ,
.]1.,1.,...,1.,1[
0x.
24. Extended Himmelblau Function:

2/
1
2
2
212
2
2
2
12 711)(
n
i
iiii xxxxxf ,
]1.1,1.1,...,1.1,1.1[
0x.
25. Raydan 2 Function:

n
i
ii xxxf
1
)exp()(,
.]1.,1.,...,1.,1[
0x.
26. Diagonal 6 Function:
n
i
ii xxxf
1
))1()(exp()(,
.]1.,1.,...,1.,1[
0x.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Sequential optimality conditions are related to stopping criteria for nonlinear programming algorithms. Local minimizers of continuous optimization problems satisfy these conditions without constraint qualifications. It is interesting to discover whether well-known optimization algorithms generate primal–dual sequences that allow one to detect that a sequential optimality condition holds. When this is the case, the algorithm stops with a ‘correct’ diagnostic of success ‘convergence’. Otherwise, closeness to a minimizer is not detected and the algorithm ignores that a satisfactory solution has been found. In this paper it will be shown that a straightforward version of the Newton–Lagrange sequential quadratic programming method fails to generate iterates for which a sequential optimality condition is satisfied. On the other hand, a Newtonian penalty–barrier Lagrangian method guarantees that the appropriate stopping criterion eventually holds.
Article
This paper investigates the global convergence properties of the Fletcher-Reeves (FR) method for unconstrained optimization. In a simple way, we prove that a kind of inexact line search condition can ensure the convergence of the FR method. Several examples are constructed to show that, if the search conditions are relaxed, the FR method may produce an ascent search direction, which implies that our result cannot be improved.