Globally convergent modified Perry’s conjugate gradient method Ioannis E. Livieris ⇑ , Panagiotis Pintelas Department of Mathematics, University of Patras, GR 265-00, Greece Educational Software Development Laboratory, Department of Mathematics, University of Patras, GR 265-00, Greece article info Keywords: Unconstrained optimization Conjugate gradient method Sufficient descent property Line search Global convergence abstract Conjugate gradient methods are probably the most famous iterative methods for solving large scale optimization problems in scientific and engineering computation, characterized by the simplicity of their iteration and their low memory requirements. In this paper, we propose a new conjugate gradient method which is based on the MBFGS secant condition by modifying Perry’s method. Our proposed method ensures sufficient descent indepen- dent of the accuracy of the line search and it is globally convergent under some assump- tions. Numerical experiments are also presented. Ó 2012 Elsevier Inc. All rights reserved. 1. Introduction Let us consider the unconstrained optimization problem min f ðxÞ; x 2 R n ; ð1:1Þ where f : R n ! R is a smooth nonlinear function and its gradient is denoted by gðxÞ¼ rf ðxÞ. Iterative methods are usually applied to deal with this problem by generating a sequence of points fx k g, starting from an initial point x 0 2 R n , using the recurrence x kþ1 ¼ x k þ a k d k ; k ¼ 0; 1; ... ; ð1:2Þ where a k > 0 is the stepsize obtained by some line search and d k is the search direction. Conjugate gradient methods are probably the most famous iterative methods for solving the optimization problem (1.1), especially when the dimension is large due to the simplicity of their iteration and their low memory requirements. These methods define the search direction by d k ¼ g 0 ; if k ¼ 0; g k þ b k d k1 ; otherwise; ð1:3Þ where g k ¼ gðx k Þ. Conjugate gradient methods differ in their way of defining the scalar parameter b k . In the literature, there have been proposed several choices for b k which give rise to distinct conjugate gradient methods. The most well known con- jugate gradient methods are the Hestenes–Stiefel (HS) method [19], the Fletcher–Reeves (FR) method [11], the Polak–Ribière (PR) method [27] and Perry’s (P) method [26]. The update parameters of these methods are respectively specified as follows: b HS k ¼ g T k y k1 y T k1 d k1 ; b FR k ¼ kg k k 2 kg k1 k 2 ; b PR k ¼ g T k y k1 kg k1 k 2 ; b P k ¼ g T k ðy k1 s k1 Þ y T k1 d k1 ; 0096-3003/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2012.02.076 ⇑ Corresponding author at: Department of Mathematics, University of Patras, GR 265-00, Greece. E-mail address: [email protected](I.E. Livieris). Applied Mathematics and Computation xxx (2012) xxx–xxx Contents lists available at SciVerse ScienceDirect Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc Please cite this article in press as: I.E. Livieris, P. Pintelas, Globally convergent modified Perry’s conjugate gradient method, Appl. Math. Comput. (2012), doi:10.1016/j.amc.2012.02.076
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Applied Mathematics and Computation xxx (2012) xxx–xxx
Contents lists available at SciVerse ScienceDirect
Ioannis E. Livieris ⇑, Panagiotis PintelasDepartment of Mathematics, University of Patras, GR 265-00, GreeceEducational Software Development Laboratory, Department of Mathematics, University of Patras, GR 265-00, Greece
0096-3003/$ - see front matter � 2012 Elsevier Incdoi:10.1016/j.amc.2012.02.076
⇑ Corresponding author at: Department of MatheE-mail address: [email protected] (I.E. Livieris).
Please cite this article in press as: I.E. LivierisComput. (2012), doi:10.1016/j.amc.2012.02.07
a b s t r a c t
Conjugate gradient methods are probably the most famous iterative methods for solvinglarge scale optimization problems in scientific and engineering computation, characterizedby the simplicity of their iteration and their low memory requirements. In this paper, wepropose a new conjugate gradient method which is based on the MBFGS secant conditionby modifying Perry’s method. Our proposed method ensures sufficient descent indepen-dent of the accuracy of the line search and it is globally convergent under some assump-tions. Numerical experiments are also presented.
� 2012 Elsevier Inc. All rights reserved.
1. Introduction
Let us consider the unconstrained optimization problem
min f ðxÞ; x 2 Rn; ð1:1Þ
where f : Rn ! R is a smooth nonlinear function and its gradient is denoted by gðxÞ ¼ rf ðxÞ. Iterative methods are usuallyapplied to deal with this problem by generating a sequence of points fxkg, starting from an initial point x0 2 Rn, using therecurrence
xkþ1 ¼ xk þ akdk; k ¼ 0;1; . . . ; ð1:2Þ
where ak > 0 is the stepsize obtained by some line search and dk is the search direction.Conjugate gradient methods are probably the most famous iterative methods for solving the optimization problem (1.1),
especially when the dimension is large due to the simplicity of their iteration and their low memory requirements. Thesemethods define the search direction by
dk ¼�g0; if k ¼ 0;�gk þ bkdk�1; otherwise;
�ð1:3Þ
where gk ¼ gðxkÞ. Conjugate gradient methods differ in their way of defining the scalar parameter bk. In the literature, therehave been proposed several choices for bk which give rise to distinct conjugate gradient methods. The most well known con-jugate gradient methods are the Hestenes–Stiefel (HS) method [19], the Fletcher–Reeves (FR) method [11], the Polak–Ribière(PR) method [27] and Perry’s (P) method [26]. The update parameters of these methods are respectively specified as follows:
2 I.E. Livieris, P. Pintelas / Applied Mathematics and Computation xxx (2012) xxx–xxx
where sk�1 ¼ xk � xk�1; yk�1 ¼ gk � gk�1 and k � k denotes the Euclidean norm.In the literature, much effort has been devoted to the global convergence analysis of conjugate gradient methods, which is
usually based on mild conditions which refer to the Lipschitz and boundedness assumptions and is closely connected withthe sufficient descent property
PleaseComp
gTk dk 6 �ckgkk
2; ð1:4Þ
where c > 0 is a positive constant. The global convergence of the FR method was established using both exact [41] and inex-act [1] line search on general functions. The PR and the HS methods may be trapped and cycle infinitely without approachinga solution which implies that they both do not have global convergence for general functions in certain circumstances (seePowell’s counter example [29]). Nevertheless, both methods are preferred to the FR method in its numerical performancesince they have the remarkable property of performing a restart after encountering a bad direction. Motivated by Powell’swork [30], Gilbert and Nocedal [15] conducted an elegant analysis and established that the PR method is globally convergentif bPR
k is restricted to be nonnegative under the sufficient descent condition (1.4). This theoretic result is very interesting andthis globalization technique has been be extended to other conjugate gradient methods, see for instance [6,18]. Perry’s meth-od is based on a quasi-Newton philosophy since it satisfies the secant equation and has been considered to be one of themost efficient conjugate gradient methods in the context of unconstrained minimization [2,35]. However, a global conver-gence result for general functions has not been established yet. We refer to the books [6,25], the survey paper [18] and thereferences therein about the numerical performance and the convergence properties of conjugate gradient methods. Duringthe last decade, much effort has been devoted to develop new conjugate gradient methods which are not only globally con-vergent for general functions but also computationally superior to classical methods and are classified in two classes.
The first class utilizes second order information to accelerate conjugate gradient methods based on modified secant equa-tions (see [12,13,20,21,34]). Dai and Liao [5] proposed a conjugate gradient method by exploiting a new conjugacy conditionbased on the standard secant equation. Motivated by their work, Zhou and Zhang [40] proposed a modification of the Dai-Liao method which is based on the MBFGS condition [20,21]. Li et al. [22] proposed some conjugate gradient methods whichare based on a modified secant equation [34]. In more recent works, Ford et al. [14] proposed a multi-step conjugate gradientmethod that is based on the multi-step quasi-Newton methods proposed in [12,13]. Under proper conditions, these methodsare globally convergent and sometimes their numerical performance is superior to classical conjugate gradient methods.However, these methods don’t ensure to generate descent directions, therefore restarts are employed in their analysisand implementation in order to guarantee convergence.
The second class focuses on developing conjugate gradient methods which ensure the sufficient descent property (1.4).Independently, Dai and Yuan [7] and Hager and Zhang [17] proposed conjugate gradient methods by modifying the updateparameter bk which generate descent directions under the Wolfe line search conditions. Moreover, an important feature oftheir works is that they established the global convergence of their methods for general functions.
Quite recently, similar to the spectral gradient method [2], Zhang et al. [39] considered a different approach, to modify thesearch direction such that it satisfies sufficient descent gT
k dk ¼ �kgkk2. More specifically, they proposed a modification of the
FR method in the following way:
dk ¼ � 1þ bFRk
gTk dk�1
kgkk
� �gk þ bFR
k dk�1: ð1:5Þ
Their method is reduced to the classical FR method in case the line search is exact. An attractive property of their method isthat the property gT
k dk ¼ �kgkk2 holds, independently of the performed line search and the choice of bk. Moreover, if bk in Eq.
(2.5) is specified by another existing conjugate gradient formula, we obtain the corresponding modified conjugate gradientmethod. Along this line, many related conjugate gradient methods have been extensively studied [4,8,10,23,37,38] whichpossess global convergence for general functions and are also computationally competitive to classical methods.
In this work, we propose a conjugate gradient method which can be regarded as a modified version of Perry’s method[26]. Our proposed method ensures sufficient descent using any line search and it is based on the MBFGS secant equation[20,21]. Under suitable conditions, we establish the global convergence of our proposed method provided that the line searchsatisfies the Wolfe conditions. Our numerical results demonstrate that our proposed method is promising.
The remainder of this paper is organized as follows: In Section 2, we present our proposed conjugate gradient method andin Section 3, we present its global convergence analysis. The numerical experiments are reported in Section 4 using the per-formance profiles of Dolan & Morè [9]. Finally, Section 5 presents our concluding remarks and our proposals for futureresearch.
2. Modified Perry’s conjugate gradient method
Li and Fukushima [20,21] made a modification on the standard BFGS method and developed a modified BFGS (MBFGS)method which is globally convergent without a convexity assumption on the objective function f. Their method satisfiesthe following secant condition
Bksk�1 ¼ zk�1; ð2:1Þ
cite this article in press as: I.E. Livieris, P. Pintelas, Globally convergent modified Perry’s conjugate gradient method, Appl. Math.ut. (2012), doi:10.1016/j.amc.2012.02.076
I.E. Livieris, P. Pintelas / Applied Mathematics and Computation xxx (2012) xxx–xxx 3
where Bk is the Hessian approximation and
PleaseComp
zk�1 ¼ yk�1 þ hkkgk�1krsk�1; ð2:2Þ
where r > 0 and hk > 0 (hk is slightly different from that of [20] where r ¼ 1) is defined by
hk ¼ t þmax � sTk�1yk�1
ksk�1k2 ;0
( )kgk�1k
�r; ð2:3Þ
where t is a positive constant. By taking these into consideration, we propose a modification of Perry’s formula bPk as follows:
bMPk ¼ gT
kðzk�1 � sk�1ÞzT
k�1dk�1; ð2:4Þ
where zk�1 is defined by (2.2) and (2.3). Notice that, for a general function the MBFGS secant Eqs. (2.1)–(2.3) ensures that thedominator zT
k�1dk�1 in (2.4) is always positive independent of the performed line search, thus formula (2.4) is well defined. Inorder to guarantee that our proposed method generates descent directions, we exploit the idea of the modified FR method[39]. More specifically, let the search direction be defined by
dk ¼ � 1þ bMPk
gTk dk�1
kgkk
� �gk þ bMP
k dk�1: ð2:5Þ
It is easy to see that the condition
gTk dk ¼ �kgkk
2; ð2:6Þ
holds, using any line search. At this point, we present our proposed modified Perry’s conjugate gradient algorithm (MP-CG).
Algorithm 2.1 (MP-CG).
Step 1: Initiate x0 2 Rn and 0 < r1 < r2 < 1; Set k ¼ 0.Step 2: If kgkk ¼ 0, then terminate; Otherwise go to the next step.Step 3: Compute the descent direction dk by Eq. (2.5).Step 4: Determine a stepsize ak using the Wolfe line search:
f ðxk þ akdkÞ � f ðxkÞ 6 r1akgTk dk; ð2:7Þ
gðxk þ akdkÞT dk P r2gTk dk: ð2:8Þ
Step 5: Let xkþ1 ¼ xk þ akdk.Step 6: Set k ¼ kþ 1 and go to Step 2.
3. Global convergence analysis
In order to present the global convergence analysis, we make the following assumptions on the objective function f, whichhave often been used in the literature [18,6,25] to establish the global convergence of conjugate gradient methods.
Assumption 1. The level set L ¼ fx 2 Rnjf ðxÞ 6 f ðx0Þg is bounded, namely, there exists a positive constant B > 0 such that
kxk 6 B; 8x 2 L: ð3:1Þ
Assumption 2. In some neighborhood N of L; f is differentiable and its gradient g is Lipschitz continuous, i.e., there exists apositive constant L > 0 such that
kgðxÞ � gðyÞk 6 Lkx� yk; 8x; y 2 N : ð3:2Þ
Since ffkg is a decreasing sequence, it is clear that the sequence fxkg generated by Algorithm MP-CG is contained in L. In
addition, it follows directly from Assumptions 1 and 2 that there exists a positive constant c > 0 such that
kgðxÞk 6 c; 8x 2 L: ð3:3Þ
In Algorithm 2.1 since the line search satisfies the Wolfe line search conditions (2.7) and (2.8), it immediately follows thatyT
k�1sk�1 > 0 for all k > 0, thus zk�1 is reduced to
zk�1 ¼ yk�1 þ tkgk�1krsk�1: ð3:4Þ
Utilizing this with Assumption 2 and relation (3.3), we can easily obtain the following lemma whose proof is omitted.
cite this article in press as: I.E. Livieris, P. Pintelas, Globally convergent modified Perry’s conjugate gradient method, Appl. Math.ut. (2012), doi:10.1016/j.amc.2012.02.076
4 I.E. Livieris, P. Pintelas / Applied Mathematics and Computation xxx (2012) xxx–xxx
Lemma 3.1. Suppose that Assumptions 1 and 2 hold. Let fxkg be generated by Algorithm MP-CG, then we have
PleaseComp
kzk�1k 6 Lþ tcrð Þksk�1k: ð3:5Þ
Any conjugate gradient method implemented with a line search that satisfies the Wolfe conditions (2.7) and (2.8) possessesthe following property called the Zoutendijk condition [41] which is often used to prove global convergence of conjugategradient methods.
Lemma 3.2. Suppose that Assumptions 1 and 2 hold. Consider any conjugate gradient method of the form (1.2) where dk satisfygT
k dk < 0 and ak satisfies the Wolfe line search conditions (2.7) and (2.8), then
XkP0
ðgTk dkÞ2
kdkk2 < þ1: ð3:6Þ
Clearly, by substituting (2.6) in Zoutendijk’s condition (3.6), we obtain the following inequality
XkP0
kgkk4
kdkk2 < þ1; ð3:7Þ
which is useful in showing the global convergence of our proposed method. In the following, we establish the global conver-gence of Algorithm MP-CG for uniformly convex functions.
Theorem 3.1. Suppose that Assumptions 1 and 2 hold and f is uniformly convex, namely, there exists a positive constant c > 0such that
If fxkg is obtained by Algorithm MP-CG, then we have either gk ¼ 0 for some k or
limk!1
inf kgkk ¼ 0:
Proof. By the convexity assumption (3.8) and Eq. (3.4), we have
zTk�1dk�1 ¼ yT
k�1dk�1 þ tkgk�1krsT
k�1dk�1 P cak�1kdk�1k2: ð3:9Þ
Combining the previous inequality with relations (3.1), (3.2) and (3.5), we obtain
jbMPk j ¼
gTkðzk�1 � sk�1Þ
zTk�1dk�1
�������� 6 kgkk kzk�1k þ ksk�1kð Þ
zTk�1dk�1
�� �� 6Lþ tcr þ 1
ckgkkkdk�1k
:
Therefore, by the definition of the search direction in Eq. (2.5), we have
kdkk 6 kgkk þ jbMPk jkdk�1kkgkkkgkk
2 kgkk þ jbMPk j kdk�1k ¼ kgkk þ 2jbMP
k j kdk�1k 6 1þ 2Lþ tcr þ 1
c
� �kgkk:
Inserting this upper bound for dk in Eq. (3.7), yieldsP
kP0kgkk2<1, which completes the proof. h
Subsequently, in order to ensure global convergence for general functions, similar to Gilbert and Nocedal [15], we restrictthe update parameter of being nonnegative, namely
bMPþk ¼ max
gTkðzk�1 � sk�1Þ
zTk�1dk�1
;0� �
: ð3:10Þ
Moreover, for simplicity in Algorithm 2.1, in case the update parameter bk is computed by Eq. (3.10), we refer it as AlgorithmMP+-CG.
Next, we establish the global convergence of Algorithm MP+-CG for general nonlinear functions. The following lemmashows that bMP
k will be small when the step sk�1 is small which implies that the Algorithm MP-CG prevents the inefficientbehavior of the jamming phenomenon [28], presented in the FR method, from occurring. This property is similar to butslightly different from Property(*), which was derived by Gilbert and Nocedal [15].
Lemma 3.3. Suppose that Assumptions 1 and 2 hold. Let fxkg and fdkg be generated by Algorithm MP-CG, if there exists a positiveconstant l > 0 such that for all k
kgkkP l; for all k P 0: ð3:11Þ
then there exist constants b > 1 and k > 0 such that for all k P 1
cite this article in press as: I.E. Livieris, P. Pintelas, Globally convergent modified Perry’s conjugate gradient method, Appl. Math.ut. (2012), doi:10.1016/j.amc.2012.02.076
I.E. Livieris, P. Pintelas / Applied Mathematics and Computation xxx (2012) xxx–xxx 5
PleaseComp
jbMPk j 6 b ð3:12Þ
and
ksk�1k 6 k ) jbMPk j 6
1b: ð3:13Þ
Proof. From equations (2.6), (2.8) and (3.4), we have
zTk�1dk�1 ¼ yT
k�1dk�1 þ tkgk�1krsT
k�1dk�1 P ðr2 � 1ÞgTk�1dk�1 P ð1� r2Þkgk�1k
2:
Using this with (3.3), (3.5) and (3.13) we obtain
jbMPk j ¼
gTkðzk�1 � sk�1Þ
zTk�1dk�1
�������� 6 kgkkðkzk�1k þ ksk�1kÞ
jzTk�1dk�1j
6cðLþ tcr þ 1Þð1� r2Þl2 ksk�1k , Cksk�1k: ð3:14Þ
Therefore, by setting b :¼maxf2;2CBg and k :¼ 1=Cb, we have that relations (3.12) and (3.13) hold, which completes ourproof. h
From the definition of bMPþk in Eq. (3.10), it immediately follows that jbMPþ
k j 6 jbMPk j for all k > 0. Therefore, we can obtain
the same result for Algorithm MP+-CG as in Lemma 3.3.Subsequently, we present a lemma for the search direction which shows that, asymptotically, the search directions
change slowly.
Lemma 3.4. Suppose that Assumptions 1 and 2 hold. Let fxkg and fdkg be generated by Algorithm MP+-CG, if there exists a positiveconstant l > 0 such that Eq. (3.13) holds; then dk – 0 and
X
kP1
kwk �wk�1k2<1;
where wk ¼ dk=kdkk.
Proof. Firstly, note that dk – 0, for otherwise (2.6) would imply gk ¼ 0. Therefore, wk is well defined. Now, let us define
rk :¼ tk
kdkkand dk :¼ bMP
k þkdk�1kkdkk
; ð3:15Þ
where
tk ¼ � 1þ bMPk þ
gTk dk�1
kgkk2
!gk:
Then, by Eq. (2.5), we have
wk ¼ rk þ dkwk�1: ð3:16Þ
Using this relation with the identity kwkk ¼ kwk�1k ¼ 1, we obtain
krkk ¼ kwk � dkwk�1k ¼ kwk�1 � dkwkk:
Moreover, using this with the condition dk P 0 and the triangle inequality, we get
6 I.E. Livieris, P. Pintelas / Applied Mathematics and Computation xxx (2012) xxx–xxx
PleaseComp
ktkk ¼ 1þ bMPþk
gTk dk�1
kgkk2
!gk
���������� 6 1þ gT
kðzk�1 � sk�1ÞzT
k�1dk�1
�������� jgT
k dk�1jkgkk
2
!kgkk 6 kgkk þ kzk�1k þ ksk�1kð Þ gT
k dk�1
zTk�1dk�1
��������
6 cþ ðLþ tcr þ 1ÞB maxr2
ð1� r2Þ;1
� �, D:
Therefore, using this relation with (3.7), we obtain
XkP1
krkk26
XkP1
ktkk2
kdkk2 6XkP1
ktkk2
kgkk4
kgkk4
kdkk2 6D2
l4
XkP1
kgkk4
kdkk2 < þ1:
which together with (3.17) completes the proof. h
Next, by making use of Lemmas 3.3 and 3.4, we establish the global convergence theorem for Algorithm MP+-CG. Theproof of the following theorem is similar to that of Theorem 3.2 in [17].
Theorem 3.2. Suppose that Assumptions 1 and 2 hold. If fxkg is obtained by Algorithm MP+-CG, then we have
limk!1
inf kgkk ¼ 0:
Proof. We proceed by contraction. Suppose that there exists a positive constant l > 0 such that for all k P 0
kgkkP l:
The proof is divided in the following two steps.
Step I. A bound on the step sk. Let D be a positive integer, chosen large enough that
D P 4BC;
where B and C are defined in (3.1) and (3.14), respectively. For any l > k P k0 with l� k 6 D, following the same proof as thecase II of Theorem 3.2 in [17], we get
Xl�1
j¼k
ksjk < 2B:
Step II. A bound on the directions dl determined by Eq. (2.5). It follows from (2.5) that
dl ¼ �gl þ bMPþk I � glg
Tl
kglk2
!dl�1: ð3:21Þ
Since gl is orthogonal to I � glgTl
kglk2
� dl�1 and I � glg
Tl
kglk2 is a project matrix, we have from (3.1), (3.14) and (3.21) that
kdlk26 kglk
2 þ jbMPþk j2kdl�1k2
6 c2 þ C2ksl�1k2kdl�1k2:
Now, the remaining argument is standard in the same way as case III of Theorem 3.2 in [17], thus we omit it. This completesthe proof. h
4. Numerical experiments
In this section, we report some numerical results in order to evaluate the performance of our proposed conjugate gradientmethod MP+-CG with that of the CG-DESCENT method [17] and the PR+ method [15].
The implementation code was written in Fortran and compiled with ifort on a PC (2.66 GHz Quad-Core processor,4Gbyte RAM) running Linux operating system. The CG-DESCENT code is coauthored by Hager and Zhang obtained from Hag-er’s web page1 and the PR+ code is coauthored by Liu, Nocedal and Waltz and obtained by Nocedal’s web page.2 In our exper-iments, we use the condition kgkk1 6 10�6 as stopping criterion. For our proposed method, we set parameter t ¼ 10�4 and r ¼ 1,if kgkkP 1; otherwise r ¼ 3 as in [40]. We selected 111 problems from the CUTEr [3] library that have been also tested by Hagerand Zhang [17]. The problem ncb20 was excluded from our experimental analysis because it gives the ‘‘insufficient space’’ errorwhen evaluated by any tested algorithm. Table 1 reports the numerical results, which gives the problem names and theirdimensions, the total number of iterations (Iter), the total number of function evaluations (FcEv), the total number of gradient
8 I.E. Livieris, P. Pintelas / Applied Mathematics and Computation xxx (2012) xxx–xxx
evaluations (GradEv) and the CPU time (Time) in seconds. Moreover, ‘‘Failed’’ means that the method failed to converge with theprescribed accuracy, i.e. kgkk1 6 10�6. For convenience, we give the meanings of these methods in Table 1.
� ‘‘CG-DESCENT’’ stands for the CG-DESCENT method [17] implemented with the Wolfe line search conditions (2.7) and(2.8) with r1 ¼ 0:1 and r2 ¼ 0:9. The other parameters are set as default.� ‘‘PR+’’ stands for PR+ of Gilbert and Nocedal [15] implemented with the line search proposed in [24].� ‘‘MP+’’ stands for Algorithm MP+-CG implemented with the same line search as the CG-DESCENT.
All algorithms were evaluated using the performance profiles proposed by Dolan and Morè [9] which provide a wealth ofinformation such as solver efficiency, robustness and probability of success in compact form. The use of performance profileseliminate the influence of a small number of problems on the benchmarking process and the sensitivity of results associatedwith the ranking of solvers [9]. The performance profile plots the fraction P of problems for which any given method is withina factor s of the best solver. The horizontal axis of the problems for which a method is each plot shows the percentage thefastest (efficiency), while the vertical side gives the percentage of the problems that were successfully solved by each method(robustness). Figs. 1–4 show the performance profiles relative to function evaluations, gradient evaluations, number of iter-ations and CPU time, respectively.
Please cite this article in press as: I.E. Livieris, P. Pintelas, Globally convergent modified Perry’s conjugate gradient method, Appl. Math.Comput. (2012), doi:10.1016/j.amc.2012.02.076
Fig. 1. Log10 scaled performance profiles of conjugate gradient methods CG-DESCENT, PR+ and MP+ based on function evaluations.
Fig. 2. Log10 scaled performance profiles of conjugate gradient methods CG-DESCENT, PR+ and MP+ based on gradient evaluations.
I.E. Livieris, P. Pintelas / Applied Mathematics and Computation xxx (2012) xxx–xxx 9
Clearly, Figs. 1–4 present that our proposed method MP+ exhibits the best overall performance since it illustrates thehighest probability of being the optimal solver, followed by the CG-DESCENT, relative to all performance metrics. PR+ exhib-its the worst performance, solving only 73.8% of the test problems successfully, while MP+ and CG-DESCENT solve 96.3% ofthe test problems successfully. Additionally, it is worth noticing that MP+ solves about 51.4% and 55.9% of the test problemswith the least number of function evaluations and gradient evaluations, respectively while the CG-DESCENT solves about31.5% and 44.1% of the test problems, in the same situation. Moreover, Figs. 3 and 4 show that MP+ has the best performancewith respect to the number of iterations and CPU time since it corresponds to the top curves.
Please cite this article in press as: I.E. Livieris, P. Pintelas, Globally convergent modified Perry’s conjugate gradient method, Appl. Math.Comput. (2012), doi:10.1016/j.amc.2012.02.076
Fig. 3. Log10 scaled performance profiles of conjugate gradient methods CG-DESCENT, PR+ and MP+ based on number of iterations.
Fig. 4. Log10 scaled performance profiles of conjugate gradient methods CG-DESCENT, PR+ and MP+ based on CPU time.
10 I.E. Livieris, P. Pintelas / Applied Mathematics and Computation xxx (2012) xxx–xxx
5. Conclusions & future research
In this paper, we proposed a conjugate gradient method which is based on the MBFGS secant condition by modifying Per-ry’s method. An important property of our proposed method is that it ensures sufficient descent using any line search. Underproper conditions, we established that our proposed method is globally convergent for general functions under the Wolfeline search. The presented numerical results illustrated the efficiency and robustness of our proposed method.
Our future work is concentrated on studying the convergence properties and numerical performance of our proposedmethod using different inexact line searches [16,31–33,36].
Please cite this article in press as: I.E. Livieris, P. Pintelas, Globally convergent modified Perry’s conjugate gradient method, Appl. Math.Comput. (2012), doi:10.1016/j.amc.2012.02.076
I.E. Livieris, P. Pintelas / Applied Mathematics and Computation xxx (2012) xxx–xxx 11
References
[1] M. Al-Baali, Descent property and global convergence of the Fletcher–Reeves method with inexact line search, IMA Journal of Numerical Analysis 5(1985) 121–124.
[2] E.G. Birgin, J.M. Martínez, A spectral conjugate gradient method for unconstrained optimization, Applied Mathematics and Optimization 43 (1999)117–128.
[3] I. Bongartz, A. Conn, N. Gould, P. Toint, Cute: constrained and unconstrained testing environments, ACM Transaction on Mathematical Software 21(1995) 123–160.
[4] W. Chen, Q. Liu, Sufficient descent nonlinear conjugate gradient methods with conjugacy condition, Numerical Algorithms 53 (2010) 113–131.[5] Y.H. Dai, L.Z. Liao, New conjugacy conditions and related nonlinear conjugate gradient methods, Applied Mathematics and Optimization 43 (2001) 87–
101.[6] Y.H. Dai, Y.X. Yuan, Nonlinear Conjugate Gradient Methods, Shanghai Scientific and Technical Publishers, Shanghai, 2000.[7] Y.H. Dai, Y.X. Yuan, A nonlinear conjugate gradient with a strong global convergence properties, SIAM Journal of Optimization 10 (2000) 177–182.[8] Z. Dai, B.S. Tian, Global convergence of some modified PRP nonlinear conjugate gradient methods, Optimization Letters (2010) 1–16.[9] E. Dolan, J.J. Moré, Benchmarking optimization software with performance profiles, Mathematical Programming 91 (2002) 201–213.
[10] S.Q. Du, Y.Y. Chen, Global convergence of a modified spectral FR conjugate gradient method, Applied Mathematics and Computation 202 (2) (2008)766–770.
[11] R. Fletcher, C.M. Reeves, Function minimization by conjugate gradients, Computer Journal 7 (1964) 149–154.[12] J.A. Ford, I.A. Moghrabi, Multi-step quasi-Newton methods for optimization, Journal of Computational and Applied Mathematics 50 (1994) 305–323.[13] J.A. Ford, I.A. Moghrabi, Using function-values multi-step quasi-Newton methods, Journal of Computational and Applied Mathematics 66 (1996) 201–
211.[14] J.A. Ford, Y. Narushima, H. Yabe, Multi-step nonlinear conjugate gradient methods for unconstrained minimization, Computational Optimization and
Applications 40 (2008) 191–216.[15] J.C. Gilbert, J. Nocedal, Global convergence properties of conjugate gradient methods for optimization, SIAM Journal of Optimization 2 (1) (1992) 21–
42.[16] L. Grippo, S. Lucidi, A globally convergent version of the Polak–Ribière conjugate gradient method, Mathematical Programming 78 (3) (1997) 375–391.[17] W.W. Hager, H. Zhang, A new conjugate gradient method with guaranteed descent and an efficient line search, SIAM of Journal Optimization 16 (2005)
170–192.[18] W.W. Hager, H. Zhang, A survey of nonlinear conjugate gradient methods, Pacific of Journal Optimization 2 (2006) 35–58.[19] M.R. Hestenes, E. Stiefel, Methods for conjugate gradients for solving linear systems, Journal of Research of the National Bureau of Standards 49 (1952)
409–436.[20] D.H. Li, M. Fukushima, A modified BFGS method and its global convergence in nonconvex minimization, Journal of Computational and Applied
Mathematics 129 (2001) 15–35.[21] D.H. Li, M. Fukushima, On the global convergence of the BFGS method for nonconvex unconstrained optimization problems, SIAM Journal on
Optimization 11 (2001) 1054–1064.[22] G. Li, C. Tang, Z. Wei, New conjugacy condition and related new conjugate gradient methods for unconstrained optimization, Journal of Computational
and Applied Mathematics 202 (2007) 523–539.[23] A. Lu, H. Liu, X. Zheng, W. Cong, A variant spectral-type FR conjugate gradient method and its global convergence, Applied Mathematics and
Computation 217 (12) (2011) 5547–5552.[24] J.J. Moré, D. Thuente, Line search algorithms with guaranteed sufficient decrease, ACM Transaction on Mathematical Software 20 (1994) 286–307.[25] J. Nocedal, S.J. Wright, Numerical Optimization, Springer-Verlag, New York, 1999.[26] A. Perry, A modified conjugate gradient algorithm, Operational Research 26 (1978) 1073–1078.[27] E. Polak, G. Ribière, Note sur la convergence de methods de directions conjuguees, Revue Francais d’Informatique et de Recherche Operationnelle 16
(1969) 35–43.[28] M.J.D. Powell, Restart procedures for the conjugate gradient method, Mathematical Programming 12 (1977) 241–254.[29] M.J.D. Powell, Nonconvex minimization calculations and the conjugate gradient method, Lectures Notes in Mathematics, vol. 1066, Springer-Verlag,
Berlin, 1984, pp. 122–141.[30] M.J.D. Powell, Convergence properties of algorithms for nonlinear optimization, SIAM Review 28 (1986) 487–500.[31] Z.J. Shi, J. Shen, Convergence of PRP method with new nonmonotone line search, Applied Mathematics and Computation 181 (1) (2006) 423–431.[32] Z.J. Shi, S. Wang, Z. Xu, The convergence of conjugate gradient method with nonmonotone line search, Applied Mathematics and Computation 217 (5)
(2010) 1921–1932.[33] C.Y. Wang, Y. Y Chen, S.Q. Du, Further insight into the Shamanskii modification of Newton method, Applied Mathematics and Computation 180 (2006)
46–52.[34] Z. Wei, G. Yu, G. Yuan, Z. Lian, The superlinear convergence of a modified BFGS-type method for unconstrained optimization, Computational
Optimization and Applications 29 (2004) 315–332.[35] G. Yu, L. Guan, W. Chen, Spectral conjugate gradient methods with sufficient descent property for large-scale unconstrained optimization,
Optimization Methods and Software 23 (2) (2008) 275–293.[36] G. Yu, L. Guan, Z. Wei, Globally convergent Polak–Ribière-Polyak conjugate gradient methods under a modified Wolfe line search, Applied Mathematics
and Computation 215 (8) (2009) 3082–3090.[37] L. Zhang, Two modified Dai–Yuan nonlinear conjugate gradient methods, Numerical Algorithms 50 (2009) 1–16.[38] L. Zhang, W. Zhou, Two descent hybrid conjugate gradient methods for optimization, Journal of Computational and Applied Mathematics 216 (2008)
164–251.[39] L. Zhang, W. Zhou, D. Li, Global convergence of a modified Fletcher–Reeves conjugate gradient method with Armijo-type line search, Numerische
Mathematik 104 (2006) 561–572.[40] W. Zhou, L. Zhang, A nonlinear conjugate gradient method based on the MBFGS secant condition, Optimization Methods and Software 21 (5) (2006)
707–714.[41] G. Zoutendijk, Nonlinear programming, in: J. Abadie (Ed.), Integer and Nonlinear Programming, North Holland, Amsterdam, 1970, pp. 37–86.
Please cite this article in press as: I.E. Livieris, P. Pintelas, Globally convergent modified Perry’s conjugate gradient method, Appl. Math.Comput. (2012), doi:10.1016/j.amc.2012.02.076