A general descent framework for the monotone variational inequality problem

A GENERAL DESCENT FRAMEWORKFOR THE MONOTONE VARIATIONAL INEQUALITYPROBLEMJIA HAO WUMICHAEL FLORIANPATRICE MARCOTTECentre de recherche sur les transportsUniversit�e de Montr�eal1993AbstractWe present a framework for descent algorithms that solve the monotone variational inequalityproblem V IPv which consists in �nding a solution v 2 v which satis�es s(v)T (u�v) � 0, for allu 2 v. This uni�ed framework includes, as special cases, some well known iterative methodsand equivalent optimization formulations. A descent method is developed for an equivalentgeneral optimization formulation and a proof of its convergence is given. Based on this uni�edalgorithmic framework, we show that a variant of the descent method where each subproblemis only solved approximately is globally convergent under certain conditions.Key words Variational Inequalities, Descent methods, Optimization.1

1 IntroductionIn this paper we consider the variational inequality problem (VIP) that consists in �nding a vectorv in v such that(V IPv) s(v)T (u� v) � 0 for all u 2 v (1)where u, v are vectors in Rn and s a mapping from the closed convex set v into Rn. Through thelinear change of variablesv = �h; S(h) = �T s(v) (2)V IPv can be put in the equivalent form(V IPh) s(�h)T �(u� h) � 0; for all u 2 h: (3)It is well known that when rs(v) is symmetric and positive de�nite on v , V IPv corresponds tothe optimality condition of a convex program; if rs(v) does not satisfy the condition, then there isusually no such convex program. Both V IPv and V IPh have been widely used to model networkequilibrium problems arising in transportation system analysis (Smith, 1979, 1983, Dafermos, 1980,Bertsekas and Gafni, 1982, Florian and Spiess, 1984, Florian, 1986, Marcotte and Gu�elat, 1988) andregional science (Florian and Los, 1982, Nagurney, 1987, Dafermos and Nagurney, 1987). The studyof variational inequalities draws on nonlinear optimization methodology, solution of nonlinear equa-tions and �xed point theory. In order to solve V IPv or V IPh, two main approaches have emergedin the recent literature: the iterative approach, where the solution approach is based on solving asequence of simpler problems (optimization problems, linear or nonlinear complementarity prob-lems, such as Pang and Chan, 1982), and the solution of an equivalent optimization problem wherethe variational inequality is solved as a mathematical program (both di�erentiable and nondi�eren-tiable, such as Nguyen and Dupuis, 1984, Marcotte, 1985, Dussault and Marcotte, 1985). The �rstapproach draws on the theory of the iterative solution of nonlinear systems of equations (Ortega andRheinboldt, 1970) and includes algorithms based on SOR, Gauss-Seidel, Newton, linearized Jacobiand projection methods. Among others, Bertsekas and Gafni (1982) present a projection methodfor V IPh and Dafermos (1982, 1983) presents a general iterative scheme including the projectionmethod for V IPv as a special case. Nguyen and Dupuis (1984) propose a cutting plane approachfor V IPv. Marcotte (1985) proposes an algorithm for the solution of an equivalent nondi�erentiableoptimization formulation of V IPv for monotone s, while Marcotte and Dussault (1990) solve anequivalent optimization formulation for V IPv by solving a sequence of linear programs. Fukushima(1986) proposes a modi�ed projection method for V IPv, where each iteration requires a projectiononto a supporting half space, rather than just the solution space itself. Harker (1988) developedan acceleration method for the Jacobi and projection algorithms for solving monotone V IPv whichis reminiscent of the PARTAN variant for primal optimization algorithms. See Harker and Pang2

(1990) for a comprehensive survey of the subject. A signi�cant contribution to the reformulationof V IPv as an equivalent di�erentiable optimization problem is that of Fukushima (1989) whodevelops a continuously di�erentiable but nonconvex equivalent optimization formulation for V IPvwhen rs(v) is asymmetric, by using the projection operator. We shall refer to this formulation asthe projection based optimization model (PBOM). Motivated by Fukushima's work, Wu, Florianand Marcotte (1990) consider a more general reformulation for V IPh, which we refer to as thelinearized Jacobi based optimization model (LJBOM). Pang (1990) studies the Newton method forsolving a system of B-di�erentiable equations and applies the resulting method to the nonlinearcomplementarity, variational inequality and nonlinear programming prolems. In this paper, weconsider only V IPv and present a general equivalent optimization formulation for V IPv. Based onthe formulation, a general descent framework is developed for the general equivalent optimizationformulation and a proof of its convergence is given. Then we show that the symmetrized Newtonmethod, Quasi-Newton methods, the linearized Jacobi method, the projection method as well asothers are various realizations of this general algorithmic framework. In addition, we show thata variant of the descent method, where each subproblem is only solved approximately, is globallyconvergent under certain conditions. The theoretical development parallels that of Migdalas (1990)for the solution of convex optimization problem by a regularized Frank-Wolfe (1956) algorithm.Although we only consider V IPv in this paper, it is important to note that the results may beextended to V IPh, as analysed in Wu, Florian and Marcotte (1990). This paper is organized asfollows. In section 2, we give some de�nitions and assumptions. In section 3, a general equivalentoptimization formulation is presented for V IPv, together with some of its properties. Then, ageneral descent method is developed for this formulation. Section 4 is devoted to an analysis of therelationship with some other existing algorithms, while, in section 5, we consider a possible way ofapplying an approximation strategy to the general framework in order to derive an approximatealgorithm. Section 6 provides some conclusions.2 De�nitions and assumptionsIn this section, we give some de�nitions and assumptions that are used in our analysis. The mappings : Rm ! Rm is said to be monotone on v if(s(v0)� s(v00))T (v0 � v00) � 0 for all v0; v00 2 v; (4)strictly monotone on v if strict inequality holds in (4) for v0 6= v00 and strongly monotone on vif there is a positive � such that(s(v0)� s(v00))T (v0 � v00) � �jjv0 � v00jj22 for all v0; v00 2 v : (5)It is well known (Ortega and Rheinboldt, 1970) that, if s(v) is continuously di�erentiable and theJacobian matrix rs(v) is positive de�nite on v, and(v0 � v00)Trs(v)(v0 � v00) � �jjv0 � v00jj22 for all v; v0; v00 2 v; v0 6= v00 (6)3

then s is strongly monotone on v with � the minimum eigenvalue of the symmetric part of rs(v)over v, which we denote �min. Note that rs(v) need not be symmetric in (6).The mapping s : Rm ! Rm is Lipschitz continuous on v if there is a constant Ls such thatjjs(v0)� s(v00)jj2 � Lsjjv0 � v00jj2 for all v0; v00 2 v: (7)Unless otherwise stated, s will be assumed strongly monotone on v throughout this paper.3 A general equivalent optimization formulationIn this section, we give a general equivalent optimization formulation for V IPv and discuss itsproperties. For � positive, consider the optimization problem:minv2v f(v) = �s(v)T (H(v)� v)� 1��(H(v); v) (8)where H(v) 2 �(v) = argminu2v s(v)T (u� v) + 1��(u; v) and �(u; v) : v � v � Rm ! R1 hasthe following properties:�it is continuously di�erentiable on v � v ; (9)�it is nonnegative on v � v; (10)�it is uniformly strictly convex on v (11)i.e. there exists a positive number � such that �(u00; v)��(u0; v) � ru�(u0; v)(u00�u0)+�ku00�u0k2for all u0, u00, v in v;��(u; v) = 0, u = v; (12)�ru�(u; v) is uniformly Lipschitz continuous on v (13)i.e. there exists a constant L such thatkru�(u00; v)�ru�(u0; v)k2 �Mku00 � u0k2for all u0, u00, v in v.Note that if � = 1, then � may be a point-to-set mapping. The following possible forms for�(u; v) satisfy (9)-(13):�(u; v) = 12 jju� vjj22 (14)4

�(u; v) = 12(u� v)TB(u� v) (15)�(u; v) = 12(u� v)TB(v)(u� v) (16)where B(v) is a symmetric and positive de�nite matrix. Note that �(u; v) is a proximity measurebetween u and v.Now, consider the following subproblem of (8) for a �xed v:minu2v (u; v) = s(v)T (u� v) + 1��(u; v): (17)(18)It is clear that the program (17) is strongly convex and hence is well de�ned for any positive �.Furthermore, (u; v) is also a proximity measure between u and v. If � =1, then the solution tothe linear program (17) may be non unique.Proposition 3.1 Any H(v) in �(v) satis�es:[s(v) + 1�ru�(H(v); v)]T(u�H(v)) � 0 for all u 2 v: (19)Proof. The variational inequality (19) expresses the �rst order optimality condition for the convex(linear if � =1) program (17). 2Proposition 3.2 � admits a �xed point v 2 v, i.e., v 2 �(v).Proof. Since � is an upper semicontinuous point-to-set mapping (continuous if � is �nite), theresult follows from Kakutani's �xed point theorem (1964). 2Proposition 3.3 A vector v solves the variational inequality problem V IPv if and only if it is a�xed point of the mapping �, i.e., v 2 �(v).Proof. See lemma 1 in Dafermos (1982). 2Proposition 3.4 We have:f(v) is Lipschitz continuous on v; (20)@f(v) = Cou2H(v)[s(v)� rs(v)(u� v)� 1�rv�(u; v)]; (21)where u is the solution set of the subproblem (17);f(v) is di�erentiable at v if 0 < � <1: (22)5

Proof. See Clarke (1975) for (20) and (21). For (22), note that 0 < � < 1 ensures uniqueness ofH(v) by Danskin's theorem (1966). We then have that f is di�erentiable. 2Theorem 3.1 f(v) � 0 for all v 2 v and f(v) = 0, i.e., v solves (8), if and only if v solvesV IPv.Proof. Since f(v) is the negative of the optimal value of subproblem (17) which is less or equal tozero (take u = v), it is clear that f(v) � 0 for every v 2 v. For the second part, suppose that vsolves V IPv. Then by Proposition 3.3, we have v 2 �(v), which implies f(v) = 0.Conversely suppose that f(v) = 0. It then follows from the de�nition of f thats(v)T (H(v)� v) = � 1��(H(v); v)� 0: (23)Also, letting u = v in (19) and rearranging terms, we obtains(v)T (H(v)� v) � � 1�ru�(H(v); v)(H(v)� v)< � 1��(H(v); v): (24)Combining (23) and (24), we have v 2 H(v). By Proposition 3.3, v is then a solution of V IPv. 2Proposition 3.5 The function f has bounded level sets.Proof. Let v� be the solution to V IPv and set u = v� in (19). We obtain:[s(v) + 1�ru�(H(v); v)]T(H(v)� v�) � 0: (25)However we also have:s(v�)T (v� �H(v)) � 0: (26)Adding (25) and (26) we obtain:[ 1�ru�(H(v); v)+ s(v)� s(v�)]T(H(v)� v�) � 0which can be rewritten as:( 1�ru�(H(v); v)+ s(v)� s(v�))T (H(v)� v)+ ( 1�ru�(H(v); v)+ s(v)� s(v�))T (v � v�) � 06

or: (s(v)� s(v�))T (v � v�) � � 1�ru�(H(v); v)T(H(v)� v)�(s(v)� s(v�))T (H(v)� v)� 1�ru�(H(v); v)T(v � v�):This implies in turn�kv � v�k2 � 1��(v; v)� �(H(v); v)+Lkv � v�k � kH(v)� v)k+ 1�kv � v�k2 � kru�(H(v); v)�ru�(v; v)k2since ru�(v; v) = 0� (L+ M� kv � v�k2 � kH(v)� vk2since �(v; v) = 0 and �(H(v); v)� 0:Hence:kv � v�k2 � L+M=�� kH(v)� vk: (27)We also have, from proposition 19:(s(v) + 1�ru�(H(v); v))T(H(v)� v) � 0which implies: 1�ru�(H(v); v)T(H(v)� v) � s(v)T (H(v)� v)1� [�(H(v); v)� �(v; v) + �kH(v)� vk22] � s(v)T (H(v)� v)and f(v) � ��kH(v)� vk2 since �(v; v) = 0: (28)Grouping (27) and (28) yields: f(v) � �� 2(L+M=�)2kv � v�k22which implies in turn the desired result. 27

Proposition 3.6 Assume that s is strongly monotone on v and rTu (u�v)�(u; v) = �rv�(u; v)T(u�v). Suppose that v is not a solution of V IPv. Then there exists u 2 H(v) such that d = u� v is adescent direction for f at v, i.e., f 0(v; d) < 0. More preciselyf 0(v; d)� �f(v); if � =1; (29)f 0(v; d)� ��minjjdjj22; if 0 < � <1: (30)Proof. Case 1: See Marcotte (1985).Case 2: 0 < � <1. We havef 0(v; d) = �(u� v)Trs(v)(u� v) + [s(v)� 1�rv�(u; v)]T(u� v)� ��minjjdjj22+ [s(v) + 1�ru�(u; v)]T(u� v)� ��minjjdjj22� [s(v) + 1�ru�(u; v)]T(v � u)� ��minjjdjj22 by (19): 2If the form (14) or (15) of �(u; v) is used, then the second condition of Proposition 3.5 is satis�ed.Proposition 3.7 Assume that s is strongly monotone on v and the matrix rB(v)(H(v) � v)is positive de�nite on v with minimum eigenvalue �min, if the form (16) of �(u; v) is used. Letd = u� v where u 2 H(v). If d 6= 0, then there exists u 2 H(v) such that d is a descent directionfor f at v, i.e., f 0(v; d). More precisely:f 0(v; d)� �f(v); if � =1; (31)f 0(v; d)� �(�min + �min2� )jjdjj22; if 0 < � <1: (32)Proof. Case 1: See Marcotte (1985).Case 2: 0 < � <1: We havef 0(v; d) = (u� v)T [s(v)� (rs(v)� B(v)� )(u� v)� 12�(u� v)TrB(v)(u� v)]= [s(v) + B(v)� (u� v)]T(u� v)� (u� v)Trs(v)(u� v)8

� 12�(u� v)TrB(v)(u� v)(u� v)� �(u� v)Trs(v)(u� v)� 12�(u� v)TrB(v)(u� v)(u� v)� �(�min + �min2� )jjdjj22: 2In practice, we may expect that the form (16) provides a steeper descent direction, if the matrixB(v) is suitably chosen.Proposition 3.8 Assume that rs(v) is positive de�nite on v and that ru�(u; v) = �rv�(u; v).Then there is only one stationary point of f which is the unique solution of V IPv, i.e.,f 0(v�; v � v�) � 0 for all v 2 vimplies v� is the unique solution of V IPv.Proof. First we prove that any stationary point of f is a solution of V IPv. Suppose we have thatf 0(v�; v � v�) � 0 holds for all v 2 v. Then:supu2H(v�)(v � v�)T [s(v�)� rs(v�)(u� v�)� 1�rv�(u; v�)] � 0Letting v = u and rearranging, we obtain, from (19),(u� v�)Trs(v�)(u� v�) � [s(v�) + 1�ru�(u; v�)]T(u� v�) � 0which implies u = v� since rs(v) is positive de�nite on v. By Proposition 3.3, we conclude thatv� is a solution of V IPv. It is well known, however, that if s is strongly monotone on v then V IPvhas a unique solution. Suppose that there are two stationary points. Then as in the �rst part, thetwo stationary points must be solutions of V IPv, which is a contradiction. 2Proposition 3.9 Assume that s is monotone on v and 0 < � <1. If v� is a solution of V IPv,then v� is a stationary point of f over v.Proof. Since 0 < � < 1 and (13) is satis�ed, subproblem (17) is well de�ned and its solution isunique, i.e., u = H(v�) = v�. We have, for every v 2 v,f 0(v�; v � v�) = (v � v�)T [s(v�)� (rs(v�)(u� v�)� 1�rv�(u; v�))]= s(v�)T (v � v�):Since s(v�)T (v � v�) � 0, we have f 0(v�; v � v�) � 0: 29

Proposition 3.10 Assume that s is strongly monotone on v, 0 < � <1, the form (16) of �(u; v)is used and rB(v)(H(v)�v) is positive semi-de�nite on v. Then v� is a solution of V IPv, if andonly if v� is a stationary point of f(v) over v.Proof. In this case, subproblem (17) is well de�ned and its solution is unique. Suppose that v� isa solution of V IPv . We have H(v�) = v� andf 0(v�; v � v�) = (v � v�)T [s(v�)� (rs(v�)� B(v�)� )T (H(v�)� v�)� 1�(H(v�)� v�)TrB(v�)(H(v�)� v�)]= s(v�)T (v � v�) � 0 for all v 2 v:Conversely, we havef 0(v�; v � v�) � 0 for all v 2 v:Letting v = H(v�) in the above expression and rearranging, we have(H(v�)� v�)Trs(v�)(H(v�)� v�) + 1�(H(v�)� v�)TrB(v�)(H(v�)� v�)(H(v�)� v�)� s(v�) + B(v�)� (H(v�)� v�)]T (H(v�)� v�)� 0:This implies that H(v�) = v�, since rs(v�) is positive de�nite on v and rB(v�)(H(v�)� v�) ispositive semide�nite on v. By Proposition 3.3, v� is a solution of V IPv. 2Following the analysis above, we introduce the general descent framework for V IPv.Proposition 3.11 Let fvlg be a sequence generated by the iterationvl+1 = vl + tldl; l = 0; 1; 2; :::: (33)where dl = ul � vl (ul 2 H(vl)) is a descent direction of f at vl and tl 2 [0; 1] is determined fromf(vl + tldl) = minff(vl + tdl) j 0 � t � 1g: (34)Assume that ru�(u; v) = �rv�(u; v) or rB(v)(H(v)�v) is positive de�nite on v. Then, for anystarting point v0 2 v, the generated sequence fvlg lies in v and converges to a unique solutionof V IPv.Proof. Since the points vl and H(vl) = vl + dl both belong to v and since 0 � tl � 1, itfollows from the convexity of v that the sequence fvlg lies in v. Also, since f has bounded levelsets (see Proposition 3.5), all points of the generated sequence stay in v. To prove convergence10

of the algorithm, we use Zangwill's Convergence Theorem A (Zangwill, 1969). Condition 1 ofConvergence Theorem A (Zangwill, 1967) clearly holds. Condition 2 (the descent property) isensured by Propositions 3.5 and 3.6. Now consider Condition 3, i.e., upper semicontinuity of thealgorithmic mappingA(vl) =M(vl)D(vl)where M(vl) is the line search mapping and D(vl) is the direction mapping, i.e., D(vl) = (vl; dl) =(vl; ul � vl) where ul 2 H(vl) and dl = ul � vl.We �rst study the direction mapping. Let vl ! v1 for a subsequence l and dl ! d1 for thesubsequence l. Thenul ! u1 = d1 � v1:To prove that D(v) is upper semicontinuous, it is su�cient to establish that u1 2 H(v1). Then,because d1 = u1 � v1, we have(v1; d1) 2 D(v1):Since all ul are feasible, u1 is also feasible. Moreover, by the de�nition of ul, we have[s(vl) + 1�ru�(ul; vl)]T(u� ul) � 0; for all u 2 v:Since [s(v) + 1�ru�(u; v)]is continuous, we obtain, after taking limits in the above inequality:[s(v1) + 1�ru(u1; v1)]T(u� u1) � 0; for all u 2 v:Then u1 2 H(v1) and D is upper semicontinuous. It is well known that many search algorithmssuch as (34) induce the upper semicontinuity of M . In this case, A is also upper semicontinuous.We conclude that the algorithmic map is upper semicontinuous and consequently f(vl)! 0. Thisimplies by Theorem 3.1 that fvlg is convergent to a solution of V IPv which is unique since s isstrongly monotone. All other subsequences must also converge to the solution. 2Remark The preceding Proposition remains valid if the condition of strong monotonicity of s isreplaced by the condition: s is strictly monotone on v and v is compact.We note that the proposed algorithm is di�cult to implement due to the fact that each evalua-tion of f involves an exact minimization, di�cult to perform unless v has very special structure.The exact line search is even more costly. An alternative approach would be to carry out an inexactline search, such as provided by Armijo's stepsize rule (Ortega and Rheinboldt, 1970).11

If � is of the form (15), then the method is PBOM (see Fukushima, 1989). If � takes the form(16), we shall refer to the resulting algorithm as the Linearized Jacobi Based Optimization Method.When � = 1, the existence of the descent direction is ensured by the results of Proposition 3.5and Proposition 3.6. The actual computation of the descent direction for this case is not trivial, asstudied by Bertsekas and Mitter (1973) and Marcotte (1985); it involves the solution of a suitablemathematical programming problem. We close this section with the following equivalence results.Proposition 3.12 A solution v to V IPv can be obtained by solving either one of the followingequivalent problems:1) s(v)T (u� v) � 0; for all u 2 v:2) maxv2v minu2v s(v)T (u� v) + 1��(u; v):3) minv2v �s(v)T (u� v)� 1��(u; v)u 2 arg minu2v s(v)T(u� v) + 1��(u; v):4) maxv2v minu2v s(v)T (u� v):5) minv2v �s(v)T (u� v)u 2 arg minv2v s(v)T(u� v):From the above Proposition, we note that the variational inequality problem is a special case ofthe max-min problem while the max-min problem is a special case of the bilevel problem.4 Various realizations of the general algorithmic frameworkIn this section, we consider various realizations of the general descent framework suggested byProposition 3.10. This analysis serves as a bridge connecting some algorithms of both the iterativemethods and optimization methods. Su�cient conditions for convergence of the following algo-rithms are not presented here. The interested reader is referred to the corresponding references.Proposition 4.1 If 0 < � <1, tl = 1, and � is given by�(u; v) = 12(u� v)TB(v)(u� v); 12

then the algorithm corresponds to a family of iterative methods already discussed. As particularinstances, we obtain1) the symmetrized Newton method where B(v) = 12(rs(v) + rs(v)T) (Hammond, 1984); 2)the Quasi-Newton methods where B(v) is a symmetric positive de�nite matrix of approximation ofrs(v) (Josephy, 1979); 3) the Linearized Jacobi method where B(v)=diag(rs(v)) (Wu, Florianand Marcotte, 1990); 4) the projection method where B(v) is a constant and positive symmetricde�nite matrix B (Dafermos, 1980, Bertsekas, 1982).Note that these four methods are not descent algorithms in nature.Proposition 4.2 If rs(v) is symmetric, then V IPv is equivalent to a convex minimization pro-gram. If the forms (14), (15) or (16) of �(u; v) are used and 0 < � <1, then the algorithm reducesto Migdalas's regularized Frank-Wolfe method.Proof. See Migdalas (1990). 2Proposition 4.3 Assume that rs(v) = D where D is symmetric, positive de�nite and constant.If �(u; v) is de�ned as (15) where B is diagonal with positive entries and furthermore v con-sists of one constraint and bounded variables, then the algorithm is equivalent to Dussault-Ferland-Lemaire's method (DFL).Proof. See Dussault, Ferland and Lemaire (1986). In this case, V IPv represents the �rst-orderoptimality condition of a quadratic convex program such asminv2v g(v) = 12vTDv + cTvwhere D is a symmetric and positive de�nite matrix, c is a vector and s(v) = vTD + cT . In DFL,given a feasible v, a separable subproblem of the form:(SS) minu2v 12uTBu + CTuwhere C = c + (D � B)v is solved. Thus, we need only show that this separable subproblem isequivalent to the subproblem de�ned in (17) if the form (15) of �(u; v) is taken and � = 1. Considerthe corresponding objective function in (17):s(v)T (u� v) + 12(u� v)TB(u� v) = (DTv + cT )(u� v) + 12(uTBu+ vTBv � 2uTBv)= 12uTBu+ [c+ (D� B)v]Tu+ [12vTBv � s(v)Tv]= 12uTBu+ CTu+ 12vTBv � s(v)Tvwhich is equivalent to the separable problem SS. 213

Proposition 4.4 If � = 1 and the form (15) of � is used, then the algorithm reduces to Fukushima'salgorithm.Proof. Fukushima (1990) considers the form (15). 25 An inexact methodAs indicated in Proposition 3.10, a convex optimization subproblem must be solved at each iteration.In general, the problem can not be solved exactly. We will therefore, in this section, propose aninexact scheme that retains the global convergence property possessed by the previous algorithm.It is noted that the inexact scheme may o�er a a good trade-o� between the amount of workrequired per iteration and the total number of iterations carried out. In the following, we restrictour analysis to the case where 0 < � <1. De�ne�l = minf1l ; �jjdljj22g (35)where �l is an error term at iteration l, dl is a descent direction at iteration l and � is a positiveparameter. Let u satisfy the inequality:[s(vl) + 1�ru�(u; vl)]T(v � u) � ��l; for all v 2 v ; (36)i.e., u is an �l-optimal solution of the subproblem (17).Proposition 5.1 Let u = H(vl) and dl = u� vl. If � is strongly convex in u, i.e., there exists apositive constant min such that[ru�(u0; v)�ru�(u00; v)]T [u0 � u00] � minjju0 � u00jj22; (37)then we havejju� ujj2 � r �� min jjdljj2: (38)Proof. By de�nition, u satis�es[s(vl) + 1�ru�(u; vl)]T(v � u) � 0 for all v 2 v: (39)Letting v = u in (36), v = u in (39), and adding (36) and (39), we obtain, after rearranging terms:[ru�(u; vl)� ru�(u; vl)]T (u� u) � ��l�jju� ujj22 � �l� min using (37) 14

� �� min jjdljj22 by de�nition of �l:Hencejju� ujj2 � r �� min jjdljj2: 2Proposition 5.2 Assume that1) 0 < � <1 (40)2) ru�(u; v) = �rv�(u; v) (41)3) ru� is strongly monotone on v (42)4) ru� is Lipschitz continuous on v; with Lipschitzconstant L:5) �+ (�max+ L)r �� min < �min (43)where �min and �max are the minimum and maximum eigenvalues of rs(v) respectively. If dl =u� vl 6= 0 where u satis�es (36), then dl is a descent direction of f(v) at vl, i.e., f 0(vl; dl) < 0:Proof. Let u = H(v). We havef 0(vl; dl) = [s(vl)� rs(vl)(u� vl)� 1�rv�(u; vl)]T (u� vl)T= [s(vl)� 1�rv�(u; vl)]T(u� vl)� (u� vl)Trs(vl)(u� vl)= A1 +A2whereA1 = [s(vl)� 1�rv�(u; vl)]T (u� vl)A2 = �(u� vl)Trs(vl)(u� vl):Consider A1 :A1 = [s(vl)� 1�rv�(u; vl)]T (u� vl) 15

= [s(vl) + 1�ru�(u; vl)]T(u� vl) by (41)= �[s(vl) + 1�ru�(u; vl)]T (vl � u)+ 1� [ru�(u; vl)� ru�(u; vl)]T (u� vl)� �l + 1� [ru�(u; vl)� ru�(u; vl)]T (u� vl) by (36)� �jjdljj22 + 1�Ljju� ujj2 � jjdljj2� (�+ Lr � min�)jjdljj22 by (38):Now consider A2:A2 = �(u� vl)Trs(vl)(u� vl)= �(u� vl)Trs(vl)(u� vl)� (u� vl)Trs(vl)(u� u)� ��minjjdljj22 � (u� vl)Trs(vl)(u� u)� ��minjjdljj22 + jjdljj2 � jjrs(vl)jj2 � jju� ujj2� ��minjjdljj22 + �maxjjdljj2r �� min jjdljj2 by (38)= (��min+ �maxr �� min)jjdljj22:Thus f 0(vl; dl) � (�+ Lr �� min � �min + �maxr �� min)jjdljj22 < 0 by (43): 2Proposition 5.3 Assume that 1) �(u; v) = 12(u � v)TB(v)(u � v); 2) rB(v)(u � v) is positivede�nite with minimum eigenvalue �min; 3) (37) is satis�ed; 4) B(v) is a symmetric and positivede�nite matrix with maximum eigenvalue �max; 5) 0 < � <1; 6) �+ ( �max� � �min� �min2� )q �� min <�min+ �min2� : If dl = u� vl 6= 0 where u satis�es (36), then dl is a descent direction for f at vl, i.e.,f 0(vl; dl) < 0: 16

Proof. Let u = H(vl). We have:f 0(vl; dl) = (u� vl)T [s(vl)� (rs(vl)� B(vl)� )(u� vl)� 12�(u� vl)TrB(vl)(u� vl)]= [s(vl)� (rs(vl)� B(vl)� )(u� vl)]T (u� vl)� 12� [(u� vl)TrB(vl)(u� vl)]T (u� vl)= [s(vl) + B(vl)� (u� vl)]T (u� vl)� (u� vl)Trs(vl)(u� vl)� 12� [(u� vl)TrB(vl)(u� vl)]T (u� vl)= [s(vl) + B(vl)� (u� vl) + B(vl)� (u� u)]T (u� vl)�(u� vl)rs(vl)(u� vl)� (u� u)Trs(vl)(u� vl)� 12� [(u� vl)TrB(vl)(u� vl)]T (u� vl)� 12� [(u� u)TrB(vl)(u� vl)]T(u� vl)� �l � �minjjdljj22 � 12��minjjdljj22+(u� u)T B(vl)� (u� vl)� (u� u)Trs(vl)(u� vl)� 12�(u� u)TrB(vl)(u� vl)(u� vl) by (36) and 2)� (�� min� 12��min)jjdljj2 + (�max� � �min� �min2� )jju� ujj2 � jjdljj2� [�� min� 12��min + (�max� � �min� �min2� )r �� min ] � jjdljj2217

< 0 by 6) 2An important issue is how to �nd e�ciently u satisfying (36). As we already know, if � = 0, thencondition (36) is the optimality condition of the subproblem, i.e.,minu2v (u; vl): (44)Let its optimal solution be u�. By convexity and di�erentiability of ; u is an �-optimal solution to(44), i.e., (u�) + � � (u).Motivated by the network equilibrium problem where v is polyhedral and for which a linearprogram can easily be solved, we use the linear approximation (Frank-Wolfe) method to solve (44).The linear approximation problem for a given u can be stated as� = minu2vru (u; vl)(u� u)= minu2v s(vl) + 1�ru�(u; vl) (45)whose solution yields either an extreme point or a recession direction. In either case, denote by dthe associated search direction and �nd�� = arg min0��1 (u+ �d; vl): (46)Let u = u+��d and repeat the above procedure. From the above discussion, we see that subproblem(17) is solved by performing a sequence of linear programs and we obtain:ALGORITHM Astep 1 (Initialization) Find v0 2 v and set l = 0;step 2 (Subproblem)step 2.1 (Initialization) Let u0 = vl and set k = 0;step 2.2 (Linear program) � = minu2v ru (uk; vl)(u� uk);Let the optimal solution be u.step 2.3 (Test) Let �l = minf1l ; �jjdljj22g.if � � ��l, then u = uk and goto step 3;else d = u� uk;step 2.4 (Line search) �� = argmin0��1f (uk + �d; vl)g;step 2.5 (Update) uk+1 = uk + ��d; k = k + 1. Goto Step 2.2;18

step 3 (Descent direction) dl = u� vl;step 4 (Stopping condition) if jjdljj2 � 0, then stop. The inexact solution is u.step 5 (Line search) �l = argmin0��1 f(vl + �dl);step 6 (Update) vl+1 = vl + �ldl; l = l+ 1. Goto Step 2.Proposition 5.4 Assume that v is compact and convex, the mapping s is continuously di�er-entiable, rs(v) is positive de�nite on v and, either ru�(u; v) = �rv�(u; v) or rB(v)(u� v) ispositive de�nite on v. If the conditions in Propositions 5.1, 5.2 and 5.3 are satis�ed, then froman arbitrary point v0 2 v, the sequence fvlg generated by Algorithm A lies in v and converges tothe unique solution of V IPv.Proof. The proof will be similar to that of Proposition 3.10, if we can show that step 2 in AlgorithmA is well de�ned and convergent. In fact, the method for solving the subproblem in step 2 is theapplication of the Frank-Wolfe method. Since � is continuously di�erentiable and convex on u andv is closed and bounded, and according to Martos (1975), the sequence fukg generated in Step 2converges globally to an �-optimal solution to (44). 2RemarkIf rs(v) is symmetric for all v 2 v, then Algorithm A reduces to Migdalas's method(1990).6 ConclusionIn this paper, we developed a general descent framework for the monotone variational inequalityV IPv and analysed its relationship with some existing algorithms. We provided an inexact algo-rithm and proved that it converges globally under certain conditions. It should be noticed thata decomposition approach can be applied to Algorithm A. In the case of the network equilibriumproblem, the linear program in Step 2.2 can be decomposed into a set of independent linear subpro-grams, one per origin-destination pair. This decomposability is well suited to parallel processing.An obvious way to implement a parallel version is to assign one of the linear subprograms to eachprocessor. Notice that this feature is not shared by Marcotte and Dussault's algorithm (1989)which requires the solution of a sequence of linear programs in the primal-dual space of variables.19

References[1] Auslender, A., Optimisation. M�ethodes num�eriques, Masson, Paris (1976).[2] Bard, J.F. and Falk, J.E., \An explicit solution to the multi-level programming problem ",Computers and Operations Research 9 (1982) 77-100.[3] Bertsekas, D.P. and Gafni, E.M., \Projection methods for variational inequalities with ap-plication to the tra�c assignment problem", Mathematical Programming Study 17 (1982)139-159.[4] Bertsekas, D.P. and Mitter, S.K., \A descent numerical method for optimization problemswith nondi�erentiable cost functionals", SIAM J. on Control 11 (1973) 637-652.[5] Clarke, F.H., \Generalized gradients and applications", Transactions of the American Mathe-matical Society 205 (1975) 247-262.[6] Dafermos, S., \Tra�c equilibrium and variational inequalities", Transportation Science 14(1980) 42-54.[7] Dafermos, S., \An iterative scheme for variational inequalities", Mathematical Programming26 (1983) 40-47.[8] Dafermos, S. and Nagurney, A., \Oligopolistic and competitive behavior of spatially separatedmarket", Regional Science and Urban Economics 17 (1987).[9] Danskin, J.M., \The theory of max-min with applications", SIAM J. Appl. Math. 14 (1966)641-664.[10] Dussault, J-P., Ferland, J.A. and Lemaire, B., \Convex quadratic programming with oneconstraint and bounded variable", Mathematical Programming 36 (1986) 90-104.[11] Florian, M., \Nonlinear cost network models on transportation analysis", Mathematical Pro-gramming Study 26 (1986) 167-196.[12] Florian, M. and Los, M., \A new look at static spatial price equilibrium model", RegionalScience and Urban Economy 12 (1982) 374-389.[13] Florian, M. and Spiess, H., \The convergence of diagonalization algorithms for asymmetricnetwork equilibrium problems", Transportation Research 16B (1982) 447-483.[14] Friesz, T.L., \Transportation network equilibrium, design and aggregation: key developmentsand research opportunities", Transportation Research 19A (1985) 413-427.20

[15] Fukushima, M., \A relaxed projection method for variational inequality", Mathematical Pro-gramming 35 (1986) 58-70.[16] Fukushima, M., \Equivalent di�erentiable optimization problems and descent methods forasymmetric variational inequality problems", Technical Report 89007, Dept. of Applied Math.and Physics, Kyoto University (1989).[17] Hammond, J.H., \Solving asymmetric variational inequality problems and systems of equationswith generalized nonlinear programming algorithms", Ph.D. dissertation, Dept. of Math., MIT(1984).[18] Harker, P.T., \Accelerating the convergence of the diagonalization and projection algorithmsfor �nite-dimentional variational inequalities", Mathematical Programming 41 (1988) 29-59.[19] Harker, P.T. and Pang, J.S., \Finite-dimensional variational inequality and nonlinear com-plementarity problems: a survey of theory, algorithms and applications", Mathematical Pro-gramming 48 (1990) 161-220.[20] Josephy, N.H., \Quasi-Newton methods for generalized equations", Technical Report, Math.Research Center, University of Wisconsin (June 1979).[21] Kakutani, S., \A generalization of Brouwer's �xed point theorem", Duke Mathematics Journal8 (1941) 457-459.[22] Kinderlehrer, D. and Stampacchia, G., An introduction to variational inequalities and theirapplications, Academic Press (1980).[23] Lancaster, P. and Tismenetsky, The theory of matrices, Academic Press (1985).[24] Marcotte, P., \A new algorithm for solving variational inequalities, with application to thetra�c assignment problem", Mathematical Programming 33 (1985) 339-351.[25] Marcotte, P. and Gu�elat J., \Adaptation of a modi�ed Newton method for solving the asym-metric tra�c equilibrium problem ", Transportation Science 22 (1988) 112-124.[26] Marcotte, P. and Dussault, J-P., \A sequential linear programming algorithm for solving mono-tone variational inequalities", SIAM J. Control and Optimization Vol. 27 (1989) 1260-1278.[27] Marron, M.J., Numerical Analysis, Macmillan Publishing Company (1987).[28] Martos, B., Nonlinear Programming Theory and Methods, North Holland, Amsterdam (1975).[29] Migdalas, A., \A regularization of the Frank-Wolfe method", LiTH-MAT-R-90-10 Dept. ofMath., Link�oping Institute of Technology (1990).21

[30] Nagurney, A.B., \Competitive equilibrium problems, variational inequalities and regional sci-ence", Journal of Regional Science 27 (1987) 503-517.[31] Nguyen, S. and Dupuis, C., \An e�cient method for computing tra�c equilibria in networkswith asymmetric transportation cost", Transportation Science 18 (1984) 185-202.[32] Ortega, J.M. and Rheinboldt, Iterative solution of nonlinear equations in several variables,Academic Press (1970).[33] Pang, J.S. and Chan, D., \Iterative methods for variational and complementary problems",Mathematical Programming 24 (1982) 284-313.[34] Pang, J.S. and Yu, C.S., \Linearized simplicial decomposition methods for computing tra�cequilibria on networks", Networks 14 (1984) 427-438.[35] Pang, J.S., \Newton's method for B-di�erentiable equations", Mathematics of OperationsResearch 15 (1990) 311-341.[36] Smith, M.J., \Existence, uniqueness and stability of tra�c equilibria", Transportation Re-search 13B (1979) 295-304.[37] Smith, M.J., \The existence and calculation of tra�c equilibria ", Transportation Research17B (1983) 291-303.[38] Stewart, G.W., Introduction to matrix computations, Academic Press (1973).[39] Wu, J.H., Florian, M. and Marcotte, P., \A new optimization formulation of the variationalinequality with application to the tra�c equilibrium", Publication 722, Centre de recherchesur les transports, Universit�e de Montr�eal (1990).[40] Zangwill, W.I., Nonlinear programming, Prentice-Hall (1967).22

A general descent framework for the monotone variational inequality problem

Documents