Computational assessment of distributed decomposition methods for stochastic linear programs

E L S E V I E R European Journal of Operational Research 108 (1998) 653-670

EUROPEAN JOURNAL

OF OPERATIONAL RESEARCH

T h e o r y a n d M e t h o d o l o g y

Computational assessment of distributed decomposition methods 1 for stochastic linear programs

H e r c u l e s V l a d i m i r o u 2

School of Economics & Management, Department of Public and Business Administration, University of Cyprus, 75 Kallipoleos street, P.O. Box 537, CY-1678 Nicosia, Cyprus

Received 20 July 1996; accepted 9 May 1997

Abstract

Incorporating uncertainty in optimization models gives rise to large, structured mathematical programs. Decompo- sition procedures are well-suited for parallelization, thus providing a promising venue for solving large stochastic programs arising in diverse practical applications. This paper presents an adaptation of decomposition metl~ods for execution on distributed computing systems. A regularized decomposition, as well as the linear decomposition algorithm, are implemented for execution on distributed multiprocessors. Computational results on an IBM SP2 multiprocessor system are reported to demonstrate the comparative performance of the methods on a number of test cases. © 1998 Elsevier Science B.V.

Keywords." Decomposition; Parallel algorithms; Distributed computing; Stochastic programming

I. Introduction

Stochastic linear programs with recourse provide an effective modeling paradigm for diverse sequential decision problems that include uncertain parameters. Such problems arise, for example, in situations in which the technological factors that govern a problem exhibit variability (e.g., yield of production processes), or in situations in which the primary determinants of current decisions do

J The author's work is supported by Directorate General III Industry, of the European Union under project INCO-DC

No. 951139, and by a research grant from the University of Cyprus.

2 Fax: +357-2-339-063.

not become known until sometime in the future (e.g., return of investments, demand for products). Typically, a distribution of the uncertain parameters is postulated and stochastic programs are formulated to determine optimal decisions on the basis of the distribution.

Discrete distributions are usually used, whereby uncertainty is modeled by a representative set of scenarios. In problems with many independent, discretely distributed, random factors causing sto- chasticity in the model 's data, the number of scenarios grows exponentially with the number of uncertain parameters. Sampling techniques are employed in' order to depict uncertainty by a manageable number of scenarios; the same practice is commonly applied when the uncertain parameters

0377-2217/98l$19.00 © 1998 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 7 - 2 2 1 7 ( 9 7 ) 0 0 2 2 2 - 1

654 H. Vladimirou / European Journal of Operational Research 108 (1998) 653-670

follow continuous distributions. Still, a large number of scenarios is usually needed to adequately capture uncertainty [1]. Hence, stochastic programs are typically very large and require specialized solution methods. Substantial research activity has been directed towards the develop- ment of effective solution methods for stochastic programs in the last thirty years; for overviews of algorithmic developments in the field, see [2-4].

This paper focuses on two-stage stochastic linear programs with recourse. These programs ad- dress the following sequential decision making framework:

Some decisions are made at present under uncertainty about future outcomes. When uncertainty is resolved - when a realization of the uncertain parameters is observed -- further corrective (recourse) actions can be taken to accommodate the conditions of the scenario that materializes. While the current decisions must take into account the possible future scenarios, they must be scenario-invariant; this nonanticipativity condi- tion imposes the logical requirement that current decisions cannot depend on hind- sight. On the other hand, the recourse decisions are contingent on the realization of the uncertain parameters, as well as on the initial decisions.

This modeling framework is appropriate in diverse practical applications. Applications of stochastic programs with recourse include: management of asset/liability portfolios [5 8], energy planning [9,10], resource acquisition [11], management of natural resources [12]. Stochastic programs have also been applied to problems in production, logistics and distribution, telecommunications, and finance. Various applications are reported in [13,14].

Stochastic programs produce large, structured optimization programs which require specialized solution algorithms. Decomposition methods were among the first to be applied. Decomposition for two-stage stochastic linear programs was introduced by Van Slyke and Wets [15]. Variants of the method have been developed to improve computational performance or to exploit special pro-

blem forms. This has been a fertile strand of research since the early days of stochastic programming; for overviews see [2,4].

The appeal of decomposition lies in the fact that it solves a problem by iteratively solving a se- quence of manageable subproblems. A masterproblem coordinates the iterative procedure. With the advent of parallel multiprocessors, decomposition methods have attracted renewed attention since independent subproblems can be solved concurrently on multiple processors. Parallel decomposition methods provide the computational capabilities to solve large-scale stochastic programs.

This paper describes an adaptation of decomposition methods for distributed computing systems. Studies on parallel linear decomposition are also reported in [16-18,8]. Notably, Dantzig et al. [17] demonstrated that by exploiting the mer- its of importance sampling within a distributed decomposition scheme it is possible to solve large stochastic programs. Ruszczyfiski [19] describes the principles of distributed decomposition methods for stochastic programs. This study considers a distributed implementation of regularized decomposition [20,21], in addition to linear decomposition, for comparison. Multicut implementations of both decomposition algorithms were developed and tested. A state-of-the-art simplex- based code was used to solve the subproblems, and the masterproblem in the case of linear decomposition. A primal dual path-following interior point method was employed to solve the quadratic masterproblem in regularized decomposition. Both codes were tested on an IBM SP2 distributed multiprocessor system, providing a basis for a comparative assessment. This paper discusses the results of the computational experiments.

The primary findings can be summarized in the following: (1) higher speedups are typically achieved as the number of scenarios in a problem increases, (2) the use of a general purpose interior point algorithm increases the serial step for solving the masterproblem in regularized decomposition and adversely affects the speedup, (3) regularization can significantly reduce the number of iterations in the decomposition procedure and, thus attain better overall performance than linear decomposition on large and difficult problems, even

H. Vladimirou / European Journal of Operational Research 108 (1997) 653470 655

though it may exhibit lower speedup in parallel execution. The results are discussed in the last sec- tions of the paper. Directions to overcome the observed limitations of coarse-grain, distributed implementations of decomposition are also addressed.

The rest of the paper is organized as follows. Section 2 formulates the two-stage stochastic linear program with recourse. Section 3 reviews the basic principles of decomposition and presents the distributed decomposition algorithms. Sec- tion 4 describes the computational experiments and summarizes the numerical results• Section 5 summarizes the conclusions of this study and sug- gests directions for further research•

2. Two-stage stochastic linear programs

The two-stage stochastic linear program with recourse is stated as follows:

minimize F(x) = cTx + Q(x) (1)

subject to x E X 0 - { x : A x = b , x>>.O}, (2)

where

s

Q(x) - Z Q,(x), (3) s= l

Q,(x) "- min{qTsyl~y = h, - ~.x, y >1 0},

s = 1 , . . . , S (4)

- max{nsT(h, -- r,x)lrc~W s ~< q,},

s = 1 , . . . , S . (5)

This formulation corresponds to the sequential decision framework described in Section 1. The

variables x denote the scenario-invariant, first- stage decisions which are determined so as to minimize the direct cost cTx and the expected recourse cost Q(x). For a given first-stage decision vector x the recourse cost Qs(x) for each scenario s is com- puted by solving the second-stage problem in (4). Qs(.) is an extended real-valued function, taking the value +oe if the recourse problem (4) is infeasible• Uncertainty is modeled by means of a finite set of scenarios for the data of the second- stage problem. The realizations (qs,hs, Ts, W~), s = 1 , . . . , S, constitute the discrete scenarios• For simplicity we have assumed that the probability of each scenario is incorporated by scaling the corresponding recourse cost vector q~. Eq. (3) com- putes the expectation of the recourse cost under all scenarios•

The ml x nl matrix A and the vector b E ~m~ specify the deterministic first-stage constraints in (2); c E R "~ is the vector of direct first-stage costs. For each scenario s, the recourse actions in the second-stage problem (4) are governed by the corresponding m2 × n2 recourse matrix ~ ; the technology matrix Ts has dimensions m2 × nl, hs is an mz-dimensional vector, while the recourse cost is determined by the vector q, E R n2. The decision vectors x E ~n~ and y E ~,2 are dimensioned con- formably. Thus, the problem has nl first-stage and n2 second-stage decision variables, respectively. Ex- cluding bounds on the variables, the problem has ml first-stage constraints in (2), and m2 second- stage constraints for each recourse problem in (4).

Since the uncertain parameters follow a discrete distribution the stochastic program (1)-(4) can be reformulated as the following deterministic equivalent linear program with a dual block-angular structure (see [22]).

min cTx q-

Ax Tlx + T2x

Tsx x~>O, y~>~O,

q~Y, + q f ~ + . . . + qTsYs = b,

Wlyl = hi, + W2Y2 = h2, (6)

+ WsYs = hs, s = 1 ,2 , . . . , S .

656 H. Vladimirou I European Journal of Operational Research 108 (1998) 653~ 70

This formulation reveals the large size of the stochastic program. There is one set of second- stage decision variables and one set of second- stage constraints for every scenario; the problem has (ml +Sm2) constraints and (nl +Sn2) variables.

One class of solution methods has focused on specializations of the simplex algorithm that ca- pitalize on the block structure of the problem to device efficient pivoting strategies and com- pact forms of the basis inverse (e.g., see [2]). A recent strand of research develops efficient matrix factorization techniques to exploit the special structure of (6) within interior point methods [23726 ] . These algorithms become increasingly competitive as problems grow larger, and have proved scalable in parallel computing environ- ments, thus enabling the solution of very large problems on parallel computers with many processors [26].

We turn our attention next to decomposition methods.

T Q~(xk) = qs ~ = (n~) T (ks -- Tsxk), (8)

0Qs(x k) ;rck~ TT (9) z - - k S ] ~S~

where ~ E ~"-~ and rr~ E ~rn2 are the optimal primal and dual solutions, respectively, to the second stage subproblem (4) at the trial solution x k. By substitution of (8) and (9) in (7), it follows that an optimality cut (i.e., a lower linear support of a recourse cost function Q~(x) at a trial solution x k) can be specified as

Qs (x) >~ - E~x + e~,

E~ = -OQs(x k) = (~sk)TTs, (10)

k Qs(x k ) 0Qs (xk)x k (~)Th~" e s = _ =

A subproblem (4) may be infeasible for some trial solution x k. The domains of finiteness of the recourse cost functions (As = {x : Qs(x) < oo}) are convex, polyhedral sets [22,27] and can be expressed in terms of a finite set ~, of outer linear approximations OCea - sibility cuts) of the form:

3. Decomposition methods

3.1. Fundamental principles

Decomposition for stochastic programs derives from the observation that the recourse cost functions Qs(x) are convex and polyhedral (piecewise linear). The expected recourse cost Q(x) is also convex and polyhedral when the uncertain parameters follow a finite, discrete distribution [22]. Based on the above observations, the implicit form of Q(x) in (3) and (4) can be restated in terms of outer linearizations of the recourse functions. From convexity of Qs(x), it follows that a linear approximation is a lower support.

Qs(x) >>- Q~(x k) + OQ~(xk)(x - x k)

= Oas(xk)x + [as(x k) - OOs(Xk)Xk], (7)

where Q~(x k) is the value, and OQs(x k) is a subgra- dient of the recourse cost at a trial solution x k, respectively. By duality, linear supports at x k are obtained from the solution of the second-stage subproblems (4) as follows:

X s C {x: D~sx>~d ~, j E J s ) . ( l l )

Feasibility cuts at a trial solution x k are constructed from the dual prices a~ E ~,,2 _ of the equality constraints - to the second stage feasibility subproblems

min{eTy + +eTy I W~ys +y+ - y - = hs - T~x k,

ys ~>0, y+ ~>0, y- ~>0}, (12)

where e E ~,2 is a vector of ones. Induced feasibility cuts (11) are constructed according to

D~sx. ~> a~, where D~ = ( a s ) ~ , k T d~ : k (a~)Ths.

(13)

Thus, at a trial first-stage solution x k, a second- stage subproblem (4) may return either an optimality cut (10) or a feasibility cut (13).

Decomposition algorithms for stochastic programs [15,28] are outer linearization procedures. They iteratively apply the following general procedure to solve the problem: Given a trial solution (proposal) x ~ for the first-stage variables, solve

H. Vladimirou I European Journal of Operational Research 108 (1997) 653470 657

the second stage recourse problems in (4) for each scenario in order to generate: (i) feasibility cuts (13) determining the domains of feasible recourse X~, and (ii) optimality cuts (10) that are lower linear approximations of Q~(x). Append the cuts to the masterproblem and solve to obtain an im- proved first-stage solution. The trial solutions x* are obtained from the masterproblem:

minimize

subjectto

IIx - *11 = -4- cTx -4- ~ Os s=l

Ax = b,

E~x+O~>~e~, s = 1 , . . . ,S ,

/3asx/> d~, s = 1, . . . ,S,

x~>0,

(14)

j E K °,

(15)

where p > 0 is a positive parameter, 0s represents an estimate of the optimal recourse cost Q~(x) under scenario s, ~, E ~"~ is a regularizing estimate (approximation) of the optimal first-stage decision vector, K °, K f are subsequences of the iteration se- quence K = { 1 , 2 , . . . } , and correspond to the iterations at which the subproblem for scenario s produced an optimality cut (10), or an induced feasibility cut (13), respectively.

The masterproblem above is stated for the case of regularized decomposition. A distinguishing difference of regularized decomposition from the traditional, linear decomposition algorithm is the inclusion of a quadratic regularizing term in the masterproblem objective (14). Regularized decomposition was introduced by Ruszczyfiski [20]. The regularizing term in the objective stabilizes the master. Regularization also enables the derivation of an a priori upper bound on the number of necessary cuts, making it possible to reliably delete inactive cuts during execution if the masterproblem grows significantly. The principles of regularized decomposition and its finite convergence were established in [20]; numerical results were reported in [21,29].

3.2. Regularized decomposition

The main features of regularized decomposition are summarized as follows:

Step 1: Set the iteration counter k = 0. Given parameters p > 0 , 0 < y < l , and an estimate ~k E R"' for the first-stage solution, solve the regularized quadratic masterproblem (14) and (15). If the masterproblem is infeasible, then stop (the problem is infeasible), else continue. The masterproblem solution yields a first-stage proposal x ~, and estimates of the recourse costs ~ . Compute /~,k = cTxk..[_ EsLI ~s" If F* = F(~*), then stop (optimal solution found), else continue.

Step 2: Delete from the masterproblem cuts that are inactive at the current solution, so that the master contains at most nl -4- S constraints.

Step 3: For each scenario s = 1, . . . ,S solve the corresponding subproblem (4) at x k.

(a) If the subproblem is infeasible at x k, return to the master a feasibility cut (13);

(b) If the subproblem solves optimally and Qs(x k) > ~ then return to the master an optimality cut (10).

Step 4: If any of the subproblems returned infeasible in Step 3, set ~,+l = ~, and go to Step 6.

Step 5: Compute F(x k) = cTx k + ~--~sSl Qs(xk). If

F(x*) =/O* or F(x*) ~< yF(~ k) + (1 - y)~k and ex- actly nl -4- S constraints were active at the last solution of the masterproblem then update the regularizing point ~k+l = x*; otherwise set

Step 6: Increment the iteration counter k by one and repeat from Step 1.

Note: As long as some subproblem s has returned infeasible for all previous proposals x, then there is no optimality cut corresponding to this subproblem in the master (i.e, K ° = 0). The corresponding variable 0~ is set to -0¢. Also, until all subproblems have returned optimal for at least one proposal, the regularizing term is ignored in the master and F(~) = oo (i.e., linear decomposition is employed until all subproblems return feasible at some trial solution x*, which becomes the first applicable regularizing point 4). Of course regularized decomposition can be warm-started if a feasible first-stage solution estimate ~ is available.

The suitability of this decomposition scheme for parallel processing is evident from the observation: given a first-stage trial solution x*, the recourse subproblems (4) decompose across scenarios. Hence, the subproblems in Step 3 can be

658 H. Vladimirou / European Journal o f Operational Research 108 (1998) 653~5 70

solved concurrently on multiple processors, with the masterproblem coordinating the iterative decomposition scheme.

The description above concerns regularized decomposition. The traditional, linear decomposition proceeds along the same lines, but simplifies at the following points: (a) a linear masterproblem (14) and (15) is solved in Step 1 without the quadratic regularizing term; (b) Steps 2, 4 and 5 are omitted since no regularization is used.

3.3. Distributed implementations

We have developed distributed implementations of the regularized, and the linear decomposition algorithms described above. The implementations are based on the master-slave framework of distributed computations. On a computer system with N processors the algorithms partition and solve a two-stage stochastic program as outlined below.

M a s t e r P r o c e s s Step M l: Read the input data. Partition the

scenario set to the subsets S~ C_ S , i = 1 , . . . , N (uiN1 Si = S, Si NSj = O,i = 1 , . . . , N , j = 1 , . . . , N , i # j ) .

Step M2: Spawn N distributed processes. Transmit to each process i = 1, . . . ,N the data (q~, h~, T~, Ws) for its assigned scenarios s E S/.

Step M3: Set up the initial masterproblem, and initialize the iteration counter k = 0.

Step M4 (Execute Steps 1 and 2 of the algorithm): Solve the masterproblem. If the termination criteria are met broadcast a termination signal and exit. Otherwise transmit the trial solution x k and (partially aggregate) recourse cost estimates 01 to each distributed process i = 1 , . . . , N . If desired, delete inactive cuts.

Step M5 (Corresponds to Step 3): Receive asyn- chronously the responses of the distributed processes that arrive in random order. Append the new feasibility and optimality cuts to the masterproblem.

Step M6 (Execute Steps 4-6 of the algorithm): Check conditions for reseting the regularization point; set ~k+~. Increment the iteration counter k and repeat from Step M4.

Distr ibuted process i: i = 1 , . . . , N. Step S l: Receive from the master process the

data (qs,hs,~.,~.) for its assigned scenarios s E Si. Set up the scenario subproblems.

Step $2: If a termination signal is received, exit; else receive from the master process a trial solution x k and recourse cost estimate 0 F.

Step $3: For each scenario s c Si: Modify the right-hand side vector hs - T,x* and solve the subproblem.

(S3a): If the subproblem is infeasible, send to the master process the data for a feasibility cut (13) D~s = (a~)XTs,d~ = (aks)Ths and go to Step $5.

(S3b): If the subproblem solves optimally, re- cord the optimal value Q~(x k) and the optimal dual prices n~.

Step $4: If ~s~s~ Q~(xk) > 0F send to the master process the data for an optimality cut: Eki x + Oi 7> e~, where E~ k T =

~ _ x-, ~nk~Xh" also send the aggregate re- ei -- Z.asES,~ s) s, course cost ~]~es~ Q~(x*)-

Step $5: Increment the iteration counter k and repeat from Step $2.

In the above implementation the subproblems are solved by a number of distributed processes (N). The master process coordinates and controls the execution of the algorithm. Note that this is a multicut decomposition; one cut per distributed process may be added to the masterproblem at each iteration. As the number of scenarios is typically much larger than the number of processors, this is not the full multicut approach of Birge and Louveaux [28]. Instead, multiple scenarios s E Si are grouped together in each distributed process i which provides optimality cuts representing linear approximations to the partially aggregated recourse cost function ~s~s, Qs(x). This approach was chosen so as to minimize the communication of a distributed process with the master process and reduce the serial execution step - by returning a single cut in each iteration instead of a cut for every one of its assigned scenarios. However, the masterproblem utilizes all of these cuts to have a better approximation of the recourse cost instead of aggregating them in a single cut as in the origi- nal L-shaped method [15].

H. Vladimirou I European Journal of Operational Research 108 (1997) 653-670 6 5 9

A distributed process interrupts and transmits to the master a feasibility cut at the first infeasible subproblem it encounters for a given proposal x ~ (Step S3a). This aims to reduce the turnaround time in distributed processes when infeasible scenarios are encountered, by returning feasibility cuts as soon as they are detected. The simplex- based routines of OSL [30] which solve the subproblems readily provide the phase-one dual prices

k if a subproblem (4) turns out infeasible. So, a O" s

feasibility cut (13) is determined without having to explicitly reformulate and solve the feasibility subproblem (12) as would have been necessary with an interior point solver.

Step $4 (step 3 in initial description) indicates that dominated (redundant) optimality cuts are not appended. Appended optimality cuts correspond to new facets of the polyhedral recourse cost ~scsi Qs(x) which are violated at the previous masterproblem solution x k. Similarly, feasibility cuts are not satisfied at the previous proposal. There- fore, the surplus variables of newly appended cuts should be introduced in the basis of the updated masterproblem. We explicitly introduce to the final basis of the previous masterproblem the surplus variables of new cuts to specify a primal-feasible starting basis for the next masterproblem. This is an effective warm-starting scheme for the masterproblem which constitutes the serial, coordination step in distributed decomposition.

We implemented the distributed algorithms in Fortran codes. We used the facilities of the PVM library [31] for the process-control and communications needs of the distributed implementations. We solve the subproblems using the simplex-based solvers in OSL. In the linear decomposition algorithm we apply the same solvers to the masterproblem as well. For the quadratic masterproblems in regularized decomposition we use LOQO [32] a state-of-the-art primal dual interior point solver.

Our target multiprocessing environment was a distributed system with a moderate number of processors. The issue of load balancing is addressed by partitioning the scenario set as equally as possible to the available processors. However, even subproblems of identical size do not necessarily take the same time to solve with a simplex-based algo-

rithm. Hence, totally even load distribution cannot be assured.

In each distributed process i = 1 ,2 , . . . , N, the entire group of its assigned scenarios (s ~ Si) can be solved as a single, aggregate linear program with block-diagonal structure - recall the structure of problem (6) with the vectors T~x k moved to the right-hand side of the constraints. Alternatively, the individual scenario subproblems can be solved sequentially, as stated in the algorithmic description. We tested both approaches and elected to adopt the latter. This choice is discussed in Sec- tion 4. Warm starts were always used when solving the subproblems.

4. Computational experiments

The distributed regularized decomposition algorithm and its linear counterpart were tested on an IBM SP2 multiprocessor at the University of Geneva. This system has 15 RISC processors, running AIX 3.2, which are interconnected via a high-speed crossbar communication link. Unfortu- nately, it was not possible to conduct the computational experiments on a totally dedicated system. Activity on the processors and network traffic were beyond the user's control. However, special care was exercised to ensure that the distributed processes executed on symmetric, idle processors in order to minimize the effects of external load on timing results. Higher speedups could potentially be achieved on the same system when executing in a fully dedicated mode.

All timings reported represent elapsed wall- clock times. All test problems were solved to an ac- curacy of relative duality gap ](F(~ k) - F k ) / P k I ~< 10 -8, which is a stringent tolerance for decom-

position methods.

4.1. Test problems

The test problems are taken from a library of standard stochastic programs [33]. They correspond to problems ranging from capacity expan- sion, production scheduling, telecommunications network design, etc. In these problems the uncertain parameters are the right-hand side vectors hs

660 H. Vladimirou / European Journal of Operational Research 108 (1998) 653470

Table 1 Summary of test problem core sizes and structures

Problem First-stage Second -stage

Rows Columns Rows Columns

ml nl m2 n 2

SCSD8 10 70 20 140 SCRS8 28 37 28 38 SCTAP1 30 48 60 96 SCAGR7 15 20 38 40 SCFXM 1 92 114 148 225 SEN 1 89 175 706

Table 2 Sample test problem sizes (Deterministic equivalents)

Problem Scenarios Constraints Variables Non-zeroes

SCSD8.432 432 8650 60 550 190 210 SCRS8.512 512 14 364 19 493 50 241 SCTAP1.480 480 28 830 46 128 169 068 SCAGR7.32 32 1231 1300 4074 SCAGR7.64 64 2447 2580 8106 SCAGR7.432 432 16 431 17 300 54 474 SCAGR7.864 864 32 847 34 580 108 906 SCFXM1.64 64 9564 14 514 53 799 SCFXMl.128 128 19 036 28 914 106 919 SCFXM1.256 256 37 980 57 714 213 159 SEN.16 16 2801 11 385 38 057 SEN.32 32 5601 22 681 76 025 SEN.64 64 11 201 45 273 151 961

of the second-stage constraints; other problem data are deterministic. The characteristics of several test problems are summarized in Tables 1 and 2.

4.2. Numerical results

We examine first the effects of regularization. Table 3 shows the number of iterations and the number of cuts generated with linear and with regularized decomposition, respectively, using different numbers of processors. Clearly, regularization can reduce the number of iterations and accelerate convergence. This benefit is more pro- nounced in problems that prove difficult for decomposition, requiring a large number of iterations. Particularly, the SEN problem has a single first-stage constraint and requires a very

large number of optimality cuts in linear decomposition to identify an effective set of supports for the expected recourse function Q(x); regularized decomposition proves very effective in this situation. Moreover, as the number of scenarios in a problem increases, the required number of iterations is more stable with regularized decomposition than with its linear counterpart.

In all the problems tested, regularized decomposition seldom took more iterations than linear decomposition; problem SCAGR7 in Table 3 is an example. Still, for this problem the differences in iteration counts between the two methods were fairly minor. This behavior is attributed to over- regularization that can result in some "degener- ate" steps. That is, iterations in which most of the subproblems remain on the same facet of the recourse cost functions Qs(x) and, thus return no

H. Vladimirou I European Journal of Operational Research 108 (1997) 653-670 661

new cuts. This occurs when a higher than necessary weight is placed on the quadratic regularizing term in the objective of the master - i.e., when a comparatively small scaling parameter p is used for the regularizing penalty - which in turn re- stricts the step in the first-stage proposal x. In such situations it is helpful to reduce the weight on the regularizing term (i.e., increase the value of the parameter p). Convergence performance can improve by devising effective heuristics to dynamically adjust during execution the scaling factor of the regularizing penalty. Through experimentation on various test problems, we selected the following strategies to dynamically adjust the regularizing parameter p: • if no new cuts are appended, set p ~ 100p for

the next iteration; • i f F (x k) > ?F(~ k) + (1 - ?)pk, set p ~ 0.25p; • i f F (x k) < (1 - ?)F(¢ k) +?~k, set p ~ 10p. The initial values of the parameters are: p = 1, ? = 0.7. These settings proved generally effective and are the tactics used for all the runs reported in Table 3.

On the choice between a multicut versus a single aggregate cut update of the masterproblem, our results do not provide strong evidence to support an unequivocal preference between the two approaches. In our test cases there is no clear trend between the number of iterations and the number of partially aggregate cuts appended to the master per iteration. This can be seen in Table 3 from the number of iterations in runs with different numbers of processors, corresponding to different le- vels of dissagregation in optimality cuts. A cut per iteration is the minimum communication requirement for each distributed process. We utilize these partially aggregated cuts in the masterproblem instead of further aggregating them to a single cut. This approach is motivated from the fact that the worst case bound on the number of iterations is lower for a multicut decomposition than for a single cut decomposition [28].

However, we did not implement a full multicut approach (i.e., one cut per scenario), as our test results did not suggest a compelling reason to do so. There is no evidence that the number of iterations could be further reduced with a full multicut approach in our experiments. In a full multicut

approach each distributed process would have returned a cut for each of its assigned subproblems, thus increasing the communications overhead, as well as the size of the masterproblem. Both of these effects would increase the algorithms' serial bottleneck with a consequent reduction in speedup. We note, however, that in other studies [21,29] the use of a full multicut approach in regularized decomposition proved beneficial for some problems. Partially aggregate multicut strategies, as we employ here, were not tested in those situations to determine the degree of disaggregation of cuts that is generally appropriate; this choice could be problem dependent.

Table 4 summarizes the effect of our partial multicut approach on the computational effort of the masterproblem. This table reports the average workload for the masterproblem per decomposition iteration, expressed also as a percentage of the total serial solution time on a single processor. These results indicate the proportion of the serial part of the algorithm which bounds the achievable speedup in distributed execution. This bound is a consequence of Amdhal's law [34] which states that an algorithm's attainable speedup 6eN on a parallel system with N processors is bounded by

~ N = t..~_l ~ 1 (16) tN C; + (1 -- ~r)/N'

where tl is the solution time for a problem on a single processor, tN the solution time for the same problem on a system with N identical processors, tr the serial part of execution (i.e., the proportion of tl that always needs to execute serially irrespective of the number of available processors).

Obviously, Amdahl's upper bound on speedup is attainable only if the parallelizable part of execution can be divided perfectly equally among the processors. Perfect load balance among processors, however, often cannot be achieved in the decomposition algorithms.

In Table 4 we observe that the additional workload to solve the larger masterproblems in our multicut approaches - in comparison to the single aggregate cut approach - is rather small in most cases and should not have a significant effect on


Table 3 Comparison of linear and regularized decomposition - iterations and cuts

Problem Processors Linear decomposition

Iterations Cuts added to the masterproblem

Feasibility Optimality Total

Regularized decomposition



SCSD8.432 1 2 0 2 2 2 2 0 4 4 4 2 0 8 8 8 2 0 16 16

SCTAP1.480 1 3 0 2 2 2 3 0 3 3 4 3 0 7 7 8 3 0 12 12

SCRS8.512 1 2 0 2 2 2 2 0 4 4 4 2 0 8 8 8 2 0 16 16

10 2 0 19 19

SCAGR7.32 1 13 4 9 13 2 13 8 18 26 4 9 12 20 32 8 9 24 34 58

SCAGR7.64 1 13 4 9 13 2 13 8 18 26 4 13 16 26 52 8 9 24 40 64

SCAGR7.432 1 9 3 6 9 2 16 8 22 30 4 13 12 40 52 8 11 24 60 84

SCAGR7.864 1 9 3 5 8 2 16 8 24 32 4 14 12 44 56 8 14 24 84 108

10 7 30 30 60

SCFXM 1.64 1 45 32 12 44 2 46 66 24 90 4 45 128 48 176 8 45 256 96 352

SCFXM 1.128 1 56 43 12 55 2 47 68 24 92 4 47 136 48 184 8 52 312 96 408

SCFXM1.256 1 56 43 12 55 2 56 86 24 110 4 53 160 48 208 8 47 272 96 368

2 0 2 2 2 0 4 4 2 0 8 8

0 16 16

3 0 2 2 3 0 3 3 3 0 7 7 3 0 12 12

2 0 2 2 2 0 4 4

0 8 8 2 0 16 16

(not solved with this option)

14 3 8 11 13 6 14 20 14 12 20 32 13 24 30 54

17 3 7 10 13 6 14 20 14 12 32 44 14 24 40 64

5 3 t 4 19 8 22 30 16 12 44 56 15 24 66 90

9 3 5 8 16 8 24 32 21 12 64 76 21 24 114 138

(not solved with this option)

40 26 10 36 43 56 24 80 38 100 40 140 40 184 112 296

43 29 11 40 44 58 26 84 40 106 42 148 46 214 128 342

43 29 11 40 49 64 26 90 46 122 48 170 42 194 118 312

H. Vladimirou / European Journal of Operational Research 108 (1997) 653~570 663

Table 3 (Continued)

Problem Processors Linear decomposition



Regularized decomposition



SEN.16 l 2 4 8

SEN.32 l 2 4 8

SEN.64 l 2 4 8

392 0 392 392 202 0 404 404 180 0 72o 720 148 0 1183 1183

153 0 153 153 163 0 326 326 311 0 1244 1244 285 0 2276 2276

400 (not completed to tolerances) 281 0 562 562 188 0 752 752 204 0 1631 1631

16 0 12 12 15 0 22 22 15 0 42 42 16 0 84 84

16 0 12 12 16 0 24 24 15 0 54 54 15 0 98 98

19 0 15 15 18 0 28 28 19 0 62 62 19 0 102 102

the speedup of coarse-grain, distributed decomposition executing on systems with a moderate number of processors. However, as more cuts are added to the master per iteration and, especially, as more iterations are required to solve a problem (e.g., problem SCFXM1), the proportional workload of the serial masterproblem and serial communications increase. This increase in the serial bottleneck of decomposition algorithms is reflected in lower speedups, as ~'¢'N is a decreasing function of o-.

The issues of load balancing in the concurrent solution of subproblems on parallel computers and the reduction of the serial bottleneck of the master should be carefully addressed in synchronous distributed implementations that use a large number of processors; recall from (16) that l i rnu~x 2 ~ N = 1/0. Similarly, the combined effects of the larger masterproblem and of the additional communication overhead - when each distributed process returns multiple cuts per iteration - on speedup should be taken into account if a full multicut approach is adopted. These factors affect the choice of an appropriate degree for disaggregating cuts in the masterproblem when the decomposition methods are implemented on different multiprocessor architectures.

Table 4 further indicates that the proportional workload of the serial masterproblem reduces

as problem size increases with the addition of more scenarios. Therefore, higher speedups are achievable as the number of scenarios in a problem increases. This is indeed observed in our experiments as shown in Fig. 1.

Moreover, the results on Table 4 indicate a somewhat higher proportional workload in the masterproblem for the regularized decomposition in comparison to linear decomposition. Hence, re- latively lower speedups are typically achieved with regularized decomposition compared to linear decomposition. This difference in behavior is due to the different approaches used to solve the masterproblem in the two algorithms. In linear decomposition we use an effective restarting procedure for the masterproblem as explained in Section 3.3. The masterproblem then solves quickly with the simplex algorithms in OSL. However, in the regularized decomposition algorithm we use the primalmlual interior point code LOQO to solve the quadratic masterproblem for which we have no effective restart mechanism. Each masterproblem is solved anew with the interior point code, including the symbolic factorization of its constraint matrix. This sacrifice in efficiency for the masterproblem is reflected in a reduction of speedup. Clearly, an alternative procedure for solving the quadratic masterproblem that is amenable to effective restarts, such as the active

664 H. Vladimirou / European Journal of Operational Research 108 (1998) 653470

Table 4

Relative masterproblem workload in linear and regularized decomposition

Problem Processor Average masterproblem workload per iteration

Linear decomposition Regularized decomposition

Time (ms) Percentage of serial Time (ms) solution time

Percentage of serial solution time

SCAGR7.32 1 0.007 0.001 0.015 0.002 2 0.011 0.002 0.024 0.004 4 0.013 0.002 0.025 0.004 8 0.017 0.003 0.032 0.005

SCAGR7.64 1 0.008 0.001 0.019 0.001 2 0.009 0.001 0.042 0.003 4 0.010 0.001 0.049 0.004 8 0.014 0.001 0.072 0.005

SCAGR7.432 1 0.013 0.0002 0.023 0.0006 2 0.014 0.0002 0.034 0.0009 4 0.019 0.0003 0.042 0.0011 8 0.021 0.0003 0.053 0.0014

SCAGR7.864 1 0.013 0.0001 0.025 0.0002 2 0.016 0.0001 0.035 0.0003 4 0.025 0.0002 0.041 0.0003 8 0.032 0.0002 0.043 0.0003

SCFXM 1.64 1 24.870 3.610 40.061 4.713 2 25.345 3.679 45.628 5.368 4 27.377 3.974 68.816 8.096 8 27.701 4.021 71.672 8.432

SCFXM 1.128 1 24.095 2.014 43.775 2.689 2 22.037 1.842 47.975 2.947 4 22.982 1.921 63.766 3.917 8 24.861 2.078 65.736 4.038

SCFXMI.256 1 24.763 1.146 43.929 1.359 2 24.828 1.149 50.555 1.564 4 27.579 1.273 62.257 1.926 8 32.272 1.341 68.883 2.131

set approach proposed by Ruszczyfiski [20,21], could potentially reduce the workload for the master with consequent improvements in speedup.

Table 5 summarizes the solution times and speedups observed with linear and regularized decomposition on the SP2 multiprocessor for a representative set of test problems. The solution time represents elapsed wall-clock time. It includes the communication overhead for transmitting proposals and cuts between the master process and the distributed subproblem processes; time for input

reading and setup of the distributed processes is excluded. In order to account for the differences in iterations when the number of distributed processes - and the number of cuts per iteration - var- ies, we compute speedup in terms of the average solution time per iteration.

We observe a trend for higher speedup at a given level of parallelism as the number of scenarios in a problem increases, as is evident in the results of Table 5 and in Fig. 1. This is due to the fact that while the size of the masterproblem remains un-

3 6

09

4 g_ *5

.L!n..e

..a.r.i.D

.e__c

__o_

mpos

ition

(Pro

blem

SCAG

R7)

- m

200

400

600

800

1000

Nu

mb

er

of

Sce

na

rio

s

8

3 6 2

H ..

....

-=

-

0 i

i 0

200

400

600

800

Nu

mb

er

of

Sce

na

rio

s

[]

1000

g Q

8 r

~6L

m

4 0i 50

l Lin

ear

Deco

mpo

sitio

n (P

robl

em S

CFX

M 1

) I

-~r~

--

--j-

!

100

150

200

250

300

Nu

mb

er

of

Sce

na

rio

s

6

co

4

8 ~2

la

0 50

~egu

lari

zed

Deco

mpo

sitio

n (P

robl

em S

CFX

M 1

) 1

m

--

t

100

150

200

250

Nu

mb

er

of

Sce

na

rio

s

! >e

l

300

....

a "4

&

Fig

. 1.

Ob

serv

ed s

pee

du

p (

per

iter

atio

n) o

f d

istr

ibu

ted

lin

ear

and

reg

ular

ized

dec

om

po

siti

on

exe

cuti

ng o

n N

=

1.2

.4 a

nd

8 p

roce

sso

rs.

666 1-1. Vladimirou / European Journal of Operational Research 108 (1998) 653470

Table 5 Comparison of total solution times and speedup (per iteration) with linear and regularized decomposition

Problem Processors Linear decomposition Regularized decomposition

Solution time (s) Speedup Solution time (s) (per iteration)

Speedup (per iteration)

SCAGR7.32 1 8.001 1.00 8.994 1.00 2 4.652 1.72 4.942 1.69 4 1.684 3.29 2.828 3.18 8 0.968 5.72 ! .538 5.43

SCAGR7.64 1 16.998 1.00 23.000 1.00 2 9.713 1.75 10.407 1.69 4 4.956 3.43 5.757 3.29 8 2.025 5.81 3.215 5.47

SCAGR7.432 1 56.997 1.00 18.998 1.00 2 58.234 1.74 42.466 1.70 4 23.658 3.48 17.828 3.41 8 11.649 5.98 10.017 5.69

SCAGR7.864 1 118.989 1.00 126.005 1.00 2 118.177 1.79 130.999 1.71 4 50.850 3.64 64.892 3.45 8 29.950 6.18 49.121 5.99

SCRS8.512 1 23.996 1.00 26.002 1.00 2 13.041 1.84 14.055 1.85 4 6.996 3.43 8.001 3.25 8 3.641 6.59 4.025 6.46

SCSD8.432 1 21.992 1.00 22.008 1.00 2 12.017 1.83 13.022 1.69 4 6.449 3.41 6.772 3.25 8 3.690 5.96 4.046 5.44

SCTAP1.480 1 23.002 1.00 27.022 1.00 2 13.144 1.75 15.441 1.75 4 6.970 3.30 8.264 3.27 8 3.905 5.89 5.340 5.06

SCFXM 1.64 1 31.001 1.00 34.000 1.00 2 18.532 1.71 21.756 1.68 4 10.098 3.07 10.731 3.01 8 6.250 4.96 7.219 4.71

SCFXM 1.128 1 66.997 1.00 70.001 1.00 2 32.503 1.73 42.892 1.67 4 17.794 3.16 21.142 3.08 8 12.543 4.96 15.377 4.87

SCFXM 1.256 1 121.007 1.00 138.995 1.00 2 65.058 1.86 85.156 1.86 4 34.916 3.28 46.906 3.17 8 19.308 5.26 26.362 5.15

SEN. 16 1 366.008 1.00 15.030 1.00 2 110.296 1.71 8.752 1.61 4 61.115 2.75 5.124 2.75 8 29.846 4.63 3.246 4.63

H. Vladimirou / European Journal of Operational Researeh 108 (1997) 653470 667

Table 5 (Continued)

Problem Processors Linear decomposition Regularized decomposition

Solution time (s) Speedup Solution time (s) (per iteration)

Speedup (per iteration)

SEN.32

SEN.64

l 228.003 1.00 28.007 1.00 2 139.601 1.74 17.182 1.63 4 167.313 2.77 9.479 2.77 8 74.773 5.68 5.610 4.68

l n/a 51.080 1.00 2 1147.001 29.688 1.63 4 459.005 18.308 2.79 8 182.994 10.915 4.68

changed, the workload for solving the larger number of scenario subproblems increases. As the parallelizable proportion of the workload increases relative to the computational effort in the serial masterproblem, higher speedups are observed.

Table 5 also indicates that regularized decomposition exhibits somewhat lower speedups compared to linear decomposition. As it was explained earlier, this is mainly a consequence of the increase in the solution time of the quadratic masterproblem with the interior point method for which we have no effective restart mechanism.

Speedup in a synchronous, coarse-grain, distributed decomposition is affected by several factors, namely, load balancing in the distributed processes, the proportion of serial execution for the masterproblem, and overhead of communications between the master process and the subproblem processes. Load balancing can be affected by the partitioning scheme and the procedures used to solve the subproblems. In our partitioning scheme the subsets of scenarios assigned to any pair of processors can never differ by more than one scenario. All scenario subproblems have equal sizes and identical structures. Still, they do not necessarily take identical times to solve with the simplex algorithm.

The solution method for the subproblems can influence speedup measures. For example, each process i could construct and solve a single bundle subproblem composed of all the scenarios s c 5',. assigned to the process. The size of the bundle sub-

problem in each process reduces as more processes are added. In preliminary experiments we tested this approach. We employed a built-in, block-de- tection and decomposition facility in OSL [30] (routine EKKLPDC) to solve the aggregate subproblems in the first iteration. Subsequently, we re- started each aggregate subproblem from its optimal basis in the previous iteration. This approach demonstrated good speedup as the number of processors was increased. Even superlinear speedup was observed in some test cases due to the fact that each of the distributed, aggregate subproblems could sometimes solve substantially fas- ter than the reduction in its size implied, compared to the larger aggregate subproblems in a coarser partition (when fewer processes are used). Clearly, the apparent speedup in this approach is not just due to parallelism and depends on the relative complexity of the aggregate subproblems with respect to their size, and the ability of the solver to detect and exploit any special structure in the subproblems.

However, the above approach was inferior, in terms of overall solution time, compared to the approach we finally adopted, even though it implied higher speedup in some instances. In our implementations, each distributed process sequentially solves its assigned scenario subproblems. We restart the simplex solver for each subproblem with its corresponding optimal basis from the previous iteration. Alternatively, we could restart each scenario subproblem in a distributed process from the optimal basis of the last-solved subproblem,


thus reducing the memory requirements by storing a single basis instead of a separate basis for each scenario. This is always possible since all scenario subproblems have identical dimensions. We ex- perimented with this approach as well and found that it does not provide tangible advantage over the method that warm-starts each subproblem from its own previous optimal basis.

This should not be surprising. Recall from (4) that only the right-hand side vector h,. -/7~,x k of a subproblem s changes between successive iterations due to the updating of the first-stage proposal x k. As the algorithm progresses, the differences in the first-stage proposals in successive iterations should be expected to diminish. Hence, the optimal basis ofa subproblem in one iteration should be near optimal in the next iteration and, consequently, few simplex iterations should be required to solve a subproblem that is warm-started from its optimal basis in the previous iteration. However, the differences in the data of different scenarios (qs, hs, T,,, ~ ) can be more substantial, and the optimal basis of a subproblem should not necessarily be expected to be close to that of a different subproblem. Thus, more simplex iterations may generally be required when subproblems are warm-started from the optimal basis of a "neighboring" subproblem at the same iteration. This is indeed what the computational results seem to imply.

5. Conclusions

This paper presented adaptations of a regularized and a linear decomposition method for execution on distributed multiprocessors. Both distributed algorithms were benchmarked on an SP2 multiprocessor using a set of standard stochastic programs.

Regularized decomposition demonstrates the potential to substantially reduce the number of iterations (and cuts) in difficult problems which include a large number of first-stage variables. Hence, regularization is an effective way to accelerate the convergence of decomposition on difficult problems, while on easier problems it performs as its linear counterpart, without an observable de- gradation in performance.

Our experiments demonstrate that both algorithms can effectively parallelize on a system with a moderate number of processors. Efficient solution procedures for the masterproblem, including effective restart mechanisms, become essential to sustain increasing speedups as the number of processors increases. In this respect, the use of a general purpose interior point code to solve the masterproblem increases the proportion of the serial coordination step and, consequently, reduces the achievable speedup. Although the speedup reduction on a system with a small number of processors is not substantial, there will clearly be an adverse effect on the scalability of the algorithm to systems with a larger number of processors.

Synchronous implementations of distributed decomposition inevitably face a limitation to their potential for scalability as dictated by Amdhal's law. The following issues need to be resolved in order to attain scalability: (1) effective load balancing for the distributed subproblem processes and (2) significant reduction in the serial coordination step of the masterproblem.

The first issue points to the adoption of data- parallel procedures for solving the subproblems, whose run-time complexity is fairly constant for problems of similar size. Nielsen and Zenios [18] demonstrated that interior point methods are suita- ble in this respect to achieve load balancing in the concurrent solution of the subproblems. However, currently there are no effective restart mechanisms for interior point methods. Hence, computational complexity for the repeated solution of the subproblems could increase even if data-parallel interior point solvers can improve load balancing.

Efficient solvers and effective restarts are obviously necessary to reduce the serial bottleneck of the masterproblem. Still, it is doubtful whether these alone would provide a sufficient resolution of the second issue. A possible direction would be to employ methods that can exploit parallelism in solving the masterproblem. Another possible direction would be the adoption of asynchronous decomposition procedures in which the masterproblem is solved concurrently with the subproblems as soon as some new cuts arrive without waiting for all subproblems to respond as is done in a synchronous approach. The slower processes will

H. Vladimirou / European Journal of Operational Research 108 (1997) 653~5 70 669

not constantly dictate the progress of the algorithm in this scheme. The potential of asynchronous decomposition methods has not yet been examined.

However, as long as the obstacle of the serial masterproblem remains, distributed decomposition methods cannot be expected to achieve scalability on systems with a large number of processors. The performance and scalability potential of various parallel algorithms for stochastic programs, including distributed decomposition methods, is examined in a comparative study by Vladimirou and Zenios [35].

Acknowledgements

The computational experiments for this study were carried out on an IBM SP2 multiprocessor at the University of Geneva, Switzerland. The author would like to acknowledge the assistance of Professor Jean-Philippe Vial, Co-Director of LOGILAB at the University of Geneva for arran- ging access to the SP2, and the assistance of Bob Matthews on technical matters concerning the use of the SP2.

References

[1] A. Prrkopa, Stochastic Programming, Kluwer Academic Publishers, Dordrecht, 1995.

[2] R.J-B. Wets, Large-scale linear programming techniques, in: Y. Ermoliev, R.J-B. Wets (Eds.), Numerical Techniques for Stochastic Optimization, Springer, Berlin, 1988, pp. 65-94.

[3] R.J-B. Wets, Stochastic programming, in: A.H.G. Rinnooy Kan, M.J. Todd (Eds.), Handbooks in Operations Research and Management Science: Optimization, North- Holland, Amsterdam, 1989, pp. 573-629.

[4] P. Kall, S.W. Wallace, Stochastic Programming, Wiley, Chichester, UK, 1994.

[5] M.I. Kusy, W.T. Ziemba, A bank asset and liability management model, Operations Research 34 (3) (1986) 35(~376.

[6] J.M. Mulvey, H. Vladimirou, Stochastic network programming for financial planning problems, Management Science 38 (11) (1992) 1642 1664.

[7] S.A. Zenios, A model for portfolio management with mortgage-backed securities, Annals of Operations Re- search 43 (1993) 337-356.

[8] R.S. Hiller, J. Eckstein, Stochastic dedication: Designing fixed income portfolios using massively parallel Benders decomposition, Management Science 39 (11) (1993) 1422-1438.

[9] M.V.F. Pereira, L.M.V.G. Pinto, Multistage stochastic optimization applied to energy planning, Mathematical Programming 52 (1991) 359-375.

[10] J. Jacobs, G. Freeman, J. Grygier, D. Morton, G. Schultz, K. Staschus, J. Stedinger, SOCRATES: A system for scheduling hydroelectric generation under uncertainty, Annals of Operations Research 59 (1995) 99-134.

[11] D. Bienstock, J.F. Shapiro, Optimizing resource acquisition decisions by stochastic programming, Management Science 34 (2) (1988) 215-229.

[12] L. Somlyody, R.J-B. Wets, Stochastic optimization models for lake eutrophication management, Operations Research 36 (5) (1988) 660~681.

[13] H. Vladimirou, R.J-B. Wets, S.A. Zenios (Eds.), Models for planning under uncertainty, Annals of Operations Research 59 (1995).

[14] J. Dupa~ovfi, Multistage stochastic programs: The state-of-the- art and selected bibliography, Kybemetika 31 (1994) 151-174.

[15] R. Van Slyke, R.J-B. Wets, L-shaped linear programs with applications to optimal control and stochastic programming, SIAM Journal of Applied Mathematics 17 (1969) 638~663.

[16] K.A. Ariyawansa, D.D. Hudson, Performance of a benchmark parallel implementation of the Van Slyke and Wets algorithm for two-stage stochastic programs on the Sequent/Balance, Concurrency: Practice and Experience 3 (1991) 109-128.

[17] G.B. Dantzig, J.K. Ho, G. Infanger, Solving stochastic linear programs on a hypercube multicomputer, Report SOL 91 10, Department of Operations Research, Stanford University, Stanford, CA (1991).

[18] S.S. Nielsen, S.A. Zenios, Scalable parallel Benders decomposition for stochastic linear programming, Working Paper, Department of Management Science and Informa- tion Systems, University of Texas, Austin, TX (1994).

[19] A. Ruszczyfiski, Parallel decomposition of multistage stochastic programming problems, Mathematical Pro- gramming 58 (2) (1994) 201-288.

[20] A. Ruszczyfiski, A regularized decomposition method for minimizing a sum of polyhedral functions, Mathematical Programming 35 (1986) 309-333.

[21] A. Ruszczyfiski, Regularized decomposition of stochastic programs: Algorithmic techniques and numerical results, Working Paper WP-93-21, IIASA, 'Laxenburg, Austria (1993).

[22] R.J-B. Wets, Stochastic programs with fixed recourse: The equivalent deterministic program, SIAM Review 16 (1974) 309 339.

[23] J.R. Birge, L. Qi, Computing block-angular Karmarkar projections with applications to stochastic programming, Management Science 34 (12) (1988) 1472 1479.

[24] J.R. Birge, D. Holmes, Efficient solution of two-stage stochastic linear programs using interior point methods, Computational Optimization and Applications 1 (1992) 245 276.

[25] E.R. Jessup, D. Yang, S.A. Zenios, Parallel factorization of structured matrices arising in stochastic programming, SIAM Journal on Optimization 4 (4) (1994) 833-846.

670 H. Vladimirou I European Journal of Operational Research 108 (1998) 653470

[26] D. Yang, S.A. Zenios, A scalable parallel interior point algorithm for stochastic linear programming and robust optimization, Computational Optimization and Applica- tions 7 (1997) 143 158.

[27] R.J-B. Wets, Programming under uncertainty: The solution set, SIAM Journal of Applied Mathematics 14 (5) (1966) 1143 1151.

[28] J.R. Birge, F.V. Louveaux, A multicut algorithm for two- stage stochastic linear programs, European Journal of Operational Research 34 (1988) 384-392.

[29] A. Ruszczyfiski, A. Swietanowski, On the regularized decomposition method for two-stage stochastic linear programs, Working Paper WP-96-14, IIASA, Laxenburg, Austria (1996).

[30] IBM, Optimization Subroutine Library: Guide and Re- ference, Release 2, Document SC23-0519-02, Kingston, NY (1991).

[31] A. Geist, A. Beguelin, J. Dongarra, W. Weicheng, R. Manchek, V. Sunderam, PVM 3 User's Guide and Reference Manual,

Technical Report ORNLrI'M-12187, Oak Ridge National Laboratory, Oak Ridge, TN (1993).

[32] R.J. Vanderbei, LOQO User's Manual, Technical Report SOR-92-5, Department of Civil Engineering and Opera- tions Research, Princeton University, Princeton, NJ (1992).

[33] D. Holmes, A collection of stochastic programming problems. Technical Report 94~1 l, Department of Indus- trial and Operations Engineering, University of Michigan, Ann Arbor, MI (1994).

[34] G. Amdahl, The validity of single processor approach to achieving large scale computing capabilities, AFIPS Proceedings, vol. 30, 1967, pp. 483485.

[35] H. Vladimirou, S.A. Zenios, Scalable parallel computations for large-scale stochastic programming, Annals of Operations Research (forthcoming special volume on Parallel Optimization).

Computational assessment of distributed decomposition methods for stochastic linear programs

Documents