Dynamic Load Balancing in Parallel and Distributed Networks by Random Matchings (Extended Abstract) Bhaskar Ghosh* Abstract The fundamental problems in dynamic load balancing and job scheduling in parallel and distributed comput- ers involve moving load between processors. In this paper, we consider a new model for load movement in synchronous parallel and distributed machines. In each step of our model, each processor can transfer load to at most one neighbor; also, any amount of load can be moved along a communication link be- tween two processors in one step. This is a reason- able model for load movement in significant classes of dynamic load balancing problems. We derive efficient algorithms for a number of task reallocation problems under our model of load move- ment. These include dynamic load balancing on pr~ cessor networks, adaptive mesh re-partitioning such as those in finite element methods, and progressive job migration under dynamic generation and consump- tion of load. To obtain the abov~mentioned results, we intro- duce and solve the abstract problem of Incremental Weight Migration (IWM) on arbitrary graphs. Our main result is a simple, randomized, algorithm for this problem which provably results in asymptotically op- timal convergence towards the state where weights on the nodes of the graph are all equal. This algorithm *Department of Computer Science, Yale University, P. O. Box 208285, New Haven, CT 06520. Internet : [email protected]. Reseerch supported by ONR under grant number 491-J-1576 and a Yale/IBM joint study. t Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY 10012-1185, USA; [email protected], (212) 998-3061. The research of this author was supported in part by NSF/DARPA under grant number CCR-89-06949 and by NSF under grant number CCR-91-03953. Permission to co y without fee all or part of this material is cl’ ~ranted provide that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association of Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. SPAA 94-6194 Cape May, N.J; USA @ 1994 ACM 0-89791-671 -9/94/0006..$3.50 S. Muthukrishnant utilizes an appropriate random set of edges forming a matching. Our algorithm for the IWM problem is used in deriving efficient algorithms for all the prob- lems mentioned above. Our results are very general. The algorithms we derive are local, and hence, scalable. They work for arbitrary load distributions and for networks of arbi- trary topology which can possibly undergo link fail- ures. Of independent intereat is our proof technique which we use to lower bound the convergence of our algorithms in terms of the eigenstructure of the un- derlying graph. Finally, we present preliminary experimental r- sults analyzing issues in load balancing related to our algorithms. 1 Introduction Consider the following scenario of dynamic load bal- ancing in a distributed setting. An application pro- gram is running on a distributed network of arbitrary topology comprising a large number of processors. Each processor has a load of independent tasks to be executed. The distribution of tasks is dynamically determined, that is, the specific application program running on the machine cannot be developed with a- priori estimates of the load distribution. The task of dynamic load balancing is to reallocate the tasks so that each processor has nearly the same amount of load. Of course in natural settings, the scenario is sub- stantially more demanding in that the tasks might be dynamically generated or consumed in each step and additionally, the underlying topology might change owing to failures in communication links. Besides load balancing, scenarios such as the one above occur in several other guises, for example, in job schedul- ing, adaptive mesh partitioning and resource alloca- tion problems. (These guises will be more clearly ex- plained later with examples). In each of these guises, equitable load redistribution is critical for efficient im- plementation of algorithms on both distributed and parallel computers. Standard models for dynamic load balancing make 226
10
Embed
Dynamic Load Balancing in Parallel and Distributed ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Dynamic Load Balancing in Parallel and Distributed Networks
by Random Matchings
(Extended Abstract)
Bhaskar Ghosh*
Abstract
The fundamental problems in dynamic load balancing
and job scheduling in parallel and distributed comput-
ers involve moving load between processors. In this
paper, we consider a new model for load movement
in synchronous parallel and distributed machines. In
each step of our model, each processor can transfer
load to at most one neighbor; also, any amount of
load can be moved along a communication link be-
tween two processors in one step. This is a reason-
able model for load movement in significant classes of
dynamic load balancing problems.
We derive efficient algorithms for a number of task
reallocation problems under our model of load move-
ment. These include dynamic load balancing on pr~
cessor networks, adaptive mesh re-partitioning such as
those in finite element methods, and progressive job
migration under dynamic generation and consump-
tion of load.
To obtain the abov~mentioned results, we intro-
duce and solve the abstract problem of Incremental
Weight Migration (IWM) on arbitrary graphs. Our
main result is a simple, randomized, algorithm for this
problem which provably results in asymptotically op-
timal convergence towards the state where weights on
the nodes of the graph are all equal. This algorithm
*Department of Computer Science, Yale University, P. O. Box208285, New Haven, CT 06520. Internet : [email protected].
Reseerch supported by ONR under grant number 491-J-1576
and a Yale/IBM joint study.
t Courant Institute of Mathematical Sciences, New York
University, 251 Mercer Street, New York, NY 10012-1185, USA;
[email protected], (212) 998-3061. The research of this authorwas supported in part by NSF/DARPA under grant number
CCR-89-06949 and by NSF under grant number CCR-91-03953.
Permission to co y without fee all or part of this material iscl’~ranted provide that the copies are not made or distributed for
direct commercial advantage, the ACM copyright notice and thetitle of the publication and its date appear, and notice is giventhat copying is by permission of the Association of ComputingMachinery. To copy otherwise, or to republish, requires a feeand/or specific permission.
SPAA 94-6194 Cape May, N.J; USA@ 1994 ACM 0-89791-671 -9/94/0006..$3.50
S. Muthukrishnant
utilizes an appropriate random set of edges forming
a matching. Our algorithm for the IWM problem is
used in deriving efficient algorithms for all the prob-
lems mentioned above.
Our results are very general. The algorithms wederive are local, and hence, scalable. They work for
arbitrary load distributions and for networks of arbi-
trary topology which can possibly undergo link fail-ures. Of independent intereat is our proof techniquewhich we use to lower bound the convergence of ouralgorithms in terms of the eigenstructure of the un-derlying graph.
Finally, we present preliminary experimental r-sults analyzing issues in load balancing related to ouralgorithms.
1 Introduction
Consider the following scenario of dynamic load bal-ancing in a distributed setting. An application pro-
gram is running on a distributed network of arbitrary
topology comprising a large number of processors.
Each processor has a load of independent tasks tobe executed. The distribution of tasks is dynamicallydetermined, that is, the specific application programrunning on the machine cannot be developed with a-
priori estimates of the load distribution. The taskof dynamic load balancing is to reallocate the tasksso that each processor has nearly the same amount ofload. Of course in natural settings, the scenario is sub-stantially more demanding in that the tasks might bedynamically generated or consumed in each step and
additionally, the underlying topology might changeowing to failures in communication links. Besidesload balancing, scenarios such as the one above occurin several other guises, for example, in job schedul-
ing, adaptive mesh partitioning and resource alloca-tion problems. (These guises will be more clearly ex-plained later with examples). In each of these guises,equitable load redistribution is critical for efficient im-plementation of algorithms on both distributed andparallel computers.
Standard models for dynamic load balancing make
226
the following assumptions. (See for example [AA+93,
LM93, R91].) In one time step, each processor can
migrate load to any (possibly all) of the other pro-cessors (possibly including non-neighbors). Also, at
most one unit of load can be transferred across anylink in a step. Under this model, dynamic load bal-ancing has been extensively studied. The main focusof this paper concerns the amount of parallelism inthis standard model. Specifically,
1. The standard model overestimates available par-allelism in communication with neighbors. In
practice, in each time step, each processor cancommunicate with only one other processor.That is, the communication with a set of neigh-
boring processors is inherently sequential.
2. The standard model underestimates availableparallelism in edge capacity. With increasingly
high–bandwidth networks becoming available, alarge amount of data can be transferred acrossa link in one time step. Therefore, it is reason-
able to assume that several units of tasks canbe migrated in the same message across a linkin one step, provided moving each task incursmovement of only a reasonable amount of data.Indeed, there are large classes of important dy-namic load balancing problems in which the taskshave small associated data space (Examples areprovided in Section 1.3).
Motivated by these observations, we study dy-
namic load balancing (and other guises in which itcomes up) in distributed networks with unboundededge capacity under the restriction that each proces-
sor can migrate load to at most one of its neighbors
in one step. Our approach is to identify an abstractproblem which we call the Incremental Weight Mi-
gration Problem on arbitrary graphs. This can bethought of as a single step of load migration acrossthe entire network in parallel on our model. Our mainresult is an asymptotically optimal algorithm for this
problem on our model. We utilize this in deriving ef-ficient algorithms for many other problems, includingdynamic load balancing, adaptive mesh partitioning,
and dynamic job scheduling.
Our algorithms employ only local control and datacommunication; hence, they are scalable. Also, ourresults are very general, in that they hold for networksof arbitrary topology which can possibly undergo linkfailures during the execution of the algorithm.
1.1 Problem and Model
Incremental Weight Migration (IWM) Prob-
lem. Consider an undirected connected agent graph
G = (V, E) with n vertices in which each vertex vi
representing an agent Ai has weight wi. The poten-tial @ of the graph is defined to be (xi w?) – nZJ2,
where n is the number of nodes in G and ~ = ~i wi/n
is the average weight. 1 The IWM problem is to de-
termine a set of matching edges M (that is, no pairof edges share an endpoint) and to specify, for eachedge in M, a relocation of weights on its ends across
the edge such that the drop in the potential function
is the maximum.
Model. We note three characteristics of the modelin the definition of the problem above. First, weightmovements are local, i.e., any portion of the weight ona node which is moved, ends up at a neighbor of that
node. Second, any portion of the weight on an end-
point of an edge can be moved across the edge in onestep. Third, each agent is involved in weight transferwith at most one of its neighbors.
Desirable Algorithmic Features. Naturally wewould like an algorithm of reasonable computational
expense. This is particularly relevant since tradition-ally, load balancing and scheduling algorithms trade-
off performance for running time. Importantly, wewould like our algorithm to be fully local and dis-tributed. This is critical because algorithms whichneed and rely on global information are expensive;additionally they may not work when links in the un-derlying network graph fail, as may happen in practicefor distributed networks.
1.2 Our Main Results
Our performance measure is the ratio of the drop inpotential to the original potential; we denote this as
the convergence factor. We assume that the agents
(processors) work in lock-step.
A. Real Weight Case. First, consider the case whenthe weights are real. This implies that the weight oneach agent can be subdivided to arbitrary precision.In this case, we design a simple, completely local, ran-
domized algorithm for the IWM problem which has anexpected convergence factor of at least c+. Here, A2is the second smallest eigenvalue of the Laplacian ofthe agent graph G, d is the degree of G, and c is a
constant (O < c s 1), independent of G.
It is essy to show that there exists an agent graph
and an assignment of weights to its vertices suchthat no algorithm (even one which is randomizedand which has global information) can have a conver-
gence factor greater than CZ$, where C2 is a constant.
Therefore, our algorithm is asymptotically optimal.
1 ow po~en~j~ fwctjon @ j~ the quwe of the Euclidian&-
tance between the weight vector O = (WI, . . . . Wn)= and the
load-balanced vector (=,iiJ,, . . ..@T. Note that@ ~ 0. The PO-tential becomes zero only when the weights on the agents are
all equal to 7i7.
227
B. Discrete Weight Case. In real applications, theweights are discrete, i.e., each Wj is a collection of Wiunit tasks and a unit task cannot be divided further.We show that a simple modification of the algorithmin Case A also works when the weights are discrete.Its expected convergence factor is at least C* where c
is a constant (O < c $ 1) provided the initial potentialis f2(n4).2 This algorlthm has an optimal convergence
factor w well.
The discrete case is intrinsically harder than thereal weight case when the potential is small (o(n4))since we can show the following: there exists an agent
graph and assignment of weights to the agent nodessuch that no algorithm can have a convergence factorof more than ~~, for any constant ~. In contrast,the algorithm in Case A has a convergence factor of
Q(A2/2cf).3 On the other hand, when @ = 0(n4), wecan show that our algorithm reduces @ by an additiveterm rather than by a multiplicative factor.
Remark 1. It is worth noting that although weights
are moved only along a subset (matching) of edges,
the convergence bounds are in terms of the global
properties of the graph, namely, ~2 and d. Note thatfor any connected graph, O < ~ ~ 1. Thus, our algo-
rithms guarantee a positive fractional (possibly non-constant) decrease in the potential for any connectedgraph.
Remark 2. The parameter J2 reflects the connec-tivity of the underlying graph. For a line graph, a d-
dimensional mesh, a hypercube, a d-regular expanderand a clique on n nodes, the fraction ~ roughly equals
respectively.
Remark 3. Our algorithms are extremely efficientsince each agent takes only O(d) time for control.
Also, each agent performs data transfer across at mostone edge in each time step.
Remark 4. The performance of our algorithm isguaranteed even if some edges disappear between suc-cessive time steps; in this case, d represents the de-
gree of the graph at the beginning of the algorithmand ~2 represents the second smallest eigenvalue ofthe Laplacian of the graph that remains at the end of
the algorithm. The proof of this claim is omitted in
this paper.
2~ fact we prove this convergence factor when the initial
potential is Q(dn/A2 ). Since A2 = Q(l/n2) and d < n for auyconnected graph, it follows that the claimed convergence factorholds when the initial potential is fl(n4 ).
3Throughout this paper, we use Q(j), for a given function
j, to denote cj for some constant c.
1.3 Applications and Our Other Re-
sults
We utilize our algorithm for the IWM problem to de-rive efficient algorithms for a variety of task realloca-
tion problems. In what follows, we briefly introducethree major applications, and defer the other applica-tions to our detailed paper.
1.
2.
3.
We provide an efficient and first-known com-pletely analyzable algorithm in our model for dy-
namic load balancing on arbitrary networks underpossible link failure.
We provide the first-known, analytical conver-gence result for abstract dynamic re-partitioningproblems (eg. re-partitioning an adaptivelychanging mesh such as those used in finite ele-
ment methods).
We initiate a new paradigm of progressive taskscheduling, in which even under dynamic gen-eration and consumption of load, at each taskscheduling step, a fractional progress is made to-wards the load-balanced state.
Dynamic Load Balancing. Given a processor net-
work with arbitrary discrete weights on the vertices,
the dynamic load balancing problem is to move theloads as to have nearly the same amount of load oneach processor. Dynamic load balancing has beenstudied in a number of settings, Almost all re-
search has focused on algorithms for specific topolo-gies and/or rely on global routing phases. A classof such research has involved performance analysis ofload balancing algorithms by simulations [LMR91].Among analytical results, load balancing for specifictopologies under statistical assumptions on input load
distributions has been studied [HCT89]. For arbitraryinitial load distribution, load balancing has been stud-ied in special topologies such as Counting Networks
[AHS91, HLS92] Hypercubes [P89], Meshes [HT93]and Expanders [PU89]. These algorithms do not ex-tend to arbitrary or dynamically changing topologies.
For dynamically changing topologies, load balancinghas been studied under assumptions on the pattern offailures for specific architectures [R89, AB92].
For arbitrary topologies, under the assumption
that one load unit can be migrated across each edgein parallel and that each processor can communicatewith all its neighbors in one step, [AA+93] presents
an algorithm for dynamic load balancing which takesO(A log(nA)/p) steps to approximately balance theloads. The approximation is within an additive termof d x diameter(G). Here p is the vertex expan-sion of the graph and A = maxi (Wi – iD), where
~ = ~i w;/n. Their algorithm is optimal up to alog nA factor.
228
In our model of dynamic load balancing, load canbe moved along only one edge from a processor in atime step, and there is no restriction on the amount of
load that can be moved. This is an appropriate modelwhen each task has small associated data space; there-
fore several tasks can be communicated in one time
step. This is true for a large class of problems likefine grain programs which spawn processes dynam-
ically [GH89, K88], real time data fusion problems[CA87, FG91] and game tree searches [F93].
Clearly, the dynamic load balancing problemfor discrete weights can be solved by applyingour algorithm for the IWM problem repeatedly.
This algorithm balances load approximately in
0((d/A2)(log 00 + dn)) invocations of our algorithm
for the IWM problem. The load balancing is approx-
imate in the sense that our algorithm stops when for
each edge (i, j), I wi - wj IS 1. Our algorithm worksfor arbitrary topologies under possible failure of links
connecting the processors.
We remark that Cybenko [C89] considered astronger version of our model by additionally allowingeach processor to transfer load to all its neighbors inone time step. This work is of mathematical interest
since it considers only the case when the weights arereal.
Problem Re-Partitioning. Abstractly, assume
each node in the agent graph corresponds to a par-
tition or sub-domain of a global data domain. Each
node in the agent graph is mapped to a processor.Due to dynamic computations at each processor, thesub-domains get refined, leading to a load-imbalance
in the size of each sub-domain. Repartitioning of thedomain becomes necessary to achieve load balance.
Such applications come up in various forms inthe use of adaptive finite-element and finite-differencemethods using either locally adaptive meshes or or-der of approximation, for example in h-p jinite ele-
ment methods which are common in mechanical en-gineering and visualization software. In adaptive-
mesh terminology, the agent graph representing sub-domain connectivity information is called the quo-
tient graph. Achieving balanced sub-domains usu-ally involves shifting the boundaries of adjoining sub-domains (i.e., across edges in the quotient graph) soa to equalize the data points in each sub-domain.
Further references on these areas can be found in[BB87, HT93, W91].
Clearly, our algorithm for the IWM problem canbe used repeatedly on the quotient graph to solve the
problem of mesh partitioning. Note however that theactual migration of data points as determined by theapplication of our algorithm can be performed on theunderlying architecture by either local communication
(if adjoining sub-domains have been mapped to ad-joining processors) or by non-local routing (if adjoin-
ing sub-domains have been mapped to non-adjoiningprocessors).
Progressive Dynamic Task Scheduling. Con-sider a segment of a distributed execution, where tasks
are generated and consumed in each step in an unpre-dictable manner at various nodes as the computation
proceeds. We are required to schedule the tasks in
each step by moving them to underloaded or idle pro-
cessors so as to increase the throughput. This sce-nario arises in general purpose distributed computing[LK87, NX+85] as well as in specific applications suchas the parallel branch-and-bound search on game trees
[KZ88] and dynamic tree embedding on distributedor parallel architectures [L N+89, R91].
We initiate a new paradigm for these problems.For motivation, note that there are broadly twoparadigms for task scheduling in this scenario. Inone paradigm, the scheduling guarantees that each
processor has at least one task to execute at the endof the step. In the other paradigm, the schedulingguarantees that all processors have roughly the same
number of tasks at the end of the step, It is easy to seethat in both these paradigms, there exists a sequence
of load generation and consumption that forces any
algorithm to resort to load movement between twonon-neighboring processors to satisfy the guaranteeon load distribution. Load movement between pro-cessors which are not neighbors is an expensive op-eration. We advocate the approach of restricting al-gorithms to only perform load movements between
neighbors, but requiring a guarantee of a reasonableprogress towards the load-balanced state. In our case,the reasonable progress is a decrease in the distance
to the load-balanced state (formalized as a potential
function) by a multiplicative factor. We refer to thisas progressive dynamic task scheduling.
Using our algorithm for the IWM problem onceeach step in this case of dynamic load migration withloads generated and consumed each step, we can guar-antee that the potential drops by at least a A2/16dfactor in the expected case in each step as long as
the potential is large. Note that our algorithm isthe first known algorithm to make such a guaran-tee. In a related work [AA+93], (under the weaker as-
sumption that tasks are not dynamically generated orconsumed), a distance function (different from ours)is used to measure the progress towards the load-balanced state in several steps. However, they cannotguarantee a fractional decrease in the distance in ev-ery step since their argument involves amortization ofthe decrease in the distance over several steps.
1.4 Our Techniques
For intuition, consider solving the IWM problem withreal weights. Given a graph which is not balanced,
229
we can always pick an edge (i, j) and equalize the
weights across its endpoints. Note that since provably
decreases @ since the reduction in @ is (w? + w;) –
2(w’~wj )2, which is ~(~i–~j)2 z O. We speed up thisprocess by balancing along a matching set of edges inparallel, Note that a set of matching edges can beobtained in several ways. For example, edge-coloringthe input graph gives us a set of matchings where eachcolor defines a matching. Alternately, given graph G,we can explicitly compute the matching which gives
the maximum potential drop. All these schemes re-quire expensive computation of global information;also, they may not work when some edges disappear.
In our algorithm, we choose a random set ofmatching edges locally. The manner in which the ran-
dom matching is chosen ensures that there is a globallower bound on the probability of each edge appearing
in the matching. This property ensures global conver-
gence bounds. For choosing such a random matching,we draw upon the intuition from the very sparse phasein the evolution of random graphs [B87].
There appears to be some connection betweenour techniques for analyzing our algorithm and thoseused in analyzing rapidly mixing properties of MarkovChains [M89, AA+93]. However no formal connectionis known to us.
1.5 Organization of the Paper
The IWM problem for real weights is solved in Sec-tion 3. We extend this solution in Section 4 to the casewhen the weights are discrete. In Section 5 we demon-
strate one of the applications – namely, dynamic loadbalancing. The rest of the applications are omitted inthis paper. In Section 6 we present some preliminaryexperimental results from the implementations of ouralgorithms.
2 Preliminaries
Consider an undirected connected agent graph G =
(V, E) with n vertices and maximum degree d. Eachvertex vi represents an agent Ai and has weight Wi.We denote the distribution of weights wi on node iof G by the weight vector d. The potential @ of the
graph is defined to be (xi w?) – nZ2, where n is thenumber of nodes in G and G = ~i wi/n is the averageweight. Given G and $ and any algorithm for I WM,
let the potentials before a?d after the invocation ofthe algorithm be @ and @ . Then the convergence
factor for this algorithm is defined to be (Q – @ )/0.
Let A denote the adjacency matrix of G. Definea matrix D = di,i, where di,j = O if i # j, and di,i
be the degree of agent i. The matrix L = D – A is
the Laplacian Matrix of G. The eigenvalues of L are
o= Al< A2< .... <An.
Fact 1. G is a connected graph if and only if ~2 >0.It can be shown that for any connected graph with nvertices, AZ = Q(l/n2),
Fact 2. From the Courant-Fischer Minimax Theo-rem, it follows that [MP92],
wherevl = (1,1, ..., 1)~ is the eigenvector correspond-
ing to Al = O and z -1-VI implies that the vector z isorthogonal to V1.
3 Algorithm for IWM (Real
Weights)
In this section we present a local randomized alg~rithm for the IWM problem with real weights. Given
a graph G and weight vector d, Algorithm LocalRan-dom (LR) works as follows :
1. Pick a random matching M in G aa follows:
a. Each edge e is independently put in lbl with
probability I/p (p will be fixed later).
b. Each edge (u, v) removes itself from A4 if
(w, u) or (w, v) is in Al for some w 6 V.
2. For each edge (i, j) c M, (assuming without loss
of generality wi ~ wj ), move (wi – wj )/2 loadunits from agent 2 to agent j.
Lemma 1 For each edge e = (u, v), Pr ( Algorithm
LR picks e in ill ) z l/8d.
Proof. Fix an edge e = (u, v). Now,
Pr( e is in M in Step la and removed in Step lb)
< 12(d–1)
-i P
Therefore,
Pr(e in J4 after Step lb)
= Pr(e in Jf after Step la and it is not removed in
Step lb)
= Pr(e in &f after Step la) -
Pr(e is in A4 after Step la and removed in Step lb)
>~_~2(d–l)>p–2d+2—
PPP– P2
Now set p = 4d. Then, Pr(e in J4 after Step 2) z ~
~ ~. m
230
Theorem 1 For any connected graph G, and weight
vector$, the expected value of the convergence factor
CLR of Algorithna LRit atleast~.
Proof : Let AQbethe drop intotal potentialof G
due to algorithm LR. For each edge (i, j) in Ai,j be the
drop in potential due to weight equalization between
agents i and j if (i, j) is in the matching picked by LR.