Using extremal optimization for Java program initial placement in clusters of JVMs E. Laskowski 1, M. Tudruj 1,3, I. De Falco 2, U. Scafuri 2, E. Tarantino.

Using extremal optimization for Java program initial placement in clusters

of JVMs

E. Laskowski1, M. Tudruj1,3, I. De Falco2,U. Scafuri2, E. Tarantino2, R. Olejnik4

1Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland2Institute of High Performance Computing and Networking, ICAR-CNR,

Naples, Italy3Polish-Japanese Institute of Information Technology, Warsaw, Poland

4Computer Science Laboratory of Lille, University of Science andTechnology of Lille, France.

{laskowsk, tudruj}@[email protected]

[email protected]

ASTEC 2009 2

Contents

• Motivation

• The ProActive environment

• Metrics

• Extremal Optimization

• Experimental results

• Conclusions

ASTEC 2009 3

Motivation

Efficient load balancing on Grid platform Distribution management:

load metrics: CPU queue length, resource utilization, response time

communications metrics: transferred data volume, message exchange frequency.

Balancing strategies: optimization of initial distribution of components

of an application (initial object deployment) dynamic load balancing (migration of objects).

ASTEC 2009 4

ProActive

ProActive: a Java-based framework for cluster and Grid computing

an Java API + tools Desktop→SMP→LAN→Cluster→Grid Application model:

Remote Mobile Objects, Group Communications Asynchronous Communications with synchronization

(Futures mechanism) OO SPMD, Migration, Web Services, Grid support

Various protocols: rmi, ssh, LSF, Globus

ASTEC 2009 5

Active Objects

Active Object is a standard Java object with an attached thread: asynchronous method calls wait by necessity.

Active Objects are created on nodes of the parallel system: the deployment is specified by external XML

description and/or API calls the location is completely transparent to the client.

Active objects are mobile.

ASTEC 2009 6

Active Objects

ASTEC 2009 7

Distributed multi-threaded model

ASTEC 2009 8

Initial deployment optimization steps

Measure the properties of the environment: CPU power and availability, network utilization.

Execute a program for some representative data (data sample): carry out the measurements of the number of

mutual method calls and data volume create a method call graph (a DAG) with the use of

method dependency graph and measured data. Find the optimal mapping of the graph Deploy and run the application in ProActive

ASTEC 2009 9

Observation

Load monitoring (system observation): predicts workstation load and network utilization principle:

the average idle time that threads pass to the OS is directly related to the CPU load

the maximal number of RMI calls per second. Object monitoring (application observation):

gives the intensity of communication between active objects

principle: the number of method calls between active objects and the volume of serialized data.

ASTEC 2009 10

Application observation

Application objects: global objects (ProActive's active objects) local objects (traditional Java objects)

Only global (active) objects are observed Observed items:

quantity of objects’ work intensity of communications between objects

Counting the method invocation number (remote communication)

ASTEC 2009 11

Invocation counters

G – an Active Object

G

Local objects

Output local invocations

Input invocations

Output global invocations

counter

countercounters

ASTEC 2009 12

The system description

The system consists of N computing resources (nodes)

The state of system resources: node power α

i: the number of instructions computed

per time unit on the node i average load of each node ℓ

i(Δt) in a particular time

span Δt: ℓi(Δt) ranges in [0.0, 1.0], where 0.0

means a node with no load and 1.0 a node loaded at 100%

network bandwidth βij: the communication

bandwidth between the pair of nodes i and j.

ASTEC 2009 13

The application

An application is subdivided into P communicating subtasks, two possible models:

an application is described by an undirected weighted graph G

tig = {P, E} (the TIG model)

an application is described by a weighted directed acyclic graph G

dag = {P, E} (the DAG model)

it is possible to translate DAG into TIG Parameters of a subtask k:

the number of instructions γk to be executed

the amount of communications ψkm

to be performed with the other m-th subtask

ASTEC 2009 14

EO algorithm

An introductory optimization algorithm determines an initial distribution of application components on JVMs located on Grid nodes

The problem is to assign each subtask to one node in the grid in a way that the execution of the application task is as efficient as possible the optimal mapping of application tasks onto the

nodes in heterogeneous environment is NP–hard So, we use the Extremal Optimization

algorithm for mapping of tasks to nodes

ASTEC 2009 15

The principle of the EO

Extremal Optimization is a co-evolutionary algorithm proposed by Boettcher and Percus in 1999

EO works with one single solution S made of a given number of components s

i, each of

which is a variable of the problem, is thought to be a species of the ecosystem, and is assigned a fitness value φ

i

Two fitness functions, one for the variables and one for the global solution.

ASTEC 2009 16

The outline of the EO

an initial random solution S is generated and its fitness Φ(S) is computed

repeat the following until a termination criterion becomes satisfied: the fitness value φ

i is computed for each of the

components si

the worst variable (in terms of φi) is randomly

updated, so that the solution is transformed into another solution S’ belonging to its neighborhood Neigh(S)

ASTEC 2009 17

Problems and improvements

The basic EO leads to a deterministic process, i.e., it gets stuck in a local optimum to avoid this behavior, Boettcher and Percus introduced a

probabilistic version of EO /τ-EO/ the variables are ranked in increasing order of

fitness values /for the minimization problem/ a distribution probability over the ranks k is conside-

red for a given value of the parameter τ: p

k ~ k-τ, 1 ≤ k ≤ n

at each update a rank k is selected according to pk

and the variable si with i = π(k) randomly changes

its state.

ASTEC 2009 18

Pseudocode of the τ-EO algorithm(for a minimization problem)

The only algorithm parameters are: the maximum number of iterations N

iter

the probabilistic selection value τ

ASTEC 2009 19

Encoding of mapping problem

A mapping solution is represented by a vector μ of P integers ranging in the interval [1,N]

In this example the first subtask of the task is placed on the grid node denoted with the number 12, the second on grid node 7, and so on.

Sub-task P

Sub-task P-1

Sub-task P-2

…Sub-task 4

Sub-task 3

Sub-task 2

Sub-task 1

18924…45712

ASTEC 2009 20

Fitness of a solution

Fitness function of a mapping solution:

where θ

ijcomp the computation time needed to execute the

subtask i on the node j to which it is assigned by the proposed mapping solution

θijcomm the communication time requested to execute the

subtask i on the node j to which it is assigned by the proposed mapping solution

θij= θ

ijcomp+ θ

ijcomm is the total time needed to execute the

subtask i on the node j to which it is assigned by the proposed mapping solution.

ASTEC 2009 21

Experimental results

τ-EO parameter setting N

iter = 200,000

τ = 3.0 20 runs on each problem

Two experiments reported in the presentation: a simulated execution of an application in a test grid

(10 sites, 184 nodes with different power and load) an optimization of a ProActive application in a

cluster (7 homogenous, two-core nodes).

ASTEC 2009 22

Experiment 1 – the test grid

A grid with 10 sites, a total of 184 nodes

ASTEC 2009 23

Experiment 1 - features of the nodes

Average loads: ℓi(Δt) = 0.0 for all nodes apart:

ℓi(Δt) = 0.5 for each i є [22, …, 31]

ℓi(Δt) = 0.5 for each i є [42, …, 47] the most powerful nodes are the first 22 of A and the first

10 of B

0

500

1000

1500

2000

2500

3000

0 -21

22-31

32-41

42-47

48-55

56-71

72-103

104-123

124-139

140-147

148-163

164-183

power

0

500

1000

1500

2000

0 -31

32 -47

48 -55

56 -71

72-103

104-123

124-139

140-147

148-163

164-183

bandwidth

ASTEC 2009 24

Experiment 1 – the application

P=30 nodes

G1 G2

T14

Ti

T0 T15

Ti+15

T29

:::

:::

:::

γk = 90,000 MI γk = 90,000 MI

ψkm = 100 Mbit

:::

ψkm = 100 Mbit

ψkm = 100 Mbit

ASTEC 2009 25

Experiment 1 – the result

The optimal allocation entails both the use of the most powerful nodes and the distribution of the communicating tasks in pairs on the same site so that communications are faster (only intersite, no intrasite)

the solution allocates 11 task pairs on the 22 unloaded nodes in A and the remaining 4 pairs on 8 unloaded nodes in B:

2 22 41 12 20 17 21 13 35 16 18 4 14 39 40

23 1 37 9 11 3 8 10 38 19 5 15 6 33 32

ASTEC 2009 26

Experiment 2 – the cluster

A cluster: 7 homogenous two-core nodes Gigabit Ethernet LAN average extra load ℓi(Δt) = 0.0 for all nodes each node has Sun JVM installed and a ssh agent

The scenario of the experiment:1. CPU power, load and network utilization monitoring

2. application parameters' measuring (using the sample data)

3. mapping optimization and the final run.

ASTEC 2009 27

Experiment 2 – the application

A ProActive Java multi-threaded application, working according to the DAG model 58 nodes the DAG is executed in the loop (200 iterations)

ASTEC 2009 28

Experiment 2 – the result

Since the nodes are homogenous and without the extra load, the EO mapping balanced the amount of computations assigned to each node:

Node nb: Amount of computations

0 1999

1 2005

2 1993

3 2004

4 2002

5 1980

6 1994

ASTEC 2009 29

Typical evolution of τ-EO on a mapping problem

Evolution of the best-so-far value is shown on the left, and both best-so-far and current solutions for the first 200 iterations are shown on the right

ASTEC 2009 30

Conclusions

• Extremal Optimization has been proposed as a viable approach to the mapping of the tasks making up an application in grid environments

• The unique feature of the presented approach is the ability to deal with different load of nodes and the diversity in network bandwidths

• τ-EO shows two very interesting features when compared to other optimization tools based on Evolutionary Algorithms (e.g. Differential Evolution:

– a much higher speed– its ability to provide stable solutions.

Using extremal optimization for Java program initial placement in clusters of JVMs E. Laskowski 1, M. Tudruj 1,3, I. De Falco 2, U. Scafuri 2, E. Tarantino.

Documents

active objects active

global active objects

active objects principle

quantity of objects

proactive slide

remote mobile objects

globus slide

number of method