Using extremal optimization for Java program initial placement in clusters of JVMs E. Laskowski 1 , M. Tudruj 1,3 , I. De Falco 2 , U. Scafuri 2 , E. Tarantino 2 , R. Olejnik 4 1 Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland 2 Institute of High Performance Computing and Networking, ICAR-CNR, Naples, Italy 3 Polish-Japanese Institute of Information Technology, Warsaw, Poland 4 Computer Science Laboratory of Lille, University of Science and Technology of Lille, France. {laskowsk, tudruj}@ipipan.waw.pl [email protected]Richard.Olejnik@lifl.fr
30
Embed
Using extremal optimization for Java program initial placement in clusters of JVMs E. Laskowski 1, M. Tudruj 1,3, I. De Falco 2, U. Scafuri 2, E. Tarantino.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Using extremal optimization for Java program initial placement in clusters
of JVMs
E. Laskowski1, M. Tudruj1,3, I. De Falco2,U. Scafuri2, E. Tarantino2, R. Olejnik4
1Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland2Institute of High Performance Computing and Networking, ICAR-CNR,
Naples, Italy3Polish-Japanese Institute of Information Technology, Warsaw, Poland
4Computer Science Laboratory of Lille, University of Science andTechnology of Lille, France.
the average idle time that threads pass to the OS is directly related to the CPU load
the maximal number of RMI calls per second. Object monitoring (application observation):
gives the intensity of communication between active objects
principle: the number of method calls between active objects and the volume of serialized data.
ASTEC 2009 10
Application observation
Application objects: global objects (ProActive's active objects) local objects (traditional Java objects)
Only global (active) objects are observed Observed items:
quantity of objects’ work intensity of communications between objects
Counting the method invocation number (remote communication)
ASTEC 2009 11
Invocation counters
G – an Active Object
G
Local objects
Output local invocations
Input invocations
Output global invocations
counter
countercounters
ASTEC 2009 12
The system description
The system consists of N computing resources (nodes)
The state of system resources: node power α
i: the number of instructions computed
per time unit on the node i average load of each node ℓ
i(Δt) in a particular time
span Δt: ℓi(Δt) ranges in [0.0, 1.0], where 0.0
means a node with no load and 1.0 a node loaded at 100%
network bandwidth βij: the communication
bandwidth between the pair of nodes i and j.
ASTEC 2009 13
The application
An application is subdivided into P communicating subtasks, two possible models:
an application is described by an undirected weighted graph G
tig = {P, E} (the TIG model)
an application is described by a weighted directed acyclic graph G
dag = {P, E} (the DAG model)
it is possible to translate DAG into TIG Parameters of a subtask k:
the number of instructions γk to be executed
the amount of communications ψkm
to be performed with the other m-th subtask
ASTEC 2009 14
EO algorithm
An introductory optimization algorithm determines an initial distribution of application components on JVMs located on Grid nodes
The problem is to assign each subtask to one node in the grid in a way that the execution of the application task is as efficient as possible the optimal mapping of application tasks onto the
nodes in heterogeneous environment is NP–hard So, we use the Extremal Optimization
algorithm for mapping of tasks to nodes
ASTEC 2009 15
The principle of the EO
Extremal Optimization is a co-evolutionary algorithm proposed by Boettcher and Percus in 1999
EO works with one single solution S made of a given number of components s
i, each of
which is a variable of the problem, is thought to be a species of the ecosystem, and is assigned a fitness value φ
i
Two fitness functions, one for the variables and one for the global solution.
ASTEC 2009 16
The outline of the EO
an initial random solution S is generated and its fitness Φ(S) is computed
repeat the following until a termination criterion becomes satisfied: the fitness value φ
i is computed for each of the
components si
the worst variable (in terms of φi) is randomly
updated, so that the solution is transformed into another solution S’ belonging to its neighborhood Neigh(S)
ASTEC 2009 17
Problems and improvements
The basic EO leads to a deterministic process, i.e., it gets stuck in a local optimum to avoid this behavior, Boettcher and Percus introduced a
probabilistic version of EO /τ-EO/ the variables are ranked in increasing order of
fitness values /for the minimization problem/ a distribution probability over the ranks k is conside-
red for a given value of the parameter τ: p
k ~ k-τ, 1 ≤ k ≤ n
at each update a rank k is selected according to pk
and the variable si with i = π(k) randomly changes
its state.
ASTEC 2009 18
Pseudocode of the τ-EO algorithm(for a minimization problem)
The only algorithm parameters are: the maximum number of iterations N
iter
the probabilistic selection value τ
ASTEC 2009 19
Encoding of mapping problem
A mapping solution is represented by a vector μ of P integers ranging in the interval [1,N]
In this example the first subtask of the task is placed on the grid node denoted with the number 12, the second on grid node 7, and so on.
Sub-task P
Sub-task P-1
Sub-task P-2
…Sub-task 4
Sub-task 3
Sub-task 2
Sub-task 1
18924…45712
ASTEC 2009 20
Fitness of a solution
Fitness function of a mapping solution:
where θ
ijcomp the computation time needed to execute the
subtask i on the node j to which it is assigned by the proposed mapping solution
θijcomm the communication time requested to execute the
subtask i on the node j to which it is assigned by the proposed mapping solution
θij= θ
ijcomp+ θ
ijcomm is the total time needed to execute the
subtask i on the node j to which it is assigned by the proposed mapping solution.
ASTEC 2009 21
Experimental results
τ-EO parameter setting N
iter = 200,000
τ = 3.0 20 runs on each problem
Two experiments reported in the presentation: a simulated execution of an application in a test grid
(10 sites, 184 nodes with different power and load) an optimization of a ProActive application in a
cluster (7 homogenous, two-core nodes).
ASTEC 2009 22
Experiment 1 – the test grid
A grid with 10 sites, a total of 184 nodes
ASTEC 2009 23
Experiment 1 - features of the nodes
Average loads: ℓi(Δt) = 0.0 for all nodes apart:
ℓi(Δt) = 0.5 for each i є [22, …, 31]
ℓi(Δt) = 0.5 for each i є [42, …, 47] the most powerful nodes are the first 22 of A and the first
10 of B
0
500
1000
1500
2000
2500
3000
0 -21
22-31
32-41
42-47
48-55
56-71
72-103
104-123
124-139
140-147
148-163
164-183
power
0
500
1000
1500
2000
0 -31
32 -47
48 -55
56 -71
72-103
104-123
124-139
140-147
148-163
164-183
bandwidth
ASTEC 2009 24
Experiment 1 – the application
P=30 nodes
G1 G2
T14
Ti
T0 T15
Ti+15
T29
:::
:::
:::
γk = 90,000 MI γk = 90,000 MI
ψkm = 100 Mbit
:::
ψkm = 100 Mbit
ψkm = 100 Mbit
ASTEC 2009 25
Experiment 1 – the result
The optimal allocation entails both the use of the most powerful nodes and the distribution of the communicating tasks in pairs on the same site so that communications are faster (only intersite, no intrasite)
the solution allocates 11 task pairs on the 22 unloaded nodes in A and the remaining 4 pairs on 8 unloaded nodes in B:
2 22 41 12 20 17 21 13 35 16 18 4 14 39 40
23 1 37 9 11 3 8 10 38 19 5 15 6 33 32
ASTEC 2009 26
Experiment 2 – the cluster
A cluster: 7 homogenous two-core nodes Gigabit Ethernet LAN average extra load ℓi(Δt) = 0.0 for all nodes each node has Sun JVM installed and a ssh agent
The scenario of the experiment:1. CPU power, load and network utilization monitoring
2. application parameters' measuring (using the sample data)
3. mapping optimization and the final run.
ASTEC 2009 27
Experiment 2 – the application
A ProActive Java multi-threaded application, working according to the DAG model 58 nodes the DAG is executed in the loop (200 iterations)
ASTEC 2009 28
Experiment 2 – the result
Since the nodes are homogenous and without the extra load, the EO mapping balanced the amount of computations assigned to each node:
Node nb: Amount of computations
0 1999
1 2005
2 1993
3 2004
4 2002
5 1980
6 1994
ASTEC 2009 29
Typical evolution of τ-EO on a mapping problem
Evolution of the best-so-far value is shown on the left, and both best-so-far and current solutions for the first 200 iterations are shown on the right
ASTEC 2009 30
Conclusions
• Extremal Optimization has been proposed as a viable approach to the mapping of the tasks making up an application in grid environments
• The unique feature of the presented approach is the ability to deal with different load of nodes and the diversity in network bandwidths
• τ-EO shows two very interesting features when compared to other optimization tools based on Evolutionary Algorithms (e.g. Differential Evolution:
– a much higher speed– its ability to provide stable solutions.