Top Banner
UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat Autònoma de Barcelona Paradyn/Condor Week 2005 March 2005
38

UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

Jan 04, 2016

Download

Documents

Mercy Bell
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

UAB

Dynamic Tuning of Master/Worker Applications

Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque

Universitat Autònoma de Barcelona

Paradyn/Condor Week 2005March 2005

Page 2: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

2

Outline Introduction MATE Number of workers Data distribution Conclusions

Page 3: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

3

Outline Introduction MATE Number of workers Data distribution Conclusions

Page 4: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

4

IntroductionApplication performance

The main goal of parallel/distributed applications: solve a considered problem in the possible fastest way

Performance is one of the most important issues

Developers must optimize application performance to provide efficient and useful applications

Page 5: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

5

Introduction (II)

Difficulties in finding bottlenecks and determining their solutions for parallel/distributed applications

Many tasks that cooperate with each other

Application behavior may change on input data or environment

Difficult task especially for non-expert users

Page 6: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

6

Outline Introduction MATE Number of workers Data distribution Conclusions

Page 7: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

7

MATE Monitoring, Analysis and Tuning Environment

Dynamic automatic tuning of parallel/distributed applications

Modifications

Instrumentation

User

TuningMonitoring

Tool

SolutionProblem /

Performance analysis

Performance data

Application development

Application

Execution

Source

Events

DynInst

Page 8: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

8

MATE (II)

Machine 1 Machine 2

Machine 3

pvmd

Analyzer

pvmd

AC

instr.

events

modif.

events

DMLibDMLibDMLib

Task1 Task2Task3

instr.

AC

Application Controller - AC Dynamic Monitoring Library - DMLib Analyzer

Page 9: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

9

MATE (II)

Machine 1 Machine 2

Machine 3

pvmd

Analyzer

pvmd

AC

instr.

events

modif.

events

DMLibDMLibDMLib

Task1 Task2Task3

instr.

AC

Application Controller - AC Dynamic Monitoring Library - DMLib Analyzer

Analyzer•Carries out the application performance analysis•Detects problems “on the fly” and requests changes

Page 10: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

10

MATE (II)

Machine 1 Machine 2

Machine 3

pvmd

Analyzer

pvmd

AC

instr.

events

modif.

events

DMLibDMLibDMLib

Task1 Task2Task3

instr.

AC

Application Controller - AC Dynamic Monitoring Library - DMLib Analyzer

Application Controller (AC)•Controls the execution of the application•Has a Monitor module to manage instrumentation via DynInst and gather execution information•Has a Tuner module to perform tuning via DynInst

Page 11: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

11

MATE (II)

Machine 1 Machine 2

Machine 3

pvmd

Analyzer

pvmd

AC

instr.

events

modif.

events

DMLibDMLibDMLib

Task1 Task2Task3

instr.

AC

Application Controller - AC Dynamic Monitoring Library - DMLib Analyzer

Dynamic Monitoring Library (DMLib)•Facilitates the instrumentation and data collection•Responsible for registration of events

Page 12: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

12

MATE (III) Automatic performance Analysis on the fly

Find bottlenecks among events applying performance model

Find solutions that overcome bottlenecks Analyzer is provided with an application

knowledge about performance problems Information related to one problem is called a

tuning technique A tuning technique describes a complete

performance optimization scenario

Page 13: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

13

MATE (IV) Each tuning technique is implemented in MATE as a “tunlet” A tunlet is a C/C++ library dynamically loaded to the Analyzer

process

measure points – what events are needed performance model – how to determine bottlenecks and solutions tuning actions/points/synchronization - what to change, where,

when

Analyzer

Tunlet

Measure points Tuning point, action, sync

Performance model

Page 14: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

14

MATE (V) Events (from DMLibs) via TCP/IP

Event Collector

thread

DTAPI

Controller

Tunlet

Tunlet

EventRepository

Application model

AC Proxy

Tuning request (to tuner)

via TCP/IP

Instrument. request (to monitor)

via TCP/IP

MetaData (from ACs) via TCP/IP

Tunlet

Page 15: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

15

Outline Introduction MATE Number of workers Data distribution Conclusions

Page 16: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

16

Number of Workers Master/Worker paradigm

Easy to understand concept, but with some bottlenecks Example: inadequate number of workers

- workers master idle + workers + communication

Master

Worker Worker Worker Worker

Page 17: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

17

Number of Workers (II)Master

Wor

kers

iv*if tl > then tln* iv*+

else

1

0

*n

iivtl

Execution Trace of an Homogeneous Master-Worker Application

(where are homogeneous:

•message size

•workers execution time)

Where...tl = latencyλ = inverse bandwidthvi = size of tasks sent to worker i, in bytes.n = current number of workers in the application.

Page 18: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

18

Number of Workers (II)Master

Wor

kers

tci

Execution Trace of an Homogeneous Master-Worker Application

(where are homogeneous:

•message size

•workers execution time)

Where...tci = time that worker i spends processing a task

Page 19: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

19

Number of Workers (II)Master

Wor

kers

tl + λ*vm

Execution Trace of an Homogeneous Master-Worker Application

(where are homogeneous:

•message size

•workers execution time)

Where...tl = latencyλ = inverse bandwidthvm = size of results sent back to master

Page 20: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

20

Number of Workers (III)

)))**/()*((

)*((

*)1(*)2(

tlVpTcVn

andn

Vptlif

n

VpTcnpVtlTt

)n

Vp** tl( if

)n

V*p* tl(if *

n

Vtl

n

TctlnTt

tlTcVNopt )*(

Page 21: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

21

Number of Workers (IV)

Page 22: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

22

Number of Workers: Tunlet

Measure points:

The amount of data sent to the workers and received by the master

The total computational time of workers The network overhead and bandwidth

Machine A (master) Machine B (worker)

time time

receive (entry)

receive ( exit )

send (exit)

send (entry)

receive (exit)

send (entry)

send (exit)

receive (entry)

Page 23: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

23

Number of Workers: Tunlet (II)

Performance function: Calculation of the optimal number of workers:

Tuning actions: To change the value of “numworkers” to add or

remove as many workers as is needed

tlTcVNopt )*(

Page 24: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

24

Experimentation Example application

Forest Fire Propagation simulator – Xfire Intensive computing application Master/Worker Simulation of the fireline propagation Calculates the next position of the fireline considering the current fireline position and weather factors, vegetation,etc.

Platform Cluster of Pentium 4, 1.8Ghz, SuSE Linux 8.0, connected

by 100Mb/sec network

Page 25: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

25

Experimentation (II)

Load in the system We designed different external load patterns They simulate the system’s time-sharing Allow us to reproduce experiments

Case Studies Xfire executed with different fixed number of workers

without any tuning, introducing external loads Xfire executed under MATE, introducing external loads

Page 26: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

26

Experimentation (III)

1 2 4 6 8 10 12 14 16 18 20 22 24 26 Xf+MATE

0

200

400

600

800

1000

1200

1400

Case studies

Exe

cuti

on

tim

e (S

ec.)

Note that...

• Execution time of Xfire under MATE is close to the best execution times obtained.

• Resources devoted to the application using MATE, are used when they are really needed.

Starts with 1 worker and adapts it

Page 27: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

27

Experimentation (IV) Statically, the model fits Dynamically, there are some problems

Nopt Could be extremely high Computation power added or removed may be not

significant considering the previous computational power Solution

Finding a “reasonable” number of workers that define a trade off between resources utilization and execution time.

Page 28: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

28

Outline Introduction MATE Number of workers Data distribution Conclusions

Page 29: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

29

Data Distribution Imbalance Problem:

Heterogeneous computing and communication powers Varying amount of distributed work

Master

Wor

kers

Unbalanced iteration Balanced iteration

Page 30: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

30

Data Distribution (II) Goal:

minimize the idle time by balancing the work among the processes considering efficiency of machines

Performance Model Factoring Scheduling method

Work is divided into different-size tuples according to the factor

Work size(N)

Number ofWorkers (P)

Factor(f)

Tuples

1000 2 1 500,500

1000 2 0.5 250,250,125,125,63,63,32,32,16,16,8,8,4,4,2,2,1,1

Page 31: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

31

Data Distribution: Tunlet Measure points:

The work unit processing time. The latency and bandwidth

Performance function: Calculation of the factor. Analyzer simulates the execution considering different

factors. Finally, it decides the best factor. Currently we are working on an analytical model to

determine the factor

Tuning actions: To change the value of “TheFactorF”

Page 32: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

32

Experimentation Example application

Forest Fire Propagation simulator – Xfire

Platform Cluster of Pentium 4, 1.8Ghz, SuSE Linux 8.0, connected by

100Mb/sec network

Page 33: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

33

Experimentation (II)

Load in the system We designed different external load patterns They simulate the system’s time-sharing Permit us to reproduce experiments

Study Cases Xfire executed without any tuning Xfire, introducing controlled variable external loads Xfire executed under MATE, introducing variable

external loads

Page 34: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

34

Experimentation (III)

Note that…

• Introduction of an extra load increases the execution time.

• Execution with MATE corrects the factor value to improve the execution time

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

Exe

cuti

on

tim

e (S

ec.)

1 2 4 8 16 30Number of Workers

Xfire

Xfire+Load

Xfire+Load+MATE

Page 35: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

35

Outline Introduction MATE Number of workers Data distribution Conclusions

Page 36: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

36

Conclusions and open lines

Conclusions Prototype environment – MATE – automatically monitors,

analyses and tunes running applications

Practical experiments conducted with MATE and parallel/distributed applications prove that it automatically adapts application behavior to existing conditions during run time

MATE in particular is able to tune Master/Worker applications and overcome the possible bottlenecks: number of workers and data distribution

Dynamic tuning works, is applicable, effective and useful in certain conditions.

Page 37: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

37

Conclusions and open lines

Open Lines

Determining the “reasonable” number of workers.

Considering interaction between different tunlets.

Providing the system with other tuning techniques.

Page 38: UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

38

Thank you…