UAB Dynamic Monitoring and Tuning in Multicluster Environment Genaro Costa, Anna Morajko, Paola Caymes Scutari, Tomàs Margalef and Emilio Luque Universitat Autònoma de Barcelona Paradyn Week 2006 March 2006
Dec 27, 2015
UABDynamic Monitoring and Tuning in
Multicluster Environment
Genaro Costa, Anna Morajko, Paola Caymes Scutari, Tomàs Margalef and Emilio Luque
Universitat Autònoma de Barcelona
Paradyn Week 2006March 2006
2
Outline
Introduction Multicluster Systems Applications on Wide Systems MATE New Requirements Design Conclusions
3
IntroductionSystem performance
New problems require more computation power. Performance is a key issue.
New wide systems are built over the available resources and the user does not have total control of where the application will run.
It became more difficult to reach high performance and efficiency for these wide systems.
4
Introduction (II)
To reach performance goals, users need to find and solve bottlenecks.
Dynamic Monitoring and Tuning is a promising approach.
With dynamic systems’ properties, efficient resource use is hard to reach even for expert users.
5
Multicluster Systems
New systems are built using existing resources. Examples are NOW and HNOW linked with multistage network interconnections.
Intra cluster communications have different latencies than inter cluster communications.
Generally multiclusters built of clusters (homogenous or heterogeneous) interconnected by WAN.
6
Multicluster Systems (II)
Each cluster can have its own scheduler and can be exposed either through a head node or by all nodes
Cluster B
Cluster CCluster A
Condor/LSF/PBS
Headnode
job
7
Applications on Wide Systems
Hierarchical Master/Worker Applications
Raise the possibility of performance bottlenecks
Load imbalance problems Inefficient resource use Non-deterministic inter cluster
bandwidth
Worker
WorkerWorker
Worker
Worker
Worker WorkerWorker
Master
SubMaster
Sub Master explores
data locality
Common data aretransmitted once
Cluster A
Cluster B
8
Applications on Wide Systems (II)
Hierarchical Master/Worker Applications
Sub master is seen as a high processing node by the master.
Work distribution from master to sub master should be based on:
Available bandwidth Computing power
These characteristics may have dynamic behavior.
9
MATEMonitoring, Analysis and Tuning Environment
Dynamic automatic tuning of parallel/distributed applications.
Modifications
Instrumentation
User
TuningMonitoring
Tool
SolutionProblem /
Performance analysis
Performance data
Application development
Application
Execution
Source
Events
DynInst
10Machine 3
Machine 2Machine 1
MATE (II)
Analyzer
AC
instr.
events
modif.
events
DMLibDMLibDMLib
Task1 Task2Task3
instr.
AC
Application Controller - AC Dynamic Monitoring Library - DMLib Analyzer
11
MATE (III)
Each tuning technique is implemented in MATE as a “tunlet”, a C/C++ library dynamically loaded to the Analyzer process.
measure points – what events are needed
performance model – how to determine bottlenecks and solutions
tuning actions/points/synchronization - what to change, where, when
Analyzer
DTAPITunlet
Performance model
Measure points
Tuning point, action, sync
Tunlet
Performance model
Measure points
Tuning point, action, sync
12
New Requirements Transparent process tracking
AC should follow application process to any cluster.
Lower inter cluster instrumentation communication overhead Inter cluster communications generally have high
latency and lower bandwidth.
13
Transparent process tracking
System Service Machine or Cluster can have MATE enabled as
daemon that detects startup of new processes.
MATE EnabledMachine
AC
MATE EnabledMachine
AC
Taskn
startupdetection
MATE EnabledMachine
DMLibAC
Tasknattach
control
receivesAnalyzer
information
Analyzersubscription
DESIGN
14
new ‘Task’
Transparent process tracking
Application plug-in AC can be binary packaged with application binary.
DMLib
ACTask
DMLib
ACTask
Remote Machine
DMLib
Remote Machine
ACAC
Taskn
detects Dyninst create
control
Analyzersubscription
Job submission
new ‘Task’
create
DESIGN (II)
15
Lower communication overhead Smart event collection
Total application trace may generate much overhead.
Event aggregationRemote trace events should be aggregated to
trace event abstractions, saving bandwidth.
Inter Cluster Trace Event Routing
DESIGN (III)
16
Analyzer Approaches Centralized
Requires tunlets modification to distinguish instrumentation data of local application processes.
Hierarchical Requires tunlets dismembering into local tunlets and
global tunlets.
Distributed Requires that tunlets instances located on different
Analyzer instances cooperate to tune an application.
17
Machine B3Machine B1
Machine B2Machine A3
Machine A2Machine A1
Lower communication overhead (II)
Centralized Analyzer Approach
Analyzer
ACTask1
Task2
Task3
AC ACTask1
Task4
Task3
AC
AC
Task2
Event Router
Cluster BCluster A
DESIGN (IV)
18
Machine A4
GlobalAnalyzer Machine B2
Local Performance Model Analysis
Hierarchical Analyzer Approach
Abstract Events
Machine B3Machine B1
Machine A3
Machine A2Machine A1
LocalAnalyzer
ACTask1
Task2
Task3
AC ACTask1
Task4
Task3
AC
Cluster BCluster A
LocalAnalyzer
DESIGN (V)
19
Distributed Monitoring, Analysis and Tuning Environment Distributed Analyzer Approach
Cluster A Cluster B
Machine B2
Machine B3Machine B1
Machine A3
Machine A2Machine A1
Analyzer
ACTask1
Task2
Task3
AC ACTask1
Task4
Task3
AC
Cluster BCluster A
AnalyzerTunlet instancescooperation
DESIGN (VI)
20
Conclusions and future work
Conclusions
Interference of instrumentation information on inter cluster communication should be minimal.
Process tracking enables MATE for multicluster systems.
Centralized Analyzer approach benefits tunlet developer but does not scale.
Distributed Analyzer approach scales but requires different model based analysis.
21
Conclusions and future work (II)
Future Work
Development of new tunlets for distributed and hierarchical Analyzer approach.
Tuning based only of local instrumentation data. Semantics of aggregation for Instrumentation
events. Patterns of distributed tunlets cooperation. Scenarios of distributed Analyzer cooperation in
multiclusters.
22
Thank you…