Scheduling in Scheduling in Heterogeneous Grid Heterogeneous Grid Environments: Environments: The Effects of Data The Effects of Data Migration Migration Leonid Oliker, Hongzhang Shan Future Technology Group Lawrence Berkeley Research Laboratory Warren Smith, Rupak Biswas NASA Advanced Supercomputing Division NASA Ames Research Center
20
Embed
Scheduling in Heterogeneous Grid Environments: The Effects of Data Migration Leonid Oliker, Hongzhang Shan Future Technology Group Lawrence Berkeley Research.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Scheduling in Heterogeneous Grid Scheduling in Heterogeneous Grid Environments:Environments:
The Effects of Data Migration The Effects of Data Migration
Leonid Oliker, Hongzhang ShanFuture Technology Group
• Difficult to schedule and manage efficiently– Autonomy (local scheduler)– Heterogeneity– Lack of perfect global information– Conflicting requirements between users and
system administrators
Current StatusCurrent Status
• Grid Initiatives– Global Grid Forum, NASA Information Power Grid, TeraGrid,
Multi-Disciplinary Applications, Remote Visualization, Co-Scheduling,Distributed Data Mining, Parameter Studies
– Job Migration Improve Time-to-Solution Avoid dependency on single resource provider Optimize application mapping to target architecture But what are the tradeoffs of data migration?
Our ContributionsOur Contributions
• Interaction between grid scheduler and local scheduler
• Architecture: distributed, centralized, and ideal
• Real workloads
• Performance metrics
• Job migration overhead
• Superscheduler scalability
• Fault tolerance
• Multi-resource requirements
Distributed ArchitectureDistributed Architecture
Local Env
Grid Env
Info
Job
Local Queue
Compute Server
Local Scheduler
Grid Scheduler
MiddlewareGrid Queue
Job
Communication Infrastructure
PE PE … PE
Interaction between Grid and Interaction between Grid and Local SchedulersLocal Schedulers
If AWT < :
JR
AWT & CRU
Local Scheduler
Grid Scheduler
MiddlewareGrid Queue
Job
Local Queue
Else : Considered for MigrationSender-Initiated (S-I)Receiver-Initiated (R-I)Symmetrically-Initiated (Sy-I)
• Each local site network has peak bandwidth of 800Mb/s (gigabit Ethernet LAN)• External network has 40Mb/s available point-to-point (high-performance WAN)• Assume all data transfers share network equally (network contention is modeled)• Assume performance linearly related to CPU speed• Assume users pre-compiled code for each of the heterogeneous platforms
• Systems located at Lawrence Berkeley Laboratory, NASA Ames Research Center,Lawrence Livermore Laboratory, San Diego Supercomputing Center• Data volume info not available. Assume volume is correlated to volume of work
B is number if Kbytes of each work unit (CPU * runtime) Our best estimate is B=1Kb for each CPU second of application execution
Scheduling PolicyScheduling Policy
• Large potential gain using grid superscheduler• Reduced average wait time by 25X compared with local scheme!
• Sender-Initiated performance comparable to Centralized• Inverse between migration (FOJM,FDVM) and timing (NAWT, NART)•Very small fraction of response time spent moving data (DMOH)
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
NAWT NART FOJM FDVM DMOH
S-IR-ISY-ICntlLocal
12 SitesWorkload B
Data Migration SensitivityData Migration Sensitivity
•NAWT for 100B almost 8X than B, NART 50% higher•DMOH increases to 28% and 44% for 10B and 100B respectively•As B increases, data migration (FDVM) decreases due to increasing overhead•FOJM inconsistent because it measures # of jobs NOT data volume
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
NAWT NART FOJM FDVM DMOH
0B
0.1B
B
10B
100B
Sender-I
12 Sites
Site Number SensitivitySite Number Sensitivity
• 0.1B causes no site sensitivity, • 10B has noticeable effect as sites decrease from 12 to 3:
• Decrease in time (NAWT, NART) due to increase in network bandwidth• Increase in fraction of data volume migrated (FDVM)• 40% Increase in fraction of response time moving data (DMOH)
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
NAWT NART FOJM FDVM DMOH
0.1B,12Sites
0.1B,6Sites
0.1B,3Sites
10B,12Sites
10B,6Sites
10B,3Sites
Sender-I
Communication ObliviousCommunication ObliviousSchedulingScheduling
• For B10 If data migration cost is not considered in scheduling algorithm:• NART increases 14X, 40X for 12Sites, 3Sites respectively • NAWT increases 28X,43X for 12Sites, 3Sites respectively• DMOH is over 96%! (only 3% for B set)• 16% of all jobs blocked from executing waiting for data
Compared with practically 0% for communication-aware scheduling