Top Banner
Grid Simulator with Production Scheduling Algorithms Miroslav Ruda, Hana Rudová Motivation Simulator with Virtual Machines Experimental Testbed Preemption Conclusion Grid Simulator with Production Scheduling Algorithms Miroslav Ruda 1 Hana Rudová 2 1 Institute of Computer Science Masaryk University 2 Faculty of Informatics Masaryk University Cracow, 2007
18

Grid Simulator with Production Scheduling Algorithms

Feb 25, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Grid Simulator with Production SchedulingAlgorithms

Miroslav Ruda1 Hana Rudová2

1Institute of Computer ScienceMasaryk University

2Faculty of InformaticsMasaryk University

Cracow, 2007

Page 2: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Scheduling Algorithms

Algorithms in Grid simulatorsSimGrid, GridSim, GSSIM , Aleadevelopment and testing of new algorithmscomparison of algorithms

Algorithms in production systemsPBSPro, SGE, Maui, Moabin simulators: approximated with FIFO (with backfilling)hard to reimplement

many rules, features, bugsclosed source, algorithms not published

Example www.excludus.com

Information about Excludus "Grid Optimizer"uses innovative real-time scheduling algorithmdynamic adaptive scheduling beyond traditionalworkload managers

Page 3: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Scheduling Algorithms

Algorithms in Grid simulatorsSimGrid, GridSim, GSSIM , Aleadevelopment and testing of new algorithmscomparison of algorithms

Algorithms in production systemsPBSPro, SGE, Maui, Moabin simulators: approximated with FIFO (with backfilling)hard to reimplement

many rules, features, bugsclosed source, algorithms not published

Example www.excludus.com

Information about Excludus "Grid Optimizer"uses innovative real-time scheduling algorithmdynamic adaptive scheduling beyond traditionalworkload managers

Page 4: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Scheduling Algorithms

Algorithms in Grid simulatorsSimGrid, GridSim, GSSIM , Aleadevelopment and testing of new algorithmscomparison of algorithms

Algorithms in production systemsPBSPro, SGE, Maui, Moabin simulators: approximated with FIFO (with backfilling)hard to reimplement

many rules, features, bugsclosed source, algorithms not published

Example www.excludus.com

Information about Excludus "Grid Optimizer"uses innovative real-time scheduling algorithmdynamic adaptive scheduling beyond traditionalworkload managers

Page 5: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Simulator with production resourcemanagement system

Experiments with PBSProdifferent setup of PBS scheduler

queues, priorities, backfilling, . . .different setup of worker nodes

number of nodes per queue, . . .

modifications of PBS schedulerinclusion of virtual machines into PBS

Future: new scheduler

Page 6: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Simulator with Virtual Machines

Worker nodes represented by virtual machinesStandard PBS Server and Scheduler

running on dedicated serverStandard PBS Mom

running within each virtual machineSleep jobs

no cpu/memory consumption

Page 7: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Workloads

Real workloadsCzech Grid METACentrumExtracted from PBS accounting2005-2007

Jobs submitted with the same requirementson worker nodesto the same queueswith original owners, ...

Time reductionconfigurable reduce factor (600)expected and real wall-clock timejob arrival time

Page 8: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Vserver based virtual machines

One kernel space - very lightweightSimilar to

Linux chroot or BSD jail, with better protectionAccess limits

standard: filesystemadded: processes, network devices ...

No hardware emulation, no paravirtualisationno performance penalty

Copy On Write filesystemone RO root filesystem, with RW overlay filesystem

System daemonsrunning only once in hosting environment

Page 9: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Experimental Testbed

Current workloads (year 2007)January 4.700 jobs, March 14.000 jobs, Jan-March70.000 jobs

150 Vserver domains16 core AMD machine . . . . can use more physical machinesrepresents 300 nodes. . . . . . . . . . . . . . . . . . .can be extended

COW filesystem300 MB one system instalation12 GB used to represent 150 virtual machines

Virtual machine: only PBS Mom + sshdSubmission of all jobs

without any sleep takes less then 10 minutesReduce factor 600

1 month -> 1.5 hours, 1 year -> ≤ 1 dayreasonably small simulation overhead

Page 10: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Evaluation Criteria

Standard monitoring during simulation runnumber of running/waiting/done jobsnumber of used nodes

Analysis of accounting dataWeighted Response Time (WRT)Weighted SlowDown Time (WSD)Weighted Wait Time (WWT)metrics per user, queuealso structured by number of nodes used by job

Page 11: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Experimental Results for March Workload

Number of running jobsand number of usedworker nodes. Firstsimulation with "starvationsupport", second without.

Number of finished jobs.

Page 12: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Experimental Results for Parallel Jobs I.

number of running jobs

number of finished jobs number of finished parallel jobs

Page 13: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Experimental Results for Parallel Jobs II.

Wait time based on number of nodes used by jobNodes A B C D E1 166 326 261 99 2722 40 44 25 244 284 308 337 229 522 3528 339 513 252 425 58412 434 580 393 615 85316 617 343 524 694 35528 817 361 758 884 43032 820 676 761 857 74740 1052 937 1020 808 1048A one queueB one queue, starvation supportC two queuesD two queues, strict fifoE two queues, starvation

Page 14: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Jobs with Preemption

Motivation: better support of parallel jobs or prioritiesTwo virtual machines running on physical machine

first machine: standard jobssecond machine: privileged/parallel jobs

Magrathea allowsseveral VMs running on a single computerjobs submitted directly to VMs

When job is started in privileged domain, Magratheasuspends job in standard domain (if needed)almost all cpu/memory resources are given to privilegeddomain (but standard is still running)

Support of simulatorMagrathea installed on simulated machines toosleep jobs must respect preemption

Page 15: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Conclusion & Future Work

New Grid simulatorInclusion of production resource management systemNew experiments: PBSPro (and other algorithms)Novel proposal with virtual machinesNew experiments: scheduling with Magrathea

Future workStudy: limits of the simulator

efficient scheduling algorithms neededcannot use actual load on machinesmonitoring issues

New scheduler

Page 16: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Standard submit script in simulator

#!/bin/bashreduce=600 #reduce factor

sleep $(($SIMSLEEP/$reduce)) #gap in workload

sudo $SIMUSER qsub -q $SIMQUEUE#the same node requirements-l nodes=$SIMNODESL-l walltime=$(($SIMREQL/$reduce)) «EOF

#sleep instead of real jobsleep $(($SIMWALL/$reduce))EOF

Page 17: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Preemption in simulator

reduce=600sleep $(($SIMSLEEP/$reduce))sudo $SIMUSER qsub -q sim$SIMQUEUE

-l nodes=$SIMNODESL-l walltime=$(($SIMREQL/$reduce))«EOF

sleeptime = $(($SIMWALL/$reduce))while ($sleeptime >0) dosleep $sleeptime#check long how job has been preemptedsleeptime=‘magrathea-preempted-time‘;

doneEOF

Page 18: Grid Simulator with Production Scheduling Algorithms

Grid Simulatorwith

ProductionSchedulingAlgorithms

MiroslavRuda, Hana

Rudová

Motivation

Simulator withVirtualMachines

ExperimentalTestbed

Preemption

Conclusion

Weighted Response Time, SlowDown, andWait Time

SAj = reqResourcesj × (endTimej − startTimej)

TotalSA =∑

j∈Jobs

SAj

SD =(endTimej − submitTimej)

runtimej

WRT =

∑j∈Jobs(SAj(endTimej − submitTimej))

TotalSA

WSD =

∑j∈Jobs SAj × SDj

TotalSA

WWT =

∑j∈Jobs SAj × (startTimej − submitTimej)

TotalSA