Grid Simulator with Production Scheduling Algorithms Miroslav Ruda, Hana Rudová Motivation Simulator with Virtual Machines Experimental Testbed Preemption Conclusion Grid Simulator with Production Scheduling Algorithms Miroslav Ruda 1 Hana Rudová 2 1 Institute of Computer Science Masaryk University 2 Faculty of Informatics Masaryk University Cracow, 2007
18
Embed
Grid Simulator with Production Scheduling Algorithms
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Grid Simulatorwith
ProductionSchedulingAlgorithms
MiroslavRuda, Hana
Rudová
Motivation
Simulator withVirtualMachines
ExperimentalTestbed
Preemption
Conclusion
Grid Simulator with Production SchedulingAlgorithms
Miroslav Ruda1 Hana Rudová2
1Institute of Computer ScienceMasaryk University
2Faculty of InformaticsMasaryk University
Cracow, 2007
Grid Simulatorwith
ProductionSchedulingAlgorithms
MiroslavRuda, Hana
Rudová
Motivation
Simulator withVirtualMachines
ExperimentalTestbed
Preemption
Conclusion
Scheduling Algorithms
Algorithms in Grid simulatorsSimGrid, GridSim, GSSIM , Aleadevelopment and testing of new algorithmscomparison of algorithms
Algorithms in production systemsPBSPro, SGE, Maui, Moabin simulators: approximated with FIFO (with backfilling)hard to reimplement
many rules, features, bugsclosed source, algorithms not published
Example www.excludus.com
Information about Excludus "Grid Optimizer"uses innovative real-time scheduling algorithmdynamic adaptive scheduling beyond traditionalworkload managers
No hardware emulation, no paravirtualisationno performance penalty
Copy On Write filesystemone RO root filesystem, with RW overlay filesystem
System daemonsrunning only once in hosting environment
Grid Simulatorwith
ProductionSchedulingAlgorithms
MiroslavRuda, Hana
Rudová
Motivation
Simulator withVirtualMachines
ExperimentalTestbed
Preemption
Conclusion
Experimental Testbed
Current workloads (year 2007)January 4.700 jobs, March 14.000 jobs, Jan-March70.000 jobs
150 Vserver domains16 core AMD machine . . . . can use more physical machinesrepresents 300 nodes. . . . . . . . . . . . . . . . . . .can be extended
COW filesystem300 MB one system instalation12 GB used to represent 150 virtual machines
Virtual machine: only PBS Mom + sshdSubmission of all jobs
without any sleep takes less then 10 minutesReduce factor 600
1 month -> 1.5 hours, 1 year -> ≤ 1 dayreasonably small simulation overhead
Grid Simulatorwith
ProductionSchedulingAlgorithms
MiroslavRuda, Hana
Rudová
Motivation
Simulator withVirtualMachines
ExperimentalTestbed
Preemption
Conclusion
Evaluation Criteria
Standard monitoring during simulation runnumber of running/waiting/done jobsnumber of used nodes
Analysis of accounting dataWeighted Response Time (WRT)Weighted SlowDown Time (WSD)Weighted Wait Time (WWT)metrics per user, queuealso structured by number of nodes used by job
Grid Simulatorwith
ProductionSchedulingAlgorithms
MiroslavRuda, Hana
Rudová
Motivation
Simulator withVirtualMachines
ExperimentalTestbed
Preemption
Conclusion
Experimental Results for March Workload
Number of running jobsand number of usedworker nodes. Firstsimulation with "starvationsupport", second without.
Number of finished jobs.
Grid Simulatorwith
ProductionSchedulingAlgorithms
MiroslavRuda, Hana
Rudová
Motivation
Simulator withVirtualMachines
ExperimentalTestbed
Preemption
Conclusion
Experimental Results for Parallel Jobs I.
number of running jobs
number of finished jobs number of finished parallel jobs
Grid Simulatorwith
ProductionSchedulingAlgorithms
MiroslavRuda, Hana
Rudová
Motivation
Simulator withVirtualMachines
ExperimentalTestbed
Preemption
Conclusion
Experimental Results for Parallel Jobs II.
Wait time based on number of nodes used by jobNodes A B C D E1 166 326 261 99 2722 40 44 25 244 284 308 337 229 522 3528 339 513 252 425 58412 434 580 393 615 85316 617 343 524 694 35528 817 361 758 884 43032 820 676 761 857 74740 1052 937 1020 808 1048A one queueB one queue, starvation supportC two queuesD two queues, strict fifoE two queues, starvation
Grid Simulatorwith
ProductionSchedulingAlgorithms
MiroslavRuda, Hana
Rudová
Motivation
Simulator withVirtualMachines
ExperimentalTestbed
Preemption
Conclusion
Jobs with Preemption
Motivation: better support of parallel jobs or prioritiesTwo virtual machines running on physical machine
first machine: standard jobssecond machine: privileged/parallel jobs
Magrathea allowsseveral VMs running on a single computerjobs submitted directly to VMs
When job is started in privileged domain, Magratheasuspends job in standard domain (if needed)almost all cpu/memory resources are given to privilegeddomain (but standard is still running)
Support of simulatorMagrathea installed on simulated machines toosleep jobs must respect preemption
Grid Simulatorwith
ProductionSchedulingAlgorithms
MiroslavRuda, Hana
Rudová
Motivation
Simulator withVirtualMachines
ExperimentalTestbed
Preemption
Conclusion
Conclusion & Future Work
New Grid simulatorInclusion of production resource management systemNew experiments: PBSPro (and other algorithms)Novel proposal with virtual machinesNew experiments: scheduling with Magrathea
Future workStudy: limits of the simulator
efficient scheduling algorithms neededcannot use actual load on machinesmonitoring issues