Providing Scientific Software as a Service in Consideration of Service Level Agreements Oliver Niehörster(1), André Brinkmann(1), Georg Birkenheuer(1), Sonja Herres-Pawlis(2), Julia Niehörster(3), Jens Krüger(2), Brigitta Elsässer(2), Lars Packschies(4) (1) Paderborn Center for Parallel Computing, Universität Paderborn, Germany (2) Department Chemie, Universität Paderborn, Germany (3) Department Agricultural Sciences, Universität Hohenheim, Germany (4) Universität zu Köln, Germany 27.06.22 1
14
Embed
Providing Scientific Software as a Service in Consideration of Service Level Agreements
Providing Scientific Software as a Service in Consideration of Service Level Agreements. Oliver Niehörster(1), André Brinkmann(1), Georg Birkenheuer (1), Sonja Herres-Pawlis(2), Julia Niehörster(3), Jens Krüger(2), Brigitta Elsässer(2), Lars Packschies(4) - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Providing Scientific Software as a Servicein Consideration of Service Level Agreements
Oliver Niehörster(1), André Brinkmann(1), Georg Birkenheuer(1), Sonja Herres-Pawlis(2), Julia Niehörster(3), Jens Krüger(2), Brigitta Elsässer(2), Lars Packschies(4)
(1) Paderborn Center for Parallel Computing, Universität Paderborn, Germany
(2) Department Chemie, Universität Paderborn, Germany(3) Department Agricultural Sciences, Universität Hohenheim, Germany
(4) Universität zu Köln, Germany
19.04.23 1
Scientific SaaS
• Provider Advantages– Horizontal & vertical scaling of virtual machines – Better resource utilization– Snapshot & live migration– High availability, possible error recovery
19.04.23 2
CloudCloud
Job &
SLA
Job &
SLA
Service Stack
• Today: Concentration on business applications– Online word processors, content management, customer
relationship management, human resource management • Our contribution: Support for scientific applications
– Gaussian, Gromacs, MoE, NWChem, …
19.04.23 3
Agenda
19.04.23 4
MotivationArchitecture
Scientific SaaS Survey
Results
Architecture Stack
19.04.23 5
Optimal provisioning to fulfill SLAsFinding performance functions
and dynamic job scheduling
Scientific SaaS
Eucalyptus implements the EC2 APIConsideration of local resource management systems
Analysis of different mapping policies (Greedy, RR, Green-IT,...)
Cloud-/IaaS
Virtualization API to handle different hypervisorsExtensions of libvirt to handle Hyper-V and ESX
Virtualization
High-performance interconnects (like Infiniband)Analysis of VMM-bypass, MR-IOV, SR-IOV
HPC Hardware
Is this a
pproach applicable to scientific
applications?
How do scientific applications behave?
Is this a
pproach applicable to scientific
applications?
How do scientific applications behave?
Aim of the survey
• Can classical SLA used to provide information about scientific applications?– Can we estimate temporal behaviour of the
applications?– Can we support disaster recovery?
• Can we use virtualisation capabilities to increase scheduling efficiency?– Resizing of virtual machines? – Change number of nodes?
• How does application scale (max. CPUs, etc)?– Parallelism is user decision - up to 1024 CPUs
• How does parallelism behave over time?– no change in parallelism
• Is it possible to add nodes on line?– No!
• Is the resource demand time dependent?– Application dependent: constant or increasing over
time• High internode or I/O communication?– All computing intensive, most I/O intensive
Application Parallelism
19.04.23 9
Application Input
• Are interactive user inputs possible during calculation?– No
• Is all input available initial or are workflow dependencies? – Initial available
• How much Data uses the application?– Several MB to several GB
19.04.23 10
Application Progress Indication
• Does the application provide progress information?– Progress often not available – If available• Gromacs reliable after 1000 iterations• R -> not linear• PlabSoft, SAS, ASRemL -> unreliable
• Guess end time from history
19.04.23 11
Checkpoint and Restart
• Is application specific checkpointing possible?– Unknown for some applications• PlabSoft, SAS, ASRemL
– Manually started • Gromacs
– Automatically started• R, Gaussian, NWChem
• Can a checkpoint continue computation with more nodes?– If available then possible
19.04.23 12
Summary & Conclusion
• Summary– Scientific applications are batch jobs– Halting problem: Determination of application finish
not possible in every case– No online progress indication– Virtualisation