AN INGENIOUS APPROACH FOR IMPROVING TURNAROUND TIME OF GRID JOBS WITH RESOURCE ASSURANCE AND ALLOCATION MECHANISM Shikha Mehrotra Centre for Development of Advanced Computing CDAC, Bangalore, India {[email protected]} 10-12 September 2012 1 IEEE HPEC'12
20
Embed
Shikha Mehrotra Centre for Development of Advanced Computing CDAC, Bangalore, India
Shikha Mehrotra Centre for Development of Advanced Computing CDAC, Bangalore, India {[email protected]}. AN INGENIOUS APPROACH FOR IMPROVING TURNAROUND TIME OF GRID JOBS WITH RESOURCE ASSURANCE AND ALLOCATION MECHANISM. Outline. Indian National grid GARUDA Need for Reservation in Grid - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
AN INGENIOUS APPROACH FOR IMPROVING TURNAROUND TIME OF GRID JOBS WITHRESOURCE ASSURANCE AND ALLOCATION
MECHANISM
Shikha Mehrotra Centre for Development of Advanced Computing
Outline• Indian National grid GARUDA• Need for Reservation in Grid• Approach followed in realizing reservation in Garuda Grid
– Architecture– Features
• Performance analysis– Job flow in Garuda grid– Performance metrics – Turnaround time of grid jobs – Case-study
• Turn-around time without reservation• Turn-around time with reservation• Data analysis• Results
• Conclusion10-12 September 2012
IEEE HPEC'12 3
Grid Computing
• Distributed Computing taken to the next level
• Aggregation of Resources from many participants (geographically distributed in general)– Compute resources– Data resources– Special instruments (Telescopes, microscopes, so on..)
• Unified, Seamless access to these resources– Analogous to the “Power Grid”
10-12 September 2012
IEEE HPEC'12 4
India’s National Grid Computing Initiative: GARUDA
10-12 September 2012
Motivation To Collaborate on Research and Engineering of
Technologies, Architectures, Standards and Applications in Grid Computing
To Contribute to the aggregation of resources in the Grid
Production infrastructure with Gigabit networking backbone (NKN) Large HPC computing resources Massive Storage Tools and Services for Unified Access
Currently Connects more than 60 institutions
Academic & Research labs Spans across 17 cities of India Supports 10 Virtual Organizations
Bioinformatics, Seismic engineering, Climate modeling, Drug discovery ….
IEEE HPEC'12 5
Problem Statement
• As the demand for the resources increases more and more, it becomes really difficult to manage the jobs and allocate resources to them and hence most of the jobs will be in the queued state waiting for the resource to be free.
10-12 September 2012
IEEE HPEC'12 6
Our Approach
• Reduce waiting time• Solution : Advance Reservation of resources
– An advance reservation is a reservation that a user or administrator can request and the scheduler can create.
– It guarantees the availability of resources at specified future time slot.
10-12 September 2012
IEEE HPEC'12 7
Compute Reservation
• An advance reservation is essentially defined by the following:
– Start time which is defined using the standard date-time format
– An end time, which is either defined using the standard date-time format or computed from the start time plus a duration value,
– Number and type of resource to be reserved.
10-12 September 2012
Garuda Reservation Architecture
RESERVATION REPLICA DB
LOCAL RESOURCE MANAGER
RESERVATION MANAGER AND SCHEDULER
GARUDA LRM RESERVATION COMPONENT
GARUDA MIDDLEWARE RESERVATION COMPONENT
GLOBUS MIDDLEWARE
GRIDWAY META-SCHEDULER
GARUDA GRID LEVEL RESERVATION COMPONENT
RESERVATION DB
FAILOVER
API
COMMANDS
APPLICATIONS
Garuda Reservation Features• Advanced and Immediate Reservation of resources across multiple
clusters
– Ensure resource availability
– GSI based reservation: Garuda Reservation
– Grid Reservation Failover mechanism:
– Application Programming Interface
– Intelligent resource allocation based on QoS Parameters
– Virtual Organization support
– Avoiding resource under utilization
– Integration with Gridway Meta-scheduler and Globus Middleware
IEEE HPEC'12 10
Performance Analysis
10-12 September 2012
IEEE HPEC'12 11
Performance Metrics
• Mean waiting time
• Execution time
• Turnaround time
10-12 September 2012
IEEE HPEC'12 12
Turnaround Time
• Turnaround time (total time taken between the submission of a program/process/thread/task (Linux) for execution and the return of the complete output to the customer/user)
10-12 September 2012
Job Submission
Job OutputUser
IEEE HPEC'12 13
Performance Analysis
10-12 September 2012
IEEE HPEC'12 14
Turn-around time without reservation
10-12 September 2012
Job Set Waiting Execution TurnaroundJob Set 1 0:04:00 0:17:16 0:22:02Job Set 2 0:06:00 0:17:27 0:24:14Job Set 3 0:44:00 0:18:31 1:02:49Job Set 4 1:11:00 0:17:27 1:38:42Job Set 5 1:20:00 0:18:26 1:37:41
Turn-around time without reservation
10-12 September 2012
IEEE HPEC'12 16
Turn-around time with reservation
10-12 September 2012
Job Set Waiting ExecutionTurnaroun
dJob Set 1 0:00:09 0:08:03 0:08:32Job Set 2 0:00:09 0:08:05 0:08:35Job Set 3 0:00:09 0:08:07 0:08:37Job Set 4 0:00:09 0:08:05 0:08:37Job Set 5 0:00:08 0:07:15 0:07:45
IEEE HPEC'12 1710-12 September 2012
Turn-around time with reservation
IEEE HPEC'12 18
Comparison of Turnaround times
10-12 September 2012
IEEE HPEC'12 19
• Guarantees the availability of resources• Eliminates the waiting time• Reduces Turnaround time considerably• Well integrates into the Grid Middleware• Built for the production infrastructure • Analysis has shown results that are really encouraging.