Top Banner
MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba
22

MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Dec 14, 2015

Download

Documents

Laura Warden
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

MyGrid: A User-Centric Approach for Grid Computing

Walfredo Cirne Universidade Federal da Paraíba

Page 2: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

High-Performance Computing

• High-Performance Computing means running faster than the typical machine du jour

• Unbeatable price/performance of microprocessors has killed specialized high-performance machines

• Therefore, paralelism currently is the way to do High-Performance Computing– Parallel supercomputers

Page 3: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Solving a Real Problem

• I had hundreds of thousands of independent simulations to run

• Parallel supercomputers are typically– hard to get acess to – slow (too much time in the queue)

• Since my simulations were independent, I had the perfect application for the Computational Grid

Page 4: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Grid Computing

• Grid Computing aims to enable the execution of parallel applications over processors that are:– Geographically distributed– Under multiple administrative domains– Not dedicated

• The potential for resource gathering is enormous– “Let´s run over the Internet”

Page 5: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Grid Applications

• Not all applications can benefit from the Grid

• Loosely coupled applications match the Grid characteristics much better than tightly coupled applications

Page 6: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

State of Art in Grid Computing

• Most services are provided by the Grid Infrastructure– Naming, remote execution/task control, security,

etc

• Scheduling is done at the application level

• Globus

• “Virtual Organizations”

Page 7: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Back to the Real Problem

• I had hundreds of thousands of independent simulations to run

• I was working in a top research lab in Grid Computing

• I could not manage to use the Grid

• It is hard to get the Grid Infrastructure Software installed everywhere

Page 8: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

The Motivation for MyGrid

• Users of loosely coupled applications could benefit from the Grid now

• However, they don´t run on the Grid today because the Grid Infrastructure is not widely deployed

• What if we build a solution at the user level? That is, a solution that does not depend upon installed infrastructure?

Page 9: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

MyGrid

• MyGrid is a framework to build infrastructure-independent grid applications

• The user provides:– A description of her Grid– A way to do remote execution and file transfer– “The application”

• MyGrid provides:– Grid abstractions– Scheduling

Page 10: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

MyGrid Goals

• open = do not require a particular infrastructure

• self-installable = do not require manual installation on a given machine

• extensible = simple to add refinements

• complete = cover the whole production cycle

Page 11: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

MyGrid Concepts

• Job = set of independent tasks– Tasks have three pieces: init, remote and final

• Home machine Grid machine

• Grid abstractions– remote execution– file transfer– playpen– mirroring

Page 12: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Defining My Personal Gridbagre.dsc.ufpb.brdsc, linuxssh %machine %commandscp %localdir/%file %machine:%remotedirscp %machine:%remotedir/%file %localdir

traira.dsc.ufpb.brdsc, linuxssh %machine %commandscp %localdir/%file %machine:%remotedirscp %machine:%remotedir/%file %localdir

quidam.ucsd.educse, linuxssh %machine %commandscp %localdir/%file %machine:%remotedirscp %machine:%remotedir/%file %localdir

Page 13: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Fatoring with MyGrid

• Fatora n gerates tasks, init, remotei, and collect• User runs mygrid.ui.AddTask < tasks• tasks

task:init= initremote= remote1final= collectprocessor= linuxplaypensize= 0cost = 1task:init= initremote= remote2…

Page 14: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Fatoring with MyGrid

• initjava mygrid.ui.MyGridUI p $PROC ./Fat.class $PLAYPEN

• remote1java Fat 3 18655 34789789798 output-$TASK

• remote2java Fat 18655 37307 34789789798 output-$TASK

• collectjava mygrid.ui.MyGridUI g $PROC "" $PLAYPEN saida-

$TASK .

Page 15: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Running an MyGrid Task

(3c)(3b)

task-done (4)remote exec (3)

playpen, file xfer, and remote exec (3a)

(2)

add-task (1)

Home Machine

Grid Machine

Task Manager

User Agent Server

home stasks

User Agent

Daemon

grid stask

Page 16: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

User Agent

• User Agent provides the grid abstractions

• User Agent Daemon runs on grid machines

• User Agent Server runs on home machines

• The Daemon and the Server rely upon public-key cryptography to authenticate each other

Page 17: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Self Instalation

• We are working on having MyGrid install and start-up User Agents everywere

• The user provides a way to do remote execution and file transfer to make that possible

Page 18: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Scheduling in MyGrid

• Grid scheduling is application dependent and effort intensive

• Most people don´t want to spend months to write good schedulers for their applications

• MyGrid provides a sensible default scheduler – The user can of course replace the default

scheduler

Page 19: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Default Scheduler

• How to provide good performance with no knowledge about the application or the current state of the Grid– The key is to avoid having the job waiting for a

task that runs in a slow/loaded machine

• Task replication is our answer for this problem– Task replication is only done when the jobs has

no other tasks

Page 20: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Preliminary Results

• During a 40-day period, we ran 600,000 simulations using 178 processors located in 6 different administrative domains widely spread in the USA

• MyGrid took 16.7 days to run the simulations

• My desktop machine would have taken 5.3 years to do so

• Speed-up is 115.8 for 178 processors

Page 21: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Conclusions

• Running Grid Applications at the user-level is a viable strategy

• Bag-of-tasks parallel applications can currently benefit from the Grid

• Is “upperware” the way to go for new middleware development?

Page 22: MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba.

Future Work

• Turn MyGrid into a production-quality software

• Investigate the impact of task replication in resource consumption

• Develop a default scheduler for data intensive applications– Such a scheduler should try to minimize data

movement