Top Banner
The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed Systems (GRIDS) Lab. Dept. of Computer Science and Software Engineering The University of Melbourne, Australia www.gridbus.org
53

The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

Dec 27, 2015

Download

Documents

Andrew Henry
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids

Rajkumar Buyya

Fellow of Grid Computing

Grid Computing and Distributed Systems (GRIDS) Lab. Dept. of Computer Science and Software EngineeringThe University of Melbourne, Australia

www.gridbus.org

Page 2: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

2

Outline

Introduction to eScience and Challenges Introduction to the Gridbus Project An Overview of Gridbus Components Grid Service Broker

Architecture Design and Implementation

Scheduling Algorithms BioGrid Demo OR Performance Evaluation

A Case Study in High Energy Physics Economy-based Scheduling in Data Grids

Summary

Page 3: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

3

Prominent Grid Drivers: Emerging eScinece and eBusiness Apps

Next generation experiments, simulations, sensors, satellites, even people and businesses are creating a flood of data. They all involve numerous experts/resources from multiple organization in synthesis, modeling, simulation, analysis, and interpretation.

Life Sciences Digital Biology

Finance: Portfolio analysis

~PBytes/sec

Newswire & data mining:Natural language engineering

Astronomy

Internet & Ecommerce

High Energy Physics Brain Activity Analysis

Quantum Chemistry

Page 4: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

4

E-Science Elements

Distributed instruments

Distributed computation

Distributed data

Peers sharing ideas and collaborative interpretation of data/resultsE-Scientist

2100 2100 2100 2100

2100 2100 2100 2100

Remote Visualization

Data & Compute Service

Page 5: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

5

Grids have Emerged as Scalable Cyberinfrastructure for e-Science Applications

Grid Resource Broker

Resource Broker

Application

Grid Information Service

Grid Resource Broker

databaseR2R3

RN

R1

R4

R5

R6

Grid Information Service

Page 6: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

6

Type of Services Modern Grids Offer

Computational Services – CPU cycles SETI@Home, NASA IPG, TeraGrid, I-Grid,…

Data Services Data replication, management, secure access--

LHC Grid/Napster Application Services

Access to remote software/libraries and license management—NetSolve

Information Services Extraction and presentation of data with meaning

Knowledge Services The way knowledge is acquired and managed—

data mining. Utility Computing Services

Towards a market-based Grid computing: Leasing and delivering Grid services as ICT utilities.

Computional Grid

Data Grid

ASP Grid

Information Grid

Knowledge Grid

Utility Grid

Page 7: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

7

Grid Challenges

Security

Resource Allocation & Scheduling

Data locality

Network Management

System Management

Resource Discovery

Uniform Access

Computational Economy

Application Construction

Page 8: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

8

Some Grid Initiatives Worldwide

Australia Nimrod-G Gridbus DISCWorld GrangeNet. APACGrid ARC eResearch?

Brazil OurGrid, EasyGrid LNCC-Grid + many others

China ChinaGrid – Education CNGrid - application

Europe UK eScience EU Grids.. and many more...

India I-Grid

Japan NAGERI

Korea...N*Grid

SingaporeNGP

USA Globus NASA IPG AccessGrid TeraGrid Cyberinfrasture and many more...

Industry Initiatives IBM On Demand Computing HP Adaptive Computing Sun N1 Microsoft - .NET Oracle 10g Infosys – Business Grid StorageTek –Grid.. and many more

Public Forums Global Grid Forum Australian Grid Forum Conferences:

CCGrid Grid P2P HPDC

http://www.gridcomputing.com

1.3 billion – 3 yrs

1 billion – 5 yrs

450million – 5 yrs

486million – 5 yrs

1.3 billion (Rs)

27 million

2? billion

120million – 5 yrs

Page 9: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

9

The Gridbus Project @ Melbourne:Enable Leasing of ICT Services on Demand

WWG

World Wide Grid!On Demand Utility

Computing

Gridbus

Distributed Data

Page 10: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

10

The Gridbus Project: http://www.gridbus.org

A multi-institutional “Open Source” R&D Project with focus on: Architecture, Specification, and Open Source Reference Implementation. Service-Oriented Grid, Utility Computing & Distributed Data and Computation Economy Scaling from Desktops, Clusters, Cluster Federation, Enterprise Grids to Global Grids.

Grid Market Directory and Web Services Grid Bank: Accounting and Transaction Management Visual Tools for Creation of Distributed Applications Workflow Composition and Deployment Services Data Grid Brokering and Grid Economy Services Data Replication Strategies GridSim Toolkit: Enhanced to support Data Grid, Reservation, etc. Libra: Economic Cluster Scheduler Coupling of Clusters and Computational Economy Alchemi: Harnessing .NET/Windows-based Resources WWG: Global Data Intensive Grid Testbed Application Enabler Projects:

High-Energy Physics , Astronomy, Brain Activity Analysis – Osaka U., Natural Language Processing, Portfolio Analysis – Spain, BioGrid - WEHI (via APACGrid), SensorGrid (NICTA), Medical Imaging (HFI)

Supported by:

Page 11: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

11

Grid Economy: Methodology for Sustained Resourced Sharing and Managing Supply-and-Demand for Resources

Page 12: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

12

New challenges of Grid Economy

Grid Service Providers (GSPs) How do I decide service pricing models ? How do I specify them ? How do I translate them into resource allocations ? How do I enforce them ? How do I advertise & attract consumers ? How do I do accounting and handle payments? …..

Grid Service Consumers (GSCs) How do I decide expenses ? How do I express QoS requirements ? How do I trade between timeframe & cost ? How do I map jobs to resources to meet my QoS needs? …..

They need mechanisms and technologies for value expression, value translation, and value enforcement.

Page 13: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

GRACE: Service Oriented Grid Architecture

GRid Architecture for Computational Economy (GRACE)

Page 14: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

14

Grid Node N

GRACE: A ReferenceService-Oriented Grid Architecture for Computational Economies

Grid Consumer

Pro

gra

mm

ing

En

viro

nm

ents

Grid Resource Broker

Grid Service Providers

Grid Explorer

Schedule Advisor

Trade Manager

Job ControlAgent

Deployment Agent

Trade Server

Resource Allocation

ResourceReservation

R1

Misc. services

Information Service

R2 Rm…

Pricing Algorithms

Accounting

Grid Node1

Grid Middleware Services

HealthMonitor

Grid Market Services

JobExec

Info ?

Secure

Trading

QoS

Storage

Sign-on

Grid Bank

Ap

pli

cati

on

s

Data Catalogue

Page 15: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

15

Gridbus and Complementary Grid Technologies – realizing GRACE

AIXSolarisWindows Linux

.NET GridFabricSoftware

GridApplications

Core GridMiddleware

User-LevelMiddleware(Grid Tools)

GridBank

Grid Exchange & Federation

JVM

Grid Brokers:

X-Parameter Sweep Lang.

Gridbus Data Broker

MPI

Condor SGE TomcatPBS

Alchemi

Workflow

IRIX OSF1 Mac

Libra

Globus Unicore ……Grid

MarketDirectory

PDB

CDB

Worldwide Grid

GridFabricHardware

……

PortalsScience Commerce Engineering ……Collaboratories

……

Workflow Engine

Grid Storage Economy

Gri

d E

con

om

y NorduGrid XGrid

ExcellGrid

Nimrod-G

GRIDSIM

Gridscape

Page 16: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

16

Gridbus Technologies

Application Construction Tools Visual Parametric Modeller (VPM)

Grid Economy Services Grid Market Directory

A Registry for publication of GSPs and their Services – VO/VE Grid Bank

A Grid Accounting Services Grid Trading Services

Data Grid Service Broker QoS based Scheduling of Distributed Data Oriented Apps on global Grids

Grid Workflow Management System Gridscape

Interactive Grid Testbed Portal Generator G-monitor

Grid Application Execution Management Portal GridSim

A Grid Simulation Toolkit Libra

Economy based Cluster Scheduling

Page 17: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

17

Alchemi: .NET-based Enterprise Grid Platform & Web Services

InternetInternet

InternetInternet

Alchemi Worker Agent

Alchemi Manager

Alchemi Users

Web Services

Web Services

•SETI@Home like Model•General Purpose•Dedicated/Non-dedicate workers•Role-based Security•.NET and Web Services•C# Implementation•GridThread and Job Model Programming•Easy to setup and use

Page 18: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

18

On Demand Assembly of Services: Putting Them All Together

Data Source

(Instruments/distributed sources)

Data Replicator(GDMP) ASP Catalogue

Grid Info Service

Grid Market Directory

GSP(Accounting Service)

GridbusGridBank

Data

GSP(e.g., UofM)

PEGSP

(e.g., VPAC)

PE

GSP(e.g., IBM)

CPUorPE

Grid Service (GS)

(Globus)

Alchemi

GS

GTS

Cluster Scheduler

Grid Service Provider (GSP)

(e.g., CERN)

PECluster Scheduler

Job

8

GridResource Broker

2

Visual Application Composer

Application CodeExplore

data1

36

45

Resu

lts9 7

Results+

Cost Info

10

11

Bill

12Data Catalogue

Page 19: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

Creation and Operation of Virtual Enterprises

Grid Market DirectoryGrid Bank

Page 20: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

20

A Market-Oriented Grid Environment

“Solve this in5hrs for $20”

Grid Market Directory (GMD)

ResourceBroker

Grid Info. Service

GTS

GTS

(Grid Service Provider)

GTS

GTS GTS

“ register me as GSP”

“Give me list of GSPs & price?”

“ service available?”

(GTS - Grid Trade Server)

(GSP)

“ service available?”“

service available?”

(RB selects GSPs)

“Solve this in5hrs for $20”

Grid Market Directory (GMD)

ResourceBroker

Grid BankService

GTSGTS

GTSGTS

(Grid Service Provider)

GTSGTS

GTSGTS GTSGTS

“ register me as GSP”

“Give me list of GSPs & price?”

“ service available?”

(GTS - Grid Trade Server)

(GSP)

“ service available?”“

service available?”

(RB selects GSPs)

Page 21: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

21

Grid Market Infrastructure

Grids need to provide an infrastructure that supports: (a) the creation of one or more GMP registries; (b) the contributors to register themselves as GSPs along with

their resources/application services that they wish to provide; (c) GSPs to publish themselves in one or more GMPs along

with service prices; and (d) Grid resource brokers to discover resources/services and

their attributes (e.g., access price and usage constraints) that meet user QoS requirements.

Page 22: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

22

Grid Bank: Grid Transactions Authorization, Accounting, & Payment Infrastructure

Grid Resource

Broker (GRB)

GridBank Payment Module

Grid Trade Server

GridBank Charging Module

GridBank Server

Establish Service Costs

A p p l i c a t i o n s

Grid AgentGrid

Resource Meter

GridCheque

Deploy Agent and Submt Jobs

Usage Agreement

Resource Usage

GridCheque

Grid Service Provider (GSP)

GridCheque + Resource Usage (GSC Account Charge

Grid Service Consumer (GSC)

R1 R2 R3 R4

User

User

Page 23: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

Grid Applications: Composition and Deployment – A Broker Perspective

Nimrod-G Broker: A Grid Broker for Computational Grids

Gridbus Broker: A Grid Service Broker for Data Grids

Page 24: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

24

Grid Applications and Parametric Computing

Grid Applications and Parametric Computing

Bioinformatics: Bioinformatics: Drug Design / Protein Drug Design / Protein

ModellingModelling

SensitivitySensitivityexperiments experiments

on smog formationon smog formation

Natural Language Natural Language EngineeringEngineering

Ecological Modelling: Ecological Modelling: Control Strategies Control Strategies

for Cattle Tickfor Cattle Tick

Electronic CAD: Electronic CAD: Field Programmable Field Programmable

Gate ArraysGate ArraysComputer Graphics: Computer Graphics: Ray TracingRay Tracing

High Energy High Energy Physics: Physics:

Searching for Searching for Rare EventsRare Events

Finance: Finance: Investment Risk AnalysisInvestment Risk Analysis

VLSI Design: VLSI Design: SPICE SimulationsSPICE Simulations

Aerospace: Aerospace: Wing DesignWing Design

Network SimulationNetwork SimulationAutomobile:Automobile:

Crash Simulation Crash Simulation

Data MiningData Mining

Civil Engineering:Civil Engineering:Building Design Building Design

astrophysics astrophysics

Page 25: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

25

Thesis

Build a task farming application (parameter sweep or bag of tasks) and execute it on Grid within “T” hours or early and cost not exceeding $M.

Manual

Automated

Three Options/Solutions: Using pure Globus commands Build your own Distributed App & Scheduler Use Gridbus Resource Broker

to compose and schedule

Page 26: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

The Gridbus Grid Service Broker for Data Grid Applications

Builds on the Nimrod-G Computational Grid Broker and Computational Economy [Buyya, Abramson, Giddy, Monash University, 1999-

2001]And

Extends its notion for Data and Service Grids

Page 27: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

27

A resource broker for scheduling task farming data Grid applications with static or dynamic parameter sweeps on global Grids.

It uses computational economy paradigm for optimal selection of computational and data services depending on their quality, cost, and availability, and users’ QoS requirements (deadline, budget, & T/C optimisation)

Key Features A single window to manage & control experiment Programmable Task Farming Engine Resource Discovery and Resource Trading Optimal Data Source Discovery Scheduling & Predications Generic Dispatcher & Grid Agents Transportation of data & sharing of results Accounting

Grid Service Broker (GSB)

Page 28: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

28

Gridbus Broker at a GlanceHome Node/Portal

-PBS-Condor-SGE

Alchemi Globus

Job manager

fork() batch()

GridbusBroker

Gateway

Unicore

fork()

batch() -PBS-Condor-Alchemi

Data Store

Access Technology

Grid FTPSRB

Gridbusagent

Data Catalog

Credential RepositoryMyProxy

Page 29: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

29

Gridbus Broker Architecture

Grid Middleware

Gridbus Client Gridbus ClientGribus Client

Grid Info Server

Schedule Advisor

Trading Manager

Gridbus Farming Engine

RecordKeeper

Grid Explorer

GE GIS, NWSTM TS

RM & TS

Grid Dispatcher

RM: Local Resource Manager, TS: Trade Server

G

G

CU

Globus enabled node.A

L

Alchemi enabled node.

(Data Grid Scheduler)

DataCatalog

DataNode

Unicore enabled node.

$

$

$

App, T, $, Opt

(Bag of Tasks Applications)

Page 30: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

30

Gridbus Services for eScience applications

Application Development Environment: XML-based language for composition of task farming (legacy)

applications as parameter sweep applications. Task farming APIs for new applications. Web APIs (e.g., Portlets) for Grid portal development. Workflow interface and Gridbus-enabled workflow engine.

Resource Allocation and Scheduling Dynamic discovery of optional computational and data nodes

that meet user QoS requirements. Hide Low-Level Grid Middleware interfaces

Globus, Alchemi, Unicore, NorduGrid, XGrid, etc.

Page 31: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

31

Gridbus Broker: XML file

<parameter name=“X" type="integer"> <domain> <range><value from="1" to="10"/> <interval type="step"> 1</interval> </range> </domain></parameter><parameter name=“Y" type="integer"> <domain> <single> <value> 1</value> </single> </domain></parameter><task> <type>main</type> <copy> <source location="local" file="calc.$OS"/> <destination location="node" file="calc"/> </copy> <execute location="node"> <command>./calc $X $Y</command> </execute> <copy> <source location="node" file="output"/> <destination location="local" file="output.$jobname"/> </copy> </task>

Page 32: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

32

Portal-based Access to Grid Broker for Launching and Steering Applications

Grid BrokerGrid Broker

World-Wide Grid

Page 33: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

33

Figure 3 : Logging into the portal.

Drug DesignMade Easy!

Page 34: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

34

Excel Plugin to Access Gridbus Services

Excel

ExcelGrid Add-In

ExcelGrid Runner

ExcelGridJob

ExcelGrid Middleware

Gridbus Broker

Enterprise Grid

2100

2100

2100

2100

2100

2100

2100

2100

Page 35: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

35

Discover Discover ResourcesResources

Distribute JobsDistribute Jobs

Establish Establish RatesRates

Meet requirements ? Remaining Meet requirements ? Remaining Jobs, Deadline, & Budget ?Jobs, Deadline, & Budget ?

Evaluate & Evaluate & RescheduleReschedule

Discover Discover More More

ResourcesResources

Compose & Compose & ScheduleSchedule

Adaptive Scheduling Steps

Page 36: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

36

Deadline (D) and Budget (B) Constrained Scheduling Algorithms

Algorithm Execution Time (D)

Execution Cost (B)

Compute Grid

Data Grid

Cost Opt Limited by D Minimize Yes Yes

Cost-Time Opt

Minimize if possible

Minimize Yes

Time Opt Minimize Limited by B Yes Yes

Conservative-Time Opt

Minimize Limited by B, jobs have guaranteed minimum budget

Yes

Page 37: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

37

Sample Applications of Gridbus Broker

Molecular Docking - WEHI Drug Discovery

Brain Activity Analysis – Osaka University Neuroscience studies

Natural Language Engineering – Melbourne NLP Indexing of newswire data

High Energy Physics – School of Physics/Melbourne Belle experiment data analysis

Finance - Portfolio Analysis – U. Comp. Madrid/Spain Investment risk analysis

Astronomy Australian Virtual Observatory

Spreadsheet Processing Microsoft Excel

Page 38: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

Economy-based Data Grid Scheduling

High Energy Physics as eScience Application Case Study

Page 39: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

39

Case Study: High Energy Physics

What is High Energy Physics? (HEP) Study of the fundamental constituents of matter and

forces. High Energy Physics - using H.E. enables the

probing of smaller distances/structures and study in early-universe like environ.

Particle Physics - quanta of matter/forces and their properties

The Belle Experiment KEK B-Factory, Japan Investigating fundamental violation of symmetry in

nature (Charge Parity) which may help explain the universal matter – antimatter imbalance.

Collaboration 400 people, 50 institutes 100’s TB data currently

Page 40: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

40

Case Study: Event Simulation and Analysis

B0->D*+D*-Ks

• Simulation and Analysis Package - Belle Analysis Software Framework (BASF)• Experiment in 2 parts – Generation of Simulated Data and Analysis of the distributed data• Only the Analysis is discussed here

Page 41: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

41

Australian Belle Data Grid Testbed

Grid Service Broker

Replica Catalog

AARNET

NWS NameServer

VirtualOrganization

Analysis Request

Analysis Results

CertificateAuthority

NWSSensor

GridFTPGRIS

GlobusGatekeeper

Dual Intel Xeon 2.8 Ghz, 2 GB RAM

NWSSensor

GridFTPGRIS

GlobusGatekeeper

Dual Intel Xeon 2.8 Ghz, 2 GB RAM

NWSSensor

GridFTPGRIS

GlobusGatekeeper

Dual Intel Xeon 2.8 Ghz, 2 GB RAM

GRIDS Lab, University of Melbourne

Dept. of Physics,University of Sydney

ANU, Canberra

Dept. of Computer Science, University of Adelaide

NWSSensor

GridFTPGRIS

GlobusGatekeeper

Intel Pentium 2.0 Ghz, 512 MB RAM

Dept. of Physics,University of Melbourne

NWSSensor

GridFTPGRIS

GlobusGatekeeper

Dual Intel Xeon 2.8 Ghz, 2 GB RAM

Page 42: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

42

Case Study: Input File for Analysis

parameter jobf Gridfile lfn:/users/winton/fsimddks/fsimdata*.mdst;task main copy runme.grid2 node:runme.grid2 node:execute ./runme.grid2 $jobf $jobnameendtask

• Dynamic parameter defined to describe an input data file

• Logical file name pointing to the location in the replica catalog that contains a mapping to where the physical files are present.

100 data files (30MB each) were equally distributed among the five nodes

Page 43: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

43

Resources Used and their Service Price

Organization 

Node details Role Cost (in G$/CPU-sec)

CS,UniMelb belle.cs.mu.oz.au4 CPU, 2GB RAM, 40 GB HD, Linux

Broker host, Data host, NWS server

N.A. (Not used as a compute resource)

Physics, UniMelb fleagle.ph.unimelb.edu.au1 CPU, 512 MB RAM, 40 GB HD, Linux

Replica Catalog host, Data host, Compute resource, NWS sensor

2

CS, University of Adelaide

belle.cs.adelaide.edu.au4 CPU (only 1 available) , 2GB RAM, 40 GB HD, Linux

Data host, NWS sensor

N.A. (Not used as a compute resource)

ANU, Canberra belle.anu.edu.au4 CPU, 2GB RAM, 40 GB HD, Linux

Data host, Compute resource, NWS sensor

4

Dept of Physics, USyd

belle.physics.usyd.edu.au4 CPU (only 1 available), 2GB RAM, 40 GB HD, Linux

Data host, Compute resource, NWS sensor

4

VPAC, Melbourne brecca-2.vpac.org180 node cluster (only head node used), Linux

Compute resource,NWS sensor

6

Page 44: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

44

Network Cost (in Grid $/Currency!)

NETWORK COSTS BETWEEN THE DATA HOSTS AND THE COMPUTE RESOURCES

(IN G$ PER MB) Data Node

Compute Node ANU UniMelb

Physics Sydney Physics

VPAC

ANU 0 34.0 31.0 38.0 Adelaide CS 34.0 36.0 31.0 33.0 UniMelb Physics 40.0 0 32.0 39.0 UniMelb CS 36.0 30.0 33.0 37.0 Sydney Physics 35.0 33.0 0 37.0

Page 45: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

45

Deploying Application Scenario

A data grid scenario with 100 jobs and each accessing remote data of ~30MB

Deadline: 3hrs. Budget: G$ 60K Scheduling Optimisation Scenario:

Minimise Time Minimise Cost

Results:

SUMMARY OF EVALUATION RESULTS

Scheduling strategy Total Time Taken (mins.)

Compute Cost (G$)

Data Cost (G$)

Total Cost (G$)

Cost Minimization 71.07 26865 7560 34425 Time Minimization 48.5 50938 7452 58390

Page 46: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

46

Time Minimization in Data Grids

0

10

20

30

40

50

60

70

80

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

Time (in mins.)

Nu

mb

er

of

job

s c

om

ple

ted

fleagle.ph.unimelb.edu.au belle.anu.edu.au belle.physics.usyd.edu.au brecca-2.vpac.org

Page 47: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

47

Results : Cost Minimization in Data Grids

0

10

20

30

40

50

60

70

80

90

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63

Time(in mins.)

Nu

mb

er o

f jo

bs

com

ple

ted

fleagle.ph.unimelb.edu.au belle.anu.edu.au belle.physics.usyd.edu.au brecca-2.vpac.org

Page 48: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

48

SUMMARY OF EVALUATION RESULTS

Scheduling strategy Total Time Taken (mins.)

Compute Cost (G$)

Data Cost (G$)

Total Cost (G$)

Cost Minimization 71.07 26865 7560 34425 Time Minimization 48.5 50938 7452 58390

Observation

Organization 

Node details Cost (in G$/CPU-sec) Total Jobs Executed

Time Cost

CS,UniMelb belle.cs.mu.oz.au4 CPU, 2GB RAM, 40 GB HD, Linux

N.A. (Not used as a compute resource)

-- --

Physics, UniMelb fleagle.ph.unimelb.edu.au1 CPU, 512 MB RAM, 40 GB HD, Linux

2 3 94

CS, University of Adelaide

belle.cs.adelaide.edu.au4 CPU (only 1 available) , 2GB RAM, 40 GB HD, Linux

N.A. (Not used as a compute resource)

-- --

ANU, Canberra belle.anu.edu.au4 CPU, 2GB RAM, 40 GB HD, Linux

4 2 2

Dept of Physics, USyd

belle.physics.usyd.edu.au4 CPU (only 1 available), 2GB RAM, 40 GB HD, Linux

4 72 2

VPAC, Melbourne brecca-2.vpac.org180 node cluster (only head node used), Linux

6 23 2

Page 49: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

49

Grid Workflow Management System and Broker Services

DatabaseDatabase

Workflow Submission Handler

Workflow Language Parser

Tasks Parameters Dependencies

Resource Discovery

Dispatcher Data Movement

GMD

ReplicaCatalog

Gridbus Broker Globus

Web services HTTP GridFTP

Data transfer

Workflow Planner Application Composition …… Scientific Portal

Workflow Enactment Engine

Workflow description & QoS

Info Service

MDS

Workflow Scheduler

Page 50: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

50

The GridSim ToolkitA Java based tool for Grid Scheduling Simulations

Basic Discrete Event Simulation Infrastructure

Virtual Machine (Java, cJVM, RMI)

PCs ClustersWorkstations

. . .

SMPs Distributed Resources

GridSim Toolkit

Application Modeling

InformationServices

Resource Allocation

Grid Resource Brokers or Schedulers’s Simulation

Statistics

Resource Modeling and Simulation (with Time and Space shared schedulers)

Job Management

ClustersSingle CPU ReservationSMPs Load Pattern

Application Configuration

Resource Configuration

Visual Modeler

Grid Scenario

Network

SimJava Distributed SimJava

Resource Entities

Output

Application, User, Grid Scenario’s Input and Results

Add your own policy for resource allocation

Page 51: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

51

Selected GridSim Users - 2002

Page 52: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

52

Summary and Conclusion

Introduced requirements for an eScience application

Demonstrated suitability of Grid computing as Cyberinfrastructure for eScience and eBusiness.

Grids exploit synergies that result from cooperation of autonomous entities:

Resource sharing, dynamic provisioning, and aggregation at global level.

Grids allow users to dynamically lease Grid services at runtime based on their quality, cost, availability, and users QoS requirements.

Delivering ICT services as computing utilities. Grids offer enormous opportunities for realizing

eScience and eBusiness at global level.

Page 53: The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.

53

Any Questions ?

http://www.gridbus.org