The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed Systems (GRIDS) Lab. Dept. of Computer Science and Software Engineering The University of Melbourne, Australia www.gridbus.org
53
Embed
The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids Rajkumar Buyya Fellow of Grid Computing Grid Computing and Distributed.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
The Gridbus Toolkit for Building and Deploying eScience Applications on Utility Grids
Rajkumar Buyya
Fellow of Grid Computing
Grid Computing and Distributed Systems (GRIDS) Lab. Dept. of Computer Science and Software EngineeringThe University of Melbourne, Australia
www.gridbus.org
2
Outline
Introduction to eScience and Challenges Introduction to the Gridbus Project An Overview of Gridbus Components Grid Service Broker
Architecture Design and Implementation
Scheduling Algorithms BioGrid Demo OR Performance Evaluation
A Case Study in High Energy Physics Economy-based Scheduling in Data Grids
Summary
3
Prominent Grid Drivers: Emerging eScinece and eBusiness Apps
Next generation experiments, simulations, sensors, satellites, even people and businesses are creating a flood of data. They all involve numerous experts/resources from multiple organization in synthesis, modeling, simulation, analysis, and interpretation.
Life Sciences Digital Biology
Finance: Portfolio analysis
~PBytes/sec
Newswire & data mining:Natural language engineering
Astronomy
Internet & Ecommerce
High Energy Physics Brain Activity Analysis
Quantum Chemistry
4
E-Science Elements
Distributed instruments
Distributed computation
Distributed data
Peers sharing ideas and collaborative interpretation of data/resultsE-Scientist
2100 2100 2100 2100
2100 2100 2100 2100
Remote Visualization
Data & Compute Service
5
Grids have Emerged as Scalable Cyberinfrastructure for e-Science Applications
Grid Resource Broker
Resource Broker
Application
Grid Information Service
Grid Resource Broker
databaseR2R3
RN
R1
R4
R5
R6
Grid Information Service
6
Type of Services Modern Grids Offer
Computational Services – CPU cycles SETI@Home, NASA IPG, TeraGrid, I-Grid,…
Data Services Data replication, management, secure access--
LHC Grid/Napster Application Services
Access to remote software/libraries and license management—NetSolve
Information Services Extraction and presentation of data with meaning
Knowledge Services The way knowledge is acquired and managed—
data mining. Utility Computing Services
Towards a market-based Grid computing: Leasing and delivering Grid services as ICT utilities.
Computional Grid
Data Grid
ASP Grid
Information Grid
Knowledge Grid
Utility Grid
7
Grid Challenges
Security
Resource Allocation & Scheduling
Data locality
Network Management
System Management
Resource Discovery
Uniform Access
Computational Economy
Application Construction
8
Some Grid Initiatives Worldwide
Australia Nimrod-G Gridbus DISCWorld GrangeNet. APACGrid ARC eResearch?
Brazil OurGrid, EasyGrid LNCC-Grid + many others
China ChinaGrid – Education CNGrid - application
Europe UK eScience EU Grids.. and many more...
India I-Grid
Japan NAGERI
Korea...N*Grid
SingaporeNGP
USA Globus NASA IPG AccessGrid TeraGrid Cyberinfrasture and many more...
Industry Initiatives IBM On Demand Computing HP Adaptive Computing Sun N1 Microsoft - .NET Oracle 10g Infosys – Business Grid StorageTek –Grid.. and many more
Public Forums Global Grid Forum Australian Grid Forum Conferences:
CCGrid Grid P2P HPDC
http://www.gridcomputing.com
1.3 billion – 3 yrs
1 billion – 5 yrs
450million – 5 yrs
486million – 5 yrs
1.3 billion (Rs)
27 million
2? billion
120million – 5 yrs
9
The Gridbus Project @ Melbourne:Enable Leasing of ICT Services on Demand
WWG
World Wide Grid!On Demand Utility
Computing
Gridbus
Distributed Data
10
The Gridbus Project: http://www.gridbus.org
A multi-institutional “Open Source” R&D Project with focus on: Architecture, Specification, and Open Source Reference Implementation. Service-Oriented Grid, Utility Computing & Distributed Data and Computation Economy Scaling from Desktops, Clusters, Cluster Federation, Enterprise Grids to Global Grids.
Grid Market Directory and Web Services Grid Bank: Accounting and Transaction Management Visual Tools for Creation of Distributed Applications Workflow Composition and Deployment Services Data Grid Brokering and Grid Economy Services Data Replication Strategies GridSim Toolkit: Enhanced to support Data Grid, Reservation, etc. Libra: Economic Cluster Scheduler Coupling of Clusters and Computational Economy Alchemi: Harnessing .NET/Windows-based Resources WWG: Global Data Intensive Grid Testbed Application Enabler Projects:
Grid Economy: Methodology for Sustained Resourced Sharing and Managing Supply-and-Demand for Resources
12
New challenges of Grid Economy
Grid Service Providers (GSPs) How do I decide service pricing models ? How do I specify them ? How do I translate them into resource allocations ? How do I enforce them ? How do I advertise & attract consumers ? How do I do accounting and handle payments? …..
Grid Service Consumers (GSCs) How do I decide expenses ? How do I express QoS requirements ? How do I trade between timeframe & cost ? How do I map jobs to resources to meet my QoS needs? …..
They need mechanisms and technologies for value expression, value translation, and value enforcement.
GRACE: Service Oriented Grid Architecture
GRid Architecture for Computational Economy (GRACE)
14
Grid Node N
GRACE: A ReferenceService-Oriented Grid Architecture for Computational Economies
Grid Consumer
Pro
gra
mm
ing
En
viro
nm
ents
Grid Resource Broker
Grid Service Providers
Grid Explorer
Schedule Advisor
Trade Manager
Job ControlAgent
Deployment Agent
Trade Server
Resource Allocation
ResourceReservation
R1
Misc. services
Information Service
R2 Rm…
Pricing Algorithms
Accounting
Grid Node1
…
Grid Middleware Services
…
…
HealthMonitor
Grid Market Services
JobExec
Info ?
Secure
Trading
QoS
Storage
Sign-on
Grid Bank
Ap
pli
cati
on
s
Data Catalogue
15
Gridbus and Complementary Grid Technologies – realizing GRACE
Alchemi: .NET-based Enterprise Grid Platform & Web Services
InternetInternet
InternetInternet
Alchemi Worker Agent
Alchemi Manager
Alchemi Users
Web Services
Web Services
•SETI@Home like Model•General Purpose•Dedicated/Non-dedicate workers•Role-based Security•.NET and Web Services•C# Implementation•GridThread and Job Model Programming•Easy to setup and use
18
On Demand Assembly of Services: Putting Them All Together
Data Source
(Instruments/distributed sources)
Data Replicator(GDMP) ASP Catalogue
Grid Info Service
Grid Market Directory
GSP(Accounting Service)
GridbusGridBank
Data
GSP(e.g., UofM)
PEGSP
(e.g., VPAC)
PE
GSP(e.g., IBM)
CPUorPE
Grid Service (GS)
(Globus)
Alchemi
GS
GTS
Cluster Scheduler
Grid Service Provider (GSP)
(e.g., CERN)
PECluster Scheduler
Job
8
GridResource Broker
2
Visual Application Composer
Application CodeExplore
data1
36
45
Resu
lts9 7
Results+
Cost Info
10
11
Bill
12Data Catalogue
Creation and Operation of Virtual Enterprises
Grid Market DirectoryGrid Bank
20
A Market-Oriented Grid Environment
“Solve this in5hrs for $20”
Grid Market Directory (GMD)
ResourceBroker
Grid Info. Service
GTS
GTS
(Grid Service Provider)
GTS
GTS GTS
“ register me as GSP”
“Give me list of GSPs & price?”
“ service available?”
(GTS - Grid Trade Server)
(GSP)
“ service available?”“
service available?”
(RB selects GSPs)
“Solve this in5hrs for $20”
Grid Market Directory (GMD)
ResourceBroker
Grid BankService
GTSGTS
GTSGTS
(Grid Service Provider)
GTSGTS
GTSGTS GTSGTS
“ register me as GSP”
“Give me list of GSPs & price?”
“ service available?”
(GTS - Grid Trade Server)
(GSP)
“ service available?”“
service available?”
(RB selects GSPs)
21
Grid Market Infrastructure
Grids need to provide an infrastructure that supports: (a) the creation of one or more GMP registries; (b) the contributors to register themselves as GSPs along with
their resources/application services that they wish to provide; (c) GSPs to publish themselves in one or more GMPs along
with service prices; and (d) Grid resource brokers to discover resources/services and
their attributes (e.g., access price and usage constraints) that meet user QoS requirements.
Civil Engineering:Civil Engineering:Building Design Building Design
astrophysics astrophysics
25
Thesis
Build a task farming application (parameter sweep or bag of tasks) and execute it on Grid within “T” hours or early and cost not exceeding $M.
Manual
Automated
Three Options/Solutions: Using pure Globus commands Build your own Distributed App & Scheduler Use Gridbus Resource Broker
to compose and schedule
The Gridbus Grid Service Broker for Data Grid Applications
Builds on the Nimrod-G Computational Grid Broker and Computational Economy [Buyya, Abramson, Giddy, Monash University, 1999-
2001]And
Extends its notion for Data and Service Grids
27
A resource broker for scheduling task farming data Grid applications with static or dynamic parameter sweeps on global Grids.
It uses computational economy paradigm for optimal selection of computational and data services depending on their quality, cost, and availability, and users’ QoS requirements (deadline, budget, & T/C optimisation)
Key Features A single window to manage & control experiment Programmable Task Farming Engine Resource Discovery and Resource Trading Optimal Data Source Discovery Scheduling & Predications Generic Dispatcher & Grid Agents Transportation of data & sharing of results Accounting
Grid Service Broker (GSB)
28
Gridbus Broker at a GlanceHome Node/Portal
-PBS-Condor-SGE
Alchemi Globus
Job manager
fork() batch()
GridbusBroker
Gateway
Unicore
fork()
batch() -PBS-Condor-Alchemi
Data Store
Access Technology
Grid FTPSRB
Gridbusagent
Data Catalog
Credential RepositoryMyProxy
29
Gridbus Broker Architecture
Grid Middleware
Gridbus Client Gridbus ClientGribus Client
Grid Info Server
Schedule Advisor
Trading Manager
Gridbus Farming Engine
RecordKeeper
Grid Explorer
GE GIS, NWSTM TS
RM & TS
Grid Dispatcher
RM: Local Resource Manager, TS: Trade Server
G
G
CU
Globus enabled node.A
L
Alchemi enabled node.
(Data Grid Scheduler)
DataCatalog
DataNode
Unicore enabled node.
$
$
$
App, T, $, Opt
(Bag of Tasks Applications)
30
Gridbus Services for eScience applications
Application Development Environment: XML-based language for composition of task farming (legacy)
applications as parameter sweep applications. Task farming APIs for new applications. Web APIs (e.g., Portlets) for Grid portal development. Workflow interface and Gridbus-enabled workflow engine.
Resource Allocation and Scheduling Dynamic discovery of optional computational and data nodes
that meet user QoS requirements. Hide Low-Level Grid Middleware interfaces
High Energy Physics as eScience Application Case Study
39
Case Study: High Energy Physics
What is High Energy Physics? (HEP) Study of the fundamental constituents of matter and
forces. High Energy Physics - using H.E. enables the
probing of smaller distances/structures and study in early-universe like environ.
Particle Physics - quanta of matter/forces and their properties
The Belle Experiment KEK B-Factory, Japan Investigating fundamental violation of symmetry in
nature (Charge Parity) which may help explain the universal matter – antimatter imbalance.
Collaboration 400 people, 50 institutes 100’s TB data currently
40
Case Study: Event Simulation and Analysis
B0->D*+D*-Ks
• Simulation and Analysis Package - Belle Analysis Software Framework (BASF)• Experiment in 2 parts – Generation of Simulated Data and Analysis of the distributed data• Only the Analysis is discussed here