The Gridbus Middleware: Building and Managing Utility Grids for Powering e-Science and e-Business Applications Dr. Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Laboratory Dept. of Computer Science and Software Engineering The University of Melbourne, Australia ww.gridbus.org Gridbus Sponsors
53
Embed
The Gridbus Middleware: Building and Managing Utility Grids for Powering e-Science and e-Business Applications Dr. Rajkumar Buyya Grid Computing and Distributed.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
The Gridbus Middleware:Building and Managing Utility Grids for Powering
e-Science and e-Business Applications
Dr. Rajkumar Buyya
Grid Computing and Distributed Systems (GRIDS) LaboratoryDept. of Computer Science and Software EngineeringThe University of Melbourne, Australiaww.gridbus.org
Gridbus Sponsors
2
Outline
Introduction to E-Science Collaborative Science & Challenges
Introduction to Grid Computing Defining Grids, Services, Challenges, Middleware Solutions
Service-Oriented Grid Architecture and Gridbus Solutions
Market-based Management, GMD, Grid Bank, Alchemi Grid Service Broker
Architecture, Design and Implementation Performance Evaluation: Experiments in Creation
and Deployment of Applications on Global Grids A Case Study in High Energy Physics
Summary and Conclusion
3
Big Science Problems & Collaborative Research
Next generation experiments, simulations, sensors, satellites, even people and businesses are creating a flood of data. They all involve numerous experts/resources from multiple organization in synthesis, modeling, simulation, analysis, and interpretation.
Life Sciences Digital Biology
Finance: Portfolio analysis
~PBytes/sec
Newswire & data mining:Natural language engineering
Grid Computing Solution: (1) providing Cyberinfrastructure for e-Science;
(2) delivering IT services as the 5th utility
E-ScienceE-Business
E-GovernmentE-Health
E-Education…
7
Outline
Introduction to E-Science Collaborative Science & Challenges
Introduction to Grid Computing Defining Grids, Services, Challenges, Middleware Solutions
Service-Oriented Grid Architecture and Gridbus Solutions
Market-based Management, GMD, Grid Bank, Alchemi Grid Service Broker
Architecture, Design and Implementation Performance Evaluation: Experiments in Creation
and Deployment of Applications on Global Grids A Case Study in High Energy Physics
Summary and Conclusion
8
What is Grid?[Buyya et. al]
A type of parallel and distributed system that enables the sharing, exchange, selection, & aggregation of geographically distributed “autonomous” resources:
USA Globus GridSec AccessGrid TeraGrid Cyberinfrasture and many more...
Industry Initiatives IBM On Demand Computing HP Adaptive Computing Sun N1 Microsoft - .NET Oracle 10g Infosys – Enterprise Grid Satyam – Business Grid StorageTek –Grid.. and many more
Public Forums Open Grid Forum Australian Grid Forum Conferences:
Security ServicesAuthentication, Single sign-on, secure communication
Job submission, info services, Storage access, Trading, Accounting, License
Resource management and scheduling
Grid programming environment and toolsLanguages, API, libraries, compilers, parallelization tools
Grid applicationsWeb Portals, Applications,
System level
User level
Adaptiv
e M
anagem
ent
CoreMiddleware
User-LevelMiddleware
15
Open-Source Grid Middleware Projects
16
The Gridbus Project @ Melbourne:Enable Leasing of ICT Services on Demand
WWG
Pushes Grid computing into mainstream
computing
Gridbus
17
The Gridbus Project @ GRIDS Lab, The University of Melbourne: Toolkit for Creating and Deploying e-Research Applications on Utility Grids
The Gridbus Project @ GRIDS Lab, The University of Melbourne: The Gridbus Project @ GRIDS Lab, The University of Melbourne: Toolkit for Creating and Deploying eToolkit for Creating and Deploying e--Research Applications on Utility GridsResearch Applications on Utility Grids
Gridbus
Distributed Data
http://www.gridbus.org
• Gridbus is a “open source” Grid R&D project with focus on Grid Economy, Utility Grids and Service Oriented Computing.
– Grid Bank: Accounting and Transaction Management
– Visual Tools for Creation of Distributed Applications
– Grid Service Broker and Scheduling
– Workflow Management Engine
– GridSim Toolkit
– Libra: SLA-based Resource Allocation
18
Outline
Introduction to E-Science Collaborative Science & Challenges
Introduction to Grid Computing Defining Grids, Services, Challenges, Middleware Solutions
Service-Oriented Grid Architecture and Gridbus Solutions
Market-based Management, GMD, Grid Bank, Alchemi Grid Service Broker
Architecture, Design and Implementation Performance Evaluation: Experiments in Creation
and Deployment of Applications on Global Grids A Case Study in High Energy Physics
Summary and Conclusion
19
What does Grid players want?
Grid Consumers Execute jobs for solving varying problem size and complexity Benefit by utilizing distributed resources wisely Tradeoff timeframe and cost
Strategy: minimise expenses
Grid Providers Contribute resources for executing consumer jobs Benefit by maximizing resource utilisation Tradeoff local requirements & market opportunity
Strategy: maximise return on investment
20
What does Grid players require?
They need tools and technologies that help them in value expression, value translation, and value enforcement.
Grid Service Consumers (GSCs): How do I express QoS requirements ? How do I trade between timeframe & cost ? How do I map jobs to resources to meet my QoS needs? How do I manage Grid dynamics and get my work done? …
Grid Service Providers (GSPs) How do I decide service pricing models ? How do I specify them ? How do I translate them into resource allocations ? How do I enforce them ? How do I advertise & attract consumers ? How do I do accounting and handle payments? …
21
Solution 1: Service Oriented Architecture (SOA)
A SOA is a contractual architecture for offering and consuming software as services.
There are four entities that make up an SOA service provider, service registry, and service consumer (also known as service requestor).
The functions or tasks that the service provider offers, along with other functional and technical information required for consumption, are defined in
On Demand Assembly of Services: Putting Them All Together
ASP Catalogue
Grid Info Service
Grid Market Directory
GSP(Accounting Service)
GridbusGridBank
GSP(e.g., UofM)
PEGSP
(e.g., VPAC)
PE
GSP(e.g., IBM)
CPUorPE
Grid Service (GS)
(Globus)
Alchemi
GS
GTS
Cluster Scheduler
Job
8
GridResource Broker
2
Visual Application Composer
Application CodeExplore
data1
36
45
Resu
lts9 7
Results+
Cost Info
10
11
Bill
12Data Catalogue
2733
On Demand Assembly of Services: Putting Them All Together
ASP Catalogue
Grid Info Service
Grid Market Directory
GSP(Accounting Service)
GridbusGridBank
GSP(e.g., UofM)
PEGSP
(e.g., VPAC)
PE
GSP(e.g., IBM)
CPUorPE
Grid Service (GS)(Globus) Alchemi
GS
GTS
Cluster Scheduler
Job
8
GridResource Broker
2
Visual Application Composer
Application CodeExplore
data1
36
45
Res
ults
9 7
Results+
Cost Info
10
11
Bill
12Data Catalogue
28
Alchemi: .NET-based Enterprise Grid Platform & Web Services
InternetInternet
InternetInternet
Alchemi Worker Agents
Alchemi Manager
Alchemi Users
Web Services
Web Services
•SETI@Home like Model•General Purpose•Dedicated/Non-dedicate workers•Role-based Security•.NET and Web Services•C# Implementation•GridThread and Job Model Programming•Easy to setup and use• Widely in use!
29
Some Users of Alchemi
Tier Technologies, USALarge scale document processing using Alchemi framework
CSIRO, AustraliaNatural Resource Modeling
The Friedrich Miescher Institute (FMI) for Biomedical Research, SwitzerlandPatterns of transcription factors in mammalian genes
Satyam Computers Applied Research Laboratory, IndiaMicro-array data processing using Alchemi framework
The University of Sao Paulo, BrazilThe Alchemi Executor as a Windows Service
stochastix GmbH, GermanyServing clients in International Banking/Finance sector
Many users in Universities: See next for an example.
30
Students' project gives old computers new life - 1/25/2005
31
Outline
Introduction to E-Science Collaborative Science & Challenges
Introduction to Grid Computing Defining Grids, Services, Challenges, Middleware Solutions
Service-Oriented Grid Architecture and Gridbus Solutions
Market-based Management, GMD, Grid Bank, Alchemi Grid Service Broker
Architecture, Design and Implementation Performance Evaluation: Experiments in Creation
and Deployment of Applications on Global Grids A Case Study in High Energy Physics
Summary and Conclusion
32
A resource broker for scheduling task farming data Grid applications with static or dynamic parameter sweeps on global Grids.
It uses computational economy paradigm for optimal selection of computational and data services depending on their quality, cost, and availability, and users’ QoS requirements (deadline, budget, & T/C optimisation)
Key Features A single window to manage & control experiment Programmable Task Farming Engine Resource Discovery and Resource Trading Optimal Data Source Discovery Scheduling & Predications Generic Dispatcher & Grid Agents Transportation of data & sharing of results Accounting
Grid Service Broker (GSB)
33
Gridbus Broker Architecture
Grid Middleware
Gridbus Client Gridbus ClientGribus Client
Grid Info Server
Schedule Advisor
Trading Manager
Gridbus Farming Engine
RecordKeeper
Grid Explorer
GE GIS, NWSTM TS
RM & TS
Grid Dispatcher
RM: Local Resource Manager, TS: Trade Server
G
G
CU
Globus enabled node.A
L
Alchemi enabled node.
(Data Grid Scheduler)
DataCatalog
DataNode
Unicore enabled node.
$
$
$
App, T, $, Opt
(Bag of Tasks Applications)
34
Gridbus Broker and Remote Service Access Enablers
Alchemi
Gateway
UnicoreData Store
Access Technology
Grid FTPSRB
-PBS-Condor-SGE
Globus
Job manager
fork() batch()
Gridbusagent
Data Catalog
-PBS-Condor-SGE-XGrid
SSH
fork()
batch()
Gridbusagent
Credential RepositoryMyProxy
Home Node/Portal
GridbusBroker
fork()
batch() -PBS-Condor-SGE-Alchemi-XGrid
Por
tlets
35
Gridbus Services for eScience applications
Application Development Environment: XML-based language for composition of task farming (legacy)
applications as parameter sweep applications. Task Farming APIs for new applications. Web APIs (e.g., Portlets) for Grid portal development. Threads-based Programming Interface Workflow interface and Gridbus-enabled workflow engine.
Resource Allocation and Scheduling Dynamic discovery of optional computational and data nodes
that meet user QoS requirements. Hide Low-Level Grid Middleware interfaces
Globus (v2, v4), SRB, Alchemi, Unicore, and ssh-based access to local/remote resources managed by XGrid, Condor, SGE.
Detection of patterns of transcription factors in mammalian genes
Detection of patterns of transcription factors in mammalian genes
39
Figure 3 : Logging into the portal.
Drug DesignMade Easy!
Click Here for Demo
40
Excel Plugin to Access Gridbus Services
Excel
ExcelGrid Add-In
ExcelGrid Runner
ExcelGridJob
ExcelGrid Middleware
Gridbus Broker
Enterprise Grid
2100
2100
2100
2100
2100
2100
2100
2100
41
Outline
Introduction to the University Melbourne, GRIDS Lab, and Opportunities
Recap of the First Lecture What are Grids, Challenges, Middleware Solutions
Service-Oriented Grid Architecture and Gridbus Solutions
Market-based Management, GMD, Grid Bank, Alchemi Grid Service Broker
Architecture, Design and Implementation Performance Evaluation: Experiments in Creation
and Deployment of Applications on Global Grids A Case Study in High Energy Physics
Summary and Conclusion
42
Case Study: High Energy Physics and Data Grid
The Belle Experiment KEK B-Factory, Japan Investigating fundamental violation
of symmetry in nature (Charge Parity) which may help explain “why do we have more antimatter in the universe OR imbalance of matter and antimatter in the universe?”.
Collaboration 1000 people, 50 institutes
100’s TB data currently
43
Case Study: Event Simulation and Analysis
B0->D*+D*-Ks
• Simulation and Analysis Package - Belle Analysis Software Framework (BASF)• Experiment in 2 parts – Generation of Simulated Data and Analysis of the distributed data
Analyzed 100 data files (30MB each) that were distributed among the five nodes within Australian Belle DataGrid platform.
44
Australian Belle Data Grid Testbed
Grid Service Broker
Replica Catalog
AARNET
NWS NameServer
VirtualOrganization
Analysis Request
Analysis Results
CertificateAuthority
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Dual Intel Xeon 2.8 Ghz, 2 GB RAM
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Dual Intel Xeon 2.8 Ghz, 2 GB RAM
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Dual Intel Xeon 2.8 Ghz, 2 GB RAM
GRIDS Lab, University of Melbourne
Dept. of Physics,University of Sydney
ANU, Canberra
Dept. of Computer Science, University of Adelaide
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Intel Pentium 2.0 Ghz, 512 MB RAM
Dept. of Physics,University of Melbourne
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Dual Intel Xeon 2.8 Ghz, 2 GB RAM
VPACMelbourne
45
Belle Data Grid (GSP CPU Service Price: G$/sec)
Grid Service Broker
Replica Catalog
AARNET
NWS NameServer
VirtualOrganization
Analysis Request
Analysis Results
CertificateAuthority
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Dual Intel Xeon 2.8 Ghz, 2 GB RAM
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Dual Intel Xeon 2.8 Ghz, 2 GB RAM
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Dual Intel Xeon 2.8 Ghz, 2 GB RAM
GRIDS Lab, University of Melbourne
Dept. of Physics,University of Sydney
ANU, Canberra
Dept. of Computer Science, University of Adelaide
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Intel Pentium 2.0 Ghz, 512 MB RAM
Dept. of Physics,University of Melbourne
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Dual Intel Xeon 2.8 Ghz, 2 GB RAM
NA
G$4
G$4
Datanode
G$6VPAC
MelbourneG$2
46
Belle Data Grid (Bandwidth Price: G$/MB)
Grid Service Broker
Replica Catalog
AARNET
NWS NameServer
VirtualOrganization
Analysis Request
Analysis Results
CertificateAuthority
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Dual Intel Xeon 2.8 Ghz, 2 GB RAM
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Dual Intel Xeon 2.8 Ghz, 2 GB RAM
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Dual Intel Xeon 2.8 Ghz, 2 GB RAM
GRIDS Lab, University of Melbourne
Dept. of Physics,University of Sydney
ANU, Canberra
Dept. of Computer Science, University of Adelaide
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Intel Pentium 2.0 Ghz, 512 MB RAM
Dept. of Physics,University of Melbourne
NWSSensor
GridFTPGRIS
GlobusGatekeeper
Dual Intel Xeon 2.8 Ghz, 2 GB RAM
NA
G$4
G$4
Datanode
G$6VPAC
MelbourneG$2
34
31
38
31
30
3336
32
47
Deploying Application Scenario
A data grid scenario with 100 jobs and each accessing remote data of ~30MB