Grid Computing and the Grid Computing and the Globus Toolkit Globus Toolkit Jennifer M. Schopf Jennifer M. Schopf Argonne National Lab Argonne National Lab National eScience Centre National eScience Centre http://www.mcs.anl.gov/~jms/Talks/ http://www.mcs.anl.gov/~jms/Talks/
133
Embed
Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Grid Computing and the Grid Computing and the Globus ToolkitGlobus Toolkit
Jennifer M. SchopfJennifer M. SchopfArgonne National LabArgonne National Lab
National eScience CentreNational eScience Centrehttp://www.mcs.anl.gov/~jms/Talks/http://www.mcs.anl.gov/~jms/Talks/
The size and/or complexity of the problem requires that people in several organizations collaborate and share computing resources, data, instruments
8
The Role of the Globus ToolkitThe Role of the Globus Toolkit
A collection of solutions to problems that come up frequently when building collaborative distributed applications
Heterogeneity– A focus, in particular, on overcoming
heterogeneity for application developers Standards
– We capitalize on and encourage use of existing standards (IETF, W3C, OASIS, GGF)
– GT also includes reference implementations of new/proposed standards in these organizations
9
Globus is an Hour GlassGlobus is an Hour Glass
Local sites have an their own policies, installs – heterogeneity!– Queuing systems, monitors,
network protocols, etc Globus unifies
– Build on Web services
– Use WS-RF, WS-Notification to represent/access state
– Common management abstractions & interfaces Local heterogeneity
Higher-Level Servicesand Users
Standard GT4Interfaces
10
On April 29, 2005 the On April 29, 2005 the Globus Alliance releasedGlobus Alliance releasedthe finest version of the the finest version of the Globus Toolkit to date!Globus Toolkit to date!
Don’t take our word for it!Read the UK eScience Evaluation of GT4
www.nesc.ac.uk/technical_papers/UKeS-2005-03.pdf(Reachable from www.globus.org, under “News”)
15
Globus is Grid InfrastructureGlobus is Grid Infrastructure
Software for Grid infrastructure– Service enable new & existing resources
– E.g., GRAM on computer, GridFTP on storage system, custom application service
– Uniform abstractions & mechanisms Tools to build applications that exploit Grid
infrastructure– Registries, security, data management, …
Open source & open standards– Each empowers the other
Enabler of a rich tool & service ecosystem
17
Globus is a Building BlockGlobus is a Building Block
Basic components for grid functionality Highest-level services are often application
specific, we let applications concentrate there
Easier to reuse than to reinvent– Compatibility with other Grid systems
comes for free We provide basic infrastructure to get you
one step closer
24
Globus is a ToolGlobus is a Tool
A Grid development environment– Develop new OGSA-compliant Web Services
– Develop applications using Java or C/C++ Grid APIs
– Secure applications using basic security mechanisms A set of basic Grid functionality
– Services and clients
– Libraries
– Development tools and examples The prerequisites for many Grid community tools
25
GT Domain AreasGT Domain Areas
Core runtime– Infrastructure for building new services
Security– Apply uniform policy across distinct systems
Current document significantly more detailed than earlier versions– http://www.globus.org/toolkit/docs/4.0/
Tutorials available for those of you building a new service– http://www-unix.globus.org/toolkit/tutorials/BAS/
Globus® Toolkit 4: Programming Java Services (The Morgan Kaufmann Series in Networking), by Borja Sotomayor, Lisa Childers (Available through Amazon, £19.99 or $20)
Verify your prereqs! Security – check spellings and permissions Globus is system software – plan
accordingly
Now that you’veNow that you’vedone your installation… done your installation… Lets talk about what you Lets talk about what you
get!get!
42
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
Globus Toolkit:Globus Toolkit: Open Source Grid Infrastructure Open Source Grid Infrastructure
43
GT4 Web Services RuntimeGT4 Web Services Runtime
Supports both GT (GRAM, RFT, Delegation, etc.) & user-developed services
Redesign to enhance scalability, modularity, performance, usability
Leverages existing WS standards– WS-I Basic Profile: WSDL, SOAP, etc.
– WS-Security, WS-Addressing Adds support for emerging WS standards
– WS-Resource Framework, WS-Notification Java, Python, & C hosting environments
– Java is standard Apache
44
What does Core give you?What does Core give you?
Reference implementation of WSRF and WS-N functions Naming and bindings (basis for virtualization)
– Every resource can be uniquely referenced and has one or more associated services for interacting
Lifecycle (basis for resilient state management)– Resources created by svcs following a factory pattern– Resource destroyed immediately or scheduled
Information model (basis for monitoring & discovery)– Resource properties associated with resources– Operations for querying and setting this info– Asynchronous notification of changes to properties
Service groups (basis for registries & collective svcs)– Group membership rules and membership management
Base fault type
45
Apache Axis Apache Axis Web Services ContainerWeb Services Container
Good news for Java WS developers: GT4.0 works with standard Axis* and Tomcat*– GT provides Axis-loadable libraries, handlers– Includes useful behaviors such as inspection,
notification, lifetime mgmt (WSRF)– Others implement GRAM, etc.
Major Globus contributions to Apache– ~50% of WS-Addressing code– ~15% of WS-Security code– Many bug fixes– WSRF code a possible next contribution
* Modulo Axis and Tomcat release cycle issues
Axis
SecurityAddressing
GTbits
Appbits
46
CustomWeb
ServicesWS-Addressing, WSRF,
WS-Notification
CustomWSRF Web
Services
GT4WSRF Web
Services
WSDL, SOAP, WS-Security
User Applications
Reg
istr
yA
dmin
istr
atio
n
GT
4 C
onta
iner
GT4 Web Services RuntimeGT4 Web Services Runtime
49
GetRP TestGetRP Test
Distributed client and service on same LAN(times in milliseconds)
GT4 - Java
GT4 - C
pyGridWare
WSRF::Lite
WSRF.NET
No Security
GT4 - Java
GT4 - C
pyGridWare
WSRF::Lite
WSRF.NET
GT4 - Java
GT4 - C
pyGridWare
WSRF::Lite
WSRF.NET
X509 Signing HTTPS
10.05
2.34
25.57
17.1
8.23
181.96
14.8
140.5
81.39
N/A11.46
2.8512.91
55.6
149.67
51
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
Globus Toolkit:Globus Toolkit: Open Source Grid Infrastructure Open Source Grid Infrastructure
52
Globus SecurityGlobus Security
Control access to shared services– Address autonomous management, e.g.,
different policy in different work-groups Support multi-user collaborations
– Federate through mutually trusted services
– Local policy authorities rule Allow users and application communities to
set up dynamic trust domains– Personal/VO collection of resources working
A high-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks– FTP with well-defined extensions– Uses basic Grid security (control and data channels)– Multiple data channels for parallel transfers– Partial file transfers– Third-party (direct server-to-server) transfers– Reusable data channels– Command pipelining
GGF recommendation GFD.20
101
GridFTP in GT4GridFTP in GT4
100% Globus code– No licensing issues
– Stable, extensible IPv6 Support XIO for different transports Striping multi-Gb/sec wide area transport Pluggable
– Front-end: e.g., future WS control channel
– Back-end: e.g., HPSS, cluster file systems
– Transfer: e.g., UDP, NetBLT transport
Bandwidth Vs Striping
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
0 10 20 30 40 50 60 70
Degree of Striping
Ba
nd
wid
th (
Mb
ps
)
# Stream = 1 # Stream = 2 # Stream = 4
# Stream = 8 # Stream = 16 # Stream = 32
Disk-to-disk onTeraGrid
102
Striped ServerStriped Server Multiple nodes work together and act as a single
GridFTP server An underlying parallel file system allows all nodes to see
the same file system and must deliver good performance (usually the limiting factor in transfer speed)
– I.e., NFS does not cut it Each node then moves (reads or writes) only the pieces
of the file that it is responsible for. This allows multiple levels of parallelism, CPU, bus, NIC,
disk, etc.
– Critical if you want to achieve better than 1 Gbs without breaking the bank
103
Striped GridFTP ServiceStriped GridFTP Service
A distributed GridFTP service that runs on a storage cluster– Every node of the cluster
is used to transfer data into/out of the cluster
– Head node coordinates transfers
Multiple NICs/internal busses lead to very high performance– Maximizes use of Gbit+
WANs
Parallel TransferFully utilizes bandwidth of
network interface on single nodes.
Striped TransferFully utilizes bandwidth of
Gb+ WAN using multiple nodes.
Par
alle
l F
iles
yste
m
Par
alle
l F
iles
yste
m
104
MODE ESPAS (Listen) - returns list of host:port pairsSTOR <FileName>
MODE ESPOR (Connect) - connect to the host-port pairsRETR <FileName>
– Provide your own services, client-side support, and data-related functionality
128
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
Globus Toolkit:Globus Toolkit: Open Source Grid Infrastructure Open Source Grid Infrastructure
129
Monitoring and Discovery SystemMonitoring and Discovery System(MDS4)(MDS4)
Grid-level monitoring system – Aid user/agent to identify host(s) on which to run an
application
– Warn on errors Uses standard interfaces to provide publishing of
data, discovery, and data access, including subscription/notification– WS-ResourceProperties, WS-BaseNotification, WS-
ServiceGroup Functions as an hourglass to provide a common
interface to lower-level monitoring tools
130
Standard Schemas(GLUE schema, eg)
Information Users :Schedulers, Portals, Warning Systems, etc.
Cluster monitors(Ganglia, Hawkeye,Clumon, and Nagios) Services
(GRAM, RFT, RLS)
Queuing systems(PBS, LSF, Torque)
WS standard interfaces for subscription, registration, notification
131
MDS4 ComponentsMDS4 Components
Information providers– Monitoring is a part of every WSRF service– Non-WS services are also be used
Higher level services– Index Service – a way to aggregate data– Trigger Service – a way to be notified of changes– Both built on common aggregator framework
Clients– WebMDS
All of the tool are schema-agnostic, but interoperability needs a well-understood common language
132
Information ProvidersInformation Providers
Data sources for the higher-level services Some are built into services
– Any WSRF-compliant service publishes some data automatically
– WS-RF gives us standard Query/Subscribe/Notify interfaces
– GT4 services: ServiceMetaDataInfo element includes start time, version, and service type name
– Most of them also publish additional useful information as resource properties
133
Information Providers (2)Information Providers (2)
Other sources of data – Any executables– Other (non-WS) services– Interface to another archive or data
store– File scraping
Just need to produce a valid XML document
134
Information Providers:Information Providers:GT4 ServicesGT4 Services
Reliable File Transfer Service (RFT)– Service status data, number of active transfers,
transfer status, information about the resource running the service
Community Authorization Service (CAS)– Identifies the VO served by the service instance
Replica Location Service (RLS)– Note: not a WS– Location of replicas on physical storage systems
(based on user registrations) for later queries
135
Information Providers:Information Providers:Cluster and Queue DataCluster and Queue Data
Interfaces to Hawkeye, Ganglia, CluMon, Nagios– Basic host data (name, ID), processor information,
memory size, OS name and version, file system data, processor load data
– Some condor/cluster specific data
– This can also be done for sub-clusters, not just at the host level
Interfaces to PBS, Torque, LSF– Queue information, number of CPUs available and
free, job count information, some memory statistics and host info for head node of cluster
136
Higher-Level ServicesHigher-Level Services
Index Service– Caching registry
Trigger Service– Warn on error conditions
Archive Service– Database store for history (in development)
All of these have common needs, and are built on a common framework
137
Common Aggregator FrameworkCommon Aggregator Framework
Basic framework for higher-level functions– Subscribe to Information Provider(s)
– Do some action
– Present standard interfaces
138
Aggregator Framework FeaturesAggregator Framework Features
1) Common configuration mechanism– Specify what data to get, and from where
2) Self cleaning – Services have lifetimes that must be refreshed
3) Soft consistency model– Published information is recent, but not
guaranteed to be the absolute latest
4) Schema Neutral– Valid XML document needed only
139
MDS4 Index ServiceMDS4 Index Service
Index Service is both registry and cache– Datatype and data provider info, like a registry
(UDDI)– Last value of data, like a cache
In memory default approach– DB backing store currently being developed to
allow for very large indexes Can be set up for a site or set of sites, a
specific set of project data, or for user-specific data only
Can be a multi-rooted hierarchy– No *global* index
140
MDS4 Trigger ServiceMDS4 Trigger Service
Subscribe to a set of resource properties Evaluate that data against a set of pre-
configured conditions (triggers) When a condition matches, action occurs
– Email is sent to pre-defined address
– Website updated
Similar functionality in Hawkeye
141
WebMDS User InterfaceWebMDS User Interface
Web-based interface to WSRF resource property information
User-friendly front-end to Index Service Uses standard resource property requests to
query resource property data XSLT transforms to format and display them Customized pages are simply done by using
HTML form options and creating your own XSLT transforms
768, 800 Nodes connected via 1Gb/s network Each data point is average of 8 minutes
– Ran for 10 mins but first 2 spent getting clients up and running
– Error bars are SD over 8 mins Experiments by Ioan Raicu, U of Chicago, using DiPerf
147
149
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
Globus Toolkit:Globus Toolkit: Open Source Grid Infrastructure Open Source Grid Infrastructure
150
The Globus EcosystemThe Globus Ecosystem
Globus components address core issues relating to resource access, monitoring, discovery, security, data movement, etc.– GT4 being the latest version
A larger Globus ecosystem of open source and proprietary components provide complementary components– A growing list of components
These components can be combined to produce solutions to Grid problems– We’re building a list of such solutions
151
Many Tools Build on, or Can Many Tools Build on, or Can Contribute to, GT4-Based Grids Contribute to, GT4-Based Grids
Condor-G, DAGman MPICH-G2 GRMS Nimrod-G Ninf-G Open Grid Computing Env. Commodity Grid Toolkit GriPhyN Virtual Data System Virtual Data Toolkit GridXpert Synergy Platform Globus Toolkit
VOMS PERMIS GT4IDE Sun Grid Engine PBS scheduler LSF scheduler GridBus TeraGrid CTSS NEES IBM Grid Toolbox …
152
GlobalGlobalCommunityCommunity
155
Example SolutionsExample Solutions
Portal-based User Reg. System (PURSE) VO Management Registration Service Service Monitoring Service TeraGrid TGCP Tool Lightweight Data Replicator GriPhyN Virtual Data System
156
Condor-GCondor-G
The Condor Project @ U Wisconsin Madison develops software for high-throughput computing on collections of distributed compute resources
Condor-G is an interface to GRAM created by the Condor team that allows users to submit jobs to GRAM servers
159
MPICH-G2MPICH-G2
MPICH-G2, developed at Northern Illinois University and Argonne National Lab, is a grid-enabled implementation of the MPI v1.1 standard
MPICH-G2 is implemented using the pre-WS GRAM component in GT4; integration with GT4 WS GRAM is expected in the near future
163
SRBSRB
SRB is a package from SDSC providing a uniform interface for connecting to network-based heterogeneous data resources
GT4’s GridFTP includes an interface to SRB data sources, and vice versa
165
Tells Us About YourTells Us About YourGrid Tools & Solutions Grid Tools & Solutions
We list links to related projects on the “Related Software” of the Globus Toolkit web www.globus.org/toolkit/tools/
“Solutions” are documented on the Globus web www.globus.org/solutions/
If we’ve got details wrong or you have a GT4-related tool to list on our website, please send mail to [email protected]
166
The Globus Commitment The Globus Commitment to Open Sourceto Open Source
Globus was first established as an open source project in 1996
The Globus Toolkit is open source to:– allow for inspection
> for consideration in standardization processes
– encourage adoption> in pursuit of ubiquity and interoperability
– encourage contributions> harness the expertise of the community
The Globus Toolkit is distributed under the (BSD-style) Apache License version 2
167
The Future:The Future:StructureStructure
NSF Community Driven Improvement of Globus Software (CDIGS) project– 5 years of funding for GT enhancement
GlobDev http://dev.globus.org– Globus Development Envionrment
168
Why is Globus Software Why is Globus Software Open Source?Open Source?
To allow for inspection– For consideration in standardization
processes
To encourage adoption– In pursuit of ubiquity and interoperability
To encourage contributions– Harness the expertise of the community
169
Open ContributionOpen Contribution
But distributing code under an open source license does not guarantee open development!
Open development requires open processes
So we have created dev.globus to facilitate contributions– http://dev.globus.org/
170
Governance ModelGovernance Model
Based on Apache Jakarta– Individual development efforts organized as
projects
– Consensus-based decision making
Control over each project in the hands of its most active and respected contributors (committers)
Globus Management Committee (GMC) providing overall guidance and conflict resolution
171
Common InfrastructureCommon Infrastructure
Code repositories (CVS, SVN) Mailing lists
– *-dev, *-user, *-announce, *-commit for every project
Issue tracking (bugzilla)– Including roadmap info for future development
Wikis Known interactions for people accessing your
project
172
SampleSample
http://dev.globus.org/wiki/GRAM
173
174
175
176
177
Technology ProjectsTechnology Projects
Common runtime projects– C Core Utilities, C WS Core, CoG jglobus, Core WS Schema,
Java WS Core, XIO Data projects
– GridFTP, Reliable File Transfer, Replica Location, Data Replication
Security Projects – C Security, CAS/SAML Utilities, Delegation Service, MyProxy
178
Non-Technolgy ProjectsNon-Technolgy Projects
Distribution Projects – Globus Toolkit Distribution
– Process was used for April 4.0.2 release Documentation Projects
– GT Release Manuals Incubation Projects
– Incubation management project
– And any new projects wanting to join
179
Incubator Process in dev.globusIncubator Process in dev.globus
Entry point for new Globus projects Incubator Management Project (IMP)
– Oversees incubator process form first contact to becoming a Globus project
– Quarterly reviews of current projects
– Process being debugged by “Incubator Pioneers”
http://dev.globus.org/wiki/IncubatorDraft
180
Incubator Process (1 of 3)Incubator Process (1 of 3)
Project proposes itself as a Candidate– A proposed name for the project; – A proposed project chair, with contact info;– A list of the proposed committers for the
project; – An overview of the aims of the project; – An overview of any current user base or user
community, if applicable;– An overview of how the project relates to
other parts of Globus; – A summary of why the project would enhance
and benefit Globus.
181
Incubator Process (2 of 3)Incubator Process (2 of 3)
IMP meet, discuss, and accept project as a ProtoProject– ProtoProject now part of the Incubator
framework– Get assigned a Mentor to help
>Member of IMP>Bridge between Globus and new
ProtoProject– Opportunity to get up to speed on Globus
Development process
182
Incubator Process (3 of 3)Incubator Process (3 of 3)
Quarterly reviews by IMP determine– Stay a ProtoProject
– Retire
– Escalate to a full Globus project
Escalation when ProtoProject passes checklist– Legal
– Meritocracy
– Alignment/Synergy
– Infrastructure
184
You Can Begin Participating You Can Begin Participating in Globus Development Today!in Globus Development Today!
Monitor and comment on Globus development discussions; recent threads include:– GT Backward Compatibility ([email protected])