Top Banner
Grid Computing and the Grid Computing and the Globus Toolkit Globus Toolkit Jennifer M. Schopf Jennifer M. Schopf Argonne National Lab Argonne National Lab National eScience Centre National eScience Centre http://www.mcs.anl.gov/~jms/Talks/ http://www.mcs.anl.gov/~jms/Talks/
133

Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks

Mar 27, 2015

Download

Documents

Taylor Monroe
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

Grid Computing and the Grid Computing and the Globus ToolkitGlobus Toolkit

Jennifer M. SchopfJennifer M. SchopfArgonne National LabArgonne National Lab

National eScience CentreNational eScience Centrehttp://www.mcs.anl.gov/~jms/Talks/http://www.mcs.anl.gov/~jms/Talks/

Page 2: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

2

What is a Grid?What is a Grid?

Resource sharing– Computers, storage, sensors, networks, …

– Sharing always conditional: issues of trust, policy, negotiation, payment, …

Coordinated problem solving– Beyond client-server: distributed data analysis,

computation, collaboration, … Dynamic, multi-institutional virtual orgs

– Community overlays on classic org structures

– Large or small, static or dynamic

Page 3: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

3

Why Is this Hard or Different?Why Is this Hard or Different?

Lack of central control– Where things run

– When they run Shared resources

– Contention, variability Communication

– Different sites implies different sys admins, users, institutional goals, and often “strong personalities”

Page 4: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

4

So Why Do It?So Why Do It?

Computations that need to be done with a time limit

Data that can’t fit on one site Data owned by multiple sites

Applications that need to be run bigger, faster, more

Page 5: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

5

What Kinds of Applications?What Kinds of Applications? Computation intensive

– Interactive simulation (climate modeling)

– Large-scale simulation and analysis (galaxy formation, gravity waves, event simulation)

– Engineering (parameter studies, linked models) Data intensive

– Experimental data analysis (e.g., physics)

– Image & sensor analysis (astronomy, climate) Distributed collaboration

– Online instrumentation (microscopes, x-ray) Remote visualization (climate studies, biology)

– Engineering (large-scale structural testing)

Page 6: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

6

Key Common FeatureKey Common Feature

The size and/or complexity of the problem requires that people in several organizations collaborate and share computing resources, data, instruments

Page 7: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

8

The Role of the Globus ToolkitThe Role of the Globus Toolkit

A collection of solutions to problems that come up frequently when building collaborative distributed applications

Heterogeneity– A focus, in particular, on overcoming

heterogeneity for application developers Standards

– We capitalize on and encourage use of existing standards (IETF, W3C, OASIS, GGF)

– GT also includes reference implementations of new/proposed standards in these organizations

Page 8: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

9

Globus is an Hour GlassGlobus is an Hour Glass

Local sites have an their own policies, installs – heterogeneity!– Queuing systems, monitors,

network protocols, etc Globus unifies

– Build on Web services

– Use WS-RF, WS-Notification to represent/access state

– Common management abstractions & interfaces Local heterogeneity

Higher-Level Servicesand Users

Standard GT4Interfaces

Page 9: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

10

On April 29, 2005 the On April 29, 2005 the Globus Alliance releasedGlobus Alliance releasedthe finest version of the the finest version of the Globus Toolkit to date!Globus Toolkit to date!

Don’t take our word for it!Read the UK eScience Evaluation of GT4

www.nesc.ac.uk/technical_papers/UKeS-2005-03.pdf(Reachable from www.globus.org, under “News”)

Page 10: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

15

Globus is Grid InfrastructureGlobus is Grid Infrastructure

Software for Grid infrastructure– Service enable new & existing resources

– E.g., GRAM on computer, GridFTP on storage system, custom application service

– Uniform abstractions & mechanisms Tools to build applications that exploit Grid

infrastructure– Registries, security, data management, …

Open source & open standards– Each empowers the other

Enabler of a rich tool & service ecosystem

Page 11: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

17

Globus is a Building BlockGlobus is a Building Block

Basic components for grid functionality Highest-level services are often application

specific, we let applications concentrate there

Easier to reuse than to reinvent– Compatibility with other Grid systems

comes for free We provide basic infrastructure to get you

one step closer

Page 12: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

24

Globus is a ToolGlobus is a Tool

A Grid development environment– Develop new OGSA-compliant Web Services

– Develop applications using Java or C/C++ Grid APIs

– Secure applications using basic security mechanisms A set of basic Grid functionality

– Services and clients

– Libraries

– Development tools and examples The prerequisites for many Grid community tools

Page 13: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

25

GT Domain AreasGT Domain Areas

Core runtime– Infrastructure for building new services

Security– Apply uniform policy across distinct systems

Execution management– Provision, deploy, & manage services

Data management– Discover, transfer, & access large data

Monitoring– Discover & monitor dynamic services

Page 14: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

26

Data Mgmt

SecurityCommonRuntime

Execution Mgmt

Info Services

GridFTPAuthenticationAuthorization

ReliableFile

Transfer

Data Access& Integration

Grid ResourceAllocation &

ManagementIndex

CommunityAuthorization

DataReplication

CommunitySchedulingFramework

Delegation

ReplicaLocation

Trigger

Java Runtime

C Runtime

Python Runtime

WebMDS

WorkspaceManagement

Grid Telecontrol

Protocol

Globus Toolkit v4www.globus.org

CredentialMgmt

Globus Toolkit:Globus Toolkit: Open Source Grid Infrastructure Open Source Grid Infrastructure

Page 15: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

27

Our Goals for GT4Our Goals for GT4

Usability, reliability, scalability, …– Web service components have quality equal or

superior to pre-WS components

– Documentation at acceptable quality level Consistency with latest standards (WS-*,

WSRF, WS-N, etc.) and Apache platform– WS-I Basic Profile compliant

– WS-I Basic Security Profile compliant New components, platforms, languages

– And links to larger Globus ecosystem

Page 16: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

28

WSRF vs XML/SOAPWSRF vs XML/SOAP

The definition of WSRF means that the Grid and Web services communities can move forward on a common base

Why Not Just Use XML/SOAP?– WSRF and WS-N are just XML and SOAP– WSRF and WS-N are just Web services

Benefits of following the specs:– These patterns represent best practices that have

been learned in many Grid applications– There is a community behind them– Why reinvent the wheel?– Standards facilitate interoperability

Page 17: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

30

GT2 vs GT4GT2 vs GT4

Pre-WS Globus is in GT4 release– Both WS and pre-WS components (ala 2.4.3) are

shipped

– These do NOT interact, but both can run on the same resource independently

Basic functionality is the same– Run a job

– Transfer a file

– Monitoring

– Security Code base is completely different

Page 18: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

31

Why Use GT4?Why Use GT4?

Performance and reliability– Literally millions of tests and queries run against GT4

services Scalability

– Many lessons learned from GT2 have been addressed in GT4

Support– This is our active code base, much more attention

Additional functionality– New features are here– Additional GRAM interfaces to schedulers, MDS Trigger

service, GridFTP protocol interfaces, etc Easier to contribute to

Page 19: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

32

4.0 is not a typical “.0” release,4.0 is not a typical “.0” release,but the culmination of months of testing but the culmination of months of testing

3.0.0 3.2.0

3.9.5

4.0.03.9.4

3.9.3

3.9.2

3.9.1

3.9.0

3.3.0

3.2.13.0.1

3.0.2

CVS trunk

4.0.1

Stable release branch

Development release

Stable release

4.0.2

Page 20: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

33

Versioning and SupportVersioning and Support

Versioning– Evens are production (4.0.x, 4.2.x),

– Odds are development (4.1.x) We support this version and the one

previous– Currently we’re at 4.0.2 so we support

3.2 and 4.0

– There is also a 4.1.0 development release

Page 21: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

34

Several Possible Next VersionsSeveral Possible Next Versions

4.0.3 – stable release– 100% same interfaces, bug fixes only– Perhaps in the fall?

4.1.1 – development release– New functionality– Likely 6-10 weeks?

4.2 - stable release– When 4.1 has “enough” new functionality,

and is stable 5.0 – substantial code base change

– With any luck, not for years :)

Page 22: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

35

Testing OverviewTesting Overview Nightly builds and tests TestGrid at USC/ISI

– Stand up services for several weeks

– Perform stress tests TestGrid at LBNL

– Focus on WS Core performance and interoperability tests

Performance and reliability testing is a major focus– Component-specific approaches mostly

Calls for Community Testing near release time - we welcome new testing help!

Page 23: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

36

Tested PlatformsTested Platforms

Debian Fedora Core FreeBSD HP/UX IBM AIX Red Hat Sun Solaris

SGI Altix (IA64 running Red Hat)

SuSE Linux Tru64 Unix Apple MacOS X (no

binaries) Windows – Java

components only

List of binaries and known platform-specific install bugs at

http://www.globus.org/toolkit/docs/4.0/admin/ docbook/ ch03.html

Page 24: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

37

Documentation OverviewDocumentation Overview

Current document significantly more detailed than earlier versions– http://www.globus.org/toolkit/docs/4.0/

Tutorials available for those of you building a new service– http://www-unix.globus.org/toolkit/tutorials/BAS/

Globus® Toolkit 4: Programming Java Services (The Morgan Kaufmann Series in Networking), by Borja Sotomayor, Lisa Childers (Available through Amazon, £19.99 or $20)

Page 25: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

38

Grid Packaging Technology (GPT)Grid Packaging Technology (GPT) Collection of XML-based packaging tools

– Straight forward definition of complex dependency and compatibility relationships between packages

– Way for developers to define the packaging data and include it as part of their source code distribution

– Automatic generation of binary packages Developer tools

– Convert a source distribution into a GPT package– Patch-n-build capability similar to RPM spec files so you

can retain their own build system if needed User Tools

– Enable collections of packages to be built and/or installed– Package manager for those systems that don't have one

Developed at NCSA

Page 26: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

40

Installation in a nutshellInstallation in a nutshell

Quickstart guide is very useful

http://www.globus.org/toolkit/docs/4.0/ admin/docbook/quickstart.html

Verify your prereqs! Security – check spellings and permissions Globus is system software – plan

accordingly

Page 27: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

Now that you’veNow that you’vedone your installation… done your installation… Lets talk about what you Lets talk about what you

get!get!

Page 28: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

42

Data Mgmt

SecurityCommonRuntime

Execution Mgmt

Info Services

GridFTPAuthenticationAuthorization

ReliableFile

Transfer

Data Access& Integration

Grid ResourceAllocation &

ManagementIndex

CommunityAuthorization

DataReplication

CommunitySchedulingFramework

Delegation

ReplicaLocation

Trigger

Java Runtime

C Runtime

Python Runtime

WebMDS

WorkspaceManagement

Grid Telecontrol

Protocol

Globus Toolkit v4www.globus.org

CredentialMgmt

Globus Toolkit:Globus Toolkit: Open Source Grid Infrastructure Open Source Grid Infrastructure

Page 29: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

43

GT4 Web Services RuntimeGT4 Web Services Runtime

Supports both GT (GRAM, RFT, Delegation, etc.) & user-developed services

Redesign to enhance scalability, modularity, performance, usability

Leverages existing WS standards– WS-I Basic Profile: WSDL, SOAP, etc.

– WS-Security, WS-Addressing Adds support for emerging WS standards

– WS-Resource Framework, WS-Notification Java, Python, & C hosting environments

– Java is standard Apache

Page 30: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

44

What does Core give you?What does Core give you?

Reference implementation of WSRF and WS-N functions Naming and bindings (basis for virtualization)

– Every resource can be uniquely referenced and has one or more associated services for interacting

Lifecycle (basis for resilient state management)– Resources created by svcs following a factory pattern– Resource destroyed immediately or scheduled

Information model (basis for monitoring & discovery)– Resource properties associated with resources– Operations for querying and setting this info– Asynchronous notification of changes to properties

Service groups (basis for registries & collective svcs)– Group membership rules and membership management

Base fault type

Page 31: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

45

Apache Axis Apache Axis Web Services ContainerWeb Services Container

Good news for Java WS developers: GT4.0 works with standard Axis* and Tomcat*– GT provides Axis-loadable libraries, handlers– Includes useful behaviors such as inspection,

notification, lifetime mgmt (WSRF)– Others implement GRAM, etc.

Major Globus contributions to Apache– ~50% of WS-Addressing code– ~15% of WS-Security code– Many bug fixes– WSRF code a possible next contribution

* Modulo Axis and Tomcat release cycle issues

Axis

SecurityAddressing

GTbits

Appbits

Page 32: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

46

CustomWeb

ServicesWS-Addressing, WSRF,

WS-Notification

CustomWSRF Web

Services

GT4WSRF Web

Services

WSDL, SOAP, WS-Security

User Applications

Reg

istr

yA

dmin

istr

atio

n

GT

4 C

onta

iner

GT4 Web Services RuntimeGT4 Web Services Runtime

Page 33: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

49

GetRP TestGetRP Test

Distributed client and service on same LAN(times in milliseconds)

GT4 - Java

GT4 - C

pyGridWare

WSRF::Lite

WSRF.NET

No Security

GT4 - Java

GT4 - C

pyGridWare

WSRF::Lite

WSRF.NET

GT4 - Java

GT4 - C

pyGridWare

WSRF::Lite

WSRF.NET

X509 Signing HTTPS

10.05

2.34

25.57

17.1

8.23

181.96

14.8

140.5

81.39

N/A11.46

2.8512.91

55.6

149.67

Page 34: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

51

Data Mgmt

SecurityCommonRuntime

Execution Mgmt

Info Services

GridFTPAuthenticationAuthorization

ReliableFile

Transfer

Data Access& Integration

Grid ResourceAllocation &

ManagementIndex

CommunityAuthorization

DataReplication

CommunitySchedulingFramework

Delegation

ReplicaLocation

Trigger

Java Runtime

C Runtime

Python Runtime

WebMDS

WorkspaceManagement

Grid Telecontrol

Protocol

Globus Toolkit v4www.globus.org

CredentialMgmt

Globus Toolkit:Globus Toolkit: Open Source Grid Infrastructure Open Source Grid Infrastructure

Page 35: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

52

Globus SecurityGlobus Security

Control access to shared services– Address autonomous management, e.g.,

different policy in different work-groups Support multi-user collaborations

– Federate through mutually trusted services

– Local policy authorities rule Allow users and application communities to

set up dynamic trust domains– Personal/VO collection of resources working

together based on trust of user/VO

Page 36: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

53

Organization A Organization B

Compute Server C1Compute Server C2

Compute Server C3

File server F1 (disks A and B)

Person C(Student)

Person A(Faculty)

Person B(Staff) Person D

(Staff)Person F(Faculty)

Person E(Faculty)

Virtual Community C

Person A(Principal Investigator)

Compute Server C1'

Person B(Administrator)

File server F1 (disk A)

Person E(Researcher)

Person D(Researcher)

Virtual Organization (VO) ConceptVirtual Organization (VO) Concept

VO for each application or workload Carve out and configure resources for a particular

use and set of users

Page 37: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

54

GT4 SecurityGT4 Security

VO

RightsUsers

Rights’

ComputeCenter

Access

Services (runningon user’s behalf)

Rights

Local policyon VO identityor attributeauthority

CAS or VOMSissuing SAMLor X.509 ACs

SSL/WS-Securitywith ProxyCertificates

Authz Callout:SAML, XACML

KCA

MyProxy

Page 38: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

55

GT Authorization FrameworkGT Authorization Framework

Page 39: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

56

GT4 SecurityGT4 Security Public-key-based authentication Transport- and message-level authentication Extensible authorization framework based on Web

services standards– SAML-based authorization callout

– Integrated policy decision engine> XACML policy language, per-operation policies, pluggable

Credential management service– MyProxy (One time password support)

Community Authorization Service Standalone delegation service Ability to map between Grid and local identity

Page 40: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

57

Security ToolsSecurity Tools

Basic Grid Security Mechanisms Certificate Generation Tools Certificate Management Tools

– Getting users “registered” to use a Grid

– Getting Grid credentials to wherever they’re needed in the system

Authorization/Access Control Tools– Storing and providing access to system-

wide authorization information

Page 41: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

58

Other Security Services Include …Other Security Services Include …

MyProxy– Simplified credential management

– Web portal integration

– Single-sign-on support KCA & kx.509

– Bridging into/out-of Kerberos domains SimpleCA

– Online credential generation PERMIS

– Authorization service callout

Page 42: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

59

A Cautionary NoteA Cautionary Note

Grid security mechanisms are tedious to set up– If exposed to users, hand-holding is usually required

– These mechanisms can be hidden entirely from end users, but still used behind the scenes

These mechanisms exist for good reasons.– Many useful things can be done without Grid

security

– It is unlikely that an ambitious project could go into production operation without security like this

– Most successful projects end up using Grid security, but using it in ways that end users don’t see much

Page 43: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

67

Data Mgmt

SecurityCommonRuntime

Execution Mgmt

Info Services

GridFTPAuthenticationAuthorization

ReliableFile

Transfer

Data Access& Integration

Grid ResourceAllocation &

ManagementIndex

CommunityAuthorization

DataReplication

CommunitySchedulingFramework

Delegation

ReplicaLocation

Trigger

Java Runtime

C Runtime

Python Runtime

WebMDS

WorkspaceManagement

Grid Telecontrol

Protocol

Globus Toolkit v4www.globus.org

CredentialMgmt

Globus Toolkit:Globus Toolkit: Open Source Grid Infrastructure Open Source Grid Infrastructure

Page 44: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

68

Execution Management (GRAM)Execution Management (GRAM)

Common WS interface to schedulers– Unix, Condor, LSF, PBS, SGE, …

More generally: interface for process execution management– Lay down execution environment

– Stage data

– Monitor & manage lifecycle

– Kill it, clean up A basis for application-driven provisioning

Page 45: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

69

GRAM - Basic Job GRAM - Basic Job Submission and Control ServiceSubmission and Control Service

A uniform service interface for remote job submission and control– Includes file staging and I/O

management– Includes reliability features– Supports basic Grid security

mechanisms– Available in Pre-WS and WS

GRAM is not a scheduler.– No scheduling– No metascheduling/brokering– Often used as a front-end to

schedulers, and often used to simplify metaschedulers/brokers

Page 46: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

70

GT4 WS GRAMGT4 WS GRAM

2nd-generation WS implementation optimized for performance, flexibility, stability, scalability

Streamlined critical path– Use only what you need

Flexible credential management– Credential cache & delegation service

GridFTP & RFT used for data operations– Data staging & streaming output

– Eliminates redundant GASS code

Page 47: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

71

GRAMGRAM

Intended for jobs where arbitrary programs, stateful monitoring, credential management, and file staging are important

If the application is lightweight, with modest input/output, may be a better candidate for hosting directly as a WSRF service

Page 48: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

72

GRAMservices

GT4 Java Container

GRAMservices

Delegation

RFT FileTransfer

Transferrequest

GridFTPRemote storage element(s)

Localscheduler

Userjob

Compute element

GridFTP

sudo

GRAMadapter

FTPcontrol

Local job control

Delegate

FTP data

Cli

ent Job

functions

Delegate

Service host(s) and compute element(s)

GT4 WS GRAM ArchitectureGT4 WS GRAM Architecture

SEGJob events

Page 49: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

73

GRAMservices

GT4 Java Container

GRAMservices

Delegation

RFT FileTransfer

Transferrequest

GridFTPRemote storage element(s)

Localscheduler

Userjob

Compute element

GridFTP

sudo

GRAMadapter

FTPcontrol

Local job control

Delegate

FTP data

Cli

ent Job

functions

Delegate

Service host(s) and compute element(s)

GT4 WS GRAM ArchitectureGT4 WS GRAM Architecture

SEGJob events

Delegated credential can be:Made available to the application

Page 50: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

74

GRAMservices

GT4 Java Container

GRAMservices

Delegation

RFT FileTransfer

Transferrequest

GridFTPRemote storage element(s)

Localscheduler

Userjob

Compute element

GridFTP

sudo

GRAMadapter

FTPcontrol

Local job control

Delegate

FTP data

Cli

ent Job

functions

Delegate

Service host(s) and compute element(s)

GT4 WS GRAM ArchitectureGT4 WS GRAM Architecture

SEGJob events

Delegated credential can be:Used to authenticate with RFT

Page 51: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

75

GRAMservices

GT4 Java Container

GRAMservices

Delegation

RFT FileTransfer

Transferrequest

GridFTPRemote storage element(s)

Localscheduler

Userjob

Compute element

GridFTP

sudo

GRAMadapter

FTPcontrol

Local job control

Delegate

FTP data

Cli

ent Job

functions

Delegate

Service host(s) and compute element(s)

GT4 WS GRAM ArchitectureGT4 WS GRAM Architecture

SEGJob events

Delegated credential can be:Used to authenticate with GridFTP

Page 52: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

76

Submitting a Sample JobSubmitting a Sample Job

Specify a remote host with –F

globusrun-ws –submit –F host2 –c /bin/true

The return code will be the job’s exit code if supported by the scheduler

Page 53: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

77

Data Staging and StreamingData Staging and Streaming

Simplest stage-in/stage-out example is stdout/stderr

globusrun-ws –S –s –c /bin/date

-S is short for “-submit” -s is short for –streaming The output will be sent back to the terminal,

control will not return until the job is done

Page 54: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

78

Resource Specification LanguageResource Specification Language

For more complicated jobs, we’ll use RSL to specify the job

<job>

<executable>/bin/echo</executable>

<argument>this is an example_string </argument>

<argument>Globus was here</argument>

<stdout>${GLOBUS_USER_HOME}/stdout</stdout>

<stderr>${GLOBUS_USER_HOME}/stderr</stderr>

</job>

Page 55: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

79

Resource Specification LanguageResource Specification Language

<job>

<executable>/bin/echo</executable> <directory>/tmp</directory> <argument>12</argument>

<environment><name>PI</name> <value>3.141</value></environment>

<stdin>/dev/null</stdin>

<stdout>stdout</stdout>

<stderr>stderr</stderr>

</job>

Page 56: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

82

At Most Once SubmissionAt Most Once Submission

You may specify a UUID with your job submission

If you’re not sure the submission worked, you may submit the job again with the same UUID

If the job has already been submitted, the new submission will have no effect

If you do not specify a UUID, one will be generated for you

Page 57: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

89

Batch SubmissionBatch Submission

Your client does not have to stay attached to the execution of the job

-batch will disconnect from the job and output an EPR– You may redirect the EPR to a file with –o

Use the EPR file with –monitor or -status You may also kill the job using -kill

Page 58: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

90

Specifying Scheduler OptionsSpecifying Scheduler Options

RSL lets you specify various scheduler options– what queue to submit to

– which project to select for accounting

– max CPU and wallclock time to spend

– min/max memory required All defined online under the schema

document for GRAM

Page 59: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

93

WS GRAM PerformanceWS GRAM Performance

Time to submit a basic GRAM job– Pre-WS GRAM: < 1 second– WS GRAM: 2 seconds

Concurrent jobs– Pre-WS GRAM: 300 jobs– WS GRAM: 32,000 jobs

Page 60: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

98

Data Mgmt

SecurityCommonRuntime

Execution Mgmt

Info Services

GridFTPAuthenticationAuthorization

ReliableFile

Transfer

Data Access& Integration

Grid ResourceAllocation &

ManagementIndex

CommunityAuthorization

DataReplication

CommunitySchedulingFramework

Delegation

ReplicaLocation

Trigger

Java Runtime

C Runtime

Python Runtime

WebMDS

WorkspaceManagement

Grid Telecontrol

Protocol

Globus Toolkit v4www.globus.org

CredentialMgmt

Globus Toolkit:Globus Toolkit: Open Source Grid Infrastructure Open Source Grid Infrastructure

Page 61: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

99

GT4 Data ManagementGT4 Data Management Stage/move large data to/from nodes

– GridFTP, Reliable File Transfer (RFT)

– Alone, and integrated with GRAM Locate data of interest

– Replica Location Service (RLS) Replicate data for performance/reliability

– Distributed Replication Service (DRS) Provide access to diverse data sources

– File systems, parallel file systems, hierarchical storage: GridFTP

– Databases: OGSA DAI

Page 62: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

100

GridFTPGridFTP

A high-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks– FTP with well-defined extensions– Uses basic Grid security (control and data channels)– Multiple data channels for parallel transfers– Partial file transfers– Third-party (direct server-to-server) transfers– Reusable data channels– Command pipelining

GGF recommendation GFD.20

Page 63: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

101

GridFTP in GT4GridFTP in GT4

100% Globus code– No licensing issues

– Stable, extensible IPv6 Support XIO for different transports Striping multi-Gb/sec wide area transport Pluggable

– Front-end: e.g., future WS control channel

– Back-end: e.g., HPSS, cluster file systems

– Transfer: e.g., UDP, NetBLT transport

Bandwidth Vs Striping

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

0 10 20 30 40 50 60 70

Degree of Striping

Ba

nd

wid

th (

Mb

ps

)

# Stream = 1 # Stream = 2 # Stream = 4

# Stream = 8 # Stream = 16 # Stream = 32

Disk-to-disk onTeraGrid

Page 64: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

102

Striped ServerStriped Server Multiple nodes work together and act as a single

GridFTP server An underlying parallel file system allows all nodes to see

the same file system and must deliver good performance (usually the limiting factor in transfer speed)

– I.e., NFS does not cut it Each node then moves (reads or writes) only the pieces

of the file that it is responsible for. This allows multiple levels of parallelism, CPU, bus, NIC,

disk, etc.

– Critical if you want to achieve better than 1 Gbs without breaking the bank

Page 65: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

103

Striped GridFTP ServiceStriped GridFTP Service

A distributed GridFTP service that runs on a storage cluster– Every node of the cluster

is used to transfer data into/out of the cluster

– Head node coordinates transfers

Multiple NICs/internal busses lead to very high performance– Maximizes use of Gbit+

WANs

Parallel TransferFully utilizes bandwidth of

network interface on single nodes.

Striped TransferFully utilizes bandwidth of

Gb+ WAN using multiple nodes.

Par

alle

l F

iles

yste

m

Par

alle

l F

iles

yste

m

Page 66: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

104

MODE ESPAS (Listen) - returns list of host:port pairsSTOR <FileName>

MODE ESPOR (Connect) - connect to the host-port pairsRETR <FileName>

18-Nov-03

GridFTP Striped Transfer

Host Z

Host Y

Host A

Block 1

Block 5

Block 13

Block 9

Host B

Block 2

Block 6

Block 14

Block 10

Host C

Block 3

Block 7

Block 15

Block 11

Host D

Block 4

Block 8 - > Host D

Block 16

Block 12 -> Host D

Host X

Block1 -> Host A

Block 13 -> Host A

Block 9 -> Host A

Block 2 -> Host B

Block 14 -> Host B

Block 10 -> Host B

Block 3 -> Host C

Block 7 -> Host C

Block 15 -> Host C

Block 11 -> Host C

Block 16 -> Host D

Block 4 -> Host D

Block 5 -> Host A

Block 6 -> Host B

Block 8

Block 12

Page 67: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

105

Network ProtocolNetwork

Protocol

Typical Approach (without XIO)Typical Approach (without XIO)

Application

Disk

Network Protocol

Special Device

Protocol API

POSIX IO

Proprietary API

Page 68: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

106

Network ProtocolNetwork

Protocol

Globus XIO ApproachGlobus XIO Approach

ApplicationDisk

Network Protocol

Special Device

Glo

bus

XIO

Driver

Driver

Driver

Page 69: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

107

DriversDrivers Make 1 API do many types of IO Specific drivers for specific protocols/devices Transform

– Manipulate or examine data

– Do not move data outside of process space

– Compression, Security, Logging Transport

– Moves data across a wire

– TCP, UDP, File IO, Device IO

– Typically move data outside of process space

Page 70: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

108

StackStack

Transport– Exactly one per stack

– Must be on the bottom Transform

– Zero or many per stack Control flows from user to the top

of the stack, to the transport driver.

Example Driver Stack

Compression

Logging

TCP

Page 71: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

109

Copying Files (in a nutshell)Copying Files (in a nutshell) globus-url-copy [options] srcURL dstURL guc gsiftp://localhost/foo file:///bar

– Client/server, using FTP stream mode guc –vb –dbg –tcp-bs 1048576 –p 8

gsiftp://localhost/foo gsiftp://localhost/bar– 3rd party transfer, MODE E

guc https://host.domain.edu/foo ftp://host.domain.gov/bar– from secure http to ftp server

Page 72: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

110

The Options: Improving PerformanceThe Options: Improving Performance

-p (parallelism or number of streams)– rule of thumb 4-8, start with 4

-tcp-bs (TCP buffer size)– use either ping or traceroute to determine the

RTT between hosts

– buffer size = BW (Mbs) * RTT (ms) *1000/8/<(parallelism value – 1)>

– If that is still too complicated use 2MB -vb if you want performance feedback -dbg if you have trouble

Page 73: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

111

Tuning GridFTPTuning GridFTP

Many ways you can tune the performance

Two sources of data arehttp://www.globus.org/toolkit/docs/4.0/data/gridftp/rn01re01.htmlhttp://www.nsf-middleware.org/OnTheGrid

/ 2004-09-MaxGridFTP.pdf

Page 74: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

116

RFT - File Transfer QueuingRFT - File Transfer Queuing

A WSRF service for queuing file transfer requests– Server-to-server transfers

– Checkpointing for restarts

– Database back-end for failovers Allows clients to requests transfers and

then “disappear”– No need to manage the transfer

– Status monitoring available if desired

Page 75: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

117

Reliable File Transfer:Reliable File Transfer:Third Party TransferThird Party Transfer

RFT Service

RFT Client

SOAP Messages

Notifications(Optional)

DataChannel

Protocol Interpreter

MasterDSI

DataChannel

SlaveDSI

IPCReceiver

IPC Link

MasterDSI

Protocol Interpreter

Data Channel

IPCReceiver

SlaveDSI

Data Channel

IPC Link

GridFTP Server GridFTP Server

Fire-and-forget transfer Web services interface Many files & directories Integrated failure recovery Has transferred 900K files

Page 76: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

118

Replica Location ServiceReplica Location Service

Identify location of files via logical to physical name map

Distributed indexing of names, fault tolerant update protocols

GT4 version scalable & stable

Managing ~40 million files across ~10 sites

IndexIndex

Local DB

Update send (secs)

Bloom filter

(secs)

Bloom filter (bits)

10K <1 2 1 M

1 M 2 24 10 M

5 M 7 175 50 M

Page 77: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

119

Cardiff

AEI/Golm

Birmingham•

Reliable Wide Area Data Reliable Wide Area Data ReplicationReplication

Replicating >1 Terabyte/day to 8 sites>30 million replicas so farMTBF = 1 month

LIGO Gravitational Wave Observatory

www.globus.org/solutions

Page 78: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

121

OGSA-DAIOGSA-DAI

Provide service-based access to structured data resources as part of Globus

Specify a selection of interfaces tailored to various styles of data access—starting with relational and XML

Page 79: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

122MySQL

OGSA-DAI service

Engine

SQLQuery

JDBCData

Resources

Activities

DB2

The OGSA-DAI FrameworkThe OGSA-DAI Framework

GZip GridFTPXPath

XMLDB

XIndice

readFile

File

SWISSPROT

XSLT

SQLServer

Data-bases

ApplicationApplicationClient ToolkitClient Toolkit

Page 80: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

124

OGSA-DAI: A Framework OGSA-DAI: A Framework for Building Applicationsfor Building Applications

Supports data access, insert and update– Relational: MySQL, Oracle, DB2, SQL Server, Postgres– XML: Xindice, eXist– Files – CSV, BinX, EMBL, OMIM, SWISSPROT,…

Supports data delivery– SOAP over HTTP– FTP; GridFTP– E-mail– Inter-service

Supports data transformation– XSLT– ZIP; GZIP

Supports security– X.509 certificate based security

Page 81: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

125

OGSA-DAI: Other FeaturesOGSA-DAI: Other Features

A framework for building data clients– Client toolkit library for application developers

A framework for developing functionality– Extend existing activities, or implement your own

– Mix and match activities to provide functionality you need

Highly extensible– Customise our out-of-the-box product

– Provide your own services, client-side support, and data-related functionality

Page 82: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

128

Data Mgmt

SecurityCommonRuntime

Execution Mgmt

Info Services

GridFTPAuthenticationAuthorization

ReliableFile

Transfer

Data Access& Integration

Grid ResourceAllocation &

ManagementIndex

CommunityAuthorization

DataReplication

CommunitySchedulingFramework

Delegation

ReplicaLocation

Trigger

Java Runtime

C Runtime

Python Runtime

WebMDS

WorkspaceManagement

Grid Telecontrol

Protocol

Globus Toolkit v4www.globus.org

CredentialMgmt

Globus Toolkit:Globus Toolkit: Open Source Grid Infrastructure Open Source Grid Infrastructure

Page 83: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

129

Monitoring and Discovery SystemMonitoring and Discovery System(MDS4)(MDS4)

Grid-level monitoring system – Aid user/agent to identify host(s) on which to run an

application

– Warn on errors Uses standard interfaces to provide publishing of

data, discovery, and data access, including subscription/notification– WS-ResourceProperties, WS-BaseNotification, WS-

ServiceGroup Functions as an hourglass to provide a common

interface to lower-level monitoring tools

Page 84: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

130

Standard Schemas(GLUE schema, eg)

Information Users :Schedulers, Portals, Warning Systems, etc.

Cluster monitors(Ganglia, Hawkeye,Clumon, and Nagios) Services

(GRAM, RFT, RLS)

Queuing systems(PBS, LSF, Torque)

WS standard interfaces for subscription, registration, notification

Page 85: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

131

MDS4 ComponentsMDS4 Components

Information providers– Monitoring is a part of every WSRF service– Non-WS services are also be used

Higher level services– Index Service – a way to aggregate data– Trigger Service – a way to be notified of changes– Both built on common aggregator framework

Clients– WebMDS

All of the tool are schema-agnostic, but interoperability needs a well-understood common language

Page 86: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

132

Information ProvidersInformation Providers

Data sources for the higher-level services Some are built into services

– Any WSRF-compliant service publishes some data automatically

– WS-RF gives us standard Query/Subscribe/Notify interfaces

– GT4 services: ServiceMetaDataInfo element includes start time, version, and service type name

– Most of them also publish additional useful information as resource properties

Page 87: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

133

Information Providers (2)Information Providers (2)

Other sources of data – Any executables– Other (non-WS) services– Interface to another archive or data

store– File scraping

Just need to produce a valid XML document

Page 88: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

134

Information Providers:Information Providers:GT4 ServicesGT4 Services

Reliable File Transfer Service (RFT)– Service status data, number of active transfers,

transfer status, information about the resource running the service

Community Authorization Service (CAS)– Identifies the VO served by the service instance

Replica Location Service (RLS)– Note: not a WS– Location of replicas on physical storage systems

(based on user registrations) for later queries

Page 89: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

135

Information Providers:Information Providers:Cluster and Queue DataCluster and Queue Data

Interfaces to Hawkeye, Ganglia, CluMon, Nagios– Basic host data (name, ID), processor information,

memory size, OS name and version, file system data, processor load data

– Some condor/cluster specific data

– This can also be done for sub-clusters, not just at the host level

Interfaces to PBS, Torque, LSF– Queue information, number of CPUs available and

free, job count information, some memory statistics and host info for head node of cluster

Page 90: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

136

Higher-Level ServicesHigher-Level Services

Index Service– Caching registry

Trigger Service– Warn on error conditions

Archive Service– Database store for history (in development)

All of these have common needs, and are built on a common framework

Page 91: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

137

Common Aggregator FrameworkCommon Aggregator Framework

Basic framework for higher-level functions– Subscribe to Information Provider(s)

– Do some action

– Present standard interfaces

Page 92: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

138

Aggregator Framework FeaturesAggregator Framework Features

1) Common configuration mechanism– Specify what data to get, and from where

2) Self cleaning – Services have lifetimes that must be refreshed

3) Soft consistency model– Published information is recent, but not

guaranteed to be the absolute latest

4) Schema Neutral– Valid XML document needed only

Page 93: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

139

MDS4 Index ServiceMDS4 Index Service

Index Service is both registry and cache– Datatype and data provider info, like a registry

(UDDI)– Last value of data, like a cache

In memory default approach– DB backing store currently being developed to

allow for very large indexes Can be set up for a site or set of sites, a

specific set of project data, or for user-specific data only

Can be a multi-rooted hierarchy– No *global* index

Page 94: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

140

MDS4 Trigger ServiceMDS4 Trigger Service

Subscribe to a set of resource properties Evaluate that data against a set of pre-

configured conditions (triggers) When a condition matches, action occurs

– Email is sent to pre-defined address

– Website updated

Similar functionality in Hawkeye

Page 95: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

141

WebMDS User InterfaceWebMDS User Interface

Web-based interface to WSRF resource property information

User-friendly front-end to Index Service Uses standard resource property requests to

query resource property data XSLT transforms to format and display them Customized pages are simply done by using

HTML form options and creating your own XSLT transforms

Sample page:– http://mds.globus.org:8080/webmds/webmds?

info=indexinfo&xsl=servicegroupxsl

Page 96: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

142

Page 97: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

143

Working with TeraGridWorking with TeraGrid

Large US project across 9 different sites– Different hardware, queuing systems and

lower level monitoring packages Starting to explore MetaScheduling

approaches– GRMS (Poznan)– W. Smith (TACC)– K. Yashimoto (SDSC)– User Portal

Need a common source of data with a standard interface for basic scheduling info

Page 98: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

144

Data CollectedData Collected

Provide data at the subcluster level– Sys admin defines a subcluster, we query one

node of it to dynamically retrieve relevant data Can also list per-host details Interfaces to Ganglia, Hawkeye, CluMon, and

Nagios available now– Other cluster monitoring systems can write into

a .html file that we then scrape Also collect basic queuing data, some TeraGrid

specific attributes

Page 99: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

145

Page 100: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

146

Scalability ExperimentsScalability Experiments

MDS index– Dual 2.4GHz Xeon processors, 3.5 GB RAM– Sizes: 1, 10, 25, 50, 100

Clients– 20 nodes also dual 2.6 GHz Xeon, 3.5 GB RAM– 1, 2, 3, 4, 5, 6, 7, 8, 16, 32, 64, 128, 256, 384, 512, 640,

768, 800 Nodes connected via 1Gb/s network Each data point is average of 8 minutes

– Ran for 10 mins but first 2 spent getting clients up and running

– Error bars are SD over 8 mins Experiments by Ioan Raicu, U of Chicago, using DiPerf

Page 101: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

147

Page 102: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

149

Data Mgmt

SecurityCommonRuntime

Execution Mgmt

Info Services

GridFTPAuthenticationAuthorization

ReliableFile

Transfer

Data Access& Integration

Grid ResourceAllocation &

ManagementIndex

CommunityAuthorization

DataReplication

CommunitySchedulingFramework

Delegation

ReplicaLocation

Trigger

Java Runtime

C Runtime

Python Runtime

WebMDS

WorkspaceManagement

Grid Telecontrol

Protocol

Globus Toolkit v4www.globus.org

CredentialMgmt

Globus Toolkit:Globus Toolkit: Open Source Grid Infrastructure Open Source Grid Infrastructure

Page 103: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

150

The Globus EcosystemThe Globus Ecosystem

Globus components address core issues relating to resource access, monitoring, discovery, security, data movement, etc.– GT4 being the latest version

A larger Globus ecosystem of open source and proprietary components provide complementary components– A growing list of components

These components can be combined to produce solutions to Grid problems– We’re building a list of such solutions

Page 104: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

151

Many Tools Build on, or Can Many Tools Build on, or Can Contribute to, GT4-Based Grids Contribute to, GT4-Based Grids

Condor-G, DAGman MPICH-G2 GRMS Nimrod-G Ninf-G Open Grid Computing Env. Commodity Grid Toolkit GriPhyN Virtual Data System Virtual Data Toolkit GridXpert Synergy Platform Globus Toolkit

VOMS PERMIS GT4IDE Sun Grid Engine PBS scheduler LSF scheduler GridBus TeraGrid CTSS NEES IBM Grid Toolbox …

Page 105: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

152

GlobalGlobalCommunityCommunity

Page 106: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

155

Example SolutionsExample Solutions

Portal-based User Reg. System (PURSE) VO Management Registration Service Service Monitoring Service TeraGrid TGCP Tool Lightweight Data Replicator GriPhyN Virtual Data System

Page 107: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

156

Condor-GCondor-G

The Condor Project @ U Wisconsin Madison develops software for high-throughput computing on collections of distributed compute resources

Condor-G is an interface to GRAM created by the Condor team that allows users to submit jobs to GRAM servers

Page 108: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

159

MPICH-G2MPICH-G2

MPICH-G2, developed at Northern Illinois University and Argonne National Lab, is a grid-enabled implementation of the MPI v1.1 standard

MPICH-G2 is implemented using the pre-WS GRAM component in GT4; integration with GT4 WS GRAM is expected in the near future

Page 109: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

163

SRBSRB

SRB is a package from SDSC providing a uniform interface for connecting to network-based heterogeneous data resources

GT4’s GridFTP includes an interface to SRB data sources, and vice versa

Page 110: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

165

Tells Us About YourTells Us About YourGrid Tools & Solutions Grid Tools & Solutions

We list links to related projects on the “Related Software” of the Globus Toolkit web www.globus.org/toolkit/tools/

“Solutions” are documented on the Globus web www.globus.org/solutions/

If we’ve got details wrong or you have a GT4-related tool to list on our website, please send mail to [email protected]

Page 111: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

166

The Globus Commitment The Globus Commitment to Open Sourceto Open Source

Globus was first established as an open source project in 1996

The Globus Toolkit is open source to:– allow for inspection

> for consideration in standardization processes

– encourage adoption> in pursuit of ubiquity and interoperability

– encourage contributions> harness the expertise of the community

The Globus Toolkit is distributed under the (BSD-style) Apache License version 2

Page 112: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

167

The Future:The Future:StructureStructure

NSF Community Driven Improvement of Globus Software (CDIGS) project– 5 years of funding for GT enhancement

GlobDev http://dev.globus.org– Globus Development Envionrment

Page 113: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

168

Why is Globus Software Why is Globus Software Open Source?Open Source?

To allow for inspection– For consideration in standardization

processes

To encourage adoption– In pursuit of ubiquity and interoperability

To encourage contributions– Harness the expertise of the community

Page 114: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

169

Open ContributionOpen Contribution

But distributing code under an open source license does not guarantee open development!

Open development requires open processes

So we have created dev.globus to facilitate contributions– http://dev.globus.org/

Page 115: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

170

Governance ModelGovernance Model

Based on Apache Jakarta– Individual development efforts organized as

projects

– Consensus-based decision making

Control over each project in the hands of its most active and respected contributors (committers)

Globus Management Committee (GMC) providing overall guidance and conflict resolution

Page 116: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

171

Common InfrastructureCommon Infrastructure

Code repositories (CVS, SVN) Mailing lists

– *-dev, *-user, *-announce, *-commit for every project

Issue tracking (bugzilla)– Including roadmap info for future development

Wikis Known interactions for people accessing your

project

Page 117: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

172

SampleSample

http://dev.globus.org/wiki/GRAM

Page 118: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

173

Page 119: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

174

Page 120: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

175

Page 121: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

176

Page 122: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

177

Technology ProjectsTechnology Projects

Common runtime projects– C Core Utilities, C WS Core, CoG jglobus, Core WS Schema,

Java WS Core, XIO Data projects

– GridFTP, Reliable File Transfer, Replica Location, Data Replication

Execution projects– GRAM, Dynamic accounts, Virtual workspaces

Information services projects– MDS4

Security Projects – C Security, CAS/SAML Utilities, Delegation Service, MyProxy

Page 123: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

178

Non-Technolgy ProjectsNon-Technolgy Projects

Distribution Projects – Globus Toolkit Distribution

– Process was used for April 4.0.2 release Documentation Projects

– GT Release Manuals Incubation Projects

– Incubation management project

– And any new projects wanting to join

Page 124: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

179

Incubator Process in dev.globusIncubator Process in dev.globus

Entry point for new Globus projects Incubator Management Project (IMP)

– Oversees incubator process form first contact to becoming a Globus project

– Quarterly reviews of current projects

– Process being debugged by “Incubator Pioneers”

http://dev.globus.org/wiki/IncubatorDraft

Page 125: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

180

Incubator Process (1 of 3)Incubator Process (1 of 3)

Project proposes itself as a Candidate– A proposed name for the project; – A proposed project chair, with contact info;– A list of the proposed committers for the

project; – An overview of the aims of the project; – An overview of any current user base or user

community, if applicable;– An overview of how the project relates to

other parts of Globus; – A summary of why the project would enhance

and benefit Globus.

Page 126: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

181

Incubator Process (2 of 3)Incubator Process (2 of 3)

IMP meet, discuss, and accept project as a ProtoProject– ProtoProject now part of the Incubator

framework– Get assigned a Mentor to help

>Member of IMP>Bridge between Globus and new

ProtoProject– Opportunity to get up to speed on Globus

Development process

Page 127: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

182

Incubator Process (3 of 3)Incubator Process (3 of 3)

Quarterly reviews by IMP determine– Stay a ProtoProject

– Retire

– Escalate to a full Globus project

Escalation when ProtoProject passes checklist– Legal

– Meritocracy

– Alignment/Synergy

– Infrastructure

Page 128: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

184

You Can Begin Participating You Can Begin Participating in Globus Development Today!in Globus Development Today!

Monitor and comment on Globus development discussions; recent threads include:– GT Backward Compatibility ([email protected])

– 4.2 Wish List for GRAM ([email protected])

Submit bug fixes and other contributions Start your own Globus project!

Page 129: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

186

The Future:The Future:ContentContent

We now have a solid and extremely powerful Web services base

Next, we will build an expanded open source Grid infrastructure– Virtualization

– New services for provisioning, data management, security, VO management

– End-user tools for application development

– Etc., etc. And of course responding to user requests for

other short-term needs

Page 130: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

189

Opportunities for CollaborationOpportunities for Collaboration

Use of Globus software– Feedback & involvement in design

Development of new Globus components– Help an existing project

> New information providers to enable use of GT to manage an entire Grid

> GRAM interfaces, etc

– Become an incubator project New applications and tools

– E.g., Grid operations, emergency response, ecogrid, bioinformatics, …

Page 131: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

191

CreditsCredits

Globus Toolkit v4 is the work of many talented Globus Alliance members, at– Argonne Natl. Lab & U.Chicago

– USC Information Sciences Corporation

– National Center for Supercomputing Applns

– U. Edinburgh

– Swedish PDC

– Univa Corporation

– Other contributors at other institutions Supported by DOE, NSF, UK EPSRC, and other

sources

Page 132: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

192

Some Useful PointersSome Useful Pointers

UK ETF GT4 reporthttp://www.nesc.ac.uk/technical_papers/ UKeS-2005-

03.pdf GRAM command line man page

http://www.globus.org/toolkit/docs/4.0/ execution/wsgram/rn01re01.html

GridFTP command line man page

http://www.globus.org/toolkit/docs/4.0/data/gridftp/rn01re01.html

MDS4 TG site

http://snipurl.com/j24r

Page 133: Grid Computing and the Globus Toolkit Jennifer M. Schopf Argonne National Lab National eScience Centre jms/Talks/

193

For More InformationFor More Information

Jennifer Schopf– [email protected]

– www.mcs.anl.gov/~jms Globus Alliance

– www.globus.org GlobusWORLD 2006

– September 10-14

– Washington, D.C. 2nd Edition

www.mkp.com/grid2