Top Banner
14.1 Grid-enabling” application ITCS 4146/5146 Grid Computing, 2007, UNC-Charlotte, B. Wilkinson. March 27, 2007
37

“Grid-enabling” applications

Jan 15, 2016

Download

Documents

ivrit

“Grid-enabling” applications. ITCS 4146/5146 Grid Computing, 2007, UNC-Charlotte, B. Wilkinson. March 27, 2007. “Grid-enabling”. A poorly defined and understood term! One simple definition: - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: “Grid-enabling” applications

14.1

“Grid-enabling” applications

ITCS 4146/5146 Grid Computing, 2007, UNC-Charlotte, B. Wilkinson. March 27, 2007

Page 2: “Grid-enabling” applications

14.2

“Grid-enabling”

• A poorly defined and understood term!

• One simple definition:

– Being able to execute an application on a grid platform, using the distributed resources available on that platform.

Page 3: “Grid-enabling” applications

14.3

“Turning an existing application, installed on a Grid resource, into a service and generating the application-specific user interfaces to use that application through a web portal.”1

This definition assumes a portal interface and the use of services.

1 From: "A Service-Oriented, Scalable Approach to Grid-Enabling of Legacy Scientific Applications" by Sanjeepan, Vivekananthan; Matsunaga, Andrea; Zhu, Liping; Lam, Herman; Fortes, Jose A.B. Proc. of 2005 Int. Conf.on Web Services (ICWS-2005), Orlando, Florida, p.553-560, 11-15 July, 2005.

Another definition from the literature:

Page 4: “Grid-enabling” applications

14.4

How does one do “Grid-enabling”?

• Still an open question and in the research domain without a standard approach.

Here will describe various approaches.

Page 5: “Grid-enabling” applications

14.5

Simple “grid-enabling”First step

• Simply running an application on a grid resource.

• Might just mean making sure executable and input files and available to the application.

• Not exactly making the most of the grid platform!

Page 6: “Grid-enabling” applications

14.6

Best types of applications for grid-enabling

• One homogeneous application that needs to be executed multiple times with different arguments (“parameter sweep”) – perfect

• Computational intensive– a high 'compute time' vs. 'communication time'

ratio

• An MPI type parallel application with minimal message-passing between grid sites

Page 7: “Grid-enabling” applications

14.7

Parameter Sweep Examples• Molecular biologist (drug designer) looking for compounds in

large chemical data sets that best dock with a particular protein

• Geologist looking at change in density and depth of ore-body and overlying rock’s density to optimise cost and production

• Aerospace engineer understanding role of geometry parameters in aerodynamic design and optimization process

• High energy physicist investigating origin of mass by analyzing petabytes of data generated by high-energy accelerators such as the LHC (Large Hadron Collider)

• Neuroscientist performing brain activity analysis by conducting pair-wise cross co-relation analysis of MEG (Magneto-EncephaloGraphy) sensors data

Source: Alchemi project.

Page 8: “Grid-enabling” applications

14.8

Grid-enabling MPI programs• Globus version of MPI available to run MPI jobs across a grid

(MPICH-G2).

http://www.globus.org/grid_software/computation/mpich-g2.php

Message passing can cross sites:

Page 9: “Grid-enabling” applications

14.9

MPICH-G2 programs

• Ideally one can simply run the MPI job unmodified across the grid.

• However not that simple

Page 10: “Grid-enabling” applications

14.10

Problems:

• Firewalls: Need to accommodate firewalls by opening up ports

• Job Schedulers: Each site will have a separate independent local job scheduler, which will mean can guarantee all MPI processes will be operating at different sites at the same time to communicate.(This issue does not seem to be mentioned in MPICH-G2 documentation.)

• Latency: The delays in messages in transit are much larger and variable between sites (Internet)

Page 11: “Grid-enabling” applications

14.11

http://www.ngpp.ngp.org.sg/

Page 12: “Grid-enabling” applications

14.12

More advanced “grid-enabling”

Some strategies:

1. Using Globus and Grid service APIs

2. Using Grid wrappers to form services

3. Higher-level toolkits

Page 13: “Grid-enabling” applications

14.13

1. Using Globus APIs

Globus provides a suite of services that have APIs1 (C and Java interfaces) that could be called from the application.

1 API: An application programming interface is a source code interface that a computer system or program library provides in order to support requests for services to be made of it by a computer program. http://en.wikipedia.org/wiki/API

Page 14: “Grid-enabling” applications

14.14

Examples• GridFTP for high performance file transfers.

• MDS (Monitoring and Discovery Service) for resource monitoring and discovery. Provides information about available grid resources and their status

• RLS Replicator locator service: maintains and provides access to mapping information from logical names for data items to target names - a database that maps logical file names or file aliases to physical location.

• GASS – Global Access to Secondary Storage: Provides mechanisms for transferring data between a remote HTTP, FTP, or GASS server. Condor-G uses GASS to transfer the executable, stdin, stdout, and stderr to/from the remote resource.

Page 15: “Grid-enabling” applications

Data Management

SecurityCommonRuntime

Execution Management

Information Services

Web Services

Components

Non-WS

Components

Pre-WSAuthenticationAuthorization

GridFTP

GridResource

Allocation Mgmt(Pre-WS GRAM)

Monitoring& Discovery

System(MDS2)

C CommonLibraries

GT2

WSAuthenticationAuthorization

ReliableFile

Transfer

OGSA-DAI[Tech Preview]

GridResource

Allocation Mgmt(WS GRAM)

Monitoring& Discovery

System(MDS4)

Java WS Core

CommunityAuthorization

ServiceGT3

ReplicaLocationService

XIO

GT3

CredentialManagement

GT4

Python WS Core[contribution]

C WS Core

CommunitySchedulerFramework

[contribution]

DelegationService

GT4

Globus Services

Page 16: “Grid-enabling” applications

14.16

GridFTP• Built on FTP using separation of data and

control channels• Provides features for

– Large data transfers– Secure transfers– Fast transfers– Reliable transfers– Third party transfers

• Not a web service– RTF (Reliable File Transfer) service provided WS-

level interface

Page 17: “Grid-enabling” applications

14.17

Third party transfers

PI = FTP Protocol InterpreterDTP= FTP Data Channel Process

PI

DTP DTP

PI

PI PI

Client

Server Server

Control channels

Data channel

Page 18: “Grid-enabling” applications

14.18

Performing a third-party transfer

1. Client establishes control channel with server2. Using control channel, client sets up transfer

parameters and requests data channel creation

3. Data channel established,4. Client sends transfer command over control

channel,5. Data transfer starts through data channel.

Either client or server can send.

Page 19: “Grid-enabling” applications

14.19

Parallel transfers and striping

• Using multiple (virtual) connections for transfer– Same external network– Speed improvement possible, but limited by

network card

• Striping– a version of parallel transfers that can use

separate hardware interfaces– Implemented in GT 4.

Page 20: “Grid-enabling” applications

14.20

GridFTP and RFT

WS ClientRFT service

(Java)

Client API(Java)

XIO based (C) XIO based (C)

Control channel

Data channel

Control channel

GridFTP server GridFTP server

From Gridwise

Page 21: “Grid-enabling” applications

14.21

GT 4 Replica Location Service

• Identify location of files via logical to physical name map

• Distributed indexing of names, fault tolerant update protocols

IndexIndex

I Foster

Page 22: “Grid-enabling” applications

14.22

Monitoring and Discovery

• WSRF provides common mechanisms for monitoring and discovering a service.

• Every GT 4 is discoverable

Page 23: “Grid-enabling” applications

14.23

2. Grid service wrapper approach

Providing a wrapper to make it possible to access application as a grid service

Request

Grid serviceApplication

One of our guest speakers (Joel Hollingsworth) will discuss this in more detail

Page 24: “Grid-enabling” applications

14.24

3. Higher–level toolkits• Objective is to provide a suite of APIs that are

system independent, to hides the underlying grid structure, and even that it is using Globus or any other lower-level grid middleware.

• Examples: Grid Application Toolkit (GAT)

Page 25: “Grid-enabling” applications

14.25

Grid Application Toolkit (GAT)

• APIs for developing and executing portable grid applications that are independent of the underlying grid infrastructure and available services

• GAT APIs used by application to access grid services

• Essentially wrapper code that hides Globus API.

Page 26: “Grid-enabling” applications

14.26

Page 27: “Grid-enabling” applications

14.27

Deploying legacy code

• For the most part, people want to re-use their existing high performance code.

• Several projects to make this easier.

Example

GriddLeS: Grid Enabling Legacy Software http://www.csse.monash.edu.au/~davida/griddles/

Page 28: “Grid-enabling” applications

14.28

Uses GAT

Page 29: “Grid-enabling” applications

14.29

Page 30: “Grid-enabling” applications

14.30

Other tools

Page 31: “Grid-enabling” applications

14.31

Data Grids

Data integration

• Data integration is the capability to link different datasets together, thereby enabling users to interact with them as if they were a single, unified and homogenous resource.

Page 32: “Grid-enabling” applications

14.32

OGSA-DAI Project Open Grid Services Architecture

Data Access and Integration

Aim of the OGSA-DAI project is to develop middleware to assist with access and integration of

data from separate sources via the grid.

http://www.ogsadai.org.uk/

Page 33: “Grid-enabling” applications

14.33

Grid-enabling a data resource using OGSA

• “ … Placing it behind wrapper middleware for the Grid, e.g., OGSA-DAI. …

• Once a data resource is Grid-enabled, its availability can be easily advertised in registries where advanced Grid middleware will know to find them and learn of their specific usage conditions for both access and update, as the case may be. ”

http://www.ncess.ac.uk/learning/tutorials/datagrids/grid_en/why_grid_en_important/what_grid_en_involves/

Page 34: “Grid-enabling” applications

14.34

http://www.ncess.ac.uk/learning/tutorials/datagrids/grid_en/why_grid_en_important/what_grid_en_involves/

Page 35: “Grid-enabling” applications

14.35

OGSA-DAIArchitecture

Page 36: “Grid-enabling” applications

14.36

End of formal lecture

materials in course !!

Page 37: “Grid-enabling” applications

14.37

What NextMini-project: Will be discussed

Thursday March 29th, 2007.

PLEASE BE SURE TO ATTEND THIS CLASS

Actually, mini-project will not start until April after MPI assignment, but next week have guest presentation.