Top Banner
1 Grid System Issues MSI-CI 2 Meeting June 29 2006 Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 [email protected] http:// www.infomall.org
103

Grid System Issues MSI-CI 2 Meeting

Jan 14, 2016

Download

Documents

shayna

Grid System Issues MSI-CI 2 Meeting. June 29 2006 Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 [email protected] http://www.infomall.org. Topics Covered. General Issues: Relation to P2P Types of Grids - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Grid System Issues MSI-CI 2  Meeting

11

Grid System Issues

MSI-CI2 Meeting

June 29 2006

Geoffrey Fox

Computer Science, Informatics, Physics

Pervasive Technology Laboratories

Indiana University Bloomington IN 47401

[email protected]

http://www.infomall.org

Page 2: Grid System Issues MSI-CI 2  Meeting

22

Topics Covered General Issues: Relation to P2P Types of Grids Why use Service Oriented Architectures Multi-core Chips All the world’s services Workflow Metadata and State Workflow Sensors and Filters SOAP MPI and Communication Performance Grids of Grids Community Tools

Page 3: Grid System Issues MSI-CI 2  Meeting

33

Web services Web Services build

loosely-coupled, distributed applications, (wrapping existing codes and databases) based on the SOA (service oriented architecture) principles.

Web Services interact by exchanging messages in SOAP format

The contracts for the message exchanges that implement those interactions are described via WSDL interfaces.

Databases

Humans

ProgramsComputational resources

Devices

reso

urce

s

BP

EL,

Jav

a, .N

ET

serv

ice

logi

c

<env:Envelope> <env:Header> ... </env:header> <env:Body> ... </env:Body></env:Envelope> m

essa

ge p

roce

ssin

g

SO

AP

and

WS

DL

SOAP messages

Page 4: Grid System Issues MSI-CI 2  Meeting

44

A typical Web Service In principle, services can be in any language (Fortran .. Java ..

Perl .. Python) and the interfaces can be method calls, Java RMI Messages, CGI Web invocations, totally compiled away (inlining)

The simplest implementations involve XML messages (SOAP) and programs written in net friendly languages like Java and Python

PaymentCredit Card

WarehouseShippingcontrol

WSDL interfaces

WSDL interfaces

Security CatalogPortalService

Web Services

Web Services

Page 5: Grid System Issues MSI-CI 2  Meeting

55

Philosophy of Web Service Grids Much of Distributed Computing was built by natural

extensions of computing models developed for sequential machines

This leads to the distributed object (DO) model represented by Java and CORBA• RPC (Remote Procedure Call) or RMI (Remote Method

Invocation) for Java Key people think this is not a good idea as it scales badly

and ties distributed entities together too tightly• Distributed Objects Replaced by Services

Note CORBA was considered too complicated in both organization and proposed infrastructure• and Java was considered as “tightly coupled to Sun”• So there were other reasons to discard

Thus replace distributed objects by services connected by “one-way” messages and not by request-response messages

Page 6: Grid System Issues MSI-CI 2  Meeting

66

Some ideas to Remember Grids are managed Web Services exchanging Messages P2P Networks are differently managed and architected

services exchanging messages Any computer operation involves messages; not all

these messages can be isolated• With services all messages are explicit and can be examined

Grid Services extend WS-* Web Service Specifications Web Service container replaces computer Service replaces process A stream is an ordered set of messages Service Internet replaces Internet: messages replace

packets (Sub)Grids replace Libraries

Page 7: Grid System Issues MSI-CI 2  Meeting

77

Internet Scale Distributed Services Grids use Internet technology and are distinguished by managing

or organizing sets of network connected resources• Classic Web allows independent one-to-one access to

individual resources • Grids integrate together and manage multiple Internet-

connected resources: People, Sensors, computers, data systems

Organization can be explicit as in• TeraGrid which federates many supercomputers; • Information Retrieval Grid which federates multiple data

resources; • CrisisGrid which federates first responders, commanders,

sensors, GIS, (Tsunami) simulations, science/public data Organization can be implicit as in Internet resources such as

curated databases and simulation resources that “harmonize a community”

Page 8: Grid System Issues MSI-CI 2  Meeting

88

Raw (HPC) Resources

Middleware

Database

PortalServices

SystemServices

SystemServices

SystemServices

Application Service

SystemServices

SystemServices

UserServices

“Core”Grid

Typical Grid Architecture

Each Blob is a Computer Program!

Page 9: Grid System Issues MSI-CI 2  Meeting

99

Classic Grid Architecture

Database Database

Netsolve

Computing

SecurityCollaboration

CompositionContent Access

Resources

Clients Users and Devices

Middle TierBrokers Service Providers

Middle Tier becomes Web Services

Page 10: Grid System Issues MSI-CI 2  Meeting

1010

Peer to Peer Grid

DatabaseDatabase

Peers

Peers

Peer to Peer GridA democratic organization

User FacingWeb Service Interfaces

Service FacingWeb Service Interfaces

Event/MessageBrokers

Event/MessageBrokers

Event/MessageBrokers

Page 11: Grid System Issues MSI-CI 2  Meeting

1111

Different Visions of the Grid e-Science or Cyberinfrastructure are virtual organization Grids

supporting global distributed engineering and science research (note sensors, instruments are people are all distributed)

Utility Computing or X-on-demand (X=data, computer ..) is a major computer Industry interest in Grids and this is key part of enterprise or campus Grids

Skype (Kazaa) VOIP system is a Peer-to-peer Grid (and VRVS/GlobalMMCS like Internet A/V conferencing are Collaboration Grids)

DoD’s vision of Network Centric Computing can be considered a Grid (linking sensors, warfighters, commanders, backend resources) and they are building the GIG (Global Information Grid)

Commercial 3G Cell-phones and DoD ad-hoc network initiative are forming mobile Grids

Grids support universal Globalization in life, fun, research, business

Page 12: Grid System Issues MSI-CI 2  Meeting

1212

e-moreorlessanything and the Grid e-Business captures an emerging view of corporations as

dynamic virtual organizations linking employees, customers and stakeholders across the world. • The growing use of outsourcing is one example

e-Science is the similar vision for scientific research with international participation in large accelerators, satellites or distributed gene analyses.

The Grid integrates the best of the Web, traditional enterprise software, high performance computing and Peer-to-peer systems to provide the information technology e-infrastructure for e-moreorlessanything.

A deluge of data of unprecedented and inevitable size must be managed and understood.

People, computers, data and instruments must be linked. On demand assignment of experts, computers, networks and

storage resources must be supported

Page 13: Grid System Issues MSI-CI 2  Meeting

1313

e-Defense and e-Crisis Grids support Command and Control and provide

Global Situational Awareness • Link commanders and frontline troops to themselves and to

archival and real-time data; link to what-if simulations • Dynamic heterogeneous wired and wireless networks• Security and fault tolerance essential

System of Systems; Grid of Grids• The command and information infrastructure of each ship is

a Grid; each fleet is linked together by a Grid; the President is informed by and informs the national defense Grid

• Grids must be heterogeneous and federated Crisis Management and Response enabled by a Grid

linking sensors, disaster managers, and first responders with decision support

Page 14: Grid System Issues MSI-CI 2  Meeting

14

1962 Licklider’s Vision

“Lick had this concept – all of the stuff linked together throughout the world, that you can use a remote computer, get data from a remote computer, or use lots of computers in your job.”

Larry Roberts – Principal Architect of the ARPANET

Page 15: Grid System Issues MSI-CI 2  Meeting

15

What is e-Science?

‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’

John Taylor

Director General of Research Councils

UK, Office of Science and Technology e-Science is about developing tools and

technologies that allow scientists to do ‘faster, better or different’ research

Page 16: Grid System Issues MSI-CI 2  Meeting

1616

Some Important Styles of Grids Computational Grids were origin of concepts and link

computers across the globe – high latency stops this from being used as parallel machine• Typically Compute/File Grids where information (messages) exchanged

by writing and reading files Knowledge and Information Grids link sensors and information

repositories as in Virtual Observatories or BioInformatics Education Grids link teachers, learners, parents as a VO with

learning tools, distant lectures etc. e-Science Grids link multidisciplinary researchers across

laboratories and universities Community Grids focus on Grids involving large numbers of

peers rather than focusing on linking major resources – links Grid and Peer-to-peer network concepts

Semantic Grid links Grid, and AI community with Semantic web (ontology/meta-data enriched resources) and Agent concepts

Collaboration Grids support the linkage of multiple people and electronic resources (often peer-to-peer architecture)

Page 17: Grid System Issues MSI-CI 2  Meeting

1717

Types of Computing Grids Running “Pleasing Parallel Jobs” as in United Devices, Entropia

(Desktop Grid) “cycle stealing systems” Can be managed (“inside” the enterprise as in Condor) or more

informal (as in SETI@Home) Computing-on-demand in Industry where jobs spawned are

perhaps very large (SAP, Oracle …) Support distributed file systems as in Legion (Avaki), Globus with

(web-enhanced) UNIX programming paradigm• Particle Physics will run some 30,000 simultaneous jobs

Distributed Simulation HLA style Grids (some work) Linking Supercomputers as in TeraGrid Pipelined applications linking data/instruments, compute,

visualization Seamless Access where Grid portals allow one to choose one of

multiple resources with a common interfaces Parallel Computing typically NOT suited for a Grid (latency)

Page 18: Grid System Issues MSI-CI 2  Meeting

18

Large Scale Parallel Computers

Old Style Metacomputing GridQuickTime™ and a

decompressorare needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

IMAGING INSTRUMENTS

COMPUTATIONALRESOURCES

LARGE-SCALE DATABASES

DATA ACQUISITION ,ANALYSIS

ADVANCEDVISUALIZATION

Analysis and Visualization

Spread a single large Problem over multiple supercomputers

Large Disks

Page 19: Grid System Issues MSI-CI 2  Meeting

1919

Utility and Service Computing An important business application of Grids is believed to be

utility computing Namely support a pool of computers to be assigned as needed to

take-up extra demand• Pool shared between multiple applications

Natural architecture is not a cluster of computers connected to each other but rather a “Farm of Grid Services” connected to Internet and supporting services such as• Web Servers• Financial Modeling • Run SAP • Data-mining• Simulation response to crisis like forest fire or earthquake• Media Servers for Video-over-IP

Note classic Supercomputer use is to allow full access to do “anything” via ssh etc.• In service model, one pre-configures services for all programs

and you access portal to run job with less security issues

Page 20: Grid System Issues MSI-CI 2  Meeting

20

GOSC Timeline

Q2 Q4 Q2 Q3Q1Q4Q3Q2Q1Q3

2004 20062005

EGEE gLite alpha release

gLite release 1

OMII release

NGS Expansion(Bristol, Cardiff…)

OGSA-DAI

WS plan

NGS ProductionService

NGS WS Service

EGEE gLite releaseOMII Release

NGS Expansion

WS2 plan

NGS WS Service 2

UK National Grid Service

Grid Operation Support Centre

Web Services based National Grid Infrastructure

Page 21: Grid System Issues MSI-CI 2  Meeting

21Computation

Starlight (Chicago) Netherlight

(Amsterdam)

Leeds

PSC

SDSC

UCL

Network PoP Service Registry

NCSA

Manchester

UKLight

Oxford

RAL

US TeraGrid

UK NGS

Steering clients

SC05

Local laptops in Seattle and UK

All sites connected by production

network (not all shown)

Towards an International Grid

Infrastructure

Page 22: Grid System Issues MSI-CI 2  Meeting

UNIVERSITY OF CALIFORNIA, SAN DIEGO

SAN DIEGO SUPERCOMPUTER CENTER

Fran Berman

Cyberinfrastructure At Home

• BOINC (Berkeley Open Infrastructure for Network Computing) (http://boinc.berkeley.edu)

• Climateprediction.net: study climate change

• Einstein@home: search for gravitational signals emitted by pulsars

• LHC@home: improve the design of the CERN LHC particle accelerator

• Predictor@home: investigate protein-related diseases

• Rosetta@home: help researchers develop cures for human diseases

• SETI@home: Look for radio evidence of extraterrestrial live

• Etc.

SETI@Home averages 138 TFLOPS on 100,000’s of

computers in 100’s of countries

Arecibo telescope

Page 23: Grid System Issues MSI-CI 2  Meeting

23

climateprediction.net

Since September 2003:

95,000 registered participants in 150 countriesDonated 8,000 years of computer timeCompleted 100,000 simulations of over 4M model years

Page 24: Grid System Issues MSI-CI 2  Meeting

2424

Information/Knowledge Grids Distributed (10’s to 1000’s) of data sources (instruments,

file systems, curated databases …) Data Deluge: 1 (now) to 100’s petabytes/year (2012)

• Moore’s law for Sensors Possible filters assigned dynamically (on-demand)

• Run image processing algorithm on telescope image• Run Gene sequencing algorithm on compiled data

Needs decision support front end with “what-if” simulations

Metadata (provenance) critical to annotate data

Integrate across experiments as in multi-wavelength astronomy

Data Deluge comes from pixels/year available

Page 25: Grid System Issues MSI-CI 2  Meeting

2525

Data Deluged Science In the past, we worried about data in the form of parallel I/O or

MPI-IO, but we didn’t consider it as an enabler of new algorithms and new ways of computing

Data assimilation was not central to HPCC DoE ASCI set up because didn’t want test data! Now particle physics will get 100 petabytes from CERN

• Nuclear physics (Jefferson Lab) in same situation

• Use around 30,000 CPU’s simultaneously 24X7

Weather, climate, solid earth (EarthScope) Bioinformatics curated databases (Biocomplexity only 1000’s of

data points at present) Virtual Observatory and SkyServer in Astronomy Environmental Sensor nets

Page 26: Grid System Issues MSI-CI 2  Meeting

Data

Information

Ideas

Simulation

Model

Assimilation

Reasoning

Datamining

ComputationalScience

Informatics

Data DelugedScienceComputingParadigm

Page 27: Grid System Issues MSI-CI 2  Meeting

2727

Database Database

Analysis and VisualizationPortal

RepositoriesFederated Databases

Data Filter

Services

Field Trip DataStreaming Data

Sensors

?DiscoveryServices

SERVOGrid

ResearchSimulations

Research Education

CustomizationServices

From Research

to Education

EducationGrid ComputerFarmGrid of Grids: Research Grid and Education Grid

GISGrid

Sensor GridDatabase Grid

Compute Grid

Page 28: Grid System Issues MSI-CI 2  Meeting

2828

SERVOGrid Requirements Seamless Access to Data repositories and large scale

computers Integration of multiple data sources including sensors,

databases, file systems with analysis system• Including filtered OGSA-DAI (Grid database access)

Rich meta-data generation and access with SERVOGrid specific Schema extending openGIS (Geography as a Web service) standards and using Semantic Grid

Portals with component model for user interfaces and web control of all capabilities

Collaboration to support world-wide work Basic Grid tools: workflow and notification NOT metacomputing

Page 29: Grid System Issues MSI-CI 2  Meeting

Community Tools e-mail and list-serves are oldest and best used Kazaa, Instant Messengers, Skype, Napster, BitTorrent for P2P

Collaboration – text, audio-video conferencing, files del.icio.us, Connotea, Citeulike manage shared bookmarks hotornot.com or similar sites allow you to create community

resources and share them Writely, Wikis and Blogs are powerful specialized shared

document systems ConferenceXP and WebEx share general applications Google Scholar tells you who has cited your papers while

publisher sites tell you about co-authors Note sharing resources creates (implicit) communities

• Social network tools study graphs to both define communities and extract their properties

Page 30: Grid System Issues MSI-CI 2  Meeting

Why use SOA’s Globalization of applications: Life, Fun, Research, Business,

Defense as an International collaborative activity Globalization of Software Production: Software components

including open-source made everywhere Interoperability: in interfaces and protocol (messages) requires

Web Services as only broadly supported SOA Anti-Performance: if Moore’s law gives you a factor X, then use

√X for performance, √ X for improved lifecycle (re-use) Software Engineering: Software paradigms are ways of

“packaging” modules/components/objects/methods/subroutines. Services have minimal coupling and best re-use (lowest performance). 1962 Fortran easier re-use than 2006 Java

Multicore chips: requires pervasive concurrency without side effects. Even Microsoft must be able to use 32-128 way parallelism on a chip over next 5 years

Page 31: Grid System Issues MSI-CI 2  Meeting

Intel Fall 2005 Multicore Roadmap

March 2006 Sun T1000 8 core Server at <$6,000

Page 32: Grid System Issues MSI-CI 2  Meeting

Performance Per Transistor

Performance data from uP vendors Transistor count excludes on-chip caches Performance normalized by clock rate Conclusion: Simplest is best! (250K Transistor CPU)

0.1

1

10

0.1 1 100.1

1

10

0.1 1 10

Millions of Transistors (CPU) Millions of Transistors (CPU)

No

rma

lize

d S

PE

CIN

TS

No

rma

lize

d S

PE

CF

LT

S

Peter Kogge 1997

Page 33: Grid System Issues MSI-CI 2  Meeting

33

The Grid and Web Service Institutional Hierarchy

OGSA GS-*and some WS-*GGF/W3C/….XGSP (Collab)

WS-* fromOASIS/W3C/Industry

Apache Axis.NET etc.

Must set standards to get interoperability

2: System Services and Features(WS-* from OASIS/W3C/Industry)

Handlers like WS-RM, Security, UDDI Registry

3: Generally Useful Services and Features(OGSA and other GGF, W3C) Such as

“Collaborate”, “Access a Database” or “Submit a Job”

4: Application or Community of Interest (CoI)Specific Services such as “Map Services”, “Run

BLAST” or “Simulate a Missile”

1: Container and Run Time (Hosting) Environment (Apache Axis, .NET etc.)

XBMLXTCE VOTABLECMLCellML

Page 34: Grid System Issues MSI-CI 2  Meeting

3434

Sources of Grid Technology Grids support distributed collaboratories or virtual

organizations integrating concepts from The Web Agents Distributed Objects (CORBA Java/Jini COM) Globus, Legion, Condor, NetSolve, Ninf and other High

Performance Computing activities Peer-to-peer Networks With perhaps the Web and P2P networks being the most

important for “Information Grids” and Globus for “Compute/File Grids”

Page 35: Grid System Issues MSI-CI 2  Meeting

3535

The Essence of Grid Technology? We will start from the Web view and assert that basic

paradigm is Meta-data rich Web Services communicating via

messages These have some basic support from some runtime

such as .NET, Jini (pure Java), Apache Tomcat+Axis (Web Service toolkit), Enterprise JavaBeans, WebSphere (IBM) or GT3/4 (Globus Toolkit 3/4)• These are the distributed equivalent of operating system

functions as in UNIX Shell

• Called Hosting Environment or platform W3C standard WSDL defines IDL (Interface

standard) for Web Services

Page 36: Grid System Issues MSI-CI 2  Meeting

3636

What is Happening? Grid ideas are being developed in (at least) four communities

• Web Service – W3C, OASIS, (DMTF)• Global Grid Forum (High Performance Computing, e-Science)• Enterprise Grid Alliance (Commercial “Grid Forum” with a

near term focus) merged with GGF to make Open Grid Forum Service Standards are being debated Grid Operational Infrastructure is being deployed Grid Architecture and core software being developed

• Apache has several important projects as do academia; large and small companies

Particular System Services are being developed “centrally” – OGSA framework for this in GGF; WS-* for OASIS/W3C/Microsoft-IBM

Lots of fields are setting domain specific standards and building domain specific services

USA started but now Europe is probably in the lead and Asia will soon catch USA if momentum (roughly zero for USA) continues

Page 37: Grid System Issues MSI-CI 2  Meeting

3737

What do Grids Add?What do Grids Add? GridsGrids use use all of the Web Servicesall of the Web Services They address They address managementmanagement and deployment of and deployment of

large distributed systems of serviceslarge distributed systems of services• Internet Scale Distributed ServicesInternet Scale Distributed Services• I will use Grid more simply as a I will use Grid more simply as a composable composable

coordinated collection of services coordinated collection of services They address They address security security and management issues of and management issues of

virtual organizationsvirtual organizations crossing multiple crossing multiple administrative domainsadministrative domains

GGF is developing specific services of relevance GGF is developing specific services of relevance including including jobjob management, many aspects of management, many aspects of data data and and schedulingscheduling• Not much on Not much on sensors, real-time, P2Psensors, real-time, P2P

GGF has a good process for developing new GGF has a good process for developing new higher level specificationshigher level specifications

Page 38: Grid System Issues MSI-CI 2  Meeting

3838

Technical Activities of Note Look at different styles of Grids such as Autonomic (Robust

Reliable Resilient) New Grid architectures hard due to investment required Program the Grid – Workflow Access the Grid – Portals, Grid Computing Environments Critical Services Such as

• Security – build message based not connection based

• Notification – event services

• Metadata – Use Semantic Web, provenance

• Fabric and Service Management

• Databases and repositories – instruments, sensors

• Computing – Submit job, scheduling, distributed file systems

• Visualization, Computational Steering

• Network performance

LowLevelWS-*

High Levele.g. OGSA

Page 39: Grid System Issues MSI-CI 2  Meeting

39

What do Web Services Prescribe?• The specify interfaces for system services (and generally useful

services like database) • They specify an interface language (WSDL) for all services• They develop containers and frameworks to use to host services• They specify a message format (SOAP) for ALL messages that

defines both application and system actions precisely• They imply a process be started to define domain specific

services• There are multiple competing activities from Microsoft and IBM

to Apache, and IU (for example) developing system and application services

• Unlike for RTI and CORBA, services from different vendors should interoperate

H1 H4H3H2 Body F1 F2 F3 F4 Service

Container Handlers

Container System Processing

Page 40: Grid System Issues MSI-CI 2  Meeting

4040

Plethora of Standards Java is very powerful partly due to its many “frameworks” that

generalize libraries e.g.• Java Media Framework• Java Database Connectivity JDBC

Web Services have a correspondingly collections of specifications that represent critical features of the distributed operating systems for “Grids of Simple Services”• About 60 WS-* specifications introduced in last 2-3 years• These are low level with higher level standards such as access

database (OGSA-DAI) or “Submit a job” built on top of these Many battles both between standard bodies and between companies as

each tries to set standards they consider best; thus there are multiple standards for many of key Web Service functionalities

Microsoft a key player and stands to benefit as Web Services open up enterprise software space to all participants• e.g. MQSeries (IBM) and Tibco have to change their messaging

systems to support new open standards

Page 41: Grid System Issues MSI-CI 2  Meeting

41

The Ten areas covered by the 60 core WS-* Specifications

WS-* Specification Area Examples

1: Core Service Model XML, WSDL, SOAP

2: Service Internet WS-Addressing, WS-MessageDelivery; Reliable Messaging WSRM; Efficient Messaging MOTM

3: Notification WS-Notification, WS-Eventing (Publish-Subscribe)

4: Workflow and Transactions BPEL, WS-Choreography, WS-Coordination

5: Security WS-Security, WS-Trust, WS-Federation, SAML, WS-SecureConversation

6: Service Discovery UDDI, WS-Discovery

7: System Metadata and State WSRF, WS-MetadataExchange, WS-Context

8: Management WSDM, WS-Management, WS-Transfer

9: Policy and Agreements WS-Policy, WS-Agreement

10: Portals and User Interfaces WSRP (Remote Portlets)

Page 42: Grid System Issues MSI-CI 2  Meeting

42

Activities in Global Grid Forum Working Groups

GGF Area GS-* and OGSA Standards Activities

1: Architecture High Level Resource/Service Naming (level 2 of slide 6),Integrated Grid Architecture

2: Applications Software Interfaces to Grid, Grid Remote Procedure Call, Checkpointing and Recovery, Interoperability to Job Submittal services, Information Retrieval,

3: Compute Job Submission, Basic Execution Services, Service Level Agreements for Resource use and reservation, Distributed Scheduling

4: Data Database and File Grid access, Grid FTP, Storage Management, Data replication, Binary data specification and interface, High-level publish/subscribe, Transaction management

5: Infrastructure Network measurements, Role of IPv6 and high performance networking, Data transport

6: Management Resource/Service configuration, deployment and lifetime, Usage records and access, Grid economy model

7: Security Authorization, P2P and Firewall Issues, Trusted Computing

Page 43: Grid System Issues MSI-CI 2  Meeting

43

Net-Centric Core Enterprise Services Core Enterprise Services Service Functionality

NCES1: Enterprise Services Management (ESM)

including life-cycle management

NCES2: Information Assurance (IA)/Security

Supports confidentiality, integrity and availability. Implies reliability and autonomic features

NCES3: Messaging Synchronous or asynchronous cases

NCES4: Discovery Searching data and services

NCES5: Mediation Includes translation, aggregation, integration, correlation, fusion, brokering publication, and other transformations for services and data. Possibly agents

NCES6: Collaboration Provision and control of sharing with emphasis on synchronous real-time services

NCES7: User Assistance Includes automated and manual methods of optimizing the user GiG experience (user agent)

NCES8: Storage Retention, organization and disposition of all forms of data

NCES9: Application Provisioning, operations and maintenance of applications.

Page 44: Grid System Issues MSI-CI 2  Meeting

44

The Core Features/Service Areas IService or Feature WS-* GS-* NCES

(DoD)Comments

A: Broad Principles

FS1: Use SOA: Service Oriented Arch.

WS1 Core Service Architecture, Build Grids on Web Services. Industry best practice

FS2: Grid of Grids Distinctive Strategy for legacy subsystems and modular architecture

B: Core Services

FS3: Service Internet, Messaging

WS2 NCES3 Streams/Sensors. Team

FS4: Notification WS3 NCES3 JMS, MQSeries.

FS5 Workflow WS4 NCES5 Grid Programming

FS6 : Security WS5 GS7 NCES2 Grid-Shib, Permis Liberty Alliance ...

FS7: Discovery WS6 NCES4 UDDI

FS8: System Metadata & State

WS7 Globus MDSSemantic Grid, WS-Context

FS9: Management WS8 GS6 NCES1 CIM

FS10: Policy WS9 ECS

Page 45: Grid System Issues MSI-CI 2  Meeting

45

The Core Feature/Service Areas IIService or Feature WS-* GS-* NCES Comments

B: Core Services (Continued)

FS11: Portals and User assistance

WS10 NCES7 Portlets JSR168, NCES Capability Interfaces

FS12: Computing GS3

FS13: Data and Storage GS4 NCES8 NCOW Data StrategyFederation at data/information layer major research area; CGL leading role

FS14: Information GS4 JBI for DoD, WFS for OGC

FS15: Applications and User Services

GS2 NCES9 Standalone ServicesProxies for jobs

FS16: Resources and Infrastructure

GS5 Ad-hoc networks

FS17: Collaboration and Virtual Organizations

GS7 NCES6 XGSP, Shared Web Service ports

FS18: Scheduling and matching of Services and Resources

GS3 Current work only addresses scheduling “batch jobs”. Need networks and services

Page 46: Grid System Issues MSI-CI 2  Meeting

46

A List of Web Services 1• 1) Core Service Architecture

• XSD XML Schema (W3C Recommendation) V1.0 February 1998, V1.1 February 2004

• WSDL 1.1 Web Services Description Language Version 1.1, (W3C note) March 2001

• WSDL 2.0 Web Services Description Language Version 2.0, (W3C under development) March 2004

• SOAP 1.1 (W3C Note) V1.1 Note May 2000

• SOAP 1.2 (W3C Recommendation) June 24 2003

Page 47: Grid System Issues MSI-CI 2  Meeting

47

A List of Web Services 2• 2) Service Internet including messaging• WS-Addressing Web Services Addressing (BEA, IBM, Microsoft, SAP, Sun) in

W3C consideration August 2004 • WS-MessageDelivery Web Services Message Delivery (W3C Submission by

Oracle, Sun ..) April 2004 • WS-Reliability Web Services Reliable Messaging (OASIS Web Services

Reliable Messaging TC) March 2004 • WS-RM Web Services Reliable Messaging (BEA, IBM, Microsoft, Tibco)

v0.992 February 2005 linked to WS-Reliability in OASIS as Web Services Reliable Exchange (WS-RX)

• WS-RM Policy Web Services Reliable Messaging Policy Assertion (BEA, IBM, Microsoft, Tibco) March 2006

• WS-RX Web Services Reliable Exchange (Many members) integrating previous reliability specifications

• SOAP MOTM SOAP Message Transmission Optimization Mechanism (W3C) June 2004

• SOAP-over-UDP Binding of SOAP to UDP (Microsoft, BEA …) September 2004

• Many obsolete specifications like WS-Routing and Referral SOAP Routing Protocol (Microsoft) October 2001

Page 48: Grid System Issues MSI-CI 2  Meeting

48

Bit levelInternet

(OSI Stack)

Layered Architecture for Web Services and Grids

Base Hosting EnvironmentProtocol HTTP FTP DNS …

Presentation XDR …Session SSH …

Transport TCP UDP …Network IP …

Data Link / Physical

ServiceInternet

Application Specific GridsGenerally Useful Services and Grids

Workflow WSFL/BPELService Management (“Context etc.”)

Service Discovery (UDDI) / InformationService Internet Transport Protocol

Service Interfaces WSDL

ServiceContext

HigherLevelServices

Page 49: Grid System Issues MSI-CI 2  Meeting

WS-* implies the Service Internet We have the classic (CISCO, Juniper ….) Internet routing the

flood of ordinary packets in OSI stack architecture Web Services build the “Service Internet” or IOI (Internet on

Internet) with• Routing via WS-Addressing not IP header• Fault Tolerance (WS-RM not TCP)• Security (WS-Security/SecureConversation not IPSec/SSL)• Data Transmission by WS-Transfer not HTTP• Information Services (UDDI/WS-Context not

DNS/Configuration files)• At message/web service level and not packet/IP address level

Software-based Service Internet possible as computers “fast” Familiar from Peer-to-peer networks and built as a software

overlay network defining Grid (analogy is VPN) SOAP Header contains all information needed for the “Service

Internet” (Grid Operating System) with SOAP Body containing information for Grid application service

Page 50: Grid System Issues MSI-CI 2  Meeting

50

A List of Web Services 3• 3) Notification and high-level publish/subscribe information

dissemination

• WS-Eventing Web Services Eventing (BEA, Microsoft, TIBCO) August 2004

• WS-EventNotification (HP, IBM, Intel, Microsoft) March 2006 uses resources to manage subscriptions

• WS-Notification Framework for Web Services Notification with WS-Topics, WS-BaseNotification, and WS-BrokeredNotification (OASIS) OASIS Web Services Notification TC Set up March 2004

• JMS Java Message Service V1.1 March 2002

• Different from using publish-subscribe to robustly support messaging between Web services– Bind SOAP to JMS or MQSeries

Page 51: Grid System Issues MSI-CI 2  Meeting

51

A List of Web Services 4• 4) Coordination and Workflow, Transactions and

Contextualization• BPEL Business Process Execution Language for Web Services

(OASIS) V1.1 May 2003 (V1.1) with V2.0 under development• WS-CDL Web Services Choreography Language (W3C) V1.0

Working Draft 17 December 2004• WSCI (W3C) Web Service Choreography Interface V1.0 (W3C

Note from BEA, Intalio, SAP, Sun, Yahoo) • WSCL Web Services Conversation Language (W3C Note) HP

March 2002 • Workflow is general linkage between services; transactions are a

critical special case• Concept of workflow generalizes traditional workflow processes

in business• Many competing workflow implementations and standards;

many implementations “reject” current standards

Page 52: Grid System Issues MSI-CI 2  Meeting

5252

Role of WorkflowRole of Workflow

Programming SOAP and Web Services (the Grid)Programming SOAP and Web Services (the Grid): : Workflow describes linkage between servicesWorkflow describes linkage between services

As distributed, As distributed, linkage must be by messageslinkage must be by messages Linkage is two-way and has both control and dataLinkage is two-way and has both control and data Apply to multi-disciplinary, multi-scale linkage, Apply to multi-disciplinary, multi-scale linkage,

multi-program linkage, link multi-program linkage, link visualization to visualization to simulationsimulation, GIS to simulations and visualization , GIS to simulations and visualization filters to each otherfilters to each other

Microsoft-IBM specification Microsoft-IBM specification BPELBPEL is current is current preferred Web Service XML specification of preferred Web Service XML specification of workflowworkflow

Service-1 Service-3

Service-2

Page 53: Grid System Issues MSI-CI 2  Meeting

5353

Example workflowExample workflow

Here a sensor feeds a data-mining application(We are extending data-mining in DoD applications with Grossman from UIC)The data-mining application drives a visualization

Page 54: Grid System Issues MSI-CI 2  Meeting

5454

Example Flood Simulation workflowExample Flood Simulation workflow

DataArchives

DataArchives

RunoffModel

RunoffModel

FlowModel

FlowModel

FlowModel

GIS Grid Services Link Distributed

Data and Applications

SOAP MessagesAnd Events

DataArchives

DataArchives

RunoffModel

RunoffModel

FlowModel

FlowModel

FlowModel

GIS Grid Services Link Distributed

Data and Applications

SOAP MessagesAnd Events

Page 55: Grid System Issues MSI-CI 2  Meeting

5555

SERVOGrid Codes, RelationshipsSERVOGrid Codes, Relationships

Elastic DislocationPattern Recognizers

Fault Model BEM

Viscoelastic Layered BEM

Viscoelastic FEMElastic Dislocation Inversion

This linkage called Workflow in Grid/Web Service parlance

Page 56: Grid System Issues MSI-CI 2  Meeting

56

Two-level Programming I• The Web Service (Grid) paradigm implicitly assumes a

two-level Programming Model• We make a Service (same as a “distributed object” or

“computer program” running on a remote computer) using conventional technologies– C++ Java or Fortran Monte Carlo module

– Data streaming from a sensor or Satellite

– Specialized (JDBC) database access

• Such services accept and produce data from users files and databases

• The Grid is built by coordinating such services assuming we have solved problem of programming the service

Service Data

Page 57: Grid System Issues MSI-CI 2  Meeting

5757

Two-level Programming II The Grid is discussing the composition of distributed

services with the runtime interfaces to Grid as opposed to UNIX pipes/data streams

Familiar from use of UNIX Shell, PERL or Python scripts to produce real applications from core programs

Such interpretative environments are the single processor analog of Grid Programming

Some projects like GrADS from Rice University are looking at integration between service and composition levels but dominant effort looks at each level separately

Service1 Service2

Service3 Service4

Page 58: Grid System Issues MSI-CI 2  Meeting

58

WS 2 WS N-1Web Service 1 Web Service N

3 Layer Programming Model

Level 2 Programming choosing services by virtualizationApplication Semantics (Metadata, Ontology) Semantic Grid

Level 1 Programming inside servicesApplication expressed in in Java Fortran C++ MPI etc.

Level 3 Grid Programming composing multiple servicesService Workflow, Transactions, Mediation

WS-* Infrastructure

Substantial work in UK e-Science program, international semantic web community

Page 59: Grid System Issues MSI-CI 2  Meeting

59

A List of Web Services 4-Continued• 4) Transactions, Business Processes and Contextualization• WS-CAF Web Services Composite Application Framework including WS-

CTX, WS-CF and WS-TXM below (OASIS Web Services Composite Application Framework TC)

• WS-CTX Web Services Context (OASIS Web Services Composite Application Framework TC) V0.9.2 July 2005

• WS-CF Web Services Coordination Framework (OASIS Web Services Composite Application Framework TC) V0.1 April 2005

• WS-TXM Web Services Transaction Management (OASIS Web Services Composite Application Framework TC) including WS-ACID (V0.1 May 2005), WS-BP (Business Process V0.1 May 2005), WS-LRA (Long running action V0.1 May 2005)

• WS-Coordination Web Services Coordination (BEA, IBM, Microsoft) November 2004

• WS-AtomicTransaction Web Services Atomic Transaction (BEA, IBM, Microsoft) November 2004

• WS-BusinessActivity Web Services Business Activity Framework (BEA, IBM, Microsoft) November 2004

• BTP Business Transaction Protocol (OASIS) May 2002 with V1.1 November 2004

• ebXML BPSS Business Process (OASIS) with V2.0.1 pre-Committee Draft review 17 July 2005

Page 60: Grid System Issues MSI-CI 2  Meeting

60

A List of Web Services 5• 5) Security Frameworks and Core Specifications• WS-Security 2004 Web Services Security: SOAP Message Security (OASIS)

Standard March 2004. • WS-I Basic Security Profile V1.0 Web Services Interoperability Organization

Working Group Draft May 15 2005• WS-Security Username Token Profile Web Services Security Username Token

Profile V1.0 OASIS Standard, March 2004• WS-Security X.509 Certificate Token Profile Web Services Security X.509

Certificate Token Profile OASIS Standard, March 2004 • WS-Security REL Profile Web Services Security Rights Expression Language

(REL) Token Profile OASIS Standard: 19 December 2004 • WS-I REL Token Profile V1.0 Web Services Interoperability Organization

Working Group Draft 13 May 2005• WS-Security Kerberos Web Services Security Kerberos Binding (Microsoft)

December 2003• Web-SSO Web Single Sign-On Metadata Exchange Protocol (Microsoft, Sun)

April 2005 • Web-SSO-Mex Web Single Sign-On Interoperability Profile (Microsoft, Sun)

April 2005• WS-SecurityPolicy Web Services Security Policy Language (IBM, Microsoft,

RSA, Verisign) V1.1 July 2005

Page 61: Grid System Issues MSI-CI 2  Meeting

61

A List of Web Services 5 - Contd• 5) Security Capabilities• WS-Trust Web Services Trust Language (BEA, IBM, Microsoft, RSA,

Verisign …) February 2005 • WS-SecureConversation Web Services Secure Conversation Language

(BEA, IBM, Microsoft, RSA, Verisign …) February 2005• WS-Federation Web Services Federation Language (BEA, IBM,

Microsoft, RSA, Verisign) July 2003 • WS-Federation Active Requestor Profile Web Services Federation

Language Active Requestor Profile V 1.0 (BEA, IBM, Microsoft, RSA, Verisign) July 8, 2003

• WS-Federation Passive Requestor Profile Web Services Federation Language Passive Requestor Profile V 1.0 (BEA, IBM, Microsoft, RSA, Verisign) July 8, 2003

• WS-Authorization is being developed by IBM and Microsoft and will build on WS-Trust to describe how access to particular web services is specified and managed.

• WS-Privacy is being developed by IBM and Microsoft and will build on WS-Policy to describe the binding of privacy policies to Web services and their exchanged data.

Page 62: Grid System Issues MSI-CI 2  Meeting

62

A List of Web Services 5 - Contd• 5) Security Languages

• SAML Assertions and Protocols for the OASIS Security Assertion Markup Language (SAML) V2.0 OASIS Standard, 15 March 2005

• WS-Security SAML Token Profile Web Services Security SAML Token Profile OASIS Standard, 1 December 2004

• WS-I SAML Token Profile V1.0 Web Services Interoperability Organization Working Group Draft 13 May 2005

• XACML eXtensible Access Control Markup Language (OASIS) V2.0 1 February 2005

Page 63: Grid System Issues MSI-CI 2  Meeting

63

A List of Web Services 6• 6) Service Discovery

• UDDI (Broadly Supported OASIS Standard) V3 August 2003

• WS-Discovery Web services Dynamic Discovery (Microsoft, BEA, Intel …) February 2004

• WS-IL Web Services Inspection Language, (IBM, Microsoft) November 2001

• Note WS-Context as a metadata catalog and WS-Management Catalog are examples of related services

• There are many UDDI extensions

Page 64: Grid System Issues MSI-CI 2  Meeting

64

A List of Web Services 7• 7) Metadata and State• RDF Resource Description Framework (W3C) Set of

recommendations expanded from original February 1999 standard • DAML+OIL combining DAML (Darpa Agent Markup Language)

and OIL (Ontology Inference Layer) (W3C) Note December 2001 • OWL Web Ontology Language (W3C) Recommendation February

2004 • WS-MetadataExchange 1.1 Web Services Metadata Exchange

(HP, IBM, Intel, Microsoft) March 2006 • ASAP Asynchronous Service Access Protocol (OASIS) with V1.0

working draft 2B December 11 2004• WS-GAF Web Service Grid Application Framework (Arjuna,

Newcastle University) August 2003• WBEM Web-Based Enterprise Management including CIM

(Common Information Model) from DMTF (Distributed Management Task Force) 2004-2005

Page 65: Grid System Issues MSI-CI 2  Meeting

65

A List of Web Services 7• 7) Metadata and State: Resource Framework• WS-RF Web Services Resource Framework (OASIS)

including • WS-Resource Framework Web Services Resource 1.2

(OASIS) Public Review Draft 01, 10 June 2005• WS-ResourceProperties Web Services Resource

Properties V1.2 Public Review Draft 01, 10 June 2005• WS-ResourceLifetime Web Services Resource Lifetime

V1.2 Public Review Draft 01, 13 June 2005• WS-ServiceGroup Web Services Service Group V1.2

Public Review Draft 01, 10 June 2005• WS-BaseFaults Web Services Base Faults V1.2 Public

Review Draft 01, June 13, 2005

Page 66: Grid System Issues MSI-CI 2  Meeting

6666

Metadata and Service ContextMetadata and Service Context Consider a collection of services working togetherConsider a collection of services working together

• Workflow tells you how to specify service interaction but more Workflow tells you how to specify service interaction but more basically there is shared information or context basically there is shared information or context specifying/controlling collectionspecifying/controlling collection

WS-RF and WS-GAF have different approaches to contextualization WS-RF and WS-GAF have different approaches to contextualization – supplying a common “context” which at its simplest is a token to – supplying a common “context” which at its simplest is a token to represent state represent state

More generally core shared information includes dynamic service More generally core shared information includes dynamic service metadata and the equivalent of configuration information.metadata and the equivalent of configuration information.

One can supports such a common context either as pool of One can supports such a common context either as pool of messages or as message-based access to a “database” (Context messages or as message-based access to a “database” (Context Service)Service)

Two services linked by a stream are perhaps simplest example of a Two services linked by a stream are perhaps simplest example of a collection of services needing contextcollection of services needing context

Note that there is a tension between storing metadata in Note that there is a tension between storing metadata in messagesmessages and and services. services. • This is shared versus distributed memory debate in parallel This is shared versus distributed memory debate in parallel

computingcomputing

Page 67: Grid System Issues MSI-CI 2  Meeting

6767

Stateful Interactions There are (at least) four approaches to specifying state

• OGSI use factories to generate separate services for each session in standard distributed object fashion

• Globus GT-4 and WSRF use metadata of a resource to identify state associated with particular session

• WS-GAF uses WS-Context to provide abstract context defining state. Has strength and weakness that reveals less about nature of session

• WS-I+ “Pure Web Service” leaves state specification the application – e.g. put a context in the SOAP body

I think we should smile and write a great metadata service hiding all these different models for state and metadata

Page 68: Grid System Issues MSI-CI 2  Meeting

68

A List of Web Services 8• 8) Management – original OASIS

• WS-DistributedManagement Web Services Distributed Management Framework with MUWS and MOWS below (OASIS)

• WSDM-MUWS Web Services Distributed Management: Management Using Web Services (OASIS) OASIS Standard March 9 2005

• WSDM-MOWS Web Services Distributed Management: Management of Web Services (OASIS) OASIS Standard March 9 2005

Page 69: Grid System Issues MSI-CI 2  Meeting

69

A List of Web Services 8- Contd• 8) Management: Microsoft Converged Stack• WS-Management Web Services for Management

(Microsoft, Intel, Sun …) August 2005 • WS-Management Catalog The WS-Management

Catalog (Microsoft, Intel, Sun …) August 2005• WS-ResourceTransfer Web Service Resource Transfer

(HP, IBM, Intel, Microsoft) March 2006 • WS-Transfer Web Service Transfer (Microsoft, BEA,

Sonic Software etc.) September 2004• WS-TransferAddendum Extensions to Web Service

Transfer (HP, IBM, Intel, Microsoft) March 2006 • WS-Enumeration Web Service Enumeration

(Microsoft, BEA, Sonic Software etc.) September 2004

Page 70: Grid System Issues MSI-CI 2  Meeting

70

A List of Web Services 9• 9) General Service Characteristics

• WS-PolicyFramework Web Services Policy Framework (BEA, IBM, Microsoft, SAP …) September 2004

• WS-PolicyAttachment Web Services Policy Attachment (BEA, IBM, Microsoft, SAP …) September 2004

• WS-PolicyAssertions Web Services Policy Assertions Language (BEA, IBM, Microsoft, SAP) 18 December 2002 (Superseded by WS-PolicyFramework)

• WS-Agreement Web Services Agreement Specification (GGF under development) 9 August 2004

Page 71: Grid System Issues MSI-CI 2  Meeting

71

A List of Web Services 10• 10) User Interfaces

• WSRP Web Services for Remote Portlets (OASIS) OASIS Standard August 2003

• JSR168: JSR-000168 Portlet Specification for Java binding (Java Community Process) October 2003

• WSRP specifies the client-service protocol while JSR168 specifies how portlets are implemented for each supported service user-facing Web service ports inside aggregating portalslike JetSpeed, GridSphere or uPortal

Page 72: Grid System Issues MSI-CI 2  Meeting

7272

WS-I InteroperabilityWS-I Interoperability Critical underpinning of Grids and Web Services is Critical underpinning of Grids and Web Services is

the gradually growing set of specifications in the the gradually growing set of specifications in the Web Service Interoperability ProfilesWeb Service Interoperability Profiles

Web Services InteroperabilityWeb Services Interoperability (WS-I) Interoperability (WS-I) Interoperability Profile 1.0a." Profile 1.0a." http://www.ws-i.orghttp://www.ws-i.org. gives us . gives us XSD, XSD, WSDL1.1, SOAP1.1, UDDIWSDL1.1, SOAP1.1, UDDI in basic profile and parts in basic profile and parts of of WS-Security WS-Security in their first security profile.in their first security profile.

We imagine the “60 Specifications” being checked We imagine the “60 Specifications” being checked out and evolved in the out and evolved in the cauldron of the real worldcauldron of the real world and occasionally best practice identifies a new and occasionally best practice identifies a new specification to be added to specification to be added to WS-IWS-I which which gradually gradually increases in scopeincreases in scope• Note only 4.5 out of 60 specifications have Note only 4.5 out of 60 specifications have

“made it” in this definition“made it” in this definition

Page 73: Grid System Issues MSI-CI 2  Meeting

73

Database

SS

SS

SS

SS

SS

SS

SS

SS

SS

SS

FS

FS

FS

FS

FS

FS

FS

FS FS

FS

FS

FS

FS

FS

FS

FS

FS FS

FS

FS

PortalFS

OS

OS

OS

OS

OS

OS

OS

OS

OS

OS

OS

OS

MD

MD

MD

MD

MD

MD

MD

MD

MD

MetaDataFilter Service

Sensor Service

OtherService

AnotherGrid

Raw Data Data Information Knowledge Wisdom

Decisions

SS

SS

AnotherService

AnotherService

SSAnother

Grid SS

AnotherGrid

SS

SS

SS

SS

SS

SS

SS

SS

FS

SOAP Messages

Portal

Page 74: Grid System Issues MSI-CI 2  Meeting

7474

Semantic Grid and Services Implications of SOA (Service Oriented Architectures) for SG

(Semantic Grid)

• Build services to implement SG Implications of SG for SOA

• Build metadata rich systems of services using SG Services receive data in SOAP messages, manipulate it and

produce transformed data as further messages Meta-data is carried in SOAP messages Meta-data controls processing and transport of SOAP Messages Knowledge is created from data by services The Grid enhances Web services with semantically rich system

and application specific management One must exploit and work around the different approaches to

meta-data and their manipulation in Web Services

Page 75: Grid System Issues MSI-CI 2  Meeting

7575

Structure of SOAP Messages

SOAP Messages have System information in the header including WS-Policy based meta-data defining processing options• Processed by Handlers

Application data and meta-data is the body (controversies here!)• Processed by the Service itself

Some meta-data like WS-RF is logically “only in messages” Other like that in WS-Context or the SRB are stored in logical

equivalent of XML databases We only need to preserve semantic structure (XML/SOAP

Infoset) so transport in fast XML and store in efficient relational databases

H1 H4H3H2 Body F1 F2 F3 F4 Service

Container Handlers

Container Workflow

Page 76: Grid System Issues MSI-CI 2  Meeting

7676

Support for Messages Optimize XML representation and transport protocol

XML’’Filter2-1

StdXML Filter1 XML’ StdXMLFilter1-1XML’

Database(WS-Context)

Choose InvertibleFilter

Choose Protocol

XML’’ Filter2

Filters Preserve Infoset

Page 77: Grid System Issues MSI-CI 2  Meeting

7777

FI (Fast Infoset=Binary XML) v Traditional XML Messages

Transfer Time Comparison

0

100

200

300

400

500

600

700

# Of Features Per Message

Tim

e (m

s)

Transfer - FI

Transfer - XML

Page 78: Grid System Issues MSI-CI 2  Meeting

7878

PDA to Web Service Optimized Communication

0 5 10 15 20 25 30 350

20

40

60

80

100

120

140

Number Of Messages Per Session

To

tal S

ess

ion

Tim

e (

sec)

HHFR: 16 String Per MessageSOAP: 16 String Per Message

Page 79: Grid System Issues MSI-CI 2  Meeting

7979

Requirements for MPI Messaging

MPI and SOAP Messaging both send data from a source to a destination

• MPI supports multicast (broadcast) communication;

• MPI specifies destination and a context (in comm parameter)

• MPI specifies data to send• MPI has a tag to allow flexibility in processing in source processor

• MPI has calls to understand context (number of processors etc.)

MPI requires very low latency and high bandwidth so that tcomm/tcalc is at most 10

• BlueGene/L has bandwidth between 0.25 and 3 Gigabytes/sec/node and latency of about 5 microseconds

• Latency determined so Message Size/Bandwidth > Latency

tcommtcalc tcalc

Page 80: Grid System Issues MSI-CI 2  Meeting

8080

Requirements for SOAP Messaging Web Services has much of the same requirements as MPI with two

differences where MPI more stringent than SOAP• Latencies are inevitably 1 (local) to 100 milliseconds which is

200 to 20,000 times that of BlueGene/L 1) 0.000001 ms – CPU does a calculation 2) 0.001 to 0.01 ms – MPI latency 3) 1 to 10 ms – wake-up a thread or process 4) 10 to 1000 ms – Internet delay

• Bandwidths for many business applications are low as one just needs to send enough information for ATM and Bank to define transactions

SOAP has MUCH greater flexibility in areas like security, fault-tolerance, “virtualizing addressing” because one can run a lot of software in 100 milliseconds• Typically takes 1-3 milliseconds to gobble up a modest message

in Java and “add value”

Page 81: Grid System Issues MSI-CI 2  Meeting

8181

Structure of SOAP SOAP defines a very obvious message structure with a

header and a body just like email The header contains information used by the “Internet

operating system”• Destination, Source, Routing, Context, Sequence Number …

The message body is partly further information used by the operating system and partly information for application when it is not looked at by “operating system” except to encrypt, compress it etc.• Note WS-Security supports separate encryption for different

parts of a document Much discussion in field revolves around what is

referenced in header This structure makes it possible to define VERY

Sophisticated messaging

Page 82: Grid System Issues MSI-CI 2  Meeting

8282

MPI and SOAP Integration Note SOAP Specifies format and through WSDL

interfaces MPI only specifies interface and so interoperability

between different MPIs requires additional work• IMPI http://impi.nist.gov/IMPI/

Pervasive networks can support high bandwidth (Terabits/sec soon) but latency issue is not resolvable in general way

Can combine MPI interfaces with SOAP messaging but I don’t think this has been done

Just as walking, cars, planes, phones coexist with different properties; so SOAP and MPI are both good and should be used where appropriate

Page 83: Grid System Issues MSI-CI 2  Meeting

8383

When is a High Performance Computer? We might wish to consider three classes of multi-node computers 1) Classic MPP with microsecond latency and scalable internode

bandwidth (tcomm/tcalc ~ 10 or so) 2) Classic Cluster which can vary from configurations like 1) to 3)

but typically have millisecond latency and modest bandwidth 3) Classic Grid or distributed systems of computers around the

network• Latencies of inter-node communication – 100’s of milliseconds

but can have good bandwidth All have same peak CPU performance but synchronization costs

increase as one goes from 1) to 3) Cost of system (dollars per gigaflop) decreases by factors of 2 at

each step from 1) to 2) to 3) One should NOT use classic MPP if class 2) or 3) suffices unless

some security or data issues dominates over cost-performance One should not use a Grid as a true parallel computer – it can

link parallel computers together for convenient access etc.

Page 84: Grid System Issues MSI-CI 2  Meeting

8484

Linking Modules

From method based to RPC to message based to event-based publish-subscribe Message Oriented Middleware

Module A

Module B

Method Calls.001 to 1 millisecond

Service A

Service B

Messages

0.1 to 1000 millisecond latency

Coarse Grain Service ModelClosely coupled Java/Python …

Service B Service A

PublisherPost Events

“Listener”Subscribe to Events

Message Queue in the Sky

Page 85: Grid System Issues MSI-CI 2  Meeting

8585

What is a Simple Service? Take any system – it has multiple functionalities

• We can implement each functionality as an independent distributed service

• Or we can bundle multiple functionalities in a single service Whether functionality is an independent service or one of many

method calls into a “glob of software”, we can always make them as Web services by converting interface to WSDL

Simple services are gotten by taking functionalities and making as small as possible subject to “rule of millisecond”• Distributed services incur messaging overhead of one (local) to

100’s (far apart) of milliseconds to use message rather than method call

• Use scripting or compiled integration of functionalities ONLY when require <1 millisecond interaction latency

Apache web site has many (pre Web Service) projects that are multiple functionalities presented as (Java) globs and NOT (Java) Simple Services• Makes it hard to integrate sharing common security, user

profile, file access .. services

Page 86: Grid System Issues MSI-CI 2  Meeting

86

Grids of Grids of Simple Services• Link via methods messages streams• Services and Grids are linked by messages• Internally to service, functionalities are linked by methods• A simple service is the smallest Grid• We are familiar with method-linked hierarchy

Lines of Code Methods Objects Programs Packages

Overlayand ComposeGrids of Grids

Methods Services Component Grids

CPUs Clusters ComputeResource Grids

MPPs

DatabasesFederatedDatabases

Sensor Sensor Nets

DataResource Grids

Page 87: Grid System Issues MSI-CI 2  Meeting

8787

Component Grids? So we build collections of Web Services which we package as

component Grids

• Visualization Grid

• Sensor Grid

• Utility Computing Grid

• Collaboration Grid

• Earthquake Simulation Grid

• Control Room Grid

• Crisis Management Grid

• Drug Discovery Grid

• Bioinformatics Sequence Analysis Grid

• Intelligence Data-mining Grid We build bigger Grids by composing component Grids using the

Service Internet

Page 88: Grid System Issues MSI-CI 2  Meeting

88

Typical use of Grid Messaging in NASA

Datamining Grid

Sensor Grid implementing using NB

NB GIS Grid

Page 89: Grid System Issues MSI-CI 2  Meeting

89Physical Network (monitored by FS16)

7: Discovery 8:Metadata

BioInformatics GridChemical Informatics Grid

…Domain SpecificGrids/Services

4: Notification

6: Security 5: Workflow3: Messaging 9: Management

14: Information Instrument/Sensor

12: Computing

Core Low Level Grid Services

9: Management 18: Scheduling 10: Policy

15: Application Services

Screening ToolsQuantum Calculations

15: Application Services Sequencing ToolsBiocomplexity Simulations

11: Portals

17: Collaboration

Ser

vice

s

13: Data Access/Storage

Using the Grid of Grids and Core Services to build multiple application grids re-using common components.

Page 90: Grid System Issues MSI-CI 2  Meeting

9090

Critical Infrastructure (CI) Grids built as Grids of Grids

Gas Servicesand Filters

Physical Network

Registry Metadata

Flood Servicesand Filters

Flood CIGrid Gas CIGrid… Electricity CIGrid …

Data Access/Storage

Security WorkflowNotification Messaging

Portals Visualization GridCollaboration Grid

Sensor Grid Compute GridGIS Grid

Core Grid Services

Page 91: Grid System Issues MSI-CI 2  Meeting

91

Mediation and Transformation in a Grid of Grids and Simple Services

Po

rtP

ort

Port PortInternal

Interfaces

Subgrid or service

Po

rtP

ort

Port PortInternal

Interfaces

Subgrid or service

Po

rtP

ort

Port PortInternal

Interfaces

Subgrid or service

Messaging

Mediation andTransformationServices

External facingInterfaces

Page 92: Grid System Issues MSI-CI 2  Meeting

Why can we build better software? In 1962 I was punching holes in cards and paper tape to

persuade tiny slow computers to manipulate words in memory to string together instructions like a = b + c

Now computers are much faster and languages are better but not a lot better• I suspect I would only be a factor of 2 or so faster

programming the same program today However A B C can now be resources (Bank records,

Drugs, Games, Supernova) and + can be a service composition• Objects were insufficient as they distributed ordinary

programs; services express distributed independent entities (communication time very different inter and intra computers)

• Services are essential for reliable modular programming

Page 93: Grid System Issues MSI-CI 2  Meeting

What’s wrong with old programs They were made of instructions, methods, subroutines

and libraries thereof Languages (Java, C++) encouraged spaghetti

programming that linked parts of programs together• This leads to efficient but unmaintainable software

However now computers and networks are several orders of magnitude faster• Optimize for modularity and maintainability and rarely if

ever optimize for performance Old programs have the wrong optimization and by

construction are hard to maintain/change

Page 94: Grid System Issues MSI-CI 2  Meeting

Old and New Software Regime Web Services, Grids and P2P systems are built with

• The new software model: independent entities connected by explicit messages

All computer entities are actually connected by some form of message (traveling on bus or from memory to register) but often implicit

• And they support the distributed services and resources needed for global science, fun and business

• Google, Amazon, Yahoo and perhaps Microsoft and Electronic Arts can exploit this model

Old programs have the old architecture and cannot be modified• At best can wrap partial functionalities as services and use as

a black box• IBM, Oracle and the old Enterprise software companies have

this noose around their necks

Page 95: Grid System Issues MSI-CI 2  Meeting

9595

Delicious Applications http://del.icio.us purchased by Yahoo for ~$30M http://www.CiteULike.org http://www.connotea.org (Nature) http://www.bibsonomy.org/

• Associate metadata with Bookmarks specified by URL’s, DOI’s (Digital Object Identifiers)

• Users add comments and keywords (called tags)• Users are linked together into groups (communities)• Information such as title and authors extracted automatically

from some sites (PubMed, ACM, IEEE, Wiley etc.)• Bibtex like additional information

This is de facto Semantic Web – remarkable for its simplicity

Page 96: Grid System Issues MSI-CI 2  Meeting

9696

Connotea

Page 97: Grid System Issues MSI-CI 2  Meeting

9797

Connotea queried by SERVOGrid

Page 98: Grid System Issues MSI-CI 2  Meeting

9898

Provenance and Delicious ???? ???? is any field such as chemistry All ???? Data should be associated with provenance that

describes its lineage

• How and when it was created

• Compiler options used in simulation

• ????XMLfrontendedDatabase query used on what ????GridNodes

Provenance produced by computer automatically and/or by user All ????Data can and should be labeled by a URI such as

cicc://ciccnodenumber.xx.yy.whathaveyou We can use del.icio.us style interface to annotate ????Data with

missing provenance and user comments of any type (describing quality of data or a keyword relating different data etc.)

Page 99: Grid System Issues MSI-CI 2  Meeting

9999

Semantic Scholar Grid Citeseer and Google Scholar scour the Internet and

analyze documents for incidental metadata Title, author and institution of documents Citations with their own metadata allowing one to

match to other documents These capabilities are sure to become more powerful

and to be extended• Give “Citation Index” in real time• Tell you all authors of all papers that cite a paper that cites

you etc. (Note it’s a small world so don’t go too far in link analysis)

• Tell you all citations of all papers in a workshop Such high value tools will appear on “publisher” sites

of future (or else publishers will disappear)

Page 100: Grid System Issues MSI-CI 2  Meeting

100100

OSCAR2 Chemistry Document analysis

It detects “magic” chemical strings in text and then• Stores them as

metadata associated with document

Queries ChemInformatics repositories to tell you lots of information about identified compounds

Tells you which other documents have this compound

Page 101: Grid System Issues MSI-CI 2  Meeting

101101

???? Version of OSCAR Some of the ???? Nodes will store metadata associated

with ????Data – including documents• Note documents could be anywhere on the Internet – the ????

Node may choose to store (a copy of) document or just its metadata

• Note all ????Nodes are federated i.e. there is no “one central” store of any type of data

Metadata will be user annotations including tags, Citeseer style citation information for all scientific fields

Then each scientific field has its own version of OSCAR tuned to extract natural metadata for science – for Earthquake science this is GML and Chemistry is CML …

Page 102: Grid System Issues MSI-CI 2  Meeting

102

ExistingUser Interface

Document-enhanced Research Grid

etc.

Google Scholar

ManuscriptCentral

Science.gov

Windows Live Academic Search

Citeseer

CMT Conference

Management

Existing Document-basedResearch Tools

Web serviceWrappers

New Document-enhancedResearch Tools

Integration/EnhancementUser Interface

Community Tools

Generic Document Tools

MyResearchDatabase

Bibliographic Database

Export:RSS, BibtexEndnote etc.

CiteULike

Connotea

Del.icio.us

Bibsonomy

BioliciousPubChem

PubMed

TraditionalCyberinfrastructure

Page 103: Grid System Issues MSI-CI 2  Meeting

103

SSG Domain-1Web service

SSG Domain-NWeb service

Tool-1Del.icio.us

Tool-2Connotea

Tool-3CiteULike

Tool–N e.g.CiteSeer

NativeUI-1

NativeUI-4

NativeUI-3

NativeUI-N

IntegratedUser Interface UI

GatewayWS-1

GatewayWS-2

GatewayWS-3

GatewayWS-N

SSG MDStore

Integration Framework of Tools