1 Introduction to Grids for CTS05 GlobalMMCS Tutorial CTS05 St. Louis May 17 2005 Geoffrey Fox CTO Anabas Corporation and Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 [email protected]http:// www.infomall.org
47
Embed
1 Introduction to Grids for CTS05 GlobalMMCS Tutorial CTS05 St. Louis May 17 2005 Geoffrey Fox CTO Anabas Corporation and Computer Science, Informatics,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
11
Introduction to Grids for CTS05 GlobalMMCS Tutorial
Summary This presentation describes Grids as they are being developed
to support commercial enterprise and research (e-Science) applications
We explain the Web Service Grid architecture We develop the Grid of Grids concepts and suggest one
important subgrid is a Collaboration Grid We contrast typical (today’s Grid) “compute/file” operation
with streaming data Grids needed to support both collaboration and sensor Grids
We discuss important component services in a collaboration Grid
We contrast DoD’s Network Centric Computing and its GiG architecture with current Grid technologies
We show how to make application Web Services collaborative
33
Internet Scale Distributed Services Grids use Internet technology and are distinguished by managing
or organizing sets of network connected resources• Classic Web allows independent one-to-one access to
individual resources • Grids integrate together and manage multiple Internet-
connected resources: People, Sensors, computers, data systems
Organization can be explicit as in• TeraGrid which federates many supercomputers; • Deep Web Technologies IR Grid which federates multiple
data resources; • CrisisGrid which federates first responders, commanders,
sensors, GIS, (Tsunami) simulations, science/public data Organization can be implicit as in Internet resources such as
curated databases and simulation resources that “harmonize a community”
44
Different Visions of the Grid Grid just refers to the technologies
• Or Grids represent the full system/Applications DoD’s vision of Network Centric Computing can be considered a
Grid (linking sensors, warfighters, commanders, backend resources) and they are building the GIG (Global Information Grid)
Utility Computing or X-on-demand (X=data, computer ..) is major computer Industry interest in Grids and this is key part of enterprise or campus Grids
e-Science or Cyberinfrastructure are virtual organization Grids supporting global distributed science (note sensors, instruments are people are all distributed
Skype (Kazaa) VOIP system is a Peer-to-peer Grid (and VRVS/GlobalMMCS like Internet A/V conferencing are Collaboration Grids)
Commercial 3G Cell-phones and DoD ad-hoc network initiative are forming mobile Grids
55
e-moreorlessanything and the Grid e-Business captures an emerging view of corporations as
dynamic virtual organizations linking employees, customers and stakeholders across the world. • The growing use of outsourcing is one example
e-Science is the similar vision for scientific research with international participation in large accelerators, satellites or distributed gene analyses.
The Grid integrates the best of the Web, traditional enterprise software, high performance computing and Peer-to-peer systems to provide the information technology e-infrastructure for e-moreorlessanything.
A deluge of data of unprecedented and inevitable size must be managed and understood.
People, computers, data and instruments must be linked. On demand assignment of experts, computers, networks and
storage resources must be supported
66
e-Defense and e-Crisis Grids support Command and Control and provide
Global Situational Awareness • Link commanders and frontline troops to themselves and to
archival and real-time data; link to what-if simulations • Dynamic heterogeneous wired and wireless networks• Security and fault tolerance essential
System of Systems; Grid of Grids• The command and information infrastructure of each ship is
a Grid; each fleet is linked together by a Grid; the President is informed by and informs the national defense Grid
• Grids must be heterogeneous and federated Crisis Management and Response enabled by a Grid
linking sensors, disaster managers, and first responders with decision support
77
Some Important Styles of Grids Computational Grids were origin of concepts and link
computers across the globe – high latency stops this from being used as parallel machine• Typically Compute/File Grids where information (messages) exchanged
by writing and reading files Knowledge and Information Grids link sensors and information
repositories as in Virtual Observatories or BioInformatics Education Grids link teachers, learners, parents as a VO with
learning tools, distant lectures etc. e-Science Grids link multidisciplinary researchers across
laboratories and universities Community Grids focus on Grids involving large numbers of
peers rather than focusing on linking major resources – links Grid and Peer-to-peer network concepts
Semantic Grid links Grid, and AI community with Semantic web (ontology/meta-data enriched resources) and Agent concepts
Collaboration Grids support the linkage of multiple people and electronic resources (often peer-to-peer architecture)
88
Types of Computing Grids Running “Pleasing Parallel Jobs” as in United Devices,
Entropia (Desktop Grid) “cycle stealing systems” Can be managed (“inside” the enterprise as in Condor)
or more informal (as in SETI@Home) Computing-on-demand in Industry where jobs spawned
are perhaps very large (SAP, Oracle …) Support distributed file systems as in Legion (Avaki),
Globus with (web-enhanced) UNIX programming paradigm• Particle Physics will run some 30,000 simultaneous jobs
Linking Supercomputers as in TeraGrid Pipelined applications linking data/instruments,
compute, visualization Seamless Access where Grid portals allow one to choose
one of multiple resources with a common interfaces
99
Utility and Service Computing An important business application of Grids is believed to be
utility computing Namely support a pool of computers to be assigned as needed to
take-up extra demand• Pool shared between multiple applications
Natural architecture is not a cluster of computers connected to each other but rather a “Farm of Grid Services” connected to Internet and supporting services such as• Web Servers• Financial Modeling • Run SAP • Data-mining• Simulation response to crisis like forest fire or earthquake• Media Servers for Video-over-IP
Note classic Supercomputer use is to allow full access to do “anything” via ssh etc.• In service model, one pre-configures services for all programs
and you access portal to run job with less security issues
1010
Information/Knowledge Grids Distributed (10’s to 1000’s) of data sources (instruments,
file systems, curated databases …) Data Deluge: 1 (now) to 100’s petabytes/year (2012)
• Moore’s law for Sensors Possible filters assigned dynamically (on-demand)
• Run image processing algorithm on telescope image• Run Gene sequencing algorithm on compiled data
Needs decision support front end with “what-if” simulations
Metadata (provenance) critical to annotate data
Integrate across experiments as in multi-wavelength astronomy
Data Deluge comes from pixels/year available
1111
Database Database
Analysis and VisualizationPortal
RepositoriesFederated Databases
Data Filter
Services
Field Trip DataStreaming Data
Sensors
?DiscoveryServices
SERVOGrid
ResearchSimulations
Research Education
CustomizationServices
From Research
to Education
EducationGrid ComputerFarmGrid of Grids: Research Grid and Education Grid
Sources of Grid Technology Grids support distributed collaboratories or virtual
organizations integrating concepts from The Web Agents Distributed Objects (CORBA Java/Jini COM) Globus, Legion, Condor, NetSolve, Ninf and other High
Performance Computing activities Peer-to-peer Networks With perhaps the Web and P2P networks being the most
important for “Information Grids” and Globus for “Compute/File Grids”
1515
The Essence of Grid Technology? We will start from the Web view and assert that basic
paradigm is Meta-data rich Web Services communicating via
messages These have some basic support from some runtime
such as .NET, Jini (pure Java), Apache Tomcat+Axis (Web Service toolkit), Enterprise JavaBeans, WebSphere (IBM) or GT3/4 (Globus Toolkit 3/4)• These are the distributed equivalent of operating system
functions as in UNIX Shell
• Called Hosting Environment or platform W3C standard WSDL defines IDL (Interface
standard) for Web Services
1616
Meta-data Meta-data is usually thought of as “data about data” The Semantic Web is at its simplest considered as
adding meta-data to web pages For example, the hospital web-page has meta-data
telling you its location, phone-number, specialties which can be used to automate Google-style searches to allow planning of disease/accident treatment from web
Modern trend (Semantic Grid) is meta-data about web-services e.g. specify details of interface and useage• Such as that a bioinformatics service is free or bandwidth
input is of limited amount Provenance – history and ownership – of data very
important
1717
A typical Web Service In principle, services can be in any language (Fortran .. Java ..
Perl .. Python) and the interfaces can be method calls, Java RMI Messages, CGI Web invocations, totally compiled away (inlining)
The simplest implementations involve XML messages (SOAP) and programs written in net friendly languages like Java and Python
PaymentCredit Card
WarehouseShippingcontrol
WSDL interfaces
WSDL interfaces
Security CatalogPortalService
Web Services
Web Services
1818
Raw (HPC) Resources
Middleware
Database
PortalServices
SystemServices
SystemServices
SystemServices
Application Service
SystemServices
SystemServices
UserServices
“Core”Grid
Typical Grid Architecture
Each Blob is a Computer Program!
1919
Classic Grid Architecture
Database Database
Netsolve
Computing
SecurityCollaboration
CompositionContent Access
Resources
Clients Users and Devices
Middle TierBrokers Service Providers
Middle Tier becomes Web Services
2020
Peer to Peer Grid
DatabaseDatabase
Peers
Peers
Peer to Peer GridA democratic organization
User FacingWeb Service Interfaces
Service FacingWeb Service Interfaces
Event/MessageBrokers
Event/MessageBrokers
Event/MessageBrokers
2121
What is Happening? Grid ideas are being developed in (at least) four communities
• Web Service – W3C, OASIS, (DMTF)• Grid Forum (High Performance Computing, e-Science)• Enterprise Grid Alliance (Commercial “Grid Forum” with a
near term focus) Service Standards are being debated Grid Operational Infrastructure is being deployed Grid Architecture and core software being developed
• Apache has several important projects as do academia; large and small companies
Particular System Services are being developed “centrally” – OGSA framework for this in GGF; WS-* for OASIS/W3C/Microsoft-IBM
Lots of fields are setting domain specific standards and building domain specific services
USA started but now Europe is probably in the lead and Asia will soon catch USA if momentum (roughly zero for USA) continues
2222
Technical Activities of Note Look at different styles of Grids such as Autonomic (Robust
Reliable Resilient) New Grid architectures hard due to investment required Program the Grid – Workflow Access the Grid – Portals, Grid Computing Environments Critical Services Such as
• Security – build message based not connection based
• Notification – event services
• Metadata – Use Semantic Web, provenance
• Fabric and Service Management
• Databases and repositories – instruments, sensors
• Computing – Submit job, scheduling, distributed file systems
• Visualization, Computational Steering
• Network performance
LowLevelWS-*
High Levele.g. OGSA
2323
Web services Web Services build
loosely-coupled, distributed applications, (wrapping existing codes and databases) based on the SOA (service oriented architecture) principles.
Web Services interact by exchanging messages in SOAP format
The contracts for the message exchanges that implement those interactions are described via WSDL interfaces.
Databases
Humans
ProgramsComputational resources
Devices
reso
urce
s
BP
EL,
Jav
a, .N
ET
serv
ice
logi
c
<env:Envelope> <env:Header> ... </env:header> <env:Body> ... </env:Body></env:Envelope> m
essa
ge p
roce
ssin
g
SO
AP
and
WS
DL
SOAP messages
2424
Philosophy of Web Service Grids Much of Distributed Computing was built by natural
extensions of computing models developed for sequential machines
This leads to the distributed object (DO) model represented by Java and CORBA• RPC (Remote Procedure Call) or RMI (Remote Method
Invocation) for Java Key people think this is not a good idea as it scales badly
and ties distributed entities together too tightly• Distributed Objects Replaced by Services
Note CORBA was considered too complicated in both organization and proposed infrastructure• and Java was considered as “tightly coupled to Sun”• So there were other reasons to discard
Thus replace distributed objects by services connected by “one-way” messages and not by request-response messages
2525
Plethora of Standards Java is very powerful partly due to its many “frameworks” that
generalize libraries e.g.• Java Media Framework• Java Database Connectivity JDBC
Web Services have a correspondingly collections of specifications that represent critical features of the distributed operating systems for “Grids of Simple Services”• About 60 WS-* specifications introduced in last 2-3 years• These are low level with higher level standards such as access
database (OGSA-DAI) or “Submit a job” built on top of these Many battles both between standard bodies and between companies as
each tries to set standards they consider best; thus there are multiple standards for many of key Web Service functionalities
Microsoft a key player and stands to benefit as Web Services open up enterprise software space to all participants• e.g. MQSeries (IBM) and Tibco have to change their messaging
systems to support new open standards
2626
WS-I Interoperability Critical underpinning of Grids and Web Services is the
gradually growing set of specifications in the Web Service Interoperability Profiles
Web Services Interoperability (WS-I) Interoperability Profile 1.0a." http://www.ws-i.org. gives us XSD, WSDL1.1, SOAP1.1, UDDI in basic profile and parts of WS-Security in their first security profile.
We imagine the “60 Specifications” being checked out and evolved in the cauldron of the real world and occasionally best practice identifies a new specification to be added to WS-I which gradually increases in scope• Note only 4.5 out of 60 specifications have “made it” in this
Service Discovery (UDDI) / InformationService Internet Transport Protocol
Service Interfaces WSDL
ServiceContext
HigherLevelServices
WS-* implies the Service Internet We have the classic (CISCO, Juniper ….) Internet routing the
flood of ordinary packets in OSI stack architecture Web Services build the “Service Internet” or IOI (Internet on
Internet) with• Routing via WS-Addressing not IP header• Fault Tolerance (WS-RM not TCP)• Security (WS-Security/SecureConversation not IPSec/SSL)• Data Transmission by WS-Transfer not HTTP• Information Services (UDDI/WS-Context not
DNS/Configuration files)• At message/web service level and not packet/IP address level
Software-based Service Internet possible as computers “fast” Familiar from Peer-to-peer networks and built as a software
overlay network defining Grid (analogy is VPN) SOAP Header contains all information needed for the “Service
Internet” (Grid Operating System) with SOAP Body containing information for Grid application service
2929
Consequences of Rule of the Millisecond Useful to remember critical time scales
• 1) 0.000001 ms – CPU does a calculation• 2a) 0.001 to 0.01 ms – Parallel Computing MPI latency• 2b) 0.001 to 0.01 ms – Overhead of a Method Call• 3) 1 ms – wake-up a thread or process • 4) 10 to 1000 ms – Internet delay
2a), 4) implies geographically distributed metacomputing can’t in general compete with parallel systems
3) << 4) implies a software overlay network is possible without significant overhead• We need to explain why it adds value of course!
2b) versus 3) and 4) describes regions where method and message based programming paradigms important
3030
What is a Simple Service? Take any system – it has multiple functionalities
• We can implement each functionality as an independent distributed service
• Or we can bundle multiple functionalities in a single service Whether functionality is an independent service or one of many
method calls into a “glob of software”, we can always make them as Web services by converting interface to WSDL
Simple services are gotten by taking functionalities and making as small as possible subject to “rule of millisecond”• Distributed services incur messaging overhead of one (local) to
100’s (far apart) of milliseconds to use message rather than method call
• Use scripting or compiled integration of functionalities ONLY when require <1 millisecond interaction latency
Apache web site has many (pre Web Service) projects that are multiple functionalities presented as (Java) globs and NOT (Java) Simple Services• Makes it hard to integrate sharing common security, user
profile, file access .. services
31
Grids of Grids of Simple Services• Link via methods messages streams• Services and Grids are linked by messages• Internally to service, functionalities are linked by methods• A simple service is the smallest Grid• We are familiar with method-linked hierarchy
Lines of Code Methods Objects Programs Packages
Overlayand ComposeGrids of Grids
Methods Services Component Grids
CPUs Clusters ComputeResource Grids
MPPs
DatabasesFederatedDatabases
Sensor Sensor Nets
DataResource Grids
3232
Component Grids? So we build collections of Web Services which we
We build bigger Grids by composing component Grids using the Service Internet
3333
Critical Infrastructure (CI) Grids built as Grids of Grids
Gas Servicesand Filters
Physical Network
Registry Metadata
Flood Servicesand Filters
Flood CIGrid Gas CIGrid… Electricity CIGrid …
Data Access/Storage
Security WorkflowNotification Messaging
Portals Visualization GridCollaboration Grid
Sensor Grid Compute GridGIS Grid
Core Grid Services
3434
The WS-* Infrastructure Core Grid Services build on and/or extend the 60 or so
WS-* Infrastructure specifications which define• Container Model, XML, WSDL …• Service Internet ( (Reliable) Messaging, Addressing)
including extensions for high performance transport and representation. This is natural basis for streaming applications
• Service Discovery• Workflow and Transactions• Security• Metadata and State including lifetime• Notification• Policy, Agreements• Management (service interactions)• Portals and User Interfaces
catalogs, Semantic Grid, Provenance Higher level security with fine grain authorization,
session level security etc. Higher level execution services building on workflow
and including job management for simulations Infrastructure services giving common interfaces to
heterogeneous resource such as storage and computers Data and Information services including federated
databases and use of CIM Self and distributed (resource) management for
autonomic features, configuration Not OGSA: Collaboration, Sensors, Visualization, GIS
3636
Relation to GiG Architecture GiG and NCOW (Net-Centric
Operations and Warfare) define services and laud SOA but don’t seem to use either industry or GGF “stacks”
Note Grids and “Service Oriented Computing” placed as part of computing infrastructure – I think this is inappropriate
Identified features can be mapped to previous Grid service categories
TTV Emerging Technology Categories
Information Assurance
Data Strategy
Policy-Based Management
Mission-Specific Applications
Information Modeling
Computing Infrastructure
Transport Infrastructure
Service Oriented Computing*
Autonomous Computing*
Grid Computing*
Collaborative Computing*
* Sub-categories of Computing Infrastructure
3737
NCOW Reference Model All of these areas and their defined sub-areas are
naturally defined as services
3838
NCES: Network Centric Enterprise Services I
3939
NCES: Network Centric Enterprise Services II
4040
Implications for Collaboration Grids As with all Grids, we will use a SOA and identify what
core Grid (WS) services one needs and build on top of this• Core collaboration interface specification XGSP• Common collaboration services such as session management
and secure software multicast• Customized collaboration services in particular domains
Support asynchronous and synchronous collaboration• Most Grids naturally support asynchronous sharing
Need to see how to link to existing SIP and H323 capabilities
Need to examine current monolithic collaboration architectures and divide into simple services• MCU becomes multiple services
4141
Collaboration and Web Services Collaboration has
a) Mechanism to set up members (people, devices) of a “collaborative sessions”
b) Shared generic tools such as text chat, white boards, audio-video conferencing
c) Shared applications such as Web Pages, PowerPoint, Visualization, maps, (medical) instruments ….
b) and c) are “just shared objects” where objects could be Web Services but rarely are at moment
• We can port objects to Web Services and build a general approach for making Web services collaborative
a) is a “Service” which is set up in many different ways (H323 SIP JXTA are standards supported by multiple implementations) – we should make it a WS
4242
Shared Event Collaboration All collaboration is about sharing events defining state changes
• Audio/Video conferencing shares events specifying in compressed form audio or video
• Shared display shares events corresponding to change in pixels of a frame buffer
• Instant Messengers share updates to text message streams
• Microsoft events for shared PowerPoint (file replicated between clients) as in Access Grid
Finite State Change NOT Finite State Machine architecture Using Web services allows one to expose update events of all
kinds as message streams Need publish/subscribe approach to share messages (NB) plus System to control “session” – who is collaborating and rules
• XGSP is XML protocol for controlling collaboration building on H323 and SIP
4343
Web Services and M-MVC Web Services are naturally
M-MVC – Message based Model View Controller with • Model is Web Service
• Controller is Messages (NaradaBrokering)
• View is rendering
R F I O
ViewView
PortalAggregate WS User Facing fragments
desktop handheld phone
Input port Output port
User Facing Port
PortFacingResource
Web ServiceApplication or
Model
WSRP and JSR168 Portlets
R F I O
ViewView
PortalAggregate WS User Facing fragments
desktop handheld phone
Input port Output port
User Facing Port
PortFacingResource
Web ServiceApplication or
Model
R F I O
ViewView
PortalAggregate WS User Facing fragments
PortalAggregate WS User Facing fragments
desktopdesktop handheldhandheld phonephone
Input port Output port
User Facing Port
PortFacingResource
Web ServiceApplication or
ModelUser Facing Port
PortFacingResource
Web ServiceApplication or
Model
WSRP and JSR168 Portlets
Model
Subscribe UI event
View
Broker
Subscribe re
nderingPublis
h UI event
Publish rendering
Explicit message-based Publish/Subscribe MVC model
ModelModel
Subscribe UI event
View
BrokerBroker
Subscribe re
nderingPublis
h UI event
Publish rendering
Explicit message-based Publish/Subscribe MVC model
As Controller
4444
Desktop and Web Services with MMVC Most desktop applications are in fact roughly MVC
with controller formed by “system interrupts” with View and Model communicating by “post an event” and define a “listener” programming mode
We propose to integrate desktop and Web Service approach by systematic use of MMVC and NaradaBrokering
Allows easier porting to diverse clients and automatic collaboration
Attractive for next generation of Linux desktop clients We have demonstrated for SVG Browser (Scalable
Vector Graphics), OpenOffice and PowerPoint “Glob” programming style makes hard
45
14 16 18 20 22 24 26 28 300
1
2
3
4
5
6
7
8
milliseconds
nu
mb
er
of
ev
en
ts i
n 0
.5 m
illis
ec
on
d b
ins
Distribution of the mean of mousedown events
NB on ModelNB on ViewNB on ripvanwrinkle
36 38 40 42 44 46 48 50 52 54 560
0.5
1
1.5
2
2.5
3
3.5
4
milliseconds
nu
mb
er
of
ev
en
ts i
n 0
.5 m
illis
ec
on
d b
ins
Distribution of the mean of mouseup events
NB on ModelNB on ViewNB on ripvanwrinkle
15 20 25 300
1
2
3
4
5
6
milliseconds
nu
mb
er
of
ev
en
ts i
n 0
.5 m
illis
ec
on
d b
ins
Distribution of the mean of mousemove events
NB on ModelNB on ViewNB on ripvanwrinkle
Mean Mousedown
Mean MousemoveMean Mouseup
EventsPer 0.5 ms
Mean ms
NB on RipvanwinkleNB on ViewNB on Model
15 runs eachsplit over3 days
4646
SM-MV Collaboration
SVG DOM
Model
as Web Service
NaradaBrokeringNaradaBrokering
master
SVG
clientmasterView
master
SVG
clientotherView
master
SVG
clientotherView
master
SVG
clientotherView
Share output port
SVG DOM
Model
as Web Service
NaradaBrokeringNaradaBrokering
master
SVG
clientmasterViewmaster
SVG
clientmasterViewSVG
clientmasterView
master
SVG
clientotherViewmaster
SVG
clientotherViewSVG
clientotherView
master
SVG
clientotherViewmaster
SVG
clientotherViewSVG
clientotherView
master
SVG
clientotherViewmaster
SVG
clientotherViewSVG
clientotherView
Share output port
Shared Output portSingle Model, Multiple View SM-MV CollaborativeWeb Service
XGSPSessionControl
4747
MM-MV CollaborationShared Input portMultiple Model, Multiple View MM-MV Collaborative Web Service