1 Grid System Issues MSI-CI 2 Meeting June 29 2006 Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 [email protected] http:// www.infomall.org
Jan 14, 2016
11
Grid System Issues
MSI-CI2 Meeting
June 29 2006
Geoffrey Fox
Computer Science, Informatics, Physics
Pervasive Technology Laboratories
Indiana University Bloomington IN 47401
http://www.infomall.org
22
Topics Covered General Issues: Relation to P2P Types of Grids Why use Service Oriented Architectures Multi-core Chips All the world’s services Workflow Metadata and State Workflow Sensors and Filters SOAP MPI and Communication Performance Grids of Grids Community Tools
33
Web services Web Services build
loosely-coupled, distributed applications, (wrapping existing codes and databases) based on the SOA (service oriented architecture) principles.
Web Services interact by exchanging messages in SOAP format
The contracts for the message exchanges that implement those interactions are described via WSDL interfaces.
Databases
Humans
ProgramsComputational resources
Devices
reso
urce
s
BP
EL,
Jav
a, .N
ET
serv
ice
logi
c
<env:Envelope> <env:Header> ... </env:header> <env:Body> ... </env:Body></env:Envelope> m
essa
ge p
roce
ssin
g
SO
AP
and
WS
DL
SOAP messages
44
A typical Web Service In principle, services can be in any language (Fortran .. Java ..
Perl .. Python) and the interfaces can be method calls, Java RMI Messages, CGI Web invocations, totally compiled away (inlining)
The simplest implementations involve XML messages (SOAP) and programs written in net friendly languages like Java and Python
PaymentCredit Card
WarehouseShippingcontrol
WSDL interfaces
WSDL interfaces
Security CatalogPortalService
Web Services
Web Services
55
Philosophy of Web Service Grids Much of Distributed Computing was built by natural
extensions of computing models developed for sequential machines
This leads to the distributed object (DO) model represented by Java and CORBA• RPC (Remote Procedure Call) or RMI (Remote Method
Invocation) for Java Key people think this is not a good idea as it scales badly
and ties distributed entities together too tightly• Distributed Objects Replaced by Services
Note CORBA was considered too complicated in both organization and proposed infrastructure• and Java was considered as “tightly coupled to Sun”• So there were other reasons to discard
Thus replace distributed objects by services connected by “one-way” messages and not by request-response messages
66
Some ideas to Remember Grids are managed Web Services exchanging Messages P2P Networks are differently managed and architected
services exchanging messages Any computer operation involves messages; not all
these messages can be isolated• With services all messages are explicit and can be examined
Grid Services extend WS-* Web Service Specifications Web Service container replaces computer Service replaces process A stream is an ordered set of messages Service Internet replaces Internet: messages replace
packets (Sub)Grids replace Libraries
77
Internet Scale Distributed Services Grids use Internet technology and are distinguished by managing
or organizing sets of network connected resources• Classic Web allows independent one-to-one access to
individual resources • Grids integrate together and manage multiple Internet-
connected resources: People, Sensors, computers, data systems
Organization can be explicit as in• TeraGrid which federates many supercomputers; • Information Retrieval Grid which federates multiple data
resources; • CrisisGrid which federates first responders, commanders,
sensors, GIS, (Tsunami) simulations, science/public data Organization can be implicit as in Internet resources such as
curated databases and simulation resources that “harmonize a community”
88
Raw (HPC) Resources
Middleware
Database
PortalServices
SystemServices
SystemServices
SystemServices
Application Service
SystemServices
SystemServices
UserServices
“Core”Grid
Typical Grid Architecture
Each Blob is a Computer Program!
99
Classic Grid Architecture
Database Database
Netsolve
Computing
SecurityCollaboration
CompositionContent Access
Resources
Clients Users and Devices
Middle TierBrokers Service Providers
Middle Tier becomes Web Services
1010
Peer to Peer Grid
DatabaseDatabase
Peers
Peers
Peer to Peer GridA democratic organization
User FacingWeb Service Interfaces
Service FacingWeb Service Interfaces
Event/MessageBrokers
Event/MessageBrokers
Event/MessageBrokers
1111
Different Visions of the Grid e-Science or Cyberinfrastructure are virtual organization Grids
supporting global distributed engineering and science research (note sensors, instruments are people are all distributed)
Utility Computing or X-on-demand (X=data, computer ..) is a major computer Industry interest in Grids and this is key part of enterprise or campus Grids
Skype (Kazaa) VOIP system is a Peer-to-peer Grid (and VRVS/GlobalMMCS like Internet A/V conferencing are Collaboration Grids)
DoD’s vision of Network Centric Computing can be considered a Grid (linking sensors, warfighters, commanders, backend resources) and they are building the GIG (Global Information Grid)
Commercial 3G Cell-phones and DoD ad-hoc network initiative are forming mobile Grids
Grids support universal Globalization in life, fun, research, business
1212
e-moreorlessanything and the Grid e-Business captures an emerging view of corporations as
dynamic virtual organizations linking employees, customers and stakeholders across the world. • The growing use of outsourcing is one example
e-Science is the similar vision for scientific research with international participation in large accelerators, satellites or distributed gene analyses.
The Grid integrates the best of the Web, traditional enterprise software, high performance computing and Peer-to-peer systems to provide the information technology e-infrastructure for e-moreorlessanything.
A deluge of data of unprecedented and inevitable size must be managed and understood.
People, computers, data and instruments must be linked. On demand assignment of experts, computers, networks and
storage resources must be supported
1313
e-Defense and e-Crisis Grids support Command and Control and provide
Global Situational Awareness • Link commanders and frontline troops to themselves and to
archival and real-time data; link to what-if simulations • Dynamic heterogeneous wired and wireless networks• Security and fault tolerance essential
System of Systems; Grid of Grids• The command and information infrastructure of each ship is
a Grid; each fleet is linked together by a Grid; the President is informed by and informs the national defense Grid
• Grids must be heterogeneous and federated Crisis Management and Response enabled by a Grid
linking sensors, disaster managers, and first responders with decision support
14
1962 Licklider’s Vision
“Lick had this concept – all of the stuff linked together throughout the world, that you can use a remote computer, get data from a remote computer, or use lots of computers in your job.”
Larry Roberts – Principal Architect of the ARPANET
15
What is e-Science?
‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’
John Taylor
Director General of Research Councils
UK, Office of Science and Technology e-Science is about developing tools and
technologies that allow scientists to do ‘faster, better or different’ research
1616
Some Important Styles of Grids Computational Grids were origin of concepts and link
computers across the globe – high latency stops this from being used as parallel machine• Typically Compute/File Grids where information (messages) exchanged
by writing and reading files Knowledge and Information Grids link sensors and information
repositories as in Virtual Observatories or BioInformatics Education Grids link teachers, learners, parents as a VO with
learning tools, distant lectures etc. e-Science Grids link multidisciplinary researchers across
laboratories and universities Community Grids focus on Grids involving large numbers of
peers rather than focusing on linking major resources – links Grid and Peer-to-peer network concepts
Semantic Grid links Grid, and AI community with Semantic web (ontology/meta-data enriched resources) and Agent concepts
Collaboration Grids support the linkage of multiple people and electronic resources (often peer-to-peer architecture)
1717
Types of Computing Grids Running “Pleasing Parallel Jobs” as in United Devices, Entropia
(Desktop Grid) “cycle stealing systems” Can be managed (“inside” the enterprise as in Condor) or more
informal (as in SETI@Home) Computing-on-demand in Industry where jobs spawned are
perhaps very large (SAP, Oracle …) Support distributed file systems as in Legion (Avaki), Globus with
(web-enhanced) UNIX programming paradigm• Particle Physics will run some 30,000 simultaneous jobs
Distributed Simulation HLA style Grids (some work) Linking Supercomputers as in TeraGrid Pipelined applications linking data/instruments, compute,
visualization Seamless Access where Grid portals allow one to choose one of
multiple resources with a common interfaces Parallel Computing typically NOT suited for a Grid (latency)
18
Large Scale Parallel Computers
Old Style Metacomputing GridQuickTime™ and a
decompressorare needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
IMAGING INSTRUMENTS
COMPUTATIONALRESOURCES
LARGE-SCALE DATABASES
DATA ACQUISITION ,ANALYSIS
ADVANCEDVISUALIZATION
Analysis and Visualization
Spread a single large Problem over multiple supercomputers
Large Disks
1919
Utility and Service Computing An important business application of Grids is believed to be
utility computing Namely support a pool of computers to be assigned as needed to
take-up extra demand• Pool shared between multiple applications
Natural architecture is not a cluster of computers connected to each other but rather a “Farm of Grid Services” connected to Internet and supporting services such as• Web Servers• Financial Modeling • Run SAP • Data-mining• Simulation response to crisis like forest fire or earthquake• Media Servers for Video-over-IP
Note classic Supercomputer use is to allow full access to do “anything” via ssh etc.• In service model, one pre-configures services for all programs
and you access portal to run job with less security issues
20
GOSC Timeline
Q2 Q4 Q2 Q3Q1Q4Q3Q2Q1Q3
2004 20062005
EGEE gLite alpha release
gLite release 1
OMII release
NGS Expansion(Bristol, Cardiff…)
OGSA-DAI
WS plan
NGS ProductionService
NGS WS Service
EGEE gLite releaseOMII Release
NGS Expansion
WS2 plan
NGS WS Service 2
UK National Grid Service
Grid Operation Support Centre
Web Services based National Grid Infrastructure
21Computation
Starlight (Chicago) Netherlight
(Amsterdam)
Leeds
PSC
SDSC
UCL
Network PoP Service Registry
NCSA
Manchester
UKLight
Oxford
RAL
US TeraGrid
UK NGS
Steering clients
SC05
Local laptops in Seattle and UK
All sites connected by production
network (not all shown)
Towards an International Grid
Infrastructure
UNIVERSITY OF CALIFORNIA, SAN DIEGO
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Cyberinfrastructure At Home
• BOINC (Berkeley Open Infrastructure for Network Computing) (http://boinc.berkeley.edu)
• Climateprediction.net: study climate change
• Einstein@home: search for gravitational signals emitted by pulsars
• LHC@home: improve the design of the CERN LHC particle accelerator
• Predictor@home: investigate protein-related diseases
• Rosetta@home: help researchers develop cures for human diseases
• SETI@home: Look for radio evidence of extraterrestrial live
• Etc.
SETI@Home averages 138 TFLOPS on 100,000’s of
computers in 100’s of countries
Arecibo telescope
23
climateprediction.net
Since September 2003:
95,000 registered participants in 150 countriesDonated 8,000 years of computer timeCompleted 100,000 simulations of over 4M model years
2424
Information/Knowledge Grids Distributed (10’s to 1000’s) of data sources (instruments,
file systems, curated databases …) Data Deluge: 1 (now) to 100’s petabytes/year (2012)
• Moore’s law for Sensors Possible filters assigned dynamically (on-demand)
• Run image processing algorithm on telescope image• Run Gene sequencing algorithm on compiled data
Needs decision support front end with “what-if” simulations
Metadata (provenance) critical to annotate data
Integrate across experiments as in multi-wavelength astronomy
Data Deluge comes from pixels/year available
2525
Data Deluged Science In the past, we worried about data in the form of parallel I/O or
MPI-IO, but we didn’t consider it as an enabler of new algorithms and new ways of computing
Data assimilation was not central to HPCC DoE ASCI set up because didn’t want test data! Now particle physics will get 100 petabytes from CERN
• Nuclear physics (Jefferson Lab) in same situation
• Use around 30,000 CPU’s simultaneously 24X7
Weather, climate, solid earth (EarthScope) Bioinformatics curated databases (Biocomplexity only 1000’s of
data points at present) Virtual Observatory and SkyServer in Astronomy Environmental Sensor nets
Data
Information
Ideas
Simulation
Model
Assimilation
Reasoning
Datamining
ComputationalScience
Informatics
Data DelugedScienceComputingParadigm
2727
Database Database
Analysis and VisualizationPortal
RepositoriesFederated Databases
Data Filter
Services
Field Trip DataStreaming Data
Sensors
?DiscoveryServices
SERVOGrid
ResearchSimulations
Research Education
CustomizationServices
From Research
to Education
EducationGrid ComputerFarmGrid of Grids: Research Grid and Education Grid
GISGrid
Sensor GridDatabase Grid
Compute Grid
2828
SERVOGrid Requirements Seamless Access to Data repositories and large scale
computers Integration of multiple data sources including sensors,
databases, file systems with analysis system• Including filtered OGSA-DAI (Grid database access)
Rich meta-data generation and access with SERVOGrid specific Schema extending openGIS (Geography as a Web service) standards and using Semantic Grid
Portals with component model for user interfaces and web control of all capabilities
Collaboration to support world-wide work Basic Grid tools: workflow and notification NOT metacomputing
Community Tools e-mail and list-serves are oldest and best used Kazaa, Instant Messengers, Skype, Napster, BitTorrent for P2P
Collaboration – text, audio-video conferencing, files del.icio.us, Connotea, Citeulike manage shared bookmarks hotornot.com or similar sites allow you to create community
resources and share them Writely, Wikis and Blogs are powerful specialized shared
document systems ConferenceXP and WebEx share general applications Google Scholar tells you who has cited your papers while
publisher sites tell you about co-authors Note sharing resources creates (implicit) communities
• Social network tools study graphs to both define communities and extract their properties
Why use SOA’s Globalization of applications: Life, Fun, Research, Business,
Defense as an International collaborative activity Globalization of Software Production: Software components
including open-source made everywhere Interoperability: in interfaces and protocol (messages) requires
Web Services as only broadly supported SOA Anti-Performance: if Moore’s law gives you a factor X, then use
√X for performance, √ X for improved lifecycle (re-use) Software Engineering: Software paradigms are ways of
“packaging” modules/components/objects/methods/subroutines. Services have minimal coupling and best re-use (lowest performance). 1962 Fortran easier re-use than 2006 Java
Multicore chips: requires pervasive concurrency without side effects. Even Microsoft must be able to use 32-128 way parallelism on a chip over next 5 years
Intel Fall 2005 Multicore Roadmap
March 2006 Sun T1000 8 core Server at <$6,000
Performance Per Transistor
Performance data from uP vendors Transistor count excludes on-chip caches Performance normalized by clock rate Conclusion: Simplest is best! (250K Transistor CPU)
0.1
1
10
0.1 1 100.1
1
10
0.1 1 10
Millions of Transistors (CPU) Millions of Transistors (CPU)
No
rma
lize
d S
PE
CIN
TS
No
rma
lize
d S
PE
CF
LT
S
Peter Kogge 1997
33
The Grid and Web Service Institutional Hierarchy
OGSA GS-*and some WS-*GGF/W3C/….XGSP (Collab)
WS-* fromOASIS/W3C/Industry
Apache Axis.NET etc.
Must set standards to get interoperability
2: System Services and Features(WS-* from OASIS/W3C/Industry)
Handlers like WS-RM, Security, UDDI Registry
3: Generally Useful Services and Features(OGSA and other GGF, W3C) Such as
“Collaborate”, “Access a Database” or “Submit a Job”
4: Application or Community of Interest (CoI)Specific Services such as “Map Services”, “Run
BLAST” or “Simulate a Missile”
1: Container and Run Time (Hosting) Environment (Apache Axis, .NET etc.)
XBMLXTCE VOTABLECMLCellML
3434
Sources of Grid Technology Grids support distributed collaboratories or virtual
organizations integrating concepts from The Web Agents Distributed Objects (CORBA Java/Jini COM) Globus, Legion, Condor, NetSolve, Ninf and other High
Performance Computing activities Peer-to-peer Networks With perhaps the Web and P2P networks being the most
important for “Information Grids” and Globus for “Compute/File Grids”
3535
The Essence of Grid Technology? We will start from the Web view and assert that basic
paradigm is Meta-data rich Web Services communicating via
messages These have some basic support from some runtime
such as .NET, Jini (pure Java), Apache Tomcat+Axis (Web Service toolkit), Enterprise JavaBeans, WebSphere (IBM) or GT3/4 (Globus Toolkit 3/4)• These are the distributed equivalent of operating system
functions as in UNIX Shell
• Called Hosting Environment or platform W3C standard WSDL defines IDL (Interface
standard) for Web Services
3636
What is Happening? Grid ideas are being developed in (at least) four communities
• Web Service – W3C, OASIS, (DMTF)• Global Grid Forum (High Performance Computing, e-Science)• Enterprise Grid Alliance (Commercial “Grid Forum” with a
near term focus) merged with GGF to make Open Grid Forum Service Standards are being debated Grid Operational Infrastructure is being deployed Grid Architecture and core software being developed
• Apache has several important projects as do academia; large and small companies
Particular System Services are being developed “centrally” – OGSA framework for this in GGF; WS-* for OASIS/W3C/Microsoft-IBM
Lots of fields are setting domain specific standards and building domain specific services
USA started but now Europe is probably in the lead and Asia will soon catch USA if momentum (roughly zero for USA) continues
3737
What do Grids Add?What do Grids Add? GridsGrids use use all of the Web Servicesall of the Web Services They address They address managementmanagement and deployment of and deployment of
large distributed systems of serviceslarge distributed systems of services• Internet Scale Distributed ServicesInternet Scale Distributed Services• I will use Grid more simply as a I will use Grid more simply as a composable composable
coordinated collection of services coordinated collection of services They address They address security security and management issues of and management issues of
virtual organizationsvirtual organizations crossing multiple crossing multiple administrative domainsadministrative domains
GGF is developing specific services of relevance GGF is developing specific services of relevance including including jobjob management, many aspects of management, many aspects of data data and and schedulingscheduling• Not much on Not much on sensors, real-time, P2Psensors, real-time, P2P
GGF has a good process for developing new GGF has a good process for developing new higher level specificationshigher level specifications
3838
Technical Activities of Note Look at different styles of Grids such as Autonomic (Robust
Reliable Resilient) New Grid architectures hard due to investment required Program the Grid – Workflow Access the Grid – Portals, Grid Computing Environments Critical Services Such as
• Security – build message based not connection based
• Notification – event services
• Metadata – Use Semantic Web, provenance
• Fabric and Service Management
• Databases and repositories – instruments, sensors
• Computing – Submit job, scheduling, distributed file systems
• Visualization, Computational Steering
• Network performance
LowLevelWS-*
High Levele.g. OGSA
39
What do Web Services Prescribe?• The specify interfaces for system services (and generally useful
services like database) • They specify an interface language (WSDL) for all services• They develop containers and frameworks to use to host services• They specify a message format (SOAP) for ALL messages that
defines both application and system actions precisely• They imply a process be started to define domain specific
services• There are multiple competing activities from Microsoft and IBM
to Apache, and IU (for example) developing system and application services
• Unlike for RTI and CORBA, services from different vendors should interoperate
H1 H4H3H2 Body F1 F2 F3 F4 Service
Container Handlers
Container System Processing
4040
Plethora of Standards Java is very powerful partly due to its many “frameworks” that
generalize libraries e.g.• Java Media Framework• Java Database Connectivity JDBC
Web Services have a correspondingly collections of specifications that represent critical features of the distributed operating systems for “Grids of Simple Services”• About 60 WS-* specifications introduced in last 2-3 years• These are low level with higher level standards such as access
database (OGSA-DAI) or “Submit a job” built on top of these Many battles both between standard bodies and between companies as
each tries to set standards they consider best; thus there are multiple standards for many of key Web Service functionalities
Microsoft a key player and stands to benefit as Web Services open up enterprise software space to all participants• e.g. MQSeries (IBM) and Tibco have to change their messaging
systems to support new open standards
41
The Ten areas covered by the 60 core WS-* Specifications
WS-* Specification Area Examples
1: Core Service Model XML, WSDL, SOAP
2: Service Internet WS-Addressing, WS-MessageDelivery; Reliable Messaging WSRM; Efficient Messaging MOTM
3: Notification WS-Notification, WS-Eventing (Publish-Subscribe)
4: Workflow and Transactions BPEL, WS-Choreography, WS-Coordination
5: Security WS-Security, WS-Trust, WS-Federation, SAML, WS-SecureConversation
6: Service Discovery UDDI, WS-Discovery
7: System Metadata and State WSRF, WS-MetadataExchange, WS-Context
8: Management WSDM, WS-Management, WS-Transfer
9: Policy and Agreements WS-Policy, WS-Agreement
10: Portals and User Interfaces WSRP (Remote Portlets)
42
Activities in Global Grid Forum Working Groups
GGF Area GS-* and OGSA Standards Activities
1: Architecture High Level Resource/Service Naming (level 2 of slide 6),Integrated Grid Architecture
2: Applications Software Interfaces to Grid, Grid Remote Procedure Call, Checkpointing and Recovery, Interoperability to Job Submittal services, Information Retrieval,
3: Compute Job Submission, Basic Execution Services, Service Level Agreements for Resource use and reservation, Distributed Scheduling
4: Data Database and File Grid access, Grid FTP, Storage Management, Data replication, Binary data specification and interface, High-level publish/subscribe, Transaction management
5: Infrastructure Network measurements, Role of IPv6 and high performance networking, Data transport
6: Management Resource/Service configuration, deployment and lifetime, Usage records and access, Grid economy model
7: Security Authorization, P2P and Firewall Issues, Trusted Computing
43
Net-Centric Core Enterprise Services Core Enterprise Services Service Functionality
NCES1: Enterprise Services Management (ESM)
including life-cycle management
NCES2: Information Assurance (IA)/Security
Supports confidentiality, integrity and availability. Implies reliability and autonomic features
NCES3: Messaging Synchronous or asynchronous cases
NCES4: Discovery Searching data and services
NCES5: Mediation Includes translation, aggregation, integration, correlation, fusion, brokering publication, and other transformations for services and data. Possibly agents
NCES6: Collaboration Provision and control of sharing with emphasis on synchronous real-time services
NCES7: User Assistance Includes automated and manual methods of optimizing the user GiG experience (user agent)
NCES8: Storage Retention, organization and disposition of all forms of data
NCES9: Application Provisioning, operations and maintenance of applications.
44
The Core Features/Service Areas IService or Feature WS-* GS-* NCES
(DoD)Comments
A: Broad Principles
FS1: Use SOA: Service Oriented Arch.
WS1 Core Service Architecture, Build Grids on Web Services. Industry best practice
FS2: Grid of Grids Distinctive Strategy for legacy subsystems and modular architecture
B: Core Services
FS3: Service Internet, Messaging
WS2 NCES3 Streams/Sensors. Team
FS4: Notification WS3 NCES3 JMS, MQSeries.
FS5 Workflow WS4 NCES5 Grid Programming
FS6 : Security WS5 GS7 NCES2 Grid-Shib, Permis Liberty Alliance ...
FS7: Discovery WS6 NCES4 UDDI
FS8: System Metadata & State
WS7 Globus MDSSemantic Grid, WS-Context
FS9: Management WS8 GS6 NCES1 CIM
FS10: Policy WS9 ECS
45
The Core Feature/Service Areas IIService or Feature WS-* GS-* NCES Comments
B: Core Services (Continued)
FS11: Portals and User assistance
WS10 NCES7 Portlets JSR168, NCES Capability Interfaces
FS12: Computing GS3
FS13: Data and Storage GS4 NCES8 NCOW Data StrategyFederation at data/information layer major research area; CGL leading role
FS14: Information GS4 JBI for DoD, WFS for OGC
FS15: Applications and User Services
GS2 NCES9 Standalone ServicesProxies for jobs
FS16: Resources and Infrastructure
GS5 Ad-hoc networks
FS17: Collaboration and Virtual Organizations
GS7 NCES6 XGSP, Shared Web Service ports
FS18: Scheduling and matching of Services and Resources
GS3 Current work only addresses scheduling “batch jobs”. Need networks and services
46
A List of Web Services 1• 1) Core Service Architecture
• XSD XML Schema (W3C Recommendation) V1.0 February 1998, V1.1 February 2004
• WSDL 1.1 Web Services Description Language Version 1.1, (W3C note) March 2001
• WSDL 2.0 Web Services Description Language Version 2.0, (W3C under development) March 2004
• SOAP 1.1 (W3C Note) V1.1 Note May 2000
• SOAP 1.2 (W3C Recommendation) June 24 2003
47
A List of Web Services 2• 2) Service Internet including messaging• WS-Addressing Web Services Addressing (BEA, IBM, Microsoft, SAP, Sun) in
W3C consideration August 2004 • WS-MessageDelivery Web Services Message Delivery (W3C Submission by
Oracle, Sun ..) April 2004 • WS-Reliability Web Services Reliable Messaging (OASIS Web Services
Reliable Messaging TC) March 2004 • WS-RM Web Services Reliable Messaging (BEA, IBM, Microsoft, Tibco)
v0.992 February 2005 linked to WS-Reliability in OASIS as Web Services Reliable Exchange (WS-RX)
• WS-RM Policy Web Services Reliable Messaging Policy Assertion (BEA, IBM, Microsoft, Tibco) March 2006
• WS-RX Web Services Reliable Exchange (Many members) integrating previous reliability specifications
• SOAP MOTM SOAP Message Transmission Optimization Mechanism (W3C) June 2004
• SOAP-over-UDP Binding of SOAP to UDP (Microsoft, BEA …) September 2004
• Many obsolete specifications like WS-Routing and Referral SOAP Routing Protocol (Microsoft) October 2001
48
Bit levelInternet
(OSI Stack)
Layered Architecture for Web Services and Grids
Base Hosting EnvironmentProtocol HTTP FTP DNS …
Presentation XDR …Session SSH …
Transport TCP UDP …Network IP …
Data Link / Physical
ServiceInternet
Application Specific GridsGenerally Useful Services and Grids
Workflow WSFL/BPELService Management (“Context etc.”)
Service Discovery (UDDI) / InformationService Internet Transport Protocol
Service Interfaces WSDL
ServiceContext
HigherLevelServices
WS-* implies the Service Internet We have the classic (CISCO, Juniper ….) Internet routing the
flood of ordinary packets in OSI stack architecture Web Services build the “Service Internet” or IOI (Internet on
Internet) with• Routing via WS-Addressing not IP header• Fault Tolerance (WS-RM not TCP)• Security (WS-Security/SecureConversation not IPSec/SSL)• Data Transmission by WS-Transfer not HTTP• Information Services (UDDI/WS-Context not
DNS/Configuration files)• At message/web service level and not packet/IP address level
Software-based Service Internet possible as computers “fast” Familiar from Peer-to-peer networks and built as a software
overlay network defining Grid (analogy is VPN) SOAP Header contains all information needed for the “Service
Internet” (Grid Operating System) with SOAP Body containing information for Grid application service
50
A List of Web Services 3• 3) Notification and high-level publish/subscribe information
dissemination
• WS-Eventing Web Services Eventing (BEA, Microsoft, TIBCO) August 2004
• WS-EventNotification (HP, IBM, Intel, Microsoft) March 2006 uses resources to manage subscriptions
• WS-Notification Framework for Web Services Notification with WS-Topics, WS-BaseNotification, and WS-BrokeredNotification (OASIS) OASIS Web Services Notification TC Set up March 2004
• JMS Java Message Service V1.1 March 2002
• Different from using publish-subscribe to robustly support messaging between Web services– Bind SOAP to JMS or MQSeries
51
A List of Web Services 4• 4) Coordination and Workflow, Transactions and
Contextualization• BPEL Business Process Execution Language for Web Services
(OASIS) V1.1 May 2003 (V1.1) with V2.0 under development• WS-CDL Web Services Choreography Language (W3C) V1.0
Working Draft 17 December 2004• WSCI (W3C) Web Service Choreography Interface V1.0 (W3C
Note from BEA, Intalio, SAP, Sun, Yahoo) • WSCL Web Services Conversation Language (W3C Note) HP
March 2002 • Workflow is general linkage between services; transactions are a
critical special case• Concept of workflow generalizes traditional workflow processes
in business• Many competing workflow implementations and standards;
many implementations “reject” current standards
5252
Role of WorkflowRole of Workflow
Programming SOAP and Web Services (the Grid)Programming SOAP and Web Services (the Grid): : Workflow describes linkage between servicesWorkflow describes linkage between services
As distributed, As distributed, linkage must be by messageslinkage must be by messages Linkage is two-way and has both control and dataLinkage is two-way and has both control and data Apply to multi-disciplinary, multi-scale linkage, Apply to multi-disciplinary, multi-scale linkage,
multi-program linkage, link multi-program linkage, link visualization to visualization to simulationsimulation, GIS to simulations and visualization , GIS to simulations and visualization filters to each otherfilters to each other
Microsoft-IBM specification Microsoft-IBM specification BPELBPEL is current is current preferred Web Service XML specification of preferred Web Service XML specification of workflowworkflow
Service-1 Service-3
Service-2
5353
Example workflowExample workflow
Here a sensor feeds a data-mining application(We are extending data-mining in DoD applications with Grossman from UIC)The data-mining application drives a visualization
5454
Example Flood Simulation workflowExample Flood Simulation workflow
DataArchives
DataArchives
RunoffModel
RunoffModel
FlowModel
FlowModel
FlowModel
GIS Grid Services Link Distributed
Data and Applications
SOAP MessagesAnd Events
DataArchives
DataArchives
RunoffModel
RunoffModel
FlowModel
FlowModel
FlowModel
GIS Grid Services Link Distributed
Data and Applications
SOAP MessagesAnd Events
5555
SERVOGrid Codes, RelationshipsSERVOGrid Codes, Relationships
Elastic DislocationPattern Recognizers
Fault Model BEM
Viscoelastic Layered BEM
Viscoelastic FEMElastic Dislocation Inversion
This linkage called Workflow in Grid/Web Service parlance
56
Two-level Programming I• The Web Service (Grid) paradigm implicitly assumes a
two-level Programming Model• We make a Service (same as a “distributed object” or
“computer program” running on a remote computer) using conventional technologies– C++ Java or Fortran Monte Carlo module
– Data streaming from a sensor or Satellite
– Specialized (JDBC) database access
• Such services accept and produce data from users files and databases
• The Grid is built by coordinating such services assuming we have solved problem of programming the service
Service Data
5757
Two-level Programming II The Grid is discussing the composition of distributed
services with the runtime interfaces to Grid as opposed to UNIX pipes/data streams
Familiar from use of UNIX Shell, PERL or Python scripts to produce real applications from core programs
Such interpretative environments are the single processor analog of Grid Programming
Some projects like GrADS from Rice University are looking at integration between service and composition levels but dominant effort looks at each level separately
Service1 Service2
Service3 Service4
58
WS 2 WS N-1Web Service 1 Web Service N
3 Layer Programming Model
Level 2 Programming choosing services by virtualizationApplication Semantics (Metadata, Ontology) Semantic Grid
Level 1 Programming inside servicesApplication expressed in in Java Fortran C++ MPI etc.
Level 3 Grid Programming composing multiple servicesService Workflow, Transactions, Mediation
WS-* Infrastructure
Substantial work in UK e-Science program, international semantic web community
59
A List of Web Services 4-Continued• 4) Transactions, Business Processes and Contextualization• WS-CAF Web Services Composite Application Framework including WS-
CTX, WS-CF and WS-TXM below (OASIS Web Services Composite Application Framework TC)
• WS-CTX Web Services Context (OASIS Web Services Composite Application Framework TC) V0.9.2 July 2005
• WS-CF Web Services Coordination Framework (OASIS Web Services Composite Application Framework TC) V0.1 April 2005
• WS-TXM Web Services Transaction Management (OASIS Web Services Composite Application Framework TC) including WS-ACID (V0.1 May 2005), WS-BP (Business Process V0.1 May 2005), WS-LRA (Long running action V0.1 May 2005)
• WS-Coordination Web Services Coordination (BEA, IBM, Microsoft) November 2004
• WS-AtomicTransaction Web Services Atomic Transaction (BEA, IBM, Microsoft) November 2004
• WS-BusinessActivity Web Services Business Activity Framework (BEA, IBM, Microsoft) November 2004
• BTP Business Transaction Protocol (OASIS) May 2002 with V1.1 November 2004
• ebXML BPSS Business Process (OASIS) with V2.0.1 pre-Committee Draft review 17 July 2005
60
A List of Web Services 5• 5) Security Frameworks and Core Specifications• WS-Security 2004 Web Services Security: SOAP Message Security (OASIS)
Standard March 2004. • WS-I Basic Security Profile V1.0 Web Services Interoperability Organization
Working Group Draft May 15 2005• WS-Security Username Token Profile Web Services Security Username Token
Profile V1.0 OASIS Standard, March 2004• WS-Security X.509 Certificate Token Profile Web Services Security X.509
Certificate Token Profile OASIS Standard, March 2004 • WS-Security REL Profile Web Services Security Rights Expression Language
(REL) Token Profile OASIS Standard: 19 December 2004 • WS-I REL Token Profile V1.0 Web Services Interoperability Organization
Working Group Draft 13 May 2005• WS-Security Kerberos Web Services Security Kerberos Binding (Microsoft)
December 2003• Web-SSO Web Single Sign-On Metadata Exchange Protocol (Microsoft, Sun)
April 2005 • Web-SSO-Mex Web Single Sign-On Interoperability Profile (Microsoft, Sun)
April 2005• WS-SecurityPolicy Web Services Security Policy Language (IBM, Microsoft,
RSA, Verisign) V1.1 July 2005
61
A List of Web Services 5 - Contd• 5) Security Capabilities• WS-Trust Web Services Trust Language (BEA, IBM, Microsoft, RSA,
Verisign …) February 2005 • WS-SecureConversation Web Services Secure Conversation Language
(BEA, IBM, Microsoft, RSA, Verisign …) February 2005• WS-Federation Web Services Federation Language (BEA, IBM,
Microsoft, RSA, Verisign) July 2003 • WS-Federation Active Requestor Profile Web Services Federation
Language Active Requestor Profile V 1.0 (BEA, IBM, Microsoft, RSA, Verisign) July 8, 2003
• WS-Federation Passive Requestor Profile Web Services Federation Language Passive Requestor Profile V 1.0 (BEA, IBM, Microsoft, RSA, Verisign) July 8, 2003
• WS-Authorization is being developed by IBM and Microsoft and will build on WS-Trust to describe how access to particular web services is specified and managed.
• WS-Privacy is being developed by IBM and Microsoft and will build on WS-Policy to describe the binding of privacy policies to Web services and their exchanged data.
62
A List of Web Services 5 - Contd• 5) Security Languages
• SAML Assertions and Protocols for the OASIS Security Assertion Markup Language (SAML) V2.0 OASIS Standard, 15 March 2005
• WS-Security SAML Token Profile Web Services Security SAML Token Profile OASIS Standard, 1 December 2004
• WS-I SAML Token Profile V1.0 Web Services Interoperability Organization Working Group Draft 13 May 2005
• XACML eXtensible Access Control Markup Language (OASIS) V2.0 1 February 2005
63
A List of Web Services 6• 6) Service Discovery
• UDDI (Broadly Supported OASIS Standard) V3 August 2003
• WS-Discovery Web services Dynamic Discovery (Microsoft, BEA, Intel …) February 2004
• WS-IL Web Services Inspection Language, (IBM, Microsoft) November 2001
• Note WS-Context as a metadata catalog and WS-Management Catalog are examples of related services
• There are many UDDI extensions
64
A List of Web Services 7• 7) Metadata and State• RDF Resource Description Framework (W3C) Set of
recommendations expanded from original February 1999 standard • DAML+OIL combining DAML (Darpa Agent Markup Language)
and OIL (Ontology Inference Layer) (W3C) Note December 2001 • OWL Web Ontology Language (W3C) Recommendation February
2004 • WS-MetadataExchange 1.1 Web Services Metadata Exchange
(HP, IBM, Intel, Microsoft) March 2006 • ASAP Asynchronous Service Access Protocol (OASIS) with V1.0
working draft 2B December 11 2004• WS-GAF Web Service Grid Application Framework (Arjuna,
Newcastle University) August 2003• WBEM Web-Based Enterprise Management including CIM
(Common Information Model) from DMTF (Distributed Management Task Force) 2004-2005
65
A List of Web Services 7• 7) Metadata and State: Resource Framework• WS-RF Web Services Resource Framework (OASIS)
including • WS-Resource Framework Web Services Resource 1.2
(OASIS) Public Review Draft 01, 10 June 2005• WS-ResourceProperties Web Services Resource
Properties V1.2 Public Review Draft 01, 10 June 2005• WS-ResourceLifetime Web Services Resource Lifetime
V1.2 Public Review Draft 01, 13 June 2005• WS-ServiceGroup Web Services Service Group V1.2
Public Review Draft 01, 10 June 2005• WS-BaseFaults Web Services Base Faults V1.2 Public
Review Draft 01, June 13, 2005
6666
Metadata and Service ContextMetadata and Service Context Consider a collection of services working togetherConsider a collection of services working together
• Workflow tells you how to specify service interaction but more Workflow tells you how to specify service interaction but more basically there is shared information or context basically there is shared information or context specifying/controlling collectionspecifying/controlling collection
WS-RF and WS-GAF have different approaches to contextualization WS-RF and WS-GAF have different approaches to contextualization – supplying a common “context” which at its simplest is a token to – supplying a common “context” which at its simplest is a token to represent state represent state
More generally core shared information includes dynamic service More generally core shared information includes dynamic service metadata and the equivalent of configuration information.metadata and the equivalent of configuration information.
One can supports such a common context either as pool of One can supports such a common context either as pool of messages or as message-based access to a “database” (Context messages or as message-based access to a “database” (Context Service)Service)
Two services linked by a stream are perhaps simplest example of a Two services linked by a stream are perhaps simplest example of a collection of services needing contextcollection of services needing context
Note that there is a tension between storing metadata in Note that there is a tension between storing metadata in messagesmessages and and services. services. • This is shared versus distributed memory debate in parallel This is shared versus distributed memory debate in parallel
computingcomputing
6767
Stateful Interactions There are (at least) four approaches to specifying state
• OGSI use factories to generate separate services for each session in standard distributed object fashion
• Globus GT-4 and WSRF use metadata of a resource to identify state associated with particular session
• WS-GAF uses WS-Context to provide abstract context defining state. Has strength and weakness that reveals less about nature of session
• WS-I+ “Pure Web Service” leaves state specification the application – e.g. put a context in the SOAP body
I think we should smile and write a great metadata service hiding all these different models for state and metadata
68
A List of Web Services 8• 8) Management – original OASIS
• WS-DistributedManagement Web Services Distributed Management Framework with MUWS and MOWS below (OASIS)
• WSDM-MUWS Web Services Distributed Management: Management Using Web Services (OASIS) OASIS Standard March 9 2005
• WSDM-MOWS Web Services Distributed Management: Management of Web Services (OASIS) OASIS Standard March 9 2005
69
A List of Web Services 8- Contd• 8) Management: Microsoft Converged Stack• WS-Management Web Services for Management
(Microsoft, Intel, Sun …) August 2005 • WS-Management Catalog The WS-Management
Catalog (Microsoft, Intel, Sun …) August 2005• WS-ResourceTransfer Web Service Resource Transfer
(HP, IBM, Intel, Microsoft) March 2006 • WS-Transfer Web Service Transfer (Microsoft, BEA,
Sonic Software etc.) September 2004• WS-TransferAddendum Extensions to Web Service
Transfer (HP, IBM, Intel, Microsoft) March 2006 • WS-Enumeration Web Service Enumeration
(Microsoft, BEA, Sonic Software etc.) September 2004
70
A List of Web Services 9• 9) General Service Characteristics
• WS-PolicyFramework Web Services Policy Framework (BEA, IBM, Microsoft, SAP …) September 2004
• WS-PolicyAttachment Web Services Policy Attachment (BEA, IBM, Microsoft, SAP …) September 2004
• WS-PolicyAssertions Web Services Policy Assertions Language (BEA, IBM, Microsoft, SAP) 18 December 2002 (Superseded by WS-PolicyFramework)
• WS-Agreement Web Services Agreement Specification (GGF under development) 9 August 2004
71
A List of Web Services 10• 10) User Interfaces
• WSRP Web Services for Remote Portlets (OASIS) OASIS Standard August 2003
• JSR168: JSR-000168 Portlet Specification for Java binding (Java Community Process) October 2003
• WSRP specifies the client-service protocol while JSR168 specifies how portlets are implemented for each supported service user-facing Web service ports inside aggregating portalslike JetSpeed, GridSphere or uPortal
7272
WS-I InteroperabilityWS-I Interoperability Critical underpinning of Grids and Web Services is Critical underpinning of Grids and Web Services is
the gradually growing set of specifications in the the gradually growing set of specifications in the Web Service Interoperability ProfilesWeb Service Interoperability Profiles
Web Services InteroperabilityWeb Services Interoperability (WS-I) Interoperability (WS-I) Interoperability Profile 1.0a." Profile 1.0a." http://www.ws-i.orghttp://www.ws-i.org. gives us . gives us XSD, XSD, WSDL1.1, SOAP1.1, UDDIWSDL1.1, SOAP1.1, UDDI in basic profile and parts in basic profile and parts of of WS-Security WS-Security in their first security profile.in their first security profile.
We imagine the “60 Specifications” being checked We imagine the “60 Specifications” being checked out and evolved in the out and evolved in the cauldron of the real worldcauldron of the real world and occasionally best practice identifies a new and occasionally best practice identifies a new specification to be added to specification to be added to WS-IWS-I which which gradually gradually increases in scopeincreases in scope• Note only 4.5 out of 60 specifications have Note only 4.5 out of 60 specifications have
“made it” in this definition“made it” in this definition
73
Database
SS
SS
SS
SS
SS
SS
SS
SS
SS
SS
FS
FS
FS
FS
FS
FS
FS
FS FS
FS
FS
FS
FS
FS
FS
FS
FS FS
FS
FS
PortalFS
OS
OS
OS
OS
OS
OS
OS
OS
OS
OS
OS
OS
MD
MD
MD
MD
MD
MD
MD
MD
MD
MetaDataFilter Service
Sensor Service
OtherService
AnotherGrid
Raw Data Data Information Knowledge Wisdom
Decisions
SS
SS
AnotherService
AnotherService
SSAnother
Grid SS
AnotherGrid
SS
SS
SS
SS
SS
SS
SS
SS
FS
SOAP Messages
Portal
7474
Semantic Grid and Services Implications of SOA (Service Oriented Architectures) for SG
(Semantic Grid)
• Build services to implement SG Implications of SG for SOA
• Build metadata rich systems of services using SG Services receive data in SOAP messages, manipulate it and
produce transformed data as further messages Meta-data is carried in SOAP messages Meta-data controls processing and transport of SOAP Messages Knowledge is created from data by services The Grid enhances Web services with semantically rich system
and application specific management One must exploit and work around the different approaches to
meta-data and their manipulation in Web Services
7575
Structure of SOAP Messages
SOAP Messages have System information in the header including WS-Policy based meta-data defining processing options• Processed by Handlers
Application data and meta-data is the body (controversies here!)• Processed by the Service itself
Some meta-data like WS-RF is logically “only in messages” Other like that in WS-Context or the SRB are stored in logical
equivalent of XML databases We only need to preserve semantic structure (XML/SOAP
Infoset) so transport in fast XML and store in efficient relational databases
H1 H4H3H2 Body F1 F2 F3 F4 Service
Container Handlers
Container Workflow
7676
Support for Messages Optimize XML representation and transport protocol
XML’’Filter2-1
StdXML Filter1 XML’ StdXMLFilter1-1XML’
Database(WS-Context)
Choose InvertibleFilter
Choose Protocol
XML’’ Filter2
Filters Preserve Infoset
7777
FI (Fast Infoset=Binary XML) v Traditional XML Messages
Transfer Time Comparison
0
100
200
300
400
500
600
700
# Of Features Per Message
Tim
e (m
s)
Transfer - FI
Transfer - XML
7878
PDA to Web Service Optimized Communication
0 5 10 15 20 25 30 350
20
40
60
80
100
120
140
Number Of Messages Per Session
To
tal S
ess
ion
Tim
e (
sec)
HHFR: 16 String Per MessageSOAP: 16 String Per Message
7979
Requirements for MPI Messaging
MPI and SOAP Messaging both send data from a source to a destination
• MPI supports multicast (broadcast) communication;
• MPI specifies destination and a context (in comm parameter)
• MPI specifies data to send• MPI has a tag to allow flexibility in processing in source processor
• MPI has calls to understand context (number of processors etc.)
MPI requires very low latency and high bandwidth so that tcomm/tcalc is at most 10
• BlueGene/L has bandwidth between 0.25 and 3 Gigabytes/sec/node and latency of about 5 microseconds
• Latency determined so Message Size/Bandwidth > Latency
tcommtcalc tcalc
8080
Requirements for SOAP Messaging Web Services has much of the same requirements as MPI with two
differences where MPI more stringent than SOAP• Latencies are inevitably 1 (local) to 100 milliseconds which is
200 to 20,000 times that of BlueGene/L 1) 0.000001 ms – CPU does a calculation 2) 0.001 to 0.01 ms – MPI latency 3) 1 to 10 ms – wake-up a thread or process 4) 10 to 1000 ms – Internet delay
• Bandwidths for many business applications are low as one just needs to send enough information for ATM and Bank to define transactions
SOAP has MUCH greater flexibility in areas like security, fault-tolerance, “virtualizing addressing” because one can run a lot of software in 100 milliseconds• Typically takes 1-3 milliseconds to gobble up a modest message
in Java and “add value”
8181
Structure of SOAP SOAP defines a very obvious message structure with a
header and a body just like email The header contains information used by the “Internet
operating system”• Destination, Source, Routing, Context, Sequence Number …
The message body is partly further information used by the operating system and partly information for application when it is not looked at by “operating system” except to encrypt, compress it etc.• Note WS-Security supports separate encryption for different
parts of a document Much discussion in field revolves around what is
referenced in header This structure makes it possible to define VERY
Sophisticated messaging
8282
MPI and SOAP Integration Note SOAP Specifies format and through WSDL
interfaces MPI only specifies interface and so interoperability
between different MPIs requires additional work• IMPI http://impi.nist.gov/IMPI/
Pervasive networks can support high bandwidth (Terabits/sec soon) but latency issue is not resolvable in general way
Can combine MPI interfaces with SOAP messaging but I don’t think this has been done
Just as walking, cars, planes, phones coexist with different properties; so SOAP and MPI are both good and should be used where appropriate
8383
When is a High Performance Computer? We might wish to consider three classes of multi-node computers 1) Classic MPP with microsecond latency and scalable internode
bandwidth (tcomm/tcalc ~ 10 or so) 2) Classic Cluster which can vary from configurations like 1) to 3)
but typically have millisecond latency and modest bandwidth 3) Classic Grid or distributed systems of computers around the
network• Latencies of inter-node communication – 100’s of milliseconds
but can have good bandwidth All have same peak CPU performance but synchronization costs
increase as one goes from 1) to 3) Cost of system (dollars per gigaflop) decreases by factors of 2 at
each step from 1) to 2) to 3) One should NOT use classic MPP if class 2) or 3) suffices unless
some security or data issues dominates over cost-performance One should not use a Grid as a true parallel computer – it can
link parallel computers together for convenient access etc.
8484
Linking Modules
From method based to RPC to message based to event-based publish-subscribe Message Oriented Middleware
Module A
Module B
Method Calls.001 to 1 millisecond
Service A
Service B
Messages
0.1 to 1000 millisecond latency
Coarse Grain Service ModelClosely coupled Java/Python …
Service B Service A
PublisherPost Events
“Listener”Subscribe to Events
Message Queue in the Sky
8585
What is a Simple Service? Take any system – it has multiple functionalities
• We can implement each functionality as an independent distributed service
• Or we can bundle multiple functionalities in a single service Whether functionality is an independent service or one of many
method calls into a “glob of software”, we can always make them as Web services by converting interface to WSDL
Simple services are gotten by taking functionalities and making as small as possible subject to “rule of millisecond”• Distributed services incur messaging overhead of one (local) to
100’s (far apart) of milliseconds to use message rather than method call
• Use scripting or compiled integration of functionalities ONLY when require <1 millisecond interaction latency
Apache web site has many (pre Web Service) projects that are multiple functionalities presented as (Java) globs and NOT (Java) Simple Services• Makes it hard to integrate sharing common security, user
profile, file access .. services
86
Grids of Grids of Simple Services• Link via methods messages streams• Services and Grids are linked by messages• Internally to service, functionalities are linked by methods• A simple service is the smallest Grid• We are familiar with method-linked hierarchy
Lines of Code Methods Objects Programs Packages
Overlayand ComposeGrids of Grids
Methods Services Component Grids
CPUs Clusters ComputeResource Grids
MPPs
DatabasesFederatedDatabases
Sensor Sensor Nets
DataResource Grids
8787
Component Grids? So we build collections of Web Services which we package as
component Grids
• Visualization Grid
• Sensor Grid
• Utility Computing Grid
• Collaboration Grid
• Earthquake Simulation Grid
• Control Room Grid
• Crisis Management Grid
• Drug Discovery Grid
• Bioinformatics Sequence Analysis Grid
• Intelligence Data-mining Grid We build bigger Grids by composing component Grids using the
Service Internet
88
Typical use of Grid Messaging in NASA
Datamining Grid
Sensor Grid implementing using NB
NB GIS Grid
89Physical Network (monitored by FS16)
7: Discovery 8:Metadata
BioInformatics GridChemical Informatics Grid
…Domain SpecificGrids/Services
…
4: Notification
6: Security 5: Workflow3: Messaging 9: Management
14: Information Instrument/Sensor
12: Computing
Core Low Level Grid Services
9: Management 18: Scheduling 10: Policy
15: Application Services
Screening ToolsQuantum Calculations
15: Application Services Sequencing ToolsBiocomplexity Simulations
11: Portals
17: Collaboration
Ser
vice
s
13: Data Access/Storage
Using the Grid of Grids and Core Services to build multiple application grids re-using common components.
9090
Critical Infrastructure (CI) Grids built as Grids of Grids
Gas Servicesand Filters
Physical Network
Registry Metadata
Flood Servicesand Filters
Flood CIGrid Gas CIGrid… Electricity CIGrid …
Data Access/Storage
Security WorkflowNotification Messaging
Portals Visualization GridCollaboration Grid
Sensor Grid Compute GridGIS Grid
Core Grid Services
91
Mediation and Transformation in a Grid of Grids and Simple Services
Po
rtP
ort
Port PortInternal
Interfaces
Subgrid or service
Po
rtP
ort
Port PortInternal
Interfaces
Subgrid or service
Po
rtP
ort
Port PortInternal
Interfaces
Subgrid or service
Messaging
Mediation andTransformationServices
External facingInterfaces
Why can we build better software? In 1962 I was punching holes in cards and paper tape to
persuade tiny slow computers to manipulate words in memory to string together instructions like a = b + c
Now computers are much faster and languages are better but not a lot better• I suspect I would only be a factor of 2 or so faster
programming the same program today However A B C can now be resources (Bank records,
Drugs, Games, Supernova) and + can be a service composition• Objects were insufficient as they distributed ordinary
programs; services express distributed independent entities (communication time very different inter and intra computers)
• Services are essential for reliable modular programming
What’s wrong with old programs They were made of instructions, methods, subroutines
and libraries thereof Languages (Java, C++) encouraged spaghetti
programming that linked parts of programs together• This leads to efficient but unmaintainable software
However now computers and networks are several orders of magnitude faster• Optimize for modularity and maintainability and rarely if
ever optimize for performance Old programs have the wrong optimization and by
construction are hard to maintain/change
Old and New Software Regime Web Services, Grids and P2P systems are built with
• The new software model: independent entities connected by explicit messages
All computer entities are actually connected by some form of message (traveling on bus or from memory to register) but often implicit
• And they support the distributed services and resources needed for global science, fun and business
• Google, Amazon, Yahoo and perhaps Microsoft and Electronic Arts can exploit this model
Old programs have the old architecture and cannot be modified• At best can wrap partial functionalities as services and use as
a black box• IBM, Oracle and the old Enterprise software companies have
this noose around their necks
9595
Delicious Applications http://del.icio.us purchased by Yahoo for ~$30M http://www.CiteULike.org http://www.connotea.org (Nature) http://www.bibsonomy.org/
• Associate metadata with Bookmarks specified by URL’s, DOI’s (Digital Object Identifiers)
• Users add comments and keywords (called tags)• Users are linked together into groups (communities)• Information such as title and authors extracted automatically
from some sites (PubMed, ACM, IEEE, Wiley etc.)• Bibtex like additional information
This is de facto Semantic Web – remarkable for its simplicity
9696
Connotea
9797
Connotea queried by SERVOGrid
9898
Provenance and Delicious ???? ???? is any field such as chemistry All ???? Data should be associated with provenance that
describes its lineage
• How and when it was created
• Compiler options used in simulation
• ????XMLfrontendedDatabase query used on what ????GridNodes
Provenance produced by computer automatically and/or by user All ????Data can and should be labeled by a URI such as
cicc://ciccnodenumber.xx.yy.whathaveyou We can use del.icio.us style interface to annotate ????Data with
missing provenance and user comments of any type (describing quality of data or a keyword relating different data etc.)
9999
Semantic Scholar Grid Citeseer and Google Scholar scour the Internet and
analyze documents for incidental metadata Title, author and institution of documents Citations with their own metadata allowing one to
match to other documents These capabilities are sure to become more powerful
and to be extended• Give “Citation Index” in real time• Tell you all authors of all papers that cite a paper that cites
you etc. (Note it’s a small world so don’t go too far in link analysis)
• Tell you all citations of all papers in a workshop Such high value tools will appear on “publisher” sites
of future (or else publishers will disappear)
100100
OSCAR2 Chemistry Document analysis
It detects “magic” chemical strings in text and then• Stores them as
metadata associated with document
Queries ChemInformatics repositories to tell you lots of information about identified compounds
Tells you which other documents have this compound
101101
???? Version of OSCAR Some of the ???? Nodes will store metadata associated
with ????Data – including documents• Note documents could be anywhere on the Internet – the ????
Node may choose to store (a copy of) document or just its metadata
• Note all ????Nodes are federated i.e. there is no “one central” store of any type of data
Metadata will be user annotations including tags, Citeseer style citation information for all scientific fields
Then each scientific field has its own version of OSCAR tuned to extract natural metadata for science – for Earthquake science this is GML and Chemistry is CML …
102
ExistingUser Interface
Document-enhanced Research Grid
etc.
Google Scholar
ManuscriptCentral
Science.gov
Windows Live Academic Search
Citeseer
CMT Conference
Management
Existing Document-basedResearch Tools
Web serviceWrappers
New Document-enhancedResearch Tools
Integration/EnhancementUser Interface
Community Tools
Generic Document Tools
MyResearchDatabase
Bibliographic Database
Export:RSS, BibtexEndnote etc.
CiteULike
Connotea
Del.icio.us
Bibsonomy
BioliciousPubChem
PubMed
TraditionalCyberinfrastructure
103
SSG Domain-1Web service
SSG Domain-NWeb service
Tool-1Del.icio.us
Tool-2Connotea
Tool-3CiteULike
Tool–N e.g.CiteSeer
NativeUI-1
NativeUI-4
NativeUI-3
NativeUI-N
IntegratedUser Interface UI
GatewayWS-1
GatewayWS-2
GatewayWS-3
GatewayWS-N
SSG MDStore
Integration Framework of Tools