Manchester Computing Supercomputing, Visualization & e-Science Stephen Pickles, Andrew Porter, Robin Pinning & Rob Haines <[email protected]> http://www.realitygrid.org Royal Society, Tuesday 15 June, 2004 RealityGrid RealityGrid Software Infrastructure: Achievements and Prospects
41
Embed
Manchester Computing Supercomputing, Visualization & e-Science Stephen Pickles, Andrew Porter, Robin Pinning & Rob Haines Royal.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Man
ch
este
r C
om
pu
tin
gSup
erc
om
puti
ng,
Vis
ualiz
ati
on &
e-S
cien
ce
Stephen Pickles, Andrew Porter, Robin Pinning & Rob Haines
Software Infrastructure: Achievements and Prospects
RealityGrid Annual Workshop, 15/6/20042
Outline
Review– How we got here
Status– Where we are today
Prospects– Where we’re going
Man
ch
este
r C
om
pu
tin
gSup
erc
om
puti
ng,
Vis
ualiz
ati
on &
e-S
cien
ce
ReviewReview
How we got here
RealityGrid Annual Workshop, 15/6/20044
The pieces
Fast track Computational Steering Library and tools (MC) On-line Visualization (MC) Web portal (EPCC) Human-Computer Interfaces (HCI)
Deep track Performance Control (CNC) Resource management, component frameworks (IC) Instruments: LUSI, XMT (not this talk)
This talk will emphasise fast track work.
RealityGrid Annual Workshop, 15/6/20045
Design philosophies
Grid-enabled Component-based and service-oriented
– plug in and compose new components and services, from partners and third parties
Independence and modularity– to minimize dependencies on third-party software
• Should be able to steer locally without and Grid middleware
– to facilitate parallel development within project
Integration and/or interoperability– Things should work together
Respect autonomy of application owners– Prefer light-weight instrumentation of application codes to wholesale re-factoring– Same source (or binary) should work with or without steering
Dynamism and adaptability– Attach/detach steering client from running application– Adapt to prevailing conditions
Intuitive and appropriate user interfaces
RealityGrid Annual Workshop, 15/6/20046
Historical Context –Messages from above
In 2002, we were told “use Globus, SRB or Condor”. Then we were told “Web services are OK too”. Then the Open Grid Services Architecture (OGSA) effort was
announced. OGSA would be based on the Open Grid Services Infrastructure
(OGSI), and specifications began in earnest with (it seemed) overwhelming industrial support.
“You must be on an OGSA-convergence track. You must use e-Science certificates.”
GT3 appears 2003. Some people build GT3 services. No-one builds production grids based on GT3.
Early in 2004, we hear “OGSI was a great success. OGSI is dead. Long live WS-RF. GT3 is obsolescent.”
RealityGrid Annual Workshop, 15/6/20047
2002 - Enter Grid Services
OGSI brought the hope of convergence between Web services (technology of choices for business process integration) and Grid computing.
It offered state, 2-level naming (GSH, GSR), lifetime management, and infrastructure support for common patterns (factories, registries, notification)…
With Dave Snelling, we experimented with UNICORE-based OGSI prototype (pre-dating GT3 preview).
RealityGrid Annual Workshop, 15/6/20048
First “Fast Track” Demonstration
Jens Harting at UK e-ScienceAll Hands Meeting, September 2002
RealityGrid Annual Workshop, 15/6/20049
“Fast Track” Steering DemoUK e-Science AHM 2002
BezierSGI Onyx @ Manchester
Vtk + VizServer
DiracSGI Onyx @ QMUL
LB3D with RealityGridSteering API
LaptopSHU Conference Centre
UNICOREGateway and NJS
Manchester
Fir
ew
al
l
SGI OpenGL VizServer
Simulation
Data
VizServer clientSteering GUI The Mind Electric GLUE web service hosting environment with OGSA extensionsSingle sign-on using UK e-Science digital certificates
UNICOREGateway and NJS
QMUL
Steering (XML)
RealityGrid Annual Workshop, 15/6/200410
Steering architecture in 2002
Communication modes:• Shared file system• Files moved by UNICORE daemon• GLOBUS-IO
Simulation
Visualization
Visualization
data transfer
Client
Steering library
Steering library
Steering library
RealityGrid Annual Workshop, 15/6/200411
Dilemma
Wanted to separate steering from job management Architecture was brittle and firewall unfriendly
– Client needed to know too much about application deployment
– Direct connection between client and simulation is problematic when client is mobile
OGSI’s lifetime management, registries, language neutrality and notification seemed ideal for steering– (ended up not using OGSI notification for firewall reasons)
But all “production” grids were based on Globus Toolkit version 2 (GT2)
RealityGrid Annual Workshop, 15/6/200412
Serendipity – OGSI::Lite
Mark Mc Keown’s OGSI::Lite started life as a spare time exercise to understand Web services, then OGSI.
Soon became a near-complete OGSI implementation.
Minimal pre-requisites (Perl and SOAP::Lite) meant we could deploy it trivially in user space when the job is run. Only need permission to listen on a port. (This would be highly non-trivial using deep stack of GT3.)
So we could have our OGSI cake and eat it on a GT2 grid.
Our steering architecture quickly got a middle-tier implemented in OGSI::Lite.
RealityGrid Annual Workshop, 15/6/200413
The Architecture of Steering
Steering client
Simulation
Steering library
VisualizationVisualization
Registry
Steering GS
Steering GS
connect
publish
find
bind
data transfer
(Globus-IO)
publish
bind
Client
Steering library
Steering library
Steering library Display
Display
Display
components start independently and
attach/detach dynamically
multiple clients: Qt/C++, .NET on PocketPC, GridSphere Portlet (Java) remote visualization through
SGI VizServer, Chromium, and/or streamed to Access Grid
OGSI middle tier
RealityGrid Annual Workshop, 15/6/200414
The TeraGyroid Project
Funding from EPSRC (UK) & NSF (USA) Ran LB3D across UK e-Science Grid and US TeraGrid Study of defect dynamics in liquid crystalline surfactant
systems using lattice-Boltzmann methods Featured world’s largest Lattice Boltzmann simulation TRICEPS was the HPC-Challenge aspect of this work
– Transcontinental RealityGrids for Interactive Collaborative Exploration of Parameter Space
– “most innovative data-intensive application” at SC’03
Later picked up ISC 2004 award in the “Integrated Data and Information Management” category
More in Richard Blake’s talk
RealityGrid Annual Workshop, 15/6/200415
New for TeraGyroid
Access Grid integration use of Chromium to complement VizServer job migration based on malleable checkpoints user friendly “wizard” to drive job launching and migration support for parameter space exploration through checkpoint trees
– also implemented in OGSI::Lite– services thrown together for TeraGyroid have been upgraded in flight– still running 8 months later
file transfer service– to get around issues with systems homed on two networks
port forwarding (Stephen Booth, EPCC)– to work around lack of public IP address on compute nodes (e.g. HPCx)
RealityGrid Annual Workshop, 15/6/200416
Checkpoint trees andparameter space exploration
Initial condition: Random water/ surfactant mixture.
Self-assembly starts.
Rewind and restart from checkpoint.
Lamellar phase: surfactant bilayers between water layers.
Cubic micellar phase, low surfactant density gradient.
Cubic micellar phase, high surfactant density gradient.
RealityGrid Annual Workshop, 15/6/200417
Access Grid integration - SC Global
RealityGrid Annual Workshop, 15/6/200418
TeraGyroid Testbed
VisualizationComputation
Starlight (Chicago)
Netherlight (Amsterdam)
BT provision
PSC
ANL
NCSA
Phoenix
Caltech
SDSC
UCL
Daresbury
Manchester
SJ4MB-NG
Network PoP
Access Grid nodeService Registry
production network
Dual-homed system
10 Gbps
2 x 1 Gbps
RealityGrid Annual Workshop, 15/6/200419
EPSRC e-Science Meeting 2004
Multiple steering clients driving same simulation– Qt client on laptop– .NET client on PDA
• Simon Nee (Loughborough)
– Web client• GridSphere Portlet• Access through web browser• Matthew Egbert (EPCC)
– not all at same time– significant achievement in terms of OGSI interoperability
Collaborative steering prototype– using ICENI and client proxy– Java bindings to client side of steering library (JNI)– Gary Kong (LeSC)
RealityGrid Annual Workshop, 15/6/200420
Public Release – April 2004
Steering Library released as version 1.1 version 1.0 was project internal very liberal open source license (FreeBSD) API specification version 1.1 Library (C and Fortran90 bindings) Tools, including Qt steerer User Manual Examples
Available for download at:http://www.sve.man.ac.uk/Research/AtoZ/RealityGrid/
Globus-IO replaced by vanilla sockets major simplification to build process only way to complete integration of NAMD and VMD into RealityGrid
Man
ch
este
r C
om
pu
tin
gSup
erc
om
puti
ng,
Vis
ualiz
ati
on &
e-S
cien
ce
StatusStatus
Where we are today
RealityGrid Annual Workshop, 15/6/200422
Steering library
We instrument (add "knobs" and "dials" to) simulation codes through a steering library, written in C– Bindings in Fortran90, C/C++ (complete) and Java (partial)
Library features:– Pause/resume– Checkpoint and restart– Set values of steerable parameters (parameter steer)– Report values of monitored (read-only) parameters (parameter watch)– Emit "samples" to remote systems for e.g. on-line visualization– Consume "samples" from remote systems for e.g. resetting boundary
conditions– Automatic emit/consume with steerable frequency– No restrictions on parallelisation paradigm
You only implement what you need
RealityGrid Annual Workshop, 15/6/200423
Qt Steering client
Built using C++ and QtAttaches to any steerable
RealityGrid applicationDiscovers what commands
are supportedDiscovers steerable &
monitored parametersConstructs appropriate
widgets on the fly
RealityGrid Annual Workshop, 15/6/200424
On-line visualisation
Fast track uses open source VTK for on-line visualisation– Simple GUI built with Tk/Tcl, polls for new data to refresh image
– Some in-built parallelism
– extended to use the steering library
– AVS-format data supported
– XDR-format data for sample transfer between platforms
– Volume render (parallel)
– Isosurface
– Hedgehog
– Cut-plane
New work on atom-centric meshes for Steve Kenny
RealityGrid Annual Workshop, 15/6/200425
OGSI is dead. Long live WS-RF!
WS-ResourceFramework preserves most OGSI ideas in a way which is friendlier (less abusive) to Web services.
Open Middleware Infrastructure Institute (OMII) has a conservative roadmap based on Web services.– WS-I plus as little else as possible
UK National Grid Service is aligned with EGEE.– This means Globus Toolkit version 2 for at least 12 months.
WS-RF (and WS-Notification) are moving targets. What does this mean for us?
RealityGrid Annual Workshop, 15/6/200426
Our response to WS-RF
We must be able to exploit the grids that exist– GT4 is unlikely to be stable and widely deployed in lifetime of RealityGrid
OGSI::Lite works fine for us, so continue to use it for now. In time, WS-RF may be appropriate.
– seems indicated for the Steering Grid Service, which is a very dynamic thing
– optional for persistent services such as Checkpoint Metadata Tree and Registry. These could be implemented in plain Web services.
WSRF::Lite is already an option– prototype released within a few weeks of first publication of WS-RF drafts
– featured in WS-RF interop fest in April, and interop demo at GGF 11 last week
RealityGrid Annual Workshop, 15/6/200427
Standards, generally
Very slow progress on Advance Reservation– RealityGrid requires co-allocation of compute, viz, AG resources at time to suit the
humans– LSF, PBS(Pro), SGE now support it, but not accessible through middleware– GRAAP-WG at GGF is bogged down in WS-Agreement and has yet to address
protocols and apply them to Advance Reservation problem
Practical WS-RF interoperability will require coherent, global security strategy for Web services, and a delegation model
– not clear that GT4 interoperability is the driver.– GT3 and GT4 security has never been on the standards table– what is GSI-SecureConversation anyway?
OGSA itself is a massive undertaking and will not settle in RealityGrid’s lifetime
RealityGrid is a provider of use case drivers for GRAAP, GridCPR, OGSA, SAGA (and other) groups in GGF
Man
ch
este
r C
om
pu
tin
gSup
erc
om
puti
ng,
Vis
ualiz
ati
on &
e-S
cien
ce
ProspectsProspects
Where we’re going
RealityGrid Annual Workshop, 15/6/200429
Steering
Plans Tabbed steerer (work in progress)
– single client tabs between multiple steerable simulations– required for thermodynamic integration work using NAMD
Steering of multi-component simulations (coupled models) – requires metadata about component interactions and schedule
Quantitative study of the overhead of steering and on-line visualization Support use of steering within project Final release of steering library, toolkit and documentation
Significant Gap - Security!!!– contingent on additional funding for WSRF::Lite– and coherent global security strategy for Web services
RealityGrid Annual Workshop, 15/6/200430
Steering - Wishlist
Port of steering services to WS-RF– probably in a follow-on project
Provenance of steering and parameter space exploration Collaborative steering
– i.e. support simultaneous connection of multiple clients
Scripted steering– Breakpoints ( IF (temperature > TOO_HOT) THEN … )
– Replay of previous steering actions
Integration of steering into selected MVEs– entirely feasible, but can’t do them all
RealityGrid Annual Workshop, 15/6/200431
Standardisation of Steering
Opportunities: Standardise an API for computational steering Standardise the WSDL of the Steering Grid Service
These could be input to the GGF research group “Simple APIs for Grid Applications” (SAGA-RG)
“Thin visualization”– delivered to PDA or Web browser– thumbnails in checkpoint tree
Possibilities Use of *-ray from Utah AVS module for streaming to Access Grid VizServer integration:
– Put GSI authentication into VizServer PAM when released– Liaison with Platform and SGI regarding use of VizServer API for Advance
Reservation of graphics pipes
RealityGrid Annual Workshop, 15/6/200433
Launching and packaging
Plans Continue to improve usability Reduce deployment overhead
– wizard can now work with Java CoG kit• easier to deploy than Globus client bundles
Possibilities Integrate RLS or SRB into checkpoint tree Pick up Web service approaches to job submission
RealityGrid Annual Workshop, 15/6/200434
HCI
Plans Update of HCI Audit report in light of experiences Journal paper on the HCI of TeraGyroid .NET client
– deployable demonstrator with renderings on PDA and Windows laptop
Identified activities, off critical path, for PhD student VizServer QoS experiments with MB-NG or UK-Light Thin visualization for PDAs and Web portals
RealityGrid Annual Workshop, 15/6/200435
Portal
Currently provides Web client for steering– GridSphere portlet communicates with Steering Grid Service via SOAP
Prototype portlet for checkpoint tree browsing
Little resource (2-3 PM) remains for second phase of portal work.
Plans Finish checkpoint tree browsing Incorporate use of registry for simulation discovery Hope to inherit JSR168 portlets for job launching and monitoring limited visualization capability
– slice of scalar field– subject to resources
RealityGrid Annual Workshop, 15/6/200436
Resource Management – Deep Track
Advance Reservation– proof of concept using SGE 6.0
Implemented within Job Submission Web Service separated from ICENI– using Job Definition Markup Language (JDML)
• which is evolving into Job Submission Definition Language (JSDL) through Global Grid Forum JSDL working group
– designed to support plug-in of other job submission systems• eg. Globus, gsi-ssh, UNICORE, LSF,...
RealityGrid Annual Workshop, 15/6/200437
ICENI integration – Deep Track
Application
Steering library
Steering
GS
Control
Status
Data in / Data out
Technical report on feasibility of integrating fast-track steerable binary (with associated SGS) as an ICENI component
If practical, do it.
RealityGrid Annual Workshop, 15/6/200438
Performance Control – Deep Track
Performance Control of coupled models– working with HybridMD code and Bespoke Framework Generator (BFG)– outcomes: technology demonstrator & research papers– deployment in production is unlikely
Performance prediction of same– Steering of BFG-coupled models
Integration of PERCO and ICENI is not likely Generalised malleable-checkpoint library is unlikely
– major undertaking, re-inventing SRS from UTK– application specific alternatives always possible for those that need it
Proven to be possible to support steering or PERCO through a common API
– which simplifies instrumentation of application codes– but doing both at the same time leads to frighteningly complex interactions
RealityGrid Annual Workshop, 15/6/200439
Conclusions
We will not solve everything during the lifetime of RealityGrid
We must be ruthless about what we do and do not undertake
RealityGrid Annual Workshop, 15/6/200440
Partners
Academic University College London Queen Mary, University of London Imperial College University of Manchester University of Edinburgh University of Oxford University of Loughborough
Industrial Schlumberger Edward Jenner Institute for Vaccine
Research Silicon Graphics Inc Computation for Science Consortium Advanced Visual Systems Fujitsu BT Exact
Man
ch
este
r C
om
pu
tin
gSup
erc
om
puti
ng,
Vis
ualiz
ati
on &
e-S
cien
ce
Bringing Science and Supercomputers Together
http://www.sve.man.ac.uk
SVE @ Manchester ComputingSVE @ Manchester Computing