WebFlow: Web Interface for Computational Modules presented by Tomasz Haupt Northeast Parallel Architectures Center at Syracuse University This project is sponsored by the U.S. Army Corps of Engineers Waterways Experimental Station MSRC (Vicksburg, MS) er the DoD Modernization Program, Programming Environment and Train
111
Embed
WebFlow: Web Interface for Computational Modules presented by Tomasz Haupt Northeast Parallel Architectures Center at Syracuse University This project.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
WebFlow:Web Interface
for Computational Modules
presented by
Tomasz HauptNortheast Parallel Architectures Center
at Syracuse University
This project is sponsored by the U.S. Army Corps of Engineers Waterways Experimental Station MSRC (Vicksburg, MS)
under the DoD Modernization Program, Programming Environment and Training
Authors
• Erol Akarsu (*)• Geoffrey Fox• Tomasz Haupt• Alexey Kalinichenko (*)• Kang-Seok Kim (*)• Praveen Sheethalnath (*)• Choon-Han Youn
(*) student
Synergistic projects(led by W. Furmanski)
at NPAC: FMS
Object WebHLAJWORB
http://bombay.npac.syr.edu/fms
alsoTango
Agenda
• Part I: Introduction (25 min)• Part II: WebFlow Design (45 min)• Part III: WebFlow Security (20 min)• 10:30 -10:45 break• Part IV: WebFlow Applications (75 min)• 12:00-1:30 lunch• Part V: How to Write WebFlow modules (15 min)• Part VI: LMS details, Demos, Discussion
Part I
Introduction
WebFlow Mission
• seamless access to remote resources– through a Web based user interface
– customized application GUI
• high-level user friendly visual programming and runtime environment for HPDC
• portable system based on industry standards and commodity software components
Remote Resources
Front-End
FRONT-END:high-level user friendly - visual programming and authoring tools - application GUI
RESOURCES:all hardware and software components needed to complete the user task, including, but not limited to, compute engines from workstations to supercomputers, storage, databases, instruments, codes, libraries, and licenses.
Desktop/Laptop
Seamless Access
Seamless Access
• Create an illusion that all resources needed to complete the user tasks are available locally.
• In particular, an authorized user can allocate the resources she needs without explicit login to the host controlling the resources.
• An analogy: NSF mounted disk or a network printer.
Examples:
• WebSubmit (NIST)
• TeraWeb (NCS, Inc.)
• CCM PSE (OSC)
• many others
WebBrowser
SP-2 O2K
CGI CGI
SSL
Disadvantage: - client/server based on custom protocol over CGI
Example: Globus
Advantages:- platform independent mini-language (RSL) for specification of resources- can be layered on top of different schedulers- enables interoperability between resources (can allocate many resources at a time, file transfer, monitoring, etc.)
Disadvantage: - a bag of low level tools
GRAMClient
Gatekeeper Gatekeeper Gatekeeper
Contact addressResource Language Specification
MDSDirectoryService
GSS-API
Towards a complete solution ...PSE: problem description (physics, chemistry, ...)
Task description: I need 64 nodes of SP-2 at Argonne to run my MPI-based executable “a.out” you can find in “/tmp/users/haupt” on
marylin.npac.syr.edu. In addition, I need any idle workstationwith jdk1.1 installed. Make sure that the output of my a.out is
transferred to that workstation
Middle-Tier: map the user’s task description onto the resource specification; this may include resource discovery, and other services
Resource Specification
Resource Allocation: run, transfer data, run
Remote Resources
Front-End Front-End
Middle-Tier
Resource Specification
Abstract Task Specification
We need a third tier!
Target Architecture
Middle-Tier
Resource Specification
Abstract Task Specification
ProblemSolving
Environments
OO VisualAuthoring
Tools
Data-FlowVisual
Authoring
CustomApplication
GUIOther
WebFlow
Back-End Resources
Middle-Tier
Resource Spec.
A. Task Spec.
PSE OODataFlow
CustomGUI Other
WebFlow
DATORR, Alliance
DATORR, Alliance
HPCC: Globus
Other as needed
DBMS: JDBC
Small tasks: Javauser codes
https, IIOP/SECIOP
Example: IPSE
Under development
NCSA Alliance
Example: LMS
Example of a custom GUI:LMS Front-End
Navigate and choose an existing application
to solve the problem at hand.Import all necessary data.
Retrieve data
Pre/post-processing
Run simulations
Select host
Select model
Set parameters
Run
PSE Example: CCM IPSE
Ken Flurchick, http://www.osc.edu/~kenf/Gateway
1 .Define your problem2. Identify resources (software and hardware)3. Create input file4. Run your application5. Analyze results
http://www-fp.mcs.anl.gov/~gregor/datorr/soon to become http://www.datorr.org
WebFlow design
• Object Oriented, follows JavaBeans model– everything is an object– objects interact through events
Object A(event source)
Object B(event target)
Fire event EMethod M(){…}
Firing event E by object A causes invocation of method M of object B.The association of event E and method M is achieved by an eventregistration mechanism. An event is also an object and it carries data.
A few words about CORBA
(a digression)
more information on Java, Corba, Distributed Object:
• Typically WebFlow objects live in different address spaces. We use CORBA to invoke methods of the remote objects.
Object A(event source)
Object B(event target)
Fire event EMethod M(){…}
ORB
How is this possible?
ORB2
Object A(event source)
Object B(event target)
Fire event EMethod M(){…}
ORB1IIOP
- Objects A and B are CORBA objects (thus not Java objects)- Objects are defined in IDL (Interface Definition Language)- IDL definitions are compiled using (Java)IDL compiler- The IDL compiler generates new classes to be used by the Java compiler (javac) instead of the original ones, on both the client and server side- The IDL compiler generates either classes to be extended, or interfaces to be implemented
* two modules: - runEdys - runCasc2d * one event - DoneEvent
They will be added to packageWebFlow.lms
We need more flexibility...
• WebFlow objects are developed independently of each other(reusable modules): we cannot assume that the event source knows anything about the event target and vice versa
Event binding
addEventListenerrmEventListenerfireEvent(E,M)
method M
Event Source Event TargetAdapter
Event
ORB
binding table
DII DSI
Controlling a moduleApplet
ModuleControls
Proxy Module
ModuleActionButton1ActionButton2
….
IIOP
Another complication:Java sandbox!
Adding a remote moduleLocal Host
Add module
Module Factory
Proxy Module
Remote Host
FE
requestAdd module
Module Factory
Module
Back to WebFlow design
WebFlow Server
• The WebFlow server is a container object, a.k.a. context - in fact it implements JavaBeanContext class (Java1.2)
• The BeanContext acts as a logical container for JavaBeans (“WebFlow modules and services”) and BeanContexts.
User 1 User 2
Application 1
Application 2
App 2App 1
WebFlow Server
WebFlow server is given by a hierarchy of containers
and components
WebFlow server hosts users and services
Each user maintainsa number of applications
composed of custom modules
and common services
WebFlow Services
CORBA Based Middle-Tier
Mesh of WebFlow Serversimplemented as CORBA objects
that manage and coordinate
distributed computation.
Front End
GatekeeperAuthenticationAuthorization
WebFlow Context HierarchyMaster Server (Gatekeeper)
Slave Server
Slave Server
User Context
Application Context
Module
Slave Server Proxy
Gatekeeper
Services User Modules
Data FlowFront-End
Middle-Tier modulesserve as proxies ofBack-End Services OO
Front-End
User Space Definition and Task Specification
Metacomputing Services
Back-End Resources
Modules
• Similar to JavaBeans– full power of Java (or C++) to implement
functionality– can encapsulate legacy applications
• May serve as Proxies– JDBC– metacomputing services (such as Globus)– schedulers (such as PBS, CONDOR, etc)
Services
• Services are modules provided by the system and offers a generic functionality– job services (submit,monitor,kill,... a job)– file services (edit,copy,move,… a file)– XML parser– database access– mass storage access– ...
The Run Job module is a proxy module. It generates the RSL on-the-fly and submits the job for execution using globusrun function.
The module knows only exec name, location and its arguments/parameters.
WebFlow over Globus
• In order to run WebFlow over Globus there must be at least one WebFlow node capable of executing Globus commands, such as globusrun
• Jobs that require computational power of massively parallel computers are directed to the Globus domain, while other jobs can be launched on much more modest platforms, such as the user’s desktop or even a
laptop running Windows NT.
Bridge between WebFlow and Globus
Part III
WebFlow Security
(design)
Secure Access: terminology• Access Control (or Authorization)
– Assurance that the person or computer at the other end of the session is permitted to do what he asks for.
• Authentication– Assurance that the resource (human or machine) at the other end of the
session is what it claims to be
• Integrity– Assurance that the information that arrives is the same as when it was sent
• Accountability (or non-repudiation)– Assurance that any transaction that takes place can subsequently proved to
have taken place
• Privacy– Assurance that sensitive information is not visible to an eavesdropper
(usually achieved using encryption)
Secure Access
• Mutual authentication of servers and users– Certificates, Keberos/SecurID
• Access control– Full autonomy of the resources owner(s)– Akenti
• Privacy
• Integrity
SECIOP
Security Model
Front End Applet
https
authentication& authorization
Gatekeeper
delegation
Stakeholders
HPCC resources
GSSAPIGSSAPI
Layer 1: secure Web
Layer 2: secure CORBA
Layer 3: Secure access to resources
Policies defined by resource owners
https (SSL)AKENTI
CORBA security service
GSSAPI (Globus)
Distributed Objects are less secure
• can play both client and server– in client/server you trust the server, but not the clients
• evolve continually– objects delegate parts of their implementation to the other objects (also dynamically composed at
runtime). Because of subclassing, the implementation of an object may change over time
• interactions are not well defined– because of encapsulation, you cannot understand all the interactions between objects
• are polymorphic (ideal for Trojan horses!)
• can scale without limit – how do you manage the access right to millions of servers?
• are very dynamic
CORBA security is built into ORB
Secure Communications
Authentication
ClientUser
Encryption Audit Authorization
Server
Encryption
Credentials
ObjectAdapterORB
Authentication
• A principal is authenticated once by ORB and given a set of credentials, including one or more roles, privileges, and an authenticated ID.
• An authenticated ID is automatically propagated by a secure ORB; it is part of the caller context
Principal Credentials
Current
Client Server
set_credentials get_attributes
authenticate
Privilege Delegation
• No delegation– The intermediary uses its own credentials
• Simple delegation– The intermediary impersonates the client
• Composite delegation– The intermediary uses both
ClientT
arge
t
Clie
nt
Tar
get
Clie
nt
Tar
get
Clie
nt TargetObject
IIOP
CORBA access model
• Based on a trusted ORB model:you must trust that your ORB will enforce the access policy on the server resource
• The ORB determines:if this client on behalf of this principal can do this operation on this object
• Server uses Access Control Lists (ACL) to control user access
Principal Role Rights Operation
Mary Thompson, http://www-itg.lbl.gov/security/Akenti/DOE2000/sld014.htm
Part IV
WebFlow Applications
• Applications vary by the functionality of their Front-Ends– Front-End Applications
• must be pre-installed
• run fast, no restrictions
– Front-End Applets• no installation, but may take time to download
• sandbox restrictions apply, unless signed
• Applications vary by how they are composed from modules– statically
• can by prepared in the Middle-Tier
– dynamically• the user composes them from reusable components
• The modules can interact with each other in different ways:– through events (object oriented approach)– through ports (data flow model)– through message passing
• Applications vary on how the Front-End interacts with the Middle-Tier– A complete task description is sent to the
middle-tier• composed of reusable modules
• predefined
– Objects are added to the user context one at a time, and Front-End keeps their references
Landscape Management System
LMS Objectives To develop a web based system that implements a
“navigate-and-choose” paradigm and allows the end user to: – Select (a set of) computational modules that provide
answers to the problem at hand
– Retrieve input data sets from remote sources
– Use adequate (remote) computational resources
– Visualize and analyze output data on the local host
Anytime, anywhere, using any platform
(e.g., a connected to the Internet laptop PC)
LMS: Changes in Vegetation A decision maker (the end user of the system) wants to
evaluate changes in vegetation in a geographical region over a long time period caused by short term disturbances such as a fire or human activity.
One of the critical parameters of the vegetation model (EDYS) is soil condition at the time of the disturbance.
This in turn is dominated by rainfall that possibly occurs at that time (CASC2D simulation)
Input data for the simulations are available from the Internet, such as Data Elevation Models (DEM) from USGS web site or from custom databases (spices characteristics)
LMS: Changes in Vegetation
Data retrieval Data preprocessing Simulation: two
interacting codes EDYS CASC2D
Visualization
WMS
EDYS CASC2D
DEM Land UseSoil
TextureVegetation
EDYS: vegetation model CASC2D: watershed modelWMS: Watershed Modeling System
LMS Front End
Data retrieval Data pre- and post-processing Simulations
Data RetrievalThe data wizard allows the user to
interactivelyselect the data anddownload them tothe local machine.The raw data arethen fed to the WMS system
launched from the browser to
generate input filesfor simulations.
Launching coupled simulations on different Back-End computational resources
Select host
Select model
Set parameters
Run
WMS based Visualizations
The results of the simulations are send
back to the Front-End, and can be visualized using tools included
in WMS package
Implementation of LMS
• Front-End (client) is a Java application– Data wizard, EDYS and WMS are run locally
• “navigate and choose” - no interactive composition of applications– EDYS, CASC2D, EDYS and CASC2D
• modules exchange data through message passing mediated by WebFlow
• client keeps the module references
slave
Running LMS
runCasc2d
master
UNIXWinNT
slave
runEdys
lms.class
Data wizardWMS
exeCasc2d
WebServer
WebServer
WebFlow ServersClient
- WebFlow modules
To run LMS
• Start web servers on both machines
• Start master on WinNT
• Start slave on WinNT
• Start slave on UNIX
• Start client (Java lms) on WinNT
Client code
try { //add modules p1 = slaveNT.addNewModule("runEdys"); //as defined in conf.file runEdys re = runEdysHelper.narrow(p1); p2 = slaveUNIX.addNewModule("runCasc2d"); //as defined in conf.file runCasc2d rc = runCasc2dHelper.narrow(p2); //bind events master.attachEvent(p2,"Casc2dDone","Casc2dDone",p1,"run"); master.attachEvent(p1,"EdysStarted","EdysStarted",p2,"run"); master.attachEvent(p1,"EdysDone","EdysDone",p2,"runAgain"); //invoke methods of runCasc2dImp rc.run(); } catch(COMM_FAILURE ex) {System.err.println(ex.getMessage()); System.exit(1);}
slave
1.start runCasc2d;
runCasc2d
master
UNIXWinNT
slave
runEdys
lms.class
Data wizardWMS
exeCasc2d
WebServer
WebServer
slave
2. casc2 starts in a new thread, uploads datato its web server and sends “done” event to Edys;
runCasc2d
master
UNIXWinNT
slave
runEdys
lms.class
Data wizardWMS
exeCasc2d
WebServer
WebServer
casc2d
slave
3. casc2 waits for new data from Edys; Edys downloads data and runs ‘till the first rain event
runCasc2d
master
UNIXWinNT
slave
runEdys
lms.class
Data wizardWMS
exeCasc2d
WebServer
WebServer
casc2d
slave
4. casc2 waits for data;Edys uploads data, sends event “done” and quits
runCasc2d
master
UNIXWinNT
slave
runEdys
lms.class
Data wizardWMS
exeCasc2d
WebServer
WebServer
casc2d
slave
5. runcasc2 fetches data from remote web server
runCasc2d
master
UNIXWinNT
slave
runEdys
lms.class
Data wizardWMS
exeCasc2d
WebServer
WebServer
casc2d
slave
6. Casc2d detects new data and resumes execution
runCasc2d
master
UNIXWinNT
slave
runEdys
lms.class
Data wizardWMS
exeCasc2d
WebServer
WebServer
casc2d
slave
7. Casc2d completes the rain event and writes new data; runCasc2d detects new data and sends event to Edys
runCasc2d
master
UNIXWinNT
slave
runEdys
lms.class
Data wizardWMS
exeCasc2d
WebServer
WebServer
casc2d
slave
8. Edys fetches data from the remote web server and starts;casc2d waits for new data
runCasc2d
master
UNIXWinNT
slave
runEdys
lms.class
Data wizardWMS
exeCasc2d
WebServer
WebServer
casc2d
Write
slave
9. This cycle is repeated ‘till all rain events are processed
runCasc2d
master
UNIXWinNT
slave
runEdys
lms.class
Data wizardWMS
exeCasc2d
WebServer
WebServer
casc2dIIOP
http
Write
http
slave
10. Casc2d quits, final run of Edys begins
runCasc2d
master
UNIXWinNT
slave
runEdys
lms.class
Data wizardWMS
exeCasc2d
WebServer
WebServer
slave
11. Edys terminates. All data are on the WinNT side and can be visualized using WMS tools.
runCasc2d
master
UNIXWinNT
slave
runEdys
lms.class
Data wizardWMS
exeCasc2d
WebServer
WebServer
Quantum Simulations
Quantum Simulations
QS: WebFlow implementation
Implementation of QS
• Front-End (client) is a Java applet
• applications are created dynamically from pre-existing modules
• modules exchange data through ports (data flow model)
• server keeps the module references;the references are published on a web site
QS: Front-End
Building an application
XMLA visual representation
is converted into a XMLdocument
XMLservice
WebServer
save
parse
ApplContext
Generates Java code to add modules to ApplContextPublishes IOR
- object oriented approach - implementation:- CORBA based Middle-Tier - bean-box type API - JDBC proxy modules
- Web interface to store data in DB in variable format- Data transfer from DB to a visualization engine- Coordinates transformations on a remote server- Launching simulations on remote hosts with interactive input
Building an application
Applet ApplicationContext
Netscape ORB ORBacus ORBIIOP
List of servers
List of modules
List of events
List of methods
E M
Add module
Attach Event
local remote
Adapter LLM
IPSE/Gateway Project
Services User Modules
Back-End Resources
Front-End
Back-End services comprise Tier 3.
Tier 1 is a high-level Front-End for visual programming
Distributed object-based, scalable, and reusable Web server and Object broker
Middleware forms Tier 2
Multi-tier Architecture of Gateway
• Master Server is started by administrator•command line•administrator page
• Slave Server is started by administrator•command line•administrator page
•User Context is created by Servlet• Slave server method • Security
• Application Context is created by User• User Context method
• Module are added by User• Application Context method
Starting Gateway
Slave Server
User Context
Slave Server
Initialization of a session
PortalPage
SecureWeb Server
Mutual
authentication
start
AKENTI
CredentialsGlobus Cert.
Front EndApplet
WebFlowServer User
ContextNetscape’s ORB ORBacus ORB
IIOP
Middle-Tier is given by a mesh of WebFlow Servers that manage
and coordinate distributed computation
.
• WebFlow applications are composed of independent reusable modules• Modules are written by module developers who have only limited knowledge of the system on which the modules will run.• The WebFlow system hides module management and coordination functions
Summary of features
• Single Web-based access via Gateway portal• Security based on standards: https, PKI,
secure ORB, GSSAPI (SSL/Keberos5)• Access policies controlled by stakeholders• WebFlow API allows implementation of many
different front-ends• Modern three-tier architecture (distributed objects)• Access to HPCC through metacomputing services
How to use WebFlow
• A production version is being developed within Gateway project (ASC/OSC)– first release: Jun’99
(with security features, and a subset of services)
– Beta release: Sept’99
– release 1.0: Nov-Dec’99 (SC’99)
• A preliminary version is available now• I am looking for WebFlow applications