SCIENCE GATEWAYS Suresh Marru Mark Miller Nancy Wilkins-Diehr
• I stole & mashed together slides from: – Nancy Wilkins-‐Diehr – XSEDE 12 Gateways overview – Henry Neeman, University of Oklahoma -‐ SupercompuIng in Plain English
– Dennis Gannon – eScience Lectures – Grid AnimaIons were adopted from resources at White Rose Grid eScience Center hOp://www.wrgrid.org.uk/Resources/PresentaIons.php
– MaO McKenzie, NICS, ORNL – XSEDE 12 Tutorial: PracIcal issues in running a gateway
I am here since Monday and I am convinced … o SupercompuIng is all about size and speed and XSEDE can potenIally quest my computaIonal hunger
o Parallelism is a good way of dividing and conquering my problem
o I can use batch queues and HPC machines in a good way and run my applicaIons
o I realized SDSC consultants and sys-‐admins are the nicest people and you can go back to them for further help
Its Thursday and why am I not on the beach yet?
• You have developed, tuned, or enhanced your applicaIon but would like to know how to share & execute them in familiar and simpler ways
• Science Gateways, an integral part of Cyberinfrastructure will solve all your problems. Really?, Nah!!, but
• Will help you use or build simple web interfaces to complex apps
• Will absorb the niOy-‐griOy details of emerging technologies
worker node
worker node
worker node
worker node
worker node
Batch Manager/Login Node
Queue-A Queue-B Queue-C
A Slot 1
A Slot 2
B Slot 1
C Slot 1
C Slot 2
C Slot 3
B Slot 1
B Slot 2
B Slot 3
B Slot 1
C Slot 1
C Slot 2
A Slot 1
B Slot 1
C Slot 1
§ Queues
§ Policies
§ Priorities
§ Share/Tickets
§ Resources
§ Users/Projects JOB Y JOB Z JOB X
JOB U
JOB O JOB N
How to run a job on Trestles/Gordon
What does a Batch Manager do
• Dynamic Resource Management – Job Scheduling – Resource monitoring – Policy administraIon – User authenIcaIon and access control
– AccounIng and reporIng
l System characteristics l System status l Resources
l Job policies l Resources
Key Feature: Match Making
JOB
User
l User policies l Groups l Roles l Departments l Projects
Selection Scheduling
How to interface with Batch Managers
Batch Manager
Graphical Interfaces
Command-line
<c/>
Programmatic API’s
Browser
15
What is Grid CompuIng? • The grid vision is of “Virtual
compuIng” (+ informaIon services to locate computaIon, storage resources) – Compare: The web: “virtual
documents” (+ search engine to locate them)
• MOTIVATION: collaboraIon through sharing resources (and experIse) to expand horizons of – Research – Commerce – engineering, … – Public service – health,
environment,…
16
G R I D M I D D L E W A R E
Visualising
WorkstaDon
Mobile Access
Supercomputer, PC-‐Cluster
Data-‐storage, Sensors, Experiments
Internet, networks
Power Grid Metaphor
17
Ian Foster’s Grid Checklist
• A Grid is a system that:
– Coordinates resources that are not subject to centralized control
– Uses standard, open, general-‐purpose protocols and interfaces
– Delivers non-‐trivial quali<es of service
17
18
The Grid Middleware Stack
Grid Security Infrastructure
Job Management
Data Management
Grid InformaDon Services
Core Globus Services
Standard Network Protocols and Web Services
Workflow system (explicit or ad-‐hoc)
Grid ApplicaDon (oVen includes a Portal)
Grid Middleware glues the grid together
• A short, intui<ve defini<on:
the so@ware that glues together different clusters into a grid
taking into consideraIon the socio-‐poliIcal side of things (such as common policies on who can use what, how much, and what for)
20
Grid middleware components
• Job management • Storage management • InformaIon Services • Security
GridFTP Data Movement
http://www.loni.org
LONI HPC Enablement Workshop – LaTech University, October 23, 2008
+$)&C28*
Basic Transfer One control channel, several
parallel data channels
Third-party
Transfer Control channels to each server, several parallel
data channels between servers Striped Transfer
Control channels to each server on one node, several parallel data channels
between servers and data channels spread across nodes
IntroducIon to Grid CompuIng 22
Single Sign-‐on
• Important for complex applicaIons that need to use Grid resources – Enables easy coordinaIon of varied resources – Enables automaIon of processes – Allows remote processes and resources to act on user’s behalf
– AuthenIcaIon and DelegaIon
Grid CerIficate • Every user and service on the Grid is
idenIfied via a cerIficate, which contains informaIon vital to idenIfying and authenIcaIng the user or service.
• Grid CerIficates are based on standard PKI infrastructure and provide a set of privileges of one resource to another.
• Grid CerIficates provides the features of dynamic delegaIon, dynamic enIIes and repeated authenIcaIon.
Grid CerIficate
Proxies of the
CerIficate
Internals of Grid CerIficates • A Grid CerIficate in X.509 cerIficate format contains:
– EnIty’s qualified name – EnIty’s public key – Name of the issuing CA – Signature of issuing CA – Validity dates (start and end dates)
• The Grid CerIficate associates the public key with a qualified DisDnguished Name (DN).
• DN is a unique idenIfier and is composed of: – Persons Name (Common Name or CN) – InsItuIon (OrganizaIon O) – Country (C) – Example DN: /C=US/O=NaIonal Center for SupercompuIng
ApplicaIons/CN=Suresh Marru
Grid Security Infrastructure • How do we delegate our idenIty to a remote agent/program to act on our behalf? – GSI SoluIon: Create a Proxy cerIficate
• A new public-‐private key pair that is to be used only for a limited Ime. Never use this key pair again.
• Give this proxy cert and its private key to a trusted agent to work on your behalf
My name is: Suresh Marru My public key is:
My pub key
Signed by a Trusted CA
My name is: Suresh Marru’s proxy My public key is:
My new pub key
Signed by Suresh Marru
Do not use aner: 5:00 pm today
My new private key
Why Use Proxy CerIficates?
• A cerIficate usually lasts a year – If it’s stolen, it’s sIll good for the rest of the year
• unless it’s revoked by being placed on a cerIficate revocaIon list (CRL)
– And your uIlity actually checks the CRL. » With any frequency
• A proxy cerIficate usually lasts 12 hours – Minimizes the possible mischief
Grid Proxy • Temporary instance of a cerIficate • Has IdenIcal subject as the grid
cerIficate. A unique number is added to the proxied credenIal’s DN.
• Example: – DN of the cerIficate: /C=US/O=NaIonal Center for SupercompuIng
ApplicaIons/CN=Suresh Marru – DN of the proxy to the cerIficate /C=US/O=NaIonal Center for
SupercompuIng ApplicaIons/CN=Suresh Marru/CN=1156762317 – DN of the proxy to the proxy /C=US/O=NaIonal Center for
SupercompuIng ApplicaIons/CN=Suresh Marru/CN=1156762317/CN=2136876587
• Proxies have much lesser life Ime then cerIficates and hence more safer to transmit.
• A proxy can be created using Grid client sonware.
MyProxy Repository • MyProxy is a online credenIal Repository.
• Instead of user having to use grid sonware to create proxies of their cerIficate, they can delegate the task to myproxy repositories.
• Only short lived x.509 proxies are issued, and long lived private keys are safely stored and never given out.
• Avoids the need of copying cerIficate and key files between machines and hence more secure and ease of use.
IntroducIon to Grid CompuIng 29
Local Resource Managers (LRM) • Compute resources have a local resource manager
(LRM) that controls: – Who is allowed to run jobs – How jobs run on a specific resource – Specifies the order and locaIon of jobs
• Example policy: – Each cluster node can run one job. – If there are more jobs, then they must wait in a queue
• Examples: PBS, LSF, Condor
IntroducIon to Grid CompuIng 30
Local Resource Manager: a batch scheduler for running jobs on a compuIng cluster • Popular LRMs include:
– PBS – Portable Batch System – LSF – Load Sharing Facility – SGE – Sun Grid Engine – Condor – Originally for cycle scavenging, Condor has evolved into a comprehensive system for managing compu<ng
• LRMs execute on the cluster’s head node • Simplest LRM allows you to “fork” jobs quickly
– Runs on the head node (gatekeeper) for fast uIlity funcIons – No queuing (but this is emerging to “throOle” heavy loads)
• In GRAM, each LRM is handled with a “job manager”
30
IntroducIon to Grid CompuIng 31
GRAM Globus Resource AllocaIon Manager
• GRAM = provides a standardised interface to submit jobs to LRMs.
• Clients submit a job request to GRAM • GRAM translates into something a(ny) LRM can understand …. Same job request can be used for many different kinds of LRM
IntroducIon to Grid CompuIng 32
Job Management on a Grid
User
The Grid
Condor
PBS
LSF
fork
GRAM
Site A
Site B
Site C
Site D
IntroducIon to Grid CompuIng 33
GRAM’s abiliIes
• Given a job specificaIon:
– Creates an environment for the job – Stages files to and from the environment – Submits a job to a local resource manager – Monitors a job – Sends noIficaIons of the job state change – Streams a job’s stdout/err during execuIon
IntroducIon to Grid CompuIng 34
GRAM components
Worker nodes / CPUs Worker node / CPU Worker node / CPU Worker node / CPU Worker node / CPU Worker node / CPU
LRM eg Condor, PBS, LSF
Gatekeeper Internet
Jobmanager Jobmanager globus-‐job-‐run
Submitting machine (e.g. User's workstation)
IntroducIon to Grid CompuIng 35
Submirng a job with GRAM • globus-‐job-‐run command $ globus-job-run workshop1.ci.uchicago.edu /bin/
hostname
– Run '/bin/hostname' on the resource workshop1.ci.uchicago.edu
• We don't care what LRM is used on ‘workshop1’ This command works with any LRM.
Resource SpecificaIon Language (RSL) • AOribute & value pairings • GRAM aOributes: executable, arguments, count, directory, maxIme, jobtype, project,… – A different way to describe a PBS script
• LRM interprets the RSL aOributes to manage the GRAM request
& (project=TG-‐STA110014S) (jobtype=mpi) (directory=/lustre/scratch/mmcken6/apoa1/) (count=24) (executable=/lustre/scratch/mmcken6/namd2) (arguments=apoa1.namd)
#!/bin/bash #PBS –l size=24 #PBS –A TG-‐STA110014S cd /lustre/scratch/mmcken6/apoa1/ aprun –n 24 /lustre/scratch/mmcken6/namd2 apoa1.namd
More on Jobtype • Jobtype=single
– This is solitary, non-‐parallel job – Most flexible opIon
• One can set aprun/mpirun/other exe and its various arguments
• '&(executable=aprun)(arguments="-‐n" "24" "helloworld") (directory=/lustre/scratch/user)(jobtype=single)(count=24)(maxIme=10) (project=my_alloca<on)'
• Jobtype=mpi – Generates a PBS for MPI jobs
• Jobtype=mulIple – Parallel applicaIons that do not depend on MPI – Check your site friend on how this is implemented
Purng it all together: Portal or a Gateway
One Ime Gateway Community Setup
Community Account Grid CerIficate username, password
Gateway Interface Gateway Server
Compute Servers
Gateway AuthenIcaIon
Job Submit or File Transfer request
Output
Proxy, Job Requ
est
Job Status, Outpu
t
Step 0
Step 1
Step 2,3,,
IntroducIon to Grid CompuIng 39
Workflow management systems
• OrchestraIon of many resources over long Ime periods – Very complex to do manually -‐ workflow automates this effort
• Enables restart of long running scripts • Write scripts in a manner that’s locaIon-‐independent: run anywhere – Higher level of abstracIon gives increased portability of the workflow script (over ad-‐hoc scripIng)
AbstracIons
Impact on nu
mbe
r of U
sers
Develop & Tune Applications
Science Gateways
Gateways: DemocraIzing Science
Deploy Middleware,
Register Applications
Download Xbaya Workflow GUI
• Go to hOp://airavata.org • Click on Wiki (the second link on the len) • Click on “SDSC Summer InsItute Tutorial”
44
Knowledge and Expertise
Computational Resources
Scientific Instruments
Algorithms and Models
Archived Data and Metadata
Advanced Science Tools
Science Gateways: Enabling & Democratizing Scientific Research
On-‐Demand Grid CompuIng
LEAD: an Integrated, Scalable Geosciences Framework (One of the byproduct: Workflow OrchestraIon for On-‐Demand Real-‐Time Dynamically-‐AdapIve System)
Streaming ObservaIons
Storms Forming
Forecast Model
Data Mining
Refine forecast
Instrument Steering
Envisioned by a mulI-‐disciplinary team from OU, IU, NCSA, Unidata, UAH, Howard, Millersville, Colorado State, RENCI
Science Gateways: HolisIc System IntegraIon
Data Storage
Applica<on services
Compute Engine
Gateway Portal
Portal server
Instrument Data
Catalog Metadata catalog
Data Brokering service Data
Management Service
Workflow Engine
Workflow graphs
Provenance CollecIon service
Event No<fica<on Bus
Fault Tolerance & scheduler
Anatomy of a Science Gateway • Gateway User Interface
• Web Portals • Desktop Clients • Social/ Collaboration Capabilities
• Security Infrastructure • Analyses & Visualization Capabilities • Workflow Execution Framework
• Application Abstraction • Workflow construction & Enactment • Compute Resource Management • Scheduling • Messaging System
• Data Management • Provenance Collection
Case Study: Dark Energy Survey • Long running code: Based on simulaIon box
size L-‐gadget can run for 3 to 5 days using more than 1024 cores on TACC ranger.
• TACC policies: TACC job scheduling policy does not allow jobs to run for more than 24 hours in normal queue and 48 hours in long queue
• Do-‐While Construct: Restart service support is needed in workflow. Do-‐while construct was developed to address the need.
• Data size and File transfer challenges: L-‐gadget produces 10~TB for large DES simulaIon boxes in system scratch so data need to moved to persistent storage ASAP
• File system issues: More than 10,000 lightcone files are doing conInues file I/O. Ranger have one Luster metadata server to serve 300 I/O nodes. SomeIme metadata server can’t fine these lightcone files, which make simulaIons to stop. We have wasted ~50k SU this month struggling with I/O issues and to get recommendaIon to use MPI I/O
Figure: Processing steps to build a syntheIc galaxy catalog. Xbaya workflow currently controls the top-‐most element (N-‐body simulaIons) which consists of methods to sample a cosmological power spectrum (ps), generaIng an iniIal set of parIcles (ic) and evolving the parIcles forward in Ime with Gadget (N-‐body). The remaining methods are run manually on distributed resources.
Case Study: ParamChem • ParamChem researchers try to optimize the geometry of new molecules which may or may not converge with in a given time or number of steps.
• Factors that include the mathematical convergence issues in solutions for partial integro-differential equations to potential shallowness of an energy landscape.
• The intermediate outputs from model iterations can be used to determine convergence.
Complex graph execuIons with support for long running and interacIve execuIons to address non-‐determinisIc convergence problems.
NextGen Workflow Systems: Need for InteracIvity Across Layers
• Scientific workflow systems and compiled workflow languages have focused on modeling, scheduling, data movement, dynamic service creation and monitoring of workflows.
• Building on these foundations we extend to a interactive and flexible workflow systems.
• Features include: • interactive ways of interfering and steering the workflow execution • interpreted workflow execution model • high level instruction set • flexibility to execute individual workflow activity and wait for further analysis.
InteracIvity Contd. • Derivations during workflow Execution that does not affect the structure of the workflow
• dynamic change workflow inputs, workflow rerun. interpreted workflow execution model.
• dynamic change in point of execution, workflow smart rerun. • Fault handling and exception models.
• Derivation that change the workflow DAG during runtime • Reconfiguration of activity.. • dynamic addition of activities to the workflow. • Dynamic remove or replace of activity to the workflow
Level%0!!!!!4!instances!X!4!!!16!outputs!
Level 1 2instances X (4x4)! 32 outputs
Level 2 1 instance X (32x32)! 1024outputs
!! !!
!!
!!!!
Start%
!!
Start
!!
!
!! ! !
! !
! ! !
!
A! B! C! Pruned Computation
ExecuIon PaOerns Parametric Sweeps
Level%0!!!!!4x4!instances!!16!outputs!
Level 1 2x16 instances! 32 outputs
Level 2 1x256 instances! 256 outputs
!! !!
!!
!!!!
Start!
!
!! ! !
! !
! ! !
!
A! B! C!
Start
Dot vs CarIsian
Why Apache for Gateway Sonware? • Apache Software Foundation is a neutral playing field
– 501(c)(3) non-profit organization. – Designed to encourage competitors to collaborate on foundational software.
– Includes a legal cell for legal issues. • Foundation itself is sustainable
– Incorporated in 1999 – Multiple sponsors (Yahoo, Microsoft, Google, AMD, Facebook, IBM, …)
• Proven governance models – Projects are run by Program Management Committees. – New projects must go through incubation.
• Provides the social infrastructure for building communities. • Opportunities to collaborate with other Apache projects outside the usual CI world.
55 55
OVP/ RST/ MIG
OGCE Re-‐engineer, Generalize,
Build, Test and Release
LEAD
GridChem
TeraGrid User Portal
OGCE NMI & SDCI Funding
Atmospheric Science
LEAD, OLAM
Open Grid/Gateway CompuIng Environments
Molecular Chemistry
GridChem, ParamChem, OREChem
Bio Physics
Bio InformaIcs BioVLAB, mCpG
Astronomy ODI, DES-‐SimWG
Nuclear Physics LCCI
Ultrascan
Projects in the pipe
QuakeSim, VLAB, Einstein Gateway
Apache Airavata: Assisting in building Science Gateways
Science Gateways enable and support communities of users associated with a scientific discipline to use cyber infrastructure through a common interface that is configured for optimal use.
Cyber Infrastructure
Science Gateway
End User
Apache Sonware FoundaIon: Beyond Open Source, Open Community
• Transparency • Decision-making and actions are observable • Events of interest are published and recorded • Transparency invites collaboration
• Meritocratic Governance • Influence on decisions is based on merit • Merit is earned in public • Community based governance
• Community • Common interest, Community interest, Common experience • “Community before code”
• Collaboration • Systems supporting communication and coordination: repositories, trackers, forums, build tools
• You can reuse what you can see and influence • More eyeballs means better quality
Apache Airavata • Airavata is an open source framework which enables a
user to build Science Gateways. • It is used to compose, manage, execute and monitor
distributed applications and workflows on computational resources.
Apache Airavata
Science Community
Cyber Infrastructure
Science Gateway
Science Community
Cyber Infrastructure
Science Gateway
Science Community
Science Community
Airavata Features • Graphical user interface to construct, execute, control,
manage and reuse scientific workflows. • Desktop tools and browser-based web interface
components to manage applications, workflows and generated data.
• Sophisticated server-side tools to register, schedule and manage scientific applications on high performance computational resources.
• Ability to Interface and interoperate with various external (third party) data, workflow and provenance management tools.
Airavata Stakeholders
• Gateway End Users • Gateway Developers • Core Developers
Apache Airavata
Science Community
Cyber Infrastructure
Science Gateway
Science Community
Cyber Infrastructure
Science Gateway
Science Community
Science Community
Core Developer
Gateway Developer
Gateway End User
Peek into Apache Airavata Client
Graphical Workflow Client [XBaya]
Client API
Repository [JackRabbit]
Cyber Infrastructure
Registry API
Workflow Enactment Engine [Workflow Interpreter]
Generic ApplicaDon Toolkit
Messaging Service
[WS-‐Messenger] Scheduler
Resource Info
Services
[GFac]
XSEDE ECSS Science Gateways Program
• Mission/purpose – Science Gateways enable communiIes of users associated with a common discipline to use computaIonal resources through a familiar and simpler interface.
– The missions of the Extended Support for Science Gateway (ESSGW) Group is to provide Extended CollaboraIve Support to exisIng and new ScienIfic CommuniIes in developing, enhancing and maintaining Science Gateways in effecIvely using XSEDE ComputaIonal Resources.
– Outreach to potenIal communiIes and help fostering new gateways.
– Engage the gateway community through forums & discussions.
63
ECSS Gateway Examples • ImplementaIon of new workflows for automaIon of scienIfic processes
• IncorporaIon of new visualizaIon methods • InnovaIve scheduling implementaIon • IntegraIon of XSEDE resources into a portal or Science Gateway
• Move data from gateway to XSEDE resources • Bridge Campus Resources with XSEDE through a gateway
64
Be in the loop:
• [email protected] Mailing List • Send email to [email protected]
– with "subscribe gateways" in the body of the message
• Email Suresh Marru ([email protected]) or Nancy Wilkins-‐Diehr ([email protected])
• Apache Airavata -‐ hOp://airavata.org
65