Top Banner
SCIENCE GATEWAYS Suresh Marru Mark Miller Nancy Wilkins-Diehr
66

SCIENCE GATEWAYS

Feb 27, 2023

Download

Documents

Khang Minh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SCIENCE GATEWAYS

SCIENCE  GATEWAYS  

Suresh Marru Mark Miller

Nancy Wilkins-Diehr

Page 2: SCIENCE GATEWAYS

•  I  stole  &  mashed  together  slides  from:    –  Nancy  Wilkins-­‐Diehr  –  XSEDE  12  Gateways  overview  –  Henry  Neeman,  University  of  Oklahoma  -­‐  SupercompuIng  in  Plain  English  

–  Dennis  Gannon  –  eScience  Lectures  –  Grid  AnimaIons  were  adopted  from  resources  at  White  Rose  Grid  eScience  Center  hOp://www.wrgrid.org.uk/Resources/PresentaIons.php  

– MaO  McKenzie,  NICS,  ORNL  –  XSEDE  12  Tutorial:  PracIcal  issues  in  running  a  gateway  

Page 3: SCIENCE GATEWAYS

HPC/Grid  CompuIng    &    XSEDE  Science  Gateways    

Page 4: SCIENCE GATEWAYS

CIPRES  Science  Gateway  

Page 5: SCIENCE GATEWAYS

Interested  in  building  or  using  a  gateway?    Hands-­‐on  Tutorial  

Page 6: SCIENCE GATEWAYS

ECSS  Science  Gateways  

Where  to  go  from  here  

Page 7: SCIENCE GATEWAYS

I  am  here  since  Monday  and  I  am  convinced  …  o SupercompuIng  is  all  about  size  and  speed  and  XSEDE  can  potenIally  quest  my  computaIonal  hunger  

o Parallelism  is  a  good  way  of  dividing  and  conquering  my  problem  

o  I  can  use  batch  queues  and  HPC  machines  in  a  good  way  and  run  my  applicaIons  

o  I  realized  SDSC  consultants  and  sys-­‐admins  are  the  nicest  people  and  you  can  go  back  to  them  for  further  help  

Page 8: SCIENCE GATEWAYS

Its  Thursday  and  why  am  I  not  on  the  beach  yet?  

•  You  have  developed,  tuned,  or  enhanced  your  applicaIon  but  would  like  to  know  how  to  share  &  execute  them  in  familiar  and  simpler  ways    

•  Science  Gateways,  an  integral  part  of  Cyberinfrastructure  will  solve  all  your  problems.  Really?,  Nah!!,  but    

•  Will  help  you  use  or  build  simple  web  interfaces  to  complex  apps  

•  Will  absorb  the  niOy-­‐griOy  details  of  emerging  technologies  

Page 9: SCIENCE GATEWAYS

WHY  SCIENCE  GATEWAYS  

Page 10: SCIENCE GATEWAYS

worker node

worker node

worker node

worker node

worker node

Batch Manager/Login Node

Queue-A Queue-B Queue-C

A Slot 1

A Slot 2

B Slot 1

C Slot 1

C Slot 2

C Slot 3

B Slot 1

B Slot 2

B Slot 3

B Slot 1

C Slot 1

C Slot 2

A Slot 1

B Slot 1

C Slot 1

§ Queues

§ Policies

§ Priorities

§ Share/Tickets

§ Resources

§ Users/Projects JOB Y JOB Z JOB X

JOB U

JOB O JOB N

How  to  run  a  job  on  Trestles/Gordon  

Page 11: SCIENCE GATEWAYS

What  does  a  Batch  Manager  do  

•  Dynamic  Resource  Management  –  Job  Scheduling  –  Resource  monitoring    –  Policy  administraIon  –  User  authenIcaIon  and  access  control  

–  AccounIng  and  reporIng  

Page 12: SCIENCE GATEWAYS

l System characteristics l System status l Resources

l Job policies l Resources

Key  Feature:  Match  Making  

JOB

User

l User policies l Groups l Roles l Departments l Projects

Selection Scheduling

Page 13: SCIENCE GATEWAYS

How  to  interface  with  Batch  Managers  

Batch Manager

Graphical Interfaces

Command-line

<c/>

Programmatic API’s

Browser

Page 14: SCIENCE GATEWAYS

BACKGROUND  CONTEXT:  GRID  COMPUTING  

Page 15: SCIENCE GATEWAYS

15  

What  is  Grid  CompuIng?  •  The  grid  vision  is  of  “Virtual  

compuIng”  (+  informaIon  services  to  locate  computaIon,  storage  resources)    –  Compare:  The  web:  “virtual  

documents”  (+  search  engine  to  locate  them)  

•  MOTIVATION:  collaboraIon  through  sharing  resources  (and  experIse)  to  expand  horizons  of    –  Research    –  Commerce  –  engineering,  …    –  Public  service  –  health,  

environment,…  

Page 16: SCIENCE GATEWAYS

16  

G  R  I  D    M  I  D  D  L  E  W  A  R  E  

Visualising  

WorkstaDon  

Mobile  Access  

Supercomputer,  PC-­‐Cluster  

Data-­‐storage,  Sensors,  Experiments  

Internet,  networks  

Power  Grid  Metaphor  

Page 17: SCIENCE GATEWAYS

17    

Ian  Foster’s  Grid  Checklist  

•  A  Grid  is  a  system  that:  

– Coordinates  resources  that  are  not  subject  to  centralized  control  

– Uses  standard,  open,  general-­‐purpose  protocols  and  interfaces  

– Delivers  non-­‐trivial  quali<es  of  service  

17  

Page 18: SCIENCE GATEWAYS

18    

The  Grid  Middleware  Stack  

Grid  Security  Infrastructure  

Job  Management    

Data  Management  

Grid  InformaDon  Services    

Core  Globus  Services    

Standard  Network  Protocols  and  Web  Services    

Workflow  system  (explicit  or  ad-­‐hoc)  

Grid  ApplicaDon    (oVen  includes  a  Portal)  

Page 19: SCIENCE GATEWAYS

Grid  Middleware  glues  the  grid  together  

•  A  short,  intui<ve  defini<on:  

 the  so@ware  that  glues  together  different  clusters  into  a  grid  

         taking  into  consideraIon  the  socio-­‐poliIcal  side  of  things  (such  as  common  policies  on  who  can  use  what,  how  much,  and  what  for)  

Page 20: SCIENCE GATEWAYS

20    

Grid  middleware  components  

•  Job  management  •  Storage  management  •  InformaIon  Services  •  Security      

Page 21: SCIENCE GATEWAYS

 GridFTP  Data  Movement  

http://www.loni.org

LONI HPC Enablement Workshop – LaTech University, October 23, 2008

+$)&C28*

Basic Transfer One control channel, several

parallel data channels

Third-party

Transfer Control channels to each server, several parallel

data channels between servers Striped Transfer

Control channels to each server on one node, several parallel data channels

between servers and data channels spread across nodes

Page 22: SCIENCE GATEWAYS

IntroducIon  to  Grid  CompuIng   22    

Single  Sign-­‐on  

•  Important  for  complex  applicaIons  that  need  to  use  Grid  resources  – Enables  easy  coordinaIon  of  varied  resources  – Enables  automaIon  of  processes  – Allows  remote  processes  and  resources  to  act  on  user’s  behalf  

– AuthenIcaIon  and  DelegaIon  

Page 23: SCIENCE GATEWAYS

Grid  CerIficate  •  Every  user  and  service  on  the  Grid  is  

idenIfied  via  a  cerIficate,  which  contains  informaIon  vital  to  idenIfying  and  authenIcaIng  the  user  or  service.  

•  Grid  CerIficates  are  based  on  standard  PKI  infrastructure  and  provide  a  set  of  privileges  of  one  resource  to  another.  

•  Grid  CerIficates  provides  the  features  of  dynamic  delegaIon,  dynamic  enIIes  and  repeated  authenIcaIon.  

Grid  CerIficate  

Proxies  of  the  

CerIficate  

Page 24: SCIENCE GATEWAYS

Internals  of  Grid  CerIficates  •  A  Grid  CerIficate  in  X.509  cerIficate  format  contains:  

–  EnIty’s  qualified  name  –  EnIty’s  public  key  –  Name  of  the  issuing  CA  –  Signature  of  issuing  CA  –  Validity  dates  (start  and  end  dates)  

•  The  Grid  CerIficate  associates  the  public  key  with  a  qualified  DisDnguished  Name  (DN).  

•  DN  is  a  unique  idenIfier  and  is  composed  of:  –  Persons  Name  (Common  Name  or  CN)  –  InsItuIon  (OrganizaIon  O)  –  Country  (C)  –  Example  DN:  /C=US/O=NaIonal  Center  for  SupercompuIng  

ApplicaIons/CN=Suresh  Marru  

Page 25: SCIENCE GATEWAYS

Grid  Security  Infrastructure  •  How  do  we  delegate  our  idenIty  to  a  remote  agent/program  to  act  on  our  behalf?  –  GSI  SoluIon:  Create  a  Proxy  cerIficate  

•  A  new  public-­‐private  key  pair  that  is  to  be  used  only  for  a  limited  Ime.  Never  use  this  key  pair  again.  

•  Give  this  proxy  cert  and  its  private  key  to  a  trusted  agent  to  work  on  your  behalf    

My  name  is:  Suresh  Marru  My  public  key  is:  

My  pub  key  

Signed  by  a  Trusted  CA  

My  name  is:  Suresh  Marru’s  proxy  My  public  key  is:  

My  new  pub  key  

Signed  by    Suresh  Marru  

Do  not  use  aner:          5:00  pm  today  

My  new  private  key  

Page 26: SCIENCE GATEWAYS

Why  Use  Proxy  CerIficates?  

•  A  cerIficate  usually  lasts  a  year  –  If  it’s  stolen,  it’s  sIll  good  for  the  rest  of  the  year  

•  unless  it’s  revoked  by  being  placed  on  a  cerIficate  revocaIon  list  (CRL)  

– And  your  uIlity  actually  checks  the  CRL.  » With  any  frequency  

•  A  proxy  cerIficate  usually  lasts  12  hours  – Minimizes  the  possible  mischief  

Page 27: SCIENCE GATEWAYS

Grid  Proxy  •  Temporary  instance  of  a  cerIficate    •  Has  IdenIcal  subject  as  the  grid  

cerIficate.  A  unique  number  is  added  to  the  proxied  credenIal’s  DN.      

•  Example:    –  DN  of  the  cerIficate:  /C=US/O=NaIonal  Center  for  SupercompuIng  

ApplicaIons/CN=Suresh  Marru  –  DN  of  the  proxy  to  the  cerIficate  /C=US/O=NaIonal  Center  for  

SupercompuIng  ApplicaIons/CN=Suresh  Marru/CN=1156762317  –  DN  of  the  proxy  to  the  proxy  /C=US/O=NaIonal  Center  for  

SupercompuIng  ApplicaIons/CN=Suresh  Marru/CN=1156762317/CN=2136876587  

•  Proxies  have  much  lesser  life  Ime  then  cerIficates  and  hence  more  safer  to  transmit.  

•  A  proxy  can  be  created  using  Grid  client  sonware.  

Page 28: SCIENCE GATEWAYS

MyProxy  Repository  •  MyProxy  is  a  online  credenIal  Repository.  

•  Instead  of  user  having  to  use  grid  sonware  to  create  proxies  of  their  cerIficate,  they  can  delegate  the  task  to  myproxy  repositories.  

•  Only  short  lived  x.509  proxies  are  issued,  and  long  lived  private  keys  are  safely  stored  and  never  given  out.  

•  Avoids  the  need  of  copying  cerIficate  and  key  files  between  machines  and  hence  more  secure  and  ease  of  use.  

Page 29: SCIENCE GATEWAYS

IntroducIon  to  Grid  CompuIng   29    

Local  Resource  Managers  (LRM)  •  Compute  resources  have  a  local  resource  manager  

(LRM)  that  controls:  –  Who  is  allowed  to  run  jobs  –  How  jobs  run  on  a  specific  resource  –  Specifies  the  order  and  locaIon  of  jobs  

•  Example  policy:  –  Each  cluster  node  can  run  one  job.    –  If  there  are  more  jobs,  then  they  must  wait  in  a  queue  

•  Examples:  PBS,  LSF,  Condor  

Page 30: SCIENCE GATEWAYS

IntroducIon  to  Grid  CompuIng   30    

Local  Resource  Manager:  a  batch  scheduler  for  running  jobs  on  a  compuIng  cluster  •  Popular  LRMs  include:  

–  PBS  –  Portable  Batch  System  –  LSF  –  Load  Sharing  Facility  –  SGE  –  Sun  Grid  Engine  –  Condor  –  Originally  for  cycle  scavenging,  Condor  has  evolved  into  a  comprehensive  system  for  managing  compu<ng  

•  LRMs  execute  on  the  cluster’s  head  node  •  Simplest  LRM  allows  you  to  “fork”  jobs  quickly  

–  Runs  on  the  head  node  (gatekeeper)  for  fast  uIlity  funcIons  –  No  queuing  (but  this  is  emerging  to  “throOle”  heavy  loads)  

•  In  GRAM,  each  LRM  is  handled  with  a  “job  manager”  

30  

Page 31: SCIENCE GATEWAYS

IntroducIon  to  Grid  CompuIng   31    

GRAM        Globus  Resource  AllocaIon  Manager    

•  GRAM  =  provides  a  standardised  interface  to  submit  jobs  to  LRMs.  

•  Clients  submit  a  job  request  to  GRAM  •  GRAM  translates  into  something  a(ny)  LRM  can  understand        ….    Same  job  request  can  be  used  for  many  different  kinds  of  LRM  

Page 32: SCIENCE GATEWAYS

IntroducIon  to  Grid  CompuIng   32    

Job  Management  on  a  Grid  

User

The Grid

Condor

PBS

LSF

fork

GRAM

Site A

Site B

Site C

Site D

Page 33: SCIENCE GATEWAYS

IntroducIon  to  Grid  CompuIng   33    

GRAM’s  abiliIes  

•  Given  a  job  specificaIon:    

– Creates  an  environment  for  the  job  – Stages  files  to  and  from  the  environment  – Submits  a  job  to  a  local  resource  manager  – Monitors  a  job  – Sends  noIficaIons  of  the  job  state  change  – Streams  a  job’s  stdout/err  during  execuIon  

Page 34: SCIENCE GATEWAYS

IntroducIon  to  Grid  CompuIng   34    

GRAM  components  

   

 Worker  nodes  /  CPUs    Worker  node  /  CPU    Worker  node  /  CPU    Worker  node  /  CPU    Worker  node  /  CPU    Worker  node  /  CPU  

LRM eg Condor, PBS, LSF

Gatekeeper   Internet

Jobmanager  Jobmanager  globus-­‐job-­‐run  

Submitting machine (e.g. User's workstation)

Page 35: SCIENCE GATEWAYS

IntroducIon  to  Grid  CompuIng   35    

Submirng  a  job  with  GRAM  •  globus-­‐job-­‐run  command   $ globus-job-run workshop1.ci.uchicago.edu /bin/

hostname

– Run  '/bin/hostname'  on  the  resource  workshop1.ci.uchicago.edu  

•  We  don't  care  what  LRM  is  used  on  ‘workshop1’  This  command  works  with  any  LRM.  

Page 36: SCIENCE GATEWAYS

Resource  SpecificaIon  Language  (RSL)  •  AOribute  &  value  pairings  •  GRAM  aOributes:  executable,  arguments,  count,  directory,  maxIme,  jobtype,  project,…  – A  different  way  to  describe  a  PBS  script  

•  LRM  interprets  the  RSL  aOributes  to  manage  the  GRAM  request  

 &  (project=TG-­‐STA110014S)      (jobtype=mpi)      (directory=/lustre/scratch/mmcken6/apoa1/)      (count=24)      (executable=/lustre/scratch/mmcken6/namd2)      (arguments=apoa1.namd)  

#!/bin/bash  #PBS  –l  size=24  #PBS  –A  TG-­‐STA110014S  cd  /lustre/scratch/mmcken6/apoa1/  aprun  –n  24  /lustre/scratch/mmcken6/namd2  apoa1.namd    

Page 37: SCIENCE GATEWAYS

More  on  Jobtype  •  Jobtype=single  

–  This  is  solitary,  non-­‐parallel  job  – Most  flexible  opIon  

•  One  can  set  aprun/mpirun/other  exe  and  its  various  arguments  

•  '&(executable=aprun)(arguments="-­‐n"  "24"  "helloworld")  (directory=/lustre/scratch/user)(jobtype=single)(count=24)(maxIme=10)  (project=my_alloca<on)'  

•  Jobtype=mpi  – Generates  a  PBS  for  MPI  jobs  

•  Jobtype=mulIple  –  Parallel  applicaIons  that  do  not  depend  on  MPI  –  Check  your  site  friend  on  how  this  is  implemented  

Page 38: SCIENCE GATEWAYS

Purng  it  all  together:  Portal  or  a  Gateway  

One  Ime  Gateway    Community  Setup  

Community  Account  Grid  CerIficate      username,  password  

Gateway  Interface  Gateway  Server  

Compute  Servers  

Gateway  AuthenIcaIon  

Job  Submit  or    File  Transfer  request  

Output  

Proxy,  Job  Requ

est  

Job  Status,  Outpu

t  

Step  0  

Step  1  

Step  2,3,,  

Page 39: SCIENCE GATEWAYS

IntroducIon  to  Grid  CompuIng   39    

Workflow  management  systems  

•  OrchestraIon  of  many  resources  over  long  Ime  periods  –  Very  complex  to  do  manually  -­‐  workflow  automates  this  effort  

•  Enables  restart  of  long  running  scripts  •  Write  scripts  in  a  manner  that’s  locaIon-­‐independent:  run  anywhere  – Higher  level  of  abstracIon  gives  increased  portability  of  the  workflow  script  (over  ad-­‐hoc  scripIng)  

Page 40: SCIENCE GATEWAYS

AbstracIons    

Impact  on  nu

mbe

r  of  U

sers  

Develop & Tune Applications  

Science Gateways  

Gateways:  DemocraIzing  Science    

Deploy Middleware,

Register Applications

Page 41: SCIENCE GATEWAYS

THAT’S  PROMISING,  CAN  WE  LOOK  AT  ONE  OF  THE  GATEWAY  IN  ACTION:    CIPRES  GATEWAY  –  MARK  MILLER  

Page 42: SCIENCE GATEWAYS

HANDS-­‐ON:  LETS  TRY  TO  RUN  THE  TUTORIAL  APPLICATIONS  FROM  A  GATEWAY  

Page 43: SCIENCE GATEWAYS

Download  Xbaya  Workflow  GUI  

•  Go  to  hOp://airavata.org  •  Click  on  Wiki  (the  second  link  on  the  len)  •  Click  on  “SDSC  Summer  InsItute  Tutorial”  

Page 44: SCIENCE GATEWAYS

44  

Knowledge and Expertise

Computational Resources

Scientific Instruments

Algorithms and Models

Archived Data and Metadata

Advanced Science Tools

Science Gateways: Enabling & Democratizing Scientific Research

Page 45: SCIENCE GATEWAYS

On-­‐Demand  Grid  CompuIng  

LEAD:  an  Integrated,  Scalable  Geosciences  Framework    (One  of  the  byproduct:  Workflow  OrchestraIon  for  On-­‐Demand  Real-­‐Time  Dynamically-­‐AdapIve  System)    

Streaming  ObservaIons  

Storms  Forming  

Forecast  Model  

Data  Mining  

Refine  forecast  

Instrument  Steering  

Envisioned  by  a  mulI-­‐disciplinary  team  from  OU,  IU,  NCSA,  Unidata,  UAH,  Howard,  Millersville,  Colorado  State,  RENCI  

Page 46: SCIENCE GATEWAYS

Science  Gateways:    HolisIc  System  IntegraIon  

Data  Storage  

Applica<on    services  

Compute  Engine  

Gateway  Portal  

Portal  server  

Instrument  Data  

Catalog  Metadata  catalog  

Data  Brokering  service   Data  

Management  Service  

Workflow  Engine  

Workflow  graphs  

Provenance  CollecIon  service  

Event  No<fica<on  Bus  

Fault  Tolerance  &  scheduler  

Page 47: SCIENCE GATEWAYS

Anatomy  of  a  Science  Gateway  •  Gateway User Interface

•  Web Portals •  Desktop Clients •  Social/ Collaboration Capabilities

•  Security Infrastructure •  Analyses & Visualization Capabilities •  Workflow Execution Framework

•  Application Abstraction •  Workflow construction & Enactment •  Compute Resource Management •  Scheduling •  Messaging System

•  Data Management •  Provenance Collection

Page 48: SCIENCE GATEWAYS

Case  Study:  Dark  Energy  Survey  •  Long  running  code:  Based  on  simulaIon  box  

size  L-­‐gadget  can  run  for  3  to  5  days  using  more  than  1024  cores  on  TACC  ranger.    

•  TACC  policies:  TACC  job  scheduling  policy  does  not  allow  jobs  to  run  for  more  than  24  hours  in  normal  queue  and  48  hours  in  long  queue  

•  Do-­‐While  Construct:  Restart  service  support  is  needed  in  workflow.  Do-­‐while  construct  was  developed  to  address  the  need.  

•  Data  size  and  File  transfer  challenges:  L-­‐gadget  produces  10~TB  for  large  DES  simulaIon  boxes  in  system  scratch  so  data  need  to  moved  to  persistent  storage  ASAP  

•  File  system  issues:  More  than  10,000  lightcone  files  are  doing  conInues  file  I/O.  Ranger  have  one  Luster  metadata  server  to  serve  300  I/O  nodes.  SomeIme  metadata  server  can’t  fine  these  lightcone  files,  which  make  simulaIons  to  stop.  We  have  wasted  ~50k  SU  this  month  struggling  with  I/O  issues  and  to  get  recommendaIon  to  use  MPI  I/O          

 

Figure:  Processing  steps  to  build  a  syntheIc  galaxy  catalog.  Xbaya  workflow  currently  controls  the  top-­‐most  element  (N-­‐body  simulaIons)  which  consists  of  methods  to  sample  a  cosmological  power  spectrum  (ps),  generaIng  an  iniIal  set  of  parIcles  (ic)  and  evolving  the  parIcles  forward  in  Ime  with  Gadget  (N-­‐body).  The  remaining  methods  are  run  manually  on  distributed  resources.  

Page 49: SCIENCE GATEWAYS

Case  Study:  ParamChem  • ParamChem researchers try to optimize the geometry of new molecules which may or may not converge with in a given time or number of steps.

• Factors that include the mathematical convergence issues in solutions for partial integro-differential equations to potential shallowness of an energy landscape.

• The intermediate outputs from model iterations can be used to determine convergence.

Complex  graph  execuIons  with  support  for  long  running  and  interacIve  execuIons  to  address  non-­‐determinisIc  convergence  problems.    

Page 50: SCIENCE GATEWAYS

NextGen  Workflow  Systems:    Need  for  InteracIvity  Across  Layers  

• Scientific workflow systems and compiled workflow languages have focused on modeling, scheduling, data movement, dynamic service creation and monitoring of workflows.

• Building on these foundations we extend to a interactive and flexible workflow systems.

• Features include: • interactive ways of interfering and steering the workflow execution • interpreted workflow execution model • high level instruction set • flexibility to execute individual workflow activity and wait for further analysis.

Page 51: SCIENCE GATEWAYS

InteracIvity  Contd.  • Derivations during workflow Execution that does not affect the structure of the workflow

• dynamic change workflow inputs, workflow rerun. interpreted workflow execution model.

• dynamic change in point of execution, workflow smart rerun. • Fault handling and exception models.

• Derivation that change the workflow DAG during runtime • Reconfiguration of activity.. • dynamic addition of activities to the workflow. • Dynamic remove or replace of activity to the workflow

Page 52: SCIENCE GATEWAYS

Level%0!!!!!4!instances!X!4!!!16!outputs!

Level 1 2instances X (4x4)! 32 outputs

Level 2 1 instance X (32x32)! 1024outputs

!! !!

!!

!!!!

Start%

!!

Start

!!

!

!! ! !

! !

! ! !

!

A! B! C! Pruned Computation

ExecuIon  PaOerns  Parametric  Sweeps  

Level%0!!!!!4x4!instances!!16!outputs!

Level 1 2x16 instances! 32 outputs

Level 2 1x256 instances! 256 outputs

!! !!

!!

!!!!

Start!

!

!! ! !

! !

! ! !

!

A! B! C!

Start

Dot  vs  CarIsian  

Page 53: SCIENCE GATEWAYS

Why  Apache  for  Gateway  Sonware?  • Apache Software Foundation is a neutral playing field

– 501(c)(3) non-profit organization. – Designed to encourage competitors to collaborate on foundational software.

– Includes a legal cell for legal issues. • Foundation itself is sustainable

– Incorporated in 1999 – Multiple sponsors (Yahoo, Microsoft, Google, AMD, Facebook, IBM, …)

• Proven governance models – Projects are run by Program Management Committees. – New projects must go through incubation.

• Provides the social infrastructure for building communities. • Opportunities to collaborate with other Apache projects outside the usual CI world.

Page 54: SCIENCE GATEWAYS

ParamChem Gateway

with Airavata

Page 55: SCIENCE GATEWAYS

55  55  

OVP/  RST/    MIG    

OGCE  Re-­‐engineer,  Generalize,  

Build,  Test  and  Release  

LEAD  

GridChem  

TeraGrid  User  Portal  

OGCE  NMI  &  SDCI  Funding  

Atmospheric    Science  

LEAD,  OLAM  

Open  Grid/Gateway  CompuIng  Environments  

Molecular    Chemistry  

GridChem,  ParamChem,  OREChem  

Bio  Physics  

Bio  InformaIcs   BioVLAB,  mCpG  

Astronomy   ODI,  DES-­‐SimWG  

Nuclear  Physics   LCCI    

Ultrascan  

Projects  in  the  pipe  

QuakeSim,  VLAB,  Einstein  Gateway  

Page 56: SCIENCE GATEWAYS

Apache  Airavata:    Assisting in building Science Gateways

Science Gateways enable and support communities of users associated with a scientific discipline to use cyber infrastructure through a common interface that is configured for optimal use.

Cyber Infrastructure

Science Gateway

End User

Page 57: SCIENCE GATEWAYS

Apache  Sonware  FoundaIon:          Beyond  Open  Source,  Open  Community    

• Transparency • Decision-making and actions are observable • Events of interest are published and recorded • Transparency invites collaboration

• Meritocratic Governance • Influence on decisions is based on merit • Merit is earned in public • Community based governance

• Community • Common interest, Community interest, Common experience • “Community before code”

• Collaboration • Systems supporting communication and coordination: repositories, trackers, forums, build tools

• You can reuse what you can see and influence • More eyeballs means better quality

Page 58: SCIENCE GATEWAYS

Apache Airavata • Airavata is an open source framework which enables a

user to build Science Gateways. •  It is used to compose, manage, execute and monitor

distributed applications and workflows on computational resources.

Apache Airavata

Science Community

Cyber Infrastructure

Science Gateway

Science Community

Cyber Infrastructure

Science Gateway

Science Community

Science Community

Page 59: SCIENCE GATEWAYS

Airavata Features • Graphical user interface to construct, execute, control,

manage and reuse scientific workflows. • Desktop tools and browser-based web interface

components to manage applications, workflows and generated data.

• Sophisticated server-side tools to register, schedule and manage scientific applications on high performance computational resources.

• Ability to Interface and interoperate with various external (third party) data, workflow and provenance management tools.

Page 60: SCIENCE GATEWAYS

Airavata Stakeholders

• Gateway End Users • Gateway Developers • Core Developers

Apache Airavata

Science Community

Cyber Infrastructure

Science Gateway

Science Community

Cyber Infrastructure

Science Gateway

Science Community

Science Community

Core Developer

Gateway Developer

Gateway End User

Page 61: SCIENCE GATEWAYS

Peek  into  Apache  Airavata  Client  

Graphical  Workflow  Client  [XBaya]    

Client  API  

Repository  [JackRabbit]    

Cyber Infrastructure

Registry  API  

Workflow  Enactment  Engine    [Workflow  Interpreter]  

Generic  ApplicaDon    Toolkit  

Messaging  Service  

[WS-­‐Messenger]  Scheduler  

Resource  Info  

Services  

[GFac]

Page 62: SCIENCE GATEWAYS

hOps://www.xsede.org/gateways-­‐overview  

62  

Page 63: SCIENCE GATEWAYS

XSEDE ECSS Science Gateways Program

•  Mission/purpose  –  Science  Gateways  enable  communiIes  of  users  associated  with  a  common  discipline  to  use  computaIonal  resources  through  a  familiar  and  simpler  interface.    

–  The  missions  of  the  Extended  Support  for  Science  Gateway  (ESSGW)  Group  is  to  provide  Extended  CollaboraIve  Support  to  exisIng  and  new  ScienIfic  CommuniIes  in  developing,  enhancing  and  maintaining  Science  Gateways  in  effecIvely  using  XSEDE  ComputaIonal  Resources.    

–  Outreach  to  potenIal  communiIes  and  help  fostering  new  gateways.    

–  Engage  the  gateway  community  through  forums  &  discussions.  

63

Page 64: SCIENCE GATEWAYS

ECSS  Gateway  Examples  •  ImplementaIon  of  new  workflows  for  automaIon  of  scienIfic  processes  

•  IncorporaIon  of  new  visualizaIon  methods  •  InnovaIve  scheduling  implementaIon  •  IntegraIon  of  XSEDE  resources  into  a  portal  or  Science  Gateway  

•  Move  data  from  gateway  to  XSEDE  resources  •  Bridge  Campus  Resources  with  XSEDE  through  a  gateway  

64  

Page 65: SCIENCE GATEWAYS

Be  in  the  loop:  

•  [email protected]  Mailing  List  •  Send  email  to  [email protected]  

– with  "subscribe  gateways"  in  the  body  of  the  message  

•  Email  Suresh  Marru  ([email protected])  or  Nancy  Wilkins-­‐Diehr  ([email protected])  

•  Apache  Airavata  -­‐  hOp://airavata.org  

65  

Page 66: SCIENCE GATEWAYS

ADVANCED  HANDS-­‐ON:  LETS  TRY  TO  BUILD  A  GATEWAY  FROM  SCRATCH