Top Banner
The SBGrid Science Portal: An integrated environment for protein structure studies Ian StokesRees Harvard Medical School eScience 2012, Chicago, October 2012
123

SBGrid Science Portal - eScience 2012

May 11, 2015

Download

Technology

Ian Stokes-Rees

The SBGrid Science Portal provides multi-modal access to computational infrastructure, data storage, and data analysis tools for the structural biology community. It incorporates features not previously seen in cyberinfrastructure science gateways. It enables researchers to securely share a computational study area, including large volumes of data and active computational workflows. A rich identity management system has been developed that simplifies federated access to US national cyberinfrastructure, distributed data storage, and high performance file transfer tools. It integrates components from the Virtual Data Toolkit, Condor, glideinWMS, the Globus Toolkit and Globus Online, the FreeIPA identity management system, Apache web server, and the Django web framework.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SBGrid Science Portal - eScience 2012

The  SBGrid  Science  Portal:An  integrated  environment  forprotein  structure  studies

Ian  Stokes-­‐ReesHarvard  Medical  School  

eScience  2012,  Chicago,  October  2012

Page 3: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

What’s  interesting  about  another  Science  Portal?

✦ Interface  modalities• Web  forms,  RESTful  interfaces,  command  line

Page 4: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

What’s  interesting  about  another  Science  Portal?

✦ Interface  modalities• Web  forms,  RESTful  interfaces,  command  line

✦ Access  model• Browser  SSO,  X.509,  LDAP,  .htaccess,  GACL

Page 5: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

What’s  interesting  about  another  Science  Portal?

✦ Interface  modalities• Web  forms,  RESTful  interfaces,  command  line

✦ Access  model• Browser  SSO,  X.509,  LDAP,  .htaccess,  GACL

✦ Identity  management• Streamlined  grid  account  creation

Page 6: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

What’s  interesting  about  another  Science  Portal?

✦ Interface  modalities• Web  forms,  RESTful  interfaces,  command  line

✦ Access  model• Browser  SSO,  X.509,  LDAP,  .htaccess,  GACL

✦ Identity  management• Streamlined  grid  account  creation

✦ Computational  capability• local,  cluster,  and  grid  computing

Page 7: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

What’s  interesting  about  another  Science  Portal?

✦ Interface  modalities• Web  forms,  RESTful  interfaces,  command  line

✦ Access  model• Browser  SSO,  X.509,  LDAP,  .htaccess,  GACL

✦ Identity  management• Streamlined  grid  account  creation

✦ Computational  capability• local,  cluster,  and  grid  computing

✦ Data  management• Web  (HTTP),  scp,  GridFTP,  GlobusOnline• Tiered  staging  of  data

Page 8: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

I’m  still  skeptical.  What  about  Taverna,  GridSphere,  Galaxy,  or  HubZero?

Page 9: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

I’m  still  skeptical.  What  about  Taverna,  GridSphere,  Galaxy,  or  HubZero?

✦ All  great  if• the  portal  or  application  plugin  already  exists;  and• the  application  workGlows  closely  match  your  

requirements

Page 10: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

I’m  still  skeptical.  What  about  Taverna,  GridSphere,  Galaxy,  or  HubZero?

✦ All  great  if• the  portal  or  application  plugin  already  exists;  and• the  application  workGlows  closely  match  your  

requirements

✦ Not-­‐so-­‐great  if• you  have  to  implement  a  new  portal  on  top  of  one  of  

those  frameworks• you  want  to  adapt  the  workGlow• your  data  model  changes• you  want  to  add  a  new  application• you  want  to  explore  the  data  in  an  unanticipated  way• command-­‐line  access  is  also  important  to  you• you  are  working  with  others

Page 12: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Outline✦ Community• Who  the  SBGrid  Science  Portal  is  meant  to  serve

✦ Objectives• What  was  the  vision  for  the  Science  Portal

✦ Implementation• Software  and  service  architectures

✦ Security,  Collaboration,  and  IdM• ...  or  “How  I  learned  to  stop  worrying  and  love  X.509”

✦ Data• Tiered  data  distribution  model

Page 13: SBGrid Science Portal - eScience 2012

Rice UniversityE. NikonowiczY. ShamooY.J. Tao

CalTechP. BjorkmanW. ClemonsG. JensenD. Rees

StanfordA. BrungerK. GarciaT. Jardetzky

UCSFJJ MirandaY. Cheng

UC DavisH. Stahlberg

UCSDT. NakagawaH. Viadiu

WesternUM. Swairjo

U. WashingtonT. Gonen

Washington U. School of Med.T. EllenbergerD. Fremont

VanderbiltCenter for Structural Biology

Rosalind FranklinD. Harrison

A. LeschzinerK. MillerA. RaoT. RapoportM. SamsoP. SlizT. SpringerG. VerdineG. WagnerL. WalenskyS.WalkerT.WalzJ. WangS. Wong

N. Beglova S. BlacklowB. ChenJ. ChouJ. ClardyM. EckB. FurieR. GaudetM. GrantS.C. Harrison J. HogleD. JeruzalmiD. KahneT. Kirchhausen

Harvard and Affiliates

NE-CATR. OswaldC. ParrishH. Sondermann

R. CerioneB. CraneS. EalickM. JinA. Ke

Cornell U.

Brandeis U.N. Grigorieff

Tufts U.K. Heldwein

UMass MedicalW. Royer

NIHM. Mayer

U. MarylandE. Toth

K. ReinischJ. SchlessingerF. SigworthF. Zhou

T. BoggonD. BraddockY. HaE. Lolis

Yale U.

C. SandersB. SpillerM. Stone

M. Waterman

W. ChazinB. EichmanM. EgliB. LacyM. Ohi

Columbia U.Q. Fan

Rockefeller U.R. MacKinnon

Thomas JeffersonJ. Williams

Not Pictured: University of Toronto: L. Howell, E. Pai, F. Sicheri; NHRI (Taiwan): G. Liou; Trinity College, Dublin: Amir Khan

Community

Page 14: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Structural  Biology:Study  of  Protein  Structure  and  Function

1mm

10nm

400m

Page 15: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Structural  Biology:Study  of  Protein  Structure  and  Function

1mm

10nm

400m

• Shared  scientiGic  data  collection  facility• Data  intensive  (10-­‐100  GB/day)

Page 16: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Consortium  By  The  Numbers✦ ~200  member  labs• representing  about  1500  users

✦ ~200  software  packages• multi-­‐platform  (Linux,  OS  X)• multi-­‐version

✦ 4  FTE  staff✦ Automated  software  distribution• 80  GB  for  full  package• rsync+ssh  for  updates

✦ Everything  “Just  Works”• So  labs  are  happy  to  renew  membership  and  refer  friends

Page 17: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Boston  Life  Sciences  Hub

• Biomedical  researchers• Government  agencies• Life  sciences• Universities• Hospitals

Tufts

University

School of

Medicin

e

Page 20: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Hug  a  Life  Scientist!✦ Let  them  know  you  care  ...• ...  because  the  software  we  give  them  doesn’t

Page 21: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Hug  a  Life  Scientist!✦ Let  them  know  you  care  ...• ...  because  the  software  we  give  them  doesn’t• ...  and  neither  do  the  systems  we  subject  them  to

Page 22: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Hug  a  Life  Scientist!✦ Let  them  know  you  care  ...• ...  because  the  software  we  give  them  doesn’t• ...  and  neither  do  the  systems  we  subject  them  to• ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted

Page 23: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Hug  a  Life  Scientist!✦ Let  them  know  you  care  ...• ...  because  the  software  we  give  them  doesn’t• ...  and  neither  do  the  systems  we  subject  them  to• ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted

✦ SBGrid  came  into  existence  to  Zill  the  tech  void/pain  experienced  by  structural  biologists

Page 24: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Hug  a  Life  Scientist!✦ Let  them  know  you  care  ...• ...  because  the  software  we  give  them  doesn’t• ...  and  neither  do  the  systems  we  subject  them  to• ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted

✦ SBGrid  came  into  existence  to  Zill  the  tech  void/pain  experienced  by  structural  biologists

✦ Started  with  providing  reliable  compiled  software

Page 25: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Hug  a  Life  Scientist!✦ Let  them  know  you  care  ...• ...  because  the  software  we  give  them  doesn’t• ...  and  neither  do  the  systems  we  subject  them  to• ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted

✦ SBGrid  came  into  existence  to  Zill  the  tech  void/pain  experienced  by  structural  biologists

✦ Started  with  providing  reliable  compiled  software✦ Expanded  into

Page 26: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Hug  a  Life  Scientist!✦ Let  them  know  you  care  ...• ...  because  the  software  we  give  them  doesn’t• ...  and  neither  do  the  systems  we  subject  them  to• ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted

✦ SBGrid  came  into  existence  to  Zill  the  tech  void/pain  experienced  by  structural  biologists

✦ Started  with  providing  reliable  compiled  software✦ Expanded  into• training  events  and  workshops

Page 27: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Hug  a  Life  Scientist!✦ Let  them  know  you  care  ...• ...  because  the  software  we  give  them  doesn’t• ...  and  neither  do  the  systems  we  subject  them  to• ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted

✦ SBGrid  came  into  existence  to  Zill  the  tech  void/pain  experienced  by  structural  biologists

✦ Started  with  providing  reliable  compiled  software✦ Expanded  into• training  events  and  workshops• best  practice  guides

Page 28: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Hug  a  Life  Scientist!✦ Let  them  know  you  care  ...• ...  because  the  software  we  give  them  doesn’t• ...  and  neither  do  the  systems  we  subject  them  to• ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted

✦ SBGrid  came  into  existence  to  Zill  the  tech  void/pain  experienced  by  structural  biologists

✦ Started  with  providing  reliable  compiled  software✦ Expanded  into• training  events  and  workshops• best  practice  guides• shared  computational  infrastructure

(clusters!  OSG!  GlobusOnline!)

Page 29: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Hug  a  Life  Scientist!✦ Let  them  know  you  care  ...• ...  because  the  software  we  give  them  doesn’t• ...  and  neither  do  the  systems  we  subject  them  to• ...  but  to  be  fair,  a  lot  of  the  pain  is  self-­‐inZlicted

✦ SBGrid  came  into  existence  to  Zill  the  tech  void/pain  experienced  by  structural  biologists

✦ Started  with  providing  reliable  compiled  software✦ Expanded  into• training  events  and  workshops• best  practice  guides• shared  computational  infrastructure

(clusters!  OSG!  GlobusOnline!)• web-­‐based  collaborative  computational  and  data  services

Page 30: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Objectives

A.Extensible  infrastructure  to  facilitate  development  and  deployment  of  novel  

computational  workGlows  

B.Web-­‐accessible  environment  for  collaborative,  

compute  and  data  intensive  science

Page 33: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Objectives  (explained)

✦ Pareto  Principle• 80%  of  the  time  users  are  happy  with  basic  web  form  interface  

to  standard  application  workGlow  and  canned  result  analysis

Page 34: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Objectives  (explained)

✦ Pareto  Principle• 80%  of  the  time  users  are  happy  with  basic  web  form  interface  

to  standard  application  workGlow  and  canned  result  analysis• 20%  of  the  effort  to  address  these  routine  cases

Page 35: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Objectives  (explained)

✦ Pareto  Principle• 80%  of  the  time  users  are  happy  with  basic  web  form  interface  

to  standard  application  workGlow  and  canned  result  analysis• 20%  of  the  effort  to  address  these  routine  cases• Science  Portals  are  a  big  win  over  cumbersome  and  complex  

Fortran  code

Page 36: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Objectives  (explained)

✦ Pareto  Principle• 80%  of  the  time  users  are  happy  with  basic  web  form  interface  

to  standard  application  workGlow  and  canned  result  analysis• 20%  of  the  effort  to  address  these  routine  cases• Science  Portals  are  a  big  win  over  cumbersome  and  complex  

Fortran  code

✦ Corollary  to  Pareto  Principle

Page 37: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Objectives  (explained)

✦ Pareto  Principle• 80%  of  the  time  users  are  happy  with  basic  web  form  interface  

to  standard  application  workGlow  and  canned  result  analysis• 20%  of  the  effort  to  address  these  routine  cases• Science  Portals  are  a  big  win  over  cumbersome  and  complex  

Fortran  code

✦ Corollary  to  Pareto  Principle• 20%  of  the  time  users  want  or  need  customized  application  

work<low  and/or  result  analysis

Page 38: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Objectives  (explained)

✦ Pareto  Principle• 80%  of  the  time  users  are  happy  with  basic  web  form  interface  

to  standard  application  workGlow  and  canned  result  analysis• 20%  of  the  effort  to  address  these  routine  cases• Science  Portals  are  a  big  win  over  cumbersome  and  complex  

Fortran  code

✦ Corollary  to  Pareto  Principle• 20%  of  the  time  users  want  or  need  customized  application  

work<low  and/or  result  analysis• 80%  of  the  effort  to  make  possible

Page 39: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Objectives  (explained)

✦ Pareto  Principle• 80%  of  the  time  users  are  happy  with  basic  web  form  interface  

to  standard  application  workGlow  and  canned  result  analysis• 20%  of  the  effort  to  address  these  routine  cases• Science  Portals  are  a  big  win  over  cumbersome  and  complex  

Fortran  code

✦ Corollary  to  Pareto  Principle• 20%  of  the  time  users  want  or  need  customized  application  

work<low  and/or  result  analysis• 80%  of  the  effort  to  make  possible• But  rare  that  anyone  knows  in  advance  whether  80  or  20  side

Page 42: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

My  Experience  and  Perspective

✦ The  really  interesting  stuff  happens

Audience  Participation

Page 43: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

My  Experience  and  Perspective

✦ The  really  interesting  stuff  happens• in  the  unpredictable  20%

Audience  Participation

Page 44: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

My  Experience  and  Perspective

✦ The  really  interesting  stuff  happens• in  the  unpredictable  20%

✦ Innovative  analytical  strategies  require

Audience  Participation

Page 45: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

My  Experience  and  Perspective

✦ The  really  interesting  stuff  happens• in  the  unpredictable  20%

✦ Innovative  analytical  strategies  require• an  ability  to  rapidly  adjust  work<low  and  data  analysis

Audience  Participation

Page 46: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

My  Experience  and  Perspective

✦ The  really  interesting  stuff  happens• in  the  unpredictable  20%

✦ Innovative  analytical  strategies  require• an  ability  to  rapidly  adjust  work<low  and  data  analysis

✦ You’re  stuffed

Audience  Participation

Page 47: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

My  Experience  and  Perspective

✦ The  really  interesting  stuff  happens• in  the  unpredictable  20%

✦ Innovative  analytical  strategies  require• an  ability  to  rapidly  adjust  work<low  and  data  analysis

✦ You’re  stuffed• if  workGlow  and  data  are  tightly  coupled  to  portal  framework

Audience  Participation

Page 48: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

My  Experience  and  Perspective

✦ The  really  interesting  stuff  happens• in  the  unpredictable  20%

✦ Innovative  analytical  strategies  require• an  ability  to  rapidly  adjust  work<low  and  data  analysis

✦ You’re  stuffed• if  workGlow  and  data  are  tightly  coupled  to  portal  framework

✦ Collaboration  is  critical:

Audience  Participation

Page 49: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

My  Experience  and  Perspective

✦ The  really  interesting  stuff  happens• in  the  unpredictable  20%

✦ Innovative  analytical  strategies  require• an  ability  to  rapidly  adjust  work<low  and  data  analysis

✦ You’re  stuffed• if  workGlow  and  data  are  tightly  coupled  to  portal  framework

✦ Collaboration  is  critical:• you  need  to  be  able  to  share  your  work  (securely)

Audience  Participation

Page 50: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

My  Experience  and  Perspective

✦ The  really  interesting  stuff  happens• in  the  unpredictable  20%

✦ Innovative  analytical  strategies  require• an  ability  to  rapidly  adjust  work<low  and  data  analysis

✦ You’re  stuffed• if  workGlow  and  data  are  tightly  coupled  to  portal  framework

✦ Collaboration  is  critical:• you  need  to  be  able  to  share  your  work  (securely)• the  web  is  the  obvious  (only!)  way  anyone  wants  to  do  this

Audience  Participation

Page 51: SBGrid Science Portal - eScience 2012

Implementation  and  Architecture

Page 52: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Front  End  Interface

✦ Django  (Python)  web  framework

✦ Apache  web  server✦ Per-­‐user  protected  jobs  and  data

✦ WebDAV  to  data✦ ssh  access  possible✦ Richer  access  control  in  development

Page 53: SBGrid Science Portal - eScience 2012

Results  Visualization  and  Analysis

Page 54: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

NoSQL  hierarchical  document  store✦ The  SBGrid  Portal’s  leading  workGlow:• 100,000  jobs• 300,000  output  Giles• 20-­‐100k  CPU-­‐hours

✦ Need  a  good  way  to  store  data• Glexible  data  format• Glexible  analysis  output• Gine  grained,  user-­‐driven  access  control• parallel  access• remote  access

✦ high  capacity  non-­‐relational  hierarchical  storage• ????

Page 56: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Operating  Systemsare  Pretty  Good

✦ File  systems  work  well• organize  data  carefully  (hierarchically)• include  meta-­‐data  (mod_cern_meta,  Gile  system)• serve  intelligently  via  multiple  protocols  (http,  gridftp)• leverage  POSIX  ownerships  (user,  group,  other,  r/w)• leverage  user,  group,  and  volume  quotas• storage  management  and  backups  are  easy  easier

Page 57: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Operating  Systemsare  Pretty  Good

✦ File  systems  work  well• organize  data  carefully  (hierarchically)• include  meta-­‐data  (mod_cern_meta,  Gile  system)• serve  intelligently  via  multiple  protocols  (http,  gridftp)• leverage  POSIX  ownerships  (user,  group,  other,  r/w)• leverage  user,  group,  and  volume  quotas• storage  management  and  backups  are  easy  easier

✦ Process  management  works  well• execute  as  the  actual  user,  where  possible• setuid,  su,  ssh,  suexec,  and  gsexec  can  all  help  with  this• process  accounting  is  your  friend!  (pacct)• leverage  ulimit  for  process  resource  limits

Page 59: SBGrid Science Portal - eScience 2012

Same  data  servedby  web  and  availablefrom  command  line

Page 60: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Open  Science  Gridhttp://opensciencegrid.org

✦ US  National  Cyberinfrastructure

✦ Primarily  used  for  high  energy  physics  computing

✦ 80  sites✦ O(1e5)  job  slots✦ O(1e6)  core-­‐hours  per  day✦ PB  scale  aggregate  storage

5,073,293  hours~570  years

Page 61: SBGrid Science Portal - eScience 2012

DOEGrids CA@Lawrence Berkley Labs

UC San Diego

Apache

GridSite

Django

Sage Math

R-Studio

SBGrid Science Portal @ Harvard Medical School

MyProxy@NCSA, UIUC

Gratia Acct'ing@FermiLab

FreeIPA

LDAP

VOMS

GUMS

GACL

ID mgmt

glideinWMS factory Open Science Grid

fileserver

SQLDB

scp

GridFTP

data

SRM

WebDAV

cluster

Condor

Cycle Server

VDT

Globus

computation

data

computations

interfaces

User

shell CLI

GUMSGUMSGridFTP +

Hadoop

GlobusOnline@Argonne

glideinWMS

Monitoring@Indiana

Ganglia

Nagios

monitoring

RSV

pacct

Service  Architecture

Page 62: SBGrid Science Portal - eScience 2012
Page 64: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

SBGrid  Portal:  Current  Status✦ 262  users  (lifetime),  72  active  in  past  quarter

Page 65: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

SBGrid  Portal:  Current  Status✦ 262  users  (lifetime),  72  active  in  past  quarter✦ 2.4  million  hours  on  OSG  last  12  months

Page 66: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

SBGrid  Portal:  Current  Status✦ 262  users  (lifetime),  72  active  in  past  quarter✦ 2.4  million  hours  on  OSG  last  12  months✦ Seamless  data  sharing  from  web  to  ssh?• requires  NFSv4  to  allow  >12  POSIX  groups/user• suexec  or  gsexec  possibility

Page 67: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

SBGrid  Portal:  Current  Status✦ 262  users  (lifetime),  72  active  in  past  quarter✦ 2.4  million  hours  on  OSG  last  12  months✦ Seamless  data  sharing  from  web  to  ssh?• requires  NFSv4  to  allow  >12  POSIX  groups/user• suexec  or  gsexec  possibility

✦ Account  integration• PAM  (ssh/command  line)  +  web  through  FreeIPA  LDAP• prototype  of  X.509  +  VOMS  +  MyProxy  (next  section!)

Page 68: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

SBGrid  Portal:  Current  Status✦ 262  users  (lifetime),  72  active  in  past  quarter✦ 2.4  million  hours  on  OSG  last  12  months✦ Seamless  data  sharing  from  web  to  ssh?• requires  NFSv4  to  allow  >12  POSIX  groups/user• suexec  or  gsexec  possibility

✦ Account  integration• PAM  (ssh/command  line)  +  web  through  FreeIPA  LDAP• prototype  of  X.509  +  VOMS  +  MyProxy  (next  section!)

✦ Collaboration• shared  secret  (password)• manual  .htaccess  or  .gacl

Page 69: SBGrid Science Portal - eScience 2012

Identity  Management*

*  or  “How  I  learned  to  stop  worrying  and  love  X.509”

Page 71: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Big  Picture✦ Federated  environment  requires• federated  identity  management• trusted  identity  providers  (“roots  of  trust”)

Page 72: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Big  Picture✦ Federated  environment  requires• federated  identity  management• trusted  identity  providers  (“roots  of  trust”)

✦ Collaboration  requires• user-­‐driven  capacity  to  form  cross-­‐organization  user  groups  

(aka  “Virtual  Organizations”)• roles  (or  at  least  privilege  levels)  within  VO

Page 73: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Big  Picture✦ Federated  environment  requires• federated  identity  management• trusted  identity  providers  (“roots  of  trust”)

✦ Collaboration  requires• user-­‐driven  capacity  to  form  cross-­‐organization  user  groups  

(aka  “Virtual  Organizations”)• roles  (or  at  least  privilege  levels)  within  VO

✦ State  of  Play• InCommon  will  get  us  part  way  there  (waiting  on  adoption!)• OpenID  nice  for  users,  but  no  trust  or  delegated  perms• X.509  process  and  details  still  tough  for  end  user• SSH  keys  lack  standard  root  of  trust  and  roles

Page 74: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

✦ Analogy  to  a  passport:• Application  form• Sponsor’s  attestation• Consular  services

• veriGication  of  application,  sponsor,  and  accompanying  identiGication  and  eligibility  documents

• Passport  issuing  ofGice

✦ Portable,  digital  passport• Gixed  and  secure  user  identiGiers

• name,  email,  home  institution• signed  by  widely  trusted  issuer• time  limited• ISO  standard

X.509  Digital  CertiZicates

Page 75: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

X.509  Challenges✦ Lots  of  “humans  in  the  loop”  to  get  usable  cert• Registration  Agent,  Sponsor,  VO  Manager,  User

✦ Awkward  working  with  X.509  certs• multiple  formats• proxy  certs  and  VOMS  ACs• proxy  servers  (MyProxy)• expiry  (of  proxy,  of  base  cert,  of  VO  membership)• browser  integration  and  import  process• CA  cert  chain• digital  token  needs  to  be  available  on  all  devices

• particularly  challenging  for  phones  and  tablets

Page 77: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

X.509  Nirvana  (ours  at  least)✦ User  never  sees  X.509  anything• unless  they  want  to

Page 78: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

X.509  Nirvana  (ours  at  least)✦ User  never  sees  X.509  anything• unless  they  want  to

✦ X.509  request  +  VO  membership  +  account  creation  completed  in  one  step  by  one  person• single  step  for  user• single  step  for  one  administrator

Page 79: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

X.509  Nirvana  (ours  at  least)✦ User  never  sees  X.509  anything• unless  they  want  to

✦ X.509  request  +  VO  membership  +  account  creation  completed  in  one  step  by  one  person• single  step  for  user• single  step  for  one  administrator

✦ Goodbye  passphrases  (and  forgotten  passphrases)• hold  private  key  in  LDAP  and  use  LDAP  authentication  to  access

Page 80: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

X.509  Nirvana  (ours  at  least)✦ User  never  sees  X.509  anything• unless  they  want  to

✦ X.509  request  +  VO  membership  +  account  creation  completed  in  one  step  by  one  person• single  step  for  user• single  step  for  one  administrator

✦ Goodbye  passphrases  (and  forgotten  passphrases)• hold  private  key  in  LDAP  and  use  LDAP  authentication  to  access

✦ Automate  everything• login  (web  or  command  line)  triggers  X.509  proxy  request  with  

(default)  VOMS  AC,  and  loading  to  MyProxy  server

Page 81: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

X.509  Nirvana  (ours  at  least)✦ User  never  sees  X.509  anything• unless  they  want  to

✦ X.509  request  +  VO  membership  +  account  creation  completed  in  one  step  by  one  person• single  step  for  user• single  step  for  one  administrator

✦ Goodbye  passphrases  (and  forgotten  passphrases)• hold  private  key  in  LDAP  and  use  LDAP  authentication  to  access

✦ Automate  everything• login  (web  or  command  line)  triggers  X.509  proxy  request  with  

(default)  VOMS  AC,  and  loading  to  MyProxy  server

✦ VO  Management  System  run  by  users• Users  need  to  be  able  to  self-­‐manage  their  (sub-­‐)  VOs

Page 82: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Addressing  CertiZicate  ProblemsU1U1U1

!"#$"%&'%()*"+',"!&'

-.'.)"*&'/.' 012*%2!' 3%"!'

!"&$!*'&!4,5(*)'*$67"!''

*289:'4)"*&%'

!";("<'!"#$"%&'

;"!(9:'$%"!'"=()(7(=(&:'

,2*>!6'"=()(7(=(&:'

411!2;"',"!&'

*289:'4;4(=47(=(&:'

)"*"!4&"',"!&'5":'14(!'

%()*',"!&'

!"&!(";"',"!&'

"?12!&'%()*"+',"!&'5":'14(!'

U2a

U1

S1

R1

R2

time

Page 83: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Addressing  CertiZicate  ProblemsU1U1U1

!"#$"%&'%()*"+',"!&'

-.'.)"*&'/.' 012*%2!' 3%"!'

!"&$!*'&!4,5(*)'*$67"!''

*289:'4)"*&%'

!";("<'!"#$"%&'

;"!(9:'$%"!'"=()(7(=(&:'

,2*>!6'"=()(7(=(&:'

411!2;"',"!&'

*289:'4;4(=47(=(&:'

)"*"!4&"',"!&'5":'14(!'

%()*',"!&'

!"&!(";"',"!&'

"?12!&'%()*"+',"!&'5":'14(!'

U2a

U1

S1

R1

R2

time

T0  =  late  Saturdaynight  lab  session

Page 84: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Addressing  CertiZicate  ProblemsU1U1U1

!"#$"%&'%()*"+',"!&'

-.'.)"*&'/.' 012*%2!' 3%"!'

!"&$!*'&!4,5(*)'*$67"!''

*289:'4)"*&%'

!";("<'!"#$"%&'

;"!(9:'$%"!'"=()(7(=(&:'

,2*>!6'"=()(7(=(&:'

411!2;"',"!&'

*289:'4;4(=47(=(&:'

)"*"!4&"',"!&'5":'14(!'

%()*',"!&'

!"&!(";"',"!&'

"?12!&'%()*"+',"!&'5":'14(!'

U2a

U1

S1

R1

R2

time

T0  =  late  Saturdaynight  lab  session

T+40h  =  mid-­‐Mondayresponse

Page 85: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Addressing  CertiZicate  ProblemsU1U1U1

!"#$"%&'%()*"+',"!&'

-.'.)"*&'/.' 012*%2!' 3%"!'

!"&$!*'&!4,5(*)'*$67"!''

*289:'4)"*&%'

!";("<'!"#$"%&'

;"!(9:'$%"!'"=()(7(=(&:'

,2*>!6'"=()(7(=(&:'

411!2;"',"!&'

*289:'4;4(=47(=(&:'

)"*"!4&"',"!&'5":'14(!'

%()*',"!&'

!"&!(";"',"!&'

"?12!&'%()*"+',"!&'5":'14(!'

U2a

U1

S1

R1

R2

time

T0  =  late  Saturdaynight  lab  session

T+40h  =  mid-­‐Mondayresponse

T+60h  =  early-­‐Tuesdayresponse

Page 86: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Addressing  CertiZicate  ProblemsU1U1U1

!"#$"%&'%()*"+',"!&'

-.'.)"*&'/.' 012*%2!' 3%"!'

!"&$!*'&!4,5(*)'*$67"!''

*289:'4)"*&%'

!";("<'!"#$"%&'

;"!(9:'$%"!'"=()(7(=(&:'

,2*>!6'"=()(7(=(&:'

411!2;"',"!&'

*289:'4;4(=47(=(&:'

)"*"!4&"',"!&'5":'14(!'

%()*',"!&'

!"&!(";"',"!&'

"?12!&'%()*"+',"!&'5":'14(!'

U2a

U1

S1

R1

R2

time

T0  =  late  Saturdaynight  lab  session

T+40h  =  mid-­‐Mondayresponse

T+60h  =  early-­‐Tuesdayresponse

T+66h  =  late-­‐Tuesdayresponse

Page 87: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Addressing  CertiZicate  ProblemsU1U1U1

!"#$"%&'%()*"+',"!&'

-.'.)"*&'/.' 012*%2!' 3%"!'

!"&$!*'&!4,5(*)'*$67"!''

*289:'4)"*&%'

!";("<'!"#$"%&'

;"!(9:'$%"!'"=()(7(=(&:'

,2*>!6'"=()(7(=(&:'

411!2;"',"!&'

*289:'4;4(=47(=(&:'

)"*"!4&"',"!&'5":'14(!'

%()*',"!&'

!"&!(";"',"!&'

"?12!&'%()*"+',"!&'5":'14(!'

U2a

U1

S1

R1

R2

time

T0  =  late  Saturdaynight  lab  session

T+40h  =  mid-­‐Mondayresponse

T+60h  =  early-­‐Tuesdayresponse

T+66h  =  late-­‐Tuesdayresponse

T+70h  =  late-­‐TuesdaySTAGE  1

Page 88: SBGrid Science Portal - eScience 2012

!"#$%&'(#!")*# *+,(-,.# /-0.#

(,123#4%&'(#

5,(6.&#07'8'9'7':3#

4++.,;0#&0&90.-<'+=#8.,>+-=#4(%#.,70-#

(,123#

;0.'23#>-0.#07'8'9'7':3#

+.0-0(:#50.:#:,#.0?>0-:#&0&90.-<'+#93#@A#

4%%#@A#:,#!")*#

.0?>0-:#!")*#$B#

.0:>.(#!")*#$B# 4%%#$B#:,#+.,C3#50.:#

.0?>0-:#!"#8.,>+-#4(%#.,70-# U2b

S2

V1

V2

time

VO  (Group)  Membership  Registration

Page 89: SBGrid Science Portal - eScience 2012

!"#$%&'(#!")*# *+,(-,.# /-0.#

(,123#4%&'(#

5,(6.&#07'8'9'7':3#

4++.,;0#&0&90.-<'+=#8.,>+-=#4(%#.,70-#

(,123#

;0.'23#>-0.#07'8'9'7':3#

+.0-0(:#50.:#:,#.0?>0-:#&0&90.-<'+#93#@A#

4%%#@A#:,#!")*#

.0?>0-:#!")*#$B#

.0:>.(#!")*#$B# 4%%#$B#:,#+.,C3#50.:#

.0?>0-:#!"#8.,>+-#4(%#.,70-# U2b

S2

V1

V2

time

VO  (Group)  Membership  Registration

T+82h  =  mid-­‐Wednesdayask  “What  next?”

Page 90: SBGrid Science Portal - eScience 2012

!"#$%&'(#!")*# *+,(-,.# /-0.#

(,123#4%&'(#

5,(6.&#07'8'9'7':3#

4++.,;0#&0&90.-<'+=#8.,>+-=#4(%#.,70-#

(,123#

;0.'23#>-0.#07'8'9'7':3#

+.0-0(:#50.:#:,#.0?>0-:#&0&90.-<'+#93#@A#

4%%#@A#:,#!")*#

.0?>0-:#!")*#$B#

.0:>.(#!")*#$B# 4%%#$B#:,#+.,C3#50.:#

.0?>0-:#!"#8.,>+-#4(%#.,70-# U2b

S2

V1

V2

time

VO  (Group)  Membership  Registration

T+82h  =  mid-­‐Wednesdayask  “What  next?”

T+95h  =  early-­‐Thursdayresponse  (time  zone!)

Page 91: SBGrid Science Portal - eScience 2012

!"#$%&'(#!")*# *+,(-,.# /-0.#

(,123#4%&'(#

5,(6.&#07'8'9'7':3#

4++.,;0#&0&90.-<'+=#8.,>+-=#4(%#.,70-#

(,123#

;0.'23#>-0.#07'8'9'7':3#

+.0-0(:#50.:#:,#.0?>0-:#&0&90.-<'+#93#@A#

4%%#@A#:,#!")*#

.0?>0-:#!")*#$B#

.0:>.(#!")*#$B# 4%%#$B#:,#+.,C3#50.:#

.0?>0-:#!"#8.,>+-#4(%#.,70-# U2b

S2

V1

V2

time

VO  (Group)  Membership  Registration

T+82h  =  mid-­‐Wednesdayask  “What  next?”

T+95h  =  early-­‐Thursdayresponse  (time  zone!)

T+100h  =  early-­‐Thursdayresponse

Page 92: SBGrid Science Portal - eScience 2012

!"#$%&'(#!")*# *+,(-,.# /-0.#

(,123#4%&'(#

5,(6.&#07'8'9'7':3#

4++.,;0#&0&90.-<'+=#8.,>+-=#4(%#.,70-#

(,123#

;0.'23#>-0.#07'8'9'7':3#

+.0-0(:#50.:#:,#.0?>0-:#&0&90.-<'+#93#@A#

4%%#@A#:,#!")*#

.0?>0-:#!")*#$B#

.0:>.(#!")*#$B# 4%%#$B#:,#+.,C3#50.:#

.0?>0-:#!"#8.,>+-#4(%#.,70-# U2b

S2

V1

V2

time

VO  (Group)  Membership  Registration

T+82h  =  mid-­‐Wednesdayask  “What  next?”

T+95h  =  early-­‐Thursdayresponse  (time  zone!)

T+100h  =  early-­‐Thursdayresponse

T+105h  =  mid-­‐Thursdayresponse

Page 93: SBGrid Science Portal - eScience 2012

!"#$%&'(#!")*# *+,(-,.# /-0.#

(,123#4%&'(#

5,(6.&#07'8'9'7':3#

4++.,;0#&0&90.-<'+=#8.,>+-=#4(%#.,70-#

(,123#

;0.'23#>-0.#07'8'9'7':3#

+.0-0(:#50.:#:,#.0?>0-:#&0&90.-<'+#93#@A#

4%%#@A#:,#!")*#

.0?>0-:#!")*#$B#

.0:>.(#!")*#$B# 4%%#$B#:,#+.,C3#50.:#

.0?>0-:#!"#8.,>+-#4(%#.,70-# U2b

S2

V1

V2

time

VO  (Group)  Membership  Registration

T+82h  =  mid-­‐Wednesdayask  “What  next?”

T+95h  =  early-­‐Thursdayresponse  (time  zone!)

T+100h  =  early-­‐Thursdayresponse

T+105h  =  mid-­‐Thursdayresponse

T+105h  =  4.5  days  waiting

Page 94: SBGrid Science Portal - eScience 2012

!"#$%&'()' *+",-"#' .-/#'

0/#123'/&14151&1$3'

6",7#8'/&14151&1$3'

4/,/#%$/'6/#$'9/3'+%1#'

-/$'#/$#1/0%&'-/#1%&',:85/#'

;<=*'

#/>:/-$'+"#$%&'%66":,$'

0/#176%?",'/8%1&'-/,$'

/8%1&'0/#17/@'

AB)!'

;<')@81,'

,"?23'%4/,$-'

6#/%$/'&"6%&'%66$'

-14,'6/#$'

!"#$%&'(%)*+%',,#""%-#.#$'/#.%$#"*!$,#"%

-/$';<'#14C$-'

,"?23'%0%1&%51&1$3'

+"#$%&'&"41,'

#/>:/-$'-14,/@'6/#$'

#/$:#,'$#%691,4',:85/#'

%++#"0/'6/#$'

%66":,$'#/%@3',"?76%?",'

#/>:/-$'-14,/@'6/#?76%$/'

#/$:#,'-14,/@'6/#?76%$/' D'E'+%1#'-14,/@'6/#$'1,$"'!F(*GDH'7&/'H'E'6#/%$/'&"6%&'+#"I3'6/#$'

=3!#"I3'

#/41-$/#'+#"I3'6/#$'J1$C'=3!#"I3'

!"#$%&'(%)*+%',,#""%+#0%1*$/'2%

U1

U2*

S1*A1a

A1btime

Page 95: SBGrid Science Portal - eScience 2012

!"#$%&'()' *+",-"#' .-/#'

0/#123'/&14151&1$3'

6",7#8'/&14151&1$3'

4/,/#%$/'6/#$'9/3'+%1#'

-/$'#/$#1/0%&'-/#1%&',:85/#'

;<=*'

#/>:/-$'+"#$%&'%66":,$'

0/#176%?",'/8%1&'-/,$'

/8%1&'0/#17/@'

AB)!'

;<')@81,'

,"?23'%4/,$-'

6#/%$/'&"6%&'%66$'

-14,'6/#$'

!"#$%&'(%)*+%',,#""%-#.#$'/#.%$#"*!$,#"%

-/$';<'#14C$-'

,"?23'%0%1&%51&1$3'

+"#$%&'&"41,'

#/>:/-$'-14,/@'6/#$'

#/$:#,'$#%691,4',:85/#'

%++#"0/'6/#$'

%66":,$'#/%@3',"?76%?",'

#/>:/-$'-14,/@'6/#?76%$/'

#/$:#,'-14,/@'6/#?76%$/' D'E'+%1#'-14,/@'6/#$'1,$"'!F(*GDH'7&/'H'E'6#/%$/'&"6%&'+#"I3'6/#$'

=3!#"I3'

#/41-$/#'+#"I3'6/#$'J1$C'=3!#"I3'

!"#$%&'(%)*+%',,#""%+#0%1*$/'2%

U1

U2*

S1*A1a

A1btime

T0  =  late  Saturdaynight  lab  session

Page 96: SBGrid Science Portal - eScience 2012

!"#$%&'()' *+",-"#' .-/#'

0/#123'/&14151&1$3'

6",7#8'/&14151&1$3'

4/,/#%$/'6/#$'9/3'+%1#'

-/$'#/$#1/0%&'-/#1%&',:85/#'

;<=*'

#/>:/-$'+"#$%&'%66":,$'

0/#176%?",'/8%1&'-/,$'

/8%1&'0/#17/@'

AB)!'

;<')@81,'

,"?23'%4/,$-'

6#/%$/'&"6%&'%66$'

-14,'6/#$'

!"#$%&'(%)*+%',,#""%-#.#$'/#.%$#"*!$,#"%

-/$';<'#14C$-'

,"?23'%0%1&%51&1$3'

+"#$%&'&"41,'

#/>:/-$'-14,/@'6/#$'

#/$:#,'$#%691,4',:85/#'

%++#"0/'6/#$'

%66":,$'#/%@3',"?76%?",'

#/>:/-$'-14,/@'6/#?76%$/'

#/$:#,'-14,/@'6/#?76%$/' D'E'+%1#'-14,/@'6/#$'1,$"'!F(*GDH'7&/'H'E'6#/%$/'&"6%&'+#"I3'6/#$'

=3!#"I3'

#/41-$/#'+#"I3'6/#$'J1$C'=3!#"I3'

!"#$%&'(%)*+%',,#""%+#0%1*$/'2%

U1

U2*

S1*A1a

A1btime

T0  =  late  Saturdaynight  lab  session

T+40h  =  mid-­‐Mondayresponse

Page 97: SBGrid Science Portal - eScience 2012

!"#$%&'()' *+",-"#' .-/#'

0/#123'/&14151&1$3'

6",7#8'/&14151&1$3'

4/,/#%$/'6/#$'9/3'+%1#'

-/$'#/$#1/0%&'-/#1%&',:85/#'

;<=*'

#/>:/-$'+"#$%&'%66":,$'

0/#176%?",'/8%1&'-/,$'

/8%1&'0/#17/@'

AB)!'

;<')@81,'

,"?23'%4/,$-'

6#/%$/'&"6%&'%66$'

-14,'6/#$'

!"#$%&'(%)*+%',,#""%-#.#$'/#.%$#"*!$,#"%

-/$';<'#14C$-'

,"?23'%0%1&%51&1$3'

+"#$%&'&"41,'

#/>:/-$'-14,/@'6/#$'

#/$:#,'$#%691,4',:85/#'

%++#"0/'6/#$'

%66":,$'#/%@3',"?76%?",'

#/>:/-$'-14,/@'6/#?76%$/'

#/$:#,'-14,/@'6/#?76%$/' D'E'+%1#'-14,/@'6/#$'1,$"'!F(*GDH'7&/'H'E'6#/%$/'&"6%&'+#"I3'6/#$'

=3!#"I3'

#/41-$/#'+#"I3'6/#$'J1$C'=3!#"I3'

!"#$%&'(%)*+%',,#""%+#0%1*$/'2%

U1

U2*

S1*A1a

A1btime

T0  =  late  Saturdaynight  lab  session

T+40h  =  mid-­‐Mondayresponse

T+40h  =  1.7  day  wait

Page 99: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Data  Tiers  -­‐  Scoping

• VO-­‐wide:  all  sites,  admin  managed,  very  stable

• User  archive:  single  site,  user  managed,  very  stable,  10+  GB

• User  project:  all  sites,  user  managed,  1-­‐10  weeks,  1-­‐3  GB

• User  static:  all  sites,  user  managed,  indeZinite,  10  MB

• Job  set:  all  sites,  infrastructure  managed,  1-­‐10  days,  0.1-­‐1  GB

• Job:  direct  to  worker  node,  infrastructure  managed,  1  day,  <10  MB

• Job  indirect:  to  worker  node  via  UCSD,  infrastructure  managed,  1  day,  <10  GB

Page 100: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

About  2PB  with40  front  end  servers  for  high  bandwidth  parallel  Gile  transfer

Data  Movementscp  (users)rsync  (VO-­‐wide)grid-­‐ftp  (UCSD)curl  (WNs)cp  (NFS)htcp  (secure  web)http(s)  (web)

Page 101: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Globus  Online:  High  Performance  Reliable  3rd  Party  File  Transfer

portal

cluster

desktop laptop

lab fileserver

data collectionfacility

GUMSDN  to  user  mapping

VOMSVO  membership

CertiGicate  Authorityroot  of  trust

Globus  OnlineZile  transfer  service

Page 102: SBGrid Science Portal - eScience 2012

SBGridSciencePortal

laptop

desktop lab fileserver

facility fileserver

Page 103: SBGrid Science Portal - eScience 2012

SBGridSciencePortal

laptop

desktop lab fileserver

facility fileserver

Ryan,  a  postdoc  in  the  Frank  Lab  at  Columbia

Access  NRAMM  facilities  securely  and  transfer  data  back  to  home  institute

Page 104: SBGrid Science Portal - eScience 2012

SBGridSciencePortal

laptop

desktop lab fileserver

facility fileserver

Ryan,  a  postdoc  in  the  Frank  Lab  at  Columbia

Access  NRAMM  facilities  securely  and  transfer  data  back  to  home  institute

/data/columbia/frank

/nfs/data/rsmith

/Users/Ryan

Page 105: SBGrid Science Portal - eScience 2012

SBGridSciencePortal

laptop

desktop lab fileserver

facility fileserver

Ryan,  a  postdoc  in  the  Frank  Lab  at  Columbia

Access  NRAMM  facilities  securely  and  transfer  data  back  to  home  institute

/data/columbia/frank

/nfs/data/rsmith

/Users/Ryan

Ryan  applies  for  an  account  at  the  SBGrid  Science  Portal

automated  X.509application

automated  Globus  Online  application/X.509  linking(wish  list!)

Page 106: SBGrid Science Portal - eScience 2012

SBGridSciencePortal

laptop

desktop lab fileserver

facility fileserver

Ryan,  a  postdoc  in  the  Frank  Lab  at  Columbia

Access  NRAMM  facilities  securely  and  transfer  data  back  to  home  institute

/data/columbia/frank

/nfs/data/rsmith

/Users/Ryan

Ryan  applies  for  an  account  at  the  SBGrid  Science  Portal

automated  X.509application

automated  Globus  Online  application/X.509  linking(wish  list!)

veriZication  of  lab  membership

Page 107: SBGrid Science Portal - eScience 2012

SBGridSciencePortal

laptop

desktop lab fileserver

facility fileserver

Ryan,  a  postdoc  in  the  Frank  Lab  at  Columbia

Access  NRAMM  facilities  securely  and  transfer  data  back  to  home  institute

/data/columbia/frank

/nfs/data/rsmith

/Users/Ryan

Ryan  applies  for  an  account  at  the  SBGrid  Science  Portal

automated  X.509application

automated  Globus  Online  application/X.509  linking(wish  list!)

veriZication  of  lab  membership

Page 108: SBGrid Science Portal - eScience 2012

SBGridSciencePortal

laptop

desktop lab fileserver

facility fileserver

Ryan,  a  postdoc  in  the  Frank  Lab  at  Columbia

Access  NRAMM  facilities  securely  and  transfer  data  back  to  home  institute

/data/columbia/frank

/nfs/data/rsmith

/Users/Ryan

Page 109: SBGrid Science Portal - eScience 2012

SBGridSciencePortal

laptop

desktop lab fileserver

facility fileserver

Ryan,  a  postdoc  in  the  Frank  Lab  at  Columbia

Access  NRAMM  facilities  securely  and  transfer  data  back  to  home  institute

/data/columbia/frank

/nfs/data/rsmith

/Users/Ryan

request  accessto  NRAMMfacility

using  credential  held  by  SBGrid

Page 110: SBGrid Science Portal - eScience 2012

SBGridSciencePortal

laptop

desktop lab fileserver

facility fileserver

Ryan,  a  postdoc  in  the  Frank  Lab  at  Columbia

Access  NRAMM  facilities  securely  and  transfer  data  back  to  home  institute

/data/columbia/frank

/nfs/data/rsmith

/Users/Ryan

request  accessto  NRAMMfacility

using  credential  held  by  SBGrid

check  SBGrid  for  Ryan’s  group  membership

in  Frank  Lab,  so  grant  access  to  Ziles

Page 111: SBGrid Science Portal - eScience 2012

SBGridSciencePortal

laptop

desktop lab fileserver

facility fileserver

Ryan,  a  postdoc  in  the  Frank  Lab  at  Columbia

Access  NRAMM  facilities  securely  and  transfer  data  back  to  home  institute

/data/columbia/frank

/nfs/data/rsmith

/Users/Ryan

Page 112: SBGrid Science Portal - eScience 2012

SBGridSciencePortal

laptop

desktop lab fileserver

facility fileserver

Ryan,  a  postdoc  in  the  Frank  Lab  at  Columbia

Access  NRAMM  facilities  securely  and  transfer  data  back  to  home  institute

/data/columbia/frank

/nfs/data/rsmith

/Users/Ryan

use  Globus  Online  to  managetransfer  from  NRAMM  back  to  lab

Page 113: SBGrid Science Portal - eScience 2012

SBGridSciencePortal

laptop

desktop lab fileserver

facility fileserver

Ryan,  a  postdoc  in  the  Frank  Lab  at  Columbia

Access  NRAMM  facilities  securely  and  transfer  data  back  to  home  institute

/data/columbia/frank

/nfs/data/rsmith

/Users/Ryan

use  Globus  Online  to  managetransfer  from  NRAMM  back  to  lab

initiate  transfer  at  NRAMM

Page 114: SBGrid Science Portal - eScience 2012

SBGridSciencePortal

laptop

desktop lab fileserver

facility fileserver

Ryan,  a  postdoc  in  the  Frank  Lab  at  Columbia

Access  NRAMM  facilities  securely  and  transfer  data  back  to  home  institute

/data/columbia/frank

/nfs/data/rsmith

/Users/Ryan

use  Globus  Online  to  managetransfer  from  NRAMM  back  to  lab

initiate  transfer  at  NRAMM

transfer  data  to  lab

Page 115: SBGrid Science Portal - eScience 2012

SBGridSciencePortal

laptop

desktop lab fileserver

facility fileserver

Ryan,  a  postdoc  in  the  Frank  Lab  at  Columbia

Access  NRAMM  facilities  securely  and  transfer  data  back  to  home  institute

/data/columbia/frank

/nfs/data/rsmith

/Users/Ryan

use  Globus  Online  to  managetransfer  from  NRAMM  back  to  lab

initiate  transfer  at  NRAMM

transfer  data  to  lab

notify  user  of  completion

Page 116: SBGrid Science Portal - eScience 2012

datacollec(on

6+monthstaging+storage

10+TB+per+group+permanent+archive

/stage/sliz

/stage/murphy

/data/murphy

/data/deacon

Sliz+lab

~andy~sue

~sco?

NEBCAT+beamline+at+APS

SBGrid'Science'PortalGlobus'Online

Harvard "Sco?"+fromthe+Sliz+lab

VOMS

/data/sliz

/embarg/2010/

/embarg/2011/

/public/2009/

10+PBpublic+archive

general+public

WWW

Tier'1 Tier'2 Tier'3

Local  accounts  withinlab  infrastructure

Shared  (lab  level)  accounts  at  facility

User  can  directly  access  lab  or  facility  data  from  laptop

Public  access  available  to  archived  data  through  web  interface

Embargo  policy  to  hold  deposited  data  for  agreed  time

Tiered  storage

VO  management

Page 117: SBGrid Science Portal - eScience 2012

Data  at  Shared  ScientiZic  Facilities

Page 118: SBGrid Science Portal - eScience 2012

Data  at  Shared  ScientiZic  Facilities✦ SBGrid

• manages  all  user  account  creation  and  credential  mgmt• hosts  MyProxy,  VOMS,  GridFTP,  and  user  interfaces

Page 119: SBGrid Science Portal - eScience 2012

Data  at  Shared  ScientiZic  Facilities✦ SBGrid

• manages  all  user  account  creation  and  credential  mgmt• hosts  MyProxy,  VOMS,  GridFTP,  and  user  interfaces

✦ Facility• knows  about  lab  groups

• e.g.  “Harrison”,  “Sliz”

• delegates  knowledge  of  group  membership  to  SBGrid  VOMS• facility  can  poll  VOMS  for  list  of  current  members

• uses  X.509  for  user  identiGication• deploys  GridFTP  server

Page 120: SBGrid Science Portal - eScience 2012

Data  at  Shared  ScientiZic  Facilities✦ SBGrid

• manages  all  user  account  creation  and  credential  mgmt• hosts  MyProxy,  VOMS,  GridFTP,  and  user  interfaces

✦ Facility• knows  about  lab  groups

• e.g.  “Harrison”,  “Sliz”

• delegates  knowledge  of  group  membership  to  SBGrid  VOMS• facility  can  poll  VOMS  for  list  of  current  members

• uses  X.509  for  user  identiGication• deploys  GridFTP  server

✦ Lab  group• designates  group  manager  that  adds/removes  individuals• deploys  GridFTP  server  or  Globus  Connect  client

Page 121: SBGrid Science Portal - eScience 2012

Data  at  Shared  ScientiZic  Facilities✦ SBGrid

• manages  all  user  account  creation  and  credential  mgmt• hosts  MyProxy,  VOMS,  GridFTP,  and  user  interfaces

✦ Facility• knows  about  lab  groups

• e.g.  “Harrison”,  “Sliz”

• delegates  knowledge  of  group  membership  to  SBGrid  VOMS• facility  can  poll  VOMS  for  list  of  current  members

• uses  X.509  for  user  identiGication• deploys  GridFTP  server

✦ Lab  group• designates  group  manager  that  adds/removes  individuals• deploys  GridFTP  server  or  Globus  Connect  client

✦ Individual• username/password  to  access  facility  and  lab  storage• Globus  Connect  for  personal  GridFTP  server  to  laptop• Globus  Online  web  interface  to  “drive”  transfers

Page 122: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Summary✦ Don’t  discount  unpredictable  20%• need  Glexibility  to  innovate  and  explore  (data  and  comp)

✦ “Last  mile”  challenge• to  the  desktop• to  the  laptop

✦ UniGied  and  simpliGied  identity  management• centralized  set  of  credentials  for  each  person• tight  links  to  CA/X.509,  LDAP,  MyProxy  and  VOMS

✦ Empower  collaborations  to  self-­‐manage✦ Shift  of  focus  from  “compute”  to  “data”• for  users• for  facilities  where  data  is  the  main  challenge

Page 123: SBGrid Science Portal - eScience 2012

j.mp/esci12-sbgrid [email protected]

Q&A  and  Acknowledgements✦ Piotr  Sliz

• Supervisor  and  PI  at  Harvard  Medical  School• Chair  of  SBGrid  Consortium

✦ SBGrid  Science  Portal• Daniel  O’Donovan,  Meghan  Porter-­‐Mahoney,  Mick  Timoney

✦ SBGrid  System  Administrators• Ian  Levesque,  Peter  Doherty,  Steve  Jahl

✦ Facility  Collaborators• Frank  Murphy  (NE-­‐CAT/APS)• Ashley  Deacon  (JCSG/SLAC)

✦ Globus  Online  Team• Steve  Tueke,  Ian  Foster,  Rachana  Ananthakrishnan,  Raj  Kettimuthu  

✦ OSG  Collaborators• Ruth  Pordes,  Director  of  OSG,  for  championing  SBGrid• Terrence  Martin,  for  UCSD  HDFS  support• Steve  Timm  and  Keith  Chadwick  (FNAL)  for  helping  resolve  OSG  problems