Top Banner

Click here to load reader

of 42

Widening the number of e-Infrastructure users with Science Gateways and Identity Federations Giuseppe Andronico INFN -

Jan 18, 2018

Download

Documents

Shauna Lee

Path to technology uptake The Rogers “bell-shape” curve - Rogers, E. M. (1962), “Diffusion of Innovations”, Glencoe: Free Press. 3
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript

Widening the number of e-Infrastructure users with Science Gateways and Identity Federations Giuseppe Andronico INFN - Italy Workshop on Science Applications And Infrastructure In Clouds And Grids Oxford,15-16 March 2012 Outline Introduction and driving considerations The Science Gateway paradigm: Architecture Authentication and Authorisation Schema Access workflow Grid transaction model The Authentication process Use cases and statistics The forthcoming Cloud Engine Summary and conclusions 2 Path to technology uptake The Rogers bell-shape curve - Rogers, E. M. (1962), Diffusion of Innovations, Glencoe: Free Press. 3 IT acceptance model the Web Davis, F. D. (1989), "Perceived usefulness, perceived ease of use, and user acceptance of information technology", MIS Quarterly 13(3): 319340 Development of web browsers The World Wide Web 4 The evolution leap in web browsers 5 evolution leap 5 The eResearch2020 report (http://www.eresearch2020.eu/eResearch%20Brochure%20EN.pdf) Some barriers in the adoption of Grids: Changes on Grids means changes on applications Time required to adapt usual workflows Lack of structure to support anonymous access Download and installation of applications Interface Slow to get to compared to other resources Difficult to use in the beginning Time spent to get the application compiled and running 6 Using Grids is not straightforward Users have to cope with complex security procedures, execution scripts, job description languages, command line based interfaces and lack of standards. This makes the learning curve very steep and keeps non IT-experts away. Another consideration VRCs # of users There is a huge number of non IT-experts out there who do not belong to any constituted Virtual Research Community. How can we attract them ? 8 I have a dream Can we increase the number of potential grid users by a factor of 1,000 ? or even by a factor of 25,000 and more ? 9 A new paradigm: the Science Gateway A Science Gateway is a community-developed set of tools, applications, and data that is integrated via a portal or a suite of applications, usually in a graphical user interface, that is further customized to meet the needs of a specific community. Teragrid 10 Davis, F. D. (1989), "Perceived usefulness, perceived ease of use, and user acceptance of information technology", MIS Quarterly 13(3): 319340 Development of Science Gateway Requirement for sustainability IT acceptance model the Grid 11 Primary requirement: building Science Gateways should be like playing with Sc. Gtwy E Sc. Gtwy DSc. Gtwy CSc. Gtwy B Sc. Gtwy A Standards Simplicity Easiness of use Re-usability 12 Our reference model Science Gateway Science Gateway App. 1 App. 2 App. N Embedded Applications Administrator Power User Basic User Users from different organisations having different roles and privileges Standard-based (SAGA) middleware-independent Grid Engine Standard-based (SAGA) middleware-independent Grid Engine 13 AuthN & AuthZ Schema AuthorisationAuthorisation Science Gateway GrIDP (catch-all) GrIDP (catch-all) IDPCT (catch- all) IDPCT (catch- all) IDP_y LDAP Register to a Service 2. Sign in Authentication Social Networks Bridge IdP The Grid IDentity Pool (GrIDP) (http://gridp.ct.infn.it) This is a catch-all Identity Federation eduGAIN (www.edugain.org) All the Science Gateways are registered as Service Providers of eduGAIN 16 17 Grid Engine Users Tracking DB Science GW Interface SAGA/JSAGA API Job Engine Data Engine Users Track & Monit. Science GW 1 Science GW 2 Science GW 3 Grid MWs Liferay Portlets eToken Server DONEBy end of April Catania Grid Engine By mid April DONE 17 Job Engine - Architecture WT Worker Threads for Job Submission WT Worker Threads for Status Checking USER TRACKING DB MONITORING MODULE GRID INFRASTRUCTURE(S) Job Queue WT Job Submission Job Check Status/ Get Output 18 19 Job Engine - Features The Job Engine has been designed with the following features in mind: FeatureDescriptionStatus Middleware Independent Capacity to submit job to resources running different middleware DONE EasinessCreate code to run applications on the grid in a very short time DONE ScalabilityManage a huge number of parallel job submissions fully exploiting the HW of the machine where the Job Engine is installed DONE PerformanceHave a good response timeDONE Accounting & Auditing Register every grid operation performed by the usersDONE Fault ToleranceHide middleware failure to end usersALMOST DONE WorkflowProviding a way to easily create and run workflowsIN PROGRESS 20 Job Engine Scalability 40,000 jobs submitted in parallel ! Time to submit 10,000 jobs (h) Job submission time (h) Submission time scales linearly with number of jobs >10,000 jobs a hour 20 Both sequential and MPI-enabled jobs successfully executed Tests with Globus planned 21 Job Engine Middleware interoperability Job Engine Accounting & Auditing A powerful accounting & auditing system is included in the Job Engine It is fully compliant with EGI VO Portal Policy and EGI Grid Security Traceability and Logging Policy The following values are stored in the DB for each job submitted: User ID Job Submission timestamp Job Done timestamp Application name Job ID Robot certificate ID VO name Execution site (name, latitude, longitude) 22 Catania Science Gateways in numbers Overall usage (arb. units) 23 Data Engine Requirements A file browser shows Grid files in a tree File system exposed by the Science Gateway is virtual Easy transfers from/to Grid (through the SG at the moment) are done in a few clicks Users do not need to care about how and where their files are really located 24 Data Engine Usage Workflow Sign in eTokenServer User Track. DB DOGS DB 5. File Upload 3. Proxy request 4. Proxy transfer 6. Update DB 7. Upload on Grid 7. Tracking 2. Upload request 25 DOGS: Data On Grid Services Back-end implementation 26 JSAGA API used to transfer data from/to storage elements Hibernate to manage the VFS collecting information on files stored on Grid; any changes/actions in the user view affect the VFS MySQL as underlying RDBMS An additional component has been developed in order to keep track of each transaction in the users tracking DB DOGS: Data On Grid Services Front-end implementation A portlet has been created wit access provided only to federated users with given roles and privileges The portlet view component includes elFinder, a web-based file manager developed in Javascript using jQuery UI for a dynamic and user friendly interface 27 Data Engine in action (1/2) 28 Data Engine in action (2/2) Share to be added soon 29 Summary of standards adopted The framework for Science Gateways developed at Catania is fully web-based and adopts official worldwide standards and protocols, through their most common implementations These are: The JSR 168 and JSR 286 standards (also known as "portlet 1.0" and "portlet 2.0" standards)JSR 168JSR 286 The OASIS Security Assertion Markup Language (SAML) standard and its Shibboleth and SimpleSAMLphp implementationsOASISSecurity Assertion Markup Language ShibbolethSimpleSAMLphp The Lightweight Direct Access Protocol, and its OpenLDAP implementationOpenLDAP The Cryptographic Token Interface Standard (PKCS#11) standard and its Cryptoki implementationCryptographic Token Interface Standard The Open Grid Forum (OGF) Simple API for Grid Applications (SAGA) standard and its JSAGA implementationOpen Grid ForumSimple API for Grid ApplicationsJSAGA 30 INDICATE Review Roberto Barbera Lyon, 20/09/ Science Gateways in action: e-Culture Science INDICATE Use the HTTPS interface of Storage Elements Important for large-size files Science Gateways in action: e-Culture Science INDICATE 32 Science Gateways in action: e-Culture Science INDICATE Thanks to the collaboration with Science Gateways in action: e-Culture Science INDICATE Science Gateways in action: EUMEDGRID Science Gateways in action: GISELA 36 Science Gateways in action: DECIDE 37 The CHAIN Application Database (www.chain-project.eu/applications) Project-specific Science Gateways can be accessed from the CHAIN AppDB 38 Cloud Engine Users Tracking DB OCCI API Users Track & Monit. Cloud App 1 Cloud App 2 Cloud App N Cloud MW Cloud Gateway The forthcoming Cloud Engine AWS Host Management Layer: Host Manager Performs physical resources monitoring and VEs allocation Cluster Management Layer: Cluster Manager Monitoring the overall state of the cluster, coordinates HMs External components: XMPP Server and Distributed Database XMPP advantages: host presence, open standard Central failure point does not exist: fault tolerance mechanism with multiple CM instances Virtual execution environment: CLEVER Summary and conclusions e-Infrastructures can be very beneficial platforms (especially for cultural heritage), provided they are really easy to use Science Gateways with support for Identity Federations and Social Networks can revolutionize the way Grid infrastructures are used, hugely widening their potential user base, especially non-IT experts and the citizen scientist The adoption of standards (JSR 286, SAGA, SAML, etc.) represents a concrete investment towards sustainability By design, the components (the portlets our Lego bricks) of our Science Gateways have maximum re-usability and, indeed, they have been already adopted in/by several projects (CHAIN, DECIDE, EarthServer, EUMEDGRID-Support, GISELA, INDICATE, etc.) If you want to integrate your applications in our Science Gateways, or simply enable your websites with our authentication tools, please contact me at 41 Thank you 42