Collaborative eScience: Evolving Approaches Charles Severance Rutgers CyberInfrastructure Meeting April 4, 2006 www.dr-chuck.com
May 10, 2015
2. Outline A look back at the past 15 years Putting the collab in Collaborative eScience The current tools of Collaborative eScience Collaboration Portals Repository Reflecting on 15 years of Experience What is wrong with Middleware? Authorization and Authentication - Are we there yet? A future eScience Case Study 3. The Founding Concepts Scientific Domain Groups of People Common User Interface Data Sharing In the moment Long-term Experimental Equipment Compute Visualization 4. Over 15 Years of Collaborative eScience SCIGate ?OGCE Grid Portal NEESGridNEESIT Worktools CHEFSakaiGlobus Tool KitUARC/SPARC1991 - 1999 2000 2001 20022003 2004 20052006 2007 5. What was SPARC? Before UARC.. 6. What was SPARC?UARC/SPARC 7. SPARC2/2001 600 users 800 data sources 8. SPARC Software Written from scratch No Middleware No Portal Technology Three rewrites over 10 years NextStep Java Applets with server support Browser based - kind of like a portal At the end, in 2001 - it was ready for anotherrewrite 9. Keys to SPARC Success Ten years of solid funding Team consistency Long enough to learn from mistakes Long term relationship between IT folks andscientists - evolved over time - relationshipwas grey Software rewritten several times over life ofproject based on evolving user needs andexperience with each version of the program Portion of effort was invested in evaluation ofusability - feedback to developers 10. After SPARC: Now What? Getting people together is an important part ofcollaborative eScience WorkTools - Based on Lotus Notes CHEF - Collaborative framework - Based on Java and Jetspeed Sakai - Collaboration and Learning Environment - Java Critical point: Collaborative software is only onecomponent of eScience UM Focus: Building reusable user interfacetechnologies for the people part of collaborativeeScience 11. WorkToolsWorkTools - The organicsingle-server approach - ifyou build it (and give awayfree acounts), they willcomeOver 9000 users (2000 active) at the end of 2003 12. CompreHensive CollaborativE Framework (CHEF) Fall 2001: CHEF Development begins Generalized extensible framework for building collaboratories Funded internally at UM All JAVA - Open Source Jakarta Jetspeed Portal Jakarta Tomcat Servlet Container Jakarta Turbine Service Container Build community of developers through workshopsand outreach 13. CHEF Applications CourseTools Next Generation WorkTools Next Generation NEESGrid NSF National Middleware Grid Portal 14. NEESGrid - The Equipment Network for Earthquake Engineering SimulationNSF Funded.NCSA, ANL, USC/ISI, UM, USC, Berkeley, MSU 15. CHEF-Based NEESGridSoftware 16. Overall Data Modeling Efforts NEESSiteSpecifications Site ASite BSite CDatabase Equipment Equipment PeoplePeople Project Trials Experiments Description ExperimentsTrialsDomain Tsnumai Shake TableGeotechCentrifugeSpecific Specimen Specimen SpecimenSpecimenmodelsCommonElements Units Sensors Descriptions Data / Data DataData Observations 17. Capturing Video and DataPTZ/ Still Camera ControlUSBCapture Gateway DT Client Video DT Main System BT848Frames DT Client DataDAQ Capture DT Client Simulation CoordinatorSite B Site A 18. Data Monitoring ToolsStill Image / Camera ControlCamera Control Still imageGateway cameracontrol^ < >^DT Main System ~ < >Thumb-nail Creare viewers 19. Whats in a name?Sakai is named afterHiroyuki Sakai of the FoodChannel Televisionprogram Iron Chef.Hiroyuki is renowned for hisfusion of French andJapanese cuisine. 20. Sakai General Collaborative Tools Announcements Assignments Chat Room Threaded Discussion Drop Box Email Archive Message Of The Day News/RSS Preferences Resources Schedule Web Content Worksite Setup WebDAV 21. Requirements Overlap Grid ComputingQuizzesPhysicsVisualizationGrading Tools ResearchSyllabus CollaborationSCORMTeachingandData RepositoryLearningChat Discussion Earthquake Resources ResearchLarge Data CollaborationLibraries 22. Sakai: Product PlacementTeachingandLearningCollaboration and eResearch 23. Additional General Collaboration Tools Under Development Wiki based onRadeox Blog Shared Display SharedWhiteboard Multicast Audio Multicast VideoThese are works-in-progress by members of the Sakai eResearch community. There are no dates for release. 24. NMI / OGCE www.ogce.orgNSF National Middleware InitiativeIndiana, UTexas, ANL, UM, NCSA 25. Chalk Talk:School of Portals (2004) NEES 1.1NEES 3.0CHEF OGCE 1.2 ? Jetspeed OGCE 1.1OGCE 2Sakai GridPort 2 GridPort 3GridPort AllianceuPortal GridSphere XCAT CompetitionCollaboration Convergence 26. Chalk Talk:School of eScience Portals (2006)NEES 1.1GridSphere CHEFOGCE 1.1OGCE 2 Jetspeed SciGate ? Sakai GridPort 2GridPort 3 SciDocuPortalGridPortAlliance XCAT Competition CollaborationConvergence 27. Integration andApplications Atlas ITERCMSAdministrationand UsersConfigure: Atlas PortalExperiment GatewayProcessPortal Gateway Desktop GatewayTechnologiesControlKnowledge StoreSakaiSRBKnowledgeStoreOpalManagementClarens ExperimentComponents SimulationProcessMetadataControl Configure: ITER PortalExperimentProcessServices and BlueGene MetaDataControlSecurityClarens ComponentsIdentityGlobusORNLSakaiKnowledge Store OpalSRBSakai SRBOpalClarensMetadata SciGate PetascalePetascale ResourcesProduction ComputeData 28. The Ecology of Collaborative eScience 29. Scope of Collaborative E-Science ..composing and orchestrating..interoperability is key many technologiesPortalTechnologyData Collaborative Sources ToolsIdentityACLShared DataRepositoryKnowledgeComputeTools 30. Portals are an excellenttechnology for building aUser Interface forfederated user interface Collaborative E- across these disparatecomponents using Sciencestandards like JSR-168. Portal TechnologyData Collaborative Sources ToolsIdentityACLSharedData RepositoryKnowledgeCompute Tools 31. Portals may only be an intermediate step in theDesktop Applicationsprocess..PortalTechnology Data CollaborativeSources ToolsIdentityACLShared DataRepository KnowledgeCompute Tools 32. Sakai is focused primarily Focus of Sakaion integration with portals Activity in eScienceand working closely with data repositories.DiscussF irstPortalTechnologyData Collaborative Sources ToolsIdentityACLShared DataRepository KnowledgeCompute Tools 33. Collaboration .vs. Portal Basic organization is about the Basic organization is about theshape of the people and groupsthing it represents - Teragrid, Customization based on the group NVOleaders Site customization is based on New groups form quickly and the resource ownersorganically Sometimes there is an Doing one thing at a time - chat,upload - perhaps multiple activeindividual customization aspectwindows on a desktop Many small rectangles to Very interactiveprovide a great deal of Think of navigation as picking a tool information on a single screenor switching from one class to Portals think of rectanglesanother operating independently - like Think Application windows Think Dashboard 34. Sakai Portlet Version 0.2 Announcements (sakai.announcements) Assignments (sakai.assignment) Tree ViewChat Room (sakai.chat) Discussion (sakai.discussion) Gallery View Gradebook (sakai.gradebook.tool) Email Archive (sakai.mailbox) Proxy portlets Membership (sakai.membership) Message Forums (sakai.messageforums) Source in SVNPreferences Tool (sakai.preferences) Presentation (sakai.presentation) Configurable via Profile (sakai.profile) Resources (sakai.resources)properties fileWiki (sakai.rwiki) Tests & Quizzes (sakai.samigo) Roster (sakai.site.roster) Schedule (sakai.schedule) Site Info (sakai.siteinfo) Syllabus (sakai.syllabus) 35. Sakai JSR-168 Portlet Web Services are used to login to Sakaiestablish a session and retrieve a list ofSakai Sites, Pages, and Tools The portlet is 100% stock JSR-168 Works in Pluto, uPortal, and GridSphere 36. Three Variations Display the Sakai gallery - all of Sakaiexcept for the login and branding. Retrieve the hierarchy of sites, pagesand tools display in a tree view with theportlet and show selected tools/pages inan iframe within the portlet Proxy tool placement for a particularSakai tool such as sakai.preferences 37. Sakai Gallery View 38. uPortal, Pluto, or GridSphere Sakai Portlet Login /portal/gallery Charon How Gallery Works Web Svcs Portal Sakai 39. Sakai Tree View 40. uPortal, Pluto, or GridSphere Sakai Portlet LoginToolList/portal/page/FF96 Charon Web Svcs PortalHow Tree View Works Sakai 41. Sakai Proxy Tool 42. Proxy Tool Selection 43. 1uPortal, Pluto, or GridSphere2 Sakai Portlet LoginSiteList /portal/page/FF96 Charon Web Svcs How Proxy Portlet Works Portal Sakai 44. SakaiSite.getToolsDomhttp://localhost:8080/portalhttp://localhost:8080http://localhost:8080/galleryMy Workspace~csevhttp://localhost:8080/portal/worksite/~csevaf54f077-42d8-4922-80e3-59c158af2a9aHomehttp://localhost:8080/portal/page/af54f077-42d8-4922-80e3-59c158af2a9ab7b19ad1-9053-4826-00f0-3a964cd20f77Message of the Daysakai.motdhttp://localhost:8080/portal/tool/b7b19ad1-9053-4826-00f0-3a964cd20f7785971b6b-e74e-40eb-80cb-93058368813cMy Workspace Informationsakai.iframe.myworkspacehttp://localhost:8080/portal/tool/85971b6b-e74e-40eb-80cb-93058368813c New WS method is upwards compatible with 45. Sakai Repository IntegrationApproach 46. Sakai is focused primarily Focus of Sakaion integration with portals Activity in eScienceand working closely with data repositories.PortalTechnologyData Collaborative Sources Tools N owc us sDi sIdentityACLShared DataRepository KnowledgeCompute Tools 47. Collaboration .vs. Repository Many different systems may be Generally one system for theactive at the same time area Systems evolve, improve, and Long term strategic choice forare often replaced every fewinstitutionyears System focused on accessing, Systems focused on theindexing, curation, and storagedynamic needs of users and Millions of high quality objectsapplicationsproperly indexed Thousands of simultaneous Data and metadata qualityonline users Must enforce standards and Performance tuningworkflow to insure data quality Must be very easy to use; Most use is very purposeful:almost unnoticeable search, publish, add value Used informally hundreds of Think Librarytimes per day per user Think E-Mail 48. Inbound Object Flow SearchThe lens orView disseminator Reuse understandsSakai DRthe data model and is Index Lens capable ofrendering the objects. The StorePrepare forlens is part ofCreate andstorage the DR.use in Datanative form ModelCurate, convert,update andmaintain overIngesttimeThe DR establishes a data model for sitePreparation for storage may objects. The CLE hands sites to the DR. The include cleanup, conversion, DR may have to do model or content cleanup copyright clearance, and other before completing the ingest process.workflow steps. 49. Outbound Object FlowSakai can find and re-use objects in the Searchrepository.View ViewSakai Search Index LensLensReuse Reuse Data ModelData DR Model 50. Sakai and Repositories GoingForward Instead of solving the problem by creating a singleDR technology that is a superset - which might takeyears Focus on data portability between systems - reducethe impedance mismatch (or needed conversionbetween systems) RDF enables object portability across systems,languages, and technologies 51. Sakai Repository Approach Move Sakai and other Collaboration systems towardRDF Experiment with using RDF as native storage format High Performance RDF - Fedora testing - 180M tuples - complex queries - 70ms Move data repositories toward RDF Move from schema-based stovepipe objects to OWL/RDF based objects with referential integrity Explore dimensions of portability of disseminator / lenses - this is an important research area Get started immediately. 52. Fedora Images 53. Some Reflections 54. Where is the Middleware?..composing and orchestrating..interoperability is keymany technologies Portal TechnologyData Collaborative Sources ToolsIdentityACLSharedData Repository KnowledgeComputeTools 55. Is Middleware The Universal Connector? Portal Collaborative Technology Data Tools SourcesIdentityACL Middleware Shared Data KnowledgeCompute Repository Tools 56. The Universal Connectors Portal Collaborative TechnologyData ToolsSources tcp/ip http/httpsIdentityACL web services Shared DataKnowledgeCompute RepositoryTools 57. Is Middleware inside each application?PortalTechnologyData Collaborative Sources ToolsIdentityACLShared DataRepository KnowledgeCompute Tools 58. Middleware is simply another component - used as neededPortalTechnologyDataCollaborativeIdentitySourcesToolsACL MiddlewareSharedDataRepository Knowledge Compute Tools 59. Identity and Access Control: A very importantfunction of MiddlewarePortalTechnologyDataCollaborativeIdentitySources ACLLets Talk about ThisTools MiddlewareSharedDataRepositoryKnowledge ComputeTools 60. Chalk Talk:Identity and Access Control CAS??ShibbolethMyProxyGridShib Globus Identity K.X509ACLKerberos ??? LDAP CosignPubCookieCompetitionCollaboration Convergence 61. Identity and ACL: Goal State One server - one software distribution Virtual Organization Software Supports all protocols Globus Certificate Authority Shibboleth LDAP MyProxy Kerberos Who will do this? Who will fund this? Whocan get these competitors to cooperate? 62. AUTHN/AUTHZ Meetings 63. My eScience Fantasy 64. The pre-requisites My net worth is $5B (I give myself grants) I encounter some tech-savvy scientists in a field whoare using technology to do world-class research They have never been visited by any other computerscientist They are working in groups of 1-30 geographicallydistributed around the world They all work on a beach with Internet2 connectionsand wide-open wireless and favourable exchangerates 65. DA ExperimentsComputeE BRemote ObservationData ModelsVol 4Vol 3Vol 2 FVol 1CeDocumentsTutorials 66. Step 1: Visit The Scientists Understand what they are doing and howthey are doing it? Ask them how they would like to improve it. Show each application to other scientists.Ask the other scientists how they wouldimprove it. Help each group improve their work - helpthem using whatever technology they arecurrently using 67. Step 2: Add some technology Install the super-multi-protocol VirtualOrganization software and provide a NOC forthe VO software - identity and simpleattributes Install Sakai - point it at the VO software foridentity add icon at the top of Sakai Give each scientist an account in the VO Give each effort in the field a site within Sakai 68. Heart Study Collaboratory LoginMy WorkspaceA BC D E Open Forum HomeChat ResourcesTutorials Site BMail ListLive Meetings 69. Step 2: Use the VO For those who want to protect theirinformation, help them add SSO to theirsites, backed by the VO service Since it is multi-protocol - likely therewill be no modification of the underlyingscience code - only a serverconfiguration changeIdentity ACL 70. DAExperimentsCompute E B Remote Observation Data ModelsHeart Study Collaboratory LoginVol 4My WorkspaceAB CD E Open ForumVol 3 HomeVol 2 FChat ResourcesVol 1Tutorials Site BMail ListCLive MeetingsIdentity eDocumentsACLTutorials 71. Step 4: Unique IdentifierService Come up with a way for any member of the VO to get a uniqueidentifier Demand some information (build a little data model) Persons name and organization (implicit from request) What kind of thing this will represent (experiment, document, image series) Simple description Keyword/value extensions Build an simple way request and retrieve these through a simpleweb service - capture implicit metadata from request (when, IPaddress, etc). Make sure it works from perl! Encourage community to start marking stuff with theseidentifiers in their stovepipes 72. Step 5: Data Models Begin to work with subsets of the field to try tofind common data models across stovepipes Start simple - use very simple RDF - humanreadable Broaden / deepen model slowly - explorevariations Define simple file-system pattern for storingmetadata associated with a file and/or adirectory 73. Step 6: A Backup-Style Repo Build a data repository which will function asa backup Basic idea - each time you get identifier - thisenables backup space - any data and/ormetadata can be uploaded under thatparticular identifier and left in the repository Make the repo multi-protocol, FTP, DAV,Web-Service with attachments, GridFTP, etc. Make it so there can be a network ofcooperating repositories 74. LocalCentralRepo Repo DA ExperimentsComputeE BRemote ObservationData Models GUID Vol 4Vol 3Service Vol 2 FVol 1IdentityC Heart Study CollaboratoryMy WorkspaceAB CD E LoginOpen Forum eDocumentsACL HomeChat ResourcesTutorials Tutorials Site BMail List Local RepoLive Meetings 75. Year 4 and on Once the basic stovepipes have been brought infrom the cold and made part of a community with noharm, the next steps are to begin to work cross-stovepipe Evolve data models to be far richer with many variants Build value added tools that are aware of the data models and are usable across stovepipes Teach the community to build and share tools - gentlyencourage development standards - Java / JSR-168perhaps Most important: Always listen to the users 76. Science at start at the centerand work outwardsthe center ofNew ToolseScience Data ModelsNew TechnologiesConnectCommunicate RepositoriesScience PriorityScientistsEnhance Data Storage apply technologywhen the users will New Approachessee it as a win 77. Conclusion Many years ago, eScience had science as itsmain focus Custom approaches resulted in too manyunique solutions Computer scientists began a search for themagic bullet - each group found a differentmagic bullet Each group now competes for mind share(and funding) to be the one true magic bullet 78. Conclusion (cont) One way to solve the many competing technologiessolution is to form super groups which unify thetechnologies No single technology gets to claim they are the one(Middleware is not in the middle) Each technology needs to become a drop-inservice/component which is available for use onlywhen appropriate Once we can get past looking at the technologies asthe main focus, we get back to science as the mainfocus 79. Lets remember why we started thiswhole field in the first place Scientific Domain Groups of People Common User InterfaceTo download Data Sharingwww.dr-chuck.com In the momentChucks Talks Long-term Experimental Equipment Compute Visualization