The Open Gateway Computing Environment: Experiences Developing Tools for Scientific Communities in the Apache Software Foundation Marlon Pierce Indiana University March 7, 2012
Dec 25, 2015
The Open Gateway Computing Environment: Experiences
Developing Tools for Scientific Communities in the Apache
Software Foundation Marlon Pierce
Indiana UniversityMarch 7, 2012
March 7, 2012: The Competition for Your Attention
IU Science Gateway Group Overview• IU Science Gateway Group members
– Marlon Pierce: Project Leader– Suresh Marru: Principal Software Architect– Raminder Singh, Chathura Herath, Yu Ma, Lahiru Gunathilake:
Senior team members– Research assistants and interns
• NSF SDCI funding of Open Gateway Computing Environments project – TACC (M. Dahan), SDSC (N. Wilkins-Diehr), SDSU (M. Thomas),
NCSA (S. Pamidighantam), UIUC (S. Wang), Purdue (C. Song), UTHSCSA (E. Brookes)
• We participate in two Apache incubators– Apache Rave: http://incubator.apache.org/rave/ – Apache Airavata: http://www.airavata.org
Web user interfaces to Grids, Clouds, and other scientific resources.
Web user interfaces to Grids, Clouds, and other scientific resources.
Scientific workflow composition and execution.
Scientific workflow composition and execution.
Compute ResourcesCompute Resources
Resource Middleware
Resource Middleware Cloud Interfaces Grid Middleware SSH & Resource
Managers
Computational Clouds Computational Grids
Gateway SoftwareGateway Software
User Interfaces
User Interfaces
Web/Gadget
Container
Web Enabled Desktop Applications
User Management
Auditing & Reporting
Fault Tolerance
Application Abstractions
Workflow System
Information ServicesMonitoring
Registry SecurityProvenance &
Metadata Management
Local Resources
Web/Gadget Interfaces
Gateway Abstraction Interfaces
Cyberinfrastructure Layers
Color Coding
Dependent resource provider components
Complimentary Gateway Components
OGCE Gateway Components
1
CollaborationsCollaborating Team Scientific Field
GridChem (Sudhakar Pamidighantam, NCSA)
Computational Chemistry
ParamChem (Alex Mackerell, Sudhakar Pamidighantam, Micheal Sheetz et. al)
Molecular Sciences
WIYN Consortium One Degree Imager (Pat Knezek, NOAO)
Astronomy
OLAM (Craig Mattocks, University of Miami)
Atmospheric and Environmental Modeling
UltraScan (Borries Demeler, University of Texas Health Science Center)
Experimental Biophysics
LCCI (James Vary, Iowa State) Computational Nuclear Physics
Dark Energy Survey Simulation Working Group (August Evrard et. al)
Astrophysics, Astronomy
VLab (Renata Wentzcovitch, University of Minnesota)
Planetary Materials
Key Problems for Science Gateway, Cyberinfrastructure Software
• Reusability– Reuse or write your own gateway software?
• Sustainability– The reason to reuse.– Cyberinfrastructure Software Sustainability and
Reusability: Report from an NSF-funded workshop• https://scholarworks.iu.edu/dspace/handle/2022/6701
• Governance– How are design decisions made?– Who decides if the software is suitable for release?– How do you handle contributions?– How do you add people to the development and project
management teams?
OGCE Funds Software Lifecycle
Governance: Open Community Software
• More than SourceForge, GitHub, Google Code, etc– Those provide excellent Web tools to help developers.
• Here we are concerned with community building.– Diverse community of developers increases probability
of reusability and sustainability– But diverse communities require governance
– Get governance right, and sustainability and reuse will follow.
Some Open Model Examples in CI• NSF-funded CDIGS project
– http://confluence.globus.org/display/CDIGS/CDIGS+Home+Page• HUBzero Consortium, Sakai Foundation, Kuali Foundation
– Institutional level organizations • Eclipse Parallel Tools Platform
– Jay Alameda, NCSA: NSF SI2 funding to develop HPC tools workbench
– http://www.eclipse.org/ptp/• Enzo Project
– Excellent talk at TG11 by Prof. Brian O’Shea on their open community efforts
– http://enzo-project.org/• Apache Software Foundation
– OODT and TIKKA Data Management Projects at NASA JPL
Two Apache Software Foundation Case Studies
Apache Rave and Airavata Incubators
Apache Airavata
• Science Gateway software framework to– Compose, manage, execute, and monitor
computational workflows on Grids and Clouds– Web service abstractions to legacy command line-
driven scientific applications– Modular software framework to use as individual
components or as an integrated solution.• More Information
– Airavata Web Site: airavata.org– Developer Mailing Lists: airavata-
Apache Airavata High Level Overview
Example Workflow: Nuclear Physics
Courtesy of collaboration with Prof. James Vary and team, Iowa State
1
Apache Rave
• Open Community Software for Enterprise Social Networking, Shareable Web Components, and Science Gateways
• Founding members:• Mitre Software• SURFnet• Hippo Software• Indiana University
• More information• Project Website: http://incubator.apache.org/rave/• Mailing List: [email protected]
Gadget Dashboard View
Gadget Store View
Rave Building Blocks
• Rave is implemented in JavaScript, Java with Spring MVC– Bean initialization specified in XML configuration files.– Inversion of Control makes it easy to swap out
implementations.– Disciplined MVC through Java annotations
• Builds on Apache Shindig and Wookie– Provide layout management, user management,
administration tools, production backend data systems, etc.
Extending Rave for Science Gateways
• Two constraints– Must work out of the box– But must be flexible for developers to adapt it.
• Rave is designed to be extended.– Good design (interfaces, easily pluggable
implementations) and code organization are required.– It helps to have a diverse, distributed developer
community• How can you work on it if we can’t work on it?
• Rave is also packaged so that you can extend it without touching the source tree.
• GCE11 paper presented 3 case studies for Science Gateways
Rave Extension General Steps
• Download and install Rave’s source– “mvn clean install” puts JARs, WARs, and POMs
into your local Apache Maven repository.– Only if building from a snapshot.
• Create a new Apache Maven project– You’ll need rave-portal-dependencies POM in your
<dependencies/>.– Include any configuration files that you would like
to modify.– Include the source code for your extensions.
The Apache Way and Science Gateways
Why Apache for Gateway Software?• Apache Software Foundation is a neutral playing field
– 501(c)(3) non-profit organization.– Designed to encourage competitors to collaborate on
foundational software.– Includes a legal cell for legal issues.
• Provides the social infrastructure for building communities.• Opportunities to collaborate with other Apache projects
outside the usual CI world.• Foundation itself is sustainable
– Incorporated in 1999– Multiple sponsors (Yahoo, Microsoft, Google, AMD, Facebook,
IBM, …)• Proven governance models
– Projects are run by Program Management Committees.– New projects must go through incubation.
The Apache Way• Projects start as incubators with 1 champion and several mentors.
– Making good choices is very important• Graduation ultimately is judged by the Apache community.
– +1/-1 votes on the incubator list• Good, open engineering practices required
– DEV mailing list design discussions, issue tracking– Jira contributions– Important decisions are voted on
• Properly packaged code– Build out of the box– Releases are signed– Licenses, disclaimers, notices, change logs, etc.– Releases are voted
• Developer diversity– Three or more unconnected developers– Price is giving up sole ownership, replace with meritocracy
Apache and Science Gateways• Apache rewards projects for cross-pollination.
– Connecting with complementary Apache projects strengthens both sides.
– New requirements, new development methods• Apache methods foster sustainability
– Building communities of developers, not just users– Key merit criterion
• Apache methods provide governance– Incubators learn best practices from mentors– Open, democratic procedures
– Processes for adding new committers and management– Ex: Releases are peer-reviewed and voted on.– All communications are archived.
Apache Contributions Aren’t Just Software
• Apache committers and Project Management Committee members aren’t just code writers.
• Successful communities also include– Important users– Project evangelists – Content providers: documentation, tutorials– Testers, requirements providers, and constructive
complainers • Using Jira and mailing lists
– Anything else that needs doing.
How To Get Involved
• Join the DEV mailing lists.• Grab the software and start complaining.• Post Jira tickets
– Add your patches to Jira if you want to solve a problem.
– Request a review– Frequent patch submission is the best way to get
voted in as a committer.
Case Study: GridShib and Community Credentials
• XSEDE Science Gateways use shared community credentials when accessing backend resources. – Many portal users map to one community
account.• GridShib adds attributes to grid credentials
– Gateway membership, originating IP address, user email, creation time, etc.
• For Rave, we’ll have to change the User service implementation to support this.
GridShib Step By Step• Install Rave in your Maven repo.• Create a Maven project with standard directory
layout for WAR packaging • Create a new user service (ComUserService) for
obtaining a community credential and adding GridShib attributes.
• Replace applicationContext-security.xml with your version
• In the XML, replace the default UserService with ComUserService.
• Place all GridShib resources in src/main/resources • Place web.xml in src/main/webapp/WEB-INF
– You’ll need an additional listener to get the IP address.