The Open Gateway Computing Environment: Experiences Developing Tools for Scientific Communities in the Apache Software Foundation Marlon Pierce Indiana.

Post on 25-Dec-2015

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

The Open Gateway Computing Environment: Experiences

Developing Tools for Scientific Communities in the Apache

Software Foundation Marlon Pierce

Indiana UniversityMarch 7, 2012

March 7, 2012: The Competition for Your Attention

IU Science Gateway Group Overview• IU Science Gateway Group members

– Marlon Pierce: Project Leader– Suresh Marru: Principal Software Architect– Raminder Singh, Chathura Herath, Yu Ma, Lahiru Gunathilake:

Senior team members– Research assistants and interns

• NSF SDCI funding of Open Gateway Computing Environments project – TACC (M. Dahan), SDSC (N. Wilkins-Diehr), SDSU (M. Thomas),

NCSA (S. Pamidighantam), UIUC (S. Wang), Purdue (C. Song), UTHSCSA (E. Brookes)

• We participate in two Apache incubators– Apache Rave: http://incubator.apache.org/rave/ – Apache Airavata: http://www.airavata.org

Web user interfaces to Grids, Clouds, and other scientific resources.

Web user interfaces to Grids, Clouds, and other scientific resources.

Scientific workflow composition and execution.

Scientific workflow composition and execution.

Compute ResourcesCompute Resources

Resource Middleware

Resource Middleware Cloud Interfaces Grid Middleware SSH & Resource

Managers

Computational Clouds Computational Grids

Gateway SoftwareGateway Software

User Interfaces

User Interfaces

Web/Gadget

Container

Web Enabled Desktop Applications

User Management

Auditing & Reporting

Fault Tolerance

Application Abstractions

Workflow System

Information ServicesMonitoring

Registry SecurityProvenance &

Metadata Management

Local Resources

Web/Gadget Interfaces

Gateway Abstraction Interfaces

Cyberinfrastructure Layers

Color Coding

Dependent resource provider components

Complimentary Gateway Components

OGCE Gateway Components

1

CollaborationsCollaborating Team Scientific Field

GridChem (Sudhakar Pamidighantam, NCSA)

Computational Chemistry

ParamChem (Alex Mackerell, Sudhakar Pamidighantam, Micheal Sheetz et. al)

Molecular Sciences

WIYN Consortium One Degree Imager (Pat Knezek, NOAO)

Astronomy

OLAM (Craig Mattocks, University of Miami)

Atmospheric and Environmental Modeling

UltraScan (Borries Demeler, University of Texas Health Science Center)

Experimental Biophysics

LCCI (James Vary, Iowa State) Computational Nuclear Physics

Dark Energy Survey Simulation Working Group (August Evrard et. al)

Astrophysics, Astronomy

VLab (Renata Wentzcovitch, University of Minnesota)

Planetary Materials

Key Problems for Science Gateway, Cyberinfrastructure Software

• Reusability– Reuse or write your own gateway software?

• Sustainability– The reason to reuse.– Cyberinfrastructure Software Sustainability and

Reusability: Report from an NSF-funded workshop• https://scholarworks.iu.edu/dspace/handle/2022/6701

• Governance– How are design decisions made?– Who decides if the software is suitable for release?– How do you handle contributions?– How do you add people to the development and project

management teams?

OGCE Funds Software Lifecycle

Governance: Open Community Software

• More than SourceForge, GitHub, Google Code, etc– Those provide excellent Web tools to help developers.

• Here we are concerned with community building.– Diverse community of developers increases probability

of reusability and sustainability– But diverse communities require governance

– Get governance right, and sustainability and reuse will follow.

Some Open Model Examples in CI• NSF-funded CDIGS project

– http://confluence.globus.org/display/CDIGS/CDIGS+Home+Page• HUBzero Consortium, Sakai Foundation, Kuali Foundation

– Institutional level organizations • Eclipse Parallel Tools Platform

– Jay Alameda, NCSA: NSF SI2 funding to develop HPC tools workbench

– http://www.eclipse.org/ptp/• Enzo Project

– Excellent talk at TG11 by Prof. Brian O’Shea on their open community efforts

– http://enzo-project.org/• Apache Software Foundation

– OODT and TIKKA Data Management Projects at NASA JPL

Two Apache Software Foundation Case Studies

Apache Rave and Airavata Incubators

Apache Airavata

• Science Gateway software framework to– Compose, manage, execute, and monitor

computational workflows on Grids and Clouds– Web service abstractions to legacy command line-

driven scientific applications– Modular software framework to use as individual

components or as an integrated solution.• More Information

– Airavata Web Site: airavata.org– Developer Mailing Lists: airavata-

dev@incubator.apache.org

Apache Airavata High Level Overview

Example Workflow: Nuclear Physics

Courtesy of collaboration with Prof. James Vary and team, Iowa State

1

Apache Rave

• Open Community Software for Enterprise Social Networking, Shareable Web Components, and Science Gateways

• Founding members:• Mitre Software• SURFnet• Hippo Software• Indiana University

• More information• Project Website: http://incubator.apache.org/rave/• Mailing List: rave-dev@incubator.apache.org

Gadget Dashboard View

Gadget Store View

Rave Building Blocks

• Rave is implemented in JavaScript, Java with Spring MVC– Bean initialization specified in XML configuration files.– Inversion of Control makes it easy to swap out

implementations.– Disciplined MVC through Java annotations

• Builds on Apache Shindig and Wookie– Provide layout management, user management,

administration tools, production backend data systems, etc.

Extending Rave for Science Gateways

• Two constraints– Must work out of the box– But must be flexible for developers to adapt it.

• Rave is designed to be extended.– Good design (interfaces, easily pluggable

implementations) and code organization are required.– It helps to have a diverse, distributed developer

community• How can you work on it if we can’t work on it?

• Rave is also packaged so that you can extend it without touching the source tree.

• GCE11 paper presented 3 case studies for Science Gateways

Rave Extension General Steps

• Download and install Rave’s source– “mvn clean install” puts JARs, WARs, and POMs

into your local Apache Maven repository.– Only if building from a snapshot.

• Create a new Apache Maven project– You’ll need rave-portal-dependencies POM in your

<dependencies/>.– Include any configuration files that you would like

to modify.– Include the source code for your extensions.

The Apache Way and Science Gateways

Why Apache for Gateway Software?• Apache Software Foundation is a neutral playing field

– 501(c)(3) non-profit organization.– Designed to encourage competitors to collaborate on

foundational software.– Includes a legal cell for legal issues.

• Provides the social infrastructure for building communities.• Opportunities to collaborate with other Apache projects

outside the usual CI world.• Foundation itself is sustainable

– Incorporated in 1999– Multiple sponsors (Yahoo, Microsoft, Google, AMD, Facebook,

IBM, …)• Proven governance models

– Projects are run by Program Management Committees.– New projects must go through incubation.

The Apache Way• Projects start as incubators with 1 champion and several mentors.

– Making good choices is very important• Graduation ultimately is judged by the Apache community.

– +1/-1 votes on the incubator list• Good, open engineering practices required

– DEV mailing list design discussions, issue tracking– Jira contributions– Important decisions are voted on

• Properly packaged code– Build out of the box– Releases are signed– Licenses, disclaimers, notices, change logs, etc.– Releases are voted

• Developer diversity– Three or more unconnected developers– Price is giving up sole ownership, replace with meritocracy

Apache and Science Gateways• Apache rewards projects for cross-pollination.

– Connecting with complementary Apache projects strengthens both sides.

– New requirements, new development methods• Apache methods foster sustainability

– Building communities of developers, not just users– Key merit criterion

• Apache methods provide governance– Incubators learn best practices from mentors– Open, democratic procedures

– Processes for adding new committers and management– Ex: Releases are peer-reviewed and voted on.– All communications are archived.

Apache Contributions Aren’t Just Software

• Apache committers and Project Management Committee members aren’t just code writers.

• Successful communities also include– Important users– Project evangelists – Content providers: documentation, tutorials– Testers, requirements providers, and constructive

complainers • Using Jira and mailing lists

– Anything else that needs doing.

How To Get Involved

• Join the DEV mailing lists.• Grab the software and start complaining.• Post Jira tickets

– Add your patches to Jira if you want to solve a problem.

– Request a review– Frequent patch submission is the best way to get

voted in as a committer.

Case Study: GridShib and Community Credentials

• XSEDE Science Gateways use shared community credentials when accessing backend resources. – Many portal users map to one community

account.• GridShib adds attributes to grid credentials

– Gateway membership, originating IP address, user email, creation time, etc.

• For Rave, we’ll have to change the User service implementation to support this.

GridShib Step By Step• Install Rave in your Maven repo.• Create a Maven project with standard directory

layout for WAR packaging • Create a new user service (ComUserService) for

obtaining a community credential and adding GridShib attributes.

• Replace applicationContext-security.xml with your version

• In the XML, replace the default UserService with ComUserService.

• Place all GridShib resources in src/main/resources • Place web.xml in src/main/webapp/WEB-INF

– You’ll need an additional listener to get the IP address.

top related