Learning Open Source through GSOC
Post on 16-Dec-2014
366 Views
Preview:
DESCRIPTION
Transcript
Apache Software Foundation Indiana University
Science Gateways, Open Source & Google Summer of Code
Suresh Marru
Acknowledgements
Apache Software Foundation (ASF)
Extreme Science and Engineering Discovery Environments (XSEDE)
Science Gateways Group, Pervasive Technology Institute, Indiana University (SGG)
Credits to ….
Science Gateways Group @ IU Marlon Pierce: Group Lead Amila Jayasekara Chathuri Wimalasena Heshan Suriyaachchi Jun Wang Lahiru Gunathilake Raminder Singh Saminda Wijeratne Suresh Marru Viknes Balasubramanee Yu (Marie) Ma
Apache Airavata
What will you hear today?
Science GatewaysWeb 2.0, Social Networking, Grid & Cloud
Computing, BigData, everything-as-a-service -- churned into real-world scientific research.
Open Source Hack into Open Source projects – a good way to
cherish doing what you like as opposite to what you have to.
Google Summer of CodeReward yourself with $5000 while making a case
for Future Employments & Graduate School Admissions
Outline
Google Summer of Code
Apache Software Foundation
Getting your way in Open Source
What are Science Gateways?
Interested? Next Steps……
www.google-melange.orgwww.google-melange.com
What is Google Summer of Code?
Google Summer of Code is a program designed toencourage college student participation in
open source software development.
Key Goals of GSOC
• Inspire young developers to begin participating in open source development
• Provide students in computer science and related fields the opportunity to do work related to their academic pursuits during the summer
• Give students more exposure to real-world software development scenarios (e.g. distributed development, software licensing questions, mailing list etiquette, etc.)
• Get more open source code created and released for the benefit of all
• Help open source projects identify and bring in new developers and committers
GSoC in numbers: Countries
GSoC Top Schools
GSoC in numbers: Students
Number of students max’ed and stabilized around 1200.
This is not expected to grow in near future, understandable, still thank you Google!!
GSoC Win-Win Perspective
• Project Perspective:o Paid software developer for the summer.o Attracting a new member into the project
community.
• Student Perspectiveo Opportunity to gain (open source) software
development experience.o Good payment for rewarding work.o Ability to network and become known within a
structured, distributed setting.
What to look for in a project?
Can you engage with project (not just the mentor)?. Can they guide you with tutorials and hand hold early on?
For instance, will you get to experience “Apache Way”?
Is the project welcoming and appreciative?
Is there a mileage for your extra effort with long term commitments?
Key Success: Integrated Cross Apache Projects
• Whirr API
Success Story from Apache Airavata Student: Milinda Pathirage
Core Contributions beyond GSOC
Milinda realized he could execute his GSOC project, but had great thoughts on how we can fundamentally improve Airavata Architecture to make it easy for future extensions.
Developer community agreed to the new Architecture. Simple Easy extendibility.
Airavata has adopted his proposed new architecture
Enhanced Airavata Architecture
Global InHandlers
Global OutHandlers Provider specific OutHandlers
Application specific In Handlers
Application specific OutHandlers
Provider specific InHandlers
Job
Exe
cuti
on C
onte
xt
Pro
vide
r L
ogic
Pick what motivates you
Harness your skills and interests If possible pick a project relevant and “required”
by aligning with your’ academic curriculum As a final year (research) project As a Masters-level research project
Create an interesting and challenging research problem
Sense of satisfaction and achievements Research publications Presentations at ApacheCon and similar conferences Committership
What does a good mentor look for?
Free & Paid Contributions – the reality Long term participant in the project (not a
software developer for ~3 months)Accomplish meaningful research-oriented
goals either within the project or cross-cutting projects.
Teach open source/community participation to the next generation workforce
Apache Airavata
What will you hear today?
Science GatewaysWeb 2.0, Social Networking, Grid & Cloud
Computing, BigData, everything-as-a-service -- churned into real-world scientific research.
Open Source Hack into Open Source projects – a good way to
cherish doing what you like as opposite to what you have to.
Google Summer of CodeReward yourself with $5000 while making a case
for Future Employments & Graduate School Admissions
What Is Cyberinfrastructure?
“Cyberinfrastructure consists of computing systems,data storage systems, advanced instruments and
data repositories, visualization environments, andpeople, all linked together by software and high
performance networks to improve researchproductivity and enable breakthroughs not otherwise
possible.” –Craig Stewart, Indiana University
Knowledge and Expertise
Computational Resources
Scientific Instruments
Algorithms and Models
Archived Data and Metadata
Advanced Science Tools
Science Gateways: Enabling & Democratizing Scientific Research
On-DemandGrid Computing
Dynamic Adaptive Cyberinfrastructure - Reacting to real-time weather
StreamingObservations
Storms Forming
Forecast Model
Data Mining
Refine forecast
Instrument Steering
Envisioned by a multi-disciplinary team from OU, IU, NCSA, Unidata, UAH, Howard, Millersville, Colorado State, RENCI
Anatomy of a Science Gateway
Gateway User Interface Web Portals Desktop Clients Social/ Collaboration Capabilities
Security Infrastructure Analyses & Visualization Capabilities Workflow Execution Framework
Application Abstraction Workflow construction & Enactment Compute Resource Management Scheduling Messaging System
Data Management Provenance Collection
Knowledge and Expertise
Computational Resources
Scientific Instruments
Algorithms and Models
Archived Data and Metadata
Advanced Science Tools
Science Gateways: Enabling & Democratizing Scientific Research
Science Gateways enable and support communities of users associated with a scientific discipline to use cyber infrastructure through a common interface that is configured for optimal use.
25
XSEDE Vision
The eXtreme Science and Engineering Discovery Environment (XSEDE):
enhances the productivity of scientists and engineers by providing them with new and innovative capabilities
and thusfacilitates scientific discovery while enabling transformational science/engineering and innovative educational programs
https://www.xsede.org/gateways-overview
Today, there are approximately 35 gateways using XSEDE
Apache Airavata
What will you hear today?
Science GatewaysWeb 2.0, Social Networking, Grid & Cloud
Computing, BigData, everything-as-a-service -- churned into real-world scientific research.
Open Source Hack into Open Source projects – a good way to
cherish doing what you like as opposite to what you have to.
Google Summer of CodeReward yourself with $5000 while making a case
for Future Employments & Graduate School Admissions
The Apache Software Foundation
Apache software powers 65% of web sites worldwide
501(c)3 non-profit foundation
Reasons for creating ASF
Create legal entity Protect contributors from
liability Protect Apache assets
Membership: individual
Apache Incubator
Governance and Staffing
Board of Directors Project Management
Committees ASF Members Committers Contributors
Funding All-volunteer
staffing/development resources
Donations Corporate investment
Apache Way:Beyond Open Source, Open Community
Transparency Decision-making and actions are observable Events of interest are published and recorded Transparency invites collaboration
Meritocratic Governance Influence on decisions is based on merit Merit is earned in public Community based governance
Community Common interest, Community interest, Common experience “Community before code”
Collaboration Systems supporting communication and coordination:
repositories, trackers, forums, build tools You can reuse what you can see and influence More eyeballs means better quality
• Apache is a meritocratic organization – Merit does not expire. You earn your keep and your credentials
• Start out as Contributor– Patches, mailing list comments, testing, documentation, etc.– No commit access
• Move onto Committer– Commit access, evolve the code
• PMC Members– Have binding VOTEs on releases/personnel
• Officer (VP, Project)– PMC Chair
• ASF Member– Have binding VOTE in the state of the foundation– Elect Board of Directors
• Director– Oversight of projects, foundation activities
Apache Organization
Our experience with Apache ..
Give up control and get back contributions. Being in apache by itself doesn’t guarantee sustainability but
open doors for sustainability. Google Summer of code has bought in students, increased
documentation, identified confined projects. Do not have to worry about getting sued by Oracle for using Java
API’s. Standing behind a shield of expert lawyers. Companies make in-kind contributions, some have concrete plans,
some or just evangelizing. Both are good. Todays, Cyberinfrastructure eco-system is not in a funding
situation to work on parallel independent implementation. Shared implementation is hard to achieve, but well thought
architectures can achieve it. Also encourage multiple implementations and let the communities
sort out. The winner sustains. Example: Apache Axis2, Apache CXF
Apache Contributions Aren’t Just Software
• Apache committers and PMC members aren’t just code writers.
• Successful communities also include– Important users– Project evangelists – Content providers: documentation, tutorials– Testers, requirements providers, and
constructive complainers • Using Jira and mailing lists
– Anything else that needs doing.
Apache Airavata
http://airavata.apache.org
Science Gateways with Airavata
Workflow Interpreter
Application Factory
Message Box
Registry
Apache Airavata
API
Lorem ipsum
insolens
p1m5
duo x
End
Use
rsG
atew
ay
Dev
elop
er
Scientific Applicati
on
Core Developer
Computational Resources
Apache Airavata
Apache Airavata Components
Component Description
XBaya Workflow graphical composition tool.
Registry Service Insert and access application, host machine, workflow, and provenance data.
Workflow Interpreter Service
Execute the workflow on one or more resources.
Application Factory Service (GFAC)
Manages the execution and management of an application in a workflow
Messaging System WS-Notification and WS-Eventing compliant publish/subscribe messaging system for workflow events
Airavata API Single wrapping client to provide higher level programming interfaces.
Key Airavata Features
Graphical user interface to construct, execute, control, manage and reuse scientific workflows.
Desktop tools and browser-based web interface components to manage applications, workflows and generated data.
Sophisticated server-side tools to register, schedule and manage scientific applications on high performance computational resources.
Ability to Interface and interoperate with various external (third party) data, workflow and provenance management tools.
A Classic Scientific Workflow
Workflows are composite applications built out of independent parts.
Parts are executables wrapped as network accessible services The classic example is that codes A, B, and C need
to be executed in a specific sequence. A, B, C: parallel codes compiled and executable on a cluster,
supercomputer, etc. by schedulers. A, B, and C do not need to be co-located A, B, and C may be sequential or parallel A, B and C may have date or control dependencies
Data may need to be staged in and out Some variations on ABC:
Conditional execution branches Dynamic execution resource binding Iterations (Do-while, For-Each) over all or parts of the sequence Triggers, events, data streams
Challenges in Scientific Workflows
Accommodating wide range of execution patterns Iterations: for-each, do-while, dot and
Cartesian products Interactivity, adaptivity, non-determinism
Accommodating error and uncertainties
NextGen Workflow Systems: Need for Interactivity Across Layers
Scientific workflow systems and compiled workflow languages have focused on modeling, scheduling, data movement, dynamic service creation and monitoring of workflows.
Building on these foundations Airavata extends to a interactive and flexible workflow systems.
Airavata Workflow Features include: interactive ways of interfering and steering the
workflow execution interpreted workflow execution model high level instruction set flexibility to execute individual workflow activity and
wait for further analysis.
Interactivity Contd.
Derivations during workflow Execution that does not affect the structure of the workflow dynamic change workflow inputs, workflow rerun.
interpreted workflow execution model. dynamic change in point of execution, workflow
smart rerun. Fault handling and exception models.
Derivation that change the workflow DAG during runtime Reconfiguration of activity.. dynamic addition of activities to the workflow. Dynamic remove or replace of activity to the
workflow
Interactivity Mathematical uncertainty:
PDE’s from domain problems do not have analytical solution and thereby look at numerical methods to find solutions
These solvers may not converge depending on method, PDE system, initial conditions and expected output tolerances
statistical techniques lead to nondeterministic results. closer observation at computational output ensure acceptability of results.
Domain uncertainty: Scenarios of running against range of parameter values in an attempt to find the
most appropriate input set. Initial execution providing estimate of the accuracy of the inputs and facilitating
further refinement. Outputs are diverse and nondeterministic
Resource uncertainty: Failures in distributed systems are norm than an exception transient failures can be retried if computation is side-effect free/Idempotent. persistent failures require migration
Real-time Model refinement Real-time event processing systems not having data available prior to initialization
of model. models evolve over time and can take advantage of more and more events as
they become available
Illustrating Interactivity
Domain Description
Astronomy Image processing pipeline for One Degree Imager instrument on XSEDE
Astrophysics Supporting workflow of Dark Energy Survey simulations working group on XSEDE
Bioinformatics Supported workflow executions on Amazon EC2 for BioVLAB project
Biophysics Manage large scale data analysis of analytical ultracentrifugation experiments on XSEDE and campus resources
Computational Chemistry
Manage workflows to support computational chemistry parameter studies for ParamChem.org on XSEDE
Nuclear Physics Workflows for nuclear structure calculations using Leadership Class Configuration Interaction (LCCI) computations on DOE resources
Apache Airavata in Action
Apache Airavata
What will you hear today?
Science GatewaysWeb 2.0, Social Networking, Grid & Cloud
Computing, BigData, everything-as-a-service -- churned into real-world scientific research.
Open Source Hack into Open Source projects – a good way to
cherish doing what you like as opposite to what you have to.
Google Summer of CodeReward yourself with $5000 while making a case
for Future Employments & Graduate School Admissions
Apache Airavata
• Engage Early
• Familiarize Projects
• Propose Ideas
• Win, Code, Earn… Cherish !!!
1 2 3 4
How to crack GSoC?
Be Part of the project Community
• Play with different popular open source software ..
• Experiment with the emerging technologies …
• Learn & Engage with a multidisciplinary community..
Be pro-active instead of being reactive:
come up with your own ideas
GSoC Win-Win Perspective
• Project Perspective:o Paid software developer for the summer.o Attracting a new member into the project
community.
• Student Perspectiveo Opportunity to gain (open source) software
development experience.o Good payment for rewarding work.o Ability to network and become known within a
structured, distributed setting.
What to look for in a project?
Engage with project (not just the mentor). Can they guide you with tutorials and hand hold early on?
For instance, will you get to experience “Apache Way”?
Is the project welcoming and appreciative?
Is there a mileage for your extra effort with long term commitments?
Pick what motivates you
Harness your skills and interests If possible pick a project relevant and “required”
by aligning with your’ academic curriculum As a final year (research) project As a Masters-level research project
Create an interesting and challenging research problem
Sense of satisfaction and achievements Research publications Presentations at ApacheCon and similar conferences Committership
What does a good mentor look for?
Free & Paid Contributions – the reality Long term participant in the project (not a
software developer for ~3 months)Accomplish meaningful research-oriented
goals either within the project or cross-cutting projects.
Teach open source/community participation to the next generation workforce
Join the mailing list
Google Group - sgw-gsoc-discuss: https://groups.google.com/d/forum/sgw-gsoc-
discussNeed more info – smarru@apache.org
Apache Airavata
top related