Hortonworks Engineering Apache Training
Hortonworks Engineering Apache Training
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Introduction To Apache
A non-profit foundation that manages open source projects Many top level projects (TLPs)
– Covering many spaces such as big data, web serving, and J2EE– Example TLPs: Hadoop, Spark, Tomcat, httpd– Each one is managed by a Project Management Committee (PMC)– Each project has a fair amount of room to make its own culture and rules
Apache stresses community over code– The goal is to build communities that produce and support software
New projects enter Apache through the Incubator– Mentors train the project in the “Apache Way”– Participant votes are not binding, they must be approved by mentors, thus releases etc. take longer
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Roles
Users – the millions who download and use Apache projects Contributors – those who contribute to an Apache project
– Not only developers, contributing tests & documentation, finding bugs, and helping others counts
Committers – those who have write permission for the project Project Management Committee (PMC)
– Determines committers and other PMC members– Approves releases– Assures the project is operating in the “Apache Way”– Reports regularly to the Apache Board
Apache Members – shareholders and caretakers of the foundation itself– Can serve in the Incubator, on the Board, or in various other roles– Hortonworks employs a number of Apache members
Apache Board – governs Apache
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The “Apache Way” – No Corporate Affiliations
Each of us work in Apache as individuals, NOT Hortonworks employees Hortonworks as such has no voice at Apache This means we cannot tell our committers etc. how to vote, act, etc. in Apache Hortonworks does not control Apache release process, so it cannot commit to its
customers that it will achieve any particular outcome in Apache; we can commit to work with them for the good of the community
Hortonworks is an Apache sponsor
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The “Apache Way” - Meritocracy
People are recognized in Apache for “just doing it”, leading by doing rather than by talking
Contributors who consistently make quality contributions and work well with the project typically are voted in as committers by the PMC
Committers who continue to contribute, work well with the team, and show they can help guide the project typically are voted in as PMC members by the PMC
Committers and PMC members who consistently contribute across projects or in positions of leadership in the foundation are voted in as Apache members by the other members
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The “Apache Way” - Collaboration
The mailing lists are the official record of Apache interactions– “If it didn’t happen on the lists, it didn’t happen”– JIRA counts, as all JIRA comments are copied to the mailing lists– All lists archived
Every project has a set of lists– dev – for those developing the project, including feature discussions, releases– private – only things that must be private, this list is only accessible by PMC and Apache members– user – for users of a project, new projects often don’t have this– some projects have commits, issues, or security lists as well
Everything in Apache that can be public must be public– Only personnel, security, or other delicate matters should be discussed on private lists
Decisions are reached by consensus, not by force– Decisions such as releases and adding new committers or PMC members are voted on, with votes open for 3+ days to
assure everyone has a chance to see them– Votes should be used mainly to formalize consensus
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Licenses
All Apache projects use the Apache License Apache license is corporate friendly: users can copy, repackage, and even sell Apache
licensed software Some licenses are copyleft, meaning that if you make any changes or additions to the
code you must also open source your changes/additions (e.g. GPL) When contributing to Apache, if you are contributing code from another source you
must make sure it has a compatible license and evaluate whether to update the LICENSE and NOTICE files (see FAQ)
Apache license != Apache software, anyone can use the license
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Why Apache?
Apache is a known and respected brand in the open source world Developing open source software owned by a 3rd party gives our customers
confidence in the longevity and availability of our software Contributing our software to Apache assures the fastest possible market adoption,
thus increasing the potential market for Hortonworks Working in Apache allows us to collaborate with customers, partners, and
competitors to develop the best possible software
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The Importance of Healthy Apache Communities
It is crucial to Hortonworks that the Apache communities we participate in are healthy
A healthy community provides users, testers, and contributors A healthy community assures our users that they are getting a true open source,
Apache driven, community owned project, not a thin foil for Hortonworks proprietary software
A healthy community means that the software we develop will continue to grow and thrive
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Which Hat Are You Wearing?
If you are a committer, PMC member, etc. you have responsibilities to Apache Your Apache responsibilities are neither more nor less binding than your Hortonworks
responsibilities; you must play both roles You represent Apache to Hortonworks
– Sometimes your co-workers, customers, partners etc. won’t understand Apache and its ways – your role is to be an ambassador
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Best Practices Make sure all off list discussions regarding a project are posted to the list
– Non-Hortonworkers and remote Hortonworkers should be able to participate fully
Discuss changes fully in JIRA (or PR or review request or whatever your project uses)– Avoid the pattern: 1) open a JIRA with minimal description; 2) post a patch; 3) resolve the JIRA all within 5 min
Participate in the mailing lists in constructive ways– Be professional in your interactions at Apache, even when others aren’t
Make sure all involved Hortonworkers (QE, support, docs) are represented in Apache– Avoid funneling interactions with Apache through one or a few people
Review patches from outside Hortonworks– Give good, constructive feedback
Respond positively to feedback from community members– This includes making changes in your code when their concerns are valid
If you are on the PMC, make sure you review and promote deserving contributors to committers and committers to PMC members regardless of their affiliation
Schedule time to make Apache releases and review outside contributions
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Don’ts
Announce security holes (CVEs) that have been fixed in HDP but not in Apache Put Hortonworks specific stuff into Apache projects
– Documentation that points to HDP docs– Code that works only in an HDP release, not in the Apache release
Share Hortonworks’ customer information on Apache JIRAs, mailing lists, etc.– This includes hiding customer application info and data as part of bug reports
Share Hortonworks’ corporate plans Succumb to group think
– Others outside Hortonworks will have good ideas and constructive feedback
Wear your Hortonworks hat on the Apache lists– Avoid things like “We at Hortonworks think...”– The fact that you’re a manager or an architect or a founder here doesn’t mean anything in Apache
Tell other Hortonworks committers and PMC members how to vote– You can ask them to review something and vote on it
Release Hortonworks releases with version numbers not yet released in Apache– e.g. We can’t have a Hortonworks release something it calls Hive 3.0 before Apache releases Hive 3.0
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Trademarks Trademark law requires us to honor Apache’s trademarks Some of Apache’s project names are registered, some are not
– e.g. Apache Hadoop® is registered, Apache Pig™ isn’t– All are claimed by Apache as trademarks
When referring to Apache projects in your documentation, slides, talks, etc. the first reference should include “Apache X”; after that it can just be “X”
We cannot say things that imply we own, drive, or control an Apache project; examples of what not to do:– Hortonworks Hadoop– Hortonworks, the masters of Hadoop– Hortonworks, the Hadoop company
We can brag all we want about our contributions to the Apache projects If you are on a PMC, Apache depends on you to enforce the trademarks for that product
– including (and from a HWX perspective especially) if you see Hortonworks infringing on the trademarks
See the Apache Trademark page for full details
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Wait, what if...
We’ve built a FAQ on Apache issues to answer questions (internal link) If you have questions about Apache, ask one of the many Apache members who
works at Hortonworks: – Alan Gates, Arun Murthy, Ashutosh Chauhan, Bikas Saha, Billie Rinaldi, Daniel Dai, Devaraj Das, Enis
Soztutar, Hitesh Shah, Jitendra Pandey, Joe Witt, Josh Elser, Julian Hyde, Mahadev Konar, Nicholas Sze, Owen O’Malley, Siddarth Seth, Steve Loughran, Taylor Goetz, Thejas Nair, Vinod Vavilapalli
If you have questions about corporate policy, ask your manager