Click here to load reader
Feb 17, 2017
FAIL FAST, FAIL OFTEN
Gordon Haff @ghaff, Technology EvangelistWilliam Henry @ipbabble, DevOps Strategy Lead13 July 2016
FAILURE
2
3
FAILURE
4
FAILURE
ALSO FAILURE
5
FAILURES HAVE CONSEQUENCES
6
THE INESCAPABLE CONCLUSION?
7
DONTFAIL
8
DONTFAIL
9
FAILWELL
10
11
Experiment by Peter Skillman, former VP of design at Palm
12
WHAT HE LEARNED
Kindergarteners do not spend 15 minutes in a bunch of status transactions trying to figure out who is going to be CEO of Spaghetti Corporation.
They dont sit around talking about the problem. They just start building to determine what works and what doesnt.
SOFTWARE = GREAT MATCH FOR
FAILING WELL
13
14
FIVE PRINCIPLES:
THE RIGHT
scopeapproachworkflowincentivesculture
15
THE RIGHT SCOPEConstrain the impact of failure
Enable experimentation
Stop cascading of failures
Make deployments incremental, frequent, and routine events
Generally decouple activities and decisions from each other
Small, autonomous, bounded context services
16
SMALL
Two pizza teams
Well-defined functional units
Organized around business capabilities (Conway's Law)
17
AUTONOMOUS
Implementation changes can happen independently of other services
Data and functionality exposed only through service calls over the network
Designed to be externalizable
No back-doors
18
THE RIGHT APPROACHContinuously experiment, iterate, and improve
Its about the process
Identify mistakes early
Establish safety nets
Fail and move on
19
THE PROCESSInvolves people and communication
The most effective process have continuous communication - think scrums and kanban
Allows for collaboration that can identify failures before they happen
Allows for feedback to continuously improve and cultivate growth
Provides transparency
20
DEV LESSONS: BREAKING CODE VIOLENTLYBuild in violent failures to highlight issues
C/C++ lessons:
Sanity check using assertions
Invariant checks
If ever Im here in the code and these conditions arent met, then I have no business being here. Something is wrong and I should fail violently.
Involves tracing through the failure
21
AUTOMATED REGRESSION TESTING
As products and services evolve we discovered that maintaining and incrementally adding new tests became valuable
These tests were/are most often based on experienced failures and bugs
Scripts were developed to run nightly builds against various developer changes to test for regression
Testing tools evolved - proprietary and open source
22
OPS LESSONS: CHAOS MONKEYTest robustness of recovery using failure
Platform should provide uninterrupted services to the customer
Therefore:
Should always recover in acceptable amount of time
We should have random failures to ensure that changes have not regressed or caused new recovery problems
http://understeer.hatenablog.com/entry/2012/02/29/224629
23
THE RIGHT WORKFLOWRepeatably automate for consistency
Goal is repeatable automation
Toyotas yellow cord
Initially pipelines may be very different
Different tools
Traditional vs. cloud native
Its a journey Consolidation evolves naturally
24
DESIRABLE ENTERPRISE CI/CD WORKFLOW
myRepo ProjectRepo
CI
Commit Push
Pass/Fail
Local Test
BuildRepo
CD
ReleaseRepo
Monitor
Build Test Review/Appr Deliver Deploy
3rd Party
25
CI/CD PIPELINE TOOLSET
CI/CD Workflow UI
gerrit
26
OPS LESSONS: RED/GREENConfiguration as code has built in failure
Continuous Integration / Continuous Deployment
Image & Package &Metadata Repository
src repo
Dev./Build QAProduction
in OHC
Events
27
THE RIGHT INCENTIVESAlign rewards and behavior with desirable outcomes
Incentives (advancement, money, recognition) need to reward trust, cooperation, and innovation
Peer reward systems also valuable
Individual has control over their own success
But people still have responsibility for their actions
28
THE RIGHT CULTUREBuild systems and organizations that allow for failing well
Transparency
Even good decisions can have bad outcomes
Innovation inherently risky Cut losses (avoid sunk cost fallacy)
This is why open source is so successful!
29
30
BUT CULTURE ISNT SOMETHING YOU JUST CHANGE
Lack of agreed-to model of what right culture looks like
Different organizations require different behaviors
Culture change is difficult to measure and quantify
Culture is very hard to impose
Culture is an output, not an input
31
CULTURE IS:
emergentpervasivethe keystone
plus.google.com/+RedHat
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHatNews
THANK YOU
CREDITS
33
Tacoma Narrows Bridge: Barney Elliott; The Camera Shop - Screenshot taken from 16MM Kodachrome motion picture film by Barney Elliott.
Time cover: Time, Inc.
Wipeout, Flickr/CC: https://www.flickr.com/photos/andymorffew/15843725192
Marshmallow challenge: http://marshmallowchallenge.com/Welcome.html
Linux Collaboration Summit: Linux Foundation.
Two pizzas: Flickr/CC https://www.flickr.com/photos/dongkwan/283076601
Frog: Kathy CC/Flickr https://flic.kr/p/b9fFV
Square peg Flickr/CC: https://www.flickr.com/photos/epublicist/3546059144/
https://www.flickr.com/photos/andymorffew/15843725192http://marshmallowchallenge.com/Welcome.htmlhttps://www.flickr.com/photos/dongkwan/283076601https://flic.kr/p/b9fFV