Top Banner
Engineering Velocity: Continuous Delivery at Netflix Dianne Marsh SATURN 2014
29

Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Oct 17, 2014

Download

Software

At Netflix, we realize that there’s a tension between the availability of our service and our speed of innovation. If we move slowly, we can be very available -- but that’s not a good business proposition. If we move super fast, we risk downtime -- and that might annoy our customers. But
what if we could increase our velocity without significantly impacting availability? How can we shift that curve so that we’re moving faster without dropping any of those coveted 9’s?
How can we engineer velocity by weaving together tooling and culture with software development to expose and elevate highly effective practices? This talk describes various
components of Netflix’s continuous delivery platform -- much of which is available in open source. I’ll show how these pieces fit together and allow us to build scaffolding so that we’re comfortable with software developers making the decision to push the button for prod deployment -- and helps them to recover if necessary. As a result, we can run fast, trusting our tooling and our culture. I’ll also describe how we test our resiliency through simulating failure, unleashing the monkeys (Simian Army) on our production environment. Because if you’re afraid of cute little monkeys,
imagine how afraid you’ll be of a production environment that offers those same risks but doesn’t give you an opportunity to test your response to those dangers.

Throughout this talk, I hope that you will challenge yourself to consider how your company can "shift the curve" through tooling and to achieve a high velocity environment without negatively impacting reliability.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Engineering Velocity: Continuous Delivery at Netflix

Dianne Marsh SATURN 2014

Page 2: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

en-gi-neer-ing + ve-loc-i-ty !applying science and technology to designing and building speed

into a system

Page 3: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Availability vs. Rate of ChangeAv

aila

blity

(in

9’s)

0

1

2

3

4

5

6

Rate of Change0 10 100 1000

Page 4: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Shift the CurveAv

aila

blity

(in

9’s)

0

1

2

3

4

5

6

Rate of Change0 10 100 1000 10000

Page 5: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

http://www.slideshare.net/reed2001/culture-1798664

Page 6: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Manager’s Role

Context, not Control

Loosely coupled, Tightly aligned

And hire well!

Page 7: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Get out of the Way

Freedom to Innovate

Page 8: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Support Experimentation

!

How We Built a Predictive

Autoscaling Engine

http://techblog.netflix.com/2013/11/scryer-netflixs-predictive-auto-scaling.html

Page 9: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Support Independent Paths of Exploration Don’t Prematurely Optimize!

Page 10: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Blameless Culture

Page 11: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Developers Deploy Their Code

Run What You Wrote

!

• Rapid Innovation

• Rapid Detection

• Rapid Response

!

= Freedom + Responsibility

Page 12: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Support with Tools

Page 13: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Jenkins Job DSL

Configuration as Code

Groovy Script

Scripts go in Version Control

http://www.slideshare.net/quidryan/configuration-as-code

Page 14: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Aminator

Create AMI from Base AMI

Image contains service and everything needed to run it

Unit of Deployment for Test and Prod

Abstracts Cloud Details

http://techblog.netflix.com/2013/03/ami-creation-with-aminator.html

Page 15: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Asgard

Deploys Netflix to the Cloud

Red/Black push

Developed to address delays in rollback

http://www.infoq.com/presentations/asgard

Page 16: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Red/Black Push!

• Scale up new instances

• Run canary analysis

• Turn on traffic to new ASG

• Turn off traffic to old ASG

• Wait … analyze … continue

Page 17: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Workflow

Continuous Delivery Engine

Judges between Stages

Represent Best Practices

http://techblog.netflix.com/2013/09/glisten-groovy-way-to-use-amazons.html

Page 18: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

One Click Deployment?

Page 19: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Regional IsolationLimit Impact of Human Error

!

• Stagger Deployments?

• Canary Testing per Region?

!

Know your Service!

Page 20: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Multi-Region ConsistencyBuild Tooling to:

!

• Schedule Deployments

• Prefer Off-Peak

• Choose Next Available Region

• Provide Visibility by Region

Page 21: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Simian Army

• Chaos Monkey

• Latency Monkey

• Conformity Monkey

• Janitor Monkey (and more)

http://www.infoq.com/presentations/netflix-resiliency-failure-cloud

Page 22: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Chaos Monkey

Kills Running Instances

• Simulates failures inherent to running in the cloud

• In Production

Page 23: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Latency Monkey

Introduces Latency between services

Page 24: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Conformity Monkey

Have Deployments Diverged?

• Balance Regional Consistency with Regional Isolation

• Build Best Practices into Tooling and Reporting

Page 25: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Janitor Monkey

Reduce Cognitive Load and Cost

• Remove unused instances

• Uniform way to clean up

Page 26: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Shifting the Curve with Tooling

• Value Self-Service

• Test Everywhere

• Awareness of Multiple Regions

• Best Practices Represented in Tooling

• Recover Quickly and Easily

• Be Cloud Native

Page 27: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Shifting the Curve with Culture

• Context not Control

• Freedom to Experiment

• Blameless Culture

Page 28: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

ArsTechnica, November 2012

“As the number of applications and the scale of the campaign's AWS infrastructure use

climbed, the DevOps team shifted to using Asgard—an open-source tool developed by

Netflix to manage cloud deployments.”

Page 29: Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

Thanks!

Dianne Marsh (@dmarsh)

[email protected]