Top Banner
Principles and Practices in Continuous Deployment Mike Brittain Engineering Director, Etsy @mikebrittain mikebrittain.com/talks
93

Principles and Practices in Continuous Deployment at Etsy

Aug 23, 2014

Download

Engineering

Mike Brittain

Presented at ALM Forum 2014.

Like what you've read? We're frequently hiring for a variety of engineering roles at Etsy. If you're interested, drop me a line or send me your resume: [email protected].

http://www.etsy.com/careers
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Principles and Practices in Continuous Deployment at Etsy

Principles and Practices in Continuous Deployment

Mike Brittain

Engineering Director, Etsy

@mikebrittain mikebrittain.com/talks

Page 2: Principles and Practices in Continuous Deployment at Etsy
Page 3: Principles and Practices in Continuous Deployment at Etsy

“Continuous Deployment”Process by which our team deploys software changes to production services over 30 times per day.

Page 4: Principles and Practices in Continuous Deployment at Etsy

Where we started

Principles for our Engineering team

Continuous Deployment

Business case

Page 5: Principles and Practices in Continuous Deployment at Etsy

Five years ago…2-3 weeks of code changes

Release and rollback plans

Traffic and infrastructure management (Ops)

6-14 hours

Page 6: Principles and Practices in Continuous Deployment at Etsy

Five years ago…“Deployment Army”

Stressful, especially when things go wrong

Long days and late nights

Scheduled downtime

Page 7: Principles and Practices in Continuous Deployment at Etsy

pro·duc·tion [pruh-duhk-shuhn] (n)

1. This complex system of application code, distributed services, servers, networking gear, etc., upon which we’re going to try to carefully apply a complicated set of changes and hope that nothing goes wrong. Cross your fingers… here goes.

Page 8: Principles and Practices in Continuous Deployment at Etsy

Software for large-scale web sites has been traditionally written by one group of people, then released and operated by a different group.

These two groups have very different levels of visibility into how the software works.

Page 9: Principles and Practices in Continuous Deployment at Etsy

Stagnation

Page 10: Principles and Practices in Continuous Deployment at Etsy

“…frequent and prolonged outages.”2010 CAPACITY PLAN

Page 11: Principles and Practices in Continuous Deployment at Etsy

First, Principles.

Page 12: Principles and Practices in Continuous Deployment at Etsy

Innovate or die

Page 13: Principles and Practices in Continuous Deployment at Etsy

Innovate or die

Resolve scaling hurdles

Page 14: Principles and Practices in Continuous Deployment at Etsy

Innovate or die

Resolve scaling hurdles

Mean-time-to-recovery

Page 15: Principles and Practices in Continuous Deployment at Etsy

“Quality is not just testing pre-release. It also includes our adaptability and response time.”

- Jeff Sussna at ALM Forum, 2014

Page 16: Principles and Practices in Continuous Deployment at Etsy

Innovate or die

Resolve scaling hurdles

Mean-time-to-recovery

Page 17: Principles and Practices in Continuous Deployment at Etsy

Innovate or die

Resolve scaling hurdles

Mean-time-to-recovery

Healthy and talented engineering team

Page 18: Principles and Practices in Continuous Deployment at Etsy

Autonomy, Mastery, Purpose.

“Drive: The surprising truth about what motivates us.” ~Dan Pink, at RSA http://youtu.be/u6XAPnuFjJc

Page 19: Principles and Practices in Continuous Deployment at Etsy

Innovate or die

Resolve scaling hurdles

Mean-time-to-recovery

Healthy and talented engineering team

Stop stressing about releases

Page 20: Principles and Practices in Continuous Deployment at Etsy

First, Principles.

Page 21: Principles and Practices in Continuous Deployment at Etsy

http://timothyfitz.com/2009/02/10/continuous-deployment-at-imvu-doing-the-impossible-fifty-times-a-day/

In a software release process Fail Fast means releasing undeployed code as fast as possible, instead of waiting for a weekly release to break.

Page 22: Principles and Practices in Continuous Deployment at Etsy

http://youtu.be/LdOe18KhtT4

Page 23: Principles and Practices in Continuous Deployment at Etsy

SECRET WEAPON: Hired as VP, Tech-Ops at Etsy

Page 24: Principles and Practices in Continuous Deployment at Etsy

Continuous Deployment

Continuous Delivery~ vs ~

Page 25: Principles and Practices in Continuous Deployment at Etsy

Frequent check-ins directly to mainline.

Continuous Deployment Continuous Delivery✓ ✓

Page 26: Principles and Practices in Continuous Deployment at Etsy

Continuous Integration and Automated tests.

Continuous Deployment Continuous Delivery✓ ✓

Page 27: Principles and Practices in Continuous Deployment at Etsy

Keep the build green. We’re always ready to release.

Continuous Deployment Continuous Delivery✓ ✓

Page 28: Principles and Practices in Continuous Deployment at Etsy

“One button” deploys.

Continuous Deployment Continuous Delivery✓ ✓

Page 29: Principles and Practices in Continuous Deployment at Etsy
Page 30: Principles and Practices in Continuous Deployment at Etsy

Business dictates when a build is deployed.

Continuous Deployment Continuous Delivery✓

Page 31: Principles and Practices in Continuous Deployment at Etsy

Every passing build is deployed to production.

Continuous Deployment Continuous Delivery✓

Page 32: Principles and Practices in Continuous Deployment at Etsy

All enhancements are gated by Config Flags. (“Branch in code”)

Continuous Deployment Continuous Delivery✓ ?

Page 33: Principles and Practices in Continuous Deployment at Etsy

Most of the builds we deploy are “dark” changes.

Page 34: Principles and Practices in Continuous Deployment at Etsy

CSS rules and properties

Copy in templates (e.g. typos)

New, un-referenced code (e.g. classes, funcs, templates)

Code paths behind disabled config flags

etc…

Page 35: Principles and Practices in Continuous Deployment at Etsy

Dev Team Version Control Build & Unit Tests

Automated Acceptance Tests

User Acceptance Tests Release

Check inTrigger

Feedback

Source: http://en.wikipedia.org/wiki/Continuous_delivery

Continuous Delivery release pipeline

Page 36: Principles and Practices in Continuous Deployment at Etsy

Dev Team Version Control Build & Unit Tests

Automated Acceptance Tests

User Acceptance Tests Release

Check inTrigger

Feedback

Check inTrigger

Feedback Trigger

Feedback

Source: http://en.wikipedia.org/wiki/Continuous_delivery

Continuous Delivery release pipeline

Page 37: Principles and Practices in Continuous Deployment at Etsy

Dev Team Version Control Build & Unit Tests

Automated Acceptance Tests

User Acceptance Tests Release

Check inTrigger

Feedback

Check inTrigger

Feedback Trigger

Feedback

Check inTrigger

Feedback Trigger

Feedback Approval

ApprovalFeedback

Source: http://en.wikipedia.org/wiki/Continuous_delivery

Continuous Delivery release pipeline

Page 38: Principles and Practices in Continuous Deployment at Etsy

Dev Team Version Control Build & Unit Tests

Automated Acceptance Tests

User Acceptance Tests Release

Check inTrigger

Feedback

Check inTrigger

Feedback Trigger

Feedback

Check inTrigger

Feedback Trigger

Feedback Approval

ApprovalFeedback

Continuous Delivery release pipeline

Dev / Integration Staging Production

Page 39: Principles and Practices in Continuous Deployment at Etsy

Dev Team Version Control Build & Unit Tests

Automated Acceptance Tests

User Acceptance Tests Release

Check inTrigger

Feedback

Check inTrigger

Feedback Trigger

Feedback

Check inTrigger

Feedback Trigger

Feedback Approval

ApprovalFeedback

Continuous Delivery release pipeline

Dev / Integration Staging Production

Assumptions:

Staging is a perfect reflection of Production, with respect to hardware, configurations, data, overall load, capacity, etc.

Deploy process is infallible.

Page 40: Principles and Practices in Continuous Deployment at Etsy

“What do you mean, ‘it’s not working in production?’ I TESTED IT BEFORE WE RELEASED!”

Page 41: Principles and Practices in Continuous Deployment at Etsy

Dev Team Version Control Build & Unit Tests

Automated Acceptance Tests

User Acceptance Tests Release

Check inTrigger

TriggerApproval

Continuous Delivery release pipeline

Dev / Integration Staging Production

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Page 42: Principles and Practices in Continuous Deployment at Etsy

Dev Team Version Control Build & Unit Tests

Automated Acceptance Tests

User Acceptance Tests Release

Check inTrigger

TriggerApproval

Continuous Delivery release pipeline

Dev / Integration Staging Production

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Approval

Check inTrigger

TriggerApproval

Feedback

Page 43: Principles and Practices in Continuous Deployment at Etsy

Dev Team Version Control Build & Unit Tests

Automated Acceptance Tests

User Acceptance Tests Release

Check inTrigger

TriggerApproval

Continuous Delivery release pipeline

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Approval

Check inTrigger

TriggerApproval

Feedback

"Because you’re integrating so frequently, there is significantly less back-tracking to discover where things went wrong , so you can spend more time building features.” !

—ThoughtWorks !

!http://www.thoughtworks.com/continuous-integration

Dev / Integration Staging Production

Page 44: Principles and Practices in Continuous Deployment at Etsy

Dev Team Version Control Build & Unit Tests

Automated Acceptance Tests

User Acceptance Tests Release

Check inTrigger

TriggerApproval

Continuous Delivery release pipeline

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Approval

Check inTrigger

TriggerApproval

Feedback

Dev / Integration Staging Production

Where’s the bug? !

In one of the numerous check-ins? Missing unit tests? Missing automated UA tests? Missing manual UA tests?

Page 45: Principles and Practices in Continuous Deployment at Etsy

Dev Team Version Control Build & Unit Tests

Automated Acceptance Tests

User Acceptance Tests Release

Check inTrigger

TriggerApproval

Continuous Delivery release pipeline

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Approval

Check inTrigger

TriggerApproval

Feedback

Dev / Integration Staging Production

Where’s the bug? !

In one of the numerous check-ins? Missing unit tests? Missing automated UA tests? Missing manual UA tests? !Data out of sync?

Server configurations out of sync?

Capacity vs. current load?

Deployment script?

Page 46: Principles and Practices in Continuous Deployment at Etsy

Dev Team Version Control Build & Unit Tests

Automated Acceptance Tests

User Acceptance Tests Release

Check inTrigger

TriggerApproval

Continuous Delivery release pipeline

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Check inTrigger

TriggerApproval

Approval

Check inTrigger

TriggerApproval

Feedback

Dev / Integration Staging Production

How will we know when something is wrong in production? !

How long will it take to resolve the issue?

Check inTrigger

Page 47: Principles and Practices in Continuous Deployment at Etsy

We aim to reduce fundamental surprise in every release.

Page 48: Principles and Practices in Continuous Deployment at Etsy

Furthermore, we optimize for detecting and recovering from failures quickly.

Page 49: Principles and Practices in Continuous Deployment at Etsy

Pre-production validationCode deployed to de-pooled application (web) servers touching prod services and databases.

Smoke tests

Integration tests

Functional tests

User-Acceptance (ad hoc)

Page 50: Principles and Practices in Continuous Deployment at Etsy

Production validationExactly the same server configs, services and data as pre-prod, but this is where we introduce application code to live traffic.

Page 51: Principles and Practices in Continuous Deployment at Etsy

Production validationExactly the same server configs, services and data as pre-prod, but this is where we introduce application code to live traffic.

Smoke tests (esp. over public hostnames)

User-Acceptance testing behind config flags

Gratuitous monitoring

Customer support and forums

Page 52: Principles and Practices in Continuous Deployment at Etsy

Single release

Many releases

50K LOC/month

Few opportunities for failureWide surface area (50,000 LOC) High MTTR !

All of the bugs we’ve written

More opportunities for failure Narrow surface area (< 100 LOC)

Low MTTR !

A fraction of the bugs we’vewritten per release

Imagine that we’ll write

Page 53: Principles and Practices in Continuous Deployment at Etsy

Monitoring

Page 54: Principles and Practices in Continuous Deployment at Etsy

Monitoring

PHP Warnings Bug Reports and Help Requests

Page 55: Principles and Practices in Continuous Deployment at Etsy

Deploy logs

Page 56: Principles and Practices in Continuous Deployment at Etsy

Post-Mortems

Page 57: Principles and Practices in Continuous Deployment at Etsy

Check inTrigger

Feedback Trigger

Feedback Approval

Approval

Smoke Tests User Acceptance!Tests ReleaseDev Team Version Control Build & Unit

TestsAutomated

Acceptance TestsUser Acceptance

Tests Deploy (Prod) Monitoring and Automated Alerts

Continuous Deployment release pipeline

Feedback

Page 58: Principles and Practices in Continuous Deployment at Etsy

Dev Pre-Production (“Princess”)

Production

Check inTrigger

Feedback Trigger

Feedback Approval

ApprovalFeedback

Smoke Tests User Acceptance!Tests ReleaseDev Team Version Control Build & Unit

TestsAutomated

Acceptance TestsUser Acceptance

Tests Deploy (Prod) Monitoring and Automated Alerts

Continuous Deployment release pipeline

CI

Page 59: Principles and Practices in Continuous Deployment at Etsy

Dev Pre-Production (“Princess”)

Production

Check inTrigger

Feedback Trigger

Feedback Approval

ApprovalFeedback

Smoke Tests User Acceptance!Tests ReleaseDev Team Version Control Build & Unit

TestsAutomated

Acceptance TestsUser Acceptance

Tests Deploy (Prod) Monitoring and Automated Alerts

Continuous Deployment release pipeline

CI

ApprovalApproval

FeedbackFeedback

Feedback

Page 60: Principles and Practices in Continuous Deployment at Etsy

Dev Pre-Production (“Princess”)

Production

Check inTrigger

Feedback Trigger

Feedback Approval

ApprovalFeedback

Smoke Tests User Acceptance!Tests ReleaseDev Team Version Control Build & Unit

TestsAutomated

Acceptance TestsUser Acceptance

Tests Deploy (Prod) Monitoring and Automated Alerts

Continuous Deployment release pipeline

CI

ApprovalApproval

FeedbackFeedback

Feedback ApprovalFeedback Approval

Feedback

Page 61: Principles and Practices in Continuous Deployment at Etsy

“Allow buttons properly to inherit color from their parent node.”

Page 62: Principles and Practices in Continuous Deployment at Etsy
Page 63: Principles and Practices in Continuous Deployment at Etsy
Page 64: Principles and Practices in Continuous Deployment at Etsy

Five years ago…2-3 weeks of code changes

Release and rollback plans

Traffic and infrastructure management (Ops)

6-14 hours

Page 65: Principles and Practices in Continuous Deployment at Etsy

Five years ago…“Deployment Army”

Stressful, especially when things go wrong

Long days and late nights

Scheduled downtime

Page 66: Principles and Practices in Continuous Deployment at Etsy

Why do we do this?

Page 67: Principles and Practices in Continuous Deployment at Etsy

Innovate or die.

Resolve scaling hurdles.

Mean-time-to-recovery.

Healthy and talented engineering team.

Stop stressing about releases.

Page 68: Principles and Practices in Continuous Deployment at Etsy

Innovate or die.

Resolve scaling hurdles.

Mean-time-to-recovery.

Healthy and talented engineering team.

Stop stressing about releases.

Page 69: Principles and Practices in Continuous Deployment at Etsy

Admin-launch and whitelist

Ramp-up public traffic

Page 70: Principles and Practices in Continuous Deployment at Etsy

mainline

header_redesign

search_filter_custom_orders

checkout_blue_button

listing_css_refactor

Page 71: Principles and Practices in Continuous Deployment at Etsy

www.etsy.com

beta01.etsy.com

beta02.etsy.com

beta03.etsy.com

Page 72: Principles and Practices in Continuous Deployment at Etsy

www.etsy.com

beta01.etsy.com

beta02.etsy.com

beta03.etsy.com

Page 73: Principles and Practices in Continuous Deployment at Etsy

date":"30\/Mar\/2014:12:49:48","locale_currency_code":"USD","pref_language":"en-

US","region":"US","detected_currency_code":"USD","detected_language":"en-

US","detected_region":"US","accept-languages":"en-US","cdn-

provider":"","isMobileDevice":"0","isMobileSupported":"0","isMobileRequestIgnoreCookie":"0"

,"isTabletSupported":"0","isTouch":"0","isEtsyApp":"0","isPreviewRequest":"0","isChromeInst

antRequest":"0","isMozPrefetchRequest":"0","listing_ids":

[104073511,130604774,159651433,155451607,160523743,124025232,95186610,82967340,114692884,11

4767467,117266897,157579748],"scheduled_modules_content_ids":

[10808052776,10256029946],"primary_event":"1",".event_source":"web",".event_logger":"fronte

nd","php_ab_test_names":"translation_profiler.profiling;translation_profiler.logging;transl

ation_profiler.backend_event_logging;footer_redesign_20131201;international.languages.el;in

ternational.languages.ja;international.languages.no;international.languages.pl;internationa

l.languages.ro;international.languages.tr;simplified_locale_experience;full_site_ssl;admin_

toolbar;enabled_locale_subdirectories;affiliates.publishing.user_publishers;buyer_invites_r

ecipients;home_improvement;home_improvement.new_homepage;authoritative_items;refactored_foo

ter;conversations.rejuvination;contextual_homepage_recs.global;css_from_www;shrinkray.css;c

srf_nonce_refactor.allow_colon;csrf_nonce_refactor.reverse_order;csrf_nonce_refactor.no_encAnalytics connected to config names

Page 74: Principles and Practices in Continuous Deployment at Etsy

Catapult

Page 75: Principles and Practices in Continuous Deployment at Etsy

Observed impact

Time series data for duration of the experiment

Page 76: Principles and Practices in Continuous Deployment at Etsy

Observed impact

Time series data for duration of the experiment

Page 77: Principles and Practices in Continuous Deployment at Etsy

FrankProduct Manager

Page 78: Principles and Practices in Continuous Deployment at Etsy

“I want to find out whether buyers will favor a single price for the product that includes shipping.”

https://www.etsy.com/shop/lucra

Page 79: Principles and Practices in Continuous Deployment at Etsy

Eligibility requirements:

- Must be first page of visit

- Buyer & seller in same region

- etc…

Page 80: Principles and Practices in Continuous Deployment at Etsy

Time: < 8 hours Staff: One !

Design, config flag (disabled), eligibility code in controller, template code, CSS, code review, automated tests, deployed code, config flag enabled.

Page 81: Principles and Practices in Continuous Deployment at Etsy

We do not bundle the item price and shipping cost together today. !

https://www.etsy.com/shop/lucra

Page 82: Principles and Practices in Continuous Deployment at Etsy
Page 83: Principles and Practices in Continuous Deployment at Etsy

Ambitious Product Goal

Page 84: Principles and Practices in Continuous Deployment at Etsy

Ambitious Product Goal

Monolithic Building and measuring many things at once.

Page 85: Principles and Practices in Continuous Deployment at Etsy

Ambitious Product Goal

Monolithic Building and measuring many things at once.

Iterative One thing at a time, our design goal is always in sight.

Page 86: Principles and Practices in Continuous Deployment at Etsy

Time: < 8 hours Staff: One !

Design, config flag (disabled), eligibility code in controller, template code, CSS, code review, automated tests, deployed code, config flag enabled.

Deployed Deployed

Page 87: Principles and Practices in Continuous Deployment at Etsy

Is this for me?

Page 88: Principles and Practices in Continuous Deployment at Etsy

http://timothyfitz.com/2009/02/10/continuous-deployment-at-imvu-doing-the-impossible-fifty-times-a-day/

“Maybe this is just viable for a single developer … your site will be down. A lot.”

Page 89: Principles and Practices in Continuous Deployment at Etsy

etsystatus.com

Page 90: Principles and Practices in Continuous Deployment at Etsy

@mikebrittain

Very end of 2009 Today

DEPLOYMENTS PER DAYAPP CODE CONFIG FILES

Page 91: Principles and Practices in Continuous Deployment at Etsy

$1.35 Billion Goods sold in 2013 60+ Million Unique visitors per month !

175+ Committers, everyone deploys

http://www.etsy.com/blog/news/2013/etsy-statistics-december-2012-weather-report/Items by anjaysdesigns, betwixxt, OneStarLeatherGoods, mediumcontrol, TheDesignPallet

Page 92: Principles and Practices in Continuous Deployment at Etsy
Page 93: Principles and Practices in Continuous Deployment at Etsy

Thank you.

Mike Brittain

Engineering Director, Etsy

@mikebrittain mikebrittain.com/talks