Top Banner
@jonahhorowitz Production Ready Services Increasing Microservice Availability at Netflix
14

Production Ready Services at Netflix

Feb 16, 2017

Download

Technology

Jonah Horowitz
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Production Ready Services at Netflix

@jonahhorowitz

Production Ready ServicesIncreasing Microservice Availability

at Netflix

Page 2: Production Ready Services at Netflix

@jonahhorowitz

Netflix Architecture

Page 3: Production Ready Services at Netflix

@jonahhorowitz

Page 4: Production Ready Services at Netflix

@jonahhorowitz

Determining Microservice Availability

Service A Service B

?

Page 5: Production Ready Services at Netflix

@jonahhorowitz

Microservice Availability Report

Page 6: Production Ready Services at Netflix

@jonahhorowitz

Microservice Availability Report

Page 7: Production Ready Services at Netflix

@jonahhorowitz

Production Ready

Page 8: Production Ready Services at Netflix

@jonahhorowitz

Characteristics of a Production Ready Service● Reliable

○ Chaos Monkey Enabled○ Automated Deployment Pipelines○ Automated Canary Analysis

Page 9: Production Ready Services at Netflix

@jonahhorowitz

Characteristics of a Production Ready Service● Reliable

○ Chaos Monkey Enabled○ Automated Deployment Pipelines○ Automated Canary Analysis

● Scalable○ Proactive autoscaling○ Reactive autoscaling

Page 10: Production Ready Services at Netflix

@jonahhorowitz

Characteristics of a Production Ready Service● Reliable

○ Chaos Monkey Enabled○ Automated Deployment Pipelines○ Automated Canary Analysis

● Scalable○ Proactive autoscaling○ Reactive autoscaling

● Performant○ Automated Canary Analysis○ Proper Instance Type○ Well-Tuned GC○ Well-Tuned connections, threads

(ezconfig)

Page 11: Production Ready Services at Netflix

@jonahhorowitz

Characteristics of a Production Ready Service● Reliable

○ Chaos Monkey Enabled○ Automated Deployment Pipelines○ Automated Canary Analysis

● Scalable○ Proactive autoscaling○ Reactive autoscaling

● Performant○ Automated Canary Analysis○ Proper Instance Type○ Well-Tuned GC○ Well-Tuned connections, threads

(ezconfig)● Monitored

○ High Quality Alerts■ Upstream Failure %■ Downstream Failure %

○ Dashboards■ Monitoring Releases■ Troubleshooting

Page 12: Production Ready Services at Netflix

@jonahhorowitz

Team Engagements

Page 13: Production Ready Services at Netflix

@jonahhorowitz

Drill SergeantAutomated Checks

● Deployment Pipelines● Operating System● Canary Analysis● Chaos Monkey

Manual Checks

● Alerts● Dashboards● Autoscaling Rules (for now)● Java/Tomcat/Apache Tuning (for

now)

Page 14: Production Ready Services at Netflix

@jonahhorowitz

Jonah HorowitzSenior Site Reliability Engineer

@jonahhorowitz

https://netflix.github.io/https://jobs.netflix.com/

Velocity Ignite Talk tomorrow:7:00pm – 8:30pm Santa Clara Convention Center