Top Banner
Not everything that happens in Vegas stays in Vegas
18

Aws uk ug #8 not everything that happens in vegas stay in vegas

Jan 15, 2015

Download

Documents

Peter Mounce

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Aws uk ug #8   not everything that happens in vegas stay in vegas

Not everything that happens in Vegas stays in Vegas

Page 2: Aws uk ug #8   not everything that happens in vegas stay in vegas

DevOpsor “getting devs to be on call for what they ship” :-)

Page 3: Aws uk ug #8   not everything that happens in vegas stay in vegas

Netflix development

Priorities

1. Speed of innovation

2. Availability

3. Running costs

a. “It’ll cost what it ends up costing”

In practise, they found that holding to the first two ended

up costing way less than otherwise expected.

Page 4: Aws uk ug #8   not everything that happens in vegas stay in vegas

Riot Games + League of Legends

Cloud == ideal for MMOs. Solve launch issues.

● chef gets used a lot here.

○ talked about their evolution with it, lessons learned

● What sucked?

○ 25 minute bootstrap runs

○ External dependencies (including S3)

○ Duplicating application deployment recipes

● golden masters and immutable servers simplify your

life drastically.

● “if you’re doing chef without BerkShelf you’re doing it

wrong”

● Make it easy to throw up new things

Page 5: Aws uk ug #8   not everything that happens in vegas stay in vegas

Testing in production

Netflix, Riot, Kickstarter - they all do this.

At scale.

Netflix

● 10s to 100s of code pushes per day

● 1000s to 100,000s of config changes per day

○ they tune their A/B testing constantly

Of course, they also have the instrumentation to react to

this.

Page 6: Aws uk ug #8   not everything that happens in vegas stay in vegas

How’re other people doing DevOps?

Good news - we’re at the “more sophisticated” end of the

spectrum.

Every “cloud native” was doing this.

Things other people did better:

● “Golden master” AMIs

● Immutable instances

● Absolute ownership of vertical slices

● Config-managment (chef/puppet) featured

prominently

● Extensive monitoring+logs+visibility == “table stakes”

○ for developers!

● Easy to throw up new things

● Run many small, simple, collaborating things

Who? Riot Games, Netflix, change.org, Kickstarter

Page 7: Aws uk ug #8   not everything that happens in vegas stay in vegas

Logging aggregation is important

Page 8: Aws uk ug #8   not everything that happens in vegas stay in vegas

Logging aggregation is important

Lots of 3rd party companies are offering centralized

logging services, there's a huge appetite for logging

and monitoring.

● http://logentries.com/

● http://www.loggly.com/

● http://papertrailapp.com/

● https://www.splunkstorm.com/tour

● http://www.datadoghq.com/

● DIY - Lumberjacking slides

Page 9: Aws uk ug #8   not everything that happens in vegas stay in vegas

DEMO: Monitoring & Logging

https://app.datadoghq.com/infrastructure

● Tag Metrics, awesome Metric discoverability

● Cloud Watch integration

○ I never knew I could see ELB metrics :-)

● Alarms are integrated

● You can template Dashboards

https://papertrailapp.com/

● Can Search, Save Searches, Alerts on searches

● No alert on patterns

● Archive to S3 / Push to Redshift

Logging aggregation is FOR DEVELOPERS!!!

Saves lots of time when you’re on call.

Page 10: Aws uk ug #8   not everything that happens in vegas stay in vegas

Loggly Session

Benefit of logging as a service.

● When your infrastructure is in trouble, you do not

want to have your logging analytic system on the

same infrastructure.

AWS Services that loggly could use:

● Kafka + Storm vs Kinesis

● Elastic Search vs Cloud Search

Predictive Analytics using Storm, Hadoop, R and

AWS

http://www.youtube.com/watch?v=6Sl3eBmDheE

Page 11: Aws uk ug #8   not everything that happens in vegas stay in vegas

Loggly Session

● Provisioned IOPS solve all issues :)

● ELB do not perform with extremely high volume

of requests.

● DNS round robin is a very good basic load

balancing solution

● Cassandra works very well for application data.

● Cassandra does not work well as a queue system,

hard to track order of events.

● Keep the architecture simple.

Page 12: Aws uk ug #8   not everything that happens in vegas stay in vegas

Large Scale Load Testing on AWS

Page 13: Aws uk ug #8   not everything that happens in vegas stay in vegas

Many types of load

● Load testing

○ (running a marathon), predict future load and

plan in advance

● Stress testing

○ Break things (figure out limits), mitigation

plans

● Resilience test

○ Figure out how many parts of the architecture

you can lose and still operate

● Performance test

○ How is latency and throughput changing when

the load increase

Page 14: Aws uk ug #8   not everything that happens in vegas stay in vegas

Phase roll out and measure

● Load Testing is necessary but not sufficient.

○ Deploy to alpha cluster.

○ The release cycle is important, phased

deployment, one box, monitor and ramp up.

○ Monitor performance and behaviour, look at

99% of the traffic, not at the average.

● Netflix record 1.2 billion metrics per day

○ 5 minutes SLA

Page 15: Aws uk ug #8   not everything that happens in vegas stay in vegas

Gameday

Page 16: Aws uk ug #8   not everything that happens in vegas stay in vegas

Gameday

We took part to the AWS Gameday

http://www.awsgameday.com/whatisgameday.html

Inspired by the 2012 Obama For America DevOps

and Amazon.com ops teams

● Build an Autoscaling application

● Exchange administrative IAM credentials with

other team

● Break your opponent's systems

● Restore your system

● Lessons learned

Page 17: Aws uk ug #8   not everything that happens in vegas stay in vegas

Who is interested if we wanted to run this?

It needs a full day, ~ 6 hours.

Weekday?

Weekend?

Page 18: Aws uk ug #8   not everything that happens in vegas stay in vegas

Twitter: @petemounce