Performances on Amazon AWS - Cloud Day 2012 Xebia

B H A S K A R S U N K A R A A N D P E T E R A B R A M S

PERFORMANCE ON AMAZON AWS

INTRODUCTION

•  Founded in April 2008 in San Francisco – Venture Funded •  Founding Principles •  The Move to the cloud presents a new set of challenges •  New world - Constant Change (infrastructure, architecture,

code) •  Existing management solutions not designed for constant

change •  AppDynamics Value - Enable teams to operate business critical

applications in clouds and guarantee service performance •  Working with Netflix since October 2009 •  Oct. 2009 – 150 servers in private data center •  May 2012 – 50 servers in data center, 8,000 servers in EC2 •  AppDynamics is Netflix primary SLA management tool

AGENDA

Differences between Cloud and Physical Datacenter

Performance on AWS

Case Study on AWS

AppDynamics

CLOUD /PHYSICAL DATACENTER TH INGS HAVE CHANGED

EVERYTHING IS SHARED

•  Shared/Virtualized Infrastructure •  Shared services

Shared Services

The biggest public cloud !!

•  S3 •  SQS •  SDB •  EBS •  EMR •  …

INFRASTRUCTURE

•  Machines come and go •  High rate of change •  Capacity is much cheaper •  Capacity can be both increased and decreased •  In minutes

•  Cannot use physical dependencies anymore •  E.g. static IP mapping between services

PERFORMANCE MONITORING

•  Traditional monitoring : Measure •  CPU and other hardware metrics •  Code metrics – individual methods etc. •  Scrape logs for errors etc. •  Configured by hand

•  Cloud Monitoring - Datacenter tools are a big pain ! •  You were measuring CPU metrics for a bunch of machines •  Now those are gone, and the new ones are up •  Who is going to refresh your dashboards? •  Who is going to clean up the dead instance data?

GOOD PERFORMANCE ON AWS?

(Re)architect your app to • Work on Amazon ! • Take advantage of all that it provides • Careful with shared services !

Pick the right performance monitoring tools !

Lets not forget managing capacity/cost !!

APPLICATION ARCHITECTURE

APPLICATION ARCHITECTURE

•  Distributed •  Service Oriented •  Horizontally Scalable

AWS PERFORMANCE FACTSHEET –FROM A HEAVY DUTY USER - NETFLIX

IF YOU ARE USING SHARED SERVICES

•  Measure service performance in isolation •  Stress test the hell out of shared service calls •  At minimum double of your peak load !

•  Look for common patterns out there •  e.g. Simple DB needs a cache frontend

•  Avoid badly performing shared services •  EBS?

PERFORMANCE MONITORING

ESTABL ISH A CRITER IA TO P ICK THE R IGHT TOOLS

1.HAS TO BE SERVICE ORIENTED

•  Primarily monitor Services not Infrastructure ! •  Focus on the application SLAs •  Focus on the end user experience

Process Service Order

§  Response times §  Load §  Error rates §  Trends

2. HAS TO BE DISTRIBUTED

•  Tools need to measure health of tiers •  Measuring individual servers does not make sense •  Services are horizontally scalable

ec2-1 ec2-2

ec2-3

ec2-1 ec2-2

ec2-3 ec2-4

ec2-5

You need to know how the cluster/tier performs in terms of average utilization

3. HAS TO KEEP UP WITH RATE OF CHANGE

•  Keep up with machines going up/down •  Node are transient

•  Provide a clean view of the current state •  Clean up dead instances/services

•  Maintain a baseline of how the overall tier does

ec2-1 ec2-2

ec2-3

ec2-22 ec2-23

ec2-24

4.CROSS SERVICE TRACING

•  Becomes absolutely necessary for truly distributed apps •  Should be able to drill down across services within

the context of a single user request •  Should be able to analyze code in every service •  Should be able to point out impact of using shared

services

CROSS SERVICE TRACING

IMPACT OF SHARED SERVICES

5.AUTODISCOVERY/LOW CONFIGURATION MAINTENANCE

•  Cannot have configuration based discovery of new instances/services •  Baking into AMIs etc. •  Should auto-discover new tiers/services

•  Cannot have code level configuration •  Difficult to maintain with agile development

MANAGING COST – CASE STUDY

MANAGING COST - EISMANN

•  Managing Capacity == Cost •  The cloud isn’t free ! •  Eismann •  Frozen food delivery vendor in Germany •  In-production on AWS •  Has variable-capacity based on usage hours •  Use application level SLAs to determine capacity •  E.g. Process Order Volume == capacity of services on AWS

WHAT IS APPDYNAMICS?

•  Fundamentally built for the Cloud •  Handles constant change of infrastructure •  Service oriented SLA management •  Detailed – actionable information on service

performance for engineers, architects and operations •  Zero to low configuration •  No code configuration needed for visibility

Did I mention Eismann is fully deployed on AppDynamics and uses us for automatically managing capacity and SLAs !!

Performances on Amazon AWS - Cloud Day 2012 Xebia

Technology

Performances on Amazon AWS - Cloud Day 2012 Xebia