Performance Variability of Production Cloud Servicessahuja/cloudcourse/StudentPres1.pdfA small mobile app development company is ... to high user load. Vendors do not publish their

Performance Variability of

Production Cloud

Services

by

Allen Metcalfe, Josh Mercer, Aaron Wagner,

and Spence Southard

Discussion

1. Background

2. Problem

3. Cloud Services

4. Quick Statistics Review

5. Benchmarking Results

6. Critical Analysis

7. Areas of Future Research

8. Summary

Background: Cloud Computing

● Owned, operated, and maintained by an

independent vendor.

● Services usually sold as: o Infrastructure as a Service

o Platform as a Service

o Software as a Service

● Deployed as many VMs operating on one

physical machine.

Goals of Cloud Computing

● Performance

● Cost

● Flexibility/Scalability

Background Scenario

A small mobile app development company is

expanding their application by harnessing the

additional compute and storage potential of

traditional computers where smart devices fall

short.

Two choices:

1. Maintain dedicated servers

2. Obtain servers though a cloud vendor

Problem: Performance Variability

● Dependability: o Machine downtime

● I/O Sharing o Contention for disk writes

● Performance Stability o Overutilization of resources

HealthCare.gov (Still loading)

Problem Analysis

How does variance impact performance?

How do we define what is an acceptable level

of performance variance?

What trends and seasonal factors may

contribute to large scale variance?

Research Overview

● At the time of the study there existed no

other investigation into cloud performance

variance.

● Study the long-term variability of

performance for production cloud systems o Amazon Web Services

o Google App Engine

Google App Engine (GAE)

● Python, Java, PHP, Go

● Used by:

Amazon Web Services (AWS)

● Python, Java, PHP, Ruby, .NET

● Used by:

Cloud Providers & Services Tested

Google App Engine

● Python runtime environment

● Datastore storage

● Memcache stores data queries

● URL fetch issues http/https requests

Amazon

● EC2 virtual machines

● S3 storage

● SDB database

● SQS message queue processing

● FPS payment processing

All tests run using CloudStatus.com

Google App Engine - Test

Parameters

● Google Run Service o Calculate Fibonacci

● Google Datastore Service o Create Time

o Read Time

o Delete Time

● Google Memcache Service o Get Time

o Put Time

o Response Time

● Google URL Fetch Service o Response Time

api.facebook.com, api.hi5.com, api.myspace.com, ebay.com,

s3.amazonaws.com, and paypal.com.

Amazon Web Services - Test

Parameters

● Amazon EC2

o Deployment Latency

● Amazon S3

o Get Throughput

o Put Throughput

● Amazon SDB

o Query Response Time

o Update Latency

● Amazon SQS

o Average Lag Time

● Amazon FPS

o Response Time

Statistics Review: Quartiles

Quartiles: ● Q1 # of measurements

from the mean to -25%

● Q3 # of measurements

from the mean to +25%

● iQR (inter quartile range)

distance Q1-Q3

● Q2 Or the “median” is

# of measurements evenly

between Q1 - Q3

Statistics Review: Measuring Variability

Two Qualitative Measures of Variability:

Because there aren’t Industry Measures

1. If the Mean deviates from the Median (value)

by more than 10% of the iQR (value range)

This implies values scale towards one end of the

range.

1. If the Median (population) is less than half

of the iQR (population) IE. Q2<0.5*iQR This implies most measurements are highly variable.

EC2 Deployment Time

Amazon Get EU HI Hourly

Amazon Get EU HI Monthly

Amazon Get US HI Monthly

Amazon SDB Update Time

Amazon SQS Query Time

Amazon FPS payment Time

Google Python Run

Google Datastore

Gooogle Memcache

Google URL Fetch

Performance penalty scenarios

● Researchers generated hypothetical models

for use cases comparing performance vs.

traditional parallel processing environments.

● Simulations are based on the data from

previous graphs and show a penalty factor

versus traditional environments for the

measured user load.

Amazon FPS penalty for payment

processing

Amazon SDB for Social Games

Google datastore for Social Games

Critical Analysis

● This kind of research should have been

repeated for each year.

● Authors conclusion is limited to variance due

to high user load.

● Vendors do not publish their increases in

capability, making inferences difficult.

Summary

● Cloud services may incur high performance

variability due to: o System size

o Workload variability

o Virtualization overhead

o Resource-time Sharing

● Seasonal cloud variance is present,

exhibiting yearly and daily patterns

● Performance variability varies greatly across

application types

Future Research

● Repeated research to verify seasonal

variance

● Techniques for minimizing variance

● Dynamic cloud resizing

Any Questions

Slides available at:

http://bit.ly/1vdpM6B

Performance Variability of Production Cloud Servicessahuja/cloudcourse/StudentPres1.pdfA small mobile app development company is ... to high user load. Vendors do not publish their

Documents