Top Banner
Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State University - Vancouver, WA Presented by Smita Vijayakumar, Juniper Networks
36

Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

Mar 27, 2015

Download

Documents

Sophia Brewer
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

Evaluating Caching and Storage Options on the Amazon Web Services

Cloud

Gagan Agrawal, Ohio State University - Columbus, OH

David Chiu, Washington State University - Vancouver, WA

Presented by

Smita Vijayakumar, Juniper Networks

Page 2: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

2

Outline

‣ Introduction to Cloud Computing

‣Background on AWS and Motivation

‣Cost and Performance Evaluation

‣Conclusion

Page 3: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

3

Cloud Computing Paradigm

Cloud “Utility” Providers:Amazon AWS, Azure, Cloudera,

Google App Engine

Consumers:Companies, labs, schools, et al.

Page 4: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

4

Cloud Computing Paradigm

Algorithms& Data

Cloud “Utility” Providers:Amazon AWS, Azure, Cloudera,

Google App Engine

Consumers:Companies, labs, schools, et al.

Page 5: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

5

Cloud Computing Paradigm

Algorithms& Data

Cloud “Utility” Providers:Amazon AWS, Azure, Cloudera,

Google App Engine

Consumers:Companies, labs, schools, et al.

Page 6: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

6

Cloud Computing Paradigm

Algorithms& Data

Cloud “Utility” Providers:Amazon AWS, Azure, Cloudera,

Google App Engine

Consumers:Companies, labs, schools, et al.

ProcessedResults

Page 7: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

7

Promises of Cloud Computing

Allows us to consolidate

machines and outsource

computation and storage

Pay-as-you-go Computing

“Infinite” compute resources and storage

Page 8: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

8

Outline

‣ Introduction to Cloud Computing

‣Background on AWS and Motivation

‣Cost and Performance Evaluation

‣Conclusion

Page 9: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

9

A Motivating Example

‣A service-oriented system that answers queries

from a similar domain

‣ Intermediate and

final results can be

cached and reused

for future queries

‣Often present in

workflow

applications

Page 10: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

12

Leveraging the Cloud for Storage

‣Store and Cache Intermediate and Final Results in the

Cloud

‣The Cloud has many options for data storage

• Memory

• Disks

• Network Disks

• Highly Available Persistent Storage

‣There are several tradeoffs in each option

Page 11: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

13

Amazon Web Services (AWS)

‣A Case study: AWS has emerged as one of the most

widely used Cloud platform

‣We consider caching and storage performance in three

AWS Services:

• Elastic Compute Cloud (EC2) Machine instances

• Simple Storage Service (S3)

• Elastic Block Storage (EBS)

Page 12: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

14

AWS Services: EC2

‣Elastic Compute Cloud (EC2)

• Access to virtualized machines with varying capabilities

(e.g., CPU cores, memory, disk space) depending on

price.

Instance Type CPU Memory Disk I/O

Small 1 virtual core 1.7GB 160GB medium

XLarge 4 virtual cores(x 2 compute units ea)

15.0GB 1.7TB high

Page 13: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

15

AWS Services: EBS

‣Elastic Block Storage (EBS)

• Persisted network disks.

• Must be mounted onto EC2 machine before use.

• Users must initially specify a fixed size and format to

appropriate file system.

Page 14: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

16

AWS Services: S3

‣Simple Storage Service (S3)

• Simple FTP-style API: GET, PUT, etc.

• Highly available, reliable, and durable storage (but

slower)

• “Infinite capacity”

• Not required to be used with EC2 machines.

• Very inexpensive in terms of costs.

Page 15: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

17

Costs of AWS Services

Page 16: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

18

Tradeoffs Per Application and Service

‣Caching in-core (EC2-Memory)

• Fast, but expensive

• Small, may need extra logic to coordinate set of EC2

nodes

• Data is volatile

Page 17: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

19

Tradeoffs Per Application and Service

‣Caching on local disk (EC2-Disk)

• Much slower than memory

• Much more space

• Data is still volatile

Page 18: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

20

Tradeoffs Per Application and Service

‣Caching on Elastic Block Store (EC2-EBS)

• Possibly slower than disk

• Volume size is initially configured by application users

• Data is persisted

Page 19: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

21

Tradeoffs Per Application and Service

‣Caching on S3

• Slowest option, but most reliable

• No bound on size

• Data is persisted

Page 20: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

22

Outline

‣ Introduction to Cloud Computing

‣Background on AWS and Motivation

‣Cost and Performance Evaluation

‣Conclusion

Page 21: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

24

Experimental Application

‣Geospatial Application: Land Elevation Change

• In general, 2 large matrices (DEM files) are retrieved, and their

difference is returned

‣500 unique requests

‣Requests are issued randomly

‣Eviction not considered (we assume cache/storage configuration

is being used to store all results)

Smita
Do we need this line?
Page 22: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

25

Performance

‣We use 4 different DEM data sizes to test performance:

• 1KB, 1MB, 5MB, 50MB

‣This means a full cache would hold

• 500KB, 500MB, 2.5GB, 25GB

Page 23: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

26

1KB DEM Size

Page 24: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

27

1MB DEM Size

Page 25: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

28

5MB DEM Size

Page 26: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

29

50MB DEM Size

Page 27: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

30

Cost Analysis

‣We next assess the costs versus the performance

‣Performance is being measured as relative speedup over

the baseline DEM process execution, shown in Table 2

‣We project costs and speedup over 2000 and 200000

requests

Page 28: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

31

Monthly Costs for Volatile Cache (1MB)

200000 I/O Requestsoutside of AWS

2000 I/O Requestsoutside of AWS

Cost per unit speedup is low when requests are high.

I/O costs are still low because of small data size

3.5 3.26 3.6 3.6 267 28 347 180.5Speedup

Page 29: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

32

Monthly Costs for Volatile Cache (50MB)

200000 I/O Requestsoutside of AWS

2000 I/O Requestsoutside of AWS

Costs are now dominated by I/O due to large data size

In terms of performance, makes more sense to use xlarge for large data size

2.9 3.3 16.05 31.66Speedup

small instance makes better economic sense for small number of requests

Page 30: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

33

Monthly Costs for Persistent Cache (1MB)

200000 I/O Requestsoutside of AWS

2000 I/O Requestsoutside of AWS

S3 makes better economic sense than EBS-based instances

3.4 3.62 3.58 30 13.6 134Speedup

S3 performance is comparable for a cache with small I/O requests

Page 31: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

34

Monthly Costs for Persistent Cache (50MB)

200000 I/O Requestsoutside of AWS

2000 I/O Requestsoutside of AWS

Interesting - Even with low cost of S3, it still makes sense to use xlarge when I/O requests are high

2.59 2.74 3.19 6.4 11.09 22.66Speedup

S3 still comparable, and makes better economic sense than EBS-based instances

Page 32: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

35

Outline

‣ Introduction to Cloud Computing

‣Background on AWS and Motivation

‣Cost and Performance Evaluation

‣Conclusion

Page 33: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

36

Summary (1)

‣For smaller data (<= 5MB)

• If request rate is low: Use small instance on-disk

• If request rate is high: Use small instance in-memory

• Although I/O is slow, the cost of using small instance is

very low

‣ If persistence is needed,

• Use S3, and avoid EBS

Page 34: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

37

Summary (2)

‣For larger data (>= 50MB and large cache sizes)

• Use xlarge instances

• Higher I/O rates

• Larger memory and disk capacity

‣EBS may be considered in conjunction to XLarge

instances for persistence

‣ If performance is not an issue, but persistence and

costs are, use S3

Page 35: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

38

Conclusion

‣Cloud offers many viable options for data storage and

caching

‣We evaluated the cost-performance tradeoffs of these

various options, and determined a roadmap for making

clear decisions on resource usage

Page 36: Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.

39

Thank you

‣Questions and Comments?

• David Chiu - [email protected]

• Gagan Agrawal – [email protected]