Ten^H^H^H Many Cloud App Design Patterns

Post on 17-May-2015

16201 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

What kind of design patterns are useful for applications adopting the cloud? How can apps achieve the scalability and availability promised by the cloud? Presentation from Interop 2011 Enterprise Cloud Summit.

Transcript

Ten CloudDesign Patterns

Shlomo SwidlerFounder

Orchestratus

2 of 61

Shlomo Swidler

• Founder, Orchestratus– Strategic and technical

IT consulting– Customers include:

• Cloud Developer Tips bloghttp://shlomoswidler.com/

• Among top community-ranked contributors to Amazon Web Services discussion forums

Ten CloudDesign Patterns

Shlomo SwidlerFounder

Orchestratus

Ten Cloud ApplicationDesign Patterns

Shlomo SwidlerFounder

Orchestratus

Ten Cloud ApplicationDesign Patterns

Shlomo SwidlerFounder

Orchestratus

Many

6 of 61

What is a Design Pattern

• A reusable recipe for building (software) systems that solve a particular problem.

7 of 61

What is a Design Pattern

• A reusable recipe for building (software) systems that solve a particular problem.

AKA Architectural Pattern

8 of 61

What is a Design Pattern

• A reusable recipe for building (software) systems that solve a particular problem.

Constraints AvailableResources

Goal

9 of 61

A Design Pattern

• A reusable recipe for building (software) systems that solve a particular problem.

Constraints AvailableResources

Goal

Meets affirmativerequirements

Can be implemented

Does not violatenegative requirements

10 of 61

Challenges Faced by Apps in the Cloud

• Application Scalability– Cloud promises rapid (de)provisioning of resources.– How do you tap into that to create scalable

systems?• Application Availability– Underlying resource failures happen

… usually more frequently than intraditional data centers.

– How do you overcome that to create highly available systems?

11 of 61

The Scalability Challenge

• Scalability: Handle more (or fewer) requests– It’s not Performance (handle requests faster)– It’s not Availability (tolerate failures)• But improving Scalability often improves Availability

12 of 61

The Scalability Challenge

• Two different components to scale:– State (inputs, data store, output)– Behavior (business logic)

• Any non-trivial application has both.• Scaling one component means scaling the

other, too.

13 of 61

App Scalability Patterns for State

• Data Grids• Distributed Caching• HTTP Caching

– Reverse Proxy– CDN

• Concurrency– Message-Passing– Dataflow– Software Transactional

Memory– Shared-State

• Partitioning

• CAP theorem: Data Consistency– Eventually Consistent– Atomic Data

• DB Strategies– RDBMS

• Denormalization• Sharding

– NOSQL• Key-Value store• Document store• Data Structure store• Graph database

14 of 61

App Scalability Patterns for Behavior

• Compute Grids• Event-Driven Architecture

– Messaging– Actors– Enterprise Service Bus– Domain Events– Event Stream Processing– Event Sourcing– Command & Query

Responsibility Segregation (CQRS)

• Load Balancing– Round-robin– Random– Weighted– Dynamic

• Parallel Computing– Master/Worker– Fork/Join– MapReduce– SPMD– Loop Parallelism

15 of 61

The Availability Challenge

• Availability: Tolerate failures• Traditional IT focuses on increasing MTTF– Mean Time to Failure

• Cloud IT focuses on reducing MTTR– Mean Time to Recovery

16 of 61

The Availability Challenge

• Availability: Tolerate failures• Traditional IT focuses on increasing MTTF– Mean Time to Failure

• Cloud IT focuses on reducing MTTR– Mean Time to Recovery

• What follows is four availability scenarios:[low, high] X [MTTF, MTTR]

17 of 61

Availability and MTTF, MTTR

< MTTF, > MTTR

< MTTF, < MTTR

> MTTF, < MTTR

> MTTF, > MTTR Up 1Down 1Up 2Down 2Up 3Down 3Up 4Down 4Up 5Down 5Up 6Down 6Up 7

18 of 61

Availability and MTTF, MTTR

< MTTF, > MTTR

< MTTF, < MTTR

> MTTF, < MTTR

> MTTF, > MTTR Up 1Down 1Up 2Down 2Up 3Down 3Up 4Down 4Up 5Down 5Up 6Down 6Up 7

53%

Uptime

86%

69%

30%

19 of 61

Availability and MTTF, MTTR

< MTTF, > MTTR

< MTTF, < MTTR

> MTTF, < MTTR

> MTTF, > MTTR Up 1Down 1Up 2Down 2Up 3Down 3Up 4Down 4Up 5Down 5Up 6Down 6Up 7

53%

Uptime

86%

69%

30%

Traditional IT

20 of 61

Availability and MTTF, MTTR

< MTTF, > MTTR

< MTTF, < MTTR

> MTTF, < MTTR

> MTTF, > MTTR Up 1Down 1Up 2Down 2Up 3Down 3Up 4Down 4Up 5Down 5Up 6Down 6Up 7

53%

Uptime

86%

69%

30%

Traditional IT

Cloud

21 of 61

Availability and MTTF, MTTR

< MTTF, > MTTR

< MTTF, < MTTR

> MTTF, < MTTR

> MTTF, > MTTR Up 1Down 1Up 2Down 2Up 3Down 3Up 4Down 4Up 5Down 5Up 6Down 6Up 7

53%

Uptime

86%

69%

30%

Traditional IT

Cloud

Cloud done wrong

22 of 61

Design Patterns for Availability

• Pattern: Replication• Pattern: Fail-Over

• Often used together.

23 of 61

Availability Pattern: Fail-Over

Source: Michael Nygaard

24 of 61

Availability Pattern: Fail-Over

In practice, fail-over is not this simple

Source: Michael Nygaard

25 of 61

Availability Pattern: Fail-Over

Source: Michael Nygaard

26 of 61

Availability Pattern: Fail-Over with Fail-Back

Source: Michael Nygaard

27 of 61

Availability’s Nemesis

• Single Points of Failure

SPOT the SPOF*

*Single Point of Failure

29 of 61

Spot the SPOF: 1Internet

Cloud

App InstanceApp

30 of 61

Spot the SPOF: 1b

App

Internet

Cloud

App Instance

31 of 61

Spot the SPOF: 1bInternet

Cloud

App InstanceApp

32 of 61

Spot the SPOF: 2

App

Internet

Cloud

App Instance

Elastic IP Address

App

App Instance

Fail-over

33 of 61

Spot the SPOF: 2

App

Internet

Cloud

App Instance

Elastic IP Address

App

App Instance

Might work…Until you need more App instancesOr until another SPOF fails…

Fail-over

34 of 61

Spot the SPOF: 2aInternet

Cloud

LB Load Balancer Instance

App App

35 of 61

Spot the SPOF: 2aInternet

Cloud

LB Load Balancer Instance

App App

36 of 61

Spot the SPOF: 3

Availability Zone

Internet

Cloud

LB

Replicated configuration

LB

Elastic IP Address

App App

Fail-over

37 of 61

Spot the SPOF: 3

Availability Zone

Internet

Cloud

LB

Replicated configuration

LB

App App

Fail-over

Elastic IP Address

38 of 61

Spot the SPOF: 4

Availability Zone

Internet

Cloud

ELB Elastic Load Balancer (Magic)

AppApp

39 of 61

Spot the SPOF: 4

Availability Zone

Internet

Cloud

ELB Elastic Load Balancer (Magic)

AppApp

40 of 61

Spot the SPOF: 5Internet

Region

Availability Zone

LB

App App

Availability Zone

LB

App App

Replicated configuration

Elastic IP Address

Fail-over

41 of 61

Spot the SPOF: 5Internet

Availability Zone

LB

App App

Availability Zone

LB

App App

RegionReplicated configuration

Elastic IP Address

Fail-over

42 of 61

Spot the SPOF: 6

Availability Zone

Internet

Region

ELB Elastic Load Balancer (Magic)

Availability Zone

AppApp

AppApp

43 of 61

Spot the SPOF: 6

Availability Zone

Internet

Region

ELB Elastic Load Balancer (Magic)

Availability Zone

AppApp

AppApp

44 of 61

Spot the SPOF: 7Internet

RegionRegion

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

Or…

45 of 61

Spot the SPOF: 7aInternet

RegionRegion

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

46 of 61

Spot the SPOF: 7/7aInternet

RegionRegion

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

Elastic IPs aresingle-region only

47 of 61

Spot the SPOF: 7bInternet

ELB

RegionRegion

Availability Zone

AppApp

Availability Zone

AppApp

Availability Zone

AppApp

Availability Zone

AppApp

48 of 61

Spot the SPOF: 7bInternet

ELB

RegionRegion

Availability Zone

AppApp

Availability Zone

AppApp

Availability Zone

AppApp

Availability Zone

AppApp

ELB is single-region only

49 of 61

Spot the SPOF: 7c

Availability Zone

Internet

ELB

Region

Availability ZoneAvailability Zone

Region

Availability Zone

ELB

DNS

AppApp

AppApp

AppApp

AppApp

50 of 61

Spot the SPOF: 7c

Availability Zone

Internet

ELB

Region

Availability ZoneAvailability Zone

Region

Availability Zone

ELB

DNS

AppApp

AppApp

AppApp

AppApp

ELB Can’t Do ThatMultiple CNAMEs Violate RFC 2181

51 of 61

Spot the SPOF: 7dInternet

RegionRegion

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

DNS

52 of 61

Spot the SPOF: 7dInternet

RegionRegion

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

DNSCloud Provider

AWS

Spot the SPOF: 8Internet

RegionRegion

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

DNS

Rackspace

53

LB

App App

AWS

Spot the SPOF: 8Internet

RegionRegion

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

DNS

Rackspace

54

LB

App App

AWS

Spot the SPOF: 8Internet

RegionRegion

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

DNS

Rackspace

55

LB

App App

and...

AWS

Spot the SPOF: 8Internet

RegionRegion

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

DNS

Rackspace

56

LB

App App

and...Fail-overmechanism

AWS

Spot the SPOF: 8Internet

RegionRegion

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

Availability Zone

LB

App App

DNS

Rackspace

57

LB

App App

and...Ops staff andFail-overmechanism

58 of 61

Availability: Ensure Redundancies

• Physical• Virtual resource (instance, disk, etc.)• Availability zone• Region• Provider• Human (ops staff)

59 of 61

Availability Best Practice:Chaos Monkey

• AKA Error Injection Testing– Forcibly create fault conditions in your cloud

components.– Kill instances, detach disks, screw up DNS, etc.

• Automate recovery from the errors.• The team gets really good at reducing MTTR,

increasing availability!• Popularized by Netflix, who run it on their live

environment.

60 of 61

For more on Designing forAvailability, Scalability

• Jonas BonérScalability, Availability, Stability Patterns http://slidesha.re/cK3NJv

• George ReeseThe AWS Outage: The Cloud’s Shining Momenthttp://oreil.ly/eKCGG9

• John Ciancutti of Netflix5 Lessons We’ve Learned Using AWShttp://bit.ly/h8rU8b

Ten Cloud ApplicationDesign Patterns

Shlomo SwidlerFounder

Orchestratusshlomo@orchestratus.com

@ShlomoSwidler

ManyThank you!

top related