Top Banner
Architecting for the Cloud Seth Proctor, CTO @technicallyseth
26
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: New york-breakfast-seminar

Architecting for the Cloud

Seth Proctor, CTO @technicallyseth

Page 2: New york-breakfast-seminar

What’s unique about “cloud”?

Page 3: New york-breakfast-seminar

Cloud architecture   On-demand

  Scale-out for capacity & availability   Public infrastructure; dynamic provisioning

  Flexible   Commodity   Hybrid (public & private)

  Simple   Monitoring & management   Platform APIs and automation

  Resilient

Page 4: New york-breakfast-seminar

Why a different architecture?

  Greater capacity   Cost-effectiveness   Higher availability and better failure-handling   Lower latencies for global deployment

Page 5: New york-breakfast-seminar

Challenges

  Distribution brings challenges   Lots of failures happen with frequency   More difficult to get a global view   Security & data lifecycle is harder   Everything else about “distributed computing”

  Still, we can scale most layers   Load-balancers & name services at the top   Horizontally-scaled app servers   Caches & CDNs for content   Redundant disks and object stores

Page 6: New york-breakfast-seminar

Scaling the database is the real challenge

Page 7: New york-breakfast-seminar

Migrating to “the cloud”

  Modern architectures are commodity, on-demand and virtualized   Enterprise applications need availability and transactions   If cloud-scaling breaks global consistency migration isn’t possible

7

Page 8: New york-breakfast-seminar

Traditional database design

  RDBMS architectures start at the disk   Vertical scale follows   Caching helps, but often breaks consistency   HA systems become very expensive

  Schema & operation is hard to evolve   Hard to harness commodity infrastructure   Not designed to scale-out

Page 9: New york-breakfast-seminar

Common options

  Replication   Active-passive or (gulp) multi-master   Replicated data but visible delays & conflict

Sharding   Split one database into many sub-sets   More capacity but hard to evolve and relate

  Abandon consistency   Push correctness & conflict to the application   Simpler core architecture but painful for

applications and hard to reconcile failures

Page 10: New york-breakfast-seminar

Consistency

Page 11: New york-breakfast-seminar

Side-effects

  Applications are tied to deployment   A key motivator for dev-ops   Complex for on-demand changes, failures

  More, independent pieces   Harder to interpret failures   Complexity

Page 12: New york-breakfast-seminar

Global operation

  Many motivations   Disaster Recovery   Lower-latency for distributed users   Data access & storage residency rules

  Trade-offs between latencies and safety   Storage may be a separate concern from interaction

Page 13: New york-breakfast-seminar

The database is not the disk

Page 14: New york-breakfast-seminar

Evolution of “operational”

  Hybrid Tasks   Hybrid analytics for real-time insight   Document & SQL for flexibility   Graph views to ask hypothetical questions

and model and track lifecycle

  These imply that …   Data-sets are larger   Access and cache patterns are variable   Latencies have more impact

14

Page 15: New york-breakfast-seminar

Global requirements

  Global Operation   Active in multiple locations & globally

consistent

  Data Residence & Governance   Where is your data on-disk and in-memory

  SLAs as Policy for Automation   Reactive or proactive   Resilient

  Multiple Models

Page 16: New york-breakfast-seminar

Cloud is the evolution from “client-server” to “distributed”

Page 17: New york-breakfast-seminar

Distributed Database Designs

17

Approach Shared Disk Shared-Nothing/Sharded

Synchronous Replication

Durable Distributed Cache

Key Idea Sharing a file system. Independent databases for disjoint subsets of

data.

Committing data transactionally to multiple

locations before returning.

Replicating data in memory on-demand.

Topology

Example Oracle RAC DB2 Pure Scale

MySQL Cluster and most NoSQL/NewSQL

solutions Google F1

Page 18: New york-breakfast-seminar

A Durable, Distributed Cache

  Caching puts a database in-memory   Optimizations focus on memory, not disk   Caches are transient, on-demand and

hierarchical by nature

  Distribution means independence   Equivalent peers that coordinate to provide a

single logical entity   Drives service resiliency

  Durability provides safety   Decisions about replication, location and

resource allocation are operational

18

Page 19: New york-breakfast-seminar

Peer to Peer Architecture

P

P P

S3Disk , ...

P

P NuoDB Database Peer Process

Provisioned, Manageable Resources

Peer to Peer Communications

SQL Client

Management Client

SQL Front-EndSQL Optimizer

Transaction Handling

Object CachingObject Coordination

Durability

P

Page 20: New york-breakfast-seminar

NuoDB is designed for

  Global operations   On-demand capacity   Continuous availability   Policy-driven deployment   Multi-tenancy   Multiple models

Page 21: New york-breakfast-seminar

NuoDB is a single, logical service with global consistency

Page 22: New york-breakfast-seminar

Scaling YCSB

Page 23: New york-breakfast-seminar

Scaling DBT-2

23

Page 24: New york-breakfast-seminar

Summary

  Look for distributed architectures with on-demand capabilities   Layer & abstract to support evolution and react gracefully to failures   Assume your needs will evolve; plan with scale in mind

Page 25: New york-breakfast-seminar

http://dev.nuodb.com

Page 26: New york-breakfast-seminar

Questions?