Top Banner
Scaling your Database in the Cloud Fall 2010 Presented by: Cory Isaacson, CEO CodeFutures Corporation http://www.dbshards.com
34

Scaling Your Database In The Cloud

Jan 15, 2015

Download

Technology

Cory Isaacson

These slides were presented at Cloud Expo West 2010, covering what it takes to scale your databases in the cloud -- keeping them fully reliable as well.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Scaling Your Database In The Cloud

Scaling your Database in the Cloud

Fall 2010

Presented by: Cory Isaacson, CEO CodeFutures Corporation http://www.dbshards.com

Page 2: Scaling Your Database In The Cloud

Introduction

  Who I am   Cory Isaacson, CEO of CodeFutures   Providers of dbShards   Author of Software Pipelines

  Partnerships:   Rightscale

  The leading Cloud Management Platform   myCloudWatcher

  Provider of cloud monitoring and management services

  Leaders in scalability, performance, high-availability and database solutions for the cloud…   …based on real-world experience with dozens of cloud-based applications   …social networking, gaming, data collection, mobile, analytics

  Objective is to provide useful experience you can apply to scaling (and managing) your database tier…   …especially for high volume applications

Page 3: Scaling Your Database In The Cloud

Challenges of cloud computing

  Cloud provides highly attractive service environment   Flexible, scales with need (up or down)   No need for dedicated IT staff, fixed facility costs   Pay-as-you-go model

  Cloud services occasionally fail   Partial network outages   Server failures…

  …by their nature cloud servers are “transient”   Disk volume issues

  Cloud-based resources are constrained   CPU   I/O Rates…

  …the “Cloud I/O Barrier”

Page 4: Scaling Your Database In The Cloud

Typical Application Architecture

Page 5: Scaling Your Database In The Cloud

Scaling in the Cloud

  Scaling Load Balancers is easy   Stateless routing to app server  Can add redundant Load Balancers if needed  DNS round-robin or intelligent routing for larger sites   If one goes down…

  …failover to another

  Scaling Application Servers is easy   Stateless  Add or remove servers as need dictates   If one goes down…

  …failover to another

Page 6: Scaling Your Database In The Cloud

Scaling in the Cloud

  Scaling the Database tier is hard   “Stateful” by definition (and necessity…)   Large, integrated data sets…

  10s of GBs to TBs (or more…)   Difficult to move, reload

  I/O dependent…   …adversely affected by cloud service failures   …and slow cloud I/O

  If one goes down…   …ouch!

  Databases form the “last mile” of true application scalability   Initially simple optimization produces the best result   Implement a follow-on scalability strategy for long-term performance goals…

  …plus a high-availability strategy is a must

Page 7: Scaling Your Database In The Cloud

All CPUs wait at the same speed…

The Cloud I/O Barrier

Page 8: Scaling Your Database In The Cloud

More Database scalability challenges

  Databases have many other challenges that limit scalability   ACID compliance…

  …especially Consistency   …user contention

  Operational challenges   Failover

  Planned, unplanned   Maintenance

  Index rebuild   Restore   Space reclamation

  Lifecycle   Reliable Backup/Restore   Monitoring   Application Updates   Management

Page 9: Scaling Your Database In The Cloud

Database slowdown is not linear…

0 1000 2000 3000 4000 5000 6000 7000 8000 9000

10000

0 10 20 30 40

Load

Tim

e

Data File (GB)

Database Load Curve

Time

Expon.(Time)

GB Load Time (Min)

.9 1

1.3 2.5

3.5 11.7

39.0 10 days…

Page 10: Scaling Your Database In The Cloud

Challenges apply to all types of databases   Traditional RDBMS (MySQL, Postgres, Oracle…)

  I/O bound

  Multi-user, lock contention   High-availability

  Lifecycle management

  In-memory Databases (NoSQL, Caching, Specialty…)   Reliability   Limits of a single server

  …and a single thread

  Data dumps to disk   High-availability

  Lifecycle Management

  No matter what the technology, big databases are hard to manage…   …elastic scaling is a real challenge   …degradation from growth in size and volume is a certainty

  Application-specific database requirements add to the challenge   …database design is key…

  …balance performance vs. convenience vs. data size

Page 11: Scaling Your Database In The Cloud

The Laws of Databases

  Law #1: Small Databases are fast…   Law #2: Big Databases are slow…   Law #3: Keep databases small

Page 12: Scaling Your Database In The Cloud

What is the answer?

  Database sharding is the only effective method for achieving scale, elasticity, reliability and easy management…  …regardless of your database technology

Page 13: Scaling Your Database In The Cloud

What is Database Sharding?

  “Horizontal partitioning is a database design principle whereby rows of a database table are held separately... Each partition forms part of a shard, which may in turn be located on a separate database server or physical location.” Wikipedia

Page 14: Scaling Your Database In The Cloud

The key to Database Sharding…

Page 15: Scaling Your Database In The Cloud

Database Sharding Architecture

Page 16: Scaling Your Database In The Cloud

Database Sharding Architecture

Page 17: Scaling Your Database In The Cloud

Database Sharding… the results

Page 18: Scaling Your Database In The Cloud

Why does Database Sharding work?

  Maximize CPU/Memory per database instance…  …as compared to database size

  No contention between servers  Locking, disk, memory, CPU

  Allows for intelligent parallel processing…  …Go Fish queries across shards

  Keep CPUs busy and productive

Page 19: Scaling Your Database In The Cloud

Breaking the Cloud I/O Barrier

Page 20: Scaling Your Database In The Cloud

Database types

  Traditional RDBMS   Proven   SQL language…

  …although ORM can change that   Durable, transactional, dependable…

  …but inflexible

  NoSQL   Typically in-memory…

  …the real speed advantage   Various flavors…

  Key/Value Store   Document database   Hybrid

Page 21: Scaling Your Database In The Cloud

Black box vs. Application-Aware Sharding   Both utilize sharding on a data key…

  …typically modulus on a value or consistent hash   Black box sharding is automatic…

  …attempts to evenly shard across all available servers   …no developer visibility or control   …can work acceptably for simple, non-indexed NoSQL data stores   …easily supports single-row/object results

  Application-Aware sharding is defined by the developer…   …selective sharding of large data   …explicit developer control and visibility   …tunable as the database grows and matures   …more efficient for result sets and searchable queries

Page 22: Scaling Your Database In The Cloud

How Database Sharding works for NoSQL

Page 23: Scaling Your Database In The Cloud

How Database Sharding works for SQL

Global Tables

Primary Shard Table

Shard Child Tables

Page 24: Scaling Your Database In The Cloud

How Database Sharding works for SQL

Page 25: Scaling Your Database In The Cloud

What about Cross-Shard result sets?

Page 26: Scaling Your Database In The Cloud

More on Cross-Shard result sets…

  Black Box approach requires “scatter gather” for multi-row/object result set…  …common with NoSQL engines  …forces use of denormalized lists  …must be maintained by developer/application code  …Map Reduce processing helps with this (non-realtime)

  Application-Aware provides access to meaningful result sets of related data…  …aggregation, sort easier to perform  …logical search operations more natural

Page 27: Scaling Your Database In The Cloud

What about High-Availability?

  Can you afford to take your databases offline:  …for scheduled maintenance?  …for unplanned failure?  …can you accept some lost transactions?

  By definition Database Sharding adds failure points to the data tier

  A proven High-Availability strategy is a must…  …system outages…especially in the cloud  …planned maintenance…a necessity for all database

engines

Page 28: Scaling Your Database In The Cloud

Traditional asynch replication is inadequate…   Replication “Lag” can

result in lost transactions

  No rapid failover for continuous operation

  Maintenance difficult in production environments

  Used by many NoSQL engines

Page 29: Scaling Your Database In The Cloud

Database Sharding and High-Availability

  Need a minimum of 2 active-active instances per shard…   …3 active-active instances for in-memory databases

  Support for…   …fail-down

  …failover   …maintenance   …automated backup/restore

  …monitoring

Page 30: Scaling Your Database In The Cloud

Database Sharding…elastic shards

Page 31: Scaling Your Database In The Cloud

Database Sharding…elastic shards

  Expand the number of shards…  …divide a single shard into N new shards

  Contract the number of shards…  …consolidate N shards into a single shard

  Due to large data sizes, this takes time…  …regardless of data architecture

  Requires High-Availability to ensure no downtime…  …perform scaling on live replica

Page 32: Scaling Your Database In The Cloud

Database Sharding…the future

  Ability to leverage proven database engines…   …SQL   …NoSQL   …Caching

  Allow developers to select the best database engine for a given set of application requirements…   …seamless context-switching within the application   …use the API of choice

  Improved management…   …monitoring   …configuration “on-the-fly”   …dynamic elastic shards based on demand

Page 33: Scaling Your Database In The Cloud

Database Sharding summary

  Database Sharding is the most effective tool for scaling your database tier in the cloud…   …or anywhere

  Use Database Sharding for any type of engine…   …SQL   …NoSQL   …Cache

  Use the best engine for a given application requirement…   …avoid the “one-size-fits-all” trap   …each engine has its strengths to capitalize on

  Ensure your High-Availability is proven and bulletproof…   …especially critical on the cloud   …must support failure and maintenance

Page 34: Scaling Your Database In The Cloud

Questions/Answers

Cory Isaacson CodeFutures Corporation [email protected] http://www.dbshards.com