Scaling MySQL: Benefits of Automatic Data Distribution

Webinar: Scaling MySQL Benefits of Automatic Data Distribution

December 13, 2012

2

Agenda

1. Who We Are 2. The Scalability Problem

3. Benefits of Automatic Data Distribution

4. Customer ROI/Case Studies

5. Q & A

(please type questions directly into the GoToWebinar side panel)

3

Who We Are

Presenters: Paul Campaniello,

VP of Global Marketing 25 year technology veteran with marketing experience at Mendix, Lumigent, Savantis and Precise.

Doron Levari, Founder A technologist and long-time

veteran of the database industry. Prior to founding ScaleBase, Doron

was CEO to Aluna.

4

Pain Points – The Scalability Problem

• Thousands of new online and mobile

apps launching every day

• Demand climbs for these apps and

databases can’t keep up

• App must provide uninterrupted

access and availability

• Database performance and

scalability is critical

5

Big Data = Big Scaling Needs

The 451 Group & Teradata

Big Data = Transactions + Interactions + Observations

BIG

DA

TA

ER

P

CR

M

WE

B

Petabytes

Terabytes

Gigabytes

Megabytes

Increasing Data Variety and Complexity

Purchase Detail

Purchase Record

Payment Record

Segmentation

Offer Details

Customer Touches

Support Contacts

Web Logs

Offer History A/B Testing

Dynamic Pricing

Affiliate Networks

Search Marketing

Behavioral

Targeting

Dynamic

Funnels

Sensors/RFID/Devices

User Click Stream

Mobile Web

Sentiment

User Generated Content

Social Interactions & Feeds

Spatial & GPS Coordinates

External

Demographics

Business Data

Feeds

HD Video, Audio, Images

Speech to Text

Product/Service Logs

SMS/MMS

6

Scalability Pain

You just lost

customers

Infrastructure Cost $

time

Large

Capital

Expenditure

Opportunity

Cost

Predicted Demand

Traditional Hardware

Actual Demand

Dynamic Scaling

7

Ongoing “Scaling MySQL” Series

• August 16 & September 20, 2012

– Scaling MySQL: ScaleUp versus Scale Out

• October 23, 2012

– Methods and challenges to Scale out MySQL

• Today

– Benefits of Automatic Data Distribution

• January 17, 2013

– Catch 22 of read-write splitting

8

The Database Engine is the Bottleneck...

• Every write operation is At Least 4 write operations inside the DB:

– Data segment

– Index segment

– Undo segment

– Transaction log

• And Multiple Activities in the DB engine memory:

– Buffer management

– Locking

– Thread locks/semaphores

– Recovery tasks

9

• Every write operation is At Least 4 write operations inside the DB:

– Data segment

– Index segment

– Undo segment

– Transaction log

• And Multiple Activities in the DB engine memory:

– Buffer management

– Locking

– Thread locks/semaphores

– Recovery tasks

The Database Engine is the Bottleneck

Now multiply

by 10TB accessed by

10000 concurrent

sessions

10

COI – Customer, Order, Item

C_ID NAME LOCATION RANK

1 John MA 10

2 James AL 9

3 Peter CA 10

4 Chris FL 8

5 Oliver MA 9

6 Allan MA 9

7 Janette CA 8

8 David MD 10

O_ID C_ID DATE

1 1 2012-02-01

2 1 2012-02-01

3 2 2012-02-01

4 6 2012-02-01

5 6 2012-02-01

6 8 2012-02-01

OI_ID O_ID QUANT I_ID

1 1 3 1

2 1 6 2

3 2 4 1

4 2 2 2

5 2 1 5

6 3 1 1

7 3 6 5

8 4 8 3

9 4 9 4

10 5 2 6

11 6 1 5

I_ID NAME

1 iPhone

2 iPad

3 iPad Mini

4 Kindle

5 Kindle Fire

6 Galaxy S3

CUSTOMER ORDER ORDER_ITEM ITEM

11

Requirements

• Every day:

• Updates

– 30,000 new customers

– 1,000,000 new orders, average of 5 items per order

– Items catalog is updated once a day, nightly, on 11pm

• Queries

– Top customers, rank 9 and up)

– New orders, joins across the board…

Throughput

Latency

12

Splitting the data

• CUSTOMER – random (hash)

• ORDER – derivative (C_ID)

• ORDER_ITEM – transitive (O_ID -> C_ID)

• ITEM – global table

13

Sliced Database


1 John MA 10

4 Chris FL 8

7 Janette CA 8

O_ID C_ID DATE

1 1 2012-02-01

2 1 2012-02-01


1 1 3 1

2 1 6 2

3 2 4 1

4 2 2 2

5 2 1 5

I_ID NAME

1 iPhone

… …

6 Galaxy S3

CUSTOMER ORDER ORDER_ITEM ITEM


2 James AL 9

5 Oliver MA 9

8 David MD 10


3 Peter CA 10

6 Allan MA 9

O_ID C_ID DATE

3 2 2012-02-01

6 8 2012-02-01

O_ID C_ID DATE

4 6 2012-02-01

5 6 2012-02-01


6 3 1 1

7 3 6 5

11 6 1 5


8 4 8 3

9 4 9 4

10 5 2 6

I_ID NAME

1 iPhone

… …

6 Galaxy S3

I_ID NAME

1 iPhone

… …

6 Galaxy S3

DB - 1

DB - 2

DB - 3

14

Requirements

• Every day:

• Updates

– 30,000 new customers

– 1,000,000 new orders, average of 5 items per order

– Items catalog is updated once a day, nightly, on 11pm

• Queries

– Top customers, rank 9 and up)

– New orders, joins across the board…

Throughput

Distribution

Parallelism

Latency

15

Automatic Data Distribution

• The ultimate way to scale

• Provides significant performance improvements

• The only way to really improve read and also writes

• Good for scaling high session-volume reads and writes

• Good for scaling high data-volume reads and writes

• Home-grown implementations have drawbacks

16

Scale Out Features and Benefits

Feature Benefit

Parallel query execution Great performance of cross-db queries & maintenance commands

Query result aggregation Support of sophisticated cross-db queries, even with ORDER BY, GROUP BY, LIMIT, Aggregate functions…

Online data redistribution Flexibility: no need to over-provision No downtime

100% compatible MySQL proxy Applications unmodified Standard MySQL tools and interfaces

MySQL databases untouched Data is safe within MySQL InnoDB/MyISAM/any

Data distribution review and analysis Optimization of data distribution policy

Data consistency verifier Validate system-wide data consistency

Real-time monitoring and alerts Simplify management, reduce TCO

17

Scale Out Provides Immediate & Tangible Value

Application Server

BI

Management

Application Server

Database A Standby A

Database B Standby B

Database C Standby C

Database D Standby D

18

Typical Scale Out (ScaleBase) Deployment

Database B

Database C

Database D

Database A Standby A

Standby B

Standby C

Standby D

ScaleBase

Data Traffic Manager

ScaleBase

Central Management

Application Server

BI

Management

Application Server

19

Choose Your Scale-out Path

# of concurrent sessions

Dat

abas

e S

ize

1 DB?

Good for me!

Data Distribution

Read/Write Splitting

20

Scaling Out Achieves Unlimited Scalability

6000 12000

24000

36000

48000

60000

84000

500 500 1000

1500 1500 2000

2500

0

20000

40000

60000

80000

100000

120000

140000

160000

1 2 4 6 8 10 14

Thro

ugh

pu

t

Number of Databases

Throughput (TPM)

Total DB Size (MB)

# Connections

21

Detailed Scale Out Case Studies

Nokia

• Device Apps App

• Availability

• Scalability

• Geo-clustering

• 100 Apps

• 300 MySQL DB

Solar Edge

• Next Gen Monitoring App

• Massive Scale

• Monitors real time data from thousands of distributed systems

Mozilla

• New Product/ Next Gen App/ AppStore

• Scalability

• Geo-sharding

AppDynamics

• Next gen APM company

• Scalability for the Netflix implementation

22

Summary

• Database scalability is a significant problem

– App explosion, Big Data, Mobile

• Scale Up helps somewhat, but Scale Out provides

a long-term, cost-effective solution

• ScaleBase has an effective Scale Out

solution with a proven ROI

– Improves performance &

requires NO changes to

your existing infrastructure

• Choose your scale-out path....

– The ScaleBase platform enables

you to start with R/W splitting and

grow into automatic data distribution

23

Questions (please enter directly into the GTW side panel)

617.630.2800

www.ScaleBase.com

[email protected]

[email protected]

24

Thank You

Scaling MySQL: Benefits of Automatic Data Distribution

Documents

id order

id c

id date

id quant

id name1johnma

id name3peter ca

id namelocation rank

id name2james al