Webinar: Scaling MySQL Benefits of Automatic Data Distribution December 13, 2012
Jan 26, 2015
Webinar: Scaling MySQL Benefits of Automatic Data Distribution
December 13, 2012
2
Agenda
1. Who We Are 2. The Scalability Problem
3. Benefits of Automatic Data Distribution
4. Customer ROI/Case Studies
5. Q & A
(please type questions directly into the GoToWebinar side panel)
3
Who We Are
Presenters: Paul Campaniello,
VP of Global Marketing 25 year technology veteran with marketing experience at Mendix, Lumigent, Savantis and Precise.
Doron Levari, Founder A technologist and long-time
veteran of the database industry. Prior to founding ScaleBase, Doron
was CEO to Aluna.
4
Pain Points – The Scalability Problem
• Thousands of new online and mobile
apps launching every day
• Demand climbs for these apps and
databases can’t keep up
• App must provide uninterrupted
access and availability
• Database performance and
scalability is critical
5
Big Data = Big Scaling Needs
The 451 Group & Teradata
Big Data = Transactions + Interactions + Observations
BIG
DA
TA
ER
P
CR
M
WE
B
Petabytes
Terabytes
Gigabytes
Megabytes
Increasing Data Variety and Complexity
Purchase Detail
Purchase Record
Payment Record
Segmentation
Offer Details
Customer Touches
Support Contacts
Web Logs
Offer History A/B Testing
Dynamic Pricing
Affiliate Networks
Search Marketing
Behavioral
Targeting
Dynamic
Funnels
Sensors/RFID/Devices
User Click Stream
Mobile Web
Sentiment
User Generated Content
Social Interactions & Feeds
Spatial & GPS Coordinates
External
Demographics
Business Data
Feeds
HD Video, Audio, Images
Speech to Text
Product/Service Logs
SMS/MMS
6
Scalability Pain
You just lost
customers
Infrastructure Cost $
time
Large
Capital
Expenditure
Opportunity
Cost
Predicted Demand
Traditional Hardware
Actual Demand
Dynamic Scaling
7
Ongoing “Scaling MySQL” Series
• August 16 & September 20, 2012
– Scaling MySQL: ScaleUp versus Scale Out
• October 23, 2012
– Methods and challenges to Scale out MySQL
• Today
– Benefits of Automatic Data Distribution
• January 17, 2013
– Catch 22 of read-write splitting
8
The Database Engine is the Bottleneck...
• Every write operation is At Least 4 write operations inside the DB:
– Data segment
– Index segment
– Undo segment
– Transaction log
• And Multiple Activities in the DB engine memory:
– Buffer management
– Locking
– Thread locks/semaphores
– Recovery tasks
9
• Every write operation is At Least 4 write operations inside the DB:
– Data segment
– Index segment
– Undo segment
– Transaction log
• And Multiple Activities in the DB engine memory:
– Buffer management
– Locking
– Thread locks/semaphores
– Recovery tasks
The Database Engine is the Bottleneck
Now multiply
by 10TB accessed by
10000 concurrent
sessions
10
COI – Customer, Order, Item
C_ID NAME LOCATION RANK
1 John MA 10
2 James AL 9
3 Peter CA 10
4 Chris FL 8
5 Oliver MA 9
6 Allan MA 9
7 Janette CA 8
8 David MD 10
O_ID C_ID DATE
1 1 2012-02-01
2 1 2012-02-01
3 2 2012-02-01
4 6 2012-02-01
5 6 2012-02-01
6 8 2012-02-01
OI_ID O_ID QUANT I_ID
1 1 3 1
2 1 6 2
3 2 4 1
4 2 2 2
5 2 1 5
6 3 1 1
7 3 6 5
8 4 8 3
9 4 9 4
10 5 2 6
11 6 1 5
I_ID NAME
1 iPhone
2 iPad
3 iPad Mini
4 Kindle
5 Kindle Fire
6 Galaxy S3
CUSTOMER ORDER ORDER_ITEM ITEM
11
Requirements
• Every day:
• Updates
– 30,000 new customers
– 1,000,000 new orders, average of 5 items per order
– Items catalog is updated once a day, nightly, on 11pm
• Queries
– Top customers, rank 9 and up)
– New orders, joins across the board…
Throughput
Latency
12
Splitting the data
• CUSTOMER – random (hash)
• ORDER – derivative (C_ID)
• ORDER_ITEM – transitive (O_ID -> C_ID)
• ITEM – global table
13
Sliced Database
C_ID NAME LOCATION RANK
1 John MA 10
4 Chris FL 8
7 Janette CA 8
O_ID C_ID DATE
1 1 2012-02-01
2 1 2012-02-01
OI_ID O_ID QUANT I_ID
1 1 3 1
2 1 6 2
3 2 4 1
4 2 2 2
5 2 1 5
I_ID NAME
1 iPhone
… …
6 Galaxy S3
CUSTOMER ORDER ORDER_ITEM ITEM
C_ID NAME LOCATION RANK
2 James AL 9
5 Oliver MA 9
8 David MD 10
C_ID NAME LOCATION RANK
3 Peter CA 10
6 Allan MA 9
O_ID C_ID DATE
3 2 2012-02-01
6 8 2012-02-01
O_ID C_ID DATE
4 6 2012-02-01
5 6 2012-02-01
OI_ID O_ID QUANT I_ID
6 3 1 1
7 3 6 5
11 6 1 5
OI_ID O_ID QUANT I_ID
8 4 8 3
9 4 9 4
10 5 2 6
I_ID NAME
1 iPhone
… …
6 Galaxy S3
I_ID NAME
1 iPhone
… …
6 Galaxy S3
DB - 1
DB - 2
DB - 3
14
Requirements
• Every day:
• Updates
– 30,000 new customers
– 1,000,000 new orders, average of 5 items per order
– Items catalog is updated once a day, nightly, on 11pm
• Queries
– Top customers, rank 9 and up)
– New orders, joins across the board…
Throughput
Distribution
Parallelism
Latency
15
Automatic Data Distribution
• The ultimate way to scale
• Provides significant performance improvements
• The only way to really improve read and also writes
• Good for scaling high session-volume reads and writes
• Good for scaling high data-volume reads and writes
• Home-grown implementations have drawbacks
16
Scale Out Features and Benefits
Feature Benefit
Parallel query execution Great performance of cross-db queries & maintenance commands
Query result aggregation Support of sophisticated cross-db queries, even with ORDER BY, GROUP BY, LIMIT, Aggregate functions…
Online data redistribution Flexibility: no need to over-provision No downtime
100% compatible MySQL proxy Applications unmodified Standard MySQL tools and interfaces
MySQL databases untouched Data is safe within MySQL InnoDB/MyISAM/any
Data distribution review and analysis Optimization of data distribution policy
Data consistency verifier Validate system-wide data consistency
Real-time monitoring and alerts Simplify management, reduce TCO
17
Scale Out Provides Immediate & Tangible Value
Application Server
BI
Management
Application Server
Database A Standby A
Database B Standby B
Database C Standby C
Database D Standby D
18
Typical Scale Out (ScaleBase) Deployment
Database B
Database C
Database D
Database A Standby A
Standby B
Standby C
Standby D
ScaleBase
Data Traffic Manager
ScaleBase
Central Management
Application Server
BI
Management
Application Server
19
Choose Your Scale-out Path
# of concurrent sessions
Dat
abas
e S
ize
1 DB?
Good for me!
Data Distribution
Read/Write Splitting
20
Scaling Out Achieves Unlimited Scalability
6000 12000
24000
36000
48000
60000
84000
500 500 1000
1500 1500 2000
2500
0
20000
40000
60000
80000
100000
120000
140000
160000
1 2 4 6 8 10 14
Thro
ugh
pu
t
Number of Databases
Throughput (TPM)
Total DB Size (MB)
# Connections
21
Detailed Scale Out Case Studies
Nokia
• Device Apps App
• Availability
• Scalability
• Geo-clustering
• 100 Apps
• 300 MySQL DB
Solar Edge
• Next Gen Monitoring App
• Massive Scale
• Monitors real time data from thousands of distributed systems
Mozilla
• New Product/ Next Gen App/ AppStore
• Scalability
• Geo-sharding
AppDynamics
• Next gen APM company
• Scalability for the Netflix implementation
22
Summary
• Database scalability is a significant problem
– App explosion, Big Data, Mobile
• Scale Up helps somewhat, but Scale Out provides
a long-term, cost-effective solution
• ScaleBase has an effective Scale Out
solution with a proven ROI
– Improves performance &
requires NO changes to
your existing infrastructure
• Choose your scale-out path....
– The ScaleBase platform enables
you to start with R/W splitting and
grow into automatic data distribution
23
Questions (please enter directly into the GTW side panel)
617.630.2800
www.ScaleBase.com
24
Thank You