© 2015 VMware Inc. All rights reserved. MySQL High Availability and Disaster Recovery Featuring Continuent Robert Hodges January 2015
Jul 17, 2015
© 2015 VMware Inc. All rights reserved.
MySQL High Availability and Disaster Recovery Featuring Continuent
Robert Hodges January 2015
Continuent Quick Introduction
2
History Products 2004 Continuent established in USA
2009 3rd Generation Continuent Tungsten (aka VMware Continuent) ships
2014 100+ customers running business-critical applications
Oct 2014 Acquisition by VMware: Now part of the vCloud Air Business Unit
Oct 2015 Continuent solutions available through VMware sales
Industry-leading clustering and replication for open source DBMS
Clustering – Commercial-grade HA, performance scaling, and data management for MySQL
Replication– Flexible, high-performance data movement
Business-Critical Deployment Examples
High Availability for MySQL
Largest cluster deployment performs 800M+ transactions/day on 275 TB of relational data
Business Continuity Cross-site cluster topologies widely deployed including primary/DR and multi-master
High Performance Replication
Largest installations transfer billions of transactions daily using high speed, parallel replication
Heterogeneous Integration
Customers replicate from MySQL to Oracle, Hadoop, Redshift, Vertica, and others
Real-time Analytics Optimized data loading for data warehouses with deployments of up to 200 MySQL masters feeding to Hadoop
Continuent Facts
3
The Dream: multiple, active DBMS servers with identical data over distance
High Availability
Updates propagated immediately to all servers
Transparent read/write to
any server
High Performance
Synchronous multi-master clusters claim to deliver on the dream
Table fooid=1, data=6
Ordering
Table fooid=1, data=5
Table fooid=7, data=25
[1] id=1, data=6 [2] id=1, data=5 [3] id=7, data=25
Synchronous multi-master introduces new problems
Table fooid=1, data=6
Ordering
Table fooid=1, data=5
REJECTED!
Table fooid=7, data=25
[1] id=1, data=6 [2] id=1, data=5 [3] id=7, data=25
…That grow as data scale in volume and distance
• Transaction failures due to conflicts • Operations like SELECT FOR UPDATE not supported • Slow writes due to synchronous messaging • Large transactions lock system or cause failures • Cross-site replication is unstable
Can master/slave clusters offer the same benefits?
High Availability?
Updates propagated
immediately?
Transparent read/write to any server?
High Performance?
24x7 data access
SQL load balancing
Simple management
Off-the-shelf MySQL
Continuent Clustering: HA, DR and Performance Scaling
db2 db1 db3
Slave Master Slave
Application Stack
Continuent Connector
Application Stack
Continuent Connector
Benefits
Manager
Replicator
Manager
Replicator
Manager
Replicator
Continuent clusters add HA and scaling without taking features away
13
Slave Master Slave
Continuent Connector Continuent Connector
Continuent Connector operates as an intelligent proxy to the DBMS
• Any MySQL client can connect • Connector initiates connections on behalf of client to the DBMS
mySQL
Master
mySQL
Slave
mySQL
Slave
Application Connector
MySQL ProtocolCOM_QUERYCOM_INIT_DB
COM_DROP_DB…
Connector minimizes overhead from proxying
• Pass-through operation after connection • Full transparency and low overhead for clients
mySQL
Master
mySQL
Slave
mySQL
Slave
Application Connector
(Packet)COM_QUERY
SELECT * FROM foo
(Packet)OK
ResultSet Rows: 1
Continuent SmartScale provides session load balancing
• Initial write goes to master • Reads go to replicas if it is safe to do so.
mySQL
Master
mySQL
Slave
mySQL
Slave
Application Connector(Session “X” Binlog Position)
Initial Write
Connect/Insert data
Write committed
Not received
Not received
Continuent SmartScale provides session load balancing
• Auto-commit reads are eligible to go to slave • Reads stay on master until a slave catches up
mySQL
Master
mySQL
Slave
mySQL
Slave
Application Connector(Session “X” Binlog Position)
Select Data
Write committed
Not received
Received but not applied
Read from master
NO read from slave
Continuent SmartScale provides session load balancing
• Reads go to slave when it has caught up with master • Session tags may be schema name or supplied by application
mySQL
Master
mySQL
Slave
mySQL
Slave
Application Connector(Session “X” Binlog Position)
Select Data
Write committed
Received but not applied
Received and
appliedRead from slave
Manager
Replicator
Manager
Replicator
Manager
Replicator
Connectors can be configured to support different levels of service
19
Slave Master Slave
Continuent Connector Continuent Connector
(SmartScale) (Strict Consistency)
Continuent clusters automatically monitor all cluster nodes for failure
Continuent Connector
Master
SlaveSlave
Cluster rules fail over master if DBMS no longer accepts network connections
Continuent Connector
Master
SlaveSlave
X1. Detect non-responsive node
2. Halt in-coming connections
3. Find and promote most up-to-date slave
Failed nodes can be reprovisioned from a backup with a single management command
Continuent Connector
New Master
Shunned nodeSlave
4. Administrator inspects and recovers old master X
Continuent clusters support zero-downtime maintenance operations from parameter changes to app upgrade
• Task: change the InnoDB log file size • Problem: requires a mysqld restart, hence can cause
application downtime • Constraint: avoid application-visible restart • Solution: upgrade nodes in succession
Rolling maintenance proceeds node-by-node starting with slaves and proceeding to master
Slave upgrade
Slave upgrade Switch Master
upgrade
• Shun slave • Resize
journal, restart mysqld
• Return node to cluster
• Discard and reprovision on failure
• Repeat for remaining slave(s)
• Switch master to promote an upgraded slave
• Upgrade old master
• Maintenance is now done!
Size and transaction activity on business data depend on many factors
28
0
200
400
600
800
1000
1200
1400
1 501 Dat
aset
Siz
e in
Gig
iaby
es
Customers
SaaS Datasets -- Size of Top 1000 Customers
99th percentile=290GB
Max=1214GB
Median=2.6GB
Source: Statistics provided by Continuent customer
Manager
Replicator
Manager
Replicator
Manager
Replicator
DBMS workloads are correspondingly varied
29
Complex queries
Large batch operations
Small online transactions
Analytic reports
Slave Master Slave
Asynchronous replication decouples transaction processing on master and slave DBMS nodes
30
Replicator
mySQL
DBMS Logs
mySQL
Replicator
THL
THL
Download transactions via
network
Apply using JDBC(Transactions + metadata)
(Transactions + metadata)
Master
Slave
Parallel apply maximizes DBMS I/O bandwidth when updating replicas
31
Master replicator
THL
Parallel queue(Transactions + metadata)
Slave
Extract
Filter Apply Extrac
t Filte
r Apply
Extract
Filter Apply
Extract
Filter Apply
Extract
Filter Apply
StageStageStage
Slave Replicator Pipeline
Continuent Disaster Recovery creates composite clusters that span sites and are ready for immediate failover
SJC Master Service NYC Slave Service
Slave Slave
Master
Slave Slave
RelayCross-Region Replication
(Async master/slave)
Continuent Connector Continuent Connector
Continuent multi-master, cross-site cluster operate independent, active clusters on 2 or more remote sites
SJC Service NYC Service
Slave Slave
Master
Slave Slave
MasterCross-Region Replication
(Async Multi-master)
Continuent Connector Continuent Connector
The same replication mechanism supports real-time loading of data warehouses
SJC Service Hadoop Cluster
Slave Slave
Master
Continuent Connector
Master/slave clustering is a robust technology for enterprise data management!
Very High Availability
Updates propagated
without cost to applications
Transparent connectivity with full SQL
semantics
Very High Performance
Continuent offers…
• Highly available clusters of off-the-shelf MySQL servers • Zero-downtime maintenance and upgrade • High performance regardless of data volume or distance • Replication over regions to DR sites as well as non-
MySQL data warehouses
For more information, contact us: Robert Noyes Alliance Manager, USA & Canada [email protected] +1 (650) 575-0958 Philippe Bernard Alliance Manager, EMEA & APAC [email protected] +41 79 347 1385
Eero Teerikorpi Sr. Director, Strategic Alliances [email protected] +1 (408) 431-3305