Top Banner
Scaling MySQL using multi master synchronous replication Marco “the Grinch” Tusa Percona Live London2013
49

Scaling with sync_replication using Galera and EC2

Jan 27, 2015

Download

Technology

Marco Tusa

Challenging architecture design, and proof of concept on a real case of study using Syncrhomous solution.
Customer asks me to investigate and design MySQL architecture to support his application serving shops around the globe.
Scale out and scale in base to sales seasons.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Scaling with sync_replication using Galera and EC2

Scaling MySQL using multi master

synchronous replication

Marco “the Grinch” Tusa

Percona Live London2013

Page 2: Scaling with sync_replication using Galera and EC2

About Me Marco “The Grinch”

• Former Pythian cluster technical leader

• Former MySQL AB PS (EMEA)

• Love programming

• History of religions

• Ski; Snowboard; scuba diving; Mountain trekking

Introduction

Page 3: Scaling with sync_replication using Galera and EC2

Agenda • Customer requirements

• Installation and initial setup

• Applying the customer scenario to solution

• Numbers, and considerations.

• Scaling out test and efforts

• Scaling in test and efforts

• Geographic distribution

Introduction

Page 4: Scaling with sync_replication using Galera and EC2

Many Galera Talks • PERCONA XTRADB CLUSTER IN A NUTSHELL :

HANDS ON TUTORIAL

Tutorial Monday

• Galera Cluster 3.0 New Features. Seppo Jaakola

Presentation Tuesday

• HOW TO UNDERSTAND GALERA REPLICATION

Alexey Yurchenko Presentation Tuesday

Introduction

Page 5: Scaling with sync_replication using Galera and EC2

A journey started 2 yeas ago • First work done as POC in November 2011

• First implementation in production January 2012

• Many more after

• Last done 12 clusters of 5 nodes with 18 to 24

application server attached

Introduction

Page 6: Scaling with sync_replication using Galera and EC2

Historical Real life case Customer mentions the need to scale for writes.

My first though went to NDB.

Customer had specific constrains:

• Amazon EC2;

• No huge instances (medium preferred);

• Number of instances Increase during peak seasons;

• Number of Instances must be reduced during regular period;

• Customer use InnoDB as storage engine in his current platform and

will not change;

Customer requirements

Page 7: Scaling with sync_replication using Galera and EC2

Refine the customer requirements Challenging architecture design, and proof of concept on a real case of study using Synchronous

solution.

Customer asks us to investigate and design MySQL architecture to support his application serving

shops around the globe.

Scale out and scale in base to sales seasons. We will share our experience presenting the results of

our POC High level outline

Customer numbers:

• Range of operation/s from 20 to 30,000 (5.000 inserts/sec)

• Selects/Inserts 70/30 %

• Application servers from 2 to ∞

• MySQL servers from 2 to ∞

• Operation from 20 bytes to max 1Mb (text)

• Data set dimension 40GB (Old data is archive every year)

• Geographic distribution (3 -> 5 zones), partial dataset

Customer requirements

Page 8: Scaling with sync_replication using Galera and EC2

My Motto Use the right tool for the job

Customer requirements

Page 9: Scaling with sync_replication using Galera and EC2

Scaling Up vs. Out

Scaling Up Model • Require more investment • Not flexible and not a good fit with MySQL

Scaling Out Model • Scale by small investment • Flexible� • Fits in MySQL model (HA, load balancing etc.)

Page 10: Scaling with sync_replication using Galera and EC2

Scaling Reads vs Write • Read Easy way of doing in MySQL if % of write is low

•Write • Replication is not working • Single process • No data consistency check • Parallel replication by schema is not the solution • Semi synchronous replication is not the solution as well

Write

Read

Page 11: Scaling with sync_replication using Galera and EC2

Synchronous Replication in MySQL

Galera replication • Virtually Synchronous • No data consistency check (optimistic lock) • Data replicated by Commit • Use InnoDB

MySQL cluster, I NDBCluster

• Really synchronous

• Data distribution and Internal partitioning

• The only real solution giving you 9 9. 9 9 9 % (5 minutes) max

downtime

• NDB Cluster is more then a simple storage engine (use API if you can)

Options Overview

Page 12: Scaling with sync_replication using Galera and EC2

Choosing the solution Did I say I NDB Cluster?

–But not a good fit here because:

•EC2 dimension (1 CPU 3.7GB RAM);

•Customer does not want to change from InnoDB;

•Need to train the developer to get out maximum from it;

–Galera could be a better fit because:

•Can fit in the EC2 dimension;

•Use InnoDB;

•No additional knowledge when develop the solution;

Options Overview

Page 13: Scaling with sync_replication using Galera and EC2

Architecture Design

Final architecture simple and powerful

Page 14: Scaling with sync_replication using Galera and EC2

Architecture Design

EC2 small

instance

EC2 medium

instance

MySQL instance

geographically

distributed

Load Balancer distributing

request in RR

Application layer

in the cloud

Data

layer in

the cloud

Architecture AWS blocks

Page 15: Scaling with sync_replication using Galera and EC2

Instances EC2 Web servers • Small instance • Local EBS

Data Servers • Medium instance 1 CPU 3.7GB RAM • 1 EBS OS • 6 EBS RAID0 for data

Be ready to scale OUT • Create an AMI • Get AMI update at regular base

Architecture EC2 blocks

Page 16: Scaling with sync_replication using Galera and EC2

Why not ephemeral storage RAID0 against 6 EBS is performing faster;

• RAID approach will mitigate possible temporary degradation;

• Ephemeral is … ephemeral, all data will get lost;

Numbers with rude comparison

(ebs) Timing buffered disk reads: 768 MB in 3.09 seconds = 248.15 MB/sec

(eph)Timing buffered disk reads: 296 MB in 3.01 seconds = 98.38 MB/sec

(ebs)Timing O_DIRECT disk reads: 814 MB in 3.20 seconds = 254.29 MB/sec

(eph)Timing O_DIRECT disk reads: 2072 MB in 3.00 seconds = 689.71 MB/sec

Architecture Installation and numbers

Page 17: Scaling with sync_replication using Galera and EC2

Why not ephemeral storage (cont.)

Architecture Installation and numbers

Page 18: Scaling with sync_replication using Galera and EC2

Why not ephemeral storage (cont.)

Architecture Installation and numbers

Page 19: Scaling with sync_replication using Galera and EC2

Why not ephemeral storage (cont.)

Architecture Installation and numbers

Page 20: Scaling with sync_replication using Galera and EC2

Storage on EC2

Architecture Installation and numbers

Multiple EBS RAID0

Or

USE Provisioned IOPS

Amazon EBS Standard volumes:

$0.10 per GB-month of provisioned storage

$0.10 per 1 million I/O requests

Amazon EBS Provisioned IOPS volumes:

$0.125 per GB-month of provisioned storage

$0.10 per provisioned IOPS-month

Page 21: Scaling with sync_replication using Galera and EC2

Instances EC2 How we configure the EBS.

• Use Amazon EC2 API Tools (http://aws.amazon.com/developertools/351)

• Create 6 EBS • Attach them to the running instance Run mdadm as root (sudo mdadm --verbose --create /dev/md0 --level=0 --chunk=256 --raid-devices=6 /dev/xvdg1 /dev/xvdg2 /dev/xvdg3 /dev/xvdg4 /dev/xvdg5 /dev/xvdg6 echo 'DEVICE

/dev/xvdg1 /dev/xvdg2 /dev/xvdg3 /dev/xvdg4 /dev/xvdg5 /dev/xvdg6' | tee -a /etc/mdadm.conf sudo

mdadm --detail --scan | sudo tee -a /etc/mdadm.conf ) Create an LVM to allow possible easy increase of data size Format using ext3 (no journaling) Mount it using noatime nodiratime Run hdparm –t [--direct] <device> to check it works properly

Installation

Page 22: Scaling with sync_replication using Galera and EC2

Instances EC2 (cont.) You can install MySQL using RPM, or if you want to have a better life and upgrade (or downgrade) faster do:

•Create a directory like /opt/mysql_templates

•Get MySQL binary installation and expand it in the /opt/mysql_templates

•Create symbolic link /usr/local/mysql against the version you want to use

•Create the symbolic links also in the /usr/bin directory ie (for bin in `ls -D /usr/local/mysql/bin/`; do ln -s /usr/local/mysql/bin/$bin /usr/bin/$bin; done)

Installation

Page 23: Scaling with sync_replication using Galera and EC2

Create the AMI

• Once I had the machines ready and standardized.

o Create AMI for the MySQL –Galera data node;

o Create AMI for Application node;

• AMI will be used for expanding the cluster and or in case of

crashes.

Installation

Page 24: Scaling with sync_replication using Galera and EC2

Problem in tuning - MySQL

MySQL optimal configuration for the environment

• Correct Buffer pool, InnoDB log size;

• Dirty page;

• Innodb write/read threads;

• Binary logs (no binary logs unless you really need them);

• Doublebuffer;

• Innodb Flush log TRX commit & concurrency;

Setup

Page 25: Scaling with sync_replication using Galera and EC2

Problem in tuning - Galera Galera optimal configuration for the environment

• evs.send_window Maximum messages in replication at a time

• evs.user_send_window Maximum data messages in replication

at a time

• wsrep_slave_threads which is the number of threads used by

Galera to commit the local queue

• gcache.size

• Flow Control

• Network/keep alive settings and WAN replication

Setup

Page 26: Scaling with sync_replication using Galera and EC2

Applying the customer scenario How I did the tests. What I have used.

Stresstool (my development) Java • Multi thread approach (each thread a connection);

• Configurable number of master table;

• Configurable number of child table;

• Variable (random) number of table in join;

• Can set the ratio between R/W/D threads;

• Tables can have Any data type combination;

• Inserts can be done simple or batch;

• Reads can be done by RANGE, IN, Equal;

• Operation by set of commands not single SQL;

Test application

Page 27: Scaling with sync_replication using Galera and EC2

Applying the customer scenario (cont.)

How I did the tests.

• Application side

• I have gradually increase the number of thread per instance of stresstool running, then increase the number of instance.

• Data layer

• Start with 3 MySQL;

• Up to 7 Node;

• Level of request

• From 2 Application blocks to 12;

• From 4 threads for “Application” block;

• To 64 threads for “Application” block (768);

Test application

Page 28: Scaling with sync_replication using Galera and EC2

Numbers Table with numbers (writes) for 3 nodes cluster and bad replication traffic

Bad commit behavior

Page 29: Scaling with sync_replication using Galera and EC2

Numbers in Galera replication What happened to the replication?

Bad commit behavior

Page 30: Scaling with sync_replication using Galera and EC2

Changes in replication settings Problem was in commit efficiency & Flow Control Reviewing Galera documentation I choose to change:

• evs.send_window=1024 (Maximum packets in replication at a

time.);

• evs.user_send_window=1024 (Maximum data packets in replication at a time);

• wsrep_slave_threads=48;

Bad commit behavior

Page 31: Scaling with sync_replication using Galera and EC2

Numbers After changes (cont.) Table with numbers (writes) for 3-5-7 nodes and increase traffic

Using MySQL 5.5

Page 32: Scaling with sync_replication using Galera and EC2

Numbers After changes (cont.) Table with numbers (writes) for 3-5-7 nodes and increase traffic

Using MySQL 5.5

Page 33: Scaling with sync_replication using Galera and EC2

Other problems… This is what happen if one node starts to have issue?

Tests & numbers

Page 34: Scaling with sync_replication using Galera and EC2

Numbers After changes (cont.) Rebuild the node, re-attach it to the cluster and the status is:

Tests & numbers

Page 35: Scaling with sync_replication using Galera and EC2

Numbers After changes (cont.) Going further and removing Binary log writes:

Tests & numbers

Page 36: Scaling with sync_replication using Galera and EC2

Numbers for reads Select for 3 -7 nodes cluster and increase traffic

Tests & numbers

Page 37: Scaling with sync_replication using Galera and EC2

Many related metrics From 4 – 92 threads

Tests & numbers Real HW

Page 38: Scaling with sync_replication using Galera and EC2

FC on real HW From 4 – 92 threads

Tests & numbers Real HW

Page 39: Scaling with sync_replication using Galera and EC2

How to scale OUT The effort to scale out is: • Launch a new instance from AMI (IST Sync if wsrep_local_cache_size big enough otherwise SST);

• Always add nodes in order to have ODD number of nodes;

• Modify the my.cnf to match the server ID and IP of the master node;

• Start MySQL

• Include node IP in the list of active IP of the load balancer

• The whole operation normally takes less then 30 minutes.

Scaling

Page 40: Scaling with sync_replication using Galera and EC2

How to scale IN The effort to scale IN is minimal:

• Remove the data nodes IP from load balancer (HAProxy);

• Stop MySQL

• Stop/terminate the instance

Scaling

Page 41: Scaling with sync_replication using Galera and EC2

How to Backup:

If using provisioning and one single volumes

contains al, snapshot is fine.

Otherwise I like the Jay solution:

http://www.mysqlperformanceblog.com/2013/10/08

/taking-backups-percona-xtradb-cluster-without-

stalls-flow-control/

Using wsrep_desync=OFF

Page 42: Scaling with sync_replication using Galera and EC2

Failover and HA With MySQL and Galera, unless issue all the nodes should contain the same data.

Performing failover will be not necessary for the whole service.

Cluster in good health Cluster with failed node

So the fail over is mainly an operation at load balancer (HAProxy works great) and add another new Instance (from AMI).

Page 43: Scaling with sync_replication using Galera and EC2

Geographic distribution With Galera it is possible to set the cluster to replicate cross

Amazon’s zones.

I have tested the implementation of 3 geographic location:

• Master location (1 to 7 nodes);

• First distributed location (1 node to 3 on failover);

• Second distributed location (1 node to 3 on failover);

• No significant delay were reported, when the distributed nodes remain

passive.

Good to play with: keepalive_period inactive_check_period suspect_timeout inactive_timeout install_timeout

Geographic distribution

Page 44: Scaling with sync_replication using Galera and EC2

Problems with Galera During the tests we face the following issues:

• MySQL data node crash auto restart, recovery (Galera in loop)

• Node behind the cluster, replica is not fully synchronous, so the

local queue become too long, slowing down the whole cluster

• Node after restart acting slowly, no apparent issue, no way to

have it behaving as it was before clean shutdown, happening

randomly, also possible issue due to Amazon.

Faster solution build another node and attach it in place of the

failing.

Conclusions

Page 45: Scaling with sync_replication using Galera and EC2

Did we match the expectations? Our numbers were:

• From 1,200 to ~10,000 (~3,000 in prod) inserts/sec

• 27,000 reads/sec with 7 nodes

• From 2 to 12 Application servers (with 768 request/sec)

• EC2 medium 1 CPU and 3.7GB!!

o In Prod Large 7.5GB 2 CPU.

I would say mission accomplished!

Conclusions

Page 46: Scaling with sync_replication using Galera and EC2

Consideration about the solution Pro

• Flexible;

• Use well known storage engine;

• Once tuned is “stable” (if Cloud permit it);

Cons

• !WAS! New technology not included in a official cycle of development;

• Some times fails without clear indication of why, but is getting better;

• Replication is still not fully Synchronous (on write/commit);

Conclusions

Page 48: Scaling with sync_replication using Galera and EC2

Q&A

Page 49: Scaling with sync_replication using Galera and EC2

Thank you To contact Me

[email protected]

[email protected]

To follow me

http://www.tusacentral.net/

https://www.facebook.com/marco.tusa.94

@marcotusa

http://it.linkedin.com/in/marcotusa/

Conclusions