MySQL 5.6 Replication for Admins Kristian Köhntopp
MySQL 5.6 Replication for AdminsKristian Köhntopp
How async replication works2
Master Slave
Binlog RelayLog
ConnectionThread
IOThread
SQLThread
To Tables
Slavelogs in to Master
Server ID: x Server ID: y
Setting up a master
• --server_id = x
• --binlog_format = STATEMENT|ROW|MIXED
• --expire_logs_days = n
• --log_bin = name -- Restart required
• SHOW MASTER STATUS
• SHOW MASTER LOGS
3
Making a full backup
• Slave start position:
• A consistent backup associated with a binlog position.
• mysqldump --master-data=2 OR
• mylvmbackup
4
Setting up a slave
• --server-id= y
• Recover from backup.
• Slave now at backups binlog position.
• CHANGE MASTER TOMASTER_HOST, MASTER_PORT,MASTER_USER, MASTER_PASSWORD MASTER_LOG_FILE, MASTER_LOG_POS
5
Starting the slave
• SHOW SLAVE STATUS\G
• START SLAVE IO_THREAD; • SHOW SLAVE STATUS\G -- error.log
• START SLAVE SQL_THREAD; • SHOW SLAVE STATUS\G
6
Replication stuck
• set global sql_slave_skip_counter = 1;
• start slave;
• But you need to understand what happened.
• Percona Toolkit:pt-table-sync,pt-table-check
7
Binlog Management
• max-binlog-size
• FLUSH LOGS
• SHOW MASTER LOGS
• PURGE MASTER LOGS BEFOREnow() - INTERVAL 3 DAY
• expire_logs_days
8
Debugging Replication
• mysqlbinlog --help
• --start-datetime, --stop-datetime
• useful only for zooming in, time is a binlog position!
• --verbose is not what you think (-v -vv)
• BINLOG statement decoder
9
Filtering the Binlog
• Do not use binlog-do/ignore-db
• incomplete binlog = no roll forward
• Do not use replicate-do/ignore-db, replicate-do/ignore-table.
• Do use replicate-wild-do/ignore-table.
• Do not mix do and ignore, if at all possible.
10
Unslaving
• STOP SLAVE
• maybe START SLAVE UNTIL MASTER_LOG_FILE = …, MASTER_LOG_POS = …
• RESET SLAVE
11
Files
• On the master:
• binlog.index, binlog.nnnnnn
• On the slave:
• master.info
• relay.info
• relay.nnnnnn
12
Binlog Formats: SBR vs. RBR
• “There is only SBR.”
• binlog_format = ROW|MIXED|STATEMENT generates: BINLOG “<base64>”contains: before-image, after-image
• undebuggable, so 5.6 adds binlog_rows_query_log_events = 1
13
RBR Binlog Sizes
• Typical usage: RBR is 33% to 50% size of SBR.
• RBR size escalates when BLOB types are part of replication.
• max_allowed_packet problem (when FULL is used) slave-max-allowed-packet= (2x normal size)
• 5.6: binlog_row_image = FULL|NOBLOB|MINIMAL
14
Alternate RBR uses
• RBR decoding with “mysqlbinlog -v -v”
• strict format, easily parseable, complete list of changes to the server
• Good for change auditing, change extraction.
• Low overhead, avoids triggers.
15
Reliability
• Starting with 5.6:
• Binlog CRC32 Checksums included, never checked.
• master_verify_checksum = 1
• slave_sql_verify_checksum=1
• Always: sync_binlog = 1
• fast, when group commit is on.
16
Reliability
• Slave using tables: fast w/ group commit.
• master-info-repository=TABLErelay-log-info-repository=TABLE
• Hostnames are part of filenames.
• Makes cloning hard. Configure away!
17
Semi-Synchronous Replication SSR
• enabled & one ssr slave connects
• after commit wait for ack or timeout
• if timeout, async,until SSR catches up and acks
18
Pre-SSR workarounds
• Heartbeat tables
• INSERT INTO heartbeat VALUES (name, timestamp)
• master_pos_wait()
• limited to single level hierarchies
19
SSR background
• it makes it much more likely that fewer transactions will be lost when a master disappears
• it throttles busy clients to run no faster than master-slave networking
• http://www.mysqlperformanceblog.com/2012/01/19/how-does-semisynchronous-mysql-replication-work/
20
SSR enhancement
• Google/Percona/Maria 10 SSR enchangement:
• SSR send first, local commit afterwards
• Slow in Oracle MySQL due to global txn lock, fast is Maria 10
21
SSR Master
• INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
• show plugins;
• set global rpl_semi_sync_master_enabled = 1; # my.cnf
• set global rpl_semi_sync_master_timeout = 5000;
• # millis, add to my.cnf
22
SSR slave
• INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
• show plugins;
• set global rpl_semi_sync_slave_enabled = 1; # my.cnf
• stop slave io_thread;
• start slave io_thread;
23
SSR checking
• select @@global.rpl_semi_sync_master_clients;
• select @@global.rpl_semi_sync_master_status;
• select @@global.rpl_semi_sync_slave_status;
24
MariaDB Multisource Replication MSR
• Consolidate multi non-overlapping masters into one slave
• all slave-to-master config and vars named, duplicated
• set @@default_master_connection = …
• https://mariadb.com/kb/en/mariadb/mariadb-documentation/replication-cluster-multi-master/replication/multi-source-replication/
25
MariaDB 10 Parallel Replication
• Record degree of parallelism on master
• Replay in parallel on slave
• min(#sql_thds, master_parallelism)
• Faster slave when master more busy.
• Delay transactions on master to force parallism
• http://kristiannielsen.livejournal.com/18435.html
26
PR config
• slave-parallel-threads = 12
• slave_parallel_max_queued = <buffer in bytes>
• execution is still running in-order
• ordering late in commit stage
27
PR config
• On master, force parallel:
• binlog_commit_wait_count = 100
• binlog_commit_wait_usec = 100000
• Also:
• innodb_flush_logs_at_trx_commit=1
• sync_binlog = 1
28
PR out-of-order execution
• Domain-ID
• application controlled
• set session gtid_domain_id = 1
• alter table t add index i(i)
• set session gtid_domain_id = 0
29
Upcoming technology
• GTID
• SSR + GTID = easier master HA
• Really?
• Galera
• proper “Multi-Master”
• for small values of Multi-Master
30
Upcoming technology
• All new replication technology requires:
• Clean setup, defined as:
• pure InnoDB, full ACID, RBR, clean transactions, QC off, log to file (not table), innodb_autoinc_lock_mode = 2, innodb_doublewrite = 1, no replication filters, sensible txn size (1000-10000 rows, < 50MB), PRIMARY KEYS on all tables
31