MySQL GTID Implementation, Maintenance, and Best Practices Brian Cain (Dropbox) Gillian Gunson (GitHub) Mark Filipi (SurveyMonkey)
MySQL GTID Implementation, Maintenance, and Best Practices
Brian Cain (Dropbox)Gillian Gunson (GitHub)
Mark Filipi (SurveyMonkey)
Agenda
❏ Intros❏ Concepts
❏ Replication overview❏ GTID Intro
❏ Implementation❏ Maintenance❏ New 5.7 Features and Advanced Concepts
2
About Mark
• Works at SurveyMonkey
• From Kansas
• Formerly of PalominoDB and Garmin and preschool
3
About Gillian
• Senior Infrastructure Engineer at GitHub
• From Vancouver, BC, Canada
• Formerly of Okta, PalominoDB, Oracle, Disney
4
About Brian
• Database Engineer, MySQL SRE at Dropbox
• From Seattle
• Formerly of PalominoDB, Zappos, EMusic, etc
• Also from Kansas
5
Tutorial Setup (Hour 2)
❏ Collect a DigitalOcean droplet host access card from the front❏ Connect to wireless and ssh to the droplet host (server1)❏ Confirm you can ssh to server2 and server3 from server1❏ Confirm replication is running
server1master
server2replica
server3replica
6
ConceptsTraditional replication primer and introduction to GTID
7
Traditional MySQL replication primer
❏ Standard topologies❏ SHOW MASTER STATUS❏ SHOW SLAVE STATUS
8
Standard topologies
server1master
server2replica
server3replica
server1master
server2relay
server3replica
9
SHOW MASTER STATUS
markf@db-wfcore03-ro [(none)]> show master status;+------------------+-----------+--------------+------------------+| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |+------------------+-----------+--------------+------------------+| mysql-bin.000695 | 264631170 | | |+------------------+-----------+--------------+------------------+1 row in set (0.08 sec)
10
SHOW SLAVE STATUSmarkf@db-wfcore03-ro [(none)]> show slave status\G*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.10.23.17 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000751 Read_Master_Log_Pos: 67329044 Relay_Log_File: mysqld-relay-bin.000403 Relay_Log_Pos: 67329189 Relay_Master_Log_File: mysql-bin.000751 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 67329044 Relay_Log_Space: 67329388 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error:1 row in set (0.08 sec)
11
SHOW SLAVE STATUSMaster information and running status
Slave_IO_State: Waiting for master to send event Master_Host: 10.10.23.17 Master_User: repl Master_Port: 3306 Slave_IO_Running: Yes Slave_SQL_Running: Yes Seconds_Behind_Master: 0
Displays state of replication IO thread - whether logs are being pulled from the master.
12
SHOW SLAVE STATUSMaster information and running status
Slave_IO_State: Waiting for master to send event Master_Host: 10.10.23.17 Master_User: repl Master_Port: 3306 Slave_IO_Running: Yes Slave_SQL_Running: Yes Seconds_Behind_Master: 0
Configured host database
13
SHOW SLAVE STATUSMaster information and running status
Slave_IO_State: Waiting for master to send event Master_Host: 10.10.23.17 Master_User: repl Master_Port: 3306 Slave_IO_Running: Yes Slave_SQL_Running: Yes Seconds_Behind_Master: 0
MySQL user configured for replication
14
SHOW SLAVE STATUSMaster information and running status
Slave_IO_State: Waiting for master to send event Master_Host: 10.10.23.17 Master_User: repl Master_Port: 3306 Slave_IO_Running: Yes Slave_SQL_Running: Yes Seconds_Behind_Master: 0
DB port being connected to (default)
15
SHOW SLAVE STATUSMaster information and running status
Slave_IO_State: Waiting for master to send event Master_Host: 10.10.23.17 Master_User: repl Master_Port: 3306 Slave_IO_Running: Yes Slave_SQL_Running: Yes Seconds_Behind_Master: 0
Two replication threads, one reading from master, other executing SQL on replica.
16
SHOW SLAVE STATUSMaster information and running status
Slave_IO_State: Waiting for master to send event Master_Host: 10.10.23.17 Master_User: repl Master_Port: 3306 Slave_IO_Running: Yes Slave_SQL_Running: Yes Seconds_Behind_Master: 0
Seconds between timestamp in binlog, and time on replica
17
SHOW SLAVE STATUSLog file information
Master_Log_File: mysql-bin.000751 Read_Master_Log_Pos: 67329044 Relay_Log_File: mysqld-relay-bin.000403 Relay_Log_Pos: 67329189 Relay_Master_Log_File: mysql-bin.000751 Exec_Master_Log_Pos: 67329044 Relay_Log_Space: 67329388
Binary log file on master, and position it’s read to
18
SHOW SLAVE STATUSLog file information
Master_Log_File: mysql-bin.000751 Read_Master_Log_Pos: 67329044 Relay_Log_File: mysqld-relay-bin.000403 Relay_Log_Pos: 67329189 Relay_Master_Log_File: mysql-bin.000751 Exec_Master_Log_Pos: 67329044 Relay_Log_Space: 67329388
Position in relay log on replica
19
SHOW SLAVE STATUSLog file information
Master_Log_File: mysql-bin.000751 Read_Master_Log_Pos: 67329044 Relay_Log_File: mysqld-relay-bin.000403 Relay_Log_Pos: 67329189 Relay_Master_Log_File: mysql-bin.000751 Exec_Master_Log_Pos: 67329044 Relay_Log_Space: 67329388
Position in binary log that SQL thread has executed on replica
20
SHOW SLAVE STATUSSSL information
Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key:
21
SHOW SLAVE STATUSFiltering information
Until_Condition: None Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table:
Filtered replication settings -- Use with caution
22
- Global Transaction IDentifier- source_id:transaction_id - e200c55b-7832-11e5-9d51-00259082ca78:1- source_id - normally the server_uuid of the master- transaction_id - sequential integer (starts at 1) representing the order a transaction
was committed on the source
Defining GTID
23
Binary Log ContentsStandard replication
# at 2637016#160307 15:05:42 server id 3031 end_log_pos 2637016 Table_map: `C0070735`.`FormStats` mapped to number 139874072#160307 15:05:42 server id 3031 end_log_pos 2637088 Update_rows: table id 139874072 flags: STMT_END_F
BINLOG 'RgneVhPXCwAAOAAAANg8KAAABhPVggAAAEACUMwMDcwMzczNQAJm9ybVN0YXRzAAQDDAMDAAA=RgneVhjXCwAASAAAACA9KAAABhPVggAAAEABP//8AdPAACwPPLvVIAACgBAAAMAAAA8AdPAACwPPLvVRIAACkBAAAMAAAA'/*!*/;### UPDATE C0070735.FormStats### WHERE### @1=20231### @2=2016-03-07 15:00:00### @3=296### @4=12### SET### @1=20231### @2=2016-03-07 15:00:00### @3=297### @4=12# at 2637088#160307 15:05:42 server id 3031 end_log_pos 2637115 Xid = 2004736153COMMIT/*!*/; 24
Binary Log ContentsGTID Enabled
#160323 14:37:48 server id 168433453 end_log_pos 41956 CRC32 0xee79822d GTID [commit=yes]SET @@SESSION.GTID_NEXT= '81b0bb5e-f004-11e5-aaa3-b8ca3a676681:100'/*!*/;# at 41956#160323 14:37:48 server id 168433453 end_log_pos 42033 CRC32 0xdac047b0 Query thread_id=4611exec_time=0 error_code=0SET TIMESTAMP=1458769068/*!*/;BEGIN/*!*/;# at 42033#160323 14:37:48 server id 168433453 end_log_pos 42094 CRC32 0xd3c70a01 Table_map: `C01840587`.`FormStats` mapped to number 74# at 42094#160323 14:37:48 server id 168433453 end_log_pos 42166 CRC32 0xa031417e Update_rows: table id 74 flags: STMT_END_F
BINLOG 'rAzzVhMtFwoKPQAAAG6kAAAAAEoAAAAAAUMwMTg0MDU4NwAJRm9ybVN0YXRzAAQDEgMDAQAAAQrH0w==rAzzVh8tFwoKSAAAALakAAAAAEoAAA///wBAAAAJmY7uAAAgAAAAIAAADwBAAAAJmY7uAAAwAAAAIAAAB+QTGg'/*!*/;### UPDATE `C01840587`.`FormStats`### WHERE### @1=4### @2='2016-03-23 14:00:00'### @3=2 25
SHOW SLAVE STATUSNEW GTID Information
Master_UUID: 81b0bb5e-f004-11e5-aaa3-b8ca3a676681
Retrieved_Gtid_Set: 81b0bb5e-f004-11e5-aaa3-b8ca3a676681:1-51
Executed_Gtid_Set: 81b0bb5e-f004-11e5-aaa3-b8ca3a676681:1-51
26
- Rather than just a single transaction_id, an interval is given- A range of transactions - e200c55b-7832-11e5-9d51-00259082ca78:1-1234- Two ranges with a gap - e200c55b-7832-11e5-9d51-00259082ca78:1-1234,1236-1240- Commonly used variables related to GTID
- server_uuid- enforce_gtid_consistency- gtid_mode- gtid_next
27
GTID Sets & Related Variables
- Traditional replication coordinates- MASTER_LOG_FILE- MASTER_LOG_POS
- GTID replication coordinates- MASTER_AUTO_POSITION
28
GTID vs Binlog Position
- Guarantee master and replica(s) are in sync by enabling read_only on the master and allow replication to catch up
- Shutdown the master and replica(s)- Add the following to my.cnf
- enforce-gtid-consistency- gtid-mode = ON- skip-slave-start- log-slave-updates- read-only = 1
- Start the master and replica(s)- Start replication
CHANGE MASTER TO MASTER_HOST=’server1’, MASTER_AUTO_POSITION=1; START SLAVE;
- Disable read_only on the master and remove read-only from the my.cnf29
Enabling GTIDs in Oracle MySQL 5.6
- Traditional replication required unique server_id values in a cluster- GTID replication requires unique sources (server_uuid)
- data_dir/auto.cnf contains the server_uuid value- when the server starts it will generate a new server_uuid and save it in auto.cnf if not found
- Beware cloning of replicas- The auto.cnf file will be copied as well and needs to be removed prior to server start- This is similar to the process of changing the server_id in my.cnf
30
Potential Replication Conflicts
ImplementationEnabling GTID and making topology changes
31
Connection setup
❏ See https://goo.gl/NVKBL3 for all the commands being run■ Start in the “Prep work” tab
❏ ssh to your provided host in 3 terminal windows/tabs/panes■ this will be server1 ■ ssh directly to server2 and server3 in the other windows
❏ Use mysql (no password) to connect to local mysql instance
32
A note about conventions
❏ A lot of this is BAD PRACTICE■ root user, no password■ Hacky, incorrect SQL for fixes■ Slow progress between steps ■ Not worrying about errant writes between steps■ Making deliberate mistakes
33
Starting topology
server1master
server2replica
server3replica
34
MySQL instance information
❏ Ubuntu on DigitalOcean droplet
❏ Percona Server 5.6.32-78.0
❏ Important file locations:■ MySQL config file: /etc/mysql/my.cnf■ datadir: /var/lib/mysql/■ Binary logs: /var/lib/mysql/mysql-bin.00000x■ Relay logs: /var/lib/mysql/relay-bin.00000x
❏ Restart mysqld via service mysql start/stop/restart
35
MySQL instance: replication configuration
❏ Regular non-GTID replication
❏ binlog_format=ROW ■ mysqlbinlog --no-defaults --base64-output=DECODE-ROWS -vvv
[binlog]
❏ Some variables already set:■ skip-slave-start■ log-bin, log-slave-updates
36
Important GTID variables
❏ gtid_mode■ Static variable (requires restart)■ If ON, requires log-bin, log-slave-updates, enforce-gtid-consistency
also set■ “Disabled” by gtid_deployment_step
❏ gtid_deployment_step ■ Percona-specific dynamic variable■ Used as a temporary setting on replicas■ ON means:
● can replicate non-GTID binary log events from master● direct writes won’t have GTIDs
37
Important GTID variables (cont.)
❏ master_auto_position■ Tells server to use GTID replication protocol
● Needed for simplified server failover/repointing■ Set in CHANGE MASTER statement instead of master_log_file,
master_log_pos■ Shown as Auto_Position in SHOW SLAVE STATUS■ Setting to 1 tells server to only replicate GTID events
38
Important GTID variables (cont.)
❏ Setting both gtid_deployment_step=ON and master_auto_position=1 will result in silently ignored non-GTID events
■ server3 will be used to demonstrate this
39
Prep work: general steps
❏ Use mysqlslap and inserts to generate writes between steps
❏ Edit the [mysqld] section in /etc/mysql/my.cnf on all 3 servers:
enforce-gtid-consistency = 1gtid-mode = ON
❏ Restart server2 and server3 (service mysql restart)
❏ Set these variables on server2 and server3:
SET GLOBAL gtid_deployment_step=ON;SET GLOBAL super_read_only=ON;
40
Prep work: results
❏ non-GTID writes to server1 still replicating properly
server1master
server2replica
server3replica
gtid_mode=ONgtid_deployment_step=ON
gtid_mode=ONgtid_deployment_step=ON
gtid_mode=OFFgtid_deployment_step=OFF
41
Enable GTIDs: Topology change
❏ Failover to server2❏ Intentional breaking/fixing of replication
server1master
server2replica
server3replica
server1replica
server2master
server3replica
42
Important GTID status info variables
❏ Retrieved_Gtid_Set■ All GTIDs received from the master■ Resets on:
● CHANGE MASTER ● RESET SLAVE● server restart (if relay-log-recovery is on)
❏ Executed_Gtid_Set ■ All GTIDs written to binary log■ Same value seen in:
● SHOW MASTER STATUS ● SHOW SLAVE STATUS● gtid_executed variable
43
GTID slave status
(root@server2) [(none)]> show slave status\G
*************************** 1. row ***************************
Slave_IO_Running: Yes
Slave_SQL_Running: No
...
Master_UUID: cc83d91e-d0e4-11e5-9faf-02cddc874cbb
...
Retrieved_Gtid_Set: cc83d91e-d0e4-11e5-9faf-02cddc874cbb:1-107
Executed_Gtid_Set: c866b7ac-d0e4-11e5-9faf-020a6fe2a217:1-442,
cc83d91e-d0e4-11e5-9faf-02cddc874cbb:1-107
Auto_Position: 0
44
GTID master status
(root@server2) [(none)]> show master status\G
*************************** 1. row ***************************
File: mysql-bin.000002
Position: 1976131
Binlog_Do_DB:
Binlog_Ignore_DB:
Executed_Gtid_Set: c866b7ac-d0e4-11e5-9faf-020a6fe2a217:1-442,
cc83d91e-d0e4-11e5-9faf-02cddc874cbb:1-107
1 row in set (0.00 sec)
45
Enable GTIDs: general steps
❏ Set server1 to read-only❏ Point server1 to server2❏ Repoint server3 to server2 (incorrectly)❏ Test server2 to server3 replication (broken)❏ Fix server3 replication❏ Fix server1 replication
46
Enable GTIDs: breaking things
47
server1replica
server2master
server3replica
gtid_mode=OFF gtid_mode=ONgtid_deployment_step=ON
gtid_mode=ONgtid_deployment_step=ONmaster_auto_position=1
error
silently dropped writes
Promote server-1: Topology change
❏ Failover back to original topology
server1master
server2replica
server3replica
48
server1replica
server2master
server3replica
Promote server1: general steps
❏ Set server2 to read-only❏ Repoint server2 to server1 ❏ Repoint server3 to server1❏ Turn off read-only on server1❏ Turn off replication on server1
49
Server2 relay to server3: Topology change
❏ Repoint server3 to server2
server1master
server2replica
server3replica
50
server1master
server2relay
server3replica
Server2 relay to server3: non-GTID long way
❏ server3■ STOP SLAVE
❏ server2■ wait until replication ahead of server3■ FLUSH TABLES; FLUSH TABLES WITH READ LOCK; SHOW MASTER STATUS;
SHOW SLAVE STATUS\G UNLOCK TABLES;
❏ server3■ START SLAVE UNTIL [recorded server2 slave file/position]■ wait until SHOW SLAVE STATUS says Slave_SQL_Running: No■ CHANGE MASTER TO [recorded server2 master file/position]■ START SLAVE
51
Server2 relay to server3: with GTIDs and master_auto_position=1❏ server3
■ stop slave;■ CHANGE MASTER TO master_host=’server2’;■ start slave;
52
Maintenance
53
Maintenance
❏ Determine the currently writing master❏ GTID set gaps❏ Finding transactions in the binary logs❏ Fixing transactions with gtid_next❏ Faking and skipping transactions
54
Currently writing master
❏ Master_UUID can be misleading as to who is writing
server1master
server2relay
server3replica
server_uuidbd933998-f2c5-11e5-bc9a-021b71e877a3Executed_Gtid_Setbcc3d83d-f2c5-11e5-bc9a-029d985ea7a3:1-2,bd933998-f2c5-11e5-bc9a-021b71e877a3:1-111
server_uuidbcc3d83d-f2c5-11e5-bc9a-029d985ea7a3Master_UUID bd933998-f2c5-11e5-bc9a-021b71e877a3
server_uuidbe2ed142-f2c5-11e5-bc9a-0274358cd201Master_UUIDbcc3d83d-f2c5-11e5-bc9a-029d985ea7a3
55
GTID set gaps
❏ Gaps within a GTID set occur when❏ slave_parallel_workers > 1 ❏ A transaction is missing
Executed_Gtid_Setbcc3d83d-f2c5-11e5-bc9a-029d985ea7a3:1-2,bd933998-f2c5-11e5-bc9a-021b71e877a3: 1-111:113-120
56
Finding transactions
❏ mysqlbinlog❏ include_gtids❏ exclude_gtids❏ Beware transactions without gtid_next (gtid_mode=OFF)
mysqlbinlog --no-defaults -vvv --base64-output=DECODE-ROWS --include-gtids='bd933998-f2c5-11e5-bc9a-021b71e877a3:112' /var/lib/mysql/mysql-bin.000002
SET @@SESSION.GTID_NEXT= 'bd933998-f2c5-11e5-bc9a-021b71e877a3:112'/*!*/;
57
Fixing transactions with gtid_next
❏ Accidental writes on a replica happen❏ Who hasn’t forgotten to set sql_log_bin=0?
❏ Apply DDL to replicas for very large tables then promote❏ Realign GTID sets to match the recorded direct write to the replica
ALTER TABLE mysqlslap.t1 ADD COLUMN newcol3 varchar(128);
58
Faking and skipping transactions
❏ What if the change on the replica was already fixed or irrelevant❏ How to skip a transaction
❏ sql_slave_skip_counter
set gtid_next='xxx_gtid_xxx'; BEGIN; COMMIT;
59
Advanced ConceptsThings to investigate
60
Advanced Concepts
❏ GTID variables❏ binlog_gtid_simple_recovery❏ GTID Set functions❏ START SLAVE UNTIL …❏ SHOW SLAVE STATUS NONBLOCKING
61
GTID Variables
❏ gtid_executed❏ Same information as seen in SHOW MASTER/SLAVE STATUS
❏ gtid_purged❏ Subset of gtid_executed that are no longer in the binary logs
62
binlog_gtid_simple_recovery
❏ Controls how binary logs are iterated over when MySQL starts❏ When set to TRUE
❏ gtid_executed is set based on Gtid_log_event in the newest binary log file ❏ gtid_purged is set based on Previous_gtids_log_event in the oldest file
❏ When FALSE❏ Both variables are computed by iterating through the binary logs from
newest to oldest (gtid_executed) and oldest to newest (gtid_purged) ❏ Used when there may be transactions without GTIDs prior to enabling
gtid_mode or setting gtid_purged
63
GTID Set functions
❏ GTID_SUBSET(subset,set)❏ Return true if subset is in the set
❏ GTID_SUBTRACT(set,subset)❏ Return what is in set less the subset❏ Know what is still available on the master or difference between binary logs
select gtid_subtract(@@global.gtid_executed,@@global.gtid_purged)
❏ WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS(gtid_set,timeout)❏ Wait for a gtid_set to complete or timeout, returning a count of transactions
completed ❏ MASTER_POS_WAIT(log_name,log_pos,timeout)
64
START SLAVE UNTIL ...
❏ SQL_BEFORE_GTIDS❏ SQL_AFTER_GTIDS❏ SQL_AFTER_MTS_GAPS
❏ Reverting slave_parallel_workers to 0
65
SHOW SLAVE STATUS NONBLOCKING
❏ Issuing STOP SLAVE creates a global lock on other SLAVE commands❏ Lock remains while the last replication event group finishes❏ rpl_stop_slave_timeout controls how long stop slave will wait (defaults to 1
YEAR)❏ Issuing SHOW SLAVE STATUS will wait until the STOP SLAVE unlocks❏ SHOW SLAVE STATUS NONBLOCKING allows bypassing the lock
❏ 5.7 no longer blocks on SHOW SLAVE STATUS
66
New 5.7 FeaturesThings to look forward to
67
New items or changes in 5.7
❏ Enabling GTID online❏ mysql.gtid_executed table❏ Use performance_schema to view MTR details
68
Enabling GTID online
❏ gtid_mode and enforce_gtid_consistency are now dynamic ❏ No longer requires restarts or promotions❏ enforce_gtid_consistency - WARN -> ON❏ gtid_mode - OFF <-> OFF_PERMISSIVE <-> ON_PERMISSIVE <-> ON❏ No longer requires log_bin and log_slave_updates
69
mysql.gtid_executed table
❏ Stores the source uuid and start/end intervals for executed statements (GTID set)❏ The table is compressed for contiguous GTID sets depending on log_bin
❏ If log_bin is OFF, then executed_gtids_compression_period determines how many transactions are executed before compressing
❏ If log_bin is ON, then the table is compressed during each binary log rotation
70
❏ Give us your feedback!❏ https://www.surveymonkey.com/r/PLAM16GTID
71
See us again
❏ Mark Filipi❏ Using Ansible to Manage MySQL
4 October 5:20 PM - 06:10 PM in Lausanne
❏ Gillian Gunson and Brian Cain❏ At the bar
72
Thank you!
73