MYSQL PERFORMANCE TUNING Arnaud Adant June 21 2017 The London MySQL
MYSQL PERFORMANCE TUNING
Arnaud Adant June 21 2017
The London MySQL
WHO I AM• [email protected] • 3.5 years @ Oracle MySQL as support engineer
– Filed hundreds of bugs and feature requests • InnoDB • Optimizer • High availability • Performance
• Joined Jump Trading in 2015
AGENDA
• Introduction • Mostly performance tuning tips
• this presentation is an update from the 50 tips presentation done while at Oracle for MySQL 5.6
• Questions
TOP PERFORMANCE ISSUES• Bad SQL queries : 90 % of the time • Long running idle transaction
• high history_list_length • MDL
• Replication lag • no primary key and row based replication
• Wrong / bad configuration • smaller buffer pool, query cache, small redo logs
• OS, hardware
PERFORMANCE CHECK LIST• #1 Monitor the queries …
• tune the apps • #2 Monitor replication
• tune replication • #3 Monitor the status variables
• tune the config • #4 Monitor the OS
• tune the OS • #5 in doubt, benchmark
THE APP (DEFINITION)• Everything that is not shipped with the database
• schema • queries • user connections • the app data in data files • …
TUNING PROCESS
replication
status variables
mysqld
Monitor / measure
backup
Tunethe app
Config /design
config variables
benchmark OS
OS
config variables
OSthe app
app design data files
CHANGE PROCESS• One change at a time, test on dev = prod, then deploy
Development Production
deploy
Change TestMonitor
OK
Monitor
TUNING PROCESS1. Config
• OS • Config variables • App design • Benchmark
2. Monitor • App / status variables / replication / mysqld / os
3. Tune using the change process • App • Config variables • Table spaces • OS
CONFIG / OS 1/2• RAM • CPU • Storage • OS • OS limits • Battery backed disk cache • Memory allocator • CPU affinity
CONFIG / OS 2/2• I/O scheduler • File system • Mount options • Disk configuration • NUMA
RAM• The active data should fit in the buffer pool • MySQL connections and caches take memory • ECC RAM recommended • Enable huge pages for large buffer pools : large-pages • Extra RAM for
• FS cache • Monitoring • RAM disk (tmpfs)
• RAM is split over NUMA nodes
CPU• Fast CPU for single threaded performance • Recent servers have 32 to 80 cores. • Enable hyper-threading • MySQL 5.7 scales to 64 cores • Minimize the number of sockets !
CPU
http://dimitrik.free.fr/blog/archives/2016/02/mysql-performance-scalability-on-oltp_rw-benchmark-with-mysql-57.html
STORAGE• Good for IO bound loads • HDD for sequential reads and writes • Bus-attached SSD for random reads and writes
• NVMe : Non-Volatile Memory Express • Big sata or other disk for log files • Several disks ! If striping, mind the stripe size, align • Life time for low end SSD is a concern
OPERATING SYSTEM• Linux !
• pick a modern distribution • also works well on Windows, FreeBSD, MacOS, Solaris
OS LIMITS (LINUX)• Max open files per process
• ulimit –n • limits the number of file handles (connections, open
tables, …) • Max threads per user
• ulimit –u • limits the number of threads (connections, event
scheduler, shutdown)
BATTERY BACKED DISK CACHE• Usually faster fsyncs
• InnoDB redo logs • binary logs • data files
• Crash safety • Durability (ACID) • Applies to SSD
MEMORY ALLOCATOR• jemalloc is a good malloc replacement
[mysqld_safe] malloc-lib=/usr/lib64/libjemalloc.so.1 Default with MariaDB
• tcmalloc shipped on Linux with MySQL [mysqld_safe] malloc-lib=tcmalloc
CPU USAGE TUNING• taskset command on Linux for core assignment
• taskset -c 1-4 `pidof mysqld` • taskset -c 1,2,3,4 `pidof mysqld`
• niceness [mysqld_safe] nice=-20 also in /etc/security/limits.conf
• CPU governor • set to performance
FILE SYSTEM• ext4 best choice for speed and ease of use
• fsyncs a bit slower than ext3 • more reliable
• xfs is excellent (for experts only) • With innodb_flush_method = O_DIRECT • less stable recently
• ext3 is also a good choice
MOUNT OPTIONS
• ext4 (rw,noatime,nodiratime,nobarrier,data=ordered)* • xfs (rw, noatime,nodiratime,nobarrier,logbufs=8,logbsize=32k) • SSD specific
• innodb_use_trim • innodb_page_size = 4K • Innodb_flush_neighbors = 0
* use no barrier if you have a battery backed disk cache
I/O SCHEDULER
• deadline is generally the best I/O scheduler • echo deadline > /sys/block/{DEVICE-NAME}/queue/scheduler • the best value is HW and WL specific • noop on high end controller (SSD, good RAID card …) • deadline otherwise
DISK CONFIGURATION
• everything on one disk is killing performance • several disks (RAID) • or distribute the data files on different disks
• data files (ibd files) • main InnoDB table space ibdata • redo logs • undo logs (if separate) • binary logs
NUMA
• NUMA architecture is the norm nowadays for mysql server • Problem : RAM is allocated to CPU sockets • The Buffer pool should be distributed in each RAM • Percona, MariaDB and MySQL >= 5.6 support NUMA • The usual trick is using mysql_safe
• flush the FS cache --flush-caches • use NUMA interleave --numa-interleave
NUMA
• From 10.1, 5.7, -- innodb-numa-interleave • Allocates the buffer pool at startup • “Replaces” innodb_buffer_pool_populate • Default off • “For the innodb_numa_interleave option to be available,
MySQL must be compiled on a NUMA-enabled Linux system”
MariaDB : only custom builds MDEV-12924
CONFIG / VARIABLES 1/2• Buffer pool size • Query cache off • Use a thread pool • Cache the tables • Cache threads • Per thread memory usage • Default storage engine • Buffer pool contention
CONFIG / VARIABLES 2/2• Large redo logs • IO capacity • InnoDB flushing • Thread concurrency • InnoDB table spaces • Transaction isolation • Replication : sync_binlog • Replication : parallel threads • Connector configuration
INNODB BUFFER POOL SIZE• innodb_buffer_pool_size • Not too large for the data • Do not swap ! • Beware of memory crash if swapping is disabled • Active data <= innodb_buffer_pool_size <= 0.8 * RAM
DISABLE THE QUERY CACHE• Single threaded cache, removed in 8.0 • Only if threads_running <= 4 • Becomes fragmented • Cache should be in the App ! • Off by default from 5.6 • query_cache_type = 0 • query_cache_size =0 • Problem if qcache_free_blocks > 50k
ENABLE THE THREAD POOL
• https://www.percona.com/blog/2014/01/23/percona-server-improve-scalability-percona-thread-pool/
• Stabilize TPS for high concurrency
• Useful if threads_running > hardware threads
• Decrease context switches • Several connections for one
execution thread • Acts as a Speed Limiter • MySQL commercial,
Percona, MariaDB
TABLE CACHING• table_open_cache • not too small, not too big • opened_tables / sec • table_definition_cache • do not forget to increase • opened_table_definitions / sec • table_cache_instances = 8 or 16 (MySQL and Percona only) • innodb_open_files • mdl_hash_instances = 256 (in 5.7, no more an issue)
THREAD CACHING• Thread creation is expensive, so caching is important • thread_cache_size
• decreases threads_created rate • capped by max user processes (see OS limits) • 5.7.2 refactors this code • Default value is calculated in 5.7 • http://dev.mysql.com/doc/refman/5.7/en/server-system-
variables.html#sysvar_thread_cache_size
INNODB STORAGE ENGINE• Should be the default storage engine • Do not use MyISAM unless you know what you are doing • The most advanced storage engine is InnoDB • Scalable • Temporary tables use InnoDB in 5.7
• the memory engine is still used when using less then tmp_table_size
BUFFER POOL CONTENTION• innodb_buffer_pool_instances >= 8 • Reduce rows_examined / sec (see Bug #68079) • 8 is the default value in 5.6 ! • innodb_spin_wait_delay = 96 on high concurrency • Use read only transactions when possible
LARGE REDO LOGS• Redo logs defer the expensive changes to the data files • Recovery time is no more an issue • innodb_log_file_size = 2047M before 5.6 • innodb_log_file_size >= 2047M from 5.6 • Bigger is better for write QPS stability • You want to avoid “furious flushing” • innodb_log_files_in_group = 2 is usually fine
IO CAPACITY• IO capacity should mirror device IO capacity in IOPS • innodb_io_capacity should be larger for SSD • Impacts flushing • In 5.6, innodb_lru_scan_depth is per buffer pool instance • so innodb_lru_scan_depth = innodb_io_capacity /
innodb_buffer_pool_instances • Default innodb_io_capacity_max = min(2000, 2 * innodb_io_capacity) • From 5.7, innodb_page_cleaners, default = 4
PER THREAD MEMORY USAGE• Each thread allocates memory • estimates = max_used_connections * (
read_buffer_size + read_rnd_buffer_size + join_buffer_size + sort_buffer_size + binlog_cache_size + thread_stack + 2 * net_buffer_length … + n * tmp_table_size where n >= 0 )
For a more accurate measure, check the performance_schema memory metrics or MariaDB select * from information_schema.processlist.
THREAD CONCURRENCY• No thread pool :
• innodb_thread_concurrency = 16 - 32 in 5.5 • innodb_thread_concurrency = 36 in 5.6
• align to HW threads if less than 32 cores • Thread pool :
• innodb_thread_concurrency = 0 (unlimited) is fine • innodb_max_concurrency_tickets : higher for OLAP, lower for
OLTP
INNODB FLUSHING• Redo logs :
• innodb_flush_log_at_trx_commit = 1 // best durability • innodb_flush_log_at_trx_commit = 2 // better performance • innodb_flush_log_at_trx_commit = 0 // best performance
• Data files only : • innodb_flush_method = O_DIRECT // Linux, skips the FS cache
• Increase innodb_adaptive_flushing_lwm (fast disk)
INNODB_FILE_PER_TABLE• Default value : 1 table = 1 table space = 1 ibd file • Not so good for small tables. • Good for large tables. • Default value from 5.6, before all tables in system table space. • From 5.7, a user defined table space can now host several tables. • http://dev.mysql.com/doc/refman/5.7/en/create-tablespace.html
TRANSACTION ISOLATION• Default = REPEATABLE-READS • Oracle database, Sybase ASE default is READ-COMMITTED • Less locking, no gap locks, less overhead • transaction-isolation = REPEATABLE-READ • If you enable READ-COMMITTED, make sure binlog_format=ROW.
REPLICATION DURABILITY• Replication is crash safe from 5.6 • Replication state was stored in files • Now stored in InnoDB tables in the mysql schema • However the binary logs are stored on disk :
• sync_binlog = 1 • no reason not to use from 5.6
MULTI-THREADED REPLICATION
• MariaDB slave_parallel_threads • MySQL / Percona : slave_parallel_workers • >1 when lag is a concern • For a recent comparison, see https://www.percona.com/live/data-
performance-conference-2016/sites/default/files/slides/2016-04-21_plmce_mysql_parallel_replication-inventory_use-case_and_limitations.pdf
CONNECTOR TUNING• Connectors can also be tuned. • JDBC property for maximum performance :
• userConfigs=maxPerformance • Use if the server configuration is stable • Removes frequent
• SHOW COLLATION • SHOW GLOBAL VARIABLES
• Fast validation query : /* ping */
APPLICATION DESIGN 1/2• Schema design • Indexes at the right place • Remove redundant indexes • Reduce rows examined • Reduce sent rows • Minimize locking • Minimize temporary tables (on disk) • Minimize sorting on disk
APPLICATION DESIGN 2/2• Avoid long running transactions • Close prepare statements • Close idle connection • Do not use the information_schema in your app • Views may not not good for performance
– temporary tables (on disk)
• Replace truncate with drop table / create table • Tune the replication thread • Cache data in the app • Scale out, shard
SCHEMA DESIGN• create a PK for each table ! • integer primary keys • avoid varchar, composite for PK • latin1 vs. utf8 vs. utf8mb4 • the smallest varchar for a column • smallest data type for a column • keep the number of partitions low (< 100, optimal
performance <10) in 5.6. No more an issue in 5.7. • use compression for blob / text data types
INDEXES• Fast path to data • B-TREE, R-TREE (spatial), full text • for sorting / grouping
• without temporary table • covering indexes
• contain all the selected data • save access to full record • reduce random reads
REDUNDANT INDEXES• Not good for write performance
• duplicated data • resources to update • confuse the optimizer
• Use SYS schema views • schema_unused_indexes
REDUCE ROWS_EXAMINED• Rows read from the storage engines • Rows_examined
• slow query log • P_S statement digests • SYS schema
• Handler% • show session status where variable_name like ‘Handler%’ or
variable_name like ‘%tmp%’; • optimize if rows_examined > 10 * rows_sent • usually due to missing indexes
REDUCE ROWS_SENT• Found in the slow query log, the SYS schema • Number of rows that are returned by queries to the clients • rows_sent <= rows_examined • Network / CPU expensive • Client compression can help. • Usually bad design. • Use caching or LIMIT for UI • No human can parse 10k rows
REDUCE LOCKING• Locking has a performance impact because locks kept are in memory • Can be seen in show engine innodb status • UPDATE, SELECT FOR UPDATE, DELETE, INSERT SELECT • Use a PK ref, UK ref to lock • Avoid large index range and table scans • Reduce rows_examined for locking SQL • Commit when possible
TEMPORARY TABLES (ON DISK)• Large temporary tables on disk
• handler_write (handler_tmp_write, handler_tmp_update in MariaDB)
• created_tmp_disk_tables • monitor tmpdir usage • Frequent temporary tables on disk • High created_tmp_disk_tables / uptime • show global status like '%tmp%'; • Available in the SYS schema, select * from sys.statement_analysis
MIND THE SORT (ON DISK)• Key status variable is :
• sort_merge_passes : it is a session variable • if it occurs often, you can try to up sort_buffer_size • find the query and fix it with an the index if possible
• http://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html
LONG RUNNING (IDLE) TRANSACTIONS• Usually the oldest transactions in show engine innodb status :
• High history_list_length (a status variable in MariaDB) • Prevent the purge • Decrease performance • Can also prevent schema changes (due to MDL locks) • Can block backups (FTWRL see MDEV-12620)
CLOSE PREPARE STATEMENTS• com_stmt_prepare – com_stmt_close ~= 0
CLOSE IDLE CONNECTIONS• Idle connections consume resources • Either refresh or disconnect them
MISCELLANEOUS• Do not use the information_schema in your App • Replace truncate with drop table / create table due to
• Bug #68184 Truncate table causes innodb stalls, fixed in MySQL 8.0
• Views may not be good for performance • temporary tables (on disk) • joining other views
• Scale out, shard
CACHE DATA IN THE APP• Good for CPU / IO • Cache the immutable or the expiring !
• referential data • memcached / redis
• Query cache can be disabled • Identify frequent statements
• perl mysqldumpslow.pl –s c slow60s.log • pt-statement-digest
• Possibly use proxies such as ProxySQL, MaxScale
MONITOR THE REPLICATION THREADS• Slow query log with
• log-slow-slave-statements is now dynamic (Bug #59860) from 5.6.11
• MySQL and Percona only • Performance_schema >= 5.6.14 • binlog_format = ROW • show global status like ‘Handler%’ • SYS (on table scans), detect tables not having PK • In case of issue with a particular table, check the MDL locks.
• MariaDB and 5.7 have a way to see them
MONITOR / MAINTAIN• Monitor the database and OS • Mine the slow query log • Use the performance_schema and SYS schema • Backup the database • Upgrade regularly
MONITOR THE DATABASE• essential for any professional DBA • part of the tuning and change processes • alerts • graphs • availability and SLAs • the effect of tuning • query analysis
MINE THE SLOW QUERY LOG• Dynamic collection • The right interval • Top queries • pt-schema-digest • Sort by query time desc • perl mysqldumpslow.pl –s t slow.log • Sort by rows_examined desc • Top queries at the 60s range
USE THE SYS SCHEMA• the SYS schema: • good entry point • ready to use views • IO / latency / waits / statement digests • ideal for dev and staging • https://github.com/mysql/mysql-sys • sys 5.6 works with MariaDB 10.1 • overhead is acceptable in most cases (5% for P_S)
BACKUP THE DATABASE• Backup is always required ! • Method depends on your business : logical vs. physical • Verify the backup • Decrease the overhead on prod
• LVM has an overhead • mysqldump eats MySQL resources • mysqlbackup / xtrabackup copy the data files and verify it
(in parallel)
OPTIMIZE THE DATABASE• Fragmentation has an impact on performance
• internal fragmentation (inside table spaces) • external fragmentation (on the file system)
• OPTIMIZE TABLE fixes it (see also Bug #57583) • can be done online
• MariaDB has the iterative defragmentation patch from Facebook / Kakao.
• https://mariadb.com/kb/en/defragmenting-innodb-tablespaces/
MONITORING FRAGMENTATION • There is no general formula
• except for fixed length records • create table t_defrag like t; insert into t_defrag select * from t limit 20000; • Fragmentation if Avg_row_length(t) >
Avg_row_length(t_defrag) • Avg_row_length from show table status
MONITORING FRAGMENTATION • There is no general formula
• except for fixed length records • create table t_defrag like t; insert into t_defrag select * from t limit 20000; • Fragmentation if Avg_row_length(t) >
Avg_row_length(t_defrag) • Avg_row_length from show table status
UPGRADE POLICY• Security vulnerability fixes • Bug fixes • Performance improvements • Ready for the next GA • Never upgrade without testing (see change process)
• can be automated • the DBA is also a QA engineer !
POINTERS• The MySQL manual • bugs.mysql.com • Percona’s blog • https://planet.mysql.com/ • 50 tips to improve MySQL performance (2014) • Dimitrik’s latest slides and blog • Bill Karwin’s Tuning MySQL Its about performance (2015) • Bill Karwin’s SQL anti-patterns • https://www.percona.com/live/17/resources/slides are freely available !!! • FOSDEM videos https://video.fosdem.org/2017/H.1309/ • MariaDB • My blog : http://aadant.com/blog
QUESTIONS?