Page 1
BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA
HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH
@ChrisAntognini antognini.ch/blog
Christian Antognini
Query Optimizer – MySQL vs. PostgreSQL
Percona Live, Frankfurt (DE), 7 November 2018
Page 2
@ChrisAntognini
Query Optimizer – MySQL vs. PostgreSQL2 2018-11-07
Senior principal consultant, trainer and partner at Trivadis
[email protected]
http://antognini.ch
Focus: get the most out of database engines
Logical and physical database design
Query optimizer
Application performance management
Author of Troubleshooting Oracle Performance (Apress, 2008/14)
OakTable Network, Oracle ACE Director
Page 3
Agenda
Query Optimizer – MySQL vs. PostgreSQL3 2018-11-07
1. Introduction
2. Controlling the Query Optimizer
3. Statistics about Data
4. Data Dictionary Metadata
5. Single-Table Access Paths
6. Joins and Sub-queries
7. Conclusion
8. References
Page 4
Query Optimizer – MySQL vs. PostgreSQL4 2018-11-07
Introduction
Page 5
Compared Products
Query Optimizer – MySQL vs. PostgreSQL5 2018-11-07
MySQL Community Server 8.0.13
Release date: 10 October 2018
Only the InnoDB engine is covered
PostgreSQL 11.0
Release date: 18 October 2018
Page 6
Disclaimer
Query Optimizer – MySQL vs. PostgreSQL6 2018-11-07
No performance tests were performed
No comparison between the execution plans generated by the two query
optimizers were performed
To compare and to evaluate the two query optimizers only the availability of
key features and the ability of the query optimizer to correctly recognize
common data patterns was considered
Page 7
Inputs Considered to Produce an Execution Plan
Query Optimizer – MySQL vs. PostgreSQL7 2018-11-07
Query
Optimizer
SQL statement
Execution plan
Configuration
Metadata
Object statistics
Runtime information?
Others?
Page 8
How Is Data Stored?
Query Optimizer – MySQL vs. PostgreSQL8 2018-11-07
InnoDB uses a B-tree index Heap table
Page 9
Query Optimizer – MySQL vs. PostgreSQL9 2018-11-07
Controlling the Query Optimizer
Page 10
Configuration
Query Optimizer – MySQL vs. PostgreSQL10 2018-11-07
Three system variables control the
behavior of the query optimizer
Limit the number of evaluated
plans (2)
Control specific features
(1 parameter for 21 features)
The default (system) values can be
overwritten at session and statement
level
45 parameters control the behavior
of the query optimizer
Limit the number of evaluated
plans (2)
Control specific features (25)
Control the genetic optimizer (7)
The default (system) values can be
overwritten at session level
Page 11
Configuration – Cost Model
Query Optimizer – MySQL vs. PostgreSQL11 2018-11-07
A cost model database contains cost
estimate information for a number of
operations (8)
The default values can be changed
at the system level only
A number of parameters provide cost
estimate information for a number of
operations (11)
The default (system) values can be
overwritten at session level
Page 12
Hints
Query Optimizer – MySQL vs. PostgreSQL12 2018-11-07
SELECT statement modifiers (4)
Impact statement syntax
Index hints (3)
Impact statement syntax
Cause error when index missing
Optimizer hints (23)
Similar to Oracle Database hints
Global, query block and object-level
Cause warning when syntax is wrong
(Available in EDB Advanced Server)
hints_modifiers.sql
hints_index.sql
hints_optimizer.sql
Page 13
Query Optimizer – MySQL vs. PostgreSQL13 2018-11-07
Statistics about Data
Page 14
Gathering Statistics
Query Optimizer – MySQL vs. PostgreSQL14 2018-11-07
The ANALYZE statement gathers
and, by default, stores statistics in
the data dictionary
By default, an asynchronous
automatic statistics recalculation
takes place
Persistent (default) as well as non-
persistent statistics exist
The ANALYZE statement gathers
and stores statistics in the data
dictionary
By default, the autovacuum daemon
recalculate statistics of modified
tables
Page 15
Table Statistics
Query Optimizer – MySQL vs. PostgreSQL15 2018-11-07
Clustered index size (pages)
Number of rows
Table size (pages)
Number of rows
Number of pages marked all-visible
statistics.sql
Page 16
Column Statistics
Query Optimizer – MySQL vs. PostgreSQL16 2018-11-07
Data distribution (optional)
Including fraction of entries that
are null
Fraction of values that are null
Average column width (bytes)
Number of distinct values
Statistical correlation between
physical and logical row ordering
Data distribution (optional)
Most common values and their
frequency (optional)
statistics.sql
Page 17
Cross-Column Statistics
Query Optimizer – MySQL vs. PostgreSQL17 2018-11-07
Functional dependencies (optional)
Number of distinct values (optional)
statistics.sql
Page 18
Index Statistics
Query Optimizer – MySQL vs. PostgreSQL18 2018-11-07
Index size (pages)
Number of leaf pages
Number of distinct keys
Several values are stored
E.g. for index “a,b,c”
“a”, “a,b”, “a,b,c”, “a,b,c,PK”
Index size (pages)
Number of indexed rows
statistics.sql
Page 19
Query Optimizer – MySQL vs. PostgreSQL20 2018-11-07
Data Dictionary Metadata
Page 20
Constraints – Primary Key and Unique Key
Query Optimizer – MySQL vs. PostgreSQL21 2018-11-07
Because of the clustered index, PK
has precedence over other indexes
Predicates based on UK take
precedence over non-UK indexes
Equality predicates based on PK/UK
are probed
No particular precedence is given to
predicates based on PK/UK
Statistical correlation between
physical and logical row ordering
determines which index is used
constraints_pk_uk.sql
Page 21
Constraints – Foreign Key
Query Optimizer – MySQL vs. PostgreSQL22 2018-11-07
No usage of FK to avoid loss-less
joins has been observed
No usage of FK to avoid loss-less
joins has been observed
constraints_fk.sql
Page 22
Constraints – NOT NULL
Query Optimizer – MySQL vs. PostgreSQL23 2018-11-07
NOT NULL constraints are used to
verify the validity of predicates
By default the usage of NOT NULL
constraints to verify the validity of
predicates is enabled for specific
cases only
constraint_exclusion = partition
Statistics are used instead
constraints_not_null.sql
Page 23
Constraints – CHECK
Query Optimizer – MySQL vs. PostgreSQL24 2018-11-07
No usage of CHECK constraints to
verify the validity of predicates has
been observed
Statistics are used instead
By default the usage of CHECK
constraints to verify the validity of
predicates is enabled for specific
cases only
constraint_exclusion = partition
Statistics are used instead
constraints_check.sql
Page 24
Query Optimizer – MySQL vs. PostgreSQL25 2018-11-07
Single-Table Access Paths
Page 25
Available Index Types
Query Optimizer – MySQL vs. PostgreSQL26 2018-11-07
Supported index types
B-tree (default)
R-tree (for spatial indexes)
Indexes can be created on
expressions and, for string columns,
on the leading part of column values
B-tree indexes store NULL values
Support for invisible indexes
Supported index types
B-tree (default)
Hash, GiST, SP-GiST, GIN, BRIN
Indexes can be created on
expressions as well as on a subset
of the rows
B-tree indexes store NULL values
and support non-key columns
indexes_expression.sql
indexes_invisible.sql
indexes_non_key.sql
indexes_nulls.sql
indexes_partial.sql
indexes_prefix.sql
Page 26
Optimization of ORDER BY, MIN and MAX
Query Optimizer – MySQL vs. PostgreSQL27 2018-11-07
B-tree indexes can be used to
optimize ORDER BY, MIN and MAX
Index scans can be performed in
both directions
Keys are stored according to the
specified order
NULLS FIRST/LAST not
supported
B-tree indexes can be used to
optimize ORDER BY, MIN and MAX
Index scans can be performed in
both directions
Keys are stored according to the
specified order
NULLS FIRST/LAST supported
indexes_order_by.sql
indexes_min_max.sql
Page 27
Merging Indexes
Query Optimizer – MySQL vs. PostgreSQL28 2018-11-07
Two or more B-tree indexes can be
merged at runtime to evaluate
multiple predicates combined with
AND or OR
When appropriate, B-tree indexes
are dynamically converted to
bitmaps in memory
One utilization of this feature is to
merge indexes to evaluate multiple
predicates combined with AND or
OR
indexes_merge.sql
Page 28
(Declarative) Partitioning
Query Optimizer – MySQL vs. PostgreSQL29 2018-11-07
Available methods:
Multi-column range and list
Single-column hash
Sub-partitioning by hash
Only local indexes (incl. PK/UK)
FK not supported
Partition pruning
Available methods:
Multi-column range and hash
Single-column list
Sub-partitioning by range/hash/list
Only local indexes (incl. PK/UK)
FK cannot reference a partitioned table
Partition pruning
partitioning_hash.sql
partitioning_list.sql
partitioning_range.sql
partitioning_sub.sql
Page 29
Query Optimizer – MySQL vs. PostgreSQL30 2018-11-07
Joins and Sub-queries
Page 30
Available Kind of Joins
Query Optimizer – MySQL vs. PostgreSQL31 2018-11-07
Available join methods
(Block) Nested loops join
(Hash join available in MariaDB)
Bushy plans are considered only
when no other possibility exists
Full outer joins are not supported
Available join methods
Nested loops join
Hash join
Merge join
Bushy plans are considered
Full outer joins are supported and optimized with hash/merge joins
joins_methods.sql
joins_order.sql
joins_syntax.sql
joins_bushy.sql
Page 31
Sub-queries in WHERE Clause
Query Optimizer – MySQL vs. PostgreSQL32 2018-11-07
Simple sub-queries that are not
correctly optimized were observed
For optimal performance a rewrite
might be necessary
Problematic cases observed
Correlated NOT IN
Correlated (NOT) EXISTS
Simple sub-queries that are not
correctly optimized were observed
For optimal performance a rewrite
might be necessary
Problematic case observed
Correlated (NOT) IN
Uncorrelated NOT IN
joins_subqueries.sql
Page 32
Query Optimizer – MySQL vs. PostgreSQL35 2018-11-07
Conclusion
Page 33
Summary
Query Optimizer – MySQL vs. PostgreSQL36 2018-11-07
Good configuration capabilities
Hints available
Fairly good object statistics
Metadata only partially used
Good indexing capabilities
Average partition capabilities
Limited join capabilities
Good configuration capabilities
Hints missing
Very good object statistics
Metadata only partially used
Good indexing capabilities
Average partition capabilities
Good join capabilities
Page 34
Query Optimizer – MySQL vs. PostgreSQL37 2018-11-07
Core Messages
The query optimizer of PostgreSQL is more advanced than the one of MySQL
In general, the query optimizer of MySQL can only do a good job with
transactional loads; the one of PostgreSQL is also suitable for analytical loads
Page 35
Query Optimizer – MySQL vs. PostgreSQL38 2018-11-07
References
Page 36
References (1)
Query Optimizer – MySQL vs. PostgreSQL39 2018-11-07
MySQL 8.0 Reference Manual
The Unofficial MySQL 8.0 Optimizer
Guide
MySQL Internals
MySQL Server Blog
PostgreSQL 11 Documentation
Planner source code “readme”
PostgreSQL Wiki
Page 37
References (2)
Query Optimizer – MySQL vs. PostgreSQL40 2018-11-07
The verification scripts I wrote are available on GitHub
How Well a Query Optimizer Handles Subqueries?
Page 38
Q&A
2018-11-07 Query Optimizer – MySQL vs. PostgreSQL41