Preparse Query Rewrite Plugins New SQL syntax for fun & performance September, 13, 2016 Sveta Smirnova
Preparse Query Rewrite PluginsNew SQL syntax for fun & performance
September, 13, 2016
Sveta Smirnova
•Introducing new SQL syntax
•Working with results
•Variables•Summary
Table of Contents
2
Introducing new SQL syntax
• From mascots and from humans
• It cannot make a toast• It does not support some syntax
MySQL often receives blames
4
• From mascots and from humans• It cannot make a toast
• It does not support some syntax
MySQL often receives blames
4
• From mascots and from humans• It cannot make a toast• It does not support some syntax
MySQL often receives blames
4
• FILTER clause in MySQL on my homemachine
Or does it?
5
• FILTER clause in MySQL on my homemachine
Or does it?
5
• FILTER clause in MySQL on my homemachine
• But not in the user manual
Or does it?
5
• With 181 lines of code
• Including comments!
• And new Query Rewrite Plugin interface
How is it done?
6
• First introduced in version 5.7.5• Was available at MySQL Labs• Two types of plugins
• Pre-parse• Post-parse
A little bit of history
7
• Part of Audit plugin interface• Step in at
• MYSQL AUDIT GENERAL ALL• MYSQL AUDIT CONNECTION ALL• MYSQL AUDIT PARSE ALL
MYSQL AUDIT PARSE PREPARSEMYSQL AUDIT PARSE POSTPARSE
• MYSQL AUDIT AUTHORIZATION ALL• ...
Today
8
#include <mysql/plugin.h>
#include <mysql/plugin_audit.h> - Audit plugin declaration
...
static MYSQL_PLUGIN plugin_info_ptr; - Pointer to the plugin
...
static int filter_plugin_init(MYSQL_PLUGIN plugin_ref); - Plugin initialization
...
static int filter(MYSQL_THD thd, mysql_event_class_t event_class,
const void *event); - Entry point for MYSQL_AUDIT_PARSE_PREPARSE
...
static st_mysql_audit filter_plugin_descriptor;
...
mysql_declare_plugin(filter_plugin);
Plugin skeleton
9
static st_mysql_audit filter_plugin_descriptor= {
MYSQL_AUDIT_INTERFACE_VERSION, /* interface version */
NULL,
filter, /* implements FILTER */
// You can also use MYSQL_AUDIT_PARSE_ALL
{ 0, 0, (unsigned long) MYSQL_AUDIT_PARSE_PREPARSE,}
};
Plugin descriptor
10
mysql_declare_plugin(filter_plugin)
{
MYSQL_AUDIT_PLUGIN,
&filter_plugin_descriptor,
"filter_plugin",
"Sveta Smirnova",
"FILTER SQL:2003 support for MySQL",
PLUGIN_LICENSE_GPL,
filter_plugin_init,
NULL, /* filter_plugin_deinit - TODO */
0x0001, /* version 0.0.1 */
NULL, /* status variables */
NULL, /* system variables */
NULL, /* config options */
0, /* flags */
}
mysql_declare_plugin_end;
Plugin declaration
11
#include <my_thread.h> // my_thread_handle needed by mysql_memory.h
#include <mysql/psi/mysql_memory.h>
...
static PSI_memory_key key_memory_filter;
static PSI_memory_info all_rewrite_memory[]=
{
{ &key_memory_filter, "filter", 0 }
};
static int filter_plugin_init(MYSQL_PLUGIN plugin_ref)
{
plugin_info_ptr= plugin_ref;
const char* category= "sql";
int count;
count= array_elements(all_rewrite_memory);
mysql_memory_register(category, all_rewrite_memory, count);
return 0; /* success */
}
Memory management for plugins
12
<filter clause> ::=
FILTER <left paren> WHERE <search condition> <right paren>
(10.9 <aggregate function>, 5WD-02-Foundation-2003-09.pdf, p.505)
Only for aggregate functions:
<computational operation> ::=
AVG | MAX | MIN | SUM | EVERY | ANY
| SOME | COUNT | STDDEV_POP | STDDEV_SAMP
| VAR_SAMP | VAR_POP | COLLECT | FUSION | INTERSECTION
<set quantifier> ::=
DISTINCT
| ALL
MySQL only supports
COUNT | AVG | SUM | MAX | MIN
| STDDEV_POP | STDDEV_SAMP
| VAR_SAMP | VAR_POP
SQL:2003
13
• FILTER is practicallyCASE WHEN foo THEN bar ELSE NULL
• So we only need to catchFUNCTION(var) FILTER(WHERE foo)
• And replace it with CASE
Implementing FILTER clause
14
static int filter(MYSQL_THD thd, // MySQL Thread object
mysql_event_class_t event_class, // Class of the event
const void *event // Event itself
)
{
const struct mysql_event_parse *event_parse=
static_cast<const struct mysql_event_parse *>(event);
if (event_parse->event_subclass != MYSQL_AUDIT_PARSE_PREPARSE)
return 0;
string subject= event_parse->query.str; // Original query
string rewritten_query;
//requires std::regex and GCC 4.9+
regex filter_clause_star("(COUNT)\((\s*\*\s*)\)\s+"
+ "FILTER\s*\(\s*WHERE\s+([^\)]+)\s*\)",
ECMAScript | icase);
rewritten_query= regex_replace(subject, filter_clause_star,
"$1(CASE WHEN $3 THEN 1 ELSE NULL END)");
...
Catching up the query
15
void _rewrite_query(const void *event,
const struct mysql_event_parse *event_parse,
char const* new_query
)
{
char *rewritten_query= static_cast<char *>(my_malloc(
key_memory_filter, strlen(new_query) + 1, MYF(0)));
strncpy(rewritten_query, new_query, strlen(new_query));
rewritten_query[strlen(new_query)]= ’\0’;
event_parse->rewritten_query->str= rewritten_query; // Rewritten query
event_parse->rewritten_query->length=strlen(new_query);
// You must set this flag to inform MySQL Server what query was rewritten
*((int *)event_parse->flags)|=
(int)MYSQL_AUDIT_PARSE_REWRITE_PLUGIN_QUERY_REWRITTEN;
}
Rewritten query
16
Working with results
• Playing with syntax is fun
• But can we introduce something moreMySQL-ish?
Can we do better?
18
• MySQL 5.7 has Optimizer HintsSELECT /*+ NO_RANGE_OPTIMIZATION(t3 PRIMARY, f2_idx) */ f1
FROM t3 WHERE f1 > 30 AND f1 < 33;
SELECT /*+ BKA(t1) NO_BKA(t2) */ * FROM t1 INNER JOIN t2 WHERE ...;
SELECT /*+ NO_ICP(t1, t2) */ * FROM t1 INNER JOIN t2 WHERE ...;
SELECT /*+ SEMIJOIN(FIRSTMATCH, LOOSESCAN) */ * FROM t1 ...;
EXPLAIN SELECT /*+ NO_ICP(t1) */ * FROM t1 WHERE ...;
• But sometimes thread-specific buffers affectquery execution
• Workaround requires processing result setof each of these statements
• SET STATEMENT not supported by forks• This is why I extended optimizer hint syntax
Custom hint plugin
19
• MySQL 5.7 has Optimizer Hints• But sometimes thread-specific buffers affectquery execution
• Common workaround exists:SET tmp_table_size=1073741824;
SELECT * FROM t1 INNER JOIN t2 WHERE ...;
SET tmp_table_size=DEFAULT;
• Workaround requires processing result setof each of these statements
• SET STATEMENT not supported by forks• This is why I extended optimizer hint syntax
Custom hint plugin
19
• MySQL 5.7 has Optimizer Hints• But sometimes thread-specific buffers affectquery execution
• Workaround requires processing result setof each of these statements
• Percona Server has SET STATEMENTmysql> SET STATEMENT max_statement_time=1000 FOR SELECT user FROM user;
+------------------+
| user |
+------------------+
| foo |
| root |
...
• SET STATEMENT not supported by forks• This is why I extended optimizer hint syntax
Custom hint plugin
19
• MySQL 5.7 has Optimizer Hints• But sometimes thread-specific buffers affectquery execution
• Workaround requires processing result setof each of these statements
• SET STATEMENT not supported by forksmysql> SET STATEMENT max_statement_time=1000 FOR SELECT user FROM user;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual
that corresponds to your MySQL server version for the right syntax to use near
’max_statement_time=1000 FOR SELECT user FROM user’ at line 1
• This is why I extended optimizer hint syntax
Custom hint plugin
19
• MySQL 5.7 has Optimizer Hints• But sometimes thread-specific buffers affectquery execution
• Workaround requires processing result setof each of these statements
• SET STATEMENT not supported by forks• This is why I extended optimizer hint syntax
SELECT /*+ join_buffer_size=16384 */ f1 FROM t3 WHERE f1 > 30 AND f1 < 33;
SELECT /*+ tmp_table_size=1073741824 BKA(t1) NO_BKA(t2) */ *
FROM t1 INNER JOIN t2 WHERE ...;
Custom hint plugin
19
• For Custom Hints we need to:• Store previous values of thread variables we
are going to modify• Modify variables• Revert them back before sending result
New features
20
// map to store modified variables
static map <my_thread_id, map<supported_hints_t, ulonglong> > modified_variables;
...
/* The job */
static int custom_hint(MYSQL_THD thd, mysql_event_class_t event_class,
const void *event)
{
...
// If we have a match store create map of thread variables
std::map<supported_hints_t, ulonglong> current;
...
After processing variables store them in modified_variables map
modified_variables[thd->thread_id()]= current;
...
Store previous values
21
• Since we have access to MYSQL THD thisis easy:switch(get_hint_switch(ssm[1]))
{
case JOIN_BUFFER_SIZE:
current[JOIN_BUFFER_SIZE]= thd->variables.join_buff_size;
thd->variables.join_buff_size= stoull(ssm[2]);
break;
case TMP_TABLE_SIZE:
current[TMP_TABLE_SIZE]= thd->variables.tmp_table_size;
thd->variables.tmp_table_size= stoull(ssm[2]);
break;
...
Modify variables
22
• First we need to specify what we needMYSQL AUDIT GENERAL RESULTstatic st_mysql_audit custom_hint_plugin_descriptor= {
MYSQL_AUDIT_INTERFACE_VERSION, /* interface version */
NULL,
custom_hint, /* implements custom hints */
{ (unsigned long) MYSQL_AUDIT_GENERAL_RESULT, 0,
(unsigned long) MYSQL_AUDIT_PARSE_PREPARSE,
}
};
• Then revert variables before sending result• And, finally, erase stored values for currentthread:
Revert variables back
23
• First we need to specify what we needMYSQL AUDIT GENERAL RESULT
• Then revert variables before sending resultif (event_general->event_subclass == MYSQL_AUDIT_GENERAL_RESULT)
{
map<my_thread_id, map<supported_hints_t, ulonglong> >::iterator
current= modified_variables.find(thd->thread_id());
if (current != modified_variables.end())
{
for (map<supported_hints_t, ulonglong>::iterator it=
current->second.begin(); it!= current->second.end(); ++it)
{
switch(it->first)
{
case JOIN_BUFFER_SIZE:
thd->variables.join_buff_size= it->second;
break;
• And, finally, erase stored values for currentthread:
Revert variables back
23
• First we need to specify what we needMYSQL AUDIT GENERAL RESULT
• Then revert variables before sending result• And, finally, erase stored values for currentthread:modified_variables.erase(current);
Revert variables back
23
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)
mysql> select count(*), sum(c) from
-> (select s, count(s) c from joinit where i < 1000000 group by s) t;
+----------+--------+
| count(*) | sum(c) |
+----------+--------+
| 737882 | 737882 |
+----------+--------+
1 row in set (24.70 sec)
mysql> show status like ’Created_tmp_disk_tables’;
+-------------------------+-------+
| Variable_name | Value |
+-------------------------+-------+
| Created_tmp_disk_tables | 2 | -- 2 temporary tables on disk
+-------------------------+-------+
1 row in set (0.00 sec)
Before Custom Hint Plugin
24
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)
mysql> select /*+ tmp_table_size=134217728 max_heap_table_size=134217728 */
-> count(*), sum(c) from
-> (select s, count(s) c from joinit where i < 1000000 group by s) t;
+----------+--------+
| count(*) | sum(c) |
+----------+--------+
| 737882 | 737882 |
+----------+--------+
1 row in set, 2 warnings (6.21 sec) -- 4 times speed gain!
mysql> show status like ’Created_tmp_disk_tables’;
+-------------------------+-------+
| Variable_name | Value |
+-------------------------+-------+
| Created_tmp_disk_tables | 0 | -- No disk-based temporary table!
+-------------------------+-------+
1 row in set (0.00 sec)
Custom Hint Plugin at work
25
Variables
• Very simple syntax•
mysql> BACKUP SERVER;
+---------------------------------+
| Backup finished with status OK! |
+---------------------------------+
| Backup finished with status OK! |
+---------------------------------+
1 row in set, 1 warning (42.92 sec)
• Supports many tools• Needs to pass options
BACKUP DATABASE plugin
27
• Very simple syntax• Supports many tools
• mysqldump• mysqlpump• mysqlbackup• xtrabackup
• Needs to pass options
BACKUP DATABASE plugin
27
• Very simple syntax• Supports many tools• Needs to pass options
• Credentials• Backup directory• Custom
BACKUP DATABASE plugin
27
• We have access to•
MYSQL_THR->security_context
thd->security_context()->user().str
Password still has to be in the configuration file, under[client]
or[toolname]
section
• System variablesSince we are interested in backing up local server we will usemysqld_unix_port
Customization: credentials
28
• Global variables - Example only!static MYSQL_SYSVAR_STR(backup_dir, backup_dir_value, PLUGIN_VAR_MEMALLOC,
"Default directory...", NULL, NULL, NULL);
static MYSQL_SYSVAR_ENUM(backup_tool, backup_tool_name,
PLUGIN_VAR_RQCMDARG, "Backup tool. Possible values:
mysqldump|mysqlbackup", NULL, NULL,
MYSQLDUMP, &supported_tools_typelib);
• Thread variables• Add to plugin declaration
Customization: variables
29
• Global variables - Example only!• Thread variables
static MYSQL_THDVAR_STR(backup_dir, PLUGIN_VAR_MEMALLOC,
"Default directory...", NULL, NULL, NULL);
static MYSQL_THDVAR_ENUM(backup_tool, PLUGIN_VAR_RQCMDARG,
"Backup tool. Possible values:
mysqldump|mysqlbackup|mysqlpump", NULL, NULL,
MYSQLDUMP, &supported_tools_typelib);
...
• Add to plugin declaration
Customization: variables
29
• Global variables - Example only!• Thread variables• Add to plugin declaration
static struct st_mysql_sys_var *mysqlbackup_plugin_sys_vars[] = {
MYSQL_SYSVAR(backup_dir),
...
MYSQL_SYSVAR(backup_tool_options),
NULL
};
mysql_declare_plugin(mysqlbackup_plugin) {
MYSQL_AUDIT_PLUGIN,
&mysqlbackup_plugin_descriptor,
"mysqlbackup_plugin",
...
NULL, /* status variables */
mysqlbackup_plugin_sys_vars, /* system variables */
...
Customization: variables
29
Summary
• Custom locks• Access to thread- and server-specificvariables
• Fine control at multiple steps of queryexecution
• More
More possibilities
31
• https://github.com/svetasmirnova/
• filter plugin• custom hint plugin• mysqlbackup plugin
Code
32
• MySQL source dir/plugin• rewriter• rewrite example
• Writing Audit Plugins manual• MySQL Services for Plugins manual
More information
33
When: October 3-5, 2016Where: Amsterdam, NetherlandsPercona Live Europe Open Source Database Conference is the premier event for thediverse and active open source community, as well as businesses that develop and useopen source software.
Use promo code ParisMeetup to get 25 euros off. Register now
Sponsorship opportunities available as well here.
Join us at Percona Live Europe
34
???
Place for your questions
35
http://www.slideshare.net/SvetaSmirnova
https://twitter.com/svetsmirnova
Thank you!
36