Top Banner
By: Jehad Keriaki DBA MySQL: Indexing for Better Performance 1 MySQL: Indexing for Better Performance
26

MySQL: Indexing for Better Performance

Aug 27, 2014

Download

Software

jkeriaki

Best practices of creating indexes that increase the query speed and optimize database performance
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MySQL: Indexing for Better Performance

By: Jehad Keriaki

DBA

MySQL: Indexing for Better Performance 1

MySQL: Indexing for Better Performance

Page 2: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

What is an Index Data structure to improve the speed of data

retrieval from DBs.

MySQL: Indexing for Better Performance 2

Page 3: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Why Would We Use Indexes Speed, Speed, and Speed

Constraints (Uniqueness)

IO Optimization

MAX, MIN

Sorting, Grouping

MySQL: Indexing for Better Performance 3

Page 4: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Index Types Primary Key (PK), Unique, Key

Primary Key vs Unique

Unique can be NULL

InnoDB is clustered based on PK

MySQL: Indexing for Better Performance 4

Page 5: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Types (Algorithm) B-Tree, R-Tree, Hash, Full text R-Tree: Geo-spatial Hash: Memory only, fast for equality, whole key is used,

no range Full-text: For MyISAM, and as of 5.6 for InnoDB too. SELECT * WHERE MATCH(description) AGAINST ('toshiba') boolean , with query expansion, stop words, short words,

50% rule A better choice would be to use a search server like Sphinx

MySQL: Indexing for Better Performance 5

Page 6: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Types (Algorithm) [cont'd] B-Tree:

For comparison operations (<>=..etc)

Range (Between)

Like, which is a special case of range when used with %

It is the DEFAULT in MySQL

In B-Tree, data are stored in the leaf nodes

MySQL: Indexing for Better Performance 6

Page 7: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Types (Structure) One column

Multi-Column [composite]

Partial [prefix]

Any one of them can be "Covering Index", except 'partial'

MySQL: Indexing for Better Performance 7

Page 8: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

What Indexes to Create? PK is a must Best to be unsigned [smallest int] auto increment

PK and InnoDB (Clustered) InnoDB tables are clustered based on PKs Each secondary index has the PK in it. example: INDEX(name) is in fact (name, id)

AVOID long PKs. Why?

AVOID md5(), uuid(), etc.

MySQL: Indexing for Better Performance 8

Page 9: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

MyISAM and InnoDB In MyISAM:

Index entry tells the physical offset of the row in the data file

In InnoDB:

PK index has the data. Secondary indexes store PK as a pointer. Key on field F is (F, PK) - good for sorting and covering index

MySQL: Indexing for Better Performance 9

Page 10: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Cardinality and Selectivity Cardinality: Number of distinct values

Selectivity: Cardinality / total number of rows

What values are better

Optimize Stats Update

MySQL: Indexing for Better Performance 10

Page 11: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

One Column Index This index is on one column only

Query example: SELECT * FROM employee WHERE first_name LIKE 'stephane';

Index solution: ALTER TABLE employee ADD INDEX (first_name);

Notes: Index the first n char of the char/varchar/text fields Do not use a function. i.e.

WHERE md5(field)='1bc29b36f623ba82aaf6724fd3b16718'

MySQL: Indexing for Better Performance 11

Page 12: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Multi Column Index What is it: Index that involves more than one column.

Higher cardinality field goes first, with exceptions.

What 'left most' term is. [INDEX (A, B, C)]

Query example: SELECT * FROM employee WHERE department = 5 AND last_name LIKE 'tran';

Index solution: ALTER TABLE employee ADD INDEX (last_name, department);

{WHY NOT (department, last_name)??}

MySQL: Indexing for Better Performance 12

Page 13: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Multi Column Index [Cont’d] Query example:

SELECT * FROM employee WHERE department = 5 and hiring_date>='2014-01-01';

Index solution: ALTER TABLE employee ADD INDEX (department, hiring_date);

Notes Should it be (hiring_date, department)? Is this an

exception? Order of columns IS important WILL NOT USE THE INDEX:

SELECT * FROM employee WHERE hiring_date>='2014-01-01';

MySQL: Indexing for Better Performance 13

Page 14: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Partial Index What is it: Index on the first n char of a field.

Query example: email: varchar(255); SELECT * FROM users WHERE email like '[email protected]';

Index solution ALTER TABLE users ADD INDEX (email(12));

vs

ALTER TABLE users ADD INDEX (email);

Notes: Save space, efficient writing, same performance SELECT COUNT(DISTINCT(LEFT(field, 20))) FROM table 85% threshold? 90% maybe?

MySQL: Indexing for Better Performance 14

Page 15: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Joins and Indexes Linking two or more tables to get related rows

Query example: SELECT employee.first_name, employee.last_name, FROM department INNER JOIN employee ON departmant.id = employee.department WHERE department.location='MTL';

Index solution: ALTER TABLE department ADD INDEX (location);

ALTER TABLE employee ADD INDEX (department);

Notes: The join could be on a non-indexed field on department, but an index has to exist on "employee's field"

MySQL: Indexing for Better Performance 15

Page 16: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Multiple Indexes OR Multi-Col Index What is it:

ALTER TABLE ADD INDEX(field1), ADD INDEX(field2)

ALTER TABLE ADD INDEX(field1, field2)

Query example: WHERE field1=1 OR field2=2 [multiple indexes]

WHERE field1=1 AND field2=2 [multi-col index]

MySQL: Indexing for Better Performance 16

Page 17: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Covering Index When the index has the required data, no need to

read data from table’s data!

Example: employee(id, first_name, last_name, email, phone, hiring_date)

SELECT email FROM employee WHERE phone='123456789';

ALTER TABLE employee ADD INDEX(phone, email);

min(), max() functions use the index only.

MySQL: Indexing for Better Performance 17

Page 18: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Covering Index - Note only in InnoDB: myindex(col1,col2)

SELECT col1 FROM table1 WHERE col2 = 200 <<-- will use index

SELECT * FROM table1 where col2 = 200 <<-- will NOT use index.

MySQL: Indexing for Better Performance 18

Page 19: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

ICP (Index Condition Pushdown) [5.6] Lets the optimizer check in the index instead of checking in the

table's data. employee(id, first_name, last_name, department, phone, email, address)

INDEX(department, email)

SELECT * FROM employee WHERE department=5 AND email LIKE '%@beta.example%' [and address LIKE '%montreal%'];

Instead of stopping at department and then use where to check for email in the table's data, it will actually check in the index to see if the 2nd condition is satisfied, and then if yes, it will fetch the data from the table

MySQL: Indexing for Better Performance 19

Page 20: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Using Index for Sorting ORDER BY x (index on x)

WHERE x ORDER BY y (index on x, y)

WHERE x ORDER BY x DESC, y DESC (index on x, y)

WHERE x ORDER BY x ASC, y DESC (Can't use index)

MySQL: Indexing for Better Performance 20

Page 21: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Exceptions E.g. Date index with other less cardinal field.

Status or Gender special cases

MySQL: Indexing for Better Performance 21

Page 22: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

Overhead of indexing IO: Each DML operation will modify the indexes

Disk space

More indexes => Higher possibility of deadlock

MySQL: Indexing for Better Performance 22

Page 23: MySQL: Indexing for Better Performance

Jehad Keriaki 2014

ABOUT EXPLAIN It lets us know the plan of query execution

What index would be used, if any

Rows to be scanned

MySQL: Indexing for Better Performance 23

Page 24: MySQL: Indexing for Better Performance

MySQL: Indexing for Better Performance 24

QUESTIONS & EXAMPLES

Page 25: MySQL: Indexing for Better Performance

MySQL: Indexing for Better Performance 25

mysql> explain select * from md_table where id=50000\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: md_table type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 Extra: 1 row in set (0.00 sec) mysql> explain select id from md_table where id=50000\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: md_table type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 Extra: Using index 1 row in set (0.00 sec)

Page 26: MySQL: Indexing for Better Performance

MySQL: Indexing for Better Performance 26

mysql> explain select id from md_table where hashed_id='1017bfd4673955ffee4641ad3d481b1c'\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: md_table type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 100000 Extra: Using where 1 row in set (0.00 sec) mysql> alter table md_table add index (hashed_id(15)); Query OK, 100000 rows affected (0.77 sec) Records: 100000 Duplicates: 0 Warnings: 0 mysql> explain select id from md_table where hashed_id='1017bfd4673955ffee4641ad3d481b1c'\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: md_table type: ref possible_keys: hashed_id key: hashed_id key_len: 46 ref: const rows: 1 Extra: Using where 1 row in set (0.01 sec)