Top Banner
SGBD Practice 1
203

SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

May 15, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

SGBD Practice

1

Page 2: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

What will we study?

• Beeing about something that is practical, mostly nothing theoretical.

• We will have 3 main topics:

- Optimizing an querry by indexing;

- Transactions (some topic remaind from DB);

- NoSQL – a short introduction.

• In laboratory: PL/SQL (Procedural Language / Structured Query Language)

2

Page 3: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

3

Achieving Optimum Performance for Executing SQL Queries in Online Transaction Processing and in Data Warehouses (Lucrare dizertaţie - Lazăr L.)

Page 4: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

http://use-the-index-luke.com/

4

Page 5: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Sintactical & Semantical

• We can consider that a querry is a proposition in english that tells the DB server what he has to do:

… what if the answer comes back after 15 seconds?

SELECT fname

FROM students

WHERE lname = „Jackson'

5

Page 6: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

At the base of an application error are two human errors*

• Most of the time, the programmer that writes the querry doesn’t care how will be executed.

• He usually strongly believes that the SGBD beehaviour is slow…

• Solution ? Simple: stop using Oracle, start using somethin else (MySQL, PostgreSQL or SQL Server (because we heard that it will work faster than oracle).

*One of them is to blame the computer. 6

Page 7: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Actually…

• The only thing that programmers should know about databases is how to index the data.

• The most important information is how the application will access the data.

• The journey of the data is not known by anybody except the programmer !

7

Page 8: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

…ToC… (about indexing)

• How indexes look like

• WHERE-clause

• Performance and scalability

• JOIN

• Clustering

• Sorting and grouping data

• Partial Results

• INSERT, UPDATE, DELETE8

Page 9: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index anatomy

• “An index makes the query fast” - how fast?

------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 9 | 5 (0)| 00:00:01 |

|* 1 | TABLE ACCESS FULL| STUDENTS | 1 | 9 | 5 (0)| 00:00:01 |

------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT

-------------------------

1 - filter(“LNAME"=„Jackson')

9

Page 10: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index anatomy

• “An index makes the query fast” (5x ?)

-----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 9 | 1 (0)| 00:00:01 |

|* 1 | INDEX RANGE SCAN| STD_LNAME| 1 | 9 | 1 (0)| 00:00:01 |

-----------------------------------------------------------------------------

PLAN_TABLE_OUTPUT

--------------------------------

1 - access(“LANME"=„Jackson')

10

Page 11: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index anatomy

• An index is a structure* that is not the actual table. It can be created by using create index.

select index_name from user_indexes;

• It needs it’s own space on HDD and points to the records in the actual table. Beeing an index (mostly like the ToC of a book, it will have a certain degree of redundancy – some of them even have 100% redundancy: SQL Server or MySQL + InnoDB are using Index-Organized Tables [IOT]).

* We will give some details, most of them you know from the DB course previous semester.

11

Page 12: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index Anatomy

• Looking for information into an index is the same as searching for a number in a phonebook.

• The index in a DB has to be fastly reorganized because there are more operations in a DB (than in a phonebook):

[insert / update / delete]

• It has to be maintained without moving big quantities of data.

12

Page 13: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Accessing data in a “sorted” file

• Suppose that we have 1.000.000 data having the same space on HDD *e.g. …FoxPro+.

• Binary Search => log2(1.000.000) =20 reads

• A HDD having 7200RPM makes a complete rotation in 60/7200” = 0.008333..” = 8.33ms

• For a Seagate ST3500320NS, track-to-track seek time = 0.8ms

https://en.wikipedia.org/wiki/B-tree13

Page 14: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Accessing data in a “sorted” file

• Searching (0.8ms) and readin (8.33ms) might get up to 10ms – this is for 1 read. But we needed 20 (to completely search the file).

• 20 readings = 200ms = 0.2”

• Might not look that much and, probably, because some data are in the same HDD area they will be accessed without moving the header to some other location. Suppose that we have 0.15”

https://en.wikipedia.org/wiki/B-tree14

Page 15: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Accessing data in a “sorted” file

• What we would have searched for 10 values ? The needed time would be 1.5 seconds.

• What about 100 values ? 15 seconds ?!?!

• What if the data are not even sorted…

15

Page 16: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

How do I get smaller times?

log2(1.000.000)

We need to change this

16

Page 17: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index anatopy

• How does it work ? Based on a searching tree

Based on a doubly linked list

• The tree required for accessing data is a B+-trees (a version of B-tree with data on leafs).

• The list is used to access next data (after the first one is found based on the tree).

17

For a long time it was unclear what the "B" in the name represented. Candidates discussedin public where "Boeing", "Bayer", "Balanced", "Bushy" and others. In 2013, the B-Tree had just turned 40, Ed McCreight revealed in an interview, that they intentionally never published an answer to this question. They were thinking about many of these options themselves at the time and decided to just leave it an open question. http://sqlity.net/en/2445/b-plus-tree/

Page 18: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

18

Page 19: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

To easily talk about indexes…

• We will discard one of the pointers in each node of the tree – each pointer has the highest value in the next node (we actually ignored one of the pointers) !

• The leafs contain also duplicates (even though in realty there are buckets of values) !

• In reality many DB use a doubly lionked list – in order to be able to search in both ways.

19

Page 20: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index Anatomy

20

In reality here is a bucket of

27s.

Page 21: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index Anatomy

• Because the values in the leafs are not uniformly distributed, we need to access a certain value by the means of the B-tree.

21

Page 22: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index Anatomy

22

Page 23: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index Anatomy

The next node will have the highest value equal to the value of the pointer.

23

Page 24: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index Anatomy

24

Page 25: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index Anatomy

• A B+ tree is a balanced tree !

• A B+ tree is not a binary tree !

• Oince is created, the DB maintains the index balanced (insert/delete/update)

• B+ tree-ul helps the DB gets fast to a leaf;

• How fast ? [first power of indexing]

25

Page 26: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

How is the B+ tree balanced?

• This topic was discussed in the DB course.

• Basic ideea is that when the node is full and we want to add another value, it will split in two (brothers) nodes adding a value to his parent.

• The balancing is done up to the point where the father node has enough space for keeping one more value. If this is impossible, the root will be splitted and the B+ tree will increase the level.

26

Page 27: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index Anatomy

• There is a fake rumor that in time the index will not be as effective as in the beginning. This is totally fake because the B+tree will always balance.

27

Page 28: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index Anatomy

• What makes the index work slow ?

• Accessing the bucket !28

Page 29: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index Anatomy

• Why would the DB work slow when using indexes ?

• After finding the rowID, the data has to be taken from the table.

• 3 sptes:

Traversing B+ tree[time: O(log(n))]Searching the wanted leaf [O(n)]Accessing the table [HDD speed dependant]

29

Page 30: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index Anatomy

• It is a mistake to believe that the B+ tree is not balanced anymore.

• The developer can ask the DB how the query will be executed.

30

Page 31: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Index Anatomy

• 3 types of operations:

INDEX UNIQUE SCAN

INDEX RANGE SCAN

TABLE ACCESS BY INDEX ROWID

• The highest cost: INDEX RANGE SCAN.

• If there are more than 1 row: TABLE ACCESS –might start to be a problem (especially when the row are not in the same track of the HDD).

31

Page 32: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Exectuin Plan

• If we want to ask Oracle how he will execute a querryu: EXPLAIN PLAN FOR

• To obtain a representation of the answer as a table, we execute the following command:

SELECT* FROM TABLE(dbms_xplan.display);

32

Page 33: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

33

Page 34: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

34

Page 35: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

35

Page 36: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

WHERE clause

36

Page 37: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

WHERE clause

• WHERE defines the selection criterion: it tells the DB what data are asked in order to not have the entire table returned (that would take time). That is why it has the most power of influencing the time data are returned.

• WHERE is the one that is actually using the indexes in order to get a faster result.

37

Page 38: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

WHERE clause

CREATE TABLE studentd (

id INT PRIMARY KEY,

lname VARCHAR2(15) NOT NULL,

fname VARCHAR2(30) NOT NULL,

dob DATE,

email VARCHAR2(40), … (LAB)

);

…we add 1025 students.

It will automatically

create an index

38

Page 39: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

WHERE clauseSELECT lname, fname

FROM students

WHERE id = 300

Is it better with range scan ? Can we get a range scan when using a primary index and equality in the where clause ?

39

Page 40: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

WHERE clauseSELECT lname, fname

FROM students

WHERE id BETWEEN 200 AND 210

40

Page 41: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Composing indexes

• Sometime we want our index to contain two columns:

CREATE UNIQUE INDEX idx_grades ON

grades(id_student, id_course);

• Instead of a single value, each node will have a combination of the two. The same goes for the doubly linked list. When searching finds the student with the certain ID, the index will start accessing the data using the id_course.

41

Page 42: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

42

Looking up 300,2 (for example)

Page 43: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

43

Page 44: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Composing indexes

• When the combination of the two fields that are used in the creation of the index are unique, we can create the index using:CREATE UNIQUE INDEX ….

• What if we want now to search by only one of the fields ? Two cases

Case 1: Looking by id_student

Case 2: Looking by id_course

44

Page 45: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

id_student

45

Page 46: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

id_course

46

Page 47: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Composing indexes

• Consider that the index is created on a phonebook and we try to access everybody having a certain first last name (e.g. Jackson). Is it possible to search on the first field of the index ?

• Consider that the index is created on a phonebook and we try to access everybody having a certain first name (e.g. Michael). Is it possible to search on the second field of the index ?

47

Page 48: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Composing indexes

Look at how much the consumption of the CPU has increased ?

And this is only a small case scenario.

48

Page 49: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

49

Search all the rows containing grades for the student with ID=300.

Search all the rows with grades from the course with ID-1.

Page 50: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Composing indexes

• We can easily see that the value 1 for id_curs is randomly distributed in the index. That is why is not efficient to search using this index.

• How do we do it better ?

• We have an index built upon the same columns (in a different order though).

DROP INDEX idx_grades;

CREATE UNIQUE INDEX idx_grades

ON grades(id_coursw, id_student);

50

Page 51: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

51

Page 52: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Composing indexes

• The most important when creating composed indexes is to know what data we can access using that index.

• If we want to index 3 fields, that index can be used by querries of 1st field, querries on the 1st

+ 2nd fields or querries upon all three fields. NO OTHER COMBINATION WILL BE ABLE TO USE THE QUERRY.

• However, you can use 2+1 or 2+3+1 (he knows his that and operation is commutative).

52

Page 53: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

WHERE clause

• When it’s possible a single index is better to use then 2 different indexes (balancing 1 tree is easier then balancing 2 trees).

• To make a composed index efficient, we also have to know how the data is accessed – this is usually known only by the programmer.

53

Page 54: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Slow indexes

• Changing an index can affect the entire application (e.g. now none of the querriesusing id_student will be able to use the index).

54

Page 55: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

• What if we build 2 different indexes, each on a field. Can they both be used ? If now, which one will be ?

55

Page 56: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

56

Page 57: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Slow indexes

• The Query Optimizer

Cost based optimizers (CBO)

Rule-based optimizers (RBO)

57

Page 58: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Statistics

• CBO uses some statistics about DB (e.g. about: columns, tables, indexes). For example, for a table he can hold statistics regarding:

- Highes and smallest value,

- The number of distinct data in a column,

- Number of fields NULL,

- Data distribution (histogram),

- How many rows and blocks it uses.

58

Page 59: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Statistics

• CBO uses some statistics about DB (e.g. about: columns, tables, indexes). For example, for a index he can hold statistics regarding:

- how deep the B+ tree is,

- number of leafs,

- number of distinct values in the index,

- “clustering” factor (data of apropiate of same values are in the same location on HDD).

• Using indexes is not always the best ideea.59

Page 60: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Indecsi bazati pe functii

60

Page 61: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Functii

• Sa presupunem ca dorim sa facem o cautaredupa last_name.

61

Page 62: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Functii• Evident, aceasta cautare va fi mai rapida daca:

CREATE INDEX emp_name ON employees (last_name);

62

Page 63: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Functii• Ce se intampla daca vreau ignorecase?

• Pentru o astfel de cautare, desi avem un index construit peste coloana cu last_name, acesta va fi ignorat [de ce ? – exemplu]

[poate utilizarea unui alt collation ?!]*

• Pentru ca BD nu cunoaste rezultatul apeluluiunei functii a-priori, functia va trebui apelatapentru fiecare linie in parte.

*SQL Server sau MySQL nu fac distinctie intre cases cand sorteazainformatiile in indecsi.

63

Page 64: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

FunctiiSELECT * FROM employees

WHERE UPPER(last_name) = UPPER('winand');

Isi da seama ca e mai eficient sa evaluezefunctia pentru valoarea constanta si sa nu faca

acest lucru pentru fiecare rand in parte. 64

Page 65: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Functii• Cum vede BD interogarea ?SELECT * FROM employees

WHERE BLACKBOX(...) = 'WINAND';

• Se observa totusi ca partea dreapta a expresieieste evaluata o singura data. In fapt filtrul a fost facut pentru

UPPER(“last_name”)=„WINAND‟

65

Page 66: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Functii• Indexul va fi reconstruit pesteUPPER(last_name)

drop index emp_name;

CREATE INDEX emp_up_name

ON employees (UPPER(last_name));

66

Page 67: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Functii - function-based index (FBI)SELECT * FROM employees

WHERE UPPER(last_name) = UPPER('winand');

67

Page 68: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Functii• In loc sa puna direct valoarea campului in

index, un FBI stocheaza valoarea returnata de functie.

• Din acest motiv functia trebuie sa returnezemereu aceeasi valoare: nu sunt permise decatfunctii deterministe.

• A nu se construi FBI cu functii ce returneazavalori aleatoare sau pentru cele care utilizeazadata sistemului pentru a calcula ceva. [days untill xmas]

68

Page 69: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Functii• Nu exista cuvinte rezervate sau optimizari

pentru FBI (altele decat cele deja explicate).

• Uneori instrumentele pentru Object relation mapping (ORM tools) injecteaza din prima o functie de conversie a tipului literelor (upper / lower). De ex. Hibernate converteste totul in lower.

• Puteti construi proceduri stocate deterministeca sa fie folosite in FBI. getAge ?!?!

69

Page 70: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Functii – nu indexati TOT• De ce ? (nu are sens sa fac un index pt. lower)

(daca tot aveti peste upper). De fapt, daca existao functie bijectiva de la felul in care sunt indexatedatele la felul in care vreti sa interogati baza de date, mai bine refaceti interogarea – cu sigurantaeste posibil !).

• Incercati sa unificati caile de acces ce ar putea fiutilizate pentru mai multe interogari.

• E mai bine sa puneti indecsii peste dateleoriginale decat daca aplicati functii peste acestea.

70

Page 71: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Parametri dinamici

71

Page 72: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Parametri dinamici (bind parameters, bind variables)

• Sunt metode alternative de a trimiteinformatii catre baza de date.

• In locul scrierii informatiilor direct in interogare, se folosesc constructii de tipul ? si:name (sau @name) iar datele adevarate sunttransmise din apelul API

• E “ok” sa punem valorile direct in interogaredar abordarea parametrilor dinamici are uneleavantaje:

72

Page 73: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Parametri dinamici (bind parameters, bind variables)

• Avantajele folosirii parametrilor dinamici:

Securitate [impiedica SQL injection]

Performanta [obliga QO sa foloseasca acelasi plan de executie]

73

Page 74: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Parametri dinamici (bind parameters, bind variables)

• Securitate: impiedica SQL injection*statement = "SELECT * FROM users

WHERE name ='" + userName + "';“

Daca userName e modificat in ' or '1'='1

Daca userName e modificat in: a';DROPTABLE users; SELECT * FROM

userinfo WHERE 't' = 't

* http://en.wikipedia.org/wiki/SQL_injection 74

Page 75: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

75

Page 76: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

76

Page 77: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Parametri dinamici (bind parameters, bind variables)

• Avantajele folosirii parametrilor dinamici:Securitate

Performanta

• Performanta: Baze de date (Oracle, SQL Server) pot salva (in cache) executii ale planurilor pe care le-au considera eficiente darDOAR daca interogarile sunt EXACT la fel. Trimitand valori diferite (nedinamic), suntformulate interogari diferite.

77

Page 78: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Si, reamintesc….

78

Page 79: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Parametri dinamici (bind parameters, bind variables)

• Utilizand parametri dinamici, cele doua vor fiprocesate dupa acelasi plan. E mai bine ?

• Neavand efectiv valorile, se va executa planulcare este considerat mai eficient daca valoriledate pentru subsidiary_id ar fidistribuite uniform. [atentie, nu valorile din tabela ci cele din interogare !]

79

Page 80: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Parametri dinamici (bind parameters, bind variables)

• Query optimizer este ca un compilator:

- daca ii sunt trecute valori ca si constante,

se foloseste de ele in acest mod;

- daca valorile sunt dinamice, le vede ca

variabile neinitializate si le foloseste ca

atare.

• Atunci de ce ar functiona mai bine cand nu sunt stiute valorile dinainte ?

80

Page 81: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Parametri dinamici (bind parameters, bind variables)

• Atunci cand este trimisa valoarea, The query optimizer va construi mai multe planuri, vastabili care este cel mai bun dupa care il vaexecuta. In timpul asta, s-ar putea ca un plan (prestabilit), desi mai putin eficient, sa fiexecutat deja interogarea.

• Utilizarea parametrilor fixati e ca si cum aicompila programul de fiecare data.

81

Page 82: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Parametri dinamici (bind parameters, bind variables)

• Cine “bindeaza” variabilele poate face eficienta interogarea (programatorul): se vorfolosi parametri dinamici pentru toatevariabilele MAI PUTIN pentru cele pentru care se doreste sa influenteze planului de executie.

• In all reality, there are only a few cases in which the actual values affect the execution plan. You should therefore use bind parameters if in doubt—just to prevent SQL injections.

82

Page 83: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Parametri dinamici (bind parameters, bind variables) – exemplu Java:

Fara bind parameters:

int subsidiary_id = 20;

Statement command =

connection.createStatement(

"select first_name, last_name" +

" from employees" +

" where subsidiary_id = " +

subsidiary_id );

83

Page 84: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Parametri dinamici (bind parameters, bind variables) – exemplu Java:

Cu bind parameters:

int subsidiary_id = 20;

PreparedStatement command =

connection.prepareStatement(

"select first_name, last_name" +

" from employees" +

" where subsidiary_id = ?" );

command.setInt(1, subsidiary_id);

int rowsAffected =

preparedStatement.executeUpdate();

http://use-the-index-luke.com/sql/where-clause/bind-parameters - C#, PHP, Perl, Java, Ruby

Se repeta pentrufiecare parametru

84

Page 85: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Parametri dinamici (bind parameters, bind variables)- Ruby:Fara parametri dinamici:dbh.execute("select first_name, last_name" +

" from employees" +

" where subsidiary_id = #{subsidiary_id}");

Cu parametri dinamici:dbh.prepare("select first_name, last_name" +

" from employees" +

" where subsidiary_id = ?");

dbh.execute(subsidiary_id);

85

Page 86: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Parametri dinamici (bind parameters, bind variables)

• Semnul intrebarii indica o pozitie. El va fiindentificat prin 1,2,3… (pozitia lui) atuncicand se vor trimite efectiv parametri.

• Se poate folosi “@id” (in loc de ? si de 1,2…).

86

Page 87: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Parametri dinamici (bind parameters, bind variables)

• Parametri dinamici nu pot schimba structurainterogarii (Ruby):

String sql = prepare("SELECT * FROM ?

WHERE ?");

sql.execute('employees',

'employee_id = '1');

Pentru a schimba structura interogarii: dynamic SQL. 87

Page 88: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Cautari pe intervale

88

Page 89: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Q: Daca avem doua coloane, una dintre ele cu foarte multe valori diferite si cealalta cu foartemulte valori identice. Pe care o punem prima in index ?

[carte de telefon:numele sunt mai diversificate decat prenumele]

89

Page 90: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Cautari pe intervale

• Sunt realizate utilizand operatorii <, > saufolosind BETWEEN.

• Cea mai mare problema a unei cautari intr-un interval este traversarea frunzelor.

• Ar trebui ca intervalele sa fie cat mai miciposibile. Intrebarile pe care ni le punem:

unde incepe un index scan ?

unde se termina ?

90

Page 91: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Cautari pe intervale

SELECT first_name, last_name,

date_of_birth

FROM employees

WHERE

date_of_birth >= TO_DATE(?, 'YYYY-

MM-DD')

AND

date_of_birth <= TO_DATE(?, 'YYYY-

MM-DD')

Inceput

Sfarsit91

Page 92: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Cautari pe intervale

SELECT first_name, last_name,

date_of_birth

FROM employees

WHERE

date_of_birth >= TO_DATE(?, 'YYYY-

MM-DD') AND

date_of_birth <= TO_DATE(?, 'YYYY-

MM-DD') AND

AND subsidiary_id = ?

???92

Page 93: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Cautari pe intervale

• Indexul ideal acopera ambele coloane.

• In ce ordine ?

93

Page 94: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

1 Ianuarie 19719 ianuarie 1971

Sub_id=27

94

Page 95: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Sub_id=27

1 Ianuarie 19719 ianuarie 1971

95

Page 96: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Cautari pe intervale

Regula: indexul pentru egalitate primul

si apoi cel pentru interval !

Nu e neaparat bine ca sa punem pe prima pozitie coloana cea mai diversificata.

96

Page 97: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Cautari pe intervale

• Depinde si de ce interval cautam (pentruintervale foarte mari s-ar putea sa fie maieficient invers).

• Nu este neaparat ca acea coloana cu valorilecele mai diferite sa fie prima in index – vezicazul precedent in care sunt doar 30 de IDurisi 365 de zile de nastere (x ani).

• Ambele indexari faceau match pe 13 inregistrari.

97

Page 98: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Cautari pe intervale

Dupa ce a fostgasit intervalul, datele au fost

filtrate…

Deci prima coloana din index este… ? 98

Page 99: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Cautari pe intervale

• Acces – indica de unde incepe si unde se termina rangeul pentru cautare.

• Filtrul – preia un range si selecteaza doarliniile care satisfac o conditie.

• Daca schimbam ordinea coloanelor din index:

(subsidiary_id, date_of_birth)

99

Page 100: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Cautari pe intervale

100

Page 101: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Cautari pe intervale

• Operatorul BETWEEN este echivalent cu o cautare in interval dar considerand simarginile intervalului.

DATE_OF_BIRTH BETWEEN '01-JAN-71'

AND '10-JAN-71

Este echivalent cu:DATE_OF_BIRTH >= '01-JAN-71' AND

DATE_OF_BIRTH <= '10-JAN-71‟

101

Page 102: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

LIKE

102

Page 103: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

LIKE

• Operatorul LIKE poate avea repercusiuninedorite asupra interogarii (chiar cand suntfolositi indecsi).

• Unele interogari in care este folosit LIKE se pot baza pe indecsi, altele nu. Diferenta o face pozitia caracterului % .

103

Page 104: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

LIKE

SELECT first_name, last_name,

date_of_birth

FROM employees

WHERE UPPER(last_name) LIKE 'WIN%D'

104

Page 105: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

LIKE

• Doar primele caractere dinainte de % pot fiutilizate in cautarea bazata pe indecsi. Restulcaracterelor sunt utilizate pentru a filtrarezultatele obtinute.

• Daca in exemplul anterior punem un index peste UPPER(last_name), iata cum ar fiprocesate diverse interogarile in functie in care caracterul % este asezat in diverse pozitii:

105

Page 106: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

LIKE

106

Page 107: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

LIKE

• Ce se intampla daca LIKE-ul este de formaLIKE '%WI%D' ?

107

Page 108: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

LIKE

• A se evita expresii care incep cu %.

• In teorie, %, influenteaza felul in care estecautata expresia. In practica, daca suntutilizati parametri dinamici, nu se stie cum Querry optimizer va considera ca este mai binesa procedeze: ca si cum interogarea ar incepecu % sau ca si cum ar incepe fara?

108

Page 109: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

LIKE

• Daca avem de cautat un cuvant intr-un text, nu conteaza daca acel cuvant este trimis ca parametru dinamic sau hardcodat in interogare. Cautarea va fi oricum de tipul%cuvant% . Totusi, folosind parametridinamici, macar evitam SQL injection.

109

Page 110: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

LIKE

• Pentru a “optimiza” cautarile cu clauza LIKE, se poate utiliza in mod intentionat alt camp indexat (daca se stie ca intervalul ce va fireturnat de index va contine oricum textul cecontine parametrul din like).

Q: Cum ati putea indexa totusi pentru a optimizao cautare care sa aiba ca si clauza:

LIKE '%WINAND'

110

Page 111: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Contopirea indecsilor

Indecsi de tip Bitmap

111

Page 112: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Contopirea indecsilor

• Este mai bine sa se utilizeze cate un index pentru fiecare coloana sau e mai bine sa fie utilizati indecsi pe mai multe coloane ?

• Sa studiem urmatoarea interogare:

SELECT first_name, last_name, date_of_birth

FROM employees

WHERE UPPER(last_name) < ?

AND date_of_birth < ?

112

Page 113: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Contopirea indecsilor

• Indiferent de cum ar fi intoarsa problema, nu se poate construi un index care sa duca atatnumele cat si data de nastere intr-un interval compact (cu inregistrari consecutive) [decatdaca nu cumva toate mamele si-au botezatcopii nascuti in Iulie drept “Iulian” ].

• Indexul peste doua coloane nu ne ajutaextraordinar. Totusi, daca il construim, punemcoloana cu valorile cele mai diferite pe prima pozitie.

De ce ? 113

Page 114: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Contopirea indecsilor

• O a doua posibilitate este utilizarea de indecsidiferiti pentru fiecare coloana si lasat QO sadecida cum sa ii foloseasca [deja s-ar putea saia mai mult timp pentru ca poate crea de douaori mai multe planuri].

• Pont: o cautare dupa un singur index este mairapida decat daca sunt doi indecsi implicati.

114

Page 115: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Contopirea indecsilor

• Data warehouse (DW) este cel care are grijade toate interogarile ad-hoc.

• Din cauza ca nu poate folosi un index clasic, foloseste un tip special de index: bitmap index (felul cum sunt indexate patratele de pe o tabla de sah).

• Merge oarecum mai bine decat fara (dar nu mai bine ca un index folosit eficient).

115

Page 117: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Contopirea indecsilor

• Dezavantajul folosirii indecsilor de tip bitmap:

timpul ridicol de mare pentru operatii de insert/delete/update.

nu permite scrieri concurente (desi in DW pot fiexecutate serial)

In aplicatii online, indecsii de tip bitmap sunt inutili

• Uneori arborii de acces (B-trees) sunt convertiti(temporar) in bitmaps de catre BD pentru a executa interogarea (nu sunt stocati) – solutiedisperata a QO ce foloseste CPU+RAM.

117

Page 118: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Indecsi Partiali

Indexarea NULL

118

Page 119: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

• Sa analizam interogarea:

SELECT message

FROM messages

WHERE processed = 'N'

AND receiver = ?

• Preia toate mailurile nevizualizate (de exemlu). Cum ati indexa ? [ambele sunt cu =]

119

Page 120: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

• Am putea crea un index de forma:

CREATE INDEX messages_todo ON

messages (receiver, processed)

• Se observa ca processed imparte tabela in doua categorii: mesaje procesate si mesajeneprocesate.

120

Page 121: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Indecsi partiali

• Unele BD permit indexarea partiala. Astainseamna ca indexul nu va fi creat decat pesteanumite linii din tabel.

CREATE INDEX messages_todo

ON messages (receiver)

WHERE processed = 'N'

Atentie: nu merge in Oracle…

121

Page 122: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Indecsi partiali

• Ce se intampla la executia codului:

SELECT message

FROM messages

WHERE processed = 'N';

122

Page 123: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Indecsi partiali

• Indexul nou construit este redus si pe verticala(pentru ca are mai putine linii) dar si peorizontala (nu mai trebuie sa aiba grija de coloana “processed”).

• Se poate intampla ca dimensiunea sa fie constanta (de exemplu nu am mereu ~500 de mailuri necitite) chiar daca numarul liniilor din BD creste.

123

Page 124: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

NULL in Oracle

• Ce este NULL in Oracle ?

• In primul rand trebuie folosit “IS NULL” si nu “=NULL”.

• NULL nu este mereu conform standardului (artrebui sa insemne absenta datelor).

• Oracle trateaza un sir vid ca si NULL ?!?! (de fapt trateaza ca NULL orice nu stie sau nu intelege sau care nu exista).

124

Page 125: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

125

Page 126: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

NULL in Oracle

• Mai mult, Oracle trateaza NULL ca sir vid:

126

Page 127: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

NULL in Oracle

• Daca am creat un index dupa o coloana X siapoi adaugam o inregistrare care sa aiba NULL pentru X, acea inregistrare nu este indexata.

127

Page 128: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

NULL in Oracle

INSERT INTO employees ( subsidiary_id,

employee_id , first_name , last_name ,

phone_number)

VALUES ( ?, ?, ?, ?, ? );

• Noul rand nu va fi indexat:SELECT first_name, last_name

FROM employees

WHERE date_of_birth IS NULL

Table access

full

alter table employees modify (date_of_birth null); 128

Neinserand data de nastere, aceasta va fi

NULL

Page 129: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Indexarea NULL in Oracle

CREATE INDEX demo_null ON employees

(subsidiary_id, date_of_birth);

• Si apoi:SELECT first_name, last_name

FROM employees

WHERE subsidiary_id = ?

AND date_of_birth IS NULL

129

NOT NULL

Page 130: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Indexarea NULL in Oracle

• Ambele predicate sunt utilizate !

130

Page 131: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Indexarea NULL in Oracle

• Atunci cand indexam dupa un camp ce s-arputea sa fie NULL, pentru a ne asigura ca siaceste randuri sunt indexate, trebuie adaugatun camp care sa fie NOT NULL ! (poate fiadaugat si o constanta – de exemplu ‘1’):

DROP INDEX emp_dob;

CREATE INDEX emp_dob ON employees

(date_of_birth, '1');

131

Page 132: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

NOT NULL CONSTRAINTS….Asta este NOT NULL

132

Page 133: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Indexarea NULL in Oracle

• Fara NOT NULL pus pe last_name (care e folosit in index), indexul este inutilizabil.

• Se gandeste ca poate exista cazul cand ambelecampuri sunt nule si acel caz nu e bagat in index.

133

Page 134: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Indexarea NULL in Oracle

• O functie creata de utilizator este considerataca fiind NULL (indiferent daca este sau nu).

• Exista anumite functii din Oracle care suntrecunoscute ca intorc NULL atunci cand datelede intrare sunt NULL (de exemplu functiaupper).

134

Page 135: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Indexarea NULL in Oracle

• a

In opinia lui, ambele pot fi

NULL.Desi id esteNOT NULL

135

Page 136: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

• a

Ii spunem clar ca nu ne intereseaza unde

functia da NULL.

136

Page 137: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Indexarea NULL in Oracle

• a

Sau ii spunem ca acest camp este

mereu NOT NULL.

Si folosimcoloana in index

137

Page 138: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Indexarea NULL in Oracle

• a

Daca initial last_name este nenul va stica upper(last_name) este tot nenul.

138

Page 139: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Emularea indecsilor partiali in Oracle

CREATE INDEX messages_todo

ON messages (receiver)

WHERE processed = 'N'

• Avem nevoie de o functie care sa returnezeNULL de fiecare data cand mesajul a fostprocesat.

139

Page 140: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Emularea indecsilor partiali in Oracle

CREATE OR REPLACE FUNCTION

pi_processed(processed CHAR,

receiver NUMBER)

RETURN NUMBER DETERMINISTIC AS

BEGIN

IF processed IN ('N')

THEN RETURN receiver;

ELSE RETURN NULL;

END IF;

END; /

Pentru a putea fi utilizata in index.

140

Page 141: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Indexarea NULL in OracleDeoarece stie ca aici va veni o valoare, QO face un singur plan (cu

index). Daca ar fi fost null ar fi fost testat cu “IS NULL”.

141

Page 142: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Conditii obfuscate

142

Page 143: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare – siruri numerice

• Sunt numere memorate in coloane de tip text

• Desi nu e practic, un index poate fi folositpeste un astfel de sir de caractere (indexuleste peste sirul de caractere):

SELECT ... FROM ... WHERE

numeric_string = '42„

• Daca s-ar face o cautare de genul:

143

Page 144: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare – siruri numerice

SELECT ... FROM ... WHERE

numeric_string = 42

• Unele SGBDuri vor semnala o eroare(PostgreSQL) in timp ce altel vor face o conversia astfel:

SELECT ... FROM ... WHERE

TO_NUMBER(numeric_string) = 42

144Va merge pe index ? (care era construit peste sirul de caractere ?!?!)

Page 145: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare – siruri numerice

• Problema este ca nu ar trebui sa convertim sirulde caractere din tabel ci mai degraba saconvertim numarul (pentru ca indexul e pe sir):

SELECT ... FROM ... WHERE

numeric_string = TO_CHAR(42)

• De ce nu face baza de date conversia in acestmod ? Pentru ca datele din tabel ar putea fistocate ca ‘42’ dar si ca ‘042’, ‘0042’ care suntdiferite ca si siruri de caractere dar reprezintaacelasi numar.

145

Page 146: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare – siruri numerice

• Conversia se face din siruri in numeredeoarece ‘42’ sau ‘042’ vor avea aceeasivaloare cand sunt convertite. Totusi 42 nu vaputea fi vazut ca fiind atat ‘42’ cat si ‘042’ cand este convertit in sir numeric.

• Diferenta nu este numai una de performantadar chiar una ce tine de semantica.

146

Page 147: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare – siruri numerice

• Utilizarea sirurilor numerice intr-o tabela esteproblematica (de exemplu din cauza ca poatefi stocat si altceva decat un numar).

• Regula: folositi tipuri de date numerice ca sastocati numere.

147

Page 148: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare - Date

• Data include o componenta temporala

• Trunc(DATE) seteaza data la miezul noptii.

SELECT ... FROM sales WHERE

TRUNC(sale_date) =

TRUNC(sysdate – INTERVAL '1' DAY)

Nu va merge corect daca indexul este pus pesale_date deoarece TRUNC=blackBox.

CREATE INDEX index_name ON table_name

(TRUNC(sale_date))

148

Page 149: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare - Date

• Este bine ca indecsii sa ii punem peste dateleoriginale (si nu peste functii).

• Daca facem acest lucru putem folosi acelasiindex si pentru cautari ale vanzarilor de ieridar si pentru cautari a vanzarilor din ultimasaptamana / luna sau din luna N.

149

Page 150: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare - Date

SELECT ... FROM sales WHERE

DATE_FORMAT(sale_date, "%Y-%M") =

DATE_FORMAT(now() , "%Y-%M')

• Cauta vanzarile din luna curenta. Mai rapid este:

SELECT ... FROM sales WHERE

sale_date BETWEEN month_begin(?)

AND month_end(?)

150

Page 151: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare - Date

• Regula: scrieti interogarile pentru perioada ca si conditii explicite (chiar daca e vorba de o singura zi).

sale_date >= TRUNC(sysdate) AND

sale_date < TRUNC(sysdate +

INTERVAL '1' DAY)

151

Page 152: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare - Date

• O alta problema apare la compararea tipurilor date cu siruri de caractere:

SELECT ... FROM sales WHERE TO_CHAR(sale_Date, 'YYYY-MM-DD') = '1970-01-01'

• Problema este (iarasi) conversia coloanei ce reprezintadata.

• Oamenii traiesc cu impresia ca parametrii dinamicitrebuie sa fie numere sau caractere. In fapt ele pot fichiar si de tipul java.util.Date

152

Page 153: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare - Date

• Daca nu puteti trimite chiar un obiect de tip Date ca parametru, macar nu faceti conversiacoloanei (evitand a utiliza indexul). Mai bine:

SELECT ... FROM sales WHERE sale_date

= TO_DATE('1970-01-01', 'YYYY-MM-

DD')

Index peste sale_date

153

Fie direct sir de caractere sau chiar parametrudinamic trimis ca sir de caractere.

Page 154: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare - Date

• Cand sale_date contine o data de tip timp, e mai bine sa utilizam intervale) :

SELECT ... FROM sales WHERE

sale_date >= TO_DATE('1970-01-01',

'YYYY-MM-DD') AND

sale_date < TO_DATE('1970-01-01',

'YYYY-MM-DD') + INTERVAL '1' DAY

sale_date LIKE SYSDATE154

Page 155: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare - Math

• Putem crea un index pentru ca urmatoareainterogare sa functioneze corect?

SELECT numeric_number FROM table_name

WHERE numeric_number - 1000 > ?

• Dar pentru:SELECT a, b FROM table_name

WHERE 3*a + 5 = b

155

Page 156: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare - Math

• In mod normal NU este bine sa punem SGBD-ul sa rezolve ecuatii.Pentru el, si urmatoareainterogare va face full scan:

SELECT numeric_number FROM table_name

WHERE numeric_number + 0 > ?

• Totusi am putea indexa in felul urmator:CREATE INDEX math ON table_name (3*a - b)

SELECT a, b FROM table_name

WHERE 3*a - b = -5;156

Chiar de are index peste numeric_number, nu are peste suma lui cu 0 !

Page 157: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare – “Smart logic”

SELECT first_name, last_name,

subsidiary_id, employee_id FROM

employees WHERE

( subsidiary_id = :sub_id OR :sub_id

IS NULL ) AND

( employee_id = :emp_id OR :emp_id IS

NULL ) AND

( UPPER(last_name) = :name OR :name

IS NULL )

157

Page 158: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare – “Smart logic”

• Cand nu se doreste utilizarea unuia dintre filtre, se trimite NULL in parametrul dinamic.

• Baza de date nu stie care dintre filtre este NULL sidin acest motiv se asteapta ca toate pot fi NULL => TABLE ACCESS FULL + filtru (chiar daca existaindecsi).

• Problema este ca QO trebuie sa gaseasca planulde executie care sa acopere toate cazurile(inclusiv cand toti sunt NULL), pentru ca va folosiacelasi plan pentru interogarile cu var. dinamice.

158

Page 159: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Metode de Ofuscare – “Smart logic”

• Solutia este sa ii zicem BD ce avem nevoie siatat:

SELECT first_name, last_name,

subsidiary_id, employee_id FROM

employees

WHERE UPPER(last_name) = :name

• Problema apare din cauza share execution plan pentru parametrii dinamici.

159

Page 160: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Performanta - Volumul de date

Don't ask a DBA to help you move furniture.They've been known to drop tables…

160

Page 161: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Volumul de date

• O interogare devine mai lenta cu cat sunt maimulte date in baza de date

• Cat de mare este impactul asupraperformantei daca volumul datelor se dubleaza ?

• Cum putem imbunatati ?

161

Page 162: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Volumul de date

• Interogarea analizata:

SELECT count(*) FROM scale_data

WHERE section = ? AND id2 = ?

• Section are rolul de a controla volumul de date. Cu cat este mai mare section, cu atateste mai mare volumul de date returnat.

• Consideram doi indecsi: index1 si index2

162

Page 163: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Volumul de date

• Interogarea analizata:

SELECT count(*) FROM scale_data

WHERE section = ? AND id2 = ?

• Section mic – index1 si apoi index2

163

Page 164: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Volumul de date

• Scalabilitatea indica dependenta performanteiin functie de factori precum volumul de informatii.

164

Page 165: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Volumul de date

• index1 – timp dublu fata de cel initial

• index2 – trimp x20 fata de cel initial

165

Page 166: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Volumul de date

• Raspunsul unei interogari depinde de maimulti factori. Volumul de date e unul dintre ei.

• Daca o interogare merge bine in faza de test, nu e neaparat ca ea sa functioneze bine si in productie.

• Care este motivul pentru care apare diferentadintre index1 si index2 ?

166

Page 167: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Ambele par identice ca executie:

167

Page 168: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Volumul de date

• Ce influenteaza un index ?

table acces

scanarea unui interval mare

• Nici unul din planuri nu indica acces pe bazaindexului (TABLE ACCES BY INDEX ROW ID)

• Unul din intervale este mai mare atunci cand e parcurs…. trebuie sa avem acces la “predicate information” ca sa vedem de ce:

168

Page 169: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

169

Page 170: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

170

Page 171: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Volumul de date

• Puteti spune cum a fost construit indexulavand planurile de executie ?

171

Page 172: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Volumul de date

• Puteti spune cum a fost construit indexulavand execution plans ?

• CREATE INDEX scale_slow ON

scale_data (section, id1, id2);

• CREATE INDEX scale_fast ON

scale_data (section, id2, id1);

Campul id1 este adaugat doar pentru a pastraaceeasi dimensiune (sa nu se creada ca indexulscale_fast e mai rapid pentru ca are mai putinecampuri in el). 172

Page 173: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Incarcarea sistemului

• Faptul ca am definit un index pe care ilconsideram bun pentru interogarile noastrenu il face sa fie neaparat folosit de QO.

• SQL Server Management Studio Arata predicatuldoar ca un tooltip

173

Page 174: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Incarcarea sistemului

• De regula, impreuna cu numarul de inregistrari, creste si numarul de accesari.

• Numarul de accesari este alt parametru ceintra in calculul scalabilitatii.

174

Page 175: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Incarcarea sistemului

• Daca initial era doar o singura accesare, considerand acelasi scenariu dar cu 1-25 interogari concurente, timpul de raspunscreste:

175

Page 176: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Incarcarea sistemului

• Asta inseamna ca si daca avem toata baza de date din productie si testam totul pe ea, tot sunt sanse ca in realitate, din cauza numaruluimare de interogari, sa mearga mult mai greu.

• Nota: atentia data planului de executie estemai importanta decat benchamarkurisuperficiale ( gen SQL Server Management Studio).

176

Page 177: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Incarcarea sistemului

• Ne-am putea astepta ca hardwareul mai puternicdin productie sa duca mai bine sistemul. In fapt, in faza de development nu exista deloc latenta –ceea ce nu se intampla in productie (unde accesulpoate fi intarziat din cauza retelei).

• http://blog.fatalmind.com/2009/12/22/latency-security-vs-performance/

• http://jamesgolick.com/2010/10/27/we-are-experiencing-too-much-load-lets-add-a-new-server..html

177

Page 178: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Timpi de raspuns + throughput

• Hardware mai performant nu este mai rapid doar poate duce mai multa incarcare.highway

• Procesoarele single-core vs procesoarelemulti-core (cand e vorba de un singur task).

• Scalarea pe orizontala (adaugarea de procesoare) are acelasi efect.

• Pentru a imbunatati timpul de raspuns estenecesar un arbore eficient (chiar si in NoSQL).

178

Page 179: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Timpi de raspuns

• Indexarea corecta fac cautarea intr-un B-tree in timp logaritmic.

• Sistemele bazate pe NoSQL par sa fi rezolvatproblema performantei prin scalare pe orizontala[analogie cu indecsii partiali in care fiecarepartitie este stocata pe o masina diferita].

• Aceasta scalabilitate este totusi limitata la operatiile de scriere intr-un model denumit“eventual consistency” *Consistency / Availability / Partition tolerance = CAP theorem] http://en.wikipedia.org/wiki/CAP_theorem

179

Page 180: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Timpi de raspuns

• Mai mult hardware de obicei nu imbunatateste sistemul.

• Latency al HDD [problema apare cand datelesunt culese din locatii diferite ale HDDului – de exemplu in cadrul unei operatii JOIN]. SSD?

180

Page 181: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

“Facts”

• Performance has two dimensions: response time and throughput.

• More hardware will typically not improve query response time.

• Proper indexing is the best way to improve query response time.

181

Page 182: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join

An SQL query walks into a bar and sees two tables.

He walks up to them and asks ’Can I join you?’

— Source: Unknown

182

Page 183: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join

• Join-ul transforma datele dintr-un model normalizat intr-unul denormalizat care serveste unui anumit scop.

• Sensibil la latente ale discului (si fragmentare).

183

Page 184: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join

• Reducerea timpilor = indexarea corecta

• Toti algoritmii de join proceseaza doar douatabele simultan (apoi rezultatul cu a treia, etc).

• Rezultatele de la un join sunt trimise in urmatoarea operatie join fara a fi stocate.

• Ordinea in care se efectueaza JOIN-ulinfluenteaza viteza de raspuns.[10, 30, 5, 60]

• QO incearca toate permutarile de JOIN.

• Cu cat sunt mai multe tabele, cu atat mai multeplanuri de evaluat. [cate ?]

184

Page 185: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join

• Cu cat sunt mai multe tabele, cu atat maimulte planuri de evaluat = O(n!)

• Nu este o problema cand sunt utilizatiparametri dinamici [De ce ?]

185

Page 186: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join – Nested Loops (anti patern)

• Ca si cum ar fi doua interogari: cea exterioarapentru a obtine o serie de rezultate dintr-o tabela si cea interioara ce preia fiecare rand obtinut si apoi informatia corespondenta din cea de-a doua tabela.

• Se pot folosi Nested Selects pentru a simulaalgoritmul de nested loops [latenta retelei, usurinta implementarii, Object-relational mapping (N+1 selects)].

186

Page 187: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join – nested selects [PHP] java, perl on “luke…”

187

Page 188: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join – nested selects

188

Page 189: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join – nested selects

189Ce indecsi ati crea ca sa fie mai rapida executia ?

Page 190: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join – nested selects

• DB executa joinul exact ca si in exemplulanterior. Indexarea pentru nested loops estesimilara cu cea din selecturile anterioare:

1. Un FBI (function based Index) pesteUPPER(last_name)

2. Un Index concatenat peste subsidiary_id, employee_id

190

Page 191: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join – nested selects

• Totusi, in BD nu avem latenta din retea.

• Totusi, in BD nu sunt transferate dateleintermediare (care sunt piped in BD).

• Pont: executati JOIN-urile in baza de date si nu in Java/PHP/Perl sau in alt limbaj (ORM).

There you go: PLSQL style ;)

191

Page 192: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join – nested selects

• Cele mai multe ORM permit SQL joins.

• eager fetching – probabil cel mai important (va prelua si tabela vanzari –in mod join–atunci cand se interogheaza angajatii).

• Totusi eager fetching nu este bun atunci candeste nevoie doar de tabela cu angajati (aducesi date irelevante) – nu am nevoie de vanzaripentru a face o carte de telefoane cu angajatii.

• O configurare statica nu este o solutie buna.

192

Page 193: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

193

Page 194: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join

194

Page 196: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join – Hash join

• Evita traversarea multipla a B-tree din cadrulinner-querry (din nested loops) construindcate o tabela hash pentru inregistrarilecandidat.

• Hash join imbunatatit daca sunt selectate maiputine coloane.

• A se indexa predicatele independente din where pentru a imbunatati performanta. (peele este construit hashul)

196

Page 197: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join – Hash join

SELECT * FROM

sales s JOIN employees e

ON (s.subsidiary_id = e.subsidiary_id

AND s.employee_id = e.employee_id )

WHERE s.sale_date > trunc(sysdate) -

INTERVAL '6' MONTH

197

Page 198: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join – Hash join

198

Page 199: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join – Hash join

• Indexarea predicatelor utilizate in join nu imbunatatesc performanta hash join !!!

• Un index ce ar putea fi utilizat este pestesale_date

• Cum ar arata daca s-ar utiliza indexul ?

199

Page 200: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join – Hash join

200

Page 201: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join – Hash join

• Ordinea conditiilor din join nu influenteazaviteza (la nested loops influenta).

201

Page 202: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Bibliografie (online)

• http://use-the-index-luke.com/

( puteti cumpara si cartea in format PDF – darnu contine altceva decat ceea ce este pe site)

202

Page 203: SGBD Practice - Alexandru Ioan Cuza Universityvcosmin/pagini/resurse... · Structured Query Language) 2. 3 Achieving Optimum Performance for Executing SQL Queries in Online Transaction

Join

203