Top Banner
1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina
38

1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

1

Query Compilation

ParsingLogical Query Plan

Source: our textbook, slides by Hector Garcia-Molina

Page 2: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

2

parse

convert

apply laws

estimate result sizes

consider physical plans estimate costs

pick best

execute

{P1,P2,…..}

{(P1,C1),(P2,C2)...}

Pi

answer

SQL query

parse tree

logical query plan

“improved” l.q.p

l.q.p. +sizes

statistics

Page 3: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

3

Outline Convert SQL query to a parse tree

Semantic checking: attributes, relation names, types Convert to a logical query plan (relational

algebra expression) deal with subqueries

Improve the logical query plan use algebraic transformations group together certain operators evaluate logical plan based on estimated size of

relations Convert to a physical query plan

search the space of physical plans choose order of operations complete the physical query plan

Page 4: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

4

Parsing Goal is to convert a text string

containing a query into a parse tree data structure: leaves form the text string (broken into

lexical elements) internal nodes are syntactic categories

Uses standard algorithmic techniques from compilers given a grammar for the language (e.g.,

SQL), process the string and build the tree

Page 5: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

5

Example: SQL querySELECT titleFROM StarsInWHERE starName IN (

SELECT nameFROM MovieStarWHERE birthdate LIKE ‘%1960’

);

(Find the movies with stars born in 1960)

Assume we have a simplified grammar for SQL.

Page 6: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

6

Example: Parse Tree<Query>

<SFW>

SELECT <SelList> FROM <FromList> WHERE <Condition>

<Attribute> <RelName> <Tuple> IN <Query>

title StarsIn <Attribute> ( <Query> )

starName <SFW>

<Attribute> <RelName> <Attribute> LIKE <Pattern>

name MovieStar birthDate ‘%1960’

SELECT <SelList> FROM <FromList> WHERE <Condition>

Page 7: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

7

The Preprocessor

replaces each reference to a view with a parse (sub)-tree that describes the view (i.e., a query)

does semantic checking: are relations and views mentioned in the

schema? are attributes mentioned in the current

scope? are attribute types correct?

Page 8: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

8

Outline Convert SQL query to a parse tree

Semantic checking: attributes, relation names, types

Convert to a logical query plan (relational algebra expression) deal with subqueries

Improve the logical query plan use algebraic transformations group together certain operators evaluate logical plan based on estimated size of

relations Convert to a physical query plan

search the space of physical plans choose order of operations complete the physical query plan

Page 9: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

9

Convert Parse Tree to Relational Algebra

Complete algorithm depends on specific grammar, which determines forms of the parse trees

Here give a flavor of the approach

Page 10: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

10

Conversion

Suppose there are no subqueries.

SELECT att-list FROM rel-list WHERE cond

is converted into

PROJatt-list(SELECTcond(PRODUCT(rel-list))), or

att-list(cond( X (rel-list)))

Page 11: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

11

SELECT movieTitleFROM StarsIn, MovieStarWHERE starName = name AND birthdate LIKE '%1960';

<Query>

<SFW>

SELECT <SelList> FROM <FromList> WHERE <Condition>

<Attribute> <RelName> , <FromList> AND <Condition>

movieTitle StarsIn <RelName> <Attribute> LIKE <Pattern>

MovieStar birthdate '%1960'

<Condition>

<Attribute> = <Attribute>

starName name

Page 12: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

12

Equivalent Algebraic Expression Tree

movieTitle

starname = name AND birthdate LIKE '%1960'

X

StarsIn MovieStar

Page 13: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

13

Handling Subqueries

Recall the (equivalent) query:SELECT titleFROM StarsInWHERE starName IN (

SELECT nameFROM MovieStarWHERE birthdate LIKE ‘%1960’

);

Use an intermediate format called two-argument selection

Page 14: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

14

title

StarsIn <condition>

<tuple> IN name

<attribute> birthdate LIKE ‘%1960’

starName MovieStar

Example: Two-Argument Selection

Page 15: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

15

Converting Two-Argument Selection

To continue the conversion, we need rules for replacing two-argument selection with a relational algebra expression

Different rules depending on the nature of the subquery

Here show example for IN operator and uncorrelated query (subquery computes a relation independent of the tuple being tested)

Page 16: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

16

Rules for IN

R <Condition>

t IN S R S

X

C

C is the condition that equatesattributes in t with correspondingattributes in S

Page 17: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

17

Example: Logical Query Plan

title

starName=name

StarsIn

birthdate LIKE ‘%1960’

MovieStar

name

Page 18: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

18

What if Subquery is Correlated?

Example is when subquery refers to the current tuple of the outer scope that is being tested

More complicated to deal with, since subquery cannot be translated in isolation

Need to incorporate external attributes in the translation

Some details are in textbook

Page 19: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

19

Outline Convert SQL query to a parse tree

Semantic checking: attributes, relation names, types

Convert to a logical query plan (relational algebra expression) deal with subqueries

Improve the logical query plan use algebraic transformations group together certain operators evaluate logical plan based on estimated size of

relations Convert to a physical query plan

search the space of physical plans choose order of operations complete the physical query plan

Page 20: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

20

Improving the Logical Query Plan

There are numerous algebraic laws concerning relational algebra operations

By applying them to a logical query plan judiciously, we can get an equivalent query plan that can be executed more efficiently

Next we'll survey some of these laws

Page 21: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

21

Associative and Commutative Operations

product natural join set and bag union set and bag intersection

associative: (A op B) op C = A op (B op C)

commutative: A op B = B op A

Page 22: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

22

Laws Involving Selection

Selections usually reduce the size of the relation

Usually good to do selections early, i.e., "push them down the tree"

Also can be helpful to break up a complex selection into parts

Page 23: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

23

Selection Splitting

C1 AND C2 (R) = C1 ( C2 (R))

C1 OR C2 (R) = ( C1 (R)) Uset ( C2 (R))if R is a set

C1 ( C2 (R)) = C2 ( C1 (R))

Page 24: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

24

Selection and Binary Operators

Must push selection to both arguments: C (R U S) = C (R) U C (S)

Must push to first arg, optional for 2nd: C (R - S) = C (R) - S C (R - S) = C (R) - C (S)

Push to at least one arg with all attributes mentioned in C: product, natural join, theta join, intersection e.g., C (R X S) = C (R) X S, if R has all the

atts in C

Page 25: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

25

Pushing Selection Up the Tree

Suppose we have relations StarsIn(title,year,starName) Movie(title,year,len,inColor,studioName)

and a view CREATE VIEW MoviesOf1996 AS

SELECT *FROM MovieWHERE year = 1996;

and the query SELECT starName, studioName

FROM MoviesOf1996 NATURAL JOIN StarsIn;

Page 26: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

26

The Straightforward Tree

starName,studioName

year=1996 StarsIn

Movie Remember the ruleC(R S) = C(R) S ?

Page 27: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

27

The Improved Logical Query Plan

starName,studioName

year=1996 StarsIn

Movie

starName,studioName

year=1996

Movie StarsIn

starName,studioName

year=1996 year=1996

Movie StarsIn

push selectionup tree

push selectiondown tree

Page 28: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

28

Laws Involving Projections Consider adding in additional projections Adding a projection lower in the tree can

improve performance, since often tuple size is reduced Usually not as helpful as pushing selections down

If a projection is inserted in the tree, then none of the eliminated attributes can appear above this point in the tree Ex: L(R X S) = L(M(R) X N(S)), where M (resp. N) is

all attributes of R (resp. S) that are used in L Another example:

L(R Ubag S) = L(R) Ubag L(S) But watch out for set union!

Page 29: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

29

Push Projection Below Selection?

Rule: L(C(R)) = L(C(M(R)))

where M is all attributes used by L or C But is it a good idea?

SELECT starName FROM StarsIn WHERE movieYear = 1996;

starName

movieYear=1996

StarsIn

starName,movieYear

starName

movieYear=1996

StarsIn

Page 30: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

30

Joins and Products Recall from the definitions of

relational algebra: R C S = C(R X S) (theta join) R S = L(C(R X S)) (natural join)

where C equates same-name attributes in R and S, and L includes all attributes of R and S dropping duplicates

To improve a logical query plan, replace a product followed by a selection with a join Join algorithms are usually faster than

doing product followed by selection

Page 31: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

31

Duplicate Elimination Moving down the tree is potentially beneficial

as it can reduce the size of intermediate relations

Can be eliminated if argument has no duplicates a relation with a primary key a relation resulting from a grouping operator

Legal to push through product, join, selection, and bag intersection Ex: (R X S) = (R) X (S)

Cannot push through bag union, bag difference or projection

Page 32: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

32

Grouping and Aggregation

Since produces no duplicates: (L(R)) = L(R)

Get rid of useless attributes: L(R) = L(M(R))

where M contains all attributes in L If L contains only MIN and MAX:

L(R) = L((R))

Page 33: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

33

Example Suppose we have the relations

MovieStar(name,addr,gender,birthdate)StarsIn(title,year,starName)

and we want to find the youngest star to appear in a movie for each year:SELECT year, MAX(birthdate)FROM MovieStar,StarsInWHERE name = starNameGROUP BY year;

year,MAX(birthdate)

name=starName

X

MovieStar StarsIn

Page 34: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

34

Example cont'dyear,MAX(birthdate)

name=starName

X

MovieStar StarsIn

year,MAX(birthdate)

year,birthdate

MovieStar StarsIn

name=starName

year,

starName

birthdate,

name

year,MAX(birthdate)

MovieStar StarsIn

year,birthdate

name=starName

Page 35: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

35

Summary of LQP Improvements

Selections: push down tree as far as possible if condition is an AND, split and push separately sometimes need to push up before pushing down

Projections: can be pushed down new ones can be added (but be careful)

Duplicate elimination: sometimes can be removed

Selection/product combinations: can sometimes be replaced with join

Page 36: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

36

Outline Convert SQL query to a parse tree

Semantic checking: attributes, relation names, types Convert to a logical query plan (relational

algebra expression) deal with subqueries

Improve the logical query plan use algebraic transformations group together certain operators evaluate logical plan based on estimated size of

relations Convert to a physical query plan

search the space of physical plans choose order of operations complete the physical query plan

Page 37: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

37

Grouping Assoc/Comm Operators

Group together adjacent joins, adjacent unions, and adjacent intersections as siblings in the tree

Sets up the logical QP for future optimization when physical QP is constructed: determine best order for doing a sequence of joins (or unions or intersections)

U D E FU

UA

B C

D E F

A B C

Page 38: 1 Query Compilation Parsing Logical Query Plan Source: our textbook, slides by Hector Garcia-Molina.

38

Evaluating Logical Query Plans

The transformations discussed so far intuitively seem like good ideas

But how can we evaluate them more scientifically?

Estimate size of relations, also helpful in evaluating physical query plans

Coming up next…