5/2/2011 1 Query Processing and Optimization Introduction • Users are expected to write ―efficient‖ queries. But they do not always do that! – Users typically do not have enough information about the database to write efficient queries. E.g., no information on table size – Users would not know if a query is efficient or not without knowing how the DBMS’s query processor work • DBMS’s job is to optimize the user’s query by: – Converting the query to an internal representation (tree or graph) – Evaluate the costs of several possible ways of executing the query and find the best one. Steps in Query Processing SQL query Execution Plan Code Result Parse Tree Query Parsing Code Generation Query Optimization Runtime DB Processor Join Project Employee Join Employee and Project using hash join, … ... Query Processing Query in a high level language Scanning, Parsing, & Validating Intermediate form of query QUERY OPTIMIZER Execution Plan Query Code Generator Code to execute the query Runtime DB Processor Result of query Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation Basic Steps in Query Processing • Parsing and translation – translate the query into its internal form. This is then translated into relational algebra. – Parser checks syntax, verifies relations • Evaluation – The query-execution engine takes a query- evaluation plan, executes that plan, and returns the answers to the query.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
5/2/2011
1
Query Processing and
Optimization
Introduction
• Users are expected to write ―efficient‖ queries. But they
do not always do that!
– Users typically do not have enough information about the
database to write efficient queries. E.g., no information on
table size
– Users would not know if a query is efficient or not without
knowing how the DBMS’s query processor work
• DBMS’s job is to optimize the user’s query by:
– Converting the query to an internal representation (tree or
graph)
– Evaluate the costs of several possible ways of executing
the query and find the best one.
Steps in Query Processing
SQL query
Execution Plan
Code
Result
Parse Tree
Query Parsing
Code Generation
Query Optimization
Runtime DB Processor
Join
ProjectEmployee
Join Employee and Project
using hash join, … ...
Query ProcessingQuery in a high level language
Scanning, Parsing,
& Validating
Intermediate form of query
QUERY OPTIMIZER
Execution Plan
Query Code Generator
Code to execute the query
Runtime DB Processor
Result of query
Basic Steps in Query Processing1. Parsing and translation
2. Optimization
3. Evaluation
Basic Steps in Query Processing
• Parsing and translation
– translate the query into its internal form.
This is then translated into relational
algebra.
– Parser checks syntax, verifies relations
• Evaluation
– The query-execution engine takes a query-
evaluation plan, executes that plan, and
returns the answers to the query.
5/2/2011
2
Query Processing
• Consider the query:
select balance
from account
where balance<2500
• Can be translated into either of the following RA expressions:
balance 2500( balance(account))
balance( balance 2500(account))
• The RA expressions are equivalent
Query Processing
• Each relational algebra operation can be evaluated using one of several different algorithms– Correspondingly, a relational-algebra
expression can be evaluated in many ways.
• Annotated expression specifying detailed evaluation strategy is called an evaluation-plan– E.g., can use an index on balance to find
accounts with balance < 2500,– or can perform complete relation scan and
discard accounts with balance 2500
Query Plan Query Optimization
• Amongst all equivalent evaluation plans choose the one with lowest cost. – Cost is estimated using statistical information
from the database catalog• e.g. number of tuples in each relation, size of tuples,
etc.
• First we need to learn:– How to measure query costs– Algorithms for evaluating relational algebra
operations– How to combine algorithms for individual
operations in order to evaluate a complete expression
– How to optimize queries, that is, how to find an evaluation plan with lowest estimated cost
Measures of Query Cost• Cost is generally measured as total elapsed time for
answering query
– Many factors contribute to time cost
• disk accesses, CPU, or even network communication
• Typically disk access is the predominant cost, and is also relatively easy to estimate. Measured by taking into account
– Number of seeks * average-seek-cost
+ Number of blocks read * average-block-read-cost
+ Number of blocks written * average-block-write-cost
• Cost to write a block is greater than cost to read a block
– data is read back after being written to ensure that the write was successful
– Assumption: single disk
• Can modify formulae for multiple disks/RAID arrays
• Or just use single-disk formulae, but interpret them as measuring resource consumption instead of time
Measures of Query Cost (Cont.)• For simplicity we just use the number of block transfers from
disk and the number of seeks as the cost measures– tT – time to transfer one block
– tS – time for one seek
– Cost for b block transfers plus S seeksb * tT + S * tS
• We ignore CPU costs for simplicity– Real systems do take CPU cost into account
• We do not include cost to writing output to disk in our cost formulae
• Several algorithms can reduce disk I/O by using extra buffer space
– Amount of real memory available to buffer depends on other concurrent queries and OS processes, known only during execution
• We often use worst case estimates, assuming only the minimum amount of memory needed for the operation is available
• Required data may be buffer resident already, avoiding disk I/O
– But hard to take into account for cost estimation
5/2/2011
3
Statistics and Catalogs
• For each Table
– Table name, file name (or some identifier) & file structure (e.g., heap file)
– Attribute name and type of each attribute
– Index name of each index
– Integrity constraints
• For each Index
– Index name & the structure (e.g., B+ tree)
– Search key attributes
• For each View
– View name & definition
Statistics and Catalogs
• Cardinality: NTuples(N) for each R
• Size: NPages(R) for each R
• Index Cardinality: Number of distinct key values NKeys(I) for each I
• Index Size: INPages(I) for each index I
• For B+ tree index, INPages is number of leaf pages
• Index Height: Number of non-leaf levels IHeight(I) for eact tree index
• Index Range: ILow(I) & IHigh(I)
Statistics and Catalogs
• Catalogs updated periodically
– Updating whenever data changes is too expensive
• More detailed information (e.g., histograms of the values in some field) are sometimes stored.
Operator Evaluation
Algorithms for evaluating relational operators use some simple ideas extensively:
– Indexing: If a selection or join condition is specified, use an index to examine just the tuples that satisfy the condition.
– Iteration: Sometimes, faster to scan all tuples even if there is an index. (And sometimes, we can scan the data entries in an index instead of the table itself.)
– Partitioning: By using sorting or hashing, we can partition the input tuples and replace an expensive operation by similar operations on smaller inputs.
Access Paths• An access path is a method of retrieving tuples:
• File scan, or index that matches a selection (in the query)
• A tree index matches (a conjunction of) terms that involve only attributes in a prefix of the search key.
• E.g., Tree index on <a, b, c> matches the selection a=5 AND b=3, and a=5 AND b>6, but not b=3.
• A hash index matches (a conjunction of) terms that has a term attribute = value for every attribute in the search key of the index.
• E.g., Hash index on <a, b, c> matches a=5 AND b=3 AND
c=5; but it does not match b=3, or a=5 AND b=3, or a>5 AND b=3 AND c=5.
Access Paths
• Selectivity: Number of pages retrieved (Index + data) to retrieve all desired tuples
• Using the most selective access path minimizes the cost of data retrieval
• Reduction Factor: • Each conjunct is a filter
• Fraction of tuples satisfying a given conjunct is called the reduction factor
5/2/2011
4
Query Optimization
• Techniques used by a DBMS to process, optimize, and execute high-level queries
• A high-level query is – Scanned– Parsed– Validated
Query Optimization• Query optimizer would now choose an execution
plan for each block
• Note that the inner block needs to be evaluated only once to produce the maximum salary
• Uncorrelated nested query
• It is much harder to optimize correlated nested query where a tuple variable from the outer block appears in the where clause of the inner block
Select S.sname
From Sailors S
Where exists (select *
from reserves R
where R.bid=103
& R.sid=S.sid)
A Word about *
• All we want to do is to check that a qualifying row exists, and not really want to retrieve any columns from the row
Select S.sname
From Sailors S
Where exists (select *
from reserves R
where R.bid=103
& R.sid=S.sid)
Select count (*)
From Sailors S
Select count (distinct S.sname)
From Sailors S
If COUNT does not include DISTINCT, the above two queries give the same result
COUNT (*) is a better querying style since it immediately clear that all records contribute to total count
• Give a relational algebra expression,
how do we transform it to a more efficient
one?
Query Optimization
• Use the query tree as a tool to rearrange
the operations of the relational algebra
expression
Query Optimization
• RDBMS query optimizers are very complex pieces of software
• Typically represent 40-50 man years of development effort!!
Query Optimization
• SQL queries translated into Relational Algebra & then optimized
• Two main techniques for optimization•Heuristic based
» Ordering the operations in a query execution strategy
» Works for most cases but not guaranteed for all possible cases
•Cost based» Systematically estimating the cost of different
execution strategies and choosing the execution plan with the lowest cost estimate
• Both combined in a typical query optimizer
Query Optimization
• Query is essentially treated as a σ-∏-►◄ algebra expression
• Remaining operations are carried out on the result of the σ-∏-►◄expression
• Optimizing an RA expression involves:• Enumerating alternative plans for evaluating the
expression. NOT ALL
• Estimating the cost of each enumerated plan and choosing the plan with the lowest estimated cost
5/2/2011
10
Query Evaluation Plans
• A QEP consists of an extended RA tree
• Additional annotations at each node indicating the access method to use for each table and the implementation method to use for each relational operator
Structure and Execution of a Query Tree
• A query tree is a tree structure that
corresponds to a relational algebra expression
by representing the input relations as leaf
nodes and the relational algebra operations as
internal nodes of the tree
• An execution of the query tree consists of
executing an internal node operation whenever
its operands are available and then replacing
that internal node by the relation that results
from executing the operation
Query Optimization: Example
SELECT S.snameFROM Reserves R, Sailors SWHERE R.sid=S.sid AND
R.bid=100 AND S.rating>5
RA Tree:
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
(Simple Nested Loops)
(On-the-fly)
(On-the-fly)Plan:
RA Expression:∏sname (σ bid=100^rating>5(R ►◄sid=sid S))
The Schema:Sailors (sid, sname, rating, age) 50 Bytes
Reserves (sid, bid, day, rname) 40 Bytes
Interpreting the TREE
Tree partially specifies how to evaluate the query
• First compute join between Reserves & Sailors
• Then the selections
• Finally the projection
RA Tree:
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
Interpreting the TREEDecide on the implementation of each operation involved
• Page oriented simple nested loops join between Reserves & Sailors with Reserves as the outer table
• Apply selections & projections to each tuple in the result of the join as it is produced
• Result of the join before the selections and projections is never stored in its entirety
• Convention: Outer table is the left child of the operator
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
(Simple Nested Loops)
(On-the-fly)
(On-the-fly)Plan:
(File Scan)(File Scan)
Heuristics for Optimizing a Query
• A query may have several equivalent
query trees
• A query parser generates a standard canonical query tree from a SQL query tree– Cartesian products are first applied
(FROM)
– then the conditions (WHERE)
– and finally projection (SELECT)
5/2/2011
11
ProjNo,DeptNo,EmpName,Address,Birthdate
ProjLocation=‘Stafford’ AND MgrNo=EmpNo AND
DeptNo=DeptNo,
Employee
DepartmentProject
The query optimizer
transforms this canonical
query into an efficient final
query
Heuristics for Optimizing a Query
select ProjNo, DeptNo, EmpName, Address,
Birthdate
from Project, Department, Employee
where ProjLocation=„Stafford‟ and
MrgNo=EmpNo and
Department.DeptNo=Employee.DeptNo
Find the names of employees born after 1957
who work on a project named „Aquarius‟
select EmpName
from Employee, WorksOn, Project
where ProjName=„Aquarius‟ AND
Project.ProjNo=WorksOn.ProjNo AND
Employee.EmpNo = WorksOn.EmpNo
AND
Birthdate >„DEC-31-1957‟
WorksOn (EmpNo, ProjNo, Hours)
EmpName
ProjName=‘Aquarius’ AND Project.ProjNo=Project.ProjNo
AND Employee.EmpNo=WorksOn.EmpNo
AND Birthdate > ‘DEC-31-1957’
Project
WorksOnEmployee
Example
EmpName
ProjNo=ProjNo
Project
WorksOn
Employee
ProjName=‘Aquarius’
Birthdate > ‘dec-31-1957’
EmpNo=EmpNo
Example
Push all the conditions as far down
the tree as possible
Expensive due to large
size of Employee
Example
EmpName
EmpNo=EmpNo
Employee
WorksOn
Project
Birthdate > ‘dec-31-1957’
PNAME=‘Aquarius’
ProjNo=ProjNo
Rearrange join sequence according
to estimates of relation sizes
Only need ProjNo attribute from
Project and WorksOn
Only need EmpNo attribute from
Employee and WorksOn and
EmpName from Employee
Example
Replace cross products and selection
sequence with a join operation EmpName
EmpNo= EmpNo
EmployeeWorksOn
Project
Birthdate > ‘dec-31-1957’
ProjName=‘Aquarius’
ProjNo= ProjNo
Example
Push projection as far down the
query tree as possible
LNAME
EmpNo = EmpNo
Employee
Birthdate > ‘dec-31-1957’
WorksOn
Project
ProjName=‘Aquarius’
ProjNo= ProjNo
EmpNo, EmpNameEmpNo
EmpNo, ProjNoProjNo
5/2/2011
12
1. Cascade of : A conjunctive selection condition can be broken up into a cascade (sequence) of individual operations:
• c1 AND c2 AND...AND cn(R) c1
( c2(...( cn
(R))..))
2. Commutativity of :
c1( c2
(R)) c2( c1
(R))
3. Cascade of :
• List1( List2
(... ( Listn(R))... )) List1
(R)
if List1 is included in List2…Listn; result is null if List1 is not in any of List2…Listn
Transformation Rules
4. Commuting with : if the projection list List1 involves only attributes that are in condition c
• List1( c(R)) c( List1(R))
5. Commutivity of JOIN or : R S S R
6. Commuting with JOIN: if all the attributes in the selection condition c involve only the attributes of one of the relations being joined, say, R
• c(R S) ( c(R)) S
Transformation Rules
7. Commuting with JOIN: if List can be separated into
List1 and List2 involving only attributes from R and S,
respectively, and the join condition c involves only
attributes in List:
• List(R c S) ( List1(R) c List2
(S))
8. Commuting set operations: and are commutative
9. JOIN, , , are associative
10. distributes over , ,
• c (R S) c(R) c(S)
11. distributes over
• List (R S) ( List(R) List(S))
Transformation Rules
Use rule 1 to break up any operation with conjunctive conditions into a sequence of operations
Use rules 2, 4, 6, and 10 concerning commutativity of with other operations to move each operation as far down the query tree as possible based on the attributes in the operations
Use rule 9 concerning associativity of binary operations to rearrange the leaf nodes of the tree so that the leaf node relations with the most restrictive operations are executed
Heuristic Algebraic Optimization
Combine sequences of Cartesian product and operation representing a join condition into single JOIN operations
Use rules 3, 4, 7, and 11 concerning the cascading of and commuting with other operations, break down a and move the projection attributes down the tree as far as possible
Identify subtrees that represent groups of operations that can be executed by a single algorithm (select/join followed by project)
• Motivation– A query is mapped into a sequence of operations.
– Each execution of an operation produces a temporary result.
– Generating and saving temporary files on disk is time consuming and expensive.
• Alternative:– Avoid constructing temporary results as much as
possible.
– Pipeline the data through multiple operations - pass the result of a previous operator to the next without waiting to complete the previous operation.
5/2/2011
13
Pipelined Evaluation
• The result of one operator is sometimes pipelined to another operator without creating a temporary table to hold the intermediate result
• The output of R ►◄S is pipelined into the selections & projections that follow
• Cost of writing out the intermediate result & reading it back in can be significant
• Temporary table: Materialized Tuples
Pipelined Evaluation
• Consider a selection query in which only a part of the selection condition matches an index
• 2 instances of selection operator– Matching (primary) part of the selection condition
– Rest
• Pipelining: apply the second selection to each tuple in the result of the primary selection as it is produced & adding tuples that qualify to the final result
• When the input to a unary operator is pipelined into it, we say that the operator is applied on-the-fly
Pipelined Evaluation
• Result tuples of first join pipelined into join with C
• Conceptually, the evaluation is initiated from the root, & the node joining A & B produces tuples as and when they are requested from their parent node
►◄
A B
C
►◄
(A ►◄B) ►◄ C
Estimation of the Size of Joins
• The Cartesian product r s contains nrns tuples; each tuple
occupies sr + ss bytes.
• If R S = , then r s is the same as r x s.
• If R S is a key for R, then a tuple of s will join with at most one
tuple from r; therefore, the number of tuples in r s is no greater
than the number of tuples in s.If R S in S is a foreign key in S referencing R, then the number of
tuples in r s is exactly the same as the number of tuples in s.The case for R S being a foreign key referencing S is symmetric.
R S
Matching tuples
Example of Size Estimation
• In the example query depositor customer, customer-name in
depositor is a foreign key of customer; hence, the result has exactly
depositor tuples, which is 5000.
• Data: R = Customer, S = Depositor
customer = 10,000
fcustomer = 25
bcustomer = 10000/25 = 400
depositor = 5,000
fdepositor = 50
bdepositor = 5000/50 = 100
Estimation of the size of Joins
• If R S = {A} is not a key for R or S.
If we assume that every tuple t in R produces tuples in
R S, number of tuples in R S is estimated to be:
r s
V(A, s)
• If the reverse is true, the estimates obtained will be:
r s
V(A, r)
• The lower of these two estimates is probably the more
accurate one.
Number of distinct values of A in s
R S
s
V(A, s)
5/2/2011
14
Estimation of the size of Joins
• Compute the size estimates for depositor customer
without using information about foreign keys:
– customer = 10,000
depositor = 5,000
V(customer-name, depositor ) = 2500
V(customer-name, customer ) = 10000
– The two estimates are 5000 * 10000/2500 = 20,000 and
5000 * 10000/10000 = 5000
– We choose the lower estimate, which, in this case, is the
same as our earlier computation using foreign keys.
There are 5,000 tuples in
depositor relation but has
only 2,500 distinct
depositors, so every
depositor has two accounts
Customer-name is unique
Nested-Loop Join
• Compute the theta join, r s
for each tuple tr in r do begin
for each tuple ts in s do begintest pair (tr, ts) to see if they satisfy the join condition
if they do, add tr · ts to the result.
End
end
• r is called the outer relation and s the inner relation of the join.
• Requires no indices and can be used with any kind of join condition.
• Expensive since it examines every pair of tuples in the two relations.
Cost of Nested-Loop Join• If there is enough memory to hold only one block of each
relation, the estimated cost is nr * bs + br disk accesses
• If the smaller relation fits entirely in memory, use it as the inner relation. This reduces the cost estimate to br + bs disk accesses.
– br + bs is the minimum possible cost to read R and S once
– Putting both relations in memory won’t reduce the cost further
br disk accesses to
load R into bufferRS
For each tuple in r, S has to be
read into buffer, bs disk accesses
no. of bocks in rno. of bocks in s
Query Processing and
Optimization
Structure and Execution of a Query Tree
• A query tree is a tree structure that
corresponds to a relational algebra expression
by representing the input relations as leaf
nodes and the relational algebra operations as
internal nodes of the tree
• An execution of the query tree consists of
executing an internal node operation whenever
its operands are available and then replacing
that internal node by the relation that results
from executing the operation
Query Optimization: Example
SELECT S.snameFROM Reserves R, Sailors SWHERE R.sid=S.sid AND
R.bid=100 AND S.rating>5
RA Tree:
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
(Simple Nested Loops)
(On-the-fly)
(On-the-fly)Plan:
RA Expression:∏sname (σ bid=100^rating>5(R ►◄sid=sid S))
The Schema:Sailors (sid, sname, rating, age) 50 Bytes
Reserves (sid, bid, day, rname) 40 Bytes
5/2/2011
15
Interpreting the TREE
Tree partially specifies how to evaluate the query
• First compute join between Reserves & Sailors
• Then the selections
• Finally the projection
RA Tree:
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
Interpreting the TREEDecide on the implementation of each operation involved
• Page oriented simple nested loops join between Reserves & Sailors with Reserves as the outer table
• Apply selections & projections to each tuple in the result of the join as it is produced
• Result of the join before the selections and projections is never stored in its entirety
• Convention: Outer table is the left child of the operator
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
(Simple Nested Loops)
(On-the-fly)
(On-the-fly)Plan:
(File Scan)(File Scan)
Heuristics for Optimizing a Query
• A query may have several equivalent
query trees
• A query parser generates a standard canonical query tree from a SQL query tree– Cartesian products are first applied
(FROM)
– then the conditions (WHERE)
– and finally projection (SELECT)
ProjNo,DeptNo,EmpName,Address,Birthdate
ProjLocation=‘Stafford’ AND MgrNo=EmpNo AND
DeptNo=DeptNo,
Employee
DepartmentProject
The query optimizer
transforms this canonical
query into an efficient final
query
Heuristics for Optimizing a Query
select ProjNo, DeptNo, EmpName, Address,
Birthdate
from Project, Department, Employee
where ProjLocation=„Stafford‟ and
MrgNo=EmpNo and
Department.DeptNo=Employee.DeptNo
Find the names of employees born after 1957
who work on a project named „Aquarius‟
select EmpName
from Employee, WorksOn, Project
where ProjName=„Aquarius‟ AND
Project.ProjNo=WorksOn.ProjNo AND
Employee.EmpNo = WorksOn.EmpNo
AND
Birthdate >„DEC-31-1957‟
WorksOn (EmpNo, ProjNo, Hours)
EmpName
ProjName=‘Aquarius’ AND Project.ProjNo=Project.ProjNo
AND Employee.EmpNo=WorksOn.EmpNo
AND Birthdate > ‘DEC-31-1957’
Project
WorksOnEmployee
Example
EmpName
ProjNo=ProjNo
Project
WorksOn
Employee
ProjName=‘Aquarius’
Birthdate > ‘dec-31-1957’
EmpNo=EmpNo
Example
Push all the conditions as far down
the tree as possible
Expensive due to large
size of Employee
5/2/2011
16
Example
EmpName
EmpNo=EmpNo
Employee
WorksOn
Project
Birthdate > ‘dec-31-1957’
PNAME=‘Aquarius’
ProjNo=ProjNo
Rearrange join sequence according
to estimates of relation sizes
Only need ProjNo attribute from
Project and WorksOn
Only need EmpNo attribute from
Employee and WorksOn and
EmpName from Employee
Example
Replace cross products and selection
sequence with a join operation EmpName
EmpNo= EmpNo
EmployeeWorksOn
Project
Birthdate > ‘dec-31-1957’
ProjName=‘Aquarius’
ProjNo= ProjNo
Example
Push projection as far down the
query tree as possible
LNAME
EmpNo = EmpNo
Employee
Birthdate > ‘dec-31-1957’
WorksOn
Project
ProjName=‘Aquarius’
ProjNo= ProjNo
EmpNo, EmpNameEmpNo
EmpNo, ProjNoProjNo
1. Cascade of : A conjunctive selection condition can be broken up into a cascade (sequence) of individual operations:
• c1 AND c2 AND...AND cn(R) c1
( c2(...( cn
(R))..))
2. Commutativity of :
c1( c2
(R)) c2( c1
(R))
3. Cascade of :
• List1( List2
(... ( Listn(R))... )) List1
(R)
if List1 is included in List2…Listn; result is null if List1 is not in any of List2…Listn
Transformation Rules
4. Commuting with : if the projection list List1 involves only attributes that are in condition c
• List1( c(R)) c( List1(R))
5. Commutivity of JOIN or : R S S R
6. Commuting with JOIN: if all the attributes in the selection condition c involve only the attributes of one of the relations being joined, say, R
• c(R S) ( c(R)) S
Transformation Rules
7. Commuting with JOIN: if List can be separated into
List1 and List2 involving only attributes from R and S,
respectively, and the join condition c involves only
attributes in List:
• List(R c S) ( List1(R) c List2
(S))
8. Commuting set operations: and are commutative
9. JOIN, , , are associative
10. distributes over , ,
• c (R S) c(R) c(S)
11. distributes over
• List (R S) ( List(R) List(S))
Transformation Rules
5/2/2011
17
Pictorial Depiction of Equivalence Rules
Use rule 1 to break up any operation with conjunctive conditions into a sequence of operations
Use rules 2, 4, 6, and 10 concerning commutativity of with other operations to move each operation as far down the query tree as possible based on the attributes in the operations
Use rule 9 concerning associativity of binary operations to rearrange the leaf nodes of the tree so that the leaf node relations with the most restrictive operations are executed
Heuristic Algebraic Optimization
Combine sequences of Cartesian product and operation representing a join condition into single JOIN operations
Use rules 3, 4, 7, and 11 concerning the cascading of and commuting with other operations, break down a and move the projection attributes down the tree as far as possible
Identify subtrees that represent groups of operations that can be executed by a single algorithm (select/join followed by project)
Heuristic Algebraic OptimizationEvaluation of Expressions
• Alternatives for evaluating an entire expression tree
– Materialization: generate results of an expression whose inputs
are relations or are already computed, materialize (store) it on
disk.
– Pipelining: pass on tuples to parent operations even as an
operation is being executed
Materialization• Materialized evaluation: evaluate one operation at a time,
starting at the lowest-level. Use intermediate results materialized
into temporary relations to evaluate next-level operations.
• E.g., in figure below, compute and store
then compute the store its join with instructor, and finally compute
the projection on name.
)("Watson" departmentbuilding
Materialization (Cont.)
• Materialized evaluation is always applicable
• Cost of writing results to disk and reading them back can be quite
high
– Our cost formulas for operations ignore cost of writing results to
disk, so
• Overall cost = Sum of costs of individual operations +
cost of writing intermediate results to disk
• Double buffering: use two output buffers for each operation, when
one is full write it to disk while the other is getting filled
– Allows overlap of disk writes with computation and reduces
execution time
5/2/2011
18
Pipelining• Pipelined evaluation : evaluate several operations
simultaneously, passing the results of one operation on to the next.
• E.g., in previous expression tree, don’t store result of
– instead, pass tuples directly to the join.. Similarly, don’t store result of join, pass tuples directly to projection.
• Much cheaper than materialization: no need to store a temporary relation to disk.
• Pipelining may not always be possible – e.g., sort, hash-join.
• For pipelining to be effective, use evaluation algorithms that generate output tuples even as tuples are received for inputs to the operation.
• Pipelines can be executed in two ways: demand driven and
producer driven
)("Watson" departmentbuilding
Pipelining• In demand driven or lazy evaluation
– system repeatedly requests next tuple from top level operation
– Each operation requests next tuple from children operations as
required, in order to output its next tuple
– In between calls, operation has to maintain ―state‖ so it knows
what to return next
• In producer-driven or eager pipelining
– Operators produce tuples eagerly and pass them up to their
parents
• Buffer maintained between operators, child puts tuples in
buffer, parent removes tuples from buffer
• if buffer is full, child waits till there is space in the buffer, and
then generates more tuples
– System schedules operations that have space in output buffer
and can process more input tuples
• Alternative name: pull and push models of pipelining
Pipelining (Cont.)• Implementation of demand-driven pipelining
– Each operation is implemented as an iterator implementing the following operations
• open()
– E.g. file scan: initialize file scan
» state: pointer to beginning of file
– E.g.merge join: sort relations;
» state: pointers to beginning of sorted relations
• next()
– E.g. for file scan: Output next tuple, and advance and store file pointer
– E.g. for merge join: continue with merge from earlier state till next output tuple is found. Save pointers as iterator state.
• close()
Evaluation Algorithms for Pipelining• Some algorithms are not able to output results even as they get input
tuples
– E.g. merge join, or hash join
– intermediate results written to disk and then read back
• Algorithm variants to generate (at least some) results on the fly, as
input tuples are read in
– E.g. hybrid hash join generates output tuples even as probe relation
tuples in the in-memory partition (partition 0) are read in
– Double-pipelined join technique: Hybrid hash join, modified to
buffer partition 0 tuples of both relations in-memory, reading them
as they become available, and output results of any matches
between partition 0 tuples
• When a new r0 tuple is found, match it with existing s0 tuples,
output matches, and save it in r0
• Symmetrically for s0 tuples
Pipelined Evaluation
• Motivation– A query is mapped into a sequence of operations.
– Each execution of an operation produces a temporary result.
– Generating and saving temporary files on disk is time consuming and expensive.
• Alternative:– Avoid constructing temporary results as much as
possible.
– Pipeline the data through multiple operations - pass the result of a previous operator to the next without waiting to complete the previous operation.
Pipelined Evaluation
• The result of one operator is sometimes pipelined to another operator without creating a temporary table to hold the intermediate result
• The output of R ►◄S is pipelined into the selections & projections that follow
• Cost of writing out the intermediate result & reading it back in can be significant
• Temporary table: Materialized Tuples
5/2/2011
19
Pipelined Evaluation
• Consider a selection query in which only a part of the selection condition matches an index
• 2 instances of selection operator– Matching (primary) part of the selection condition
– Rest
• Pipelining: apply the second selection to each tuple in the result of the primary selection as it is produced & adding tuples that qualify to the final result
• When the input to a unary operator is pipelined into it, we say that the operator is applied on-the-fly
Pipelined Evaluation
• Result tuples of first join pipelined into join with C
• Conceptually, the evaluation is initiated from the root, & the node joining A & B produces tuples as and when they are requested from their parent node
►◄
A B
C
►◄
(A ►◄B) ►◄ C
Statistical Information for Cost Estimation
• nr: number of tuples in a relation r.
• br: number of blocks containing tuples of r.
• lr: size of a tuple of r.
• fr: blocking factor of r — i.e., the number of tuples of
r that fit into one block.
• V(A, r): number of distinct values that appear in r for attribute A; same as the size of A(r).
• If tuples of r are stored together physically in a file,
then:
rfrn
rb
Histograms
• Histogram on attribute age of relation person
Equi-width histograms
• Equi-depth histograms
Estimation of the Size of Joins
• The Cartesian product r s contains nrns tuples; each tuple
occupies sr + ss bytes.
• If R S = , then r s is the same as r x s.
• If R S is a key for R, then a tuple of s will join with at most one
tuple from r; therefore, the number of tuples in r s is no greater
than the number of tuples in s.If R S in S is a foreign key in S referencing R, then the number of
tuples in r s is exactly the same as the number of tuples in s.The case for R S being a foreign key referencing S is symmetric.
R S
Matching tuples
Example of Size Estimation
• In the example query depositor customer, customer-name in
depositor is a foreign key of customer; hence, the result has exactly
depositor tuples, which is 5000.
• Data: R = Customer, S = Depositor
customer = 10,000
fcustomer = 25
bcustomer = 10000/25 = 400
depositor = 5,000
fdepositor = 50
bdepositor = 5000/50 = 100
5/2/2011
20
Estimation of the size of Joins
• If R S = {A} is not a key for R or S.
If we assume that every tuple t in R produces tuples in
R S, number of tuples in R S is estimated to be:
r s
V(A, s)
• If the reverse is true, the estimates obtained will be:
r s
V(A, r)
• The lower of these two estimates is probably the more
accurate one.
Number of distinct values of A in s
R S
s
V(A, s)
Estimation of the size of Joins
• Compute the size estimates for depositor customer
without using information about foreign keys:
– customer = 10,000
depositor = 5,000
V(customer-name, depositor ) = 2500
V(customer-name, customer ) = 10000
– The two estimates are 5000 * 10000/2500 = 20,000 and
5000 * 10000/10000 = 5000
– We choose the lower estimate, which, in this case, is the
same as our earlier computation using foreign keys.
There are 5,000 tuples in
depositor relation but has
only 2,500 distinct
depositors, so every
depositor has two accounts
Customer-name is unique
Nested-Loop Join
• Compute the theta join, r s
for each tuple tr in r do begin
for each tuple ts in s do begintest pair (tr, ts) to see if they satisfy the join condition
if they do, add tr · ts to the result.
End
end
• r is called the outer relation and s the inner relation of the join.
• Requires no indices and can be used with any kind of join condition.
• Expensive since it examines every pair of tuples in the two relations.
Cost of Nested-Loop Join• If there is enough memory to hold only one block of each
relation, the estimated cost is nr * bs + br disk accesses
• If the smaller relation fits entirely in memory, use it as the inner relation. This reduces the cost estimate to br + bs disk accesses.
– br + bs is the minimum possible cost to read R and S once
– Putting both relations in memory won’t reduce the cost further
br disk accesses to
load R into bufferRS
For each tuple in r, S has to be
read into buffer, bs disk accesses
no. of bocks in rno. of bocks in s
Selection Size Estimation
• A=v(r)
• nr / V(A,r) : number of records that will satisfy the selection
• Equality condition on a key attribute: size estimate = 1
• A V(r) (case of A V(r) is symmetric)
– Let c denote the estimated number of tuples satisfying the
condition.
– If min(A,r) and max(A,r) are available in catalog
• c = 0 if v < min(A,r)
• c =
– If histograms available, can refine above estimate
– In absence of statistical information c is assumed to be nr / 2.
),min(),max(
),min(.
rArA
rAvnr
Size Estimation of Complex Selections
• The selectivity of a condition i is the probability that a tuple in
the relation r satisfies i .
– If si is the number of satisfying tuples in r, the selectivity of
i is given by si /nr.
• Conjunction: 1 2 . . . n (r). Assuming indepdence, estimate of
tuples in the result is:
• Disjunction: 1 2 . . . n (r). Estimated number of tuples:
• Negation: (r). Estimated number of tuples:
nr – size( (r))
n
r
nr
n
sssn
. . . 21
)1(...)1()1(1 21
r
n
rr
rn
s
n
s
n
sn
5/2/2011
21
Heuristic Optimization• Cost-based optimization is expensive
• Systems may use heuristics to reduce the number of choices that must be made in a cost-based fashion.
• Heuristic optimization transforms the query-tree by using a set of rules that typically (but not in all cases) improve execution performance:
– Perform selection early (reduces the number of tuples)
– Perform projection early (reduces the number of attributes)
– Perform most restrictive selection and join operations before other similar operations.
– Some systems use only heuristics, others combine heuristics with partial cost-based optimization.
Heuristic Optimization
Perform selection operations as early as possible
– A heuristic optimizer would use this rule without finding out whether the cost is reduced by this transformation
– Does it always work?
– Consider this:
σθ (A ►◄B)
Heuristic Optimization
Perform selection operations as early as possible
σθ (A ►◄B)
– Condition θ only refers to attributes in B
– Selection can definitely be performed before the join
– A is extremely small as compared to B
– Index on the join attribute of B
– No index on the attributes used by θ
– Is it a good idea to push the selection before the join?
Heuristic Optimization
Perform selection operations as early as possible
σθ (A ►◄B)
– Performing the selection early ie directly on B
– Would require a scan of all tuples in B
– Probably cheaper to compute the join using the index and then to reject the tuples that fail the selection
Heuristic Optimization
Perform projection operations as early as possible
– Projection operation, like the selection operation, reduces the size of relations
– Whenever we need to generate a temporary relation, it is advantageous to apply immediately any projections that are possible
Heuristic Optimization
Perform selections earlier than projections
– Selections have the potential of reducing the size of a relation greatly
– Selections enable the use of indices to access tuples
5/2/2011
22
Heuristic Optimization
– Heuristics reorder an initial query-tree representation in such a way that the operations that reduce the size of the intermediate results are applied first
– Early selections reduce the number of tuples
– Early projections reduce the number of attributes
– Heuristic transformations also restructure the tree so that the system performs the most restrictive selection and join operations before other similar operations
SYSTEM R Optimizer
Current relational query optimizers have been greatly influenced by choices made in the design of the IBM’s System R query optimizer
– Use of statistics about DB instance to
estimate the cost of a QEP
– Consider only plans with binary joins in which
the inner relation is a base relation
• This heuristic greatly reduces the no. of alternative
plans that must be considered
SYSTEM R Optimizer
– Focus optimization on the class of SQL queries without nesting & treat nested queries in a relatively ad-hoc way
– Not to perform duplicate elimination for projections except as a final step when required by a DISTINCT clause
– Cartesian products avoided
– A model of cost that accounted for CPU costs as well as I/O costs
– Only left-deep plans
Left-Deep Plans
Focus optimization on the class of SQL queries without nesting & treat nested queries in a relatively ad-hoc way
– Not to perform duplicate elimination for projections except as a final step when required by a DISTINCT clause
– Cartesian products avoided
– A model of cost that accounted for CPU costs as well as I/O costs
– Only left-deep plans
Left Deep Join Trees
• In left-deep join trees, the right-hand-side input