Top Banner
Query Processing Presented by Aung S. Win
28

Query Processing

Feb 25, 2016

Download

Documents

susane

Query Processing. Presented by Aung S. Win. Objectives. Query processing and optimization. Static versus dynamic query optimization. How a query is decomposed and semantically analyzed. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Query Processing

Query Processing

Presented by

Aung S. Win

Page 2: Query Processing

Objectives

Query processing and optimization. Static versus dynamic query optimization. How a query is decomposed and

semantically analyzed. How to create a relational algebra tree to

represent a query. The rules of equivalence for the relational

algebra operations.

Page 3: Query Processing

(Cont.)

Heuristic transformation rules. The types of database statistics required to

estimate the cost of operations. The different strategies for implementing the

relational algebra operations. The difference between materialization and

pipelining. The advantages of left-deep trees.

Page 4: Query Processing

Query Processing

The activities involved in retrieving data from the database.

The aims of query processing (1)to transform a query written in a high-level language into a low-level language (2)to execute the strategy to retrieve the required data.

Page 5: Query Processing

(Cont.)

Query processing can be divided into four main phases: decomposition, optimization, code generation, and execution.

Page 6: Query Processing

Query Decomposition

The aims of query decomposition (1)to transform a high-level query into a

relational algebra query. (2)to check that the query is syntactically and

semantically correct.

Page 7: Query Processing

(Cont.)

The typical stages of query decomposition are analysis, normalization, semantic analysis, simplification, and query restructuring.

Page 8: Query Processing

Analysis

The query is lexically and syntactically analyzed using the techniques of programming language compilers.

Verifies that the relations and attributes specified in the query are defined in the system catalog.

Verifies that any operations applied to database objects are appropriate for the object type.

Page 9: Query Processing

(Cont.)

On completion of the analysis, the high-level query has been transformed into some internal representation (query tree) that is more suitable for processing.

Root

Intermediate operations

leaves

Page 10: Query Processing

Normalization

Converts the query into a normalized form that can be more easily manipulated.

There are two different normal forms, conjunctive normal form and disjunctive normal form.

Page 11: Query Processing

Conjunctive normal form

A sequence of conjuncts that are connected with the and operator. Each conjunct contains one or more terms connected by the or operator.

for example (position=‘Manager’ V salary>20000) ^

branchNo = ‘B003’

Page 12: Query Processing

Disjunctive normal form

A sequence of disjuncts that are connected with the or operator. Each disjunt contains one or more terms connected by the and operator.

for example (position=‘Manager’ ^ branchNo = ‘B003’) V

(salary>20000 ^ branchNo = ‘B003’)

Page 13: Query Processing

Semantic analysis

The objective is to reject normalized queries that are incorrectly formulated or contradictory.

Page 14: Query Processing

Simplification

To detect redundant qualifications, eliminate common subexpressions , and transform the query to a semantically equivalent but more easily and efficiently computed form.

Access restrictions, view definitions, and integrity constraints are considered at this stage.

Page 15: Query Processing

Query restructuring

The final stage of query decomposition. The query is restructured to provide a more

efficient implementation.

Page 16: Query Processing

Query optimization

The activity of choosing an efficient execution strategy for processing a query.

An important aspect of query processing is query optimization.

The aim of query optimization is to choose the one that minimizes resource usage.

Page 17: Query Processing

(Cont.)

Every method of query optimization depend on database statistics.

The statistics cover information about relations, attribute, and indexes.

Keeping the statistics current can be problematic. If the DBMS updates the statistics every time a tuple

is inserted, updated, or deleted, this would have a significant impact on performance during peak period.

Page 18: Query Processing

(Cont.)

An alternative approach is to update the statistics on a periodic basis, for example nightly, or whenever the system is idle.

Page 19: Query Processing

Dynamic query optimization

Advantage: all information required to select an optimum strategy is up to date.

Disadvantage: the performance of the query is affected because the query has to be parsed, validated, and optimized before it can be executed.

Page 20: Query Processing

Static query optimization

The query is parsed, validated, and optimized once that is similar to the approach taken by a compiler for a programming language.

Advantages1)The runtime overhead is removed2)More time available to evaluate a larger

number of execution strategies.

Page 21: Query Processing

(cont.)

Disadvantage: the execution strategy that is chosen as being optimal when the query is compiled may no longer be optimal when the query is run.

Page 22: Query Processing

Transformation Rules for the Relational Algebra Operations

By applying transformation rules, we can transform one relational algebra into an equivalent expression that is more efficient.

There are twelve rules that can be used to restructure the relational algebra tree generated during query decomposition.

Page 23: Query Processing

Heuristics rules

Many DBMSs use heuristics to determine strategies for query processing.

Heuristics rules include -performing Selection and Projections as early as

possible.-combining Cartesian product with a subsequent

selection whose predicate represents a join condition into a join operation.

Page 24: Query Processing

(Cont.)

-using associativity of binary operations to rearrange leaf nodes so that leaf nodes with the most restrictive Selections are executed first.

Page 25: Query Processing

Cost estimation

Depends on statistical information held in the system catalog.

Typical statistics include the cardinality of each base relation, the number of blocks required to store a relation, the number of distinct values for each attribute, the selection cardinality of each attribute, and the number of levels in each multilevel index.

Page 26: Query Processing

Join operation

Block nested loop join Indexed nested loop join Sort-merge join Hash join

Page 27: Query Processing

Pipelining

In materialization, the output of one operation is stored in a temporary relation for processing by the next operation.

An alternative approach is to pipeline the results of one operation to another operation without creating a temporary relation to hold the intermediate result.

By using it, we can save on the cost of creating temporary relations and reading the results back in again.

Page 28: Query Processing

Left – deep trees

A relational algebra tree where the right-hand relation is always a base relation.

Advantages: reducing the search space for the optimum strategy and allowing the query optimizer to be based on dynamic processing techniques.