8/14/2019 DB2 SQL TUNING.ppt
1/53
8/14/2019 DB2 SQL TUNING.ppt
2/53
Topics of Discussion
General Recommendation
Predicates Evaluation
Filter Factor
Column Correlation
Writing Subquery
Special Techniques to influence access path Using DB2 EXPLAIN
Different Access Types
8/14/2019 DB2 SQL TUNING.ppt
3/53
Topics of Discussion
Join Methods
DB2 Data and Index page prefetch
Sorting of Data and RIDs
View Merge/Materialization
Query Parallelism
DB2 V5 Features
8/14/2019 DB2 SQL TUNING.ppt
4/53
General Recommendation
Make Sure
Queries are as simple as possible
Unused rows are not fetched. Filtering to be doneby DB2 not in the application program.
Unused columns are not selected
There is no unnecessary ORDER BY or GROUPBY Clause
Use page level locking and try to minimize lock
duration.
Big tables should be handled with care.
8/14/2019 DB2 SQL TUNING.ppt
5/53
General Recommendation
Try to use indexable predicates wherever possible
Do not code redundant predicates
Make sure that declared length of host variable isnot greater than length attribute of data column.
If there are efficient indexes available on the
tables in the subquery, co-related subquery willperform better. Otherwise no co related subquery
will perform better.
If there are multiple subqueries, make sure that
they are ordered in efficient manner.
8/14/2019 DB2 SQL TUNING.ppt
6/53
Predicate
Predicates are found on WHERE, ON and
HAVING clause of the SQL. ON predicates are
applied first, then WHERE predicates and afterdata access HAVING predicates are applied.
Predicates on HAVING clause are not used when
accessing data
Predicate types have great impact on choosing
DB2 access path.
General Predicate Types
- Subquery, Equal, Range, IN-List and NOT
8/14/2019 DB2 SQL TUNING.ppt
7/53
Predicate
Predicates are also categorized into Indexable and
non-Indexable predicates.
Predicates are classified into Stage 1 and Stage 2predicates depending on when they are used
during query evaluation.
For Outer join, predicates on ON clause are
treated as stage 2 predicates. And most of the
other predicates are applied after JOIN as stage 2
predicates. Predicates on table expression can be
evaluated before join as stage 1 predicate
8/14/2019 DB2 SQL TUNING.ppt
8/53
Order of Predicate Evaluation
Predicates are evaluated in following sequence :
Indexable matching predicates are accessed first
Next is Indexable non-matching index. Processingis called index screening
All Indexable predicates are stage 1 predicates
After data page access, other stage 1 predicates areapplied
Finally stage 2 predicates are applied in returned
data rows
8/14/2019 DB2 SQL TUNING.ppt
9/53
Order of Predicate Evaluation
Predicate evaluation within each stage :
All equal predicates are applied first
Range predicates are next
All other predicates at the last
After both of the sets of rule are applied,predicates are evaluated in the order, they appear
in the query.
8/14/2019 DB2 SQL TUNING.ppt
10/53
Filtering
DB2 will apply the most restrictive predicate first
to reduce the processing at next stage for stage 1
predicates. It is better to code predicate with high filtering
factor first
DB2 evaluates the filter factor based on catalog
information in SYSCOLUMNS and
SYSCOLDIST tables.
8/14/2019 DB2 SQL TUNING.ppt
11/53
Filtering
If distribution of column value is not available in
catalog table SYSIBM.SYSCOLDIST, DB2
assumes normal distribution for filtering. Make sure the catalog tables are updated either
manually or running RUNSTATS.
If there is no information available from the
catalog tables, DB2 assumes default filter factor.
DB2 uses default filter factor for predicates in
static SQLs using host variables.
8/14/2019 DB2 SQL TUNING.ppt
12/53
Column Correlation
Two columns of a table are said to be co-related if
they are depended on each other.
DB2 might not determine optimum access path,table order or join method when query uses highly
correlated columns.
Column correlation makes query cheaper than
actually they are.
Run RUNSTATS to update the catalog tables with
correct correlation to help DB2 to find actual
filtering factor.
8/14/2019 DB2 SQL TUNING.ppt
13/53
Using host variable effectively
If static SQL has host variables, DB2 might not
select the optimum access path as it uses default
filter factor for the predicates using host variable. There are two ways to change the access path for
the query that contains host variables
- Using REOPT(VARS) option to change the
access path at run time.
- Rewrite the SQL in a different way.
8/14/2019 DB2 SQL TUNING.ppt
14/53
Writing Subquery
Subquery could be correlated and non-correlated
For Correlated subquery, for each row, returned
from the outer query, the subquery is evaluated. For non-correlated subquery, Inner query is
evaluated first and then the outer query
DB2 sometime can transform the subquery to join
and sometime application programmer has to do it
for better performance
Any subquery could be transformed to a join
8/14/2019 DB2 SQL TUNING.ppt
15/53
Writing Subquery
If you use columns from the both the tables, better
to use JOIN
Guidelines for Writing efficient subquery : If there are efficient indexes available on the tables in the
subquery, then a correlated subquery is likely to be the
most efficient kind of subquery.
If there are no efficient indexes available on the tables inthe subquery, then a non-correlated subquery would likely
perform better.
If there are multiple subqueries in any parent query, make
sure that the subqueries are ordered in the most efficientmanner.
8/14/2019 DB2 SQL TUNING.ppt
16/53
Some Special Techniques
OPTIMIZE OF n ROWS
Reducing the number of matching columns for
index scan Adding extra local predicates
Changing inner join to outer join
Updating Catalog Statistics
8/14/2019 DB2 SQL TUNING.ppt
17/53
OPTIMIZE FOR n ROWS
DB2 Chooses the access path that minimizes the
response time for retrieving the first few rows
Using OPTIMIZE FOR does not stop the useraccessing whole result set.
This is not useful when DB2 has to gather whole
result set before returning the first n rows.
8/14/2019 DB2 SQL TUNING.ppt
18/53
Influencing access path
DB2 evaluates the access path based on
information available in catalog tables
Wrong catalog information or unavailable cataloginformation may result in selection of wrong
access path
Wrong access path could be because of wrong
index selection
It also could be of index selection where
tablespace scan is effective
8/14/2019 DB2 SQL TUNING.ppt
19/53
8/14/2019 DB2 SQL TUNING.ppt
20/53
Other factors
Adding extra predicate may influence in selection
of join method
If you have extra predicate, Nested loop join maybe selected as DB2 assumes that filter factor will
be high. The proper type of predicate to add is
WHERE T1.C1 = T1.C1
Hybrid join is a costlier method. Outer join does
not use hybrid join. So If hybrid join is used by
DB2, convert inner join to outer join and add extra
predicates to removes unneeded rows.
8/14/2019 DB2 SQL TUNING.ppt
21/53
Updating catalog tables
Access path based on catalog column values
Catalog tables could be updated manually or
running RUNSTATS with appropriate options Tables which are frequently changed, access
method on them may suffer as statistics are not
reflected
Running RUNSTATS is a costly process. So
catalog statistics manually should be updated.
It is necessary to rebind the static SQLs after
catalog statistics update
8/14/2019 DB2 SQL TUNING.ppt
22/53
DB2 EXPLAIN AND TUNING
EXPLAIN is a monitoring tool that produces
information about a plan, package, or SQL
statement when it is bound. The output appears ina user-supplied table called PLAN_TABLE
It helps you to do the following
Design databases, indexes, and application
programs
Determine when to rebind an application
Determine the access path chosen for a query
8/14/2019 DB2 SQL TUNING.ppt
23/53
DB2 EXPLAIN OUTPUT
Explain output is stored in PLAN_TABLE
Each plan is identified by APPLNAME column
Each package is identified by PROGNAME,COLLID and VERSION
In each package, you might have multiple SQLs
and each is identified by QUERYNO
For each query will be evaluated in multiple stages
and each stage is identified by QBLOCKNO and
PLANNO
8/14/2019 DB2 SQL TUNING.ppt
24/53
Type of Access
Tablespace Scan (ACCESSTYPE = R)
Index scan
Index scan can categorized into
Index Only Access (INDEXONLY = Y)
Multiple index Scan
(ACCESSTYPE=M,MI,MU,MX) Matching index scan (MATCHCOLS > 0)
Non-Matching index scan ( MATCHCOLS = 0)
One fetch access (ACCESSTYPE= I1)
8/14/2019 DB2 SQL TUNING.ppt
25/53
Tablespace scan (ACCESSTYPE=R)
A matching index scan is not possible because an
index is not available, or there are no predicates to
match the index columns. high percentage of the rows in the table is
returned. In this case an index is not really useful,
because most rows need to be read anyway.
The indexes that have matching predicates have
low cluster ratios and are therefore efficient only
for small amounts of data.
Sequential prefetch is used (PREFETCH=S)
8/14/2019 DB2 SQL TUNING.ppt
26/53
Using Index
Index to be defined should be solely based on how
does application fetch data
Proper definition of index will avoid sort You need to trade cost of defining index and
performance.
8/14/2019 DB2 SQL TUNING.ppt
27/53
Matching Index Scan
Match index scan provide filtering
This is possible if predicates are specified on
either the leading or all of the index key columns If degree of filtering is high. Matching index scan
is efficient
MATCHCOLS will provide the number of
matching columns
If there are more than one index, DB2 will use IX
with most restrictive filtering for matching index
scan
8/14/2019 DB2 SQL TUNING.ppt
28/53
Non-Matching IX scan
This is also called Index Screening
DB2 select index screening when predicates are
specified on index key columns but are not part ofthe matching columns
Index screening predicates improve the index
access by reducing the number of rows that
qualify while searching the index
MATCHCOLS = 0 and ACCESSTYPE = I
8/14/2019 DB2 SQL TUNING.ppt
29/53
IN LIST Index Scan
An IN-list index scan is a special case of the
matching index scan, in which a single indexable
IN predicate is used as a matching equal predicate. PLAN TABLE shows MATCHCOLS > 0 and
ACCESSTYPE = N
8/14/2019 DB2 SQL TUNING.ppt
30/53
Multiple Index Scan
Multiple index access uses more than one index to
access a table
It is a good access path when No single indexprovides efficient access OR A combination of
index accesses provides efficient access.
LIST Sequential prefetch is used as RIDs are
collected from each index scan
ACCESSTYPE = M,MI,MU,MX and
PREFETCH = L
Same index also may be scanned more than ones
8/14/2019 DB2 SQL TUNING.ppt
31/53
One Fetch Access
One-fetch index access requires retrieving only
one row. It is the best possible access path if
available.One-fetch index access is a possible when :
There is only one table in the query.
The column function is either MIN or MAX and There is
an ascending index column for MIN, and a descendingindex column for MAX.
Either no predicate or all predicates are matching
predicates for the index. And There is no GROUP BY.
8/14/2019 DB2 SQL TUNING.ppt
32/53
Index Only access
With index-only access, the access path does not
require any data pages because the access
information is available in the index Because the index is almost always smaller than
the table itself, an index-only access path usually
processes the data efficiently.
ACCESSTYPE = I AND INDEXONLY = Y
8/14/2019 DB2 SQL TUNING.ppt
33/53
JOIN
A join operation retrieves rows from more than
one table and combines them. The operation
specifies at least two tables, but they need not bedistinct.
Application joins are called inner join, left outer
join, right outer join and full outer join
DB2 internally uses three types of join method -Nested loop join, Merge Scan Join and Hybrid
Join
Hybrid join is not used for OUTER join.
8/14/2019 DB2 SQL TUNING.ppt
34/53
Nested Loop Join (METHOD =1)
Initially two tables are picked out and one is used
as inner table and other one as composite table.
For each row in outer table, inner table is scannedfor matching rows in inner(New) table. And the
composite table is prepared. This composite table
is used as outer table at the next stage.
Process continues until and unless all the tableshave been selected.
Composite table is sorted when order of join
columns on both the table are not same.
8/14/2019 DB2 SQL TUNING.ppt
35/53
Nested Loop Join (Method = 1)
Nested loop join is efficient when
Outer table is small. Predicates with small filter
factor reduces no of qualifying rows in outer table. The number of data pages accessed in inner table
is also small.
Highly clustered index available on join columns
of the inner table.
This join method is efficient when filtering for
both the tables(Outer and inner) is high.
8/14/2019 DB2 SQL TUNING.ppt
36/53
Merge Scan Join (Method = 2)
DB2 scans both the tables in order of join column
If there is no efficient index to provide the order,
DB2 might sort the either or both the tables. DB2 reads a row from the outer table and keep on
reading the inner table as long as a match is there.
When there is no match, DB2 reads another row
from outer table.
If outer table has a new value, DB2 searches ahead
in the inner table.
8/14/2019 DB2 SQL TUNING.ppt
37/53
Merge Scan Join (Method = 2)
Merge scan is used when :
Qualifying rows of inner and outer tables are large
and join predicates also does not provide muchfiltering
Tables are large and have no indexes with
matching columns
8/14/2019 DB2 SQL TUNING.ppt
38/53
Hybrid Join(Method=4)
It is only used for Inner Join and requires an index
on the join column of inner table.
Join the outer table with RIDs from the index onthe inner table. Index of the inner table is scanned
for each row in outer table.
Sort the data on RID orders and retrieve the data
from inner table using list prefetch
Concatenates data from inner table to form the
resultant table.
8/14/2019 DB2 SQL TUNING.ppt
39/53
Hybrid Join(Method=4)
Hybrid join is used often when a non-clustered
index available on join column of the inner table
and there are duplicate qualifying rows on outertable.
Hybrid join handles are duplicates in the outer
table as inner table is scanned only ones for each
set of duplicate values. Prefetch method is LIST SEQUENTIAL
8/14/2019 DB2 SQL TUNING.ppt
40/53
8/14/2019 DB2 SQL TUNING.ppt
41/53
8/14/2019 DB2 SQL TUNING.ppt
42/53
Sequential Detection
If DB2 does not choose prefetch at bind time, it
can sometimes do that at execution time. The
method is called sequential detection. If a table is accessed repeatedly using the same
statement (SQL in a do-while loop), the data or
index leaf pages of the table can be accessed
sequentially. DB2 can use this technique if it did not choose
sequential prefetch at bind time because of an
inaccurate estimate of the no of pages to be
8/14/2019 DB2 SQL TUNING.ppt
43/53
Sorting of data
Sort can happen on a new table or on the
composite table
Sort is required by ORDER BY or GROUP BYclause. (SORTC_GROUPBY/SORTC_ORDERBY = Y).
Sort is required to remove duplicates while
DISTINCT or UNION is used. (SORTC_UNIQ=Y)
During Nested loop and Hybrid join, composite
table is sorted and Merge scan join, both of the
tables might be sorted to make join efficient.(SORTN_JOIN/SORTC_JOIN=Y)
8/14/2019 DB2 SQL TUNING.ppt
44/53
Sorting of data
Sort is need for subquery processing. Result of the
subquery is sorted and put into the work file for
later reference by parent query. DB2 sorts RIDs into ascending page number order
in order to perform list prefetch. This sort is very
fast and is done totally in memory
If sort is required during CURSOR processing, itis done during OPEN CURSOR. Once cursor is
closed and opened, sort is to be performed again.
8/14/2019 DB2 SQL TUNING.ppt
45/53
View Merge
If query is using view, view name ultimately will
be resolved to table name. This process is called
view merge. the statement that references the view is combined
with the subselect that defined the view. This
combination creates a logically equivalent
statement. This equivalent statement is executedagainst the database
8/14/2019 DB2 SQL TUNING.ppt
46/53
View Materialization
Views can not be merged if view definition
involves column functions. In that case view
materialization is required. Done in two stages :
1) The view's defining subselect is executed against
the database and the results are placed in a
temporary copy of a result table.
2) The view's referencing statement is then executed
against the temporary copy of the result table to
obtain the intended result
8/14/2019 DB2 SQL TUNING.ppt
47/53
Query Parallelism
When DB2 plans to access data from a table or
index in a partitioned table space, it can initiate
multiple parallel operations to reduce the response
time for data or processor-intensive queries.
Two types of parallelism -
1) Query I/O parallelism - Manages concurrent I/O
request for a single query.
2) Query CP parallelism - Enables multitasking
within a single query. Query is broken into parts
and processed.
8/14/2019 DB2 SQL TUNING.ppt
48/53
Query Parallelism
Parallel processing is enabled using
DEGREE(ANY) on BIND and REBIND for static
SET CURRENT DEGREE = ANY for dynamic The virtual buffer pool parallel sequential
threshold (VPPSEQT) value must be large enough
to provide adequate buffer pool space for parallel
processing
Degree = 1 disables parallel processing
8/14/2019 DB2 SQL TUNING.ppt
49/53
8/14/2019 DB2 SQL TUNING.ppt
50/53
Tools
Tools for Performance Analysis
- DB2 PM
- CANDLES OMEGAMON- SMF/RMF Data
- DB2 TRACE
Tools for Access Path Analysis- DB2 EXPLAIN
- VISUAL EXPLAIN
8/14/2019 DB2 SQL TUNING.ppt
51/53
Risks
There is no GOLDEN RULE for DB2 SQL tuning
Wrong Analysis of performance Data and access
method information may led to more performanceoverhead
While tuning SQL in test environment, the person
should keep in mind that amount of data and DB2
sub-system setup are not same.
Person with good knowledge of DB2 should be
involved with tuning activity.
8/14/2019 DB2 SQL TUNING.ppt
52/53
BP Buffer PoolCP Central Processor
CPC Central Processing CageDASD Direct Access Storage Device
DB2 PM DB2 Performance Monitor
DBD Database Descriptor DS Dataset
IX Index
LDS Linear VSAM DatasetRI Referential Integrity
RID Row Identifier
RMF Resource Monitoring facilitySMF System Monitoring Facility
SQL Structured Query Language
T TableTS Tablespace
VSAM Virtual Storage Access Method
GLOSSARY
8/14/2019 DB2 SQL TUNING.ppt
53/53