DB2 V9 Overview - OOWidgets V9 Overview.pdf · DB2 for z/OS Into the Future Delivering Customer Value V7 V8 DB2 9 2001 2004 2007 20xx DB2 X Ongoing themes: Performance Scalability

zOS Performance

10/3/2008 © 2006 IBM Corporation

DB2 V9 OverviewBill [email protected]

ATS

© 2003 IBM Corporation2 10/3/2008

Disclaimer

The information in this document has not been submitted to any formal IBM review and is distributed on an "as is" basis without any warranty expressed or implied. Use of this information or the implementation of any of these techniques is a user responsibility and depends on the user's ability to evaluate and integrate them into the user's operational environment. While each item may have been reviewed for accuracy in a specific situation there is no guarantee the same or similar results may be achieved elsewhere.


V9 Introduction

� Most of this material was extracted from conference presentations from the following individuals with their permission

– Peggy Zagelow

– Terry Purcell

– Jim Teng


DB2 for z/OS Into the FutureDelivering Customer Value

�V7�V8

�DB2 92001

2004

2007

20xx

�DB2 X

Ongoing themes:Performance Scalability

Reliability Availability ServiceabilitySecurity Productivity

Application DevelopmentSQL XML SOA

64 bit data definition on demand

pureXMLtm


• Native SQL procedures• Index compression• Partition By Growth tables• Cloned tables• Volume based backup / recovery

Simplification, Reduced TCO

• Many SQL improvements• Dynamic index ANDing• Histogram statistics• New built-in OLAP expressions• Optimization Service Center

Dynamic Warehousing

• pureXML• Optimistic locking for WebSphere• LOB performance, usability

SOA Enablement

• More online schema changes • Online REBUILD INDEX• Trusted context and ROLEs• Parallel Sysplex clustering improvements

WorkloadConsolidation

DB2 9 for z/OSAddressing Corporate Data Goals


DB2 9: Another features rich release

� SHRLEVEL(REFERENCE) for REORG of LOB table spaces

� Online RENAME COLUMN� Online RENAME INDEX� Online CHECK DATA & CHECK

LOB� Faster REORG by intra-REORG

parallelism� More online REORG by

eliminating BUILD2 phase� LOB Locks reduction� Online REBUILD INDEX� Renaming SCHEMA� Renaming VCAT� Tape support for BACKUP and

RESTORE SYSTEM utilities� Recovery of individual table

spaces and indexes from volume-level backups

� Enhanced STOGROUP definition� Preserving consistency when

recovering individual objects to a prior point in time

� Spatial Support� Text Search� Reordered Row Format (RRF)

� Global query optimization� Generalizing sparse index

and in-memory data caching method

� Autonomic reoptimization� Logging enhancements� LOBs Network Flow Optimization� NOT LOGGED table spaces� Index on expressions� Universal table spaces� Partition-by-growth table

spaces� APPEND option at insert� Autonomic index page split� Different index page sizes� Faster and more automatic DB2

restart� MODIFY RECOVERY

enhancements� RLF improvements for remote

application servers such as SAP� Thin DB2 Connect client� Virtual storage constraint

relief

� Index compression� DECIMAL FLOAT � BIGINT� VARBINARY & BINARY� MERGE statement� FETCH CONTINUE� SELECT FROM UPDATE /

DELETE / MERGE� Enhanced CURRENT SCHEMA� Automatic creation of

database objects� Modify early code without

requiring an IPL� Utilities CPU reduction� Temporary space

consolidation� Removing more reasons for

‘soft’ outages� Conditional restart

enhancement: automatic search for the appropriate checkpoint

� ALTER column default� Object recovery from BACKUP SYSTEM

� . . .


�Private protocol � DRDA (new help in DSNTP2DP) �Plans containing DBRMs, ACQUIRE(ALLOCATE) �packages, ACQUIRE(USE) �XML Extender � new XML type�DB2 MQ XML user-defined functions and stored procedures � use the new XML functions�msys for Setup DB2 Customization Center removed

� install panels�Simple table spaces � segmented or partitioned by growth�Loading DSNHDECP directly � Use the new interface

Version 9 deprecated or removed function


Planned DB2 9 Post GA Deliveries

� New XMLTABLE and XLMCAST functions: APAR PK51573

� Storage class zparm for online CHECK utilities APAR PK41711

� Spatial data type enhancements – about 30 more funct ions

� Trusted context enhancements: APAR PK44617

� ALTER TABLE ALTER COLUMN SET, DROP DEFAULT


Agenda

� Memory Management

� Optimization Service Center

� Native SQL Procedures

� Plan Stability

� REOPT AUTO

� Histogram Statistics

� Global Query Optimization

� Generalized sparse index and in-memory data cache

� Dynamic Index ANDing

� Indexing Enhancements

� Larger pre-fetch and deferred write quantities

� Page Range Processing

� Automatic buffer pool management

� Misc Optimization Enhancements

� Spatial Support

� Text Search Support


Memory Management – V8


MIPS per engine

230.55

20.75

979.85

0

200

400

600

800

1000

1200

G2 Turbo G3 G4 G5 G5 Turbo G6 G6 Turbo z900-1xx z900-2xx z990 z9 z10

Model

MIP

S

Assuming z9-701=602mips

Engine Capacity Increase over time based on LSPR tables


CEC Capacity Increase over time based on LSPR tablesMIPS with maximum engines

3054.83

170.610

10

20

30

40

50

60

70

G2 Turbo G3 G4 G5 G5 Turbo G6 G6 Turbo z900-1xx z900-2xx z990 z9 z10

model

# E

ngin

es

0

5,000

10,000

15,000

20,000

25,000

30,000

35,000

MIP

S Max Engines

Max MIPS


DBM1 Virtual Storage Constraint Relief

2GB

Buffer poolBuffer control blocksSort poolRID poolEDM DBD poolGlobal dynamic statement cacheCastout engine work areaCompression dictionaryLOB 2GBThread and stack storageOther EDM poolLocal dynamic statement cache


Summary of EDM related storage in V8

2GB 2GB

SKCT/SKPT

CT/PT

Local DSC

Global DSC

DBD

0GB 0GB


Estimation of V8 below 2GB DBM1 use

� Average estimates

– Thread storage: +40 to 90% (40% for system, 40 to 90% for user thread)

– Stack storage: +100%

– Local Dynamic statement cache: +60%

– EDM pool: roughly the same

– Global DSC control block: -70%

– RID pool: -90%

� Most customers get some to good relief but a small% of customers may get small% increase in DBM1 below 2GB use


�Max BP size is lifted to 1TBƒMax size of single or summation of allƒThe actual maximum = the REAL storage availableƒAlways allocated above 2GBƒCastout buffers and Buffer Control Blocks are also

allocated above 2GB�Data space pools and Hiperpools are eliminated ƒSimplifying DB2 system management

64-bit Buffer Pools


�PGFIX = YES option to long-term page fix buffers in real storage (i.e. virtual = real) ƒUse where I/O rate is highƒMust have real storage available to back the poolƒUp to 10% CPU savingƒIndependent setting by buffer poolƒIssues DSNB541I and ignores PGFIX = YES if the total

active BP storage > 80% of REAL storageƒPGFIX option is enabled in V8 CM Mode

�PGFIX = NO (which is the default)ƒNeeds to do page fix/free for each I/O or each GBP

operation

64-bit Buffer Pools ...


�Need to have sufficient real storage to back BPƒPaging I/O's will affect performanceƒIssue DSNB536I if the total active BP storage > REAL

storage capacityƒIssue DSNB610I if the total active BP storage > 2 x

REAL storage –Adjust BP size downward or use the minimum size

ƒIssue DSNB508I if the total BP size > 1 TB�Default BP0 size raised from 2000 to 20000�Default BP8K0 size changed from 0 to 1000�Default BP16K0 size changed from 0 to 500�Default BP32K size raised from 24 to 250

64-bit Buffer Pools ...


�ObjectivesƒWrite/Castout multiple pages in a single CF operationƒReduce traffic to and from CFƒImproved data sharing performance for most workloads,

especially for batch update–Workloads that updating large numbers of updated pages for GBP-dependent objects

–ReduceDBM1's CPU time and CF link utilization due to less CF messages

�What are the prerequisites ?ƒNew commands in z/OS 1.4ƒCF microcode shipped with CF LEVEL 12

Batching of GBP Writes and Castout Reads


�Some Tx managers spawn other transactions at syncpoint�Spawned Tx could encounter "record not found" if it tries to read originating tx's update from another member (rare but a few customers have reported it)�IMMEDWRITE NO will now write pages at commit phase 1ƒZPARM IMMEDWRI(PH1) option removedƒBIND IMMEDWRITE(PH1) option kept for compatibility

�Equivalent performance for Ph1 vs. Ph2 writes�CPU cost to write pages to GBP being transferred from MSTR SRB to allied TCBƒIncluded in the class 2 accounting CPU time

Pages Written to GBP at Phase 1 instead of Phase 2


Memory Management – V9


DBM1 Virtual Storage below 2GB

� 3 major storage areas still below 2GB in V8

– EDM pool containing SKCT/SKPT and CT/PT

– Local dynamic statement cache

– Thread and stack storage

� EDM pool in DB2 9 – need rebind

– SKCT/SKPT moved above 2GB

– A portion (close to 30%) of CT/PT moved above 2GB

– Average estimated reduction of 60% but there is a wide fluctuation from 20 to 90%


DBM1 Virtual Storage below 2GB …

� Local dynamic statement cache

– Rough estimation of V9 = 50% of V8

� User thread storage, System thread storage, Stack sto rage

– Current expectation of less than 10% difference overall

� V8 PK20800 8/07 Display Thread(*) Service(Storage) for agent-local virtual storage, real storage, and auxiliary sto rage

� Potential reduction can range from 0 to 300MB depend ing on how much thread/stack storage usage


Summary of EDM related storage in V8 & V9

2GB 2GB

SKCT/SKPT

CT/PT

Local DSC

Remaining portion of CT/PT (70%)

Remaining portion of Local DSC (50%)

Global DSC

DBD

SKCT/SKPT

Portion of CT/PT

Portion of local DSC

V8 V9Global DSC

DBD

0GB 0GB


Virtual and Real Storage

� Real storage – If everything under user control such as buffer/storage pool size, #concurrent threads, e tc. is kept constant,

– 5 to 25% increase in overall real storage from V7 to V8, primarily depending on active buffer pool size

– Less than 10% from V8 to V9

� DDF virtual storage below 2GB

– 15 to 40% reduction in V9

• via shared storage between DDF and DBM1 above 2GB (CM)


Optimization Service Center


Optimization Service Center (OSC)

� V8 introduced “Intelligent Visual Explain”

– Externalizing “hidden plan table” optimizer cost details

– Stats Advisor enhancement recommends when to collect stats

– Limited to single query only

� V9 provides a more extensive “Optimization Service Cen ter”

– All the features of Visual Explain / Statistics Advisor

– Single query or workload

– Plus query monitor

– Advisors – Statistics ( Index and Query Design in OE)


OSC Welcome Page


Native SQL Procedures


Native SQL Procedures in DB2 9

� What is an SQL procedure ?

– A stored procedure that contains only SQL statements.

– May use SQL control statements to write the logic part of the program (WHILE, IF, etc)

– SQL Procedural Language or SQL PL


External and Native SQL procedures

� External SQL procedures (from V5 on)

– Generated C program which runs in a WLM environment

� Native SQL procedures (from V9)

– The SQL procedure logic runs in the DBM1 address space


Benefits of native SQL procedures

� Enhanced SQL PL support– Better Family Compatibility and Standards Compliance– Enhanced Portability

� Support for the Application Development Lifecycle– Versioning – Debugging – Deployment – Security and the management of source code

� Enhanced Performance

� Enhanced Usability

� Reduced cost of ownership – eligible for redirect to zIIPwhen invoked remotely, at DDF redirect percentage


Comparison of external and native SQL procedures

NativeExternal

Runs entirely within the DB2 engine

Requires WLM environment, load module

Single step

DDL

Multi-step,

Requires C compilerPreparation

Execution


Native SQL Procedures Performance considerations

� Execution within DB2 engine

� SQLPL Compiler Transformation technology

� Optimized access to program variables throughout the SQL program

� Execution from remote thread eligible for zIIPat same percentage as DDF SRB

� Storage utilization optimized: 64-bit, LOB locators, program variables


Native SQL Procedures Summary

– New Syntax and data types

– Designed with the application development lifecycle in mind..

– Native SQL procedures executed entirely in DBM1, not in WLM-managed stored procedures address space.

• Execution from remote thread eligible for zIIP at same percentage as DDF SRB

– No C compiler requirement


Plan Stability


Plan Stability Overview� Ability to backup your static SQL packages

� At REBIND

– Save old copies of packages in Catalog/Directory

– Switch back to previous or original version

� Two flavors

– BASIC

• 2 copies: Current and Previous

– EXTENDED

• 3 copies: Current, Previous, Original

– Default controlled by a ZPARM

– Also supported as REBIND options


Plan Stability - BASIC support

Current copy

previous copy

Incoming copy

REBIND … PLANMGMT(BASIC) REBIND … SWITCH(PREVIOUS)

current copy

previous copy

move

delete

movemove

Chart is to be read from bottom to top


Plan Stability - EXTENDED support

current copy

previous copy

REBIND … PLANMGMT(EXTENDED) REBIND … SWITCH(ORIGINAL)

move

delete

current copy

previous copy

original copy

move

clone

Incoming copy

original copy

clone

delete


Plan Stability Notes� REBIND PACKAGE …

– PLANMGMT (BASIC)

2 copies: Current and Previous

– PLANMGMT (EXTENDED)

3 copies: Current, Previous, Original

� REBIND PACKAGE …

– SWITCH(PREVIOUS)

Switch between current & previous

– SWITCH(ORIGINAL)

Switch between current & original

� Most bind options can be changed at REBIND

– But a few must be the same …

� FREE PACKAGE …

– PLANMGMTSCOPE(ALL) – Free package completely

– PLANMGMTSCOPE(INACTIVE) –Free old copies

� Catalog support

– SYSPACKAGE reflects active copy

– SYSPACKDEP reflects dependencies of all copies

– Other catalogs (SYSPKSYSTEM, …) reflect metadata for all copies

� Invalidation and Auto Bind

– Each copy invalidated separately


REOPT enhancement for dynamic SQL

� V8 REOPT options

– Dynamic SQL

• REOPT(NONE, ONCE, ALWAYS)

– Static SQL

• REOPT(NONE, ALWAYS)

� V9 Addition for Dynamic SQL

– Bind option REOPT(AUTO)


Dynamic SQL REOPT - AUTO

� For dynamic SQL with parameter markers

– DB2 will automatically reoptimize the SQL when

• Filtering of one or more of the predicates changes dramatically– Such that table join sequence or index selection may change

• Some statistics cached to improve performance of runtime check

– Newly generated access path will replace the current in the statement cache.

� First optimization is the same as REOPT(ONCE)

– Followed by analysis of the values supplied at each execution of the statement


Histogram Statistics


RUNSTATS Histogram Statistics

� RUNSTATS will produce equal-depth histogram

– Each quantile (range) will have approx same number of rows

• Not same number of values

– Another term is range frequency

� Example

• 1, 3, 3, 4, 4, 6, 7, 8, 9, 10, 12, 15 (sequenced)

– Lets cut that into 3 quantiles.• 1, 3, 3, 4 ,4 6,7,8,9 10,12,15

3/12315103

4/124962

5/12 3411

FrequencyCardinalityHigh ValueLow ValueSeq No


RUNSTATS Histogram Statistics Notes� RUNSTATS

– Maximum 100 quantiles for a column

– Same value columns WILL be in the same quantile

– Quantiles will be similar size but:• Will try and avoid big gaps between quantiles• Highvalue and lowvalue may have separate quantiles• Null WILL have a separate quantile

� Supports column groups as well as single columns

� Think “frequencies” for high cardinality columns


Histogram Statistics Example� SAP uses INTEGER (or VARCHAR) for YEAR-MONTH

• Assuming data for 2006 & 2007– FF = (high-value – low-value) / (high2key – low2key)– FF = (200612 – 200601) / (200711 – 200602)

– 10% of rows estimated to returnData Distribution - Even Distribution

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

200601 200712

Year/Month

WHERE YEARMONTH BETWEEN 200601 AND 200612

Data assumed as evenly distributed between low and high range


Histogram Statistics Example

Data Distribution - Histograms

0

200000

400000

600000

800000

1000000

1200000

1400000

2006 01-12 200613 -----> -----> 200700 2007 01-12

Year/Month

� Example (cont.)

– Data only exists in ranges 200601-12 & 200701-12• Collect via histograms

– 45% of rows estimated to return

No data between 200613 & 200700

WHERE YEARMONTH BETWEEN 200601 AND 200612


Global Query Optimization


Problem Scenario 1� V8, Large Non-correlated subquery is materialized*

SELECT * FROM SMALL_TABLE A

WHERE A.C1 IN

(SELECT B.C1 FROM BIG_TABLE B)

– “BIG_TABLE” is scanned and put into workfile

– “SMALL_TABLE” is joined with the workfile

� V9 may rewrite non-correlated subquery to correlated– Much more efficient if scan / materialisation of BIG_TABLE was avoided

– Allows matching index access on BIG_TABLE

SELECT * FROM SMALL_TABLE A

WHERE EXISTS

(SELECT 1 FROM BIG_TABLE B WHERE B.C1 = A.C1)

* Assumes subquery is not transformed to join


Problem Scenario 2

� V8, Large outer table scanned rather than using matchi ng index access*

SELECT * FROM BIG_TABLE A

WHERE EXISTS

(SELECT 1 FROM SMALL_TABLE B WHERE A.C1 = B.C1)

– “BIG_TABLE” is scanned to obtain A.C1 value– “SMALL_TABLE” gets matching index access

� V9 may rewrite correlated subquery to non-correlated

SELECT * FROM BIG_TABLE A

WHERE A.C1 IN

(SELECT B.C1 FROM SMALL_TABLE B)– “SMALL_TABLE” scanned and put in workfile

– Allows more efficient matching index access on BIG_TABLE

* Assumes subquery is not transformed to join


Virtual Tables� A new way to internally represent subqueries

– Represented as a Virtual table

• Allows subquery to be considered in different join sequences• May or may not represent a workfile

• Apply only to subqueries that cannot be transformed to joins

Correlated or non-correlated?......I shouldn’t have to care!


EXPLAIN Output

� Additional row for materialized “Virtual Table”

– Table type is "W" for "Workfile".

• Name includes an indicator of the subquery QB number– Example � “DSNVT(02)”

– Non-materialized Virtual tables will not be shown in EXPLAIN output.

� Additional column PARENT_PLANNO

– Used with PARENT_QBLOCKNO to connect child QB to parent

– V8 only contains PARENT_QBNO

• Not possible to distinguish which plan step the child tasks belong to.


EXPLAIN – Non-correlated subquerySELECT * FROM T1 WHERE T1.C2 IN

(SELECT T2.C2 FROM T2, T3 WHERE T2.C1 = T3.C1)

TNCOSUB11NT3_X_C11IT3122TNCOSUB11N0RT2012

TSELECT00YT1_IX_C21IT1121

WSELECT00N0RDSNVT(02)011

TB_TYPEQB_TYPEPAR_PNOPAR_QBSC-JNAC-NAMEMCAC-TYPETNAMEMETHODPLAN-NOQBNO

� Represented as a join in QB 1

� Which plan step does the subquery belong to?

– PARENT_PLANNO = 1 and PARENT_QBNO = 1

• Thus the row corresponding to QBNO=1, PLANNO=1 is the parent row.


EXPLAIN – Correlated subquery

SELECT * FROM T1 WHERE EXISTS

(SELECT 1 FROM T2, T3 WHERE T2.C1 = T3.C1 AND T2.C2 = T1.C2)

TCORSUB11T3_IX_C11IT3122TCORSUB11T2_IX_C21IT2112TSELECT000RT1011

TB_TYPEQB_TYPEPAR_PNO

PAR_QB

AC-NAMEMCAC-TYPE

TNAMEMETHODPLAN-NO

QBNO

� Using same example from previous slide

– Although optimizer has rewritten to correlated


Generalizing Sparse Index and In-Memory Data Cache


Pre-V9 Sparse Index & in-memory data cache

� V4 introduced sparse index

– for non-correlated subquery workfiles

� V7 extended sparse index

– for the materialized work files within star join

� V8 replaced sparse index

– with in-memory data caching for star join• Runtime fallback to sparse index when memory is insufficient


RID

T1 T2 (WF)NLJ

... ...

t1.c = t2.c

KeyBinary Search of sparse index

to look up “approximate “

location of qualified key

Sparse Index

sorted in t2.c order

Workfile sorted

in t2.c order

T2(WF)

How does Sparse Index work?� Sparse index may be a subset of workfile (WF)

– Example, WF may have 10,000 entries

• Sparse index may have enough space (240K) for 1,000 entries

• Sparse index is “binary searched” to find target location of search key• At most 10 WF entries are scanned


Extending Data Caching in DB2 9

� In-memory data caching is extended to non-star join

� V9 will use a local pool above the bar – Instead of a global pool used in V8 star join

– Data caching storage management will be associated with each thread• Which can reduce the potential storage contention

� New ZPARM MXDTCACH

– specifies the maximum extent in MB, for data caching per thread.


Benefit of Data Caching � All tables lacking an index on join column(s):

– Temporary tables

– Table expressions

– Materialized views

– …..any table

� V9 also supports multi-column sparse index


Dynamic Index ANDing for Star Schema


Dynamic Index ANDing Challenge

� Filtering may come from multiple dimensions

•Creating multi-column indexes to support the best combinations is difficult

F

D5

D4

D2

D1

D3


Index ANDing – Pre-Fact

� Pre-fact table access

–Filtering may not be (truly) known until runtime

F

D1 Filtering dimensions accessed in parallel

Join to respective fact table indexes

Build RID lists

F

D3

F

D5

RID list 1

RID list 2

RID list 3

✘ Runtime optimizer may terminate parallel leg(s) which provide poor filtering at runtime


Index ANDing – Fact and Post-Fact� Fact table access

–Intersect filtering RID lists

–Access fact table

•From RID list

� Post fact table

–Join back to dimension tables

Remaining RID lists are “ANDed” (intersected)

RID list 2

RID list 3

Using parallelism

RID list 2/3

Final RID list used for parallel fact table access


V8 RID Pool failure = TS Scan

SORT

RID ListTablespace SCAN

Physical orLogical resource

constraintRID Processing


V9 RID Pool Fallback Plan

SORT

RID List

Workfile

Fall Back plan writes pair of join result rids into Workfile

Physical orLogical resource

constraint SORT

Next portion of Rids

retrieved


Dynamic Index Anding Highlights� Pre-fact table filtering

– Filtering dimensions accessed concurrently

� Runtime optimization

– Terminate poorly filtering legs at runtime

� More aggressive parallelism

� Fallback to workfile for RID pool failure

– Instead of r-scan


Indexing Enhancements


Index on Expression

SELECT *FROM CUSTOMERS WHERE YEAR(BIRTHDATE) = 1971

� DB2 9 supports “index on expression”

– Can turn a stage 2 predicate into indexable

Previous FF = 1/25Now, RUNSTATS collects frequencies. Improved FF accuracy

CREATE INDEX ADMF001.CUSTIX3 ON ADMF001.CUSTOMERS

(YEAR(BIRTHDATE) ASC)


Index Enhancement - Tracking Usage

� Additional indexes require overhead for

– Utilities

• REORG, RUNSTATS, LOAD etc

– Data maintenance

• INSERT, UPDATE, DELETE

– Disk storage

– Optimization time

• Increases optimizer’s choices

� But identifying unused indexes is a difficult task

– Especially in a dynamic SQL environment


Tracking Index Usage

� RTS records the index last used date.

– SYSINDEXSPACESTATS.LASTUSED

• Updated once in a 24 hour period– RTS service task updates at 1st externalization interval (set by

STATSINT) after 12PM.

• if the index is used by DB2, update occurs. • If the index was not used, no update.

� "Used", as defined by DB2 as:

– As an access path for query or fetch.

– For searched UPDATE / DELETE SQL statement.

– As a primary index for referential integrity.

– To support foreign key access


Larger pre-fetch and deferred write quantities


Buffer Pool adjusting

� If the buffer pool is adjusted, the result will be just as though an ALTER BUFFERPOOL VPSIZE command had been issued

– The new size is stored by DB2 in the BSDS

� If the buffer pool is deallocated (e.g. because DB2 is being stopped) it will subsequently be reallocated at its most recently allocated size.

Example

– If BPOOL is adjusted from 800 MB to 900 MB

– Then DB2 is stopped and restarted

– BPOOL will be subsequently allocated at 900 MB


What if the BPOOL is manually altered?

� If a buffer pool's size is manually altered (via th e ALTER BUFFERPOOL VPSIZE command), it is deregistered and then registered at the new size.

�Example

–BPOOL registered at 800 MB

–Altered to a size of 1000 MB

–Then after the alteration has completed, DB2 deregisters and re-registers the buffer pool at 1000 MB with a new min of 750 MB and a new max of 1250 MB.


AUTOSIZE option� DB2 will increase or decrease the size of a

given buffer pool by up to 25% of the originally allocated size.

� By default, automatic buffer pool adjustment is turned off.

� It can be activated via a new AUTOSIZE(YES) option on the ALTER BUFFERPOOL command.

� Once activated, it can be deactivated by ALTER BUFFERPOOL(bpname) AUTOSIZE(NO).

� The AUTOSIZE attribute is added to the DISPLAY BUFFERPOOL output.


Prefetch and Deferred Write Quantity

� Bigger prefetch and deferred write quantity for bigger buffer pool

– Max of 128KB V8 ->256KB V9 in SQL table scan

– 256KB V8 ->512KB V9 in utility

– +36% MB/sec in non striped prefetch

– +47% in 2-striped prefetch -> more effective striping

� “Bigger buffer pool”

– For sequential prefetch, if VPSEQT*VPSIZE> 160MB for SQL, 320MB for utility

– For deferred write, if VPSIZE> 160MB for SQL, 320MB for utility


Dynamic Prefetch & Preformat

� Replace all sequential prefetch, except in tablespace scan, with dynamic prefetch in SQL calls

– Up to 50% faster

– Dynamic prefetch is more intelligent and robust

� Bigger preformatting quantity and trigger ahead

– From 2 (V8) to 16 (V9) cylinders if >16cyl allocation

– 27% faster Insert in one measurement


Workfile Buffer Pools

� Heavier use of 32K workfile BP instead of 4K BP

– V9 tries to use 32K BP for bigger record size to gain improved performance, especially I/O time

• Less workfile space and faster I/O

– Recommendation

• Assign bigger 32K workfile BP• Allocate more 32K workfile datasets• If 4K workfile BP activity is significantly less,

corresponding BP size and workfile datasets can be reduced.


Page Range Processing


Limiting the Partitions Accessed

� With DPSIs or tablespace scan of partitioned tablespace

– beneficial to avoid accessing partitions with no qualifying rows

� Done using page range screening,

– V8 support for local predicates on the leading partitioning key(s)

� Reduces qualified rows read without indexing

Table T1 Partition 1





SELECT SUM(GROSS_SALES) FROM T1

WHERE T1.MONTH = ?AND T1.STOR_ID = ?


Page Range Screening Enhancements

� DB2 9 introduces two page range screening enhancements:

– Join predicates

– Non-matching predicates






SELECT SUM(GROSS_SALES) FROM T1

WHERE T1.MONTH = ?AND T1.STOR_ID = ?


Page Range Screening with Join Predicates

� V8

– All DPSI index parts accessed

• page range screening for local predicates only

� V9

– 1 DPSI index part on B accessed for each join row from A

• join predicate(s) used for page range screening– 10X performance improvement in DB2 9 Redbook example

SELECT *FROM TABLEA A, TABLEB BWHERE A.COL001 = B.COL001AND A.COL004 = B.COL004 DPSI key

Non-Indexed

partition key

PARTITION BY ( COL001 ASC)(PART 1 VALUES('00000100') ,PART 2 VALUES('00000200') ,PART 3 VALUES('00000300') ………,PART 999 VALUES('00099900'),PART 1000 VALUES('00100000'))


Page Range Screening with Non-matching Predicates

� V8, page range screening only applies to leading lim it key(s)

– 1000 DPSI parts must be probed

� V9, since only COL002 = ‘00000001’ is required,

– page range screening can be applied on 2nd limit key,

– only 20 DPSI parts are probed (1 in every 50 parts)• 16X performance improvement in DB2 9 Redbook example

SELECT SUM(COL008)FROM TABLEAWHERE COL002 = '00000001'AND COL004 = '00000001'

Non-in

dexe

d

2nd

part

key

PARTITION BY (COL001 ASC, COL002 ASC) ( PART 1 VALUES('00000100','00000002'),

PART 2 VALUES('00000100','00000004'),PART 3 VALUES('00000100','00000006'),

.....PART 50 VALUES('00000100','99999999'), PART 51 VALUES('00000200','00000002'),

.....PART 1000 VALUES('99999999','99999999'))

DPSI key


Automatic buffer pool management


Automatic buffer pool management

� Only the size attribute of the buffer pool.

� Can be enabled or disabled at the individual buffer pool level.

� Automatic management entails the following :

ƒDB2 Registers the BPOOL with WLM

ƒDB2 provides sizing information to WLM

ƒDB2 communicates to WLM each time allied agents encounter delays

ƒDB2 periodically reports BPOOL size and random read hit ratios to WLM


DB2 Registers BPOOL to WLM

BP1800MB

DB2 WLMIWM4MREG Service

Trigger•ALTER BPOOL AUTOSIZE(YES) •BPOOL allocation • Automatic management set ON(DB2 deregisters when deallocated or altered OFF)

600MBMin

1GBMax

800MBCurrent

BP1Name


DB2 communication to WLM

� The following cases are not communicated to WLM:

–Prefetch I/O

–Wait for I/O on a sequential GetPage

–Group buffer pool reads

Each time an allied agent encounters a delay caused

by a random Get Page having to wait for read I/O.


Periodic reporting

BP0

DB2 WLM

Data Collection exit(one for each pool)

DB2 Periodic Report

Buffer Pool SizesHit Ratio for Random Reads

BP1

BP7BP2

1 Plots size and hit ratio overtime.

2 Projects effects of changing the size


Misc Optimization Enhancements


Sort Avoidance Improvements

� Improved Sort avoidance for DISTINCT

– From V9, DISTINCT can avoid sort using duplicate index

• DISTINCT required unique index to avoid sort

� Sort avoidance for GROUP BY

– Order of GROUP BY columns re-arranged to match index

– Eg. Index on C1, C2

• GROUP BY C2, C1– Sort required in V8

– Sort avoided in 9


Sort Improvements

� Reduced workfile usage for very small sorts

– Final sort step requiring 1 page will NOT allocate workfile

� More efficient sort with FETCH FIRST clause

– V8 and prior,

• Sort would continue to completion• Then return only the requested ‘n’ rows

– From V9,

• If the requested ‘n’ rows will fit into a 32K page,– As the data is scanned,

> Only the top ‘n’ rows are kept in memory> Order of the rows is tracked

> No requirement for final sort


Clusterratio Enhancement

� New Clusterratio formula in V9

– Better awareness of prefetch range

– More accurate CR for lower cardinality indexes

– V9 adds new statistic collected by RUNSTATS

• DATAREPEATFACTOR helps optimizer differentiate clustering from sequential data pattern

� RUNSTATS required in V9 before mass REBIND

– As a migration step


Sequential Access� Sequential prefetch only used for tablespace scan in V9

– Dynamic prefetch used instead for other access paths

• Dynamic prefetch tracks sequential access at runtime

• Sequential prefetch is based upon bind/prepare prediction– At runtime, data may not be page sequential


Parallelism Enhancements� In V8

– Lowest cost is BEFORE parallelism

� In DB2 9

– Lowest cost is AFTER parallelism

• Only a subset of plans are considered for parallelism

Optimizer

Parallelism

One Lowest cost plan survives

How to parallelize

these plans?


Additional Parallelism Enhancements

� In V8

–Degree cut on leading table (exception star join)

� In DB2 9

–Degree can cut on non-leading table

•Benefit for leading workfile, 1-row table etc.

–Histogram statistics exploited for more even distribution

•For index access with NPI

–CPU bound query degree <= # of CPUs * 4

•<= # of CPUs in V8


ORDER BY & FETCH FIRST in subqueries

� ORDER BY can be wrapped inside additional SQL

� ORDER BY and FETCH FIRST n ROWS ONLY in subselect / fullselect

– ability to select the top n rows

(SELECT * FROM T1 ORDER BY C1) UNION ALL(SELECT * FROM T2 ORDER BY C2)

SELECT EMP_ACT.EMPNO, PROJNO FROM EMP_ACT WHERE EMP_ACT.EMPNO IN

(SELECT EMPLOYEE.EMPNO FROM EMPLOYEE ORDER BY SALARY DESC FETCH FIRST 3 ROWS ONLY )


Merge

� MERGE

– A new SQL DML statement in zOS V9

– Combine Update and Insert operations into one statement

� SELECT FROM MERGE

– a SELECT statement

– Show updated/inserted rows

– Including DB2 generated values

� SELECT FROM UPDATE/DELETE

– V8 has SELECT FROM INSERT


SQL Family Compatibility SQL

INTERSECT UNION

R1 R2

EXCEPT

R1 R2 R1 R2

� INTERSECT

� EXCEPT

� RANK

� DENSE_RANK

� ROW_NUMBER

OLAP Functions


Spatial Support


DB2 Spatial Support

� Seamless integration with DB2

� Spatial Data Types

– ST_Point, ST_LineString, ST_Polygon, etc.

� Spatial Functions

– ST_Contains, ST_Distance, ST_Intersect etc.

� Spatial stored procedures

– Administration of coordinate and reference systems

� Implement Open Geospatial Consortium (OGC) SQL specification and ISO SQL/MM spatial standard for types and functions


Examples of spatial applications

� Insurance: Generate quote based on geographic location and risk assessment

� Retail: Display customers around a store to determine areas of market penetration

� Real Estate: Locate properties around a school

� Utilities: Broker power based on demand and delivery cost

� Communications: locate cell phone towers based on call history


Steps to using spatial functions� 1. Enable the database for spatial.

– adds the spatial data types and functions and identifies the available coordinate systems and other spatial meta-data.

� 2. Enable a table for spatial.

– identifying the spatial column in the table

– create a UDF to call an external Web services geocoder and triggers to maintain the spatial columns.

� 3. Create a spatial index.

– DSN5SCLP ODBC program invokes stored procedures for administrative tasks.

� 4. Submit the queries.

– Using any SQL generating or accepting application.


Text Search Support


OmniFind Text Search Support: 12/2007

• CONTAINS() built in function for text search• Search CHAR, VARCHAR, LOB, & XML columns• OmniFind provides a text index server• Efficient communication interactions with DB2 for z/OS• OmniFind text indexes are persisted into DB2 tables for

backup/recovery purposes

DB2

DB2

DB2DB2

ParallelSysplex

OmniFindServer

OmniFindServer

TCP/IP

Application

InvokeSQLAPI


Drivers for text search solution

�Customer demand

�Problems with prior text extender offerings

�OmniFind index and search technology


Customer search scenarios

� DB2 for z/OS table with catalog item descriptions

– VARCHAR column, average 256 bytes

– Online web searching with familiar interface

� DB2 for z/OS table with insurance agent notes

– CLOB column, average 1K bytes

– Agent remembers a claim but not who made it!

� DB2 for z/OS table with item names

– CHAR column, padded to 80 bytes, 400K+ rows

– Find items with keyword ordered by a customer

DB2 V9 Overview - OOWidgets V9 Overview.pdf · DB2 for z/OS Into the Future Delivering Customer Value V7 V8 DB2 9 2001 2004 2007 20xx DB2 X Ongoing themes: Performance Scalability

Documents