© 2003 IBM Corporation2 10/3/2008
Disclaimer
The information in this document has not been submitted to any formal IBM review and is distributed on an "as is" basis without any warranty expressed or implied. Use of this information or the implementation of any of these techniques is a user responsibility and depends on the user's ability to evaluate and integrate them into the user's operational environment. While each item may have been reviewed for accuracy in a specific situation there is no guarantee the same or similar results may be achieved elsewhere.
© 2003 IBM Corporation3 10/3/2008
V9 Introduction
� Most of this material was extracted from conference presentations from the following individuals with their permission
– Peggy Zagelow
– Terry Purcell
– Jim Teng
© 2003 IBM Corporation4 10/3/2008
DB2 for z/OS Into the FutureDelivering Customer Value
�V7�V8
�DB2 92001
2004
2007
20xx
�DB2 X
Ongoing themes:Performance Scalability
Reliability Availability ServiceabilitySecurity Productivity
Application DevelopmentSQL XML SOA
64 bit data definition on demand
pureXMLtm
© 2003 IBM Corporation5 10/3/2008
• Native SQL procedures• Index compression• Partition By Growth tables• Cloned tables• Volume based backup / recovery
Simplification, Reduced TCO
• Many SQL improvements• Dynamic index ANDing• Histogram statistics• New built-in OLAP expressions• Optimization Service Center
Dynamic Warehousing
• pureXML• Optimistic locking for WebSphere• LOB performance, usability
SOA Enablement
• More online schema changes • Online REBUILD INDEX• Trusted context and ROLEs• Parallel Sysplex clustering improvements
WorkloadConsolidation
DB2 9 for z/OSAddressing Corporate Data Goals
© 2003 IBM Corporation6 10/3/2008
DB2 9: Another features rich release
� SHRLEVEL(REFERENCE) for REORG of LOB table spaces
� Online RENAME COLUMN� Online RENAME INDEX� Online CHECK DATA & CHECK
LOB� Faster REORG by intra-REORG
parallelism� More online REORG by
eliminating BUILD2 phase� LOB Locks reduction� Online REBUILD INDEX� Renaming SCHEMA� Renaming VCAT� Tape support for BACKUP and
RESTORE SYSTEM utilities� Recovery of individual table
spaces and indexes from volume-level backups
� Enhanced STOGROUP definition� Preserving consistency when
recovering individual objects to a prior point in time
� Spatial Support� Text Search� Reordered Row Format (RRF)
� Global query optimization� Generalizing sparse index
and in-memory data caching method
� Autonomic reoptimization� Logging enhancements� LOBs Network Flow Optimization� NOT LOGGED table spaces� Index on expressions� Universal table spaces� Partition-by-growth table
spaces� APPEND option at insert� Autonomic index page split� Different index page sizes� Faster and more automatic DB2
restart� MODIFY RECOVERY
enhancements� RLF improvements for remote
application servers such as SAP� Thin DB2 Connect client� Virtual storage constraint
relief
� Index compression� DECIMAL FLOAT � BIGINT� VARBINARY & BINARY� MERGE statement� FETCH CONTINUE� SELECT FROM UPDATE /
DELETE / MERGE� Enhanced CURRENT SCHEMA� Automatic creation of
database objects� Modify early code without
requiring an IPL� Utilities CPU reduction� Temporary space
consolidation� Removing more reasons for
‘soft’ outages� Conditional restart
enhancement: automatic search for the appropriate checkpoint
� ALTER column default� Object recovery from BACKUP SYSTEM
� . . .
© 2003 IBM Corporation7 10/3/2008
�Private protocol � DRDA (new help in DSNTP2DP) �Plans containing DBRMs, ACQUIRE(ALLOCATE) �packages, ACQUIRE(USE) �XML Extender � new XML type�DB2 MQ XML user-defined functions and stored procedures � use the new XML functions�msys for Setup DB2 Customization Center removed
� install panels�Simple table spaces � segmented or partitioned by growth�Loading DSNHDECP directly � Use the new interface
Version 9 deprecated or removed function
© 2003 IBM Corporation8 10/3/2008
Planned DB2 9 Post GA Deliveries
� New XMLTABLE and XLMCAST functions: APAR PK51573
� Storage class zparm for online CHECK utilities APAR PK41711
� Spatial data type enhancements – about 30 more funct ions
� Trusted context enhancements: APAR PK44617
� ALTER TABLE ALTER COLUMN SET, DROP DEFAULT
© 2003 IBM Corporation9 10/3/2008
Agenda
� Memory Management
� Optimization Service Center
� Native SQL Procedures
� Plan Stability
� REOPT AUTO
� Histogram Statistics
� Global Query Optimization
� Generalized sparse index and in-memory data cache
� Dynamic Index ANDing
� Indexing Enhancements
� Larger pre-fetch and deferred write quantities
� Page Range Processing
� Automatic buffer pool management
� Misc Optimization Enhancements
� Spatial Support
� Text Search Support
© 2003 IBM Corporation10 10/3/2008
Memory Management – V8
© 2003 IBM Corporation11 10/3/2008
MIPS per engine
230.55
20.75
979.85
0
200
400
600
800
1000
1200
G2 Turbo G3 G4 G5 G5 Turbo G6 G6 Turbo z900-1xx z900-2xx z990 z9 z10
Model
MIP
S
Assuming z9-701=602mips
Engine Capacity Increase over time based on LSPR tables
© 2003 IBM Corporation12 10/3/2008
CEC Capacity Increase over time based on LSPR tablesMIPS with maximum engines
3054.83
170.610
10
20
30
40
50
60
70
G2 Turbo G3 G4 G5 G5 Turbo G6 G6 Turbo z900-1xx z900-2xx z990 z9 z10
model
# E
ngin
es
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000
MIP
S Max Engines
Max MIPS
© 2003 IBM Corporation13 10/3/2008
DBM1 Virtual Storage Constraint Relief
2GB
Buffer poolBuffer control blocksSort poolRID poolEDM DBD poolGlobal dynamic statement cacheCastout engine work areaCompression dictionaryLOB 2GBThread and stack storageOther EDM poolLocal dynamic statement cache
© 2003 IBM Corporation14 10/3/2008
Summary of EDM related storage in V8
2GB 2GB
SKCT/SKPT
CT/PT
Local DSC
Global DSC
DBD
0GB 0GB
© 2003 IBM Corporation15 10/3/2008
Estimation of V8 below 2GB DBM1 use
� Average estimates
– Thread storage: +40 to 90% (40% for system, 40 to 90% for user thread)
– Stack storage: +100%
– Local Dynamic statement cache: +60%
– EDM pool: roughly the same
– Global DSC control block: -70%
– RID pool: -90%
� Most customers get some to good relief but a small% of customers may get small% increase in DBM1 below 2GB use
© 2003 IBM Corporation16 10/3/2008
�Max BP size is lifted to 1TBƒMax size of single or summation of allƒThe actual maximum = the REAL storage availableƒAlways allocated above 2GBƒCastout buffers and Buffer Control Blocks are also
allocated above 2GB�Data space pools and Hiperpools are eliminated ƒSimplifying DB2 system management
64-bit Buffer Pools
© 2003 IBM Corporation17 10/3/2008
�PGFIX = YES option to long-term page fix buffers in real storage (i.e. virtual = real) ƒUse where I/O rate is highƒMust have real storage available to back the poolƒUp to 10% CPU savingƒIndependent setting by buffer poolƒIssues DSNB541I and ignores PGFIX = YES if the total
active BP storage > 80% of REAL storageƒPGFIX option is enabled in V8 CM Mode
�PGFIX = NO (which is the default)ƒNeeds to do page fix/free for each I/O or each GBP
operation
64-bit Buffer Pools ...
© 2003 IBM Corporation18 10/3/2008
�Need to have sufficient real storage to back BPƒPaging I/O's will affect performanceƒIssue DSNB536I if the total active BP storage > REAL
storage capacityƒIssue DSNB610I if the total active BP storage > 2 x
REAL storage –Adjust BP size downward or use the minimum size
ƒIssue DSNB508I if the total BP size > 1 TB�Default BP0 size raised from 2000 to 20000�Default BP8K0 size changed from 0 to 1000�Default BP16K0 size changed from 0 to 500�Default BP32K size raised from 24 to 250
64-bit Buffer Pools ...
© 2003 IBM Corporation19 10/3/2008
�ObjectivesƒWrite/Castout multiple pages in a single CF operationƒReduce traffic to and from CFƒImproved data sharing performance for most workloads,
especially for batch update–Workloads that updating large numbers of updated pages for GBP-dependent objects
–ReduceDBM1's CPU time and CF link utilization due to less CF messages
�What are the prerequisites ?ƒNew commands in z/OS 1.4ƒCF microcode shipped with CF LEVEL 12
Batching of GBP Writes and Castout Reads
© 2003 IBM Corporation20 10/3/2008
�Some Tx managers spawn other transactions at syncpoint�Spawned Tx could encounter "record not found" if it tries to read originating tx's update from another member (rare but a few customers have reported it)�IMMEDWRITE NO will now write pages at commit phase 1ƒZPARM IMMEDWRI(PH1) option removedƒBIND IMMEDWRITE(PH1) option kept for compatibility
�Equivalent performance for Ph1 vs. Ph2 writes�CPU cost to write pages to GBP being transferred from MSTR SRB to allied TCBƒIncluded in the class 2 accounting CPU time
Pages Written to GBP at Phase 1 instead of Phase 2
© 2003 IBM Corporation21 10/3/2008
Memory Management – V9
© 2003 IBM Corporation22 10/3/2008
DBM1 Virtual Storage below 2GB
� 3 major storage areas still below 2GB in V8
– EDM pool containing SKCT/SKPT and CT/PT
– Local dynamic statement cache
– Thread and stack storage
� EDM pool in DB2 9 – need rebind
– SKCT/SKPT moved above 2GB
– A portion (close to 30%) of CT/PT moved above 2GB
– Average estimated reduction of 60% but there is a wide fluctuation from 20 to 90%
© 2003 IBM Corporation23 10/3/2008
DBM1 Virtual Storage below 2GB …
� Local dynamic statement cache
– Rough estimation of V9 = 50% of V8
� User thread storage, System thread storage, Stack sto rage
– Current expectation of less than 10% difference overall
� V8 PK20800 8/07 Display Thread(*) Service(Storage) for agent-local virtual storage, real storage, and auxiliary sto rage
� Potential reduction can range from 0 to 300MB depend ing on how much thread/stack storage usage
© 2003 IBM Corporation24 10/3/2008
Summary of EDM related storage in V8 & V9
2GB 2GB
SKCT/SKPT
CT/PT
Local DSC
Remaining portion of CT/PT (70%)
Remaining portion of Local DSC (50%)
Global DSC
DBD
SKCT/SKPT
Portion of CT/PT
Portion of local DSC
V8 V9Global DSC
DBD
0GB 0GB
© 2003 IBM Corporation25 10/3/2008
Virtual and Real Storage
� Real storage – If everything under user control such as buffer/storage pool size, #concurrent threads, e tc. is kept constant,
– 5 to 25% increase in overall real storage from V7 to V8, primarily depending on active buffer pool size
– Less than 10% from V8 to V9
� DDF virtual storage below 2GB
– 15 to 40% reduction in V9
• via shared storage between DDF and DBM1 above 2GB (CM)
© 2003 IBM Corporation26 10/3/2008
Optimization Service Center
© 2003 IBM Corporation27 10/3/2008
Optimization Service Center (OSC)
� V8 introduced “Intelligent Visual Explain”
– Externalizing “hidden plan table” optimizer cost details
– Stats Advisor enhancement recommends when to collect stats
– Limited to single query only
� V9 provides a more extensive “Optimization Service Cen ter”
– All the features of Visual Explain / Statistics Advisor
– Single query or workload
– Plus query monitor
– Advisors – Statistics ( Index and Query Design in OE)
© 2003 IBM Corporation28 10/3/2008
OSC Welcome Page
© 2003 IBM Corporation29 10/3/2008
Native SQL Procedures
© 2003 IBM Corporation30 10/3/2008
Native SQL Procedures in DB2 9
� What is an SQL procedure ?
– A stored procedure that contains only SQL statements.
– May use SQL control statements to write the logic part of the program (WHILE, IF, etc)
– SQL Procedural Language or SQL PL
© 2003 IBM Corporation31 10/3/2008
External and Native SQL procedures
� External SQL procedures (from V5 on)
– Generated C program which runs in a WLM environment
� Native SQL procedures (from V9)
– The SQL procedure logic runs in the DBM1 address space
© 2003 IBM Corporation32 10/3/2008
Benefits of native SQL procedures
� Enhanced SQL PL support– Better Family Compatibility and Standards Compliance– Enhanced Portability
� Support for the Application Development Lifecycle– Versioning – Debugging – Deployment – Security and the management of source code
� Enhanced Performance
� Enhanced Usability
� Reduced cost of ownership – eligible for redirect to zIIPwhen invoked remotely, at DDF redirect percentage
© 2003 IBM Corporation33 10/3/2008
Comparison of external and native SQL procedures
NativeExternal
Runs entirely within the DB2 engine
Requires WLM environment, load module
Single step
DDL
Multi-step,
Requires C compilerPreparation
Execution
© 2003 IBM Corporation34 10/3/2008
Native SQL Procedures Performance considerations
� Execution within DB2 engine
� SQLPL Compiler Transformation technology
� Optimized access to program variables throughout the SQL program
� Execution from remote thread eligible for zIIPat same percentage as DDF SRB
� Storage utilization optimized: 64-bit, LOB locators, program variables
© 2003 IBM Corporation35 10/3/2008
Native SQL Procedures Summary
– New Syntax and data types
– Designed with the application development lifecycle in mind..
– Native SQL procedures executed entirely in DBM1, not in WLM-managed stored procedures address space.
• Execution from remote thread eligible for zIIP at same percentage as DDF SRB
– No C compiler requirement
© 2003 IBM Corporation36 10/3/2008
Plan Stability
© 2003 IBM Corporation37 10/3/2008
Plan Stability Overview� Ability to backup your static SQL packages
� At REBIND
– Save old copies of packages in Catalog/Directory
– Switch back to previous or original version
� Two flavors
– BASIC
• 2 copies: Current and Previous
– EXTENDED
• 3 copies: Current, Previous, Original
– Default controlled by a ZPARM
– Also supported as REBIND options
© 2003 IBM Corporation38 10/3/2008
Plan Stability - BASIC support
Current copy
previous copy
Incoming copy
REBIND … PLANMGMT(BASIC) REBIND … SWITCH(PREVIOUS)
current copy
previous copy
move
delete
movemove
Chart is to be read from bottom to top
© 2003 IBM Corporation39 10/3/2008
Plan Stability - EXTENDED support
current copy
previous copy
REBIND … PLANMGMT(EXTENDED) REBIND … SWITCH(ORIGINAL)
move
delete
current copy
previous copy
original copy
move
clone
Incoming copy
original copy
clone
delete
© 2003 IBM Corporation40 10/3/2008
Plan Stability Notes� REBIND PACKAGE …
– PLANMGMT (BASIC)
2 copies: Current and Previous
– PLANMGMT (EXTENDED)
3 copies: Current, Previous, Original
� REBIND PACKAGE …
– SWITCH(PREVIOUS)
Switch between current & previous
– SWITCH(ORIGINAL)
Switch between current & original
� Most bind options can be changed at REBIND
– But a few must be the same …
� FREE PACKAGE …
– PLANMGMTSCOPE(ALL) – Free package completely
– PLANMGMTSCOPE(INACTIVE) –Free old copies
� Catalog support
– SYSPACKAGE reflects active copy
– SYSPACKDEP reflects dependencies of all copies
– Other catalogs (SYSPKSYSTEM, …) reflect metadata for all copies
� Invalidation and Auto Bind
– Each copy invalidated separately
© 2003 IBM Corporation41 10/3/2008
REOPT enhancement for dynamic SQL
� V8 REOPT options
– Dynamic SQL
• REOPT(NONE, ONCE, ALWAYS)
– Static SQL
• REOPT(NONE, ALWAYS)
� V9 Addition for Dynamic SQL
– Bind option REOPT(AUTO)
© 2003 IBM Corporation42 10/3/2008
Dynamic SQL REOPT - AUTO
� For dynamic SQL with parameter markers
– DB2 will automatically reoptimize the SQL when
• Filtering of one or more of the predicates changes dramatically– Such that table join sequence or index selection may change
• Some statistics cached to improve performance of runtime check
– Newly generated access path will replace the current in the statement cache.
� First optimization is the same as REOPT(ONCE)
– Followed by analysis of the values supplied at each execution of the statement
© 2003 IBM Corporation43 10/3/2008
Histogram Statistics
© 2003 IBM Corporation44 10/3/2008
RUNSTATS Histogram Statistics
� RUNSTATS will produce equal-depth histogram
– Each quantile (range) will have approx same number of rows
• Not same number of values
– Another term is range frequency
� Example
• 1, 3, 3, 4, 4, 6, 7, 8, 9, 10, 12, 15 (sequenced)
– Lets cut that into 3 quantiles.• 1, 3, 3, 4 ,4 6,7,8,9 10,12,15
3/12315103
4/124962
5/12 3411
FrequencyCardinalityHigh ValueLow ValueSeq No
© 2003 IBM Corporation45 10/3/2008
RUNSTATS Histogram Statistics Notes� RUNSTATS
– Maximum 100 quantiles for a column
– Same value columns WILL be in the same quantile
– Quantiles will be similar size but:• Will try and avoid big gaps between quantiles• Highvalue and lowvalue may have separate quantiles• Null WILL have a separate quantile
� Supports column groups as well as single columns
� Think “frequencies” for high cardinality columns
© 2003 IBM Corporation46 10/3/2008
Histogram Statistics Example� SAP uses INTEGER (or VARCHAR) for YEAR-MONTH
• Assuming data for 2006 & 2007– FF = (high-value – low-value) / (high2key – low2key)– FF = (200612 – 200601) / (200711 – 200602)
– 10% of rows estimated to returnData Distribution - Even Distribution
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
200601 200712
Year/Month
WHERE YEARMONTH BETWEEN 200601 AND 200612
Data assumed as evenly distributed between low and high range
© 2003 IBM Corporation47 10/3/2008
Histogram Statistics Example
Data Distribution - Histograms
0
200000
400000
600000
800000
1000000
1200000
1400000
2006 01-12 200613 -----> -----> 200700 2007 01-12
Year/Month
� Example (cont.)
– Data only exists in ranges 200601-12 & 200701-12• Collect via histograms
– 45% of rows estimated to return
No data between 200613 & 200700
WHERE YEARMONTH BETWEEN 200601 AND 200612
© 2003 IBM Corporation48 10/3/2008
Global Query Optimization
© 2003 IBM Corporation49 10/3/2008
Problem Scenario 1� V8, Large Non-correlated subquery is materialized*
SELECT * FROM SMALL_TABLE A
WHERE A.C1 IN
(SELECT B.C1 FROM BIG_TABLE B)
– “BIG_TABLE” is scanned and put into workfile
– “SMALL_TABLE” is joined with the workfile
� V9 may rewrite non-correlated subquery to correlated– Much more efficient if scan / materialisation of BIG_TABLE was avoided
– Allows matching index access on BIG_TABLE
SELECT * FROM SMALL_TABLE A
WHERE EXISTS
(SELECT 1 FROM BIG_TABLE B WHERE B.C1 = A.C1)
* Assumes subquery is not transformed to join
© 2003 IBM Corporation50 10/3/2008
Problem Scenario 2
� V8, Large outer table scanned rather than using matchi ng index access*
SELECT * FROM BIG_TABLE A
WHERE EXISTS
(SELECT 1 FROM SMALL_TABLE B WHERE A.C1 = B.C1)
– “BIG_TABLE” is scanned to obtain A.C1 value– “SMALL_TABLE” gets matching index access
� V9 may rewrite correlated subquery to non-correlated
SELECT * FROM BIG_TABLE A
WHERE A.C1 IN
(SELECT B.C1 FROM SMALL_TABLE B)– “SMALL_TABLE” scanned and put in workfile
– Allows more efficient matching index access on BIG_TABLE
* Assumes subquery is not transformed to join
© 2003 IBM Corporation51 10/3/2008
Virtual Tables� A new way to internally represent subqueries
– Represented as a Virtual table
• Allows subquery to be considered in different join sequences• May or may not represent a workfile
• Apply only to subqueries that cannot be transformed to joins
Correlated or non-correlated?......I shouldn’t have to care!
© 2003 IBM Corporation52 10/3/2008
EXPLAIN Output
� Additional row for materialized “Virtual Table”
– Table type is "W" for "Workfile".
• Name includes an indicator of the subquery QB number– Example � “DSNVT(02)”
– Non-materialized Virtual tables will not be shown in EXPLAIN output.
� Additional column PARENT_PLANNO
– Used with PARENT_QBLOCKNO to connect child QB to parent
– V8 only contains PARENT_QBNO
• Not possible to distinguish which plan step the child tasks belong to.
© 2003 IBM Corporation53 10/3/2008
EXPLAIN – Non-correlated subquerySELECT * FROM T1 WHERE T1.C2 IN
(SELECT T2.C2 FROM T2, T3 WHERE T2.C1 = T3.C1)
TNCOSUB11NT3_X_C11IT3122TNCOSUB11N0RT2012
TSELECT00YT1_IX_C21IT1121
WSELECT00N0RDSNVT(02)011
TB_TYPEQB_TYPEPAR_PNOPAR_QBSC-JNAC-NAMEMCAC-TYPETNAMEMETHODPLAN-NOQBNO
� Represented as a join in QB 1
� Which plan step does the subquery belong to?
– PARENT_PLANNO = 1 and PARENT_QBNO = 1
• Thus the row corresponding to QBNO=1, PLANNO=1 is the parent row.
© 2003 IBM Corporation54 10/3/2008
EXPLAIN – Correlated subquery
SELECT * FROM T1 WHERE EXISTS
(SELECT 1 FROM T2, T3 WHERE T2.C1 = T3.C1 AND T2.C2 = T1.C2)
TCORSUB11T3_IX_C11IT3122TCORSUB11T2_IX_C21IT2112TSELECT000RT1011
TB_TYPEQB_TYPEPAR_PNO
PAR_QB
AC-NAMEMCAC-TYPE
TNAMEMETHODPLAN-NO
QBNO
� Using same example from previous slide
– Although optimizer has rewritten to correlated
© 2003 IBM Corporation55 10/3/2008
Generalizing Sparse Index and In-Memory Data Cache
© 2003 IBM Corporation56 10/3/2008
Pre-V9 Sparse Index & in-memory data cache
� V4 introduced sparse index
– for non-correlated subquery workfiles
� V7 extended sparse index
– for the materialized work files within star join
� V8 replaced sparse index
– with in-memory data caching for star join• Runtime fallback to sparse index when memory is insufficient
© 2003 IBM Corporation57 10/3/2008
RID
T1 T2 (WF)NLJ
... ...
t1.c = t2.c
KeyBinary Search of sparse index
to look up “approximate “
location of qualified key
Sparse Index
sorted in t2.c order
Workfile sorted
in t2.c order
T2(WF)
How does Sparse Index work?� Sparse index may be a subset of workfile (WF)
– Example, WF may have 10,000 entries
• Sparse index may have enough space (240K) for 1,000 entries
• Sparse index is “binary searched” to find target location of search key• At most 10 WF entries are scanned
© 2003 IBM Corporation58 10/3/2008
Extending Data Caching in DB2 9
� In-memory data caching is extended to non-star join
� V9 will use a local pool above the bar – Instead of a global pool used in V8 star join
– Data caching storage management will be associated with each thread• Which can reduce the potential storage contention
� New ZPARM MXDTCACH
– specifies the maximum extent in MB, for data caching per thread.
© 2003 IBM Corporation59 10/3/2008
Benefit of Data Caching � All tables lacking an index on join column(s):
– Temporary tables
– Table expressions
– Materialized views
– …..any table
� V9 also supports multi-column sparse index
© 2003 IBM Corporation60 10/3/2008
Dynamic Index ANDing for Star Schema
© 2003 IBM Corporation61 10/3/2008
Dynamic Index ANDing Challenge
� Filtering may come from multiple dimensions
•Creating multi-column indexes to support the best combinations is difficult
F
D5
D4
D2
D1
D3
© 2003 IBM Corporation62 10/3/2008
Index ANDing – Pre-Fact
� Pre-fact table access
–Filtering may not be (truly) known until runtime
F
D1 Filtering dimensions accessed in parallel
Join to respective fact table indexes
Build RID lists
F
D3
F
D5
RID list 1
RID list 2
RID list 3
✘ Runtime optimizer may terminate parallel leg(s) which provide poor filtering at runtime
© 2003 IBM Corporation63 10/3/2008
Index ANDing – Fact and Post-Fact� Fact table access
–Intersect filtering RID lists
–Access fact table
•From RID list
� Post fact table
–Join back to dimension tables
Remaining RID lists are “ANDed” (intersected)
RID list 2
RID list 3
Using parallelism
RID list 2/3
Final RID list used for parallel fact table access
© 2003 IBM Corporation64 10/3/2008
V8 RID Pool failure = TS Scan
SORT
RID ListTablespace SCAN
Physical orLogical resource
constraintRID Processing
© 2003 IBM Corporation65 10/3/2008
V9 RID Pool Fallback Plan
SORT
RID List
Workfile
Fall Back plan writes pair of join result rids into Workfile
Physical orLogical resource
constraint SORT
Next portion of Rids
retrieved
© 2003 IBM Corporation66 10/3/2008
Dynamic Index Anding Highlights� Pre-fact table filtering
– Filtering dimensions accessed concurrently
� Runtime optimization
– Terminate poorly filtering legs at runtime
� More aggressive parallelism
� Fallback to workfile for RID pool failure
– Instead of r-scan
© 2003 IBM Corporation67 10/3/2008
Indexing Enhancements
© 2003 IBM Corporation68 10/3/2008
Index on Expression
SELECT *FROM CUSTOMERS WHERE YEAR(BIRTHDATE) = 1971
� DB2 9 supports “index on expression”
– Can turn a stage 2 predicate into indexable
Previous FF = 1/25Now, RUNSTATS collects frequencies. Improved FF accuracy
CREATE INDEX ADMF001.CUSTIX3 ON ADMF001.CUSTOMERS
(YEAR(BIRTHDATE) ASC)
© 2003 IBM Corporation69 10/3/2008
Index Enhancement - Tracking Usage
� Additional indexes require overhead for
– Utilities
• REORG, RUNSTATS, LOAD etc
– Data maintenance
• INSERT, UPDATE, DELETE
– Disk storage
– Optimization time
• Increases optimizer’s choices
� But identifying unused indexes is a difficult task
– Especially in a dynamic SQL environment
© 2003 IBM Corporation70 10/3/2008
Tracking Index Usage
� RTS records the index last used date.
– SYSINDEXSPACESTATS.LASTUSED
• Updated once in a 24 hour period– RTS service task updates at 1st externalization interval (set by
STATSINT) after 12PM.
• if the index is used by DB2, update occurs. • If the index was not used, no update.
� "Used", as defined by DB2 as:
– As an access path for query or fetch.
– For searched UPDATE / DELETE SQL statement.
– As a primary index for referential integrity.
– To support foreign key access
© 2003 IBM Corporation71 10/3/2008
Larger pre-fetch and deferred write quantities
© 2003 IBM Corporation72 10/3/2008
Buffer Pool adjusting
� If the buffer pool is adjusted, the result will be just as though an ALTER BUFFERPOOL VPSIZE command had been issued
– The new size is stored by DB2 in the BSDS
� If the buffer pool is deallocated (e.g. because DB2 is being stopped) it will subsequently be reallocated at its most recently allocated size.
Example
– If BPOOL is adjusted from 800 MB to 900 MB
– Then DB2 is stopped and restarted
– BPOOL will be subsequently allocated at 900 MB
© 2003 IBM Corporation73 10/3/2008
What if the BPOOL is manually altered?
� If a buffer pool's size is manually altered (via th e ALTER BUFFERPOOL VPSIZE command), it is deregistered and then registered at the new size.
�Example
–BPOOL registered at 800 MB
–Altered to a size of 1000 MB
–Then after the alteration has completed, DB2 deregisters and re-registers the buffer pool at 1000 MB with a new min of 750 MB and a new max of 1250 MB.
© 2003 IBM Corporation74 10/3/2008
AUTOSIZE option� DB2 will increase or decrease the size of a
given buffer pool by up to 25% of the originally allocated size.
� By default, automatic buffer pool adjustment is turned off.
� It can be activated via a new AUTOSIZE(YES) option on the ALTER BUFFERPOOL command.
� Once activated, it can be deactivated by ALTER BUFFERPOOL(bpname) AUTOSIZE(NO).
� The AUTOSIZE attribute is added to the DISPLAY BUFFERPOOL output.
© 2003 IBM Corporation75 10/3/2008
Prefetch and Deferred Write Quantity
� Bigger prefetch and deferred write quantity for bigger buffer pool
– Max of 128KB V8 ->256KB V9 in SQL table scan
– 256KB V8 ->512KB V9 in utility
– +36% MB/sec in non striped prefetch
– +47% in 2-striped prefetch -> more effective striping
� “Bigger buffer pool”
– For sequential prefetch, if VPSEQT*VPSIZE> 160MB for SQL, 320MB for utility
– For deferred write, if VPSIZE> 160MB for SQL, 320MB for utility
© 2003 IBM Corporation76 10/3/2008
Dynamic Prefetch & Preformat
� Replace all sequential prefetch, except in tablespace scan, with dynamic prefetch in SQL calls
– Up to 50% faster
– Dynamic prefetch is more intelligent and robust
� Bigger preformatting quantity and trigger ahead
– From 2 (V8) to 16 (V9) cylinders if >16cyl allocation
– 27% faster Insert in one measurement
© 2003 IBM Corporation77 10/3/2008
Workfile Buffer Pools
� Heavier use of 32K workfile BP instead of 4K BP
– V9 tries to use 32K BP for bigger record size to gain improved performance, especially I/O time
• Less workfile space and faster I/O
– Recommendation
• Assign bigger 32K workfile BP• Allocate more 32K workfile datasets• If 4K workfile BP activity is significantly less,
corresponding BP size and workfile datasets can be reduced.
© 2003 IBM Corporation78 10/3/2008
Page Range Processing
© 2003 IBM Corporation79 10/3/2008
Limiting the Partitions Accessed
� With DPSIs or tablespace scan of partitioned tablespace
– beneficial to avoid accessing partitions with no qualifying rows
� Done using page range screening,
– V8 support for local predicates on the leading partitioning key(s)
� Reduces qualified rows read without indexing
Table T1 Partition 1
Table T1 Partition 3
Table T1 Partition 4
Table T1 Partition 5
Table T1 Partition 2
SELECT SUM(GROSS_SALES) FROM T1
WHERE T1.MONTH = ?AND T1.STOR_ID = ?
© 2003 IBM Corporation80 10/3/2008
Page Range Screening Enhancements
� DB2 9 introduces two page range screening enhancements:
– Join predicates
– Non-matching predicates
Table T1 Partition 1
Table T1 Partition 3
Table T1 Partition 4
Table T1 Partition 5
Table T1 Partition 2
SELECT SUM(GROSS_SALES) FROM T1
WHERE T1.MONTH = ?AND T1.STOR_ID = ?
© 2003 IBM Corporation81 10/3/2008
Page Range Screening with Join Predicates
� V8
– All DPSI index parts accessed
• page range screening for local predicates only
� V9
– 1 DPSI index part on B accessed for each join row from A
• join predicate(s) used for page range screening– 10X performance improvement in DB2 9 Redbook example
SELECT *FROM TABLEA A, TABLEB BWHERE A.COL001 = B.COL001AND A.COL004 = B.COL004 DPSI key
Non-Indexed
partition key
PARTITION BY ( COL001 ASC)(PART 1 VALUES('00000100') ,PART 2 VALUES('00000200') ,PART 3 VALUES('00000300') ………,PART 999 VALUES('00099900'),PART 1000 VALUES('00100000'))
© 2003 IBM Corporation82 10/3/2008
Page Range Screening with Non-matching Predicates
� V8, page range screening only applies to leading lim it key(s)
– 1000 DPSI parts must be probed
� V9, since only COL002 = ‘00000001’ is required,
– page range screening can be applied on 2nd limit key,
– only 20 DPSI parts are probed (1 in every 50 parts)• 16X performance improvement in DB2 9 Redbook example
SELECT SUM(COL008)FROM TABLEAWHERE COL002 = '00000001'AND COL004 = '00000001'
Non-in
dexe
d
2nd
part
key
PARTITION BY (COL001 ASC, COL002 ASC) ( PART 1 VALUES('00000100','00000002'),
PART 2 VALUES('00000100','00000004'),PART 3 VALUES('00000100','00000006'),
.....PART 50 VALUES('00000100','99999999'), PART 51 VALUES('00000200','00000002'),
.....PART 1000 VALUES('99999999','99999999'))
DPSI key
© 2003 IBM Corporation83 10/3/2008
Automatic buffer pool management
© 2003 IBM Corporation84 10/3/2008
Automatic buffer pool management
� Only the size attribute of the buffer pool.
� Can be enabled or disabled at the individual buffer pool level.
� Automatic management entails the following :
ƒDB2 Registers the BPOOL with WLM
ƒDB2 provides sizing information to WLM
ƒDB2 communicates to WLM each time allied agents encounter delays
ƒDB2 periodically reports BPOOL size and random read hit ratios to WLM
© 2003 IBM Corporation85 10/3/2008
DB2 Registers BPOOL to WLM
BP1800MB
DB2 WLMIWM4MREG Service
Trigger•ALTER BPOOL AUTOSIZE(YES) •BPOOL allocation • Automatic management set ON(DB2 deregisters when deallocated or altered OFF)
600MBMin
1GBMax
800MBCurrent
BP1Name
© 2003 IBM Corporation86 10/3/2008
DB2 communication to WLM
� The following cases are not communicated to WLM:
–Prefetch I/O
–Wait for I/O on a sequential GetPage
–Group buffer pool reads
Each time an allied agent encounters a delay caused
by a random Get Page having to wait for read I/O.
© 2003 IBM Corporation87 10/3/2008
Periodic reporting
BP0
DB2 WLM
Data Collection exit(one for each pool)
DB2 Periodic Report
Buffer Pool SizesHit Ratio for Random Reads
BP1
BP7BP2
1 Plots size and hit ratio overtime.
2 Projects effects of changing the size
© 2003 IBM Corporation88 10/3/2008
Misc Optimization Enhancements
© 2003 IBM Corporation89 10/3/2008
Sort Avoidance Improvements
� Improved Sort avoidance for DISTINCT
– From V9, DISTINCT can avoid sort using duplicate index
• DISTINCT required unique index to avoid sort
� Sort avoidance for GROUP BY
– Order of GROUP BY columns re-arranged to match index
– Eg. Index on C1, C2
• GROUP BY C2, C1– Sort required in V8
– Sort avoided in 9
© 2003 IBM Corporation90 10/3/2008
Sort Improvements
� Reduced workfile usage for very small sorts
– Final sort step requiring 1 page will NOT allocate workfile
� More efficient sort with FETCH FIRST clause
– V8 and prior,
• Sort would continue to completion• Then return only the requested ‘n’ rows
– From V9,
• If the requested ‘n’ rows will fit into a 32K page,– As the data is scanned,
> Only the top ‘n’ rows are kept in memory> Order of the rows is tracked
> No requirement for final sort
© 2003 IBM Corporation91 10/3/2008
Clusterratio Enhancement
� New Clusterratio formula in V9
– Better awareness of prefetch range
– More accurate CR for lower cardinality indexes
– V9 adds new statistic collected by RUNSTATS
• DATAREPEATFACTOR helps optimizer differentiate clustering from sequential data pattern
� RUNSTATS required in V9 before mass REBIND
– As a migration step
© 2003 IBM Corporation92 10/3/2008
Sequential Access� Sequential prefetch only used for tablespace scan in V9
– Dynamic prefetch used instead for other access paths
• Dynamic prefetch tracks sequential access at runtime
• Sequential prefetch is based upon bind/prepare prediction– At runtime, data may not be page sequential
© 2003 IBM Corporation93 10/3/2008
Parallelism Enhancements� In V8
– Lowest cost is BEFORE parallelism
� In DB2 9
– Lowest cost is AFTER parallelism
• Only a subset of plans are considered for parallelism
Optimizer
Parallelism
One Lowest cost plan survives
How to parallelize
these plans?
© 2003 IBM Corporation94 10/3/2008
Additional Parallelism Enhancements
� In V8
–Degree cut on leading table (exception star join)
� In DB2 9
–Degree can cut on non-leading table
•Benefit for leading workfile, 1-row table etc.
–Histogram statistics exploited for more even distribution
•For index access with NPI
–CPU bound query degree <= # of CPUs * 4
•<= # of CPUs in V8
© 2003 IBM Corporation95 10/3/2008
ORDER BY & FETCH FIRST in subqueries
� ORDER BY can be wrapped inside additional SQL
� ORDER BY and FETCH FIRST n ROWS ONLY in subselect / fullselect
– ability to select the top n rows
(SELECT * FROM T1 ORDER BY C1) UNION ALL(SELECT * FROM T2 ORDER BY C2)
SELECT EMP_ACT.EMPNO, PROJNO FROM EMP_ACT WHERE EMP_ACT.EMPNO IN
(SELECT EMPLOYEE.EMPNO FROM EMPLOYEE ORDER BY SALARY DESC FETCH FIRST 3 ROWS ONLY )
© 2003 IBM Corporation96 10/3/2008
Merge
� MERGE
– A new SQL DML statement in zOS V9
– Combine Update and Insert operations into one statement
� SELECT FROM MERGE
– a SELECT statement
– Show updated/inserted rows
– Including DB2 generated values
� SELECT FROM UPDATE/DELETE
– V8 has SELECT FROM INSERT
© 2003 IBM Corporation97 10/3/2008
SQL Family Compatibility SQL
INTERSECT UNION
R1 R2
EXCEPT
R1 R2 R1 R2
� INTERSECT
� EXCEPT
� RANK
� DENSE_RANK
� ROW_NUMBER
OLAP Functions
© 2003 IBM Corporation98 10/3/2008
Spatial Support
© 2003 IBM Corporation99 10/3/2008
DB2 Spatial Support
� Seamless integration with DB2
� Spatial Data Types
– ST_Point, ST_LineString, ST_Polygon, etc.
� Spatial Functions
– ST_Contains, ST_Distance, ST_Intersect etc.
� Spatial stored procedures
– Administration of coordinate and reference systems
� Implement Open Geospatial Consortium (OGC) SQL specification and ISO SQL/MM spatial standard for types and functions
© 2003 IBM Corporation100 10/3/2008
Examples of spatial applications
� Insurance: Generate quote based on geographic location and risk assessment
� Retail: Display customers around a store to determine areas of market penetration
� Real Estate: Locate properties around a school
� Utilities: Broker power based on demand and delivery cost
� Communications: locate cell phone towers based on call history
© 2003 IBM Corporation101 10/3/2008
Steps to using spatial functions� 1. Enable the database for spatial.
– adds the spatial data types and functions and identifies the available coordinate systems and other spatial meta-data.
� 2. Enable a table for spatial.
– identifying the spatial column in the table
– create a UDF to call an external Web services geocoder and triggers to maintain the spatial columns.
� 3. Create a spatial index.
– DSN5SCLP ODBC program invokes stored procedures for administrative tasks.
� 4. Submit the queries.
– Using any SQL generating or accepting application.
© 2003 IBM Corporation102 10/3/2008
Text Search Support
© 2003 IBM Corporation103 10/3/2008
OmniFind Text Search Support: 12/2007
• CONTAINS() built in function for text search• Search CHAR, VARCHAR, LOB, & XML columns• OmniFind provides a text index server• Efficient communication interactions with DB2 for z/OS• OmniFind text indexes are persisted into DB2 tables for
backup/recovery purposes
DB2
DB2
DB2DB2
ParallelSysplex
OmniFindServer
OmniFindServer
TCP/IP
Application
InvokeSQLAPI
© 2003 IBM Corporation104 10/3/2008
Drivers for text search solution
�Customer demand
�Problems with prior text extender offerings
�OmniFind index and search technology
© 2003 IBM Corporation105 10/3/2008
Customer search scenarios
� DB2 for z/OS table with catalog item descriptions
– VARCHAR column, average 256 bytes
– Online web searching with familiar interface
� DB2 for z/OS table with insurance agent notes
– CLOB column, average 1K bytes
– Agent remembers a claim but not who made it!
� DB2 for z/OS table with item names
– CHAR column, padded to 80 bytes, 400K+ rows
– Find items with keyword ordered by a customer