1Carl Dudley – University of Wolverhampton, UK
Data Compression in Oracle
Carl Dudley
University of Wolverhampton, UK
UKOUG SIG Director
2Carl Dudley – University of Wolverhampton, UK
Main Topics
� Oracle 9i and 10g Compression - major features
� Compression in data warehousing
� Sampling the data to predict compression
� Pre-sorting the data for compression
� Behaviour of DML/DDL on compressed tables
� Compression Internals
� Advanced Compression in Oracle11g (for OLTP operations)
� Shrinking unused space
3Carl Dudley – University of Wolverhampton, UK
Oracle Data Compression – Main Features
4Carl Dudley – University of Wolverhampton, UK
Compression : Characteristics
� Trades Physical I/O against CPU utilization
– Transparent to applications
– Can increase I/O throughput and buffer cache capacity
� Useful for 'read mostly' applications
– Decision Support and OLAP
� Compression is performed only when Oracle considers it worthwhile
– Depends on column length and amount of duplication
� Compression algorithms have caused little change to the Kernel code
– Modifications only to block formatting and accessing rows and columns
– No compression within individual column values or across blocks
• Blocks retrieved in compressed format in the buffer cache
5Carl Dudley – University of Wolverhampton, UK
Getting Compressed
� Building a new compressed table
CREATE TABLE <table_name> ...
COMPRESS;
CREATE TABLE <table_name> COMPRESS AS SELECT ...
� Altering an existing table to be compressed
ALTER TABLE <table_name> MOVE COMPRESS;
– No additional copy created but temp space and exclusive table level lock required for the compression activity
ALTER TABLE <table_name> COMPRESS;
– Future bulk inserts may be compressed – existing data is not
� Compressing individual partitions
ALTER TABLE <table_name>
MOVE PARTITION <partition_name> COMPRESS;
– Existing data and future bulk inserts compressed in a specific partition
6Carl Dudley – University of Wolverhampton, UK
Tablespace Level Compression
� Entire tablespaces can compress by default
– All objects in the tablespace will be compressed by default
CREATE | ALTER TABLESPACE < tablespace_name>
DEFAULT [ COMPRESS | NOCOMPRESS ] ...
7Carl Dudley – University of Wolverhampton, UK
Compressing Table Data
� Uncompressed conventional emp table
� Could consider sorting the data on columns which lend
themselves to compression
CREATE TABLE emp
(empno NUMBER(4)
,ename VARCHAR2(12)
...)
COMPRESS;
� Compressed emp table
CREATE TABLE emp
(empno NUMBER(4)
,ename VARCHAR2(12)
...);
8Carl Dudley – University of Wolverhampton, UK
7369 CLERK 2000 1550 ACCOUNTING
7782 MANAGER 4975 1600 PLANT
7902 ANALYST 4000 2100 OPERATIONS
7900 CLERK 2750 1500 OPERATIONS
7934 CLERK 2200 1200 ACCOUNTING
7654 PLANT 3000 1100 RESEARCH
Table Data Compression
� Uncompressed emp table
� Compressed emp table
– Similar to index compression
[SYMBOL TABLE] [A]=CLERK, [B]=ACCOUNTING, [C]=PLANT,
[D]=OPERATIONS
7369 [A] 2000 1550 [B]
7782 MANAGER 4975 1600 [C]
7902 ANALYST 4000 2100 [D]
7900 [A] 2750 1500 [D]
7934 [A] 2200 1200 [B]
7654 [C] 3000 1100 RESEARCH
9Carl Dudley – University of Wolverhampton, UK
Observing Compression Information
� Compressed tables are not identified in user_tables in Oracle9i
– Fixed in Oracle10g
� Create a new view called my_user_tables by modifying the create view statement for user_tables in the catalog.sql file (see italic type)
� Create a public synonym for the new view and grant public access to it
create or replace view MY_USER_TABLES
(TABLE_NAME, TABLESPACE_NAME, CLUSTER_NAME, IOT_NAME,
PCT_FREE, PCT_USED,
:
GLOBAL_STATS, USER_STATS, DURATION, SKIP_CORRUPT, MONITORING,
CLUSTER_OWNER, DEPENDENCIES,COMPRESSED)
as
select o.name, decode(bitand(t.property, 4194400), 0, ts.name, null),
decode(bitand(t.property, 1024), 0, null, co.name),
:
decode(bitand(t.flags, 8388608), 8388608, 'ENABLED', 'DISABLED'),
decode(bitand(s.spare1, 2048), 2048, 'ENABLED', 'DISABLED')
from sys.ts$ ts, sys.seg$ s, sys.obj$ co, sys.tab$ t, sys.obj$ o,
sys.obj$ cx, sys.user$ cu
:
10Carl Dudley – University of Wolverhampton, UK
SELECT COUNT(ename) uncompressed FROM ec;
UNCOMPRESSED ------------
229376 Elapsed: 00:00:04.00
SELECT COUNT(ename) compressed FROM ec1;
COMPRESSED ----------
229376 Elapsed: 00:00:02.07
Table Data Compression (continued)
� Scanning the compressed table is much faster
– Compressing can significantly reduce disk I/O - good for queries
• Possible increase in CPU activity
• Must unravel the compression but logical reads will be reduced
SELECT table_name,compressed,num_rows FROM my_user_tables;
TABLE_NAME COMPRESSED NUM_ROWS---------- ---------- --------EC DISABLED 229376
EC1 ENABLED 229376
11Carl Dudley – University of Wolverhampton, UK
Space Reduction Due to Compression
� Space usage summary
Statistics for table EC1Free Blocks.............................0
Total Blocks............................384
Total Bytes.............................3145728
Unused Blocks...........................46
Unused Bytes............................376832
Last Used Ext FileId....................1
Last Used Ext BlockId...................54025
Last Used Block.........................82
Statistics for table ECFree Blocks.............................1
Total Blocks............................1438
Total Bytes.............................11534336
Unused Blocks...........................58
Unused Bytes............................475136
Last Used Ext FileId....................1
Last Used Ext BlockId...................53641
Last Used Block.........................70
– Summary routine adapted from Tom Kyte’s example
12Carl Dudley – University of Wolverhampton, UK
Compression in Data Warehousing
13Carl Dudley – University of Wolverhampton, UK
Compression in Data Warehousing
� Fact tables are good candidates for compression
– Large and have repetitive values
– Repetitive data tends to be clustered
• More clustered than in TPC-H tables
� Dimension tables are often too small for compression
� Large block size leads to greater compression
– Typical in data warehouses
– More rows available for compression within each block
� Materialized views can be compressed (and partitioned)
– Naturally sorted on creation due to GROUP BY
– Especially good for ROLLUP views and join views
• Tend to contain repetitive data
14Carl Dudley – University of Wolverhampton, UK
Compression of Individual Table Partitions
� Partition level
– Partitioning must be range or list (or composite)
– The first partition will be compressed
– Could consider compressing read only partitions of historical data
CREATE TABLE sales
(sales_id NUMBER(8)
: :
,sales_date DATE)
PARTITION BY RANGE(sales_date)
(PARTITION sales_jan2003 VALUES LESS THAN
(TO_DATE('02/01/2003','DD/MM/YYYY')) COMPRESS,
PARTITION sales_feb2003 VALUES LESS THAN
(TO_DATE('03/01/2003','DD/MM/YYYY')),
PARTITION sales_mar2003 VALUES LESS THAN
(TO_DATE('04/01/2003','DD/MM/YYYY')),
PARTITION sales_apr2003 VALUES LESS THAN
(TO_DATE('05/01/2003','DD/MM/YYYY')));
15Carl Dudley – University of Wolverhampton, UK
Effect of Partition Operations
� Consider individual partitions compressed as shown
– Produces two new compressed partitions
PARTITION p1a COMPRESS VALUES LESS THAN 50
PARTITION p1b COMPRESS VALUES LESS THAN 100
PARTITION p2 COMPRESS VALUES LESS THAN 200
PARTITION p3 NOCOMPRESS VALUES LESS THAN 300
PARTITION p4 NOCOMPRESS VALUES LESS THAN 400
ALTER TABLE s1 SPLIT PARTITION p1 (50)
INTO (PARTITION p1a, PARTITION p1b);
� Splitting a compressed partition
PARTITION p1 COMPRESS VALUES LESS THAN 100
PARTITION p2 COMPRESS VALUES LESS THAN 200
PARTITION p3 NOCOMPRESS VALUES LESS THAN 300
PARTITION p4 NOCOMPRESS VALUES LESS THAN 400
16Carl Dudley – University of Wolverhampton, UK
Effect of Partition Operations (contd)
� Effect of merging compressed partitions
– New partition p1b_2 is not compressed by default
• Same applies if any to be merged are initially uncompressed
PARTITION p1a COMPRESS VALUES LESS THAN 50
PARTITION p1b COMPRESS VALUES LESS THAN 100
PARTITION p2 COMPRESS VALUES LESS THAN 200
PARTITION p3 NOCOMPRESS VALUES LESS THAN 300
PARTITION p4 NOCOMPRESS VALUES LESS THAN 400
PARTITION p1a COMPRESS VALUES LESS THAN 50
PARTITION p1b_2 NOCOMPRESS VALUES LESS THAN 200
PARTITION p3 NOCOMPRESS VALUES LESS THAN 300
PARTITION p4 NOCOMPRESS VALUES LESS THAN 400
ALTER TABLE s1 MERGE PARTITIONS p1b,p2
INTO PARTITION p1b_2;
� Merge of two compressed partitions
17Carl Dudley – University of Wolverhampton, UK
Forcing Compression During Partition Maintenance
ALTER TABLE s1 MERGE PARTITIONS p2,p3
INTO PARTITION p2_3 COMPRESS;
� Force compression of new partition after a merge operation
ALTER TABLE s1 SPLIT PARTITION p1 AT (50)
INTO (PARTITION p1a COMPRESS,PARTITION p2a);
� Force compression of the new partition(s) after a split operation
� Partitions may be empty or contain data during maintenance
operations involving compression
18Carl Dudley – University of Wolverhampton, UK
Effect of Partitioned Bitmap Indexes
� Scenario :
– Table having no compressed partitions has bitmap locally partitioned indexes
– The presence of usable bitmap indexes will prevent the first operation that compresses a partition
SQL> ALTER TABLE sales MOVE PARTITION p4 COMPRESS;
ORA-14646: Specified alter table operation involving compression
cannot be performed in the presence of usable bitmap indexes
SQL> ALTER TABLE part2 SPLIT PARTITION p1a AT (25)
INTO ( PARTITION p1c COMPRESS,PARTITION p1d);
ORA-14646: Specified alter table operation involving compression
cannot be performed in the presence of usable bitmap indexes
19Carl Dudley – University of Wolverhampton, UK
� Uncompressed partitioned table with bitmap index in 3 partitions
Compression of Partitions with Bitmap Indexes in Place
CREATE TABLE emp_part
PARTITION BY RANGE (deptno)
(PARTITION p1 VALUES LESS THAN (11),
PARTITION p2 VALUES LESS THAN (21),
PARTITION p3 VALUES LESS THAN (31))
AS SELECT * FROM emp;
CREATE BITMAP INDEX part$empno
ON emp_part(empno) LOCAL;
20Carl Dudley – University of Wolverhampton, UK
� First compression operation requires the following
1. Mark bitmap indexes unusable (or drop them)
Compression of Partitions with Bitmap Indexes in Place (continued)
3. Rebuild the bitmap indexes (or recreate them)
ALTER INDEX part$empno UNUSABLE;
2. Compress the first (and any subsequent) partition as required
ALTER TABLE emp_part MOVE PARTITION p1 COMPRESS;
ALTER INDEX part$empno REBUILD PARTITION p1;
ALTER INDEX part$empno REBUILD PARTITION p2;
ALTER INDEX part$empno REBUILD PARTITION p3;
– Each index partition must be individually rebuilt
21Carl Dudley – University of Wolverhampton, UK
� Oracle needs to know maximum records per block
– Correct mapping of bits to blocks can then be done
– On compression this value increases
� Oracle has to rebuild bitmaps to accommodate potentially larger number of values even if no data is present in the partition(s)
– Could result in larger bitmaps for uncompressed partitions
• Increase in size can be offset by the actual compression
� Once rebuilt, the indexes can cope with any compression
– Subsequent compression operations do not invalidate bitmap indexes
� Recommended to create each partitioned table with at least one compressed (dummy/empty?) partition
– Can be subsequently dropped
� Compression activity does not affect Btree usability
Compression of Partitions with Bitmap Indexes in Place (continued)
22Carl Dudley – University of Wolverhampton, UK
Table Level Compression for Partitioned Tables
� Compression can be the default for all partitions
CREATE TABLE sales
(sales_id NUMBER(8),
: :
sales_date DATE)
COMPRESS
PARTITION BY (sales_date)
...
– Can still specify individual partitions to be NOCOMPRESS
� Default partition maintenance actions on compressed tables
– Splitting non-compressed partitions results in non-compressed partitions
– Merging non-compressed partitions results in a compressed partition
– Adding a partition will result in a new compressed partition
– Moving a partition does not alter its compression
23Carl Dudley – University of Wolverhampton, UK
Finding the Largest Tables
SELECT owner
,name
,SUM(gb)
,SUM(pct)
FROM (SELECT owner
,name
,TO_CHAR(gb,'999.99') gb
,TO_CHAR((RATIO_TO_REPORT(gb) OVER())*100,'999,999,999.99') pct
FROM (SELECT owner
,SUBSTR(segment_name,1,30) name
,SUM(bytes/(1024*1024*1024)) gb
FROM dba_segments
WHERE segment_type IN ('TABLE','TABLE PARTITION')
GROUP BY owner
,segment_name
)
)
WHERE pct > 3
GROUP BY ROLLUP(owner
,name)
ORDER BY 3;
� Useful for finding candidates for compression
24Carl Dudley – University of Wolverhampton, UK
Finding the Largest Tables (contd)
OWNER NAME SUM(GB) SUM(PCT)
------------- -------------- ------------- ----------
SH COSTS .03 8.23
SH SALES .05 14.44
SH SALES_HIST .13 32.93
SH .21 55.61
SYS IDL_UB2$ .01 3.86
SYS SOURCE$ .02 6.43
SYS .03 10.29
.24 65.90
25Carl Dudley – University of Wolverhampton, UK
Sampling Data to Predict Compression
26Carl Dudley – University of Wolverhampton, UK
Compression Factor and Space Saving
� Compression Factor (CF)
compressed blocks
non-compressed blocks* 100
compressed blocks
non-compressed blocks - compressed blocks* 100
CF =
SS =
� Space Savings (SS)
27Carl Dudley – University of Wolverhampton, UK
Predicting the Compression Factor
CREATE OR REPLACE FUNCTION compression_ratio (tabname VARCHAR2)
RETURN NUMBER IS
pct NUMBER := 0.000099; -- sample percentage
blkcnt NUMBER := 0; -- original block count (should be < 10K)
blkcntc NUMBER; -- compressed block count
BEGIN
EXECUTE IMMEDIATE ' CREATE TABLE temp_uncompressed PCTFREE 0 AS SELECT *
FROM ' || tabname || ' WHERE ROWNUM < 1';
WHILE ((pct < 100) AND (blkcnt < 1000)) LOOP -- until > 1000 blocks in sample
EXECUTE IMMEDIATE 'TRUNCATE TABLE temp_uncompressed';
EXECUTE IMMEDIATE 'INSERT INTO temp_uncompressed SELECT * FROM ' ||
tabname || ' SAMPLE BLOCK (' || pct || ',10)';
EXECUTE IMMEDIATE 'SELECT COUNT(DISTINCT(dbms_rowid.rowid_block_number(rowid)))
FROM temp_uncompressed' INTO blkcnt;
pct := pct * 10;
END LOOP;
EXECUTE IMMEDIATE 'CREATE TABLE temp_compressed COMPRESS AS SELECT * FROM
temp_uncompressed';
EXECUTE IMMEDIATE 'SELECT COUNT(DISTINCT(dbms_rowid.rowid_block_number(rowid)))
FROM temp_compressed' INTO blkcntc;
EXECUTE IMMEDIATE 'DROP TABLE temp_compressed';
EXECUTE IMMEDIATE 'DROP TABLE temp_uncompressed';
RETURN (blkcnt/blkcntc);
END;
/
28Carl Dudley – University of Wolverhampton, UK
Predicting the Compression Factor (continued)
CREATE OR REPLACE PROCEDURE compress_test(p_comp VARCHAR2)
IS
comp_ratio NUMBER;
BEGIN
comp_ratio := compression_ratio(p_comp);
dbms_output.put_line('Compression factor for table ' ||
p_comp ||' is '|| comp_ratio );
END;
EXEC compress_test('EMP')
Compression factor for table EMP is 1.6
� Run the compression test for the emp table
29Carl Dudley – University of Wolverhampton, UK
Compression Test – Clustered Data
CREATE TABLE clust (col1
VARCHAR2(1000))
COMPRESS;
INSERT INTO clust VALUES ('VV...VV');
INSERT INTO clust VALUES ('VV...VV');
INSERT INTO clust VALUES ('VV...VV');
INSERT INTO clust VALUES ('VV...VV');
INSERT INTO clust VALUES ('VV...VV');
INSERT INTO clust VALUES ('WW...WW');
: : :
INSERT INTO clust VALUES ('WW...WW');
INSERT INTO clust VALUES ('XX...XX');
: : :
INSERT INTO clust VALUES ('YY...YY');
: : :
INSERT INTO clust VALUES ('ZZ...ZZ');
CREATE TABLE noclust (col1
VARCHAR2(1000))
COMPRESS;
INSERT INTO noclust VALUES ('VV...VV');
INSERT INTO noclust VALUES ('WW...WW');
INSERT INTO noclust VALUES ('XX...XX');
INSERT INTO noclust VALUES ('YY...YY');
INSERT INTO noclust VALUES ('ZZ...ZZ');
INSERT INTO noclust VALUES ('VV...VV');
INSERT INTO noclust VALUES ('WW...WW');
INSERT INTO noclust VALUES ('XX...XX');
INSERT INTO noclust VALUES ('YY...YY');
INSERT INTO noclust VALUES ('ZZ...ZZ');
INSERT INTO noclust VALUES ('VV...VV');
: : :
INSERT INTO noclust VALUES ('ZZ...ZZ');
� Every value for column col1 is 390 bytes long
� Both tables have a total of 25 rows stored in blocks of size 2K
– So a maximum of four rows will fit in each block
� Both have same amount of repeated values but the clustering is different
30Carl Dudley – University of Wolverhampton, UK
Compression Test (continued)
noclust - 4 rows per block. (7 blocks in total)
The 5th row to be inserted must go in the next block as it contains different data
vv…vv
ww…ww
xx…xx
yy…yy
zz…zz
xx…xx
yy…yy
vv…vv
ww…ww
zz…zz
vv…vv
ww…ww
xx…xx
yy…yy
zz…zz
vv…vv
ww…ww
xx…xx
yy…yy zz…zz
vv…vv
clust - 20 rows per block.
Rows 2,3,4,5 are duplicates of the first row
in the block.
Rows 7,8,9,10 are duplicates of the 6th row
in the block, and this pattern is repeated.
The residual space in the first block is used
by the compressed data
header header header header
header header
31Carl Dudley – University of Wolverhampton, UK
Compression Test - Compression Factors
� Compression test routine is accurate due to sampling of actual data
– Make sure default tablespace is correctly set
• Temporary sample tables are physically built for the testing
EXEC compress_test('NOCLUST')
Compression factor for table NOCLUST is 1
EXEC compress_test('CLUST')
Compression factor for table CLUST is 3.5
32Carl Dudley – University of Wolverhampton, UK
Testing Compression : Using Repeatable sampling on Oracle9i
� Estimate the compression/decompression ratio for a table, abc
DROP TABLE abc$test1;
DROP TABLE abc$test2;
2. Drop any previously created test tables
ALTER SESSION
SET EVENTS '10193 trace name context forever, level 1';
1. Make sampling repeatable:
– This statement appears to have no effect in Oracle10g
33Carl Dudley – University of Wolverhampton, UK
Testing Compression : Further Example (continued)
LOCK TABLE abc IN SHARE MODE;
INSERT /*+ APPEND */ INTO abc$test1
SELECT * FROM abc SAMPLE BLOCK(x,y);
INSERT /*+ APPEND */ INTO abc$test2
SELECT * FROM abc SAMPLE BLOCK(x,y);
5. Place the same test data in each of the test tables
CREATE TABLE abc$test2 NOCOMPRESS AS
SELECT * FROM abc WHERE ROWNUM < 1;
4. Create an empty uncompressed table:
CREATE TABLE abc$test1 COMPRESS AS
SELECT * FROM abc WHERE ROWNUM < 1;
3. Create an empty compressed table:
– This example uses block level sampling
34Carl Dudley – University of Wolverhampton, UK
Testing Compression : Sampling Rows
SELECT * FROM emp SAMPLE (10) SEED (1);
• Selects a 10% sample of rows
• If repeated, a different sample will be taken
SELECT * FROM emp SAMPLE (10);
� Tables can be sampled at row or block level
– Block level samples a random selection of whole blocks
– Row level (default) samples a random selection of rows
� Samples can be 'fixed' in Oracle10g using SEED
– SEED can can have integer values from 0 -100
– Can also have higher numbers ending in '00'
– Shows a 10% sample of rows
– If repeated, the exact same sample will be taken
• Also applies to block level sampling
• The sample set will change if DML is performed on the table
35Carl Dudley – University of Wolverhampton, UK
Pre-Sorting the Data for Compression
36Carl Dudley – University of Wolverhampton, UK
Sorting the Data for Compression
� Presort the data on a column which has : no. of distinct values ~ no. of blocks
� Information on column cardinality is shown in:
ALL_TAB_COL_STATISTICS
ALL_PART_COL_STATISTICS
ALL_SUBPART_COL_STATISTICS
� Reorganize (pre-sort) rows in segments that will be compressed
to cause repetitive data within blocks
� For multi-column tables, order the rows by the low cardinality column(s)
CREATE TABLE emp_comp COMPRESS AS
SELECT * FROM emp ORDER BY <some unselective column(s)>;
– For a single-column table, order the table rows by the column value
37Carl Dudley – University of Wolverhampton, UK
Sorting the Data for Compression
� Presort data on column having no. of distinct values ~ no. of blocks~
SELECT COUNT(DISTINCT ename) FROM large_emp; 170 enames
SELECT COUNT(DISTINCT job) FROM large_emp; 5 jobs
SELECT * FROM large_emp; (114368 rows)
EMPNO ENAME JOB------- ------------ ----------43275 25***** CLERK
47422 128**** ANALYST
79366 6****** MANAGER
: : :
38Carl Dudley – University of Wolverhampton, UK
Sorting the Data for Compression (continued)
� Sorting on the job column is not the most effective
Compressed table sorted on ename : Number of used blocks = 172
CREATE TABLE cename COMPRESS
AS SELECT empno,ename,job FROM large_emp ORDER BY ename;
Compressed table sorted on job : Number of used blocks = 243
CREATE TABLE cjob COMPRESS
AS SELECT empno,ename,job FROM large_emp ORDER BY job;
Non-compressed table : Number of used blocks = 360
CREATE TABLE nocomp
AS SELECT empno,ename,job FROM large_emp;
39Carl Dudley – University of Wolverhampton, UK
Behaviour of DML/DDL on Tables with Default Compression
40Carl Dudley – University of Wolverhampton, UK
Default Compressed Table Behaviour
� Updates, and conventional single and multi-row inserts are NOT
compressed
– UPDATE
• Wholesale updates lead to large increases in storage (>250%)
• Performance impact on UPDATEs can be around 400%
• Rows are migrated to new blocks (default value of PCTFREE is 0)
– DELETE
• Performance impact of around 15% for compressed rows
� Creating a compressed table can take 50% longer
� Compressed tables cannot be modified in 9.2.0.1
ORA-22856 cannot add column to object tables
41Carl Dudley – University of Wolverhampton, UK
Default Compressed Table Behaviour
(continued)� Operations which 'perform' compression
– CREATE TABLE ... AS SELECT ...
– ALTER TABLE ... MOVE ...
– INSERT /*+APPEND*/ (single threaded)
– INSERT /*+PARALLEL(sales,4)*/
• Requires ALTER SESSION ENABLE PARALLEL DML;
• Both of the above inserts work with data from database tables and external tables
– SQL*Loader DIRECT = TRUE
– Various partition maintenance operations
� Can not be used for:– LOB and VARRAY constructs
– Index organized tables
• Can use index compression on IOTs
– External tables, index clusters, hash clusters
– Bulk loads within PL/SQL
– Logical standby techniques
42Carl Dudley – University of Wolverhampton, UK
Compression Internals
43Carl Dudley – University of Wolverhampton, UK
Hexadecimal Dump of Compressed Data
Symbol Table :
14 unique names
5 unique jobs
Start of next block
44Carl Dudley – University of Wolverhampton, UK
Oracle Dump - Two Uncompressed Employee Rows
...block_row_dump:tab 0, row 0, @0x4a1tl: 41 fb: --H-FL-- lb: 0x2 cc: 8col 0: [ 3] c2 03 38col 1: [ 5] 43 4c 41 52 4bcol 2: [ 7] 4d 41 4e 41 47 45 52col 3: [ 3] c2 4f 28col 4: [ 7] 77 b5 06 09 01 01 01col 5: [ 3] c2 19 33col 6: *NULL*col 7: [ 2] c1 0btab 0, row 1, @0x4catl: 40 fb: --H-FL-- lb: 0x2 cc: 8col 0: [ 3] c2 03 39col 1: [ 5] 53 43 4f 54 54col 2: [ 7] 41 4e 41 4c 59 53 54col 3: [ 3] c2 4c 43col 4: [ 7] 77 bb 04 13 01 01 01col 5: [ 2] c2 1fcol 6: *NULL*col 7: [ 2] c1 15...
ALTER SYSTEM DUMP
DATAFILE 8 BLOCK 35;
� Creates a trace file in USER_DUMP_DEST
45Carl Dudley – University of Wolverhampton, UK
tl: 18 fb: --H-FL-- lb: 0x0 cc: 2col 0: [ 9] 50 52 45 53 49 44 45 4e 54col 1: [ 4] 4b 49 4e 47bindmp: 00 0b 02 d1 50 52 45 53 49 44 45 4e 54 cc 4b 49 4e 47tab 0, row 1, @0x746tl: 9 fb: --H-FL-- lb: 0x0 cc: 2col 0: [ 7] 41 4e 41 4c 59 53 54col 1: [ 4] 46 4f 52 44bindmp: 00 0a 02 11 cc 46 4f 52 44.....
tab 0, row 13, @0x6cctl: 10 fb: --H-FL-- lb: 0x0 cc: 2col 0: [ 5] 43 4c 45 52 4bcol 1: [ 5] 53 4d 49 54 48bindmp: 00 0b 02 0e cd 53 4d 49 54 48tab 0, row 14, @0x780tl: 8 fb: --H-FL-- lb: 0x0 cc: 1col 0: [ 5] 43 4c 45 52 4bbindmp: 00 04 cd 43 4c 45 52 4b.....
tab 0, row 17, @0x761tl: 10 fb: --H-FL-- lb: 0x0 cc: 1col 0: [ 7] 41 4e 41 4c 59 53 54bindmp: 00 02 cf 41 4e 41 4c 59 53 54tab 1, row 0, @0x6c3tl: 9 fb: --H-FL-- lb: 0x0 cc: 3col 0: [ 5] 43 4c 45 52 4bcol 1: [ 5] 53 4d 49 54 48col 2: [ 3] c2 02 34bindmp: 2c 00 02 02 0d cb c2 02 34
Oracle Dump of Table of empno,ename,job
PRESIDENT KING
FORD
SMITH
CLERK
ANALYST
46Carl Dudley – University of Wolverhampton, UK
Oracle Avoids Unnecessary Compression
� Create two tables with repeating small values in one column
CREATE TABLE tnocomp (
col1 VARCHAR2(1)
,col2 VARCHAR2(6))
PCTFREE 0;
COL1 COL2---- ------1A 1ZZZZZ
2A 2ZZZZZ
3A 3ZZZZZ
4A 4ZZZZZ
5A 5ZZZZZ
1A 6ZZZZZ
2A 7ZZZZZ
...
4A 319ZZZ
5A 320ZZZ
� Insert data (320 rows) as follows
CREATE TABLE tcomp (
col1 VARCHAR2(1)
,col2 VARCHAR2(6))
COMPRESS;
Values unique in col2
Values repeat in col1 every 5 rows
47Carl Dudley – University of Wolverhampton, UK
Evidence of Minimal Compression
SELECT dbms_rowid.rowid_block_number(ROWID) block
,dbms_rowid.rowid_relative_fno(ROWID) file
,COUNT(*) num_rows
FROM &table_name
GROUP BY dbms_rowid.rowid_block_number(ROWID)
,dbms_rowid.rowid_relative_fno(ROWID);
BLOCK FILE NUM_ROWS----- ---- --------
66 8 128
67 8 132
68 8 60
BLOCK FILE NUM_ROWS----- ---- --------
34 8 126
35 8 126
36 8 68
tnocomp tcomp
� Evidence of compression in the 'compressed' table
48Carl Dudley – University of Wolverhampton, UK
Further Evidence of Compression
ALTER SYSTEM
DUMP DATAFILE 8
BLOCK 67;
block_row_dump:tab 0, row 0, @0x783tl: 5 fb: --H-FL-- lb: 0x0 cc: 1col 0: [ 2] 31 41bindmp: 00 1b ca 31 41tab 0, row 1, @0x77etl: 5 fb: --H-FL-- lb: 0x0 cc: 1col 0: [ 2] 32 41bindmp: 00 1b ca 32 41tab 0, row 2, @0x779tl: 5 fb: --H-FL-- lb: 0x0 cc: 1col 0: [ 2] 33 41bindmp: 00 1b ca 33 41tab 0, row 3, @0x774tl: 5 fb: --H-FL-- lb: 0x0 cc: 1col 0: [ 2] 34 41bindmp: 00 1a ca 34 41tab 0, row 4, @0x76ftl: 5 fb: --H-FL-- lb: 0x0 cc: 1col 0: [ 2] 35 41bindmp: 00 19 ca 35 41tab 1, row 0, @0x763tl: 12 fb: --H-FL-- lb: 0x0 cc: 2col 0: [ 2] 31 41col 1: [ 6] 31 30 31 30 30 30bindmp: 2c 00 02 02 00 c9 31 30 31 30 30 30tab 1, row 1, @0x757tl: 12 fb: --H-FL-- lb: 0x0 cc: 2col 0: [ 2] 32 41col 1: [ 6] 31 30 32 30 30 30bindmp: 2c 00 02 02 01 c9 31 30 32 30 30 30
HEX(A) = 41
Symbol Table (tab0)(1A,2A.3A.4A.5A)
49Carl Dudley – University of Wolverhampton, UK
Compression not Performed on Unsuitable Data
� Both tables recreated with values in col1 now set toTO_CHAR(MOD(ROWNUM,50))
– Much less repetition of values (only every 50 rows) allowing less
compression
COL1 COL2---- ------1 1ZZZZZ
2 2ZZZZZ
3 3ZZZZZ
4 4ZZZZZ
... 5ZZZZZ
49 49ZZZZ
50 50ZZZZ
1 51ZZZZ
2 52ZZZZ
3 53ZZZZ
BLOCK FILE NUM_ROWS----- ---- --------
66 8 128
67 8 128
68 8 64
BLOCK FILE NUM_ROWS----- ---- --------
34 8 128
35 8 128
36 8 64
tnocomp
tcompOracle decides not to compress
50Carl Dudley – University of Wolverhampton, UK
Comparison of Heap and IOT Compression
IOT = Index Organized table
51Carl Dudley – University of Wolverhampton, UK
Comparison of IOT and Heap Tables
� Tests constructed using a standard set of data in emptest
– Six columns with absence of nulls
EMPNO ENAME JOB HIREDATE SAL DEPTNO------ ---------- --------- --------- ------ ------
1 KING PRESIDENT 17-NOV-81 5000 10
2 FORD ANALYST 03-DEC-81 3000 20
3 SCOTT ANALYST 09-DEC-82 3000 20
4 JONES MANAGER 02-APR-81 2975 20
5 BLAKE MANAGER 01-MAY-81 2850 30
6 CLARK MANAGER 09-JUN-81 2450 10
7 ALLEN SALESMAN 20-FEB-81 1600 30
8 TURNER SALESMAN 08-SEP-81 1500 30
9 MILLER CLERK 23-JAN-82 1300 10
10 WARD SALESMAN 22-FEB-81 1250 30
11 MARTIN SALESMAN 28-SEP-81 1250 30
12 ADAMS CLERK 12-JAN-83 1100 20
13 JAMES CLERK 03-DEC-81 950 30
14 SMITH CLERK 17-DEC-80 800 20
15 KING PRESIDENT 17-NOV-81 5000 10
16 FORD ANALYST 03-DEC-81 3000 20
... ... ... ... ... ...
229376 SMITH CLERK 17-DEC-80 800 20
IOT = Index Organized table
52Carl Dudley – University of Wolverhampton, UK
Creation of IOTs
� empi : Conventional IOT based on emptest data
CREATE TABLE empi (empno,ename,job,hiredate,sal,deptno,
CONSTRAINT pk_empi PRIMARY KEY(ename,job,hiredate,sal,deptno,empno))
ORGANIZATION INDEX
PCTFREE 0
AS SELECT * FROM emptest
� empic : First five columns compressed
CREATE TABLE empic (empno,ename,job,hiredate,sal,deptno,
CONSTRAINT pk_empic PRIMARY KEY(ename,job,hiredate,sal,deptno,empno))
ORGANIZATION INDEX
PCTFREE 0
COMPRESS 5
AS SELECT * FROM emptest
53Carl Dudley – University of Wolverhampton, UK
Test tables
� Average row length obtained from user_tables (avg_row_len)
– Compressed heap tables show no reduction in average row length
Table Table Compress Blocks Average
Name Type row length
emph Heap No 1092 36
emphc Heap Yes 361 36empi IOT No 1077 35
empic IOT Yes 261 7
� Four tables built having heap/IOT structures and
compressed/noncompressed data
� Unique index built on empno column in emph and emphc tables
54Carl Dudley – University of Wolverhampton, UK
Compression Data
� Number of blocks in IOTs obtained from index validation
VALIDATE INDEX &index_name;
SELECT * FROM index_stats;
� Compressed IOTs have compression shown as DISABLED in
user_tables, but ENABLED in user_indexes
SELECT COUNT(COUNT(*))
FROM &table_name
GROUP BY dbms_rowid.rowid_block_number(ROWID)
,dbms_rowid.rowid_relative_fno(ROWID);
� Number of blocks in heap tables obtained using dbms_rowid
55Carl Dudley – University of Wolverhampton, UK
Timings to Scan Tables
SELECT COUNT(deptno) FROM <table_name>;
Table Table Compress Elapsed
Name Type Time (s)
emph Heap No 0.70
emphc Heap Yes 0.47
empi IOT No 0.37
empic IOT Yes 0.21
56Carl Dudley – University of Wolverhampton, UK
Access Paths for Heap Table
EXPLAIN PLAN FOR SELECT * FROM emph WHERE EMPNO > 193387;
SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 35989| 984K| 252 (2)|
| 1 | TABLE ACCESS BY INDEX ROWID| EMPH | 35989| 984K| 252 (2)|
|* 2 | INDEX RANGE SCAN | PK_EMPH | 35989| | 78 (3)|
----------------------------------------------------------------------------
EXPLAIN PLAN FOR SELECT * FROM emph WHERE EMPNO > 193386;
SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
---------------------------------------------------------------
| 0 | SELECT STATEMENT | | 35990| 984K| 252 (5)|
|* 1 | TABLE ACCESS FULL| EMPH | 35990| 984K| 252 (5)|
---------------------------------------------------------------
57Carl Dudley – University of Wolverhampton, UK
Access Paths for Compressed Heap Table
EXPLAIN PLAN FOR SELECT * FROM emphc WHERE EMPNO > 206863;
SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|-----------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 22513| 615K| 86 (3)|| 1 | TABLE ACCESS BY INDEX ROWID| EMPHC | 22513| 615K| 86 (3)||* 2 | INDEX RANGE SCAN | PK_EMPHC | 22513| | 49 (3)|-----------------------------------------------------------------------------
EXPLAIN PLAN FOR SELECT * FROM emphc WHERE EMPNO > 206862;
SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|-------------------------------------------------------------------| 0 | SELECT STATEMENT | | 22514| 642K| 91 (11)||* 1 | TABLE ACCESS FULL| EMPHC | 22514| 642K| 91 (11)|-------------------------------------------------------------------
� Note that full table scan is (rightly) more likely to be performed
58Carl Dudley – University of Wolverhampton, UK
Access Path for IOTs
EXPLAIN PLAN FOR SELECT * FROM empi WHERE EMPNO > 200000;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
---------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 29376| 803K| 238 (1)|
|* 1 | INDEX FAST FULL SCAN| PK_EMPI | 29376| 803K| 238 (1)|
---------------------------------------------------------------------
59Carl Dudley – University of Wolverhampton, UK
Updates to Heap and IOT Tables
UPDATE <table_name> SET ename = 'XXXXXXX‘;
Table Table Compress Blocks Blocks Increase PCT Elapsed
Name Type before after in blocks increase Time
update update
Emph Heap No 1092 1280 188 ~ 10% 5 mins
Emphc Heap Yes 361 2291 1930 ~ 600% 15 mins
Empi IOT No 1077 2218 1141 ~ 100% 4 mins
Empic IOT Yes 261 527 266 ~ 100% 12 mins
� Note the ‘explosion’ in size of the compressed heap table
� Lengthens each employee name by at least one character
60Carl Dudley – University of Wolverhampton, UK
Advanced Compression in Oracle11g
(for OLTP Operations)
61Carl Dudley – University of Wolverhampton, UK
Advanced Compression in Oracle 11g
� Conventional DML maintains the compression
– Inserted and updated rows remain compressed
� The compress activity is kept at a minimum
� Known as the Advanced Compression option
– Extra cost
62Carl Dudley – University of Wolverhampton, UK
Compressing for OLTP Operations
� Requires COMPATIBILITY = 11.1.0 (or higher)
CREATE TABLE t1 ... COMPRESS FOR ALL OPERATIONS ;
� Conventionally inserted rows stay uncompressed until PCTFREE is reached
– Mimimises compression operations
– Existing rows are not compressedHeader Header Header HeaderHeader
Conventional inserts are notcompressed
Block becomes full
Rows are nowcompressed
More uncompressedInserts fill the block
Rows arecompressed again
Transaction activity
Free space Free space Free space Free space Free space
63Carl Dudley – University of Wolverhampton, UK
OLTP Compression
� Some early results
� Table containing 3million parts records
– Avoid large scale (batch) updates on OLTP compressed tables
• Significant processing overheads
16M12M35M
Compression for all operations
Compression for direct path operations
No compression
64Carl Dudley – University of Wolverhampton, UK
OLTP Compression Test
� Table containing 500000 sales records
PROD_ID CUST_ID TIME_ID CHANNEL_ID PROMO_ID AMOUNT_SOLD ACCOUNT_TYPE------- ------- --------- ---------- -------- ----------- -------------
13 5590 10-JAN-04 3 999 1232.16 Minor account19 6277 23-FEB-05 1 123 7690.00 Minor account16 6859 04-NOV-05 3 999 66.16 Minor account: : : : : : :
� Three tables created
CREATE TABLE s_non_c PCTFREE 0 AS SELECT * FROM sales;
CREATE TABLE s_c_all COMPRESS FOR ALL OPERATIONS
AS SELECT * FROM sales;
— Non-compressed table with fully packed blocks
CREATE TABLE s_c COMPRESS AS SELECT * FROM sales;
— Compressed table for DSS operations
— Compressed table for OLTP operations
65Carl Dudley – University of Wolverhampton, UK
OLTP Compression Test (continued)
2:20:12 secs1:04 secs39 secsElapsed time for update
588855043200Size after update (blocks)
10248963200Original size (blocks)
Compressed Table for OLTP
Compressed Table for DSS
Non-compressed table
� Test somewhat unfair on OLTP compression
– Update is large and batch orientated
– I/O subsystem was single disk
– But still interesting?
UPDATE sales SET account_type = ‘Major account’WHERE prod_id > 13;
� Update stress test – update 94% of the rows
– No lengthening of any data
66Carl Dudley – University of Wolverhampton, UK
Shrinking Unused Space in Oracle10g
67Carl Dudley – University of Wolverhampton, UK
Reclaiming Space in Oracle10g
� Unused space below the High Water Mark can be reclaimed online
– Space caused by delete operations can be returned to free space
– Object must be in a locally managed tablespace with automatic segment space management (ASSM)
– Removes the need for Online Redefinition to reclaim space?
� Two-step operation
– Rows are moved to blocks available for insert
• Requires row movement to be enabled
– High water mark (HWM) is repositioned
– Newly freed blocks are returned to free space if the COMPACTkeyword is not used
ALTER TABLE <table_name> SHRINK SPACE;
ALTER TABLE <table_name> ENABLE ROW MOVEMENT;
68Carl Dudley – University of Wolverhampton, UK
Repositioning of HWM During shrinks
HWM
Allocated block above the high
water mark
HWM
Free block Free block Free block
� Rows are physically moved to space in blocks on the freelist
– Shrinking causes ROWIDs to change
– Indexes (bitmap and btree) are updated accordingly
� Shrinking does not work with compressed tables
69Carl Dudley – University of Wolverhampton, UK
Shrinking Objects
� Why Shrink?
– To reclaim space
– To increase speed of full table scans
– To allow access to table during the necessary reorganisation
– The shrink operation can be terminated/interrupted at any time
• Can be continued at a later time from point of termination
� Objects that can be SHRINKed
– Tables
– Indexes
– Materialized views
– Materialized view logs
– Dependent objects may be shrunk when a table shrinks
ALTER TABLE <table_name> SHRINK SPACE [CASCADE];
70Carl Dudley – University of Wolverhampton, UK
Data Compression in Oracle
Carl Dudley
University of Wolverhampton, UK
UKOUG SIG Director