PL SQL Questions

http://www.sqlsnippets.com/en/home.html

PL/SQL Collections

The chart below lists the properties of the three collection types on a set of parameters such as size, ease of modification, persistence, etc.

Index By Tables Nested Tables Varrays

Size Unbounded i.e. the number of elements it can hold is not pre-defined

Unbounded i.e. the number of elements it can hold is not pre-defined

Bounded i.e. holds a declared number of elements, though this number can be changed at runtime

Subscript Characteristics

Can be arbitrary numbers or strings. Need not be sequential.

Sequential numbers, starting from one

Sequential numbers, starting from one

Database Storage

Index by tables can be used in PL/SQL programs only, cannot be stored in the database.

Can be stored in the database using equivalent SQL types, and manipulated through SQL.

Can be stored in the database using equivalent SQL types, and manipulated through SQL (but with less ease than nested tables)

Referencing and lookups

Works as key-value pairs. e.g. Salaries of employees can be stored with unique employee numbers used as subscripts sal(102) := 2000;

Similar to one-column database tables. Oracle stores the nested table data in no particular order. But when you retrieve the nested table into a PL/SQL variable, the rows are given consecutive subscripts starting at 1.

Standard subscripting syntax e.g. color(3) is the 3rd color in varray color

Flexibility to changes

Most flexible. Size can increase/ decrease dynamically. Elements can be added to

Almost like index-by tables, except that subscript values are not as flexible. Deletions are

Not very flexible. You must retrieve and update all the elements of the varray at the same time.

http://www.sqlsnippets.com/en/home.html

any position in the list and deleted from any position.

possible from non-contiguous positions.

Mapping with other

programming languages

Hash tables Sets and bags Arrays

Which Collection Type to Use?

You have all the details about index by tables, nested tables and varrays now. Given a situation, will one should you use for your list data?

Here are some guidelines.

Use index by tables when:

Your program needs small lookups The collection can be made at runtime in the memory when the package/ procedure is initialized The data volume is unknown beforehand The subscript values are flexible (e.g. strings, negative numbers, non-sequential) You do not need to store the collection in the database

Use nested tables when:

The data needs to be stored in the database The number of elements in the collection is not known in advance The elements of the collection may need to be retrieved out of sequence Updates and deletions affect only some elements, at arbitrary locations Your program does not expect to rely on the subscript remaining stable, as their order may change when nested tables are stored in the database.

Use varrays when:

The data needs to be stored in the database The number of elements of the varray is known in advance The data from the varray is accessed in sequence Updates and deletions happen on the varray as a whole and not on arbitrarily located elements in the varray

Sample Code

Associative Array Nested Table Varray

Declare

Declare a collection variable.

aa_0 p.aa_type ; nt_0 p.nt_type ; va_0 p.va_type ;

Declare, initialize, and load a collection variable.

aa p.aa_type ;

-- cannot load values in-- declaration

nt p.nt_type := p.nt_type( 'a', 'b' );

va p.va_type := p.va_type( 'a', 'b' );

begin

Let's inspect the variables to see what they look like at this point (NULL means the variable is not initialized).

p.print( 'aa_0 is ' );p.print( aa_0 );p.print( ' ' );p.print( 'aa is ' );p.print( aa );

p.print( 'nt_0 is ' );p.print( nt_0 );p.print( ' ' );p.print( 'nt is ' );p.print( nt );

p.print( 'va_0 is ' );p.print( va_0 );p.print( ' ' );p.print( 'va is ' );p.print( va );

aa_0 isNOT NULL and empty.first = NULL.last = NULL.count = 0.limit = NULL

aa isNOT NULL and empty.first = NULL.last = NULL.count = 0.limit = NULL

nt_0 isNULL

nt is(1) a(2) b.first = 1.last = 2.count = 2.limit = NULL

va_0 isNULL

va is(1) a(2) b.first = 1.last = 2.count = 2.limit = 10

Initialize a collection after it has been declared.

-- n/a nt_0 := p.nt_type() ;

p.print( nt_0 );

va_0 := p.va_type();

p.print( va_0 );

NOT NULL and empty.first = NULL.last = NULL.count = 0.limit = NULL

NOT NULL and empty.first = NULL.last = NULL.count = 0.limit = 10

Add individual rows to a collection.

-- add 1 row at a time

aa(1) := 'a' ;aa(2) := 'b' ;aa(3) := 'c' ;

aa(4) := 'd' ;aa(5) := 'e' ;

aa(6) := 'e' ;aa(7) := 'e' ;

p.print( aa );

-- add 1 row

nt.extend ;nt(3) := 'c' ;

-- add 2 rows

nt.extend(2) ;nt(4) := 'd' ;nt(5) := 'e' ;

-- create two copies-- of row #5

nt.extend(2,5) ;

p.print( nt );

-- add 1 row

va.extend ;va(3) := 'c' ;

-- add 2 rows

va.extend(2) ;va(4) := 'd' ;va(5) := 'e' ;

-- create two copies-- of row #5

va.extend(2,5) ;

p.print( va );

(1) a(2) b(3) c(4) d(5) e(6) e(7) e.first = 1.last = 7.count = 7.limit = NULL

(1) a(2) b(3) c(4) d(5) e(6) e(7) e.first = 1.last = 7.count = 7.limit = NULL

(1) a(2) b(3) c(4) d(5) e(6) e(7) e.first = 1.last = 7.count = 7.limit = 10

Load a single value from the database into a collection row.

select valinto aa(2)from twhere val = 'B' ;

select valinto nt(2)from twhere val = 'B' ;

select valinto va(2)from twhere val = 'B' ;

p.print( aa ); p.print( nt ); p.print( va );

(1) a(2) B(3) c(4) d(5) e(6) e(7) e.first = 1.last = 7.count = 7.limit = NULL

(1) a(2) B(3) c(4) d(5) e(6) e(7) e.first = 1.last = 7.count = 7.limit = NULL

(1) a(2) B(3) c(4) d(5) e(6) e(7) e.first = 1.last = 7.count = 7.limit = 10

Initialize a collection and load it with multiple database values (pre-existing contents will be lost).

select valbulk collect into aafrom t ;

p.print( aa );

select valbulk collect into ntfrom t ;

p.print( nt );

select valbulk collect into vafrom t ;

p.print( va );

(1) A(2) B(3) C(4) D(5) E(6) F(7) G.first = 1.last = 7.count = 7.limit = NULL


(1) A(2) B(3) C(4) D(5) E(6) F(7) G.first = 1.last = 7.count = 7.limit = 10

Test a row's existence by subscript.

p.print( 'aa.exists(3) is '|| p.tf( aa.exists(3) ));

p.print( 'aa.exists(9) is '|| p.tf( aa.exists(9) ));

p.print( 'nt.exists(3) is '|| p.tf( nt.exists(3) ));

p.print( 'nt.exists(9) is '|| p.tf( nt.exists(9) ));

p.print( 'va.exists(3) is '|| p.tf( va.exists(3) ));

p.print( 'va.exists(9) is '|| p.tf( va.exists(9) ));

aa.exists(3) is TRUEaa.exists(9) is FALSE

nt.exists(3) is TRUEnt.exists(9) is FALSE

va.exists(3) is TRUEva.exists(9) is FALSE

Test a row's existence by content.

-- use a loop (see below) p.print( '''C'' member of nt is '|| p.tf( 'C' member of nt ));

p.print( '''X'' member of nt is '|| p.tf( 'X' member of nt ));

-- use a loop (see below)

'C' member of nt is TRUE'X' member of nt is FALSE

Compare two collections for equality.

-- cannot use "=" with-- two associative arrays

nt_0 := nt ;

if nt_0 = nt then p.print( 'equal' );else p.print( 'not equal' );end if;

-- cannot use "=" with-- two varrays

equal

Update a collection row.

aa(1) := 'a' ;aa(3) := 'c' ;

p.print( aa );

nt(1) := 'a' ;nt(3) := 'c' ;

p.print( nt );

va(1) := 'a' ;va(3) := 'c' ;

p.print( va );

(1) a(2) B(3) c(4) D(5) E(6) F(7) G.first = 1.last = 7.count = 7.limit = NULL

(1) a(2) B(3) c(4) D(5) E(6) F(7) G.first = 1.last = 7.count = 7.limit = NULL

(1) a(2) B(3) c(4) D(5) E(6) F(7) G.first = 1.last = 7.count = 7.limit = 10

Remove rows from the middle of a collection.

aa.delete(2);aa.delete(3,4);

p.print( aa );

nt.delete(2);nt.delete(3,4);

p.print( nt );

-- not possible

p.print( va );

(1) a(5) E(6) F(7) G.first = 4.last = 7.count = 4.limit = NULL

(1) a(5) E(6) F(7) G.first = 4.last = 7.count = 4.limit = NULL


Loop through all rows in the collection.

declare i binary_integer ;begin i := aa.first ; while i is not null loop p.print ( i ||'. '|| aa(i) ); i := aa.next(i) ; end loop;end;

declare i binary_integer ;begin i := nt.first ; while i is not null loop p.print ( i ||'. '|| nt(i) ); i := nt.next(i) ; end loop;end;

for i in nvl(va.first,0) .. nvl(va.last,-1)loop p.print ( i ||'. '|| va(i) );end loop;

1. a5. E6. F7. G

1. a5. E6. F7. G

1. a2. B3. c4. D5. E6. F7. G

Remove row(s) from the end of a collection.

aa.delete(7);aa.delete(5,6);

p.print( aa );

nt.trim;nt.trim(2);

p.print( nt );

va.trim;va.trim(2);

p.print( va );

(4) D.first = 4.last = 4.count = 1.limit = NULL

(4) D.first = 4.last = 4.count = 1.limit = NULL

(1) a(2) B(3) c(4) D.first = 1.last = 4.count = 4.limit = 10

Reuse rows left vacant by earlier delete operations (rows 2,3,4) and trim operations (rows 5,6,7).

aa(2) := 'B' ;aa(3) := 'C' ;aa(4) := 'D' ;

aa(5) := 'E' ;aa(6) := 'F' ;aa(7) := 'G' ;

p.print( aa );

-- note we do not need to-- call ".extend" for rows-- 2,3,4 which were-- removed with ".delete"

nt(2) := 'B' ;nt(3) := 'C' ;nt(4) := 'D' ;

-- we do need to call-- ".extend" for rows 5,6,7-- which were removed with-- ".trim"

nt.extend(3) ;nt(5) := 'E' ;nt(6) := 'F' ;nt(7) := 'G' ;

p.print( nt );

-- we need to call-- ".extend" first-- since 5,6,7 were-- removed with ".trim"

va.extend(3) ;va(5) := 'E' ;va(6) := 'F' ;va(7) := 'G' ;

p.print( va );




Delete all rows in the collection (frees memory too).

aa.delete ;

p.print( aa );

nt.delete ;

p.print( nt );

va.delete ;

p.print( va );



NOT NULL and empty.first = NULL.last = NULL.count = 0.limit = 10

Set a collection to NULL, i.e. uninitialized state.

-- not possible -- "nt := null" will not-- work; use a null-- variable instead

declare nt_null p.nt_type ;begin nt := nt_null ;end;

p.print( nt );

-- "va := null" will not-- work; use a null-- variable instead

declare va_null p.va_type ;begin va := va_null ;end;

p.print( va );

NULL NULL

end;/

The next table presents operational characteristics of each collection type.

CharacteristicAssociative

ArrayNested Table

Varray

The entire collection can be saved in a database column. Y YRows in the collection retain their order when the entire collection is saved in a database column.

n/a Y

Legal subscript datatypes. any Integer Integer

Legal subscript value ranges.-2**31..2**31(for Integers)

1..2**31 1..2**31

The collection can be defined to hold a predefined maximum number of rows.

Y

There can be gaps between subscripts, e.g. 1,3,8. Y YThe collection must be initialized before used. Y YThe collection can be initialized with multiple rows of data using a single command (i.e. a constructor).

Y Y

The collection must be extended before a new row is added. Y YTwo collections can be compared for equality with the "=" operator.

Y

The collection can be manipulated in PL/SQL with MULTISET Operators e.g. MULTISET UNION, MULTISET INTERSECT.

Y

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14200/operators006.htm

CharacteristicAssociative

ArrayNested Table

Varray

The collection can be unnested in a query using the TABLE() collection expression.

Y Y

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14200/statements_10002.htm#i2065746

The Difference Between DECODE and CASE

DECODE and CASE statements in Oracle both provide a conditional construct, of this form:

if A = n1 then A1else if A = n2 then A2else X

Databases before Oracle 8.1.6 had only the DECODE function. CASE was introduced in Oracle 8.1.6, as a standard, more meaningful and more powerful function.

Everything DECODE can do, CASE can. There is a lot more that you can do with CASE, though, which DECODE cannot – which we’ll see in this article.

1. CASE can work with logical operators other than ‘=’

DECODE can do an equality check only. CASE is capable of more logical comparisons such as < > etc. To achieve the same effect with DECODE, ranges of data had to be forced into discrete form making unwieldy code.

An example of putting employees in grade brackets based on their salaries – this can be done elegantly with CASE.

SQL> select ename 2 , case 3 when sal < 1000 4 then 'Grade I' 5 when (sal >=1000 and sal < 2000) 6 then 'Grade II' 7 when (sal >= 2000 and sal < 3000) 8 then 'Grade III' 9 else 'Grade IV' 10 end sal_grade 11 from emp 12 where rownum < 4;

ENAME SAL_GRADE---------- ---------SMITH Grade IALLEN Grade IIWARD Grade II

2. CASE can work with predicates and searchable subqueries

DECODE works with expressions which are scalar values only. CASE can work with predicates and subqueries in searchable form.

An example of categorizing employees based on reporting relationship, illustrating these two uses of CASE.

SQL> select e.ename, 2 case 3 -- predicate with "in" 4 -- mark the category based on ename list 5 when e.ename in ('KING','SMITH','WARD') 6 then 'Top Bosses' 7 -- searchable subquery 8 -- identify if this emp has a reportee 9 when exists (select 1 from emp emp1 10 where emp1.mgr = e.empno) 11 then 'Managers' 12 else 13 'General Employees' 14 end emp_category 15 from emp e 16 where rownum < 5; ENAME EMP_CATEGORY---------- -----------------SMITH Top BossesALLEN General EmployeesWARD Top BossesJONES Managers

3. CASE can work as a PL/SQL construct

DECODE can work as a function inside SQL only. CASE can be a more efficient substitute for IF-THEN-ELSE in PL/SQL.

SQL> declare 2 grade char(1); 3 begin 4 grade := 'b'; 5 case grade 6 when 'a' then dbms_output.put_line('excellent'); 7 when 'b' then dbms_output.put_line('very good'); 8 when 'c' then dbms_output.put_line('good'); 9 when 'd' then dbms_output.put_line('fair'); 10 when 'f' then dbms_output.put_line('poor');

11 else dbms_output.put_line('no such grade'); 12 end case; 13 end; 14 /

PL/SQL procedure successfully completed.

CASE can even work as a parameter to a procedure call, while DECODE cannot.

SQL> var a varchar2(5);SQL> exec :a := 'THREE'; PL/SQL procedure successfully completed. SQL>SQL> create or replace procedure proc_test (i number) 2 as 3 begin 4 dbms_output.put_line('output = '||i); 5 end; 6 /

Procedure created. SQL> exec proc_test(decode(:a,'THREE',3,0));BEGIN proc_test(decode(:a,'THREE',3,0)); END; *ERROR at line 1:ORA-06550: line 1, column 17:PLS-00204: function or pseudo-column 'DECODE' may be used inside a SQLstatement onlyORA-06550: line 1, column 7:PL/SQL: Statement ignored

SQL> exec proc_test(case :a when 'THREE' then 3 else 0 end);output = 3 PL/SQL procedure successfully completed.

4. Careful! CASE handles NULL differently

Check out the different results with DECODE vs NULL.

SQL> select decode(null 2 , null, 'NULL' 3 , 'NOT NULL' 4 ) null_test 5 from dual; NULL----NULL

SQL> select case null 2 when null 3 then 'NULL' 4 else 'NOT NULL' 5 end null_test 6 from dual; NULL_TES--------NOT NULL

The “searched CASE” works as does DECODE, though.

SQL> select case 2 when null is null 3 then 'NULL' 4 else 'NOT NULL' 5 end null_test 6* from dualSQL> /

NULL_TES--------NULL

5. CASE expects data type consistency, DECODE does not

Compare the two examples – DECODE gives you a result, CASE gives a data type mismatch error.

SQL> select decode(2,1,1, 2 '2','2', 3 '3') t 4 from dual; T---------- 2

SQL> select case 2 when 1 then '1' 2 when '2' then '2' 3 else '3' 4 end 5 from dual; when '2' then '2' *ERROR at line 2:ORA-00932: inconsistent datatypes: expected NUMBER got CHAR

6. CASE is ANSI SQL-compliant

CASE complies with ANSI SQL. DECODE is proprietary to Oracle.

7. The difference in readability

In very simple situations, DECODE is shorter and easier to understand than CASE, as in:

SQL> -- An example where DECODE and CASESQL> -- can work equally well, andSQL> -- DECODE is cleaner

SQL> select ename 2 , decode (deptno, 10, 'Accounting', 3 20, 'Research', 4 30, 'Sales', 5 'Unknown') as department 6 from emp 7 where rownum < 4; ENAME DEPARTMENT---------- ----------SMITH ResearchALLEN SalesWARD Sales

SQL> select ename 2 , case deptno 3 when 10 then 'Accounting' 4 when 20 then 'Research' 5 when 30 then 'Sales' 6 else 'Unknown' 7 end as department 8 from emp 9 where rownum < 4; ENAME DEPARTMENT---------- ----------SMITH ResearchALLEN SalesWARD Sales

In complex situations, CASE is shorter and easier to understand. Complicated processing in DECODE, even if technically achievable, is a recipe for messy, unreadable code – while the same can be achieved elegantly using CASE.

Grouping Rows with GROUP BY

GROUP BY

Consider a table like this one.

select grp_a, grp_b, valfrom torder by grp_a, grp_b ; GRP_A GRP_B VAL---------- ---------- ----------a1 b1 10a1 b1 20a1 b2 30a1 b2 40a1 b2 50a2 b3 12a2 b3 22a2 b3 32

GROUP BY allows us to group rows together so that we can include aggregate functions like COUNT, MAX, and SUM in the result set.

select grp_a, count(*), max( val ), sum( val )from tGROUP BY GRP_Aorder by grp_a ; GRP_A COUNT(*) MAX(VAL) SUM(VAL)---------- ---------- ---------- ----------a1 5 50 150a2 3 32 66

We can specify multiple columns in the GROUP BY clause to produce a different set of groupings.

select grp_a, grp_b, count(*), max( val ), sum( val )from tGROUP BY GRP_A, GRP_Border by grp_a, grp_b ; GRP_A GRP_B COUNT(*) MAX(VAL) SUM(VAL)---------- ---------- ---------- ---------- ----------a1 b1 2 20 30a1 b2 3 50 120a2 b3 3 32 66

Parentheses may be added around the GROUP BY expression list. Doing so has no effect on the result.

select grp_a, grp_b, count(*), max( val ), sum( val )from t

GROUP BY ( GRP_A, GRP_B )order by grp_a, grp_b ; GRP_A GRP_B COUNT(*) MAX(VAL) SUM(VAL)---------- ---------- ---------- ---------- ----------a1 b1 2 20 30a1 b2 3 50 120a2 b3 3 32 66

The GROUP BY expression list may be empty. This groups all rows retrieved by the query into a single group. Parentheses are mandatory when specifying an empty set.

select count(*), max( val ), sum( val )from tGROUP BY () ; COUNT(*) MAX(VAL) SUM(VAL)---------- ---------- ---------- 8 50 216

The last example is equivalent to specifying no GROUP BY clause at all, like this.

select count(*), max( val ), sum( val )from t ; COUNT(*) MAX(VAL) SUM(VAL)---------- ---------- ---------- 8 50 216

GROUP BY and DISTINCT

We can use GROUP BY without specifying any aggregate functions in the SELECT list.

select grp_a, grp_bfrom tGROUP BY GRP_A, GRP_Border by grp_a, grp_b ; GRP_A GRP_B---------- ----------a1 b1a1 b2a2 b3

However, the same result is usually produced by specifying DISTINCT instead of using GROUP BY.

select DISTINCT grp_a, grp_bfrom torder by grp_a, grp_b ; GRP_A GRP_B---------- ----------a1 b1a1 b2

a2 b3

According to Tom Kyte the two approaches are effectively equivalent (see AskTom "DISTINCT VS, GROUP BY"). Queries that use DISTINCT are typically easier to understand, but the GROUP BY approach can provide an elegant solution to otherwise cumbersome queries when more than one set of groupings is required. For example, to produce a result set that is the union of:

distinct values in GRP_A distinct values in GRP_B distinct values in GRP_A + GRP_B

the following query would be required if we used DISTINCT

select distinct grp_a, null as grp_bfrom tunion allselect distinct null, grp_bfrom tunion allselect distinct grp_a, grp_bfrom torder by 1, 2 ; GRP_A GRP_B---------- ----------a1 b1a1 b2a1a2 b3a2 b1 b2 b3

but a GROUP BY query could produce the same result with fewer lines of code.

select grp_a, grp_bfrom tgroup by cube( grp_a, grp_b )having grouping_id( grp_a, grp_b ) != 3order by 1, 2 ; GRP_A GRP_B---------- ----------a1 b1a1 b2a1a2 b3a2 b1 b2 b3

(We will learn about the CUBE and GROUPING_ID features later in this tutorial.)

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:32961403234212


GROUP BY and OrderingAll other things being equal, changing the order in which columns appear in the GROUP BY clause has no effect on the way the result set is grouped. For example, this query

select grp_a, grp_b, count(*)from tGROUP BY GRP_A, GRP_Border by grp_a, grp_b ; GRP_A GRP_B COUNT(*)---------- ---------- ----------a1 b1 2a1 b2 3a2 b3 3

returns the same results as this one.

select grp_a, grp_b, count(*)from tGROUP BY GRP_B, GRP_A -- columns have been reversedorder by grp_a, grp_b ; GRP_A GRP_B COUNT(*)---------- ---------- ----------a1 b1 2a1 b2 3a2 b3 3

Gotcha: GROUP BY with no ORDER BY

The last two snippets used the same ORDER BY clause in both queries. What happens if we use no ORDER BY clause at all?

select grp_a, grp_b, count(*)from tgroup by grp_a, grp_b ;

GRP_A GRP_B COUNT(*)------ ------ ----------a1 b1 2a1 b2 3a2 b3 3

The results are still ordered. Some programmers interpret this as meaning that GROUP BY returns an ordered result set. This is an illusion which is easily proved with the following snippet. Note how the same query now returns rows in a random order given new conditions. truncate table t;

-- this time we insert rows into T using a different order from that-- of the Setup topic

insert into t values ( 'a2' , 'b3' , 'c2', 'd2', '32' ) ;insert into t values ( 'a2' , 'b3' , 'c2', 'd2', '22' ) ;

insert into t values ( 'a2' , 'b3' , 'c2', 'd2', '12' ) ;insert into t values ( 'a1' , 'b2' , 'c2', 'd1', '50' ) ;insert into t values ( 'a1' , 'b2' , 'c1', 'd1', '40' ) ;insert into t values ( 'a1' , 'b2' , 'c1', 'd1', '30' ) ;insert into t values ( 'a1' , 'b1' , 'c1', 'd1', '20' ) ;insert into t values ( 'a1' , 'b1' , 'c1', 'd1', '10' ) ;

commit;

select grp_a, grp_b, count(*)from tgroup by grp_a, grp_b ;

GRP_A GRP_B COUNT(*)------ ------ ----------a1 b2 3a1 b1 2a2 b3 3

-- (your results may vary)

The actual behaviour of GROUP BY without ORDER BY is documented in the SQL Reference Manual as follows.

"The GROUP BY clause groups rows but does not guarantee the order of the result set. To order the groupings, use the ORDER BY clause." (See AskTom ; Group by behavior in 10GR2 for another discussion of this issue.)

Duplicate Columns

If a column is used more than once in the SELECT clause it does not need to appear more than once in the GROUP BY clause.

select grp_a, upper(grp_a), count(*)from tgroup by GRP_Aorder by grp_a ; GRP_A UPPER(GRP_ COUNT(*)---------- ---------- ----------a1 A1 5a2 A2 3

If we did include the same column two or more times in the GROUP BY clause it would return the same results as the query above.

select grp_a, upper(grp_a), count(*)from tgroup by GRP_A, GRP_Aorder by grp_a ; GRP_A UPPER(GRP_ COUNT(*)---------- ---------- ----------


http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/statements_10002.htm#i2182483


a1 A1 5a2 A2 3

While there is no practical use for the latter syntax in the upcoming topic GROUP_ID we will see how duplicate columns in a GROUPING SETS clause do produce different results than a distinct column list.

SELECT Lists

We may group by table columns that are not in the SELECT list, like GRP_B in the example below.

select grp_a, count(*)from tgroup by grp_a, GRP_Border by grp_a, grp_b ; GRP_A COUNT(*)------ ----------a1 2a1 3a2 3

However we may not select table columns that are absent from the GROUP BY list, as with GRP_A in this example.

select GRP_A, count(*)from tGROUP BY GRP_B ;select GRP_A, count(*) *ERROR at line 1:ORA-00979: not a GROUP BY expression

Constants

The rules for columns based on constant expressions differ slightly from those for table columns. As with table based columns we can include constant columns in the GROUP BY clause

select 123, 'XYZ', SYSDATE, grp_a, grp_b, count(*)from tgroup by 123, 'XYZ', SYSDATE, grp_a, grp_border by grp_a, grp_b ; 123 'XY SYSDATE GRP_A GRP_B COUNT(*)---------- --- ---------- ------ ------ ---------- 123 XYZ 2009-06-07 a1 b1 2 123 XYZ 2009-06-07 a1 b2 3 123 XYZ 2009-06-07 a2 b3 3

and we can GROUP BY constant columns that are not in the SELECT list.

select grp_a, grp_b, count(*)

http://www.sqlsnippets.com/en/topic-13134.html

from tgroup by 123, 'XYZ', SYSDATE, grp_a, grp_border by grp_a, grp_b ; GRP_A GRP_B COUNT(*)------ ------ ----------a1 b1 2a1 b2 3a2 b3 3

Unlike table based columns we can select constant columns that are absent from the GROUP BY list.

select 123, 'XYZ', SYSDATE, grp_a, grp_b, count(*)from tGROUP BY GRP_A, GRP_Border by grp_a, grp_b ; 123 'XY SYSDATE GRP_A GRP_B COUNT(*)---------- --- ---------- ------ ------ ---------- 123 XYZ 2009-06-07 a1 b1 2 123 XYZ 2009-06-07 a1 b2 3 123 XYZ 2009-06-07 a2 b3 3

Note how all three queries returned the same number of rows.

HAVING

When Oracle processes a GROUP BY query the WHERE clause is applied to the result set before the rows are grouped together. This allows us to use WHERE conditions involving columns like GRP_B in the query below, which is not listed in the GROUP BY clause.

select grp_a, count(*)from tWHERE GRP_B in ( 'b2', 'b3' )group by grp_aorder by grp_a ; GRP_A COUNT(*)---------- ----------a1 3a2 3

Thia does, however, prevent us from using conditions that involve aggregate values like COUNT(*) that are calculated after the GROUP BY clause is applied. For example, the following will not work.

select grp_a, count(*)from tWHERE COUNT(*) > 4group by grp_aorder by grp_a ;WHERE COUNT(*) > 4 *

ERROR at line 3:ORA-00934: group function is not allowed here

For these types of conditions the HAVING clause can be used.

select grp_a, count(*)from tgroup by grp_aHAVING COUNT(*) > 4order by grp_a ; GRP_A COUNT(*)---------- ----------a1 5

Note that the HAVING clause cannot reference table columns like VAL that are not listed in the GROUP BY clause.

select grp_a, count(*)from tgroup by grp_aHAVING VAL > 5order by grp_a ;HAVING VAL > 5 *ERROR at line 4:ORA-00979: not a GROUP BY expression

It can, on the other hand, reference table columns like GRP_A that are in the GROUP BY clause.

select grp_a, count(*)from tgroup by grp_aHAVING GRP_A = 'a2'order by grp_a ; GRP_A COUNT(*)---------- ----------a2 3

but doing so yields the same result as using a WHERE clause.

select grp_a, count(*)from tWHERE GRP_A = 'a2'group by grp_aorder by grp_a ; GRP_A COUNT(*)---------- ----------


a2 3

Given a choice between the last two snippets I expect using a WHERE clause provides the best performance in most, if not all, cases.

GROUPING SETS

There are times when the results of two or more different groupings are required from a single query. For example, say we wanted to combine the results of these two queries.

select grp_a, count(*)from tgroup by grp_aorder by grp_a ; GRP_A COUNT(*)---------- ----------a1 5a2 3 select grp_b, count(*)from tgroup by grp_border by grp_b ; GRP_B COUNT(*)---------- ----------b1 2b2 3b3 3

UNION ALL could be used, like this

select grp_a, null, count(*)from tgroup by grp_aUNION ALLselect null, grp_b, count(*)from tgroup by grp_border by 1, 2 ; GRP_A NULL COUNT(*)---------- ---------- ----------a1 5a2 3 b1 2 b2 3 b3 3

but as of Oracle 9i a more compact syntax is available with the GROUPING SETS extension of the GROUP BY clause. With it the last query can be written as follows.

select grp_a, grp_b, count(*)

http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/statements_10002.htm#sthref9450

from tGROUP BY GROUPING SETS ( GRP_A, GRP_B )order by grp_a, grp_b ; GRP_A GRP_B COUNT(*)---------- ---------- ----------a1 5a2 3 b1 2 b2 3 b3 3

It is important to understand how the clause grouping sets( grp_a, grp_b ) used in the last query differs from the clause group by ( grp_a, grp_b ) in the next query.

select grp_a, grp_b, count(*)from tGROUP BY ( GRP_A, GRP_B )order by grp_a, grp_b ; GRP_A GRP_B COUNT(*)---------- ---------- ----------a1 b1 2a1 b2 3a2 b3 3

Note how the last query returned different rows than the GROUPING SETS query did even though both used the term (GRP_A, GRP_B).

GROUPING SETS, Composite Columns, and Empty Sets

Composite Columns

You can treat a collection of columns as an individual set by using parentheses in the GROUPING SETS clause. For example, to write a query that returns the equivalent of these two queries

select grp_a, grp_b, count(*)from tGROUP BY GRP_A, GRP_Border by grp_a, grp_b ; GRP_A GRP_B COUNT(*)---------- ---------- ----------a1 b1 2a1 b2 3a2 b3 3 select grp_a, null, count(*)from tGROUP BY GRP_Aorder by grp_a ; GRP_A N COUNT(*)

---------- - ----------a1 5a2 3

we could use the following GROUPING SETS clause.

select grp_a, grp_b, count(*)from tGROUP BY GROUPING SETS ( (GRP_A, GRP_B), GRP_A )order by grp_a, grp_b ; GRP_A GRP_B COUNT(*)---------- ---------- ----------a1 b1 2a1 b2 3a1 5a2 b3 3a2 3

The term (GRP_A, GRP_B) is called a "composite column" when it appears inside a GROUPING SETS, ROLLUP, or CUBE clause.

Empty Sets

To add a grand total row to the result set an empty set, specified as (), can be used. In the example below the last row is generated by the empty set grouping.

select grp_a, grp_b, count(*)from tGROUP BY GROUPING SETS ( (GRP_A, GRP_B), () )order by grp_a, grp_b ; GRP_A GRP_B COUNT(*)---------- ---------- ----------a1 b1 2a1 b2 3a2 b3 3 8

Gotcha - Parentheses without GROUPING SETS

Outside a GROUPING SETS clause (or ROLLUP or CUBE clauses) a parenthesized expression like (GRP_A, GRP_B) is no different than the same expression without parentheses. For example this query

select grp_a, grp_b, count(*)from tGROUP BY (GRP_A, GRP_B), GRP_Aorder by grp_a, grp_b ; GRP_A GRP_B COUNT(*)---------- ---------- ----------a1 b1 2a1 b2 3

a2 b3 3

returns the same results as this query

select grp_a, grp_b, count(*)from tGROUP BY GRP_A, GRP_B, GRP_Aorder by grp_a, grp_b ; GRP_A GRP_B COUNT(*)---------- ---------- ----------a1 b1 2a1 b2 3a2 b3 3

which in turn has the same result set as this one.

select grp_a, grp_b, count(*)from tGROUP BY GRP_A, GRP_Border by grp_a, grp_b ; GRP_A GRP_B COUNT(*)---------- ---------- ----------a1 b1 2a1 b2 3a2 b3 3

Gotcha: GROUPING SETS with Constants

When I first started using GROUPING SETS I used constants to produce grand total rows in my result sets, like this.

select grp_a, grp_b, count(*)from tGROUP BY GROUPING SETS ( GRP_A, GRP_B, 0 )order by grp_a, grp_b; GRP_A GRP_B COUNT(*)---------- ---------- ----------a1 5a2 3 b1 2 b2 3 b3 3 8

The last row in the result set is generated by the "0" grouping. I later learnt that an empty set term, "()", was actually a more appropriate syntactic choice than a constant but I continued to use constants out of habit. After all, both approaches seemed to produce the same results.

select grp_a, grp_b, count(*)from tGROUP BY GROUPING SETS ( GRP_A, GRP_B, () )order by grp_a, grp_b; GRP_A GRP_B COUNT(*)---------- ---------- ----------a1 5a2 3 b1 2 b2 3 b3 3 8

However, I later ran into a case where the two actually produced different results.

Query 1 Query 2

set null '(null)'

select grp_a, grp_b, nvl2( grp_b, 1, 0 ) nvl2_grp_b, count(*)from tGROUP BY GROUPING SETS ( GRP_A, GRP_B, () )order by grp_a, grp_b;

GRP_A GRP_B NVL2_GRP_B COUNT(*)------ ------ ---------- ----------a1 (null) 0 5a2 (null) 0 3(null) b1 1 2(null) b2 1 3(null) b3 1 3(null) (null) 0 8

set null '(null)'

select grp_a, grp_b, nvl2( grp_b, 1, 0 ) nvl2_grp_b, count(*)from tGROUP BY GROUPING SETS ( GRP_A, GRP_B, 0 )order by grp_a, grp_b;

GRP_A GRP_B NVL2_GRP_B COUNT(*)------ ------ ---------- ----------a1 (null) (null) 5a2 (null) (null) 3(null) b1 1 2(null) b2 1 3(null) b3 1 3(null) (null) 0 8

Note how Query 2 returns "(null)" in the NVL2_GRP_B column and Query 1 does not. This is because "0" appears in both the SELECT list and the GROUP BY clause. Readers who want to understand more about why these two queries differ can reverse engineer the two into their UNION ALL equivalents using the instructions at Reverse Engineering GROUPING BY Queries. Readers who don't simply need to remember this rule of thumb - always use an empty set term to generate a grand total row, do not use a constant.

ROLLUP

It often happens that a query will have a group A which is a superset of group B which in turn is a superset of group C. When aggregates are required at each level a query like this can be used.

set null '(null)'

select grp_a, grp_b, grp_c, count(*)from tgroup by grouping sets ( ( grp_a, grp_b, grp_c ) , ( grp_a, grp_b ) , ( grp_a ) , () )order by 1, 2, 3; GRP_A GRP_B GRP_C COUNT(*)---------- ---------- ---------- ----------a1 b1 c1 2a1 b1 (null) 2a1 b2 c1 2a1 b2 c2 1a1 b2 (null) 3a1 (null) (null) 5a2 b3 c2 3a2 b3 (null) 3a2 (null) (null) 3(null) (null) (null) 8

This arrangement is common enough that SQL actually provides a shortcut for specifying these types of GROUPING SETS clauses. It uses the ROLLUP operator. Here is how the query above looks when implemented with ROLLUP.

select grp_a




, grp_b, grp_c, count(*)from tgroup by ROLLUP( GRP_A, GRP_B, GRP_C )order by 1, 2, 3; GRP_A GRP_B GRP_C COUNT(*)---------- ---------- ---------- ----------a1 b1 c1 2a1 b1 (null) 2a1 b2 c1 2a1 b2 c2 1a1 b2 (null) 3a1 (null) (null) 5a2 b3 c2 3a2 b3 (null) 3a2 (null) (null) 3(null) (null) (null) 8

CUBE

There are times when all combinations of a collection of grouping columns are required, as in this query.

set null '(null)'

select grp_a, grp_b, grp_c, count(*)from tgroup by grouping sets ( ( grp_a, grp_b, grp_c ) , ( grp_a, grp_b ) , ( grp_a, grp_c ) , ( grp_b, grp_c ) , ( grp_a ) , ( grp_b ) , ( grp_c ) , () )order by 1, 2, 3; GRP_A GRP_B GRP_C COUNT(*)---------- ---------- ---------- ----------

a1 b1 c1 2a1 b1 (null) 2a1 b2 c1 2a1 b2 c2 1a1 b2 (null) 3a1 (null) c1 4a1 (null) c2 1a1 (null) (null) 5a2 b3 c2 3a2 b3 (null) 3a2 (null) c2 3a2 (null) (null) 3(null) b1 c1 2(null) b1 (null) 2(null) b2 c1 2(null) b2 c2 1(null) b2 (null) 3(null) b3 c2 3(null) b3 (null) 3(null) (null) c1 4(null) (null) c2 4(null) (null) (null) 8

This arrangement is common enough that SQL provides a shortcut called the CUBE operator to implement it. Here is how the query above looks after re-writing it to use CUBE.

select grp_a, grp_b, grp_c, count(*)from tgroup by CUBE( GRP_A, GRP_B, GRP_C )order by 1, 2, 3; GRP_A GRP_B GRP_C COUNT(*)---------- ---------- ---------- ----------a1 b1 c1 2a1 b1 (null) 2a1 b2 c1 2a1 b2 c2 1a1 b2 (null) 3a1 (null) c1 4a1 (null) c2 1a1 (null) (null) 5a2 b3 c2 3a2 b3 (null) 3a2 (null) c2 3a2 (null) (null) 3(null) b1 c1 2(null) b1 (null) 2(null) b2 c1 2


(null) b2 c2 1(null) b2 (null) 3(null) b3 c2 3(null) b3 (null) 3(null) (null) c1 4(null) (null) c2 4(null) (null) (null) 8

Concatenated Groupings

The last type of grouping shortcut we will examine is called a Concatenated Grouping. With it one can re-write a query like this one, which effectively performs a cross-product of GRP_A with GRP_B and GRP_C,

select grp_a, grp_b, grp_c, count(*)from tgroup by grouping sets ( ( grp_a, grp_b ) , ( grp_a, grp_c ) )order by 1, 2, 3; GRP_A GRP_B GRP_C COUNT(*)---------- ---------- ---------- ----------a1 b1 2a1 b2 3a1 c1 4a1 c2 1a2 b3 3a2 c2 3

into one like this.

set null '(null)'

select grp_a, grp_b, grp_c, count(*)from tgroup by grp_a, grouping sets( grp_b, grp_c )order by

http://download.oracle.com/docs/cd/B28359_01/server.111/b28313/aggreg.htm#i1007021

1, 2, 3; GRP_A GRP_B GRP_C COUNT(*)---------- ---------- ---------- ----------a1 b1 (null) 2a1 b2 (null) 3a1 (null) c1 4a1 (null) c2 1a2 b3 (null) 3a2 (null) c2 3

The cross-product effect is more apparent when a query like this one

select grp_a, grp_b, grp_c, count(*)from tgroup by grouping sets ( ( grp_a, grp_c ) , ( grp_a, grp_d ) , ( grp_b, grp_c ) , ( grp_b, grp_d ) )order by 1, 2, 3; GRP_A GRP_B GRP_C COUNT(*)---------- ---------- ---------- ----------a1 (null) c1 4a1 (null) c2 1a1 (null) (null) 5a2 (null) c2 3a2 (null) (null) 3(null) b1 c1 2(null) b1 (null) 2(null) b2 c1 2(null) b2 c2 1(null) b2 (null) 3(null) b3 c2 3(null) b3 (null) 3

is re-written into one like this.

select grp_a, grp_b, grp_c, count(*)from

tgroup by grouping sets( grp_a, grp_b ), grouping sets( grp_c, grp_d )order by 1, 2, 3; GRP_A GRP_B GRP_C COUNT(*)---------- ---------- ---------- ----------a1 (null) c1 4a1 (null) c2 1a1 (null) (null) 5a2 (null) c2 3a2 (null) (null) 3(null) b1 c1 2(null) b1 (null) 2(null) b2 c1 2(null) b2 c2 1(null) b2 (null) 3(null) b3 c2 3(null) b3 (null) 3

Personally I have never found the need to use concatenated groupings. I find that specifically listing the desired groupings in a single GROUPING SETS clause or using a single ROLLUP or CUBE operator makes my queries easier to understand and debug. Concatenated groupings can, however, prove useful in data warehouse queries that deal with hierarchical cubes of data. See Concatenated Groupings for more information.

GROUP_ID

Unlike a regular GROUP BY clause, including the same column more than once in a GROUPING SETS clause produces duplicate rows.

select grp_a, count(*)from tGROUP BY GROUPING SETS ( GRP_A, GRP_A )order by grp_a ; GRP_A COUNT(*)---------- ----------a1 5a1 5a2 3a2 3 select grp_a, count(*)from tGROUP BY GROUPING SETS ( GRP_A, GRP_A, GRP_A )order by grp_a ; GRP_A COUNT(*)---------- ----------a1 5

http://download.oracle.com/docs/cd/B28359_01/server.111/b28313/aggreg.htm#i1007021

a1 5a1 5a2 3a2 3a2 3

The GROUP_ID function can be used to distinguish duplicates from each other.

select grp_a, count(*), GROUP_ID()from tGROUP BY GROUPING SETS ( GRP_A, GRP_A, GRP_A )order by grp_a, group_id() ; GRP_A COUNT(*) GROUP_ID()---------- ---------- ----------a1 5 0a1 5 1a1 5 2a2 3 0a2 3 1a2 3 2

In the trivial example above it seems there would be little practical use for GROUP_ID. There are times when more complex GROUP BY clauses can return duplicate rows however. It is in such queries that GROUP_ID proves useful.

Note that GROUP_ID will always be 0 in a result set that contains no duplicates.

select grp_a, grp_b, count(*), GROUP_ID()from tGROUP BY GROUPING SETS ( GRP_A, GRP_B )order by grp_a, grp_b ; GRP_A GRP_B COUNT(*) GROUP_ID()---------- ---------- ---------- ----------a1 5 0a2 3 0 b1 2 0 b2 3 0 b3 3 0

Grouping by NULL Values

In the examples used thus far in the tutorial our base table had no null values in it. Let's now look at grouping a table that does contain null values.

set null '(null)'

select *from t2order by grp_a, grp_b ; GRP_A GRP_B VAL---------- ---------- ----------

http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/functions063.htm

A1 X1 10A1 X2 40A1 (null) 20A1 (null) 30A1 (null) 50A2 (null) 60

Now consider the following GROUP BY query.

select grp_a, grp_b, count(*)from t2group by grp_a, grp_border by grp_a, grp_b ; GRP_A GRP_B COUNT(*)---------- ---------- ----------A1 X1 1A1 X2 1A1 (null) 3A2 (null) 1

So far so good, but let's use GROUPING SETS next and see what happens.

select grp_a, grp_b, count(*)from t2GROUP BY GROUPING SETS( (GRP_A, GRP_B), GRP_A )order by grp_a, grp_b ; GRP_A GRP_B COUNT(*)---------- ---------- ----------A1 X1 1A1 X2 1A1 (null) 3A1 (null) 5A2 (null) 1A2 (null) 1

We now have two rows with "(null)" under GRP_B for each GRP_A value, one representing the null values stored in T2.GRP_B and the other representing the set of all values in T2.GRP_B.

Gotcha - NVL and NVL2

One might expect that NVL() or NVL2 could be used to distinguish the two nulls, like this

select grp_a, NVL( t2.GRP_B, 'n/a' ) AS GRP_B, nvl2( t2.grp_b, 1, 0 ) as test, count(*)from t2GROUP BY GROUPING SETS( (GRP_A, GRP_B), GRP_A )order by grp_a, grp_b ; GRP_A GRP_B TEST COUNT(*)---------- ---------- ---------- ----------

A1 X1 1 1A1 X2 1 1A1 n/a 0 5A1 n/a 0 3A2 n/a 0 1A2 n/a 0 1

but this is not the case because functions in the SELECT list operate on an intermediate form of the result set created after the GROUP BY clause is applied, not before. In the next topic we see how the GROUPING function can help us distinguish the two types of nulls.

GROUPING

The GROUPING function tells us whether or not a null in a result set represents the set of all values produced by a GROUPING SETS, ROLLUP, or CUBE operation. A value of "1" tells us it does, a value of "0" tells us it does not. In the output of the following query two of the four nulls represent the set of all GRP_B values.

set null '(null)'

select grp_a, grp_b, count(*), GROUPING( GRP_A ) GROUPING_GRP_A, GROUPING( GRP_B ) GROUPING_GRP_Bfrom t2group by grouping sets( (grp_a, grp_b), grp_a )order by 1, 2; GRP_A GRP_B COUNT(*) GROUPING_GRP_A GROUPING_GRP_B---------- ---------- ---------- -------------- --------------A1 X1 1 0 0A1 X2 1 0 0A1 (null) 3 0 0A1 (null) 5 0 1A2 (null) 1 0 0A2 (null) 1 0 1

Of course adding a column with zeros and ones to a report isn't the most user friendly way to distinguish grouped values. However, GROUPING can be used with DECODE to insert labels like "Total" into the result set. Here is one example.

select grp_a as "Group A", decode ( GROUPING( GRP_B )



, 1, 'Total:' , grp_b ) as "Group B", count(*) as "Count"from t2group by grouping sets( (grp_a, grp_b), grp_a )order by grp_a, GROUPING( GRP_B ), grp_b; Group A Group B Count---------- ---------- ----------A1 X1 1A1 X2 1A1 (null) 3A1 Total: 5A2 (null) 1A2 Total: 1

Nulls and Aggregate Functions

In this topic we explored working with null values in GROUP BY columns. To learn how aggregate functions like COUNT() and SUM() deal with null values in non-GROUP BY columns see Nulls and Aggregate Functions.

Gotcha - ORA-00979

When using ORDER BY we need to be careful with the selection of column aliases. For example, say we attempted this query.

select grp_a, decode( grouping( grp_b ), 1, 'Total:', grp_b ) AS GRP_B, count(*)from t2group by grouping sets( (grp_a, grp_b), grp_a )order by grouping( GRP_B );, decode( grouping( grp_b ), 1, 'Total:', grp_b ) AS GRP_B *ERROR at line 3:ORA-00979: not a GROUP BY expression

Note how the table has a column called GRP_B and the SELECT list has a column alias also called GRP_B. In the ORDER BY GROUPING( GRP_B ) clause one might expect the "GRP_B" term to refer to the table column, but Oracle interprets it as referring to the column alias, hence the ORA-00979 error.


To work around the error we can either prefix the column name with its table name

select grp_a, decode( grouping( grp_b ), 1, 'Total:', grp_b ) AS GRP_B, count(*)from t2group by grouping sets( (grp_a, grp_b), grp_a )order by grouping( T2.GRP_B ); GRP_A GRP_B COUNT(*)---------- ---------- ----------A1 (null) 3A1 X1 1A1 X2 1A2 (null) 1A1 Total: 5A2 Total: 1

or change the column alias.

select grp_a as "Group A", decode( grouping( grp_b ), 1, 'Total:', grp_b ) AS "Group B", count(*) as "Count"from t2group by grouping sets( (grp_a, grp_b), grp_a )order by grouping( GRP_B ); Group A Group B Count---------- ---------- ----------A1 (null) 3A1 X1 1A1 X2 1A2 (null) 1A1 Total: 5A2 Total: 1

GROUPING_ID

In the preceding topic we saw how the GROUPING function could be used to identify null values representing the set of all values produced by a GROUPING SETS, ROLLUP, or CUBE operation. What if we wanted to distinguish entire rows from each other? We could use a number of different GROUPING() calls like this

column bit_vector format a10

select TO_CHAR( GROUPING( GRP_A ) ) || TO_CHAR( GROUPING( GRP_B ) ) AS BIT_VECTOR

, DECODE ( TO_CHAR( GROUPING( GRP_A ) ) || TO_CHAR( GROUPING( GRP_B ) ) , '01', 'Group "' || GRP_A || '" Total' , '10', 'Group "' || GRP_B || '" Total' , '11', 'Grand Total' , NULL ) AS LABEL, count(*)from t2group by grouping sets ( grp_a, grp_b, () )order by GROUPING( GRP_A ), grp_a, GROUPING( GRP_B ), grp_b; BIT_VECTOR LABEL COUNT(*)---------- ------------------------ ----------01 Group "A1" Total 501 Group "A2" Total 110 Group "X1" Total 110 Group "X2" Total 110 Group "" Total 411 Grand Total 6

but if the number of grouping sets were large concatenating all the required GROUPING() terms together would get cumbersome. Fortunately for us the GROUPING_ID function exists. It yields the decimal value of a bit vector (a string of zeros and ones) formed by concatenating all the GROUPING values for its parameters. The following example shows how it works.

select to_char( grouping( grp_a ) ) || to_char( grouping( grp_b ) ) as bit_vector -- this column is only included for clarity, GROUPING_ID( GRP_A, GRP_B ), grp_a, grp_b, count(*)from t2group by grouping sets ( grp_a, grp_b, () )order by GROUPING_ID( GRP_A, GRP_B ), grp_a, grp_b; BIT_VECTOR GROUPING_ID(GRP_A,GRP_B) GRP_A GRP_B COUNT(*)---------- ------------------------ ---------- ---------- ----------


01 1 A1 501 1 A2 110 2 X1 110 2 X2 110 2 411 3 6

Here is how we could use GROUPING_ID to streamline our original query.

select DECODE ( GROUPING_ID( GRP_A, GRP_B ) , 1, 'Group "' || GRP_A || '" Total' , 2, 'Group "' || GRP_B || '" Total' , 3, 'Grand Total' , NULL ) AS LABEL, count(*)from t2group by grouping sets ( grp_a, grp_b, () )order by GROUPING_ID( GRP_A, GRP_B ), grp_a, grp_b; LABEL COUNT(*)------------------------ ----------Group "A1" Total 5Group "A2" Total 1Group "X1" Total 1Group "X2" Total 1Group "" Total 4Grand Total 6

Composite Columns

The following example shows how GROUPING_ID works when a composite column, (GRP_A, GRP_B), is included in the GROUPING SETS clause.

select GROUPING_ID( GRP_A, GRP_B ), grp_a, grp_b, count(*)from t2group by grouping sets ( (grp_a, grp_b), grp_a, grp_b, () )order by 1, 2, 3;

GROUPING_ID(GRP_A,GRP_B) GRP_A GRP_B COUNT(*)------------------------ ---------- ---------- ---------- 0 A1 X1 1 0 A1 X2 1 0 A1 3 0 A2 1 1 A1 5 1 A2 1 2 X1 1 2 X2 1 2 4 3 6

GROUPING_ID and HAVING

GROUPING_ID can also be used in the HAVING clause to filter out unwanted groupings. Say, for example, we started with a query like this one

select grouping_id( grp_a, grp_b ), grp_a, grp_b, count(*)from t2group by cube( grp_a, grp_b )order by 1, 2, 3; GROUPING_ID(GRP_A,GRP_B) GRP_A GRP_B COUNT(*)------------------------ ---------- ---------- ---------- 0 A1 X1 1 0 A1 X2 1 0 A1 3 0 A2 1 1 A1 5 1 A2 1 2 X1 1 2 X2 1 2 4 3 6

and then we wanted to exclude the empty set grouping (the one with a GROUPING_ID of "3"). We simply add a HAVING clause as follows.

select grouping_id( grp_a, grp_b ), grp_a, grp_b, count(*)from t2

group by cube( grp_a, grp_b )HAVING GROUPING_ID( GRP_A, GRP_B ) != 3order by 1, 2, 3; GROUPING_ID(GRP_A,GRP_B) GRP_A GRP_B COUNT(*)------------------------ ---------- ---------- ---------- 0 A1 X1 1 0 A1 X2 1 0 A1 3 0 A2 1 1 A1 5 1 A2 1 2 X1 1 2 X2 1 2 4

Reverse Engineering GROUPING BY Queries

At times we are faced with a complex GROUP BY query written by someone else and figuring out the equivalent UNION ALL query can help us better understand its results. This is not as easy as it first may seem. A query like this, for example,

set null (null)

select grp_a, grp_b, nvl( grp_b, grp_a ) as nvl_grp_a_b, nvl2( grp_b, 1, 0 ) as nvl2_grp_b, count(*)from tGROUP BY ROLLUP ( GRP_A, GRP_B )order by grp_a, grp_b; GRP_A GRP_B NVL_GRP_A_ NVL2_GRP_B COUNT(*)---------- ---------- ---------- ---------- ----------a1 b1 b1 1 2a1 b2 b2 1 3a1 (null) a1 0 5a2 b3 b3 1 3a2 (null) a2 0 3(null) (null) (null) 0 8

is not simply the result of unioning together three identical subqueries with different GROUP BY clauses.

set null '(null)'

select grp_a, grp_b, nvl( grp_b, grp_a ) as nvl_grp_a_b, nvl2( grp_b, 1, 0 ) as nvl2_grp_b, count(*)from tGROUP BY ()UNION ALLselect grp_a, grp_b, nvl( grp_b, grp_a ) as nvl_grp_a_b, nvl2( grp_b, 1, 0 ) as nvl2_grp_b, count(*)from tGROUP BY ( GRP_A )UNION ALLselect grp_a, grp_b, nvl( grp_b, grp_a ) as nvl_grp_a_b, nvl2( grp_b, 1, 0 ) as nvl2_grp_b, count(*)from tGROUP BY ( GRP_A, GRP_B )order by grp_a, grp_b; grp_a *ERROR at line 2:ORA-00979: not a GROUP BY expression

As you can see, such a query produces an error because the first and second subqueries select columns that are not in the GROUP BY clause. To determine the real equivalent UNION query we can use the following algorithm.

Step 1

Replace any ROLLUP or CUBE operators with their equivalent GROUPING SETS operator. In our example the query

select grp_a, grp_b, nvl( grp_b, grp_a ) as nvl_grp_a_b, nvl2( grp_b, 1, 0 ) as nvl2_grp_b

, count(*)from tGROUP BY ROLLUP ( GRP_A, GRP_B )order by grp_a, grp_b; GRP_A GRP_B NVL_GRP_A_ NVL2_GRP_B COUNT(*)---------- ---------- ---------- ---------- ----------a1 b1 b1 1 2a1 b2 b2 1 3a1 (null) a1 0 5a2 b3 b3 1 3a2 (null) a2 0 3(null) (null) (null) 0 8

is replaced with

select grp_a, grp_b, nvl( grp_b, grp_a ) as nvl_grp_a_b, nvl2( grp_b, 1, 0 ) as nvl2_grp_b, count(*)from tGROUP BY GROUPING SETS ( () , ( GRP_A ) , ( GRP_A, GRP_B ) )order by grp_a, grp_b; GRP_A GRP_B NVL_GRP_A_ NVL2_GRP_B COUNT(*)---------- ---------- ---------- ---------- ----------a1 b1 b1 1 2a1 b2 b2 1 3a1 (null) a1 0 5a2 b3 b3 1 3a2 (null) a2 0 3(null) (null) (null) 0 8

Step 2a

Next start with a query that groups by only the first term in the GROUPING SETS clause, which is an empty set in our example.

select grp_a, grp_b

, nvl( grp_b, grp_a ) as nvl_grp_a_b, nvl2( grp_b, 1, 0 ) as nvl2_grp_b, count(*)from tGROUP BY () ;

If the SELECT list contains columns that are not in the GROUP BY clause then replace those columns with NULL. In the query above both GRP_A and GRP_B are absent from the GROUP BY clause so we replace all occurrences of these columns in the SELECT list with NULL.

column grp_a format a6column grp_b format a6column nvl_grp_a_b format a11column nvl2_grp_b format 999999999

select NULL as grp_a, NULL as grp_b, nvl( NULL, NULL ) as nvl_grp_a_b, nvl2( NULL, 1, 0 ) as nvl2_grp_b, count(*)from tGROUP BY () ; GRP_A GRP_B NVL_GRP_A_B NVL2_GRP_B COUNT(*)------ ------ ----------- ---------- ----------(null) (null) (null) 0 8

Step 2b

Now we repeat the first step using the second term in the GROUPING SETS clause, ( GRP_A ).

select grp_a, grp_b, nvl( grp_b, grp_a ) as nvl_grp_a_b, nvl2( grp_b, 1, 0 ) as nvl2_grp_b, count(*)from tGROUP BY ( GRP_A ) ;

This time GRP_B is in the SELECT list but it is not in the GROUP BY list. We therefore need to replace GRP_B with NULL.

select grp_a, NULL as grp_b, nvl( NULL, grp_a ) as nvl_grp_a_b, nvl2( NULL, 1, 0 ) as nvl2_grp_b, count(*)from t

GROUP BY ( GRP_A ) ; GRP_A GRP_B NVL_GRP_A_B NVL2_GRP_B COUNT(*)------ ------ ----------- ---------- ----------a1 (null) a1 0 5a2 (null) a2 0 3

Step 2c

For the last set in the GROUPING SETS clause all selected columns are listed in the GROUP BY clause so no further transformation is needed. We can use the original SELECT list as-is.

select grp_a, grp_b, nvl( grp_b, grp_a ) as nvl_grp_a_b, nvl2( grp_b, 1, 0 ) as nvl2_grp_b, count(*)from tGROUP BY ( GRP_A, GRP_B ) ; GRP_A GRP_B NVL_GRP_A_B NVL2_GRP_B COUNT(*)------ ------ ----------- ---------- ----------a1 b1 b1 1 2a1 b2 b2 1 3a2 b3 b3 1 3

Step 3

The next step is to combine the three step 2 queries with UNION ALL and add an ORDER BY clause.

select NULL as grp_a, NULL as grp_b, nvl( NULL, NULL ) as nvl_grp_a_b, nvl2( NULL, 1, 0 ) as nvl2_grp_b, count(*)from tgroup by ()UNION ALLselect grp_a, NULL as grp_b, nvl( NULL, grp_a ) as nvl_grp_a_b, nvl2( NULL, 1, 0 ) as nvl2_grp_b, count(*)from tgroup by ( grp_a )UNION ALLselect grp_a

, grp_b, nvl( grp_b, grp_a ) as nvl_grp_a_b, nvl2( grp_b, 1, 0 ) as nvl2_grp_b, count(*)from tgroup by ( grp_a, grp_b )ORDER BY GRP_A, GRP_B; GRP_A GRP_B NVL_GRP_A_B NVL2_GRP_B COUNT(*)------ ------ ----------- ---------- ----------a1 b1 b1 1 2a1 b2 b2 1 3a1 (null) a1 0 5a2 b3 b3 1 3a2 (null) a2 0 3(null) (null) (null) 0 8

Step 4 (Optional)

Lastly we reduce expressions like nvl( NULL, NULL ) and nvl2( NULL , 1, 0 ) to simpler, equivalent terms.

select null as grp_a, null as grp_b, NULL AS NVL_GRP_A_B, 0 AS NVL2_GRP_B, count(*)from tgroup by ()union allselect grp_a, null as grp_b, GRP_A AS NVL_GRP_A_B, 0 AS NVL2_GRP_B, count(*)from tgroup by ( grp_a )union allselect grp_a, grp_b, nvl( grp_b, grp_a ) as nvl_grp_a_b, nvl2( grp_b, 1, 0 ) as nvl2_grp_b, count(*)from tgroup by ( grp_a, grp_b )order by

grp_a, grp_b; GRP_A GRP_B NVL_GRP_A_B NVL2_GRP_B COUNT(*)------ ------ ----------- ---------- ----------a1 b1 b1 1 2a1 b2 b2 1 3a1 (null) a1 0 5a2 b3 b3 1 3a2 (null) a2 0 3(null) (null) (null) 0 8

Result

The end result of the last step is a query which returns the same rows as the original GROUPING SETS query, which is repeated below for your convenience.

select grp_a, grp_b, nvl( grp_b, grp_a ) as nvl_grp_a_b, nvl2( grp_b, 1, 0 ) as nvl2_grp_b, count(*)from tGROUP BY ROLLUP ( GRP_A, GRP_B )order by grp_a, grp_b; GRP_A GRP_B NVL_GRP_A_B NVL2_GRP_B COUNT(*)------ ------ ----------- ---------- ----------a1 b1 b1 1 2a1 b2 b2 1 3a1 (null) a1 0 5a2 b3 b3 1 3a2 (null) a2 0 3(null) (null) (null) 0 8

Setup

Run the code on this page in SQL*Plus to create the sample tables, data, etc. used by the examples in this section.

create table t( grp_a varchar2(10), grp_b varchar2(10), grp_c varchar2(10), grp_d varchar2(10), val number) ;

insert into t values ( 'a1' , 'b1' , 'c1', 'd1', '10' ) ;

insert into t values ( 'a1' , 'b1' , 'c1', 'd1', '20' ) ;insert into t values ( 'a1' , 'b2' , 'c1', 'd1', '30' ) ;insert into t values ( 'a1' , 'b2' , 'c1', 'd1', '40' ) ;insert into t values ( 'a1' , 'b2' , 'c2', 'd1', '50' ) ;insert into t values ( 'a2' , 'b3' , 'c2', 'd2', '12' ) ;insert into t values ( 'a2' , 'b3' , 'c2', 'd2', '22' ) ;insert into t values ( 'a2' , 'b3' , 'c2', 'd2', '32' ) ;

commit ;

create table t2( grp_a varchar2(10), grp_b varchar2(10), val number) ;

insert into t2 values ( 'A1' , 'X1' , '10' ) ;insert into t2 values ( 'A1' , 'X2' , '40' ) ;insert into t2 values ( 'A1' , null , '20' ) ;insert into t2 values ( 'A1' , null , '30' ) ;insert into t2 values ( 'A1' , null , '50' ) ;insert into t2 values ( 'A2' , null , '60' ) ;

commit ;

Cleanup

Run the code on this page to drop the sample tables, procedures, etc. created in earlier parts of this section. To clear session state changes (e.g. those made by SET, COLUMN, and VARIABLE commands) exit your SQL*Plus session after running these cleanup commands.

drop table t ;drop table t2 ; exit

Hierarchical Data

This section presents various topics related to hierarchical data (also known as "tree structured" data). An example of hierarchical data is shown below.

KEY PARENT_KEY---------- ----------nls (null)demo nlsmesg nlsserver (null)bin serverconfig serverlog configctx serveradmin ctxdata ctxdelx dataenlx dataeslx datamig ctx

It is often useful to order and display such rows using the hierarchical relationship. Doing so yields a result set that looks like this (KEY values are indented to highlight the hierarchy).

KEY_INDENTED KEY_PATH--------------- -------------------------nls /nls demo /nls/demo mesg /nls/mesgserver /server bin /server/bin config /server/config log /server/config/log ctx /server/ctx admin /server/ctx/admin data /server/ctx/data delx /server/ctx/data/delx enlx /server/ctx/data/enlx eslx /server/ctx/data/eslx mig /server/ctx/mig

In this tutorial we explore various Oracle mechanisms for working with hierarchical data.

Connecting Rows

Say we wanted to take the following directory names from a file system and store them in a database table.

/nls/nls/demo

/nls/mesg/server/server/bin/server/config/server/config/log/server/ctx/server/ctx/admin/server/ctx/data/server/ctx/data/delx/server/ctx/data/enlx/server/ctx/data/eslx/server/ctx/mig

To do this we could use a table with a KEY column, which holds the directory name, and a PARENT_KEY column, which connects the directory to its parent directory. (Directory names like these would not typically be used as primary keys. We are bending the rules here for illustrative purposes.)

select * from t ; KEY PARENT_KEY NAME---------- ---------- ----------nls (null) NLSdemo nls DATAmesg nls DEMOserver (null) SERVERbin server BINconfig server CONFIGlog config LOGctx server CTXadmin ctx ADMINdata ctx DATAdelx data DELXenlx data ENLXeslx data ESLXmig ctx MESG

To connect and order the data in this table using the PARENT_KEY hierarchy we can create a Hierarchical Query using the START WITH and CONNECT BY clauses of the SELECT command. START WITH identifies the topmost rows in the hierarchy. CONNECT BY identifies all subsequent rows in the hierarchy.

The following snippet returns rows sorted hierarchically, starting from the root rows (those with no parents) on down through to the leaf rows (those with no children).

select key , levelfrom tSTART WITH parent_key is nullCONNECT BY parent_key = prior key

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14200/queries003.htm#i2053935

; KEY LEVEL---------- ------nls 1demo 2mesg 2server 1bin 2config 2log 3ctx 2admin 3data 3delx 4enlx 4eslx 4mig 3

The LEVEL pseudocolumn in the previous result indicates which level in the hierarchy each row is at. The topmost level is assigned a LEVEL of 1. To better illustrate hierarchical relationships the LEVEL column is commonly used to indent selected values, like this.

select lpad( ' ', level-1 ) || key as key_indented , levelfrom tSTART WITH parent_key is nullCONNECT BY parent_key = prior key; KEY_INDENTED LEVEL--------------- ------nls 1 demo 2 mesg 2server 1 bin 2 config 2 log 3 ctx 2 admin 3 data 3 delx 4 enlx 4 eslx 4 mig 3

The PRIOR operator in hierarchical queries gives us access to column information from the parent of the current row. It can be used outside the CONNECT BY clause if required.

select


http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14200/pseudocolumns001.htm#i1009261

lpad( ' ', level-1 ) || key as key_indented , PRIOR key as prior_key , PRIOR name as prior_namefrom tstart with parent_key is nullconnect by parent_key = prior key; KEY_INDENTED PRIOR_KEY PRIOR_NAME--------------- ---------- ----------nls (null) (null) demo nls NLS mesg nls NLSserver (null) (null) bin server SERVER config server SERVER log config CONFIG ctx server SERVER admin ctx CTX data ctx CTX delx data DATA enlx data DATA eslx data DATA mig ctx CTX

Changing Direction

To traverse the tree in the opposite direction, from leaf to root, simply choose a leaf row as the starting point and apply the PRIOR operator to the PARENT_KEY column instead of the KEY column.

select lpad( ' ', level-1 ) || key as key_indented , levelfrom tSTART WITH KEY = 'delx'connect by key = PRIOR PARENT_KEY; KEY_INDENTED LEVEL--------------- ------delx 1 data 2 ctx 3 server 4

Gotchas

CONNECT BY conditions are not applied to rows in level 1 of the hierarchy. In the following snippet note how the KEY <> 'delx' condition did not filter out the row with a KEY value of 'delx'.

select lpad( ' ', level-1 ) || key as key_indented , levelfrom tstart with key = 'delx'connect by key = PRIOR PARENT_KEY and KEY <> 'delx'; KEY_INDENTED LEVEL--------------- ------delx 1 data 2 ctx 3 server 4

Order of Operations

The clauses in hierarchical queries are processed in the following order. 1. join conditions (either in the FROM clause or the WHERE clause)2. START WITH clause3. CONNECT BY clause4. WHERE clause conditions that are not joins.

The following two snippets demonstrate how this order of operations affects query results when filter conditions are in the WHERE clause versus when they are in the CONNECT BY clause.

Filter Condition in WHERE Filter Condition in CONNECT BY

select lpad(' ', level-1 ) || key as key_indented , levelfrom tWHERE LEVEL != 3start with key = 'server'connect by parent_key = prior key --;

select lpad(' ', level-1 ) || key as key_indented , levelfrom t----start with key = 'server'CONNECT BY parent_key = prior key and LEVEL != 3;

KEY_INDENTED LEVEL--------------- ------server 1 bin 2 config 2 ctx 2 delx 4 enlx 4 eslx 4

KEY_INDENTED LEVEL--------------- ------server 1 bin 2 config 2 ctx 2

Sorting

Since START WITH and CONNECT BY apply a hierarchical sorting scheme to your data, you should generally not use any features that apply other sorting schemes, such as ORDER BY or GROUP BY, in your hierarchical queries. Doing so would negate the need for START WITH and CONNECT BY in the first place.

For example, given data with the following hierarchies

KEY_INDENTED---------------nls demo mesgserver bin config log ctx admin data delx enlx eslx mig

the ORDER BY clause in the hierarchical query on the left below destroys the hierarchical order. It yields the same results as if CONNECT BY was not used at all.

Hierarchical Query Regular Query

select keyfrom tstart with parent_key is nullconnect by parent_key = prior keyORDER BY NAME;

select keyfrom t--------ORDER BY NAME;

KEY----------adminbinconfigctxdatademodelxmesgenlxeslxlogmignlsserver

KEY----------adminbinconfigctxdatademodelxmesgenlxeslxlogmignlsserver

ORDER SIBLINGS BY

Unlike ORDER BY and GROUP_BY, the ORDER SIBLINGS BY clause will not destroy the hierarchical ordering of queries. It allows you to control the sort order of all rows with the same parent (aka "siblings"). The following examples show how ORDER SIBLINGS BY can be used to sort siblings in ascending and descending order respectively.

Ascending Siblings Descending Siblings

select lpad(' ', level-1) || key as key_indentedfrom tstart with parent_key is nullconnect by parent_key = prior keyORDER SIBLINGS BY KEY ASC; KEY_INDENTED---------------nls demo mesgserver bin config log ctx admin data delx enlx eslx

select lpad(' ', level-1) || key as key_indentedfrom tstart with parent_key is nullconnect by parent_key = prior keyORDER SIBLINGS BY KEY DESC; KEY_INDENTED---------------server ctx mig data eslx enlx delx admin config log binnls mesg


mig

demo

Oracle 8i and Earlier

The ORDER SIBLINGS BY clause is only available in Oracle version 9i or greater. For earlier versions a custom, recursive PL/SQL function can be used in place of ORDER SIBLINGS BY.

-------------------------------------------------------------- Note:---- This function is only for demonstration purposes.-- In a real application more robust code would be needed-- to guard against things like separator characters-- appearing in KEY values, hierarchical loops in the data,-- etc.------------------------------------------------------------

create or replace function KEY_PATH( p_key t.key%type , p_separator varchar2 default '/') return varchar2is

v_parent_key t.parent_key%type ; v_key t.key%type ;

begin

select parent_key, key into v_parent_key, v_key from t where key = p_key ;

if v_parent_key is null then return ( p_separator || v_key ); else return ( KEY_PATH( v_parent_key, p_separator ) || p_separator || v_key ); end if;

exception

when no_data_found then return( null );

end;/

show errorsNo errors.

Ascending Siblings Descending Siblingsselect lpad(' ', level-1) || key as key_indented

select lpad(' ', level-1) || key as key_indented

from tstart with parent_key is nullconnect by parent_key = prior keyORDER BY KEY_PATH( KEY, '/' ) ASC; KEY_INDENTED---------------nls demo mesgserver bin config log ctx admin data delx enlx eslx mig

from tstart with parent_key is nullconnect by parent_key = prior keyORDER BY RPAD( KEY_PATH( KEY, '/' ), 50, '~' ) DESC; KEY_INDENTED---------------server ctx mig data eslx enlx delx admin config log binnls mesg demo

Gotchas

KEY_PATH's p_separator character should be a character that 1. does not exist in values under T.KEY2. sorts lower than all characters that exist in T.KEY

For descending siblings the code RPAD( KEY_PATH( KEY, '/' ), 50, '~' ) should use a length larger than any possible KEY_PATH value ("50" in this example) and it should use a padding character that sorts higher than all characters contained in T.KEY ("~" in this example). Violating these rules can result in incorrectly sorted output.

CONNECT_BY_ISLEAF

The CONNECT_BY_ISLEAF pseudocolumn returns 1 if the current row is a leaf of the tree defined by the CONNECT BY condition, 0 otherwise.

select lpad(' ', level-1 ) || key as key_indented , CONNECT_BY_ISLEAFfrom tstart with key = 'server'connect by parent_key = prior key


; KEY_INDENTED CONNECT_BY_ISLEAF--------------- -----------------server 0 bin 1 config 0 log 1 ctx 0 admin 1 data 0 delx 1 enlx 1 eslx 1 mig 1

It is important to recognize that CONNECT_BY_ISLEAF only considers the tree defined by the CONNECT BY condition, not that of the underlying table data.

For example, in table T the rows with a KEY of 'config' and 'ctx' have descendents (children and grandchildren) and are therefore not leaf nodes in that context. In the following query however, those same rows are considered leaf nodes (they have a CONNECT_BY_ISLEAF value of 1) because none of the descendents exist in the tree as defined by the CONNECT BY clause. They are filtered out by the " LEVEL <= 2 " condition.

select lpad(' ', level-1 ) || key as key_indented , connect_by_isleaffrom tstart with key = 'server'CONNECT BY parent_key = prior key and LEVEL <= 2 -- filters out descendents of "config" and "ctx"; KEY_INDENTED CONNECT_BY_ISLEAF--------------- -----------------server 0 bin 1 config 1 ctx 1

As we saw happen with the LEVEL column in the preceding tutorial, the order of evaluation of the CONNECT BY and WHERE clauses can also affect the behaviour of the CONNECT_BY_ISLEAF pseudo column. The following example illustrates this. In it, " LEVEL <= 2 " is placed in the WHERE clause, not the CONNECT BY clause as above, causing CONNECT_BY_ISLEAF to be 0 for "config" and "ctx" even though those rows look like leaf nodes in the end result.

select lpad(' ', level-1 ) || key as key_indented , connect_by_isleaf

from tWHERE -- filter out descendents of "config" and "ctx", LEVEL <= 2 -- this time using the WHERE clausestart with key = 'server'connect by parent_key = prior key; KEY_INDENTED CONNECT_BY_ISLEAF--------------- -----------------server 0 bin 1 config 0 ctx 0

CONNECT_BY_ROOT

The CONNECT_BY_ROOT operator returns column information from the root row of the hierarchy.

select lpad(' ', level-1 ) || key as key_indented , level , key , name , CONNECT_BY_ROOT key as root_key , CONNECT_BY_ROOT name as root_namefrom tstart with parent_key is nullconnect by parent_key = prior key; KEY_INDENTED LEVEL KEY NAME ROOT_KEY ROOT_NAME--------------- ------ ---------- ---------- ---------- ----------nls 1 nls NLS nls NLS demo 2 demo DATA nls NLS mesg 2 mesg DEMO nls NLSserver 1 server SERVER server SERVER bin 2 bin BIN server SERVER config 2 config CONFIG server SERVER log 3 log LOG server SERVER ctx 2 ctx CTX server SERVER admin 3 admin ADMIN server SERVER data 3 data DATA server SERVER delx 4 delx DELX server SERVER enlx 4 enlx ENLX server SERVER eslx 4 eslx ESLX server SERVER mig 3 mig MESG server SERVER

Gotchas

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14200/operators004.htm#i1035022

The manual page for CONNECT_BY_ROOT states

You cannot specify this operator in the START WITH condition or the CONNECT BY condition.

While there would be little use for CONNECT_BY_ROOT in the START WITH condition, which already operates on the root row itself, using CONNECT_BY_ROOT in the CONNECT BY condition can be useful and, in practice, actually works in some cases (as tested in Oracle 10g). In the following example we use CONNECT_BY_ROOT in the CONNECT BY condition to prevent any rows beyond level 3 under only the "server" root row from being included in the results.

select lpad(' ', level-1 ) || key as key_indented , level , CONNECT_BY_ROOT key as root_keyfrom tstart with parent_key is nullconnect by parent_key = prior key and not ( level > 3 and connect_by_root key = 'server' ); KEY_INDENTED LEVEL ROOT_KEY--------------- ------ ----------nls 1 nls demo 2 nls mesg 2 nlsserver 1 server bin 2 server config 2 server log 3 server ctx 2 server admin 3 server data 3 server mig 3 server

The fact that the query above contradicts the documentation yet works without error in 10g suggests a bug in either the documentation or the SQL engine.

The Gotchas section of topic CONNECT BY LEVEL Method has an example where using CONNECT_BY_ROOT in CONNECT BY does not work so well.

SYS_CONNECT_BY_PATH

The SYS_CONNECT_BY_PATH function returns a single string containing all the column values encountered in the path from root to node.

select

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14200/functions164.htm#i1038266


http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14200/operators004.htm#i1035022

lpad(' ', level-1 ) || key as key_indented , name , SYS_CONNECT_BY_PATH( key , '/' ) as key_path , SYS_CONNECT_BY_PATH( name, '/' ) as name_pathfrom tstart with parent_key is nullconnect by parent_key = prior key; KEY_INDENTED NAME KEY_PATH NAME_PATH--------------- ---------- ------------------------- -------------------------nls NLS /nls /NLS demo DATA /nls/demo /NLS/DATA mesg DEMO /nls/mesg /NLS/DEMOserver SERVER /server /SERVER bin BIN /server/bin /SERVER/BIN config CONFIG /server/config /SERVER/CONFIG log LOG /server/config/log /SERVER/CONFIG/LOG ctx CTX /server/ctx /SERVER/CTX admin ADMIN /server/ctx/admin /SERVER/CTX/ADMIN data DATA /server/ctx/data /SERVER/CTX/DATA delx DELX /server/ctx/data/delx /SERVER/CTX/DATA/DELX enlx ENLX /server/ctx/data/enlx /SERVER/CTX/DATA/ENLX eslx ESLX /server/ctx/data/eslx /SERVER/CTX/DATA/ESLX mig MESG /server/ctx/mig /SERVER/CTX/MESG

SYS_CONNECT_BY_PATH is only available in Oracle version 9i or greater. For earlier versions a custom, recursive PL/SQL function can be used in place of SYS_CONNECT_BY_PATH.

-------------------------------------------------------------- Note:---- This function is only for demonstration purposes.-- In a real application more robust code would be needed-- to guard against things like separator characters-- appearing in KEY values, hierarchical loops in the data,-- etc.------------------------------------------------------------

create or replace function KEY_PATH( p_key t.key%type , p_separator varchar2 default '/') return varchar2is

v_parent_key t.parent_key%type ; v_key t.key%type ;

begin

select parent_key, key into v_parent_key, v_key from t where key = p_key ;

if v_parent_key is null then return ( p_separator || v_key ); else return ( KEY_PATH( v_parent_key, p_separator ) || p_separator || v_key ); end if;

exception

when no_data_found then return( null );

end;/

show errorsNo errors. select lpad(' ', level-1 ) || key as key_indented , name , KEY_PATH( key, '/' ) as KEY_PATHfrom tstart with parent_key is nullconnect by parent_key = prior key; KEY_INDENTED NAME KEY_PATH--------------- ---------- -------------------------nls NLS /nls demo DATA /nls/demo mesg DEMO /nls/mesgserver SERVER /server bin BIN /server/bin config CONFIG /server/config log LOG /server/config/log ctx CTX /server/ctx admin ADMIN /server/ctx/admin data DATA /server/ctx/data delx DELX /server/ctx/data/delx enlx ENLX /server/ctx/data/enlx eslx ESLX /server/ctx/data/eslx mig MESG /server/ctx/mig

With this approach an additional function would be needed if the path for another column, like NAME, were required.

If no hierarchical information, e.g. LEVEL, other than a path is required then the START WITH and CONNECT BY clauses can be omitted since KEY_PATH already knows how to traverse the hierarchy.

select

key , name , KEY_PATH( key, '/' ) as KEY_PATHfrom torder by KEY_PATH; KEY NAME KEY_PATH---------- ---------- -------------------------nls NLS /nlsdemo DATA /nls/demomesg DEMO /nls/mesgserver SERVER /serverbin BIN /server/binconfig CONFIG /server/configlog LOG /server/config/logctx CTX /server/ctxadmin ADMIN /server/ctx/admindata DATA /server/ctx/datadelx DELX /server/ctx/data/delxenlx ENLX /server/ctx/data/enlxeslx ESLX /server/ctx/data/eslxmig MESG /server/ctx/mig

Setup


create table t( key varchar2(10) , parent_key varchar2(10) , name varchar2(10));

insert into t values ( 'nls' , null , 'NLS' );insert into t values ( 'demo' , 'nls' , 'DATA' );insert into t values ( 'mesg' , 'nls' , 'DEMO' );

insert into t values ( 'server', null , 'SERVER' );insert into t values ( 'bin' , 'server' , 'BIN' );insert into t values ( 'config', 'server' , 'CONFIG' );insert into t values ( 'log' , 'config' , 'LOG' );insert into t values ( 'ctx' , 'server' , 'CTX' );insert into t values ( 'admin' , 'ctx' , 'ADMIN' );insert into t values ( 'data' , 'ctx' , 'DATA' );insert into t values ( 'delx' , 'data' , 'DELX' );insert into t values ( 'enlx' , 'data' , 'ENLX' );insert into t values ( 'eslx' , 'data' , 'ESLX' );insert into t values ( 'mig' , 'ctx' , 'MESG' );

commit;

column level format 99999

column key_indented format a15column root_key format a10column root_name format a10column key_path format a25column name_path format a25

set null '(null)'

variable v_target_key varchar2(10)

Cleanup


drop table t ;drop function KEY_PATH ; exit

Materialized Views

A Materialized View is effectively a database table that contains the results of a query. The power of materialized views comes from the fact that, once created, Oracle can automatically synchronize a materialized view's data with its source information as required with little or no programming effort.

Materialized views can be used for many purposes, including: Denormalization Validation Data Warehousing Replication.

This tutorial explores materialized view basics. After completing it you should have enough information to use materialized views effectively in simple applications. For more complex applications links at the end of the tutorial will point to information on advanced features not covered here, e.g. partitioning, refresh groups, updatable materialized views.

Terminology

With relational views the FROM clause objects a view is based on are called "base tables". With materialized views, on the other hand, these objects are either called "detail tables" (in data warehousing documentation) or "master tables" (in replication documentation and the Oracle Database SQL Reference guide). Since SQL Snippets is concerned mainly with relational uses of materialized views we will avoid the contradictory terms "master" and "detail" all together and instead use the term "base tables", thus remaining consistent with relational view terminology. Materialized Views were originally known as "Snapshots" in early releases of Oracle. This keyword is supported for backward compatibility, but should not be used in new code. You may still see this term in some Oracle 11g materialized view error messages.

http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/statements_6002.htm

Views vs Materialized Views

Like its predecessor the view, materialized views allow you to store the definition of a query in the database.

Table View Materialized View

select * from T ;

KEY VAL---------- ----- 1 a 2 b 3 c 4

create view vasselect *from t ;

select * from V ;

KEY VAL---------- ----- 1 a 2 b 3 c 4

create materialized view mvasselect *from t ;

select * from MV ;

KEY VAL---------- ----- 1 a 2 b 3 c 4

Unlike views, however, materialized views also store the results of the query in the database. In the following queries note how the rowid's for the table and the view are identical, indicating the view returns the exact same data stored in the table. The rowids of the materialized view, on the other hand, differ from those of the table. This indicates the materialized view is returning a physically separate copy of the table data.

Table View Materialized Viewselect rowidfrom Torder by rowid ;

ROWID------------------AAAgY9AAEAAAAVfAAAAAAgY9AAEAAAAVfAABAAAgY9AAEAAAAVfAACAAAgY9AAEAAAAVfAAD

select rowidfrom Vorder by rowid ;

ROWID------------------AAAgY9AAEAAAAVfAAAAAAgY9AAEAAAAVfAABAAAgY9AAEAAAAVfAACAAAgY9AAEAAAAVfAAD

select rowidfrom MVorder by rowid ;

ROWID------------------AAAgZFAAEAAADyEAAAAAAgZFAAEAAADyEAABAAAgZFAAEAAADyEAACAAAgZFAAEAAADyEAAD

The difference between views and materialized views becomes even more evident than this when table data is updated.

Table View Materialized Viewupdate t set val = upper(val); select * from T ;

KEY VAL---------- ----- 1 A 2 B 3 C 4

select * from V ;

KEY VAL---------- ----- 1 A 2 B 3 C 4

select * from MV ;

KEY VAL---------- ----- 1 a 2 b 3 c 4

Note how, after the update, the view data matches the table data but the materialized view data does not. Data in materialized views must be refreshed to keep it synchronized with its base table. Refreshing can either be done manually, as below, or automatically by Oracle in some cases.

Table View Materialized Viewexecute dbms_mview.refresh( 'MV' ); select * from T ;

KEY VAL---------- ----- 1 A 2 B 3 C 4

select * from V ;

KEY VAL---------- ----- 1 A 2 B 3 C 4

select * from MV ;

KEY VAL---------- ----- 1 A 2 B 3 C 4

Now that the materialized view has been refreshed its data matches that of its base table.

Cleanup

drop materialized view mv ;

drop view v ;

update t set val = lower(val);commit;

REFRESH COMPLETE

There are various ways to refresh the data in a materialized view, the simplest way being a complete refresh. When a complete refresh occurs the materialized view's defining query is executed and the entire result set replaces the data currently residing in the materialized view.

The REFRESH COMPLETE clause tells Oracle to perform complete refreshes by default when a materialized view is refreshed.

create materialized view mv REFRESH COMPLETE as select * from t;

Let's see a complete refresh in action now. We will use the DBMS_MVIEW.REFRESH procedure to initiate it. The "list" parameter accepts a list of materialized views to refresh (in our case we only have one) and the "method" parameter accepts a "C", for Complete refresh.

select key, val, rowid from mv ; KEY VAL ROWID---------- ----- ------------------ 1 a AAAWgHAAEAAAAIEAAA 2 b AAAWgHAAEAAAAIEAAB 3 c AAAWgHAAEAAAAIEAAC 4 AAAWgHAAEAAAAIEAAD execute DBMS_MVIEW.REFRESH( LIST => 'MV', METHOD => 'C' ); select key, val, rowid from mv ; KEY VAL ROWID---------- ----- ------------------ 1 a AAAWgHAAEAAAAIEAAE 2 b AAAWgHAAEAAAAIEAAF 3 c AAAWgHAAEAAAAIEAAG 4 AAAWgHAAEAAAAIEAAH

Note how the rowids in the second query differ from those of the first, even though the data in table T was unchanged throughout. This is because complete refreshes create a whole new set of data, even when the new result set is identical to the old one.

http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28419/d_mview.htm#sthref6065

If a materialized view contains many rows and the base table's rows change infrequently refreshing the materialized view completely can be an expensive operation. In such cases it would be better to process only the changed rows. We will explore this type of refresh next.

Cleanup


Materialized View Logs

As mentioned earlier, complete refreshes of materialized views can be expensive operations. Fortunately there is a way to refresh only the changed rows in a materialized view's base table. This is called fast refreshing. Before a materialized view can perform a fast refresh however it needs a mechanism to capture any changes made to its base table. This mechanism is called a Materialized View Log. We can create a materialized view log on our test table, T, like this.

describe T Name Null? Type -------------------------------------------- -------- ------------------------------ KEY NOT NULL NUMBER VAL VARCHAR2(5)

create materialized view log on t ;

Note how the materialized view log is not given a name. This is because a table can only ever have one materialized view log related to it at a time, so a name is not required. To see what a materialized view log looks like we can examine the table used to implement it. In practice developers other than Dizwell never actually need to reference this table, but showing it here helps illustrate materialized view log behaviour.

describe MLOG$_T

Name Null? Type -------------------------------------------- -------- ---------------------- KEY NUMBER SNAPTIME$$ DATE DMLTYPE$$ VARCHAR2(1) OLD_NEW$$ VARCHAR2(1) CHANGE_VECTOR$$ RAW(255)

The MLOG$_T.KEY column mirrors the base table's primary key column T.KEY. The other MLOG$ columns are system generated.

select * from MLOG$_T ;

no rows selected

http://diznix.com/2009/05/24/home-brew-multi-master-replication/


The query above shows that a materialized view log is initially empty upon creation. Rows are automatically added to MLOG$_T when base table T is changed.

UPDATE t set val = upper( val ) where KEY = 1 ;

INSERT into t ( KEY, val ) values ( 5, 'e' );

column dmltype$$ format a10select key, dmltype$$ from MLOG$_T ; KEY DMLTYPE$$---------- ---------- 1 U 5 I

If the changes affecting T are rolled back, so are the changes to MLOG$_T.

rollback ;

Rollback complete.

select key, dmltype$$ from MLOG$_T ;

no rows selected

WITH PRIMARY KEY

To include the base table's primary key column in a materialized view log the WITH PRIMARY KEY clause can be specified.

drop materialized view log on t ;

create materialized view log on t WITH PRIMARY KEY ;

desc mlog$_t Name Null? Type -------------------------------------------- -------- --------------------- KEY NUMBER SNAPTIME$$ DATE DMLTYPE$$ VARCHAR2(1) OLD_NEW$$ VARCHAR2(1) CHANGE_VECTOR$$ RAW(255)

Note how MLOG$_T contains T's primary key column, T.KEY. This materialized view log is equivalent to the one created earlier in this topic, which did not have a WITH clause, because WITH PRIMARY KEY is the default option when no WITH clause is specified.

WITH ROWID

To include rowids instead of primary keys WITH ROWID can be specified.


create materialized view log on t WITH ROWID ;

desc mlog$_t Name Null? Type -------------------------------------------- -------- --------------------- M_ROW$$ VARCHAR2(255) SNAPTIME$$ DATE DMLTYPE$$ VARCHAR2(1) OLD_NEW$$ VARCHAR2(1) CHANGE_VECTOR$$ RAW(255)

Note how the KEY column was replaced by the M_ROW$$ column, which contains rowids from table T. A materialized view log can also be created with both a rowid and a primary key column.


create materialized view log on t WITH ROWID, PRIMARY KEY ;

desc mlog$_t Name Null? Type -------------------------------------------- -------- ------------------------------ KEY NUMBER M_ROW$$ VARCHAR2(255) SNAPTIME$$ DATE DMLTYPE$$ VARCHAR2(1) OLD_NEW$$ VARCHAR2(1) CHANGE_VECTOR$$ RAW(255)

In this case both KEY and M_ROW$$ appear in the log table.

WITH SEQUENCE

A special SEQUENCE column can be include in the materialized view log to help Oracle apply updates to materialized view logs in the correct order when a mix of Data Manipulation (DML) commands, e.g. insert, update and delete, are performed on multiple base tables in a single transaction.


create materialized view log on t WITH SEQUENCE ;create materialized view log on t2 WITH SEQUENCE ;

INSERT into T values ( 5, 'e' );INSERT into T2 values ( 60, 3, 300 );

UPDATE T set val = upper(val) where key = 5 ;UPDATE T2 set amt = 333 where key = 60 ;

commit; select SEQUENCE$$, key, dmltype$$ from mlog$_T ; SEQUENCE$$ KEY DMLTYPE$$---------- ---------- ---------- 60081 5 I 60083 5 U select SEQUENCE$$, key, dmltype$$ from mlog$_T2 ; SEQUENCE$$ KEY DMLTYPE$$---------- ---------- ---------- 60082 60 I 60084 60 U

Since mixed DML is a common occurrence SEQUENCE will be specified in most materialized view logs. In fact, Oracle recommends it.

"Oracle recommends that the keyword SEQUENCE be included in your materialized view log statement unless you are sure that you will never perform a mixed DML operation (a combination of INSERT, UPDATE, or DELETE operations on multiple tables)." -- from Creating Materialized Views: Materialized View Logs"

WITH Column List

The WITH clause can also contain a list of specific base table columns. In the next snippet we include the VAL column.


create materialized view log on t WITH ( VAL );

desc mlog$_t Name Null? Type -------------------------------------------- -------- ------------------------------ KEY NUMBER VAL VARCHAR2(5) SNAPTIME$$ DATE DMLTYPE$$ VARCHAR2(1) OLD_NEW$$ VARCHAR2(1) CHANGE_VECTOR$$ RAW(255)

select * from t ; KEY VAL---------- ----- 1 a 2 b 3 c 4

http://download.oracle.com/docs/cd/B28359_01/server.111/b28313/basicmv.htm#i1006803

5 E UPDATE t set val = 'f' where key = 5 ;

column old_new$$ format a10

select key, val, old_new$$ from mlog$_t ; KEY VAL OLD_NEW$$---------- ----- ---------- 5 E O

INCLUDING NEW VALUES Clause

In the last snippet we see that the VAL column contains values as they existed before the update operation, aka the "old" value. There is no need to store the new value for an update because it can be derived by applying the change vector (a RAW value stored in CHANGE_VECTOR$$, which Oracle uses internally during refreshes) to the old value. In some situations, which we will identify in later topics, it helps to have both the old value and the new value explicitly saved in the materialized view log. We can do that using the INCLUDING NEW VALUES clause, like this.

drop materialized view log on T ;

create materialized view log on t with sequence ( VAL ) INCLUDING NEW VALUES;

update t set val = 'g' where key = 5 ;

column old_new$$ format a9

select sequence$$, key, val, old_new$$from mlog$_torder by sequence$$ ; SEQUENCE$$ KEY VAL OLD_NEW$$---------- ---------- ----- --------- 60085 5 f O 60086 5 g N

Note how both the old and the new values are stored in the same column, VAL. The OLD_NEW$$ column identifies the value as either an old or a new value.

Gotcha - Commas

The syntax diagrams for the CREATE MATERIALIZED VIEW LOG command indicate a comma is required between each component of the WITH clause. However this does not appear to be the case when the component is a column list, e.g. "( VAL )".



create materialized view log on t with sequence, ( VAL ), primary key ;create materialized view log on t with sequence, ( VAL ), primary key *ERROR at line 1:ORA-00922: missing or invalid option

Omitting the comma before the column list works better.

create materialized view log on t with sequence ( VAL ), primary key;

Materialized view log created.

Gotcha - DBMS_REDEFINITION

The DBMS_REDEFINITION package has certain restrictions related to materialized view logs.

In Oracle 10g these restrictions are: Tables with materialized view logs defined on them cannot be redefined online. For materialized view logs and queue tables, online redefinition is restricted to changes in

physical properties. No horizontal or vertical subsetting is permitted, nor are any column transformations. The only valid value for the column mapping string is NULL.

-- from Oracle® Database Administrator's Guide 10g Release 2 (10.2) - Restrictions for Online Redefinition of Tables

In Oracle 11g they are: After redefining a table that has a materialized view log, the subsequent refresh of any

dependent materialized view must be a complete refresh. For materialized view logs and queue tables, online redefinition is restricted to changes in

physical properties. No horizontal or vertical subsetting is permitted, nor are any column transformations. The only valid value for the column mapping string is NULL.

-- from Oracle® Database Administrator's Guide 11g Release 1 (11.1) - Restrictions for Online Redefinition of Tables

Cleanup

delete t2 ;delete t ;insert into t select * from t_backup ;insert into t2 select * from t2_backup ;commit;

drop materialized view log on t ;drop materialized view log on t2 ;

REFRESH FAST

http://download.oracle.com/docs/cd/B28359_01/server.111/b28310/tables007.htm#i1106919

http://download.oracle.com/docs/cd/B28359_01/server.111/b28310/tables007.htm#i1106919

http://download.oracle.com/docs/cd/B19306_01/server.102/b14231/tables.htm#sthref2353

http://download.oracle.com/docs/cd/B19306_01/server.102/b14231/tables.htm#sthref2353

Now that we know how materialized view logs track changes to base tables we can use them to perform fast materialized view refreshes, i.e. refreshes where only the individual materialized view rows affected by base table changes are updated. This is also called "incremental" refreshing.

Earlier in this tutorial we saw how the rowids for each row in a materialized view changed after a complete refresh. Now let's see what happens to a materialized view's rowids after a fast refresh. First we use the REFRESH FAST clause to specify that the default refresh method should be fast.

create materialized view log on t with sequence ;

create materialized view mv REFRESH FAST as select * from t;

select key, val, rowidfrom mv ; KEY VAL ROWID---------- ----- ------------------ 1 a AAAWm+AAEAAAAaMAAA 2 b AAAWm+AAEAAAAaMAAB 3 c AAAWm+AAEAAAAaMAAC 4 AAAWm+AAEAAAAaMAAD

Now we refresh the materialized view. The "F" value for the "method" parameter ensures the refresh will be a Fast one.

execute dbms_mview.refresh( list => 'MV', method => 'F' );

select key, val, rowidfrom mv ; KEY VAL ROWID---------- ----- ------------------ 1 a AAAWm+AAEAAAAaMAAA 2 b AAAWm+AAEAAAAaMAAB 3 c AAAWm+AAEAAAAaMAAC 4 AAAWm+AAEAAAAaMAAD

The rowids did not change. Thus, with a fast refresh the materialized view data is not touched when no changes have been made to the base table, unlike a complete refresh where each row would have been created anew.

Now let's update a row in the base table.

update t set val = 'XX'where key = 3 ;

commit;


select key, val, rowidfrom mv ; KEY VAL ROWID---------- ----- ------------------ 1 a AAAWm+AAEAAAAaMAAA 2 b AAAWm+AAEAAAAaMAAB 3 XX AAAWm+AAEAAAAaMAAC 4 AAAWm+AAEAAAAaMAAD

Still no change in the rowids. In row 3 we can see that VAL changed from "c" to "XX" though, telling us that row 3 was updated during the refresh.

Defaults

The REFRESH FAST clause of the CREATE MATERIALIZED VIEW command tells Oracle what type of refresh to perform when no refresh option is specified. A materialized view created with REFRESH FAST can still be refreshed completely if required though. In the following example note how, even though MV was created above with the REFRESH FAST clause, all its rowids change after the refresh. This indicates that a complete refresh was performed.

execute dbms_mview.refresh( list => 'MV', method => 'C' );

select key, val, rowidfrom mv ; KEY VAL ROWID---------- ----- ------------------ 1 a AAAWm+AAEAAAAaMAAE 2 b AAAWm+AAEAAAAaMAAF 3 XX AAAWm+AAEAAAAaMAAG 4 AAAWm+AAEAAAAaMAAH

Similarly a materialized view created with REFRESH COMPLETE can be fast refreshed (assuming the materialized view is capable of being fast refreshed, we'll learn more about this later).


create materialized view mv REFRESH COMPLETE as select * from t; select key, val, rowidfrom mv ;

KEY VAL ROWID---------- ----- ------------------ 1 a AAAWnBAAEAAAAaMAAA 2 b AAAWnBAAEAAAAaMAAB 3 XX AAAWnBAAEAAAAaMAAC

4 AAAWnBAAEAAAAaMAAD


select key, val, rowidfrom mv ;

KEY VAL ROWID---------- ----- ------------------ 1 a AAAWnBAAEAAAAaMAAA 2 b AAAWnBAAEAAAAaMAAB 3 XX AAAWnBAAEAAAAaMAAC 4 AAAWnBAAEAAAAaMAAD

Note how none of the rowids in MV changed, indicating a fast refresh.

Cleanup



update t set val = 'c' where key = 3 ;commit ;

Purging Materialized View Logs

Oracle automatically purges rows in the materialized view log when they are no longer needed. In the example below note how the log table is empty after the refresh.


create materialized view mv refresh fast as select * from t;

select count(*) from mlog$_t ;

COUNT(*)---------- 0

insert into t values ( 5, 'e' ) ;commit;


COUNT(*)---------- 1



COUNT(*)---------- 0

DBMS_MVEW.PURGE_LOG

If a materialized view log needs to be purged manually for some reason a procedure called DBMS_MVEW.PURGE_LOG can be used.


COUNT(*)---------- 0

update t set val = 'X' where key = 5 ;commit;


COUNT(*)---------- 1

execute DBMS_MVIEW.PURGE_LOG( master => 'T', num => 9999, flag => 'delete' ) ;


COUNT(*)---------- 0

The "num" and "flag" parameters can be used to partially purge the log. See the PURGE_LOG manual page for further details.

Once a materialized view log has been purged any materialized views dependent on the deleted rows cannot be fast refreshed. Attempting a fast refresh will raise an error.

execute dbms_mview.refresh( list => 'MV', method => 'F' );BEGIN dbms_mview.refresh( list => 'MV', method => 'F' ); END;

*ERROR at line 1:ORA-12034: materialized view log on "SCOTT"."T" younger than last refreshORA-06512: at "SYS.DBMS_SNAPSHOT", line 2537ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2743ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2712ORA-06512: at line 1

Such materialized views will need to be refreshed completely.

select * from mv ;

KEY VAL---------- ----- 1 a 2 b 3 c 4 5 e execute dbms_mview.refresh( list => 'MV', method => 'C' );

select * from mv ; KEY VAL---------- ----- 1 a 2 b 3 c 4 5 X

Cleanup

delete from t where key = 5 ;commit;



REFRESH FAST Categories

There are three ways to categorize a materialized view's ability to be fast refreshed. 1. It can never be fast refreshed.2. It can always be fast refreshed.3. It can be fast refreshed after certain kinds of changes to the base table but not others.

For the first case Oracle will raise an error if you try to create such a materialized view with its refresh method defaulted to REFRESH FAST. In the example below table T does not have a materialized view log on it. Materialized views based on T cannot therefore be fast refreshed. If we attempt to create such a materialized view we get an error.

create materialized view MV REFRESH FAST as select * from t2; as select * from t2 *ERROR at line 3:ORA-23413: table "SCOTT"."T2" does not have a materialized view log

For the second case materialized views are created without error, obviously, and will always be fast refreshed unless a complete refresh is explicitly requested. The third case is a little trickier. The next example demonstrates why.

select * from t2 ; KEY T_KEY AMT---------- ---------- ---------- 10 1 100 20 1 300 30 1 200 40 2 250 50 2 150 create materialized view log on t2 with primary key, rowid, sequence ( t_key, amt ) including new values;

create materialized view mv REFRESH FAST as select t_key, max( amt ) amt_max from t2 group by t_key;

select rowid, t_key, amt_max from mv ; ROWID T_KEY AMT_MAX------------------ ---------- ----------AAAhMzAAEAAAEG8AAA 1 300AAAhMzAAEAAAEG8AAB 2 250

So far everything works as expected. We created a materialized view log and created a materialized view with fast refresh as its default refresh method. Let's try inserting a row into the base table.

insert into t2 values ( 5, 2, 500 );commit;


select rowid, t_key, amt_max from mv ; ROWID T_KEY AMT_MAX------------------ ---------- ----------AAAhMzAAEAAAEG8AAA 1 300AAAhMzAAEAAAEG8AAB 2 500

Again, it worked as expected. The view was fast refreshed (the rowid's did not change after the DBMS_MVIEW.REFRESH command) and the materialized view correctly shows 500 as the maximum value for rows with T_KEY = 2. Now let's try deleting a row from the base table.

delete from t2 where key = 5 ;commit;

execute dbms_mview.refresh( list => 'MV', method => 'F' );BEGIN dbms_mview.refresh( list => 'MV', method => 'F' ); END;

*ERROR at line 1:ORA-32314: REFRESH FAST of "SCOTT"."MV" unsupported after deletes/updatesORA-06512: at "SYS.DBMS_SNAPSHOT", line 2255ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2461ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2430ORA-06512: at line 1

This time we received an error when we attempted a fast refresh. The reason is because this type of materialized view is an "insert-only" materialized view, i.e. it is only fast refreshable for inserts and direct loads, not updates or deletes. (We will see why it is an insert-only view in the next topic, DBMS_MVIEW.EXPLAIN_MVIEW.) To synchronize an insert-only materialized view after a delete we need to do a complete refresh.

execute dbms_mview.refresh( list => 'MV', method => 'C' );

select rowid, t_key, amt_max from mv ; ROWID T_KEY AMT_MAX------------------ ---------- ----------AAAhMzAAEAAAEG8AAC 1 300AAAhMzAAEAAAEG8AAD 2 250

Restrictions on Fast Refresh

So how do we know whether a materialized view can be fast refreshed each time, sometimes, or never? One way would be to learn all the documented restrictions for fast refreshable materialized views. Here are some of them.

In general materialized views cannot be fast refreshed if the base tables do not have materialized view logs or the defining query: contains an analytic function contains non-repeating expressions like SYSDATE or ROWNUM contains RAW or LONG RAW data types contains a subquery in the SELECT clause contains a MODEL clause contains a HAVING clause contains nested queries with ANY, ALL, or NOT EXISTS contains a CONNECT BY clause references remote tables in different databases references remote tables in a single database and defaults to the ON COMMIT refresh mode references other materialized views which are not join or aggregate materialized views.


There are even more restrictions for materialized views containing joins, aggregates, UNION ALL, subqueries, etc. They are documented in various sections of a few different manuals and are too numerous and complex to repeat here. The following links can help you find them if required though.

CREATE MATERIALIZED VIEW - FAST Clause General Restrictions on Fast Refresh Restrictions on Fast Refresh on Materialized Views with Joins Only Restrictions on Fast Refresh on Materialized Views with Aggregates Restrictions on Fast Refresh on Materialized Views with UNION ALL Restrictions for Materialized Views with Subqueries Restrictions for Materialized Views with Unions Containing Subqueries Restrictions for Using Multitier Materialized Views Restrictions for Materialized Views with Collection Columns

Fortunately there is a second, simpler alternative for determining whether a materialized view is fast refreshable or not. It uses the DBMS_MVIEW.EXPLAIN_MVIEW utility which we will explore next.

Cleanup


drop materialized view log on t2 ;

DBMS_MVIEW.EXPLAIN_MVIEW

As we saw in the preceding topic, predicting whether or not a materialized view is fast refreshable can be complicated. The DBMS_MVIEW.EXPLAIN_MVIEW utility can simplify this task however. Full details on how the utility works are available at the preceding link. The material below will help you use the utility effectively.

MV_CAPABILITIES_TABLE

There are two ways to get the output from DBMS_MVIEW.EXPLAIN_MVIEW, via a table or via a varray. To use the table method the current schema must contain a table called MV_CAPABILITIES_TABLE. The full, documented CREATE TABLE command for MV_CAPABILITIES_TABLE can be found on UNIX systems at $ORACLE_HOME/rdbms/admin/utlxmv.sql. It is also available in Oracle's documentation at Oracle Database Data Warehousing Guide - Basic Materialized Views - Using MV_CAPABILITIES_TABLE (see Gotcha for a related bug). Here is an abridged version.

create table MV_CAPABILITIES_TABLE( statement_id varchar(30) , mvowner varchar(30) , mvname varchar(30) , capability_name varchar(30) , possible character(1) ,

http://www.sqlsnippets.com/en/topic-12884.html#gotcha

http://download.oracle.com/docs/cd/B28359_01/server.111/b28313/basicmv.htm#sthref245


http://download.oracle.com/docs/cd/B28359_01/server.111/b28326/repmview.htm#sthref545

http://download.oracle.com/docs/cd/B28359_01/server.111/b28326/repmview.htm#i48147

http://download.oracle.com/docs/cd/B28359_01/server.111/b28326/repmview.htm#sthref472

http://download.oracle.com/docs/cd/B28359_01/server.111/b28326/repmview.htm#i28701

http://download.oracle.com/docs/cd/B28359_01/server.111/b28313/basicmv.htm#CACBHGIJ





related_text varchar(2000) , related_num number , msgno integer , msgtxt varchar(2000) , seq number) ;

VARRAY Output

Using DBMS_MVIEW.EXPLAIN_MVIEW with the table output method typically involves 1. deleting old rows from MV_CAPABILITIES_TABLE2. running DBMS_MVIEW.EXPLAIN_MVIEW3. selecting new rows from MV_CAPABILITIES_TABLE.

To save time in this tutorial we will use DBMS_MVIEW.EXPLAIN_MVIEW's varray output option instead and supplement it with a custom function called MY_MV_CAPABILITIES.

create or replace function my_mv_capabilities( p_mv in varchar2 , p_capability_name_filter in varchar2 default '%' , p_include_pct_capabilities in varchar2 default 'N' , p_linesize in number default 80) return clobas -------------------------------------------------------------------------------- -- From http://www.sqlsnippets.com/en/topic-12884.html -- -- Parameters: -- -- p_mv -- o this value is passed to DBMS_MVIEW.EXPLAIN_MVIEW's "mv" parameter -- o it can contain either a query, CREATE MATERIALIZED VIEW command text, -- or a materialized view name -- -- p_capability_name_filter -- o use either REFRESH, REWRITE, PCT, or the default -- -- p_include_pct_capabilities -- Y - capabilities like REFRESH_FAST_PCT are included in the report -- N - capabilities like REFRESH_FAST_PCT are not included in the report -- -- p_linesize -- o the maximum size allowed for any line in the report output -- o data that is longer than this value will be word wrapped -- -- Typical Usage: -- -- set long 5000 -- select my_mv_capabilities( 'MV_NAME' ) as mv_report from dual ; --

-- o the value 5000 is arbitraty; any value big enough to contain the -- report output will do -- --------------------------------------------------------------------------------

pragma autonomous_transaction ;

v_nl constant char(1) := unistr( '\000A' ); -- new line

v_previous_possible char(1) := 'X' ;

v_capabilities sys.ExplainMVArrayType ;

v_output clob ;

begin

dbms_mview.explain_mview( mv => p_mv, msg_array => v_capabilities ) ;

for v_capability in ( select capability_name , possible , related_text , msgtxt from table( v_capabilities ) where capability_name like '%' || upper( p_capability_name_filter ) || '%' and not ( capability_name like '%PCT%' and upper(p_include_pct_capabilities) = 'N' ) order by mvowner , mvname , possible desc , seq ) loop

------------------------------------------------------------ -- print section heading ------------------------------------------------------------

if v_capability.possible <> v_previous_possible then

v_output := v_output || v_nl || case v_capability.possible when 'T' then 'Capable of: ' when 'Y' then 'Capable of: ' when 'F' then 'Not Capable of: '

when 'N' then 'Not Capable of: ' else v_capability.possible || ':' end || v_nl ;

end if;

v_previous_possible := v_capability.possible ;

------------------------------------------------------------ -- print section body ------------------------------------------------------------ declare

v_indented_line_size varchar2(3) := to_char( p_linesize - 5 );

begin

-- print capability name indented 2 spaces

v_output := v_output || v_nl || ' ' || v_capability.capability_name || v_nl ;

-- print related text indented 4 spaces and word wrapped

if v_capability.related_text is not null then

v_output := v_output || regexp_replace ( v_capability.related_text || ' ' , '(.{1,' || v_indented_line_size || '} |.{1,' || v_indented_line_size || '})' , ' \1' || v_nl ) ;

end if;

-- print message text indented 4 spaces and word wrapped

if v_capability.msgtxt is not null then

v_output := v_output || regexp_replace ( v_capability.msgtxt || ' ' , '(.{1,' || v_indented_line_size || '} |.{1,' || v_indented_line_size || '})'

, ' \1' || v_nl ) ;

end if;

end;

end loop;

commit ;

return( v_output );

end;/

show errorsNo errors.

This completes our preparations. Now let's see DBMS_MVIEW.EXPLAIN_VIEW in action.

DBMS_MVIEW.EXPLAIN_MVIEW With a Query

DBMS_MVIEW.EXPLAIN_MVIEW can analyze three different types of materialized view code:

1. a defining query2. a CREATE MATERIALIZED VIEW command3. an existing materialized view.

Here is an example that explains a simple query which could appear as the defining query in a CREATE MATERIALIZED VIEW command.

set long 5000

select my_mv_capabilities( 'SELECT * FROM T', 'REFRESH' ) as mv_report from dual ; MV_REPORT--------------------------------------------------------------------------------

Capable of:

REFRESH_COMPLETE

Not Capable of:

REFRESH_FAST

REFRESH_FAST_AFTER_INSERT SCOTT.T the detail table does not have a materialized view log

REFRESH_FAST_AFTER_ONETAB_DML see the reason why REFRESH_FAST_AFTER_INSERT is disabled

REFRESH_FAST_AFTER_ANY_DML see the reason why REFRESH_FAST_AFTER_ONETAB_DML is disabled

(Descriptions of each capability name are available at Table 8-7 CAPABILITY_NAME Column Details. A list of messages and related text is available at Table 8-8 MV_CAPABILITIES_TABLE Column Details.)

The EXPLAIN_MVIEW output above shows that fast refresh is not possible in this case because T has no materialized view log.

Note that DBMS_MVIEW.EXPLAIN_MVIEW can report on a materialized view's refresh, rewrite, and partition change tracking (PCT) capabilities. For now we will only examine refresh capabilities. Rewrite capabilities will be covered in Query Rewrite Restrictions and Capabilities.

DBMS_MVIEW.EXPLAIN_MVIEW With CREATE MATERIALIZED VIEW

Now let's create a materialized view log on T and then use EXPLAIN_MVIEW to explain the capabilities of an entire CREATE MATERIALIZED VIEW command.


select my_mv_capabilities ( 'CREATE MATERIALIZED VIEW MV REFRESH FAST AS SELECT * FROM T' , 'REFRESH' ) as mv_reportfrom dual ; MV_REPORT--------------------------------------------------------------------------------

Capable of:

REFRESH_COMPLETE

REFRESH_FAST

REFRESH_FAST_AFTER_INSERT

REFRESH_FAST_AFTER_ONETAB_DML

REFRESH_FAST_AFTER_ANY_DML

This time we see that a materialized view using our simple query could be fast refreshable in all cases.

DBMS_MVIEW.EXPLAIN_MVIEW With Existing Materialized View


http://download.oracle.com/docs/cd/B28359_01/server.111/b28313/basicmv.htm#g1014409




For our last example we will explain an existing materialized view, the insert-only one we saw in the preceding topic REFRESH FAST Categories.

create materialized view log on t2 with primary key, rowid, sequence ( t_key, amt ) including new values;

create materialized view mv refresh fast as select t_key, max( amt ) amt_max from t2 group by t_key;

select my_mv_capabilities( 'MV', 'REFRESH' ) as mv_report from dual ; MV_REPORT--------------------------------------------------------------------------------

Capable of:

REFRESH_COMPLETE

REFRESH_FAST


Not Capable of:

REFRESH_FAST_AFTER_ONETAB_DML mv uses the MIN or MAX aggregate functions

REFRESH_FAST_AFTER_ONETAB_DML COUNT(*) is not present in the select list


Here we see that fast refresh is available after inserts, but not other types of DML. Note also that the "REFRESH_FAST" capability will appear whenever at least one of the other REFRESH_FAST_% capabilities is available. It does not mean the materialized view is fast refreshable in all cases.

Gotcha

Both the $ORACLE_HOME/rdbms/admin/utlxmv.sql file and the CREATE TABLE command at Oracle Database Data Warehousing Guide - Basic Materialized Views - Using MV_CAPABILITIES_TABLE state the values in MV_CAPABILITIES_TABLE.POSSIBLE will either be "T" or "F".

CREATE TABLE MV_CAPABILITIES_TABLE




...POSSIBLE CHARACTER(1), -- T = capability is possible -- F = capability is not possible...

In actual use we can see the values are really "Y" and "N". delete from mv_capabilities_table ;

execute dbms_mview.explain_mview( 'select * from t' );

commit;

column possible format a8

select distinct POSSIBLE from mv_capabilities_table ; POSSIBLE--------YN

The values "T" and "F" are, however, used when DBMS_MVIEW.EXPLAIN_MVIEW output is saved to a varray.

Cleanup

set long 80



REFRESH FORCE

In REFRESH FAST Categories and DBMS_MVIEW.EXPLAIN_MVIEW we saw an insert-only materialized view which could be fast refreshed after inserts into the base table but needed a complete refresh after other types of DML. With these types of materialized views it is often most convenient to let Oracle decide which refresh method is best. The REFRESH FORCE method does just that. It performs a FAST refresh if possible, otherwise it performs a COMPLETE refresh.


create materialized view mv REFRESH FORCE as select t_key, max( amt ) amt_max from t2 group by t_key;




select rowid, t_key, amt_max from mv ; ROWID T_KEY AMT_MAX------------------ ---------- ----------AAAWpLAAEAAAAaMAAA 1 300AAAWpLAAEAAAAaMAAB 2 250

First let's try an insert and a refresh.

insert into t2 values ( 5, 2, 500 );commit;

execute dbms_mview.refresh( list => 'MV' );

select rowid, t_key, amt_max from mv ; ROWID T_KEY AMT_MAX------------------ ---------- ----------AAAWpLAAEAAAAaMAAA 1 300AAAWpLAAEAAAAaMAAB 2 500

Since the rowids did not change but the AMT_MAX values did we can tell that a FAST refresh was performed. Now let's try a delete followed by a refresh.

delete from t2 where key = 5 ;commit;

execute dbms_mview.refresh( list => 'MV' );

select rowid, t_key, amt_max from mv ; ROWID T_KEY AMT_MAX------------------ ---------- ----------AAAWpLAAEAAAAaMAAC 1 300AAAWpLAAEAAAAaMAAD 2 250

In the REFRESH FAST Categories topic we received an "ORA-32314: REFRESH FAST of "SCOTT"."MV" unsupported after deletes/updates" error at this point. This time with REFRESH FORCE we did not. Instead Oracle performed a COMPLETE refresh (note how the rowids for each row changed).

Cleanup



NEVER REFRESH

If for some reason we need to prevent refresh operations of any sort, FAST or COMPLETE, on our materialized views we can use the NEVER REFRESH method.



create materialized view mv NEVER REFRESH as select * from t;

select * from mv ; KEY VAL---------- ----- 1 a 2 b 3 c 4

Let's see what happens when we update the base table and then attempt a refresh.

update t set val = upper(val) ;commit ;

execute dbms_mview.refresh( 'MV' );BEGIN dbms_mview.refresh( 'MV' ); END;

*ERROR at line 1:ORA-23538: cannot explicitly refresh a NEVER REFRESH materialized view ("MV")ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2537ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2743ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2712ORA-06512: at line 1

Oracle prevented the refresh by raising an error.

I cannot see a practical reason for having a materialized view with NEVER REFRESH set at all times. (If you know of any please let me know using the Comments link below.) NEVER REFRESH can come in handy though when refresh operations on a materialized view need to be prevented temporarily during maintenance or debugging operations. In this case the materialized view's refresh mode can be changed to NEVER REFRESH using the ALTER MATERIALIZED VIEW command.

Cleanup


update t set val = lower(val) ;commit ;

ON DEMAND

Up to this point in the tutorial we have always refreshed our materialized views manually with the DBMS_MVIEW.REFRESH command. This is know as ON DEMAND refreshing and it is the default refresh mode when none is specified in the CREATE MATERIALIZED VIEW command. In other words this


http://www.sqlsnippets.com/en/topic-12888.html#comments_link

create materialized view mv as select * from t;

is equivalent to this. drop materialized view mv ;

create materialized view mv REFRESH ON DEMAND as select * from t;

To refresh ON DEMAND materialized views we explicitly call one of the following procedures. DBMS_MVIEW.REFRESH DBMS_MVIEW.REFRESH_ALL_MVIEWS DBMS_MVIEW.REFRESH_DEPENDENT

Here is an example that uses DBMS_MVIEW.REFRESH.

insert into t values ( 5, 'e' );commit; select * from mv where key = 5 ;

no rows selected

execute DBMS_MVIEW.REFRESH( 'MV' );

select * from mv where key = 5 ; KEY VAL---------- ----- 5 e

Cleanup


delete from t where key = 5 ;commit;

ON COMMIT

In some situations it would be convenient to have Oracle refresh a materialized view automatically whenever changes to the base table are committed. This is possible using the ON COMMIT refresh mode. Here is an example.


create materialized view mv REFRESH FAST ON COMMIT as select * from t



;

select rowid, key, val from mv ; ROWID KEY VAL------------------ ---------- -----AAAXNGAAEAAAAasAAA 1 aAAAXNGAAEAAAAasAAB 2 bAAAXNGAAEAAAAasAAC 3 cAAAXNGAAEAAAAasAAD 4

Let's see what happens to the view in the course of an insert operation.

insert into t values ( 5, 'e' );

select rowid, key, val from mv ; ROWID KEY VAL------------------ ---------- -----AAAXNGAAEAAAAasAAA 1 aAAAXNGAAEAAAAasAAB 2 bAAAXNGAAEAAAAasAAC 3 cAAAXNGAAEAAAAasAAD 4

Nothing happend yet. Let's issue a COMMIT.

commit;

select rowid, key, val from mv ; ROWID KEY VAL------------------ ---------- -----AAAXNGAAEAAAAasAAA 1 aAAAXNGAAEAAAAasAAB 2 bAAAXNGAAEAAAAasAAC 3 cAAAXNGAAEAAAAasAAD 4AAAXNGAAEAAAAatAAA 5 e

Note how the materialized view was automatically fast refreshed after the COMMIT command. No call to DBMS_MVIEW.REFRESH was required.

Restrictions

Materialized views can only refresh ON COMMIT in certain situations. 1. The materialized view cannot contain object types or Oracle-supplied types.2. The base tables will never have any distributed transactions applied to them.

The first case produces an error during the CREATE MATERIALIZED VIEW command.

-- this materialized view is not fast refreshable-- because the materialized view contains an Oracle-supplied type

create materialized view mv2 REFRESH FAST ON COMMIT

as select key, val, sys_xmlgen( val ) as val_xml from t; as select key, val, sys_xmlgen( val ) as val_xml from t *ERROR at line 3:ORA-12054: cannot set the ON COMMIT refresh attribute for the materialized view

The second case generates an error when a distributed transaction is attempted on the base table. In the following example materialized view MV (created at the top of this page) was created with REFRESH FAST. Attempting a distributed transaction on its base table, T, will therefore raise an error.

insert into t select key+10, val from T@REMOTE ;commit;commit*ERROR at line 1:ORA-02050: transaction 5.21.5632 rolled back, some remote DBs may be in-doubtORA-02051: another session in same transaction failed

(REMOTE is a database link which loops back to the current account.) ON DEMAND materialized views have no such restriction, as the following snippet demonstrates.

alter materialized view mv refresh ON DEMAND ;

insert into t select key+10, val from T@REMOTE ;commit;

select * from t ; KEY VAL---------- ----- 1 a 2 b 3 c 4 5 e 11 a 12 b 13 c 14 15 e -- cleanup test data in preparation for next section

delete from t where key >= 5 ;commit ;

Gotcha

The SQL Language Reference manual says this about the ON COMMIT clause. "Specify ON COMMIT to indicate that a fast refresh is to occur whenever the database commits a transaction that operates on a master table of the materialized view." -- Oracle® Database SQL Language Reference: CREATE MATERIALIZED VIEW

When I first read this I assumed it meant that "REFRESH COMPLETE ON COMMIT" is not allowed. I also assumed that specifying "REFRESH ON COMMIT" is equivalent to specifying "REFRESH FAST ON COMMIT". The following examples prove neither is correct however.

create materialized view mv2 REFRESH COMPLETE ON COMMIT as select key, val from t;

As we can see the CREATE MATERIALZIED view command succeeded even though COMPLETE, not FAST, was specified with ON COMMIT. The next example examines the behavior of "REFRESH ON COMMIT" without a specified refresh method.


-- fast refreshable materialized views on T can no longer be created on T-- because it has no materialized view log

drop materialized view mv2 ;

create materialized view mv2 REFRESH ON COMMIT as select key, val from t;select rowid, key, val from mv2 ; ROWID KEY VAL------------------ ---------- -----AAAXNMAAEAAAAakAAA 1 aAAAXNMAAEAAAAakAAB 2 bAAAXNMAAEAAAAakAAC 3 cAAAXNMAAEAAAAakAAD 4 insert into t values ( 5, 'e' );commit ;

select rowid, key, val from mv2 ; ROWID KEY VAL------------------ ---------- -----AAAXNMAAEAAAAakAAE 1 aAAAXNMAAEAAAAakAAF 2 bAAAXNMAAEAAAAakAAG 3 cAAAXNMAAEAAAAakAAH 4AAAXNMAAEAAAAakAAI 5 e

The fact that all the rowid's in MV2 changed after the INSERT transaction committed confirms that a complete refresh took place during the commit. "REFRESH ON COMMIT" is not therefore equivalent to "REFRESH FAST ON COMMIT". In fact, when no REFRESH method


is specified the default behaviour is "REFRESH FORCE" regardless of whether ON COMMIT is used or not.

Given these observations I can only conclude the documentation is either in error or misleading when it says "specify ON COMMIT to indicate that a fast refresh is to occur".

Cleanup


drop materialized view mv2 ;

delete from t where key >= 5 ;commit ;

Constraints

System Generated Constraints

When a materialized view is created Oracle may add system generated constraints to its underlying table (i.e. the table containing the results of the query, not to be confused with a base table). In the following example note how Oracle automatically adds a primary key constraint to the table called "MV", which is part of the materialized view also called "MV".

create materialized view mv as select key, val from t;

column constraint_name format a20column constraint_type format a15column index_name format a15

select constraint_name, constraint_type, index_namefrom user_constraintswhere TABLE_NAME = 'MV' ; CONSTRAINT_NAME CONSTRAINT_TYPE INDEX_NAME-------------------- --------------- ---------------SYS_C0019948 P SYS_C0019948

In the next example Oracle automatically adds a check constraint.


describe t2 Name Null? Type -------------------------------------------- -------- ------------------------------ KEY NOT NULL NUMBER T_KEY NOT NULL NUMBER AMT NOT NULL NUMBER


create materialized view mv refresh fast on commit as select t_key, count(*) row_count from t2 group by t_key;

column search_condition format a30

select constraint_name, constraint_type, search_conditionfrom user_constraintswhere table_name = 'MV' ; CONSTRAINT_NAME CONSTRAINT_TYPE SEARCH_CONDITION-------------------- --------------- ------------------------------SYS_C0019949 C "T_KEY" IS NOT NULL

Adding Your Own Constraints

If necessary we can create our own constraints on materialized view tables in addition to the ones Oracle may add. When the materialized view is in ON COMMIT mode these constraints effectively constrain the materialized view's base tables. Let's see this in action by creating a check constraint on MV.

select * from t2 ; KEY T_KEY AMT---------- ---------- ---------- 10 1 100 20 1 300 30 1 200 40 2 250 50 2 150 alter table mv -- note we used "alter table" here add CONSTRAINT MY_CONSTRAINT CHECK ( ROW_COUNT <= 3 ) DEFERRABLE;

select constraint_name, constraint_type, search_conditionfrom user_constraintswhere table_name = 'MV' ; CONSTRAINT_NAME CONSTRAINT_TYPE SEARCH_CONDITION-------------------- --------------- ------------------------------SYS_C0019949 C "T_KEY" IS NOT NULLMY_CONSTRAINT C ROW_COUNT <= 3

Now any attempt to create more than 3 rows per group in table T2 will generate an error at commit time.

insert into T2 values ( 5, 1, 500 );commit;commit*ERROR at line 1:ORA-12008: error in materialized view refresh pathORA-02290: check constraint (SCOTT.MY_CONSTRAINT) violated

Implementing multirow validation rules such as this one properly is not possible using check constraints on regular tables. Implementing them using triggers can be difficult if not impossible. With materialized views they are declared using a few lines of code and are virtually bullet proof when applied correctly. We will learn more about this powerful multirow validation approach in a future SQL Snippets tutorial so stay tuned! In the mean time Ask Tom "Declarative Integrity" has some good information on the subject.

Gotcha

When we created MY_CONSTRAINT above we use an ALTER TABLE command. Curiously enough an ALTER MATERIALIZED VIEW command would have worked too.

ALTER MATERIALIZED VIEW mv add constraint my_second_constraint check ( row_count < 4 ) deferrable;

select constraint_name, constraint_type, search_conditionfrom user_constraintswhere table_name = 'MV' ; CONSTRAINT_NAME CONSTRAINT_TYPE SEARCH_CONDITION-------------------- --------------- ------------------------------SYS_C0019949 C "T_KEY" IS NOT NULLMY_CONSTRAINT C ROW_COUNT <= 3MY_SECOND_CONSTRAINT C row_count < 4

The Oracle manual page for ALTER MATERIALIZED VIEW however does not indicate that constraints can be added this way. Until the documentation says this is legal it is best to use ALTER TABLE.

Cleanup



Indexes

When a materialized view is created Oracle may add system generated indexes to its underlying table (i.e. the table containing the results of the query, not to be confused with a base table). In

http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/statements_2001.htm#CCHECCJB


the following example note how Oracle automatically adds an index to implement the system generated primary key we saw in the preceding topic, Constraints.

create materialized view mv as select key, val from t;

column index_name format a15column column_name format a15

select index_name , i.uniqueness , ic.column_namefrom user_indexes i inner join user_ind_columns ic using ( index_name )where i.table_name = 'MV'; INDEX_NAME UNIQUENES COLUMN_NAME--------------- --------- ---------------SYS_C0019959 UNIQUE KEY

In the next example Oracle automatically generates a function based index.



create materialized view mv refresh fast on commit as select t_key, COUNT(*) ROW_COUNT from t2 group by t_key;

column column_expression format a35

select index_name , i.uniqueness , ic.column_name , ie.column_expressionfrom user_indexes i inner join user_ind_columns ic left outer join user_ind_expressions ie using ( index_name ) using ( index_name )


where ic.table_name = 'MV'; INDEX_NAME UNIQUENES COLUMN_NAME COLUMN_EXPRESSION--------------- --------- --------------- -----------------------------------I_SNAP$_MV UNIQUE SYS_NC00003$ SYS_OP_MAP_NONNULL("T_KEY")

(Note that SYS_OP_MAP_NONNULL is an undocumented Oracle function. Do not attempt to use it in your own code. See Nulls and Equality: SQL Only for additional info.)

Adding Your Own Indexes

We can add out own indexes to MV just as we would a regular table. In the following example we will add an index on the T_KEY column.

create index MY_INDEX on mv ( T_KEY ) ;

select index_name , i.uniqueness , ic.column_namefrom user_indexes i inner join user_ind_columns ic using ( index_name )where i.table_name = 'MV'; INDEX_NAME UNIQUENES COLUMN_NAME--------------- --------- ---------------I_SNAP$_MV UNIQUE SYS_NC00003$MY_INDEX NONUNIQUE T_KEY

To confirm that Oracle uses our index in queries let's turn SQL*Plus's Autotrace feature on and execute a query.

set autotrace on explainset linesize 95

select *from mvwhere t_key = 2 ;

T_KEY ROW_COUNT---------- ---------- 2 2

Execution Plan----------------------------------------------------------Plan hash value: 2793437614


-------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |-------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 26 | 2 (0)| 00:00:01 || 1 | MAT_VIEW ACCESS BY INDEX ROWID| MV | 1 | 26 | 2 (0)| 00:00:01 ||* 2 | INDEX RANGE SCAN | MY_INDEX | 1 | | 1 (0)| 00:00:01 |-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):---------------------------------------------------

2 - access("T_KEY"=2)

Note----- - dynamic sampling used for this statement

Note how the optimizer chose an INDEX RANGE SCAN from MY_INDEX in step 2.

Cleanup



ENABLE QUERY REWRITE

Materialized views can be useful for pre-calculating and storing derived values such as AMT_MAX in the following snippet.


create materialized view mv refresh fast on commit as select t_key, MAX( AMT ) AMT_MAX from t2 group by t_key;

Such materialized views make queries like this

select t_key, amt_max

FROM MVorder by t_key ; T_KEY AMT_MAX---------- ---------- 1 300 2 250

faster than its equivalent query. select t_key, max( amt ) as amt_maxFROM T2group by t_keyorder by t_key ; T_KEY AMT_MAX---------- ---------- 1 300 2 250

Wouldn't it be nice if Oracle could use the information in MV to resolve this last query too? If your database has a feature called Query Rewrite available and enabled this happens automatically. To see it in action we first need to make the materialized view available to Query

Rewrite like this.

alter materialized view mv ENABLE QUERY REWRITE ;

(See Gotcha - ORA-00439 below if you encounter an ORA-00439 error at this step.) Note that materialized views which do not include the ENABLE QUERY REWRITE clause will have Query Rewrite disabled by default.

Next we collect statistics on the materialized view to help Oracle optimize the query rewrite process.

execute dbms_stats.gather_table_stats( user, 'MV' ) ;

Finally we can confirm Oracle will use the materialized view in queries by turning SQL*Plus's Autotrace feature on.

set autotrace on explainset linesize 95

select t_key, max( amt ) as amt_maxFROM T2group by t_keyorder by t_key ;

T_KEY AMT_MAX---------- ---------- 1 300 2 250

Execution Plan----------------------------------------------------------

http://www.sqlsnippets.com/en/topic-12918.html#ora-00439

Plan hash value: 446852971

--------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |--------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 2 | 14 | 4 (25)| 00:00:01 || 1 | SORT ORDER BY | | 2 | 14 | 4 (25)| 00:00:01 || 2 | MAT_VIEW REWRITE ACCESS FULL| MV | 2 | 14 | 3 (0)| 00:00:01 |--------------------------------------------------------------------------------------

Note how the optimizer chose to access MV for its pre-calculated MAX(AMT) values in line 2 even though the query itself made no mention of MV. Without the Query Rewrite feature the execution plan would look like this.

alter session set QUERY_REWRITE_ENABLED = FALSE ;

select t_key, max( amt ) as amt_maxFROM T2group by t_keyorder by t_key ;

T_KEY AMT_MAX---------- ---------- 1 300 2 250

Execution Plan----------------------------------------------------------Plan hash value: 50962384

---------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |---------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 130 | 4 (25)| 00:00:01 || 1 | SORT GROUP BY | | 5 | 130 | 4 (25)| 00:00:01 || 2 | TABLE ACCESS FULL| T2 | 5 | 130 | 3 (0)| 00:00:01 |---------------------------------------------------------------------------

Note----- - dynamic sampling used for this statement

Note how the optimizer chose to access T2 this time. Each time this query is executed it has to re-calculate MAX(VAL) from the information in T2 for each group, a more expensive approach than simply selecting pre-calculated column values from MV is.

Gotcha - ORA-00439

The materialized view query rewrite feature is not available in Oracle XE and some other Oracle configurations. If you attempt to use ENABLE QUERY REWITE in an Oracle database where the feature is not enabled you will receive an ORA-00439 error.

create materialized view mv2 refresh fast on commit ENABLE QUERY REWRITE as select t_key , count(*) as row_count , count(amt) as amt_count from t2 group by t_key; from t2 *ERROR at line 9:ORA-00439: feature not enabled: Materialized view rewrite

Cleanup

alter session set query_rewrite_enabled = true ;

set autotrace off



Query Rewrite Restrictions and Capabilities

Restrictions

Materialized views with the following characteristics cannot have query rewrite enabled: the defining query references functions which are not DETERMINISTIC an expression in the defining query is not repeatable; e.g. an expression containing the

USER pseudo column or the SYSTIMESTAMP function.

Attempting to violate these restrictions results in an error.

create materialized view mv ENABLE QUERY REWRITE as select key, val, USER from t; as select key, val, USER from t *ERROR at line 3:ORA-30353: expression not supported for query rewrite

Capabilities

A few different materialized view query rewrite capabilities exist. In EXPLAIN_MVIEW we used a utility called MY_MV_CAPABILITIES to explore a materialized view's refresh capabilities. In the snippets below we will use this same utility to explore rewrite capabilities. First lets look at a simple, single table materialized view with query rewrite disabled.

create materialized view mv DISABLE QUERY REWRITE as select key, val from t;

set long 5000

select my_mv_capabilities( 'MV', 'REWRITE' ) as mv_report from dual ; MV_REPORT--------------------------------------------------------------------------------

Not Capable of:

REWRITE

REWRITE_FULL_TEXT_MATCH query rewrite is disabled on the materialized view

REWRITE_PARTIAL_TEXT_MATCH query rewrite is disabled on the materialized view

REWRITE_GENERAL query rewrite is disabled on the materialized view

This materialized view obviously has no rewrite capabilities available to it. (Descriptions of each capability name are available at Table 8-7 CAPABILITY_NAME Column Details.)

Enabling query rewrite on the materialized view changes this.

alter materialized view mv ENABLE QUERY REWRITE ;


Capable of:

REWRITE

REWRITE_FULL_TEXT_MATCH

REWRITE_PARTIAL_TEXT_MATCH



REWRITE_GENERAL

Now all rewrite capabilities are available. If the materialized view happened to referenced a remote table then some rewrite capabilities would be available, but not others.


create materialized view mv enable query rewrite as select key, val from T@REMOTE;


Capable of:

REWRITE


REWRITE_GENERAL

Not Capable of:

REWRITE_FULL_TEXT_MATCH T mv references a remote table or view in the FROM list

Cleanup


Join Queries

So far in this tutorial we have only seen materialized views based on a single table. Materialized views can also be created on multi-table queries to store the pre-calculated results of expensive join operations. Here is a simple example.

select * from t ; KEY VAL---------- ----- 1 a 2 b 3 c 4 select * from t2 ; KEY T_KEY AMT---------- ---------- ----------

10 1 100 20 1 300 30 1 200 40 2 250 50 2 150 create materialized view mv as select t.key t_key , t.val t_val , t2.key t2_key , t2.amt t2_amt from t, t2 where t.key = t2.t_key;

select t_key, t_val, t2_key, t2_amtfrom mv ; T_KEY T_VAL T2_KEY T2_AMT---------- ----- ---------- ---------- 1 a 10 100 1 a 20 300 1 a 30 200 2 b 40 250 2 b 50 150

REFRESH FAST

For a materialized view with only joins (no aggregates, unions, subqueries, etc.) to be fast refreshable certain restrictions beyond the General Restrictions on Fast Refresh must be met.

These additional restrictions are: materialized view logs with rowids must exist for all of the defining query's base tables the SELECT clause cannot contain object type columns the defining query cannot have a GROUP BY clause or aggregates rowid columns for each table instance in the FROM clause must appear in the SELECT

clause. -- from Restrictions on Fast Refresh on Materialized Views with Joins Only

In addition to these restrictions there are some recommended practices for using join queries. They are as follows. "If a materialized view contains joins but no aggregates, then having an index on each of the join column rowids in the detail table will enhance refresh performance greatly, because this type of materialized view tends to be much larger than materialized views containing aggregates." -- from Refreshing Materialized Views: Tips for Refreshing Materialized Views Without Aggregates

"After you create the materialized view, you must collect statistics on it using the DBMS_STATS package. Oracle Database needs the statistics generated by this package to

http://download.oracle.com/docs/cd/B28359_01/server.111/b28313/refresh.htm#i1006419

http://download.oracle.com/docs/cd/B28359_01/server.111/b28313/refresh.htm#i1006419



optimize query rewrite." -- from CREATE MATERIALIZED VIEW.

The Prototype

Applying these restrictions and recommendations to our test case above yields the following prototypical materialized view with joins. Whenever I need to create this type of materialized view in an application I use the code below as a starting point to remind me of the requirements.


create materialized view log on t with rowid, sequence ;

create materialized view log on t2 with rowid, sequence ;

create materialized view mv refresh fast on commit enable query rewrite as select t.key t_key , t.val t_val , t2.key t2_key , t2.amt t2_amt , t.rowid t_row_id , t2.rowid t2_row_id from t, t2 where t.key = t2.t_key;

create index mv_i1 on mv ( t_row_id ) ;create index mv_i2 on mv ( t2_row_id ) ;


Whenever we create a fast refreshable view we should use our EXPLAIN_MVIEW utility, MY_MV_CAPABILITIES, to confirm it can be refreshed in all required situations.

set long 5000

select my_mv_capabilities( 'MV', 'REFRESH' ) as mv_reportfrom dual ; MV_REPORT--------------------------------------------------------------------------------

Capable of:

REFRESH_COMPLETE

REFRESH_FAST






Now let's test drive our new MV. First, here are MV's initial contents.

select t_key, t_val, t2_key, t2_amtfrom mv ; T_KEY T_VAL T2_KEY T2_AMT---------- ----- ---------- ---------- 1 a 10 100 1 a 20 300 1 a 30 200 2 b 40 250 2 b 50 150

Now let's do some DML on both base tables and see the effect on MV.

insert into t2 values ( 60, 3, 300 ) ;

update t set val = upper(val) ;

commit;

select t_key, t_val, t2_key, t2_amtfrom mv ; T_KEY T_VAL T2_KEY T2_AMT---------- ----- ---------- ---------- 1 A 10 100 1 A 20 300 1 A 30 200 2 B 40 250 2 B 50 150 3 C 60 300

Both changes are reflected in MV, as expected.

Query Rewrite

Materialized views containing joins can be used by the query rewrite facility (see ENABLE QUERY REWRITE).

select my_mv_capabilities( 'MV', 'REWRITE' ) as mv_reportfrom dual ; MV_REPORT--------------------------------------------------------------------------------

Capable of:



REWRITE



REWRITE_GENERAL

Gotcha - ANSI Join Syntax

When we attempt to create a materialized view with the ANSI join syntax equivalent of the defining query used above we are surprisingly rewarded with an ORA error.

create materialized view mv2 refresh fast as select t.key t_key , t.val t_val , t2.key t2_key , t2.amt t2_amt , t.rowid t_row_id , t2.rowid t2_row_id from T INNER JOIN T2 ON ( T.KEY = T2.T_KEY ); T INNER JOIN T2 ON ( T.KEY = T2.T_KEY ) *ERROR at line 12:ORA-12015: cannot create a fast refresh materialized view from a complex query

While this behaviour appears to be a bug at first glance, Metalink note 420856.1 explains that it is really an undocumented limitation of fast refresh materialized views.

An examination of the EXPLAIN_MVIEW results for this case points to some behind-the-scenes transformations with ANSI syntax which may be causing the limitation.

select my_mv_capabilities ( 'create materialized view mv2 refresh fast as select t.key t_key , t.val t_val , t2.key t2_key , t2.amt t2_amt , t.rowid t_row_id , t2.rowid t2_row_id from T INNER JOIN T2 ON ( T.KEY = T2.T_KEY )' , 'REFRESH_FAST_AFTER_INSERT'

https://metalink.oracle.com/metalink/plsql/ml2_documents.showDocument?p_database_id=NOT&p_id=420856.1

) as mv_reportfrom dual ; MV_REPORT--------------------------------------------------------------------------------

Not Capable of:

REFRESH_FAST_AFTER_INSERT inline view or subquery in FROM list not supported for this type MV

REFRESH_FAST_AFTER_INSERT inline view or subquery in FROM list not supported for this type MV

REFRESH_FAST_AFTER_INSERT view or subquery in from list

Cleanup



delete t2 ;delete t ;insert into t select * from t_backup ;insert into t2 select * from t2_backup ;commit;

Aggregate Queries

In addition to materialized views based on join queries, materialized views containing aggregate functions are also possible. Here is a simple example.

select *from t2order by key ; KEY T_KEY AMT---------- ---------- ---------- 10 1 100 20 1 300 30 1 200 40 2 250 50 2 150 create materialized view mv as select t_key t_key , SUM(AMT) AMT_SUM from t2 group by

t_key;

select *from mvorder by t_key ; T_KEY AMT_SUM---------- ---------- 1 600 2 400

REFRESH FAST

For a materialized view with only aggregates (no joins, unions, subqueries, etc.) to be fast refreshable certain restrictions beyond the General Restrictions on Fast Refresh must be met. These additional restrictions are fully documented at Restrictions on Fast Refresh on Materialized Views with Aggregates. For our current test case the most significant restrictions are these.

all base tables must have materialized view logs that:o "Contain all columns from the table referenced in the materialized view."o "Specify with ROWID and INCLUDING NEW VALUES."o "Specify the SEQUENCE clause if the table is expected to have a mix of

inserts/direct-loads, deletes, and updates." aggregates in the defining query must be either SUM, COUNT, AVG, STDDEV,

VARIANCE, MIN, or MAX the defining query's SELECT clause must contain all the columns listed in the GROUP

BY clause

In addition to these restrictions some additional columns may be required in the defining query to allow it to be fast refreshable in all cases. The table below summarized these requirements.

Materialized Views with Aggregates: Requirements for Refresh Fast After Any DML

AggregateAdditional AggregatesRequired

Optional Aggregates

Note

COUNT(expr) COUNT(*)

MIN(expr) COUNT(*)defining query must have no WHERE clause

MAX(expr) COUNT(*)defining query must have no WHERE clause

SUM(expr)COUNT(*)COUNT(expr)

SUM(col) COUNT(*)"col" must have a NOT NULL constraint

AVG(expr)COUNT(*)COUNT(expr)

SUM(expr)

STDDEV(expr) COUNT(*) SUM(expr*expr)




COUNT(expr)SUM(expr)

VARIANCE(expr)COUNT(*)COUNT(expr)SUM(expr)

SUM(expr*expr)

(For insert-only materialized views see Table 8-2 Requirements for Materialized Views with Aggregates.)

Oracle recommends including the Optional Aggregates expressions to obtain the most efficient and accurate fast refresh of the materialized view.

Recommendations

The recommendation about gathering statistics that we saw in the Join Queries topic also applies to materialized views with aggregates.

"After you create the materialized view, you must collect statistics on it using the DBMS_STATS package. Oracle Database needs the statistics generated by this package to optimize query rewrite." -- from CREATE MATERIALIZED VIEW. Additionally we also expect that our GROUP BY columns will often be specified in WHERE or JOIN clauses. To improve the performance of such queries we will therefore add indexes to our materialized view's GROUP BY columns.

The Prototype

Applying these restrictions and recommendations to our test case above yields the following prototypical materialized view with aggregates. Whenever I need to create this type of materialized view in an application I use the code below as a starting point to remind me of the requirements.


create materialized view log on t2 with rowid, sequence ( t_key, amt ) including new values;

create materialized view mv refresh fast on commit enable query rewrite as select t_key , sum(amt) as amt_sum , count(*) as row_count , count(amt) as amt_count from t2 group by t_key;





create index mv_i1 on mv ( t_key ) ;


Whenever we create a fast refreshable view we should use our EXPLAIN_MVIEW utility, MY_MV_CAPABILITIES, to confirm it can be refreshed in all required situations.

set long 5000

select my_mv_capabilities( 'MV', 'REFRESH' ) as mv_reportfrom dual ; MV_REPORT--------------------------------------------------------------------------------

Capable of:

REFRESH_COMPLETE

REFRESH_FAST




Now let's test drive our new MV. First, here are MV's initial contents.

select *from mvorder by t_key ; T_KEY AMT_SUM ROW_COUNT AMT_COUNT---------- ---------- ---------- ---------- 1 600 3 3 2 400 2 2

Now let's do some DML on the base table and see the effect on MV.

insert into t2 values ( 60, 3, 300 ) ;

update t2 set amt = 0 where t_key = 2 ;

commit;

select *from mvorder by t_key ; T_KEY AMT_SUM ROW_COUNT AMT_COUNT---------- ---------- ---------- ---------- 1 600 3 3 2 0 2 2


3 300 1 1

Both changes are reflected in MV, as expected.

Query Rewrite

Materialized views containing aggregates can be used by the query rewrite facility (see ENABLE QUERY REWRITE).

alter materialized view mv enable query rewrite ;

select my_mv_capabilities( 'MV', 'REWRITE' ) as mv_reportfrom dual ; MV_REPORT--------------------------------------------------------------------------------

Capable of:

REWRITE



REWRITE_GENERAL

Gotcha - Insert-Only Materialized Views On Commit

We know that COUNT(*), and sometimes COUNT(expr), must be included in our materialized views for them to be fast refreshable in all cases, but what happens if we do not include these columns? Let's find out.

create materialized view mv2 refresh fast on commit enable query rewrite as select t_key , sum(amt) as amt_sum-- count(*) as row_count ,-- count(amt) as amt_count from t2 group by t_key;

select *from mv2order by t_key ; T_KEY AMT_SUM---------- ---------- 1 600


2 0 3 300

Let's try an INSERT.

insert into t2 values ( 70, 3, 900 ) ;commit ;

select *from mv2order by t_key ; T_KEY AMT_SUM---------- ---------- 1 600 2 0 3 1200

Looks good. The view was fast refreshed after the transaction committed.

In topic REFRESH FAST Categories we saw how an insert-only ON DEMAND materialized view similar to this one raised an error when we attempted to fast refresh it manually after a DELETE transaction. Let's see how our ON COMMIT version behaves after a DELETE.

delete from t2where t_key = 1 ;

3 rows deleted.

commit;

Commit complete.

select *from mv2order by t_key ; T_KEY AMT_SUM---------- ---------- 1 600 2 0 3 1200

Oops, all the rows for T_KEY = 1 were deleted from T2 but the group still appears in MV2. The materialized view did not refresh on commit and no errors were generated. Let's try synchronizing MV2 manually using DBMS_MVIEW.REFRESH.

execute DBMS_MVIEW.REFRESH( 'MV2', 'c' )

select *from mv2order by t_key ;


T_KEY AMT_SUM---------- ---------- 2 0 3 1200

That's a little better. So we've confirmed we have another insert-only materialized view, except this time we won't get any warnings or errors if a commit fails to trigger a fast refresh. When I first learned materialized views I stumbled across this behaviour by accident and found it puzzling. After all, when one creates a materialized view specifying that it should REFRESH FAST ON COMMIT it seems reasonable to assume it will always refresh fast on commit. The manual page for CREATE MATERIALIZED VIEW did not mention insert-only materialized views so I had no clue to their existence, until I re-read the page a third time and followed up on this seemingly inconsequential little comment.

"(The REFRESH clause) only sets the default refresh options. For instructions on actually implementing the refresh, refer to Oracle Database Advanced Replication and Oracle Database Data Warehousing Guide." -- from CREATE MATERIALIZED VIEW

This eventually lead me to learn about insert-only refreshing and how indispensable the DBMS_MVIEW.EXPLAIN_MVIEW utility is when working with fast refreshable materialized views. Let's see what DBMS_MVIEW.EXPLAIN_MVIEW has to say about MV2.

select my_mv_capabilities( 'MV2', 'REFRESH' ) as mv_reportfrom dual ; MV_REPORT--------------------------------------------------------------------------------

Capable of:

REFRESH_COMPLETE

REFRESH_FAST


Not Capable of:

REFRESH_FAST_AFTER_ONETAB_DML AMT_SUM SUM(expr) without COUNT(expr)

REFRESH_FAST_AFTER_ONETAB_DML COUNT(*) is not present in the select list


The report tells us MV2 is fast refreshable after insert, as we saw earlier, but not after other types of DML. This is how to recognize an insert-only materialized view.


So the lesson here is do not assume materialized views created with REFRESH FAST ON COMMIT will always refresh fast on commit. Always check it with DBMS_MVEW.EXPLAIN_MVIEW to see whether or not it is an "insert-only" materialized view.

Cleanup

drop materialized view mv ;drop materialized view mv2 ;


delete t2 ;insert into t2 select * from t2_backup ;commit;

Nested Materialized Views

Sometimes a single materialized view will not meet our requirements. For example, given this base table

select t_key, amt, keyfrom t2order by t_key, amt, key ; T_KEY AMT KEY---------- ---------- ---------- 1 100 10 1 200 30 1 300 20 2 150 50 2 250 40

say we wanted a fast refreshable materialized view defined with the following query.

select t_key t_key , max(amt) amt_max , max(key) keep ( dense_rank last order by amt ) as t2_key_of_amt_maxfrom t2group by t_key; T_KEY AMT_MAX T2_KEY_OF_AMT_MAX---------- ---------- ----------------- 1 300 20 2 250 40

(T2_KEY_OF_AMT_MAX identifies the KEY value associated with the highest AMT value in each group.)

As always the first step is to create a materialized view log on T2.

create materialized view log on t2 with rowid , sequence ( key, t_key, amt ) including new values;

Now let's see what the MY_MV_CAPABILITIES utility (created in topic DBMS_MVIEW.EXPLAIN_MVIEW) tells us about our query.

set long 5000

select my_mv_capabilities('select t_key t_key , max(amt) amt_max , max(key) keep ( dense_rank LAST order by amt ) as t2_key_of_amt_maxfrom t2group by t_key' , 'REFRESH' ) as mv_reportfrom dual ; MV_REPORT--------------------------------------------------------------------------------

Capable of:

REFRESH_COMPLETE

Not Capable of:

REFRESH_FAST

REFRESH_FAST_AFTER_INSERT aggregate function nested within an expression




Though not entirely obvious from the report, it turns out our query is not fast refreshable because the LAST aggregate function which we used to implement T2_KEY_OF_AMT_MAX is not one

http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/functions073.htm#i1000905


of the fast refreshable aggregates SUM, COUNT, AVG, STDDEV, VARIANCE, MIN and MAX (see Restrictions on Fast Refresh on Materialized Views with Aggregates). Let's try writing the query using a subquery instead of LAST.

select my_mv_capabilities('select t_key t_key , max(amt) amt_max , max(key) as t2_key_of_amt_maxfrom t2where ( t_key, t2.amt ) in ( select t_key, max(amt) from t2 group by t_key )group by t_key' , 'REFRESH' ) as mv_reportfrom dual ; MV_REPORT--------------------------------------------------------------------------------

Capable of:

REFRESH_COMPLETE

Not Capable of:

REFRESH_FAST

REFRESH_FAST_AFTER_INSERT subquery in mv




It looks like a subquery will not work either. Perhaps an analytic approach will work? select my_mv_capabilities('select distinct t_key t_key , max( amt ) over ( partition by t_key ) as amt_max , last_value( key ) over


( partition by t_key order by amt range between unbounded preceding and unbounded following ) as t2_key_of_amt_maxfrom t2' , 'REFRESH' ) as mv_reportfrom dual ; MV_REPORT--------------------------------------------------------------------------------

Capable of:

REFRESH_COMPLETE

Not Capable of:

REFRESH_FAST

REFRESH_FAST_AFTER_INSERT DISTINCT clause in select list in mv

REFRESH_FAST_AFTER_INSERT DISTINCT clause in select list in mv

REFRESH_FAST_AFTER_INSERT window function in mv



This last approach did not work either, which is a bit of a relief actually since the technique is rather crass. We need to rethink our approach. Since deriving the desired result set is conceptually a three step process

1. find the highest AMT value2. find the highest KEY value per AMT3. join the results of steps (1) and (2) together on the AMT column

perhaps three separate materialized views would work? The materialized views for steps 1 and 2, which we will call MV1 and MV2, can be based on table T2 and can be refreshed independently of each other. However the materialized view for step 3, which we will call MV3, will need to be based on MV1 and MV2 and will need to refresh after they do.

Fortunately Oracle allows for a materialized view like MV3 and automatically manages the refresh order when all three views are refreshable on commit. Materialized views like MV3 are called "Nested Materialized Views". Note the term "Nested Materialized View" does not refer to MV1 and MV2, even though they could be thought of as being "nested" within MV3.


Restrictions and Recommendations

As always, before creating a type of materialized view we have not tried before we must be aware of its restrictions. For nested materialized views they are these.

The base materialized views must contain joins or aggregates. The defining query must contain joins or aggregates. All base objects, whether they are tables or materialized views, must each have

materialized view logs. If REFRESH FAST is specified then all materialized views in any chain related to the

materialized view must also specify REFRESH FAST.

Note that all base objects in a nested materialized view, regardless of whether they are tables or materialized views, are treated as tables.

We are now ready to craft our three step solution.

create materialized view MV1 refresh fast on commit as select t_key , max(amt) amt_max , count(amt) amt_count , count(*) row_count from t2 group by t_key;

create materialized view log on mv1 with rowid , sequence ( t_key, amt_max, amt_count, row_count ) including new values;


Capable of:

REFRESH_COMPLETE

REFRESH_FAST


REFRESH_FAST_AFTER_ANY_DML create materialized view MV2 refresh fast on commit as select t_key , amt , max(key) max_key_per_amt , count(*) row_count from t2 group by t_key, amt;

create materialized view log on mv2 with rowid , sequence ( t_key, max_key_per_amt, row_count ) including new values;


Capable of:

REFRESH_COMPLETE

REFRESH_FAST


REFRESH_FAST_AFTER_ANY_DML create materialized view MV3 refresh fast on commit as select mv1.t_key , mv1.amt_max , mv2.max_key_per_amt as t2_key_of_amt_max , mv1.rowid mv1_rowid , mv2.rowid mv2_rowid from mv1, mv2 where mv1.t_key = mv2.t_key and mv1.amt_max = mv2.amt;

select my_mv_capabilities( 'MV3', 'REFRESH' ) as mv_reportfrom dual ;

MV_REPORT--------------------------------------------------------------------------------

Capable of:

REFRESH_COMPLETE

REFRESH_FAST




We finally have a fast refreshable materialized view solution. Let's confirm that MV3, the nested one, contains the correct results.

select t_key, amt_max, t2_key_of_amt_maxfrom mv3order by t_key ; T_KEY AMT_MAX T2_KEY_OF_AMT_MAX---------- ---------- ----------------- 1 300 20 2 250 40

Good. It matches the results returned by the first query we tried which used the LAST function. Now let's put all three materialized views through their paces. First we perform a few mixed DML transactions.

insert into t2 values ( 60, 3, 450 );insert into t2 values ( 70, 3, 550 );

update t2 set amt = 300 where key = 30 ;commit;

delete from t2 where key = 70 ;update t2 set amt = 650 where key = 60 ;commit;

select t_key, amt, keyfrom t2order by t_key, amt, key ; T_KEY AMT KEY---------- ---------- ---------- 1 100 10 1 300 20 1 300 30 2 150 50 2 250 40 3 650 60

Now we check MV3 to see if it contains the correct info. select t_key, amt_max, t2_key_of_amt_maxfrom mv3order by t_key ; T_KEY AMT_MAX T2_KEY_OF_AMT_MAX---------- ---------- ----------------- 1 300 30 2 250 40 3 650 60

It does. Mission accomplished.

Cleanup

drop materialized view mv1 ;drop materialized view mv2 ;drop materialized view mv3 ;


delete t2 ;insert into t2 select * from t2_backup ;commit;

Setup


create table t( key number primary key, val varchar2(5)) ;

insert into t values ( 1, 'a' );insert into t values ( 2, 'b' );insert into t values ( 3, 'c' );insert into t values ( 4, null );

commit;

create table t_backup as select * from t;

create table t2( key number primary key, t_key number not null references t, amt number not null) ;

insert into t2 values ( 10, 1, 100 ) ;insert into t2 values ( 20, 1, 300 ) ;insert into t2 values ( 30, 1, 200 ) ;insert into t2 values ( 40, 2, 250 ) ;insert into t2 values ( 50, 2, 150 ) ;commit;

create table t2_backup as select * from t2;

Cleanup


drop table t2 ;drop table t ;

drop table t_backup ;drop table t2_backup ;

drop function my_mv_capabilities ; ---------------------------------------------------------------------------------- WARNING!!---- MV_CAPABILITIES_TABLE is an Oracle table; do not drop it from your schema-- unless you specifically created it for this tutorial and no longer wish to-- use it--------------------------------------------------------------------------------

drop table mv_capabilities_table ; exit

MODEL Clause

This section presents tutorials on the MODEL clause of the SELECT command. Introduced in Oracle 10g, the MODEL clause is a powerful feature that gives you the ability to change any cell in the query's result set using data from any other cell (similar to the way a spreadsheet works). It also adds procedural features to SQL previously available only through PL/SQL calls.

For example, with MODEL you can take a simple table like this

KEY GROUP_1 GROUP_2 DATE_VAL NUM_VAL------ ---------- ---------- ---------- ------- 1 A a1 2005-01-01 100 2 A a2 2005-06-12 200 3 A a3 300 4 B a1 2006-02-01 5 B a2 2006-06-12 300 6 B a3 2005-01-01 100 7 C a1 2006-06-12 100 8 C a2 9 a1 2005-02-01 200 10 a2 2005-02-01 800

and, with a single command, create a report containing ad-hoc totals like this.

set null ""

select case when key like 'Total%' then key else null end as total , group_1 , group_2 , num_valfrom tmodel dimension by ( cast(key as varchar2(20)) as key , nvl( group_1, 'n/a' ) as group_1 , nvl( group_2, 'n/a' ) as group_2 ) measures( num_val ) rules ( num_val[ 'Total 1 - A + C', null, null ] = sum(num_val)[any,group_1 in ('A','C'),any] , num_val[ 'Total 2 - A + a2', null, null ] = sum(num_val)[any,'A',any] + sum(num_val)[any,group_1 <> 'A','a2'] , num_val[ 'Total 3 - n/a', null, null ] = sum(num_val)[any,'n/a',any] ,

num_val[ 'Total 4 - a1 + a3', null, null ] = sum(num_val)[any,any,group_2 in ('a1','a3')] )order by group_1 , group_2 , total nulls first; TOTAL GROUP_1 GROUP_2 NUM_VAL-------------------- ---------- ---------- ------- A a1 100 A a2 200 A a3 300 B a1 B a2 300 B a3 100 C a1 100 C a2 n/a a1 200 n/a a2 800Total 1 - A + C 700Total 2 - A + a2 1700Total 3 - n/a 1000Total 4 - a1 + a3 800

You can also use MODEL'S procedural features to produce results that are difficult, inefficient, or impossible to do with a non-MODEL SELECT command. Here is an example.

set null "(null)"column string format a40

select group_1, substr( string, 2 ) as stringfrom twhere num_val is not nullmodel return updated rows partition by ( group_1 ) dimension by ( row_number() over (partition by group_1 order by num_val) as position ) measures ( cast( num_val as varchar2(65) ) as string ) -- Note 1 rules upsert iterate( 6 ) until ( presentv(string[iteration_number+2],1,0) = 0 ) ( string[0] = string[0] || ',' || string[iteration_number+1] )order by group_1 ; GROUP_1 STRING---------- ----------------------------------------A 100,200,300B 100,300C 100(null) 200,800

(This last technique is explained fully in another section of SQL Snippets at Rows to String: MODEL Method 1.)

Though powerful, the MODEL clause is also somewhat complex and this can be intimidating when you read about it for the first time. The tutorials to follow will therefore present very simple MODEL examples to help you quickly become comfortable with its many features. Before continuing it is important to know that everything in the MODEL clause is evaluated after all other clauses in the query, except for SELECT DISTINCT and ORDER BY. Knowing this will help you better understand the examples in this section's tutorials.

DIMENSION BY

In this tutorial we learn about the DIMENSION BY component of the MODEL clause. DIMENSION BY specifies which columns in a SELECT statement are dimension columns, which for our purposes can be thought of as any column that serves to identify each row in the result of a SELECT statement. By default, the dimension columns in a MODEL clause must produce a unique key for the result set. See the Oracle® Database Data Warehousing Guide 10g Release 2 (10.2) - Glossary for a formal definition.

Before we begin please note that, on its own, DIMENSION BY has little visible effect on the output of the SELECT statement. Most of the examples below would produce the same result as one with no MODEL clause at all. This is because we are not trying to manipulate the results just yet. We are simply seeing how to specify our dimension columns, which is a precursor for learning to manipulate results in subsequent pages.

Consider the following table.

select key, key_2, group_1, group_2, num_valfrom torder by key; KEY KEY_2 GROUP_1 GROUP_2 NUM_VAL------ ----- ---------- ---------- ------- 1 T-1 A a1 100 2 T-2 A a2 200 3 T-3 A a3 300 4 T-4 B a1 5 T-5 B a2 300 6 T-6 B a3 100 7 T-7 C a1 100 8 T-8 C a2 9 T-9 a1 200 10 T-10 a2 800

We see that KEY, KEY_2, and (GROUP_1, GROUP_2) all uniquely identify each row in the table. They are therefore dimension column candidates. To let Oracle know which column(s) we

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14223/glossary.htm#i431992





plan to use as dimensions we compose a MODEL clause like this. (Ignore the MEASURES and RULES clauses for now. We will explore those later.)

select key, num_valfrom tmodel DIMENSION BY ( KEY ) measures ( num_val ) rules ()order by key; KEY NUM_VAL------ ------- 1 100 2 200 3 300 4 5 300 6 100 7 100 8 9 200 10 800

Multiple Dimensions

If needed, you can define more than one dimension column, as this example shows.

select group_1, group_2, num_valfrom tmodel DIMENSION BY ( GROUP_1, GROUP_2 ) measures ( num_val ) rules ()order by group_1, group_2; GROUP_1 GROUP_2 NUM_VAL---------- ---------- -------A a1 100A a2 200A a3 300B a1B a2 300B a3 100C a1 100C a2

a1 200 a2 800

You can even include columns in the DIMENSION BY clause which are not required to uniquely identify each result row.

select key, date_val, num_valfrom tmodel DIMENSION BY ( KEY, DATE_VAL ) -- date_val not required to uniquely identify row measures ( num_val ) rules ()order by key; KEY DATE_VAL NUM_VAL------ ---------- ------- 1 2005-01-01 100 2 2005-06-12 200 3 300 4 2006-02-01 5 2006-06-12 300 6 2005-01-01 100 7 2006-06-12 100 8 9 2005-02-01 200 10 2005-02-01 800

Aliasing

You cannot use SELECT clause aliases in DIMENSION BY. Here are some examples of aliases that will cause errors.

select KEY AS KEY_3, num_valfrom tmodel dimension by ( KEY_3 ) measures ( num_val ) rules (); dimension by ( KEY_3 ) *ERROR at line 7:ORA-00904: "KEY_3": invalid identifier

select KEY * 10 AS KEY_3, num_val

from tmodel dimension by ( KEY_3 ) measures ( num_val ) rules (); dimension by ( KEY_3 ) *ERROR at line 7:ORA-00904: "KEY_3": invalid identifier

select ROWNUM AS KEY_3, num_valfrom tmodel dimension by ( KEY_3 ) measures ( num_val ) rules (); dimension by ( KEY_3 ) *ERROR at line 7:ORA-00904: "KEY_3": invalid identifier

You can however alias such expressions directly in DIMENSION BY.

select KEY_3, num_valfrom tmodel DIMENSION BY ( KEY AS KEY_3 ) measures ( num_val ) rules ()order by key_3; KEY_3 NUM_VAL---------- ------- 1 100 2 200 3 300 4 5 300 6 100 7 100 8 9 200 10 800

select KEY_3, num_valfrom tmodel DIMENSION BY ( KEY * 10 AS KEY_3 ) measures ( num_val ) rules ()order by key_3; KEY_3 NUM_VAL---------- ------- 10 100 20 200 30 300 40 50 300 60 100 70 100 80 90 200 100 800 select KEY_3, num_valfrom tmodel DIMENSION BY ( ROWNUM AS KEY_3 ) measures ( num_val ) rules ()order by key_3; KEY_3 NUM_VAL---------- ------- 1 100 2 200 3 300 4 5 300 6 100 7 100 8 9 200 10 800

Uniqueness

By default, if your DIMENSION BY columns do not give you a unique key for your result set you will get an error.

select group_2, num_valfrom tmodel DIMENSION BY ( GROUP_2 ) -- group_2 is not unique measures ( num_val ) rules ()order by group_2; t *ERROR at line 5:ORA-32638: Non unique addressing in MODEL dimensions

This rule can be relaxed somewhat by specifying UNIQUE SINGLE REFERENCE.

select group_2, num_valfrom tmodel UNIQUE SINGLE REFERENCE dimension by ( group_2 ) -- group_2 is not unique measures ( num_val ) rules ()order by group_2; GROUP_2 NUM_VAL---------- -------a1 100a1 100a1 200a1a2 800a2 200a2 300a2a3 300a3 100

Note that UNIQUE SINGLE REFERENCE affects the types of RULES you can define. This is explained further in Expressions and Cell References.

MEASURES

In this tutorial we learn about the MEASURES component of the MODEL clause. MEASURES specifies which columns in a SELECT are measure columns, which for our purposes can be thought of as any column containing a measurable quantity like a price or a



length. See the Oracle® Database Data Warehousing Guide 10g Release 2 (10.2) - Glossary for a formal definition.

Before we begin please note that, on its own, MEASURES has little visible effect on the output of the SELECT statement. Most of the examples below would produce the same result as one with no MODEL clause at all. This is because we are not trying to manipulate the results just yet. We are simply seeing how to specify our measure columns, which is a precursor to manipulating the results. We will see how to actually manipulate our output when we explore the RULES clause in subsequent tutorials.

Before we see MEASURES in action first consider the following table.

select key, group_1, group_2, date_val, num_valfrom torder by key; KEY GROUP_1 GROUP_2 DATE_VAL NUM_VAL------ ---------- ---------- ---------- ------- 1 A a1 2005-01-01 100 2 A a2 2005-06-12 200 3 A a3 300 4 B a1 2006-02-01 5 B a2 2006-06-12 300 6 B a3 2005-01-01 100 7 C a1 2006-06-12 100 8 C a2 9 a1 2005-02-01 200 10 a2 2005-02-01 800

If we decide to use KEY as our sole dimension column, then all other columns are available for use as measure columns. To let Oracle know we want to use the NUM_VAL column as our measure we can compose a MODEL clause like this.

select key, num_valfrom tmodel dimension by ( key ) MEASURES ( NUM_VAL ) rules ()order by key; KEY NUM_VAL------ ------- 1 100 2 200 3 300


4 5 300 6 100 7 100 8 9 200 10 800

If we want to include more measure columns we do it like this.

select key, date_val, num_valfrom tmodel dimension by ( key ) MEASURES ( DATE_VAL, NUM_VAL ) rules ()order by key; KEY DATE_VAL NUM_VAL------ ---------- ------- 1 2005-01-01 100 2 2005-06-12 200 3 300 4 2006-02-01 5 2006-06-12 300 6 2005-01-01 100 7 2006-06-12 100 8 9 2005-02-01 200 10 2005-02-01 800

You can define measures using constants and expressions instead of simple column names, like this.

select key, num_val, num_val_2, date_val_2, notefrom tmodel dimension by ( key ) MEASURES ( num_val , NUM_VAL * 10 AS NUM_VAL_2 , SYSDATE AS DATE_VAL_2 , 'A BRIEF NOTE' AS NOTE ) rules( )order by key;

KEY NUM_VAL NUM_VAL_2 DATE_VAL_2 NOTE------ ------- ---------- ---------- ------------ 1 100 1000 2007-02-28 A BRIEF NOTE 2 200 2000 2007-02-28 A BRIEF NOTE 3 300 3000 2007-02-28 A BRIEF NOTE 4 2007-02-28 A BRIEF NOTE 5 300 3000 2007-02-28 A BRIEF NOTE 6 100 1000 2007-02-28 A BRIEF NOTE 7 100 1000 2007-02-28 A BRIEF NOTE 8 2007-02-28 A BRIEF NOTE 9 200 2000 2007-02-28 A BRIEF NOTE 10 800 8000 2007-02-28 A BRIEF NOTE

Nulls

Nulls and Aggregate Functions

This tutorial demonstrates how aggregate functions deal with null values. Techniques for generating results that ignore nulls and results that include nulls are highlighted.

Ignoring Nulls

According to the SQL Reference Manual section on Aggregate Functions:

All aggregate functions except COUNT(*) and GROUPING ignore nulls. You can use the NVL function in the argument to an aggregate function to substitute a value for a null. COUNT never returns null, but returns either a number or zero. For all the remaining aggregate functions, if the data set contains no rows, or contains only rows with nulls as arguments to the aggregate function, then the function returns null.

This means that, given a table with values like this

GROUP_KEY VAL---------- ----------Group-1 (null)Group-1 (null)

Group-2 aGroup-2 aGroup-2 zGroup-2 zGroup-2 (null)

Group-3 AGroup-3 AGroup-3 Z

aggregate functions like MAX , MIN , and COUNT will return values that for the most part ignore nulls, like these.

select group_key , MAX( VAL ) max_val , MIN( VAL ) min_val , COUNT( * ) count_all_rows , COUNT( VAL ) count_val , COUNT( DISTINCT VAL ) count_distinct_valfrom t1group by group_keyorder by group_key ; GROUP_KEY MAX_VAL MIN_VAL COUNT_ALL_ROWS COUNT_VAL COUNT_DISTINCT_VAL---------- ---------- ---------- -------------- ---------- ------------------

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14200/functions032.htm




Group-1 (null) (null) 2 0 0Group-2 z a 5 4 2Group-3 Z A 3 3 2

Note how MAX_VAL contains the same results for Group-2 and Group-3, even though Group-2 contains null VAL values and Group-3 does not. Note also that only COUNT_ALL_ROWS returned a count that included null values. The other two versions of COUNT() ignored null values.

Including Nulls

For mathematical aggregate functions like AVG, MEDIAN, and SUM including nulls in the calculation is of little practical use. For aggregate functions like MAX, MIN, and COUNT(DISTINCT ...) however, we sometimes need results that take nulls into account. For example, using the same test data as above

GROUP_KEY VAL---------- ----------Group-1 (null)Group-1 (null)

Group-2 aGroup-2 aGroup-2 zGroup-2 zGroup-2 (null)

Group-3 AGroup-3 AGroup-3 Z

we may wish to produce results like these.

GROUP_KEY MAX_VAL MIN_VAL COUNT_DISTINCT_VAL_1---------- ---------- ---------- --------------------Group-1 (null) (null) 1Group-2 (null) a 3Group-3 Z A 2

For the MAX and MIN cases it helps to take the statement "all aggregate functions except COUNT(*) and GROUPING ignore nulls" with a grain of salt. Fortunately for us there are aggregate functions in addition to COUNT(*) and GROUPING which do not ignore nulls. Two of them are the FIRST and LAST functions, which we can use as follows to give us MAX and MIN results that include nulls.

select group_key, MAX( VAL ) KEEP ( DENSE_RANK LAST ORDER BY VAL ) max_val , MIN( VAL ) KEEP ( DENSE_RANK FIRST ORDER BY VAL ) min_valfrom t1group by group_keyorder by group_key ;



GROUP_KEY MAX_VAL MIN_VAL---------- ---------- ----------Group-1 (null) (null)Group-2 (null) aGroup-3 Z A

For the COUNT( DISTINCT VAL ) case two possible approaches for including nulls are demonstrated below.

select group_key, COUNT( DISTINCT DUMP(VAL) ) count_distinct_val_1 , COUNT( DISTINCT VAL ) + MAX( NVL2(VAL,0,1) ) count_distinct_val_2from t1group by group_keyorder by group_key ; GROUP_KEY COUNT_DISTINCT_VAL_1 COUNT_DISTINCT_VAL_2---------- -------------------- --------------------Group-1 1 1Group-2 3 3Group-3 2 2

Be careful with the DUMP approach though since DUMP's output is truncated at 4000 characters. If the VAL column contained values whose DUMP output is truncated then the results can be incorrect.

DENSE_RANK and RANK

Two more aggregate functions where including nulls in the calculation may be necessary are the DENSE_RANK and RANK functions. Fortunately, as with FIRST and LAST, DENSE_RANK and RANK include nulls by default. For example, given test data like this (analytic value rankings are included for clarity)

GROUP_KEY VAL VAL_DENSE_RANK VAL_RANK---------- ---------- -------------- ----------Group-1 (null) 1 1Group-1 (null) 1 1

Group-2 a 1 1Group-2 a 1 1Group-2 z 2 3Group-2 z 2 3Group-2 (null) 3 5

Group-3 A 1 1Group-3 A 1 1Group-3 Z 2 3

The following results show how the aggregate versions of DENSE_RANK and RANK do not ignore nulls.

select group_key ,



DENSE_RANK( NULL ) WITHIN GROUP ( ORDER BY VAL ) null_dense_rank_within_group , RANK( NULL ) WITHIN GROUP ( ORDER BY VAL ) null_rank_within_groupfrom t1group by group_keyorder by group_key ; GROUP_KEY NULL_DENSE_RANK_WITHIN_GROUP NULL_RANK_WITHIN_GROUP---------- ---------------------------- ----------------------Group-1 1 1Group-2 3 5Group-3 3 4

Gotchas

Some people reading these two sentences from the manual

All aggregate functions except COUNT(*) and GROUPING ignore nulls. You can use the NVL function in the argument to an aggregate function to substitute a value for a null. may infer that aggregate functions can be made to treat null values the same way they treat non-null values by simply using NVL to substitute nulls with some non-null value. A simple application of this logic can lead to trouble however. For example, say we choose to substitute all null values with a 'z', like this.

select group_key , max( NVL( VAL, 'z' ) ) max_val , min( NVL( VAL, 'z' ) ) min_val , count( distinct NVL( VAL, 'z' ) ) count_distinct_valfrom t1group by group_keyorder by group_key ; GROUP_KEY MAX_VAL MIN_VAL COUNT_DISTINCT_VAL---------- ---------- ---------- ------------------Group-1 z z 1Group-2 z a 2Group-3 Z A 2

Note how none of the columns above contain the desired results which, as you will recall, are these.

GROUP_KEY MAX_VAL MIN_VAL COUNT_DISTINCT_VAL_1---------- ---------- ---------- --------------------Group-1 (null) (null) 1Group-2 (null) a 3Group-3 Z A 2

A simple application of NVL clearly will not do then. Taking the NVL idea a little further programmers sometimes employ more complex solutions such as this one.

select group_key , DECODE( MAX( NVL( VAL, '~' ) ), '~', NULL, MAX( VAL ) ) max_val ,

DECODE( MIN( NVL( VAL, '~' ) ), '~', NULL, MIN( VAL ) ) min_val , COUNT( distinct NVL( VAL, '~' ) ) count_distinct_valfrom t1group by group_keyorder by group_key ; GROUP_KEY MAX_VAL MIN_VAL COUNT_DISTINCT_VAL---------- ---------- ---------- ------------------Group-1 (null) (null) 1Group-2 (null) a 3Group-3 Z A 2

Without some mechanism to ensure '~' and strings that sort higher than '~' never appear in VAL however, these solutions will fail if such values are ever inserted into the table. For example, given this data

insert into t1 values ( 10, 'Group-4', null );insert into t1 values ( 10, 'Group-4', '~' );

insert into t1 values ( 10, 'Group-5', null );insert into t1 values ( 10, 'Group-5', '~~~' );

the results should be

GROUP_KEY MAX_VAL MIN_VAL COUNT_DISTINCT_VAL_1---------- ---------- ---------- --------------------Group-4 (null) ~ 2Group-5 (null) ~~~ 2

but using the NVL approach gives us these, incorrect results.

select group_key , decode( max( nvl(val, '~' ) ), '~', null, max( val ) ) max_val , decode( min( nvl(val, '~' ) ), '~', null, min( val ) ) min_val , count( distinct nvl( val, '~' ) ) count_distinct_valfrom t1where group_key in ( 'Group-4', 'Group-5' )group by group_keyorder by group_key ; GROUP_KEY MAX_VAL MIN_VAL COUNT_DISTINCT_VAL---------- ---------- ---------- ------------------Group-4 (null) (null) 1Group-5 ~~~ (null) 2

To avoid these gotchas simply use the non-NVL alternatives presented under "Including Nulls" above.

Nulls and Equality

In SQL you should always consider the effect of null values when comparing two values for equality (or any type of comparison for that matter). Consider a table where two of its columns can contain null values.

select * from t; C1 C2 C3--- ---------- ---------- 1 A A 2 A B 3 (null) A 4 (null) (null)

If we attempt a SELECT statement like the following we will only get row 1.

select *from twhere c2 = c3 ; C1 C2 C3--- ---------- ---------- 1 A A

Row 4 is not returned because, in SQL, a null is not considered to be equal to or unequal to any value (including another null). If this is the behavior you need, then read no further. However, if you need a query that returns row 1 and row 4 then try one of the solutions in the subtopics to follow.

SQL + PL/SQL

These techniques work in both SQL and PL/SQL.

OR with IS NULL

While a bit cumbersome, this basic solution is the easiest to understand and implement.

select *from twhere ( C2 = C3 OR ( C2 IS NULL AND C3 IS NULL ) ); C1 C2 C3--- ---------- ---------- 1 A A 4 (null) (null) begin

for r in ( select * from t ) loop

if ( R.C2 = R.C3 OR ( R.C2 IS NULL AND R.C3 IS NULL ) ) then dbms_output.put_line( 'Row ' || r.c1 || ' contains matching values.' ); end if;

end loop;

end;./ Row 1 contains matching values.Row 4 contains matching values.

NVL

The following NVL approach is a popular one.

select *from twhere nvl( c2, 'x' ) = nvl( c3, 'x' ) ; C1 C2 C3--- ---------- ---------- 1 A A 4 (null) (null)

One problem with this solution is that the replacement value "x", or whatever value you choose to use, might be inserted into columns C2 or C3 some day. This would cause a SELECT statement that has been working properly until that day to all of a sudden start returning the wrong answer, like this.

insert into t values( 5, 'x', null );commit;

select *from twhere nvl( c2, 'x' ) = nvl( c3, 'x' ) ; C1 C2 C3--- ---------- ---------- 1 A A 4 (null) (null) 5 x (null)

The trick to making this solution bullet proof is to choose a replacement value that can never appear in either of the columns being compared. If we look at the table definition for T

desc t Name Null? Type ---------------------------------------------- -------- ------------------------------- C1 NUMBER C2 VARCHAR2(10)

C3 VARCHAR2(10)

we see that values in C2 and C3 can be at most 10 characters long. Any replacement value larger than 10 characters is therefore guaranteed to never appear in either column (assuming the sizes of C2 or C3 are never expanded).

select *from twhere nvl( c2, '12345678901' ) = nvl( c3, '12345678901' ) ; C1 C2 C3--- ---------- ---------- 1 A A 4 (null) (null)

Custom Function

If you do these comparisons frequently you may wish to create a custom database function like this one.

create function SAME( p_1 in varchar2, p_2 in varchar2 ) return varchar2 isbegin return ( case when p_1 is null and p_2 is null then 'Y' when p_1 = p_2 then 'Y' else 'N' end );end;/ select *from twhere SAME( C2, C3 ) = 'Y' ; C1 C2 C3--- ---------- ---------- 1 A A 4 (null) (null) begin


if SAME( R.C2, R.C3 ) = 'Y' then dbms_output.put_line( 'Row ' || r.c1 || ' contains matching values.' ); end if;

end loop;

end;.

/ Row 1 contains matching values.Row 4 contains matching values.

With this approach however one function is required for comparing NUMBER values, one for VARCHAR2 values, one for DATE values, etc.

SQL Only

The following techniques, while more compact than the solutions presented in SQL + PL/SQL , unfortunately only work in SQL commands, not PL/SQL commands.

DECODE

One approach uses the DECODE function. Unlike the "=" operator, DECODE treats two nulls as equivalent. To return rows where two columns contain the same value we can therefore use a command like the following.

select *from twhere DECODE( C2, C3, 'Y', 'N' ) = 'Y' ; C1 C2 C3--- ---------- ---------- 1 A A 4 (null) (null)

DUMP

Another approach uses the DUMP function.

select *from twhere DUMP(C2) = DUMP(C3) ; C1 C2 C3--- ---------- ---------- 1 A A 4 (null) (null)

This approach has a couple of limitations however. One, the output of the DUMP function is truncated at 4000 characters. If the values being compared produce truncated DUMP output then the comparison can produce false positives. Here is an example.

select 'Oops! This row should not be returned.' as resultfrom dualwhere DUMP( LPAD( 'A', 4000) ) = DUMP( LPAD( 'B', 4000 ) ) ; RESULT--------------------------------------Oops! This row should not be returned.




Two, C2 and C3 must be the exact same datatype for the comparison to work. Comparing compatible datatypes such as VARCHAR2 and CHAR will fail to match any rows.

select 'Oops! This row should be returned, but it is not.' as resultfrom twhere dump( c2 ) = dump( 'A' ) ;

no rows selected

Examining the output of DUMP shows why this occurs. The "typ=" part of the DUMP output for both terms differs because column C2 is datatype 1, VARCHAR2, and the literal 'A' is datatype 96, CHAR.

column varchar2_val format a30column char_val format a30 fold_before

select dump( c2 ) as varchar2_val , dump( 'A' ) as char_valfrom twhere c1 = 1; VARCHAR2_VAL------------------------------CHAR_VAL------------------------------Typ=1 Len=1: 65Typ=96 Len=1: 65

SYS_OP_MAP_NONNULL

Another approach that some have proposed uses the undocumented function SYS_OP_MAP_NONNULL.

select *from twhere SYS_OP_MAP_NONNULL( C2 ) = SYS_OP_MAP_NONNULL( C3 ) ; C1 C2 C3--- ---------- ---------- 1 A A 4 (null) (null)

As with the other solutions on this page, SYS_OP_MAP_NONNULL does not work in PL/SQL. begin


if ( SYS_OP_MAP_NONNULL( R.C2 ) = SYS_OP_MAP_NONNULL( R.C3 ) ) then dbms_output.put_line( 'Row ' || r.c1 || ' contains matching values.' ); end if;

end loop;

end;./ if ( SYS_OP_MAP_NONNULL( R.C2 ) = SYS_OP_MAP_NONNULL( R.C3 ) ) *ERROR at line 6:ORA-06550: line 6, column 10:PLS-00201: identifier 'SYS_OP_MAP_NONNULL' must be declaredORA-06550: line 6, column 5:PL/SQL: Statement ignored

It also has a length limitation.

select *from dualwhere SYS_OP_MAP_NONNULL( LPAD( 'A', 4000 ) ) = SYS_OP_MAP_NONNULL( LPAD( 'B', 4000 ) ) ;select **ERROR at line 1:ORA-01706: user function result value was too large

While undocumented features such as this one are compelling, their behavior or availability can change at any time making them a risky thing to include in your code. They also make support and maintenance harder for others who need to work with your code and are not familiar with the feature.

Setup


Be sure to read Using SQL Snippets ™ before executing any of these setup steps.

create table t( c1 number , c2 varchar2(10) , c3 varchar2(10));

insert into t values( 1, 'A' , 'A' );insert into t values( 2, 'A' , 'B' );insert into t values( 3, null , 'A' );insert into t values( 4, null , null );


commit;

set null '(null)'set numformat 99set serveroutput on Cleanup

Run the code on this page to drop the sample tables, procedures, etc. created in earlier parts of this section. To clear session state changes (e.g. those made by SET, COLUMN, and VARIABLE commands) exit your SQL*Plus session after running these cleanup commands. Be sure to read Using SQL Snippets ™ before executing any of these setup steps.

drop table t ;drop function same ; exit


Integer Series Generators

Sometimes, having a way to create a series of integers greatly simplifies certain queries. For example, if your data looks like this:

select * from t ; DAY_OF_WEEK VAL----------- ---------- 1 100 3 300 4 400 5 500

and you want a report that looks like this

DAY_OF_WEEK VAL----------- ---------- 0 1 100 2 3 300 4 400 5 500 6

It would be useful to have a table with the numbers 0 to 6 in it so you could write an outer join query like this.

select day_of_week, t.valfrom days_of_the_week d left outer join t using ( day_of_week )order by day_of_week; DAY_OF_WEEK VAL----------- ---------- 0 1 100 2 3 300 4 400 5 500 6

If you expect to write lots of queries that use the same series of integers and they are based on real world phenomena then creating a table like DAYS_OF_THE_WEEK can be the best solution.

Occassionally however, you may need a different set of integers just for one specific query, for ad-hoc reports, or for a system you do not have CREATE TABLE privileges on. In these cases it may be impractical or impossible to create a dedicated table that meets your needs. Fortunately there are flexible, generic techniques for generating integers. The tutorials in this section demonstrate a few of them. The feature graph below will help you decide which method is best for you.

FeatureInteger Table

MODELROWNUM + a

Big TableCONNECT BY

LEVELCUBE

Type Constructor

Pipelined Function

Pure SQL solution; no custom objects required

N Y Y Y Y N N

Works in versions prior to 10g

Y N Y Y Y Y Y

Performance comparison charts for all these methods are available at the end of the section on the Performance Comparison - Small Numbers and Performance Comparison - Large Numbers pages.

Integer Table Method

This tutorial demonstrates how to generate a series of integers using a generic integer table. Other techniques are discussed in the topics listed in the menu to the left.

One of the most straightforward ways to generate a series of integers is by adding a generic integer table to your application. You can create such a table like this.

create table integers ( integer_value integer primary key ) organization index;

Table created.

Since this table only has a single indexed column we specified organization index to make this an Index-Organized table and save storage space.

To load the table a simple loop like the following will do the trick.

begin for i in -5 .. 10 loop insert into integers values ( i ); end loop; commit;end;/

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14220/schema.htm#i23877




We used -5 and 10 as the limits of our series in this example. In practice you would choose limits that anticipate the smallest and largest integers you will ever need.

select * from integers ; INTEGER_VALUE------------- -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10

Later, when you need a specific series of integers you can use the INTEGERS table like this.

select i.integer_value as day_of_week , t.valfrom integers i left outer join t on ( i.integer_value = t.day_of_week )where i.integer_value between 0 and 6order by i.integer_value; DAY_OF_WEEK VAL----------- ---------- 0 1 100 2 3 300 4 400 5 500 6

MODEL Method

This tutorial demonstrates how to generate a series of integers using the MODEL clause of the SELECT command. (You can learn more about MODEL at SQL Features Tutorials: MODEL Clause.) This technique only works with Oracle versions starting at 10g. Other techniques are discussed in the tutorials listed in the menu to the left.

With this technique you can generate a series of integers starting at "1" using a query like this.

select integer_valuefrom dualwhere 1=2model dimension by ( 0 as key ) measures ( 0 as integer_value ) rules upsert ( integer_value[ for key from 1 to 10 increment 1 ] = cv(key) ); INTEGER_VALUE------------- 1 2 3 4 5 6 7 8 9 10

Chaning the INCREMENT value lets us control the difference between successive values in the series.

select integer_valuefrom dualwhere 1=2model dimension by ( 0 as key ) measures ( 0 as integer_value ) rules upsert ( integer_value[ for key from 2 to 10 INCREMENT 2 ] = cv(key) ); INTEGER_VALUE------------- 2 4 6 8 10

We can use bind variables to make the solution more generic. variable v_first_key numbervariable v_last_key numbervariable v_increment number

execute :V_FIRST_KEY := 1



execute :V_LAST_KEY := 5execute :V_INCREMENT := 2

select key, integer_valuefrom dualwhere 1=2model dimension by ( 0 as key ) measures ( 0 as integer_value ) rules upsert ( integer_value[ for key from :V_FIRST_KEY to :V_LAST_KEY increment 1 ] = nvl2( integer_value[cv()-1], integer_value[cv()-1] + :V_INCREMENT, cv(key) ) ); KEY INTEGER_VALUE---------- ------------- 1 1 2 3 3 5 4 7 5 9

When v_last_key is NULL or less than v_first_key no rows are returned.

execute :v_first_key := 1


execute :v_last_key := null


/

no rows selected

execute :v_last_key := 0


/

no rows selected

execute :v_last_key := -5


/

no rows selected

Day of the Week Case Study

We can apply this technique to the day of the week scenario presented at the start of this chapter as follows.

select day_of_week , t.valfrom ( select day_of_week from dual where 1=2 model dimension by ( 0 as key ) measures ( 0 as day_of_week ) rules upsert ( day_of_week[ for key from 0 to 6 increment 1 ] = cv(key) ) ) i left outer join t using ( day_of_week )order by day_of_week; DAY_OF_WEEK VAL----------- ---------- 0 1 100 2 3 300 4 400 5 500 6

Gotchas

Descending Series

If you need a descending series of integers this attempt will not work.

select integer_valuefrom dualwhere 1=2model dimension by ( 0 as key ) measures ( 0 as integer_value ) rules upsert ( integer_value[ for key FROM 3 TO 1 increment 1 ] = cv(key) );

no rows selected

Instead, do it this way

select integer_valuefrom dualwhere 1=2model dimension by ( 0 as key ) measures ( 0 as integer_value ) rules upsert ( integer_value[ for key from 3 to 1 DECREMENT 1 ] = cv(key) )ORDER BY INTEGER_VALUE DESC; INTEGER_VALUE------------- 3 2 1

or this way.

select integer_valuefrom dualwhere 1=2model dimension by ( 0 as key ) measures ( 0 as integer_value ) rules upsert ( integer_value[ for key from 1 TO 3 INCREMENT 1 ] = cv(key) )ORDER BY INTEGER_VALUE DESC; INTEGER_VALUE------------- 3 2 1

WHERE 1=2

It is important to note that everything in the MODEL clause is evaluated after all other clauses in the query, except for SELECT DISTINCT and ORDER BY. Using the WHERE 1=2 clause ensures the query starts with an empty result set when MODEL rules are first applied to the rows returned by the SELECT ... FROM ... WHERE portion of the query.

While it would be possible to omit the WHERE 1=2 clause using an approach like this

select integer_valuefrom dualmodel dimension by ( 1 as key ) measures ( 1 as integer_value ) rules upsert ( integer_value[ for key from 1 to 10 increment 1 ] = cv(key) ); INTEGER_VALUE------------- 1

2 3 4 5 6 7 8 9 10

this query causes the result set to always contain at least one row both before and after the MODEL rules are applied. This is not a problem for queries that always return one or more rows like this one,

select key, integer_valuefrom dualmodel dimension by ( 4 as key ) measures ( 4 as integer_value ) rules upsert ( integer_value[ for key from 4 to 8 increment 1 ] = cv(key) ); KEY INTEGER_VALUE---------- ------------- 4 4 5 5 6 6 7 7 8 8

but if the code is later parameterized and the TO bound is ever null or less than the FROM bound then the query will incorrectly return 1 row instead of the required zero rows for these cases.

variable v_first_key numbervariable v_last_key number

execute :v_first_key := 3execute :v_last_key := 0

select key, integer_valuefrom dualmodel dimension by ( :v_first_key as key ) measures ( :v_first_key as integer_value ) rules upsert ( integer_value[ for key from :v_first_key to :v_last_key increment 1 ] = cv(key) ); KEY INTEGER_VALUE---------- ------------- 3 3

RETURN UPDATED ROWS

An alternative to using WHERE 1=2 would be to instead include a RETURN UPDATED ROWS clause, like this

select integer_valuefrom dualmodel RETURN UPDATED ROWS dimension by ( 1 as key ) measures ( 1 as integer_value ) rules upsert ( integer_value[ for key from 1 TO 3 increment 1 ] = cv(key) ); INTEGER_VALUE------------- 1 2 3 select integer_valuefrom dualmodel RETURN UPDATED ROWS dimension by ( 1 as key ) measures ( 1 as integer_value ) rules upsert ( integer_value[ for key from 3 TO 0 increment 1 ] = cv(key) );

no rows selected

but using WHERE 1=2 to ensure the query always starts with an empty set seems like a cleaner way to work than starting with one row and then relying on RETURN UPDATED ROWS to return that row in some cases but not others.

INCREMENT and Bind Variables

Unlike the FROM and TO bounds, we cannot use a variable in the INCREMENT value (as tested in 10g).

variable v_first_key numbervariable v_last_key numbervariable v_increment number

execute :v_first_key := 1execute :v_last_key := 9execute :v_increment := 2

select key, integer_valuefrom dualwhere 1=2model dimension by ( 0 as key ) measures ( 0 as integer_value ) rules upsert

( integer_value[ for key from :v_first_key to :v_last_key INCREMENT :v_increment ] = cv(key) ); ( integer_value[ for key from :v_first_key to :v_last_key INCREMENT :v_increment ] *ERROR at line 8:ORA-32626: illegal bounds or increment in MODEL FOR loop

ROWNUM + a Big Table Method

This tutorial demonstrates how to generate a series of integers using the ROWNUM pseudocolumn and any available table with as many rows in it as the number of integers required. Other techniques are discussed in the tutorials listed in the menu to the left.

Prerequisites

Before using this solution you need to find a table with at least as many rows in it as the number of integers you need to generate. I.e. if you need a series of 10 integers then you need to find a table or view that will always have at least 10 rows in it. The data dictionary view ALL_OBJECTS is a popular choice for this method.

The Solution

Once you have identified a table with a sufficient number of rows simply select ROWNUM from it to generate the required integer series, like this.

select rownumfrom all_objectswhere rownum <= 10 ; ROWNUM---------- 1 2 3 4 5 6 7 8 9 10

We can apply this technique to our day of the week scenario as follows.

select day_of_week , t.valfrom ( select rownum - 1 as day_of_week

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14237/statviews_2005.htm

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14200/pseudocolumns009.htm#i1006297

from all_objects where rownum <= 7 ) i left outer join t using ( day_of_week )order by day_of_week; DAY_OF_WEEK VAL----------- ---------- 0 1 100 2 3 300 4 400 5 500 6

CONNECT BY LEVEL Method

This tutorial demonstrates how to generate a series of integers using a novel application of the CONNECT BY clause first posted by Mikito Harakiri at Ask Tom "how to display selective record twice in the query?". Other techniques are discussed in the tutorials listed in the menu to the left.

With this technique you can generate a series of integers starting at "1" using a query like this.

select levelfrom dualconnect by level <= 10 ; LEVEL---------- 1 2 3 4 5 6 7 8 9 10

Queries Without PRIOR

The query above is a special case of a more general type of query, those that do not use the PRIOR operator. Applying the technique to a table with two rows, "a" and "b", yields some insight into how such queries work.

break on level duplicates skip 1

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:12718072439781#34043422076040



column path format a10

select level, sys_connect_by_path( key, '/' ) as path, keyfrom t4connect by level <= 3order by level, path ; LEVEL PATH KEY---------- ---------- --- 1 /a a 1 /b b

2 /a/a a 2 /a/b b 2 /b/a a 2 /b/b b

3 /a/a/a a 3 /a/a/b b 3 /a/b/a a 3 /a/b/b b 3 /b/a/a a 3 /b/a/b b 3 /b/b/a a 3 /b/b/b b

Without a CONNECT BY condition that uses PRIOR it appears Oracle returns all possible hierarchy permutations. This effect may be useful where an exponentially increasing number of output rows is required.

The effect also proves useful in situations where more than one integer series is required from a single query. See Multiple Integer Series: CONNECT BY LEVEL Method for more details and an example.

Note there is some debate about whether queries without PRIOR in the CONNECT BY clause are legal or not. This is discussed further in the "Gotchas" section below.

Variables

The original syntax for this technique works fine when the number of rows is hardcoded to a value greater than or equal to 1. If the number of rows is set with a bind variable whose value can be 0, negative, or null however, the technique may not work as expected. It always generates at least one row in these cases.

clear breaks

variable v_total_rows number

execute :v_total_rows := 0

select levelfrom dual


connect by level <= :v_total_rows ; LEVEL---------- 1 execute :v_total_rows := -5


/

LEVEL---------- 1

1 row selected.

execute :v_total_rows := null


/

LEVEL---------- 1

1 row selected.

A simple WHERE clause fixes this behaviour.



select levelfrom dualWHERE :V_TOTAL_ROWS >= 1connect by level <= :v_total_rows ;

no rows selected

execute :v_total_rows := -5


/

no rows selected

execute :v_total_rows := null


/

no rows selected



/

LEVEL---------- 1 2 3

3 rows selected.

Day of the Week Case Study

In the next snippet we apply the technique to the day of the week scenario we examined in prior tutorials.

select day_of_week , t.valfrom ( select level - 1 as day_of_week from dual connect by level <= 7 ) i left outer join t using( day_of_week )order by day_of_week; DAY_OF_WEEK VAL----------- ---------- 0 1 100 2 3 300 4 400 5 500 6

Gotchas

To Use PRIOR or Not to Use PRIOR, That is the Question

Laurent Schneider argues in his blog post Bible of Oracle that a clause like CONNECT BY LEVEL <= 10 is an illegal construct since it has no expressions qualified with the PRIOR operator, as dictated by this statement in the SQL Reference Manual

"in a hierarchical query, one expression in condition must be qualified with the PRIOR operator to refer to the parent row." -- Oracle® Database SQL Reference 10g Release 2 (10.2) Others argue that this statement is a documentation bug. The fact the CONNECT BY clause works without error in Oracle 10g and some 9i versions somewhat supports this view.

Note that the following queries, seemingly equivalent to the CONNECT BY LEVEL <= 10 solution, do not produce the desired 10 rows of output (as tested in Oracle 10g).

select levelfrom dualconnect by level <= 10 AND PRIOR DUMMY = DUMMY;ERROR:ORA-01436: CONNECT BY loop in user data

select levelfrom dualconnect by level <= 10 AND PRIOR 1 = 1;ERROR:ORA-01436: CONNECT BY loop in user data

The following variation may be more legal than the original solution since it includes a PRIOR condition and does not produce a CONNECT BY loop, but the PL/SQL call it contains makes it perform worse (from Re: Creating N Copies of a Row using "CONNECT BY CONNECT_BY_ROOT" - Volder).

select levelfrom dualconnect by level <= 10 and PRIOR DBMS_RANDOM.VALUE IS NOT NULL; LEVEL---------- 1 2 3 4 5

http://forums.oracle.com/forums/message.jspa?messageID=1953433#1953433

http://forums.oracle.com/forums/message.jspa?messageID=1953433#1953433


http://laurentschneider.com/wordpress/2005/11/bible-of-oracle.html

6 7 8 9 10

The jury is still out on using CONNECT BY LEVEL to generate integers. Until there is a definitive answer, be aware there is a risk the technique may not work in future versions.

Issues

Issues with this technique, or variations of it, have been reported in Oracle versions earlier than 10.2. I have not tested these myself but here are some posts that describe problems.

Ask Tom "Can there be an infinite DUAL?" - Weird results Ask Tom "Can there be an infinite DUAL?" - Weird Results (bug?) Ask Tom "how to display selective record twice in the query?"- minor simplification CONNECT BY Generator Rules | Ask Mr. Ed

In Oracle 9i, if you try the CONNECT BY LEVEL technique and get a single row when expecting muliple rows, like this

select level from dual connect by level < 10 ; LEVEL---------- 1

putting the query in an inline view, as in this snippet, may help (I have not tested this).

select *from (select level from dual connect by level < 10) ; LEVEL---------- 1 2 3 4 5 6 7 8 9

9 rows selected.

Acknowledgements

Mikito Harakiri, Tom Kyte, Laurent Schneider and other posters at these threads. Ask Tom "how to display selective record twice in the query?" Ask Tom "Can there be an infinite DUAL?"



http://www.edhanced.com/ask-mred/?q=node/view/158




CUBE Method

This tutorial demonstrates how to generate a series of integers using the CUBE clause of the SELECT statement. Other techniques are discussed in the tutorials listed in the menu to the left.

Here are some examples that generate a series of integers using CUBE.

To return 4 rows (2^2):select rownumfrom( select 1 from dual group by cube( 1, 2 )) ;

ROWNUM---------- 1 2 3 4

To return 8 rows (2^3):

select rownumfrom( select 1 from dual group by cube( 1, 2, 3 )) ;

ROWNUM---------- 1 2 3 4 5 6 7 8

To return 16 rows (2^4):

select rownumfrom( select 1 from dual group by cube( 1, 2, 3, 4 )) ;

ROWNUM---------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

To return 9 rows:select rownumfrom

ROWNUM----------

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14200/statements_10002.htm#sthref9853

( select 1 from dual group by cube( 1, 2, 3, 4 ))where rownum <= 9;

1 2 3 4 5 6 7 8 9

We can apply this technique to our day of the week scenario with this query.

with days_of_the_week as( select rownum - 1 as day_of_week from ( select 1 from dual group by cube( 1, 2, 3 ) ) where rownum <= 7)select day_of_week , t.valfrom days_of_the_week left outer join t using ( day_of_week )order by day_of_week; DAY_OF_WEEK VAL----------- ---------- 0 1 100 2 3 300 4 400 5 500 6

For more details about how this method works see CUBE - Explained.

Gotchas

Number of Arguments in CUBE

Ensure the numeric literal in the WHERE clause is less than or equal to 2^(number of CUBE arguments), otherwise you will not get the correct number of rows, as in this example which attempts to generate 7 integers but only succeeds in generating 4.

select rownum as integer_valuefrom( select 1


from t2 group by cube ( 1, 2 ) -- will only generate 2^2 rows)where rownum <= 7 -- can only be <= 4, 3, 2, or 1 in this query, not 5, 6, 7, ...; INTEGER_VALUE------------- 1 2 3 4

Inline View

Attempting to use rownum without the inline view will cause errors or incorrect results.

select rownumfrom t2group by cube( 1, 2 ) ;select rownum *ERROR at line 1:ORA-00979: not a GROUP BY expression

select rownumfrom t2group by rownum, cube( 1, 2 ) ; ROWNUM---------- 1 1 1 1

CUBE Method - Explained

To understand how the integer series generator described in the CUBE Method tutorial works we will start with a simple query, transform it into a query that uses CUBE the traditional way, and then turn it into an integer series generator for the values 1 to 4. (Read the Oracle manual page on the CUBE grouping operation first if you are not already familiar with this feature.)

set null "(null)"

select *from t2 ;

C1 C2 C3------ ------ ----------x y 42

select c1, c2, sum( c3 ) as sum_c3from t2

C1 C2 SUM_C3------ ------ ----------

http://www.google.com/search?q=the+answer+to+life%2C+the+universe%2C+and+everything

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14200/statements_10002.htm#sthref9853


GROUP BY C1, C2 ;

x y 42

select c1, c2, sum( c3 ) as sum_c3from t2group by CUBE( c1, c2 );

C1 C2 SUM_C3------ ------ ----------(null) (null) 42(null) y 42x (null) 42x y 42

select c1, c2, 1 AS ANY_LITERALfrom t2group by cube( c1, c2 );

C1 C2 ANY_LITERAL------ ------ -----------(null) (null) 1(null) y 1x (null) 1x y 1

SELECT 1from t2group by cube ( c1, c2 ) ;

1---------- 1 1 1 1

select 1from t2group by CUBE ( 1, 2 ) -- see Note 1;

1---------- 1 1 1 1

select ROWNUM AS INTEGER_VALUEfrom( select 1 from t2 group by cube ( 1, 2 ));

INTEGER_VALUE------------- 1 2 3 4

select rownum as integer_valuefrom( select 1 from t2 group by cube ( 1, 2 ))where ROWNUM <= 3 ;

INTEGER_VALUE------------- 1 2 3

Note 1: In this technique it does not matter what literals you use in the arguments to CUBE. You could use arguments like 1, 1 or 'a','b' and still get the same number of rows. The important part is how many literals you include.

Two literals will give you four rows (2^2), three literals will give you eight rows (2^3), four literals will give you sixteen rows (2^4), etc. I like to use arguments like 1,2,3,4,5,6,7 because it is easier to tell there are 7 arguments (which produce 2^7 rows) with this approach than with an argument list like 1,1,1,1,1,1,1.

Type Constructor Expression Method

This tutorial demonstrates how to generate a series or set of integers using Type Constructor Expressions for collection types. Other techniques are discussed in the tutorials listed in the menu to the left.

Prerequisites

This solution requires a nested table type or varry type. We will use one called INTEGER_TABLE_TYPE created in the Setup topic for this section. If you do not have privileges to create a type like this see Setup - Note 1.

desc integer_table_type integer_table_type TABLE OF NUMBER(38)

The Solution

If you need a manageable number of integers, like 10 or 20, you can use a simple query like this one.

select column_valuefrom table( integer_table_type( 1,2,3,4,5,6,7,8,9,10 ) ) ; COLUMN_VALUE------------ 1 2 3 4 5 6 7 8 9 10

This method is unique from the others in this section in that it lends itself well to creating sets of non-sequential integers as well as sequential series.

select column_valuefrom table( integer_table_type( 1,1,4,4,4,8,10 ) ) ; COLUMN_VALUE------------ 1 1

http://www.sqlsnippets.com/en/topic-11816.html#note-1


http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14200/expressions012.htm#i1046896


4 4 4 8 10

Applying the technique to our day of the week scenario yields this query.

select i.column_value as day_of_week , t.valfrom table( integer_table_type( 0,1,2,3,4,5,6 ) ) i left outer join t on ( i.column_value = t.day_of_week )order by i.column_value; DAY_OF_WEEK VAL----------- ---------- 0 1 100 2 3 300 4 400 5 500 6

If you require more integers than you care to list in a type constructor expression see the Type Constructor + Cartesian Product tutorial for a variation of this technique.

Gotchas

If we specify more than 999 arguments in a type constructor it will generate a ORA-00939: too many arguments for function error (as tested in Oracle 10g).

Type Constructor + Cartesian Product Method

This tutorial demonstrates how to generate a series of integers using Type Constructor Expressions for collection types and Cartesian Products. Other techniques are discussed in the tutorials listed in the menu to the left.

Prerequisites





http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14200/queries006.htm#sthref3168





The Solution

If you require a large number of integers then listing them all in a type constructor expression, like the solutions in the Type Constructor Expression Method tutorial, may be difficult or impossible. In this case you can use a Cartesian product with your type constructor expressions to generate a large number of rows with a small amount of code. Here are some examples. This query returns 9 rows (3x3).

select rownumfrom table( integer_table_type( 1,2,3 ) ) i1, table( integer_table_type( 1,2,3 ) ) i2;

ROWNUM---------- 1 2 3 4 5 6 7 8 9

This query returns 12 rows (3x4).

select rownumfrom table( integer_table_type( 1,2,3 ) ) i1, table( integer_table_type( 1,2,3,4 ) ) i2;

ROWNUM---------- 1 2 3 4 5 6 7 8 9 10 11 12

A query like this can return up to 10,000 rows (10^4), though we won't prove this by diplaying them all here. Listing 15 of them should suffice. with i as( select * from table ( integer_table_type( 1,2,3,4,5,6,7,8,9,10 ) ))select rownumfrom i,i,i,iwhere rownum <= 15 ;

ROWNUM---------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14


15

How Cartesian Products Work

When you do not specify a join between tables Oracle combines each row of one table with each row in the other to produce every possible row combination. This produces a result set with (number of rows in Table 1) x (number of rows in Table 2)

rows in it. The following query illustrates this.

select rownum , i1.column_value i1_column_value, i2.column_value i2_column_valuefrom table( integer_table_type( 1,2 ) ) i1, table( integer_table_type( 10,20,30 ) ) i2; ROWNUM I1_COLUMN_VALUE I2_COLUMN_VALUE---------- --------------- --------------- 1 1 10 2 1 20 3 1 30 4 2 10 5 2 20 6 2 30

Pipelined Function Method

This tutorial demonstrates how to generate a series of integers using a Pipelined Function. Other techniques are discussed in the tutorials listed in the menu to the left.

Prerequisites



You will also need the following custom database function. create function integer_series( p_lower_bound in number, p_upper_bound in number) return integer_table_type pipelinedasbegin



http://www.oracle.com/pls/db102/to_toc?pathname=appdev.102/b14261/tuning.htm#sthref2335

for i in p_lower_bound .. p_upper_bound loop pipe row(i); end loop;

return;

end;/

The Solution

Now that we have our prerequisites in place, here is how you use the INTEGER_SERIES function.

select *from table( integer_series(-5,7) ) ; COLUMN_VALUE------------ -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

We can apply this technique to our day of the week scenario with this query.

select i.column_value as day_of_week , t.valfrom table( integer_series(0,6) ) i left outer join t on ( i.column_value = t.day_of_week )order by i.column_value; DAY_OF_WEEK VAL----------- ---------- 0 1 100 2 3 300 4 400

5 500 6

Performance Comparison - Small Numbers

The following tables show performance metrics for one run each of the eight integer series generation techniques described in the preceeding tutorials.

Integer Table Method MODEL Method ROWNUM + a Big Table Method CONNECT BY LEVEL Method CUBE Method Type Constructor Expression Method Type Constructor + Cartesian Product Method Pipelined Function Method

Each run generated a series of integers from 1 to 100. See the log file from these tests for more details.

Statistics

The following table shows database statistics where values for one method differ by more than 100 from another method.

Integer ROWNUM CONNECT BY Type Type Constructor PipelinedMETRIC_NAME Table MODEL + Big Table LEVEL CUBE Constructor + Cartesian Product Function---------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------------- ------------Elapsed Time (1/100 sec) 3 3 3 2 5 262 3 4session pga memory max 262,144 262,144 262,144 262,144 262,144 262,144 262,144 327,680session pga memory 196,608 65,536 131,072 131,072 196,608 65,536 131,072 327,680redo size 2,744 2,640 2,684 2,684 2,684 2,684 2,684 2,684sorts (rows) 2,071 2,071 2,071 2,072 2,199 2,071 2,081 2,076session uga memory 0 0 0 0 0 65,464 65,464 65,464

See Statistics Descriptions for a description of each metric.

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14237/stats002.htm

http://www.sqlsnippets.com/en/topic-11833-log.txt









Latch Gets

The following table shows total latch gets for each method.

Integer ROWNUM CONNECT BY Type Type Constructor PipelinedMETRIC_NAME Table MODEL + Big Table LEVEL CUBE Constructor + Cartesian Product Function---------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------------- ------------cache buffers chains 206 163 221 163 163 1,208 181 249row cache objects 136 96 129 87 84 135 114 171library cache 92 77 86 80 77 113 89 179shared pool 70 25 26 24 25 85 29 45session idle bit 56 55 56 55 55 56 55 57library cache pin 50 43 48 43 43 57 45 89library cache lock 26 20 28 20 20 40 25 72enqueues 23 16 17 16 16 26 16 20enqueue hash chains 22 16 16 16 16 26 16 18shared pool simulator 12 9 7 9 9 16 10 17object queue header operation 8 12 12 12 12 305 12 15redo allocation 8 8 8 8 8 18 8 8cache buffers lru chain 7 6 6 6 6 396 6 7SQL memory manager workarea list latch 6 10 6 6 6 73 6 6session allocation 6 2 4 3 2 2 2 6sort extent pool 4 4 4 4 4 4 4 4session switching 4 4 4 4 4 4 4 4kks stats 4 2 2 2 2 4 2 2simulator hash latch 4 0 10 0 0 134 0 1simulator lru latch 4 0 10 0 0 130 0 1PL/SQL warning settings 3 3 3 3 3 3 3 3compile environment latch 2 1 2 1 1 1 1 3

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14220/consist.htm#sthref2137

object stats modification 1 1 2 1 1 1 1 2library cache lock allocation 1 1 1 1 1 1 1 2dml lock allocation 1 1 1 1 1 1 1 1FOB s.o list latch 1 0 0 1 0 0 0 1OS process 0 0 0 3 0 0 0 0messages 0 0 0 1 0 40 0 0channel operations parent latch 0 0 0 1 0 18 0 0channel handle pool latch 0 0 0 1 0 0 0 0OS process allocation 0 0 0 1 0 0 0 0process allocation 0 0 0 1 0 0 0 0process group creation 0 0 0 1 0 0 0 0checkpoint queue latch 0 0 0 0 0 269 0 0redo writing 0 0 0 0 0 13 0 0active checkpoint queue latch 0 0 0 0 0 13 0 0loader state object freelist 0 0 0 0 0 12 0 0virtual circuit buffers 0 0 0 0 0 9 0 0virtual circuit queues 0 0 0 0 0 7 0 0parallel query alloc buffer 0 0 0 0 0 4 0 0user lock 0 0 0 0 0 4 0 0session timer 0 0 0 0 0 3 0 0library cache load lock 0 0 0 0 0 2 0 2virtual circuits 0 0 0 0 0 2 0 0active service list 0 0 0 0 0 2 0 0library cache pin allocation 0 0 0 0 0 1 1 1resmgr:actses active list 0 0 0 0 0 1 0 0XDB unused session pool 0 0 0 0 0 1 0 0KMG MMAN ready and startup request latch 0 0 0 0 0 1 0 0resmgr:free threads list 0 0 0 0 0 1 0 0

------------ ------------ ------------ ------------ ------------ ------------ ------------------- ------------sum 757 575 709 575 559 3,242 632 986

Techniques that use a small number of latches scale better than techniques that use a large number of latches.

Warning: Results on your own systems with your own data will differ from these results. Results will even differ from one set of test runs to the next on the same machine. Run your own tests and average the results from multiple runs before making performance decisions.

Performance Comparison - Large Numbers

The following tables show performance metrics for one run each of six integer series generation techniques described in the preceeding tutorials.

Integer Table Method MODEL Method ROWNUM + a Big Table Method CONNECT BY LEVEL Method Type Constructor + Cartesian Product Method Pipelined Function Method

Each run generated a series of integers from 1 to 100,000. Note that the Type Constructor Expression Method technique was excluded from this comparison because it can only be used to generate up to 999 different values. The CUBE Method technique was excluded from this test because it failed to complete in under 10 minutes. See the log file from these tests for more details.

Statistics

The following table shows database statistics where values for one method differ by more than 100 from another method.

Integer ROWNUM CONNECT BY Type Constructor PipelinedMETRIC_NAME Table MODEL + Big Table LEVEL + Cartesian Product Function---------------------------------------- ------------ ------------ ------------ ------------ ------------------- ------------Elapsed Time (1/100 sec) 59 349 68 67 57 544session pga memory max 262,144 4,784,128 262,144 2,031,616 262,144 327,680session uga memory max 261,964 4,533,984 261,964 2,016,252 261,964 261,964

http://www.sqlsnippets.com/en/topic-12053-log.txt










session pga memory 196,608 65,536 131,072 0 262,144 327,680session logical reads 6,879 45 6,927 45 78 111consistent gets 6,840 6 6,888 6 39 72consistent gets from cache 6,840 6 6,888 6 39 72no work - consistent read gets 6,822 0 6,860 0 0 5buffer is not pinned count 6,658 0 0 0 22 37DB time 29 311 27 37 22 34CPU used when call started 29 309 27 35 22 34CPU used by this session 29 309 27 35 21 34session uga memory 0 0 0 0 65,464 65,464

See Statistics Descriptions for a description of each metric.

Latch Gets

The following table shows total latch gets for each method.

Integer ROWNUM CONNECT BY Type Constructor PipelinedMETRIC_NAME Table MODEL + Big Table LEVEL + Cartesian Product Function---------------------------------------- ------------ ------------ ------------ ------------ ------------------- ------------cache buffers chains 13,946 836 13,927 163 196 258session idle bit 13,376 13,375 13,376 13,375 13,375 13,377simulator lru latch 424 172 461 0 0 1simulator hash latch 424 172 461 0 0 1row cache objects 139 96 129 87 132 171cache buffers lru chain 106 522 6 6 6 7library cache 92 83 85 79 119 20,172object queue header operation 79 365 12 12 15 18checkpoint queue latch 55 237 0 0 11 43library cache pin 50 49 48 43 54 13,417

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14220/consist.htm#sthref2137

http://www.oracle.com/pls/db102/to_toc?pathname=server.102/b14237/stats002.htm

shared pool 47 20 18 17 45 49library cache lock 30 24 30 22 34 79enqueues 21 96 17 16 16 100enqueue hash chains 20 97 16 16 16 99messages 10 28 4 0 8 48shared pool simulator 10 9 6 9 16 17redo allocation 9 12 8 8 8 12SQL memory manager workarea list latch 6 144 6 6 6 140channel operations parent latch 6 22 0 0 6 29session allocation 6 2 4 2 2 6session switching 4 4 4 4 4 4sort extent pool 4 4 4 4 4 4kks stats 4 2 2 2 2 2PL/SQL warning settings 3 3 3 3 3 3redo writing 2 7 0 0 1 8active checkpoint queue latch 2 3 0 0 1 1compile environment latch 2 1 2 1 1 3object stats modification 1 1 1 2 1 1library cache lock allocation 1 1 1 1 1 2dml lock allocation 1 1 1 1 1 1session timer 1 1 0 0 1 2KMG MMAN ready and startup request latch 1 1 0 0 1 1object queue header heap 1 0 0 0 0 0JS queue state obj latch 0 36 0 0 0 36active service list 0 10 0 0 0 10qmn task queue latch 0 4 0 0 0 0In memory undo latch 0 2 0 0 0 2OS process allocation 0 1 1 0 0 2

resmgr:actses active list 0 1 0 0 0 1resmgr:schema config 0 1 0 0 0 1kwqbsn:qsga 0 1 0 0 0 0Shared B-Tree 0 1 0 0 0 0library cache load lock 0 0 0 0 2 2library cache pin allocation 0 0 0 0 1 1mostly latch-free SCN 0 0 0 0 0 1undo global data 0 0 0 0 0 1lgwr LWN SCN 0 0 0 0 0 1Consistent RBA 0 0 0 0 0 1FOB s.o list latch 0 0 0 0 0 1 ------------ ------------ ------------ ------------ ------------------- ------------sum 28,883 16,447 28,633 13,879 14,089 48,136

Techniques that use a small number of latches scale better than techniques that use a large number of latches.

Warning: Results on your own systems with your own data will differ from these results. Results will even differ from one set of test runs to the next on the same machine. Run your own tests and average the results from multiple runs before making performance decisions.

Multiple Integer Series

The preceding topics showed how to generate a single series of integers. Sometimes many different integer series are required in the same query. For example, given a data table like this

KEY QTY--- ----------a (null)b 0c 1d 2e 3

we sometimes need queries that generate results like these

KEY QTY INTEGER_VALUE--- ---------- -------------a (null) (null)

b 0 (null)

c 1 1

d 2 1d 2 2

e 3 1e 3 2e 3 3

where a row like the one where KEY='d' needs the integer series "1,2" but the row where KEY='e' needs the series "1,2,3".

The subtopics in this section demonstrate various ways to accomplish this.

Join Method

Many of the single series solutions presented earlier can be easily adapted to generate multiple series with a simple join. Here is an example of how this is done using the Integer Table Method technique.

create table integers ( integer_value integer primary key ) organization index;

insert into integers values ( 1 );insert into integers values ( 2 );insert into integers values ( 3 );insert into integers values ( 4 );insert into integers values ( 5 );


commit;

set null "(null)"

break on key duplicates skip 1

select key, qty, integer_valuefrom t3 left outer join integers on ( integers.integer_value <= t3.qty )order by key, integer_value ; KEY QTY INTEGER_VALUE--- ---------- -------------a (null) (null)

b 0 (null)

c 1 1

d 2 1d 2 2

e 3 1e 3 2e 3 3

With the Type Constructor Expression Method technique it would look like this.

select key, qty, column_value as integer_valuefrom t3 left outer join table( integer_table_type(1,2,3,4,5) ) integers on ( integers.column_value <= t3.qty )order by key, integer_value; KEY QTY INTEGER_VALUE--- ---------- -------------a (null) (null)

b 0 (null)

c 1 1

d 2 1d 2 2

e 3 1e 3 2e 3 3


In both queries note that we first generate more integers than required and then filter out the excess values via a join condition.

MODEL Method

With MODEL queries there is no need to use the join technique described in Join Method. We can generate multiple series by applying a FOR loop to each row in the base table with the aid of a PARTITION BY clause.

set null "(null)"


select key, key2, qty, integer_valuefrom t3model PARTITION BY ( KEY ) dimension by ( 1 as key2 ) measures ( qty, cast( null as integer ) as integer_value ) rules ( integer_value[ FOR KEY2 FROM 1 TO QTY[1] INCREMENT 1 ] = cv(key2) )order by key, integer_value; KEY KEY2 QTY INTEGER_VALUE--- ---------- ---------- -------------a 1 (null) (null)

b 1 0 (null)

c 1 1 1

d 1 2 1d 2 (null) 2

e 1 3 1e 2 (null) 2e 3 (null) 3

CONNECT BY LEVEL Method

With the CONNECT BY LEVEL approach there is also no need to use the Join Method. Multiple integer series can be created using a query like this one (the PATH column is included to illustrate how the query works)

set null "(null)"




column path format a10

select key, qty, level as integer_value, sys_connect_by_path( key, '/' ) as pathfrom t3where qty >= 1connect by KEY = PRIOR KEY and prior dbms_random.value is not null and level <= t3.qtyorder by key, integer_value; KEY QTY INTEGER_VALUE PATH--- ---------- ------------- ----------c 1 1 /c

d 2 1 /dd 2 2 /d/d

e 3 1 /ee 3 2 /e/ee 3 3 /e/e/e

or this approach (only possible on version 10g or greater)

select key, qty, level as integer_value, sys_connect_by_path( key, '/' ) as pathfrom t3where qty >= 1connect by KEY = CONNECT_BY_ROOT KEY and level <= t3.qtyorder by key, integer_value; KEY QTY INTEGER_VALUE PATH--- ---------- ------------- ----------c 1 1 /c

d 2 1 /dd 2 2 /d/d

e 3 1 /ee 3 2 /e/ee 3 3 /e/e/e

Note these approaches will not work for rows where no integer series is required, like the rows with KEY in ( 'a', 'b' ).

Gotchas

The CONNECT_BY_ROOT technique may not work without error in all cases and it may not work in Oracle versions beyond 10g. This is because it violates two restrictions documented at Hierarchical Query Operators:

1. "In a hierarchical query, one expression in the CONNECT BY condition must be qualified by the PRIOR operator."

2. "You cannot specify (CONNECT_BY_ROOT) in the START WITH condition or the CONNECT BY condition."

The fact that the query above contradicts the documentation yet works without error in 10g suggests a bug in either the documentation or the SQL engine.

On my system the following variation of the CONNECT_BY_ROOT query raised some rather severe ORA errors casting further doubt on the technique's reliability (do not run this query on your own systems).

select key, qty, level as integer_valuefrom t3start with qty >= 1connect by KEY = CONNECT_BY_ROOT KEY and level <= t3.qtyorder by key, integer_value;ERROR at line 1:ORA-03113: end-of-file on communication channel

ERROR:ORA-03114: not connected to ORACLE

From the .trc file:ORA-07445: exception encountered: core dump [ACCESS_VIOLATION][__VInfreq__msqopnws+2740] [PC:0x30E2580] [ADDR:0x2C] [UNABLE_TO_READ] []


Bulk Collect

Executing sql statements in plsql programs causes a context switch between the plsql engine and the sql engine. Too many context switches may degrade performance dramatically. In order to reduce the number of these context switches we can use a feature named bulk binding. Bulk binding lets us to transfer rows between the sql engine and the plsql engine as collections. Bulk binding is available for select, insert, delete and update statements.

Bulk collect is the bulk binding syntax for select statements.

One of the things i usuallly come accross is that developers usually tend to use cursor for loops to process data. They declare a cursor, open it, fetch from it row by row in a loop and process the row they fetch.

Declare Cursor c1 is select column_list from table_name>; Rec1 c1%rowtype;Begin Open c1; Loop; Fetch c1 into r1; Exit when c1%notfound; --process rows... End loop;End;

Here is a simple test case to compare the performance of fetching row by row and using bulk collect to fetch all rows into a collection.

SQL> create table t_all_objects as select * from all_objects;

Table created.

SQL> insert into t_all_objects select * from t_all_objects;

3332 rows created.

SQL> r1* insert into t_all_objects select * from t_all_objects

6664 rows created.

---replicated a couple of times

SQL> select count(*) from t_all_objects;

COUNT(*)----------213248

SQL> declarecursor c1 is select object_name from t_all_objects;2 3 rec1 c1%rowtype;4 begin5 open c1;6 loop

http://oracletoday.blogspot.in/2005/11/bulk-collect_15.html

7 fetch c1 into rec1;8 exit when c1%notfound;910 null;1112 end loop;13 end;14 /


Elapsed: 00:00:44.75

SQL> declare2 cursor c1 is select object_name from t_all_objects;3 type c1_type is table of c1%rowtype;4 rec1 c1_type;5 begin6 open c1;78 fetch c1 bulk collect into rec1;91011 end;12 /


Elapsed: 00:00:05.32

As can be clearly seen, bulk collecting the rows shows a huge performance improvement over fetching row by row.

The above method (which fetched all the rows) may not be applicable to all cases. When there are many rows to process, we can limit the number of rows to bulk collect, process those rows and fetch again. Otherwise process memory gets bigger and bigger as you fetch the rows.

SQL> declare2 cursor c1 is select object_name from t_all_objects;3 type c1_type is table of c1%rowtype;4 rec1 c1_type;5 begin6 open c1;7 loop8 fetch c1 bulk collect into rec1 limit 200;9 for i in 1..rec1.count loop10 null;11 end loop;12 exit when c1%notfound;13 end loop;141516 end;17 /


Elapsed: 00:00:04.07

http://tinyurl.com/bs5ev

Deciding When to Use Bulk Binds

PL/SQL code that uses bulk binds will be slightly more complicated and somewhat more prone to programmer bugs than code without bulk binds, so you need to ask yourself if the improved runtime performance will justify the expense. No universal rule exists to dictate when bulk binds are worthwhile and when they are not. However, the cost of adding a few lines of code is so slight that I would lean toward using bulk binds when in doubt.

A PL/SQL program that reads a dozen rows from a cursor will probably see no noticeable benefit from bulk binds. The same goes for a program that issues five or six UPDATE statements. However, a program that reads 1,000 rows from a cursor or performs that many similar UPDATE statements will most likely benefit from bulk binds.

If you have the luxury of time, you can test your code both with and without bulk binds. Running both versions of the code through SQL trace and TKPROF will yield reports from which you may derive a wealth of information.

A Simple Program With and Without Bulk Binds

In this section we will look at a simple program written both with and without bulk binds. We'll look at TKPROF reports that demonstrate the impact bulk binds can have. The discussion of the TKPROF reports will help you see how to interpret TKPROF output in order to assess the impact of bulk binds on your application.

Consider the following excerpts from a TKPROF report: ************************************************************************

DECLARE CURSOR c_orders IS SELECT order_id, currency_code, amount_local /* no bulk bind */ FROM open_orders; v_amount_usd NUMBER;BEGIN FOR r IN c_orders LOOP v_amount_usd := currency_convert (r.amount_local, r.currency_code); UPDATE open_orders /* no bulk bind */ SET amount_usd = v_amount_usd WHERE order_id = r.order_id; END LOOP; COMMIT;END;

call count cpu elapsed disk query current rows------- ------ -------- ---------- -------- -------- -------- --------Parse 1 0.05 0.04 0 0 1 0Execute 1 10.55 11.40 0 0 0 1Fetch 0 0.00 0.00 0 0 0 0------- ------ -------- ---------- -------- -------- -------- --------total 2 10.60 11.45 0 0 1 1

************************************************************************

SELECT order_id, currency_code, amount_local /* no bulk bind */ FROM open_orders


************************************************************************

UPDATE open_orders /* no bulk bind */ SET amount_usd = :b2 WHERE order_id = :b1


As you can see, this is a very simple program that does not use bulk binds. (The code borders on being silly; please recognize it is for illustrative purposes only.) The PL/SQL engine used 10.55 CPU seconds to run this code (this figure does not include CPU time used by the SQL engine). There were 30,287 fetch calls against the cursor, requiring 30,393 logical reads and 1.08 CPU seconds. The UPDATE statement was executed 30,286 times, using 7.19 CPU seconds.

Now consider the following excerpts from another TKPROF report: ************************************************************************

DECLARE CURSOR c_orders IS SELECT order_id, currency_code, amount_local /* bulk bind */ FROM open_orders; TYPE t_num_array IS TABLE OF NUMBER INDEX BY BINARY_INTEGER; TYPE t_char_array IS TABLE OF VARCHAR2(10) INDEX BY BINARY_INTEGER; v_order_ids t_num_array; v_currency_codes t_char_array; v_amounts_local t_num_array; v_amounts_usd t_num_array; v_row_count NUMBER := 0;BEGIN OPEN c_orders; LOOP FETCH c_orders BULK COLLECT INTO v_order_ids, v_currency_codes, v_amounts_local LIMIT 100; EXIT WHEN v_row_count = c_orders%ROWCOUNT;

v_row_count := c_orders%ROWCOUNT; FOR i IN 1..v_order_ids.count LOOP v_amounts_usd(i) := currency_convert (v_amounts_local(i), v_currency_codes(i)); END LOOP; FORALL i IN 1..v_order_ids.count UPDATE open_orders /* bulk bind */ SET amount_usd = v_amounts_usd(i) WHERE order_id = v_order_ids(i); END LOOP; CLOSE c_orders; COMMIT;END;


************************************************************************

SELECT order_id, currency_code, amount_local /* bulk bind */ FROM open_orders


************************************************************************

UPDATE open_orders /* bulk bind */ SET amount_usd = :b1 WHERE order_id = :b2


This code uses bulk binds to do the same thing as the first code sample, but works with data 100 rows at a time instead of one row at a time. We can see that CPU time used by the PL/SQL engine has reduced from 10.55 seconds to 0.60 seconds. There were only 303 fetch calls against the cursor instead of 30,287, bringing logical reads down from 30,393 to 4,815 and CPU time down from 1.08 seconds to 0.48 seconds. The UPDATE statement was executed only 303 times instead of 30,286, reducing CPU time from 7.19 seconds to 3.75 seconds.

In this example it would appear that bulk binds were definitely worthwhile – CPU time was reduced by about 75%, elapsed time by 50%, and logical reads by 50%. Although bulk binds are indeed beneficial here, the benefit is not truly as rosy as it appears in these TKPROF reports. The SQL trace facility imparts an overhead that is proportional to the number of parse, execute, and fetch calls to the SQL engine. Since bulk binds reduce the number of SQL calls, SQL trace adds much less overhead to code that uses bulk binds. While these TKPROF reports suggest that in this example bulk binds shaved about 50% off of the elapsed time, the savings were about 25% when SQL trace was not enabled. This is still a significant savings. Thus using bulk binds in your PL/SQL programs can certainly be worth the effort. Just remember that SQL trace can inflate the perceived benefit.

Conclusion

Bulk binds allow PL/SQL programs to interact more efficiently with the SQL engine built into Oracle, enabling your PL/SQL programs to use less CPU time and run faster. The Oracle Call Interface has supported array processing for 15 years or more, and the increased efficiency it brings is well understood. It is nice to see this benefit available to PL/SQL programmers as well. PL/SQL bulk binds are not hard to implement, and can offer significant performance improvements for certain types of programs.

PL SQL Questions

Documents