Top Banner
Skip Headers Oracle9 i  Data WarehousingGuide Releas e2 (9.2) Part Number A96520-01 Home Book List Contents Index Master Index Feedback 1 Data Warehousing Concepts This chapter provides an overview of the Oracle data warehosin! implementation" It incldes# $hat is a %ata $areho se& %ata $ar ehose 'rchite ctres  (ote that this book is meant as a spplement to standard texts abot data warehosin!" This book focses on Oracle)specific material and doe s not reprodce in detail material of a !eneral natre" Two standard texts are# The Data Warehouse Toolkit  b* +alph ,imball -.ohn $ile* and Sons/ 01123  Building the Data Warehouse  b* $illiam Inmon -.ohn $ile* and Sons/ 0 1123 What is a Data Warehouse? ' data warehose is a relational database that is desi!ned for 4er* and anal*sis rather than for transaction processin!" It sall* contains historical data derived from transaction data/ bt it can inclde data from other sorces" It separates anal*sis workload from transaction workload and enables an or!ani5ation to consolidate data from several sorces" In addition to a relational database/ a data warehose environment incldes an extraction/ transportation/ transformation/ and loadin! -6TL3 soltion/ an online anal*tical  processin! -OL'73 en!ine/ client anal*sis tools/ and other applications that mana!e the  process of !atherin! data and deliverin! it to bsiness sers" See Also: Chapte r 08/ 9Overview of 6xtractio n/ Transfor mation/ and Loa din!9
260

Datawarehouse Concept

Jun 04, 2018

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 1/260

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

1Data Warehousing Concepts

This chapter provides an overview of the Oracle data warehosin! implementation" Itincldes#

• $hat is a %ata $arehose&

• %ata $arehose 'rchitectres

 (ote that this book is meant as a spplement to standard texts abot data warehosin!"This book focses on Oracle)specific material and does not reprodce in detail materialof a !eneral natre" Two standard texts are#

• The Data Warehouse Toolkit  b* +alph ,imball -.ohn $ile* and Sons/ 01123

•  Building the Data Warehouse b* $illiam Inmon -.ohn $ile* and Sons/ 01123

What is a Data Warehouse?

' data warehose is a relational database that is desi!ned for 4er* and anal*sis ratherthan for transaction processin!" It sall* contains historical data derived fromtransaction data/ bt it can inclde data from other sorces" It separates anal*sis workloadfrom transaction workload and enables an or!ani5ation to consolidate data from severalsorces"

In addition to a relational database/ a data warehose environment incldes an extraction/transportation/ transformation/ and loadin! -6TL3 soltion/ an online anal*tical

 processin! -OL'73 en!ine/ client anal*sis tools/ and other applications that mana!e the process of !atherin! data and deliverin! it to bsiness sers"

See Also: 

Chapter 08/ 9Overview of 6xtraction/ Transformation/ and Loadin!9

Page 2: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 2/260

' common wa* of introdcin! data warehosin! is to refer to the characteristics of a datawarehose as set forth b* $illiam Inmon#

• Sb:ect Oriented

• Inte!rated

 (onvolatile• Time ;ariant

Subject Oriented

%ata warehoses are desi!ned to help *o anal*5e data" For example/ to learn more abot*or compan*<s sales data/ *o can bild a warehose that concentrates on sales" =sin!this warehose/ *o can answer 4estions like 9$ho was or best cstomer for this itemlast *ear&9 This abilit* to define a data warehose b* sb:ect matter/ sales in this case/makes the data warehose sb:ect oriented"

Integrated

Inte!ration is closel* related to sb:ect orientation" %ata warehoses mst pt data fromdisparate sorces into a consistent format" The* mst resolve sch problems as namin!conflicts and inconsistencies amon! nits of measre" $hen the* achieve this/ the* aresaid to be inte!rated"

Nonvolatile

 (onvolatile means that/ once entered into the warehose/ data shold not chan!e" This islo!ical becase the prpose of a warehose is to enable *o to anal*5e what has occrred"

Tie !ariant

In order to discover trends in bsiness/ anal*sts need lar!e amonts of data" This is ver*mch in contrast to online transaction processing (OLTP) s*stems/ where performancere4irements demand that historical data be moved to an archive" ' data warehose<sfocs on chan!e over time is what is meant b* the term time variant"

Contrasting O"T# and Data Warehousing $nvironents

Fi!re 0)0 illstrates ke* differences between an OLT7 s*stem and a data warehose"

Figure 1-1 Contrasting OLTP and Data Warehousing Environments

Page 3: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 3/260

Text description of the illstration dwhs!88>"!if 

One ma:or difference between the t*pes of s*stem is that data warehoses are not sall*in third normal form (3NF)/ a t*pe of data normali5ation common in OLT7environments"

%ata warehoses and OLT7 s*stems have ver* different re4irements" Here are someexamples of differences between t*pical data warehoses and OLT7 s*stems#

• $orkload

%ata warehoses are desi!ned to accommodate ad hoc 4eries" ?o mi!ht notknow the workload of *or data warehose in advance/ so a data warehose

shold be optimi5ed to perform well for a wide variet* of possible 4er*operations"

OLT7 s*stems spport onl* predefined operations" ?or applications mi!ht bespecificall* tned or desi!ned to spport onl* these operations"

• %ata modifications

' data warehose is pdated on a re!lar basis b* the 6TL process -rn ni!htl* orweekl*3 sin! blk data modification techni4es" The end sers of a datawarehose do not directl* pdate the data warehose"

In OLT7 s*stems/ end sers rotinel* isse individal data modificationstatements to the database" The OLT7 database is alwa*s p to date/ and reflectsthe crrent state of each bsiness transaction"

• Schema desi!n

Page 4: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 4/260

%ata warehoses often se denormali5ed or partiall* denormali5ed schemas -schas a star schema3 to optimi5e 4er* performance"

OLT7 s*stems often se fll* normali5ed schemas to optimi5epdate@insert@delete performance/ and to !arantee data consistenc*"

• T*pical operations

' t*pical data warehose 4er* scans thosands or millions of rows" For example/9Find the total sales for all cstomers last month"9

' t*pical OLT7 operation accesses onl* a handfl of records" For example/9+etrieve the crrent order for this cstomer"9

• Historical data

%ata warehoses sall* store man* months or *ears of data" This is to spporthistorical anal*sis"

OLT7 s*stems sall* store data from onl* a few weeks or months" The OLT7s*stem stores onl* historical data as needed to sccessfll* meet the re4irementsof the crrent transaction"

Data Warehouse Architectures

%ata warehoses and their architectres var* dependin! pon the specifics of anor!ani5ation<s sitation" Three common architectres are#

• %ata $arehose 'rchitectre -Basic3

• %ata $arehose 'rchitectre -with a Sta!in! 'rea3

• %ata $arehose 'rchitectre -with a Sta!in! 'rea and %ata Marts3

Data Warehouse Architecture %&asic'

Fi!re 0)A shows a simple architectre for a data warehose" 6nd sers directl* accessdata derived from several sorce s*stems thro!h the data warehose"

Figure 1-2 Architecture of a Data Warehouse

Page 5: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 5/260

Text description of the illstration dwhs!80"!if 

In Fi!re 0)A/ the metadata and raw data of a traditional OLT7 s*stem is present/ as is anadditional t*pe of data/ smmar* data" Smmaries are ver* valable in data warehoses becase the* pre)compte lon! operations in advance" For example/ a t*pical datawarehose 4er* is to retrieve somethin! like '!st sales" ' smmar* in Oracle iscalled a materialized view"

Data Warehouse Architecture %(ith a Staging Area'

In Fi!re 0)A/ *o need to clean and process *or operational data before pttin! it intothe warehose" ?o can do this pro!rammaticall*/ altho!h most data warehoses se a

staging area instead" ' sta!in! area simplifies bildin! smmaries and !eneralwarehose mana!ement" Fi!re 0) illstrates this t*pical architectre"

Figure 1-3 Architecture of a Data Warehouse ith a !taging Area

Page 6: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 6/260

Text description of the illstration dwhs!80>"!if 

Data Warehouse Architecture %(ith a Staging Area and Data )arts'

'ltho!h the architectre in Fi!re 0) is 4ite common/ *o ma* want to cstomi5e*or warehose<s architectre for different !rops within *or or!ani5ation" ?o can dothis b* addin! data marts/ which are s*stems desi!ned for a particlar line of bsiness"Fi!re 0) illstrates an example where prchasin!/ sales/ and inventories are separated"In this example/ a financial anal*st mi!ht want to anal*5e historical data for prchases

and sales"

Figure 1-" Architecture of a Data Warehouse ith a !taging Area and Data#arts

Page 7: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 7/260

Text description of the illstration dwhs!82"!if 

Note: 

%ata marts are an important part of man* warehoses/ bt the* are notthe focs of this book"

See Also: 

 Data Mart Suites docmentation for frther information re!ardin! datamarts

Cop*ri!ht D 0112/ A88A Oracle Corporation"'ll +i!hts +eserved"

Home BookList

Contents Index MasterIndex

Feedback 

#art II"ogical Design

This section deals with the isses in lo!ical desi!n in a data warehose"

It contains the followin! chapter#

Page 8: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 8/260

• Lo!ical %esi!n in %ata $arehoses

*"ogical Design in Data Warehouses

This chapter tells *o how to desi!n a data warehosin! environment and incldes thefollowin! topics#

• Lo!ical ;erss 7h*sical %esi!n in %ata $arehoses

• Creatin! a Lo!ical %esi!n

• %ata $arehosin! Schemas

%ata $arehosin! Ob:ects

"ogical !ersus #h+sical Design in DataWarehouses

?or or!ani5ation has decided to bild a data warehose" ?o have defined the bsinessre4irements and a!reed pon the scope of *or application/ and created a conceptaldesi!n" (ow *o need to translate *or re4irements into a s*stem deliverable" To do so/*o create the lo!ical and ph*sical desi!n for the data warehose" ?o then define#

The specific data content• +elationships within and between !rops of data

• The s*stem environment spportin! *or data warehose

• The data transformations re4ired

• The fre4enc* with which data is refreshed

The lo!ical desi!n is more conceptal and abstract than the ph*sical desi!n" In the lo!icaldesi!n/ *o look at the lo!ical relationships amon! the ob:ects" In the ph*sical desi!n/*o look at the most effective wa* of storin! and retrievin! the ob:ects as well ashandlin! them from a transportation and backp@recover* perspective"

Orient *or desi!n toward the needs of the end sers" 6nd sers t*picall* want to performanal*sis and look at a!!re!ated data/ rather than at individal transactions" However/ endsers mi!ht not know what the* need ntil the* see it" In addition/ a well)planned desi!nallows for !rowth and chan!es as the needs of sers chan!e and evolve"

B* be!innin! with the lo!ical desi!n/ *o focs on the information re4irements andsave the implementation details for later"

Page 9: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 9/260

Creating a "ogical Design

' lo!ical desi!n is conceptal and abstract" ?o do not deal with the ph*sicalimplementation details *et" ?o deal onl* with definin! the t*pes of information that *oneed"

One techni4e *o can se to model *or or!ani5ation<s lo!ical information re4irementsis entit*)relationship modelin!" 6ntit*)relationship modelin! involves identif*in! thethin!s of importance -entities3/ the properties of these thin!s -attribtes3/ and how the*are related to one another -relationships3"

The process of lo!ical desi!n involves arran!in! data into a series of lo!ical relationshipscalled entities and attribtes" 'n entity represents a chnk of information" In relationaldatabases/ an entit* often maps to a table" 'n attribte is a component of an entit* thathelps define the ni4eness of the entit*" In relational databases/ an attribte maps to acolmn"

To be sre that *or data is consistent/ *o need to se ni4e identifiers" ' ni!e

identifier is somethin! *o add to tables so that *o can differentiate between the sameitem when it appears in different places" In a ph*sical desi!n/ this is sall* a primar*ke*"

$hile entit*)relationship dia!rammin! has traditionall* been associated with hi!hl*normali5ed models sch as OLT7 applications/ the techni4e is still sefl for datawarehose desi!n in the form of dimensional modelin!" In dimensional modelin!/ insteadof seekin! to discover atomic nits of information -sch as entities and attribtes3 and allof the relationships between them/ *o identif* which information belon!s to a central

fact table and which information belon!s to its associated dimension tables" ?o identif* bsiness sb:ects or fields of data/ define relationships between bsiness sb:ects/ andname the attribtes for each sb:ect"

See Also: 

Chapter 1/ 9%imensions9 for frther information re!ardin! dimensions

?or lo!ical desi!n shold reslt in -03 a set of entities and attribtes correspondin! tofact tables and dimension tables and -A3 a model of operational data from *or sorceinto sb:ect)oriented information in *or tar!et data warehose schema"

?o can create the lo!ical desi!n sin! a pen and paper/ or *o can se a desi!n toolsch as Oracle $arehose Bilder -specificall* desi!ned to spport modelin! the 6TL process3 or Oracle %esi!ner -a !eneral prpose modelin! tool3"

See Also: 

Oracle Designer  and Oracle Warehouse  Builder  docmentation sets

Page 10: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 10/260

Data Warehousing Scheas

' schema is a collection of database ob:ects/ incldin! tables/ views/ indexes/ ands*non*ms" ?o can arran!e schema ob:ects in the schema models desi!ned for datawarehosin! in a variet* of wa*s" Most data warehoses se a dimensional model"

The model of *or sorce data and the re4irements of *or sers help *o desi!n thedata warehose schema" ?o can sometimes !et the sorce model from *or compan*<senterprise data model and reverse)en!ineer the lo!ical data model for the data warehosefrom this" The ph*sical implementation of the lo!ical data warehose model ma* re4iresome chan!es to adapt it to *or s*stem parameters))si5e of machine/ nmber of sers/stora!e capacit*/ t*pe of network/ and software"

Star Scheas

The star schema is the simplest data warehose schema" It is called a star schema

 becase the dia!ram resembles a star/ with points radiatin! from a center" The center ofthe star consists of one or more fact tables and the points of the star are the dimensiontables/ as shown in Fi!re A)0"

Figure 2-1 !tar !chema

Text description of the illstration dwhs!88E"!if 

The most natral wa* to model a data warehose is as a star schema/ onl* one :oinestablishes the relationship between the fact table and an* one of the dimension tables"

' star schema optimi5es performance b* keepin! 4eries simple and providin! fast

response time" 'll the information abot each level is stored in one row"

Note: 

Oracle Corporation recommends that *o choose a star schema nless*o have a clear reason not to"

Page 11: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 11/260

Other Scheas

Some schemas in data warehosin! environments se third normal form rather than starschemas" 'nother schema that is sometimes sefl is the snowflake schema/ which is astar schema with normali5ed dimensions in a tree strctre"

See Also: 

Chapter 0E/ 9Schema Modelin! Techni4es9 for frther informationre!ardin! star and snowflake schemas in data warehoses and Oracle9i

 Database Concepts for frther conceptal material

Data Warehousing Objects

Fact tables and dimension tables are the two t*pes of ob:ects commonl* sed indimensional data warehose schemas"

Fact tables are the lar!e tables in *or warehose schema that store bsinessmeasrements" Fact tables t*picall* contain facts and forei!n ke*s to the dimensiontables" Fact tables represent data/ sall* nmeric and additive/ that can be anal*5ed andexamined" 6xamples inclde sales/ cost/ and profit"

%imension tables/ also known as lookp or reference tables/ contain the relativel* staticdata in the warehose" %imension tables store the information *o normall* se tocontain 4eries" %imension tables are sall* textal and descriptive and *o can sethem as the row headers of the reslt set" 6xamples are customers or products"

,act Tables

' fact table t*picall* has two t*pes of colmns# those that contain nmeric facts -oftencalled measrements3/ and those that are forei!n ke*s to dimension tables" ' fact tablecontains either detail)level facts or facts that have been a!!re!ated" Fact tables thatcontain a!!re!ated facts are often called smmar* tables" ' fact table sall* containsfacts with the same level of a!!re!ation" Tho!h most facts are additive/ the* can also besemi)additive or non)additive" 'dditive facts can be a!!re!ated b* simple arithmeticaladdition" ' common example of this is sales" (on)additive facts cannot be added at all"'n example of this is avera!es" Semi)additive facts can be a!!re!ated alon! some of thedimensions and not alon! others" 'n example of this is inventor* levels/ where *ocannot tell what a level means simpl* b* lookin! at it"

Creating a Ne( ,act Table

?o mst define a fact table for each star schema" From a modelin! standpoint/ the primar* ke* of the fact table is sall* a composite ke* that is made p of all of itsforei!n ke*s"

Page 12: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 12/260

Diension Tables

' dimension is a strctre/ often composed of one or more hierarchies/ that cate!ori5esdata" %imensional attribtes help to describe the dimensional vale" The* are normall*descriptive/ textal vales" Several distinct dimensions/ combined with facts/ enable *o

to answer bsiness 4estions" Commonl* sed dimensions are cstomers/ prodcts/ andtime"

%imension data is t*picall* collected at the lowest level of detail and then a!!re!ated intohi!her level totals that are more sefl for anal*sis" These natral rollps or a!!re!ationswithin a dimension table are called hierarchies"

-ierarchies

Hierarchies are lo!ical strctres that se ordered levels as a means of or!ani5in! data" 'hierarch* can be sed to define data a!!re!ation" For example/ in a time dimension/ a

hierarch* mi!ht a!!re!ate data from the month level to the quarter level to the year level" ' hierarch* can also be sed to define a navi!ational drill path and to establish afamil* strctre"

$ithin a hierarch*/ each level is lo!icall* connected to the levels above and below it"%ata vales at lower levels a!!re!ate into the data vales at hi!her levels" ' dimensioncan be composed of more than one hierarch*" For example/ in the product dimension/

there mi!ht be two hierarchies))one for prodct cate!ories and one for prodct sppliers"

%imension hierarchies also !rop levels from !eneral to !ranlar" er* tools sehierarchies to enable *o to drill down into *or data to view different levels of

!ranlarit*" This is one of the ke* benefits of a data warehose"

$hen desi!nin! hierarchies/ *o mst consider the relationships in bsiness strctres"For example/ a divisional mltilevel sales or!ani5ation"

Hierarchies impose a famil* strctre on dimension vales" For a particlar level vale/ avale at the next hi!her level is its parent/ and vales at the next lower level are itschildren" These familial relationships enable anal*sts to access data 4ickl*"

Leve$s

' level represents a position in a hierarch*" For example/ a time dimension mi!ht have ahierarch* that represents data at the month/ quarter/ and year levels" Levels ran!e from

!eneral to specific/ with the root level as the hi!hest or most !eneral level" The levels in adimension are or!ani5ed into one or more hierarchies"

Leve$ %e$ationshi&s

Page 13: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 13/260

Level relationships specif* top)to)bottom orderin! of levels from most !eneral -the root3to most specific information" The* define the parent)child relationship between the levelsin a hierarch*"

Hierarchies are also essential components in enablin! more complex rewrites" For

example/ the database can a!!re!ate an existin! sales revene on a 4arterl* base to a*earl* a!!re!ation when the dimensional dependencies between 4arter and *ear areknown"

T+pical Diension -ierarch+

Fi!re A)A illstrates a dimension hierarch* based on customers"

Figure 2-2 T'&ica$ Leve$s in a Dimension (ierarch' 

Text description of the illstration dwhs!8A"!if 

See Also: 

Chapter 1/ 9%imensions9 and Chapter AA/ 9er* +ewrite9 for frtherinformation re!ardin! hierarchies

.ni/ue Identi0iers

=ni4e identifiers are specified for one distinct record in a dimension table" 'rtificialni4e identifiers are often sed to avoid the potential problem of ni4e identifierschan!in!" =ni4e identifiers are represented with the G character" For example/#customer_id"

elationships

+elationships !arantee bsiness inte!rit*" 'n example is that if a bsiness sellssomethin!/ there is obviosl* a cstomer and a prodct" %esi!nin! a relationship betweenthe sales information in the fact table and the dimension tables prodcts and cstomersenforces the bsiness rles in databases"

Page 14: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 14/260

$2aple o0 Data Warehousing Objects and Their elationships

Fi!re A) illstrates a common example of a sales fact table and dimension tables

customers/ products/ promotions/ times/ and channels"

Figure 2-3 T'&ica$ Data Warehousing O)*ects

Text description of the illstration dwhs!8E2"!if 

Cop*ri!ht D 0112/ A88A Oracle Corporation"'ll +i!hts +eserved"

Home BookList

Contents Index MasterIndex

Feedback 

#art III#h+sical Design

This section deals with the ph*sical desi!n of a data warehose"

It contains the followin! chapters#

• 7h*sical %esi!n in %ata $arehoses

• Hardware and I@O Considerations in %ata $arehoses

• 7arallelism and 7artitionin! in %ata $arehoses

• Indexes

• Inte!rit* Constraints

• Materiali5ed ;iews

• %imensions

Page 15: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 15/260

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

3#h+sical Design in DataWarehouses

This chapter describes the ph*sical desi!n of a data warehosin! environment/ andincldes the followin! topics#

• Movin! from Lo!ical to 7h*sical %esi!n

• 7h*sical %esi!n

)oving 0ro "ogical to #h+sical Design

Lo!ical desi!n is what *o draw with a pen and paper or desi!n with Oracle $arehoseBilder or %esi!ner before bildin! *or warehose" 7h*sical desi!n is the creation ofthe database with SL statements"

%rin! the ph*sical desi!n process/ *o convert the data !athered drin! the lo!icaldesi!n phase into a description of the ph*sical database strctre" 7h*sical desi!ndecisions are mainl* driven b* 4er* performance and database maintenance aspects" For

example/ choosin! a partitionin! strate!* that meets common 4er* re4irements enablesOracle to take advanta!e of partition prnin!/ a wa* of narrowin! a search before performin! it"

See Also: • Chapter >/ 97arallelism and 7artitionin! in %ata $arehoses9

for frther information re!ardin! partitionin!

Page 16: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 16/260

• Oracle9i Database Concepts for frther conceptal material

re!ardin! all desi!n matters

#h+sical Design

%rin! the lo!ical desi!n phase/ *o defined a model for *or data warehose consistin!of entities/ attribtes/ and relationships" The entities are linked to!ether sin!relationships" 'ttribtes are sed to describe the entities" The ni!e identifier -=I%3distin!ishes between one instance of an entit* and another"

Fi!re )0 offers *o a !raphical wa* of lookin! at the different wa*s of thinkin! abotlo!ical and ph*sical desi!ns"

Figure 3-1 Logica$ Design Com&ared ith Ph'sica$ Design

Text description of the illstration dwhs!882"!if 

%rin! the ph*sical desi!n process/ *o translate the expected schemas into actaldatabase strctres" 't this time/ *o have to map#

• 6ntities to tables

• +elationships to forei!n ke* constraints

• 'ttribtes to colmns

• 7rimar* ni4e identifiers to primar* ke* constraints

• =ni4e identifiers to ni4e ke* constraints

#h+sical Design Structures

Once *o have converted *or lo!ical desi!n to a ph*sical one/ *o will need to createsome or all of the followin! strctres#

Page 17: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 17/260

• Tablespaces

• Tables and 7artitioned Tables

• ;iews

• Inte!rit* Constraints

• %imensions

Some of these strctres re4ire disk space" Others exist onl* in the data dictionar*"'dditionall*/ the followin! strctres ma* be created for performance improvement#

• Indexes and 7artitioned Indexes

• Materiali5ed ;iews

Tablespaces

' tablespace consists of one or more datafiles/ which are ph*sical strctres within theoperatin! s*stem *o are sin!" ' datafile is associated with onl* one tablespace" From a

desi!n perspective/ tablespaces are containers for ph*sical desi!n strctres"

Tablespaces need to be separated b* differences" For example/ tables shold be separatedfrom their indexes and small tables shold be separated from lar!e tables" Tablespacesshold also represent lo!ical bsiness nits if possible" Becase a tablespace is thecoarsest !ranlarit* for backp and recover* or the transportable tablespaces mechanism/the lo!ical bsiness desi!n affects availabilit* and maintenance operations"

See Also: 

Chapter / 9Hardware and I@O Considerations in %ata $arehoses9 for

frther information re!ardin! tablespaces

Tables and #artitioned Tables

Tables are the basic nit of data stora!e" The* are the container for the expected amontof raw data in *or data warehose"

=sin! partitioned tables instead of nonpartitioned ones addresses the ke* problem ofspportin! ver* lar!e data volmes b* allowin! *o to decompose them into smaller andmore mana!eable pieces" The main desi!n criterion for partitionin! is mana!eabilit*/tho!h *o will also see performance benefits in most cases becase of partition prnin!

or intelli!ent parallel processin!" For example/ *o mi!ht choose a partitionin! strate!* based on a sales transaction date and a monthl* !ranlarit*" If *o have for *ears< worthof data/ *o can delete a month<s data as it becomes older than for *ears with a sin!le/4ick %%L statement and load new data while onl* affectin! 0@th of the completetable" Bsiness 4estions re!ardin! the last 4arter will onl* affect three months/ which ise4ivalent to three partitions/ or @ths of the total volme"

Page 18: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 18/260

7artitionin! lar!e tables improves performance becase each partitioned piece is moremana!eable" T*picall*/ *o partition based on transaction dates in a data warehose" Forexample/ each month/ one month<s worth of data can be assi!ned its own partition"

Data Segent Copression

?o can save disk space b* compressin! heap)or!ani5ed tables" ' t*pical t*pe of heap)or!ani5ed table *o shold consider for data se!ment compression is partitioned tables"

To redce disk se and memor* se -specificall*/ the bffer cache3/ *o can store tablesand partitioned tables in a compressed format inside the database" This often leads to a better scalep for read)onl* operations" %ata se!ment compression can also speed p4er* exection" There is/ however/ a cost in C7= overhead"

%ata se!ment compression shold be sed with hi!hl* redndant data/ sch as tables withman* forei!n ke*s" ?o shold avoid compressin! tables with mch pdate or other

%ML activit*" 'ltho!h compressed tables or partitions are pdatable/ there is someoverhead in pdatin! these tables/ and hi!h pdate activit* ma* work a!ainstcompression b* casin! some space to be wasted"

See Also: 

Chapter >/ 97arallelism and 7artitionin! in %ata $arehoses9 andChapter 0/ 9Maintainin! the %ata $arehose9 for informationre!ardin! data se!ment compression and partitioned tables

!ie(s

' view is a tailored presentation of the data contained in one or more tables or otherviews" ' view takes the otpt of a 4er* and treats it as a table" ;iews do not re4irean* space in the database"

See Also: 

Oracle9i Database Concepts

Integrit+ Constraints

Inte!rit* constraints are sed to enforce bsiness rles associated with *or database andto prevent havin! invalid information in the tables" Inte!rit* constraints in datawarehosin! differ from constraints in OLT7 environments" In OLT7 environments/ the* primaril* prevent the insertion of invalid data into a record/ which is not a bi! problem indata warehosin! environments becase accrac* has alread* been !aranteed" In datawarehosin! environments/ constraints are onl* sed for 4er* rewrite" NOT NULL 

constraints are particlarl* common in data warehoses" =nder some specific

Page 19: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 19/260

circmstances/ constraints need space in the database" These constraints are in the formof the nderl*in! ni4e index"

See Also: 

Chapter E/ 9Inte!rit* Constraints9 and Chapter AA/ 9er* +ewrite9

Inde2es and #artitioned Inde2es

Indexes are optional strctres associated with tables or clsters" In addition to theclassical B)tree indexes/ bitmap indexes are ver* common in data warehosin!environments" Bitmap indexes are optimi5ed index strctres for set)oriented operations"'dditionall*/ the* are necessar* for some optimi5ed data access methods sch as startransformations"

Indexes are :st like tables in that *o can partition them/ altho!h the partitionin!

strate!* is not dependent pon the table strctre" 7artitionin! indexes makes it easier tomana!e the warehose drin! refresh and improves 4er* performance"

See Also: 

Chapter 2/ 9Indexes9 and Chapter 0/ 9Maintainin! the %ata$arehose9

)ateriali4ed !ie(s

Materiali5ed views are 4er* reslts that have been stored in advance so lon!)rnnin!

calclations are not necessar* when *o actall* execte *or SL statements" From a ph*sical desi!n point of view/ materiali5ed views resemble tables or partitioned tablesand behave like indexes"

See Also: 

Chapter / 9Materiali5ed ;iews9

Diensions

' dimension is a schema ob:ect that defines hierarchical relationships between colmns

or colmn sets" ' hierarchical relationship is a fnctional dependenc* from one level of ahierarch* to the next one" ' dimension is a container of lo!ical relationships and does notre4ire an* space in the database" ' t*pical dimension is cit*/ state -or province3/ re!ion/and contr*"

See Also: 

Chapter 1/ 9%imensions9

Page 20: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 20/260

Cop*ri!ht D 0112/ A88A Oracle Corporation"'ll +i!hts +eserved"

Home BookList

Contents Index MasterIndex

Feedback 

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

5-ard(are and I6O Considerations inData Warehouses

This chapter explains some of the hardware and I@O isses in a data warehosin!environment and incldes the followin! topics#

• Overview of Hardware and I@O Considerations in %ata $arehoses

• +'I% Confi!rations

Overvie( o0 -ard(are and I6O Considerations inData Warehouses

%ata warehoses are normall* ver* concerned with I@O performance" This is in contrastto OLT7 s*stems/ where the potential bottleneck depends on ser workload andapplication access patterns" $hen a s*stem is constrained b* I@O capabilities/ it is I@O

 bond/ or has an I@O bottleneck" $hen a s*stem is constrained b* havin! limited C7=resorces/ it is C7= bond/ or has a C7= bottleneck"

%atabase architects fre4entl* se +'I% -+edndant 'rra*s of Inexpensive %isks3s*stems to overcome I@O bottlenecks and to provide hi!her availabilit*" +'I% can beimplemented in several levels/ ran!in! from 8 to E" Man* hardware vendors haveenhanced these basic levels to lessen the impact of some of the ori!inal restrictions at a!iven +'I% level" The most common +'I% levels are discssed later in this chapter"

Page 21: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 21/260

Wh+ Stripe the Data?

To avoid I@O bottlenecks drin! parallel processin! or concrrent 4er* access/ alltablespaces accessed b* parallel operations shold be striped" Stripin! divides the data ofa lar!e table into small portions and stores them on separate datafiles on separate disks"

's shown in Fi!re )0/ tablespaces shold alwa*s stripe over at least as many devicesas CPs! In this example/ there are for C7=s/ two controllers/ and five devicescontainin! tablespaces"

Figure "-1 !tri&ing O)*ects Over at Least as #an' Devices as CP+s

Text description of the illstration dwhs!8E"!if 

See Also: 

Oracle9i Database Concepts for frther details abot disk stripin!

?o shold stripe tablespaces for tables/ indexes/ rollback se!ments/ and temporar*tablespaces" ?o mst also spread the devices over controllers/ I@O channels/ and internal bses" To make stripin! effective/ *o mst make sre that eno!h controllers and otherI@O components are available to spport the bandwidth of parallel data movement intoand ot of the striped tablespaces"

?o can se +'I% s*stems or *o can perform stripin! manall* thro!h carefl datafile allocation to tablespaces"

The stripin! of data across ph*sical drives has several conse4ences besides balancin!I@O" One additional advanta!e is that lo!ical files can be created that are lar!er than the

maximm si5e sall* spported b* an operatin! s*stem" There are disadvanta!eshowever" Stripin! means that it is no lon!er possible to locate a sin!le datafile on aspecific ph*sical drive" This can case the loss of some application tnin! capabilities"'lso/ it can case database recover* to be more time)consmin!" If a sin!le ph*sical diskin a +'I% arra* needs recover*/ all the disks that are part of that lo!ical +'I% devicemst be involved in the recover*"

Autoatic Striping

Page 22: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 22/260

'tomatic stripin! is sall* flexible and eas* to mana!e" It spports man* scenariossch as mltiple sers rnnin! se4entiall* or as sin!le sers rnnin! in parallel" Twomain advanta!es make atomatic stripin! preferable to manal stripin!/ nless the s*stemis ver* small or availabilit* is the main concern#

For parallel scan operations -sch as fll table scan or fast fll scan3/ operatin!s*stem stripin! increases the nmber of disk seeks" (evertheless/ this is lar!el*offset b* the lar!e I@O si5e -DB_BLOCK_SI!  "ULTIBLOCK_!$D_COUNT3/ which

shold enable this operation to reach the maximm I@O thro!hpt for *or platform" This maximm is in !eneral limited b* the nmber of controllers or I@O bses of the platform/ not b* the nmber of disks -nless *o have a smallconfi!ration or are sin! lar!e disks3"

• For index probes -for example/ within a nested loop :oin or parallel index ran!e

scan3/ operatin! s*stem stripin! enables *o to avoid hot spots b* evenl*distribtin! I@O across the disks"

Oracle Corporation recommends sin! a lar!e stripe si5e of at least 2 ,B" Stripe si5emst be at least as lar!e as the I@O si5e" If stripe si5e is lar!er than I@O si5e b* a factor oftwo or for/ then trade)offs ma* arise" The lar!e stripe si5e can be advanta!eos becaseit lets the s*stem perform more se4ential operations on each diskJ it decreases thenmber of seeks on disk" 'nother advanta!e of lar!e stripe si5es is that more sers canwork on the s*stem withot affectin! each other" The disadvanta!e is that lar!e stripesredce the I@O parallelism/ so fewer disks are simltaneosl* active" If *o enconter problems/ increase the I@O si5e of scan operations -for example/ from 2 ,B to 0A ,B3/instead of chan!in! the stripe si5e" The maximm I@O si5e is platform)specific -in aran!e/ for example/ of 2 ,B to 0 MB3"

$ith atomatic stripin!/ from a performance standpoint/ the best la*ot is to stripe data/indexes/ and temporar* tablespaces across all the disks of *or platform" This la*ot isalso appropriate when *o have little information abot s*stem sa!e" To increaseavailabilit*/ it ma* be more practical to stripe over fewer disks to prevent a sin!le diskvale from affectin! the entire data warehose" However/ for better performance/ it iscrcial to stripe all ob:ects over mltiple disks" In this wa*/ maximm I@O performance-both in terms of thro!hpt and in nmber of I@Os per second3 can be reached when oneob:ect is accessed b* a parallel operation" If mltiple ob:ects are accessed at the sametime -as in a mltiser confi!ration3/ stripin! atomaticall* limits the contention"

)anual Striping

?o can se manal stripin! on all platforms" To do this/ add mltiple files to eachtablespace/ with each file on a separate disk" If *o se manal stripin! correctl*/ *ors*stem<s performance improves si!nificantl*" However/ *o shold be aware of severaldrawbacks that can adversel* affect performance if *o do not stripe correctl*"

$hen sin! manal stripin!/ the de!ree of parallelism -%O73 is more a fnction of thenmber of disks than of the nmber of C7=s" First/ it is necessar* to have one server

Page 23: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 23/260

 process for each datafile to drive all the disks and limit the risk of experiencin! I@O bottlenecks" Second/ manal stripin! is ver* sensitive to datafile si5e skew/ which canaffect the scalabilit* of parallel scan operations" Third/ manal stripin! re4ires more plannin! and set)p effort than atomatic stripin!"

Note: 

Oracle Corporation recommends that *o choose atomatic stripin!nless *o have a clear reason not to"

"ocal and 7lobal Striping

Local stripin!/ which applies onl* to partitioned tables and indexes/ is a form of non)overlappin!/ disk)to)partition stripin!" 6ach partition has its own set of disks and files/ asillstrated in Fi!re )A" %isk access does not overlap/ nor do files"

'n advanta!e of local stripin! is that if one disk fails/ it does not affect other partitions"Moreover/ *o still have some stripin! even if *o have data in onl* one partition"

' disadvanta!e of local stripin! is that *o need man* disks to implement it))each partition re4ires mltiple disks of its own" 'nother ma:or disadvanta!e is that when partitions are redced to a few or even a sin!le partition/ the s*stem retains limited I@O bandwidth" 's a reslt/ local stripin! is not optimal for parallel operations" For thisreason/ consider local stripin! onl* if *or main concern is availabilit*/ rather than parallel exection"

Figure "-2 Loca$ !tri&ing 

Text description of the illstration dwhs!088"!if 

Klobal stripin!/ illstrated in Fi!re )/ entails overlappin! disks and partitions"

Figure "-3 ,$o)a$ !tri&ing 

Page 24: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 24/260

Text description of the illstration dwhs!080"!if 

Klobal stripin! is advanta!eos if *o have partition prnin! and need to access data inonl* one partition" Spreadin! the data in that partition across man* disks improves performance for parallel exection operations" ' disadvanta!e of !lobal stripin! is that ifone disk fails/ all partitions are affected if the disks are not mirrored"

See Also: 

Oracle9i Database Concepts for information on disk stripin! and partitionin!" For M77 s*stems/ see *or operatin! s*stem specificOracle docmentation re!ardin! the advisabilit* of disablin! diskaffinit* when sin! operatin! s*stem stripin!

Anal+4ing Striping

Two considerations arise when anal*5in! stripin! isses for *or applications" First/consider the cardinalit* of the relationships amon! the ob:ects in a stora!e s*stem"

Second/ consider what *o can optimi5e in *or stripin! effort# fll table scans/ !eneraltablespace availabilit*/ partition scans/ or some combinations of these !oals" Cardinalit*and optimi5ation are discssed in the followin! section"

Cardinality of Storage Object Relationships

To anal*5e stripin!/ consider the relationships illstrated in Fi!re )"

Figure "-" Cardina$it' of %e$ationshi&s

Text description of the illstration dwhs!81"!if 

Fi!re ) shows the cardinalit* of the relationships amon! ob:ects in a t*pical Oraclestora!e s*stem" For ever* table there ma* be#

Page 25: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 25/260

•  p partitions/ shown in Fi!re ) as a one)to)man* relationship

•  s partitions for ever* tablespace/ shown in Fi!re ) as a man*)to)one

relationship•  "  files for ever* tablespace/ shown in Fi!re ) as a one)to)man* relationship

• m files to n devices/ shown in Fi!re ) as a man*)to)man* relationship

Striping Goals

?o can stripe an ob:ect across devices to achieve one of three !oals#

• Koal 0# To optimi5e fll table scans/ place a table on man* devices"

• Koal A# To optimi5e availabilit*/ restrict the tablespace to a few devices"

• Koal # To optimi5e partition scans/ achieve intra)partition parallelism b* placin!

each partition on man* devices"

To attain both Koals 0 and A -havin! the table reside on man* devices/ with the hi!hest

 possible availabilit*3/ maximi5e the nmber of partitions p and minimi5e the nmber of partitions for each tablespace s"

To maximi5e Koal 0 bt with minimal intra)partition parallelism/ place each partition inits own tablespace" %o not sed striped files/ and se one file for each tablespace"

To minimi5e Koal A and thereb* minimi5e availabilit*/ set  "  and n e4al to 0" $hen *ominimi5e availabilit*/ *o maximi5e intra)partition parallelism" Koal conflicts withKoal A becase *o cannot simltaneosl* maximi5e the formla for Koal andminimi5e the formla for Koal A" ?o mst compromise to achieve some of the benefitsof both !oals"

"triping #oal $% Optimize Fll Table "cans

Havin! a table reside on man* devices ensres scalable fll table scans"

To calclate the optimal nmber of devices for each table/ se this formla#

Text description of the illstration dwhs!8"!if 

?o can do this b* havin! t  partitions/ with ever* partition in its own tablespace/ if ever*tablespace has one file/ and these files are not striped"

Text description of the illstration dwhs!82"!if 

Page 26: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 26/260

If the table is not partitioned/ bt is in one tablespace in one file/ stripe it over n devices"

Text description of the illstration dwhs!821"!if 

There are a maximm of t  partitions/ ever* partition in its own tablespace/ "  files in eachtablespace/ each tablespace on a striped device#

Text description of the illstration dwhs!8E8"!if 

"triping #oal &% Optimize 'vailability

+estrictin! each tablespace to a small nmber of devices and havin! as man* partitions as possible helps *o achieve hi!h availabilit*"

Text description of the illstration dwhs!822"!if 

'vailabilit* is maximi5ed when "   n  m  0 and p is mch !reater than 0"

"triping #oal 3% Optimize Partition "cans

'chievin! intra)partition parallelism is advanta!eos becase partition scans are scalable"To do this/ place each partition on man* devices"

Text description of the illstration dwhs!82E"!if 

7artitions can reside in a tablespace that can have man* files" ?o can have either astriped file or man* files for each tablespace"

AID Con0igurations

+'I% s*stems/ also called disk arra*s/ can be hardware) or software)based s*stems" Thedifference between the two is how C7= processin! of I@O re4ests is handled" Insoftware)based +'I% s*stems/ the operatin! s*stem or an application level handles theI@O re4est/ while in hardware)based +'I% s*stems/ disk controllers handle I@O re4ests"+'I% sa!e is transparent to Oracle" 'll the featres specific to a !iven +'I%

Page 27: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 27/260

confi!ration are handled b* the operatin! s*stem and Oracle does not need to worr*abot them"

7rimar* lo!ical database strctres have different access patterns drin! read and writeoperations" Therefore/ different +'I% implementations will be better sited for these

strctres" The prpose of this chapter is to discss some of the basic decisions *o mstmake when desi!nin! the ph*sical la*ot of *or data warehose implementation" It isnot meant as a replacement for operatin! s*stem and stora!e docmentation or aconsltant<s anal*sis of *or I@O re4irements"

See Also: 

Oracle9i Database Per"ormance Tuning #uide and $e"erence for moreinformation re!ardin! +'I%

There are advanta!es and disadvanta!es to sin! +'I%/ and those depend on the +'I%

level nder consideration and the specific s*stem in 4estion" The most commonconfi!rations in data warehoses are#

• +'I% 8 -Stripin!3

• +'I% 0 -Mirrorin!3

• +'I% 80 -Stripin! and Mirrorin!3

• +'I% >

AID 8 %Striping'

+'I% 8 is a non)redndant disk arra*/ so there will be data loss with an* disk failre" If

somethin! on the disk becomes corrpted/ *o cannot restore or recalclate that data"+'I% 8 provides the best write thro!hpt performance becase it never pdatesredndant information" +ead thro!hpt is also 4ite !ood/ bt *o can improve it b*combinin! +'I% 8 with +'I% 0"

Oracle does not recommend sin! +'I% 8 s*stems withot +'I% 0 becase the loss ofone disk in the arra* will affect the complete s*stem and make it navailable" +'I% 8s*stems are sed mainl* in environments where performance and capacit* are the primar* concerns rather than availabilit*"

AID 1 %)irroring'

+'I% 0 provides fll data redndanc* b* complete mirrorin! of all files" If a disk failreoccrs/ the mirrored cop* is sed to transparentl* service the re4est" +'I% 0 mirrorin!re4ires twice as mch disk space as there is data" In !eneral/ +'I% 0 is most sefl fors*stems where complete redndanc* of data is re4ired and disk space is not an isse"For lar!e datafiles or s*stems with less disk space/ +'I% 0 ma* not be feasible/ becaseit re4ires twice as mch disk space as there is data" $rites nder +'I% 0 are no faster

Page 28: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 28/260

and no slower than sal" +eadin! data can be faster than on a sin!le disk becase thes*stem can choose to read the data from the disk that can respond faster"

AID 891 %Striping and )irroring'

+'I% 80 offers the best performance of all +'I% s*stems/ bt costs the most becase*o doble the nmber of drives" Basicall*/ it combines the performance of +'I% 8 andthe falt tolerance of +'I% 0" ?o shold consider +'I% 80 for datafiles with hi!hwrite rates/ for example/ table datafiles/ and online and archived redo lo! files"

Striping )irroring and )edia ecover+

Stripin! affects media recover*" Loss of a disk sall* means loss of access to all ob:ectsstored on that disk" If all datafiles in a database are striped over all disks/ then loss of an*disk stops the entire database" Frthermore/ *o ma* need to restore all these databasefiles from backps/ even if each file has onl* a small fraction of its total data stored on

the failed disk"

Often/ the same s*stem that provides stripin! also provides mirrorin!" $ith the declinin! price of disks/ mirrorin! can provide an effective spplement to/ bt not a sbstitte for/ backps and lo! archives" Mirrorin! can help *or s*stem recover from disk failresmore 4ickl* than sin! a backp/ bt mirrorin! is not as robst" Mirrorin! does not protect a!ainst software falts and other problems a!ainst which an independent backpwold protect *or s*stem"

?o can effectivel* se mirrorin! if *o are able to reload read)onl* data from theori!inal sorce tapes" If *o have a disk failre/ restorin! data from backps can involve

len!th* downtime/ whereas restorin! from a mirrored disk enables *or s*stem to !et back online 4ickl* or even sta* online while the crashed disk is replaced andres*nchroni5ed"

AID ;

+'I% > s*stems provide redndanc* for the ori!inal data while storin! parit*information as well" The parit* information is striped over all disks in the s*stem to avoida sin!le disk as a bottleneck drin! write operations" The I@O thro!hpt of +'I% >s*stems depends pon the implementation and the stripin! si5e" For a t*pical +'I% >s*stem/ the thro!hpt is normall* lower than +'I% 8 0 confi!rations" In particlar/

the performance for hi!h concrrent write operations sch as parallel load can be poor"

Man* vendors se memor* -as batter*)backed cache3 in front of the disks to increasethro!hpt and to become comparable to +'I% 80" Contact *or disk arra* vendor forspecific details"

The Iportance o0 Speci0ic Anal+sis

Page 29: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 29/260

' data warehose<s re4irements are at man* levels/ and resolvin! a problem at one levelcan case problems with another" For example/ resolvin! a problem with 4er* performance drin! the 6TL process can affect load performance" ?o cannot simpl*maximi5e 4er* performance at the expense of an nrealistic load time" If *o do/ *orimplementation will fail" In addition/ a particlar process is dependent pon the

warehose<s architectre" If *o decide to chan!e somethin! in *or s*stem/ it can case performance to become nacceptable in another part of the warehosin! process" 'nexample of this is switchin! from sin! database files to flat files drin! the loadin! process" Flat files can have different read performance"

This chapter is not meant as a replacement for operatin! s*stem and stora!edocmentation" ?or s*stem<s re4irements will re4ire detailed anal*sis prior toimplementation" Onl* a detailed data warehose architectre and I@O anal*sis will help*o when decidin! hardware and I@O strate!ies"

See Also: 

Oracle9i Database Per"ormance Tuning #uide and $e"erence fordetails re!ardin! how to anal*5e I@O re4irements

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

;#arallelis and #artitioning in Data

Warehouses

%ata warehoses often contain lar!e tables and re4ire techni4es both for mana!in!these lar!e tables and for providin! !ood 4er* performance across these lar!e tables"This chapter discsses two ke* methodolo!ies for addressin! these needs# parallelism and partitionin!"

Page 30: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 30/260

These topics are discssed#

• Overview of 7arallel 6xection

• Kranles of 7arallelism

• 7artitionin! %esi!n Considerations

Miscellaneos 7artition Operations

Note: 

7arallel exection is available onl* with the Oracle1i 6nterprise6dition"

Overvie( o0 #arallel $2ecution

7arallel exection dramaticall* redces response time for data)intensive operations onlar!e databases t*picall* associated with decision spport s*stems -%SS3 and datawarehoses" ?o can also implement parallel eection on certain t*pes of onlinetransaction processin! -OLT73 and h*brid s*stems" 7arallel exection is sometimescalled parallelism" Simpl* expressed/ parallelism is the idea of breakin! down a task sothat/ instead of one process doin! all of the work in a 4er*/ man* processes do part ofthe work at the same time" 'n example of this is when for processes handle fordifferent 4arters in a *ear instead of one process handlin! all for 4arters b* itself" Theimprovement in performance can be 4ite hi!h" In this case/ each 4arter will be apartition/ a smaller and more mana!eable nit of an index or table"

See Also: 

Oracle9i Database Concepts for frther conceptal informationre!ardin! parallel exection

When to Ipleent #arallel $2ecution

The most common se of parallel exection is in %SS and data warehosin!environments" Complex 4eries/ sch as those involvin! :oins of several tables orsearches of ver* lar!e tables/ are often best exected in parallel"

7arallel exection is sefl for man* t*pes of operations that access si!nificant amontsof data" 7arallel exection improves processin! for#

• Lar!e table scans and :oins

• Creation of lar!e indexes

• 7artitioned index scans

• Blk inserts/ pdates/ and deletes

• '!!re!ations and cop*in!

Page 31: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 31/260

?o can also se parallel exection to access ob:ect t*pes within an Oracle database" Forexample/ se parallel exection to access LOBs -lar!e ob:ects3"

7arallel exection benefits s*stems that have all  of the followin! characteristics#

S*mmetric mlti)processors -SM73/ clsters/ or massivel* parallel s*stems• Sfficient I@O bandwidth

• =ndertili5ed or intermittentl* sed C7=s -for example/ s*stems where C7=

sa!e is t*picall* less than 8N3• Sfficient memor* to spport additional memor*)intensive processes sch as

sorts/ hashin!/ and I@O bffers

If *or s*stem lacks an* of these characteristics/ parallel exection mi!ht notsi!nificantl* improve performance" In fact/ parallel exection can redce s*stem performance on overtili5ed s*stems or s*stems with small I@O bandwidth"

See Also: 

Chapter A0/ 9=sin! 7arallel 6xection9 for frther informationre!ardin! parallel exection re4irements

7ranules o0 #arallelis

%ifferent parallel operations se different t*pes of parallelism" The optimal ph*sicaldatabase la*ot depends on the parallel operations that are most prevalent in *orapplication or even of the necessit* of sin! partitions"

The basic nit of work in parallelism is a called a !ranle" Oracle divides the operation bein! paralleli5ed -for example/ a table scan/ table pdate/ or index creation3 into!ranles" 7arallel exection processes execte the operation one !ranle at a time" Thenmber of !ranles and their si5e correlates with the de!ree of parallelism -%O73" It alsoaffects how well the work is balanced across 4er* server processes" There is no wa* *ocan enforce a specific !ranle strate!* as Oracle makes this decision internall*"

&loc< ange 7ranules

Block ran!e !ranles are the basic nit of most parallel operations/ even on partitionedtables" Therefore/ from an Oracle perspective/ the de!ree of parallelism is not related tothe nmber of partitions"

Block ran!e !ranles are ran!es of ph*sical blocks from a table" The nmber and the si5eof the !ranles are compted drin! rntime b* Oracle to optimi5e and balance the workdistribtion for all affected parallel exection servers" The nmber and si5e of !ranlesare dependent pon the si5e of the ob:ect and the %O7" Block ran!e !ranles do notdepend on static preallocation of tables or indexes" %rin! the comptation of the!ranles/ Oracle takes the %O7 into accont and tries to assi!n !ranles from different

Page 32: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 32/260

datafiles to each of the parallel exection servers to avoid contention whenever possible"'dditionall*/ Oracle considers the disk affinit* of the !ranles on M77 s*stems to takeadvanta!e of the ph*sical proximit* between parallel exection servers and disks"

$hen block ran!e !ranles are sed predominantl* for parallel access to a table or index/

administrative considerations -sch as recover* or sin! partitions for deletin! portionsof data3 mi!ht inflence partition la*ot more than performance considerations"

#artition 7ranules

$hen Oracle ses partition !ranles/ a 4er* server process works on an entire partitionor sbpartition of a table or index" Becase partition !ranles are staticall* determined b*the strctre of the table or index when a table or index is created/ partition !ranles donot !ive *o the flexibilit* in paralleli5in! an operation that block !ranles do" Themaximm allowable %O7 is the nmber of partitions" This mi!ht limit the tili5ation ofthe s*stem and the load balancin! across parallel exection servers"

$hen Oracle ses partition !ranles for parallel access to a table or index/ *o shold sea relativel* lar!e nmber of partitions -ideall*/ three times the %O73/ so that Oracle caneffectivel* balance work across the 4er* server processes"

7artition !ranles are the basic nit of parallel index ran!e scans and of paralleloperations that modif* mltiple partitions of a partitioned table or index" Theseoperations inclde parallel creation of partitioned indexes/ and parallel creation of partitioned tables"

See Also: 

Oracle9i Database Concepts for information on disk stripin! and partitionin!

#artitioning Design Considerations

In con:nction with parallel exection/ partitionin! can improve performance in datawarehoses" The followin! are the main desi!n considerations for partitionin!#

• T*pes of 7artitionin!

• 7artition 7rnin!

• 7artition)$ise .oins

T+pes o0 #artitioning

This section describes the partitionin! featres that si!nificantl* enhance data access andimprove overall application performance" This is especiall* tre for applications thataccess tables and indexes with millions of rows and man* !i!ab*tes of data"

Page 33: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 33/260

7artitioned tables and indexes facilitate administrative operations b* enablin! theseoperations to work on sbsets of data" For example/ *o can add a new partition/ or!ani5ean existin! partition/ or drop a partition and case less than a second of interrption to aread)onl* application"

=sin! the partitionin! methods described in this section can help *o tne SLstatements to avoid nnecessar* index and table scans -sin! partition prnin!3" ?o canalso improve the performance of massive :oin operations when lar!e amonts of data -forexample/ several million rows3 are :oined to!ether b* sin! partition)wise :oins" Finall*/ partitionin! data !reatl* improves mana!eabilit* of ver* lar!e databases and dramaticall*redces the time re4ired for administrative tasks sch as backp and restore"

Kranlarit* can be easil* added or removed to the partitionin! scheme b* splittin! partitions" Ths/ if a table<s data is skewed to fill some partitions more than others/ theones that contain more data can be split to achieve a more even distribtion" 7artitionin!also allows one to swap partitions with a table" B* bein! able to easil* add/ remove/ or

swap a lar!e amont of data 4ickl*/ swappin! can be sed to keep a lar!e amont ofdata that is bein! loaded inaccessible ntil loadin! is completed/ or can be sed as a wa*to sta!e data between different phases of se" Some examples are crrent da*<stransactions or online archives"

See Also: 

Oracle9i Database Concepts for an introdction to the ideas behind partitionin!

#artitioning )ethods

Oracle offers for partitionin! methods#

• +an!e 7artitionin!

• Hash 7artitionin!

• List 7artitionin!

• Composite 7artitionin!

6ach partitionin! method has different advanta!es and desi!n considerations" Ths/ eachmethod is more appropriate for a particlar sitation"

%ange Partitioning 

+an!e partitionin! maps data to partitions based on ran!es of partition ke* vales that*o establish for each partition" It is the most common t*pe of partitionin! and is oftensed with dates" For example/ *o mi!ht want to partition sales data into monthl* partitions"

Page 34: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 34/260

+an!e partitionin! maps rows to partitions based on ran!es of colmn vales" +an!e partitionin! is defined b* the partitionin! specification for a table or index in %$TITION

B' $N(! )column_list* and b* the partitionin! specifications for each individal

 partition in +$LU!S L!SS T,$N )-alue_list*/ where column_list is an ordered list of

colmns that determines the partition to which a row or an index entr* belon!s" These

colmns are called the partitionin! colmns" The vales in the partitionin! colmns of a particlar row constitte that row<s partitionin! ke*"

-alue_list is an ordered list of vales for the colmns in the colmn list" 6ach vale

mst be either a literal or a TO_D$T! or %$D fnction with constant ar!ments" Onl* the

+$LU!S L!SS T,$N clase is allowed" This clase specifies a non)inclsive pper bond

for the partitions" 'll partitions/ except the first/ have an implicit low vale specified b*the +$LU!S L!SS T,$N literal on the previos partition" 'n* binar* vales of the partition

ke* e4al to or hi!her than this literal are added to the next hi!her partition" Hi!hest partition bein! where "$.+$LU! literal is defined" ,e*word/ "$.+$LU!/ represents a

virtal infinite vale that sorts hi!her than an* other vale for the data t*pe/ incldin! the

nll vale"

The followin! statement creates a table sales_ran/e that is ran!e partitioned on the

sales_date field#

C!$T! T$BL! sales_ran/e)salesman_id NU"B!)0*1salesman_name +$C,$2)34*1sales_amount NU"B!)54*1sales_date D$T!*CO"%!SS%$TITION B' $N(!)sales_date*)%$TITION sales_6an2444 +$LU!S L!SST,$N)TO_D$T!)74284582444717DD8""8''''7**1%$TITION sales_fe92444 +$LU!S L!SST,$N)TO_D$T!)74384582444717DD8""8''''7**1%$TITION sales_mar2444 +$LU!S L!SST,$N)TO_D$T!)74:84582444717DD8""8''''7**1%$TITION sales_apr2444 +$LU!S L!SST,$N)TO_D$T!)74084582444717DD8""8''''7***;

Note: 

This table was created with the CO"%!SS ke*word/ ths all partitionsinherit this attribte"

See Also: 

Oracle9i S%& $e"erence for partitionin! s*ntax and the Oracle9i

 Database 'dministrator(s #uide for more examples

Page 35: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 35/260

(ash Partitioning 

Hash partitionin! maps data to partitions based on a hashin! al!orithm that Oracleapplies to a partitionin! ke* that *o identif*" The hashin! al!orithm evenl* distribtesrows amon! partitions/ !ivin! partitions approximatel* the same si5e" Hash partitionin!

is the ideal method for distribtin! data evenl* across devices" Hash partitionin! is a!ood and eas*)to)se alternative to ran!e partitionin! when data is not historical andthere is no obvios colmn or colmn list where lo!ical ran!e partition prnin! can beadvanta!eos"

Oracle ses a linear hashin! al!orithm and to prevent data from clsterin! within specific partitions/ *o shold define the nmber of partitions b* a power of two -for example/ A// 3"

The followin! statement creates a table sales_hash/ which is hash partitioned on the

salesman_id field#

C!$T! T$BL! sales_hash)salesman_id NU"B!)0*1salesman_name +$C,$2)34*1sales_amount NU"B!)54*1<ee=_no NU"B!)2**%$TITION B' ,$S,)salesman_id*%$TITIONS :;

See Also: 

Oracle9i S%& $e"erence for partitionin! s*ntax and the Oracle9i

 Database 'dministrator(s #uide for more examples

Note: 

?o cannot define alternate hashin! al!orithms for partitions"

List Partitioning 

List partitionin! enables *o to explicitl* control how rows map to partitions" ?o do this b* specif*in! a list of discrete vales for the partitionin! colmn in the description foreach partition" This is different from ran!e partitionin!/ where a ran!e of vales isassociated with a partition and with hash partitionin!/ where *o have no control of the

row)to)partition mappin!" The advanta!e of list partitionin! is that *o can !rop andor!ani5e nordered and nrelated sets of data in a natral wa*" The followin! examplecreates a list partitioned table !ropin! states accordin! to their sales re!ions#

C!$T! T$BL! sales_list)salesman_id NU"B!)0*1salesman_name +$C,$2)34*1sales_state +$C,$2)24*1sales_amount NU"B!)54*1

Page 36: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 36/260

sales_date D$T!*%$TITION B' LIST)sales_state*)%$TITION sales_<est +$LU!S)7California71 7,a<aii7* CO"%!SS1%$TITION sales_east +$LU!S)7Ne< 'or=71 7+ir/inia71 7>lorida7*1%$TITION sales_central +$LU!S)7Te?as71 7Illinois7**;

7artition sales_<est is frthermore created as a sin!le compressed partition within

sales_list" For details abot partitionin! and compression/ see 97artitionin! and %ata

Se!ment Compression9"

'n additional capabilit* with list partitionin! is that *o can se a defalt partition/ sothat all rows that do not map to an* other partition do not !enerate an error" For example/modif*in! the previos example/ *o can create a defalt partition as follows#

C!$T! T$BL! sales_list

)salesman_id NU"B!)0*1salesman_name +$C,$2)34*1sales_state +$C,$2)24*1sales_amount NU"B!)54*1sales_date D$T!*%$TITION B' LIST)sales_state*)%$TITION sales_<est +$LU!S)7California71 7,a<aii7*1%$TITION sales_east +$LU!S )7Ne< 'or=71 7+ir/inia71 7>lorida7*1%$TITION sales_central +$LU!S)7Te?as71 7Illinois7*%$TITION sales_other +$LU!S)D!>$ULT**;

See Also: 

Oracle9i S%& $e"erence for partitionin! s*ntax/ 97artitionin! and %ataSe!ment Compression9 for information re!ardin! data se!mentcompression/ and the Oracle9i Database 'dministrator(s #uide formore examples

Com&osite Partitioning 

Composite partitionin! combines ran!e and hash or list partitionin!" Oracle firstdistribtes data into partitions accordin! to bondaries established b* the partition ran!es"

Then/ for ran!e)hash partitionin!/ Oracle ses a hashin! al!orithm to frther divide thedata into sbpartitions within each ran!e partition" For ran!e)list partitionin!/ Oracledivides the data into sbpartitions within each ran!e partition based on the explicit list*o chose"

Inde2 #artitioning

Page 37: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 37/260

?o can choose whether or not to inherit the partitionin! strate!* of the nderl*in!tables" ?o can create both local and !lobal indexes on a table partitioned b* ran!e/ hash/or composite methods" Local indexes inherit the partitionin! attribtes of their relatedtables" For example/ if *o create a local index on a composite table/ Oracleatomaticall* partitions the local index sin! the composite method"

Oracle spports onl* ran!e partitionin! for !lobal partitioned indexes" ?o cannot partition !lobal indexes sin! the hash or composite partitionin! methods"

See Also: 

Chapter 2/ 9Indexes9

#er0orance Issues 0or ange "ist -ash and Coposite #artitioning

This section describes performance isses for#

• $hen to =se +an!e 7artitionin!

• $hen to =se Hash 7artitionin!

• $hen to =se List 7artitionin!

• $hen to =se Composite +an!e)Hash 7artitionin!

• $hen to =se Composite +an!e)List 7artitionin!

When to +se %ange Partitioning 

+an!e partitionin! is a convenient method for partitionin! historical data" The bondariesof ran!e partitions define the orderin! of the partitions in the tables or indexes"

+an!e partitionin! or!ani5es data b* time intervals on a colmn of t*pe D$T!" Ths/ most

SL statements accessin! ran!e partitions focs on timeframes" 'n example of this is aSL statement similar to 9select data from a particlar period in time"9 In sch ascenario/ if each partition represents data for one month/ the 4er* 9find data of month1)%6C9 needs to access onl* the %ecember partition of *ear 1" This redces theamont of data scanned to a fraction of the total data available/ an optimi5ation methodcalled partition prnin!"

+an!e partitionin! is also ideal when *o periodicall* load new data and pr!e old data"It is eas* to add or drop partitions"

It is common to keep a rollin! window of data/ for example keepin! the past 2 months<worth of data online" +an!e partitionin! simplifies this process" To add data from a newmonth/ *o load it into a separate table/ clean it/ index it/ and then add it to the ran!e) partitioned table sin! the !.C,$N(! %$TITION statement/ all while the ori!inal table

remains online" Once *o add the new partition/ *o can drop the trailin! month with theDO% %$TITION statement" The alternative to sin! the DO% %$TITION statement can

Page 38: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 38/260

 be to archive the partition and make it read onl*/ bt this works onl* when *or partitionsare in separate tablespaces"

In conclsion/ consider sin! ran!e partitionin! when#

;er* lar!e tables are fre4entl* scanned b* a ran!e predicate on a !ood partitionin! colmn/ sch as OD!_D$T! or %UC,$S!_D$T!" 7artitionin! the

table on that colmn enables partition prnin!"• ?o want to maintain a rollin! window of data"

• ?o cannot complete administrative operations/ sch as backp and restore/ on

lar!e tables in an allotted time frame/ bt *o can divide them into smaller lo!ical pieces based on the partition ran!e colmn"

The followin! example creates the table sales for a period of two *ears/ 0111 and A888/

and partitions it b* ran!e accordin! to the colmn s_salesdate to separate the data into

ei!ht 4arters/ each correspondin! to a partition"

C!$T! T$BL! sales  )s_productid NU"B!1  s_saledate D$T!1  s_custid NU"B!1  s_totalprice NU"B!*%$TITION B' $N(!)s_saledate* )%$TITION sal@@q5 +$LU!S L!SS T,$N )TO_D$T!)745A$%A5@@@71 7DDA"ONA''''7**1  %$TITION sal@@q2 +$LU!S L!SS T,$N )TO_D$T!)745AULA5@@@71 7DDA"ONA''''7**1  %$TITION sal@@q3 +$LU!S L!SS T,$N )TO_D$T!)745AOCTA5@@@71 7DDA"ONA''''7**1  %$TITION sal@@q: +$LU!S L!SS T,$N )TO_D$T!)745A$NA244471 7DDA"ONA''''7**1  %$TITION sal44q5 +$LU!S L!SS T,$N )TO_D$T!)745A$%A244471 7DDA"ONA''''7**1  %$TITION sal44q2 +$LU!S L!SS T,$N )TO_D$T!)745AULA244471 7DDA"ONA''''7**1  %$TITION sal44q3 +$LU!S L!SS T,$N )TO_D$T!)745AOCTA244471 7DDA"ONA''''7**1  %$TITION sal44q: +$LU!S L!SS T,$N )TO_D$T!)745A$NA244571 7DDA"ONA''''7***;

When to +se (ash Partitioning 

The wa* Oracle distribtes data in hash partitions does not correspond to a bsiness or alo!ical view of the data/ as it does in ran!e partitionin!" Conse4entl*/ hash partitionin!is not an effective wa* to mana!e historical data" However/ hash partitions share some performance characteristics with ran!e partitions" For example/ partition prnin! islimited to e4alit* predicates" ?o can also se partition)wise :oins/ parallel index access/and parallel %ML"

See Also: 

Page 39: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 39/260

97artition)$ise .oins9

's a !eneral rle/ se hash partitionin! for these prposes#

To improve the availabilit* and mana!eabilit* of lar!e tables or to enable parallel%ML in tables that do not store historical data"• To avoid data skew amon! partitions" Hash partitionin! is an effective means of

distribtin! data becase Oracle hashes the data into a nmber of partitions/ eachof which can reside on a separate device" Ths/ data is evenl* spread over asfficient nmber of devices to maximi5e I@O thro!hpt" Similarl*/ *o can sehash partitionin! to distribte evenl* data amon! the nodes of an M77 platformthat ses Oracle +eal 'pplication Clsters"

• If it is important to se partition prnin! and partition)wise :oins accordin! to a

 partitionin! ke* that is mostl* constrained b* a distinct vale or vale list"

Note: 

In hash partitionin!/ partition prnin! ses onl* e4alit* or IN)list

 predicates"

If *o add or mer!e a hashed partition/ Oracle atomaticall* rearran!es the rows toreflect the chan!e in the nmber of partitions and sbpartitions" The hash fnction thatOracle ses is especiall* desi!ned to limit the cost of this reor!ani5ation" Instead ofreshfflin! all the rows in the table/ Oracles ses an 9add partition9 lo!ic that splits oneand onl* one of the existin! hashed partitions" Conversel*/ Oracle coalesces a partition

 b* mer!in! two existin! hashed partitions"

'ltho!h the hash fnction<s se of 9add partition9 lo!ic dramaticall* improves themana!eabilit* of hash partitioned tables/ it means that the hash fnction can case a skewif the nmber of partitions of a hash partitioned table/ or the nmber of sbpartitions ineach partition of a composite table/ is not a power of two" In the worst case/ the lar!est partition can be twice the si5e of the smallest" So for optimal performance/ create anmber of partitions and sbpartitions for each partition that is a power of two" Forexample/ A/ / / 02/ A/ 2/ 0A/ and so on"

The followin! example creates for hashed partitions for the table sales_hash sin! the

colmn s_productid as the partition ke*#

C!$T! T$BL! sales_hash  )s_productid NU"B!1  s_saledate D$T!1  s_custid NU"B!1  s_totalprice NU"B!*%$TITION B' ,$S,)s_productid*%$TITIONS :;

Page 40: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 40/260

Specif* partition names if *o want to choose the names of the partitions" Otherwise/Oracle atomaticall* !enerates internal names for the partitions" 'lso/ *o can se theSTO! IN clase to assi!n hash partitions to tablespaces in a rond)robin manner"

See Also: 

Oracle9i S%& $e"erence for partitionin! s*ntax and the Oracle9i

 Database 'dministrator(s #uide for more examples

When to +se List Partitioning 

?o shold se list partitionin! when *o want to specificall* map rows to partitions based on discrete vales"

=nlike ran!e and hash partitionin!/ mlti)colmn partition ke*s are not spported for list partitionin!" If a table is partitioned b* list/ the partitionin! ke* can onl* consist of asin!le colmn of the table"

When to +se Com&osite %ange-(ash Partitioning 

Composite ran!e)hash partitionin! offers the benefits of both ran!e and hash partitionin!"$ith composite ran!e)hash partitionin!/ Oracle first partitions b* ran!e" Then/ withineach ran!e/ Oracle creates sbpartitions and distribtes data within them sin! the samehashin! al!orithm it ses for hash partitioned tables"

%ata placed in composite partitions is lo!icall* ordered onl* b* the bondaries thatdefine the ran!e level partitions" The partitionin! of data within each partition has nolo!ical or!ani5ation be*ond the identit* of the partition to which the sbpartitions belon!"

Conse4entl*/ tables and local indexes partitioned sin! the composite ran!e)hashmethod#

• Spport historical data at the partition level

• Spport the se of sbpartitions as nits of parallelism for parallel operations sch

as 7%ML or space mana!ement and backp and recover*• 're eli!ible for partition prnin! and partition)wise :oins on the ran!e and hash

dimensions

+sing Com&osite %ange-(ash Partitioning 

=se the composite ran!e)hash partitionin! method for tables and local indexes if#

• 7artitions mst have a lo!ical meanin! to efficientl* spport historical data

Page 41: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 41/260

• The contents of a partition can be spread across mltiple tablespaces/ devices/ or

nodes -of an M77 s*stem3• ?o re4ire both partition prnin! and partition)wise :oins even when the prnin!

and :oin predicates se different colmns of the partitioned table• ?o re4ire a de!ree of parallelism that is !reater than the nmber of partitions

for backp/ recover*/ and parallel operations

Most lar!e tables in a data warehose shold se ran!e partitionin!" Composite partitionin! shold be sed for ver* lar!e tables or for data warehoses with a well)defined need for these conditions" $hen sin! the composite method/ Oracle stores eachsbpartition on a different se!ment" Ths/ the sbpartitions ma* have properties thatdiffer from the properties of the table or from the partition to which the sbpartitions belon!"

The followin! example partitions the table sales_ran/e_hash b* ran!e on the colmn

s_saledate to create for partitions that order data b* time" Then/ within each ran!e

 partition/ the data is frther sbdivided into 02 sbpartitions b* hash on the colmns_productid#

C!$T! T$BL! sales_ran/e_hash)  s_productid NU"B!1  s_saledate D$T!1  s_custid NU"B!1  s_totalprice NU"B!*  %$TITION B' $N(! )s_saledate*  SUB%$TITION B' ,$S, )s_productid* SUB%$TITIONS   )%$TITION sal@@q5 +$LU!S L!SS T,$N )TO_D$T!)745A$%A5@@@71 7DDA"ONA''''7**1  %$TITION sal@@q2 +$LU!S L!SS T,$N )TO_D$T!)745AULA5@@@71 7DDA"ONA

''''7**1  %$TITION sal@@q3 +$LU!S L!SS T,$N )TO_D$T!)745AOCTA5@@@71 7DDA"ONA''''7**1  %$TITION sal@@q: +$LU!S L!SS T,$N )TO_D$T!)745A$NA244471 7DDA"ONA''''7***;

6ach hashed sbpartition contains sales data for a sin!le 4arter ordered b* prodct code"The total nmber of sbpartitions is x or A"

In addition to this s*ntax/ *o can create sbpartitions b* sin! a sbpartition template"This offers better ease in namin! and control of location for tablespaces and

sbpartitions" The followin! statement illstrates this#

C!$T! T$BL! sales_ran/e_hash)  s_productid NU"B!1  s_saledate D$T!1  s_custid NU"B!1  s_totalprice NU"B!*  %$TITION B' $N(! )s_saledate*  SUB%$TITION B' ,$S, )s_productid*

Page 42: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 42/260

  SUB%$TITION T!"%L$T!)SUB%$TITION sp5 T$BL!S%$C! t9s51SUB%$TITION sp2 T$BL!S%$C! t9s21SUB%$TITION sp3 T$BL!S%$C! t9s31SUB%$TITION sp: T$BL!S%$C! t9s:1SUB%$TITION sp0 T$BL!S%$C! t9s01SUB%$TITION sp T$BL!S%$C! t9s1SUB%$TITION spE T$BL!S%$C! t9sE1SUB%$TITION sp T$BL!S%$C! t9s*)%$TITION sal@@q5 +$LU!S L!SS T,$N )TO_D$T!)745A$%A5@@@71 7DDA"ONA''''7**1 %$TITION sal@@q2 +$LU!S L!SS T,$N )TO_D$T!)745AULA5@@@71 7DDA"ONA''''7**1 %$TITION sal@@q3 +$LU!S L!SS T,$N )TO_D$T!)745AOCTA5@@@71 7DDA"ONA''''7**1 %$TITION sal@@q: +$LU!S L!SS T,$N )TO_D$T!)745A$NA244471 7DDA"ONA''''7***;

In this example/ ever* partition has the same nmber of sbpartitions" ' sample mappin!

for sal@@q5 is illstrated in Table >)0" Similar mappin!s exist for sal@@q2 thro!hsal@@q:"

Ta)$e -1 !u)&artition #a&&ing

Subpartition Tablespace

sal@@q5_sp5 t9s5

sal@@q5_sp2 t9s2

sal@@q5_sp3 t9s3

sal@@q5_sp: t9s:

sal@@q5_sp0 t9s0

sal@@q5_sp t9s

sal@@q5_spE t9sE

sal@@q5_sp t9s

See Also: 

Oracle9i S%& $e"erence for details re!ardin! s*ntax and restrictions

When to +se Com&osite %ange-List Partitioning 

Page 43: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 43/260

Composite ran!e)list partitionin! offers the benefits of both ran!e and list partitionin!"$ith composite ran!e)list partitionin!/ Oracle first partitions b* ran!e" Then/ within eachran!e/ Oracle creates sbpartitions and distribtes data within them to or!ani5e sets ofdata in a natral wa* as assi!ned b* the list"

%ata placed in composite partitions is lo!icall* ordered onl* b* the bondaries thatdefine the ran!e level partitions"

+sing Com&osite %ange-List Partitioning 

=se the composite ran!e)list partitionin! method for tables and local indexes if#

• Sbpartitions have a lo!ical !ropin! defined b* the ser

• The contents of a partition can be spread across mltiple tablespaces/ devices/ or

nodes -of an M77 s*stem3• ?o re4ire both partition prnin! and partition)wise :oins even when the prnin!

and :oin predicates se different colmns of the partitioned table• ?o re4ire a de!ree of parallelism that is !reater than the nmber of partitions

for backp/ recover*/ and parallel operations

Most lar!e tables in a data warehose shold se ran!e partitionin!" Composite partitionin! shold be sed for ver* lar!e tables or for data warehoses with a well)defined need for these conditions" $hen sin! the composite method/ Oracle stores eachsbpartition on a different se!ment" Ths/ the sbpartitions ma* have properties thatdiffer from the properties of the table or from the partition to which the sbpartitions belon!"

This statement creates a table quarterly_re/ional_sales that is ran!e partitioned onthe t?n_date field and list sbpartitioned on state"

C!$T! T$BL! quarterly_re/ional_sales)deptno NU"B!1item_no +$C,$2)24*1

 t?n_date D$T!1t?n_amount NU"B!1state +$C,$2)2**

%$TITION B' $N(! )t?n_date*SUB%$TITION B' LIST )state*)%$TITION q5_5@@@ +$LU!S L!SS T,$N)TO_D$T!)75A$%A5@@@717DDA"ONA''''7**

)SUB%$TITION q5_5@@@_north<est +$LU!S )7O71 7F$7*1SUB%$TITION q5_5@@@_south<est +$LU!S )7$71 7UT71 7N"7*1

 SUB%$TITION q5_5@@@_northeast +$LU!S )7N'71 7+"71 7N7*1 SUB%$TITION q5_5@@@_southeast +$LU!S )7>L71 7($7*1 SUB%$TITION q5_5@@@_northcentral +$LU!S )7SD71 7FI7*1 SUB%$TITION q5_5@@@_southcentral +$LU!S )7N"71 7T.7**1%$TITION q2_5@@@ +$LU!S L!SS T,$N)TO_D$T!)75AULA5@@@717DDA"ONA''''7**)SUB%$TITION q2_5@@@_north<est +$LU!S )7O71 7F$7*1 SUB%$TITION q2_5@@@_south<est +$LU!S )7$71 7UT71 7N"7*1 SUB%$TITION q2_5@@@_northeast +$LU!S )7N'71 7+"71 7N7*1

Page 44: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 44/260

 SUB%$TITION q2_5@@@_southeast +$LU!S )7>L71 7($7*1 SUB%$TITION q2_5@@@_northcentral +$LU!S )7SD71 7FI7*1 SUB%$TITION q2_5@@@_southcentral +$LU!S )7N"71 7T.7**1%$TITION q3_5@@@ +$LU!S L!SS T,$N )TO_D$T!)75AOCTA5@@@717DDA"ONA''''7**)SUB%$TITION q3_5@@@_north<est +$LU!S )7O71 7F$7*1 SUB%$TITION q3_5@@@_south<est +$LU!S )7$71 7UT71 7N"7*1 SUB%$TITION q3_5@@@_northeast +$LU!S )7N'71 7+"71 7N7*1 SUB%$TITION q3_5@@@_southeast +$LU!S )7>L71 7($7*1 SUB%$TITION q3_5@@@_northcentral +$LU!S )7SD71 7FI7*1 SUB%$TITION q3_5@@@_southcentral +$LU!S )7N"71 7T.7**1%$TITION q:_5@@@ +$LU!S L!SS T,$N )TO_D$T!)75A$NA2444717DDA"ONA''''7**)SUB%$TITION q:_5@@@_north<est +$LU!S)7O71 7F$7*1 SUB%$TITION q:_5@@@_south<est +$LU!S)7$71 7UT71 7N"7*1 SUB%$TITION q:_5@@@_northeast +$LU!S)7N'71 7+"71 7N7*1 SUB%$TITION q:_5@@@_southeast +$LU!S)7>L71 7($7*1 SUB%$TITION q:_5@@@_northcentral +$LU!S )7SD71 7FI7*1 SUB%$TITION q:_5@@@_southcentral +$LU!S )7N"71 7T.7***;

?o can create sbpartitions in a composite partitioned table sin! a sbpartitiontemplate" ' sbpartition template simplifies the specification of sbpartitions b* notre4irin! that a sbpartition descriptor be specified for ever* partition in the table"Instead/ *o describe sbpartitions onl* once in a template/ then appl* that sbpartitiontemplate to ever* partition in the table" The followin! statement illstrates an examplewhere *o can choose the sbpartition name and tablespace locations#

C!$T! T$BL! quarterly_re/ional_sales)deptno NU"B!1item_no +$C,$2)24*1

 t?n_date D$T!1t?n_amount NU"B!1

state +$C,$2)2**%$TITION B' $N(! )t?n_date*SUB%$TITION B' LIST )state*SUB%$TITION T!"%L$T!)SUB%$TITION north<est +$LU!S )7O71 7F$7* T$BL!S%$C! ts51SUB%$TITION south<est +$LU!S )7$71 7UT71 7N"7* T$BL!S%$C! ts21SUB%$TITION northeast +$LU!S )7N'71 7+"71 7N7* T$BL!S%$C! ts31SUB%$TITION southeast +$LU!S )7>L71 7($7* T$BL!S%$C! ts:1SUB%$TITION northcentral +$LU!S )7SD71 7FI7* T$BL!S%$C! ts01SUB%$TITION southcentral +$LU!S )7N"71 7T.7* T$BL!S%$C! ts*)%$TITION q5_5@@@ +$LU!S L!SS T,$N)TO_D$T!)75A$%A5@@@717DDA"ONA''''7**1 %$TITION q2_5@@@ +$LU!S L!SS T,$N)TO_D$T!)75AULA5@@@717DDA"ONA

''''7**1 %$TITION q3_5@@@ +$LU!S L!SS T,$N)TO_D$T!)75AOCTA5@@@717DDA"ONA''''7**1 %$TITION q:_5@@@ +$LU!S L!SS T,$N)TO_D$T!)75A$NA2444717DDA"ONA''''7***;

See Also: 

Oracle9i S%& $e"erence for details re!ardin! s*ntax and restrictions

Page 45: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 45/260

#artitioning and Data Segent Copression

?o can compress several partitions or a complete partitioned heap)or!ani5ed table" ?odo this b* either definin! a complete partitioned table as bein! compressed/ or b*definin! it on a per)partition level" 7artitions withot a specific declaration inherit the

attribte from the table definition or/ if nothin! is specified on table level/ from thetablespace definition"

To decide whether or not a partition shold be compressed or sta* ncompressed adheresto the same rles as a nonpartitioned table" However/ de to the capabilit* of ran!e andcomposite partitionin! to separate data lo!icall* into distinct partitions/ sch a partitionedtable is an ideal candidate for compressin! parts of the data -partitions3 that are mainl*read)onl*" It is/ for example/ beneficial in all rollin! window operations as a kind ofintermediate sta!e before a!in! ot old data" $ith data se!ment compression/ *o cankeep more old data online/ minimi5in! the brden of additional stora!e consmption"

?o can also chan!e an* existin! ncompressed table partition later on/ add newcompressed and ncompressed partitions/ or chan!e the compression attribte as part ofan* partition maintenance operation that re4ires data movement/ sch as "!(! 

%$TITION/ S%LIT %$TITION/ or "O+! %$TITION" The partitions can contain data or

can be empt*"

The access and maintenance of a partiall* or fll* compressed partitioned table are thesame as for a fll* ncompressed partitioned table" 6ver*thin! that applies to fll*ncompressed partitioned tables is also valid for partiall* or fll* compressed partitionedtables"

See Also: 

Chapter / 97h*sical %esi!n in %ata $arehoses9 for a !enericdiscssion of data se!ment compression/ Chapter 0/ 9Maintainin! the%ata $arehose9 for a sample rollin! window operation with a ran!e) partitioned table/ and Oracle9i Database Per"ormance Tuning #uideand $e"erence for an example of calclatin! the compression ratio

Data Segent Copression and &itap Inde2es

If *o want to se data se!ment compression on partitioned tables with bitmap indexes/

*o need to do the followin! before *o introdce the compression attribte for the firsttime#

0" Mark bitmap indexes nsable"0" Set the compression attribte"0" +ebild the indexes"

Page 46: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 46/260

The first time *o make a compressed partition part of an alread* existin!/ fll*ncompressed partitioned table/ *o mst either drop all existin! bitmap indexes or markthem UNUS$BL! prior to addin! a compressed partition" This mst be done irrespective of

whether an* partition contains an* data" It is also independent of the operation that casesone or more compressed partitions to become part of the table" This does not appl* to a

 partitioned table havin! B)tree indexes onl*"

This rebildin! of the bitmap index strctres is necessar* to accommodate the potentiall* hi!her nmber of rows stored for each data block with data se!mentcompression enabled and mst be done onl* for the first time" 'll sbse4ent operations/whether the* affect compressed or ncompressed partitions/ or chan!e the compressionattribte/ behave identicall* for ncompressed/ partiall* compressed/ or fll* compressed partitioned tables"

To avoid the recreation of an* bitmap index strctre/ Oracle recommends creatin! ever* partitioned table with at least one compressed partition whenever *o plan to partiall* or

fll* compress the partitioned table in the ftre" This compressed partition can sta*empt* or even can be dropped after the partition table creation"

Havin! a partitioned table with compressed partitions can lead to sli!htl* lar!er bitmapindex strctres for the ncompressed partitions" The bitmap index strctres for thecompressed partitions/ however/ are in most cases smaller than the appropriate bitmapindex strctre before data se!ment compression" This hi!hl* depends on the achievedcompression rates"

Note: 

Oracle will raise an error if compression is introdced to an ob:ect forthe first time and there are sable bitmap index se!ments"

$2aple o0 Data Segent Copression and #artitioning

The followin! statement moves and compresses an alread* existin! partitionsales_q5_5@@ of table sales#

$LT! T$BL! sales"O+! %$TITION sales_q5_5@@ T$BL!S%$C! ts_arch_q5_5@@ CO"%!SS;

If *o se the "O+! statement/ the local indexes for partition sales_q5_5@@ become

nsable" ?o have to rebild them afterward/ as follows#

$LT! T$BL! sales"ODI>' %$TITION sales_q5_5@@ !BUILD UNUS$BL! LOC$L IND!.!S;

Page 47: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 47/260

The followin! statement mer!es two existin! partitions into a new/ compressed partition/residin! in a separate tablespace" The local bitmap indexes have to be rebilt afterward/as follows#

$LT! T$BL! sales "!(! %$TITIONS sales_q5_5@@1 sales_q2_5@@INTO %$TITION sales_5_5@@ T$BL!S%$C! ts_arch_5_5@@

CO"%!SS U%D$T! (LOB$L IND!.!S;

See Also: 

Oracle9i Database Per"ormance Tuning #uide and $e"erence fordetails re!ardin! how to estimate the compression ratio when sin!data se!ment compression

#artition #runing

7artition prnin! is an essential performance featre for data warehoses" In partition

 prnin!/ the cost)based optimi5er anal*5es >O" and F,!! clases in SL statements toeliminate nneeded partitions when bildin! the partition access list" This enables Oracleto perform operations onl* on those partitions that are relevant to the SL statement"Oracle prnes partitions when *o se ran!e/ LIK!/ e4alit*/ and IN)list predicates on the

ran!e or list partitionin! colmns/ and when *o se e4alit* and IN)list predicates on

the hash partitionin! colmns"

7artition prnin! dramaticall* redces the amont of data retrieved from disk andshortens the se of processin! time/ improvin! 4er* performance and resorcetili5ation" If *o partition the index and table on different colmns -with a !lobal/ partitioned index3/ partition prnin! also eliminates index partitions even when the

 partitions of the nderl*in! table cannot be eliminated"

On composite partitioned ob:ects/ Oracle can prne at both the ran!e partition level andat the hash or list sbpartition level sin! the relevant predicates" +efer to the tablesales_ran/e_hash earlier/ partitioned b* ran!e on the colmn s_salesdate and

sbpartitioned b* hash on colmn s_productid/ and consider the followin! example#

S!L!CT G >O" sales_ran/e_hashF,!! s_saledate B!TF!!N )TO_D$T!)745AULA5@@@71 7DDA"ONA''''7** $ND )TO_D$T!)745AOCTA5@@@71 7DDA"ONA''''7** $ND s_productid H 5244;

Oracle ses the predicate on the partitionin! colmns to perform partition prnin! asfollows#

• $hen sin! ran!e partitionin!/ Oracle accesses onl* partitions sal@@q2 and

sal@@q3"

• $hen sin! hash sbpartitionin!/ Oracle accesses onl* the one sbpartition in

each partition that stores the rows with s_productidH5244" The mappin!

Page 48: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 48/260

 between the sbpartition and the predicate is calclated based on Oracle<s internalhash distribtion fnction"

#runing .sing DAT$ Coluns

In the earlier partitionin! prnin! example/ the date vale was fll* specified as fordi!its for the *ear sin! the TO_D$T! fnction/ :st as it was in the nderl*in! table<s

ran!e partitionin! description" $hile this is the recommended format for specif*in! datevales/ the optimi5er can prne partitions sin! the predicates on s_salesdate when *o

se other formats/ as in the followin! example#

S!L!CT G >O" sales_ran/e_hashF,!! s_saledate B!TF!!N TO_D$T!)745AULA@@71 7DDA"ONA7* $ND  TO_D$T!)745AOCTA@@71 7DDA"ONA7* $ND s_productid H 5244; 

'ltho!h this ses the %%)MO()++ format/ which is not the same as the base partition/

the optimi5er can still prne properl*"

If *o execte an !.%L$IN %L$N statement on the 4er*/ the %$TITION_ST$T and

%$TITION_STO% colmns of the otpt table do not specif* which partitions Oracle is

accessin!" Instead/ *o see the ke*word K!' for both colmns" The ke*word K!' for both

colmns means that partition prnin! occrs at rn)time" It can also affect the exection plan becase the information abot the prned partitions is missin! compared to the samestatement sin! the same TO_D$T! fnction than the partition table definition"

Avoiding I6O &ottlenec<s

To avoid I@O bottlenecks/ when Oracle is not scannin! all partitions becase some have been eliminated b* prnin!/ spread each partition over several devices" On M77 s*stems/spread those devices over mltiple nodes"

#artition=Wise >oins

7artition)wise :oins redce 4er* response time b* minimi5in! the amont of dataexchan!ed amon! parallel exection servers when :oins execte in parallel" Thissi!nificantl* redces response time and improves the se of both C7= and memor*resorces" In Oracle +eal 'pplication Clsters environments/ partition)wise :oins alsoavoid or at least limit the data traffic over the interconnect/ which is the ke* to achievin!

!ood scalabilit* for massive :oin operations"

7artition)wise :oins can be fll or partial" Oracle decides which t*pe of :oin to se"

,ull #artition=Wise >oins

' fll partition)wise :oin divides a lar!e :oin into smaller :oins between a pair of partitions from the two :oined tables" To se this featre/ *o mst e4ipartition both

Page 49: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 49/260

tables on their :oin ke*s" For example/ consider a lar!e :oin between a sales table and acstomer table on the colmn cstomerid" The 4er* 9find the records of all cstomerswho bo!ht more than 088 articles in arter of 01119 is a t*pical example of a SLstatement performin! sch a :oin" The followin! is an example of this#

S!L!CT ccust_last_name1 COUNT)G*>O" sales s1 customers cF,!! scust_id H ccust_id

$ND stime_id B!TF!!N TO_D$T!)745AULA5@@@71 7DDA"ONA''''7* $ND)TO_D$T!)745AOCTA5@@@71 7DDA"ONA''''7**

  (OU% B' ccust_last_name ,$+IN(  COUNT)G* J 544;

This lar!e :oin is t*pical in data warehosin! environments" The entire cstomer table is :oined with one 4arter of the sales data" In lar!e data warehose applications/ this mi!htmean :oinin! millions of rows" The :oin method to se in that case is obviosl* a hash :oin" ?o can redce the processin! time for this hash :oin even more if both tables are

e4ipartitioned on the customerid colmn" This enables a fll partition)wise :oin"

$hen *o execte a fll partition)wise :oin in parallel/ the !ranle of parallelism/ asdescribed nder 9Kranles of 7arallelism9/ is a partition" 's a reslt/ the de!ree of parallelism is limited to the nmber of partitions" For example/ *o re4ire at least 02 partitions to set the de!ree of parallelism of the 4er* to 02"

?o can se varios partitionin! methods to e4ipartition both tables on the colmncustomerid with 02 partitions" These methods are described in these sbsections"

(ash-(ash

This is the simplest method# the customers and sales tables are both partitioned b* hash

into 02 partitions/ on the s_customerid and c_customerid colmns" This partitionin!

method enables fll partition)wise :oin when the tables are :oined on s_customerid and

c_customerid/ both representin! the same cstomer identification nmber" Becase *o

are sin! the same hash fnction to distribte the same information -cstomer I%3 intothe same nmber of hash partitions/ *o can :oin the e4ivalent partitions" The* arestorin! the same vales"

In serial/ this :oin is performed between pairs of matchin! hash partitions/ one at a time"$hen one partition pair has been :oined/ the :oin of another partition pair be!ins" The :oin completes when the 02 partition pairs have been processed"

Note: 

' pair of matchin! hash partitions is defined as one partition with thesame partition nmber from each table" For example/ with fll partition)wise :oins we :oin partition 8 of sales with partition 8 of

Page 50: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 50/260

customers/ partition 0 of sales with partition 0 of customers/ and so

on"

7arallel exection of a fll partition)wise :oin is a strai!htforward paralleli5ation of theserial exection" Instead of :oinin! one partition pair at a time/ 02 partition pairs are :oined in parallel b* the 02 4er* servers" Fi!re >)0 illstrates the parallel exection of afll partition)wise :oin"

Figure -1 Para$$e$ E.ecution of a Fu$$ Partition-ise /oin

Text description of the illstration dwhs!08>"!if 

In Fi!re >)0/ assme that the de!ree of parallelism and the nmber of partitions are thesame/ in other words/ 02 for both" %efinin! more partitions than the de!ree of parallelism

ma* improve load balancin! and limit possible skew in the exection" If *o have more partitions than 4er* servers/ when one 4er* server completes the :oin of one pair of partitions/ it re4ests that the 4er* coordinator !ive it another pair to :oin" This processrepeats ntil all pairs have been processed" This method enables the load to be balancedd*namicall* when the nmber of partition pairs is !reater than the de!ree of parallelism/for example/ 2 partitions with a de!ree of parallelism of 02"

Note: 

To !arantee an e4al work distribtion/ the nmber of partitionsshold alwa*s be a mltiple of the de!ree of parallelism"

In Oracle +eal 'pplication Clsters environments rnnin! on shared)nothin! or M77 platforms/ placin! partitions on nodes is critical to achievin! !ood scalabilit*" To avoidremote I@O/ both matchin! partitions shold have affinit* to the same node" 7artition pairs shold be spread over all nodes to avoid bottlenecks and to se all C7= resorcesavailable on the s*stem"

Page 51: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 51/260

 (odes can host mltiple pairs when there are more pairs than nodes" For example/ withan )node s*stem and 02 partition pairs/ each node receives two pairs"

See Also: 

Oracle9i $eal 'pplication Clusters Concepts for more information ondata affinit*

0Com&osite-(ash-(ash

This method is a variation of the hash)hash method" The sales table is a t*pical example

of a table storin! historical data" For all the reasons mentioned nder the headin! 9$hento =se +an!e 7artitionin!9/ ran!e is the lo!ical initial partitionin! method"

For example/ assme *o want to partition the sales table into ei!ht partitions b* ran!e

on the colmn s_salesdate" 'lso assme *o have two *ears and that each partition

represents a 4arter" Instead of sin! ran!e partitionin!/ *o can se composite partitionin! to enable a fll partition)wise :oin while preservin! the partitionin! ons_salesdate" 7artition the sales table b* ran!e on s_salesdate and then sbpartition

each partition b* hash on s_customerid sin! 02 sbpartitions for each partition/ for a

total of 0A sbpartitions" The customers table can still se hash partitionin! with 02

 partitions"

$hen *o se the method :st described/ a fll partition)wise :oin works similarl* to theone created b* the hash)hash method" The :oin is still divided into 02 smaller :oins between hash partition pairs from both tables" The difference is that now each hash partition in the sales table is composed of a set of sbpartitions/ one from each ran!e

 partition"

Fi!re >)A illstrates how the hash partitions are formed in the sales table" 6ach cell

represents a sbpartition" 6ach row corresponds to one ran!e partition/ for a total of ran!e partitions" 6ach ran!e partition has 02 sbpartitions" 6ach colmn corresponds toone hash partition for a total of 02 hash partitionsJ each hash partition has sbpartitions" (ote that hash partitions can be defined onl* if all partitions have the same nmber ofsbpartitions/ in this case/ 02"

Hash partitions are implicit in a composite table" However/ Oracle does not record themin the data dictionar*/ and *o cannot maniplate them with %%L commands as *o can

ran!e partitions"

Figure -2 %ange and (ash Partitions of a Com&osite Ta)$e

Page 52: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 52/260

Text description of the illstration dwhs!8A2"!if 

-Composite)Hash3)Hash partitionin! is effective becase it lets *o combine prnin! -ons_salesdate3 with a fll partition)wise :oin -on customerid3" In the previos example

4er*/ prnin! is achieved b* scannin! onl* the sbpartitions correspondin! to of

0111/ in other words/ row nmber in Fi!re >)A" Oracle then :oins these sbpartitionswith the cstomer table/ sin! a fll partition)wise :oin"

'll characteristics of the hash)hash partition)wise :oin appl* to the composite)hash partition)wise :oin" In particlar/ for this example/ these two points are common to bothmethods#

• The de!ree of parallelism for this fll partition)wise :oin cannot exceed 02" 6ven

tho!h the sales table has 0A sbpartitions/ it has onl* 02 hash partitions"

• The rles for data placement on M77 s*stems appl* here" The onl* difference is

that a hash partition is now a collection of sbpartitions" ?o mst ensre that all

these sbpartitions are placed on the same node as the matchin! hash partitionfrom the other table" For example/ in Fi!re >)A/ store hash partition 1 of thesales table shown b* the ei!ht circled sbpartitions/ on the same node as hash

 partition 1 of the customers table"

0Com&osite-List-List 

Page 53: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 53/260

The -Composite)List3)List method resembles that for -Composite)Hash3)Hash partition)wise :oins"

Com&osite-Com&osite 0(ashList Dimension

If needed/ *o can also partition the customer table b* the composite method" Forexample/ *o partition it b* ran!e on a postal code colmn to enable prnin! based on postal code" ?o then sbpartition it b* hash on customerid sin! the same nmber of

 partitions -023 to enable a partition)wise :oin on the hash dimension"

%ange-%ange and List-List 

?o can also :oin ran!e partitioned tables with ran!e partitioned tables and list partitioned tables with list partitioned tables in a partition)wise manner/ bt this isrelativel* ncommon" This is more complex to implement becase *o mst know thedistribtion of the data before performin! the :oin" Frthermore/ if *o do not correctl*

identif* the partition bonds so that *o have partitions of e4al si5e/ data skew drin!the exection ma* reslt"

The basic principle for sin! ran!e)ran!e and list)list is the same as for sin! hash)hash#*o mst e4ipartition both tables" This means that the nmber of partitions mst be thesame and the partition bonds mst be identical" For example/ assme that *o know inadvance that *o have 08 million cstomers/ and that the vales for customerid var*

from 0 to 08/888/888" In other words/ *o have 08 million possible different vales" Tocreate 02 partitions/ *o can ran!e partition both tables/ sales on c_customerid and

customers on s_customerid" ?o shold define partition bonds for both tables in

order to !enerate partitions of the same si5e" In this example/ partition bonds shold be

defined as 2A>880/ 0A>8880/ 0E>880/ """ 08888880/ so that each partition contains2A>888 rows"

%ange-Com&osite Com&osite-Com&osite 0%ange Dimension

Finall*/ *o can also sbpartition one or both tables on another colmn" Therefore/ theran!e)composite and composite)composite methods on the ran!e dimension are also validfor enablin! a fll partition)wise :oin on the ran!e dimension"

#artial #artition=(ise >oins

Oracle can perform partial partition)wise :oins onl* in parallel" =nlike fll partition)wise :oins/ partial partition)wise :oins re4ire *o to partition onl* one table on the :oin ke*/not both tables" The partitioned table is referred to as the reference table" The other tablema* or ma* not be partitioned" 7artial partition)wise :oins are more common than fll partition)wise :oins"

Page 54: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 54/260

To execte a partial partition)wise :oin/ Oracle d*namicall* repartitions the other table based on the partitionin! of the reference table" Once the other table is repartitioned/ theexection is similar to a fll partition)wise :oin"

The performance advanta!e that partial partition)wise :oins have over :oins in non)

 partitioned tables is that the reference table is not moved drin! the :oin operation"7arallel :oins between non)partitioned tables re4ire both inpt tables to be redistribtedon the :oin ke*" This redistribtion operation involves exchan!in! rows between parallelexection servers" This is a C7=)intensive operation that can lead to excessiveinterconnect traffic in Oracle +eal 'pplication Clsters environments" 7artitionin! lar!etables on a :oin ke*/ either a forei!n or primar* ke*/ prevents this redistribtion ever*time the table is :oined on that ke*" Of corse/ if *o choose a forei!n ke* to partition thetable/ which is the most common scenario/ select a forei!n ke* that is involved in man*4eries"

To illstrate partial partition)wise :oins/ consider the previos sales8customer example"

'ssme that s_customer is not partitioned or is partitioned on a colmn other thanc_customerid" Becase sales is often :oined with customers on customerid/ and

 becase this :oin dominates or application workload/ partition sales on s_customerid 

to enable partial partition)wise :oin ever* time customers and sales are :oined" 's in

fll partition)wise :oin/ *o have several alternatives#

(ashList 

The simplest method to enable a partial partition)wise :oin is to partition sales b* hash

on c_customerid" The nmber of partitions determines the maximm de!ree of

 parallelism/ becase the partition is the smallest !ranle of parallelism for partial

 partition)wise :oin operations"

The parallel exection of a partial partition)wise :oin is illstrated in Fi!re >)/ whichassmes that both the de!ree of parallelism and the nmber of partitions of sales are 02"

The exection involves two sets of 4er* servers# one set/ labeled  set ) in Fi!re >)/scans the customers table in parallel" The !ranle of parallelism for the scan operation is

a ran!e of blocks"

+ows from customers that are selected b* the first set/ in this case all rows/ are

redistribted to the second set of 4er* servers b* hashin! customerid" For example/ all

rows in customers that cold have matchin! rows in partition %5 of sales are sent to

4er* server 0 in the second set" +ows received b* the second set of 4er* servers are :oined with the rows from the correspondin! partitions in sales" er* server nmber 0

in the second set :oins all customers rows that it receives with partition %5 of sales"

Figure -3 Partia$ Partition-ise /oin

Page 55: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 55/260

Text description of the illstration dwhs!8E"!if 

Note: 

This section is based on ran!e)hash/ bt it also applies for ran!e)list partial partition)wise :oins"

Considerations for fll partition)wise :oins also appl* to partial partition)wise :oins#

• The de!ree of parallelism does not need to e4al the nmber of partitions" In

Fi!re >)/ the 4er* exectes with two sets of 02 4er* servers" In this case/Oracle assi!ns 0 partition to each 4er* server of the second set" '!ain/ thenmber of partitions shold alwa*s be a mltiple of the de!ree of parallelism"

• In Oracle +eal 'pplication Clsters environments on shared)nothin! platforms

-M77s3/ each hash partition of sales shold preferabl* have affinit* to onl* one

node in order to avoid remote I@Os" 'lso/ spread partitions over all nodes to avoid bottlenecks and se all C7= resorces available on the s*stem" ' node can hostmltiple partitions when there are more partitions than nodes"

See Also: 

Oracle9i $eal 'pplication Clusters Concepts for more information ondata affinit*

Com&osite

Page 56: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 56/260

's with fll partition)wise :oins/ the prime partitionin! method for the sales table is to

se the ran!e method on colmn s_salesdate" This is becase sales is a t*pical

example of a table that stores historical data" To enable a partial partition)wise :oin while preservin! this ran!e partitionin!/ sbpartition sales b* hash on colmn s_customerid 

sin! 02 sbpartitions for each partition" 7rnin! and partial partition)wise :oins can be

sed to!ether if a 4er* :oins customers and sales and if the 4er* has a selection predicate on s_salesdate"

$hen sales is composite/ the !ranle of parallelism for a partial partition)wise :oin is a

hash partition and not a sbpartition" +efer to Fi!re >)A for an illstration of a hash partition in a composite table" '!ain/ the nmber of hash partitions shold be a mltipleof the de!ree of parallelism" 'lso/ on an M77 s*stem/ ensre that each hash partition hasaffinit* to a sin!le node" In the previos example/ the ei!ht sbpartitions composin! ahash partition shold have affinit* to the same node"

Note: 

This section is based on ran!e)hash/ bt it also applies for ran!e)list partial partition)wise :oins"

%ange

Finall*/ *o can se ran!e partitionin! on s_customerid to enable a partial partition)

wise :oin" This works similarl* to the hash method/ bt a side effect of ran!e partitionin!is that the resltin! data distribtion cold be skewed if the si5e of the partitions differs"Moreover/ this method is more complex to implement becase it re4ires prior

knowled!e of the vales of the partitionin! colmn that is also a :oin ke*"

&ene0its o0 #artition=Wise >oins

7artition)wise :oins offer benefits described in this section#

• +edction of Commnications Overhead

• +edction of Memor* +e4irements

%eduction of Communications Overhead 

$hen exected in parallel/ partition)wise :oins redce commnications overhead" This is becase/ in the defalt case/ parallel exection of a :oin operation b* a set of parallelexection servers re4ires the redistribtion of each table on the :oin colmn into dis:ointsbsets of rows" These dis:oint sbsets of rows are then :oined pair)wise b* a sin!le parallel exection server"

Page 57: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 57/260

Page 58: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 58/260

last :oin/ while the other pairs will have to wait" This is becase/ in the be!innin!of the exection/ each parallel exection server works on a different partition pair"'t the end of this first phase/ onl* one pair is left" Ths/ a sin!le parallelexection server :oins this remainin! pair while all other parallel exectionservers are idle"

Sometimes/ parallel :oins can case remote I@Os" For example/ on Oracle +eal'pplication Clsters environments rnnin! on M77 confi!rations/ if a pair ofmatchin! partitions is not collocated on the same node/ a partition)wise :oinre4ires extra internode commnication de to remote I@O" This is becase Oraclemst transfer at least one partition to the node where the :oin is performed" In thiscase/ it is better to explicitl* redistribte the data than to se a partition)wise :oin"

)iscellaneous #artition Operations

The followin! partition operations are needed on a re!lar basis#

• 'ddin! 7artitions• %roppin! 7artitions

• 6xchan!in! 7artitions

• Movin! 7artitions

• Splittin! and Mer!in! 7artitions

• Trncatin! 7artitions

• Coalescin! 7artitions

Adding #artitions

%ifferent t*pes of partitions re4ire sli!htl* different s*ntax when bein! added" Basictopics are#

• 'ddin! a 7artition to a +an!e)7artitioned Table

• 'ddin! a 7artition to a Hash)7artitioned Table

• 'ddin! a 7artition to a List)7artitioned Table

Adding a #artition to a ange=#artitioned Table

=se the $LT! T$BL! $DD %$TITION statement to add a new partition to the

9hi!h9 end -the point after the last existin! partition3" To add a partition at the be!innin!or in the middle of a table/ se the S%LIT %$TITION clase"

For example/ consider the table/ sales/ which contains data for the crrent month in

addition to the previos 0A months" On .anar* 0/ 0111/ *o add a partition for .anar*/which is stored in tablespace ts?"

$LT! T$BL! sales$DD %$TITION 6an@ +$LU!S L!SS T,$N )745A>!BA5@@@7*

  T$BL!S%$C! ts?;

Page 59: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 59/260

?o cannot add a partition to a ran!e)partitioned table that has a "$.+$LU! partition/ bt

*o can split the "$.+$LU! partition" B* doin! so/ *o effectivel* create a new partition

defined b* the vales that *o specif*/ and a second partition that remains the "$.+$LU! 

 partition"

Local and !lobal indexes associated with the ran!e)partitioned table remain sable"

Adding a #artition to a -ash=#artitioned Table

$hen *o add a partition to a hash)partitioned table/ Oracle poplates the new partitionwith rows rehashed from an existin! partition -selected b* Oracle3 as determined b* thehash fnction"

The followin! statements show two wa*s of addin! a hash partition to table scu9a/ear"

Choosin! the first statement adds a new hash partition whose partition name is s*stem!enerated/ and which is placed in the table<s defalt tablespace" The second statementalso adds a new hash partition/ bt that partition is explicitl* named p_named and is

created in tablespace /ear0"

$LT! T$BL! scu9a/ear $DD %$TITION;$LT! T$BL! scu9a/ear  $DD %$TITION p_named T$BL!S%$C! /ear0;

Adding a #artition to a "ist=#artitioned Table

The followin! statement illstrates addin! a new partition to a list)partitioned table" In

this example/ ph*sical attribtes and NOLO((IN( are specified for the partition bein!added"

$LT! T$BL! q5_sales_9y_re/ion$DD %$TITION q5_nonmainland +$LU!S )7,I71 7%7*

  STO$(! )INITI$L 24K N!.T 24K* T$BL!S%$C! t9s_3  NOLO((IN(;

'n* vale in the set of literal vales that describe the partition bein! added mst not existin an* of the other partitions of the table"

?o cannot add a partition to a list)partitioned table that has a defalt partition/ bt *ocan split the defalt partition" B* doin! so/ *o effectivel* create a new partition defined b* the vales that *o specif*/ and a second partition that remains the defalt partition"

Local and !lobal indexes associated with the list)partitioned table remain sable"

Dropping #artitions

Page 60: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 60/260

?o can drop partitions from ran!e/ composite/ list/ or composite ran!e)list partitionedtables" For hash)partitioned tables/ or hash sbpartitions of ran!e)hash partitioned tables/*o mst perform a coalesce operation instead"

Dropping a Table #artition

=se one of the followin! statements to drop a table partition or sbpartition#

•   $LT! T$BL! DO% %$TITION to drop a table partition

•   $LT! T$BL! DO% SUB%$TITION to drop a sbpartition of a ran!e)list

 partitioned table

' t*pical example of droppin! a partition containin! data and referential inte!rit* ob:ectsis as follows#

$LT! T$BL! sales  DIS$BL! CONST$INT dname_sales5;$LT! T$BL! sales DO% %$TITTION dec@;$LT! T$BL! sales  !N$BL! CONST$INT dname_sales5;

In this example/ *o disable the inte!rit* constraints/ isse the $LT! T$BL! DO%

%$TITION statement/ then enable the inte!rit* constraints" This method is most

appropriate for lar!e tables where the partition bein! dropped contains a si!nificant percenta!e of the total data in the table"

See Also: 

Oracle9i Database 'dministrator(s #uide for more detailed examples

$2changing #artitions

?o can convert a partition -or sbpartition3 into a nonpartitioned table/ and anonpartitioned table into a partition -or sbpartition3 of a partitioned table b* exchan!in!their data se!ments" ?o can also convert a hash)partitioned table into a partition of aran!e)hash partitioned table/ or convert the partition of the ran!e)hash partitioned tableinto a hash)partitioned table" Similarl*/ *o can convert a list)partitioned table into a partition of a ran!e)list partitioned table/ or convert the partition of the ran!e)list

 partitioned table into a list)partitioned table

' t*pical example of exchan!in! into a nonpartitioned table follows" In this example/table stoc=s can be ran!e/ hash/ or list partitioned"

$LT! T$BL! stoc=s  !.C,$N(! %$TITION p3 FIT, stoc=_ta9le_3;

See Also: 

Page 61: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 61/260

Oracle9i Database 'dministrator(s #uide for more detailed examples

)oving #artitions

=se the "O+! %$TITION clase to move a partition" For example/ to move the mostactive partition to a tablespace that resides on its own disk -in order to balance I@O3 and tonot lo! the action/ isse the followin! statement#

$LT! T$BL! parts "O+! %$TITION depot2  T$BL!S%$C! ts4@: NOLO((IN(;

This statement alwa*s drops the partition<s old se!ment and creates a new se!ment/ evenif *o do not specif* a new tablespace"

See Also: 

Oracle9i Database 'dministrator(s #uide for more detailed examples

Splitting and )erging #artitions

The S%LIT %$TITION clase of the $LT! T$BL! or $LT! IND!. statement is sed to

redistribte the contents of a partition into two new partitions" Consider doin! this when a partition becomes too lar!e and cases backp/ recover*/ or maintenance operations totake a lon! time to complete" ?o can also se the S%LIT %$TITION clase to

redistribte the I@O load"

This clase cannot be sed for hash partitions or sbpartitions"

' t*pical example is to split a ran!e)partitioned table as follows#

$LT! T$BL! -et_cats S%LIT %$TITIONfee_=aty at )544* INTO ) %$TITION

  fee_=aty5 1 %$TITION fee_=aty2 *;$LT! IND!. $>5 !BUILD %$TITION fee_=aty5;$LT! IND!. $>5 !BUILD %$TITION fee_=aty2;$LT! IND!. +!T !BUILD %$TITION -et_parta;$LT! IND!. +!T !BUILD %$TITION -et_part9;

See Also: 

Oracle9i Database 'dministrator(s #uide for more detailed examples

=se the $LT! T$BL! "!(! %$TITIONS statement to mer!e the contents of two

 partitions into one partition" The two ori!inal partitions are dropped/ as are an*correspondin! local indexes"

Page 62: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 62/260

?o cannot se this statement for a hash)partitioned table or for hash sbpartitions of aran!e)hash partitioned table"

The followin! statement mer!es two sbpartitions of a table partitioned sin! ran!e)listmethod into a new sbpartition located in tablespace t9s_<est#

$LT! T$BL! quarterly_re/ional_sales  "!(! SUB%$TITIONS q5_5@@@_north<est1 q5_5@@@_south<est  INTO SUB%$TITION q5_5@@@_<est  T$BL!S%$C! t9s_<est;

Truncating #artitions

=se the $LT! T$BL! TUNC$T! %$TITION statement to remove all rows from a

table partition" Trncatin! a partition is similar to droppin! a partition/ except that the partition is emptied of its data/ bt not ph*sicall* dropped"

?o cannot trncate an index partition" However/ if there are local indexes defined forthe table/ the $LT! T$BL! TUNC$T! %$TITION statement trncates the matchin!

 partition in each local index"

The followin! example illstrates a partition that contains data and has referentialinte!rit* constraints#

$LT! T$BL! sales  DIS$BL! CONST$INT dname_sales5;$LT! T$BL! sales TUNC$T! %$TITTION dec@:;$LT! T$BL! sales  !N$BL! CONST$INT dname_sales5;

In this example/ *o disable the inte!rit* constraints/ isse the $LT! T$BL!

TUNC$T! %$TITION statement/ then re)enable the inte!rit* constraints"

This method is most appropriate for lar!e tables where the partition bein! trncatedcontains a si!nificant percenta!e of the total data in the table"

See Also: 

Oracle9i Database 'dministrator(s #uide for more detailed examples

Coalescing #artitions

Coalescin! partitions is a wa* of redcin! the nmber of partitions in a hash)partitionedtable/ or the nmber of sbpartitions in a ran!e)hash partitioned table" $hen a hash partition is coalesced/ its contents are redistribted into one or more remainin! partitionsdetermined b* the hash fnction" The specific partition that is coalesced is selected b*Oracle/ and is dropped after its contents have been redistribted"

Page 63: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 63/260

The followin! statement illstrates a t*pical case of redcin! b* one the nmber of partitions in a table#

$LT! T$BL! ouu5  CO$L!SC! %$TITION;

See Also: 

Oracle9i Database 'dministrator(s #uide for more detailed examples

Cop*ri!ht D 0112/ A88A Oracle Corporation"'ll +i!hts +eserved"

Home BookList

Contents Index MasterIndex

Feedback 

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

Inde2es

This chapter describes how to se indexes in a data warehosin! environment anddiscsses the followin! t*pes of index#

• Bitmap Indexes

• B)tree Indexes

• Local Indexes ;erss Klobal Indexes

See Also: 

Oracle9i Database Concepts for !eneral information re!ardin!

Page 64: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 64/260

indexin!

&itap Inde2es

Bitmap indexes are widel* sed in data warehosin! environments" The environmentst*picall* have lar!e amonts of data and ad hoc 4eries/ bt a low level of concrrent%ML transactions" For sch applications/ bitmap indexin! provides#

• +edced response time for lar!e classes of ad hoc 4eries

• +edced stora!e re4irements compared to other indexin! techni4es

• %ramatic performance !ains even on hardware with a relativel* small nmber of

C7=s or a small amont of memor*• 6fficient maintenance drin! parallel %ML and loads

Fll* indexin! a lar!e table with a traditional B)tree index can be prohibitivel* expensive

in terms of space becase the indexes can be several times lar!er than the data in thetable" Bitmap indexes are t*picall* onl* a fraction of the si5e of the indexed data in thetable"

Note: 

Bitmap indexes are available onl* if *o have prchased the Oracle1i 6nterprise 6dition" See Oracle9i Database *e+ ,eatures for moreinformation abot the featres available in Oracle1i and the Oracle1i 6nterprise 6dition"

'n index provides pointers to the rows in a table that contain a !iven ke* vale" ' re!larindex stores a list of rowids for each ke* correspondin! to the rows with that ke* vale"In a bitmap index/ a bitmap for each ke* vale replaces a list of rowids"

6ach bit in the bitmap corresponds to a possible rowid/ and if the bit is set/ it means thatthe row with the correspondin! rowid contains the ke* vale" ' mappin! fnctionconverts the bit position to an actal rowid/ so that the bitmap index provides the samefnctionalit* as a re!lar index" If the nmber of different ke* vales is small/ bitmapindexes save space"

Bitmap indexes are most effective for 4eries that contain mltiple conditions in theF,!! clase" +ows that satisf* some/ bt not all/ conditions are filtered ot before the

table itself is accessed" This improves response time/ often dramaticall*"

&ene0its 0or Data Warehousing Applications

Page 65: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 65/260

Bitmap indexes are primaril* intended for data warehosin! applications where sers4er* the data rather than pdate it" The* are not sitable for OLT7 applications withlar!e nmbers of concrrent transactions modif*in! the data"

7arallel 4er* and parallel %ML work with bitmap indexes as the* do with traditional

indexes" Bitmap indexin! also spports parallel create indexes and concatenated indexes"

See Also: 

Chapter 0E/ 9Schema Modelin! Techni4es9 for frther informationabot sin! bitmap indexes in data warehosin! environments

Cardinalit+

The advanta!es of sin! bitmap indexes are !reatest for colmns in which the ratio of thenmber of distinct vales to the nmber of rows in the table is nder 0N" $e refer to this

ratio as the degree of cardinality" ' !ender colmn/ which has onl* two distinct vales-male and female3/ is ideal for a bitmap index" However/ data warehose administratorsalso bild bitmap indexes on colmns with hi!her cardinalities"

For example/ on a table with one million rows/ a colmn with 08/888 distinct vales is acandidate for a bitmap index" ' bitmap index on this colmn can otperform a B)treeindex/ particlarl* when this colmn is often 4eried in con:nction with other indexedcolmns" In fact/ in a t*pical data warehose environments/ a bitmap index can beconsidered for an* non)ni4e colmn"

B)tree indexes are most effective for hi!h)cardinalit* data# that is/ for data with man*

 possible vales/ sch as customer_name or phone_num9er" In a data warehose/ B)treeindexes shold be sed onl* for ni4e colmns or other colmns with ver* hi!hcardinalities -that is/ colmns that are almost ni4e3" The ma:orit* of indexes in a datawarehose shold be bitmap indexes"

In ad hoc 4eries and similar sitations/ bitmap indexes can dramaticall* improve 4er* performance" $ND and O conditions in the F,!! clase of a 4er* can be resolved

4ickl* b* performin! the correspondin! Boolean operations directl* on the bitmaps before convertin! the resltin! bitmap to rowids" If the resltin! nmber of rows is small/the 4er* can be answered 4ickl* withot resortin! to a fll table scan"

E.am&$e 5-1 6itma& 7nde. 

The followin! shows a portion of a compan*<s customers table"

S!L!CT cust_id1 cust_/ender1 cust_marital_status1 cust_income_le-el>O" customers;

CUST_ID C CUST_"$IT$L_ST$TUS CUST_INCO"!_L!+!LAAAAAAAAAA A AAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAA

Page 66: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 66/260

E4 > D E41444 A @1@@@

  4 > married , 5041444 A 5@1@@@  @4 " sin/le , 5041444 A 5@1@@@  544 > I 5E41444 A 5@1@@@  554 > married C 041444 A @1@@@  524 " sin/le > 5541444 A 52@1@@@  534 " 5@41444 A 2:@1@@@  5:4 " married ( 5341444 A 5:@1@@@

Becase cust_/ender/ cust_marital_status/ and cst income_le-el are all low)

cardinalit* colmns -there are onl* three possible vales for marital stats and re!ion/two possible vales for !ender/ and 0A for income level3/ bitmap indexes are ideal forthese colmns" %o not create a bitmap index on cust_id becase this is a ni4e colmn"

Instead/ a ni4e B)tree index on this colmn provides the most efficient representationand retrieval"

Table 2)0 illstrates the bitmap index for the cust_/ender colmn in this example" It

consists of two separate bitmaps/ one for !ender"

Ta)$e 5-1 !am&$e 6itma& 7nde.

gender@) gender@,

cust_id E4 4 5

cust_id 4 4 5

cust_id @4 5 4

cust_id 544 4 5

cust_id 554 4 5

cust_id 524 5 4

cust_id 534 5 4

cust_id 5:4 5 4

6ach entr* -or bit3 in the bitmap corresponds to a sin!le row of the customers table" The

vale of each bit depends pon the vales of the correspondin! row in the table" Forinstance/ the bitmap cust_/enderH7>7 contains a one as its first bit becase the re!ion is

Page 67: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 67/260

east in the first row of the customers table" The bitmap cust_/enderH7>7 has a 5ero

for its third bit becase the !ender of the third row is not >"

'n anal*st investi!atin! demo!raphic trends of the compan*<s cstomers mi!ht ask/9How man* of or married cstomers have an income level of K or H&9 This corresponds

to the followin! SL 4er*#

S!L!CT COUNT)G* >O" customersF,!! cust_marital_status H 7married7$ND cust_income_le-el IN )7, 5041444 A 5@1@@@71 7( 5341444 A5:@1@@@7*;

Bitmap indexes can efficientl* process this 4er* b* merel* contin! the nmber of onesin the bitmap illstrated in Fi!re 2)0" The reslt set will be fond b* sin! bitmap ormer!e operations withot the necessit* of a conversion to rowids" To identif* additionalspecific cstomer attribtes that satisf* the criteria/ se the resltin! bitmap to access the

table after a bitmap to rowid conversion"

Figure 5-1 E.ecuting a 8uer' +sing 6itma& 7nde.es

Text description of the illstration dwhs!81"!if 

&itap Inde2es and Nulls

=nlike most other t*pes of indexes/ bitmap indexes inclde rows that have NULL vales"

Indexin! of nlls can be sefl for some t*pes of SL statements/ sch as 4eries withthe a!!re!ate fnction COUNT"

E.am&$e 5-2 6itma& 7nde. 

S!L!CT COUNT)G* >O" customers F,!! cust_marital_status IS NULL;

This 4er* ses a bitmap index on cust_marital_status" (ote that this 4er* wold

not be able to se a B)tree index"

Page 68: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 68/260

Page 69: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 69/260

C!$T! BIT"$% IND!. sales_cust_/ender_96i?ON sales)customerscust_/ender*>O" sales1 customersF,!! salescust_id H customerscust_idLOC$L;

The followin! 4er* shows how to se this bitmap :oin index and illstrates its bitmap pattern#

S!L!CT salestime_id1 customerscust_/ender1 salesamount>O" sales1 customersF,!! salescust_id H customerscust_id;

TI"!_ID C $"OUNTAAAAAAAAA A AAAAAAAAAA45A$NA@ " 22@545A$NA@ > 55:45A$NA@ " 003

45A$NA@ " 445A$NA@ " 5@045A$NA@ " 2445A$NA@ " 32

Table 2)A illstrates the bitmap :oin index in this example#

Ta)$e 5-2 !am&$e 6itma& /oin 7nde.

custBgender@) custBgender@,

sales record 5 5 4

sales record 2 4 5

sales record 3 5 4

sales record : 5 4

sales record 0 5 4

sales record 5 4

sales record E 5 4

?o can create other bitmap :oin indexes sin! more than one colmn or more than onetable/ as shown in these examples"

Page 70: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 70/260

E.am&$e 5-" 6itma& /oin 7nde.9 E.am&$e 2 

?o can create a bitmap :oin index on more than one colmn/ as in the followin!example/ which ses customers)/ender1 marital_status*#

C!$T! BIT"$% IND!. sales_cust_/ender_ms_96i?ON sales)customerscust_/ender1 customerscust_marital_status*>O" sales1 customersF,!! salescust_id H customerscust_idLOC$L NOLO((IN(;

E.am&$e 5- 6itma& /oin 7nde.9 E.am&$e 3

?o can create a bitmap :oin index on more than one table/ as in the followin!/ whichses customers)/ender* and products)cate/ory*#

C!$T! BIT"$% IND!. sales_c_/ender_p_cat_96i?ON sales)customerscust_/ender1 productsprod_cate/ory*>O" sales1 customers1 productsF,!! salescust_id H customerscust_id$ND salesprod_id H productsprod_idLOC$L NOLO((IN(;

E.am&$e 5-5 6itma& /oin 7nde.9 E.am&$e "

?o can create a bitmap :oin index on more than one table/ in which the indexed colmnis :oined to the indexed table b* sin! another table" For example/ we can bild an indexon countriescountry_name/ even tho!h the countries table is not :oined directl* to

the sales table" Instead/ the countries table is :oined to the customers table/ which is

 :oined to the sales table" This t*pe of schema is commonl* called a snowflae schema"

C!$T! BIT"$% IND!. sales_c_/ender_p_cat_96i?ON sales)customerscust_/ender1 productsprod_cate/ory*>O" sales1 customers1 productsF,!! salescust_id H customerscust_id$ND salesprod_id H productsprod_idLOC$L NOLO((IN(;

&itap >oin Inde2 estrictions

.oin reslts mst be stored/ therefore/ bitmap :oin indexes have the followin! restrictions#

• 7arallel %ML is crrentl* onl* spported on the fact table" 7arallel %ML on one

of the participatin! dimension tables will mark the index as nsable"• Onl* one table can be pdated concrrentl* b* different transactions when sin!

the bitmap :oin index"•  (o table can appear twice in the :oin"

• ?o cannot create a bitmap :oin index on an index)or!ani5ed table or a temporar*

table"

Page 71: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 71/260

• The colmns in the index mst all be colmns of the dimension tables"

• The dimension table :oin colmns mst be either primar* ke* colmns or have

ni4e constraints"• If a dimension table has composite primar* ke*/ each colmn in the primar* ke*

mst be part of the :oin"

See Also: 

Oracle9i S%& $e"erence for frther details

&=tree Inde2es

' B)tree index is or!ani5ed like an pside)down tree" The bottom level of the index holdsthe actal data vales and pointers to the correspondin! rows/ mch as the index in a book has a pa!e nmber associated with each index entr*"

See Also: 

Oracle9i Database Concepts for an explanation of B)tree strctres

In !eneral/ se B)tree indexes when *o know that *or t*pical 4er* refers to theindexed colmn and retrieves a few rows" In these 4eries/ it is faster to find the rows b*lookin! at the index" However/ sin! the book index analo!*/ if *o plan to look at ever*sin!le topic in a book/ *o mi!ht not want to look in the index for the topic and then lookp the pa!e" It mi!ht be faster to read thro!h ever* chapter in the book" Similarl*/ if *oare retrievin! most of the rows in a table/ it mi!ht not make sense to look p the index tofind the table rows" Instead/ *o mi!ht want to read or scan the table"

B)tree indexes are most commonl* sed in a data warehose to index ni4e or near)ni4e ke*s" In man* cases/ it ma* not be necessar* to index these colmns in a datawarehose/ becase ni4e constraints can be maintained withot an index/ and becaset*pical data warehose 4eries ma* not work better with sch indexes" Bitmap indexesshold be more common than B)tree indexes in most data warehose environments"

"ocal Inde2es !ersus 7lobal Inde2es

B)tree indexes on partitioned tables can be !lobal or local" $ith Oraclei and earlier

releases/ Oracle recommended that !lobal indexes not be sed in data warehoseenvironments becase a partition %%L statement -for example/ $LT! T$BL! """ DO% 

%$TITION3 wold invalidate the entire index/ and rebildin! the index is expensive" In

Oracle1i/ !lobal indexes can be maintained withot Oracle markin! them as nsableafter %%L" This enhancement makes !lobal indexes more effective for data warehoseenvironments"

Page 72: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 72/260

However/ local indexes will be more common than !lobal indexes" Klobal indexes shold be sed when there is a specific re4irement which cannot be met b* local indexes -forexample/ a ni4e index on a non)partitionin! ke*/ or a performance re4irement3"

Bitmap indexes on partitioned tables are alwa*s local"

See Also: 

9T*pes of 7artitionin!9 for frther details

Cop*ri!ht D 0112/ A88A Oracle Corporation"'ll +i!hts +eserved"

Home BookList

Contents Index MasterIndex

Feedback 

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

Integrit+ Constraints

This chapter describes inte!rit* constraints/ and discsses#

• $h* Inte!rit* Constraints are =sefl in a %ata $arehose

• Overview of Constraint States

• T*pical %ata $arehose Inte!rit* Constraints

Wh+ Integrit+ Constraints are .se0ul in a DataWarehouse

Inte!rit* constraints provide a mechanism for ensrin! that data conforms to !idelinesspecified b* the database administrator" The most common t*pes of constraints inclde#

•   UNIU! constraints

Page 73: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 73/260

To ensre that a !iven colmn is ni4e

•   NOT NULL constraints

To ensre that no nll vales are allowed

•   >O!I(N K!' constraints

To ensre that two ke*s share a primar* ke* to forei!n ke* relationship

Constraints can be sed for these prposes in a data warehose#

• %ata cleanliness

Constraints verif* that the data in the data warehose conforms to a basic level ofdata consistenc* and correctness/ preventin! the introdction of dirt* data"

• er* optimi5ation

The Oracle database tili5es constraints when optimi5in! SL 4eries" 'ltho!hconstraints can be sefl in man* aspects of 4er* optimi5ation/ constraints are particlarl* important for 4er* rewrite of materiali5ed views"

=nlike data in man* relational database environments/ data in a data warehose ist*picall* added or modified nder controlled circmstances drin! the extraction/transformation/ and loadin! -6TL3 process" Mltiple sers normall* do not pdate thedata warehose directl*/ as the* do in an OLT7 s*stem"

See Also: 

Chapter 08/ 9Overview of 6xtraction/ Transformation/ and Loadin!9

Man* si!nificant constraint featres have been introdced for data warehosin!" +eadersfamiliar with Oracle<s constraint fnctionalit* in OracleE and Oracle shold take specialnote of the fnctionalit* described in this chapter" In fact/ man* OracleE)based andOracle)based data warehoses lacked constraints becase of concerns abot constraint performance" (ewer constraint fnctionalit* addresses these concerns"

Overvie( o0 Constraint States

To nderstand how best to se constraints in a data warehose/ *o shold firstnderstand the basic prposes of constraints" Some of these prposes are#

• 6nforcement

Page 74: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 74/260

In order to se a constraint for enforcement/ the constraint mst be in the !N$BL! 

state" 'n enabled constraint ensres that all data modifications pon a !iven table-or tables3 satisf* the conditions of the constraints" %ata modification operationswhich prodce data that violates the constraint fail with a constraint violationerror"

• ;alidation

To se a constraint for validation/ the constraint mst be in the +$LID$T! state" If

the constraint is validated/ then all data that crrentl* resides in the table satisfiesthe constraint"

 (ote that validation is independent of enforcement" 'ltho!h the t*picalconstraint in an operational s*stem is both enabled and validated/ an* constraintcold be validated bt not enabled or vice versa -enabled bt not validated3" Theselatter two cases are sefl for data warehoses"

• Belief

In some cases/ *o will know that the conditions for a !iven constraint are tre/ so*o do not need to validate or enforce the constraint" However/ *o ma* wish forthe constraint to be present an*wa* to improve 4er* optimi5ation and performance" $hen *o se a constraint in this wa*/ it is called a belief or !L' 

constraint/ and the constraint mst be in the !L' state" The !L' state provides

*o with a mechanism for tellin! Oracle1i that a !iven constraint is believed to betre"

 (ote that the !L' state onl* affects constraints that have not been validated"

T+pical Data Warehouse Integrit+ Constraints

This section assmes that *o are familiar with the t*pical se of constraints" That is/constraints that are both enabled and validated" For data warehosin!/ man* sers havediscovered that sch constraints ma* be prohibitivel* costl* to bild and maintain" Thetopics discssed are#

• =(I=6 Constraints in a %ata $arehose

• FO+6IK( ,6? Constraints in a %ata $arehose

• +6L? Constraints

• Inte!rit* Constraints and 7arallelism

• Inte!rit* Constraints and 7artitionin!

• ;iew Constraints

.NI.$ Constraints in a Data Warehouse

Page 75: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 75/260

' UNIU! constraint is t*picall* enforced sin! a UNIU! index" However/ in a data

warehose whose tables can be extremel* lar!e/ creatin! a ni4e index can be costl* both in processin! time and in disk space"

Sppose that a data warehose contains a table sales/ which incldes a colmn

sales_id" sales_id ni4el* identifies a sin!le sales transaction/ and the datawarehose administrator mst ensre that this colmn is ni4e within the datawarehose"

One wa* to create the constraint is as follows#

$LT! T$BL! sales $DD CONST$INT sales_uniqueUNIU!)sales_id*;

B* defalt/ this constraint is both enabled and validated" Oracle implicitl* creates ani4e index on sales_id to spport this constraint" However/ this index can be

 problematic in a data warehose for three reasons#

• The ni4e index can be ver* lar!e/ becase the sales table can easil* have

millions or even billions of rows"• The ni4e index is rarel* sed for 4er* exection" Most data warehosin!

4eries do not have predicates on ni4e ke*s/ so creatin! this index will probabl* not improve performance"

• If sales is partitioned alon! a colmn other than sales_id/ the ni4e index

mst be !lobal" This can detrimentall* affect all maintenance operations on thesales table"

' ni4e index is re4ired for ni4e constraints to ensre that each individal rowmodified in the sales table satisfies the UNIU! constraint"

For data warehosin! tables/ an alternative mechanism for ni4e constraints isillstrated in the followin! statement#

$LT! T$BL! sales $DD CONST$INT sales_uniqueUNIU! )sales_id* DIS$BL! +$LID$T!;

This statement creates a ni4e constraint/ bt/ becase the constraint is disabled/ a

ni4e index is not re4ired" This approach can be advanta!eos for man* datawarehosin! environments becase the constraint now ensres ni4eness withot thecost of a ni4e index"

However/ there are trade)offs for the data warehose administrator to consider withDIS$BL! +$LID$T! constraints" Becase this constraint is disabled/ no %ML statements

that modif* the ni4e colmn are permitted a!ainst the sales table" ?o can se one of

two strate!ies for modif*in! this table in the presence of a constraint#

Page 76: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 76/260

• =se %%L to add data to this table -sch as exchan!in! partitions3" See the

example in Chapter 0/ 9Maintainin! the %ata $arehose9"• Before modif*in! this table/ drop the constraint" Then/ make all necessar* data

modifications" Finall*/ re)create the disabled constraint" +e)creatin! the constraintis more efficient than re)creatin! an enabled constraint" However/ this approach

does not !arantee that data added to the sales table while the constraint has been dropped is ni4e"

,O$I7N E$F Constraints in a Data Warehouse

In a star schema data warehose/ >O!I(N K!' constraints validate the relationship

 between the fact table and the dimension tables" ' sample constraint mi!ht be#

$LT! T$BL! sales $DD CONST$INT sales_time_f=  >O!I(N K!' )sales_time_id* !>!!NC!S time )time_id*  !N$BL! +$LID$T!;

However/ in some sitations/ *o ma* choose to se a different state for the >O!I(N K!'

constraints/ in particlar/ the !N$BL! NO+$LID$T! state" ' data warehose administrator

mi!ht se an !N$BL! NO+$LID$T! constraint when either#

• The tables contain data that crrentl* disobe*s the constraint/ bt the data

warehose administrator wishes to create a constraint for ftre enforcement"• 'n enforced constraint is re4ired immediatel*"

Sppose that the data warehose loaded new data into the fact tables ever* da*/ btrefreshed the dimension tables onl* on the weekend" %rin! the week/ the dimensiontables and fact tables ma* in fact disobe* the >O!I(N K!' constraints" (evertheless/ the

data warehose administrator mi!ht wish to maintain the enforcement of this constraint to prevent an* chan!es that mi!ht affect the >O!I(N K!' constraint otside of the 6TL

 process" Ths/ *o can create the >O!I(N K!' constraints ever* ni!ht/ after performin!

the 6TL process/ as shown here#

$LT! T$BL! sales $DD CONST$INT sales_time_f=  >O!I(N K!' )sales_time_id* !>!!NC!S time )time_id*  !N$BL! NO+$LID$T!;

!N$BL! NO+$LID$T! can 4ickl* create an enforced constraint/ even when the constraint

is believed to be tre" Sppose that the 6TL process verifies that a >O!I(N K!' constraint is tre" +ather than have the database re)verif* this >O!I(N K!' constraint/

which wold re4ire time and database resorces/ the data warehose administrator coldinstead create a >O!I(N K!' constraint sin! !N$BL! NO+$LID$T!"

$"F Constraints

Page 77: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 77/260

The 6TL process commonl* verifies that certain constraints are tre" For example/ it canvalidate all of the forei!n ke*s in the data comin! into the fact table" This means that *ocan trst it to provide clean data/ instead of implementin! constraints in the datawarehose" ?o create a !L' constraint as follows#

$LT! T$BL! sales $DD CONST$INT sales_time_f=  >O!I(N K!' )sales_time_id* !>!!NC!S time )time_id*!L' DIS$BL! NO+$LID$T!;

!L' constraints/ even tho!h the* are not sed for data validation/ can#

• 6nable more sophisticated 4er* rewrites for materiali5ed views" See Chapter AA/

9er* +ewrite9 for frther details"• 6nable other data warehosin! tools to retrieve information re!ardin! constraints

directl* from the Oracle data dictionar*"

Creatin! a !L' constraint is inexpensive and does not impose an* overhead drin! %ML

or load" Becase the constraint is not bein! validated/ no data processin! is necessar* tocreate it"

Integrit+ Constraints and #arallelis

'll constraints can be validated in parallel" $hen validatin! constraints on ver* lar!etables/ parallelism is often necessar* to meet performance !oals" The de!ree of parallelism for a !iven constraint operation is determined b* the defalt de!ree of parallelism of the nderl*in! table"

Integrit+ Constraints and #artitioning

?o can create and maintain constraints before *o partition the data" Later chaptersdiscss the si!nificance of partitionin! for data warehosin!" 7artitionin! can improveconstraint mana!ement :st as it does to mana!ement of man* other operations" Forexample/ Chapter 0/ 9Maintainin! the %ata $arehose9 provides a scenario creatin!UNIU! and >O!I(N K!' constraints on a separate sta!in! table/ and these constraints are

maintained drin! the !.C,$N(! %$TITION statement"

!ie( Constraints

?o can create constraints on views" The onl* t*pe of constraint spported on a view is a!L' constraint"

This t*pe of constraint is sefl when 4eries t*picall* access views instead of basetables/ and the %B' ths needs to define the data relationships between views rather thantables" ;iew constraints are particlarl* sefl in OL'7 environments/ where the* ma*enable more sophisticated rewrites for materiali5ed views"

Page 78: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 78/260

See Also: 

Chapter / 9Materiali5ed ;iews9 and Chapter AA/ 9er* +ewrite9

Cop*ri!ht D 0112/ A88A Oracle Corporation"

'll +i!hts +eserved"

Home Book

List

Contents Index Master

Index

Feedback 

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

G)ateriali4ed !ie(s

This chapter introdces *o to the se of materiali5ed views and discsses#

• Overview of %ata $arehosin! with Materiali5ed ;iews• T*pes of Materiali5ed ;iews

• Creatin! Materiali5ed ;iews

• +e!isterin! 6xistin! Materiali5ed ;iews

• 7artitionin! and Materiali5ed ;iews

• Materiali5ed ;iews in OL'7 6nvironments

• Choosin! Indexes for Materiali5ed ;iews

• Invalidatin! Materiali5ed ;iews

• Secrit* Isses with Materiali5ed ;iews

• 'lterin! Materiali5ed ;iews

• %roppin! Materiali5ed ;iews

• 'nal*5in! Materiali5ed ;iew Capabilities

Overvie( o0 Data Warehousing (ith )ateriali4ed!ie(s

T*picall*/ data flows from one or more online transaction processin! -OLT73 databasesinto a data warehose on a monthl*/ weekl*/ or dail* basis" The data is normall*

Page 79: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 79/260

Page 80: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 80/260

methods" The materiali5ed views as replicas provide local access to data that otherwisewold have to be accessed from remote sites" Materiali5ed views are also sefl inremote data marts"

See Also: 

Oracle9i $eplication and Oracle9i -eterogeneous Connectivity

 'dministrator(s #uide for details on distribted and mobile comptin!

)ateriali4ed !ie(s 0or )obile Coputing

?o can also se materiali5ed views to download a sbset of data from central servers tomobile clients/ with periodic refreshes and pdates between clients and the centralservers"

This chapter focses on the se of materiali5ed views in data warehoses"

See Also: 

Oracle9i $eplication and Oracle9i -eterogeneous Connectivity

 'dministrator(s #uide for details on distribted and mobile comptin!

The Need 0or )ateriali4ed !ie(s

?o can se materiali5ed views in data warehoses to increase the speed of 4eries onver* lar!e databases" eries to lar!e databases often involve :oins between tables/a!!re!ations sch as SU"/ or both" These operations are expensive in terms of time and

 processin! power" The t*pe of materiali5ed view *o create determines how themateriali5ed view is refreshed and sed b* 4er* rewrite"

?o can se materiali5ed views in a nmber of wa*s/ and *o can se almost identicals*ntax to perform a nmber of roles" For example/ a materiali5ed view can replicate data/a process formerl* achieved b* sin! the C!$T! SN$%S,OT statement" (ow C!$T! 

"$T!I$LI!D +I!F is a s*non*m for C!$T! SN$%S,OT"

Materiali5ed views improve 4er* performance b* precalclatin! expensive :oin anda!!re!ation operations on the database prior to exection and storin! the reslts in thedatabase" The 4er* optimi5er atomaticall* reco!ni5es when an existin! materiali5ed

view can and shold be sed to satisf* a re4est" It then transparentl* rewrites the re4estto se the materiali5ed view" eries !o directl* to the materiali5ed view and not to thenderl*in! detail tables" In !eneral/ rewritin! 4eries to se materiali5ed views ratherthan detail tables improves response" Fi!re )0 illstrates how 4er* rewrite works"

Figure :-1 Trans&arent 8uer' %erite

Page 81: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 81/260

Text description of the illstration dwhs!8AE"!if 

$hen sin! 4er* rewrite/ create materiali5ed views that satisf* the lar!est nmber of4eries" For example/ if *o identif* A8 4eries that are commonl* applied to the detailor fact tables/ then *o mi!ht be able to satisf* them with five or six well)writtenmateriali5ed views" ' materiali5ed view definition can inclde an* nmber ofa!!re!ations -SU"/ COUNT)?*/ COUNT)G*/ COUNT)DISTINCT ?*/ $+(/ +$I$NC!/ STDD!+/

"IN/ and "$.3" It can also inclde an* nmber of :oins" If *o are nsre of which

materiali5ed views to create/ Oracle provides a set of advisor* procedres in theDB"S_OL$% packa!e to help in desi!nin! and evalatin! materiali5ed views for 4er*

rewrite" These fnctions are also known as the Smmar* 'dvisor or the 'dvisor" (otethat the OL'7 Smmar* 'dvisor is different" See Oracle9i O&'P ser(s #uide forfrther details re!ardin! the OL'7 Smmar* 'dvisor"

If a materiali5ed view is to be sed b* 4er* rewrite/ it mst be stored in the samedatabase as the fact or detail tables on which it relies" ' materiali5ed view can be partitioned/ and *o can define a materiali5ed view on a partitioned table" ?o can alsodefine one or more indexes on the materiali5ed view"

=nlike indexes/ materiali5ed views can be accessed directl* sin! a S!L!CT statement"

Note: 

The techni4es shown in this chapter illstrate how to se materiali5edviews in data warehoses" Materiali5ed views can also be sed b*Oracle +eplication" See Oracle9i $eplication for frther information"

Coponents o0 Suar+ )anageent

Smmar* mana!ement consists of#

Page 82: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 82/260

• Mechanisms to define materiali5ed views and dimensions"

• ' refresh mechanism to ensre that all materiali5ed views contain the latest data"

• ' 4er* rewrite capabilit* to transparentl* rewrite a 4er* to se a materiali5ed

view"• ' collection of materiali5ed view anal*sis and advisor* fnctions and procedres

in the DB"S_OL$% packa!e" Collectivel*/ these fnctions are called the Smmar*'dvisor/ and are also available as part of Oracle 6nterprise Mana!er"

See Also: 

Chapter 02/ 9Smmar* 'dvisor9 and Oracle9i O&'P ser(s #uide forOL'7)related schemas

Man* lar!e decision spport s*stem -%SS3 databases have schemas that do not closel*resemble a conventional data warehose schema/ bt that still re4ire :oins anda!!re!ates" The se of smmar* mana!ement featres imposes no schema restrictions/

and can enable some existin! %SS database applications to improve performance withotthe need to redesi!n the database or the application"

Fi!re )A illstrates the se of smmar* mana!ement in the warehosin! c*cle" 'fterthe data has been transformed/ sta!ed/ and loaded into the detail data in the warehose/*o can invoke the smmar* mana!ement process" First/ se the 'dvisor to plan how *owill se smmaries" Then/ create smmaries and desi!n how 4eries will be rewritten"

Figure :-2 Overvie of !ummar' #anagement 

Page 83: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 83/260

Text description of the illstration dwhs!8E0"!if 

=nderstandin! the smmar* mana!ement process drin! the earliest sta!es of datawarehose desi!n can *ield lar!e dividends later in the form of hi!her performance/lower smmar* administration costs/ and redced stora!e re4irements"

Data Warehousing Terinolog+

Some basic data warehosin! terms are defined as follows#

• *imension tables describe the bsiness entities of an enterprise/ represented as

hierarchical/ cate!orical information sch as time/ departments/ locations/ and prodcts" %imension tables are sometimes called lookp or reference tables"

%imension tables sall* chan!e slowl* over time and are not modified on a periodic schedle" The* are sed in lon!)rnnin! decision spport 4eries toa!!re!ate the data retrned from the 4er* into appropriate levels of thedimension hierarch*"

Page 84: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 84/260

Page 85: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 85/260

• The existence of a materiali5ed view is transparent to SL applications/ so that a

%B' can create or drop materiali5ed views at an* time withot affectin! thevalidit* of SL applications"

• ' materiali5ed view consmes stora!e space"

• The contents of the materiali5ed view mst be pdated when the nderl*in! detail

tables are modified"

Scheas and Diension Tables

In the case of normali5ed or partiall* normali5ed dimension tables -a dimension that isstored in more than one table3/ identif* how these tables are :oined" (ote whether the :oins between the dimension tables can !arantee that each child)side row :oins with oneand onl* one parent)side row" In the case of denormali5ed dimensions/ determine whetherthe child)side colmns ni4el* determine the parent)side -or attribte3 colmns" Theserelationships can be enabled with constraints/ sin! the NO+$LID$T! and !L' options if

the relationships represented b* the constraints are !aranteed b* other means" (ote that

if the :oins between fact and dimension tables do not spport the parent)child relationshipdescribed previosl*/ *o still !ain si!nificant performance advanta!es from definin! thedimension with the C!$T! DI"!NSION statement" 'nother alternative/ sb:ect to some

restrictions/ is to se oter :oins in the materiali5ed view definition -that is/ in the C!$T! 

"$T!I$LI!D +I!F statement3"

?o mst not create dimensions in an* schema that does not satisf* these relationships"Incorrect reslts can be retrned from 4eries otherwise"

See Also: 

Chapter 1/ 9%imensions9 and Oracle9i O&'P ser(s #uide for OL'7)related schemas

)ateriali4ed !ie( Schea Design 7uidelines

Before startin! to define and se the varios components of smmar* mana!ement/ *oshold review *or schema desi!n to abide b* the followin! !idelines wherever possible"

Kidelines 0 and A are more important than !ideline " If *or schema desi!n does notfollow !idelines 0 and A/ it does not then matter whether it follows !ideline "

Kidelines 0/ A/ and affect both 4er* rewrite performance and materiali5ed viewrefresh performance"

Schea7uideline Description

Kideline 0 %imensions shold either be denormali5ed -each dimension contained inone table3 or the :oins between tables in a normali5ed or partiall*

Page 86: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 86/260

Page 87: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 87/260

Schea7uideline Description

Time

%imensions

 partition and index the materiali5ed view in the same manner as *o have

the fact tables"

If *o are concerned with the time re4ired to enable constraints and whether an*constraints mi!ht be violated/ se the !N$BL! NO+$LID$T! with the !L' clase to trn

on constraint checkin! withot validatin! an* of the existin! constraints" The risk withthis approach is that incorrect 4er* reslts cold occr if an* constraints are broken"Therefore/ as the desi!ner/ *o mst determine how clean the data is and whether the riskof wron! reslts is too !reat"

"oading Data

' poplar and efficient wa* to load data into a warehose or data mart is to seSLLoader with the DI!CT or %$$LL!L option or to se another loader tool that ses

the Oracle direct)path '7I"

See Also: 

Oracle9i Database tilities for the restrictions and considerationswhen sin! SLLoader with the DI!CT or %$$LL!L ke*words

Loadin! strate!ies can be classified as one)phase or two)phase" In one)phase loadin!/data is loaded directl* into the tar!et table/ 4alit* assrance tests are performed/ and

errors are resolved b* performin! %ML operations prior to refreshin! materiali5ed views"If a lar!e nmber of deletions are possible/ then stora!e tili5ation can be adversel*affected/ bt temporar* space re4irements and load time are minimi5ed" The %ML thatma* be re4ired after one)phase loadin! cases mltitable a!!re!ate materiali5ed viewsto become nsable in the safest rewrite inte!rit* level"

In a two)phase loadin! process#

• %ata is first loaded into a temporar* table in the warehose"

• alit* assrance procedres are applied to the data"

• +eferential inte!rit* constraints on the tar!et table are disabled/ and the local

index in the tar!et partition is marked nsable"• The data is copied from the temporar* area into the appropriate partition of the

tar!et table sin! INS!T $S S!L!CT with the %$$LL!L or $%%!ND hint"

• The temporar* table is dropped"

• The constraints are enabled/ sall* with the NO+$LID$T! option"

Immediatel* after loadin! the detail data and pdatin! the indexes on the detail data/ thedatabase can be opened for operation/ if desired" ?o can disable 4er* rewrite at the

Page 88: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 88/260

s*stem level b* issin! an $LT! S'ST!" S!T U!'_!FIT!_!N$BL!D  false 

statement ntil all the materiali5ed views are refreshed"

If U!'_!FIT!_INT!(IT' is set to stale_tolerated/ access to the materiali5ed

view can be allowed at the session level to an* sers who do not re4ire the materiali5ed

views to reflect the data from the latest load b* issin! an $LT! S!SSION S!T U!'_!FIT!_INT!(IT'true statement" This scenario does not appl* when

U!'_!FIT!_INT!(IT' is either enforced or trusted becase the s*stem ensres

in these modes that onl* materiali5ed views with pdated data participate in a 4er*rewrite"

Overvie( o0 )ateriali4ed !ie( )anageent Tas<s

The motivation for sin! materiali5ed views is to improve performance/ bt the overheadassociated with materiali5ed view mana!ement can become a si!nificant s*stemmana!ement problem" $hen reviewin! or evalatin! some of the necessar* materiali5ed

view mana!ement activities/ consider some of the followin!#

• Identif*in! what materiali5ed views to create initiall*

• Indexin! the materiali5ed views

• 6nsrin! that all materiali5ed views and materiali5ed view indexes are refreshed

 properl* each time the database is pdated• Checkin! which materiali5ed views have been sed

• %eterminin! how effective each materiali5ed view has been on workload

 performance• Measrin! the space bein! sed b* materiali5ed views

• %eterminin! which new materiali5ed views shold be created

• %eterminin! which existin! materiali5ed views shold be dropped• 'rchivin! old detail and materiali5ed view data that is no lon!er sefl

'fter the initial effort of creatin! and poplatin! the data warehose or data mart/ thema:or administration overhead is the pdate process/ which involves#

• 7eriodic extraction of incremental chan!es from the operational s*stems

• Transformin! the data

• ;erif*in! that the incremental chan!es are correct/ consistent/ and complete

• Blk)loadin! the data into the warehose

• +efreshin! indexes and materiali5ed views so that the* are consistent with the

detail data

The pdate process mst !enerall* be performed within a limited period of time knownas the pdate window" The pdate window depends on the pdate fre!ency -sch asdail* or weekl*3 and the natre of the bsiness" For a dail* pdate fre4enc*/ an pdatewindow of two to six hors mi!ht be t*pical"

?o need to know *or pdate window for the followin! activities#

Page 89: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 89/260

Page 90: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 90/260

Page 91: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 91/260

  F,!! sprod_id H pprod_id  (OU% B' pprod_name;

6xample )A creates a materiali5ed view product_sales_m- that comptes the sm of

sales b* prod_name" It is derived b* :oinin! the tables store and fact on the colmn

store_=ey" The materiali5ed view does not initiall* contain an* data/ becase the bildmethod is D!>!!D" ' complete refresh is re4ired for the first refresh of a bild

deferred materiali5ed view" $hen it is refreshed and once poplated/ this materiali5edview can be sed b* 4er* rewrite"

E.am&$e :-3 Creating a Materialized View: Example 3

C!$T! "$T!I$LI!D +I!F LO( ON salesFIT, S!U!NC!1 OFID)prod_id1 cust_id1 time_id1 channel_id1 promo_id1 quantity_sold1amount_sold*INCLUDIN( N!F +$LU!S;

C!$T! "$T!I$LI!D +I!F sum_sales  %$$LL!L  BUILD I""!DI$T!!>!S, >$ST ON CO""IT$SS!L!CT sprod_id1 stime_id1

COUNT)G* $S count_/rp1SU")samount_sold* $S sum_dollar_sales1

COUNT)samount_sold* $S count_dollar_sales1SU")squantity_sold* $S sum_quantity_sales1

COUNT)squantity_sold* $S count_quantity_sales>O" sales s

  (OU% B' sprod_id1 stime_id;

6xample ) creates a materiali5ed view that contains a!!re!ates on a sin!le table"Becase the materiali5ed view lo! has been created/ the materiali5ed view is fastrefreshable" If %ML is applied a!ainst the sales table/ then the chan!es will be reflected

in the materiali5ed view when the commit is issed"

e/uireents 0or .sing )ateriali4ed !ie(s (ith Aggregates

Table )0 illstrates the a!!re!ate re4irements for materiali5ed views"

Ta)$e :-1 %e4uirements for #ateria$i;ed <ies ith Aggregates

Page 92: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 92/260

I0 aggregate H is present aggregate F is re/uired and aggregate isoptional

H F

COUNT)e?pr* A A

SU")e?pr* COUNT)e?pr* A

$+()e?pr* COUNT)e?pr* SU")e?pr*

STDD!+)e?pr* COUNT)e?pr*SU")e?pr*

SU")e?pr G e?pr*

+$I$NC!)e?pr* COUNT)e?pr*SU")e?pr*

SU")e?pr G e?pr*

 (ote that COUNT)G* mst alwa*s be present" Oracle recommends that *o inclde the

optional a!!re!ates in colmn  in the materiali5ed view in order to obtain the most

efficient and accrate fast refresh of the a!!re!ates"

)ateriali4ed !ie(s Containing Onl+ >oins

Some materiali5ed views contain onl* :oins and no a!!re!ates/ sch as in  6xample )/ where a materiali5ed view is created that :oins the sales table to the times and

customers tables" The advanta!e of creatin! this t*pe of materiali5ed view is thatexpensive :oins will be precalclated"

Fast refresh for a materiali5ed view containin! onl* :oins is possible after an* t*pe of%ML to the base tables -direct)path or conventional INS!T/ U%D$T!/ or D!L!T!3"

' materiali5ed view containin! onl* :oins can be defined to be refreshed ON CO""IT or

ON D!"$ND" If it is ON CO""IT/ the refresh is performed at commit time of the transaction

that does %ML on the materiali5ed view<s detail table" Oracle does not allow self):oins inmateriali5ed :oin views"

If *o specif* !>!S, >$ST/ Oracle performs frther verification of the 4er* definitionto ensre that fast refresh can be performed if any of the detail tables chan!e" Theseadditional checks are#

• ' materiali5ed view lo! mst be present for each detail table"

• The rowids of all the detail tables mst appear in the S!L!CT list of the

materiali5ed view 4er* definition"

Page 93: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 93/260

• If there are no oter :oins/ *o ma* have arbitrar* selections and :oins in the

F,!! clase" However/ if there are oter :oins/ the F,!! clase cannot have an*

selections" Frther/ if there are oter :oins/ all the :oins mst be connected b* $NDs

and mst se the e4alit* -3 operator"• If there are oter :oins/ ni4e constraints mst exist on the :oin colmns of the

inner table" For example/ if *o are :oinin! the fact table and a dimension tableand the :oin is an oter :oin with the fact table bein! the oter table/ there mstexist ni4e constraints on the :oin colmns of the dimension table"

If some of these restrictions are not met/ *o can create the materiali5ed view as !>!S,

>OC! to take advanta!e of fast refresh when it is possible" If one of the tables did not

meet all of the criteria/ bt the other tables did/ the materiali5ed view wold still be fastrefreshable with respect to the other tables for which all the criteria are met"

' materiali5ed view lo! shold contain the rowid of the master table" It is not necessar*to add other colmns"

To speed p refresh/ *o shold create indexes on the materiali5ed view<s colmns thatstore the rowids of the fact table"

E.am&$e :-" #ateria$i;ed <ie Containing On$' /oins

C!$T! "$T!I$LI!D +I!F LO( ON sales  FIT, OFID; C!$T! "$T!I$LI!D +I!F LO( ON times  FIT, OFID; C!$T! "$T!I$LI!D +I!F LO( ON customers  FIT, OFID; C!$T! "$T!I$LI!D +I!F detail_sales_m-

%$$LL!L BUILD I""!DI$T!  !>!S, >$ST  $S  S!L!CT  sro<id sales_rid1 tro<id times_rid1 cro<idcustomers_rid1  ccust_id1 ccust_last_name1 samount_sold1  squantity_sold1 stime_id  >O" sales s1 times t1 customers c

F,!! scust_id H ccust_id)* $ND

  stime_id H ttime_id)*;

In this example/ to perform a fast refresh/ UNIU! constraints shold exist on ccust_id 

and ttime_id" ?o shold also create indexes on the colmns sales_rid/ times_rid/

and customers_rid/ as illstrated in the followin!" This will improve the refresh

 performance"

Page 94: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 94/260

C!$T! IND!. m-_i?_salesridON detail_sales_m-)sales_rid*;

 

'lternativel*/ if the previos example did not inclde the colmns times_rid and

customers_id/ and if the refresh method was !>!S, >OC!/ then this materiali5ed

view wold be fast refreshable onl* if the sales table was pdated bt not if the tablestimes or customers were pdated"

C!$T! "$T!I$LI!D +I!F detail_sales_m-%$$LL!L

  BUILD I""!DI$T!  !>!S, >OC!  $S  S!L!CT  sro<id sales_rid1  ccust_id1 ccust_last_name1 samount_sold1  squantity_sold1 stime_id  >O" sales s1 times t1 customers c

F,!! scust_id H ccust_id)* $NDstime_id H ttime_id)*;

Nested )ateriali4ed !ie(s

' nested materiali5ed view is a materiali5ed view whose definition is based on anothermateriali5ed view" ' nested materiali5ed view can reference other relations in thedatabase in addition to referencin! materiali5ed views"

Wh+ .se Nested )ateriali4ed !ie(s?

In a data warehose/ *o t*picall* create man* a!!re!ate views on a sin!le :oin -forexample/ rollps alon! different dimensions3" Incrementall* maintainin! these distinctmateriali5ed a!!re!ate views can take a lon! time/ becase the nderl*in! :oin has to be performed man* times"

=sin! nested materiali5ed views/ *o can create mltiple sin!le)table materiali5ed views based on a :oins)onl* materiali5ed view and the :oin is performed :st once" In addition/optimi5ations can be performed for this class of sin!le)table a!!re!ate materiali5ed viewand ths refresh is ver* efficient"

E.am&$e :- =ested #ateria$i;ed <ie 

?o can create a nested materiali5ed view on materiali5ed views that contain :oins onl*or :oins and a!!re!ates"

'll the nderl*in! ob:ects -materiali5ed views or tables3 on which the materiali5ed viewis defined mst have a materiali5ed view lo!" 'll the nderl*in! ob:ects are treated as ifthe* were tables" 'll the existin! options for materiali5ed views can be sed/ with the

Page 95: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 95/260

exception of ON CO""IT !>!S,/ which is not spported for a nested materiali5ed views

that contains :oins and a!!re!ates"

=sin! the tables and their colmns from the sh sample schema/ the followin!

materiali5ed views illstrate how nested materiali5ed views can be created"

8G create the materialiMed -ie< lo/s G8C!$T! "$T!I$LI!D +I!F LO( ON salesFIT, OFID;

C!$T! "$T!I$LI!D +I!F LO( ON customersFIT, OFID;

C!$T! "$T!I$LI!D +I!F LO( ON times  FIT, OFID;

8Gcreate materialiMed -ie< 6oin_sales_cust_time as fast refresha9le at  CO""IT time G8C!$T! "$T!I$LI!D +I!F 6oin_sales_cust_time!>!S, >$ST ON CO""IT $SS!L!CT ccust_id1 ccust_last_name1 samount_sold1 ttime_id1  tday_num9er_in_<ee=1 sro<id srid1 tro<id trid1 cro<id crid>O" sales s1 customers c1 times tF,!! stime_id H ttime_id $ND  scust_id H ccust_id;

To create a nested materiali5ed view on the table 6oin_sales_cust_time . *o wold

have to create a materiali5ed view lo! on the table" Becase this will be a sin!le)tablea!!re!ate materiali5ed view on 6oin_sales_cust_time/ *o need to lo! all the

necessar* colmns and se the INCLUDIN( N!F +$LU!S clase"

8G create materialiMed -ie< lo/ on 6oin_sales_cust_time G8

C!$T! "$T!I$LI!D +I!F LO( ON 6oin_sales_cust_timeFIT, OFID )cust_name1 day_num9er_in_<ee=1 amount_sold*INCLUDIN( N!F +$LU!S;

8G create the sin/leAta9le a//re/ate materialiMed -ie<sum_sales_cust_time on

  6oin_sales_cust_time as fast refresha9le at CO""IT time G8C!$T! "$T!I$LI!D +I!F sum_sales_cust_time!>!S, >$ST ON CO""IT$S

  S!L!CT COUNT)G* cnt_all1 SU")amount_sold* sum_sales1COUNT)amount_sold*

  cnt_sales1 cust_last_name1 day_num9er_in_<ee=  >O" 6oin_sales_cust_time  (OU% B' cust_last_name1 day_num9er_in_<ee=;

This schema can be dia!rammaticall* represented as in Fi!re )"

Figure :-3 =ested #ateria$i;ed <ie !chema

Page 96: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 96/260

Page 97: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 97/260

estrictions When .sing Nested )ateriali4ed !ie(s

The followin! restrictions exist on the wa* *o can nest materiali5ed views#

• Fast refresh for ON CO""IT is not spported for a hi!her)level materiali5ed view

that contains :oins and a!!re!ates"•   DB"S_"+I!F!>!S, '7Is will not atomaticall* refresh nested materiali5ed

views nless explicitl* specified" Ths/ if monthly_sales_m- is based on

sales_m-/ *o have to refresh sales_m- first/ followed b* monthly_sales_m-"

Oracle does not atomaticall* refresh monthly_sales_m- when *o refresh

sales_m- or vice versa"

• If *o have a table costs with a materiali5ed view cost_m- based on it/ *o

cannot then create a prebilt materiali5ed view on table costs" The reslt wold

make cost_m- a nested materiali5ed view and this method of conversion is not

spported"

Creating )ateriali4ed !ie(s

' materiali5ed view can be created with the C!$T! "$T!I$LI!D +I!F statement or

sin! Oracle 6nterprise Mana!er" 6xample )2 creates the materiali5ed viewcust_sales_m-"

E.am&$e :-5 Creating a #ateria$i;ed <ie 

C!$T! "$T!I$LI!D +I!F cust_sales_m-%CT>!! 4 T$BL!S%$C! demoSTO$(! )INITI$L 5= N!.T 5= %CTINC!$S! 4*

%$$LL!LBUILD I""!DI$T!!>!S, CO"%L!T!!N$BL! U!' !FIT!$SS!L!CT ccust_last_name1  SU")amount_sold* $S sum_amount_sold  >O" customers c1 sales s  F,!! scust_id H ccust_id  (OU% B' ccust_last_name;

It is not ncommon in a data warehose to have alread* created smmar* or a!!re!ation

tables/ and *o mi!ht not wish to repeat this work b* bildin! a new materiali5ed view"In this case/ the table that alread* exists in the database can be re!istered as a prebiltmateriali5ed view" This techni4e is described in 9+e!isterin! 6xistin! Materiali5ed;iews9"

Once *o have selected the materiali5ed views *o want to create/ follow these steps foreach materiali5ed view"

Page 98: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 98/260

0" %esi!n the materiali5ed view" 6xistin! ser)defined materiali5ed views do notre4ire this step" If the materiali5ed view contains man* rows/ then/ ifappropriate/ the materiali5ed view shold be partitioned -if possible3 and sholdmatch the partitionin! of the lar!est or most fre4entl* pdated detail or fact table-if possible3" +efresh performance benefits from partitionin!/ becase it can take

advanta!e of parallel %ML capabilities"0" =se the C!$T! "$T!I$LI!D +I!F statement to create and/ optionall*/ poplate

the materiali5ed view" If a ser)defined materiali5ed view alread* exists/ then sethe ON %!BUILT T$BL! clase in the C!$T! "$T!I$LI!D +I!F statement"

Otherwise/ se the BUILD I""!DI$T! clase to poplate the materiali5ed view

immediatel*/ or the BUILD D!>!!D clase to poplate the materiali5ed view

later" ' BUILD D!>!!D materiali5ed view is disabled for se b* 4er* rewrite

ntil the first !>!S,/ after which it will be atomaticall* enabled/ provided the

!N$BL! U!' !FIT! clase has been specified"

See Also: 

Oracle9i S%& $e"erence for descriptions of the SL statements C!$T!

"$T!I$LI!D +I!F/ $LT! "$T!I$LI!D +I!F/ and DO% "$T!I$LI!D +I!F

Naing )ateriali4ed !ie(s

The name of a materiali5ed view mst conform to standard Oracle namin! conventions"However/ if the materiali5ed view is based on a ser)defined prebilt table/ then the nameof the materiali5ed view mst exactl* match that table name"

If *o alread* have a namin! convention for tables and indexes/ *o mi!ht considerextendin! this namin! scheme to the materiali5ed views so that the* are easil*identifiable" For example/ instead of namin! the materiali5ed view sum_of_sales/ it

cold be called sum_of_sales_m- to denote that this is a materiali5ed view and not a

table or view"

Storage And Data Segent Copression

=nless the materiali5ed view is based on a ser)defined prebilt table/ it re4ires andoccpies stora!e space inside the database" Therefore/ the stora!e needs for themateriali5ed view shold be specified in terms of the tablespace where it is to reside and

the si5e of the extents"

If *o do not know how mch space the materiali5ed view will re4ire/ then theDB"S_OL$%!STI"$T!_SI! packa!e/ which is described in Chapter 02/ 9Smmar*

'dvisor9/ can estimate the nmber of b*tes re4ired to store this ncompressedmateriali5ed view" This information can then assist the desi!n team in determinin! thetablespace in which the materiali5ed view shold reside"

Page 99: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 99/260

?o shold se data se!ment compression with hi!hl* redndant data/ sch as tableswith man* forei!n ke*s" This is particlarl* sefl for materiali5ed views created withthe OLLU% clase" %ata se!ment compression redces disk se and memor* se

-specificall*/ the bffer cache3/ often leadin! to a better scalep for read)onl* operations"%ata se!ment compression can also speed p 4er* exection"

See Also: 

Oracle9i S%& $e"erence for a complete description of STO$(! 

semantics/ Oracle9i Database Per"ormance Tuning #uide and

 $e"erence/ and Chapter >/ 97arallelism and 7artitionin! in %ata$arehoses9 for data se!ment compression examples

&uild )ethods

Two bild methods are available for creatin! the materiali5ed view/ as shown in Table )

A" If *o select BUILD I""!DI$T!/ the materiali5ed view definition is added to the schemaob:ects in the data dictionar*/ and then the fact or detail tables are scanned accordin! tothe S!L!CT expression and the reslts are stored in the materiali5ed view" %ependin! on

the si5e of the tables to be scanned/ this bild process can take a considerable amont oftime"

'n alternative approach is to se the BUILD D!>!!D clase/ which creates the

materiali5ed view withot data/ thereb* enablin! it to be poplated at a later date sin!the DB"S_"+I!F!>!S, packa!e described in Chapter 0/ 9Maintainin! the %ata

$arehose9"

Ta)$e :-2 6ui$d #ethods

&uild )ethod Description

BUILD I""!DI$T! Create the materiali5ed view and then poplate it with data

BUILD D!>!!D Create the materiali5ed view definition bt do not poplate it with data

$nabling uer+ e(rite

Before creatin! a materiali5ed view/ *o can verif* what t*pes of 4er* rewrite are possible b* callin! the procedre DB"S_"+I!F!.%L$IN_"+I!F" Once the materiali5ed

view has been created/ *o can se DB"S_"+I!F!.%L$IN_!FIT! to find ot if -or wh*

not3 it will rewrite a specific 4er*"

6ven tho!h a materiali5ed view is defined/ it will not atomaticall* be sed b* the 4er*rewrite facilit*" ?o mst set the U!'_!FIT!_!N$BL!D initiali5ation parameter to

Page 100: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 100/260

TU! before sin! 4er* rewrite" ?o also mst specif* the !N$BL! U!' !FIT! 

clase if the materiali5ed view is to be considered available for rewritin! 4eries"

If this clase is omitted or specified as DIS$BL! U!' !FIT! when the materiali5ed

view is created/ the materiali5ed view can sbse4entl* be enabled for 4er* rewrite with

the $LT! "$T!I$LI!D +I!F statement"

If *o define a materiali5ed view as BUILD D!>!!D/ it is not eli!ible for 4er* rewrite

ntil it is poplated with data"

uer+ e(rite estrictions

er* rewrite is not possible with all materiali5ed views" If 4er* rewrite is not occrrin!when expected/ DB"S_"+I!F!.%L$IN_!FIT! can help provide reasons wh* a specific

4er* is not eli!ible for rewrite" 'lso/ check to see if *or materiali5ed view satisfies allof the followin! conditions"

)ateriali4ed !ie( estrictions

?o shold keep in mind the followin! restrictions#

• The definin! 4er* of the materiali5ed view cannot contain an* non)repeatable

expressions -OFNU"/ S'SD$T!/ non)repeatable 7L@SL fnctions/ and so on3"

• The 4er* cannot contain an* references to $F or LON( $F datat*pes or ob:ect

!>s"

• If the definin! 4er* of the materiali5ed view contains set operators -UNION/

"INUS/ and so on3/ rewrite will se them for fll text match rewrite onl*"

• If the materiali5ed view was re!istered as %!BUILT/ the precision of the colmnsmst a!ree with the precision of the correspondin! S!L!CT expressions nless

overridden b* the FIT, !DUC!D %!CISION clase"

• If the materiali5ed view contains the same table more than once/ it is possible to

do a !eneral rewrite/ provided the 4er* has the same aliases for the dplicatetables as the materiali5ed view"

7eneral uer+ e(rite estrictions

?o shold keep in mind the followin! restrictions#

• If a 4er* has both local and remote tables/ onl* local tables will be consideredfor potential rewrite"

•  (either the detail tables nor the materiali5ed view can be owned b* S'S"

•   S!L!CT and (OU% B' lists/ if present/ mst be the same in the 4er* of the

materiali5ed view"• '!!re!ate fnctions mst occr onl* as the otermost part of the expression" That

is/ a!!re!ates sch as $+()$+()?** or $+()?* $+()?* are not allowed"

•   CONN!CT B' clases are not allowed"

Page 101: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 101/260

e0resh Options

$hen *o define a materiali5ed view/ *o can specif* two refresh options# how torefresh and what t*pe of refresh" If nspecified/ the defalts are assmed as ON D!"$ND 

and >OC!"

The two refresh exection modes are# ON CO""IT and ON D!"$ND" %ependin! on the

materiali5ed view *o create/ some of the options ma* not be available" Table )describes the refresh modes"

Ta)$e :-3 %efresh #odes

e0resh)ode Description

ON

CO""IT

+efresh occrs atomaticall* when a transaction that modified one of the

materiali5ed view<s detail tables commits" This can be specified as lon! as themateriali5ed view is fast refreshable -in other words/ not complex3" The ON 

CO""IT privile!e is necessar* to se this mode

OND!"$ND

+efresh occrs when a ser manall* exectes one of the available refresh procedres contained in the DB"S_"+I!F packa!e -!>!S,/

!>!S,_$LL_"+I!FS/ !>!S,_D!%!ND!NT3

$hen a materiali5ed view is maintained sin! the ON CO""IT method/ the time re4ired

to complete the commit ma* be sli!htl* lon!er than sal" This is becase the refresh

operation is performed as part of the commit process" Therefore this method ma* not besitable if man* sers are concrrentl* chan!in! the tables pon which the materiali5edview is based"

If *o anticipate performin! insert/ pdate or delete operations on tables referenced b* amateriali5ed view concrrentl* with the refresh of that materiali5ed view/ and thatmateriali5ed view incldes :oins and a!!re!ation/ Oracle recommends *o se ON CO""IT

fast refresh rather than ON D!"$ND fast refresh"

If *o think the materiali5ed view did not refresh/ check the alert lo! or trace file"

If a materiali5ed view fails drin! refresh at CO""IT time/ *o mst explicitl* invoke therefresh procedre sin! the DB"S_"+I!F packa!e after addressin! the errors specified in

the trace files" =ntil this is done/ the materiali5ed view will no lon!er be refreshedatomaticall* at commit time"

?o can specif* how *o want *or materiali5ed views to be refreshed from the detailtables b* selectin! one of for options# CO"%L!T!/ >$ST/ >OC!/ and N!+!" Table )

describes the refresh options"

Page 102: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 102/260

Ta)$e :-" %efresh O&tions

e0reshOption Description

CO"%L!T! +efreshes b* recalclatin! the materiali5ed view<s definin! 4er*

>$ST 'pplies incremental chan!es to refresh the materiali5ed view sin! theinformation lo!!ed in the materiali5ed view lo!s/ or from a SLLoaderdirect)path or a partition maintenance operation

>OC! 'pplies >$ST refresh if possibleJ otherwise/ it applies CO"%L!T! refresh

N!+! Indicates that the materiali5ed view will not be refreshed with the Oracle

refresh mechanisms

$hether the fast refresh option is available depends pon the t*pe of materiali5ed view"?o can call the procedre DB"S_"+I!F!.%L$IN_"+I!F to determine whether fast

refresh is possible"

7eneral estrictions on ,ast e0resh

The definin! 4er* of the materiali5ed view is restricted as follows#

• The materiali5ed view mst not contain references to non)repeatin! expressions

like S'SD$T! and OFNU""• The materiali5ed view mst not contain references to $F or LON( $F data t*pes"

estrictions on ,ast e0resh on )ateriali4ed !ie(s (ith >oins Onl+

%efinin! 4eries for materiali5ed views with :oins onl* and no a!!re!ates have thefollowin! restrictions on fast refresh#

• 'll restrictions from 9Keneral +estrictions on Fast +efresh9"

• The* cannot have (OU% B' clases or a!!re!ates"

• If the F,!! clase of the 4er* contains oter :oins/ then ni4e constraints mst

exist on the :oin colmns of the inner :oin table"• If there are no oter :oins/ *o can have arbitrar* selections and :oins in the F,!!

clase" However/ if there are oter :oins/ the F,!! clase cannot have an*

selections" Frthermore/ if there are oter :oins/ all the :oins mst be connected b*$NDs and mst se the e4alit* -3 operator"

• +owids of all the tables in the >O" list mst appear in the S!L!CT list of the

4er*"

Page 103: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 103/260

Page 104: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 104/260

• Materiali5ed a!!re!ate views with oter :oins are fast refreshable after

conventional %ML and direct loads/ provided onl* the oter table has beenmodified" 'lso/ ni4e constraints mst exist on the :oin colmns of the inner :ointable" If there are oter :oins/ all the :oins mst be connected b* $NDs and mst

se the e4alit* -3 operator"•

For materiali5ed views with CUB!/ OLLU%/ Kropin! Sets/ or concatenation ofthem/ the followin! restrictions appl*# The S!L!CT list shold contain !ropin! distin!isher that can either be a

(OU%IN(_ID fnction on all (OU% B' expressions or (OU%IN( fnctions

one for each (OU% B' expression" For example/ if the (OU% B' clase of

the materiali5ed view is 9(OU% B' CUB!)a1 9*9/ then the S!L!CT list

shold contain either 9(OU%IN(_ID)a1 9*9 or 9(OU%IN()a* $ND 

(OU%IN()9*9 for the materiali5ed view to be fast refreshable"

 (OU% B' shold not reslt in an* dplicate !ropin!s" For example/

9(OU% B' a1 OLLU%)a1 9*9 is not fast refreshable becase it reslts in

dplicate !ropin!s 9)a*1 )a1 9*1 $ND )a*9"

estrictions on ,ast e0resh on )ateriali4ed !ie(s With the .NION A""Operator 

Materiali5ed views with the UNION $LL set operator spport the !>!S, >$ST option if

the followin! conditions are satisfied#

• The definin! 4er* mst have the UNION $LL operator at the top level"

The UNION $LL operator cannot be embedded inside a sb4er*/ with one

exception# The UNION $LL can be in a sb4er* in the >O" clase provided the

definin! 4er* is of the form S!L!CT G >O" -view or sb4er* with UNION $LL3as in the followin! example#

C!$T! +I!F -ie<_<ith_unionall_m-$S)S!L!CT cro<id crid1 ccust_id1 2 umar=er >O" customers c F,!! ccust_last_name H 7Smith7 UNION $LL S!L!CT cro<id crid1 ccust_id1 3 umar=er >O" customers c F,!! ccust_last_name H 7ones7*;

C!$T! "$T!I$LI!D +I!F unionall_inside_-ie<_m-!>!S, >$ST ON D!"$ND$SS!L!CT G >O" -ie<_<ith_unionall;

 (ote that the view -ie<_<ith_unionall_m- satisfies all re4irements for fast

refresh"

Page 105: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 105/260

• 6ach 4er* block in the UNION $LL 4er* mst satisf* the re4irements of a fast

refreshable materiali5ed view with a!!re!ates or a fast refreshable materiali5edview with :oins"

The appropriate materiali5ed view lo!s mst be created on the tables as re4ired

for the correspondin! t*pe of fast refreshable materiali5ed view"

 (ote that Oracle also allows the special case of a sin!le table materiali5ed viewwith :oins onl* provided the OFID colmn has been inclded in the S!L!CT list

and in the materiali5ed view lo!" This is shown in the definin! 4er* of the view-ie<_<ith_unionall_m-"

• The S!L!CT list of each 4er* mst inclde a maintenance colmn/ called a UNION

$LL marker" The UNION $LL colmn mst have a distinct constant nmeric or

strin! vale in each UNION $LL branch" Frther/ the marker colmn mst appear in

the same ordinal position in the S!L!CT list of each 4er* block"

• Some featres sch as oter :oins/ insert)onl* a!!re!ate materiali5ed view 4eriesand remote tables are not spported for materiali5ed views with UNION $LL"

• 7artition Chan!e Trackin!)based refresh is not spported for UNION $LL 

materiali5ed views"• The compatibilit* initiali5ation parameter mst be set to 1"A"8 to create a fast

refreshable materiali5ed view with UNION $LL"

OD$ &F Clause

'n OD! B' clase is allowed in the C!$T! "$T!I$LI!D +I!F statement" It is sed

onl* drin! the initial creation of the materiali5ed view" It is not sed drin! a fll refresh

or a fast refresh"

To improve the performance of 4eries a!ainst lar!e materiali5ed views/ store the rows inthe materiali5ed view in the order specified in the OD! B' clase" This initial orderin!

 provides ph*sical clsterin! of the data" If indexes are bilt on the colmns b* which themateriali5ed view is ordered/ accessin! the rows of the materiali5ed view sin! the indexoften redces the time for disk I@O de to the ph*sical clsterin!"

The OD! B' clase is not considered part of the materiali5ed view definition" 's a

reslt/ there is no difference in the manner in which Oracle detects the varios t*pes ofmateriali5ed views -for example/ materiali5ed :oin views with no a!!re!ates3" For the

same reason/ 4er* rewrite is not affected b* the OD! B' clase" This featre is similarto the C!$T! T$BL! """ OD! B' capabilit* that exists in Oracle"

)ateriali4ed !ie( "ogs

Materiali5ed view lo!s are re4ired if *o want to se fast refresh" The* are definedsin! a C!$T! "$T!I$LI!D +I!F LO( statement on the base table that is to be

chan!ed" The* are not created on the materiali5ed view" For fast refresh of materiali5ed

Page 106: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 106/260

views/ the definition of the materiali5ed view lo!s mst specif* the OFID clase" In

addition/ for a!!re!ate materiali5ed views/ it mst also contain ever* colmn in the tablereferenced in the materiali5ed view/ the INCLUDIN( N!F +$LU!S clase and the S!U!NC! 

clase"

'n example of a materiali5ed view lo! is shown as follows where one is created on thetable sales"

C!$T! "$T!I$LI!D +I!F LO( ON salesFIT, OFID)prod_id1 cust_id1 time_id1 channel_id1 promo_id1 quantity_sold1amount_sold*INCLUDIN( N!F +$LU!S;

Oracle recommends that the ke*word S!U!NC! be inclded in *or materiali5ed view

lo! statement nless *o are sre that *o will never perform a mixed %ML operation -a

combination of INS!T/ U%D$T!/ or D!L!T! operations on mltiple tables3"

The bondar* of a mixed %ML operation is determined b* whether the materiali5ed viewis ON CO""IT or ON D!"$ND"

• For ON CO""IT/ the mixed %ML statements occr within the same transaction

 becase the refresh of the materiali5ed view will occr pon commit of thistransaction"

• For ON D!"$ND/ the mixed %ML statements occr between refreshes" The

followin! example of a materiali5ed view lo! illstrates where one is created onthe table sales that incldes the S!U!NC! ke*word#

  C!$T! "$T!I$LI!D +I!F LO( ON sales•   FIT, S!U!NC!1 OFID

•   )prod_id1 cust_id1 time_id1 channel_id1 promo_id1

•   quantity_sold1 amount_sold*

•   INCLUDIN( N!F +$LU!S;

.sing Oracle $nterprise )anager 

' materiali5ed view can also be created sin! Oracle 6nterprise Mana!er b* selectin! themateriali5ed view ob:ect t*pe" There is no difference in the information re4ired if thisapproach is sed" However/ *o mst complete three propert* sheets and *o mst

ensre that the option ,nable -ery .ewrite on the #eneral sheet is selected"

See Also: 

Oracle /nterprise Manager Con"iguration #uide and Chapter 02/9Smmar* 'dvisor9 for frther information

.sing )ateriali4ed !ie(s (ith N"S #araeters

Page 107: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 107/260

$hen sin! certain materiali5ed views/ *o mst ensre that *or (LS parameters arethe same as when *o created the materiali5ed view" Materiali5ed views with thisrestriction are as follows#

• 6xpressions that ma* retrn different vales/ dependin! on (LS parameter

settin!s" For example/ -date P 980@8A@893 or -rate Q 9A"0>893 are (LS parameterdependent expressions"• 64i:oins where one side of the :oin is character data" The reslt of this e4i:oin

depends on collation and this can chan!e on a session basis/ !ivin! an incorrectreslt in the case of 4er* rewrite or an inconsistent materiali5ed view after arefresh operation"

• 6xpressions that !enerate internal conversion to character data in the S!L!CT list

of a materiali5ed view/ or inside an a!!re!ate of a materiali5ed a!!re!ate view"This restriction does not appl* to expressions that involve onl* nmeric data/ forexample/ a9 where a and 9 are nmeric fields"

egistering $2isting )ateriali4ed !ie(s

Some data warehoses have implemented materiali5ed views in ordinar* ser tables"'ltho!h this soltion provides the performance benefits of materiali5ed views/ it doesnot#

• 7rovide 4er* rewrite to all SL applications

• 6nable materiali5ed views defined in one application to be transparentl* accessed

in another application• Kenerall* spport fast parallel or fast materiali5ed view refresh

Becase of these limitations/ and becase existin! materiali5ed views can be extremel*lar!e and expensive to rebild/ *o shold re!ister *or existin! materiali5ed view tableswith Oracle whenever possible" ?o can re!ister a ser)defined materiali5ed view withthe C!$T! "$T!I$LI!D +I!F """ ON %!BUILT T$BL! statement" Once re!istered/ the

materiali5ed view can be sed for 4er* rewrites or maintained b* one of the refreshmethods/ or both"

The contents of the table mst reflect the materiali5ation of the definin! 4er* at the time*o re!ister it as a materiali5ed view/ and each colmn in the definin! 4er* mstcorrespond to a colmn in the table that has a matchin! datat*pe" However/ *o canspecif* FIT, !DUC!D %!CISION to allow the precision of colmns in the definin! 4er*

to be different from that of the table colmns"

The table and the materiali5ed view mst have the same name/ bt the table retains itsidentit* as a table and can contain colmns that are not referenced in the definin! 4er*of the materiali5ed view" These extra colmns are known as nmana!ed colmns" If rowsare inserted drin! a refresh operation/ each nmana!ed colmn of the row is set to itsdefalt vale" Therefore/ the nmana!ed colmns cannot have NOT NULL constraints

nless the* also have defalt vales"

Page 108: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 108/260

Materiali5ed views based on prebilt tables are eli!ible for selection b* 4er* rewrite provided the parameter U!'_!FIT!_INT!(IT' is set to at least the level of

stale_tolerated or trusted"

See Also: 

Chapter AA/ 9er* +ewrite9 for details abot inte!rit* levels

$hen *o drop a materiali5ed view that was created on a prebilt table/ the table stillexists))onl* the materiali5ed view is dropped"

$hen a prebilt table is re!istered as a materiali5ed view and 4er* rewrite is desired/ the parameter U!'_!FIT!_INT!(IT' mst be set to at least stale_tolerated becase/

when it is created/ the materiali5ed view is marked as nknown" Therefore/ onl* staleinte!rit* modes can be sed"

The followin! example illstrates the two steps re4ired to re!ister a ser)defined table"First/ the table is created/ then the materiali5ed view is defined sin! exactl* the samename as the table" This materiali5ed view sum_sales_ta9 is eli!ible for se in 4er*

rewrite"

C!$T! T$BL! sum_sales_ta9  %CT>!! 4 T$BL!S%$C! demo  STO$(! )INITI$L 5= N!.T 5= %CTINC!$S! 4*  $S  S!L!CT sprod_id1  SU")amount_sold* $S dollar_sales1  SU")quantity_sold* $S unit_sales  >O" sales s (OU% B' sprod_id;

C!$T! "$T!I$LI!D +I!F sum_sales_ta9ON %!BUILT T$BL! FIT,OUT !DUC!D %!CISION!N$BL! U!' !FIT!$SS!L!CT sprod_id1  SU")amount_sold* $S dollar_sales1  SU")quantity_sold* $S unit_sales  >O" sales s (OU% B' sprod_id;

?o cold have compressed this table to save space" See 9Stora!e 'nd %ata Se!mentCompression9 for details re!ardin! data se!ment compression"

In some cases/ ser)defined materiali5ed views are refreshed on a schedle that is lon!erthan the pdate c*cle" For example/ a monthl* materiali5ed view mi!ht be pdated onl*at the end of each month/ and the materiali5ed view vales alwa*s refer to complete time periods" +eports written directl* a!ainst these materiali5ed views implicitl* select onl*data that is not in the crrent -incomplete3 time period" If a ser)defined materiali5edview alread* contains a time dimension#

Page 109: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 109/260

• It shold be re!istered and then fast refreshed each pdate c*cle"

• ?o can create a view that selects the complete time period of interest"

• The reports shold be modified to refer to the view instead of referrin! directl* to

the ser)defined materiali5ed view"

If the ser)defined materiali5ed view does not contain a time dimension/ then#

• Create a new materiali5ed view that does inclde the time dimension -if possible3"

• The view shold a!!re!ate over the time colmn in the new materiali5ed view"

#artitioning and )ateriali4ed !ie(s

Becase of the lar!e volme of data held in a data warehose/ partitionin! is anextremel* sefl option when desi!nin! a database"

7artitionin! the fact tables improves scalabilit*/ simplifies s*stem administration/ and

makes it possible to define local indexes that can be efficientl* rebilt" 7artitionin! thefact tables also improves the opportnit* of fast refreshin! the materiali5ed view whenthe partition maintenance operation occrs"

7artitionin! a materiali5ed view also has benefits for refresh/ becase the refresh procedre can se parallel %ML to maintain the materiali5ed view"

See Also: 

Chapter >/ 97arallelism and 7artitionin! in %ata $arehoses9 forfrther details abot partitionin!

#artition Change Trac<ing

It is possible and advanta!eos to track freshness to a finer !rain than the entiremateriali5ed view" The abilit* to identif* which rows in a materiali5ed view are affected b* a certain detail table partition/ is known as 7artition Chan!e Trackin! -7CT3" $henone or more of the detail tables are partitioned/ it ma* be possible to identif* the specificrows in the materiali5ed view that correspond to a modified detail partition-s3J those rows become stale when a partition is modified while all other rows remain fresh"

7artition Chan!e Trackin! can be sed to identif* which materiali5ed view rowscorrespond to a particlar detail table" 7artition Chan!e Trackin! is also sed to spportfast refresh after partition maintenance operations on detail tables" For instance/ if a detailtable partition is trncated or dropped/ the affected rows in the materiali5ed view areidentified and deleted" Identif*in! which materiali5ed view rows are fresh or stale/ ratherthan considerin! the entire materiali5ed view as stale/ allows 4er* rewrite to se thoserows that are fresh while in U!'_!FIT!_INT!(IT'!N>OC!D or TUST!D modes"

To spport 7CT/ a materiali5ed view mst satisf* the followin! re4irements#

Page 110: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 110/260

Page 111: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 111/260

 partitioned table sin! the time_id colmn and products is partitioned b* the

prod_cate/ory colmn" times is not a partitioned table"

The first materiali5ed view is for the *earl* sales revene for each prodct"

The second materiali5ed view is for monthl* cstomer sales" 's cstomers tend to prchase in blk/ sales avera!e :st two orders for each cstomer per month" Therefore/the impact of incldin! the time_id in the materiali5ed view will not nacceptabl*

increase the nmber of rows stored" However/ most orders are lar!e and contain man*different prodcts" $ith approximatel* 0888 different prodcts sold each da*/ incldin!the time_id in the materiali5ed view wold sbstantiall* increase the cardinalit*" This

materiali5ed view ses the DB"S_"+I!F%"$K! fnction"

The detail tables mst have materiali5ed view lo!s for >$ST !>!S,"

C!$T! "$T!I$LI!D +I!F LO( ON S$L!S FIT, OFID  )prod_id1 time_id1 quantity_sold1 amount_sold*  INCLUDIN( N!F +$LU!S;

C!$T! "$T!I$LI!D +I!F LO( ON %ODUCTS FIT, OFID  )prod_id1 prod_name1 prod_desc*  INCLUDIN( N!F +$LU!S;C!$T! "$T!I$LI!D +I!F LO( ON TI"!S FIT, OFID  )time_id1 calendar_month_name1 calendar_year*  INCLUDIN( N!F +$LU!S;

C!$T! "$T!I$LI!D +I!F cust_mth_sales_m-BUILD D!>!!D !>!S, >$ST ON D!"$ND!N$BL! U!' !FIT!$S

  S!L!CT stime_id1 pprod_id1 SU")squantity_sold*1SU")samount_sold*1  pprod_name1 tcalendar_month_name1 COUNT)G*1

COUNT)squantity_sold*1 COUNT)samount_sold*  >O" sales s1 products p1 times t  F,!! stime_id H ttime_id $ND sprod_id H pprod_id  (OU% B' tcalendar_month_name1 pprod_id1 pprod_name1 stime_id;

cust_mth_sales_m- incldes the partition ke* colmn from table sales -time_id3 in

 both its S!L!CT and (OU% B' lists" This enables 7CT on table sales for materiali5ed

view cust_mth_sales_m-" However/ the (OU% B' and S!L!CT lists inclde

%ODUCTS%OD_ID rather than the partition ke* colmn -%OD_C$T!(O'3 of theproducts table" Therefore/ 7CT is not enabled on table products for this materiali5ed

view" In other words/ an* partition maintenance operation to the sales table will allow a

7CT fast refresh of cust_mth_sales_m-" However/ 7CT fast refresh is not possible after

an* kind of modification to the products table" To correct this/ the (OU% B' and S!L!CT

lists mst inclde colmn %ODUCTS%OD_C$T!(O'" Followin! a partition maintenance

operation/ sch as a drop partition/ a 7CT fast refresh shold be performed on an*

Page 112: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 112/260

materiali5ed view that is referencin! the table pon which the partition operations arendertaken"

E.am&$e :-: Creating a #ateria$i;ed <ie 

C!$T! "$T!I$LI!D +I!F prod_yr_sales_m-BUILD D!>!!D!>!S, >$ST ON D!"$ND!N$BL! U!' !FIT!$S  S!L!CT DB"S_"+I!F%"$K!)sro<id*1  DB"S_"+I!F%"$K!)pro<id*1  sprod_id1 SU")samount_sold*1 SU")squantity_sold*1  pprod_name1 tcalendar_year1 COUNT)G*1  COUNT)samount_sold*1 COUNT)squantity_sold*  >O" sales s1 products p1 times t  F,!! stime_id H ttime_id $ND  sprod_id H pprod_id  (OU% B' DB"S_"+I!F%"$K! )sro<id*1

  DB"S_"+I!F%"$K! )pro<id*1  tcalendar_year1 sprod_id1 pprod_name;

prod_yr_sales_m- incldes the DB"S_"+I!F%"$K! fnction on the sales and

products tables in both its S!L!CT and (OU% B' lists" This enables partition chan!e

trackin! on both the sales table and the products table with si!nificantl* less

cardinalit* impact than !ropin! b* the respective partition ke* colmns" In this example/the desired level of a!!re!ation for the prod_yr_sales_m- is to !rop b*

timescalendar_year" =sin! the DB"S_"+I!F%"$K! fnction/ the materiali5ed view

cardinalit* is increased onl* b* a factor of the nmber of partitions in the sales table

times/ the nmber of partitions in the products table" This wold !enerall* besi!nificantl* less than the cardinalit* impact of incldin! the respective partition ke*colmns"

' sbse4ent INS!T statement adds a new row to the sales_part3 partition of table

sales" 't this point/ becase cust_mth_sales_m- and prod_yr_sales_m- have

 partition chan!e trackin! available on table sales/ Oracle can determine that those rows

in these materiali5ed views correspondin! to sales_part3 are stale/ while all other rows

in these materiali5ed views are nchan!ed in their freshness state" 'n INS!T INTO 

products statement is not tracked for materiali5ed view cust_mth_sales_m-" Therefore/

cust_mth_sales_m- becomes completel* stale when the products table is modified in

this wa*"

#artitioning a )ateriali4ed !ie(

7artitionin! a materiali5ed view involves definin! the materiali5ed view with thestandard Oracle partitionin! clases/ as illstrated in the followin! example" Thisstatement creates a materiali5ed view called part_sales_m-/ which ses three partitions/

can be fast refreshed/ and is eli!ible for 4er* rewrite"

Page 113: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 113/260

C!$T! "$T!I$LI!D +I!F part_sales_m-%$$LL!L

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

J Diensions

The followin! sections will help *o create and mana!e a data warehose#

• $hat are %imensions&

• Creatin! %imensions

• ;iewin! %imensions

• =sin! %imensions with Constraints

• ;alidatin! %imensions

• 'lterin! %imensions• %eletin! %imensions

• =sin! the %imension $i5ard

What are Diensions?

' dimension is a strctre that cate!ori5es data in order to enable sers to answer bsiness 4estions" Commonl* sed dimensions are customers/ products/ and time"

For example/ each sales channel of a clothin! retailer mi!ht !ather and store datare!ardin! sales and reclamations of their Cloth assortment" The retail chain mana!ementcan bild a data warehose to anal*5e the sales of its prodcts across all stores over timeand help answer 4estions sch as#

• $hat is the effect of promotin! one prodct on the sale of a related prodct that is

not promoted&• $hat are the sales of a prodct before and after a promotion&

• How does a promotion affect the varios distribtion channels&

Page 114: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 114/260

The data in the retailer<s data warehose s*stem has two important components#dimensions and facts" The dimensions are prodcts/ cstomers/ promotions/ channels/ andtime" One approach for identif*in! *or dimensions is to review *or reference tables/sch as a prodct table that contains ever*thin! abot a prodct/ or a promotion tablecontainin! all information abot promotions" The facts are sales -nits sold3 and profits"

' data warehose contains facts abot the sales of each prodct at on a dail* basis"

' t*pical relational implementation for sch a data warehose is a Star Schema" The factinformation is stored in the so)called fact table/ whereas the dimensional information isstored in the so)called dimension tables" In or example/ each sales transaction record isni4el* defined as for each cstomer/ for each prodct/ for each sales channel/ for each promotion/ and for each da* -time3"

See Also: 

Chapter 0E/ 9Schema Modelin! Techni4es9 for frther details

In Oracle1i/ the dimensional information itself is stored in a dimension table" In addition/the database ob:ect dimension helps to or!ani5e and !rop dimensional information intohierarchies" This represents natral 5n relationships between colmns or colmn !rops

-the levels of a hierarch*3 that cannot be represented with constraint conditions" Koin! pa level in the hierarch* is called rollin! p the data and !oin! down a level in thehierarch* is called drillin! down the data" In the retailer example#

• $ithin the time dimension/ months roll p to 4arters/ 4arters roll p to *ears/

and *ears roll p to all *ears"• $ithin the product dimension/ prodcts roll p to sbcate!ories/ sbcate!ories

roll p to cate!ories/ and cate!ories roll p to all prodcts"• $ithin the customer dimension/ cstomers roll p to city" Then cities rolls p to

state" Then states roll p to country" Then contries roll p to su9re/ion"

Finall*/ sbre!ions roll p to re/ion/ as shown in Fi!re 1)0"

Figure @-1 !am&$e %o$$u& for a Customer Dimension

Page 115: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 115/260

Text description of the illstration dwhs!8EA"!if 

%ata anal*sis t*picall* starts at hi!her levels in the dimensional hierarch* and !radall*drills down if the sitation warrants sch anal*sis"

%imensions do not have to be defined/ bt spendin! time creatin! them can *ieldsi!nificant benefits/ becase the* help 4er* rewrite perform more complex t*pes ofrewrite" The* are mandator* if *o se the Smmar* 'dvisor -a K=I tool formateriali5ed view mana!ement3 to recommend which materiali5ed views to create/ drop/or retain"

See Also: 

Chapter AA/ 9er* +ewrite9 for frther details re!ardin! 4er* rewriteand Chapter 02/ 9Smmar* 'dvisor9 for frther details re!ardin! theSmmar* 'dvisor 

?o mst not create dimensions in an* schema that does not satisf* these relationships"Incorrect reslts can be retrned from 4eries otherwise"

Creating Diensions

Before *o can create a dimension ob:ect/ the dimension tables mst exist in thedatabase/ containin! the dimension data" For example/ if *o create a cstomerdimension/ one or more tables mst exist that contain the cit*/ state/ and contr*information" In a star schema data warehose/ these dimension tables alread* exist" It istherefore a simple task to identif* which ones will be sed"

Page 116: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 116/260

 (ow *o can draw the hierarchies of a dimension as shown in Fi!re 1)0" For example/city is a child of state -becase *o can a!!re!ate cit*)level data p to state3/ and

country" This hierarchical information will be stored in the database ob:ect dimension"

In the case of normali5ed or partiall* normali5ed dimension representation -a dimension

that is stored in more than one table3/ identif* how these tables are :oined" (ote whetherthe :oins between the dimension tables can !arantee that each child)side row :oins withone and onl* one parent)side row" In the case of denormali5ed dimensions/ determinewhether the child)side colmns ni4el* determine the parent)side -or attribte3 colmns"These constraints can be enabled with the NO+$LID$T! and !L' clases if the

relationships represented b* the constraints are !aranteed b* other means"

?o create a dimension sin! either the C!$T! DI"!NSION statement or the %imension

$i5ard in Oracle 6nterprise Mana!er" $ithin the C!$T! DI"!NSION statement/ se the

L!+!L clase to identif* the names of the dimension levels"

See Also: 

Oracle9i S%& $e"erence for a complete description of the C!$T! 

DI"!NSION statement

This cstomer dimension contains a sin!le hierarch* with a !eo!raphical rollp/ witharrows drawn from the child level to the parent level/ as shown in Fi!re 1)0"

6ach arrow in this !raph indicates that for an* child there is one and onl* one parent" Forexample/ each cit* mst be contained in exactl* one state and each state mst becontained in exactl* one contr*" States that belon! to more than one contr*/ or that

 belon! to no contr*/ violate hierarchical inte!rit*" Hierarchical inte!rit* is necessar* forthe correct operation of mana!ement fnctions for materiali5ed views that incldea!!re!ates"

For example/ *o can declare a dimension products_dim/ which contains levels

product/ su9cate/ory/ and cate/ory#

C!$T! DI"!NSION products_dim  L!+!L product IS )productsprod_id*  L!+!L su9cate/ory IS )productsprod_su9cate/ory*  L!+!L cate/ory IS )productsprod_cate/ory*

6ach level in the dimension mst correspond to one or more colmns in a table in thedatabase" Ths/ level product is identified b* the colmn prod_id in the prodcts table

and level su9cate/ory is identified b* a colmn called prod_su9cate/ory in the same

table"

In this example/ the database tables are denormali5ed and all the colmns exist in thesame table" However/ this is not a prere4isite for creatin! dimensions" 9=sin!

Page 117: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 117/260

Page 118: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 118/260

' sample dimension definition follows#

C!$T! DI"!NSION products_dim  L!+!L product IS )productsprod_id*  L!+!L su9cate/ory IS )productsprod_su9cate/ory*  L!+!L cate/ory IS )productsprod_cate/ory*

  ,I!$C,' prod_rollup )  product C,ILD O>  su9cate/ory C,ILD O>  cate/ory  *  $TTIBUT! product D!T!"IN!S  )productsprod_name1 productsprod_desc1  prod_<ei/ht_class1 prod_unit_of_measure1  prod_pac=_siMe1prod_status1 prod_list_price1 prod_min_price*  $TTIBUT! su9cate/ory D!T!"IN!S  )prod_su9cate/ory1 prod_su9cat_desc*  $TTIBUT! cate/ory D!T!"IN!S  )prod_cate/ory1 prod_cat_desc*;

The desi!n/ creation/ and maintenance of dimensions is part of the desi!n/ creation/ andmaintenance of *or data warehose schema" Once the dimension has been created/check that it meets these re4irements#

• There mst be a 0#n relationship between a parent and children" ' parent can have

one or more children/ bt a child can have onl* one parent"• There mst be a 0#0 attribte relationship between hierarch* levels and their

dependent dimension attribtes" For example/ if there is a colmnfiscal_month_desc/ then a possible attribte relationship wold be

fiscal_month_desc to fiscal_month_name"

• If the colmns of a parent level and child level are in different relations/ then the

connection between them also re4ires a 0#n :oin relationship" 6ach row of thechild table mst :oin with one and onl* one row of the parent table" Thisrelationship is stron!er than referential inte!rit* alone/ becase it re4ires that thechild :oin ke* mst be non)nll/ that referential inte!rit* mst be maintained fromthe child :oin ke* to the parent :oin ke*/ and that the parent :oin ke* mst beni4e"

• ?o mst ensre -sin! database constraints if necessar*3 that the colmns of

each hierarch* level are non)nll and that hierarchical inte!rit* is maintained"• The hierarchies of a dimension can overlap or be disconnected from each other"

However/ the colmns of a hierarch* level cannot be associated with more thanone dimension"

• .oin relationships that form c*cles in the dimension !raph are not spported" For

example/ a hierarch* level cannot be :oined to itself either directl* or indirectl*"

Note: 

The information stored with a dimension ob:ects is onl* declarative"

Page 119: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 119/260

The previosl* discssed relationships are not enforced with thecreation of a dimension ob:ect" ?o shold validate an* dimensiondefinition with the DB"S_"+I!F+$LID$T!_DI"!NSION procedre/ as

discssed on 9;alidatin! %imensions9"

)ultiple -ierarchies

' sin!le dimension definition can contain mltiple hierarchies" Sppose or retailerwants to track the sales of certain items over time" The first step is to define the timedimension over which sales will be tracked" Fi!re 1)A illstrates a dimension times_dim

with two time hierarchies"

Figure @-2 timesdim Dimension ith To Time (ierarchies

Text description of the illstration dwhs!8E>"!if 

From the illstration/ *o can constrct the hierarch* of the denormali5ed time_dim 

dimension<s C!$T! DI"!NSION statement as follows" The complete C!$T! DI"!NSION 

statement as well as the C!$T! T$BL! statement are shown in Oracle9i Sample Schemas"

C!$T! DI"!NSION times_dim  L!+!L day IS TI"!STI"!_ID  L!+!L month IS TI"!SC$L!ND$_"ONT,_D!SC  L!+!L quarter IS TI"!SC$L!ND$_U$T!_D!SC  L!+!L year IS TI"!SC$L!ND$_'!$  L!+!L fis_<ee= IS TI"!SF!!K_!NDIN(_D$'  L!+!L fis_month IS TI"!S>ISC$L_"ONT,_D!SC  L!+!L fis_quarter IS TI"!S>ISC$L_U$T!_D!SC  L!+!L fis_year IS TI"!S>ISC$L_'!$  ,I!$C,' cal_rollup )  day C,ILD O>

Page 120: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 120/260

  month C,ILD O>  quarter C,ILD O>  year  *  ,I!$C,' fis_rollup )  day C,ILD O>  fis_<ee= C,ILD O>  fis_month C,ILD O>  fis_quarter C,ILD O>  fis_year  * Pattri9ute determination clausesJ

.sing Norali4ed Diension Tables

The tables sed to define a dimension ma* be normali5ed or denormali5ed and theindividal hierarchies can be normali5ed or denormali5ed" If the levels of a hierarch*come from the same table/ it is called a fll* denormali5ed hierarch*" For example/cal_rollup in the times_dim dimension is a denormali5ed hierarch*" If levels of a

hierarch* come from different tables/ sch a hierarch* is either a fll* or partiall*normali5ed hierarch*" This section shows how to define a normali5ed hierarch*"

Sppose the trackin! of a cstomer<s location is done b* cit*/ state/ and contr*" Thisdata is stored in the tables customers and countries" The customer dimension

customers_dim is partiall* normali5ed becase the data entities cust_id and

country_id are taken from different tables" The clase OIN K!' within the dimension

definition specifies how to :oin to!ether the levels in the hierarch*" The dimensionstatement is partiall* shown in the followin!" The complete C!$T! DI"!NSION statement

as well as the C!$T! T$BL! statement are shown in Oracle9i Sample Schemas"

C!$T! DI"!NSION customers_dim  L!+!L customer IS )customerscust_id*  L!+!L city IS )customerscust_city*  L!+!L state IS )customerscust_state_pro-ince*  L!+!L country IS )countriescountry_id*  L!+!L su9re/ion IS )countriescountry_su9re/ion*  L!+!L re/ion IS )countriescountry_re/ion*  ,I!$C,' /eo/_rollup )  customer C,ILD O>  city C,ILD O>  state C,ILD O>  country C,ILD O>  su9re/ion C,ILD O>  re/ion  OIN K!' )customerscountry_id* !>!!NC!S country  * attribute determination clause;

!ie(ing Diensions

%imensions can be viewed thro!h one of two methods#

Page 121: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 121/260

• =sin! The %6MO%IM 7acka!e

• =sin! Oracle 6nterprise Mana!er

.sing The D$)OBDI) #ac<age

Two procedres allow *o to displa* the dimensions that have been defined" First/ thefile smdimsql/ located nder QO$CL!_,O"!8rd9ms8demo/ mst be exected to provide

the D!"O_DI" packa!e/ which incldes#

•   D!"O_DI"%INT_DI" to print a specific dimension

•   D!"O_DI"%INT_$LLDI"S to print all dimensions accessible to a ser

The D!"O_DI""%INT_DI" procedre has onl* one parameter# the name of the dimension

to displa*" The followin! example shows how to displa* the dimension TI"!S_DI""

S!T S!+!OUT%UT ON;

!.!CUT! D!"O_DI"%INT_DI" )7TI"!S_DI"7*;

To displa* all of the dimensions that have been defined/ call the procedreD!"O_DI"%INT_$LLDI"S withot an* parameters is illstrated as follows"

!.!CUT! DB"S_OUT%UT!N$BL!)54444*;!.!CUT! D!"O_DI"%INT_$LLDI"S;

+e!ardless of which procedre is called/ the otpt format is identical" ' sample displa*is shown here"

DI"!NSION S,%O"O_DI"L!+!L C$T!(O' IS S,%O"OTIONS%O"O_C$T!(O'L!+!L %O"O IS S,%O"OTIONS%O"O_IDL!+!L SUBC$T!(O' IS S,%O"OTIONS%O"O_SUBC$T!(O',I!$C,' %O"O_OLLU% ) %O"OC,ILD O> SUBC$T!(O'C,ILD O> C$T!(O'*$TTIBUT! C$T!(O' D!T!"IN!S S,%O"OTIONS%O"O_C$T!(O'$TTIBUT! %O"O D!T!"IN!S S,%O"OTIONS%O"O_B!(IN_D$T!$TTIBUT! %O"O D!T!"IN!S S,%O"OTIONS%O"O_COST$TTIBUT! %O"O D!T!"IN!S S,%O"OTIONS%O"O_!ND_D$T!$TTIBUT! %O"O D!T!"IN!S S,%O"OTIONS%O"O_N$"!

$TTIBUT! SUBC$T!(O' D!T!"IN!S S,%O"OTIONS%O"O_SUBC$T!(O'

.sing Oracle $nterprise )anager 

'll of the dimensions that exist in the data warehose can be viewed sin! Oracle6nterprise Mana!er" Select the *imension ob:ect from within the "chema icon todispla* all of the dimensions" Select a specific dimension to !raphicall* displa* itshierarch*/ levels/ and an* attribtes that have been defined"

Page 122: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 122/260

See Also: 

Oracle /nterprise Manager 'dministrator(s #uide and 9=sin! the%imension $i5ard9 for details re!ardin! creatin! and sin! dimensions

.sing Diensions (ith Constraints

Constraints pla* an important role with dimensions" Fll referential inte!rit* issometimes enabled in data warehoses/ bt not alwa*s" This is becase operationaldatabases normall* have fll referential inte!rit* and *o can ensre that the data flowin!into *or warehose never violates the alread* established inte!rit* rles"

Oracle recommends that constraints be enabled and/ if validation time is a concern/ thenthe NO+$LID$T! clase shold be sed as follows#

!N$BL! NO+$LID$T! CONST$INT p=_time;

7rimar* and forei!n ke*s shold be implemented also" +eferential inte!rit* constraintsand NOT NULL constraints on the fact tables provide information that 4er* rewrite can se

to extend the seflness of materiali5ed views"

In addition/ *o shold se the !L' clase to inform 4er* rewrite that it can rel* pon

the constraints bein! correct as follows#

$LT! T$BL! time "ODI>' CONST$INT p=_time !L';

This information is also sed for 4er* rewrite"

See Also: 

Chapter AA/ 9er* +ewrite9 for frther details

!alidating Diensions

The information of a dimension ob:ect is declarative onl* and not enforced b* thedatabase" If the relationships described b* the dimensions are incorrect/ incorrect reslts

cold occr" Therefore/ *o shold verif* the relationships specified b* C!$T! DI"!NSION sin! the DB"S_OL$%+$LID$T!_DI"!NSION procedre periodicall*"

This procedre is eas* to se and has onl* five parameters#

• %imension name

• Owner name

• Set to TU! to check onl* the new rows for tables of this dimension

Page 123: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 123/260

• Set to TU! to verif* that all colmns are not nll

• =ni4e rn I% obtained b* callin! the DB"S_OL$%C!$T!_ID procedre" The I%

is sed to identif* the reslt of each rn

The followin! example validates the dimension TI"!_>N in the !rocer* schema

+$I$BL! ID NU"B!;!.!CUT! DB"S_OL$%C!$T!_ID)ID*;!.!CUT! DB"S_OL$%+$LID$T!_DI"!NSION )7TI"!_>N71 7(OC!'71 R>$LS!1 TU!1 ID*;

If the +$LID$T!_DI"!NSION procedre enconters an* errors/ the* are placed in a s*stem

table" The table can be accessed from the view S'ST!""+I!F_!.C!%TIONS" er*in!

this view will identif* the exceptions that were fond" For example#

S!L!CT G >O" S'ST!""+I!F_!.C!%TIONS

F,!! UNID H ID;UNID OFN! T$BL!_N$"! DI"!NSION_N$"! !L$TIONS,I% B$D_OFIDAAAAA AAAAAAAA AAAAAAAAAAA AAAAAAAAAAAAAA AAAAAAAAAAAA AAAAAAAAAE (OC!' "ONT, TI"!_>N >O!I(N K!'$$$$u<$$$$$$<$$$

However/ rather than 4er* this view/ it ma* be better to 4er* the rowid of the invalidrow to retrieve the actal row that has violated the constraint" In this example/ thedimension TI"!_>N is checkin! a table called month" It has fond a row that violates the

constraints" =sin! the rowid/ *o can see exactl* which row in the month table is casin!the problem/ as in the followin!#

S!L!CT G >O" monthF,!! ro<id IN )S!L!CT 9ad_ro<id

>O" S'ST!""+I!F_!.C!%TIONSF,!! UNID H ID*;

"ONT, U$T! >ISC$L_T '!$ >ULL_"ONT,_N$"! "ONT,_NU"BAAAAAAAA AAAAAAA AAAAAAAAAA AAAA AAAAAAAAAAAAAAA AAAAAAAAAA  5@@@43 5@@5 5@@5 5@@ "arch 3

Finall*/ to remove reslts from the s*stem table for the crrent rn#

!.!CUT! DB"S_OL$%%U(!_!SULTS)ID*;

Altering Diensions

?o can modif* the dimension sin! the $LT! DI"!NSION statement" ?o can add or

drop a level/ hierarch*/ or attribte from the dimension sin! this command"

Page 124: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 124/260

Page 125: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 125/260

see the hierarch* and a dimension wi5ard is provided to facilitate eas* definition of thedimension ob:ect"

The %imension $i5ard is atomaticall* invoked whenever a re4est is made to create adimension ob:ect in Oracle 6nterprise Mana!er" ?o are then !ided step b* step thro!h

the information re4ired for a dimension"

' dimension created sin! the $i5ard can contain an* of the attribtes described in9Creatin! %imensions9/ sch as :oin ke*s/ mltiple hierarchies/ and attribtes" ?o mi!ht prefer to se the $i5ard becase it !raphicall* displa*s the hierarchical relationships asthe* are bein! constrcted" $hen it is time to describe the hierarch*/ the $i5ardatomaticall* displa*s a defalt hierarch* based on the colmn vales/ which *o cansbse4entl* amend"

See Also: 

Oracle /nterprise Manager 'dministrator(s #uide

)anaging the Diension Object

The dimension ob:ect is located within the /arehose section for a database" Selectin! aspecific dimension reslts in > sheets of information becomin! available" The #eneral Property sheet shown in Fi!re 1) displa*s the dimension definition in a !raphicalform"

Figure @-3 Dimension ,enera$ Pro&ert' !heet 

Page 126: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 126/260

Text description of the illstration dim!en"!if 

Page 127: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 127/260

The levels in the dimension can either be shown on the #eneral Property sheet/ or b*selectin! the Levels propert* sheet/ levels can be deleted/ displa*ed or new ones definedfor this dimension as illstrated in Fi!re 1)"

Figure @-" Dimension Leve$s Pro&ert' !heet 

Page 128: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 128/260

Text description of the illstration dimlevel"!if 

Page 129: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 129/260

B* selectin! the level name from the list on the left of the propert* sheet/ the colmnssed for this level are displa*ed in the "elected 0olmns window in the lower half of the propert* sheet"

Levels can be added or removed b* pressin! the New or *elete bttons bt the* cannot

 be modified"

' similar propert* sheet to that for Levels is provided for the attribtes in the dimensionand is selected b* clickin! on the 'ttribtes tab"

One of the main advanta!es of sin! Oracle 6nterprise Mana!er to define the dimensionis that the hierarchies can be easil* displa*ed" Fi!re 1)> illstrates the +ierarchy  propert* sheet"

Figure @- Dimension (ierarch' Pro&ert' !heet 

Page 130: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 130/260

Page 131: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 131/260

?o can add or remove hierarchies b* pressin! the New or *elete bttons bt the*cannot be modified"

Creating a Diension

'n alternative to writin! the C!$T! DI"!NSION statement is to invoke the %imensionwi5ard/ which !ides *o thro!h 2 steps to create a dimension"

Step 1

First/ *o mst define which t*pe of dimension ob:ect is to be defined" If a timedimension is re4ired/ selectin! the time dimension t*pe ensres that *or dimension isreco!ni5ed as a time dimension that has specific t*pes of hierarchies and attribtes"

Step *

Specif* the name of *or dimension and into which schema it shold reside b* selectin!from the drop down list of schemas"

Step 3

The levels in the dimension are defined in Step as shown in  Fi!re 1)2"

Figure @-5 Dimension Wi;ard9 Define Leve$s

Page 132: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 132/260

Text description of the illstration dimlevea"!if 

First/ !ive the level a name and then select the table from where the colmns whichdefine this level are located" (ow/ select one or more colmns from the available list andsin! the 1 ke* move them into the "elected 0olmns area" ?or level will now appearin the list on the left side of the propert* sheet"

To define another level/ click the New btton/ or/ if all the levels have been defined/ clickthe Net btton to proceed to the next step" If a mistake is made when definin! a level/simpl* click the *elete btton to remove it and start a!ain"

Step 5

The levels in the dimension can also have attribtes" Kive the attribte a name and thenselect the level on which this attribte is to be defined and sin! the 1 btton move it intothe "elected Levels colmn" (ow choose the colmn from the drop down list for thisattribte"

Levels can be added or removed b* pressin! the New or *elete bttons bt the* cannot be modified"

Page 133: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 133/260

Step ;

' hierarch* is defined as illstrated in Fi!re 1)E"

Figure @-> Dimension Wi;ard9 Define (ierarchies

Text description of the illstration dimhierw"!if 

First/ !ive the hierarch* a name and then select the levels to be sed in this hierarch* andmove them to the "elected Levels colmn sin! the 1 btton"

The level name at the top of the list defines the top of the hierarch*" =se the p and down bttons to move the levels into the re4ired order" (ote that each level will indent so *o

can see the relationships between the levels"

Step

Finall*/ the Smmar* screen is displa*ed as shown in Fi!re 1) where a !raphicalrepresentation of the dimension is shown on the left side of the propert* sheet and on theri!ht side the C!$T! DI"!NSION statement is shown" Clickin! on the Finish btton will

create the dimension"

Page 134: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 134/260

Figure @-: Dimension Wi;ard9 !ummar' !creen

Text description of the illstration dimwi5sa"!if 

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

Page 135: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 135/260

#art I!)anaging the Warehouse

$nvironentThis section deals with the tasks for mana!in! a data warehose"

It contains the followin! chapters#

• Overview of 6xtraction/ Transformation/ and Loadin!

• 6xtraction in %ata $arehoses

• Transportation in %ata $arehoses

• Loadin! and Transformation

• Maintainin! the %ata $arehose

Chan!e %ata Captre• Smmar* 'dvisor

Cop*ri!ht D 0112/ A88A Oracle Corporation"'ll +i!hts +eserved"

Home BookList

Contents Index MasterIndex

Feedback 

Skip Headers

Oracle9i  Data WarehousingGuideRelease2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

18Overvie( o0 $2tractionTrans0oration and "oading

This chapter discsses the process of extractin!/ transportin!/ transformin!/ and loadin!data in a data warehosin! environment#

• Overview of 6TL

• 6TL Tools

Page 136: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 136/260

Overvie( o0 $T"

?o need to load *or data warehose re!larl* so that it can serve its prpose offacilitatin! bsiness anal*sis" To do this/ data from one or more operational s*stemsneeds to be extracted and copied into the warehose" The process of extractin! data from

sorce s*stems and brin!in! it into the data warehose is commonl* called ,TL/ whichstands for extraction/ transformation/ and loadin!" The acron*m 6TL is perhaps toosimplistic/ becase it omits the transportation phase and implies that each of the other phases of the process is distinct" $e refer to the entire process/ incldin! data loadin!/ as6TL" ?o shold nderstand that 6TL refers to a broad process/ and not three well)defined steps"

The methodolo!* and tasks of 6TL have been well known for man* *ears/ and are notnecessaril* ni4e to data warehose environments# a wide variet* of proprietar*applications and database s*stems are the IT backbone of an* enterprise" %ata has to beshared between applications or s*stems/ tr*in! to inte!rate them/ !ivin! at least two

applications the same pictre of the world" This data sharin! was mostl* addressed b*mechanisms similar to what we now call 6TL"

%ata warehose environments face the same challen!e with the additional brden thatthe* not onl* have to exchan!e bt to inte!rate/ rearran!e and consolidate data over man*s*stems/ thereb* providin! a new nified information base for bsiness intelli!ence"'dditionall*/ the data volme in data warehose environments tends to be ver* lar!e"

$hat happens drin! the 6TL process& %rin! extraction/ the desired data is identifiedand extracted from man* different sorces/ incldin! database s*stems and applications";er* often/ it is not possible to identif* the specific sbset of interest/ therefore more data

than necessar* has to be extracted/ so the identification of the relevant data will be doneat a later point in time" %ependin! on the sorce s*stem<s capabilities -for example/operatin! s*stem resorces3/ some transformations ma* take place drin! this extraction process" The si5e of the extracted data varies from hndreds of kilob*tes p to !i!ab*tes/dependin! on the sorce s*stem and the bsiness sitation" The same is tre for the timedelta between two -lo!icall*3 identical extractions# the time span ma* var* betweenda*s@hors and mintes to near real)time" $eb server lo! files for example can easil* become hndreds of me!ab*tes in a ver* short period of time"

'fter extractin! data/ it has to be ph*sicall* transported to the tar!et s*stem or anintermediate s*stem for frther processin!" %ependin! on the chosen wa* of

transportation/ some transformations can be done drin! this process/ too" For example/ aSL statement which directl* accesses a remote tar!et thro!h a !atewa* canconcatenate two colmns as part of the S!L!CT statement"

The emphasis in man* of the examples in this section is scalabilit*" Man* lon!)time sersof Oracle are experts in pro!rammin! complex data transformation lo!ic sin! 7L@SL"These chapters s!!est alternatives for man* sch data maniplation operations/ with a

Page 137: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 137/260

 particlar emphasis on implementations that take advanta!e of Oracle<s new SLfnctionalit*/ especiall* for 6TL and the parallel 4er* infrastrctre"

$T" Tools

%esi!nin! and maintainin! the 6TL process is often considered one of the most difficltand resorce)intensive portions of a data warehose pro:ect" Man* data warehosin! pro:ects se 6TL tools to mana!e this process" Oracle $arehose Bilder -O$B3/ forexample/ provides 6TL capabilities and takes advanta!e of inherent database abilities"Other data warehose bilders create their own 6TL tools and processes/ either inside orotside the database"

Besides the spport of extraction/ transformation/ and loadin!/ there are some other tasksthat are important for a sccessfl 6TL implementation as part of the dail* operations ofthe data warehose and its spport for frther enhancements" Besides the spport fordesi!nin! a data warehose and the data flow/ these tasks are t*picall* addressed b* 6TL

tools sch as O$B"

Oracle1i is not an 6TL tool and does not provide a complete soltion for 6TL" However/Oracle1i does provide a rich set of capabilities that can be sed b* both 6TL tools andcstomi5ed 6TL soltions" Oracle1i offers techni4es for transportin! data betweenOracle databases/ for transformin! lar!e volmes of data/ and for 4ickl* loadin! newdata into a data warehose"

Dail+ Operations

The sccessive loads and transformations mst be schedled and processed in a specific

order" %ependin! on the sccess or failre of the operation or parts of it/ the reslt mst be tracked and sbse4ent/ alternative processes mi!ht be started" The control of the pro!ress as well as the definition of a bsiness workflow of the operations are t*picall*addressed b* 6TL tools sch as O$B"

$volution o0 the Data Warehouse

's the data warehose is a livin! IT s*stem/ sorces and tar!ets mi!ht chan!e" Thosechan!es mst be maintained and tracked thro!h the lifespan of the s*stem withotoverwritin! or deletin! the old 6TL process flow information" To bild and keep a levelof trst abot the information in the warehose/ the process flow of each individal

record in the warehose can be reconstrcted at an* point in time in the ftre in an idealcase"

Cop*ri!ht D 0112/ A88A Oracle Corporation"'ll +i!hts +eserved"

Home BookList

Contents Index MasterIndex

Feedback 

Page 138: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 138/260

Page 139: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 139/260

Page 140: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 140/260

Increental $2traction

't a specific point in time/ onl* the data that has chan!ed since a well)defined event backin histor* will be extracted" This event ma* be the last time of extraction or a morecomplex bsiness event like the last bookin! da* of a fiscal period" To identif* this delta

chan!e there mst be a possibilit* to identif* all the chan!ed information since thisspecific time event" This information can be either provided b* the sorce data itself likean application colmn/ reflectin! the last)chan!ed timestamp or a chan!e table where anappropriate additional mechanism keeps track of the chan!es besides the ori!inatin!transactions" In most cases/ sin! the latter method means addin! extraction lo!ic to thesorce s*stem"

Man* data warehoses do not se an* chan!e)captre techni4es as part of the extraction process" Instead/ entire tables from the sorce s*stems are extracted to the datawarehose or sta!in! area/ and these tables are compared with a previos extract from thesorce s*stem to identif* the chan!ed data" This approach ma* not have si!nificant

impact on the sorce s*stems/ bt it clearl* can place a considerable brden on the datawarehose processes/ particlarl* if the data volmes are lar!e"

Oracle<s Chan!e %ata Captre mechanism can extract and maintain sch deltainformation"

See Also: 

Chapter 0>/ 9Chan!e %ata Captre9 for frther details abot theChan!e %ata Captre framework 

#h+sical $2traction )ethods

%ependin! on the chosen lo!ical extraction method and the capabilities and restrictionson the sorce side/ the extracted data can be ph*sicall* extracted b* two mechanisms"The data can either be extracted online from the sorce s*stem or from an offlinestrctre" Sch an offline strctre mi!ht alread* exist or it mi!ht be !enerated b* anextraction rotine"

There are the followin! methods of ph*sical extraction#

• Online 6xtraction

Offline 6xtraction

Online $2traction

The data is extracted directl* from the sorce s*stem itself" The extraction process canconnect directl* to the sorce s*stem to access the sorce tables themselves or to anintermediate s*stem that stores the data in a preconfi!red manner -for example/ snapshot

Page 141: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 141/260

lo!s or chan!e tables3" (ote that the intermediate s*stem is not necessaril* ph*sicall*different from the sorce s*stem"

$ith online extractions/ *o need to consider whether the distribted transactions aresin! ori!inal sorce ob:ects or prepared sorce ob:ects"

O00line $2traction

The data is not extracted directl* from the sorce s*stem bt is sta!ed explicitl* otsidethe ori!inal sorce s*stem" The data alread* has an existin! strctre -for example/ redolo!s/ archive lo!s or transportable tablespaces3 or was created b* an extraction rotine"

?o shold consider the followin! strctres#

• Flat files

%ata in a defined/ !eneric format" 'dditional information abot the sorce ob:ectis necessar* for frther processin!"

• %mp files

Oracle)specific format" Information abot the containin! ob:ects is inclded"

• +edo and archive lo!s

Information is in a special/ additional dmp file"

Transportable tablespaces

' powerfl wa* to extract and move lar!e volmes of data between Oracledatabases" ' more detailed example of sin! this featre to extract and transportdata is provided in Chapter 0A/ 9Transportation in %ata $arehoses9" OracleCorporation recommends that *o se transportable tablespaces whenever possible/ becase the* can provide considerable advanta!es in performance andmana!eabilit* over other extraction techni4es"

See Also: 

Oracle9i Database tilities for more information on sin! dmp andflat files and Oracle9i Supplied P&0S%& Packages and Types $e"erence 

for details re!ardin! Lo!Miner 

Change Data Capture

'n important consideration for extraction is incremental extraction/ also called Chan!e%ata Captre" If a data warehose extracts data from an operational s*stem on a ni!htl*

Page 142: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 142/260

 basis/ then the data warehose re4ires onl* the data that has chan!ed since the lastextraction -that is/ the data that has been modified in the past A hors3"

$hen it is possible to efficientl* identif* and extract onl* the most recentl* chan!ed data/the extraction process -as well as all downstream operations in the 6TL process3 can be

mch more efficient/ becase it mst extract a mch smaller volme of data"=nfortnatel*/ for man* sorce s*stems/ identif*in! the recentl* modified data ma* bedifficlt or intrsive to the operation of the s*stem" Chan!e %ata Captre is t*picall* themost challen!in! technical isse in data extraction"

Becase chan!e data captre is often desirable as part of the extraction process and itmi!ht not be possible to se Oracle<s Chan!e %ata Captre mechanism/ this sectiondescribes several techni4es for implementin! a self)developed chan!e captre on Oraclesorce s*stems#

• Timestamps

7artitionin!• Tri!!ers

These techni4es are based pon the characteristics of the sorce s*stems/ or ma* re4iremodifications to the sorce s*stems" Ths/ each of these techni4es mst be carefll*evalated b* the owners of the sorce s*stem prior to implementation"

6ach of these techni4es can work in con:nction with the data extraction techni4ediscssed previosl*" For example/ timestamps can be sed whether the data is bein!nloaded to a file or accessed thro!h a distribted 4er*"

See Also: 

Chapter 0>/ 9Chan!e %ata Captre9 for frther details

Tiestaps

The tables in some operational s*stems have timestamp colmns" The timestampspecifies the time and date that a !iven row was last modified" If the tables in anoperational s*stem have colmns containin! timestamps/ then the latest data can easil* beidentified sin! the timestamp colmns" For example/ the followin! 4er* mi!ht besefl for extractin! toda*<s data from an orders table#

S!L!CT G >O" orders F,!! TUNC)C$ST)order_date $S date*17dd7* H TO_D$T!)S'SD$T!17ddAmonAyyyy7*;

If the timestamp information is not available in an operational sorce s*stem/ *o willnot alwa*s be able to modif* the s*stem to inclde timestamps" Sch modification woldre4ire/ first/ modif*in! the operational s*stem<s tables to inclde a new timestamp

Page 143: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 143/260

colmn and then creatin! a tri!!er to pdate the timestamp colmn followin! ever*operation that modifies a !iven row"

See Also: 

9Tri!!ers9

#artitioning

Some sorce s*stems mi!ht se Oracle ran!e partitionin!/ sch that the sorce tables are partitioned alon! a date ke*/ which allows for eas* identification of new data" Forexample/ if *o are extractin! from an orders table/ and the orders table is partitioned

 b* week/ then it is eas* to identif* the crrent week<s data"

Triggers

Tri!!ers can be created in operational s*stems to keep track of recentl* pdated records"The* can then be sed in con:nction with timestamp colmns to identif* the exact timeand date when a !iven row was last modified" ?o do this b* creatin! a tri!!er on eachsorce table that re4ires chan!e data captre" Followin! each %ML statement that isexected on the sorce table/ this tri!!er pdates the timestamp colmn with the crrenttime" Ths/ the timestamp colmn provides the exact time and date when a !iven rowwas last modified"

' similar internali5ed tri!!er)based techni4e is sed for Oracle materiali5ed view lo!s"These lo!s are sed b* materiali5ed views to identif* chan!ed data/ and these lo!s areaccessible to end sers" ' materiali5ed view lo! can be created on each sorce table

re4irin! chan!e data captre" Then/ whenever an* modifications are made to the sorcetable/ a record is inserted into the materiali5ed view lo! indicatin! which rows weremodified" If *o want to se a tri!!er)based mechanism/ se chan!e data captre"

Materiali5ed view lo!s rel* on tri!!ers/ bt the* provide an advanta!e in that the creationand maintenance of this chan!e)data s*stem is lar!el* mana!ed b* Oracle"

However/ Oracle recommends the sa!e of s*nchronos Chan!e %ata Captre for tri!!er based chan!e captre/ since C%C provides an externali5ed interface for accessin! thechan!e information and provides a framework for maintainin! the distribtion of thisinformation to varios clients

Tri!!er)based techni4es affect performance on the sorce s*stems/ and this impactshold be carefll* considered prior to implementation on a prodction sorce s*stem"

Data Warehousing $2traction $2aples

?o can extract data in two wa*s#

Page 144: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 144/260

• 6xtraction =sin! %ata Files

• 6xtraction ;ia %istribted Operations

$2traction .sing Data ,iles

Most database s*stems provide mechanisms for exportin! or nloadin! data from theinternal database format into flat files" 6xtracts from mainframe s*stems often seCOBOL pro!rams/ bt man* databases/ as well as third)part* software vendors/ provideexport or nload tilities"

%ata extraction does not necessaril* mean that entire database strctres are nloaded inflat files" In man* cases/ it ma* be appropriate to nload entire database tables or ob:ects"In other cases/ it ma* be more appropriate to nload onl* a sbset of a !iven table sch asthe chan!es on the sorce s*stem since the last extraction or the reslts of :oinin!mltiple tables to!ether" %ifferent extraction techni4es var* in their capabilities tospport these two scenarios"

$hen the sorce s*stem is an Oracle database/ several alternatives are available forextractin! data into files#

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

1*Transportation in Data Warehouses

The followin! topics provide information abot transportin! data into a data warehose#

• Overview of Transportation in %ata $arehoses• Introdction to Transportation Mechanisms in %ata $arehoses

Overvie( o0 Transportation in Data Warehouses

Page 145: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 145/260

Transportation is the operation of movin! data from one s*stem to another s*stem" In adata warehose environment/ the most common re4irements for transportation are inmovin! data from#

• ' sorce s*stem to a sta!in! database or a data warehose database

' sta!in! database to a data warehose• ' data warehose to a data mart

Transportation is often one of the simpler portions of the 6TL process/ and can beinte!rated with other portions of the process" For example/ as shown in Chapter 00/96xtraction in %ata $arehoses9/ distribted 4er* technolo!* provides a mechanism for both extractin! and transportin! data"

Introduction to Transportation )echaniss inData Warehouses

?o have three basic choices for transportin! data in warehoses#

• Transportation =sin! Flat Files

• Transportation Thro!h %istribted Operations

• Transportation =sin! Transportable Tablespaces

Transportation .sing ,lat ,iles

The most common method for transportin! data is b* the transfer of flat files/ sin!mechanisms sch as FT7 or other remote file s*stem access protocols" %ata is nloaded

or exported from the sorce s*stem into flat files sin! techni4es discssed inChapter 00/ 96xtraction in %ata $arehoses9/ and is then transported to the tar!et platform sin! FT7 or similar mechanisms"

Becase sorce s*stems and data warehoses often se different operatin! s*stems anddatabase s*stems/ sin! flat files is often the simplest wa* to exchan!e data betweenhetero!eneos s*stems with minimal transformations" However/ even when transportin!data between homo!eneos s*stems/ flat files are often the most efficient and most eas*)to)mana!e mechanism for data transfer"

Transportation Through Distributed Operations

%istribted 4eries/ either with or withot !atewa*s/ can be an effective mechanism forextractin! data" These mechanisms also transport the data directl* to the tar!et s*stems/ths providin! both extraction and transformation in a sin!le step" %ependin! on thetolerable impact on time and s*stem resorces/ these mechanisms can be well sited for both extraction and transformation"

Page 146: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 146/260

's opposed to flat file transportation/ the sccess or failre of the transportation isreco!ni5ed immediatel* with the reslt of the distribted 4er* or transaction"

See Also: 

Chapter 00/ 96xtraction in %ata $arehoses9 for frther details

Transportation .sing Transportable Tablespaces

Oraclei introdced an important mechanism for transportin! data# transportabletablespaces" This featre is the fastest wa* for movin! lar!e volmes of data between twoOracle databases"

7revios to Oraclei/ the most scalable data transportation mechanisms relied on movin!flat files containin! raw data" These mechanisms re4ired that data be nloaded orexported into files from the sorce database/ Then/ after transportation/ these files were

loaded or imported into the tar!et database" Transportable tablespaces entirel* b*pass thenload and reload steps"

=sin! transportable tablespaces/ Oracle data files -containin! table data/ indexes/ andalmost ever* other Oracle database ob:ect3 can be directl* transported from one databaseto another" Frthermore/ like import and export/ transportable tablespaces provide amechanism for transportin! metadata in addition to transportin! data"

Transportable tablespaces have some notable limitations# sorce and tar!et s*stems mst be rnnin! Oraclei -or hi!her3/ mst be rnnin! the same operatin! s*stem/ mst se thesame character set/ and/ prior to Oracle1i/ mst se the same block si5e" %espite these

limitations/ transportable tablespaces can be an invalable data transportation techni4ein man* warehose environments"

The most common applications of transportable tablespaces in data warehoses are inmovin! data from a sta!in! database to a data warehose/ or in movin! data from a datawarehose to a data mart"

See Also: 

Oracle9i Database Concepts for more information on transportabletablespaces

Transportable Tablespaces $2aple

Sppose that *o have a data warehose containin! sales data/ and several data marts thatare refreshed monthl*" 'lso sppose that *o are !oin! to move one month of sales datafrom the data warehose to the data mart"

!te& 19 P$ace the Data to )e Trans&orted into its on Ta)$es&ace

Page 147: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 147/260

Page 148: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 148/260

In this step/ we have copied the .anar* sales data into a separate tablespaceJ however/ insome cases/ it ma* be possible to levera!e the transportable tablespace featre withoteven movin! data to a separate tablespace" If the sales table has been partitioned b*month in the data warehose and if each partition is in its own tablespace/ then it ma* be possible to directl* transport the tablespace containin! the .anar* data" Sppose the

.anar* partition/ sales_6an2444/ is located in the tablespace ts_sales_6an2444" Thenthe tablespace ts_sales_6an2444 cold potentiall* be transported/ rather than creatin! a

temporar* cop* of the .anar* sales data in the ts_temp_sales"

However/ the same conditions mst be satisfied in order to transport the tablespacets_sales_6an2444 as are re4ired for the speciall* created tablespace" First/ this

tablespace mst be set to !$D ONL'" Second/ becase a sin!le partition of a partitioned

table cannot be transported withot the remainder of the partitioned table also bein!transported/ it is necessar* to exchan!e the .anar* partition into a separate table -sin!the $LT! T$BL! statement3 to transport the .anar* data" The !.C,$N(! operation is ver*

4ick/ bt the .anar* data will no lon!er be a part of the nderl*in! sales table/ and

ths ma* be navailable to sers ntil this data is exchan!ed back into the sales tableafter the export of the metadata" The .anar* data can be exchan!ed back into the sales 

table after *o complete step "

!te& 29 E.&ort the #etadata

The 6xport tilit* is sed to export the metadata describin! the ob:ects contained in thetransported tablespace" For or example scenario/ the 6xport command cold be#

!.% T$NS%OT_T$BL!S%$C!HyT$BL!S%$C!SHts_temp_sales

  >IL!H6an_salesdmp

This operation will !enerate an export file/ 6an_salesdmp" The export file will be small/

 becase it contains onl* metadata" In this case/ the export file will contain informationdescribin! the table temp_6an_sales/ sch as the colmn names/ colmn datat*pe/ and

all other information that the tar!et Oracle database will need in order to access theob:ects in ts_temp_sales"

!te& 39 Co&' the Datafi$es and E.&ort Fi$e to the Target !'stem

Cop* the data files that make p ts_temp_sales/ as well as the export file

6an_salesdmp to the data mart platform/ sin! an* transportation mechanism for flat

files"

Once the datafiles have been copied/ the tablespace ts_temp_sales can be set to !$D 

FIT! mode if desired"

!te& "9 7m&ort the #etadata

Page 149: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 149/260

Once the files have been copied to the data mart/ the metadata shold be imported intothe data mart#

I"% T$NS%OT_T$BL!S%$C!Hy D$T$>IL!SH78d98temp6anf7T$BL!S%$C!SHts_temp_sales>IL!H6an_salesdmp

't this point/ the tablespace ts_temp_sales and the table temp_sales_6an are

accessible in the data mart" ?o can incorporate this new data into the data mart<s tables"

?o can insert the data from the temp_sales_6an table into the data mart<s sales table in

one of two wa*s#

INS!T 8G $%%!ND G8 INTO sales S!L!CT G >O" temp_sales_6an;

Followin! this operation/ *o can delete the temp_sales_6an table -and even the entirets_temp_sales tablespace3"

'lternativel*/ if the data mart<s sales table is partitioned b* month/ then the newtransported tablespace and the temp_sales_6an table can become a permanent part of

the data mart" The temp_sales_6an table can become a partition of the data mart<s sales

table#

$LT! T$BL! sales $DD %$TITION sales_446an +$LU!S  L!SS T,$N )TO_D$T!)745Afe9A2444717ddAmonAyyyy7**;$LT! T$BL! sales !.C,$N(! %$TITION sales_446anFIT, T$BL! temp_sales_6an

INCLUDIN( IND!.!S FIT, +$LID$TION;

Other .ses o0 Transportable Tablespaces

The previos example illstrates a t*pical scenario for transportin! data in a datawarehose" However/ transportable tablespaces can be sed for man* other prposes" In adata warehosin! environment/ transportable tablespaces shold be viewed as a tilit*-mch like Import@6xport or SLLoader3/ whose prpose is to move lar!e volmes ofdata between Oracle databases" $hen sed in con:nction with parallel data movementoperations sch as the C!$T! T$BL! """ $S S!L!CT and INS!T """ $S S!L!CT statements/

transportable tablespaces provide an important mechanism for 4ickl* transportin! data

for man* prposes"

Cop*ri!ht D 0112/ A88A Oracle Corporation"'ll +i!hts +eserved"

Home BookList

Contents Index MasterIndex

Feedback 

Skip Headers

Page 150: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 150/260

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

13"oading and Trans0oration

This chapter helps *o create and mana!e a data warehose/ and discsses#

• Overview of Loadin! and Transformation in %ata $arehoses

• Loadin! Mechanisms

• Transformation Mechanisms• Loadin! and Transformation Scenarios

Overvie( o0 "oading and Trans0oration in DataWarehouses

%ata transformations are often the most complex and/ in terms of processin! time/ themost costl* part of the 6TL process" The* can ran!e from simple data conversions toextremel* complex data scrbbin! techni4es" Man*/ if not all/ data transformations canoccr within an Oracle1i database/ altho!h transformations are often implementedotside of the database -for example/ on flat files3 as well"

This chapter introdces techni4es for implementin! scalable and efficient datatransformations within Oracle1i" The examples in this chapter are relativel* simple" +eal)world data transformations are often considerabl* more complex" However/ thetransformation techni4es introdced in this chapter meet the ma:orit* of real)world datatransformation re4irements/ often with more scalabilit* and less pro!rammin! thanalternative approaches"

This chapter does not seek to illstrate all of the t*pical transformations that wold beencontered in a data warehose/ bt to demonstrate the t*pes of fndamental technolo!*

that can be applied to implement these transformations and to provide !idance in how tochoose the best techni4es"

Trans0oration ,lo(

From an architectral perspective/ *o can transform *or data in two wa*s#

Page 151: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 151/260

• Mltista!e %ata Transformation

• 7ipelined %ata Transformation

)ultistage Data Trans0oration

The data transformation lo!ic for most data warehoses consists of mltiple steps" Forexample/ in transformin! new records to be inserted into a sales table/ there ma* beseparate lo!ical transformation steps to validate each dimension ke*"

Fi!re 0)0 offers a !raphical wa* of lookin! at the transformation lo!ic"

Figure 13-1 #u$tistage Data Transformation

Text description of the illstration dwhs!8A>"!if 

$hen sin! Oracle1i as a transformation en!ine/ a common strate!* is to implement eachdifferent transformation as a separate SL operation and to create a separate/ temporar*sta!in! table -sch as the tables ne<_sales_step5 and ne<_sales_step2 in Fi!re 0)

03 to store the incremental reslts for each step" This load)then)transform strate!* also provides a natral checkpointin! scheme to the entire transformation process/ whichenables to the process to be more easil* monitored and restarted" However/ a

disadvanta!e to mltista!in! is that the space and time re4irements increase"

It ma* also be possible to combine man* simple lo!ical transformations into a sin!leSL statement or sin!le 7L@SL procedre" %oin! so ma* provide better performancethan performin! each step independentl*/ bt it ma* also introdce difficlties inmodif*in!/ addin!/ or droppin! individal transformations/ as well as recoverin! fromfailed transformations"

Page 152: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 152/260

#ipelined Data Trans0oration

$ith the introdction of Oracle1i/ Oracle<s database capabilities have been si!nificantl*enhanced to address specificall* some of the tasks in 6TL environments" The 6TL process flow can be chan!ed dramaticall* and the database becomes an inte!ral part of

the 6TL soltion"

The new fnctionalit* renders some of the former necessar* process steps obsolete whilstsome others can be remodeled to enhance the data flow and the data transformation to become more scalable and non)interrptive" The task shifts from serial transform)then)load process -with most of the tasks done otside the database3 or load)then)transform process/ to an enhanced transform)while)loadin!"

Oracle1i offers a wide variet* of new capabilities to address all the isses and tasksrelevant in an 6TL scenario" It is important to nderstand that the database offers toolkitfnctionalit* rather than tr*in! to address a one)si5e)fits)all soltion" The nderl*in!

database has to enable the most appropriate 6TL process flow for a specific cstomerneed/ and not dictate or constrain it from a technical perspective" Fi!re 0)A illstratesthe new fnctionalit*/ which is discssed thro!hot later sections"

Figure 13-2 Pi&e$ined Data Transformation

Text description of the illstration dw!082>"!if 

"oading )echaniss

?o can se the followin! mechanisms for loadin! a warehose#

• SLLoader

• 6xternal Tables

• OCI and %irect)7ath '7Is

• 6xport@Import

Page 153: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 153/260

S"K"oader 

Before an* data transformations can occr within the database/ the raw data mst becomeaccessible for the database" One approach is to load it into the database" Chapter 0A/9Transportation in %ata $arehoses9/ discsses several techni4es for transportin! data

to an Oracle data warehose" 7erhaps the most common techni4e for transportin! data is b* wa* of flat files"

SLLoader is sed to move data from flat files into an Oracle data warehose" %rin!this data load/ SLLoader can also be sed to implement basic data transformations"$hen sin! direct)path SLLoader/ basic data maniplation/ sch as datat*peconversion and simple NULL handlin!/ can be atomaticall* resolved drin! the data load"

Most data warehoses se direct)path loadin! for performance reasons"

Oracle<s conventional)path loader provides broader capabilities for data transformationthan a direct)path loader# SL fnctions can be applied to an* colmn as those vales are

 bein! loaded" This provides a rich capabilit* for transformations drin! the data load"However/ the conventional)path loader is slower than direct)path loader" For thesereasons/ the conventional)path loader shold be considered primaril* for loadin! andtransformin! smaller amonts of data"

See Also: 

Oracle9i Database tilities for more information on SLLoader 

The followin! is a simple example of a SLLoader controlfile to load data into thesales table of the sh sample schema from an external file sh_salesdat" The external

flat file sh_salesdat consists of sales transaction data/ a!!re!ated on a dail* level" (otall colmns of this external file are loaded into sales" This external file will also be sed

as sorce for loadin! the second fact table of the sh sample schema/ which is done sin!

an external table#

The followin! shows the controlfile )sh_salesctl3 to load the sales table#

LO$D D$T$IN>IL! sh_salesdat$%%!ND INTO T$BL! sales>I!LDS T!"IN$T!D B' ) %OD_ID1 CUST_ID1 TI"!_ID1 C,$NN!L_ID1 %O"O_ID1

 U$NTIT'_SOLD1 $"OUNT_SOLD* 

It can be loaded with the followin! command#

Q sqlldr sh8sh controlHsh_salesctl directHtrue

$2ternal Tables

Page 154: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 154/260

Page 155: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 155/260

the loadin! and transformation within a sin!le SL %ML statement/ as shown in thefollowin!" ?o do not have to sta!e the data temporaril* before insertin! into the tar!ettable"

The Oracle ob:ect directories mst alread* exist/ and point to the director* containin! the

sh_salesdat file as well as the director* containin! the bad and lo! files"

C!$T! T$BL! sales_transactions_e?t)  %OD_ID NU"B!)*1  CUST_ID NU"B!1  TI"!_ID D$T!1  C,$NN!L_ID C,$)5*1  %O"O_ID NU"B!)*1  U$NTIT'_SOLD NU"B!)3*1  $"OUNT_SOLD NU"B!)5412*1  UNIT_COST NU"B!)5412*1  UNIT_%IC! NU"B!)5412**O($NI$TION e?ternal)  T'%! oracle_loader  D!>$ULT DI!CTO' data_file_dir  $CC!SS %$$"!T!S)

  !CODS D!LI"IT!D B' N!FLIN! C,$$CT!S!T USE$SCII  B$D>IL! lo/_file_dir7sh_sales9ad_?t7  LO(>IL! lo/_file_dir7sh_saleslo/_?t7  >I!LDS T!"IN$T!D B' LDTI"*

  location)

  7sh_salesdat7  **!!CT LI"IT UNLI"IT!D;

The external table can now be sed from within the database/ accessin! some colmns ofthe external data onl*/ !ropin! the data/ and insertin! it into the costs fact table#

INS!T 8G $%%!ND G8 INTO COSTS)  TI"!_ID1  %OD_ID1  UNIT_COST1  UNIT_%IC!*S!L!CTTI"!_ID1

  %OD_ID1  SU")UNIT_COST*1  SU")UNIT_%IC!*>O" sales_transactions_e?t(OU% B' time_id1 prod_id;

Page 156: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 156/260

OCI and Direct=#ath A#Is

OCI and direct)path '7Is are fre4entl* sed when the transformation and comptationare done otside the database and there is no need for flat file sta!in!"

$2port6Iport

6xport and import are sed when the data is inserted as is into the tar!et s*stem" (o lar!evolmes of data shold be handled and no complex extractions are possible"

See Also: 

Chapter 00/ 96xtraction in %ata $arehoses9 for frther information

Trans0oration )echaniss

?o have the followin! choices for transformin! data inside the database#

• Transformation =sin! SL

• Transformation =sin! 7L@SL

• Transformation =sin! Table Fnctions

Trans0oration .sing S"

Once data is loaded into an Oracle1i database/ data transformations can be exected sin!SL operations" There are for basic techni4es for implementin! SL data

transformations within Oracle1i#

• C+6'T6 T'BL6 """ 'S S6L6CT 'nd I(S6+T @'776(%@ 'S S6L6CT

• Transformation =sin! =7%'T6

• Transformation =sin! M6+K6

• Transformation =sin! Mltitable I(S6+T

C$AT$ TA&"$ LLL AS S$"$CT And INS$T 6K9A##$NDK6 AS S$"$CT

The C!$T! T$BL! """ $S S!L!CT statement -CT'S3 is a powerfl tool for maniplatin!

lar!e sets of data" 's shown in the followin! example/ man* data transformations can be

expressed in standard SL/ and CT'S provides a mechanism for efficientl* exectin! aSL 4er* and storin! the reslts of that 4er* in a new database table" The INS!T 

@$%%!ND@ """ $S S!L!CT statement offers the same capabilities with existin! database

tables"

In a data warehose environment/ CT'S is t*picall* rn in parallel sin! NOLO((IN( 

mode for best performance"

Page 157: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 157/260

' simple and common t*pe of data transformation is data sbstittion" In a datasbstittion transformation/ some or all of the vales of a sin!le colmn are modified" Forexample/ or sales table has a channel_id colmn" This colmn indicates whether a

!iven sales transaction was made b* a compan*<s own sales force -a direct sale3 or b* adistribtor -an indirect sale3"

?o ma* receive data from mltiple sorce s*stems for *or data warehose" Spposethat one of those sorce s*stems processes onl* direct sales/ and ths the sorce s*stemdoes not know indirect sales channels" $hen the data warehose initiall* receives salesdata from this s*stem/ all sales records have a NULL vale for the saleschannel_id 

field" These NULL vales mst be set to the proper ke* vale" For example/ ?o can do

this efficientl* sin! a SL fnction as part of the insertion into the tar!et sales tablestatement#

The strctre of sorce table sales_acti-ity_direct is as follows#

SLJ D!SC sales_acti-ity_directName Null TypeAAAAAAAAAAAA AAAAA AAAAAAAAAAAAAAAAS$L!S_D$T! D$T!%ODUCT_ID NU"B!CUSTO"!_ID NU"B!%O"OTION_ID NU"B!$"OUNT NU"B!U$NTIT' NU"B!

INS!T 8G $%%!ND NOLO((IN( %$$LL!L G8INTO salesS!L!CT product_id1 customer_id1 TUNC)sales_date*1 7S71promotion_id1 quantity1 amount

>O" sales_acti-ity_direct;

Trans0oration .sing .#DAT$

'nother techni4e for implementin! a data sbstittion is to se an U%D$T! statement to

modif* the saleschannel_id colmn" 'n U%D$T! will provide the correct reslt"

However/ if the data sbstittion transformations re4ire that a ver* lar!e percenta!e ofthe rows -or all of the rows3 be modified/ then/ it ma* be more efficient to se a CT'Sstatement than an U%D$T!"

Trans0oration .sing )$7$

Oracle<s mer!e fnctionalit* extends SL/ b* introdcin! the SL ke*word "!(!/ in

order to provide the abilit* to pdate or insert a row conditionall* into a table or ot ofline sin!le table views" Conditions are specified in the ON clase" This is/ besides pre

 blk loadin!/ one of the most common operations in data warehose s*nchroni5ation"

7rior to Oracle1i/ mer!es were expressed either as a se4ence of %ML statements or as7L@SL loops operatin! on each row" Both of these approaches sffer from deficiencies

Page 158: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 158/260

in performance and sabilit*" The new mer!e fnctionalit* overcomes these deficiencieswith a new SL statement" This s*ntax has been proposed as part of the pcomin! SLstandard"

When to +se #erge

There are several benefits of the new "!(! statement as compared with the two other

existin! approaches"

• The entire operation can be expressed mch more simpl* as a sin!le SL

statement"• ?o can paralleli5e statements transparentl*"

• ?o can se blk %ML"

• 7erformance will improve becase *or statements will re4ire fewer scans of thesorce table"

#erge E.am&$es

The followin! discsses varios implementations of a mer!e" The examples assme thatnew data for the dimension table prodcts is propa!ated to the data warehose and has to be either inserted or pdated" The table products_delta has the same strctre as

products"

E.am&$e 1 #erge O&eration +sing !8L in Orac$e@i "!(! INTO products tUSIN( products_delta sON )tprod_idHsprod_id*

F,!N "$TC,!D T,!NU%D$T! S!Ttprod_list_priceHsprod_list_price1tprod_min_priceHsprod_min_priceF,!N NOT "$TC,!D T,!NINS!T)prod_id1 prod_name1 prod_desc1prod_su9cate/ory1 prod_su9cat_desc1 prod_cate/ory1prod_cat_desc1 prod_status1 prod_list_price1 prod_min_price*+$LU!S)sprod_id1 sprod_name1 sprod_desc1sprod_su9cate/ory1 sprod_su9cat_desc1sprod_cate/ory1 sprod_cat_desc1sprod_status1 sprod_list_price1 sprod_min_price*;

E.am&$e 2 #erge O&eration +sing !8L Prior to Orac$e@i 

' re!lar :oin between sorce products_delta and tar!et products"

U%D$T! products tS!T

Page 159: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 159/260

)prod_name1 prod_desc1 prod_su9cate/ory1 prod_su9cat_desc1prod_cate/ory1prod_cat_desc1 prod_status1 prod_list_price1prod_min_price* H)S!L!CT prod_name1 prod_desc1 prod_su9cate/ory1 prod_su9cat_desc1prod_cate/ory1 prod_cat_desc1 prod_status1 prod_list_price1prod_min_price from products_delta s F,!! sprod_idHtprod_id*;

'n anti:oin between sorce products_delta and tar!et products"

INS!T INTO products tS!L!CT G >O" products_delta sF,!! sprod_id NOT IN)S!L!CT prod_id >O" products*;

The advanta!e of this approach is its simplicit* and lack of new lan!a!e extensions" Thedisadvanta!e is its performance" It re4ires an extra scan and a :oin of both theproducts_delta and the products tables"

E.am&$e 3 Pre-@i #erge +sing PL!8LC!$T! O !%L$C! %OC!DU! mer/e_procISCUSO cur ISS!L!CT prod_id1 prod_name1 prod_desc1 prod_su9cate/ory1prod_su9cat_desc1  prod_cate/ory1 prod_cat_desc1 prod_status1 prod_list_price1  prod_min_price>O" products_delta;crec curro<type;

B!(IN  O%!N cur;  LOO%  >!TC, cur INTO crec;  !.IT F,!N curnotfound;  U%D$T! products S!T

prod_name H crecprod_name1 prod_desc H crecprod_desc1prod_su9cate/ory H crecprod_su9cate/ory1prod_su9cat_desc H crecprod_su9cat_desc1prod_cate/ory H crecprod_cate/ory1prod_cat_desc H crecprod_cat_desc1prod_status H crecprod_status1prod_list_price H crecprod_list_price1

  prod_min_price H crecprod_min_price  F,!! crecprod_id H prod_id;

  I> SLnotfound T,!N  INS!T INTO products

)prod_id1 prod_name1 prod_desc1 prod_su9cate/ory1prod_su9cat_desc1 prod_cate/ory1prod_cat_desc1 prod_status1 prod_list_price1 prod_min_price*

  +$LU!S

Page 160: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 160/260

  )crecprod_id1 crecprod_name1 crecprod_desc1crecprod_su9cate/ory1

crecprod_su9cat_desc1 crecprod_cate/ory1crecprod_cat_desc1 crecprod_status1 crecprod_list_price1

crecprod_min_price*;  !ND I>;  !ND LOO%;  CLOS! cur;!ND mer/e_proc;8

Trans0oration .sing )ultitable INS$T

Man* times/ external data sorces have to be se!re!ated based on lo!ical attribtes forinsertion into different tar!et ob:ects" It<s also fre4ent in data warehose environments tofan ot the same sorce data into several tar!et ob:ects" Mltitable inserts provide a newSL statement for these kinds of transformations/ where data can either end p in several

or exactl* one tar!et/ dependin! on the bsiness transformation rles" This insertion can be done conditionall* based on bsiness rles or nconditionall*"

It offers the benefits of the INS!T """ S!L!CT statement when mltiple tables are

involved as tar!ets" In doin! so/ it avoids the drawbacks of the alternatives available to*o sin! fnctionalit* prior to Oracle1i" ?o either had to deal with n independentINS!T """ S!L!CT statements/ ths processin! the same sorce data n times and

increasin! the transformation workload n times" 'lternativel*/ *o had to choose a procedral approach with a per)row determination how to handle the insertion" Thissoltion lacked direct access to hi!h)speed access paths available in SL"

's with the existin! INS!T """ S!L!CT statement/ the new statement can be paralleli5edand sed with the direct)load mechanism for faster performance"

E.am&$e 13-1 +nconditiona$ 7nsert 

The followin! statement a!!re!ates the transactional sales information/ stored insales_acti-ity_direct/ on a per dail* base and inserts into both the sales and the

costs fact table for the crrent da*"

INS!T $LL  INTO sales +$LU!S )product_id1 customer_id1 today1 7S71promotion_id1

quantity_per_day1 amount_per_day*  INTO costs +$LU!S )product_id1 today1 product_cost1 product_price*S!L!CT TUNC)ssales_date* $S today1

sproduct_id1 scustomer_id1 spromotion_id1  SU")samount_sold* $S amount_per_day1 SU")squantity*quantity_per_day1  pproduct_cost1 pproduct_price  >O" sales_acti-ity_direct s1 product_information p  F,!! sproduct_id H pproduct_id

Page 161: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 161/260

  $ND trunc)sales_date*Htrunc)sysdate*  (OU% B' trunc)sales_date*1 sproduct_id1

scustomer_id1 spromotion_id1 pproduct_cost1pproduct_price;

E.am&$e 13-2 Conditiona$ ALL 7nsert 

The followin! statement inserts a row into the sales and cost tables for all sales

transactions with a valid promotion and stores the information abot mltiple identicalorders of a cstomer in a separate table cum_sales_acti-ity" It is possible two rows

will be inserted for some sales transactions/ and none for others"

INS!T $LLF,!N promotion_id IN )S!L!CT promo_id >O" promotions* T,!N  INTO sales +$LU!S )product_id1 customer_id1 today1 7S71promotion_id1

quantity_per_day1 amount_per_day*  INTO costs +$LU!S )product_id1 today1 product_cost1 product_price*

F,!N num_of_orders J 5 T,!N  INTO cum_sales_acti-ity +$LU!S )today1 product_id1 customer_id1  promotion_id1 quantity_per_day1amount_per_day1  num_of_orders*S!L!CT TUNC)ssales_date* $S today1 sproduct_id1 scustomer_id1

spromotion_id1 SU")samount* $S amount_per_day1 SU")squantity*  quantity_per_day1 COUNT)G* num_of_orders1  pproduct_cost1 pproduct_price>O" sales_acti-ity_direct s1 product_information pF,!! sproduct_id H pproduct_id$ND TUNC)sales_date* H TUNC)sysdate*(OU% B' TUNC)sales_date*1 sproduct_id1 scustomer_id1

spromotion_id1 pproduct_cost1 pproduct_price;

E.am&$e 13-3 Conditiona$ F7%!T 7nsert 

The followin! statement inserts into an appropriate shippin! manifest accordin! to thetotal 4antit* and the wei!ht of a prodct order" 'n exception is made for hi!h valeorders/ which are also sent b* express/ nless their wei!ht classification is not too hi!h" Itassmes the existence of appropriate tables lar/e_frei/ht_shippin//

e?press_shippin// and default_shippin/"

INS!T >IST  F,!N )sum_quantity_sold J 54 $ND prod_<ei/ht_class P 0* O

)sum_quantity_sold J 0 $ND prod_<ei/ht_class J 0* T,!N  INTO lar/e_frei/ht_shippin/ +$LU!S

)time_id1 cust_id1 prod_id1 prod_<ei/ht_class1sum_quantity_sold*  F,!N sum_amount_sold J 5444 T,!N  INTO e?press_shippin/ +$LU!S  )time_id1 cust_id1 prod_id1 prod_<ei/ht_class1  sum_amount_sold1 sum_quantity_sold*  !LS!  INTO default_shippin/ +$LU!S

Page 162: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 162/260

  )time_id1 cust_id1 prod_id1 sum_quantity_sold*S!L!CT stime_id1 scust_id1 sprod_id1 pprod_<ei/ht_class1  SU")amount_sold* $S sum_amount_sold1

SU")quantity_sold* $S sum_quantity_sold>O" sales s1 products pF,!! sprod_id H pprod_id$ND stime_id H TUNC)sysdate*(OU% B' stime_id1 scust_id1 sprod_id1 pprod_<ei/ht_class;

E.am&$e 13-" #i.ed Conditiona$ and +nconditiona$ 7nsert 

The followin! example inserts new cstomers into the cstomers table and stores all newcstomers with cust_credit_limit hi!her then >88 in an additional/ separate table for

frther promotions"

INS!T >IST  F,!N cust_credit_limit JH :044 T,!N  INTO customers  INTO customers_special +$LU!S )cust_id1 cust_credit_limit*  !LS!  INTO customersS!L!CT G >O" customers_ne<;

Trans0oration .sing #"6S"

In a data warehose environment/ *o can se procedral lan!a!es sch as 7L@SL toimplement complex transformations in the Oracle1i database" $hereas CT'S operateson entire tables and emphasi5es parallelism/ 7L@SL provides a row)based approachedand can accommodate ver* sophisticated transformation rles" For example/ a 7L@SL procedre cold open mltiple crsors and read data from mltiple sorce tables/

combine this data sin! complex bsiness rles/ and finall* insert the transformed datainto one or more tar!et table" It wold be difficlt or impossible to express the samese4ence of operations sin! standard SL statements"

=sin! a procedral lan!a!e/ a specific transformation -or nmber of transformationsteps3 within a complex 6TL processin! can be encapslated/ readin! data from anintermediate sta!in! area and !eneratin! a new table ob:ect as otpt" ' previosl*!enerated transformation inpt table and a sbse4ent transformation will consme thetable !enerated b* this specific transformation" 'lternativel*/ these encapslatedtransformation steps within the complete 6TL process can be inte!rated seamlessl*/ thsstreamin! sets of rows between each other withot the necessit* of intermediate sta!in!"

?o can se Oracle1i<s table fnctions to implement sch behavior"

Trans0oration .sing Table ,unctions

Oracle1i<s table fnctions provide the spport for pipelined and parallel exection oftransformations implemented in 7L@SL/ C/ or .ava" Scenarios as mentioned earlier can be done withot re4irin! the se of intermediate sta!in! tables/ which interrpt the dataflow thro!h varios transformations steps"

Page 163: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 163/260

Page 164: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 164/260

Page 165: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 165/260

  prod_su9cat_desc +$C,$2)2444*  prod_cate/ory +$C,$2)04*1  prod_cat_desc +$C,$2)2444*1  prod_<ei/ht_class NU"B!)2*1  prod_unit_of_measure +$C,$2)24*1  prod_pac=_siMe +$C,$2)34*1  supplier_id NU"B!)*1  prod_status +$C,$2)24*1  prod_list_price NU"B!)12*1  prod_min_price NU"B!)12**;8C!$T! T'%! product_t_ta9le $S T$BL! O> product_t;8CO""IT;

!" pac=a/e of all cursor types!" <e ha-e to handle the input cursor type and the output cursorcollection!" type

C!$T! O !%L$C! %$CK$(! cursor_%K( as  T'%! product_t_rec IS !COD )

prod_id NU"B!)*1prod_name +$C,$2)04*1

  prod_desc +$C,$2):444*1  prod_su9cate/ory +$C,$2)04*1  prod_su9cat_desc +$C,$2)2444*1  prod_cate/ory +$C,$2)04*1  prod_cat_desc +$C,$2)2444*1  prod_<ei/ht_class NU"B!)2*1  prod_unit_of_measure +$C,$2)24*1  prod_pac=_siMe +$C,$2)34*1  supplier_id NU"B!)*1  prod_status +$C,$2)24*1  prod_list_price NU"B!)12*1  prod_min_price NU"B!)12**;  T'%! product_t_recta9 IS T$BL! O> product_t_rec;  T'%! stron/_refcur_t IS !> CUSO !TUN product_t_rec;  T'%! refcur_t IS !> CUSO;!ND;8

!" artificial help ta9le1 used to demonstrate fi/ure 53A:C!$T! T$BL! o9solete_products_errors )prod_id NU"B!1 ms/+$C,$2)2444**;

The followin! example demonstrates a simple filterin!J it shows all obsolete prodctsexcept the prod_cate/ory Boys" The table fnction retrns the reslt set as a set of

records and ses a weakl* t*ped ref crsor as inpt"

C!$T! O !%L$C! >UNCTION o9solete_products)cur cursor_p=/refcur_t*!TUN product_t_ta9le

IS  prod_id NU"B!)*;

Page 166: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 166/260

  prod_name +$C,$2)04*;prod_desc +$C,$2):444*;prod_su9cate/ory +$C,$2)04*;prod_su9cat_desc +$C,$2)2444*;prod_cate/ory +$C,$2)04*;

  prod_cat_desc +$C,$2)2444*;prod_<ei/ht_class NU"B!)2*;

  prod_unit_of_measure +$C,$2)24*;  prod_pac=_siMe +$C,$2)34*;  supplier_id NU"B!)*;  prod_status +$C,$2)24*;  prod_list_price NU"B!)12*;

prod_min_price NU"B!)12*;  sales NU"B!H4;  o96set product_t_ta9le H product_t_ta9le)*;  i NU"B! H 4;B!(IN  LOO%  AA >etch from cursor -aria9le  >!TC, cur INTO prod_id1 prod_name1 prod_desc1 prod_su9cate/ory1

prod_su9cat_desc1 prod_cate/ory1 prod_cat_desc1prod_<ei/ht_class1

prod_unit_of_measure1 prod_pac=_siMe1 supplier_id1 prod_status1prod_list_price1 prod_min_price;

  !.IT F,!N curNOT>OUND; AA e?it <hen last ro< is fetched  I> prod_statusH7o9solete7 $ND prod_cate/ory VH 7Boys7 T,!N  AA append to collection  iHi5;  o96sete?tend;  o96set)i*Hproduct_t) prod_id1 prod_name1 prod_desc1prod_su9cate/ory1prod_su9cat_desc1 prod_cate/ory1 prod_cat_desc1 prod_<ei/ht_class1prod_unit_of_measure1 prod_pac=_siMe1 supplier_id1 prod_status1 prod_list_price1prod_min_price*;  !ND I>;  !ND LOO%;  CLOS! cur;  !TUN o96set;!ND;8

?o can se the table fnction in a SL statement to show the reslts" Here we seadditional SL fnctionalit* for the otpt"

S!L!CT DISTINCT U%%!)prod_cate/ory*1 prod_status>O" T$BL!)o9solete_products)CUSO)S!L!CT G >O" products***;

U%%!)%OD_C$T!(O'* %OD_ST$TUSAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAA(ILS o9solete"!N o9solete

2 ro<s selected

Page 167: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 167/260

The followin! example implements the same filterin! than the first one" The maindifferences between those two are#

• This example ses a stron! t*ped +6F crsor as inpt and can be paralleli5ed

 based on the ob:ects of the stron! t*ped crsor/ as shown in one of the followin!examples"

• The table fnction retrns the reslt set incrementall* as soon as records are

created"•   !" Same e?ample1 pipelined implementation

•   !" stron/ ref cursor )input type is defined*

•   !" a ta9le <ithout a stron/ typed input ref cursor cannot 9eparalleliMed

•   !"

•   C!$T! O

•   !%L$C! >UNCTION o9solete_products_pipe)cur

cursor_p=/stron/_refcur_t*•   !TUN product_t_ta9le

•   %I%!LIN!D

•   %$$LL!L_!N$BL! )%$TITION cur B' $N'* IS

•   prod_id NU"B!)*;

•   prod_name +$C,$2)04*;

•   prod_desc +$C,$2):444*;

•   prod_su9cate/ory +$C,$2)04*;

•   prod_su9cat_desc +$C,$2)2444*;

•   prod_cate/ory +$C,$2)04*;

•   prod_cat_desc +$C,$2)2444*;

•   prod_<ei/ht_class NU"B!)2*;

•   prod_unit_of_measure +$C,$2)24*;

•   prod_pac=_siMe +$C,$2)34*;

•   supplier_id NU"B!)*;

•   prod_status +$C,$2)24*;

•   prod_list_price NU"B!)12*;

•   prod_min_price NU"B!)12*;

•   sales NU"B!H4;

•   B!(IN

•   LOO%

•   AA >etch from cursor -aria9le

•   >!TC, cur INTO prod_id1 prod_name1 prod_desc1 prod_su9cate/ory1prod_su9cat_

•   desc1 prod_cate/ory1 prod_cat_desc1 prod_<ei/ht_class1prod_unit_of_measure1

•   prod_pac=_siMe1 supplier_id1 prod_status1 prod_list_price1prod_min_price;

•   !.IT F,!N curNOT>OUND; AA e?it <hen last ro< is fetched

•   I> prod_statusH7o9solete7 $ND prod_cate/ory VH7Boys7 T,!N

•   %I%! OF )product_t)prod_id1 prod_name1 prod_desc1prod_su9cate/ory1 prod_

Page 168: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 168/260

•   su9cat_desc1 prod_cate/ory1 prod_cat_desc1 prod_<ei/ht_class1prod_unit_of_

•   measure1 prod_pac=_siMe1 supplier_id1 prod_status1prod_list_price1 prod_min_

•   price**;

•   !ND I>;

•   !ND LOO%;•   CLOS! cur;

•   !TUN;

•   !ND;

•   8

?o can se the table fnction as follows#

S!L!CT DISTINCT prod_cate/ory1 D!COD!)prod_status1 7o9solete71 7NOLON(!!"O+!_$+$IL$BL!71 7N8$7*>O" T$BL!)o9solete_products_pipe)CUSO)S!L!CT G >O" products***;

%OD_C$T!(O' D!COD!)%OD_ST$TUS1AAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAA(irls NO LON(! $+$IL$BL!"en NO LON(! $+$IL$BL!

2 ro<s selected

$e now chan!e the de!ree of parallelism for the inpt table prodcts and isse the samestatement a!ain#

$LT! T$BL! products %$$LL!L :;

The session statistics show that the statement has been paralleli5ed#

S!L!CT G >O" +Q%_S!SST$T F,!! statisticH7ueries %aralleliMed7;

ST$TISTIC L$ST_U!' S!SSION_TOT$LAAAAAAAAAAAAAAAAAAAA AAAAAAAAAA AAAAAAAAAAAAAueries %aralleliMed 5 3

5 ro< selected

Table fnctions are also capable to fanot reslts into persistent table strctres" This isdemonstrated in the next example" The fnction filters retrns all obsolete prodctsexcept a those of a specific prod_cate/ory -defalt "en3/ which was set to stats

obsolete b* error" The detected wron! prod_id<s are stored in a separate table strctre"

Its reslt set consists of all other obsolete prodct cate!ories" It frthermore demonstrateshow normal variables can be sed in con:nction with table fnctions#

Page 169: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 169/260

C!$T! O !%L$C! >UNCTION o9solete_products_dml)curcursor_p=/stron/_refcur_t1prod_cat +$C,$2 D!>$ULT 7"en7* !TUN product_t_ta9le%I%!LIN!D%$$LL!L_!N$BL! )%$TITION cur B' $N'* IS  %$("$ $UTONO"OUS_T$NS$CTION;  prod_id NU"B!)*;

prod_name +$C,$2)04*;prod_desc +$C,$2):444*;prod_su9cate/ory +$C,$2)04*;prod_su9cat_desc +$C,$2)2444*;prod_cate/ory +$C,$2)04*;

  prod_cat_desc +$C,$2)2444*;prod_<ei/ht_class NU"B!)2*;

  prod_unit_of_measure +$C,$2)24*;  prod_pac=_siMe +$C,$2)34*;  supplier_id NU"B!)*;  prod_status +$C,$2)24*;  prod_list_price NU"B!)12*;

prod_min_price NU"B!)12*;

  sales NU"B!H4;B!(IN  LOO%  AA >etch from cursor -aria9le  >!TC, cur INTO prod_id1 prod_name1 prod_desc1 prod_su9cate/ory1prod_su9cat_desc1 prod_cate/ory1 prod_cat_desc1 prod_<ei/ht_class1prod_unit_of_measure1prod_pac=_siMe1 supplier_id1 prod_status1 prod_list_price1prod_min_price;  !.IT F,!N curNOT>OUND; AA e?it <hen last ro< is fetched  I> prod_statusH7o9solete7 T,!N  I> prod_cate/oryHprod_cat T,!N  INS!T INTO o9solete_products_errors +$LU!S

)prod_id1 7correction cate/ory 7U%%!)prod_cat*7 stilla-aila9le7*;  !LS!  %I%! OF )product_t) prod_id1 prod_name1 prod_desc1prod_su9cate/ory1 prod_su9cat_desc1 prod_cate/ory1 prod_cat_desc1 prod_<ei/ht_class1prod_unit_of_measure1 prod_pac=_siMe1 supplier_id1 prod_status1 prod_list_price1prod_min_price**;  !ND I>;  !ND I>;  !ND LOO%;

  CO""IT;  CLOS! cur;  !TUN;!ND;8

The followin! 4er* shows all obsolete prodct !rops except the prod_cate/ory "en/

which was wron!l* set to stats o9solete"

Page 170: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 170/260

S!L!CT DISTINCT prod_cate/ory1 prod_status >O" T$BL!)o9solete_products_dml)CUSO)S!L!CT G >O" products***;%OD_C$T!(O' %OD_ST$TUSAAAAAAAAAAAAA AAAAAAAAAAABoys o9solete(irls o9solete

2 ro<s selected

's *o can see/ there are some prodcts of the prod_cate/ory "en that were obsoleted

 b* accident#

S!L!CT DISTINCT ms/ >O" o9solete_products_errors;

"S(AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAcorrection cate/ory "!N still a-aila9le

5 ro< selected

Takin! advanta!e of the second inpt variable chan!es the reslt set as follows#

S!L!CT DISTINCT prod_cate/ory1 prod_status >O" T$BL!)o9solete_products_dml)CUSO)S!L!CT G >O" products*1 7Boys7**;

%OD_C$T!(O' %OD_ST$TUSAAAAAAAAAAAAA AAAAAAAAAAA(irls o9solete"en o9solete

2 ro<s selected

S!L!CT DISTINCT ms/ >O" o9solete_products_errors;

"S(AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAcorrection cate/ory BO'S still a-aila9le

5 ro< selected

Becase table fnctions can be sed like a normal table/ the* can be nested/ as shown inthe followin!#

S!L!CT DISTINCT prod_cate/ory1 prod_status>O" T$BL!)o9solete_products_dml)CUSO)S!L!CT G

>O" T$BL!)o9solete_products_pipe)CUSO)S!L!CT G >O"products******;

%OD_C$T!(O' %OD_ST$TUSAAAAAAAAAAAAA AAAAAAAAAAA(irls o9solete

Page 171: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 171/260

Page 172: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 172/260

• 6ach disk has been frther sbdivided sin! an operatin! s*stem tilit* into

operatin! s*stem files with names like 8de-8D551 8de-8D521 1

8de-8D34:"

• For tablespaces are allocated on each !rop of 08 disks" To better balance I@O

and paralleli5e table space creation -becase Oracle writes each block in a datafile

when it is added to a tablespace3/ it is best if each of the for tablespaces on each!rop of 08 disks has its first datafile on a different disk" Ths the first tablespacehas 8de-8D55 as its first datafile/ the second tablespace has 8de-8D:2 as its

first datafile/ and so on/ as illstrated in Fi!re 0)>"

Figure 13- Datafi$e La'out for Para$$e$ Load E.am&$e

Text description of the illstration dwhs!811"!if 

Step 1: Create the Tablespaces and Add Data0iles in #arallel

The followin! is the command to create a tablespace named Tsfacts5" Other tablespaces

are created with analo!os commands" On a 08)C7= machine/ it shold be possible torn all 0A C!$T! T$BL!S%$C! statements to!ether" 'lternativel*/ it mi!ht be better to

rn them in two batches of 2 -two from each of the three !rops of disks3"

C!$T! T$BL!S%$C! TSfacts5D$T$>IL! 8de-8D557 SI! 542:"B !US!1D$T$>IL! 8de-8D257 SI! 542:"B !US!1D$T$>IL! 8de-8D357 SI! 542:"B !US!1D$T$>IL! 8de-8D:57 SI! 542:"B !US!1D$T$>IL! 8de-8D057 SI! 542:"B !US!1D$T$>IL! 8de-8D57 SI! 542:"B !US!1

Page 173: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 173/260

D$T$>IL! 8de-8DE57 SI! 542:"B !US!1D$T$>IL! 8de-8D57 SI! 542:"B !US!1D$T$>IL! 8de-8D@57 SI! 542:"B !US!1D$T$>IL! 8de-8D545 SI! 542:"B !US!1D!>$ULT STO$(! )INITI$L 544"B N!.T 544"B %CTINC!$S! 4*;

C!$T! T$BL!S%$C! TSfacts2D$T$>IL! 8de-8D:27 SI! 542:"B !US!1D$T$>IL! 8de-8D027 SI! 542:"B !US!1D$T$>IL! 8de-8D27 SI! 542:"B !US!1D$T$>IL! 8de-8DE27 SI! 542:"B !US!1D$T$>IL! 8de-8D27 SI! 542:"B !US!1D$T$>IL! 8de-8D@27 SI! 542:"B !US!1D$T$>IL! 8de-8D542 SI! 542:"B !US!1D$T$>IL! 8de-8D527 SI! 542:"B !US!1D$T$>IL! 8de-8D227 SI! 542:"B !US!1D$T$>IL! 8de-8D327 SI! 542:"B !US!1D!>$ULT STO$(! )INITI$L 544"B N!.T 544"B %CTINC!$S! 4*;

C!$T! T$BL!S%$C! TSfacts:D$T$>IL! 8de-8D54:7 SI! 542:"B !US!1D$T$>IL! 8de-8D5:7 SI! 542:"B !US!1D$T$>IL! 8de-8D2:7 SI! 542:"B !US!1D$T$>IL! 8de-8D3: SI! 542:"B !US!1D$T$>IL! 8de-8D::7 SI! 542:"B !US!1D$T$>IL! 8de-8D0:7 SI! 542:"B !US!1D$T$>IL! 8de-8D:7 SI! 542:"B !US!1D$T$>IL! 8de-8DE:7 SI! 542:"B !US!1D$T$>IL! 8de-8D:7 SI! 542:"B !US!1D$T$>IL! 8de-8D@:7 SI! 542:"B !US!1D!>$ULT STO$(! )INITI$L 544"B N!.T 544"B %CTINC!$S! 4*;C!$T! T$BL!S%$C! TSfacts52D$T$>IL! 8de-8D34:7 SI! 542:"B !US!1D$T$>IL! 8de-8D25:7 SI! 542:"B !US!1D$T$>IL! 8de-8D22:7 SI! 542:"B !US!1D$T$>IL! 8de-8D23: SI! 542:"B !US!1D$T$>IL! 8de-8D2::7 SI! 542:"B !US!1D$T$>IL! 8de-8D20:7 SI! 542:"B !US!1D$T$>IL! 8de-8D2:7 SI! 542:"B !US!1D$T$>IL! 8de-8D2E:7 SI! 542:"B !US!1D$T$>IL! 8de-8D2:7 SI! 542:"B !US!1D$T$>IL! 8de-8D2@:7 SI! 542:"B !US!1D!>$ULT STO$(! )INITI$L 544"B N!.T 544"B %CTINC!$S! 4*;

6xtent si5es in the STO$(! clase shold be mltiples of the mltiblock read si5e/ where blocksi5e "ULTIBLOCK_!$D_COUNT  mltiblock read si5e"

INITI$L and N!.T shold normall* be set to the same vale" In the case of parallel load/

make the extent si5e lar!e eno!h to keep the nmber of extents reasonable/ and to avoidexcessive overhead and seriali5ation de to bottlenecks in the data dictionar*" $hen%$$LL!LHTU! is sed for parallel loader/ the INITI$L extent is not sed" In this case

Page 174: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 174/260

Page 175: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 175/260

• Indexes cannot be defined"

• ?o mst set a small initial extent/ becase each loader session !ets a new extent

when it be!ins/ and it does not se an* existin! space associated with the ob:ect"• Space fra!mentation isses arise"

However/ re!ardless of the settin! of this ke*word/ if *o have one loader process foreach partition/ *o are still effectivel* loadin! into the table in parallel"

E.am&$e 13-5 Loading Partitions in Parallel Case 1

In this approach/ assme 0A inpt files are partitioned in the same wa* as *or table" ?ohave one inpt file for each partition of the table to be loaded" ?o start 0A SLLoadersessions concrrentl* in parallel/ enterin! statements like these#

SLLD D$T$H6an@0dat DI!CTHTU! CONTOLH6an@0ctlSLLD D$T$Hfe9@0dat DI!CTHTU! CONTOLHfe9@0ctl 

SLLD D$T$Hdec@0dat DI!CTHTU! CONTOLHdec@0ctl

In the example/ the ke*word %$$LL!LHTU! is not  set" ' separate control file for each

 partition is necessar* becase the control file mst specif* the partition into which theloadin! shold be done" It contains a statement sch as the followin!#

LO$D INTO facts partition)6an@0*

The advanta!e of this approach is that local indexes are maintained b* SLLoader" ?ostill !et parallel loadin!/ bt on a partition level))withot the restrictions of the %$$LL!L 

ke*word"

' disadvanta!e is that *o mst partition the inpt prior to loadin! manall*"

E.am&$e 13-> Loading Partitions in Parallel Case 2 

In another common approach/ assme an arbitrar* nmber of inpt files that are not partitioned in the same wa* as the table" ?o can adopt a strate!* of performin! parallelload for each inpt file individall*" Ths if there are seven inpt files/ *o can startseven SLLoader sessions/ sin! statements like the followin!#

SLLD D$T$Hfile5dat DI!CTHTU! %$$LL!LHTU!

Oracle partitions the inpt data so that it !oes into the correct partitions" In this case allthe loader sessions can share the same control file/ so there is no need to mention it in thestatement"

The ke*word %$$LL!LHTU! mst be sed/ becase each of the seven loader sessions

can write into ever* partition" In Case 0/ ever* loader session wold write into onl* one

Page 176: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 176/260

 partition/ becase the data was partitioned prior to loadin!" Hence all the %$$LL!L 

ke*word restrictions are in effect"

In this case/ Oracle attempts to spread the data evenl* across all the files in each of the 0Atablespaces))however an even spread of data is not !aranteed" Moreover/ there cold be

I@O contention drin! the load when the loader processes are attemptin! to write to thesame device simltaneosl*"

E.am&$e 13-: Loading Partitions in Parallel Case 3

In this example/ *o want precise control over the load" To achieve this/ *o mst partition the inpt data in the same wa* as the datafiles are partitioned in Oracle"

This example ses 08 processes loadin! into 8 disks" To accomplish this/ *o mst splitthe inpt into 0A8 files beforehand" The 08 processes will load the first partition in parallel on the first 08 disks/ then the second partition in parallel on the second 08 disks/

and so on thro!h the 0Ath partition" ?o then rn the followin! commands concrrentl*as back!rond processes#

SLLD D$T$H6an@0file5dat DI!CTHTU! %$$LL!LHTU! >IL!H8de-8D55SLLD D$T$H6an@0file54dat DI!CTHTU! %$$LL!LHTU! >IL!H8de-8D545F$IT;SLLD D$T$Hdec@0file5dat DI!CTHTU! %$$LL!LHTU! >IL!H8de-8D34:SLLD D$T$Hdec@0file54dat DI!CTHTU! %$$LL!LHTU! >IL!H8de-8D2@:

For Oracle +eal 'pplication Clsters/ divide the loader session evenl* amon! the nodes"The datafile bein! read shold alwa*s reside on the same node as the loader session"

The ke*word %$$LL!LHTU! mst be sed/ becase mltiple loader sessions can write

into the same partition" Hence all the restrictions entailed b* the %$$LL!L ke*word are

in effect" 'n advanta!e of this approach/ however/ is that it !arantees that all of the datais precisel* balanced/ exactl* reflectin! *or partitionin!"

Note: 

'ltho!h this example shows parallel load sed with partitioned tables/

the two featres can be sed independent of one another"

E.am&$e 13-@ Loading Partitions in Parallel Case 4

For this approach/ all partitions mst be in the same tablespace" ?o need to have thesame nmber of inpt files as datafiles in the tablespace/ bt *o do not need to partitionthe inpt the same wa* in which the table is partitioned"

Page 177: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 177/260

For example/ if all 8 devices were in the same tablespace/ then *o wold arbitraril* partition *or inpt data into 8 files/ then start 8 SLLoader sessions in parallel" Thestatement startin! p the first session wold be similar to the followin!#

SLLD D$T$Hfile5dat DI!CTHTU! %$$LL!LHTU! >IL!H8de-8D5

SLLD D$T$Hfile34dat DI!CTHTU! %$$LL!LHTU! >IL!H8de-8D34

The advanta!e of this approach is that as in Case / *o have control over the exact placement of datafiles becase *o se the >IL! ke*word" However/ *o are not re4ired

to partition the inpt data b* vale becase Oracle does that for *o"

' disadvanta!e is that this approach re4ires all the partitions to be in the sametablespace" This minimi5es availabilit*"

E.am&$e 13-1B Loading E.terna$ Data

This is probabl* the most basic se of external tables where the data volme is lar!e andno transformations are applied to the external data" The load process is performed asfollows#

0" ?o create the external table" Most likel*/ the table will be declared as parallel to perform the load in parallel" Oracle will d*namicall* perform load balancin! between the parallel exection servers involved in the 4er*"

0" Once the external table is created -remember that this onl* creates the metadata inthe dictionar*3/ data can be converted/ moved and loaded into the database sin!either a %$$LL!L C!$T! T$BL! $S S!L!CT or a %$$LL!L INS!T statement"

2 C!$T! T$BL! products_e?t3 )prod_id NU"B!1 prod_name +$C,$2)04*1 1: price NU"B!)2*1 discount NU"B!)2**0 O($NI$TION !.T!N$L )E D!>$ULT DI!CTO' )sta/e_dir* $CC!SS %$$"!T!S@ ) !CODS >I.!D 3454 B$D>IL! 79ad89ad_products_e?t755 LO(>IL! 7lo/8lo/_products_e?t752 ) prod_id %OSITION )5* C,$153 prod_name %OSITION )G104* C,$15: prod_desc %OSITION )G1244* C,$150 *5 !"O+!_LOC$TION )7ne<8ne<_prod5t?t717ne<8ne<_prod2t?t7**5E %$$LL!L 05 !!CT LI"IT 244;5@ # load it in the data9ase usin/ a parallel insert24 $LT! S!SSION !N$BL! %$$LL!L D"L;25 INS!T INTO T$BL! products S!L!CT G >O" products_e?t;22

In this example/ sta/e_dir is a director* where the external flat files reside"

Page 178: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 178/260

 (ote that loadin! data in parallel can be performed in Oracle1i b* sin! SLLoader"Bt external tables are probabl* easier to se and the parallel load is atomaticall*coordinated" =nlike SLLoader/ d*namic load balancin! between parallel exectionservers will be performed as well becase there will be intra)file parallelism" The latterimplies that the ser will not have to manall* split inpt files before startin! the parallel

load" This will be accomplished d*namicall*"

Ee+ "oo<up Scenario

'nother simple transformation is a ke* lookp" For example/ sppose that salestransaction data has been loaded into a retail data warehose" 'ltho!h the datawarehose<s sales table contains a product_id colmn/ the sales transaction data

extracted from the sorce s*stem contains =niform 7rice Codes -=7C3 instead of prodctI%s" Therefore/ it is necessar* to transform the =7C codes into prodct I%s before thenew sales transaction data can be inserted into the sales table"

In order to execte this transformation/ a lookp table mst relate the product_id valesto the =7C codes" This table mi!ht be the product dimension table/ or perhaps another

table in the data warehose that has been created specificall* to spport thistransformation" For this example/ we assme that there is a table named product/ which

has a product_id and an upc_code colmn"

This data sbstittion transformation can be implemented sin! the followin! CT'Sstatement#

C!$T! T$BL! temp_sales_step2NOLO((IN( %$$LL!L $SS!L!CT

sales_transaction_id1  productproduct_id sales_product_id1

sales_customer_id1  sales_time_id1

sales_channel_id1sales_quantity_sold1sales_dollar_amount

  >O" temp_sales_step51 product  F,!! temp_sales_step5upc_code H productupc_code;

This CT'S statement will convert each valid =7C code to a valid product_id vale" If

the 6TL process has !aranteed that each =7C code is valid/ then this statement alonema* be sfficient to implement the entire transformation"

$2ception -andling Scenario

In the precedin! example/ if *o mst also handle new sales data that does not have valid=7C codes/ *o can se an additional CT'S statement to identif* the invalid rows#

Page 179: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 179/260

C!$T! T$BL! temp_sales_step5_in-alid NOLO((IN( %$$LL!L $S  S!L!CT G >O" temp_sales_step5  F,!! temp_sales_step5upc_code NOT IN )S!L!CT upc_code >O"product*;

This invalid data is now stored in a separate table/ temp_sales_step5_in-alid/ and can be handled separatel* b* the 6TL process"

'nother wa* to handle invalid data is to modif* the ori!inal CT'S to se an oter :oin#

C!$T! T$BL! temp_sales_step2NOLO((IN( %$$LL!L $S

  S!L!CTsales_transaction_id1

  productproduct_id sales_product_id1sales_customer_id1

  sales_time_id1sales_channel_id1

sales_quantity_sold1sales_dollar_amount

  >O" temp_sales_step51 product  F,!! temp_sales_step5upc_code H productupc_code )*;

=sin! this oter :oin/ the sales transactions that ori!inall* contained invalidated =7Ccodes will be assi!ned a product_id of NULL" These transactions can be handled later"

'dditional approaches to handlin! invalid =7C codes exist" Some data warehoses ma*choose to insert nll)valed product_id vales into their sales table/ while other data

warehoses ma* not allow an* new data from the entire batch to be inserted into thesales table ntil all invalid =7C codes have been addressed" The correct approach is

determined b* the bsiness re4irements of the data warehose" +e!ardless of thespecific re4irements/ exception handlin! can be addressed b* the same basic SLtechni4es as transformations"

#ivoting Scenarios

' data warehose can receive data from man* different sorces" Some of these sorces*stems ma* not be relational databases and ma* store data in ver* different formats fromthe data warehose" For example/ sppose that *o receive a set of sales records from a

nonrelational database havin! the form#

product_id1 customer_id1 <ee=ly_start_date1 sales_sun1 sales_mon1sales_tue1sales_<ed1 sales_thu1 sales_fri1 sales_sat

The inpt table looks like this#

Page 180: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 180/260

Page 181: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 181/260

  333 ::: 5AOCTA44 44  333 ::: 5@AOCTA44 E44  333 ::: 24AOCTA44 44  333 ::: 25AOCTA44 @44

$2aples o0 #re=OracleJi #ivoting

The pre)Oracle1i wa* of pivotin! involved sin! CT'S -or parallel INS!T $S S!L!CT3

or 7L@SL is shown in this section"

E.am&$e 1 Pre-Orac$e@i Pivoting +sing a CTA! !tatement C!$T! ta9le temp_sales_step2 NOLO((IN( %$$LL!L $S

S!L!CT product_id1 customer_id1 time_id1 amount_sold  >O"  )S!L!CT product_id1 customer_id1 <ee=ly_start_date1 time_id1  sales_sun amount_sold >O" sales_input_ta9le  UNION $LL  S!L!CT product_id1 customer_id1 <ee=ly_start_date51 time_id1

  sales_mon amount_sold >O" sales_input_ta9le  UNION $LL  S!L!CT product_id1 cust_id1 <ee=ly_start_date21 time_id1  sales_tue amount_sold >O" sales_input_ta9le  UNION $LL  S!L!CT product_id1 customer_id1 <ee=ly_start_date31 time_id1  sales_<e9 amount_sold >O" sales_input_ta9le  UNION $LL  S!L!CT product_id1 customer_id1 <ee=ly_start_date:1 time_id1  sales_thu amount_sold >O" sales_input_ta9le  UNION $LL  S!L!CT product_id1 customer_id1 <ee=ly_start_date01 time_id1  sales_fri amount_sold >O" sales_input_ta9le  UNION $LL  S!L!CT product_id1 customer_id1 <ee=ly_start_date1 time_id1

sales_sat amount_sold >O" sales_input_ta9le*;

Like all CT'S operations/ this operation can be fll* paralleli5ed" However/ the CT'Sapproach also re4ires seven separate scans of the data/ one for each da* of the week"6ven with parallelism/ the CT'S approach can be time)consmin!"

E.am&$e 2 Pre-Orac$e@i Pivoting +sing PL!8L

7L@SL offers an alternative implementation" ' basic 7L@SL fnction to implement a

 pivotin! operation is shown in the followin! statement#

D!CL$!  CUSO c5 is  S!L!CT product_id1 customer_id1 <ee=ly_start_date1 sales_sun1

sales_mon1 sales_tue1 sales_<ed1 sales_thu1 sales_fri1 sales_sat  >O" sales_input_ta9le;B!(IN  >O crec IN c5 LOO%

Page 182: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 182/260

  INS!T INTO sales )prod_id1 cust_id1 time_id1 amount_sold*  +$LU!S )crecproduct_id1 creccustomer_id1crec<ee=ly_start_date1

crecsales_sun *;  INS!T INTO sales )prod_id1 cust_id1 time_id1 amount_sold*  +$LU!S )crecproduct_id1 creccustomer_id1crec<ee=ly_start_date51  crecsales_mon *;  INS!T INTO sales )prod_id1 cust_id1 time_id1 amount_sold*  +$LU!S )crecproduct_id1 creccustomer_id1crec<ee=ly_start_date21  crecsales_tue *;  INS!T INTO sales )prod_id1 cust_id1 time_id1 amount_sold*  +$LU!S )crecproduct_id1 creccustomer_id1crec<ee=ly_start_date31  crecsales_<ed *;  INS!T INTO sales )prod_id1 cust_id1 time_id1 amount_sold*  +$LU!S )crecproduct_id1 creccustomer_id1crec<ee=ly_start_date:1  crecsales_thu *;

  INS!T INTO sales )prod_id1 cust_id1 time_id1 amount_sold*  +$LU!S )crecproduct_id1 creccustomer_id1crec<ee=ly_start_date01  crecsales_fri *;  INS!T INTO sales )prod_id1 cust_id1 time_id1 amount_sold*  +$LU!S )crecproduct_id1 creccustomer_id1crec<ee=ly_start_date1  crecsales_sat *;  !ND LOO%;  CO""IT;!ND;

This 7L@SL procedre can be modified to provide even better performance" 'rra*inserts can accelerate the insertion phase of the procedre" Frther performance can be!ained b* paralleli5in! this transformation operation/ particlarl* if thetemp_sales_step5 table is partitioned/ sin! techni4es similar to the paralleli5ation of

data nloadin! described in Chapter 00/ 96xtraction in %ata $arehoses9" The primar*advanta!e of this 7L@SL procedre over a CT'S approach is that it re4ires onl* asin!le scan of the data"

$2aple o0 OracleJi #ivoting

Oracle1i offers a faster wa* of pivotin! *or data b* sin! a mltitable insert"

The followin! example ses the mltitable insert s*ntax to insert into the demo tableshsales some data from an inpt table with a different strctre" The mltitable insert

statement looks like this#

INS!T $LL  INTO sales )prod_id1 cust_id1 time_id1 amount_sold*  +$LU!S )product_id1 customer_id1 <ee=ly_start_date1 sales_sun*  INTO sales )prod_id1 cust_id1 time_id1 amount_sold*

Page 183: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 183/260

  +$LU!S )product_id1 customer_id1 <ee=ly_start_date51 sales_mon*  INTO sales )prod_id1 cust_id1 time_id1 amount_sold*  +$LU!S )product_id1 customer_id1 <ee=ly_start_date21 sales_tue*  INTO sales )prod_id1 cust_id1 time_id1 amount_sold*  +$LU!S )product_id1 customer_id1 <ee=ly_start_date31 sales_<ed*  INTO sales )prod_id1 cust_id1 time_id1 amount_sold*  +$LU!S )product_id1 customer_id1 <ee=ly_start_date:1 sales_thu*  INTO sales )prod_id1 cust_id1 time_id1 amount_sold*  +$LU!S )product_id1 customer_id1 <ee=ly_start_date01 sales_fri*  INTO sales )prod_id1 cust_id1 time_id1 amount_sold*  +$LU!S )product_id1 customer_id1 <ee=ly_start_date1 sales_sat*S!L!CT product_id1 customer_id1 <ee=ly_start_date1 sales_sun1  sales_mon1 sales_tue1 sales_<ed1 sales_thu1 sales_fri1 sales_sat>O" sales_input_ta9le;

This statement onl* scans the sorce table once and then inserts the appropriate data foreach da*"

Cop*ri!ht D 0112/ A88A Oracle Corporation"'ll +i!hts +eserved"

Home BookList

Contents Index MasterIndex

Feedback 

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

15)aintaining the Data Warehouse

This chapter discsses how to load and refresh a data warehose/ and discsses#

• =sin! 7artitionin! to Improve %ata $arehose +efresh

• Optimi5in! %ML Operations %rin! +efresh

• +efreshin! Materiali5ed ;iews

• =sin! Materiali5ed ;iews with 7artitioned Tables

.sing #artitioning to Iprove Data Warehousee0resh

Page 184: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 184/260

,TL -6xtraction/ Transformation and Loadin!3 is done on a schedled basis to reflectchan!es made to the ori!inal sorce s*stem" %rin! this step/ *o ph*sicall* insert thenew/ clean data into the prodction data warehose schema/ and take all of the other stepsnecessar* -sch as bildin! indexes/ validatin! constraints/ takin! backps3 to make thisnew data available to the end sers" Once all of this data has been loaded into the data

warehose/ the materiali5ed views have to be pdated to reflect the latest data"

The partitionin! scheme of the data warehose is often crcial in determinin! theefficienc* of refresh operations in the data warehose load process" In fact/ the load process is often the primar* consideration in choosin! the partitionin! scheme of datawarehose tables and indexes"

The partitionin! scheme of the lar!est data warehose tables -for example/ the fact tablein a star schema3 shold be based pon the loadin! paradi!m of the data warehose"

Most data warehoses are loaded with new data on a re!lar schedle" For example/

ever* ni!ht/ week/ or month/ new data is bro!ht into the data warehose" The data bein!loaded at the end of the week or month t*picall* corresponds to the transactions for theweek or month" In this ver* common scenario/ the data warehose is bein! loaded b*time" This s!!ests that the data warehose tables shold be partitioned on a date colmn"In or data warehose example/ sppose the new data is loaded into the sales table

ever* month" Frthermore/ the sales table has been partitioned b* month" These steps

show how the load process will proceed to add the data for a new month -.anar* A8803to the table sales"

0" 7lace the new data into a separate table/ sales_45_2445" This data can be

directl* loaded into sales_45_2445 from otside the data warehose/ or this data

can be the reslt of previos data transformation operations that have alread*occrred in the data warehose" sales_45_2445 has the exact same colmns/

datat*pes/ and so forth/ as the sales table" Kather statistics on the

sales_45_2445 table"

0" Create indexes and add constraints on sales_45_2445" '!ain/ the indexes and

constraints on sales_45_2445 shold be identical to the indexes and constraints

on sales" Indexes can be bilt in parallel and shold se the NOLO((IN( and the

CO"%UT! ST$TISTICS options" For example#2 C!$T! BIT"$% IND!. sales_45_2445_customer_id_9i?3 ON sales_45_2445)customer_id*: T$BL!S%$C! sales_id? NOLO((IN( %$$LL!L CO"%UT!

ST$TISTICS;

'ppl* all constraints to the sales_45_2445 table that are present on the sales 

table" This incldes referential inte!rit* constraints" ' t*pical constraint wold be#

$LT! T$BL! sales_45_2445 $DD CONST$INT sales_customer_id  !>!!NC!S customer)customer_id* !N$BL! NO+$LID$T!;

Page 185: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 185/260

Page 186: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 186/260

 proportional to the amont of new data bein! loaded/ not to the total si5e of the sales 

table"

Second/ the new data is loaded with minimal impact on concrrent 4eries" 'll of theoperations associated with data loadin! are occrrin! on a separate sales_45_2445 table"

Therefore/ none of the existin! data or indexes of the sales table is affected drin! thisdata refresh process" The sales table and its indexes remain entirel* ntoched

thro!hot this refresh process"

Third/ in case of the existence of an* !lobal indexes/ those are incrementall* maintainedas part of the exchan!e command" This maintenance does not affect the availabilit* of theexistin! !lobal index strctres"

The exchan!e operation can be viewed as a pblishin! mechanism" =ntil the datawarehose administrator exchan!es the sales_45_2445 table into the sales table/ end

sers cannot see the new data" Once the exchan!e has occrred/ then an* end ser 4er*

accessin! the sales table will immediatel* be able to see the sales_45_2445 data"

7artitionin! is sefl not onl* for addin! new data bt also for removin! and archivin!data" Man* data warehoses maintain a rollin! window of data" For example/ the datawarehose stores the most recent 2 months of sales data" .st as a new partition can be

added to the sales table -as described earlier3/ an old partition can be 4ickl* -and

independentl*3 removed from the sales table" These two benefits -redced resorces

tili5ation and minimal end)ser impact3 are :st as pertinent to removin! a partition asthe* are to addin! a partition"

+emovin! data from a partitioned table does not necessaril* mean that the old data is

 ph*sicall* deleted from the database" There are two alternatives for removin! old datafrom a partitioned table#

?o can ph*sicall* delete all data from the database b* droppin! the partition containin!the old data/ ths freein! the allocated space#

$LT! T$BL! sales DO% %$TITION sales_45_5@@;

?o can exchan!e the old partition with an empt* table of the same strctreJ this empt*table is created e4ivalent to step0 and A described in the load process" 'ssmin! the newempt* table stb is named sales_archi-e_45_5@@/ the followin! SL statement will

Rempt*< partition sales_45_5@@#

$LT! T$BL! sales !.C,$N(! %$TITION sales_45_5@@ FIT, T$BL!sales_archi-e_45_5@@ INCLUDIN( IND!.!S FIT,OUT +$LID$TION U%D$T! (LOB$L IND!.!S;

Page 187: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 187/260

 (ote that the old data is still existent/ as the exchan!ed/ nonpartitioned tablesales_archi-e_45_5@@"

If the partitioned table was setp in a wa* that ever* partition is stored in a separatetablespace/ *o can archive -or transport3 this table sin! Oracle<s transportable

tablespace framework before droppin! the actal data -the tablespace3" See9Transportation =sin! Transportable Tablespaces9 for frther details re!ardin!transportable tablespaces"

In some sitations/ *o mi!ht not want to drop the old data immediatel*/ bt keep it as part of the partitioned tableJ altho!h the data is no lon!er of main interest/ there are still potential 4eries accessin! this old/ read)onl* data" ?o can se Oracle<s datacompression to minimi5e the space sa!e of the old data" $e also assme that at leastone compressed partition is alread* part of the partitioned table"

See Also: 

Chapter / 97h*sical %esi!n in %ata $arehoses9 for a !enericdiscssion of data se!ment compression and Chapter >/ 97arallelismand 7artitionin! in %ata $arehoses9 for partitionin! and data se!mentcompression

e0resh Scenarios

' t*pical scenario mi!ht not onl* need to compress old data/ bt also to mer!e severalold partitions to reflect the !ranlarit* for a later backp of several mer!ed partitions"Let<s assme that a backp -partition3 !ranlarit* is on a 4arterl* base for an* 4arter/

where the oldest month is more than 2 months behind the most recent month" In thiscase/ we are therefore compressin! and mer!in! sales_45_5@@/ sales_42_5@@/ and

sales_43_5@@ into a new/ compressed partition sales_q5_5@@"

0" Create the new mer!ed partition in parallel another tablespace" The partition will be compressed as part of the "!(! operation#

2 $LT! T$BL! sales "!(! %$TITION sales_45_5@@1 sales_42_5@@1sales_43_

3 5@@ INTO %$TITION sales_q5_5@@ T$BL!S%$C! archi-e_q5_5@@CO"%!SS U%D$T!

: (LOB$L IND!.!S %$$LL!L :;0

0" The partition "!(! operation invalidates the local indexes for the new mer!ed partition" $e therefore have to rebild them#

2 $LT! T$BL! sales "ODI>' %$TITION sales_5_5@@ !BUILD UNUS$BL!LOC$L

3 IND!.!S;:

Page 188: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 188/260

'lternativel*/ *o can choose to create the new compressed data se!ment otside the partitioned table and exchan!e it back" The performance and the temporar* spaceconsmption is identical for both methods#

0" Create an intermediate table to hold the new mer!ed information" The followin!

statement inherits all NOT NULL constraints from the ori!in table b* defalt#2 C!$T! T$BL! sales_q5_5@@_out T$BL!S%$C! archi-e_q5_5@@NOLO((IN( CO"%!SS

3 %$$LL!L : $S S!L!CT G >O" sales: F,!! time_id JH TO_D$T!)745A$NA5@@717ddAmonAyyyy7*0 $ND time_id P TO_D$T!)745AUNA5@@717ddAmonAyyyy7*;

0" Create the e4ivalent index strctre for table sales_q5_5@@_out than for the

existin! table sales"

0" 7repare the existin! table sales for the exchan!e with the new compressed tablesales_q5_5@@_out" Becase the table to be exchan!ed contains data actall*

covered in three partition/ we have to Rcreate one matchin! partition/ havin! the

ran!e bondaries we are lookin! for" ?o simpl* have to drop two of the existin! partitions" (ote that *o have to drop the lower two partitions sales_45_5@@ 

and sales_42_5@@J the lower bondar* of a ran!e partition is alwa*s defined b*

the pper -exclsive3 bondar* of the previos partition#2 $LT! T$BL! sales DO% %$TITION sales_45_5@@;3 $LT! T$BL! sales DO% %$TITION sales_42_5@@;:

0" ?o can now exchan!e table sales_q5_5@@_out with partition sales_43_5@@"

=nlike what the name of the partition s!!ests/ its bondaries cover 0)011"2 $LT! T$BL! sales !.C,$N(! %$TITION sales_43_5@@3 FIT, T$BL! sales_q5_5@@_out INCLUDIN( IND!.!S FIT,OUT +$LID$TION: U%D$T! (LOB$L IND!.!S;0

Both methods appl* to sli!htl* different bsiness scenarios# =sin! the "!(! %$TITION 

approach invalidates the local index strctres for the affected partition/ bt it keeps alldata accessible all the time" 'n* attempt to access the affected partition thro!h one ofthe nsable index strctres raises an error" The limited availabilit* time isapproximatel* the time for re)creatin! the local bitmap index strctres" In most casesthis can be ne!lected/ since this part of the partitioned table sholdn<t be toched toooften"

The CT'S approach/ however/ minimi5es navailabilit* of an* index strctres close to5ero/ bt there is a specific time window/ where the partitioned table does not have all thedata/ becase we dropped two partitions" The limited availabilit* time is approximatel*the time for exchan!in! the table" %ependin! on the existence and nmber of !lobalindexes/ this time window varies" $ithot an* existin! !lobal indexes/ this time windowa matter of a fraction to few seconds"

Note: 

Page 189: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 189/260

Page 190: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 190/260

Soltion 0# =se parallel SL operations -sch as C!$T! T$BL! """ $S S!L!CT3 to separate

the new data from the data in previos time periods" 7rocess the old data separatel* sin!other techni4es"

 (ew data feeds are not solel* time based" ?o can also feed new data into a data

warehose with data from mltiple operational s*stems on a bsiness need basis" Forexample/ the sales data from direct channels ma* come into the data warehoseseparatel* from the data from indirect channels" For bsiness reasons/ it ma* frthermoremake sense to keep the direct and indirect data in separate partitions"

Soltion A# Oracle spports composite ran!e list partitionin!" The primar* partitionin!strate!* of the sales table cold be ran!e partitionin! based on time_id as shown in the

example" However/ the sbpartitionin! is a list based on the channel attribte" 6achsbpartition can now be loaded independentl* of each other -for each distinct channel3and added in a rollin! window operation as discssed before" The partitionin! strate!*addresses the bsiness needs in the most optimal manner"

Optii4ing D)" Operations During e0resh

?o can optimi5e %ML performance thro!h the followin! techni4es#

• Implementin! an 6fficient M6+K6 Operation

• Maintainin! +eferential Inte!rit*

• 7r!in! %ata

Ipleenting an $00icient )$7$ Operation

Commonl*/ the data that is extracted from a sorce s*stem is not simpl* a list of newrecords that needs to be inserted into the data warehose" Instead/ this new data set is acombination of new records as well as modified records" For example/ sppose that mostof data extracted from the OLT7 s*stems will be new sales transactions" These recordswill be inserted into the warehose<s sales table/ bt some records ma* reflect

modifications of previos transactions/ sch as retrned merchandise or transactions thatwere incomplete or incorrect when initiall* loaded into the data warehose" Theserecords re4ire pdates to the sales table"

's a t*pical scenario/ sppose that there is a table called ne<_sales that contains both

inserts and pdates that will be applied to the sales table" $hen desi!nin! the entire data

warehose load process/ it was determined that the ne<_sales table wold contain

records with the followin! semantics#

• If a !iven sales_transaction_id of a record in ne<_sales alread* exists in

sales/ then pdate the sales table b* addin! the sales_dollar_amount and

sales_quantity_sold vales from the ne<_sales table to the existin! row in

the sales table"

Page 191: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 191/260

• Otherwise/ insert the entire new record from the ne<_sales table into the sales 

table"

This U%D$T!A!LS!AINS!T operation is often called a mer!e" ' mer!e can be exected

sin! one SL statement in Oracle1i/ tho!h it re4ired two earlier"

E.am&$e 1"-1 #erging Prior to Orac$e@i 

The first SL statement pdates the appropriate rows in the sales tables/ while the

second SL statement inserts the rows#

U%D$T!  )S!L!CT  ssales_qua

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

1;Change Data Capture

Chan!e %ata Captre efficientl* identifies and captures data that has been added to/pdated/ or removed from/ Oracle relational tables/ and makes the change data availablefor se b* applications" Chan!e %ata Captre is provided as an Oracle database servercomponent with Oracle1i"

This chapter introdces Chan!e %ata Captre in the followin! sections#

'bot Chan!e %ata Captre• Installation and Implementation

• Secrit*

• Colmns in a Chan!e Table

• Chan!e %ata Captre ;iews

• S*nchronos Mode of %ata Captre

• 7blishin! Chan!e %ata

• Mana!in! Chan!e Tables and Sbscriptions

Page 192: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 192/260

• Sbscribin! to Chan!e %ata

• 6xport and Import Considerations

See Also: 

Oracle9i Supplied P&0S%& Packages and Types $e"erence for moreinformation abot the Chan!e %ata Captre pblish and sbscribe7L@SL packa!es"

About Change Data Capture

Oftentimes/ data warehosin! involves the extraction and transportation of relational datafrom one or more sorce databases/ into the data warehose for anal*sis" Chan!e %ataCaptre 4ickl* identifies and processes onl* the data that has chan!ed/ not entire tables/and makes the chan!e data available for frther se"

$ithot Chan!e %ata Captre/ database extraction is a cmbersome process in which*o move the entire contents of tables into flat files/ and then load the files into the datawarehose" This ad hoc approach is expensive in a nmber of wa*s"

Chan!e %ata Captre does not depend on intermediate flat files to sta!e the data otsideof the relational database" It captres the chan!e data resltin! from INS!T/ U%D$T!/ and

D!L!T! operations made to ser tables" The chan!e data is then stored in a database

ob:ect called a chan!e table/ and the chan!e data is made available to applications in acontrolled wa*"

Table 0>)0 describes the advanta!es of performin! database extraction with Chan!e %ata

Captre"

Ta)$e 1-1 Data)ase E.traction With and Without Change DataCa&ture

Database$2traction With Change Data Capture

Without Change DataCapture

6xtraction %atabase extraction from INS!T/ U%D$T!/

and D!L!T! operations occrs immediatel*/

at the same time the chan!es occr to thesorce tables"

%atabase extraction ismar!inal at best for INS!T 

operations/ and problematicfor U%D$T! and D!L!T! 

operations/ becase the datais no lon!er in the table"

Sta!in! Sta!es data directl* to relational tablesJthere is no need to se flat files"

The entire contents of tablesare moved into flat files"

Page 193: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 193/260

Database$2traction With Change Data Capture

Without Change DataCapture

Interface 7rovides an eas*)to)se pblish and

sbscribe interface sin!DB"S_LO("N_CDC_%UBLIS, and

DB"S_LO("N_CDC_SUBSCIB! packa!es"

6rror prone and manpower

intensive to administer"

Cost Spplied with the Oracle1i -and later3database server" +edces overhead cost b*simplif*in! the extraction of chan!e data"

6xpensive becase *o mstwrite and maintain thecaptre software *orself/ or prchase it from a third)part*vendors"

' Chan!e %ata Captre s*stem is based on the interaction of a pblisher and sbscribersto captre and distribte chan!e data/ as described in the next section"

#ublish and Subscribe )odel

Most Chan!e %ata Captre s*stems have one pblisher that captres and pblisheschan!e data for an* nmber of Oracle sorce tables" There can be mltiple sbscribersaccessin! the chan!e data" Chan!e %ata Captre provides 7L@SL packa!es toaccomplish the pblish and sbscribe tasks"

#ublisher 

The pblisher is sall* a database administrator -%B'3 who is in char!e of creatin! andmaintainin! schema ob:ects that make p the Chan!e %ata Captre s*stem" The pblisher performs these tasks#

• %etermines the relational tables -called sorce tables3 from which the data

warehose application is interested in captrin! chan!e data"• =ses the Oracle spplied packa!e/ DB"S_LO("N_CDC_%UBLIS,/ to set p the

s*stem to captre data from one or more sorce tables"• 7blishes the chan!e data in the form of chan!e tables"

• 'llows controlled access to sbscribers b* sin! the SL ($NT and !+OK! 

statements to !rant and revoke the S!L!CT privile!e on chan!e tables for sersand roles"

Subscribers

The sbscribers/ sall* applications/ are consmers of the pblished chan!e data"Sbscribers sbscribe to one or more sets of colmns in sorce tables" Sbscribers perform the followin! tasks#

Page 194: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 194/260

• =se the Oracle spplied packa!e/ DB"S_LO("N_CDC_SUBSCIB!/ to sbscribe to

sorce tables for controlled access to the pblished chan!e data for anal*sis"• 6xtend the sbscription window and create a new sbscriber view when the

sbscriber is read* to receive a set of chan!e data"• =se S!L!CT statements to retrieve chan!e data from the sbscriber views"

%rop the sbscriber view and pr!e the sbscription window when finished processin! a block of chan!es"

• %rop the sbscription when the sbscriber no lon!er needs its chan!e data"

$2aple o0 a Change Data Capture S+ste

The Chan!e %ata Captre s*stem captres the effects of %ML statements/ incldin!INS!T/ D!L!T!/ and U%D$T!/ when the* are performed on the sorce table" 's these

operations are performed/ the chan!e data is captred and pblished to correspondin!chan!e tables"

To captre chan!e data/ the pblisher creates and administers chan!e tables/ which arespecial database tables that captre chan!e data from a sorce table"

For example/ for each sorce table for which *o want to captre data/ the pblishercreates a correspondin! chan!e table" Chan!e %ata Captre ensres that none of thepdates are missed or dplicated"

6ach sbscriber has its own view of the chan!e data" This makes it possible for mltiplesbscribers to simltaneosl* sbscribe to the same chan!e table withot interferin! withone another"

Fi!re 0>)0 shows the pblish and sbscribe model in a Chan!e %ata Captre s*stem"

Figure 1-1 Pu)$ish and !u)scri)e #ode$ in a Change Data Ca&ture !'stem

Text description of the illstration s*ncfi!0"!if 

For example/ assme that the chan!e tables in Fi!re 0>)0 contains all of the chan!es thatoccrred between Monda* and Frida*/ and also assme that#

Page 195: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 195/260

• Sbscriber 0 is viewin! and processin! data from Tesda*"

• Sbscriber A is viewin! and processin! data from $ednesda* to Thrsda*"

Sbscribers 0 and A each have a ni4e sbscription window that contains a block oftransactions" Chan!e %ata Captre mana!es the sbscription window for each sbscriber

 b* creatin! a sbscriber view that retrns a ran!e of transactions of interest to thatsbscriber" The sbscriber accesses the chan!e data b* performin! S!L!CT statements on

the sbscriber view that was !enerated b* Chan!e %ata Captre"

$hen a sbscriber needs to read additional chan!e data/ the sbscriber makes procedrecalls to e1tend  the window and to create a new sbscriber view" 6ach sbscriber can +alk

thro!h the data at its own pace/ while Chan!e %ata Captre mana!es the data stora!e"'s each sbscriber finishes processin! the data in its sbscription window/ it calls procedres to drop the sbscriber view and purge the contents of the sbscriptionwindow" 6xtendin! and pr!in! windows is necessar* to prevent the chan!e table from!rowin! indefinitel*/ and to prevent the sbscriber from seein! the same data a!ain"

Ths/ Chan!e %ata Captre provides the followin! benefits for sbscribers#

• Karantees that each sbscriber sees all of the chan!es/ does not miss an*

chan!es/ and does not see the same chan!e data more than once"• ,eeps track of mltiple sbscribers and !ives each sbscriber shared access to

chan!e data"• Handles all of the stora!e mana!ement/ atomaticall* removin! data from chan!e

tables when it is no lon!er re4ired b* an* of the sbscribers"

Coponents and Terinolog+ 0or S+nchronous Change Data

Capture

This section describes the Chan!e %ata Captre components shown in Fi!re 0>)A" The pblisher is responsible for all of the components shown in Fi!re 0>)A/ except for thesbscriber views" The pblisher creates and maintains all of the schema ob:ects that makep the Chan!e %ata Captre s*stem/ and pblishes chan!e data so that sbscribers canse it"

Sbscribers are the consmers of chan!e data and are !ranted controlled access to thechan!e data b* the pblisher" Sbscribers sbscribe to one or more colmns in sorcetables"

$ith s*nchronos data captre/ the chan!e data is !enerated as data maniplationlan!a!e -%ML3 operations are made to the sorce table" 6ver* time a %ML operationoccrs on a sorce table/ a record of that operation is written to the chan!e table"

Figure 1-2 Com&onents in a !'nchronous Change Data Ca&ture !'stem

Page 196: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 196/260

Text description of the illstration s*nccom"!if 

The followin! sbsections describe Chan!e %ata Captre components in more detail"

Source S+ste

' sorce s*stem is a prodction database that contains sorce tables for which Chan!e%ata Captre will captre chan!es"

Source Table

' sorce table is a database table that resides on the sorce s*stem that contains the data*o want to captre" Chan!es made to the sorce table are immediatel* reflected in thechan!e table"

Change Source

' chan!e sorce represents a sorce s*stem" There is a s*stem)!enerated chan!e sorcenamed S'NC_SOUC!"

Change Set

' chan!e set represents the collection of chan!e tables" There is a s*stem)!eneratedchan!e set named S'NC_S!T"

Change Table

Page 197: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 197/260

Page 198: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 198/260

on the C!$T! T$BL!/ $LT! T$BL! and DO% table statements" In addition/ rmcdcsql 

removes all .ava classes sed b* Chan!e %ata Captre" (ote that after rmcdcsql is

called/ C%C will no lon!er operate on the s*stem" If the s*stem administrator decides toremove the .ava ;irtal Machine from a database instance/ rmcdcsql mst be called

 before rm6-m is called"

To re)install Chan!e %ata Captre/ the SL script initcdcsql is provided in the admin

director*" It creates the C%C s*stem tri!!ers and .ava classes that are re4ired b* Chan!e%ata Captre"

Change Data Capture estriction on Direct=#ath INS$T

Chan!e %ata Captre does not spport the direct)path INS!T statement -and/ b*

association/ the multi_ta9le_insert statement3 featre in parallel %ML mode"

$hen *o create a chan!e table/ Chan!e %ata Captre creates tri!!ers on the sorce

table" Becase a direct)path INS!T disables all database tri!!ers/ an* rows inserted intothe sorce table sin! the SL statement for direct)path INS!T in parallel %ML mode

will not be captred in the chan!e table"

Similarl*/ Chan!e %ata Captre cannot captre the inserted rows from mltitable insertoperations becase the SL multi_ta9le_insert statement in parallel %ML mode ses

direct)path INS!T" 'lso/ note that the mltitable insert operation does not retrn an error

messa!e to indicate that the tri!!ers sed b* Chan!e %ata Captre did not fire"

See Also: 

Oracle9i S%& $e"erence for more information re!ardin! mltitableinserts/ direct)path INS!T/ and tri!!ers

Securit+

?o !rant privile!es for a chan!e table separatel* from the privile!es *o !rant for asorce table" For example/ a sbscriber that has privile!es to perform a S!L!CT operation

on a sorce table mi!ht not have privile!es to perform a S!L!CT operation on a chan!e

table"

The pblisher controls sbscribers< access to chan!e data b* sin! the SL ($NT and!+OK! statements to !rant and revoke the S!L!CT privile!e on chan!e tables for sers

and roles" The pblisher mst !rant the S!L!CT privile!e before a ser or application can

sbscribe to the chan!e table"

Skip Headers

Oracle9i  Data WarehousingGuide

Page 199: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 199/260

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

1Suar+ Advisor 

This chapter illstrates how to se the Smmar* 'dvisor/ a tool for choosin! andnderstandin! materiali5ed views" The chapter contains#

• Overview of the Smmar* 'dvisor in the %BMSOL'7 7acka!e

• =sin! the Smmar* 'dvisor

6stimatin! Materiali5ed ;iew Si5e• Is a Materiali5ed ;iew Bein! =sed&

• Smmar* 'dvisor $i5ard

Overvie( o0 the Suar+ Advisor in theD&)SBO"A# #ac<age

Materiali5ed views provide hi!h performance for complex/ data)intensive 4eries" TheSmmar* 'dvisor helps *o achieve this performance benefit b* choosin! the proper setof materiali5ed views for a !iven workload" In !eneral/ as the nmber of materiali5ed

views and space allocated to materiali5ed views is increased/ 4er* performanceimproves" Bt the additional materiali5ed views have some cost# the* consme additionalstora!e space and mst be refreshed/ which increases maintenance time" The Smmar*'dvisor considers these costs and makes the most cost)effective trade)offs whenrecommendin! the creation of new materiali5ed views and evalatin! the performance ofexistin! materiali5ed views"

To help *o select from amon! the man* possible materiali5ed views in *or schema/Oracle provides a collection of materiali5ed view anal*sis and advisor* fnctions and procedres in the DB"S_OL$% packa!e" Collectivel*/ these fnctions are called the

Smmar* 'dvisor/ and the* are callable from an* 7L@SL pro!ram" Fi!re 02)0 shows

how the Smmar* 'dvisor recommends materiali5ed views from a h*pothetical or ser)defined workload or one obtained from the SL cache/ or Oracle Trace" ?o can rn theSmmar* 'dvisor from Oracle 6nterprise Mana!er or b* invokin! the DB"S_OL$% 

 packa!e" ?o mst have .ava enabled to se the Smmar* 'dvisor"

'll data and reslts !enerated b* the Smmar* 'dvisor is stored in a set of tables referredto as the Smmar* 'dvisor repositor*" These tables are owned b* S'ST!" and start with

"+I!FQ_$D+_G" Onl* %B's can access these tables directl*/ bt other sers can access

Page 200: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 200/260

Page 201: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 201/260

C!$T!_ID and the ni4e nmber is known sbse4entl* as a rn I%/ workload I% or

filter I% dependin! on the procedre it is !iven"

The identifier is sed to store the 'dvisor artifacts in the repositor*" 6ach activit* in the'dvisor re4ires a ni4e identifier to distin!ish it from other ob:ects" For example/

when *o add a filter item/ *o associate the item with a filter I%" $hen *o load aworkload/ the data !ets stored sin! the ni4e workload I%" In addition/ when *o rn!CO""!ND_"+I!F_ST$T!(' or !+$LU$T!_"+I!F_ST$T!('/ a ni4e I% is associated

with the rn"

Becase the I% is :st a ni4e nmber/ Oracle ses the same C!$T!_ID fnction to

ac4ire the vale" It is onl* when a specific operation is performed -sch as a loadworkload3 that the I% is identified as a workload I%"

?o can se the Smmar* 'dvisor with or withot a workload/ bt better reslts areachieved if a workload is provided" This can be spplied b*#

• The ser

• Oracle Trace

• The crrent SL cache contents

Once the workload is loaded into the 'dvisor workload repositor* or at the time themateriali5ed view recommendations are !enerated/ a filter can be applied to the workloadto restrict what is anal*5ed" This provides the abilit* to !enerate different sets ofrecommendations based on different workload scenarios"

These filters are created sin! the procedre $DD_>ILT!_IT!"" ?o can create an*

nmber of filters/ and se more than one at a time to filter a workload" See  9=sin! Filterswith the Smmar* 'dvisor9 for frther details"

The Smmar* 'dvisor ses for t*pes of schema ob:ects/ some of which are defined inthe ser<s schema and some are in the s*stem schema#

• =ser schema

For both ;)table and workload tables/ before the workload is available to therecommendation process" It mst be loaded into the advisor workload repositor*"

;)tables

;)tables are !enerated b* Oracle Trace for storin! reslts of formattin!server)collected trace" 7lease note that these ;)tables are different fromthe ; tables"

$orkload tables

Page 202: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 202/260

$orkload tables are ser tables that store workload information/ and canreside in an* schema"

• S*stem schema

+eslt tables

+eslt tables are internal tables that store both intermediate and finalreslts from all Smmar* 'dvisor components"

+ead)onl* views

+ead)onl* views allow *o to access recommendations/ filters andworkloads"These views are "+I!F_!CO""!ND$TIONS/

"+I!F_!+$LU$TIONS/ "+I!F_>ILT!/ and "+I!F_FOKLO$D"

$henever the Smmar* 'dvisor is rn/ the reslts/ with the exception of

estimated si5e/ are placed in internal tables/ which can be accessed fromread)onl* views in the database" These reslts can be 4eried/ so *o donot have to keep rnnin! the 'dvisor process"

If *o want to view the reslts of the last materiali5ed view recommendation/ *o canisse the followin! statement#

S!L!CT "+I!F_OFN!1 "+I!F_N$"!1 !CO""!ND!D_$CTION1%CT_%!>O"$NC!_($IN1

B!N!>IT_TO_COST_$TIO>O" S'ST!""+I!F_!CO""!ND$TIONSF,!! UNIDH )S!L!CT "$.)UNID* >O" S'ST!""+I!F_!CO""!ND$TIONS*

  OD! B' !CO""!ND$TION_NU"B! $SC

The advisor* fnctions and procedres of the DB"S_OL$% packa!e re4ire *o to !ather

strctral statistics abot fact and dimension table cardinalities/ and the distinctcardinalities of ever* dimension le-el colmn/ OIN K!' colmn/ and fact table ke*

colmn" ?o do this b* loadin! *or data warehose/ then !atherin! either exact orestimated statistics with the DB"S_ST$TS packa!e or the $N$L'! T$BL! statement"

Becase !atherin! statistics is time)consmin! and extreme statistical accrac* is notre4ired/ it is !enerall* preferable to estimate statistics"

=sin! information from the s*stem workload table/ schema metadata and statisticalinformation !enerated b* the DB"S_ST$TS packa!e/ the 'dvisor en!ine !enerates

smmar* recommendations and smmar* sa!e evalations and stores the reslts inreslt tables"

To se the Smmar* 'dvisor with a workload/ some or all of the followin! steps mst befollowed#

Page 203: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 203/260

• Optionall* obtain an identifier nmber as a filter I% and define one or more filter

items"• Obtain an identifier nmber as a workload I% and load a workload" If a filter was

defined in step 0/ then it can be sed drin! the operation to refine the SLstatements as the* are collected from the workload sorce" Load the workload"

Call the procedre !CO""!ND_"+I!F_ST$T!(' to !enerate therecommendations"

These steps can be repeated several times with different workloads to see the effect onthe materiali5ed views"

.sing the Suar+ Advisor 

The followin! sections will help *o se the 'dvisor#

• Identifier (mbers

• $orkload Mana!ement• Loadin! a =ser)%efined $orkload

• Loadin! a Trace $orkload

• Loadin! a SL Cache $orkload

• ;alidatin! a $orkload

• +emovin! a $orkload

• =sin! Filters with the Smmar* 'dvisor

• +emovin! a Filter

• +ecommendin! Materiali5ed ;iews

• Smmar* %ata +eport

• $hen +ecommendations are (o Lon!er +e4ired

• Stoppin! the +ecommendation 7rocess

• Smmar* 'dvisor Sample Sessions

• Smmar* 'dvisor and Missin! Statistics

• Smmar* 'dvisor 7rivile!es and O+')82

Identi0ier Nubers

Most of the DB"S_OL$% procedres re4ire a ni4e identifier as one of their parameters"

?o obtain this b* callin! the procedre C!$T!_ID/ which is illstrated in the followin!

section"

D&)SBO"A#LC$AT$BID #rocedure

Ta)$e 15-1 D6#!OLAPC%EATE7D Procedure Parameters

Page 204: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 204/260

#araeter Datat+pe Description

id NU"B!

The ni4e identifier that can be sed to create a filter/ load aworkload/ or create an anal*sis

$ith a SL tilit* sch as SL7ls/ do the followin!#

0" %eclare an otpt variable to receive the new identifier"2 +$I$BL! "'_ID NU"B!;3

0" Call the C!$T!_ID fnction to !enerate a new identifier"2 !.!CUT! DB"S_OL$%C!$T!_ID)"'_ID*;

Wor<load )anageent

The 'dvisor performs best when a workload based on sa!e is available" The 'dvisor$orkload +epositor* is capable of storin! mltiple workloads/ so that the different sesof a real)world data warehosin! environment can be viewed over a lon! period of timeand across the life c*cle of database instance startp and shtdown"

To facilitate wider se of the Smmar* 'dvisor/ three t*pes of workload are spported#

• Crrent contents of the SL cache

• Oracle Trace collection

• =ser)specified workload

$hen the workload is loaded sin! the appropriate load_<or=load procedre/ it isstored in a new workload repositor* in the S'ST!" schema called "+I!F_FOKLO$D 

whose format is shown in Table 02)A" ' specific workload can be removed b* callin! the%U(!_FOKLO$D rotine and passin! it a valid workload I%" To remove all workloads

for the crrent ser/ call %U(!_FOKLO$D and pass the constant vale

DB"S_OL$%FOKLO$D_$LL"

Ta)$e 15-2 #<7EWWO%LOAD Ta)$e

Colun Datat+pe Description

$%%LIC$TION +$C,$2)34* Optional application name for the 4er*

C$DIN$LIT' NU"B! Total cardinalit* of all of tables in 4er*

FOKLO$DID NU"B! $orkload id identif*in! a ni4e samplin!

Page 205: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 205/260

Colun Datat+pe Description

>!U!NC' NU"B!  (mber of times 4er* exected

I"%OT_TI"! D$T! %ate at which item was collected

L$STUS! D$T! Last date of exection

OFN! +$C,$2)34* =ser who last exected 4er*

%IOIT' NU"B! =ser)spplied rankin! of 4er*

U!' LON( er* text

U!'ID NU"B! Id nmber identif*in! a ni4e 4er*

!S%ONS!TI"! NU"B! 6xection time in seconds

!SULTSI! NU"B! Total b*tes selected b* the 4er*

Once the workload has been collected sin! the appropriate LO$D_FOKLO$D rotine/

there is also a filter mechanism that ma* be applied/ this lets *o specif* the portion ofworkload that is to be loaded into the repositor*" ?o can also se the same filtermechanism to restrict workload)based smmar* recommendation and evalation to asbset of the 4eries contained in the workload repositor*" Once the workload has beenloaded/ the Smmar* 'dvisor is rn b* callin! the procedre!CO""!ND_"+I!F_ST$T!('" ' ma:or benefit of this approach is that it is eas* to model

different workloads b* simpl* modif*in! the fre4enc* colmn/ removin! some SL4eries/ or addin! new 4eries"

Smmar* 'dvisor can retrieve workload information from the SL cache as well asOracle Trace" If the collected data was retrieved from a server with the instance parameter crsorsharin! set to

SI"IL$ or

>OC!/ then ser 4eries with embedded

literal vales will be converted to a statement that contains s*stem)!enerated bindvariables"

Note: 

Oracle Trace will be deprecated in a ftre release"

Page 206: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 206/260

In Oracle1i/ it is not possible to retrieve the bind)variable data in order to reconstrct thestatement in the form ori!inall* sbmitted b* the ser" This will/ in trn/ case Smmar*'dvisor to not consider the 4er* for rewrite and potentiall* miss a critical statement inthe ser<s workload" 's a work)arond/ if the 'dvisor will be sed to recommendmateriali5ed views/ then the server shold set the instance parameter CUSO_S,$IN( to

!.$CT"

"oading a .ser=De0ined Wor<load

' ser)defined workload is loaded sin! the procedre LO$D_FOKLO$D_US!" The

<or=load_id is obtained b* callin! the procedre C!$T!_ID" The vale of the fla!s

 parameter determines whether the workload is considered to be new/ shold be sed tooverwrite an existin! workload/ or shold be appended to an existin! workload" Theoptional filter_id can be spplied to specif* the filter that is to be sed a!ainst this

workload" $here the filter wold have been defined sin! the $DD_>ILT!_IT!" 

 procedre"

D&)SBO"A#L"OADBWOE"OADB.S$ #rocedure

Ta)$e 15-3 D6#!OLAPLOADWO%LOAD+!E% ProcedureParameters

#araeter Datat+pe Description

<or=load_id NU"B!

The re4ired workload id that was retrned b* the create_id 

call

fla/s NU"B! Can take one of the followin! vales#

DB"S_OL$%FOKLO$D_O+!FIT! 

The load rotine will explicitl* remove an* existin! 4eriesfrom the workload that are owned b* the specified collectionI%

DB"S_OL$%FOKLO$D_$%%!ND 

The load rotine preserves an* existin! 4eries in theworkload" 'n* 4eries collected b* the load operation will beappended to the end of the specified workload

DB"S_OL$%FOKLO$D_N!F 

The load rotine assmes there are no existin! 4eries in theworkload" If it finds an existin! workload element/ the call will

Page 207: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 207/260

#araeter Datat+pe Description

fail with an error

 (ote# the fla!s have the same behavior irrespective of theLO$D_FOKLO$D operation

filter_id NU"B!

Specif* filter for the workload to be loaded

o<ner_name +$C,$2

The schema that contains the ser spplied table or view

ta9le_name +$C,$2

The table or view name containin! valid workload data

The actal workload is defined in a separate table and the two parameters o<ner_name and ta9le_name describe where it is stored" There is no restriction on which schema the

workload resides in/ the name for the table/ or how man* of these ser)defined tablesexist" The onl* restriction is that the format of the ser table mst correspond to theUS!_FOKLO$D table/ as described in Table 02)#

Ta)$e 15-" +!E%WO%LOAD

Colun Datat+peOptional6e/uired Description

U!'

Can be an* +$C,$ or LON( 

t*pe"

'll character t*pes arespported

+e4ired SL statement

OFN! +$C,$2)34* +e4ired =ser who last exected 4er*

$%%LIC$TION +$C,$2)34* Optional 'pplication name for the4er*

>!U!NC' NU"B! Optional (mber of times 4er*exected

L$STUS! D$T! Optional Last date of exection

%IOIT' NU"B! Optional =ser)spplied rankin! of

Page 208: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 208/260

Colun Datat+peOptional6e/uired Description

4er*

!S%ONS!TI"! NU"B! Optional 6xection time in seconds

!SULTSI! NU"B! Optional Total b*tes selected b* the4er*

SL_$DD NU"B! Optional Cache address

SL_,$S, NU"B! Optional Cache hash vale

The followin! is an example of loadin! a ser workload"

0" %eclare an otpt variable to receive the new identifier"2 +$I$BL! "'_ID NU"B!;3

0" Call the C!$T!_ID fnction to !enerate a new identifier"2 !.!CUT! DB"S_OL$%C!$T!_ID)"'_ID*;3

0" Insert into the "'_FOKLO$D tables the 4eries *o want advice on"2 INS!T INTO ad-isor_user_<or=load +$LU!S3 )

: 7S!L!CT SU")squantity_sold*0 >O" sales s1 products p F,!! sprod_id H pprod_id $ND pprod_cate/ory H 77Boys77E (OU% B' pprod_cate/ory71 7S,71 7app571 541 NULL1 01 NULL1

NULL*

0" Load the workload from a tar!et table or view"2 !.!CUT! DB"S_OL$%LO$D_FOKLO$D_US!)"'_ID1

DB"S_OL$%FOKLO$D_N!F13 DB"S_OL$%>ILT!_NON!1 7S,71 7"'_FOKLO$D7*;

"oading a Trace Wor<load

'lternativel*/ *o can collect a Trace workload from Oracle 6nterprise Mana!er to!ather d*namic information abot *or 4er* workload/ which can be sed b* anadvisor* fnction" If Oracle Trace is available/ consider sin! it to collect materiali5edview sa!e" %oin! so enables *o to see which materiali5ed views are in se" It also letsthe 'dvisor detect an* nsal 4er* re4ests from sers that wold reslt inrecommendin! some different materiali5ed views"

Page 209: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 209/260

' workload collected b* Oracle Trace is loaded sin! the procedreLO$D_FOKLO$D_T$C!" ?o obtain <or=load_id b* callin! the procedre C!$T!_ID"

The vale of the fla!s parameter will determine whether the workload is considered new/shold be sed to overwrite an existin! workload or shold be appended to an existin!workload" The optional filter I% can be spplied to specif* the filter that is to be sed

a!ainst this workload" In addition/ *o can specif* an application name to describe thisworkload and !ive ever* 4er* a defalt priorit*" The application name is simpl* a ta!that enables *o to classif* the workload 4er*" The name can later be sed to filter theworkload drin! a !CO""!ND_"+I!F_ST$T!(' or !+$LU$T!_"+I!F_ST$T!(' 

operation"

The priorit* is an important piece of information" It tells the 'dvisor how important the4er* is to the bsiness" $hen recommendations are formed/ the priorit* will determineits vale and will case the 'dvisor to make decisions that favor hi!her rankin! 4eries"

If the o<ner_name parameter is not defined/ then the procedre will expect to find the

formatted trace tables in the schema for the crrent ser"

D&)SBO"A#L"OADBWOE"OADBTAC$ #rocedure

Ta)$e 15- D6#!OLAPLOADWO%LOADT%ACE ProcedureParameters

#araeter Datat+pe Description

<or=load_id NU"B!

The re4ired id that was retrned b* the C!$T!_ID call

fla/s NU"B!

Can take one of the followin! vales#

DB"S_OL$%FOKLO$D_O+!FIT! 

The load rotine will explicitl* remove an* existin! 4eriesfrom the workload that are owned b* the specified collectionI%

DB"S_OL$%FOKLO$D_$%%!ND 

The load rotine preserves an* existin! 4eries in theworkload" 'n* 4eries collected b* the load operation will be

appended to the end of the specified workload

DB"S_OL$%FOKLO$D_N!F 

The load rotine assmes there are no existin! 4eries in theworkload" If it finds an existin! workload element/ the call will

Page 210: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 210/260

Page 211: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 211/260

•   O$CL!_T$C!_COLL!CTION_%$T,  location of collection files 

•   O$CL!_T$C!_COLL!CTION_SI! H 4 

•   O$CL!_T$C!_!N$BL! H TU! 

•   O$CL!_T$C!_>$CILIT'_N$"! H oraclesm or oralcee 

•   O$CL!_T$C!_>$CILIT'_%$T,  location of trace facility

files 

See Also: 

Oracle9i Database Per"ormance Tuning #uide and $e"erence forfrther information re!ardin! these parameters

A" +n the Oracle Trace Mana!er/ specif* a collection name/ and select theSU""$'_!+!NT set" Oracle Trace Mana!er reads information from the associated

confi!ration file and re!isters events to be lo!!ed with Oracle" $hile collectionis enabled/ the workload information defined in the event set !ets written to a flatlo! file"

" $hen collection is complete/ Oracle Trace atomaticall* formats the OracleTrace lo! file into a set of relations/ which have the predefined s*non*ms be!innin! with +_5@2252:3_" 'lternativel*/ the collection file/ which sall*

has an extension of "C%F/ can be formatted manall* sin! the otrcfmt tilit*/ as

shown in this example#: otrcfmt collection_namecdf user8pass<ordWdata9ase0

The trace data can be formatted in an* schema" The LO$D_FOKLO$D_T$C! call lets *o

specif* the location of the data"

" +n the ($T,!_T$BL!_ST$TS procedre of the DB"S_ST$TS packa!e or $N$L'!

""" !STI"$T! ST$TISTICS to collect cardinalit* statistics on all fact tables/

dimension tables/ and ke* colmns -an* colmn that appears in a dimensionL!+!L clase or OIN clase of a C!$T! DI"!NSION statement3"

0" +n the C!$T!_ID

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

Page 212: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 212/260

#art !Warehouse #er0orance

This section deals with wa*s to improve *or data warehose<s performance/ andcontains the followin! chapters#

• Schema Modelin! Techni4es

• SL for '!!re!ation in %ata $arehoses

• SL for 'nal*sis in %ata $arehoses

• OL'7 and %ata Minin!

• =sin! 7arallel 6xection

• er* +ewrite

Cop*ri!ht D 0112/ A88A Oracle Corporation"'ll +i!hts +eserved"

Home BookList

Contents Index MasterIndex

Feedback 

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

1Schea )odeling Techni/ues

The followin! topics provide information abot schemas in a data warehose#

• Schemas in %ata $arehoses

• Third (ormal Form

• Star Schemas

• Optimi5in! Star eries

Page 213: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 213/260

Scheas in Data Warehouses

' schema is a collection of database ob:ects/ incldin! tables/ views/ indexes/ ands*non*ms"

There is a variet* of wa*s of arran!in! schema ob:ects in the schema models desi!ned fordata warehosin!" One data warehose schema model is a star schema" The Sales 

,istory sample schema -the basis for most of the examples in this book3 ses a star

schema" However/ there are other schema models that are commonl* sed for datawarehoses" The most prevalent of these schema models is the third normal form (3NF)

schema" 'dditionall*/ some data warehose schemas are neither star schemas nor (Fschemas/ bt instead share characteristics of both schemasJ these are referred to as h*bridschema models"

The Oracle1i database is desi!ned to spport all data warehose schemas" Some featresma* be specific to one schema model -sch as the star transformation featre/ described

in 9=sin! Star Transformation9/ which is specific to star schemas3" However/ the vastma:orit* of Oracle<s data warehosin! featres are e4all* applicable to star schemas/(F schemas/ and h*brid schemas" ,e* data warehosin! capabilities sch as partitionin! -incldin! the rollin! window load techni4e3/ parallelism/ materiali5edviews/ and anal*tic SL are implemented in all schema models"

The determination of which schema model shold be sed for a data warehose shold be based pon the re4irements and preferences of the data warehose pro:ect team"Comparin! the merits of the alternative schema models is otside of the scope of this bookJ instead/ this chapter will briefl* introdce each schema model and s!!est howOracle can be optimi5ed for those environments"

Third Noral ,or

'ltho!h this !ide primaril* ses star schemas in its examples/ *o can also se thethird normal form for *or data warehose implementation"

Third normal form modelin! is a classical relational)database modelin! techni4e thatminimi5es data redndanc* thro!h normali5ation" $hen compared to a star schema/ a(F schema t*picall* has a lar!er nmber of tables de to this normali5ation process" Forexample/ in Fi!re 0E)0/ orders and order items tables contain similar information as

sales table in the star schema in Fi!re 0E)A"

(F schemas are t*picall* chosen for lar!e data warehoses/ especiall* environmentswith si!nificant data)loadin! re4irements that are sed to feed data marts and exectelon!)rnnin! 4eries"

The main advanta!es of (F schemas are that the*#

Page 214: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 214/260

• 7rovide a netral schema desi!n/ independent of an* application or data)sa!e

considerations• Ma* re4ire less data)transformation than more normali5ed schemas sch as star

schemas

Fi!re 0E)0 presents a !raphical representation of a third normal form schema"

Figure 1>-1 Third =orma$ Form !chema

Text description of the illstration dwhs!08"!if 

Optii4ing Third Noral ,or ueries

eries on (F schemas are often ver* complex and involve a lar!e nmber of tables"The performance of :oins between lar!e tables is ths a primar* consideration when sin!(F schemas"

One particlarl* important featre for (F schemas is partition)wise :oins" The lar!esttables in a (F schema shold be partitioned to enable partition)wise :oins" The mostcommon partitionin! techni4e in these environments is composite ran!e)hash partitionin! for the lar!est tables/ with the most)common :oin ke* chosen as the hash) partitionin! ke*"

7arallelism is often heavil* tili5ed in (F environments/ and parallelism sholdt*picall* be enabled in these environments"

Star Scheas

The star schema is perhaps the simplest data warehose schema" It is called a starschema becase the entit*)relationship dia!ram of this schema resembles a star/ with points radiatin! from a central table" The center of the star consists of a lar!e fact tableand the points of the star are the dimension tables"

' star schema is characteri5ed b* one or more ver* lar!e fact tables that contain the primar* information in the data warehose/ and a nmber of mch smaller dimension 

Page 215: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 215/260

tables -or lookp tables3/ each of which contains information abot the entries for a particlar attribte in the fact table"

' star !ery is a :oin between a fact table and a nmber of dimension tables" 6achdimension table is :oined to the fact table sin! a primar* ke* to forei!n ke* :oin/ bt the

dimension tables are not :oined to each other" The cost)based optimi5er reco!ni5es star4eries and !enerates efficient exection plans for them"

' t*pical fact table contains ke*s and measres" For example/ in the sh sample schema/

the fact table/ sales/ contain the measres 4antit*sold/ amont/ and cost/ and the ke*s

cust_id/ time_id/ prod_id/ channel_id1 and promo_id" The dimension tables are

customers/ times/ products/ channels1 and promotions" The product dimension

table/ for example/ contains information abot each prodct nmber that appears in thefact table"

' star :oin is a primar* ke* to forei!n ke* :oin of the dimension tables to a fact table"

The main advanta!es of star schemas are that the*#

• 7rovide a direct and intitive mappin! between the bsiness entities bein!

anal*5ed b* end sers and the schema desi!n"• 7rovide hi!hl* optimi5ed performance for t*pical star 4eries"

• 're widel* spported b* a lar!e nmber of bsiness intelli!ence tools/ which ma*

anticipate or even re4ire that the data)warehose schema contain dimensiontables

Star schemas are sed for both simple data marts and ver* lar!e data warehoses"

Fi!re 0E)A presents a !raphical representation of a star schema"

Figure 1>-2 !tar !chema

Text description of the illstration dwhs!88E"!if 

Page 216: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 216/260

Sno(0la<e Scheas

The snowflake schema is a more complex data warehose model than a star schema/ andis a t*pe of star schema" It is called a snowflake schema becase the dia!ram of theschema resembles a snowflake"

Snowflake schemas normali5e dimensions to eliminate redndanc*" That is/ thedimension data has been !roped into mltiple tables instead of one lar!e table" Forexample/ a prodct dimension table in a star schema mi!ht be normali5ed into aproducts table/ a product_cate/ory table/ and a product_manufacturer table in a

snowflake schema" $hile this saves space/ it increases the nmber of dimension tablesand re4ires more forei!n ke* :oins" The reslt is more complex 4eries and redced4er* performance" Fi!re 0E) presents a !raphical representation of a snowflakeschema"

Figure 1>-3 !nof$a?e !chema

Text description of the illstration dwhs!88"!if 

Note: 

Oracle Corporation recommends *o choose a star schema over asnowflake schema nless *o have a clear reason not to"

Optii4ing Star ueries?o shold consider the followin! when sin! star 4eries#

• Tnin! Star eries

• =sin! Star Transformation

Tuning Star ueries

Page 217: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 217/260

To !et the best possible performance for star 4eries/ it is important to follow some basic!idelines#

• ' bitmap index shold be bilt on each of the forei!n ke* colmns of the fact

table or tables"•

The initiali5ation parameter ST$_T$NS>O"$TION_!N$BL!D shold be set totrue" This enables an important optimi5er featre for star)4eries" It is set to

false b* defalt for backward)compatibilit*"

• The cost)based optimi5er shold be sed" This does not appl* solel* to star

schemas# all data warehoses shold alwa*s se the cost)based optimi5er"

$hen a data warehose satisfies these conditions/ the ma:orit* of the star 4eries rnnin!in the data warehose will se a 4er* exection strate!* known as the startransformation" The star transformation provides ver* efficient 4er* performance forstar 4eries"

.sing Star Trans0oration

The star transformation is a powerfl optimi5ation techni4e that relies pon implicitl*rewritin! -or transformin!3 the SL of the ori!inal star 4er*" The end ser never needsto know an* of the details abot the star transformation" Oracle<s cost)based optimi5eratomaticall* chooses the star transformation where appropriate"

The star transformation is a cost)based 4er* transformation aimed at exectin! star4eries efficientl*" Oracle processes a star 4er* sin! two basic phases" The first phaseretrieves exactl* the necessar* rows from the fact table -the reslt set3" Becase thisretrieval tili5es bitmap indexes/ it is ver* efficient" The second phase :oins this reslt set

to the dimension tables" 'n example of an end ser 4er* is# 9$hat were the sales and profits for the !rocer* department of stores in the west and sothwest sales districts overthe last three 4arters&9 This is a simple star 4er*"

Note: 

Bitmap indexes are available onl* if *o have prchased the Oracle1i 6nterprise 6dition" In Oracle1i Standard 6dition/ bitmap indexes andstar transformation are not available"

Star Trans0oration (ith a &itap Inde2

' prere4isite of the star transformation is that there be a sin!le)colmn bitmap index onever* :oin colmn of the fact table" These :oin colmns inclde all forei!n ke* colmns"

For example/ the sales table of the sh sample schema has bitmap indexes on the time_id/

channel_id/ cust_id/ prod_id/ and promo_id colmns"

Page 218: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 218/260

Consider the followin! star 4er*#

S!L!CT chchannel_class1 ccust_city1 tcalendar_quarter_desc1  SU")samount_sold* sales_amount>O" sales s1 times t1 customers c1 channels chF,!! stime_id H ttime_id

$ND scust_id H ccust_id$ND schannel_id H chchannel_id$ND ccust_state_pro-ince H 7C$7$ND chchannel_desc in )7Internet717Catalo/7*$ND tcalendar_quarter_desc IN )75@@@A57175@@@A27*(OU% B' chchannel_class1 ccust_city1 tcalendar_quarter_desc;

Oracle processes this 4er* in two phases" In the first phase/ Oracle ses the bitmapindexes on the forei!n ke* colmns of the fact table to identif* and retrieve onl* thenecessar* rows from the fact table" That is/ Oracle will retrieve the reslt set from the facttable sin! essentiall* the followin! 4er*#

S!L!CT >O" salesF,!! time_id IN  )S!L!CT time_id >O" times

F,!! calendar_quarter_desc IN)75@@@A57175@@@A27**  $ND cust_id IN  )S!L!CT cust_id >O" customers F,!! cust_state_pro-inceH7C$7*  $ND channel_id IN  )S!L!CT channel_id >O" channels F,!! channel_descIN)7Internet717Catalo/7**;

This is the transformation step of the al!orithm/ becase the ori!inal star 4er* has been

transformed into this sb4er* representation" This method of accessin! the fact tablelevera!es the stren!ths of Oracle<s bitmap indexes" Intitivel*/ bitmap indexes provide aset)based processin! scheme within a relational database" Oracle has implemented ver*fast methods for doin! set operations sch as $ND -an intersection in standard set)based

terminolo!*3/ O -a set)based nion3/ "INUS/ and COUNT"

In this star 4er*/ a bitmap index on time_id is sed to identif* the set of all rows in the

fact table correspondin! to sales in 5@@@A5" This set is represented as a bitmap -a strin!

of 0<s and 8<s that indicates which rows of the fact table are members of the set3"

' similar bitmap is retrieved for the fact table rows correspondin! to the sale from 5@@@A

2" The bitmap O operation is sed to combine this set of 5 sales with the set of 2 sales"

'dditional set operations will be done for the customer dimension and the product 

dimension" 't this point in the star 4er* processin!/ there are three bitmaps" 6ach bitmap corresponds to a separate dimension table/ and each bitmap represents the set ofrows of the fact table that satisf* that individal dimension<s constraints"

Page 219: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 219/260

These three bitmaps are combined into a sin!le bitmap sin! the bitmap $ND operation"

This final bitmap represents the set of rows in the fact table that satisf* all of theconstraints on the dimension table" This is the reslt set/ the exact set of rows from thefact table needed to evalate the 4er*" (ote that none of the actal data in the fact tablehas been accessed" 'll of these operations rel* solel* on the bitmap indexes and the

dimension tables" Becase of the bitmap indexes< compressed data representations/ the bitmap set)based operations are extremel* efficient"

Once the reslt set is identified/ the bitmap is sed to access the actal data from the salestable" Onl* those rows that are re4ired for the end ser<s 4er* are retrieved from thefact table" 't this point/ Oracle has effectivel* :oined all of the dimension tables to thefact table sin! bitmap indexes" This techni4e provides excellent performance becaseOracle is :oinin! all of the dimension tables to the fact table with one lo!ical :oinoperation/ rather than :oinin! each dimension table to the fact table independentl*"

The second phase of this 4er* is to :oin these rows from the fact table -the reslt set3 to

the dimension tables" Oracle will se the most efficient method for accessin! and :oinin!the dimension tables" Man* dimension are ver* small/ and table scans are t*picall* themost efficient access method for these dimension tables" For lar!e dimension tables/ tablescans ma* not be the most efficient access method" In the previos example/ a bitmapindex on productdepartment can be sed to 4ickl* identif* all of those prodcts in

the !rocer* department" Oracle<s cost)based optimi5er atomaticall* determines whichaccess method is most appropriate for a !iven dimension table/ based pon the cost)basedoptimi5er<s knowled!e abot the si5es and data distribtions of each dimension table"

The specific :oin method -as well as indexin! method3 for each dimension table willlikewise be intelli!entl* determined b* the cost)based optimi5er" ' hash :oin is often the

most efficient al!orithm for :oinin! the dimension tables" The final answer is retrned tothe ser once all of the dimension tables have been :oined" The 4er* techni4e ofretrievin! onl* the matchin! rows from one table and then :oinin! to another table iscommonl* known as a semi):oin"

$2ecution #lan 0or a Star Trans0oration (ith a &itap Inde2

The followin! t*pical exection plan mi!ht reslt from 9Star Transformation with aBitmap Index9#

S!L!CT ST$T!"!NT SOT (OU% B'

  ,$S, OIN  T$BL! $CC!SS >ULL C,$NN!LS  ,$S, OIN  T$BL! $CC!SS >ULL CUSTO"!S  ,$S, OIN  T$BL! $CC!SS >ULL TI"!S  %$TITION $N(! IT!$TO  T$BL! $CC!SS B' LOC$L IND!. OFID S$L!S  BIT"$% CON+!SION TO OFIDS

Page 220: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 220/260

  BIT"$% $ND  BIT"$% "!(!  BIT"$% K!' IT!$TION  BU>>! SOT  T$BL! $CC!SS >ULL CUSTO"!S  BIT"$% IND!. $N(! SC$N S$L!S_CUST_BI.  BIT"$% "!(!  BIT"$% K!' IT!$TION  BU>>! SOT  T$BL! $CC!SS >ULL C,$NN!LS  BIT"$% IND!. $N(! SC$N S$L!S_C,$NN!L_BI.  BIT"$% "!(!  BIT"$% K!' IT!$TION  BU>>! SOT  T$BL! $CC!SS >ULL TI"!S  BIT"$% IND!. $N(! SC$N S$L!S_TI"!_BI.

In this plan/ the fact table is accessed thro!h a bitmap access path based on a bitmap

$ND/ of three mer!ed bitmaps" The three bitmaps are !enerated b* the BIT"$% "!(! rowsorce bein! fed bitmaps from row sorce trees nderneath it" 6ach sch row sorce treeconsists of a BIT"$% K!' IT!$TION row sorce which fetches vales from the sb4er*

row sorce tree/ which in this example is a fll table access" For each sch vale/ theBIT"$% K!' IT!$TION row sorce retrieves the bitmap from the bitmap index" 'fter the

relevant fact table rows have been retrieved sin! this access path/ the* are :oined withthe dimension tables and temporar* tables to prodce the answer to the 4er*"

Star Trans0oration (ith a &itap >oin Inde2

In addition to bitmap indexes/ *o can se a bitmap :oin index drin! star

transformations" 'ssme *o have the followin! additional index strctre#

C!$T! BIT"$% IND!. sales_c_state_96i?ON sales)customerscust_state_pro-ince*>O" sales1 customersF,!! salescust_id H customerscust_idLOC$L NOLO((IN( CO"%UT! ST$TISTICS;

The processin! of the same star 4er* sin! the bitmap :oin index is similar to the previos example" The onl* difference is that Oracle will tili5e the :oin index/ instead ofa sin!le)table bitmap index/ to access the cstomer data in the first phase of the star

4er*"

$2ecution #lan 0or a Star Trans0oration (ith a &itap >oin Inde2

The followin! t*pical exection plan mi!ht reslt from 96xection 7lan for a StarTransformation with a Bitmap .oin Index9#

S!L!CT ST$T!"!NT SOT (OU% B'

Page 221: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 221/260

  ,$S, OIN  T$BL! $CC!SS >ULL C,$NN!LS  ,$S, OIN  T$BL! $CC!SS >ULL CUSTO"!S  ,$S, OIN  T$BL! $CC!SS >ULL TI"!S  %$TITION $N(! $LL  T$BL! $CC!SS B' LOC$L IND!. OFID S$L!S  BIT"$% CON+!SION TO OFIDS  BIT"$% $ND  BIT"$% IND!. SIN(L! +$LU! S$L!S_C_ST$T!_BI.  BIT"$% "!(!  BIT"$% K!' IT!$TION  BU>>! SOT  T$BL! $CC!SS >ULL C,$NN!LS  BIT"$% IND!. $N(! SC$N S$L!S_C,$NN!L_BI.  BIT"$% "!(!  BIT"$% K!' IT!$TION  BU>>! SOT  T$BL! $CC!SS >ULL TI"!S

  BIT"$% IND!. $N(! SC$N S$L!S_TI"!_BI.

The difference between this plan as compared to the previos one is that the inner part ofthe bitmap index scan for the customer dimension has no sbselect" This is becase the

 :oin predicate information on customercust_state_pro-ince can be satisfied with the

 bitmap :oin index sales_c_state_96i?"

-o( Oracle Chooses to .se Star Trans0oration

The star transformation is a cost)based transformation in the followin! sense" The

optimi5er !enerates and saves the best plan it can prodce withot the transformation" Ifthe transformation is enabled/ the optimi5er then tries to appl* it to the 4er* and/ ifapplicable/ !enerates the best plan sin! the transformed 4er*" Based on a comparisonof the cost estimates between the best plans for the two versions of the 4er*/ theoptimi5er will then decide whether to se the best plan for the transformed orntransformed version"

If the 4er* re4ires accessin! a lar!e percenta!e of the rows in the fact table/ it mi!ht be better to se a fll table scan and not se the transformations" However/ if theconstrainin! predicates on the dimension tables are sfficientl* selective that onl* a small portion of the fact table needs to be retrieved/ the plan based on the transformation will

 probabl* be sperior"

 (ote that the optimi5er !enerates a sb4er* for a dimension table onl* if it decides thatit is reasonable to do so based on a nmber of criteria" There is no !arantee thatsb4eries will be !enerated for all dimension tables" The optimi5er ma* also decide/ based on the properties of the tables and the 4er*/ that the transformation does not merit bein! applied to a particlar 4er*" In this case the best re!lar plan will be sed"

Page 222: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 222/260

Page 223: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 223/260

• +OLL=7 6xtension to K+O=7 B?

• C=B6 6xtension to K+O=7 B?

• K+O=7I(K Fnctions

• K+O=7I(K S6TS 6xpression

• Composite Colmns

Concatenated Kropin!s• Considerations when =sin! '!!re!ation

• Comptation =sin! the $ITH Clase

Overvie( o0 S" 0or Aggregation in DataWarehouses

'!!re!ation is a fndamental part of data warehosin!" To improve a!!re!ation performance in *or warehose/ Oracle provides the followin! extensions to the (OU% 

B' clase#

•   CUB! and OLLU% extensions to the (OU% B' clase

• Three (OU%IN( fnctions

•   (OU%IN( S!TS expression

The CUB!/ OLLU%/ and (OU%IN( S!TS extensions to SL make 4er*in! and reportin!

easier and faster" OLLU% calclates a!!re!ations sch as SU"/ COUNT/ "$./ "IN/ and $+( 

at increasin! levels of a!!re!ation/ from the most detailed p to a !rand total" CUB! is an

extension similar to OLLU%/ enablin! a sin!le statement to calclate all possible

combinations of a!!re!ations" CUB! can !enerate the information needed in cross)

tablation reports with a sin!le 4er*"

CUB!/ OLLU%/ and the (OU%IN( S!TS extension let *o specif* exactl* the !ropin!s of

interest in the (OU% B' clase" This allows efficient anal*sis across mltiple dimensions

withot performin! a CUB! operation" Comptin! a fll cbe creates a heav* processin!

load/ so replacin! cbes with !ropin! sets can si!nificantl* increase performance" CUB!/

OLLU%/ and !ropin! sets prodce a sin!le reslt set that is e4ivalent to a UNION $LL of

differentl* !roped rows"

To enhance performance/ CUB!/ OLLU%/ and (OU%IN( S!TS can be paralleli5ed# mltiple

 processes can simltaneosl* execte all of these statements" These capabilities makea!!re!ate calclations more efficient/ thereb* enhancin! database performance/ and

scalabilit*"

The three (OU%IN( fnctions help *o identif* the !rop each row belon!s to and enable

sortin! sbtotal rows and filterin! reslts"

See Also: 

Oracle9i S%& $e"erence for frther details

Page 224: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 224/260

Anal+4ing Across )ultiple Diensions

One of the ke* concepts in decision spport s*stems is mltidimensional anal*sis#examinin! the enterprise from all necessar* combinations of dimensions" $e se theterm dimension to mean an* cate!or* sed in specif*in! 4estions" 'mon! the most

commonl* specified dimensions are time/ !eo!raph*/ prodct/ department/ anddistribtion channel/ bt the potential dimensions are as endless as the varieties ofenterprise activit*" The events or entities associated with a particlar set of dimensionvales are sall* referred to as facts" The facts mi!ht be sales in nits or local crrenc*/ profits/ cstomer conts/ prodction volmes/ or an*thin! else worth trackin!"

Here are some examples of mltidimensional re4ests#

• Show total sales across all prodcts at increasin! a!!re!ation levels for a

!eo!raph* dimension/ from state to contr* to re!ion/ for 0111 and A888"• Create a cross)tablar anal*sis of or operations showin! expenses b* territor* in

Soth 'merica for 0111 and A888" Inclde all possible sbtotals"• List the top 08 sales representatives in 'sia accordin! to A888 sales revene for

atomotive prodcts/ and rank their commissions"

'll these re4ests involve mltiple dimensions" Man* mltidimensional 4estions re4irea!!re!ated data and comparisons of data sets/ often across time/ !eo!raph* or bd!ets"

To visali5e data that has man* dimensions/ anal*sts commonl* se the analo!* of a datacbe/ that is/ a space where facts are stored at the intersection of n dimensions" Fi!re 0)0 shows a data cbe and how it can be sed differentl* b* varios !rops" The cbestores sales data or!ani5ed b* the dimensions of product/ mar=et/ sales/ and time"

 (ote that this is onl* a metaphor# the actal data is ph*sicall* stored in normal tables"The cbe data consists of both detail and a!!re!ated data"

Figure 1:-1 Logica$ Cu)es and <ies )' Different +sers

Page 225: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 225/260

Text description of the illstration dwhs!8E"!if 

?o can retrieve slices of data from the cbe" These correspond to cross)tablar reportssch as the one shown in Table 0)0" +e!ional mana!ers mi!ht std* the data b*comparin! slices of the cbe applicable to different markets" In contrast/ prodctmana!ers mi!ht compare slices that appl* to different prodcts" 'n ad hoc ser mi!htwork with a wide variet* of constraints/ workin! in a sbset cbe"

'nswerin! mltidimensional 4estions often involves accessin! and 4er*in! h!e4antities of data/ sometimes in millions of rows" Becase the flood of detailed data!enerated b* lar!e or!ani5ations cannot be interpreted at the lowest level/ a!!re!atedviews of the information are essential" '!!re!ations/ sch as sms and conts/ acrossman* dimensions are vital to mltidimensional anal*ses" Therefore/ anal*tical tasksre4ire convenient and efficient data a!!re!ation"

Optii4ed #er0orance

 (ot onl* mltidimensional isses/ bt all t*pes of processin! can benefit from enhanceda!!re!ation facilities" Transaction processin!/ financial and manfactrin! s*stems))allof these !enerate lar!e nmbers of prodction reports needin! sbstantial s*stemresorces" Improved efficienc* when creatin! these reports will redce s*stem load" Infact/ an* compter process that a!!re!ates data from details to hi!her levels will benefitfrom optimi5ed a!!re!ation performance"

Oracle1i extensions provide a!!re!ation featres and brin! man* benefits/ incldin!#

• Simplified pro!rammin! re4irin! less SL code for man* tasks

• icker and more efficient 4er* processin!

Page 226: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 226/260

• +edced client processin! loads and network traffic becase a!!re!ation work is

shifted to servers• Opportnities for cachin! a!!re!ations becase similar 4eries can levera!e

existin! work

An Aggregate Scenario

To illstrate the se of the (OU% B' extension/ this chapter ses the sh data of the sample

schema" 'll the examples refer to data from this scenario" The h*pothetical compan* hassales across the world and tracks sales b* both dollars and 4antities information"Becase there are man* rows of data/ the 4eries shown here t*picall* have ti!htconstraints on their F,!! clases to limit the reslts to a small nmber of rows"

E.am&$e 1:-1 !im&$e Cross-Ta)u$ar %e&ort With !u)tota$s

Table 0)0 is a sample cross)tablar report showin! the total sales b* country_id and

channel_desc for the =S and =, thro!h the Internet and direct sales in SeptemberA888"

Ta)$e 1:-1 !im&$e Cross-Ta)u$ar %e&ort With !u)tota$s

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

1JS" 0or Anal+sis in DataWarehouses

The followin! topics provide information abot how to improve anal*tical SL 4eriesin a data warehose#

• Overview of SL for 'nal*sis in %ata $arehoses

• +ankin! Fnctions

• $indowin! '!!re!ate Fnctions

• +eportin! '!!re!ate Fnctions

Page 227: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 227/260

Page 228: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 228/260

T+pe .sed ,or  

set"

$indowin!

Calclatin! cmlative and movin! a!!re!ates" $orks with

these fnctions#

SU"/ $+(/ "IN/ "$./ COUNT/ +$I$NC!/ STDD!+/ >IST_+$LU!/

L$ST_+$LU!/ and new statistical fnctions

+eportin!Calclatin! shares/ for example/ market share" $orks with thesefnctions#

SU"/ $+(/ "IN/ "$./ COUNT -with@withot DISTINCT3/ +$I$NC!/

STDD!+/ $TIO_TO_!%OT/ and new statistical fnctions

L'K@L6'% Findin! a vale in a row a specified nmber of rows from acrrent row"

FI+ST@L'ST First or last vale in an ordered !rop"

Linear +e!ression Calclatin! linear re!ression and other statistics -slope/ intercept/and so on3"

Inverse 7ercentile The vale in a data set that corresponds to a specified percentile"

H*pothetical +ank and%istribtion

The rank or percentile that a row wold have if inserted into aspecified data set"

To perform these operations/ the anal*tic fnctions add several new elements to SL processin!" These elements bild on existin! SL to allow flexible and powerflcalclation expressions" $ith :st a few exceptions/ the anal*tic fnctions have these newelements" The processin! flow is represented in Fi!re 01)0"

Figure 1@-1 Processing Order 

Text description of the illstration dwhs!8A0"!if 

Page 229: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 229/260

Page 230: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 230/260

6ach calclation performed with an anal*tic fnction is based on a crrent rowwithin a partition" The crrent row serves as the reference point determinin! thestart and end of the window" For instance/ a centered movin! avera!e calclationcold be defined with a window that holds the crrent row/ the six precedin!rows/ and the followin! six rows" This wold create a slidin! window of 0 rows/

as shown in Fi!re 01)A"

Figure 1@-2 !$iding Windo E.am&$e

Text description of the illstration dwhs!8AA"!if 

an<ing ,unctions' rankin! fnction comptes the rank of a record compared to other records in thedataset based on the vales of a set of measres" The t*pes of rankin! fnction are#

•   $NK and D!NS!_$NK 

•   CU"!_DIST and %!C!NT_$NK 

•   NTIL! 

•   OF_NU"B! 

ANE and D$NS$BANE

The $NK and D!NS!_$NK fnctions allow *o to rank items in a !rop/ for example/

findin! the top three prodcts sold in California last *ear" There are two fnctions that perform rankin!/ as shown b* the followin! s*ntax#

$NK ) * O+! ) Xquery_partition_clauseY order_9y_clause *D!NS!_$NK ) * O+! ) Xquery_partition_clauseY order_9y_clause *

Page 231: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 231/260

The difference between $NK and D!NS!_$NK is that D!NS!_$NK leaves no !aps in

rankin! se4ence when there are ties" That is/ if *o were rankin! a competition sin!D!NS!_$NK and had three people tie for second place/ *o wold sa* that all three were

in second place and that the next person came in third" The $NK fnction wold also !ive

three people in second place/ bt the next person wold be in fifth place"

The followin! are some relevant points abot $NK#

• 'scendin! is the defalt sort order/ which *o ma* want to chan!e to descendin!"

• The expressions in the optional %$TITION B' clase divide the 4er* reslt set

into !rops within which the $NK fnction operates" That is/ $NK !ets reset

whenever the !rop chan!es" In effect/ the vale expressions of the %$TITION B'

clase define the reset bondaries"• If the %$TITION B' clase is missin!/ then ranks are compted over the entire

4er* reslt set"• The OD! B' clase specifies the measres -Qvale expressionPs3 on which

rankin! is done and defines the order in which rows are sorted in each !rop -or partition3" Once the data is sorted within each partition/ ranks are !iven to eachrow startin! from 0"

• The NULLS >IST  NULLS L$ST clase indicates the position of NULLs in the

ordered se4ence/ either first or last in the se4ence" The order of the se4encewold make NULLs compare either hi!h or low with respect to non)NULL vales" If

the se4ence were in ascendin! order/ then NULLS >IST implies that NULLs are

smaller than all other non)NULL vales and NULLS L$ST implies the* are lar!er

than non)NULL vales" It is the opposite for descendin! order" See the example in

9Treatment of (=LLs9"• If the NULLS >IST  NULLS L$ST clase is omitted/ then the orderin! of the nll

vales depends on the $SC or D!SC ar!ments" (ll vales are considered lar!erthan an* other vales" If the orderin! se4ence is $SC/ then nlls will appear lastJ

nlls will appear first otherwise" (lls are considered e4al to other nlls and/therefore/ the order in which nlls are presented is non)deterministic"

an<ing Order 

The followin! example shows how the X$SC D!SCY option chan!es the rankin! order"

E.am&$e 1@-1 %an?ing Order 

S!L!CT channel_desc1TO_C,$)SU")amount_sold*1 7@1@@@1@@@1@@@7* S$L!SQ1

  $NK)* O+! )OD! B' SU")amount_sold* * $S default_ran=1$NK)* O+! )OD! B' SU")amount_sold* D!SC NULLS L$ST* $S

custom_ran=>O" sales1 products1 customers1 times1 channelsF,!! salesprod_idHproductsprod_id $ND  salescust_idHcustomerscust_id $ND  salestime_idHtimestime_id $ND  saleschannel_idHchannelschannel_id $ND

Page 232: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 232/260

  timescalendar_month_desc IN )72444A4@71 72444A547*  $ND country_idH7US7(OU% B' channel_desc;

C,$NN!L_D!SC S$L!SQ D!>$ULT_$NK CUSTO"_$NKAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAA AAAAAAAAAAAA AAAAAAAAAAADirect Sales 01E::123 0 5Internet 31201@@3 : 2Catalo/ 51013 3 3%artners 510441253 2 :Tele Sales 4:10 5 0

$hile the data in this reslt is ordered on the measre S$L!SQ/ in !eneral/ it is not

!aranteed b* the $NK fnction that the data will be sorted on the measres" If *o want

the data to be sorted on S$L!SQ in *or reslt/ *o mst specif* it explicitl* with an

OD! B' clase/ at the end of the S!L!CT statement"

an<ing on )ultiple $2pressions

+ankin! fnctions need to resolve ties between vales in the set" If the first expressioncannot resolve ties/ the second expression is sed to resolve ties and so on" For example/here is a 4er* rankin! for of the sales channels over two months based on their dollarsales/ breakin! ties with the nit sales" -(ote that the TUNC fnction is sed here onl* to

create tie vales for this 4er*"3

E.am&$e 1@-2 %an?ing On #u$ti&$e E.&ressions

S!L!CT channel_desc1 calendar_month_desc1  TO_C,$)TUNC)SU")amount_sold*1A*1 7@1@@@1@@@1@@@7* S$L!SQ1

  TO_C,$)SU")quantity_sold*1 7@1@@@1@@@1@@@7* S$L!S_Count1$NK)* O+! )OD! B' trunc)SU")amount_sold*1 A* D!SC1

SU")quantity_sold*D!SC* $S col_ran=>O" sales1 products1 customers1 times1 channelsF,!! salesprod_idHproductsprod_id $ND  salescust_idHcustomerscust_id $ND  salestime_idHtimestime_id $ND  saleschannel_idHchannelschannel_id $ND

timescalendar_month_desc IN )72444A4@71 72444A547* $ND  channelschannel_descPJ7Tele Sales7(OU% B' channel_desc1 calendar_month_desc;

C,$NN!L_D!SC C$L!ND$ S$L!SQ S$L!S_COUNT COL_$NKAAAAAAAAAAAAAAAAAAAA AAAAAAAA AAAAAAAAAAAAAA AAAAAAAAAAAAAA AAAAAAAAADirect Sales 2444A54 5414441444 5@21005 5Direct Sales 2444A4@ @14441444 5E1@04 2Internet 2444A54 14441444 5231503 3Internet 2444A4@ 14441444 553144 :Catalo/ 2444A54 314441444 0@1E2 0Catalo/ 2444A4@ 314441444 0:10E %artners 2444A54 214441444 041EE3 E%artners 2444A4@ 214441444 :1224

Page 233: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 233/260

The sales_count colmn breaks the ties for three pairs of vales"

ANE and D$NS$BANE Di00erence

The difference between $NK and D!NS!_$NK fnctions is illstrated as follows#

E.am&$e 1@-3 %A= and DE=!E%A= 

S!L!CT channel_desc1 calendar_month_desc1  TO_C,$)TUNC)SU")amount_sold*1A*1 7@1@@@1@@@1@@@7* S$L!SQ1  $NK)* O+! )OD! B' trunc)SU")amount_sold*1A* D!SC*  $S $NK1D!NS!_$NK)* O+! )OD! B' TUNC)SU")amount_sold*1A* D!SC*  $S D!NS!_$NK>O" sales1 products1 customers1 times1 channelsF,!! salesprod_idHproductsprod_id $ND

  salescust_idHcustomerscust_id $ND  salestime_idHtimestime_id $ND  saleschannel_idHchannelschannel_id $ND

timescalendar_month_desc IN )72444A4@71 72444A547* $ND  channelschannel_descPJ7Tele Sales7(OU% B' channel_desc1 calendar_month_desc;

C,$NN!L_D!SC C$L!ND$ S$L!SQ $NK D!NS!_$NKAAAAAAAAAAAAAAAAAAAA AAAAAAAA AAAAAAAAAAAAAA AAAAAAAAA AAAAAAAAAADirect Sales 2444A54

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

*8O"A# and Data )ining

In lar!e data warehose environments/ man* different t*pes of anal*sis can occr" Inaddition to SL 4eries/ *o ma* also appl* more advanced anal*tical operations to *ordata" Two ma:or t*pes of sch anal*sis are OL'7 -On)Line 'nal*tic 7rocessin!3 anddata minin!" +ather than havin! a separate OL'7 or data minin! en!ine/ Oracle hasinte!rated OL'7 and data minin! capabilities directl* into the database server" OracleOL'7 and Oracle %ata Minin! are options to the Oracle1 i %atabase" This chapter

Page 234: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 234/260

 provides a brief introdction to these technolo!ies/ and more detail can be fond in these prodcts< respective docmentation"

The followin! topics provide an introdction to Oracle<s OL'7 and data minin!capabilities#

• OL'7

• %ata Minin!

See Also: 

Oracle9i O&'P ser(s #uide for frther information re!ardin! OL'7and Oracle Data Mining  docmentation for frther informationre!ardin! data minin!

O"A#

Oracle1i OL'7 adds the 4er* performance and calclation capabilit* previosl* fondonl* in mltidimensional databases to Oracle<s relational platform" In addition/ it providesa .ava OL'7 '7I that is appropriate for the development of internet)read* anal*ticalapplications" =nlike other combinations of OL'7 and +%BMS technolo!*/ Oracle1 i OL'7 is not a mltidimensional database sin! brid!es to move data from the relationaldata store to a mltidimensional data store" Instead/ it is trl* an OL'7)enabled relationaldatabase" 's a reslt/ Oracle1i provides the benefits of a mltidimensional database alon!with the scalabilit*/ accessibilit*/ secrit*/ mana!eabilit*/ and hi!h availabilit* of theOracle1i database" The .ava OL'7 '7I/ which is specificall* desi!ned for internet)basedanal*tical applications/ offers prodctive data access"

See Also: 

Oracle9i O&'P ser(s #uide for frther information re!ardin! OL'7

&ene0its o0 O"A# and D&)S Integration

Basin! an OL'7 s*stem directl* on the Oracle server offers the followin! benefits#

• Scalabilit*

• 'vailabilit*

• Mana!eabilit*

• Backp and +ecover*

• Secrit*

Scalabilit+

Page 235: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 235/260

Page 236: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 236/260

• %etails related to backp/ restore/ and recover* operations are maintained b* the

server in a recover* catalo! and atomaticall* sed as part of these operations"This redces administrative brden and minimi5es the possibilit* of hman errors"

• Backp and recover* operations are fll* inte!rated with partitionin!" Individal

 partitions/ when placed in their own tablespaces/ can be backed p and restored

independentl* of the other partitions of a table"• Oracle incldes spport for incremental backp and recover* sin! +ecover*

Mana!er/ enablin! operations to be completed efficientl* within times proportional to the amont of chan!es/ rather than the overall si5e of the database"

• The backp and recover* technolo!* is hi!hl* scalable/ and provides ti!htinterfaces to indstr*)leadin! media mana!ement sbs*stems" This provides forefficient operations that can scale p to handle ver* lar!e volmes of data" Open7latforms for more hardware options U enterprise)level platforms"

See Also: 

Oracle9i $ecovery Manager ser(s #uide for frther details

Securit+

.st as the demands of real)world transaction processin! re4ired Oracle to developrobst featres for scalabilit*/ mana!eabilit* and backp and recover*/ the* lead Oracleto create indstr*)leadin! secrit* featres" The secrit* featres in Oracle have reachedthe hi!hest levels of ="S" !overnment certification for database trstworthiness" Oracle<sfine !rained access control featre/ enables cell)level secrit* for OL'7 sers" Fine!rained access control works with minimal brden on 4er* processin!/ and it enablesefficient centrali5ed secrit* mana!ement"

Data )ining

Oracle enables data minin! inside the database for performance and scalabilit*" Some ofthe capabilities are#

• 'n '7I that provides pro!rammatic control and application inte!ration

• 'nal*tical capabilities with OL'7 and statistical fnctions in the database

• Mltiple al!orithms# (aVve Ba*es/ decision trees/ clsterin!/ and association rles

• +eal)time and batch scorin! modes

• Mltiple prediction t*pes

• 'ssociation insi!hts

See Also: 

Oracle Data Mining  docmentation for more information

$nabling Data )ining Applications

Page 237: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 237/260

Oracle1i %ata Minin! provides a .ava '7I to exploit the data minin! fnctionalit* that isembedded within the Oracle1i database"

B* deliverin! complete pro!rammatic control of the database in data minin!/ Oracle %ataMinin! -O%M3 delivers powerfl/ scalable modelin! and real)time scorin!" This enables

e)bsinesses to incorporate predictions and classifications in all processes and decision points thro!hot the bsiness c*cle"

O%M is desi!ned to meet the challen!es of vast amonts of data/ deliverin! accrateinsi!hts completel* inte!rated into e)bsiness applications" This inte!rated intelli!enceenables the atomation and decision speed that e)bsinesses re4ire in order to competetoda*"

#redictions and Insights

Oracle %ata Minin! ses data minin! al!orithms to sift thro!h the lar!e volmes of data

!enerated b* e)bsinesses to prodce/ evalate/ and deplo* predictive models" It alsoenriches mission critical applications in C+M/ manfactrin! control/ inventor*mana!ement/ cstomer service and spport/ $eb portals/ wireless devices and otherfields with context)specific recommendations and predictive monitorin! of critical processes" O%M delivers real)time answers to 4estions sch as#

• $hich ( items is person ' most likel* to b* or like&

• $hat is the likelihood that this prodct will be retrned for repair&

)ining Within the Database Architecture

Oracle %ata Minin! performs all the phases of data minin! within the database" In eachdata minin! phase/ this architectre reslts in si!nificant improvements incldin! performance/ atomation/ and inte!ration"

Data #reparation

%ata preparation can create new tables or views of existin! data" Both options performfaster than movin! data to an external data minin! tilit* and offer the pro!rammer theoption of snap)shots or real)time pdates"

Oracle %ata Minin! provides tilities for complex/ data minin!)specific tasks" Binnin!

improves model bild time and model performance/ so O%M provides a tilit* for ser)defined binnin!" O%M accepts data in either sin!le record format or in transactionalformat and performs minin! on transactional formats" Sin!le record format is mostcommon in applications/ so O%M provides a tilit* for transformin! sin!le recordformat"

'ssociated anal*sis for preparator* data exploration and model evalation is extended b*Oracle<s statistical fnctions and OL'7 capabilities" Becase these also operate within

Page 238: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 238/260

the database/ the* can all be incorporated into a seamless application that shares databaseob:ects" This allows for more fnctional and faster applications"

)odel &uilding

Oracle %ata Minin! provides for al!orithms# (aVve Ba*es/ %ecision Tree/ Clsterin!/and 'ssociation +les" These al!orithms address a broad spectrm of bsiness problems/ran!in! from predictin! the ftre likelihood of a cstomer prchasin! a !iven prodct/to nderstand which prodcts are likel* be prchased to!ether in a sin!le trip to the!rocer* store" 'll model bildin! takes place inside the database" Once a!ain/ the datadoes not need to move otside the database in order to bild the model/ and therefore theentire data)minin! process is accelerated"

)odel $valuation

Models are stored in the database and directl* accessible for evalation/ reportin!/ and

frther anal*sis b* a wide variet* of tools and application fnctions" O%M provides '7Isfor calclatin! traditional confsion matrixes and lift charts" It stores the models/ thenderl*in! data/ and these anal*sis reslts to!ether in the database to allow frtheranal*sis/ reportin! and application specific model mana!ement"

Scoring

Oracle %ata Minin! provides both batch and real)time scorin!" In batch mode/ O%Mtakes a table as inpt" It scores ever* record/ and retrns a scored table as a reslt" In real)time mode/ parameters for a sin!le record are passed in and the scores are retrned in a.ava ob:ect"

In both modes/ O%M can deliver a variet* of scores" It can retrn a ratin! or probabilit*of a specific otcome" 'lternativel* it can retrn a predicted otcome and the probabilit*of that otcome occrrin!" Some examples follow"

• How likel* is this event to end in otcome '&

• $hich otcome is most likel* to reslt from this event&

• $hat is the probabilit* of each possible otcome for this event&

>ava A#I

The Oracle %ata Minin! '7I lets *o bild anal*tical models and deliver real)time predictions in an* application that spports .ava" The '7I is based on the emer!in! .S+)8E standard"

Cop*ri!ht D 0112/ A88A Oracle Corporation"'ll +i!hts +eserved"

Home BookList

Contents Index MasterIndex

Feedback 

Page 239: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 239/260

Skip Headers

Oracle9i  Data WarehousingGuide

Release2 (9.2)

Part Number A96520-01

Home BookList

Contents Index MasterIndex

Feedback 

*1.sing #arallel $2ecution

This chapter covers tnin! in a parallel exection environment and discsses#

• Introdction to 7arallel 6xection Tnin!• T*pes of 7arallelism

• Initiali5in! and Tnin! 7arameters for 7arallel 6xection

• Tnin! Keneral 7arameters for 7arallel 6xection

• Monitorin! and %ia!nosin! 7arallel 6xection 7erformance

• 'ffinit* and 7arallel Operations

• Miscellaneos 7arallel 6xection Tnin! Tips

Introduction to #arallel $2ecution Tuning

7arallel exection dramaticall* redces response time for data)intensive operations onlar!e databases t*picall* associated with decision spport s*stems -%SS3 and datawarehoses" ?o can also implement parallel exection on certain t*pes of onlinetransaction processin! -OLT73 and h*brid s*stems" 7arallel exection improves processin! for#

• eries re4irin! lar!e table scans/ :oins/ or partitioned index scans

• Creation of lar!e indexes

• Creation of lar!e tables -incldin! materiali5ed views3

• Blk inserts/ pdates/ mer!es/ and deletes

?o can also se parallel exection to access ob:ect t*pes within an Oracle database" Forexample/ *o can se parallel exection to access lar!e ob:ects -LOBs3"

7arallel exection benefits s*stems with all of the followin! characteristics#

• S*mmetric mltiprocessors -SM7s3/ clsters/ or massivel* parallel s*stems

• Sfficient I@O bandwidth

Page 240: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 240/260

• =ndertili5ed or intermittentl* sed C7=s -for example/ s*stems where C7=

sa!e is t*picall* less than 8N3• Sfficient memor* to spport additional memor*)intensive processes/ sch as

sorts/ hashin!/ and I@O bffers

If *or s*stem lacks an* of these characteristics/ parallel exection mi!ht notsi!nificantl* improve performance" In fact/ parallel exection ma* redce s*stem performance on overtili5ed s*stems or s*stems with small I@O bandwidth"

When to Ipleent #arallel $2ecution

7arallel exection provides the !reatest performance improvements in %SS and datawarehosin! environments" OLT7 s*stems also benefit from parallel exection/ btsall* onl* drin! batch processin!"

%rin! the da*/ most OLT7 s*stems shold probabl* not se parallel exection" %rin!

off)hors/ however/ parallel exection can effectivel* process hi!h)volme batchoperations" For example/ a bank mi!ht se paralleli5ed batch pro!rams to performmillions of pdates to appl* interest to acconts"

Operations That Can &e #aralleli4ed

The Oracle server can se parallel exection for an* of the followin!#

• 'ccess methods

For example/ table scans/ index fll scans/ and partitioned index ran!e scans"

• .oin methods

For example/ nested loop/ sort mer!e/ hash/ and star transformation"

• %%L statements

C!$T! T$BL! $S S!L!CT/ C!$T! IND!./ !BUILD IND!./ !BUILD IND!. 

%$TITION/ and "O+! S%LIT CO$L!SC! %$TITION

• %ML statements

For example/ INS!T $S S!L!CT/ pdates/ deletes/ and "!(! operations"

• Miscellaneos SL operations

For example/ (OU% B'/ NOT IN/ S!L!CT DISTINCT/ UNION/ UNION $LL/ CUB!/ and

OLLU%/ as well as a!!re!ate and table fnctions"

Page 241: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 241/260

Page 242: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 242/260

**

uer+ e(riteThis chapter discsses how Oracle rewrites 4eries" It contains#

• Overview of er* +ewrite

• 6nablin! er* +ewrite

• How Oracle +ewrites eries

• Special Cases for er* +ewrite

• %id er* +ewrite Occr&

• %esi!n Considerations for Improvin! er* +ewrite Capabilities

Overvie( o0 uer+ e(rite

One of the ma:or benefits of creatin! and maintainin! materiali5ed views is the abilit* totake advanta!e of 4er* rewrite/ which transforms a SL statement expressed in terms oftables or views into a statement accessin! one or more materiali5ed views that are definedon the detail tables" The transformation is transparent to the end ser or application/re4irin! no intervention and no reference to the materiali5ed view in the SL statement"Becase 4er* rewrite is transparent/ materiali5ed views can be added or dropped :stlike indexes withot invalidatin! the SL in the application code"

Before the 4er* is rewritten/ it is sb:ected to several checks to determine whether it is acandidate for 4er* rewrite" If the 4er* fails an* of the checks/ then the 4er* is appliedto the detail tables rather than the materiali5ed view" This can be costl* in terms ofresponse time and processin! power"

The Oracle optimi5er ses two different methods to reco!ni5e when to rewrite a 4er* interms of one or more materiali5ed views" The first method is based on matchin! the SLtext of the 4er* with the SL text of the materiali5ed view definition" If the first methodfails/ the optimi5er ses the more !eneral method in which it compares :oins/ selections/data colmns/ !ropin! colmns/ and a!!re!ate fnctions between the 4er* and amateriali5ed view"

er* rewrite operates on 4eries and sb4eries in the followin! t*pes of SLstatements#

•   S!L!CT 

•   C!$T! T$BL! """ $S S!L!CT 

•   INS!T INTO """ S!L!CT 

Page 243: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 243/260

It also operates on sb4eries in the set operators UNION/ UNION $LL/ INT!S!CT/ and

"INUS/ and sb4eries in %ML statements sch as INS!T/ D!L!T!/ and U%D$T!"

Several factors affect whether or not a !iven 4er* is rewritten to se one or moremateriali5ed views#

• 6nablin! or disablin! 4er* rewrite#

o  b* the C!$T! or $LT! statement for individal materiali5ed views

o  b* the initiali5ation parameter U!'_!FIT!_!N$BL!D 

o  b* the !FIT! and NO!FIT! hints in SL statements

• +ewrite inte!rit* levels

• %imensions and constraints

There is also an explain rewrite procedre which will advise whether 4er* rewrite is possible on a 4er* and if so/ which materiali5ed views will be sed"

Cost=&ased e(rite

er* rewrite is available with cost)based optimi5ation" Oracle optimi5es the inpt 4er*with and withot rewrite and selects the least costl* alternative" The optimi5er rewrites a4er* b* rewritin! one or more 4er* blocks/ one at a time"

If the rewrite lo!ic has a choice between mltiple materiali5ed views to rewrite a 4er* block/ it will select the one which can reslt in readin! in the least amont of data"

'fter a materiali5ed view has been picked for a rewrite/ the optimi5er performs therewrite/ and then tests whether the rewritten 4er* can be rewritten frther with another

materiali5ed view" This process contines ntil no frther rewrites are possible" Then therewritten 4er* is optimi5ed and the ori!inal 4er* is optimi5ed" The optimi5er comparesthese two optimi5ations and selects the least costl* alternative"

Since optimi5ation is based on cost/ it is important to collect statistics both on tablesinvolved in the 4er* and on the tables representin! materiali5ed views" Statistics arefndamental measres/ sch as the nmber of rows in a table/ that are sed to calclatethe cost of a rewritten 4er*" The* are created b* sin! the DB"S_ST$TS packa!e"

eries that contain in)line or named views are also candidates for 4er* rewrite" $hen a4er* contains a named view/ the view name is sed to do the matchin! between a

materiali5ed view and the 4er*" $hen a 4er* contains an inline view/ the inline viewcan be mer!ed into the 4er* before matchin! between a materiali5ed view and the 4er*occrs"

In addition/ if the inline view<s text definition exactl* matches with that of an inline view present in an* eli!ible materiali5ed view/ !eneral rewrite ma* be possible" This is becase/ whenever a materiali5ed view contains exactl* identical inline view text to the

Page 244: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 244/260

one present in a 4er*/ 4er* rewrite treats sch an inline view like a named view or atable"

Fi!re AA)0 presents a !raphical view of the cost)based approach sed drin! the rewrite process"

Figure 22-1 The 8uer' %erite Process

Text description of the illstration dwhs!80E"!if 

When Does Oracle e(rite a uer+?

' 4er* is rewritten onl* when a certain nmber of conditions are met#

• er* rewrite mst be enabled for the session"

• ' materiali5ed view mst be enabled for 4er* rewrite"

• The rewrite inte!rit* level shold allow the se of the materiali5ed view" For

example/ if a materiali5ed view is not fresh and 4er* rewrite inte!rit* is set toenforced/ then the materiali5ed view will not be sed"

Page 245: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 245/260

• 6ither all or part of the reslts re4ested b* the 4er* mst be obtainable from the

 precompted reslt stored in the materiali5ed view"

To determine this/ the optimi5er ma* depend on some of the data relationships declared b* the ser sin! constraints and dimensions" Sch data relationships inclde hierarchies/

referential inte!rit*/ and ni4eness of ke* data/ and so on"

Saple Schea and )ateriali4ed !ie(s

The followin! sections se an example schema and a few materiali5ed views to illstratehow the optimi5er ses data relationships to rewrite 4eries" Oracle<s sh sample schema

consists of these tables#

COSTS1 COUNTI!S1 CUSTO"!S1 %ODUCTS1 %O"OTIONS1 TI"!S1 C,$NN!LS1S$L!S 

See Also: 

Oracle9i Sample Schemas for details re!ardin! the sh sample schema

$2aples o0 )ateriali4ed !ie(s 0or uer+ e(rite

The 4er* rewrite examples in this chapter mainl* refer to the followin! materiali5edviews" (ote that those materiali5ed views do not necessaril* represent the most efficientimplementation for the sh sample schema" Instead/ the* are a base for demonstratin!

Oracle<s rewrite capabilities" Frther examples demonstratin! specific fnctionalit* can be fond in the specific context"

The followin! materiali5ed views contain :oins and a!!re!ates#

C!$T! "$T!I$LI!D +I!F sum_sales_pscat_<ee=_m-  !N$BL! U!' !FIT!  $SS!L!CT pprod_su9cate/ory1 t<ee=_endin/_day1  SU")samount_sold* $S sum_amount_sold>O" sales s1 products p1 times tF,!! stime_idHttime_id$ND sprod_idHpprod_id(OU% B' pprod_su9cate/ory1 t<ee=_endin/_day;

C!$T! "$T!I$LI!D +I!F sum_sales_prod_<ee=_m-

  !N$BL! U!' !FIT!  $SS!L!CT pprod_id1 t<ee=_endin/_day1 scust_id1  SU")samount_sold* $S sum_amount_sold>O" sales s1 products p1 times tF,!! stime_idHttime_id$ND sprod_idHpprod_id(OU% B' pprod_id1 t<ee=_endin/_day1 scust_id;

Page 246: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 246/260

C!$T! "$T!I$LI!D +I!F sum_sales_pscat_month_city_m-  !N$BL! U!' !FIT!  $SS!L!CT pprod_su9cate/ory1 tcalendar_month_desc1 ccust_city1  SU")samount_sold* $S sum_amount_sold1  COUNT)samount_sold* $S count_amount_sold>O" sales s1 products p1 times t1 customers cF,!! stime_idHttime_id$ND sprod_idHpprod_id$ND scust_idHccust_id(OU% B' pprod_su9cate/ory1 tcalendar_month_desc1 ccust_city;

The followin! materiali5ed views contain :oins onl*#

C!$T! "$T!I$LI!D +I!F 6oin_sales_time_product_m-  !N$BL! U!' !FIT!  $SS!L!CT pprod_id1 pprod_name1 ttime_id1 t<ee=_endin/_day1

  schannel_id1 spromo_id1 scust_id1  samount_sold>O" sales s1 products p1 times tF,!! stime_idHttime_id$ND sprod_id H pprod_id;

C!$T! "$T!I$LI!D +I!F 6oin_sales_time_product_o6_m-  !N$BL! U!' !FIT!  $SS!L!CT pprod_id1 pprod_name1 ttime_id1 t<ee=_endin/_day1  schannel_id1 spromo_id1 scust_id1  samount_sold>O" sales s1 products p1 times tF,!! stime_idHttime_id

$ND sprod_idHpprod_id)*;

?o mst collect statistics on the materiali5ed views so that the optimi5er can determinewhether to rewrite the 4eries" ?o can do this either on a per ob:ect base or for all newl*created ob:ects withot statistics"

On a per ob:ect base/ shown for 6oin_sales_time_product_m-#

!.!CUT! DB"S_ST$TS($T,!_T$BL!_ST$TS)7S,717OIN_S$L!S_TI"!_%ODUCT_"+71  estimate_percentHJ2419loc=_sampleHJTU!1cascadeHJTU!*;

For all newl* created ob:ects withot statistics/ on schema level#

!.!CUT! DB"S_ST$TS($T,!_SC,!"$_ST$TS)7S,71 options HJ 7($T,! !"%T'71  estimate_percentHJ241 9loc=_sampleHJTU!1 cascadeHJTU!*;

See Also: 

Page 247: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 247/260

Page 248: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 248/260

Page 249: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 249/260

aggregation

The process of consolidatin! data vales into a sin!le vale" For example/ sales datacold be collected on a dail* basis and then be a!!re!ated to the week level/ the weekdata cold be a!!re!ated to the month level/ and so on" The data can then be referred to

as a!!re!ate data" 'ggregation is s*non*mos with smmarization/ and a!!re!ate datais s*non*mos with smmar* data"

ancestor 

' vale at an* level hi!her than a !iven vale in a hierarch*" For example/ in a Timedimension/ the vale 5@@@ mi!ht be the ancestor of the vales 5A@@ and anA@@"

See Also: 

hierarch* and level

attribute

' descriptive characteristic of one or more levels" For example/ the prodct dimensionfor a clothin! manfactrer mi!ht contain a level called item/ one of whose attribtes iscolor" 'ttribtes represent lo!ical !ropin!s that enable end sers to select data based onlike characteristics"

 (ote that in relational modelin!/ an attribte is defined as a characteristic of an entit*" InOracle1i/ an attribte is a colmn in a dimension that characteri5es elements of a sin!lelevel"

cardinality

From an OLT7 perspective/ this refers to the nmber of rows in a table" From a datawarehosin! perspective/ this t*picall* refers to the nmber of distinct vales in acolmn" For most data warehose %B's/ a more important isse is the degree of

cardinality"

See Also: 

de!ree of cardinalit*

child 

' vale at the level nder a !iven vale in a hierarch*" For example/ in a Timedimension/ the vale anA@@ mi!ht be the child of the vale 5A@@" ' vale can be a

child for more than one parent if the child vale belon!s to mltiple hierarchies"

See Also: 

Page 250: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 250/260

• hierarch*

• level

•  parent

cleansing 

The process of resolvin! inconsistencies and fixin! the anomalies in sorce data/t*picall* as part of the 6TL process"

See Also: 

6TL

Common Warehouse Metadata (CWM)

' repositor* standard sed b* Oracle data warehosin!/ and decision spport" The C$Mrepositor* schema is a standalone prodct that other prodcts can share))each prodctowns onl* the ob:ects within the C$M repositor* that it creates"

cross product 

' procedre for combinin! the elements in mltiple sets" For example/ !iven twocolmns/ each element of the first colmn is matched with ever* element of the secondcolmn" ' simple example is illstrated as follows#

Col5 Col2 Cross %roductAAAA AAAA AAAAAAAAAAAAA

a c ac9 d ad  9c  9d

Cross prodcts are performed when !ropin! sets are concatenated/ as described inChapter 0/ 9SL for '!!re!ation in %ata $arehoses9"

data mart 

' data warehose that is desi!ned for a particlar line of bsiness/ sch as sales/marketin!/ or finance" In a dependent data mart/ the data can be derived from anenterprise)wide data warehose" In an independent data mart/ data can be collecteddirectl* from sorces"

See Also: 

data warehose

Page 251: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 251/260

data source

' database/ application/ repositor*/ or file that contribtes data to a warehose"

data warehouse

' relational database that is desi!ned for 4er* and anal*sis rather than transaction processin!" ' data warehose sall* contains historical data that is derived fromtransaction data/ bt it can inclde data from other sorces" It separates anal*sis workloadfrom transaction workload and enables a bsiness to consolidate data from severalsorces"

In addition to a relational database/ a data warehose environment often consists of an6TL soltion/ an OL'7 en!ine/ client anal*sis tools/ and other applications that mana!ethe process of !atherin! data and deliverin! it to bsiness sers"

See Also: 

6TL and online anal*tical processin! -OL'73

degree of cardinality

The nmber of ni4e vales of a colmn divided b* the total nmber of rows in thetable" This is particlarl* important when decidin! which indexes to bild" ?o t*picall*want to se bitmap indexes on low de!ree of cardinalit* colmns and B)tree indexes onhi!h de!ree of cardinalit* colmns" 's a !eneral rle/ a cardinalit* of nder 0N makes a!ood candidate for a bitmap index"

denormalize

The process of allowin! redndanc* in a table" Contrast with normalize"

derived fact (or measure)

' fact -or measre3 that is !enerated from existin! data sin! a mathematical operation ora data transformation" 6xamples inclde avera!es/ totals/ percenta!es/ and differences"

detail 

See# fact table"

detail table

See# fact table"

dimension

Page 252: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 252/260

Page 253: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 253/260

element 

'n ob:ect or process" For example/ a dimension is an ob:ect/ a mappin! is a process/ and both are elements"

entity

6ntit* is sed in database modelin!" In relational databases/ it t*picall* maps to a table"

 ETL

6xtraction/ transformation/ and loadin!" 6TL refers to the methods involved in accessin!and maniplatin! sorce data and loadin! it into a data warehose" The order in whichthese processes are performed varies"

 (ote that 6TT -extraction/ transformation/ transportation3 and 6TM -extraction/

transformation/ move3 are sometimes sed instead of 6TL"

See Also: • data warehose

• extraction

• transformation

• transportation

etraction

The process of takin! data ot of a sorce as part of an initial phase of 6TL"

See Also: 

6TL

 fact 

%ata/ sall* nmeric and additive/ that can be examined and anal*5ed" 6xamplesinclde sales/ cost/ and profit" Fact and measre are s*non*mosJ fact is morecommonl* sed with relational environments/ measre is more commonl* sed with

mltidimensional environments"

See Also: 

derived fact -or measre3

 fact table

Page 254: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 254/260

' table in a star schema that contains facts" ' fact table t*picall* has two t*pes ofcolmns# those that contain facts and those that are forei!n ke*s to dimension tables" The primar* ke* of a fact table is sall* a composite ke* that is made p of all of its forei!nke*s"

' fact table mi!ht contain either detail level facts or facts that have been a!!re!ated -facttables that contain a!!re!ated facts are often instead called smmary tables3" ' facttable sall* contains facts with the same level of a!!re!ation"

 fast refresh

'n operation that applies onl* the data chan!es to a materiali5ed view/ ths eliminatin!the need to rebild the materiali5ed view from scratch"

 file!to!table mapping 

Maps data from flat files to tables in the warehose"

hierarchy

' lo!ical strctre that ses ordered levels as a means of or!ani5in! data" ' hierarch*can be sed to define data a!!re!ationJ for example/ in a time dimension/ a hierarch*mi!ht be sed to a!!re!ate data from the "onth level to the uarter level to the 'ear 

level" Hierarchies can be defined in Oracle1i as part of the dimension ob:ect" ' hierarch*can also be sed to define a navi!ational drill path/ re!ardless of whether the levels in thehierarch* represent a!!re!ated totals"

See Also: 

dimension and level

level 

' position in a hierarch*" For example/ a time dimension mi!ht have a hierarch* thatrepresents data at the "onth/ uarter/ and 'ear levels"

See Also: 

hierarch*

level value table

' database table that stores the vales or data for the levels *o created as part of *ordimensions and hierarchies"

mapping 

Page 255: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 255/260

Page 256: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 256/260

OL'7 fnctionalit* is characteri5ed b* d*namic/ mltidimensional anal*sis of historicaldata/ which spports activities sch as the followin!#

• Calclatin! across dimensions and thro!h hierarchies

• 'nal*5in! trends

%rillin! p and down thro!h hierarchies• +otatin! to chan!e the dimensional orientation

OL'7 tools can rn a!ainst a mltidimensional database or interact directl* with arelational database"

"LT$ 

See# online transaction processin! -OLT73"

online transaction processing ("LT$)

Online transaction processin!" OLT7 s*stems are optimi5ed for fast and reliabletransaction handlin!" Compared to data warehose s*stems/ most OLT7 interactions willinvolve a relativel* small nmber of rows/ bt a lar!er !rop of tables"

 parallelism

Breakin! down a task so that several processes do part of the work" $hen mltiple C7=seach do their portion simltaneosl*/ ver* lar!e performance !ains are possible"

 parallel eecution

Breakin! down a task so that several processes do part of the work" $hen mltiple C7=seach do their portion simltaneosl*/ ver* lar!e performance !ains are possible"

 parent 

' vale at the level above a !iven vale in a hierarch*" For example/ in a Timedimension/ the vale 5A@@ mi!ht be the parent of the vale anA@@"

See Also: • child

• hierarch*

• level

 partition

Page 257: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 257/260

;er* lar!e tables and indexes can be difficlt and time)consmin! to work with" Toimprove mana!eabilit*/ *o can break *or tables and indexes into smaller pieces called partitions"

 pivoting 

' transformation where each record in an inpt stream is converted to man* records inthe appropriate table in the data warehose" This is particlarl* important when takin!data from nonrelational databases"

 publisher 

=sall* a database administrator who is in char!e of creatin! and maintainin! schemaob:ects that make p the Chan!e %ata Captre s*stem"

refresh

The mechanism whereb* materiali5ed views are chan!ed to reflect new data"

schema

' collection of related database ob:ects" +elational schemas are !roped b* database serI% and inclde tables/ views/ and other ob:ects" $henever possible/ a sample schemacalled sh is sed thro!hot this Kide"

See Also: 

snowflake schema and star schema

semi!additive

%escribes a fact -or measre3 that can be smmari5ed thro!h addition alon! some/ btnot all/ dimensions" 6xamples inclde headcont and on hand stock" Contrast withadditive and nonadditive"

slice and dice

This is an informal term referrin! to data retrieval and maniplation" $e can pictre a

data warehose as a cbe of data/ where each axis of the cbe represents a dimension" To9slice9 the data is to retrieve a piece -a slice3 of the cbe b* specif*in! measres andvales for some or all of the dimensions" $hen we retrieve a data slice/ we ma* alsomove and reorder its colmns and rows as if we had diced the slice into man* small pieces" ' s*stem with !ood slicin! and dicin! makes it eas* to navi!ate thro!h lar!eamonts of data"

snowfla%e schema

Page 258: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 258/260

' t*pe of star schema in which the dimension tables are partl* or fll* normali5ed"

See Also: 

schema and star schema

source

' database/ application/ file/ or other stora!e facilit* from which the data in a datawarehose is derived"

source system

' database/ application/ file/ or other stora!e facilit* from which the data in a datawarehose is derived"

staging area

' place where data is processed before enterin! the warehose"

staging file

' file sed when data is processed before enterin! the warehose"

star &uery

' :oin between a fact table and a nmber of dimension tables" 6ach dimension table is

 :oined to the fact table sin! a primar* ke* to forei!n ke* :oin/ bt the dimension tablesare not :oined to each other"

star schema

' relational schema whose desi!n represents a mltidimensional data model" The starschema consists of one or more fact tables and one or more dimension tables that arerelated thro!h forei!n ke*s"

See Also: 

schema and snowflake schema

sub'ect area

' classification s*stem that represents or distin!ishes parts of an or!ani5ation or areasof knowled!e" ' data mart is often developed to spport a sb:ect area sch as sales/marketin!/ or !eo!raph*"

Page 259: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 259/260

See Also: 

data mart

subscribers

Consmers of the pblished chan!e data" These are normall* applications"

summary

See# materiali5ed view"

 ummary #dvisor 

The Smmar* 'dvisor recommends which materiali5ed views to retain/ create/ and drop"It helps database administrators mana!e materiali5ed views" It is a K=I in Oracle

6nterprise Mana!er/ and has similar capabilities to the DB"S_OL$% packa!e"

target 

Holds the intermediate or final reslts of an* part of the 6TL process" The tar!et of theentire 6TL process is the data warehose"

See Also: 

data warehose and 6TL

third normal form (*+)

' classical relational database modelin! techni4e that minimi5es data redndanc*thro!h normali5ation"

third normal form schema

' schema that ses the same kind of normali5ation as t*picall* fond in an OLT7s*stem" Third normal form schemas are sometimes chosen for lar!e data warehoses/especiall* environments with si!nificant data loadin! re4irements that are sed to feeddata marts and execte lon!)rnnin! 4eries"

See Also: 

snowflake schema and star schema

transformation

Page 260: Datawarehouse Concept

8/13/2019 Datawarehouse Concept

http://slidepdf.com/reader/full/datawarehouse-concept 260/260

The process of maniplatin! data" 'n* maniplation be*ond cop*in! is a transformation"6xamples inclde cleansin!/ a!!re!atin!/ and inte!ratin! data from mltiple sorces"

transportation