Top Banner
Advanced Querying OLAP Part 2
25

Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Advanced Querying

OLAP

Part 2

Page 2: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Context

• OLAP systems for supporting decision making.

• Components:– Dimensions with hierarchies,– Measures,– Aggregation

• Data model:– Multidimensional cube

Page 3: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Context

• Operations:– Roll-up, drill-down,– Pivot,– Slice and dice.

• Implementation:– ROLAP– MOLAP

Page 4: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Outline

• Examples of decision support queries

• Data Cubes– Conceptual data model– Typical operations

• SQL:1999 support for OLAP

• Implementation– ROLAP vs MOLAP– Indexing structures

Page 5: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

MOLAP

• Not on top of relational database– most popular design– specialized data structures

• Multicubes vs Hypercubes

– Not all subcubes are materialized

Page 6: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Multicubes

• User identifies set of sparse attributes S, and a set of dense attributes D.

• Index tree is constructed on sparse dimensions.

• Each leaf points to a multidimensional array indexed by D.

Page 7: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Example

• product, store are sparse dimensions

• date and customer-type are dense

1time ret reg Total

1/1/07 51 25 158 234

2/1/07 58 20 120 198

… 65 22 51 138

Total 174 67 329 570

1time ret reg Total

1/1/07 51 25 158 234

2/1/07 58 20 120 198

… 65 22 51 138

Total 174 67 329 570

prod. pstore s1

prod. pstore s2

prod. p

Page 8: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Example

• product, store are sparse dimensions

• date and customer-type are dense

1time ret reg Total

1/1/07 51 25 158 234

2/1/07 58 20 120 198

… 65 22 51 138

Total 174 67 329 570

1time ret reg Total

1/1/07 51 25 158 234

2/1/07 58 20 120 198

… 65 22 51 138

Total 174 67 329 570

prod. pstore s1

prod. pstore s2

prod. p

E.g., B-tree, R-tree, …

Page 9: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Example

• product, store are sparse dimensions

• date and customer-type are dense

1time ret reg Total

1/1/07 51 25 158 234

2/1/07 58 20 120 198

… 65 22 51 138

Total 174 67 329 570

1time ret reg Total

1/1/07 51 25 158 234

2/1/07 58 20 120 198

… 65 22 51 138

Total 174 67 329 570

prod. pstore s1

prod. pstore s2

prod. p

E.g., B-tree, R-tree, …

2D arrayDirect access

Page 10: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Example

• product, store are sparse dimensions

• date and customer-type are dense

1time ret reg Total

1/1/07 51 25 158 234

2/1/07 58 20 120 198

… 65 22 51 138

Total 174 67 329 570

1time ret reg Total

1/1/07 51 25 158 234

2/1/07 58 20 120 198

… 65 22 51 138

Total 174 67 329 570

prod. pstore s1

prod. pstore s2

prod. p

E.g., B-tree, R-tree, …

2D arrayDirect access

Linked list

Page 11: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Queries

• Efficiency depends on:– does index on sparse dimensions fit into

memory?

– Type of queries:• Restrictions on all dimensions• Restrictions only on dense• Restrictions only on some sparse and dense

Page 12: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Queries

• Selection on all attributes: (p,s1,ret,all)

1time ret reg Total

1/1/07 51 25 158 234

2/1/07 58 20 120 198

… 65 22 51 138

Total 174 67 329 570

1time ret reg Total

1/1/07 51 25 158 234

2/1/07 58 20 120 198

… 65 22 51 138

Total 174 67 329 570

prod. pstore s1

prod. pstore s2

prod. p

Page 13: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Queries

• Only on dense attributes: (-,-,ret,”2/1/07”)

1time ret reg Total

1/1/07 51 25 158 234

2/1/07 58 20 120 198

… 65 22 51 138

Total 174 67 329 570

1time ret reg Total

1/1/07 51 25 158 234

2/1/07 58 20 120 198

… 65 22 51 138

Total 174 67 329 570

prod. pstore s1

prod. pstore s2

prod. p

Page 14: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Queries

• Only some sparse and dense attributes: (-,s1,ret,”2/1/07”)

1time ret reg Total

1/1/07 51 25 158 234

2/1/07 58 20 120 198

… 65 22 51 138

Total 174 67 329 570

1time ret reg Total

1/1/07 51 25 158 234

2/1/07 58 20 120 198

… 65 22 51 138

Total 174 67 329 570

prod. pstore s1

prod. pstore s2

prod. p

Page 15: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Specialized Indexing Structures

• B-trees, (not covered)

• Bitmapped indices,

• Join indices,

• Spatial data structures (covered later)

Page 16: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Index Structures

• Indexing principle: mapping key values to records for

associative direct access Most popular indexing techniques in

relational database: B+-trees For multi-dimensional data, a large

number of indexing techniques have been developed: R-trees

Page 17: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Bitmap Indexes

• Bitmap index: indexing technique that has attracted attention in multi-dimensional DB implementation

table

Customer City Carc1 Detroit Fordc2 Chicago Hondac3 Detroit Hondac4 Poznan Fordc5 Paris BMWc6 Paris Nissan

Page 18: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Bitmap Indexes

• The index consists of bitmaps:

bitmaps

ec1 Chicago Detroit Paris Poznan1 0 1 0 02 1 0 0 03 0 1 0 04 0 0 0 15 0 0 1 06 0 0 1 0

ec1 BMW Ford Honda Nissan1 0 1 0 02 1 0 1 03 0 0 1 04 0 1 0 05 1 0 0 06 0 0 0 1

bitmaps•Index on a particular column•Index consists of a number of bit vectors - bitmaps•Each value in the indexed column has a bit vector (bitmaps)•The length of the bit vector is the number of records in the base table•The i-th bit is set if the i-th row of the base table has the value for the indexed column

Page 19: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Index on a particular column

Index consists of a number of bit vectors - bitmaps

Each value in the indexed column has a bit vector (bitmaps)

The length of the bit vector is the number of records in the base table

The i-th bit is set if the i-th row of the base table has the value for the indexed column

Bitmap Indexes

Page 20: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Bitmap Index

2023

1819

202122

232526

id name age1 joe 202 fred 203 sally 214 nancy 205 tom 206 pat 257 dave 218 jeff 26

. .

.

ageindex

bitmaps

datarecords

110110000

0010001011

Query: Get people with age = 20 and name = “fred”

List for age = 20: 1101100000List for name = “fred”: 0100000001Answer is intersection: 0100000000

Suited well for domains with small cardinality

Page 21: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

With efficient hardware support for bitmap operations (AND, OR, XOR, NOT), bitmap index offers better access methods for certain queries

e.g., selection on two attributes

Some commercial products have implemented bitmap index

Works poorly for high cardinality domains since the number of bitmaps increases

Difficult to maintain - need reorganization when relation sizes change (new bitmaps)

Bitmap Index – Summary

Page 22: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

• “Combine” SALE, PRODUCT relations• In SQL: SELECT * FROM SALE, PRODUCT

sale prodId storeId date amtp1 c1 1 12p2 c1 1 11p1 c3 1 50p2 c2 1 8p1 c1 2 44p1 c2 2 4

product id name pricep1 bolt 10p2 nut 5

joinTb prodId name price storeId date amtp1 bolt 10 c1 1 12p2 nut 5 c1 1 11p1 bolt 10 c3 1 50p2 nut 5 c2 1 8p1 bolt 10 c1 2 44p1 bolt 10 c2 2 4

Join

Page 23: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Join Indexes

product id name price jIndexp1 bolt 10 r1,r3,r5,r6p2 nut 5 r2,r4

sale rId prodId storeId date amtr1 p1 c1 1 12r2 p2 c1 1 11r3 p1 c3 1 50r4 p2 c2 1 8r5 p1 c1 2 44r6 p1 c2 2 4

join index

Page 24: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

Join Indexes

Traditional indexes: value rids. Join indices: tuples in the join to rids in the source tables.

Data warehouse:

values of dimensions of star schema rows in fact table.

Join indexes can span multiple dimensions

Page 25: Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.

OLAP - Summary

• Data warehouse is a specialized database to support analytical queries = OLAP queries

• Data cube as conceptual model• Implementation of Data Cube

– View selection problem– Explosion problem– ROLAP vs. MOLAP– Indexing structures