Top Banner
DATA WAREHOUSING Multi Dimensional Data Modeling. Facts and Dimensions
33
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dw design 1_dim_facts

DATA WAREHOUSING Multi Dimensional Data Modeling. Facts and Dimensions

Page 2: Dw design 1_dim_facts

2

Page 3: Dw design 1_dim_facts

While an entity-relationship modeling approach from relational database design could be used, the dimensional modeling approach to logical design is more often used for a data warehouse.

3

Page 4: Dw design 1_dim_facts

End users cannot understand, remember, navigate an E/R model (not even with a GUI)

One reason is that an enterprise-level ERM would be too complex to understand.

4

Page 5: Dw design 1_dim_facts

Software cannot usefully query an E/R model

5

Page 6: Dw design 1_dim_facts

Use of E/R modeling doesn’t meet the DW purpose: intuitive and high performance querying

6

Page 7: Dw design 1_dim_facts

7

Fact Table Dimension Table

Time_Dim TimeKey

TheDate . . .

Sales_Fact TimeKey EmployeeKey ProductKey CustomerKey ShipperKey

$ . . .

Employee_Dim EmployeeKey

EmployeeID . . .

Product_Dim ProductKey

ProductID . . .

Customer_Dim CustomerKey

CustomerID . . .

Shipper_Dim ShipperKey ShipperID . . .

Page 8: Dw design 1_dim_facts

8

Geographic Product Time Units $

Dimension

Tables

Geographic

Product

Time

Fact Table Measures

Facts

Dimension

Several distinct dimensions, combined with

facts, enable you to answer business

questions.

Page 9: Dw design 1_dim_facts

They are normally textual and descriptive descriptions of the business.

9

Dimensions

Page 10: Dw design 1_dim_facts

dimension tables contain relatively small amounts of relatively static data

10

Dimensions

Page 11: Dw design 1_dim_facts

dimension table: usually not-normalized

11

Dimensions

Page 12: Dw design 1_dim_facts

Independent of each other, not hierarchically related

12

Dimensions

Page 13: Dw design 1_dim_facts

Dimensional attributes (attributes no key) help to describe the dimensional value.

13

Dimensional attributes

Page 14: Dw design 1_dim_facts

Fact are (usually numerical) measures of business.

14

Facts

Page 15: Dw design 1_dim_facts

Fact table is the largest table in the star schema and is composed of large volumes of data

15

Facts

Page 16: Dw design 1_dim_facts

Fact table is (often) normalized

16

Facts

Page 17: Dw design 1_dim_facts

fact table has a composite primary key made up of foreign keys

17

Facts

PK = FKi

Page 18: Dw design 1_dim_facts

fact table usually contains one or more numerical facts that occur for the combination of keys that define each record

18

Facts

measures

Page 19: Dw design 1_dim_facts

A fact table contains either detail-level facts or facts that have been aggregated (summary tables)

19

Facts

Σ

Page 20: Dw design 1_dim_facts

Facts are:

additive

semi-additive

non-additive

20

Facts

Page 21: Dw design 1_dim_facts

Non-additive facts cannot be added at all.

An example of this is averages. Semi-additive facts can be aggregated along some of

the dimensions and not along others:

current_Balance is a semi-additive fact as it makes sense to add them up for all accounts (what's the total current balance for all accounts in the bank?) but it does not make sense to add them up through time (adding up all current balances for a given account for each day of the month does not give us any useful information

The most useful measures are: Numeric, Additive

21

Facts

Page 22: Dw design 1_dim_facts

Atomic level of data of the business process

A definition of the highest level of detail that is supported in a data warehouse

22

Page 23: Dw design 1_dim_facts

A fact table usually contains facts with the same level of aggregation

a proper dimensional design allows only facts of a uniform grain (the same dimensionality) to coexist in a single fact table

23

Page 24: Dw design 1_dim_facts

Some perfectly good fact tables represent measurements that have no facts! This kind of measurements is often called an event. The classic example of such a factless fact table is a record representing a student attending a class on a specific day. The dimensions are Day, Student, Professor, Course, and Location, but there are no obvious numeric facts. The tuition paid and grade received are good facts but not at the grain of the daily attendance.

24

Page 25: Dw design 1_dim_facts

Dimensions without attributes. (Such as a transaction number or order number.)

Put the attribute value into the fact table even though it is not an additive fact.

25

Page 26: Dw design 1_dim_facts

26

Page 27: Dw design 1_dim_facts

27

Employee_Dim EmployeeKey

EmployeeID . . .

EmployeeKey

Time_Dim TimeKey

TheDate . . .

TimeKey

Product_Dim ProductKey

ProductID . . .

ProductKey

Customer_Dim CustomerKey

CustomerID . . .

CustomerKey

Shipper_Dim ShipperKey

ShipperID . . .

ShipperKey

Sales_Fact TimeKey EmployeeKey ProductKey CustomerKey ShipperKey $ . . .

TimeKey

CustomerKey ShipperKey

ProductKey EmployeeKey

Multipart Key

Measures

Dimensional Keys

Fact table provides statistics

for sales broken down by

product, time, employee, shipper

and customer, dimensions

Page 28: Dw design 1_dim_facts

28

Page 29: Dw design 1_dim_facts

1. Choosing the data mart for the small group of end users we deal with.

Choose a business process to model, e.g., orders, invoices, etc.

29

Page 30: Dw design 1_dim_facts

2. Fact table granularity (the smallest defined level of data in the table) is determined.

30

Page 31: Dw design 1_dim_facts

3. Fact table dimensions are selected.

Choose the dimensions that will apply to each fact table record

Add dimensions for "everything you know" about this grain.

31

Page 32: Dw design 1_dim_facts

4. Determine the facts for the table. In most cases, the granularity is at the transaction level, so the fact is the amount.

Choose the measure that will populate each fact table record

Add numeric measured facts true to the grain

32

Page 33: Dw design 1_dim_facts

The Data Warehouse Toolkit.Second Edition.The Complete Guide to Dimensional Modeling.Ralph Kimball.Margy Ross