Top Banner
Presented by Bryan Cafferky Business Intelligence Consultant BPC Global Solutions LLC *** October 18 th – Providence SQL Saturday! ** [email protected] www.sql-fy.com
44
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dimensional Modeling

Presented by Bryan CafferkyBusiness Intelligence Consultant

BPC Global Solutions LLC*** October 18th – Providence SQL Saturday! **

[email protected]

Page 2: Dimensional Modeling
Page 3: Dimensional Modeling

Kimball, Ralph, and Margy Ross. The Data Warehouse Toolkit. Wiley, Print. Corr, Lawrence. Agile Data Warehouse Design. DecisionOne Press, 2011. Print.

Page 4: Dimensional Modeling
Page 5: Dimensional Modeling
Page 6: Dimensional Modeling
Page 7: Dimensional Modeling
Page 8: Dimensional Modeling

An example of a dimension…

A user might want to group by store state.

Page 9: Dimensional Modeling

An example of a fact table…

Sales amount is a fact.

Page 10: Dimensional Modeling

An example of a fact table…

A dimension can contain many dimension

attributes.

A fact table has facts (or measures) and a key to

each related dimension.

Page 11: Dimensional Modeling

• Dimensions relate directly to the fact table and never to other dimensions .

• The dimensions are denormalized, i.e. Sales_Person_Region does not have a related region lookup table as an OLTP designwould likely have.

• Usually the dimension keys are NOT keys from the source systems, rather they are generatedby the data warehouse load process and they are called surrogate keys.

• The dimension attributes you define determinethe granularity called the grain of the facts, i.e. how detailed are the measures.

• Warning! This is not a relational design so be careful if you are an OLTP developer.

Page 12: Dimensional Modeling

A SQL statement walks into a bar and sees two tables. It approaches, and asks “may I join you?”

Page 13: Dimensional Modeling

Dimensions Facts

Page 14: Dimensional Modeling

When a dimension relates to another dimension you have a snowflake.

• Beware! OLTP designers must resist the urge to normalize by creating snowflakes.

• Snowflakes cause a number of performance and usability issues and are rarely justified.

Page 15: Dimensional Modeling

• Surrogate Keys – artificially created keys (usually integers) used only by the data warehouse touniquely identify a row in a dimension table.

• Grain – level of detail a fact row represents. For example, sale amount of a single item at a given date/time by salesperson A in the Boston store.

• Conformed Dimension – Different source systems (CRM versus Sales) often have differences in the list of dimension values they support. The CRM system may not have closed branches but the sales system does. A consolidated list of dimension values that supports all the source systems values is called a conformed dimension. Conformed dimensions are critical for a successful data warehouse.

Page 16: Dimensional Modeling

• Required to implement history of slowly changing dimensions.

• Avoids conflicts among backend application keys.

• Insulates the data warehouse from backend application changes.

• Different backend applications may use different columns as the dimension key.

• Note: Typically a surrogate key is just an integer.

Page 17: Dimensional Modeling
Page 18: Dimensional Modeling

1. Choose the business process

2. Declare the grain

3. Identify the dimensions

4. Identify the facts

Page 19: Dimensional Modeling

1. Choose the business process

The basics in the design build on the actual business process which the data warehouse should cover. Therefore the first step in the model is to describe the business process which the model builds on. This could for instance be a sales situation in a retail store. To describe the business process, one can choose to do this in plain text or use basic Business Process Modeling Notation (BPMN) or other design guides like the Unified Modeling Language (UML).

Page 20: Dimensional Modeling

2. Declare the grain

The grain of the model is the exact description of what the dimensional model should be focusing on. This could for instance be “An individual line item on a customer slip from a retail store”. To clarify what the grain means, you should pick the central process and describe it with one sentence. Furthermore the grain (sentence) is what you are going to build your dimensions and fact table from. You might find it necessary to go back to this step to alter the grain due to new information gained on what your model is supposed to be able to deliver.Identify the dimensions

Page 21: Dimensional Modeling

3. Identify the dimensions

The dimensions must be defined within the grain from the second step of the 4-step process. Dimensions are the foundation of the fact table, and is where the data for the fact table is collected. Typically dimensions are nouns like date, store, inventory etc. These dimensions are where all the data is stored. For example, the date dimension could contain data such as year, month and weekday.

Page 22: Dimensional Modeling

4. Identify the facts

After defining the dimensions, the next step in the process is to make keys for the fact table. This step is to identify the numeric facts that will populate each fact table row. This step is closely related to the business users of the system, since this is where they get access to data stored in the data warehouse. Therefore most of the fact table rows are numerical, additive figures such as quantity or cost per unit, etc.

Page 23: Dimensional Modeling

1. Choose the business process

2. Declare the grain

3. Identify the dimensions

4. Identify the fact

Page 24: Dimensional Modeling
Page 25: Dimensional Modeling
Page 26: Dimensional Modeling
Page 27: Dimensional Modeling
Page 28: Dimensional Modeling

Taken from The Data Warehouse Toolkit by Ralph Kimball and Margy Ross, Wiley Computer Publishing

Page 29: Dimensional Modeling
Page 30: Dimensional Modeling

Taken from Agile Data Warehouse Design by Lawrence Corr, DecisionOne Press

Mary Jones buys 1 book for $22.50 entitled “Agile Data Warehouse Design” on December 2, 2013 at 3:12 PM via Amazon.com using her Visa card to be delivered on December 10, 2013 by UPS.

Page 31: Dimensional Modeling

Taken from Agile Data Warehouse Design by Lawrence Corr, DecisionOne Press

• How?• What ?• When?• Where?• Who?• How Many?• Why?

Page 32: Dimensional Modeling

Taken from Agile Data Warehouse Design by Lawrence Corr, DecisionOne Press

Mary Jones buys 1 book for $22.50 entitled “Agile Data Warehouse Design” on December 2, 2013 at 3:12 PM via Amazon.com using her Visa card to be delivered on December 10, 2013 by UPS.

How much? (Fact)What

Who?

When?Where? How?

Page 33: Dimensional Modeling

Taken from Agile Data Warehouse Design by Lawrence Corr, DecisionOne Press

Boston University orders 2,000 books for $10,000.00 using a Corporate Amazon Account Membership discount with Amazon credit financed at 8% interest.

How much? (Fact)What

Who?

When?Where? How?

Page 34: Dimensional Modeling

Taken from Agile Data Warehouse Design by Lawrence Corr, DecisionOne Press

• Intuitive and natural for business users.

• Efficient way to get the required details.

• Provides jumping off point to get other information such as “Mary ordered via the internet. Are there otheroutlets for buy products?” or “Mary is an individual, do you have groups or corporate customers?”

• Helps you to focus on a single process at a time.

Page 35: Dimensional Modeling

Q: Why did the dimension take all day to take off its suit and put on a pair of jeans?

A: It was a slowly-changing dimension

Page 36: Dimensional Modeling
Page 37: Dimensional Modeling
Page 38: Dimensional Modeling
Page 39: Dimensional Modeling
Page 40: Dimensional Modeling
Page 41: Dimensional Modeling
Page 42: Dimensional Modeling
Page 43: Dimensional Modeling
Page 44: Dimensional Modeling

• Dimensional Modeling

• Facts and Dimensions

• Key Terms: Surrogate Key, Grain, Conformed Dimension, Bus Matrix

• The Star Schema

• Snow flaking Dimensions

• Slowly Changing Dimensions

• The Date Dimension