Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor [email protected]Danijela Subotić, Teaching Assistant [email protected]Department of Informatics, University of Rijeka Radmile Matejčić 2, 51000 Rijeka, Croatia http://www.inf.uniri.hr
17
Embed
Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor [email protected] Danijela Subotić, Teaching Assistant.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Data Warehouse design models in higher education courses
Department of Informatics, University of RijekaRadmile Matejčić 2, 51000 Rijeka, Croatia
http://www.inf.uniri.hr
Overview
• Introduction
• DW architecture
• Modeling practices
– Entity-relationship model
– Data Vault model
– Dimensional model
• Conclusion
2
Introduction
• Selected Topics in Databases
• Graduate study, 1st year
• Data warehouse (DW) design as a topic
• Integrating several data modeling practices for complete DW design
• Practical assignment at the end of the semester
3
DW architecture
4
Modeling practices
• Modeling of existing database (DB) sources
– Entity-relationship model
– Relational model
• Modeling enterprise data warehouse (EDW) as system of records
– Data Vault model
• Modeling data marts (DM)
– Dimensional model
5
Business case
• We use a business case which deals with a DW for the outdoor and adventure equipment sales company
• All data model examples (which are shown on following slides) are made in Erwin 9.5 and are based on IDEF1X
6
Entity-Relationship (ER) model
Sales DB
7
Marketing DB
Data Vault model
• A data modeling method that supports design of data warehouses for long-term storage of historical data collected from various data sources
• Based on the assumption that the DW environment is in constant change
• It highlights the need for tracking the origin of data contained in the database, through empirically defined set of metadata
• Enables tracking the value back to the source and tracking the history of changes
8
Data Vault model
• There is no difference between good and bad data - all the data is stored at all times, regardless of whether they are adaptable to business rules - avoiding the loss of information
• The structural data are explicitly separated from descriptive attributes, regardless of whether they come from the same source
• Model flexible to changes in business environment
• Allows for a gap analysis and trend projections
9
Data Vault model
• Any change is implemented in the model as an independent extension of the existing model:
– the changes do not affect current applications
– all versions of the application can be based on the same, developing DB
– all versions of the model are a subset of the DV model
• Enables fast parallel loading which reduces the overall costs
• Inserts, deletes, or updates of rows are implemented only as additions (nothing ever get lost/overwritten)
• Structural changes of and in data sources results in model expansion, principally by new links and without structural reconstruction of existing DW elements (architectural stability)
• Enables rapid parallel data loads
13
Dimensional model
• Practically universally used for DM design presentation
• Distinguished by star schema design
– centralized fact table, which contains a multi-layered keys and one or more numerical business measures
– fact (set of measurement) needs to be tracked for a lowest granularity of data
– fact is surrounded with a rich context of dimensions
– dimension tables are denormalized, they have a simple key and they store business attributes in the form of textual information