8/11/2019 2010-SQL Saturday WM Presentation
1/20
Building aDataWarehouseusing
SQL Server2008Presented by Wes Dumey
Orlando SQL SaturdayOctober 16, 2010
8/11/2019 2010-SQL Saturday WM Presentation
2/20
First Things First Networking is key at these events, please
take a minute and introduce yourself tothe person to the left and right of you
8/11/2019 2010-SQL Saturday WM Presentation
3/20
Lets Talk Trash. Well discuss data warehousing with a
view of how a trash company like WasteManagement could build a datawarehouse
All photographs and logos are property ofWaste Management, Inc.
8/11/2019 2010-SQL Saturday WM Presentation
4/20
Fun Facts about Trash Municipal solid waste (a.k.a. trash) is
generated at a rate of 250 million tons oftrash per year (in the USA)
Each person produces an average of 4.5lbs of trash per day
The nationwide recycling rate in 2008 was33.2%
*source www.epa.gov
8/11/2019 2010-SQL Saturday WM Presentation
5/20
About the Presenter Senior Consultant, Durable Impact
Consulting, a Florida-based datawarehouse consulting practice
10+ years experience developing businessintelligence solutions
Personal Interests: Economics andAviation
8/11/2019 2010-SQL Saturday WM Presentation
6/20
Agenda Overview of Data Warehouse principles Data Modeling and Data Warehouse
Architecting exercises SSIS Example Question/Answer Session
8/11/2019 2010-SQL Saturday WM Presentation
7/20
Lets Get Started Our client today is Waste Management,
Inc. Our project is to develop a business
intelligence solution covering residentialand commercial service routes
8/11/2019 2010-SQL Saturday WM Presentation
8/20
Problem Definition We need to solve the following business
problems: Business has no long term trend picture of
commissioned employee performance Business has no ability to verify whether sales
contracts are profitable Business would like to be able to conduct
elasticity modeling on pricing
8/11/2019 2010-SQL Saturday WM Presentation
9/20
Steps to Complete Project Determine metrics to be captured Analyze source systems
Develop data model Architect ETL solution Design and develop reporting/analysis
solution
8/11/2019 2010-SQL Saturday WM Presentation
10/20
Project Overview Overview of a data warehouse:
A centralized database system optimized foranalysis that contains information from one ormore source systems
ETL (extract, transform, and load) jobs arecreated to load the data warehouse
A reporting package typically sits on top ofthe data warehouse to provide end useranalysis
8/11/2019 2010-SQL Saturday WM Presentation
11/20
Data Modeling Primer A data model is a logical and physical
representation of the star (or snowflake)schemas used for the relational model
Three schematic table types: Dimension : descriptions and attributes Facts : measures and quantities Aggregates : pre-computed answers (rolled up
facts)Exercise: Can you think of some dimensions, facts,and aggregates used for this example?
8/11/2019 2010-SQL Saturday WM Presentation
12/20
Data Model Dimensions: Date, Customer, Employee, Route, Vehicle,
Rate Facts: Sales activity, haul activity Aggregates: Sales amount by employee, hauls by vehicle
How Facts and Dimensions are joined By use of a surrogate key (generally meaningless number)
Each dimension has a surrogate key as the primary identifier Natural keys in the data are used to find the surrogate keys
which are then passed into the fact tables This design allows for high performance
Aggregates are joined to facts through the common keys
8/11/2019 2010-SQL Saturday WM Presentation
13/20
Data Warehouse Dimensions EDW_DATE_DIM (date_key, date attibutes , ) EDW_CUSTOMER_DIM (customer_key,
customer name, customer address, ) EDW_EMPLOYEE_DIM (employee_key,
employee id, employee name, ) EDW_ROUTE_DIM (route_key, route id, route
name, city, state, region, ) EDW_VEHICLE_DIM (vehicle_key, vehicle id,
vehicle type, make, model, year, acquire date,disposal date, ) EDW_RATE_DIM (rate_key, rate id, rate type,
begin date, end date, current ind , ) SCD
8/11/2019 2010-SQL Saturday WM Presentation
14/20
8/11/2019 2010-SQL Saturday WM Presentation
15/20
ETL Solution ETL = Extract, transform, and load Typically performed using ETL tools such
as SQL Server 2008 Designed to read data from the source
system and load it into the star schema
Typically scheduled on a repeating basisto keep data current Can be simple or very complex
8/11/2019 2010-SQL Saturday WM Presentation
16/20
Data Architecture Considerations
To stage or not to stage (creating a stagingarea, a temporary place for source data)
Data volumes will depend on how we build our jobs
Designed for ease of support andmaintenance
8/11/2019 2010-SQL Saturday WM Presentation
17/20
Auditing Use batch audit tables to keep track of
what is running Track insert/update metrics Always know what is going on in your
warehouse (and maybe trash, too)
8/11/2019 2010-SQL Saturday WM Presentation
18/20
Reporting Solution Create reports using SQL Server Reporting
Services
Introduction to SSIS
8/11/2019 2010-SQL Saturday WM Presentation
19/20
Question/Answer Session
8/11/2019 2010-SQL Saturday WM Presentation
20/20
Additional Resources Durable Impact white papers
www.durableimpact.com Microsoft blogs Some books of interest:
http://www.durableimpact.com/http://www.durableimpact.com/