Top Banner

of 20

2010-SQL Saturday WM Presentation

Jun 02, 2018

Download

Documents

chand1255
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/11/2019 2010-SQL Saturday WM Presentation

    1/20

    Building aDataWarehouseusing

    SQL Server2008Presented by Wes Dumey

    Orlando SQL SaturdayOctober 16, 2010

  • 8/11/2019 2010-SQL Saturday WM Presentation

    2/20

    First Things First Networking is key at these events, please

    take a minute and introduce yourself tothe person to the left and right of you

  • 8/11/2019 2010-SQL Saturday WM Presentation

    3/20

    Lets Talk Trash. Well discuss data warehousing with a

    view of how a trash company like WasteManagement could build a datawarehouse

    All photographs and logos are property ofWaste Management, Inc.

  • 8/11/2019 2010-SQL Saturday WM Presentation

    4/20

    Fun Facts about Trash Municipal solid waste (a.k.a. trash) is

    generated at a rate of 250 million tons oftrash per year (in the USA)

    Each person produces an average of 4.5lbs of trash per day

    The nationwide recycling rate in 2008 was33.2%

    *source www.epa.gov

  • 8/11/2019 2010-SQL Saturday WM Presentation

    5/20

    About the Presenter Senior Consultant, Durable Impact

    Consulting, a Florida-based datawarehouse consulting practice

    10+ years experience developing businessintelligence solutions

    Personal Interests: Economics andAviation

  • 8/11/2019 2010-SQL Saturday WM Presentation

    6/20

    Agenda Overview of Data Warehouse principles Data Modeling and Data Warehouse

    Architecting exercises SSIS Example Question/Answer Session

  • 8/11/2019 2010-SQL Saturday WM Presentation

    7/20

    Lets Get Started Our client today is Waste Management,

    Inc. Our project is to develop a business

    intelligence solution covering residentialand commercial service routes

  • 8/11/2019 2010-SQL Saturday WM Presentation

    8/20

    Problem Definition We need to solve the following business

    problems: Business has no long term trend picture of

    commissioned employee performance Business has no ability to verify whether sales

    contracts are profitable Business would like to be able to conduct

    elasticity modeling on pricing

  • 8/11/2019 2010-SQL Saturday WM Presentation

    9/20

    Steps to Complete Project Determine metrics to be captured Analyze source systems

    Develop data model Architect ETL solution Design and develop reporting/analysis

    solution

  • 8/11/2019 2010-SQL Saturday WM Presentation

    10/20

    Project Overview Overview of a data warehouse:

    A centralized database system optimized foranalysis that contains information from one ormore source systems

    ETL (extract, transform, and load) jobs arecreated to load the data warehouse

    A reporting package typically sits on top ofthe data warehouse to provide end useranalysis

  • 8/11/2019 2010-SQL Saturday WM Presentation

    11/20

    Data Modeling Primer A data model is a logical and physical

    representation of the star (or snowflake)schemas used for the relational model

    Three schematic table types: Dimension : descriptions and attributes Facts : measures and quantities Aggregates : pre-computed answers (rolled up

    facts)Exercise: Can you think of some dimensions, facts,and aggregates used for this example?

  • 8/11/2019 2010-SQL Saturday WM Presentation

    12/20

    Data Model Dimensions: Date, Customer, Employee, Route, Vehicle,

    Rate Facts: Sales activity, haul activity Aggregates: Sales amount by employee, hauls by vehicle

    How Facts and Dimensions are joined By use of a surrogate key (generally meaningless number)

    Each dimension has a surrogate key as the primary identifier Natural keys in the data are used to find the surrogate keys

    which are then passed into the fact tables This design allows for high performance

    Aggregates are joined to facts through the common keys

  • 8/11/2019 2010-SQL Saturday WM Presentation

    13/20

    Data Warehouse Dimensions EDW_DATE_DIM (date_key, date attibutes , ) EDW_CUSTOMER_DIM (customer_key,

    customer name, customer address, ) EDW_EMPLOYEE_DIM (employee_key,

    employee id, employee name, ) EDW_ROUTE_DIM (route_key, route id, route

    name, city, state, region, ) EDW_VEHICLE_DIM (vehicle_key, vehicle id,

    vehicle type, make, model, year, acquire date,disposal date, ) EDW_RATE_DIM (rate_key, rate id, rate type,

    begin date, end date, current ind , ) SCD

  • 8/11/2019 2010-SQL Saturday WM Presentation

    14/20

  • 8/11/2019 2010-SQL Saturday WM Presentation

    15/20

    ETL Solution ETL = Extract, transform, and load Typically performed using ETL tools such

    as SQL Server 2008 Designed to read data from the source

    system and load it into the star schema

    Typically scheduled on a repeating basisto keep data current Can be simple or very complex

  • 8/11/2019 2010-SQL Saturday WM Presentation

    16/20

    Data Architecture Considerations

    To stage or not to stage (creating a stagingarea, a temporary place for source data)

    Data volumes will depend on how we build our jobs

    Designed for ease of support andmaintenance

  • 8/11/2019 2010-SQL Saturday WM Presentation

    17/20

    Auditing Use batch audit tables to keep track of

    what is running Track insert/update metrics Always know what is going on in your

    warehouse (and maybe trash, too)

  • 8/11/2019 2010-SQL Saturday WM Presentation

    18/20

    Reporting Solution Create reports using SQL Server Reporting

    Services

    Introduction to SSIS

  • 8/11/2019 2010-SQL Saturday WM Presentation

    19/20

    Question/Answer Session

  • 8/11/2019 2010-SQL Saturday WM Presentation

    20/20

    Additional Resources Durable Impact white papers

    www.durableimpact.com Microsoft blogs Some books of interest:

    http://www.durableimpact.com/http://www.durableimpact.com/