Top Banner
Sales Data Warehouse Advisor : Dr. Irwin Levinstein Presented By : Kalyan Yadavalli
35
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Sales Data Warehouse

Sales Data Warehouse

Advisor : Dr. Irwin Levinstein

Presented By : Kalyan Yadavalli

Page 2: Sales Data Warehouse

Contents

Mission of the project Need For a Data Warehouse Data Warehouse - Overview Sales Data Warehouse Conclusion

Page 3: Sales Data Warehouse

Mission of the project

The mission of this project is to provide strategic and tactical support to the Marketing-Sales and Advertising departments of a media company through the acquisition and analysis of data pertaining to their customers and markets.

This project helps to identify areas of readership and marketing through creation of a Data Warehouse that will provide a media company with a better understanding of its customers and markets.

Page 4: Sales Data Warehouse

Need For a Data Warehouse

To provide an environment where a relatively small amount of knowledge of the technical aspects of database technology is required to write and maintain queries and reports .

To provide a means to speed up the writing and maintaining of queries and reports by technical personnel.

For example, a query that requests the total sales income and quantity sold for a range of products in a specific geographical region for a specific time period can typically be answered in a few seconds or less regardless of how many hundreds of millions of rows of data are stored in the data warehouse database.

To make it easier, on a regular basis, to query and report data from multiple transaction processing systems ,external data sources for querying or reporting purposes.

To prevent persons who only need to query and report transaction processing system data from having any access whatsoever to transaction processing system databases and logic used to maintain those databases.

Page 5: Sales Data Warehouse

Data Warehouse – Overview

A data warehouse is a copy of data combined from different data sources specifically structured for querying and reporting.

Data warehouses support business decisions by collecting, consolidating, and organizing data for reporting and analysis with tools such as online analytical processing (OLAP) and data mining.

Dimensional Modeling VS Entity-Relationship Modeling

An OLTP system requires a normalized structure to minimize redundancy, provide validation of input data, and support a high volume of fast transactions. A transaction usually involves a single business event, such as placing an order or posting an invoice payment. An OLTP model often looks like a spider web of hundreds or even thousands of related tables.

In contrast, a typical dimensional model uses a star design that is easy to understand and relate to business needs, supports simplified business queries, and provides superior query performance by minimizing table joins.

Page 6: Sales Data Warehouse

Dimensional Modeling VS Entity-Relationship Modeling

This project uses Dimensional modeling, which is the name of the logical design technique often used for data warehouses. It is different from entity-relationship modeling.

Entity relationship modeling is a logical design technique that seeks to eliminate data redundancy while Dimensional modeling seeks to present data in a standard framework that is intuitive and allows for high-performance access.

For example, a query that requests the total sales income and

quantity sold for a range of products in a specific geographical region for a specific time period can typically be answered in a few seconds or less regardless of how many hundreds of millions of rows of data are stored in the data warehouse database.

Page 7: Sales Data Warehouse

Entity–Relationship Modeling

Customer

Campaign History

Payment

Salesperson

City

Demographics

SalesConditions

CustomerSubscriptions

District

Zones

Channel

Carrier HistoryCarrier MasterCampaign Offer

Page 8: Sales Data Warehouse

Dimensional Modeling

Subscription Sales

EffectiveDateKeyCustomerKeySubscriptionsKeyPaymentKeyCampaignKeySalesPersonKeyRouteKeyDemographics KeyUnitsSoldDollarsSoldDiscountCostPremiumCost

Customer

Campaign

Payment

Salesperson

Route

Subscriptions

Date

Demographics

Dimensions DimensionsFact Table

Page 9: Sales Data Warehouse

TechnicalArchitecture

Design

TechnicalArchitecture

Design

ProductSelection &Installation

ProductSelection &Installation

End-UserApplication

Specification

End-UserApplication

Specification

End-UserApplication

Development

End-UserApplication

Development

ProjectPlanningProject

Planning

Business

Requirement

Definition

Business

Requirement

Definition

DeploymentDeploymentMaintenance

andGrowth

Maintenanceand

Growth

Project ManagementProject Management

DimensionalModeling

DimensionalModeling

PhysicalDesign

PhysicalDesign

Data StagingDesign &

Development

Data StagingDesign &

Development

Kimball- Dimensional life cycle diagram

Page 10: Sales Data Warehouse

Sales Data Warehouse

Business Users Requirements Technical Architecture Product [ Software] Selection Dimensional Modeling Logical Design Data Staging Design & Development Building Data Cube using SQL Analysis Services End User Application Specification & Development Deployment Maintenance & Growth

Page 11: Sales Data Warehouse

Requirements Gathering

This phase involves the following steps: Collect some business questions the users want an answer for. Gather details/requirements from the business users Get user sign off on the business questions.

Business Questions: Can we profile our "best subscribers" to pull lists of "like" non-subscribers

that we could touch in some way? Who exists in the marketplace and have we touched them? Can we build a loyalty model based on a subscriber's payment history?

Page 12: Sales Data Warehouse

ServicesTransform from source to target.Maintain conformed dimensions.

Data StorageFlat files or relational tables

Design GoalsStaging throughput.Integrity and consistency.

Subscription SalesDimensional.Atomic and summary data.Business process.Design GoalsEase-of-use.Query performance.

Dimensional Bus:Conformed facts and dimensions

SQL Reporting Services

Excel

AccessLoad Access

Marketing/SalesData

Name Phone Data

Demographics

Extract

Extract

Extract

Source Systems

Sales Data Warehouse High Level Technical ArchitectureSales Data Warehouse High Level Technical Architecture

Data Staging Area Presentation Area Data Access Tools

Page 13: Sales Data Warehouse

Product Selection

Hardware Specs: AMD Opteron Processor 252 2.6 GHz, 3.83 GB RAM Operating System: Windows Server 2003

Software Specs: Kimball Data Warehouse Tool [Create staging and production databases] Microsoft ® SQL Server™ 2000 [ETL { Extract Transform Load} ] Microsoft ® SQL Server™ 2005 Integration Services [ Nightly Automation] Microsoft ® SQL Server™ 2005 Analysis Services [ Create OLAP Data

Cube] Microsoft ® SQL Server™ 2005 Reporting Services[ End User Reports] Internet Information Services [ IIS 6.0] [ Web Server to Host the Reports]

Page 14: Sales Data Warehouse

Dimensional Modeling

Design Dimensions Attributes of the dimension Hierarchy in the dimension

Dimensional Bus Matrix Design Fact Tables

Page 15: Sales Data Warehouse

Dimension Hierarchy-Subscriptions

SubscriptionName

Rate Service Term Publication

Rate Year Rate Area Rate Type

DiscountCategory

FrequencyGroups

Term LengthGroups

Short or LongTerm

BusinessGroup

Page 16: Sales Data Warehouse

Dimensions Date Demographics Sales Conditions

Salesperson Campaign……

Business

Processes Subscription Sales( starts)

X X X X X

Subscription Tracking

X X X X

Complaints X X X

Stops X X X

Upgrades AndDowngrades

X X X X X

Dimensional Bus Matrix

Page 17: Sales Data Warehouse

SubscriptionSales Fact Table

EffectiveDateKeyCampaignKeySalespersonKeyCustomerKeyDempgraphicsKeySubscriptionKey………………..Grain: Each subscription soldFactsUnits SoldNumber of Sales (=1)Dollars SoldDiscount CostPremium Cost

Design Fact Tables•Choose the Business Process as the fact table•Declare the grain•Choose the dimensions•Choose the facts

Page 18: Sales Data Warehouse

SalesConditions

Campaign

Demographics

Address

Route

Salesperson

Subscription

Loyalty-Payment

Customer

Date

Subscr.Sales

(starts)

Subscription Sales

Grain: Each subscription sold

MeasuresUnits Sold

Number of Sales (=1)Dollars Sold

Discount CostPremium Cost

Dimension Model: SubscriptionSales

Page 19: Sales Data Warehouse

Logical Design

Fact Table Design

Dimension Table Design

Slowly Changing Dimensions Type 1: The new record replaces the original record. No trace of the old

record exists. Type 2: A new record is added into the customer dimension table.

Therefore, the customer is treated essentially as two people.

Page 20: Sales Data Warehouse

Column Names Data Type

NULL?

Key? FK TO Dimension Description

EffectiveDateKey int N FK DimDate Key of effective date

EnteredDateKeyint N FK DimDate

Key of date entered in the system

CustomerKey int N FK DimCustomer Key of customer

LoyaltyKey int N FK DimLoyalty Key of loyalty score

PaymentKey int N FK DimPaymentHistory Key of payment behavior

SalesPersonKeyint N FK DimSalesPerson

Key of sales person for the change

CampaignKey int N FK DimCampaign Key of campaign

SalesConditionsKey int N FK DimSalesConditions Key of sales conditions

SubscriptionKey int N FK DimSubscription Key of subscription

PersonKeyint N FK dimPerson

Key of person on the subscription

Fact Table -SubscriptionSales

Page 21: Sales Data Warehouse

Column Names Data Type

NULL? Key?

FK TO Dimension Description

Slowly changing dimension type

CustomerSubscriptionKey int N PK ID Surrogate Primary Key

AddressNum int Y BKBusiness key of the subscription summary record address

SubscriptionNum int Y BK `Business key of the numbered subscription at the address

BusinessKey int Y BK Concatenated business key

CustomerID int N Unique identifier for this customer 1

BillingMethod int The method of delivery for the customers bill 2

OriginalStartDateKey int FK DimDateThe earliest start date on record for this customer 1

StartDateKey int FK DimDateThe start date of this customer’s current subscription 2

StopDateKey int FK DimDate The most recent stop date for this customer 2

ExpireDateKey int FK DimDateThe expiration date of this customer’s current or most recent subscription 2

Dimension Table - CustomerSubscriptions

Page 22: Sales Data Warehouse

Data Staging

The following are the sub processes of Data Staging process.

Extracting : Reading and understanding the source data, copying the parts that are needed to the staging area.

Transforming: Possible transformation steps in the data staging area Cleaning the data – correct misspellings, deal with missing data elements,

parsing into standard formats. Purging selected data which is not required Combining data sources, by matching exactly on key values or performing fuzzy

matches on non-key attributes. Creating surrogate keys for each dimensional record. Building aggregates to boost performance of common queries.

Loading – Loading the transformed data into the production database.

Page 23: Sales Data Warehouse

Data Staging -ETL Architecture

Common Source Database

Source SchemaCreates the tables for the common source database

Kimball Data Modeling toolCreates the staging and production database tables and metadata.

Production Data Warehouse Database

Source QueriesCreates views (queries) that feed data to production

Staging Database

Setup files on SQL server that is running the DTS packages.

Setup files on SQL server that is running the DTS packages.

Database Configuration FileDatabase connection information for the ETL process

Dates ConfigurationFileImportant date info for the ETL process

DTSData Transformation Services Package

DTSData Transformation Services Package

Page 24: Sales Data Warehouse

Data Staging –Development for Sales Data Warehouse

The ETL packages perform the following work. Extract the full sets of dimension rows Most transformation logic occurs in the extract query, using SQL Extracted rows are stored in a staging table until the ETL package is run. There are steps for the staged rows to be fixed up, via SQL statements.

There’s a statement for deleting bogus rows, and a separate statement for updating rows.

Find rows that are new; insert them into the target table. Use a checksum to find rows that have seen a Type 1 change. Update the

appropriate columns in the target table. Use a checksum to find rows that have seen a Type 2 change. Propagate a

new dimension row. Log the number of rows extracted, staged, deleted and updated from the

staging tables, inserted into target, Type 1 and Type 2 rows updated in target.

Page 25: Sales Data Warehouse

Dimension-Customer Subscription

Page 26: Sales Data Warehouse

Building Data Cube using SQL Analysis Services

SQL Server Analysis Services 2005 provides tools for developing OLAP applications

OLAP [ Online Anlytical Processing ] organizes data warehouse data into multidimensional cubes based on the dimensional model, and then preprocesses these cubes to provide maximum performance for queries that summarize data in various ways.

Build the cube using SQL Analysis Services and deploy it to SQL Analysis Services Server.

Page 27: Sales Data Warehouse
Page 28: Sales Data Warehouse

End User Application Development

Reporting Services - Uses the Analysis Services Data Cube as Data Source.

Sample Report Screenshots

Page 29: Sales Data Warehouse

Solicitor SalesSales Type Sales

ChannelSales Agent Number Of

SalesCost Per

Unit Retention  

Carrier Sales

             Carrier

Sales         

    Carrier Sales 2 $0.00 50.0%  DM Sales              DM Sales              DM Sales 124 $0.13 13.7%  Non-Solicited

             Other              2005 THISISHAMPRDS

FREE2WKSAMP8 $0.00 12.5%  

    AD CONTRACT START 8 $0.17 62.5%      ALLCONNECT 1,914 $0.04 43.5%      CAN'T AFFORD DM "2 WKS

FREE"7 $0.10 71.4%  

    COLLECTIONS TEAM 15 $0.14 73.3%      COOLSAVINGS.COM 9,415 $21.55 7.1%      CUSTOMER SERVICE

PROMO STARTS1,934 $0.15 73.7%  

    DATA ENTRY STARTS 49 $0.14 59.2%  

Page 30: Sales Data Warehouse

Sales Agent

Source Name

Phone Address St City ZIP

# Sales

Units

Sold

$ Sold

Discount

Cost

Premium

Cost

AD CONTRACT START

ULTIMATE TAN OF SMITHFIELD

(757) 365-9400

13412 BENNS CHURCH BLVD

VA SMITHFIELD

23430

1 260 $30.99

$37.91 $5.00

AD CONTRACT START

CHOREY & ASSOCIATE

(757) 539-7451

330 W CONSTANCE RD # 100

VA SUFFOLK

23434

1 260 $30.99

$37.91 $5.00

AD CONTRACT START

CHOREY AND ASSOCIATE

(757) 539-7454

804 W WASHINGTON ST

VA SUFFOLK

23434

1 260 $30.99

$37.91 $5.00

AD CONTRACT START

VIRGINIA STAGE CO

0 254 GRANBY ST VA NORFOLK

23510

1 260 $30.99

$37.91 $5.00

AD CONTRACT START

SPINE & ORTHAPEDIC CTR, PC

0 6160 KEMPSVILLE CIR # 303A

VA NORFOLK

23502

1 260 $30.99

$37.91 $5.00

AD CONTRACT START

COUNTRYWIDE HOME LOAN

0 3000 WOODLAWN DR

VA SUFFOLK

23434

1 260 $30.99

$37.91 $5.00

AD CONTRACT START

WHITE, E.D. 0 730 10TH ST VA VIRGINIA BEACH

23451

1 260 $30.99

$37.91 $5.00

AD CONTRACT START

PERMANENT COATING SOLUTIONS IN

(757) 539-4366

434 N MAIN ST # D

VA SUFFOLK

23434

1 260 $30.99

$37.91 $5.00

              8 2,080 $247.94

$303.26 $40.00

Sales Agent Details

Page 31: Sales Data Warehouse

Deployment

Deploy the Reports to SQL Reporting Services 2005 server.

Give Access to the users to view the reports.

Desktop Installation – Dot Net Framework 2.0 For access to Report Builder.

Page 32: Sales Data Warehouse

Maintenance & Growth

Training the End Users. Automated Nightly Updates to Data

Warehouse.

Page 33: Sales Data Warehouse
Page 34: Sales Data Warehouse

ConclusionThe reports generated from the data warehouse answered the following questions collected form the business users during the requirement gathering phase of the project.

Identify their best customers/loyal customers [ customer subscriptions /subscription sales]

Non-subscribers who can be reached Contact history of customers in market place [ Demographic data]

Benefits to Marketing Increased telemarketing close rates and increased direct mail response rates Reduced cost and use of outside telemarketing services and reduced print and

mailing costs Identification of new product bundling and distribution opportunities Increased acquisition and retention rates, and reduced cost of acquisitions

Benefits to Advertising An increase in the annual rate of revenue growth. Increase in new advertisers Improved targeting capabilities

.

Page 35: Sales Data Warehouse

QUESTIONS ?