Top Banner
2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology
33

2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

Jan 02, 2016

Download

Documents

Pamela Lawson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2Copyright © Oracle Corporation, 2002. All rights reserved.

Defining Data Warehouse Concepts and Terminology

Page 2: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-2 Copyright © Oracle Corporation, 2002. All rights reserved.

Objectives

After completing this lesson, you should be able to do the following:

• Identify a common, broadly accepted definition of a data warehouse

• Describe the differences of dependent and independent data marts

• Identify some of the main warehouse development approaches

• Recognize some of the operational properties and common terminology of a data warehouse

Page 3: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-3 Copyright © Oracle Corporation, 2002. All rights reserved.

Definition of a Data Warehouse

“A data warehouse is a subject oriented, integrated, non-volatile, and time variant collection of data in support of management’s decisions.”

— W.H. Inmon

“An enterprise structured repository of subject-oriented, time-variant, historical data used for information retrieval and decision support. The data warehouse stores atomic and summary data.”

— Oracle’s Data Warehouse Definition

Page 4: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-5 Copyright © Oracle Corporation, 2002. All rights reserved.

Data Warehouse Properties

Integrated

Time-variantNonvolatile

Subject-oriented

DataWarehouse

Page 5: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-6 Copyright © Oracle Corporation, 2002. All rights reserved.

Subject-Oriented

Data is categorized and stored by business subject rather than by application.

OLTP Applications

Equity Plans

Shares

Insurance

Loans

Savings

Data Warehouse Subject

Customer financial information

Page 6: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-7 Copyright © Oracle Corporation, 2002. All rights reserved.

Integrated

Data on a given subject is defined and stored once.

Data WarehouseOLTP Applications

Customer

Savings

Current Accounts

Loans

Page 7: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-9 Copyright © Oracle Corporation, 2002. All rights reserved.

Data Warehouse

Time-Variant

Data is stored as a series of snapshots, each representing a period of time.

Page 8: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-10 Copyright © Oracle Corporation, 2002. All rights reserved.

Nonvolatile

Typically data in the data warehouse is not updated or deleted.

Warehouse

Read

Load

Operational

Insert, Update, Delete, or Read

Page 9: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-11 Copyright © Oracle Corporation, 2002. All rights reserved.

Changing Warehouse Data

Operational Databases Warehouse Database

First time load

Refresh

Refresh

RefreshPurge or Archive

Page 10: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-12 Copyright © Oracle Corporation, 2002. All rights reserved.

Data Warehouse Versus OLTP

Property OLTP Data Warehouse

Response Time Sub seconds to seconds

Seconds to hours

Operations DML Primarily Read only

Nature of Data 30 – 60 days Snapshots over time

Data Organization Application Subject, time

Size Small to large Large to very large

Data Sources Operational, Internal Operational, Internal, External

Activities Processes Analysis

Page 11: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-14 Copyright © Oracle Corporation, 2002. All rights reserved.

Usage Curves

• Operational system is predictable

• Data warehouse:– Variable– Random

Page 12: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-15 Copyright © Oracle Corporation, 2002. All rights reserved.

User Expectations

• Control expectations

• Set achievable targets for query response

• Set SLAs

• Educate

• Growth and use is exponential

Page 13: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-16 Copyright © Oracle Corporation, 2002. All rights reserved.

Enterprisewide Warehouse

• Large scale implementation

• Scopes the entire business

• Data from all subject areas

• Developed incrementally

• Single source of enterprisewide data

• Synchronized enterprisewide data

• Single distribution point to dependent data marts

Page 14: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-17 Copyright © Oracle Corporation, 2002. All rights reserved.

Data Warehouses Versus Data Marts

Property Data Warehouse Data Mart

Scope Enterprise Department

Subjects Multiple Single-subject, LOB

Data Source Many Few

Implementation time Months to years Months

Page 15: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-19 Copyright © Oracle Corporation, 2002. All rights reserved.

Dependent Data Mart

Data Warehouse

Data Marts

Flat FilesMarketing

Sales

Finance

MarketingSales

FinanceHR

OperationalSystems

External Data

Operations Data

Legacy Data

External Data

Page 16: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-20 Copyright © Oracle Corporation, 2002. All rights reserved.

Independent Data Mart

Sales orMarketing

Flat Files

OperationalSystems

External Data

Operations Data

Legacy Data

External Data

Page 17: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-21 Copyright © Oracle Corporation, 2002. All rights reserved.

Typical DataWarehouse Components

Source Systems

Staging Area

Presentation Area

AccessTools

ODS

Operational

External

Legacy

Metadata Repository

Data Marts

Data Warehouse

Page 18: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-23 Copyright © Oracle Corporation, 2002. All rights reserved.

Warehouse Development Approaches

• “Big bang” approach

• Incremental approach:– Top-down incremental approach– Bottom-up incremental approach

Page 19: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-24 Copyright © Oracle Corporation, 2002. All rights reserved.

“Big Bang” Approach

Analyze enterpriserequirements

Build enterprisedata warehouse

Report in subsets orstore in data marts

Page 20: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-26 Copyright © Oracle Corporation, 2002. All rights reserved.

Top-Down Approach

Analyze requirements at the enterprise level

Develop conceptual information model

Identify and prioritize subject areas

Complete a model of selected subject area

Map to available data

Perform a source system analysis

Implement base technical architecture

Establish metadata, extraction, and load processes for the initial subject area

Create and populate the initial subject area data mart within the overall warehouse

framework

Page 21: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-27 Copyright © Oracle Corporation, 2002. All rights reserved.

Bottom-Up Approach

Define the scope and coverage of the data warehouse and analyze the source systems within this scope

Define the initial increment based on the political pressure, assumed business benefit and data volume

Implement base technical architecture and establish metadata, extraction, and load processes as required by increment

Create and populate the initial subject areas within the overall warehouse framework

Page 22: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-29 Copyright © Oracle Corporation, 2002. All rights reserved.

Incremental Approach to Warehouse Development

• Multiple iterations

• Shorter implementations

• Validation of each phase Strategy

Definition

Analysis

Design

Build

Production

Increment 1

Iterative

Page 23: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-30 Copyright © Oracle Corporation, 2002. All rights reserved.

Data Warehousing Process Components

• Methodology

• Architecture

• Extraction, Transformation, and Load (ETL)

• Implementation

• Operation and Support

Page 24: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-31 Copyright © Oracle Corporation, 2002. All rights reserved.

Methodology

• Ensures a successful data warehouse

• Encourages incremental development

• Provides a staged approach to an enterprisewide warehouse:– Safe– Manageable– Proven– Recommended

Page 25: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-32 Copyright © Oracle Corporation, 2002. All rights reserved.

Architecture

• “Provides the planning, structure, and standardization needed to ensure integration of multiple components, projects, and processes across time.”

• “Establishes the framework, standards, and procedures for the data warehouse at an enterprise level.”

— The Data Warehousing Institute

Page 26: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-33 Copyright © Oracle Corporation, 2002. All rights reserved.

Extraction, Transformation, and Load (ETL)

“Effective data extract, transform and load (ETL) processes represent the number one success factor for your data warehouse project and can absorb up to 70 percent of the time spent on a typical data warehousing project.”

— DM Review, March 2001

Source TargetStaging Area

Page 27: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-34 Copyright © Oracle Corporation, 2002. All rights reserved.

Implementation

Data Warehouse Architecture

Implementation

Ex., Incremental Implementation

Increment 1

Increment 2

Increment n

.

.

.

Page 28: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-35 Copyright © Oracle Corporation, 2002. All rights reserved.

Operation and Support

• Data access and reporting

• Refreshing warehouse data

• Monitoring

• Responding to change

Page 29: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-36 Copyright © Oracle Corporation, 2002. All rights reserved.

Phases of theIncremental Approach

• Strategy

• Definition

• Analysis

• Design

• Build

• Production

Increment 1Strategy

Definition

Analysis

Design

Build

Production

Page 30: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-38 Copyright © Oracle Corporation, 2002. All rights reserved.

Strategy Phase Deliverables

• Business goals and objectives

• Data warehouse purpose, objectives, and scope

• Enterprise data warehouse logical model

• Incremental milestones

• Source systems data flows

• Subject area gap analysis

Page 31: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-39 Copyright © Oracle Corporation, 2002. All rights reserved.

Strategy Phase Deliverables

• Data acquisition strategy

• Data quality strategy

• Metadata strategy

• Data access environment

• Training strategy

Page 32: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-40 Copyright © Oracle Corporation, 2002. All rights reserved.

Summary

In this lesson, you should have learned how to:

• Identify a common, broadly accepted definition of a data warehouse

• Describe the differences of dependent and independent data marts

• Identify some of the main warehouse development approaches

• Recognize some of the operational properties and common terminology of a data warehouse

Page 33: 2 Copyright © Oracle Corporation, 2002. All rights reserved. Defining Data Warehouse Concepts and Terminology.

2-41 Copyright © Oracle Corporation, 2002. All rights reserved.

Practice 2-1 Overview

This practice covers the following topics:

• Answering questions regarding data warehousing concept and terminology

• Discussing some of the data warehouse concept and terminology