Top Banner
DATA WAREHOUSE By: RAVI RANJAN By: Ravi Ranjan
17

142230 633685297550892500

May 11, 2015

Download

Technology

sumit621

Upload one or more filesUpload Videos

Use Ctrl key for multiple files
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 142230 633685297550892500

DATA

WAREHOUSE

By: RAVI RANJAN

By: Ravi Ranjan

Page 2: 142230 633685297550892500

DEFINITIONData Warehouse A collection of corporate information, derived directly from operational systems and some external data sources. Its specific purpose is to support business decisions, not business operations.

Page 3: 142230 633685297550892500

THE PURPOSE OF DATA WAREHOUSING

Realize the value of data Data / information is an asset Methods to realize the value, (Reporting,

Analysis, etc.)

Make better decisions Turn data into information Create competitive advantage Methods to support the decision making

process, (EIS, DSS, etc.)

Page 4: 142230 633685297550892500

Data Warehouse Components

• Staging Area• A preparatory repository where

transaction data can be transformed for use in the data warehouse

• Data Mart • Traditional dimensionally modeled set of

dimension and fact tables• Per Kimball, a data warehouse is the union

of a set of data marts • Operational Data Store (ODS)

• Modeled to support near real-time reporting needs.

Page 5: 142230 633685297550892500

DATA WAREHOUSE FUNCTIONALITY

Data Warehouse Engine

Optimized LoaderExtractionCleansing

AnalyzeQuery

Metadata Repository

RelationalDatabases

LegacyData

Purchased Data

ERPSystems

Page 6: 142230 633685297550892500

EVOLUTION ARCHITECTURE OF DATA WAREHOUSE

Top-Down Architecture

Bottom-Up Architecture

Enterprise Data Mart Architecture

Data Stage/Data Mart Architecture

GO TO DIAGRAM

GO TO DIAGRAM

GO TO DIAGRAM

GO TO DIAGRAM

Page 7: 142230 633685297550892500

VERY LARGE DATA BASES

Terabytes -- 10^12 bytes:

Petabytes -- 10^15 bytes:

Exabytes -- 10^18 bytes:

Zettabytes -- 10^21 bytes:

Zottabytes -- 10^24 bytes:

Wal-Mart -- 24 Terabytes

Geographic Information Systems

National Medical Records

Weather images

Intelligence Agency Videos

WAREHOUSES ARE VERY LARGE DATABASES

Page 8: 142230 633685297550892500

COMPLEXITIES OF CREATING A DATA WAREHOUSE

Incomplete errors Missing FieldsRecords or Fields That, by Design, are

not Being Recorded

Incorrect errorsWrong Calculations, AggregationsDuplicate RecordsWrong Information Entered into Source

System

Page 9: 142230 633685297550892500

SUCCESS & FUTURE OF DATA WAREHOUSE

The Data Warehouse has successfully supported the

increased needs of the State over the past eight

years.

The need for growth continues however, as the

desire for more integrated data increases.

The Data Warehouse has software and tools in place

to provide the functionality needed to support new

enterprise Data Warehouse projects.

The future capabilities of the Data Warehouse can be

expanded to include other programs and agencies.

Page 10: 142230 633685297550892500

DATA WAREHOUSE PITFALLS

You are going to spend much time extracting, cleaning, and loading data

You are going to find problems with systems feeding the data warehouse

You will find the need to store/validate data not being captured/validated by any existing system

Large scale data warehousing can become an exercise in data homogenizing

Page 11: 142230 633685297550892500

DATA WAREHOUSE PITFALLS…

The time it takes to load the warehouse will expand to the amount of the time in the available window... and then some

You are building a HIGH maintenance system You will fail if you concentrate on resource

optimization to the neglect of project, data, and customer management issues and an understanding of what adds value to the customer

Page 12: 142230 633685297550892500

BEST PRACTICES

Complete requirements and design

Prototyping is key to business understanding

Utilizing proper aggregations and detailed

data

Training is an on-going process

Build data integrity checks into your system.

Page 13: 142230 633685297550892500

BACK TO ARCHITECTURE

Top-Down Architecture

Page 14: 142230 633685297550892500

BACK TO ARCHITECTURE

Bottom-Up Architecture

Page 15: 142230 633685297550892500

Enterprise Data Mart Architecture

BACK TO ARCHITECTURE

Page 16: 142230 633685297550892500

Data Stage/Data Mart Architecture

BACK TO ARCHITECTURE

Page 17: 142230 633685297550892500

Thank You