Top Banner
Dr. Abdul Basit Siddiqui Assistant Professor FURC (Lecture Slides Week # 2)
26
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dwh lecture slides-week2

Dr. Abdul Basit Siddiqui Assistant Professor

FURC(Lecture Slides Week # 2)

Page 2: Dwh lecture slides-week2

Approach of the CourseDevelop an understanding of the underlying

RDBMS concepts.Apply these concepts to VLDB / DSS

environments and understand where and why they break down?

Expose the differences between RDBMS and Data Warehouse in the context of VLDB.

Provide the basics of DSS tools such as OLAP, Data Mining and demonstrate their applications.

Demonstrate the application of DSS concepts and limitations of the OLTP concepts through lab exercises.

04/15/23 2Data Warehoue & Mining - Spring 2014

Page 3: Dwh lecture slides-week2

Summary of the CourseIntroduction & BackgroundDe-NormalizationOnline Analytical Processing (OLAP)Dimensional ModelingExtract-Transform-Load (ETL)Data Quality Management (DQM)Parallelism, Join and Indexing TechniquesData Mining ConceptsData CleansingAssociation Rule MiningClusteringClassification

04/15/23 3Data Warehoue & Mining - Spring 2014

Page 4: Dwh lecture slides-week2

BooksReference Books

W. H. Inmon, Building the Data Warehouse, John Wiley & Sons Inc., NY

R. Kimball, The Data Warehouse Toolkit, John Wiley & Sons Inc., NY

Paulraj Ponniah, Data Warehousing Fundamentals, John Wiley & Sons Inc., NY

04/15/23 4Data Warehoue & Mining - Spring 2014

Page 5: Dwh lecture slides-week2
Page 6: Dwh lecture slides-week2

Why this Course?The World is changing / (in fact changed)

Either change or Be left behind.Missing the opportunities or going in the

wrong direction has prevented us from growing.

What is the right direction?harnessing the data, in the knowledge driven

economy.Doing what can’t be or difficult to automate.

04/15/23 6Data Warehoue & Mining - Spring 2014

Page 7: Dwh lecture slides-week2

Historical Overview

1960: Master Files and Reports1965: Lots of Master Files1970: Direct Memory Access and DBMS1975: Online High Performance

Transaction Processing1980: PCs and 4GL Technology (MIS/DSS)1985: Extract Programs, Extract

Processing1990: The Legacy System’s Web

04/15/23 7Data Warehoue & Mining - Spring 2014

Page 8: Dwh lecture slides-week2

The Need of the Time

drowning in data AND/BUT starving for information.

Knowledge is power BUT Intelligence is absolute/super power.

04/15/23 8Data Warehoue & Mining - Spring 2014

Page 9: Dwh lecture slides-week2

The Need of the Time

04/15/23

Data

Information

Knowledge

Intelligence

POWER ($/£)

9Data Warehoue & Mining - Spring 2014

Page 10: Dwh lecture slides-week2

04/15/23

ABC Pvt Ltd is a company with branches at Karachi, Quetta, Peshawar and Lahore. The Sales Manager wants quarterly sales report. Each branch has a separate operational system.

10Data Warehoue & Mining - Spring 2014

Page 11: Dwh lecture slides-week2

04/15/23

Karachi

Quetta

Peshawar

Lahore

SalesManager

Sales per item type per branchfor first quarter.

11Data Warehoue & Mining - Spring 2014

Page 12: Dwh lecture slides-week2

Solution 1:ABC Pvt Ltd.

Extract sales information from each database.

Store the information in a common repository at a single site.

04/15/23 12Data Warehoue & Mining - Spring 2014

Page 13: Dwh lecture slides-week2

04/15/23

Karachi

Quetta

Peshawar

Lahore

DataWarehouse

SalesManager

Query &Analysis tools

Report

13Data Warehoue & Mining - Spring 2014

Page 14: Dwh lecture slides-week2

04/15/23

One Stop Shopping Super Market has huge operational database. Whenever Executives wants some report, the OLTP system becomes slow and data entry operators have to wait for some time.

14Data Warehoue & Mining - Spring 2014

Page 15: Dwh lecture slides-week2

04/15/23

OperationalDatabase

Data Entry Operator

Data Entry Operator

ManagementWait

Report

15Data Warehoue & Mining - Spring 2014

Page 16: Dwh lecture slides-week2

Solution 2

Extract data needed for analysis from operational database.

Store it in warehouse.Refresh warehouse at regular interval so

that it contains up to date information for analysis.

Warehouse will contain data with historical perspective.

04/15/23 16Data Warehoue & Mining - Spring 2014

Page 17: Dwh lecture slides-week2

04/15/23

Operationaldatabase

DataWarehouse

Extractdata

Data EntryOperator

Data EntryOperator

Manager

Report

Transaction

17Data Warehoue & Mining - Spring 2014

Page 18: Dwh lecture slides-week2

04/15/23

Cakes & Cookies is a small, new company. President of the company wants his company should grow. He needs information so that he can make correct decisions.

18Data Warehoue & Mining - Spring 2014

Page 19: Dwh lecture slides-week2

Solution 3Improve the quality of data before loading it

into the warehouse.Perform data cleaning and transformation

before loading the data.Use query analysis tools to support adhoc

queries.

04/15/23 19Data Warehoue & Mining - Spring 2014

Page 20: Dwh lecture slides-week2

04/15/23

Query and Analysistool

President

Expansion

Improvement

sales

time

DataWarehouse

20Data Warehoue & Mining - Spring 2014

Page 21: Dwh lecture slides-week2

Case Study

AFCO Foods & Beverages is a new company which produces dairy, bread and meat products with production unit located at Gujranwala.

There products are sold in all the region of Pakistan.

They have sales units at provincial Head Quarters.

The President of the company wants sales information.

04/15/23 21Data Warehoue & Mining - Spring 2014

Page 22: Dwh lecture slides-week2

Sales Information

January February March April

14 41 33 25

04/15/23

Report: The number of units sold.

113

Report: The number of units sold over time

22Data Warehoue & Mining - Spring 2014

Page 23: Dwh lecture slides-week2

Sales Information

Jan Feb Mar Apr

Wheat Bread 6 17

Cheese 6 16 6 8

Swiss Rolls 8 25 21

04/15/23

Report : The number of items sold for each product withtime

Product

Tim

e

23Data Warehoue & Mining - Spring 2014

Page 24: Dwh lecture slides-week2

Sales Information

Jan Feb Mar Apr

Karachi Wheat Bread

3 10

Cheese 3 16 6

Swiss Rolls 4 16 6

Lahore Wheat Bread

3 7

Cheese 3 8

Swiss Rolls 4 9 15

04/15/23

Report: The number of items sold in each City for each product with time

Product

Tim

e

City

24Data Warehoue & Mining - Spring 2014

Page 25: Dwh lecture slides-week2

04/15/23

Report: The number of items sold and income in each region for each product with time.

Jan Feb Mar Apr

Rs U Rs U Rs U Rs U

Karachi Wheat Bread 7.44 3 24.80 10

Cheese 7.95 3 42.40 16 15.90 6

Swiss Rolls 7.32 4 29.98 16 10.98 6

Lahore Wheat Bread 7.44 3 17.36 7

Cheese 7.95 3 21.20 8

Swiss Rolls 7.32 4 16.47 9 27.45 15

25Data Warehoue & Mining - Spring 2014

Page 26: Dwh lecture slides-week2

Data Warehousing includes

Build Data WarehouseOnline Analysis/Analytical Processing (OLAP).Presentation.

04/15/23

RDBMS

Flat File

Presentation

Cleaning ,Selection &Integration

Warehouse & OLAP serverClient

26Data Warehoue & Mining - Spring 2014