Dr. Abdul Basit Siddiqui Assistant Professor FURC (Lecture Slides Week # 2)
Approach of the CourseDevelop an understanding of the underlying
RDBMS concepts.Apply these concepts to VLDB / DSS
environments and understand where and why they break down?
Expose the differences between RDBMS and Data Warehouse in the context of VLDB.
Provide the basics of DSS tools such as OLAP, Data Mining and demonstrate their applications.
Demonstrate the application of DSS concepts and limitations of the OLTP concepts through lab exercises.
04/15/23 2Data Warehoue & Mining - Spring 2014
Summary of the CourseIntroduction & BackgroundDe-NormalizationOnline Analytical Processing (OLAP)Dimensional ModelingExtract-Transform-Load (ETL)Data Quality Management (DQM)Parallelism, Join and Indexing TechniquesData Mining ConceptsData CleansingAssociation Rule MiningClusteringClassification
04/15/23 3Data Warehoue & Mining - Spring 2014
BooksReference Books
W. H. Inmon, Building the Data Warehouse, John Wiley & Sons Inc., NY
R. Kimball, The Data Warehouse Toolkit, John Wiley & Sons Inc., NY
Paulraj Ponniah, Data Warehousing Fundamentals, John Wiley & Sons Inc., NY
04/15/23 4Data Warehoue & Mining - Spring 2014
Why this Course?The World is changing / (in fact changed)
Either change or Be left behind.Missing the opportunities or going in the
wrong direction has prevented us from growing.
What is the right direction?harnessing the data, in the knowledge driven
economy.Doing what can’t be or difficult to automate.
04/15/23 6Data Warehoue & Mining - Spring 2014
Historical Overview
1960: Master Files and Reports1965: Lots of Master Files1970: Direct Memory Access and DBMS1975: Online High Performance
Transaction Processing1980: PCs and 4GL Technology (MIS/DSS)1985: Extract Programs, Extract
Processing1990: The Legacy System’s Web
04/15/23 7Data Warehoue & Mining - Spring 2014
The Need of the Time
drowning in data AND/BUT starving for information.
Knowledge is power BUT Intelligence is absolute/super power.
04/15/23 8Data Warehoue & Mining - Spring 2014
The Need of the Time
04/15/23
Data
Information
Knowledge
Intelligence
POWER ($/£)
9Data Warehoue & Mining - Spring 2014
04/15/23
ABC Pvt Ltd is a company with branches at Karachi, Quetta, Peshawar and Lahore. The Sales Manager wants quarterly sales report. Each branch has a separate operational system.
10Data Warehoue & Mining - Spring 2014
04/15/23
Karachi
Quetta
Peshawar
Lahore
SalesManager
Sales per item type per branchfor first quarter.
11Data Warehoue & Mining - Spring 2014
Solution 1:ABC Pvt Ltd.
Extract sales information from each database.
Store the information in a common repository at a single site.
04/15/23 12Data Warehoue & Mining - Spring 2014
04/15/23
Karachi
Quetta
Peshawar
Lahore
DataWarehouse
SalesManager
Query &Analysis tools
Report
13Data Warehoue & Mining - Spring 2014
04/15/23
One Stop Shopping Super Market has huge operational database. Whenever Executives wants some report, the OLTP system becomes slow and data entry operators have to wait for some time.
14Data Warehoue & Mining - Spring 2014
04/15/23
OperationalDatabase
Data Entry Operator
Data Entry Operator
ManagementWait
Report
15Data Warehoue & Mining - Spring 2014
Solution 2
Extract data needed for analysis from operational database.
Store it in warehouse.Refresh warehouse at regular interval so
that it contains up to date information for analysis.
Warehouse will contain data with historical perspective.
04/15/23 16Data Warehoue & Mining - Spring 2014
04/15/23
Operationaldatabase
DataWarehouse
Extractdata
Data EntryOperator
Data EntryOperator
Manager
Report
Transaction
17Data Warehoue & Mining - Spring 2014
04/15/23
Cakes & Cookies is a small, new company. President of the company wants his company should grow. He needs information so that he can make correct decisions.
18Data Warehoue & Mining - Spring 2014
Solution 3Improve the quality of data before loading it
into the warehouse.Perform data cleaning and transformation
before loading the data.Use query analysis tools to support adhoc
queries.
04/15/23 19Data Warehoue & Mining - Spring 2014
04/15/23
Query and Analysistool
President
Expansion
Improvement
sales
time
DataWarehouse
20Data Warehoue & Mining - Spring 2014
Case Study
AFCO Foods & Beverages is a new company which produces dairy, bread and meat products with production unit located at Gujranwala.
There products are sold in all the region of Pakistan.
They have sales units at provincial Head Quarters.
The President of the company wants sales information.
04/15/23 21Data Warehoue & Mining - Spring 2014
Sales Information
January February March April
14 41 33 25
04/15/23
Report: The number of units sold.
113
Report: The number of units sold over time
22Data Warehoue & Mining - Spring 2014
Sales Information
Jan Feb Mar Apr
Wheat Bread 6 17
Cheese 6 16 6 8
Swiss Rolls 8 25 21
04/15/23
Report : The number of items sold for each product withtime
Product
Tim
e
23Data Warehoue & Mining - Spring 2014
Sales Information
Jan Feb Mar Apr
Karachi Wheat Bread
3 10
Cheese 3 16 6
Swiss Rolls 4 16 6
Lahore Wheat Bread
3 7
Cheese 3 8
Swiss Rolls 4 9 15
04/15/23
Report: The number of items sold in each City for each product with time
Product
Tim
e
City
24Data Warehoue & Mining - Spring 2014
04/15/23
Report: The number of items sold and income in each region for each product with time.
Jan Feb Mar Apr
Rs U Rs U Rs U Rs U
Karachi Wheat Bread 7.44 3 24.80 10
Cheese 7.95 3 42.40 16 15.90 6
Swiss Rolls 7.32 4 29.98 16 10.98 6
Lahore Wheat Bread 7.44 3 17.36 7
Cheese 7.95 3 21.20 8
Swiss Rolls 7.32 4 16.47 9 27.45 15
25Data Warehoue & Mining - Spring 2014