Database Processing forBusiness Intelligence Systems
Chapter Eight
DAVID M. KROENKE and DAVID J. AUER
DATABASE CONCEPTS, 4th Edition
Chapter Objectives
• Learn the basic concepts of data warehouses and data marts
• Learn the basic concepts of dimensional databases
• Learn the basic concepts of business intelligence (BI) systems
• Learn the basic concepts of OLAP and data mining
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-2
Heather Sweeney Designs:Database Design
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-3
Heather Sweeney Designs:HSD Database Diagram in SQL Server 2005
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-4
Business Intelligence Systems
• Business intelligence (BI) systems are information systems that– Assist managers and other professionals in the analysis of
current and past activities and in the prediction of future events
– Do not support operational activities, such as the recording and processing of orders
• These are supported by transaction processing systems– Support management assessment, analysis, planning and
control
• BI systems fall into two broad categories– Reporting systems that sort, filter, group, and make
elementary calculations on operational data– Data mining applications that perform sophisticated
analyses on data, analyses that usually involve complex statistical and mathematical processing
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-5
The Relationship Among Operational and BI Applications
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-6
Characteristics of Business Intelligence Applications
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-7
Characteristics of aData Warehouse
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-8
Problems with Operational Data
• “Dirty Data”– Example – “G” for Gender– Example – “213” for Age
• Missing Values
• Inconsistent Data– Example – data that have changed,
such as a customer’s phone number
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-9
Problems with Operational Data (Continued)
• Nonintegrated Data– Example – data from two or more
sources that need to be combined
• Incorrect Format– Example – time data in hours when
needed in minutes
• Too Much Data– Example – An excess number of
columnsKROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-10
ETL Data Transformation
• Data may need to be transformed for use in a data warehouse– Example
• {CountryCode CountryName}• “US” “United States”
– Example• Email address to Email domain• [email protected] “somewhere.com”
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-11
Characteristics of aData Mart
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-12
Enterprise Data Warehouse (EDW) Architecture
• Combines the data warehouse structure and the data mart structures shown above
• Expensive to create, staff and operate
• Smaller organizations use subsets of the EDW architecture
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-13
Dimensional Databases
• A non-normalized database structure used for data warehouses
• May use slowly changing dimensions– Values change infrequently
• Phone Number• Address
• Use a Date or Time dimension
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-14
Star Schema
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-15
HSD-DW Star Schema
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-16
Two-Dimensional Matrix
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-17
Three-Dimensional Matrix
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-18
Conformed Dimensionsand the Extended HSD-DW Schema
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-19
Reporting Systems
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-20
Reporting Systems:RFM Analysis
• RFM Analysis analyzes and ranks customers according to purchasing patterns:– R = Recent (most recent order)– F = Frequent (how often an order is made)– M = Money (dollar amount of orders)
• Customers are sorted into five groups, each containing 20% of the customers.
• Each group is given a numerical value:– 1 = Top 20%– 2, 3, 4 = Each 20% in between top and
bottom 20%– 5 = Bottom 20%
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-21
The RFM Score Report
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-22
Reporting Systems: Report Characteristics
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-23
Reporting Systems: Report System Functions
• Report Authoring:– Connect to data sources– Create the report structure– Format the report
• Report Management:– Defines who receives what reports
when and by what means
• Report Delivery:– Push reports or allow them to be pulled
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-24
OLAP and Data Mining
• OnLine Analytical Processing (OLAP) is a technique for dynamically examining database data– OLAP uses arithmetic functions such as Sum
and Average
• Data Mining is a mathematically sophisticated technique for analyzing database data– Data mining uses mathematical and statistical
techniques
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-25
OLAP
• OLAP systems produce an OLAP report, also know as an OLAP cube
• The OLAP report uses inputs called dimensions
• The OLAP report calculates outputs called measures
• Excel PivotTables can be used to create OLAP reports
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-26
Excel PivotTableOLAP Report I
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-27
Excel PivotTableOLAP Report II
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-28
Excel PivotTableOLAP Report III
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-29
Data Mining Applications:The Convergence of the Disciplines
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-30
Data Mining Applications:Popular Data Mining Techniques
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-31
• Cluster analysis – Identifies groups of entities that have similar characteristics
• Decision tree analysis – Classifies entities into groups based on past history
• Logistic regression – Produces equations that offer probabilities that certain events will occur
• Neural Networks – Complex statistical prediction techniques
• Market Basket Analysis – Determines patterns of associated buying behavior
Data Mining Applications:Cluster Analysis I
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-32
Data Mining Applications:Cluster Analysis II
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-33
Data Mining Applications:Cluster Analysis III
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-34
Data Mining Applications:Market Basket Analysis
KROENKE and AUER - DATABASE CONCEPTS (4th Edition) © 2010, 2008 Pearson Prentice Hall 8-35
Database Processing forBusiness Intelligence Systems
End of Presentation on Chapter Eight
DAVID M. KROENKE and DAVID J. AUER
DATABASE CONCEPTS, 4th Edition
8-37
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means,
electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States
of America.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall