Top Banner
CHAPTER SIX Databases and Data Warehouses
47

CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Dec 25, 2015

Download

Documents

Alisha Garrett
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

CHAPTER SIX

Databases and Data Warehouses

Page 2: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Information Granularity

Refers to the level of detail of information Detailed (POS transaction) Course (Global sales totals)

Page 3: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Transactional vs. Analytical Information

Transactional information comes from a business process A bank deposit A credit card charge

Analytical information uses transactional data for the purposes of decision making Account balance trends Using credit card history to detect fraud

Page 4: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Transactional vs. Analytical Information

Page 5: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Information Dimensions

Information timeliness Obsolete information is useless Today’s information needs to be

provided in real time or near real time Information quality

Wrong information is useless Redundant information can be the

cause of errors Information must be complete

Data inconsistency and data integrity

Page 6: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Database Management

Characteristics Complex Databases often spread across multiple

servers Databases often spread across multiple

physical disks Fault tolerance is critical

Databases may be distributed

Page 7: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Database Vendors

The industry has consolidated IBM

DB2 Universal Oracle Microsoft

SQL Server Access

Sun (MySQL) Is now Oracle

Page 8: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Database Performance

Transaction Processing Performance Council provides standard benchmarks

TPC-C – Online transaction processing

TPC-E – Online brokerage transactions

TPC-H – Ad-hoc decision support TPC-W – Web / E-commerce

Page 9: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Database Performance (TPC-C)

Multiple transaction types Independent of software and

hardware Scalable Basis is online transaction

processing (OLTP)

Page 10: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

1960s Data Management

These are legacy systems Batch processing

Characterized by traditional file processing

Data processing was sequential Not possible to directly locate a particular file

record Data dependent on the programs that

used the data Program data dependence

Page 11: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

1970s Data Management

Batch processing gives way to on line transaction processing Files stored on disk rather than tape Any record can be located in the same amount

of time Technologies

Indexed Sequential Access Method (ISAM) Virtual Sequential Access Method (VSAM)

Direct Access files Use a hashing function to derive record keys

Page 12: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

1980s Data Management

Databases are becoming commonplace

Personal computer databases are evolving DBase R-Base

Page 13: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

1990s Data Management

Huge data stores and transaction processing capabilities

Distributed databases Object-oriented databases 6 Million+ transactions per second

Page 14: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Realities of a DBMS

Data centric rather than application centric Can be a repository for all an organization’s

data Databases tend to be centralized Queries get data from a DBMS

SQL is the standard query language Report generators create printed and Web-

based reports Applications interface with DBMS

Page 15: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Types of Databases

Database models include: Hierarchical database model – A tree-

based structure Network database model –

Mathematically, a directed graph Relational database model – stores

information in the form of logically related two-dimensional tables

Object-oriented databases

Page 16: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Elements of a Database

Logical view and physical view Users see and work with the logical

view Physical view is controlled by the

database management system itself

Page 17: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Entities and Attributes

Relational databases store information in tables (entities) Customer / order / product

Tables contain fields (attributes) Customer name, address

Page 18: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Keys

Each table has a primary key that uniquely identifies each record Natural keys have some meaning (stock

symbol) Artificial keys have no intrinsic meaning

(your R number) Foreign keys are used to link tables

in one-to-many relationships

Page 19: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Database Interaction

Page 20: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Advantages of an RDMS (Scalability)

Database can scale to the terabyte or petabyte range NSA maintains 1.9 trillion telephone call

records Large databases can span several

servers and storage devices

Page 21: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Advantages of an RDBMS (Redundancy)

Databases can be configured to write duplicate (redundant) information Citibank

Journaling and checkpointing are supported

Page 22: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Advantages of an RDBMS (Integrity)

Relational integrity constraints are rules that apply to the relationships between tables

Business integrity constraints enforce business rules Not really a part of the DBMS itself

Page 23: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Advantages of an RDBMS (Information Security)

A DBMS supports advanced access rights By table and fields By time of day By location By row information

Page 24: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Data-driven Web Sites

Nearly all transactional Web sites rely on a database Amazon Your bank Any shopping cart application Ebay or Craig’s List Facebook and You Tube

Page 25: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Database Integration

Databases often need to be integrated Because of mergers and acquisitions Because of organizational changes

We are referring to connections to multiple databases

Page 26: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Data Warehouses (Introduction)

Central source for clean data May contain internal or external data

Use to spot hidden patterns in data May be integrated with operational

database Parts of a data warehouse are called data

marts Data warehouses contain an analytical

component

Page 27: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Cleansing Data

Data is often obtained from a myriad of sources External lists Internal databases Other databases

This data must be cleansed and sanitized to remove Redundancy / errors / etc…

Page 28: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Data Warehouses (Illustration)

Page 29: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Multidimensional Analysis

Data are often analyzed as 3-dimensional cubes

Cubes are then ‘sliced and diced’ to look at various layers

Page 30: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Multidimensional Analysis (Illustration)

Page 31: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

The cost of Perfect Information

Page 32: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Database Design (Introduction)

In the systems process, we design before we implement Requirements specification Conceptual design Logical design Physical Design

Page 33: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Database Design Tools

Unified Modeling Language (UML) Visio Rational Rose

Entity relationship diagrams describes relationships between data

Normalization eliminates redundant data

Page 34: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Database Management HR

Database administrators Data managers Programmers and systems analysts Data security

Page 35: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

BUSINESS INTELIGENCE / DATA MINING

Page 36: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Business Intelligence (Introduction)

Simply put, it’s internal and external data used to support better decision making

It’s challenging to sift through the mountains of data

It requires cross-functional collaboration between systems

More in the next chapter but we use ERP systems to improve business intelligence

Page 37: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Business Intelligence (Industries)

BI applies to all industries Retail and sales

Understanding procurement and distribution (SCM) / customers (CRM)

Banking Understand credit worthiness / fraud

behavior Insurance

Forecast claim risk and understand at – risk customers

Page 38: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Business Intelligence (Industries)

Airlines Routing planes / minimize turnaround

time (Southwest) Marketing

Demographics Sell based on known customer behavior

(Harrah’s) Amazon

Page 39: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Business Intelligence (Levels)

Operational Day-to-day operations (building a Dell)

Tactical Short term (Dell ordering supplies)

Strategic Long term organizational goals

The systems that provide BI typically do so at all levels

Page 40: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

BI Levels (Illustration)

Page 41: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

BI and Latency

From the time of acquisition, how long does it take to analyze (analysis latency)

Time to make a decision based on the analysis E-transactions significantly reduce

latency

Page 42: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Data Mining (Introduction)

Data gets mined (analyzed) from data contained in a data warehouse or data mart

Specialized tools are used to analyze data for ‘interesting nuggets’

Ways to mine Drill down (general to specific) Drill up (specific to general)

Page 43: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Data Mining (Clustering)

Cluster analysis groups data by trait or traits

Examples Don’t drink the water in Fallon Segment customers by zip codes

Page 44: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Data Mining (Association)

Answers the question “What traits are associated with other traits” When I stay at Harrah’s,

I gamble I eat at the Sage room

When I stay in Vegas, I gamble more

Page 45: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Data Mining (Statistical Analysis)

It’s basic statistics Analysis of variance Correlation coefficients Etc…

Page 46: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

BI Benefits

We can understand what’s happening inside and outside a department Sales knows about product inventory

levels and production schedules Production knows about sales and sales

forecasts Finance knows about the sales

forecasts too This information is provided in near

real time

Page 47: CHAPTER SIX Databases and Data Warehouses. Information Granularity Refers to the level of detail of information Detailed (POS transaction) Course (Global.

Quantifying BI

Some benefits can be clearly quantified Costs went down Productivity increased Inventory levels were optimized 10%

Some are indirectly quantified Some benefits are intangible Sometimes, we get unexpected

results