Top Banner
ISM 50 - Business Information Systems Lecture 16 Instructor: Magdalini Eirinaki UC Santa Cruz May 24, 2007
36

ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Mar 28, 2018

Download

Documents

TranAnh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

ISM 50 - Business Information Systems

Lecture 16

Instructor: Magdalini EirinakiUC Santa CruzMay 24, 2007

Page 2: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Outline

Announcements MySQL case (cont’d) Student Presentation - FedEx Student Presentation - UPS OLTP OLAP

Page 3: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Announcements Reading for 5/29

Messerchmitt 11.2 (pp.333-335) Messerchmitt 18.1 (pp.498 - 505) Akamai Case (reader pp.217-236)

(Read book first)

Forthcoming Student Presentations 5/29

Evan Price (business paper) Matthew Payne (Akamai case)

5/31 Eric Gonzalez (business paper: WalMart) Derek Stern (business paper: Home Depot)

News Folio #3 due next Thursday (5/31)

Page 4: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

mySQL Case

Page 5: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

mySQLWhat does mySQL make?

How Successful is mySQL? Visibility: Fortune magazine, more mentions on www Reaction from giants Revenue growth 2001 700k, 2002 6.2m, 2003 10m Good performance reviews Recent SAP alliance But Market share tiny:

$10 million out of $10 billion market!

Why Success? Good Technology Large DBMS bloated with features most don’t need Innovative OSS model

Page 6: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

mySQLHow does OSS work?

Two Types of License: GPL

Free No Support Any software that uses MySQL as a module must itself be

made GPL Commercial License

Support Could be distributed with non-open source software Not Free:

MySQL: Classic $250, Pro $495 (for ~ 50 users) Compare to:

MSFT $3150 single proc for 50 users IBM $33000 single proc for 50 users Oracle $40000 single proc for 50 users

Page 7: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Aside: DB’s in different software stacks

Operating System

Middleware(DBMS)

Application

MS Windows or other OS

Oracleor MySQL, IBM, etc

SAPOr Oracle, Axtapa, etc.

ERPSoftware

Stack

Linux or other OS

MySQL or other DB

Apache Web Server

WebApplication

SoftwareStack

Proprietary Business Logic

BankingSoftware

Stack

IBM z/OSor other OS

GeneralSoftware

Stack

Oracleor other DB

ProprietaryBanking App.

Which companies are competitors? Which are complimentary to each other? Which are both?

Page 8: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

mySQL

Which segments of market is mySQLstrong in? Large Companies or Small Companies? Web applications or Critical Enterprise data?

Why would a major enterprise want to payso much more for an Oracle or IBM DB?

Page 9: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

My SQL: market

Small 20% Medium 30% Large 50%

Enterprise wide data 90%

Web Sites 10%

Microsoft Oracle IBM

My SQL

ReliabilityScalabilitySupportLongevity

Cost

Figure Adapted from “Teaching Note for MySQL Open Source Database,” 6/1/04, Stanford GSB.

How should mySQL grow in order to meet it’s stated goal of getting to $100 million In revenue?

Page 10: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

My SQL: Growth Strategy

- Lack of Brand identity in this segement - MySQL lacks the organization to offer support - Large enterprises have high switching costs

Small 20% Medium 30% Large 50%

Enterprise wide data 90%

Web Sites 10%

Microsoft Oracle IBM

My SQL

ReliabilityScalabilitySupportLongevity

Cost

Figure Adapted from “Teaching Note for MySQL Open Source Database,” 6/1/04, Stanford GSB.

Page 11: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

My SQL: Growth Strategy

- Not a big enough market to reach stated $100 million goal.

Small 20% Medium 30% Large 50%

Enterprise wide data 90%

Web Sites 10%

Microsoft Oracle IBM

My SQL

ReliabilityScalabilitySupportLongevity

Cost

Figure Adapted from “Teaching Note for MySQL Open Source Database,” 6/1/04, Stanford GSB.

Stay Put?

Page 12: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

My SQL: Growth Strategy

+ Many of these customers already using MySQL with websites + Less emphasis on global organization + Leverage SAP alliance - Up against Microsoft.

Small 20% Medium 30% Large 50%

Enterprise wide data 90%

Web Sites 10%

Microsoft Oracle IBM

My SQL

ReliabilityScalabilitySupportLongevity

Cost

Figure Adapted from “Teaching Note for MySQL Open Source Database,” 6/1/04, Stanford GSB.

Maybe?

Page 13: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

My SQL: Growth Strategy

+ builds on existing brand and strengths - Market not so big

Small 20% Medium 30% Large 50%

Enterprise wide data 90%

Web Sites 10%

Microsoft Oracle IBM

My SQL

ReliabilityScalabilitySupportLongevity

Cost

Figure Adapted from “Teaching Note for MySQL Open Source Database,” 6/1/04, Stanford GSB.

Maybe?

Page 14: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Student Presentations

Audrey McCloskey FedEx

Mark Sheldon UPS

Page 15: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Databases & OLTP

Page 16: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Databases

Application I Application II

Aggregation: accessingmultiple databases

Sharing: two or moreapplications accessing thesame databases

Recall - Two capabilities

Page 17: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Travelocity.com CheapTickets.com

Example - Travel Agency

Hotels Cars Airtickets

What can go wrong?

Page 18: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Travelocity.com CheapTickets.com

Example - Travel Agency

Hotels Cars Airtickets

A resource might be unavailable

Page 19: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Travelocity.com CheapTickets.com

Example - Travel Agency

Hotels Cars Airtickets

Two applications might try to access & update the same resourceconcurrently

Page 20: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Travelocity.com CheapTickets.com

Example - Travel Agency

Hotels Cars Airtickets

An application or a host might crash before the completion of thetransaction

Page 21: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Travelocity.com CheapTickets.com

Example - Travel Agency

Hotels Cars Airtickets

A customer’s transaction should be completed in its entirety,or aborted

Page 22: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Transaction Processing

“The coordination of multiple resources and theshared access to common resources in a systematicand consistent way”

Examples? Financial applications (stock market, ATMs) Reservations (travel, theatre) Manufacturing (inventory, purchasing, billing) Etc…

Page 23: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Online Transaction Processing(OLTP) Transaction Processing for networked applications

4 Important Properties of transactions: ACID Atomicity Consistency Isolation Durability

Page 24: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

The ACID properties

Atomicity All transaction components should either complete together

(commit) or abort E.g. All reservations (airline, hotel, car) should be grouped as a single

transaction that either commits, or aborts Consistency

A transaction must leave the system in a consistent state at the endof the transaction, or else abort

E.g. Either a consistent set of reservations has been made, or none Isolation

Concurrent transactions are allowed only if they don’t interfere witheach other

Two travel agents can concurrently access the same database if thereservations are for different dates/places

Durability A transaction leaves the resources in a permanent state after it

commits

Page 25: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Structure of a Transaction

DurableStarting State

Actions to beperformed

Durable,ConsistentEnd State

Successful completion

Abort

Rollback

Page 26: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

OLTP

Simplifies application development

Ensables protection and integrity of mission-criticaldata in a transparent way for the end user for the application developer

Page 27: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Data warehouses &OLAP

“Data Mining: Concepts andTechniques”Jiawei Han©2006 Jiawei Han and Micheline Kamber,All rights reserved

Page 28: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Recall - Decision Support

Knowledge management systems: Turn dataand information into knowledge A Data warehouse stores organization’s historical

data

Provide functionalities for summarizing, aggregating,reporting on these data

Data mining is the process of discovering patternsin large amounts of data

Page 29: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

What is a Data Warehouse?

A decision support database that is maintainedseparately from the organization’s operationaldatabase

Focusing on the modeling and analysis of data fordecision makers, not on daily operations or transactionprocessing

Slide adapted from slides for Data Mining: Concepts and TechniquesBy Jiawei Han and Micheline Kamber. Copyright 2006. See copyright notice

Page 30: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Data Warehouse—Nonvolatile

Operational update of data does not occur in thedata warehouse environment Does not require transaction processing, recovery, and

concurrency control mechanisms

Requires only two operations in data accessing:

initial loading of data and access of data

Slide adapted from slides for Data Mining: Concepts and TechniquesBy Jiawei Han and Micheline Kamber. Copyright 2006. See copyright notice

Page 31: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Data Warehouse—Time Variant

The time horizon for the data warehouse is significantlylonger than that of operational systems Operational database: current value data Data warehouse data: provide information from a historical

perspective (e.g., past 5-10 years)

Every key structure in the data warehouse Contains an element of time, explicitly or implicitly

Slide adapted from slides for Data Mining: Concepts and TechniquesBy Jiawei Han and Micheline Kamber. Copyright 2006. See copyright notice

Page 32: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Data Warehouse vs. OperationalDBMSOLTP (on-line transaction processing)

Major task of traditional relational DBMS Day-to-day operations: purchasing, inventory, banking, manufacturing,

payroll, registration, accounting, etc.OLAP (on-line analytical processing)

Major task of data warehouse system Data analysis and decision making

Distinct features (OLTP vs. OLAP): User and system orientation: customer vs. market Data contents: current, detailed vs. historical, consolidated Database design: Relational (ER) vs. Star View: current, local vs. evolutionary, integrated Access patterns: update vs. read-only but complex queries

Slide adapted from slides for Data Mining: Concepts and TechniquesBy Jiawei Han and Micheline Kamber. Copyright 2006. See copyright notice

Page 33: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

OLAP Database Data modeling

“Star-schema” “Snowflake-schema”

OLAP Cube: Multidimensional Database

Page 34: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

OLAP Database

OLAP Cube Enables analysis in various aggregation levels

Page 35: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

OLTP vs. OLAP

OLTP OLAP

users clerk, IT professional knowledge worker

function day to day operations decision support

DB design application-oriented subject-oriented

data current, up-to-date

detailed, flat relational

isolated

historical,

summarized, multidimensional

integrated, consolidated

usage repetitive ad-hoc

access read/write

index/hash on prim. key

lots of scans

unit of work short, simple transaction complex query

# records accessed tens millions

#users thousands hundreds

DB size 100MB-GB 100GB-TB

metric transaction throughput query throughput, response

Slide adapted from slides for Data Mining: Concepts and TechniquesBy Jiawei Han and Micheline Kamber. Copyright 2006. See copyright notice

Page 36: ISM 50 - Business Information Systems · PDF fileISM 50 - Business Information Systems Lecture 16 ... Jiawei Han ©2006 Jiawei Han and Micheline Kamber, ... (OLTP vs. OLAP):

Why Separate Data Warehouse?High performance for both systems

DBMS— tuned for OLTP: access methods, indexing,concurrency control, recovery

Warehouse—tuned for OLAP: complex OLAP queries,multidimensional view, consolidation

Note: There are more and more systems which performOLAP analysis directly on relational databases

Slide adapted from slides for Data Mining: Concepts and TechniquesBy Jiawei Han and Micheline Kamber. Copyright 2006. See copyright notice