1 A Weapon in Your Competitive Arsenal: The Data Warehouse John Rome, Arizona State University 2005 Fall Conference Agenda • Quiz • Background • Define Data Warehousing • Discuss Latest Buzzwords • Demo of Actual Data Warehouse • Lessons Learned and Some Advice • Demo/Questions/Discussion • Later Today… – Data Mining, Dashboard and Data Quality
27
Embed
A Weapon in Your Competitive Arsenal: The Data Warehouseok-air.org/documents/2005 Fall/F05_ROMEDataWarehouse.pdf · Snowflake Schema Operation Data Store (ODS) XML ... • Mission
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
A Weapon in Your Competitive Arsenal:The Data Warehouse
John Rome, Arizona State University
2005 Fall Conference
Agenda• Quiz• Background• Define Data Warehousing• Discuss Latest Buzzwords• Demo of Actual Data Warehouse• Lessons Learned and Some Advice• Demo/Questions/Discussion• Later Today…
– Data Mining, Dashboard and Data Quality
2
Quiz--Truth or Urban Legend?1. Pizza Hut knows your favorite toppings,
what you ordered last and whether you like salad with your meat lover's pie?
2. Ekco sells more turkey basters during Christmas than Thanksgiving?
3. 4 wheel drive Green Subarus outsell Blue by a wide margin, except Wisconsin?
4. Walmart increases sales by placing diapers and beer next to each other?
About Arizona State University
• Located in Phoenix Metropolitan• 61,033 Students • 5,393 Full-Time Administrative Staff• 2,165 Full-Time Faculty• Awarded Research I Status in 1994• “New American University”• http://www.asu.edu
3
`
“One University, Many Places”
4
About ASU’s Data Administration • Reports to the President’s Office • 5 Professional Staff, 4 Support Staff• Mission: Data Access, Data Quality, and Data
Education• Supports Centralized/Decentralized Initiatives• Data Warehouse is “full-employment”• In Preliminary ERP discussions• Close ties with IR office• http://www.asu.edu/data_admin
5
Warehousing Was and Still is Hot...• $8B Industry• 90% of CIOs claim to be developing (Meta Group, 1998)
with 99.9% today• Higher Education Institutions are building
them• Keynotes at IR conferences!!• Chapters in college textbooks• Amazon.com barometer
“A Database with Snapshots of Data Dedicated for Reporting Purposes”
8
Why All the Fuss About Warehousing?• Powerful Data Source for Reporting • Fills in Gaps Left by Operational Systems• Integrates Data from Silo Systems• Both Strategic and Tactical• Keeps Historical Data• Assists Longitudinal Studies• Helps Assessment and Retention• Becoming Mission Critical to Organizations!
How Is a Warehouse Different?
• data is read-only• managed redundancy• serves management• “time fixed” data• “what if” processing• historical trends• response… minutes
• data is updated• minimal redundancy• serves operational users• “current value” data• repetitive processing• limited history• response… seconds
WarehouseOLTP
9
OTHERSOURCES
MAINFRAME
MVS/ESA
LEGACY SYSTEM
(DB2/IDMS)
SQL/ODBC
SQL/ODBCNT
WEB SERVER
ASPCOLD FUSION
UNIXWEB SERVER
JAVA
SQL/JDBC
UNIX
SQL/”Native”
Data Warehouse
Sample Warehouse Architecture
Some “BI” BuzzwordsOLAP
MOLAP
ROLAP
Metadata
ReplicationAggregation
Star Schema Multi-dimensional
Facts/Dimensions
Bit-Mapped IndexingDrill-Down
Transformation Tools (ETL)
De-Normalized
Snowflake Schema
Operation Data Store (ODS)
XML
Data Mining
Data Quality Business Intelligence
DashboardsSQL
Data Mart
10
What is a Data Mart?A data mart is often a very focused slice of a larger data warehouse.
Data Warehouse vs. Data Mart Data Warehouse Data Mart Scope Enterprise
Specific business process
Data Perspective
Historical data Some summary Lightly denormalized
Current (some history)Highly denormalized
Data Subjects 20-30 tables (each subject area) Multiple subjects
SQL. Stuctured Query Language (pronounced sequel). The Lingua Franca of Data Access in Relational Databases. It is used to build queries to be performed against Data Warehouses.
Tools Are Doing the Dirty Work
12
End User Access Tools
-Gartner Group
-Keith Gile, Forrester
What is ETL? • Tool or process used to move data from
one system/DB to another system/DB• Over 100 ETL tools on market, about 10
serious contenders• Range from Free - $750K• Better ones may be cost-prohibitive• Database often has bulk load utilities• Sometimes its E.L.T. (Load data 1st after
extract and then transform with programs or stored procedures after load )
13
ETL Example
What is A Data Model?• graphical representation that identifies
the information needs of the business. A data driven, versus function (or process) based view of an organization.
• Completeness• Simplicity • No redundancy (OLTP)• Enforcement of Business Rules• Data Reusability• Stability and Flexibility• Communication Effectiveness
Some Design Guidelines• Add element of time to the tables• Appropriately name tables, attributes,
views• Add derived fields when necessary• Make sure data integrates• Consider security and privacy in design• Consider performance (indexes, etc.)• Make sure data model can answer the
critical business questions
16
Display Your Model Proudly...
takes
is offered by
offersis identified by
CLASS MEETING TIME
CLASS
CAMPUS
COURSE COLLEGE
STUDENT
COURSE CATALOG
“Mona Lisa” “Wall Ware” “American Gothic”
Demo Time
• Ad Hoc Quer(ies) using BI Tool• Retention Application using Web
17
About ASU’s Data Warehouse• 10 years in the making• Major subject areas (Student, HR,
Financial)• Supports over 1500 users• “Poor Man’s Repository” for definitions• Source of data – multiple operational