Top Banner
MODULE 5 Metadata, Tools, and Data Warehousing Section 4 Data Warehouse Administration 1 I T E C 4 5 0
13

Module 5 Metadata, Tools, and Data Warehousing

Feb 24, 2016

Download

Documents

zyta

Module 5 Metadata, Tools, and Data Warehousing. Section 4 Data Warehouse Administration. Data Warehouse and Characteristics. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Module 5  Metadata, Tools, and Data Warehousing

ITEC 450

1

MODULE 5 Metadata, Tools, and Data WarehousingSection 4 Data Warehouse Administration

Page 2: Module 5  Metadata, Tools, and Data Warehousing

2

ITEC 450DATA WAREHOUSE AND CHARACTERISTICSA data warehouse is a subject-oriented, integrated, time-

variant, non-volatile collection of data that is designed for query and analysis rather than for transaction processes.

Subject-oriented – data pertains to a particular subject instead of the many subjects pertinent to the company’s ongoing operations.

Integrated – consistent naming conventions, formats, encoding structures; from multiple data sources

Time-variant – data is identified with a particular time period, can study trends and changes

Non-updatable – data is stable in a data warehouse. Data loaded, and should not be removed.

Page 3: Module 5  Metadata, Tools, and Data Warehousing

3

ITEC 450COMPARISON OF DATABASE CHARACTERISTICS

Page 4: Module 5  Metadata, Tools, and Data Warehousing

4

ITEC 450DATA WAREHOUSE AND BUSINESS INTELLIGENCE A data warehouse usually contains historical

data derived from transaction data and other sources.

It enables an organization to consolidate data.

It includes An extraction, transportation, transformation,

and loading (ETL) solution An online analytical processing (OLAP) engine Client analysis tools Reporting

Page 5: Module 5  Metadata, Tools, and Data Warehousing

5

ITEC 450ANALYTICAL VS. TRANSACTION PROCESSING Analytical processing – informational

systems DSS – decision support system OLAP – online analytical processing Data mining – the process of mining or discovery of new

information in terms of patterns or rules from vast amounts of data

Transaction processing – operational system OLTP – online transaction processing

Page 6: Module 5  Metadata, Tools, and Data Warehousing

6

ITEC 450

DATA WAREHOUSE DESIGN Star schema - data modeling technique used

to map multidimensional decision support data into a relational database. It is excellent for ad-hoc queries, but bad for online transaction processing. It contains four components: Fact table Dimension tables Attributes Attribute hierarchies

Snowflake schema – a star schema in which the dimension tables have additional relationships

Page 7: Module 5  Metadata, Tools, and Data Warehousing

7

ITEC 450

STAR SCHEMA COMPONENTS

Page 8: Module 5  Metadata, Tools, and Data Warehousing

8

ITEC 450

STAR SCHEMA EXAMPLE

Page 9: Module 5  Metadata, Tools, and Data Warehousing

9

ITEC 450

DATA MOVEMENT – ETL PROCESS ETL – Extract, Transform, and Load Capture – extract or obtaining a snapshot of a chosen

subset of the source data for loading into the data warehouse

Scrub or data cleansing – uses pattern recognition and AI techniques to upgrade data quality

Transform – convert data from format of operational system to format of data warehouse

Load – place transformed data into the warehouse and create indexes

Page 10: Module 5  Metadata, Tools, and Data Warehousing

10

ITEC 450

DATA WAREHOUSE PERFORMANCE Perspectives of data warehouse performance

Extract performance – how ETL process performs Data management – database design and data quality Query performance – OLAP tuning Server performance – hardware support

Automated summary tables Provide a proper set of aggregate information Commonly implement with materialized views or batch

operation tables DBMS features to support data warehousing

Materialized views – automatically creation of summaries Bitmap indexes – widely used in data warehousing, in

addition to B-tree Parallel execution – multiple processes work together

simultaneously to run a single SQL statement

Page 11: Module 5  Metadata, Tools, and Data Warehousing

ITEC 450

11

MODULE 5 Metadata, Tools, and Data WarehousingSection 5 DBA Rules of Thumb

Page 12: Module 5  Metadata, Tools, and Data Warehousing

12

ITEC 450

THE RULES OF THUMB Personal DBA handbook

Write down your own experience Categorize them in a searchable note or repository

Backup everything and plan for worst all the time Before making any changes, ensure that you can

recover from them Automation and share your knowledge

Create a systematic way to troubleshoot problems Create, reuse and share scripts Knowledge sharing will open many revenues for you

Next levels Understand the business, not just the technology Keep up-to-date on technology

Page 13: Module 5  Metadata, Tools, and Data Warehousing

13

ITEC 450COURSE SUMMARY (YOUR LEARNING) DBA Roles and Responsibilities DBMS Architecture, Physical and Logical

Structures DBMS Installation and Database Creation Database Connectivity and Network

Components Database Security and Audit Capability Database Backup and Recovery Database Monitoring, DBMS System Tuning,

Physical Configuration Optimization SQL Query Coding and Tuning, Data

Loading Database Metadata, Data Dictionary Data Warehouse Characteristics and

Overview