Top Banner
www.netpeach.com Business Intelligence Overview By: Netpeach BI Team
34

Business Intelligence Overview

Jan 14, 2015

Download

Technology

netpeachteam

The purpose of business intelligence is to support better business decision making. BI systems provide historical, current, and predictive views of business operations, most often using data that has been gathered into a data warehouse or a data mart and occasionally working from operational data.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Business Intelligence Overview

www.netpeach.com

Business Intelligence Overview

By:Netpeach BI Team

Page 2: Business Intelligence Overview

www.netpeach.com

Business Intelligence Overview

- Definition- Architecture

- Source systems /OLTP- ETL process- Data Warehouses /OLAP

- OLTP vs. OLAP- ODS and Data Marts- Data Warehouse Design Approaches- Dimensional Modeling- From Enterprise models to Dimensional models- Schema Types: Star, Snowflake, Fact Constellation- Conclusion

Page 3: Business Intelligence Overview

www.netpeach.com

Definition

The term business intelligence (BI) refers to technologies, applications and practices for the collection, integration, analysis, and presentation of business information.

The purpose of business intelligence is to support better business decision making. BI systems provide historical, current, and predictive views of business operations, most often using data that has been gathered into a data warehouse or a data mart and occasionally working from operational data.

Page 4: Business Intelligence Overview

www.netpeach.com

BI enables enterprises

- Measure performance and trends- to use analytic information strategically- unlock the value of its information- identify opportunities- improve efficiency - do competitive analysis..- to find the Cause- data Mining- Etc.

Page 5: Business Intelligence Overview

www.netpeach.com

Examples

• Cause & predictive analysis: Credit cart annual fee

• Performance and trends: Region total sales / our sales

• Competitive: Our sales / competitor sales in a particular region or a location, etc,

• Right timing: Bank customer accounts (pattern changes)

• Data mining: market basket analysis

Page 6: Business Intelligence Overview

www.netpeach.com

Architecture

ETLExtractClean

TransformIntegrateLoading

Analysis Services

Reporting Services

OLAP Services

Dashboards / Scorecards

Alerts / Notifications

EDW

DM

ODS ODS

DMEtc.

Staging/ETL

Source Systems

Target Systems

BI Data Presentation

Page 7: Business Intelligence Overview

www.netpeach.com

Architecture cont…

Typical BI architecture has the following components:

• A source system, also called Operational system—typically an online transaction processing (OLTP) system, but other systems or files that capture or hold data of interest are also possible.

• An extraction, transformation, and loading (ETL) process.• A data warehouse—typically an online analytical processing

(OLAP) system.• A business intelligence platform such as Microstrategy.

Page 8: Business Intelligence Overview

www.netpeach.com

Source Systems (OLTP)

An operational system is a term used in data warehousing to refer to a system that is used to process the day-to-day transactions of an organization. These systems are designed so processing of day-to-day transactions is performed efficiently and the integrity of the transactional data is preserved.

Sometimes operational systems are referred to as operational databases, transaction processing systems, or on-line transaction processing systems (OLTP). In OLTP — online transaction processing systems relational database design use the discipline of data modeling and generally follow the Codd rules of data normalization in order to ensure absolute data integrity

Page 9: Business Intelligence Overview

www.netpeach.com

Source Systems examples

- Account transactions in a Bank- Sales transactions in a Retail outlet.- Inventory management transactions in a

warehouse- Workforce management transactions such as

attendance, vacations, overtime tracking, etc.- Operational expenditure systems- External sources such as industry information like

elasticity or demand of a product from a third part sources in Retail domain.

- Etc.

Page 10: Business Intelligence Overview

www.netpeach.com

ETL – Extraction, Transformation and Loading

The Extraction, Transformation, and Loading (ETL) process represents all the steps necessary to move data from different source systems to an integrated data warehouse.

The ETL process involves the following steps:

- Data is gathered from various source systems.- The data is transformed and prepared to be loaded into the data

warehouse. Transformation procedures can include converting data types and names, eliminating unwanted data, correcting typographical errors, aggregating data, filling in incomplete data, and similar processes to standardize the format and structure of data.

- The data is loaded into the data warehouse.

Page 11: Business Intelligence Overview

www.netpeach.com

Data Warehouse (OLAP)

A Data Warehouse, in its simplest perception, is no more than a collection of the key pieces of information used to manage and direct the business for the most profitable outcome.

- According to Bill Inmon, “a data warehouse is a subject-oriented, integrated, nonvolatile, time-variant

collection of data in support of management decisions”. - Ralph Kimball states that a data warehouse is “ a copy of transaction data specifically structured for Query and Analysis”.

Page 12: Business Intelligence Overview

www.netpeach.com

OLAP

OLAP: a category of software tools that provides analysis of data stored in a database. OLAP tools enable users to analyze different dimensions of multidimensional data. For example, it provides time series and trend analysis views. OLAP often is used in data mining.

Page 13: Business Intelligence Overview

www.netpeach.com

OLAP Analysis

Imagine an organization that manufactures and sells goods in several states of USA

During the OLAP analysis, the top executives may seek answers for the following:

- Number of products manufactured.- Number of products manufactured in a location- Number of products manufactured on time basis within a

location.- Number of products manufactured in the current year when

compared to the previous year.- Sales Dollar value for a particular product.- Sales Dollar value for a product in a location.- Sales Dollar value for a product in a year within a location.- Sales Dollar value for a product in a year within a location sold

or serviced by an employee.

Page 14: Business Intelligence Overview

www.netpeach.com

OLTP / OLAPOLTP FEATURE OLAP

Transactional applications using a Front-end,

- data capture, modify, delete

- No direct DB access

PURPOSE Analysis purpose- Analyse Data- Read only- Some times direct access to DB

Operational administrative staff, Data Entry operator, database professional, etc.

TYPE OF USERS Manager, analyst, executive, executive management

Relational Data Structures DATA STRUCTURES Multidimensional Data Structures

Normalized DBMS DUPLICATED DATA De-Normalized & Normalized DBMS

Many NUMBER OF USERS Few

Predefined operations WORKLOAD AD-HOC queries , Predefined reports

Volatile DATA MODIFICATIONS Update on a regular basis

Small volume (Current Data) DATA Volume Large Volume (Historical Data)

Availability Must be high Response time must be good

Page 15: Business Intelligence Overview

www.netpeach.com

DW related: ODS and Data Marts

ODS (Operational Data Store) - This has a broad enterprise wide scope, but unlike the real enterprise data warehouse, data is refreshed in near real time and used for routine business activity.

Data Mart – is a subset of data warehouse and it supports a particular region, business unit or business function

Page 16: Business Intelligence Overview

www.netpeach.com

Typical positioning of an ODS

Application 1

Application 2

Application n

ETL

Source Applications

ODS

EDW

Data Mart Data Mart Data Mart

Page 17: Business Intelligence Overview

www.netpeach.com

ODS vs DW

ODS DW

It is designed to support operational monitoring.

It is designed to support Decision Making Process.

Data is volatile Non-Volatile Current Data Historical Data Designed for running the business

Designed for Analyzing the business

Follows Normalization Follows de-normalization

Designed using E/R Modeling Using Dimensional Modeling

Page 18: Business Intelligence Overview

www.netpeach.com

ODS and DW use case

In a pharmaceutical company

Customer ODS is used for:- sending new product details,- promotional activities, - and scheduling appointments.

DW is used to answer: - In a month, what is the total value of medicines prescribed by a Doctor? - What is our company share- Is he missing any info from us.

Page 19: Business Intelligence Overview

www.netpeach.com

Data Warehouse design approaches

Kimball - Let everybody build what they want when they want it, we'll integrate it all when and if we need to. (BOTTOM-UP APPROACH)

Pros: fast to build, quick ROI, nimbleCons: harder to maintain as an enterprise

resource, often redundant, often difficult to integrate data marts

Inmon - Don't do anything until you've designed everything. (TOP-DOWN APPROACH)

Pros: easy to maintain, tightly integratedCons: takes way too long to deliver first projects,

rigid

Page 20: Business Intelligence Overview

www.netpeach.com

Dimensional data modeling

• Dimensional data modeling is – A logical design technique

that seeks to – present the data in a standard frame work

that is – intuitive and allows high-performance access.

• A data model specifically for designing data warehouses

• The method was developed based on observations of practice, and in particular, providing data in “user-friendly” form.

Page 21: Business Intelligence Overview

www.netpeach.com

From ER Models to Dimensional Models

A typical process in an enterprise

Page 22: Business Intelligence Overview

www.netpeach.com

Sample OLTP data model

Page 23: Business Intelligence Overview

www.netpeach.com

Step 1. Classify Entities

Transaction Entities- An event happened at a point of time- contains measurements or quantities

Component Entities : - directly related to a transaction entity- Component entities answer questions like “who”, “what”, “when”, “where”,

“how” and “why” of a business event.

In a sales application transaction entities are:Customer: who made the purchase

Product: what was soldLocation: where it was soldPeriod: when it was sold

Classification Entities:- related to component entities by a chain of one-to-many relationships - represent hierarchies embedded in the data model

Page 24: Business Intelligence Overview

www.netpeach.com

Step 2. Identify Hierarchies

• A hierarchy in an Entity Relationship model is any sequence of entities joined together by one-to-many relationships, all aligned in the same direction.

Page 25: Business Intelligence Overview

www.netpeach.com

Step 3. Produce Dimensional Models

Operators For Producing Dimensional ModelsOperator 1: Collapse HierarchyOperator 2: Aggregation

There is a wide range of options for producing dimensional models from an Entity Relationship model.

These include:Star SchemaSnowflake SchemaConstellation / Integrated Schema

Page 26: Business Intelligence Overview

www.netpeach.com

Star Schema

• A fact table is formed for each transaction entity. The key of the table is the combination of the keys of its associated component entities.

• A dimension table is formed for each component entity, by collapsing hierarchically related classification entities into it.

A star schema consists of one large central table called the fact table, and a number of smaller tables called dimension tables which radiate out from the central table

Page 27: Business Intelligence Overview

www.netpeach.com

Sample Star Schema

Page 28: Business Intelligence Overview

www.netpeach.com

Snowflake Schema

A snowflake schema is a star schema with all hierarchies explicitly shown.

Page 29: Business Intelligence Overview

www.netpeach.com

Star vs. Snowflake

Star Schema Snowflake

Ease of maintenance/change:

Has redundant data and hence less easy to maintain/change

No redundancy and hence more easy to maintain and change

Ease of Use: Less complex queries and easy to understand

More complex queries and hence less easy to understand

Query Performance: Less no. of foreign keys and hence lesser query execution time

More foreign keys-and hence more query execution time

Space: Has de-normalized tables hence takes more space.

Has normalized tables hence takes less space.

Good for: Good for data marts with simple relationships (1:1 or 1:many)

Good to use for data warehouse core to simplify complex relationships (many : many)

When to use:When a dimension hierarchy contains more levels it is a good practice to use Star schema as it requires few joins and improves performance.

When a dimension hierarchy contains fewer levels and is data volume is relatively big in size, snowflake is better as it reduces space and joins.

Star schema does not support many to many relationship between attributes in a dimension as each dimension is de-normalized into a single table.

Page 30: Business Intelligence Overview

www.netpeach.com

Fact constellation schema

The fact constellation architecture contains multiple fact tables that share many dimension tables

Page 31: Business Intelligence Overview

www.netpeach.com

Constellation /Integrated SchemaA constellation schema consists of a setof star schemas with hierarchically linked fact tables.

Page 32: Business Intelligence Overview

www.netpeach.com

Step 4. Evolution and Refinement

• Check if we can Combine any Fact Tables• Check if we can Combine any Dimension

Tables• Handling Subtypes

Page 33: Business Intelligence Overview

www.netpeach.com

Conclusion

ETL tools- Informatica- Data junction- Data stage- Ab initio- SSIS- Oracle Warehouse Builder.- Pentaho- Talend

OLAP tools- Business Objects- Cognos Powerplay- MicroStrategy- Hyperion Essbase- SSAS- SSRS- Oracle Express- Oracle OLAP option

Databases- Teradata- Netezza

Below are few most popular tools:

Page 34: Business Intelligence Overview

www.netpeach.com

Questions & Answers