Top Banner
1.Delivery Process 2.System Process Anahory
31
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Delivery Process

1. Delivery Process

2. System Process

Anahory

Page 2: Delivery Process

Delivery ProcessThe process that delivers a data warehouse has to be fundamentally different from traditional waterfall method

Issue with the DW projects is that

difficult to complete the tasks and deliverables in the strict, ordered fashion demanded by a waterfall method

because requirements are rarely understood and are expected to change over time

Knock-on effect

Architectures, designs, and build components cannot be completed until the requirements are completed which can lead to constant requirement iteration without deliver i.e “Paralysis by Analysis”

Page 3: Delivery Process

Steps in DW delivery method

1. IT Strategy

- DW are strategic investments

- Require business process to redesign in order to generate the projected benefits

- If there is no overall IT strategy that included DW

difficult to procure

retain funding for project

Page 4: Delivery Process

Steps in DW delivery method

2. Business case

Identify the projected business benefits that should be derived from using the data warehouses

Benefits may or may not be quantifiable Ex : $5000 savings per annum

Projected benefits should be clear stated

DW that do not have a clear business case tend to suffer from credibility problems at some stage during the delivery process

Page 5: Delivery Process

Steps in DW delivery method

3. Education and Prototyping

Organizations will experiment with the concept of data analysis and educate themselves on the value of a data warehousing

In some instances data warehouse may be the first large-scale client-server solution being implemented within the organizations and will require

new skills

experiences

hardware

Page 6: Delivery Process

Steps in DW delivery method

A prototyping activity on a small scale can further the education process as long as

1. Prototype addresses clearly defined technical objective

2. Prototype can be thrown away once the feasibility of the concept has been shown

3. Activity addresses a small subset of the eventual data content of the DW

4. Activity time scale is not critical – seen as a timeboxed effort to come to grips with the new technologies being considered

Page 7: Delivery Process

Steps in DW delivery method

4. Business Requirements

To produce a set of production-quality deliverables that grow to full solution

Overall requirements should be understood

Overall system architecture is in place

20% of the time within the business requirements phase should be spent on understanding the longer-term requirements

Determine the logical model for information within DW

Determine source systems that provide the data

Page 8: Delivery Process

Steps in DW delivery method

Determine business rules to be applied to data

Determine query profiles for the immediate requirement

Determine some aspects of data may not be available from the existing operational systems

Probably not feasible to populate the DW with that data – manual process to supplement data captured by the extract and load process are generally unreliable

Page 9: Delivery Process

Steps in DW delivery method5. Technical blue print

Delivers an overall architecture that satisfy the longer-term requirements

Definition of the components that must be implemented in the short term in order to derive any business profit

Blue print must identify

- Overall system architecture

- Server and data mart architecture

- Essential components of DB design

Page 10: Delivery Process

Steps in DW delivery method

- Data retention strategy

- Backup and recovery strategy

- Capacity plan for hardware and infrastructure (LAN, WAN)

Detailed design of DB is not produced in this stage

Significant components are identified and sized

Page 11: Delivery Process

Steps in DW delivery method

6. Building the Vision

First production deliverable is produced

Smallest component of DW that adds business benefit

Ex: stage builds the major infrastructure components for extracting and loading data, but limit them to the extraction and load of one or two data sources, with minimal history

Page 12: Delivery Process

Steps in DW delivery method7. History Load

Remainder of the required history is loaded into DW

New entities would not be added into DW

Physical entities would be created to store increased data volumes

Ex: Building the vision phase has delivered a retail sales analysis DW with 3 month’s worth history

- Business users analyze recent trends and address short-term sales issues

- Does not provide sufficient data to identify annual or seasonal sales trends

Page 13: Delivery Process

Steps in DW delivery method

-Next step could be back load two years worth of sales history from archieve tape – allows business users to analyze recent trends year on year

-Data volumes becomes larger

-Operational management issues become complex, disk failure increase dramatically, load processes take much longer to execute

-This allows the activity to backload history to be loaded in a a separate phase

Page 14: Delivery Process

Steps in DW delivery method

8. Ad hoc Query

Configure Ad hoc query tool to operate against the DW

End-user access tools are capable of automatically generating the DB query that answers any questions posed by the user.

Users will typically pose questions in terms that they are familiar Ex: Sales by store last week which is converted into DB query by access tool which is aware of the structure of information within DW

Page 15: Delivery Process

Steps in DW delivery method9. Automation

Operational management process are fully automated within the data warehouse. These include

1. Extracting and loading the data from variety of source systems

2. Transforming the data into a form suitable for analysis

3. Backing up, restoring and archiving data

4. Generating aggregations from predefined definitions within the DW

5. Monitoring query profiles, and determining the the appropriate aggregations to maintain system performance

Page 16: Delivery Process

Steps in DW delivery method

10. Extending Scope

Extended to address new set of business requirements

Loading of additional data sources into DW – new data marts

11. Requirement Evolution

Requirements are never static

Business requirements will constantly change during the life of DW – Process should support this and allows these changes to be reflected within the system

Page 17: Delivery Process

Data Warehouse delivery process

IT strategy

Education

Technical blue print

Building the vision

History Load

Ad-hoc query

Automation

Business case analysis

Business requirements

R

E

Q E

I v

R l

E u

M t

E i

N o

T n

Extending scope

Page 18: Delivery Process

Accessing the DW1. Do not design the DW around a specific tool or tool type

2. To fully understand the user requirement and round them out, you must gain an understanding of the business

3. Make sure that period information is captured by department, group and any other organization divisions

4. It is imperative to get level of detail at which data must be stored correct. If this decision is made incorrectly , DW must be completely reorganized at some future date

Page 19: Delivery Process

System Process

Data warehouses must be architected to support 3 major factors

1. Populating the warehouse

2. Day-to-day management of the warehouse

3. Ability to cope with requirements evolution

Page 20: Delivery Process

1. Populating the warehouse

- Cleaning it up

- making it available for analysis

- typically done on a daily basis after the close of business day

Page 21: Delivery Process

2. Day-to-day management

-different from the management of an operational system

-Volumes are larger and require active management such as creating/deleting summaries, or rolling data on/off the archive

-Essence to satisfy business requirements

Page 22: Delivery Process

3. Ability to cope with requirement evolution

-Tends to be more complex aspect of a DW

-Requires architecture to be structured to cope future changes in query profiles

-Evolution of completely new subject areas

Page 23: Delivery Process

Typical Process Flow within DW

1. Extract and load the data

2. Clean and transform the data into the form that can cope with large data volumes and provide good query performance

3. Look up and archive data

4. Manage queries and direct them to the appropriate data sources

Page 24: Delivery Process

1. Extract and Load process

-Extracting data from the sources

-Loading into the DB

-Stripping out any detail that is there to support the operational systems rather than the business requirements

-Adding more context

-Reconciling data with the other data sources

Page 25: Delivery Process

1. Extract and Load process

a. Controlling the process

Mechanisms that determine when to

start executing the data

run the transformations

Consistency checks

b. When to initiate extraction

Start extracting data from data sources when it represents the same snapshot of time as all other data sources

Page 26: Delivery Process

1. Extract and Load process

c. Loading the data

- Do not execute consistency checks until all the data sources have been loaded into temporary data store

- Expect the effort required to clean up the source systems to increase exponentially with the number of overlapping data sources

Page 27: Delivery Process

2. Clean and Transform Data

1. Clean and transform the loaded data into a structure that speeds up queries

2. Partition the data in order to speed up queries, optimize hardware performance and simplify management of the DW

3. Create aggregations to speed up the common queries

Page 28: Delivery Process

Clean and transform the data

Make sure data is consistent within itself

Make sure that data is consistent with other data within the same source

Make sure that data is consistent with other data in other source systems

Make sure the data is consistent with the information already in the warehouse

Page 29: Delivery Process

Transforming into effective structures

-Convert the source data in the temporary data store into a structure that is designed to balance query performance and operational cost

Page 30: Delivery Process

3. Backup and Archive Process

Data in the data warehouse is backed up regularly – to ensure DW can always be recovered from data loss, S/W and H/W failures

In archiving older data is removed from the system in a format that allows it to be quickly restored if required

Common to archive the data as a flat file extract where the file is in a format that allows the data to be fast loaded directly into relevant fact and dimensional tables

Page 31: Delivery Process

4. Query Management process

Manages the queries and speeds them by directing queries to the most effective data source

Ensures system resources are used in efficient way

Does not generally operate during the regular load of information