Top Banner
Best Practices: Data Administration and Quality Daniel Linstedt, all rights reserved, http://LearnDataVault.com
34

Best Practices: Data Admin & Data Management

Jan 19, 2015

Download

Technology

I built this presentation for Informatica World in 2006. It is all about Data Administration, Data Quality and Data Management. It is NOT about the Informatica product. This presentation was a hit, with standing room only full of about 150 people. The content is still useful and applicable today. If you want to use my material, please put (C) Dan Linstedt, all rights reserved, http://LearnDataVault.com
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Best Practices: Data Admin & Data Management

Best Practices: Data Administration and Quality

Daniel Linstedt, all rights reserved, http://LearnDataVault.com

Page 2: Best Practices: Data Admin & Data Management

2

Introduction and Expectations

• Author, Inventor, Speaker – and part time photographer…

• 25+ years in the IT industry

• Worked in DoD, US Gov’t, Fortune 50, and so on…

• Find out more about the Data Vault:

• http://www.youtube.com/LearnDataVault

• http://LearnDataVault.com

• Full profile on http://www.LinkedIn.com/dlinstedt

Page 3: Best Practices: Data Admin & Data Management

3

Agenda

• Introductions and expectations

• Defining data administration issues

• Applying best practices

• Conclusions and Q&A

Page 4: Best Practices: Data Admin & Data Management

4

Defining Data Administration Issues

Page 5: Best Practices: Data Admin & Data Management

5

What is Data Administration?

“What do we mean by that in the case of data

administration? We mean that DA must get out of

the design review committee mentality and

substitute something more value-added and

flexible. It must recognize that systems tend to

grow organically, and be a part of that process,

rather than an instiller of order upon it.”

Eric Rawlins, 1995

Originally Published by: Database Research Group, Inchttp://www.well.com/user/woodman/organic.html

Page 6: Best Practices: Data Admin & Data Management

6

The Role of Data Administration

• Data administration and management are key

roles in today's enterprise projects

• Data administration is a part of data management,

the two should be utilized together

• Compliance, accountability, and governance

provide a foundationally strong and tenable

architecture, and must be a part of the DAs

working knowledge

Page 7: Best Practices: Data Admin & Data Management

7

Business(Owner View)

Business(Owner View)

Data Steward

Discipline Authority

Business Process Manager

Logical(Designer View)

Logical(Designer View)

Data Administrator

Cross-Organization Roles and Responsibilities

Physical(Builder View)

Physical(Builder View)

DatabaseAdministrator

Data UsageContact

DataManager

DataModeler

DA is a ROLE and typically involves more than one person in order to achieve success.

Page 8: Best Practices: Data Admin & Data Management

8

Data Administrator Responsibilities

• Crossing the organization, building accountability across data sets

• Providing governance over master data and master metadata sets

• Assisting the data modeler in managing logical data models and matching these to business processes

• Ensuring the physical data set meets pre-designed metrics and measures, providing GAP analysis between what-was-designed and what-is-implemented

• Interfacing with the business users to ensure master metadata is meeting their needs

• Promote manageable and traceable systems growth through standards and metrics measurements

Page 9: Best Practices: Data Admin & Data Management

9

Top 10 Data Administration Issues

1. Inadequate or missing master metadata

2. Ineffective master data management

3. Incomplete logical models

4. Undefined business process models

5. Missing process control and metrics measurements

6. Non-defined user access matrices

7. Ineffective change management

8. Missing element classification system

9. Lack of user-training material

10.Mismatched data performance SLAs with DBA objectives

Page 10: Best Practices: Data Admin & Data Management

10

Defining Data Administration Issues

Top 4 Examples

Page 11: Best Practices: Data Admin & Data Management

11

Defining Master Metadata

• Master Metadata

• Information describing the elements/attributes, utilization of those attributes, which make up the master data structure

• These metadata are agreed upon by the business users to be universal in definition

• Questions to ask

• Why is master metadata important?

• What are the impacts of missing master metadata?

• Why is master metadata a part of the DA world?

• How does a DA build a master metadata management program?

Page 12: Best Practices: Data Admin & Data Management

12

Defining Master Data Management

• Master Data• Information housed in a single, consistent, quality-cleansed

reference table, located at a single location• All elements except the surrogate key in the master data

set are defined by master metadata at a global (enterprise-, or sometimes industry-wide) level

• Questions to ask• Why is master data important?• What are the impacts of master data?• Why is master data a part of the DA world?• How does a DA build a master data management

program?• Is master data connected to master metadata? How?

Page 13: Best Practices: Data Admin & Data Management

13

Assessing Logical Model Viability

• Logical Data Models

• A business view or representation of data integration in a data modeling format containing relationships and dependencies

• Questions to ask

• When was the last time the logical models were compared to the business process diagrams?

• When was the last time the logical model was reviewed with the business users?

• Do all the elements in the logical model contain metadata defined by key business individuals?

• Does the logical model match the physical model?

Page 14: Best Practices: Data Admin & Data Management

14

Defining Business Process Models

• Business Process Models

• A graphical flow of business processes including key data sets, dependencies, and key business processes

• The processes identified are often referred to as critical path components

• Questions to ask

• Why is BPR a part of data administration?

• What impact do BPM’s have on data administration?

• Do all the elements in the logical model contain metadata defined by key business individuals?

• Does the logical model match the physical model?

A B

D

F

C

G

E

EndStart2 3

4

56

32

Page 15: Best Practices: Data Admin & Data Management

15

ApplyingBest Practices

Page 16: Best Practices: Data Admin & Data Management

16

Revealing the DA Best Practices

• Review/construct logical data model to meet business needs

• Establish master data management strategy

• Audit data on a regular basis, ensure KPIs and metrics are met

• Meet with end-users to synchronize metadata, logical data models, and business process flows

• Maintain standards and end-user access paths

• Build metrics and SLAs to monitor the DA process

• Engineer and deploy a metadata management strategy

Page 17: Best Practices: Data Admin & Data Management

17

DA: MDM and Master Metadata

1. Defining work breakdown structure (WBS) as it pertains to the data and metadata, and business processes themselves

2. Defining organizational breakdown structure (OBS) as it pertains to the data and business process ownership

3. Business process to logical data – mapping4. Define and deploy data governance strategies5. Classifying metrics for data errors, and auditable sources of data6. Managing and tracking KPIs to architectural goals, aligning logical

models, and business processes (data flow) to current business objectives

Many times we see a cross-role responsibility of data management and data administration. The cross-role is responsible for the following:

Page 18: Best Practices: Data Admin & Data Management

18

Work Breakdown Structure

• Assume that the requirements and the types of deliverables are fairly well understood• Code : source, mappings, workflows, errors, etc.

• Documents: design spec., test plan, test cases, user manuals, etc.

• Training: end user training, support personnel training, etc.

• For each of the deliverables, consider the set of activities that will be employed to develop the deliverable (based on the process/procedure chosen)

• Map the deliverable against the chosen activities, and consider the sequencing of the activities, including any inter-deliverable relationships

Page 19: Best Practices: Data Admin & Data Management

19

Organizational Breakdown Structure

• Identify the workers within the IT community who are involved with Informatica, and Informatica support

• Also identify the business technical liaison (business lead) from the business side and ensure the sponsor’s participation.

• Cross the OBS with the WBS for a complete view of the work assignments• This will also help determine the roles and responsibilities

Page 20: Best Practices: Data Admin & Data Management

20

DA: Architecting Data Governance

BusinessRules& IQ

EDWSource

Systems

NonCompliant

DataMarts

BusinessRules& IQ

EDWSource

SystemsDataMarts

CompliantHard

BusinessRules

Soft Business Rules & IQ Shift to process

AFTER the EDW

Hard Business RulesStill process

Before the EDW

Page 21: Best Practices: Data Admin & Data Management

21

Establishing Auditable Sources

Sync

Routines Data2nd Source System

Staging EDWData Warehouse

Source System

Data Export

Sync Routines

OLTP

OperReports

DW Exports

• Secure• Auditable• Compliant

Page 22: Best Practices: Data Admin & Data Management

22

DA – Defining Data Errors and Models

• Data admin designs the logical data models used for error handling

• Data admin assesses the implementation of the error handling architecture

• Data admin provides data accountability to the business users

• Data admin defines master data sets, and master metadata for both “good” and “bad” data according to the business rules

B.I.Tool

Database

WrtrxformRdr

ETL Load Process

SourceSystem

StagingArea

DataWarehouse

DataMarts

**ErrorStage

**ErrorWarehouse

ErrorMarts

** Not usually implemented

Page 23: Best Practices: Data Admin & Data Management

23

DA Example – Classifications of Errors

• Soft-Errors (Business Rule Breaks)• Data requires parent key

• Negative hours on a time-card were charged

• Customer has no account, but has transactions

• Technical Errors• Datatype mismatches

• Null/Not Null issues

• Missing data, default data, mis-calculated data

• Hard-Errors• Database: out of space, bad indexes, roll-back segment

• Network: went down, can’t locate IP

• Machine: CPU bad, RAM bad, disk mad, machine shutting down, out of disk space

BusinessOwns the Error

I.T.Owns the Error

Page 24: Best Practices: Data Admin & Data Management

24

0

5

10

15

20

25

30

Jan Feb Mar Apr May June July Aug Sep Oct Nov Dec

Ho

urs

Expectation

Modeling ErrorsProcess ErrorsMDM Errors

DA: Tracking Errors – KPIs at Work

Page 25: Best Practices: Data Admin & Data Management

25

Metadata and Data Administration

• Metadata, data lineage, and element definition are all a part of data administration

• How can we stitch the different metadata together?

• Does our metadata repository capture its own versions of metadata?

• Does our enterprise vision allow business users to access and modify metadata that they own?

• Data administrators must be responsible for capturing, architecting, and engineering metadata standards across the organization

Page 26: Best Practices: Data Admin & Data Management

26

Metadata Administration Lifecycle

Identify New Metadata

Identify New Metadata

Integrate WithMaster Metadata

Repository

Integrate WithMaster Metadata

Repository

Edit and ManageMaster Metadata

(Provide Business Users with Web Interface)

Edit and ManageMaster Metadata

(Provide Business Users with Web Interface)

Stitch Master Metadata

Together

Stitch Master Metadata

Together

CompareMaster Metadata

With Business ProcessAnd Objectives

CompareMaster Metadata

With Business ProcessAnd Objectives Export Master

Metadata orDeploy via SOA

With Master Data Set

Export MasterMetadata or

Deploy via SOAWith Master Data Set

Derived from Meta Integration Metadata Lifecycle

Page 27: Best Practices: Data Admin & Data Management

27

Monitoring DA Efforts

• Focus on the Big Stuff• KPI: peer review for readability and architecture• KPI: peer review and sign-off for metadata definitions• KPI: project plan with named deliverables and phases

• Make process rules, not artifact rules• KPI: measure the effectiveness of edits/updates to the existing standards• KPI: track the amount of time spent by the DA on the edits and changes to the rules.• KPI: assess end-user access (queries) against the metadata, documentation, and models

• Consult others for added value• KPI: track the effectiveness of the peer reviews by tying the above metrics together• KPI: assign specific recurring “standing” meetings, assign role-call and notes with action

items

• Map the World• KPI: track changes, amount of time spent on data models• KPI: measure level of effort based on complexity, resulting from impact analysis studies

Establish KPIs for Each of the Following Areas

Page 28: Best Practices: Data Admin & Data Management

28

Case Study for DA Results

Government Manufacturing Firm• Three people, 6 months from start to finish on EDW

• Passed government and financial audits in the first 3 weeks of production release

• Saved the company $15M, and $45M in the first 3 months

• Reduced cycle time in manufacturing by 3 months after showing specific data lineage

• Changed IT from a cost center to a profit center

• Demonstrated a 15-year-old billing error on the operational reports

After Implementing DA Best Practices

Page 29: Best Practices: Data Admin & Data Management

29

Conclusionsand Q&A

Page 30: Best Practices: Data Admin & Data Management

30

Revealing the DA Best Practices (Recap)

• Review/construct logical data model to meet business needs

• Establish master data management strategy

• Audit data on a regular basis, ensure KPIs and metrics are met

• Meet with end-users to synchronize metadata, logical data models, and business process flows

• Maintain standards and end-user access paths

• Build metrics and SLAs to monitor the DA process

• Engineer and deploy a metadata management strategy

Page 31: Best Practices: Data Admin & Data Management

31

The Experts Say…

“The Data Vault is the optimal choice for modeling the EDW in the DW 2.0 framework.” Bill Inmon

“The Data Vault is foundationally strong and exceptionally scalable architecture.”

“The Data Vault is foundationally strong and exceptionally scalable architecture.”

Stephen BrobstStephen Brobst

“The Data Vault is a technique which some industry experts have predicted may spark a revolution as the next big thing in data modeling for enterprise warehousing....”

“The Data Vault is a technique which some industry experts have predicted may spark a revolution as the next big thing in data modeling for enterprise warehousing....” Doug LaneyDoug Laney

31

Page 32: Best Practices: Data Admin & Data Management

32

More Notables…

“[The Data Vault] captures a practical body of knowledge for data warehouse development which both agile and traditional practitioners will benefit from..”Scott Ambler

32

Page 33: Best Practices: Data Admin & Data Management

33

Where To Learn More

• The Technical Modeling Book: http://LearnDataVault.com

• The Discussion Forums: & eventshttp://LinkedIn.com – Data Vault Discussions

• Contact me:http://DanLinstedt.com - web [email protected] - email

• World wide User Group (Free)http://dvusergroup.com

33

Page 34: Best Practices: Data Admin & Data Management

34

Thank you

Contact us today:

Dan Linstedt

[email protected]

http://LearnDataVault.com