Top Banner
Information Governance and Data Discovery Vincent McBurney IM Practice Lead Focus Strategies and Solutions [email protected] www.focus.co DQ Asia Pacific March, 2011 Sydney, Australia
34
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Focus

Information Governance and

Data Discovery

Vincent McBurneyIM Practice Lead

Focus Strategies and [email protected]

www.focus.co

DQ Asia Pacific

March, 2011 Sydney, Australia

Page 2: Focus

2

Data governance is a set Data governance is a set of processes that ensures of processes that ensures that important data assets that important data assets

are formally managed are formally managed throughout the enterprise.throughout the enterprise.

Data Governance helps Data Governance helps controls the cost, risk and controls the cost, risk and

time of data driven IT time of data driven IT projectsprojects

Page 3: Focus

IBM Information Governance Maturity Model

• The categories of effective data governance

3

Page 4: Focus

Capability Maturity Model Integration (CMMI)

• Based on the Capability Maturity Model (CMM) and applied to each category of data governance.

4

Graphic sourced from Carniegie Mellon Software Engineering Institute

Page 5: Focus

What Maturity do you need?

5

• Recommended Maturity Level for different types of IT projects.

Page 6: Focus

Recommended Online Community

6

Page 7: Focus

• Why does a simple enhancement request take so long?

• Why are our estimates always wrong?

• Why does everyone take so long to do things?

7

Page 8: Focus

The Victim Statements – the Business

8

They just spent so long on

meetings and

documentation and didn’t

build anything!

We could have built

this faster

ourselves

When we got to UAT Testing there were

bugs and it had to be fixed over and over

again.

We spend all this

money on IT and

what do we get for

it?

Page 9: Focus

Obvious Suspect – IT Team

9

Requirements and rules

kept changing right up

through testing.

You thought that was

bad, wait until you

see phase 2.

No one told us there

were three different

definitions for client

status.

It would help if the

business knew

what they wanted.

Page 10: Focus

Obvious Scapegoat – the New Guy

10

I don’t know where the

application documentation

is.

I need to update

my resume.

I don’t even know who was managing the

project.

Turned out the Functional

Spec I was using was out of

date by two years.

There are three different

definitions for client status?

Wait, which client status are we

talking about?

I didn’t do a proper

handover as the guy I

replaced was always out to

lunch.

Page 11: Focus

The Coroners Report

• The team did not have the information and the context for the change.

11

Page 12: Focus

The Information Server Approach

• Metadata Workbench and Business Glossary provide context

12

Farnaz Erfan
Another thing that Metadata Workbench does include in its lineage is the external data transformations and custom programs such as 'procedures' or ' stored procedures'. You have the ability to document these processes as part of the entire information flow (so the lineage is not just 'infomration server' specific'). Hence, I would say that we can put a 'star' next to 'procedures' as well.
Farnaz Erfan
Business Glossary can also link to the technical assets (tables, columns, cubes, models, etc), as I am sure you know. But from a business term entry point. Was there a reason you did not mark those as Business Glossary features?
Page 13: Focus

Define Business Glossary in the Unified Process

13

Define Business Problem

Obtain Executive

Sponsorship

Conduct Maturity

Assessment

Build Roadmap

Establish Organisation Blueprint

Build Data Dictionary

Understand Data

Create Metadata

Repository

Define Metrics

Appoint Data

Stewards

Manage Data Quality

Implement Master Data Management

Create Specialised Centers of Excellence

Manage Security &

Privacy

Manage Life-cycle

Measure Results

= Enable through Process

= Enable through Technology

Page 14: Focus

• The steps to a successful Business Glossary.

14

Page 15: Focus

Glossary in a Project

15

Create your Glossary during the Understand and Define stage.

Use and refine your Glossary during subsequent phases.

Page 16: Focus

Identify Subject Areas

• If you are using a Business Glossary to support a Data Warehouse then start with the high level conceptual data model.

16

LearningTeachingDevelopment Management

Outcome

Grant

Attempt

Recruitment

Admission

Publication

Unit

Unit Offering

Completion

Staff

Centre

Location

Research

Policy

Commercialisation

Risk, Quality &

Evaluation

Course Student

Award

Unit Delivery

Survey

Alumni

Organisation

Faculty

SchoolPlanning

Health &

Safety

Training

Accounts

Performance

Page 17: Focus

Start with a Formal Vocabulary

• Focus helped create a Glossary for a Data Collection at NCVER.

• Clearly defined Data Dictionary with elements and rules.

17

Page 18: Focus

Define the Lifecycle of Terms

• Work out the Data Stewardship Policies

– How to use the Term status

– Identify review groups

– Collaborate via email

– Track changes over time

– Report to track progress of reviews

18

Small Term View – “Accepted”

Large Team Review – “Standard”

Enterprise Term

Term Added – “Candidate”

Page 19: Focus

Basic Glossary Entry

19

• Using Glossary just for Definitions

Page 20: Focus

Adding Synonyms and Related Terms

• Synonyms track different names for the term across the Enterprise.

• Related Terms are used to define validation business rules.

20

Page 21: Focus

Assigning Physical Assets

• External Assets – Given Name is linked to external HTTP links such as documents in Sharepoint or Intranet Pages.

• Metadata Assets – Given Name has been explicitly linked to a FIRST_NAME column in the Warehouse.

21

A linked Word Document

Linked DB Columns

Change History

Page 22: Focus

Custom Browse and Data Entry Forms

• Using the Glossary API to write our own authoring forms

22

Better Date entry

Different Column Order

Better Validation

Page 23: Focus

Business Term Linkage

• Context is everything.

23

Business

Term

Business

Term

System of

Record DB

System of

Record DB DW

Table

DW

Table

Synonyms

Hononyms

Related Terms

Synonyms

Hononyms

Related Terms

CognosCognos

Data

Model

Data

Model

Metadata

Workbench

Metadata

Workbench

Page 24: Focus

Get a Fast Start with Imports

• The Quality of Business Glossary imports has a major impact on the success of the implementation.

– Excel imports using templates.

– Build, copy paste and prepare content quickly.

– Email content around for updates and review.

• 300-400 terms in the first three weeks.

24

Page 25: Focus

• Profiling

• Primary Foreign Key Discovery

• Transformation Discovery

• Unified Schema Build

25

Page 26: Focus

Data Profiling

26

Page 27: Focus

Primary and Foreign Key Discovery

27

Page 28: Focus

Unified Schema Build Example

• Three different source systems, three tables each.

28

Page 29: Focus

Overlap Analysis

29

• Find overlapping columns and data using profiling results.

Page 30: Focus

Unified Schema Build

30

Page 31: Focus

Unified Column Analysis

• See your data quality before you move the data.

31

Page 32: Focus

InfoSphere Discovery

• Data Warehouse

– Data Inventory: Profiling and Primary/Foreign Keys

– Design and Prototype: Schema Build and Overlap Profiling

– Load: Mapping and Transformation Discovery

• Application Consolidation/Migration

– Data Inventory

– Rule Discovery: document old rules, define new rules

– Source to Target: map from old to new

• MDM

– Data Inventory: Overlapping Master Data and Conformance

– Design and Prototype: build a unified MDM registry

32

Page 33: Focus

Unified Metadata Approach

33

Business

Glossary

Business

Glossary

DiscoveryDiscoveryFast

Track

Fast

Track

Information

Analyzer

Information

Analyzer

Discovery: Profiling, Values, Frequencies, Overlap and Links, Transform Discovery. Assign TermsCreate FastTrack Maps

Audit: Define valid reference data values and show them in Glossary. Show Profiling Stats in Glossary

Mapping: Create source to target mappings with columns and terms. Map by physical names or business names. Automap by business names.

CognosCognosTurn Framework Manager metadata into a Business Glossary.Popup field help.Link KPI definitions to related terms.Find in Cognos.

Data

Model

Data

Model

Turn a Glossary into a Logical Model.Turn a Logical Model into a Glossary.

Metadata

Workbench

Metadata

Workbench

Link Terms to Assets in bulk.Link Stewards in bulk.Report on changed and stale terms.

Blueprint

Director

Blueprint

Director

Page 34: Focus

Getting Started with Data Governance

Six things everyone can do today:

1.Define your desired outcomes from Data Governance

2.Be clear about the problems you are solving

3.Define a realistic organisational structure for your environment

4.Focus on a DG pilot program that can deliver outcomes with business benefits

5.Take advantage of best practices and models from organisations like the Data Governance Council and MIKE 2.0

6.Be real with organisational challenges, funding requirements, scope and duration of deliverables

34