Top Banner
` 17.10.19 Modernising Data Architecture for AI Jon Teo Solution Specialist, Informatica
39

Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

Aug 02, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

`

17.10.19

Modernising Data Architecture for AIJon Teo

Solution Specialist, Informatica

Page 2: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

2 © Informatica. Proprietary and Confidential.

Objectives:

1. Importance of Data in Healthcare AI

2. Data Challenges in AI & Analytics

3. Modern Data Management Architecture

Data

Needs

AI

AI

Needs

Data

Page 3: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

3 © Informatica. Proprietary and Confidential.

Some Working Definitions Before We Begin

digitalready.co

Page 4: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

4 © Informatica. Proprietary and Confidential.

Health AI State of Play

Percentage (%) of healthcare professionals using Digital Health technology in their practice:

n = 3194

Percentage (%) of healthcare professionals comfortable using AI for:2.

1.

Page 5: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

5 © Informatica. Proprietary and Confidential.

Myriad Healthcare Applications for AI

Health

Consumer

Institutions &

Providers

Research &

Discovery

Page 6: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

6 © Informatica. Proprietary and Confidential.

Some Examples of AI Applications in:Population Management

Medicare Beneficiaries Leakage

Personalised Medicine

Customised Radiotherapy

Clinical Research

Clinical Trial Candidate Screening

PED Admissions Recruitment

Page 7: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

7 © Informatica. Proprietary and Confidential.

The “Big Data” Vision for Precision Healthcare[Pan-omics + SDoH + Global Evidence Base] + Deep Learning = Precision Healthcare

The Tapestry of Potentially High-Value Information Sources

That May be Linked to an Individual for Use in Health Care

JAMA. 2014

Deep Medicine: How Artificial Intelligence Can Make

Healthcare Human Again, 1st Ed. Topol, Eric. 2018

Page 8: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

8 © Informatica. Proprietary and Confidential.

Data Challenges in Health AI & Analytics

Page 9: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

9 © Informatica. Proprietary and Confidential.

UBER STRATA SJO 2017

IDEATION EXPLORATION PREPARATION ANALYSIS BUILDING SHARING

SA

TIS

FA

CT

ION

Common Data Challenges in Analytics and AI Programmes

1. Data Discovery

& Access

2. Data Quality

& Preparation

3. Protection &

Permission4. Explainable AI

Page 10: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

10 © Informatica. Proprietary and Confidential.

1. Discovery & Access to Data

Challenges for Data Scientists and Analytics teams:

• Large numbers of systems & data sources.

• Hybrid data sources, both in-house and in-cloud.

• “Data Swamps” in existing repositories.

• Reliance on IT to process access to data in a timely

manner.

• Addition of new data sources requires lengthy

system integration projects.

Page 11: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

© Informatica. Proprietary and Confidential.1111

2. Data Preparation & Data Quality NeedsDeveloping AI capabilities require both quantity and quality in data

Image/Pattern Recognition Speech, Voice, NLP, Free-Text

Relationship Discovery

• Source of Data

• Image Pre-processing

• Tagging

• Ontology Management

• Regional Localisation

• Entity Extraction

• Intelligent Matching

• Ontology Management

• Graph structuring

Page 12: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

12 © Informatica. Proprietary and Confidential.

3. Data Protection & Permissioning

Human Biomedical Research Act

Increased regulations, and expected accountability for Data Use

Top Reasons individuals

would use digital health

technology:

Assurance

health data is

secure

Sources of Data

Protection Friction:

• Data Protection & Privacy

Impact Assessments

• Approval processes

• Org behaviors & processes

• Appropriate data protection

mechanisms

csoonline.com

Page 13: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

13 © Informatica. Proprietary and Confidential.

Applying AI in healthcare settings will likely face the

challenges if it is considered a “Black-Box”:

1. High-impact / high-cost decisions

2. When an unexpected recommendations are made

3. For post-hoc reviews of performance & incidents

encountered

4. Concerns about data-driven bias

5. Detecting ML ‘cheating’

4. The need for Explainable AI

Pacemaker

Regulatory

Approval

User

Adoption

Patient

Outcomes

Programme

Trust

Page 14: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

14 © Informatica. Proprietary and Confidential.

Modern Data Management Architecture

Page 15: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

Flexible Data Platform

3 Major Enablers for Modern Data Management

CollaborativeData

Governance

AI- AssistedData

Management

Page 16: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

© Informatica. Proprietary and Confidential.1616

1. Modular

2. Scalable

3. Logical Data Warehouse & Data Lake

Architecture

4. Metadata-Driven

5. Enterprise Data Management

Capabilities

Flexible Data Platform Architecture

Data

Democratization

Real-Time

Operational

Analytics

Pervasive

Analytics & AI

IoT

Machine Data, &

Streaming Analytics

Page 17: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

© Informatica. Proprietary and Confidential.

Platform Approach for Flexible Data Operations

Data Management

Platform

Business Operations

Business Process

Management

Next-Best

Recommendations

Digital Commerce

Automation

Predictive Analytics

Cognitive Analytics Prescriptive Analytics Descriptive Analytics

Self-service Analytics Streaming Analytics

Analytics

Outbound

Touch Points

Communities

Social

Web

Mobile / Text

Mobile Apps

Email

To

uch

Po

int

Ro

utin

g

Inbound

Touch Points

Professional Services

Assisted Interaction

Clinical support

Admin Staff

Customer Service Rep

Email Web

IoT

Mobile Social

Health Consumer Data

Knowledge Base

Forums

Downloads

Unassisted Interaction

DatabasesApplication Servers

Documents

Mainframe

Operational

Data

External Data

Partner Data

SaaS

Big Data

Machine Data

IoT

Cloud

Clustering Algorithms Learning Algorithms

Natural Language Processing

AI

Recommendations

Categorization Classification

Architect Citizen Integrators IT Specialist Data Scientist Data Analyst Application Developers

Page 18: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

© Informatica. Proprietary and Confidential.

Modular Platform for Flexible Data Operations

AI

Inbound

Touch Points

Professional Services

Assisted Interaction

Customer support

Sales Rep

Customer Service Rep

Email Web

IoT

Mobile Social

Consumer Interaction Data

Knowledge Base

Forums

Downloads

Unassisted Interaction

Architect Citizen Integrators IT Specialist Data ScientistApplication Developers

Business Operations

Business Process

Management

Next-Best

Recommendations

Digital Commerce

Automation

Predictive Analytics

Cognitive Analytics Prescriptive Analytics Descriptive Analytics

Self-service Analytics Streaming Analytics

Analytics

Outbound

Touch Points

Communities

Social

Web

Mobile / Text

Mobile Apps

Email

To

uch

Po

int

Ro

utin

g

DatabasesApplication Servers

Documents

Mainframe

Operational

Data

External Data

Partner Data

SaaS

Big Data

Machine Data

IoT

Cloud

Clustering Algorithms Learning Algorithms

Natural Language ProcessingRecommendations

Categorization Classification

Integration

Platform

Data Management

Platform

Deployment (cloud,

on-premises)

Connectivity

Monitor and Manage

Multi-latency Ingestion, API &

Integration Patterns

Metadata Foundation

Master Data Management

360 Insights

Data Quality, Data Governance &

Data Privacy

Data Discovery & Cataloging

AI-enabled Automation

Data Analyst

Page 19: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

19 © Informatica. Proprietary and Confidential.

Collaborative Data Governance

Unlocking Value of Data

Manual Effort

Policy – Implementation Gap

Top-Down & Siloed

Tra

dit

ion

al D

ata

Go

ve

rna

nc

eCompliance and Risk

Collaborative EffortDemocratised DG with proper

stakeholder interest alignment

Integrated ViewConnecting data & business via.

multi-dimensional viewpoints

Automation & ScalableFeasible to manage ‘4V’

explosion of data

Page 20: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

20 © Informatica. Proprietary and Confidential.

Evolution of Data Governance Practice“Governance without Implementation is just Documentation”

Policy, Direction

& Definition

Risk

Management

Compliance &

Regulatory

Data

Ownership

Data & Digital

Strategy

Data

UniverseApplicationsData Stores Cloud EUC

Operational

Implementation

Controls and

Measurement

Master &

Reference Data

Management

Democratised

Data Access

Data Lifecycle

Management

Data

Protection

Enterprise Data

Catalog

Data Health

Management

Privacy &

Security

Analytics

CPO / CDO

Governance Office

Data

OwnerData

Steward

Data Users

Data

Architect

IT

Team

| “Data Governance is a Team Sport“

Page 21: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

21 © Informatica. Proprietary and Confidential.

Operational

Implementation

Controls and

Measurement

Policy, Direction

& Definition

Evolution of Data Governance Practice

Risk

Management

Compliance &

Regulatory

Data

Ownership

Data & Digital

Strategy

Master &

Reference Data

Management

Democratised

Data Access

Data Lifecycle

Management

Governance Outcomes achieved in a cohesive, efficient manner

Data

Universe

AI -

Augmented

Governance

Community

Enterprise Data

Catalog

Data Health

Management

Privacy &

Security

Analytics

Data

Protection

ApplicationsData Stores Cloud EUC

Enterprise Data

Visibility

Bu

sin

ess D

efin

ition

s

Ph

ysic

al D

isco

ve

ry

De

tect &

De

sig

n

Imp

lem

en

t &

Me

asu

re

Sustainable

Data Quality

De

fine

Po

licy &

Pro

tectio

ns

En

forc

e &

Re

po

rt

Privacy &

Compliance

Page 22: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

22 © Informatica. Proprietary and Confidential.

Example of DG Capability - Data Lineage

Business Data

Logical View

Policy

Owners

Physical Data

Resources

Data

Steward

Data

Engineers

Docum

ent / E

nfo

rce

Data

Analysts

Valid

ate

/ D

iscover

Page 23: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

23 © Informatica. Proprietary and Confidential.

Data Governance helps “Context-explainable” AI

towardsdatascience.com

A working approach to improve AI

explainability:

1. Consider the whole AI development chain. How much do

we trust all the components that went into developing the

model?

2. Plan for “Explainability by Design” throughout the

development process.

Page 24: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

24 © Informatica. Proprietary and Confidential.24

Collaborative Practices for “Context-Validated” AI

For AI Assets: ‘Data + Model + Context’.

• Provide traceability of the AI asset throughout its lifecycle

• Leverage multiple stakeholders in collaborative production of AI asset to provide end-to-end traceability & oversight

Example of Context-Validated AI Development:

1. Model – Logical representation of how data is processed to produce prediction, may include processing types, staging, weights, etc.

2. Model Code – Algorithm that processes data consistent with model to produce prediction

3. Data – Inputs used to produce the prediction

4. KPI – Quantification of predictive outcome

Page 25: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

25 © Informatica. Proprietary and Confidential.25

Data

ScientistLoB

Executive

LoB

ExecutiveCitizen

Analyst

Intuition Report

KPI

(Predictive) ?

Simple AI development pipeline

Model+

Data

?

Page 26: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

26 © Informatica. Proprietary and Confidential.26

Producer

Consumer

Technical Business

Data

Scientist

Data

Engineer

Data

Steward

LoB

ExecutiveCitizen

Analyst

Intuition

KPI

(Predictive)

Model C

ode (T

est)D

ata

(T

rain

ing)

Logic

al M

odel

Data Dictionary (Production)

Report LoB

Executive

Collaboration is Key to Explainable AI

Data Quality & Lineage (Production)

Page 27: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

27 © Informatica. Proprietary and Confidential.

AI-assisted Data Management

Automation

• Discovery

• Next-best Actions

• Platform Scaling

AI

Explosion

in Volume

Data

ControlsNew Data Types

& Sources

• Intuitive UX

• Natural Language DQ

• Increase collaboration

Engagement

• Patterns

• Sense-making

• Platform Management

Insight

Page 28: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

28 © Informatica. Proprietary and Confidential.

Example 1: Intelligent Structure Discovery

AI

Examples: clickstreams, log files, IoT data, txt,

csv, Excel, PDF, Word, etc..

• AI can automatically discover the structure in the

data.

Page 29: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

30 © Informatica. Proprietary and Confidential.

Example 2: Discover & Catalogue Data Entities

“Real-world” Data Catalogue

• Relating Physical data to Business Glossary Entities is laborious, confusing and not sustainable.

• E.g. Address, Customer details may be normalized, cryptic column names, etc.

Data Problem

AI for Data Cataloguing

Like photo tagging

for data

• Unsupervised learning techniques to cluster &

classify similar data types.

• Learns associations of user-tagged data types to

tag similar concepts across the Enterprise.

• Learns concept hierarchies to derive composite

business entities across the Enterprise.

• Semantic search of Enterprise catalogue with AI-

led recommendations

Page 30: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

31 © Informatica. Proprietary and Confidential.

Example 3: Handling Data Drift

Data Problem

AI for Data Drift

Original

Log

New version

Log

New fields that are not

in the model are

mapped to unassigned

ports

New date format is handled

correctly

Added spaces are

handled correctly• Data Sources and resources can

change ‘unannounced’. Traditional Data mapping is brittle.

• Data Drift can happen for formats, structure or meaning.

• Runtime processing can gracefully overcome noise and changes in incoming data.

• Unexpected data can be captured and processed.

Same Semantics, format change: 01/01/2019 and 01-01-2019 and Jan-01-2019Structural changes within file: If some records contain 10 fields other contain 8

Page 31: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

Problem Definition

Explore Data

Prepare Data

Build & Test Model

Deploy Model

Output Results

(& Refine Model)

Many other AI applications for Data Management

Data Relationship

Inference

Business Term

Associations

Dataset

Similarity

Entity

ExtractionData Domain

Inference

Column

Similarity

Data Discovery & Access

Business Rules

Translation

Entity

Matching

Business Rule

Associations

Mass Data

Correction

Natural Language

Description of Code

Data Quality & Data Preparation

Schema

Inference

Protection & Permission

Data Anomaly

Detection

Self Secure

Operational

Anomaly Detection

Cost of

Data Breach

Data Pipeline Management

Self Healing

Processing

Self Tuning

ProcessingSmart Data

Visualization

Schedule

Optimization

Security

Analyst,

DPO

Data Scientist,

Data Steward

Data Scientist,

Data StewardData

Engineer

Data

Engineer

Page 32: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

Problem Definition

Explore Data

Prepare Data

Build & Test Model

Deploy Model

Output Results (& Refine

Model)

Putting it All TogetherModernised Data Management powers future Analytics & AI development

Flexible Data Platform Architecture

Evolved Data Governance Practices

AI/ ML Enablers

-80%reduction in data quality

issues

Retailer, Australia

-50%workload for

data stewards

-3man-months of discovery effort

Health Provider, USA

Distributor, USA

Page 33: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

PR

OD

UC

TS

SO

LU

TIO

NS

MULTI-

CLOUD

REAL TIME/

STREAMINGBIG DATA TRADITIONAL

MONITOR AND MANAGE

DATA ENGINE

CONNECTIVITY

DATA QUALITY & GOVERNANCE

MASTER DATAMANAGEMENT

BIG DATA MANAGEMENT

ENTERPRISEDATA CATALOG

DATASECURITY

DATAINTEGRATION

iPaaS

The Intelligent Data Platform

PRODUCT 360

SUPPLIER 360

CUSTOMER 360

REFERENCE 360

SECURE@SOURCEENTERPRISE DATA PREPARATION

ENTERPRISE DATA GOVERNANCE

CUSTOMER 360 INSIGHTS

Page 34: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

`

Thank You

Jon Teo

[email protected]

Page 35: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

36 © Informatica. Proprietary and Confidential.

Enhance Patient Lives with Data-Driven Decisions

• A Sanofi company, dedicated to transforming the lives of people with hemophilia and other rare blood disorders through world-class research, development, and commercialization of innovative therapies

• Needed a modern hybrid data architecture to easily support integrate and synchronize data between multiple hybrid sources such as Salesforce and Veeva CRM into Azure SQL DW

• Built a scalable solution (i.e., hardware and storage) leveraging Informatica Intelligent Cloud Services for data integration and management and Azure as the platform

• Gained faster time-to-insights and made better, faster, data-driven business decisions to respond quickly and reach more patients

Page 36: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

© Informatica. Proprietary and Confidential.3737

Bioverativ’s Cloud Journey

Customer Relationship Management Rollout

2017

• Salesforce & Veeva CRM rollout

• New digital transformation

requirements arise that include

additional reporting requirements

• Additional patient and data

integration requirements increase

2018

Informatica Cloud Data Integration and Azure Rollout

• Implemented CRM Analytics CDW on

Azure DWH

• Informatica Intelligent Cloud Services

for data integration needs which

included support vendor & external

partner data systems

• Managed hybrid data source and

added new CRM services: Service

Cloud, Veeva CRM, Veeva Vault

2019+

Integration with Sanofi& Cloud Data Lakes

• Integrate with Sanofi data warehouse

infrastructure

• Implement data catalog, data lakes

• Implement predictive analytics to

support data scientists

Page 37: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

38 © Informatica. Proprietary and Confidential.

“Big Data” in Healthcare is about Quantity and QualitySupervised training needs – labelled data.

1.4M hand-labelled images

878%

in global

health data

growth since

2016

800M

Medical

scans per

annum (US)

8.41

Petabytes

average data

generated per

organisation

Page 38: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

© Informatica. Proprietary and Confidential.

Cloud Data Warehouse Modernization Blueprint

ENTERPRISE DATA CATALOG

DATA QUALITY & GOVERNANCE

DATA PRIVACY & PROTECTION

Visualization

Business Intelligence

MachineLearning

Elastic

Compute

CloudObject Store

API & Application Integration

Streaming

Ingestion

Common Enterprise MetadataAI/ML Engine

Cloud Data Integration

DatabasesApplication Servers

Mainframe

On-Premises

SaaS

LogsMachine DataConnected Devices

Edge

Replication & Mass Ingestion

Cloud Data Integration

Cloud Data

Warehouses

Cloud Data Integration

Cloud

Data LakeCloud

Page 39: Modernising Data Architecture for AI · Challenges for Data Scientists and Analytics teams: •Large numbers of systems & data sources. •Hybrid data sources, both in-house and in-cloud.

© Informatica. Proprietary and Confidential.

Intelligent Data Catalog

Ma

ch

ine

Hum

an

s

Meta

data

Colle

ction

Data

Cata

log U

se C

ases

Knowledge Graph

Structure Discovery

Profile and Domain Discovery

Recommendations

Similarity

Clustering

AI C

ura

ted C

ata

log

Busin

ess &

Cro

wd

Sourc

ed C

ura

tion

Data Asset

Management

Data

Governance

Self-Service

Analytics

Data Analyst Data Engineer Data Architect Data ScientistData Steward

Bu

sin

ess C

on

text

Glossary

Process

Policies

Bro

ad

Me

tad

ata

So

urc

e

Databases DocumentsMainframe

Cloud Data Warehouse

Application Servers

ETL Tools Other Metadata

Tools

Business Intelligence

Wisdom of Crowd

Annotations Comments

Ratings

Business

Classifications

Business Glossary

Associations