Top Banner
Data Virtualization: Revolutionizing data cloning a.k.a. copy data management 1 kylehailey.com [email protected] @datavirt
99
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Virtualization: Revolutionizing data cloning

Data Virtualization: Revolutionizing data cloning

a.k.a. copy data management

1kylehailey.com [email protected] @datavirt

Page 2: Data Virtualization: Revolutionizing data cloning

Data virtualization

• Fast becoming the new norm

• Used by Over 100 of Fortune 500

• Enables DevOps

Page 3: Data Virtualization: Revolutionizing data cloning

DevOps movement

• Goals Clarify • Metrics Define • Constraints Identify • Priorities Set • Iterations Fast

Page 4: Data Virtualization: Revolutionizing data cloning

DevOps :

• Goals Clarify • Metrics Define • Constraints Identify • Priorities Set • Iterations Fast

• Continuous Integration• Cloud • Agile • Kanban• Kata

“IT is the factory floor of this century”

Page 5: Data Virtualization: Revolutionizing data cloning

The Goal : Theory of Constraints

Improvementnot made at the constraintis an illusion

factory floor optimization

Page 6: Data Virtualization: Revolutionizing data cloning

Factory floor

Page 7: Data Virtualization: Revolutionizing data cloning

Factory floor

constraint

Not a relay race

Page 8: Data Virtualization: Revolutionizing data cloning

Tune before constraint

constraint

Tuning here

Stock piling

Page 9: Data Virtualization: Revolutionizing data cloning

Tune after constraint

constraint

Tuning here

Starvation

Page 10: Data Virtualization: Revolutionizing data cloning

Factory floor : straight forward

constraint

Goal: find constraint optimize it

Page 11: Data Virtualization: Revolutionizing data cloning

The Phoenix Project

What is the constraint

in IT ?

Page 12: Data Virtualization: Revolutionizing data cloning

Put your energy into the constraint

Top 5 constraints in IT

1. Dev environments setup2. QA setup3. Code Architecture4. Development5. Product management

- Gene Kim

“One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it“

Page 13: Data Virtualization: Revolutionizing data cloning

Data is the constraint

60% Projects Over Schedule

85% delayed waiting for data

Data is the Constraint

CIO Magazine Survey:

only getting worse

Gartner: Data Doomsday, by 2017 1/3rd IT in crisis

Page 14: Data Virtualization: Revolutionizing data cloning

• Data Constraint• Solution• Use Cases

In this presentation :

Page 15: Data Virtualization: Revolutionizing data cloning

Typical Architecture

Production

Instance

File system

Database

Page 16: Data Virtualization: Revolutionizing data cloning

Typical Architecture

Production

Instance

Backup

File system

Database

File system

Database

Page 17: Data Virtualization: Revolutionizing data cloning

Typical Architecture

Production

Instance

Reporting Backup

File system

Database

Instance

File system

Database

File system

Database

Page 18: Data Virtualization: Revolutionizing data cloning

Typical Architecture

Production

Instance

File system

Database

Instance

File system

Database

File system

Database

File system

Database

InstanceInstance

Instance

File system

Database

File system

Database

Dev, QA, UAT Reporting Backup

Triple Tax

Page 19: Data Virtualization: Revolutionizing data cloning

Typical Architecture

Production

Instance

File system

Database

Instance

File system

Database

File system

Database

File system

Database

InstanceInstance

Instance

File system

Database

File system

Database

Page 20: Data Virtualization: Revolutionizing data cloning

Typical Architecture

Production

Instance

File system

Database

Instance

File system

Database

File system

Database

File system

Database

InstanceInstance

Instance

File system

Database

File system

Database

Page 21: Data Virtualization: Revolutionizing data cloning

Copies

21

• Oracle customers : 8-12 copies per db

• Fortune 2K: 1000s multi-TB db

• Downstream storage staggering

- 3 petabytes at just one client

Page 22: Data Virtualization: Revolutionizing data cloning

• Hardware– storage, systems, network, – rack space, power cooling

• People – 1000s hours per year just for DBAs – DBAs– SYS Admin– Storage Admin– Backup Admin – Network Admin

• $10s Millions for data center modernizations

Copies require People & Time

Page 23: Data Virtualization: Revolutionizing data cloning

companies unaware

Page 24: Data Virtualization: Revolutionizing data cloning

companies unaware

Developer or AnalystBoss, Storage Admin, DBA

Page 25: Data Virtualization: Revolutionizing data cloning

Metrics

– Time – Old Data – Storage

Other – Analysts – Audits – Data Center Modernization

companies unaware

"we say no, no, no until we can't say no anymore" response when IT asked for copies of prod DB

Page 26: Data Virtualization: Revolutionizing data cloning

1. Waiting to check in code2. Production Bugs3. Expensive Slow QA

Biggest problem in Application Development

Page 27: Data Virtualization: Revolutionizing data cloning

Development : bottlenecks

Frustration Waiting

Page 28: Data Virtualization: Revolutionizing data cloning

Development : Bugs

Old Unrepresentative Data

Page 29: Data Virtualization: Revolutionizing data cloning

Development : subsets

False NegativesFalse PositivesBugs in Production

Page 30: Data Virtualization: Revolutionizing data cloning

Production Wall

30

Page 31: Data Virtualization: Revolutionizing data cloning

QA : Long setup times

BugX

010203040506070

1 2 3 4 5 6 7

Delay in Fixing the bug

Cost ToCorrect

Software Engineering Economics – Barry Boehm (1981)

Page 32: Data Virtualization: Revolutionizing data cloning

QA : destructive tests refresh time

32

20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST

8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs 8 Hrs

Page 33: Data Virtualization: Revolutionizing data cloning

• Data Constraint

• Solution• Use Cases

In this presentation :

Page 34: Data Virtualization: Revolutionizing data cloning

Development UATQA

99% of blocks are identical

Page 35: Data Virtualization: Revolutionizing data cloning

Solution

Page 36: Data Virtualization: Revolutionizing data cloning

Development QA UAT

Thin Clone

Page 37: Data Virtualization: Revolutionizing data cloning

• EMC Symmetrix– 16 snapshots – Write performance impact– No snapshots of snapshots

• Netapp & EMC VNX– 255 snapshots

• ZFS– Compression– Unlimited snapshots– Snapshots of Snapshots

• DxFS– Compression– Unlimited snapshots– Snapshots of Snapshots– Shared cache in memory

Technology Core : file system snapshots

Also check out new SSD storage such as: Pure Storage, EMC XtremIO

Page 38: Data Virtualization: Revolutionizing data cloning

Snapshot 1 – full backup once only at link time

Jonathan Lewis © 2013 Virtual DB

38 / 30

a b c d e f g h i

We start with a full backup - analogous to a level 0 rman backup. Includes

the archived redo log files needed for recovery. Run in archivelog mode.

Page 39: Data Virtualization: Revolutionizing data cloning

Snapshot 2 (from SCN)

Jonathan Lewis © 2013

b' c'

a b c d e f g h i

The "backup from SCN" is analogous to a level 1

incremental backup (which includes the relevant

archived redo logs). Sensible to enable BCT.

Delphix executes standard rman scripts

Page 40: Data Virtualization: Revolutionizing data cloning

Apply Snapshot 2

Jonathan Lewis © 2013

a b c d e f g h ib' c'

The Delphix appliance unpacks the rman backup and "overwrites" the

initial backup with the changed blocks - but DxFS makes new copies of

the blocks

Page 41: Data Virtualization: Revolutionizing data cloning

Drop Snapshot 1

Jonathan Lewis © 2013

b' c'a d e f g h i

The call to rman leaves us with a new level 0 backup, waiting for recovery.

But we can pick the snapshot root block. We have EVERY level 0 backup

Page 42: Data Virtualization: Revolutionizing data cloning

Creating a vDB

Jonathan Lewis © 2013

b' c'a d e f g h i

The first step in creating a vDB is to take a snapshot of the filesystem as at

the backup you want (then roll it forward)

My vDB(filesystem)

Your vDB(filesystem)

b' c'a d e f g h i

Page 43: Data Virtualization: Revolutionizing data cloning

Creating a vDB

Jonathan Lewis © 2013

b' c'a d e f g h i

The first step in creating a vDB is to take a snapshot of the filesystem as at

the backup you want (then roll it forward)

My vDB(filesystem)

Your vDB(filesystem)

i’b' c'a d e f g h ib' c'a d e f g h i

Page 44: Data Virtualization: Revolutionizing data cloning

Fuel not equal car

Challenges

1. Technical2. Bureaucracy

Page 45: Data Virtualization: Revolutionizing data cloning

Bureaucracy

Developer Asks for DB Get Access

Manager approves

DBA Request system

Setup DB

System Admin

Requeststorage

Setupmachine

Storage Admin

Allocate storage (take snapshot)

Page 46: Data Virtualization: Revolutionizing data cloning

Why are hand offs so expensive?

1hour1 day

9 days

Bureaucracy

Page 47: Data Virtualization: Revolutionizing data cloning

Technical Challenge

Database Luns

Production FilerTarget A

Target B

Target C

snapshotclones

InstanceInstance

InstanceInstance

InstanceInstance

InstanceInstance

Instance

Source

Page 48: Data Virtualization: Revolutionizing data cloning

Database LUNs

snapshot

clonesProduction Filer

Development Filer

Technical Challenge

Instance

Target A

Target B

Target C

InstanceInstance

InstanceInstance

InstanceInstance

Instance

Page 49: Data Virtualization: Revolutionizing data cloning

Technical Challenge

Copy

Time Flow

Purge

Production

File System Instance

TargetStorage

Clone (snapshot)

Compress

Share Cache

Provision

Mount, recover, rename

Self Service, Roles & Security

Instance

21 3

Page 50: Data Virtualization: Revolutionizing data cloning

How to get a Data Virtualization?

Sourcesync

TargetDeploy

Storagesnapshots

21 3

Source Sync Storage Snapshots Deploy automation

ZFS Yes (unlimited)

EMC SRDF Yes (16 or 255)

Netapp SMO Yes (255)

Oracle EM 12c Data Guard Netapp, ZFS Yes (oracle only, no branching)

Actifio Yes Yes Yes (no branching)

Delphix Yes Yes yes

Page 51: Data Virtualization: Revolutionizing data cloning

ActifioProduction

InstanceInstanceInstance

Actifio

InstanceInstance Instance

TargetActifio

Instance

Target

Page 52: Data Virtualization: Revolutionizing data cloning

Oracle Snap Clone

ZFSSAor

NetApp

Instance

TargetEM 12c

Instance

Target

Production

InstanceInstanceInstance

Page 53: Data Virtualization: Revolutionizing data cloning

Oracle Snap CloneProduction

InstanceInstanceInstance

Data Guard

InstanceInstanceInstance

ZFSSAor

NetApp

Instance

TargetEM 12c

Instance

Target

Page 54: Data Virtualization: Revolutionizing data cloning

Oracle Snap CloneProduction

InstanceInstanceInstance EM 12c

Solaris

ZFS

Instance

TargetData Guard

Instance

Instance

Target

Any storage

Page 55: Data Virtualization: Revolutionizing data cloning

Incremental forever collect changesProduction

InstanceInstanceInstance

Time Flow

ChangesInstance

NFS

Target

Instance

Target

Page 56: Data Virtualization: Revolutionizing data cloning

Database Virtualization

Page 57: Data Virtualization: Revolutionizing data cloning

Three Physical CopiesThree Virtual Copies

Data Virtualization Appliance

Page 58: Data Virtualization: Revolutionizing data cloning

Before Virtual Data

Production Dev, QA, UAT

Instance

Reporting Backup

File system

Database

Instance

File system

Database

File system

Database

File system

Database

InstanceInstance

Instance

File system

Database

File system

Database

“triple data

tax”

Page 59: Data Virtualization: Revolutionizing data cloning

With Virtual DataProduction

Instance

Dev & QA

Instance

Reporting

Instance

Backup

Instance Instance InstanceInstanceInstance

Instance

File system

Database

Data Virtualization Appliance

Instance

Page 60: Data Virtualization: Revolutionizing data cloning

• Problem in the Industry• Solution• Use Cases

Page 61: Data Virtualization: Revolutionizing data cloning

1. Development and QA 2. Production Support3. Business

Use Cases

Page 62: Data Virtualization: Revolutionizing data cloning

1. Development & QA2. Production Support3. Business

Use Cases

Page 63: Data Virtualization: Revolutionizing data cloning

Development: Virtual Data

Development

* Fast * Free * Full size * Self service

Page 64: Data Virtualization: Revolutionizing data cloning

Virtual Data: Easy

Instance

Instance

Instance

Instance

Source

DVA

Page 65: Data Virtualization: Revolutionizing data cloning

Development Virtual Data: Parallelize

gif by Steve Karam

Page 66: Data Virtualization: Revolutionizing data cloning

Development Virtual Data: Full size

Page 67: Data Virtualization: Revolutionizing data cloning

Development Virtual Data: Self Service

Page 68: Data Virtualization: Revolutionizing data cloning

QA : Virtual Data• Fast • Parallel• Rollback• A/B testing

Page 69: Data Virtualization: Revolutionizing data cloning

Dev

QA

Instance

Prod

DVA

• Eliminate build time

• Find bugs Fast

• Run Parallel QA

QA Virtual Data : Parallel

Production Time Flow

Page 70: Data Virtualization: Revolutionizing data cloning

QA Virtual Data : Fast Refresh

70

20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST

• Fast

• Full

• Fresh

• Efficient

8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs 8 Hrs

20 MIN

TEST

Page 71: Data Virtualization: Revolutionizing data cloning

QA with Virtual Data: Rewind

DVAInstance

QA

Prod

Production Time Flow

Page 72: Data Virtualization: Revolutionizing data cloning

QA with Virtual Data: A/B

DVAInstance

Instance

Instance

Index 1

Index 2

Production Time Flow

Page 73: Data Virtualization: Revolutionizing data cloning

Data Version Control

1/30/2015 73

Dev

QA

2.1

Dev

QA

2.2

2.1 2.2

Instance

Prod

DVA Production Time Flow

Page 74: Data Virtualization: Revolutionizing data cloning

1. Development and QA2. Production Support3. Business

Use Cases

Page 75: Data Virtualization: Revolutionizing data cloning

• Backups• Recovery• Forensics• Migration• Consolidation

Production Support

Page 76: Data Virtualization: Revolutionizing data cloning

9TB database 1TB change day 30 day backups storage requirements

76

0

10

20

30

40

50

60

70

wee

k 1

wee

k 2

wee

k 3

wee

k 4

original

Oracle

Delphix

Page 77: Data Virtualization: Revolutionizing data cloning

Recovery

Instance

Instance

Recover VDB

Drop

Source

DVA Production Time Flow

Page 78: Data Virtualization: Revolutionizing data cloning

Forensics

Instance

Development

DVA

Source

Production Time Flow

Page 79: Data Virtualization: Revolutionizing data cloning

Development (the new production)

Instance

Development

DVA

Source

Development

Prod & VDB Time Flow

Page 80: Data Virtualization: Revolutionizing data cloning

Migration

Page 81: Data Virtualization: Revolutionizing data cloning

1. Development and QA2. Production Support3. Business Intelligence

Use Cases

Page 82: Data Virtualization: Revolutionizing data cloning

Business Intelligence

• ETL• Temporal• Confidence Testing• Federated Databases• Audits

Page 83: Data Virtualization: Revolutionizing data cloning

Business Intelligence: ETL and Refresh Windows

1pm 10pm 8amnoon

Page 84: Data Virtualization: Revolutionizing data cloning

Business Intelligence: batch taking too long

1pm 10pm 8amnoon

2011

2012

2013

2014

2015

Page 85: Data Virtualization: Revolutionizing data cloning

2011

2012

2013

2014

2015

1pm 10pm 8amnoon

10pm 8am noon 9pm

6am 8am 10pm

Page 86: Data Virtualization: Revolutionizing data cloning

Business Intelligence: ETL and DW Refreshes

Instance

Prod

Instance

DW & BI

Page 87: Data Virtualization: Revolutionizing data cloning

• Collect only Changes• Refresh in minutes

Instance

Prod

BI and DW

ETL24x7

DVA

Virtual Data: Fast Refreshes

Production Time Flow

Page 88: Data Virtualization: Revolutionizing data cloning

Temporal Data

Page 89: Data Virtualization: Revolutionizing data cloning

Confidence testing

Page 90: Data Virtualization: Revolutionizing data cloning

Modernization: Federated

Instance

Instance

Source1

Source2

DVAProduction Time Flow 1

Production Time Flow 2

Page 91: Data Virtualization: Revolutionizing data cloning

Modernization: Federated

Page 92: Data Virtualization: Revolutionizing data cloning

“I looked like a hero”Tony Young, CIO Informatica

Modernization: Federated

Page 93: Data Virtualization: Revolutionizing data cloning

Production Time Flow

Audit

1/30/2015 93

Instance

Prod

DVA

Live Archive

Page 94: Data Virtualization: Revolutionizing data cloning

1. Development & QA2. Production Support3. Business

Use Case Summary

Page 95: Data Virtualization: Revolutionizing data cloning

How expensive is the Data Constraint?

DVA at Fortune 500 :

Dev throughput increase by 2x

Page 96: Data Virtualization: Revolutionizing data cloning

Faster

• Financial Close• BI refreshes• Surgical recovery• Projects

How expensive is the Data Constraint?

Page 97: Data Virtualization: Revolutionizing data cloning

• Projects “12 months to 6 months.”– New York Life

• Insurance product “about 50 days ... to about 23 days”– Presbyterian Health

• “Can't imagine working without it”– State of California

Virtual Data Quotes

Page 98: Data Virtualization: Revolutionizing data cloning

• Problem: Data is the constraint • Solution: Virtualize Data• Results:

• Half the time for projects• Higher quality• Increase revenue

Summary

Page 99: Data Virtualization: Revolutionizing data cloning

Thank you!

• Kyle Hailey| Oracle ACE and Technical Evangelist, Delphix– [email protected]

– kylehailey.com

– slideshare.net/khailey

– @datavirt