Top Banner
1 © Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane, & Windsor Hsu Backup Recovery Systems Division EMC Corporation
27

1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

Dec 23, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

1© Copyright 2013 EMC Corporation. All rights reserved.

Characterization of Incremental Data Changes for Efficient Data

Protection

Hyong Shim, Philip Shilane, & Windsor Hsu

Backup Recovery Systems DivisionEMC Corporation

Page 2: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

2© Copyright 2013 EMC Corporation. All rights reserved.

Data Protection Environment

SAN or LANWAN

Application Servers

Primary Storage

Data ProtectionStorage

High I/O per sec.Medium Capacity

Large CapacityMedium I/O per sec.

Virtual Machines

Page 3: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

3© Copyright 2013 EMC Corporation. All rights reserved.

Contributions

Detailed analysis of data change characteristics from enterprise customers

Design for replication snapshots to lower overheads on primary storage.

Evaluation of overheads on data protection storage

Rules-of-thumb for storage engineers and administrators

Page 4: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

4© Copyright 2013 EMC Corporation. All rights reserved.

EMC Symmetrix VMAX Traces

Trace Set #Volume # Storage Systems

Duration hrs

Estimated Capacity (GB)

1hr_1Wrt 109,263 125 30.4 [78.3] 71 [203]

1hr_1GBWrt 16,100 120 7.7 [6.7] 132 [262]

24hr_1GBWrt 508 13 24.4 [1.2] 318 [439]

Collected from enterprise customer sites

Page 5: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

5© Copyright 2013 EMC Corporation. All rights reserved.

Capacity and Write Footprint

Analysis for 1hr_1GBWrit

Not collected: applications using each volume

Page 6: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

6© Copyright 2013 EMC Corporation. All rights reserved.

I/O PropertiesTrace Set #Write

reqs (1000s)

Write size (GB)

#Read reqs (1000s)

Read size (GB)

1hr_1Wrt 72 [510]

2 [31]

167 [1963]

5 [66]

1hr_1GBWrt 429 [1270]

11[80]

796 [4987]

25[166]

24hr_1GBWrt 1803 [4839]

51[338]

7824[23875]

242[763]

1.9-4.3X more read I/Os than write I/Os 2.3-4.7X more GB read than written High variability More analysis in the paper

Page 7: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

7© Copyright 2013 EMC Corporation. All rights reserved.

Sequential vs. Random Write I/O

We measure how much data are written, on average, after seeking to a non-consecutive sector.

Selected most sequential and most random for analysis

Storage Volume

w w w w

Trace Timeline (w = Write I/O, r = Read I/O)

r w Sequential Write I/O

(5 + 1+ 3)/ 3 = 3

Page 8: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

8© Copyright 2013 EMC Corporation. All rights reserved.

r w ww r wr w w www w r w …

Replication Interval 1

TransferPeriod

may require snapshot storage and I/O

Trace Timeline (w = Write I/O, r = Read I/O)

Storage VolumeSectors

Replication Interval 2

Block

Trace Analysis Methodology

Create a snapshot to protect block data

Page 9: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

9© Copyright 2013 EMC Corporation. All rights reserved.

Replication Snapshot

0

Storage Volume state before transfer takes place

1 2 3 4

Block:

= Modified block to be transferred

Trace Timeline (w = Write I/O)

Goal: Create a snapshot technique that is integrated with replication that decreases overheads on primary storage

Change block tracking records modified blocks for next replication interval, possibly with a bit vector.

A snapshot has to maintain block values against overwrites.

Page 10: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

10© Copyright 2013 EMC Corporation. All rights reserved.

Replication Snapshot

Baseline Snapshot: All writes cause copy-on-write

0

Storage Volume state before transfer takes place

1 2 3 4

Block:

= Modified block to be transferred

Snapshot AreaTrace Timeline (w = Write I/O)

w w w Baseline

Transfer in progress

Page 11: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

11© Copyright 2013 EMC Corporation. All rights reserved.

Replication Snapshot

Changed Block Replication Snapshot (CB): Only writes to tracked blocks cause copy-on-write

0 1 2 3 4

Block:

Snapshot Area

w w w Baseline

Transfer in progress

CB

Page 12: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

12© Copyright 2013 EMC Corporation. All rights reserved.

Replication Snapshot

Changed Block with Early Release Replication Snapshot (CBER): Only writes to tracked blocks cause copy-on-write, and blocks are released once transferred

0 1 2 3 4

Block:

Snapshot Area

w w w Baseline

Transfer in progress

CB

CBER

Page 13: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

13© Copyright 2013 EMC Corporation. All rights reserved.

Replication Snapshot

0 1 2 3 4

Block:

Snapshot Area

w w w Baseline

CB

CBER

Baseline Snapshot: All writes cause copy-on-write

Changed Block Replication Snapshot (CB): Only writes to tracked blocks cause copy-on-write

Changed Block with Early Release Replication Snapshot (CBER): Only writes to tracked blocks cause copy-on-write, and blocks are released once transferred

= Modified block to be transferred

Page 14: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

14© Copyright 2013 EMC Corporation. All rights reserved.

Snapshot Storage OverheadsRule-of-thumb: Over-provision primary capacity by 8% for snapshots

Page 15: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

15© Copyright 2013 EMC Corporation. All rights reserved.

Snapshot I/O OverheadsRule-of-thumb: Over-provision primary I/O by 100% to support copy-on-write related write-amplification

Page 16: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

16© Copyright 2013 EMC Corporation. All rights reserved.

Snapshot I/O OverheadsRule-of-thumb: Over-provision primary I/O by 100% to support copy-on-write related write-amplification

Page 17: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

17© Copyright 2013 EMC Corporation. All rights reserved.

Transfer Size to Protection Storage Rule-of-thumb: 40% of written bytes are transferred to protection storage

Page 18: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

18© Copyright 2013 EMC Corporation. All rights reserved.

IOPS Requirements for Protection StorageRule-of-thumb: Protection storage must support 20% of the I/O per second capabilities of primary storage

Page 19: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

19© Copyright 2013 EMC Corporation. All rights reserved.

Related Work

Trace analysis– Numerous publications

Most closely related is Patterson [2002]

Snapshots– Common paradigm for storage but rarely integrated with

incremental transfer techniques– Storage overheads Azagury [2002] and Shah [2006]

Synchronous Mirroring– Effective when change rates are low and geographic

distance is small– We are focused on periodic, asynchronous replication

Page 20: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

20© Copyright 2013 EMC Corporation. All rights reserved.

Conclusion

SAN or LANWAN

Application Servers

Primary Storage

Data ProtectionStorage

High I/O per sec.Medium Capacity

Large CapacityMedium I/O per sec.

Page 21: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

21© Copyright 2013 EMC Corporation. All rights reserved.

Conclusion

Trace analysis shows diversity of storage characteristics

Snapshot overheads on primary storage can be decreased by improved integration with network transfer

Sequential versus random access patterns affect incremental change patterns on both primary and protection storage

Page 22: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

22© Copyright 2013 EMC Corporation. All rights reserved.

Rules-of-Thumb

Over-provision primary capacity by 8% for snapshots

Over-provision primary I/O by 100% to support copy-on-write related write-amplification

A write buffer decreases snapshot I/O overheads but has little impact on storage overheads

40% of written bytes are transferred to protection storage

Schedule at least 6 hours between transfers to minimize clean data in transferred blocks

Schedule at least 12 hours between transfers to minimize peak network bandwidth requirements

Protection storage must support 20% of the I/O per second capabilities of primary storage

Page 23: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

23© Copyright 2013 EMC Corporation. All rights reserved.

Questions?

Page 24: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,
Page 25: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

25© Copyright 2013 EMC Corporation. All rights reserved.

Trace Analysis: Replication of SnapshotsThe amount of data to replicate drops in half with 12 hours between snapshots. 4KB results are compared to Patterson 2002.

Page 26: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

26© Copyright 2013 EMC Corporation. All rights reserved.

I/O Per Second (IOPS) Request RateTrace Set Average

Write RatePeak Write Rate 10ms

Average Read Rate

Peak Read Rate 10 ms

1hr_1Wrt 0.7[8]

1762[2602]

2[25]

1693[2457]

1hr_1GBWrt 15[37]

4360[4379]

29[118]

3603[4135]

24hr_1GBWrt 20[55]

9004[8165]

89[269]

5647[7012]

Peak Values: IOPS are calculated every 10ms period, and the peaks for each volume are averaged.More analysis in the paper

Page 27: 1© Copyright 2013 EMC Corporation. All rights reserved. Characterization of Incremental Data Changes for Efficient Data Protection Hyong Shim, Philip Shilane,

27© Copyright 2013 EMC Corporation. All rights reserved.

Snapshot I/O OverheadsRule-of-thumb: Over-provision primary I/O by 100% to support copy-on-write related write-amplification