Audax Group: CIO Perspectives - Managing The Copy Data Explosion

Post on 21-Oct-2014

857 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presentation delivered by Audax Group CIO to Gartner Symposium ITxpo on managing the Copy Data Explosion with Actifio

Transcript

This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates.© 2012 Gartner, Inc. and/or its affiliates. All rights reserved.

Erik-Jan Dubóvik

Chief Information Officer

Audax Group

CIO Perspectives: Opportunities in Managing the Copy Data Explosion

About Audax Group

• Background- Founded 1999, ~140 ppl, offices in Boston & New York

- Investor in lower-middle market companies

- Manage over $5B of assets through our private equity, mezzanine debt, and private senior debt businesses

Copy Data Management Visualized

Infrastructure-Centric Data Management

1 Redundant – Multiple silos, same 4 primitives

2 Complex – Keep adding to relieve “symptoms”

3 Slow – Moving lots of data across networks

Status Quo

1 Flexible – Any environment (virtual, hybrid…)

2 Simple – One integrated data protection app

3 Fast – Data mounts directly to production

Information-Centric Data Management

DUPLICATION + INFRASTRUCTURE + OPERATIONS + COMPLEXITY + COST

A whole new market category…

“Copy data management: These products can perform a host of functions, including backup, archiving, replication and creation of test data using a minimal number of copies.”

To go from good to great, storage administrators should evaluate these types of tools:

13 March 2013 ID:G00248888

…And a ‘Best Practice’

Dave RussellVP Distinguished Analyst

“The notion of copy data management — which reduces the proliferation of secondary copies of data for backup, disaster recovery, testing and reporting — is becoming increasingly important to contain costs and to improve infrastructure agility.” 15 August 2013 G00252768

Best Practices for Repairing the Broken State of Backup

Copy Data Growth DriversQ: What are the reasons for growth of secondary data copies?

Other

Lack of data copy management tools and/or practices

New/expanded use of business analytics

Regulatory requirements to store data for a specific period of time

Larger size of secondary copies to be created

More copies per application are created

Increased number of applications

0% 20% 40% 60% 80%

% of respondentsN=556

The Power of Copy Data Management

Tools Landscape

RecoverPointSRDF

MirrorView

DataDomainAvamar

AvamarNetworker

FAST

TimefinderRM

SnapView

Remote Copy

Continuous Access

DataProtector

AdaptiveOptimization

Virtual CopyEVA

Snapshot

SnapMirror

SyncSortCommVaultNetBackup

AST

InmageTrue Copy

CommVault

SmartTiers

Shadow ImageCoW

Snapshot

NetBackupBackupExec

VxFS DST

PureDiskStoreOnce

SnapShot

HDIM

VVR

RealTime

Tiering

Snapshot

Backup

Dedup

Replication

Context and Problem

• Situation- Resource & time intensive business processes require

immediate systems performance and limited downtime- 5 ESX Hosts, 50 servers, 16TB storage, Dual LTO4- 500k emails/mo (3,500/FTE); annual data growth 10%

• The Problem- Backup window entering business day- Business continuity technology didn’t protect all

systems & relied on tape* for server restoration- Level 1 RTOs range 5hrs (SQL) to 12hrs (email), 48hrs (file)

- Backup email service not acceptable for multi-hour use

* If tapes are corrupt, RPO grows to 7 days or longer.

Objectives

• Justification & Business Case- Fully protect all company systems

- Eliminate need for expensive Tier 1 storage

- Establish Co-Lo for systems and personnel

- Free-up expensive real estate (i.e., NY Server Room)

- Avoid growing IT staff

• Specific Goals & Timeline- 3 month project start-to-finish

- Major improvement of RPO/RTO

Objectives: RPO/RTO

Actifio Target

Level 1 RPO/RTO Level 2 RPO/RTO Level 3 RPO/RTO

3hr/15min 18hr/30min 24hr/45min

Previous Capability

Level 1 RPO/RTO Level 2 RPO/RTO Level 3 RPO/RTO

24hr/80hr 24hr/8.7 days 24hr/19.6 days

Graphic source: Wikipedia

• Alternatives Considered- Expand existing host-based replication software

(DoubleTake, WANSync)

- Veeam + new storage• Pushing limits of tech at a comparatively higher cost

• Considerations

Failover: How long to “spin up” server in Production site? DR?

Application support: Linux, Exchange, SQL, Server, SharePoint?

Storage: How much required? De-dupe/compression (important if using one device for backup)?

Replication: Site-to-site on- premise capable? Site-to-Cloud (If so, what limitations, if any)?

Severability vs. Integration: Acceptable risk if part of VM environment (vs. standalone)?

Data Restore: Server vs. item-level? Number of snapshots? How long to “spin up” server?

Cost: Savings from HW/ SW elimination, avoidance & downsizing? Staff optimization?

Timing: Natural refresh cycle of related HW/ SW (e.g., storage, dedupe, backup, data center)?

Connectivity: Local environment (Fibre vs. iSCSI)? WAN (1MB/5/10/100/1GB)?

Approach: Options

• Strategies- Engage business management to participate in people/

process change and define system priorities

- Embrace opportunity around architecture change

• Technologies Leveraged- Actifio, VMware, Cisco, Metro-E (100MB)

Approach

SITE B: FAILOVERSITE A: PRODUCTION

Our Actifio Environment

Capture only changed blocks(zero backup window)

Store only unique blocks(10X lower storage)

Move only unique blocks(70% less bandwidth)

Recreate data on demand

Instantly mount recovered data(zero restore window)

Incremental restore for BCInstantly mount recovered data(zero restore window)

Recreate data on demand

Ingest Server ONCE

Challenges & Results

• Biggest Challenges- Overly aggressive protection SLAs @ start

- Multiple power outages during transition

- Metro-E providers didn’t provide “true” Layer 2

• How Did We Overcome Them (Or Not)?- Increased RPOs for Level 2 & 3 systems

- Stopped synchronization for 18 hours to re-index system

- Implement Network Interface Devices (NIDs) to route all Layer 2 traffic (necessary for Metro-E High Availability)

Challenges & Results

• Results: $ and Intangible- Increased short-term costs, but $150k less than

alternative.

- Met all RPO/RTO objectives; didn’t meet timeline• Metro-E networking issues were unforeseen

• Upside Surprises- Added near real-time restoration of item-level objects

from any backup of Exchange & SharePoint

- Decided to move Production to Co-LO; new storage implementation to be handled through Actifio

• Lessons Learned- Engage telecom carrier Engineering early on

- Use project as opportunity to review Business Continuity on a holistic basis

- Partner w/ cross-functional vendor (storage, backup)

• What Would We Do Differently?- Less aggressive with Level 2 & 3 SLAs @ start

- Test network technology earlier & more often

Lessons Learned & Recommendations

Quantifying The Problem

Total Data in

Environment (TB)

Total Amount of Production Data (TB)

100

The Copy Data Ratio (CDR)

Example: (45TB / 8TB ) x 100 = 563

Quantifying The Problem

100 – 150 150 – 350 350 – 700700 – 1,000

Optimistic Opportunistic Urgency Crisis

The Copy Data Ratio (CDR)What’s Your Number?

563

Evaluating CDR Score in Relation to Operational Complexity

3Opportunity for savings, some

efficiency gains

1Transformational opportunity for

savings, efficiency gains

4Limited savings,

efficiency opportunities

2Large opportunity

for savings, efficiency gains

Toolsin Use

Copy Data Ratio

High

Low

Low High

563

Summary

• Copy data is a source of significant spend and inefficiency in the enterprise

• Impact felt most severely on revenue-generating and business-agility initiatives

• Delays / issues due to resource drain from copy data sprawl

• Important to understand the magnitude of the problem

• Calculating the Copy Data Ratio (CDR) can help influence an action plan based on effort / impact analysis

top related