Top Banner
4/17/2019 1 Understanding your HA and DR Options Brian Nordland PowerHA Architect HelpSystems Email: Brian.Nordland at HelpSystems.com
38

Understanding your HA and DR Options

Jan 16, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Understanding your HA and DR Options

4/17/2019

1

Understanding your HA and DR Options

Brian NordlandPowerHA Architect

HelpSystemsEmail: Brian.Nordland at HelpSystems.com

Page 2: Understanding your HA and DR Options

4/17/2019

2

Top IT Concerns r 2019

Hardware Replication

?

Page 3: Understanding your HA and DR Options

4/17/2019

3

Which is the best?

It depends…

Page 4: Understanding your HA and DR Options

4/17/2019

4

Page 5: Understanding your HA and DR Options

4/17/2019

5

How long can you be down for?

Recovery Time Objective

Page 6: Understanding your HA and DR Options

4/17/2019

6

Washington County Public Schools is trying to recover student data — including grades and attendance records — that was apparently not properly backed up and permanently lostfollowing a minor fire that downed multiple servers more than a week ago

How much data can you afford to

lose?

Recovery Point Objective

Page 7: Understanding your HA and DR Options

4/17/2019

7

Recovery Time Objective: 5 minutesRecovery Point Objective: 3 minutes (lost data)

Which solution gives me the best RPO and RTO?

It depends…

Page 8: Understanding your HA and DR Options

4/17/2019

8

Operating System Physical Server Data Storage

Data Center

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Operating System Physical Server Data Storage

Data Center

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Page 9: Understanding your HA and DR Options

4/17/2019

9

Operating System Physical Server Data Storage

Data Center

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Operating System Physical Server Data Storage

Data Center

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Page 10: Understanding your HA and DR Options

4/17/2019

10

Operating System Physical Server Data Storage

Data Center

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Page 11: Understanding your HA and DR Options

4/17/2019

11

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Tips for Success in HA and DR• You need an RTO and RPO for every type of outage• Think about an RTO and RPO for both planned and unplanned outages• Every company has different needs

Page 12: Understanding your HA and DR Options

4/17/2019

12

HelpSystems. All rights reserved.

The TechnologiesFocusing on which types of outages they protect against

Live Partition Mobility (LPM)

Page 13: Understanding your HA and DR Options

4/17/2019

13

Live Partition Mobility (LPM)

Live Partition Mobility (LPM)

Page 14: Understanding your HA and DR Options

4/17/2019

14

Live Partition Mobility (LPM)

Live Partition Mobility (LPM)

Page 15: Understanding your HA and DR Options

4/17/2019

15

Live Partition Mobility (LPM)

Minimal application impact

Planned server hardware outages only

Requires everything to be virtualized

Requires External Storage

Live Partition Mobility (LPM)

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Planned Only

Page 16: Understanding your HA and DR Options

4/17/2019

16

Restart of partition on another physical server

For unplanned server hardware outages

Requires everything to be virtualized

Requires external storage

Remote Restart

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Restart of partition on another physical server

For unplanned server hardware outages

Requires everything to be virtualized

Requires external storage

Remote Restart

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

IBM VM Recovery Manager for HA Product for Managing LPM and Remote Restart

Page 17: Understanding your HA and DR Options

4/17/2019

17

Near-Zero application impact for planned storage outages

Full System HyperSwap

Near-Zero application impact for planned storage outages

Full System HyperSwap

Page 18: Understanding your HA and DR Options

4/17/2019

18

Near-Zero application impact for planned storage outages

Minimal application impact for unplanned storage outages

Full System HyperSwap

Near-Zero application impact for planned storage outages

Minimal application impact for unplanned storage outages

Supported for IBM DS8000 (PowerHA express edition), or IBM SVC/Storwize

Full System HyperSwap

Page 19: Understanding your HA and DR Options

4/17/2019

19

Full System HyperSwap

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Full System ReplicationMetro Mirror

Synchronous replication – all data on disk identical for great RPO but limited in distance

Page 20: Understanding your HA and DR Options

4/17/2019

20

Full System ReplicationMetro Mirror

Synchronous replication – all data on disk identical for great RPO but limited in distance

Secondary server is not accessible, but ready to be started instead of the first

Full System ReplicationMetro Mirror

Synchronous replication – all data on disk identical for great RPO but limited in distance

Secondary server is not accessible, but ready to be started instead of the first

Requires bandwidth for Application data, OS data, and temporary storage (due to IBM i’s single level store)

Page 21: Understanding your HA and DR Options

4/17/2019

21

Full System ReplicationMetro Mirror

Synchronous replication – all data on disk identical for great RPO but limited in distance

Secondary server is not accessible, but ready to be started instead of the first

Requires bandwidth for Application data, OS data, and temporary storage (due to IBM i’s single level store)

Stopping replication to “test” requires manual IP address fixup.

When done, pick one copy or the other

Full System ReplicationMetro Mirror

Synchronous replication – all data on disk identical for great RPO but limited in distance

Secondary server is not accessible, but ready to be started instead of the first

Requires bandwidth for Application data, OS data, and temporary storage (due to IBM i’s single level store)

Stopping replication to “test” requires manual IP address fixup.

When done, pick one copy or the other

Page 22: Understanding your HA and DR Options

4/17/2019

22

Full System ReplicationMetro Mirror

Synchronous replication – all data on disk identical for great RPO but limited in distance

Secondary server is not accessible, but ready to be started instead of the first

Requires bandwidth for Application data, OS data, and temporary storage (due to IBM i’s single level store)

Stopping replication to “test” requires manual IP address fixup.

When done, pick one copy or the other

Easy to set up, easy to manage (tools/products)

Full System ReplicationMetro Mirror

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Page 23: Understanding your HA and DR Options

4/17/2019

23

Full System ReplicationGlobal Mirror

Asynchronous replication – worse RPO than Metro Mirror, but allows for distances spanning the globe.

Secondary server is not accessible, but ready to be started instead of the first

Requires bandwidth for Application data, OS data, and temporary storage (due to IBM i’s single level store)

Stopping replication to “test” requires manual IP address fixup.

When done, pick one copy or the other

Easy to set up, easy to manage (tools/products)

Full System ReplicationGlobal Mirror

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Page 24: Understanding your HA and DR Options

4/17/2019

24

Full System ReplicationImplementation Products/Options

Roll your ownYou are responsible for

ensuring things are done correctly and in a correct

order to prevent data loss or corruption.

IBM VM Recovery Manager for DRAutomates and manages replication and switching.• Orchestration/GUI partitions run AIX• Requires everything to be virtualized• Can switch AIX, Linux, IBM i• Great for Managed Service Providers

IBM Lab Services Full System Replication ManagerAutomates and manages replication and switching.• Management partition runs IBM i.• Does not require virtualization.• Handles any “IP address fix-up”

03

01 02

What about OS/Software

Outages?Two primary flavors of technologies:• Logical/Software Replication• Hardware Replication with PowerHA

Page 25: Understanding your HA and DR Options

4/17/2019

25

Software-Based (Logical) Replication

Remote JournalingPF, data queues, data areas, IFS

Sync and Async

User Profiles and Spool filesMany solutions use QAUDJRN

All instances are active

Source and target are both active (target can be used for BI, reporting or test purposes)

Perform offsite backups on target system without impacting RPO

Ability to distribute data to multiple target systems

Less bandwidth used (only journals are sent)

Advantages of Logical Replication

Considerations for Logical Replication

Requires more daily care/feeding/monitoring than hardware replication

Can have a bigger impact on system performance than hardware replication

Page 26: Understanding your HA and DR Options

4/17/2019

26

Software-Based (Logical) Replication

Remote JournalingPF, data queues, data areas, IFS

Sync and Async

User Profiles and Spool filesMany solutions use QAUDJRN

All instances are active

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Synchronous Asynchronous

PowerHAHardware Replication with IASPs

Page 27: Understanding your HA and DR Options

4/17/2019

27

PowerHAHardware Replication with IASPs

PowerHAHardware Replication with IASPs

Page 28: Understanding your HA and DR Options

4/17/2019

28

PowerHAHardware Replication with IASPs

PowerHAHardware Replication with IASPs

Page 29: Understanding your HA and DR Options

4/17/2019

29

Separates the OS from Applications/Data

Separate namespace and database

Can be taken online and offline without a system restart

Foundation for PowerHA technologies

Some objects do not make sense in an IASP, for this, there is the administrative domain

Independent Auxiliary Storage Pools (IASPs)

Synchronizes objects that do not make sense in an IASP

Security

Configuration

ExamplesUser Profiles

Printer Device Descriptions

System Values

And more…

Administrative Domain

Page 30: Understanding your HA and DR Options

4/17/2019

30

Any object in the IASP is replicated

Faster/easier role swaps

Less monitoring

Often times less expensive than logical replication

Uses less bandwidth than Full System Replication (no OS or temporary data)

Advantages of PowerHA

Considerations for PowerHAUp-front work to get into an IASP (pay me now, save in the long run)

Uses more bandwidth than logical replication

Target server cannot be accessed while replication is activeReplication can be detached

FlashCopy with external storage

PowerHA - Solutions for every storage type

Internal Storage DS8000 SVC/StorewizeIBM Copy Services Manager (DS8000)

LUN Level Switching

Metro Mirror andGlobal Mirror

FlashCopy

HyperSwap

Sync. Geographic Mirroring

Async. Geographic Mirroring

LUN Level Switching

Metro Mirror andGlobal Mirror

FlashCopy

HyperSwap

Metro Mirror and Global Mirror

New

HyperSwap with Global Mirror

Page 31: Understanding your HA and DR Options

4/17/2019

31

Hardware replication done by the system

Works with any storage – generally recommended only for under 4TB

Synchronous and Asynchronous flavors

Geographic Mirroring

PowerHA Geographic Mirroring

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Synchronous Asynchronous

Page 32: Understanding your HA and DR Options

4/17/2019

32

Synchronous replication – great RPO, but limited in distance

Allows for detaching to stop replication and test on target system or perform backups

PowerHA Metro Mirror

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Asynchronous replication – worse RPO than Metro Mirror, but allows for distances spanning the globe.

Allows for detaching to stop replication and test on target system or perform backups

PowerHA Global Mirror

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Page 33: Understanding your HA and DR Options

4/17/2019

33

Data is switched between servers

Provides protection against OS/Software outages, and server hardware outages

Frequently combined with Global Mirror

PowerHA LUN Level Switching

Data is switched between servers

Provides protection against OS/Software outages, and server hardware outages

Frequently combined with Global Mirror

PowerHA LUN Level Switching

Page 34: Understanding your HA and DR Options

4/17/2019

34

PowerHA LUN Level Switching

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

HelpSystems. All rights reserved.

Combining Technologies

Page 35: Understanding your HA and DR Options

4/17/2019

35

PowerHA LUN Level Switching+Global Mirror

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

PowerHA DS8000 HyperSwap+Global Mirror

Outage Type Scorecard

Storage Outage Server/Hardware Outage OS/Software Outage Data Center Outage Regional Outage Offline Backups

Great RPO

Page 36: Understanding your HA and DR Options

4/17/2019

36

Today we talked about real-time replication and switching technologies

However…

"As a result of a server migration project, any photos, videos, and audio files you uploaded more than three years ago may no longer be available on or from MySpace," the company announced last weekend. "We apologize for the inconvenience."

Page 37: Understanding your HA and DR Options

4/17/2019

37

Real-time replication solutions for HA and DR are an addition to, not a replacement for point-in-time disaster recovery solutions

When considering HA and DR solutions, you need to first look at your RPO and RTO requirements for every type of outage.

There are solutions for every type of outage available. Many of these solutions can be combined.

Real-time replication solutions are an addition to, not a replacement for, point-in-time disaster recovery options, such as tape backup.

Summary

Page 38: Understanding your HA and DR Options

4/17/2019

38

Any Questions