Top Banner
1 HBase Disaster Recovery Solution at Huawei Ashish Singhi
21

hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

Jan 22, 2018

Download

Technology

HBaseCon
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

1

HBase Disaster Recovery Solution

at Huawei

Ashish Singhi

Page 2: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

2

About.html

• Senior Technical Leader at Huawei

• Around 6 years of experience in Big Data related projects

• Apache HBase Committer

Page 3: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

3

Agenda

• Why Disaster Recovery ?

• Backup Vs Disaster Recovery

• HBase Disaster Recovery

• Solution

• Miscellaneous

• Future Work

Page 4: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

4

Why Disaster Recovery ?

Cost of Downtime

Page 5: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

5

Agenda

• Why Disaster Recovery ?

• Backup Vs Disaster Recovery

• HBase Disaster Recovery

• Solution

• Miscellaneous

• Future Work

Page 6: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

6

Backup Vs Disaster Recovery

Two different problems and solutionsBackup Disaster Recovery

Process Archive items to cold media

Replicate to secondary site

Infrastructure Medium level Duplicate of active cluster (high level)

Cost Affordable Expensive

Restore process One to few at a time

One to everything

Restore time Slow Fast

Production usage Common Rare

Page 7: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

7

Agenda

• Why Disaster Recovery ?

• Backup Vs Disaster Recovery

• HBase Disaster Recovery

• Solution

• Miscellaneous

• Future Work

Page 8: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

8

HBase Disaster Recovery

• HBase Disaster recovery is based on replication, which

mirrors data across a network in real time.

• The technology is used to move data from a local source

location to one or more target locations.

• Replication over WAN has become an ideal technology

for disaster recovery to prevent data loss in the event of

failure.

Page 9: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

9

Deployment Strategies

Page 10: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

10

Active – Standby Cluster

Active ClusterHBase

Standby Cluster HBase

Write

Read

/hbase/clusterStat

e: standby/hbase/clusterState: active

ZooKeeper

Serves only Read Client Requests

ZooKeeper

Replication

Serves Read and Write Client Requests

Page 11: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

11

Agenda

• Why Disaster Recovery ?

• Backup Vs Disaster Recovery

• HBase Disaster Recovery

• Solution

• Miscellaneous

• Future Work

Page 12: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

12

Replication

WAL

1

1

2

Region Server

Replication Source/End Point

Replication Source/End Point

Replication Source Manager

Region Server

…/peers/…/rs/…/hfile-refs/

Source Cluster Peer Cluster 1 [tableCfs - 1]

1

3 1

Table

Batch

1Replication Sink

1

Bulk load

Region Server

TableReplication Sink

Peer Cluster 2 [tableCfs - ]

12

1

Batch

Bulk load

12

1

1

Batch

Bulk load

ZooKeeper

Page 13: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

13

Sync DDL Operations

• Synchronize the table properties across clusters

• Any change in the source cluster, reflects immediately in

the peer clusters.

• Does not break the replication.

• An additional option with DDL command to sync

• Internally sync those changes to peer clusters.

Page 14: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

14

Sync Security related Data

• Synchronize security related HBase data across the

clusters

• Any update in the source cluster ACL, Quota or Visibility

Labels table, reflects immediately in peer clusters.

• A custom WAL entry filter is added in replication for this.

• Does not break the security for HBase data access.

Page 15: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

15

Read Only Cluster

• Enable a cluster to serve only read requests

• A coprocessor based solution

• Standby cluster will serve all the read requests

• Standby cluster will serve write requests only if the requests

is coming from a,

• Super user

• From a list of accepted IPs

Page 16: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

16

Cluster Recovery

Replication

Active Standby ClusterHBase

Standby Active Cluster HBase

Serves Read and Write Client Requests

Write

Read

/hbase/clusterStat

e: standby active/hbase/clusterStat

e: active standby

ZooKeeper

Serves only Read Client Requests

ZooKeeper

Page 17: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

17

Agenda

• Why Disaster Recovery ?

• Backup Vs Disaster Recovery

• HBase Disaster Recovery

• Solution

• Miscellaneous

• Future Work

Page 18: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

18

Miscellaneous

• Increased the default replication.source.ratio to 0.5

• Adaptive hbase.replication.rpc.timeout

• Active cluster HDFS server configurations are maintained

in Standby cluster ZooKeeper for bulk loaded data

replication.

Page 19: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

19

Agenda

• Why Disaster Recovery ?

• Backup Vs Disaster Recovery

• HBase Disaster Recovery

• Solution

• Miscellaneous

• Future Work

Page 20: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

20

Future work

• Move HBase Replication tracking from ZooKeeper to

HBase table (HBASE-15867)

• Copy bulk loaded data to peer with data locality

• Replication data network bandwidth throttling.

Page 21: hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

21

Thank You!

mailto: [email protected]: ashishsinghi89