Red Hat Storage Server Replication Past, Present, & Future

Post on 25-May-2015

2043 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

"In this session, we’ll detail Red Hat Storage Server data replication strategies for both near replication (LAN) and far replication (over WAN), and explain how replication has evolved over the last few years. You’ll learn about: Past mechanisms. Near replication (client-side replication). Far replication using timestamps (xtime). Present mechanisms. Near replication (server side) built using quorum and journaling. Faster far replication using journaling. Unified replication. Replication using snapshots. Stripe replication using erasure coding."

Transcript

RED HAT STORAGE SERVERREPLICATION: PAST AND PRESENTJeff Darcy, Venky Shankar, Raghavan PichaiGlusterFS/RHS Developers @ Red Hat

Talk Outline

Background Local replication Remote replication Next steps Questions

BackgroundTypes of replication, goals, and challenges

Synchronous Replication

S

S

Y

Y

N

N

C

C

+ high consistency - network sensitive

Quorum Enforcement

Replica #1 Replica #2 Replica #3

Majority can write Minority can’t

There can only be one majority => no split brain

Synchronous Replication Data Flows

X

X

X

Y

Y

Y

Chain Fan Out

Client

Server

Server

Client

Server

Server

Fan Out Replication

Y

Y

Y Client

Server

Server

SplitBandwidth

Wait forSlowest

Chain Replication

X

X

X

Client

Server

Server

FullBandwidth

Two Hops

Asynchronous Replication

A

A

S

S

C

C

Y

Y

N

N

+ low consistency - network insensitive

Effect of Network Partitions

A

A

S

S

MY

Y N

What’s the correct value?

Tradeoff Space

Network Sensitive Network Insensitive

HighConsistency

LowConsistency

S

A

Red Hat StorageSynchronous Near-ReplicationRaghavan PDeveloper, Red Hat

Traditional replication using AFR

“Automatic file replication” Client based replication Entry, meta data and data based replication. Automated Self healing in case bricks recover after failure.

AFR Sequence Diagram

Client 1

Client 2

Server A

Server B

LockPre Op

OpPost Op

Unlock

Lock (blocked) Pre Op

AFR improvements

In 3.4 release Eager locking Piggybacking Server quorum In 3.5 release Granular self heal

In 3.6 release Rewrite of the code Pending counters Self healing in the context of self heal daemon

NSR – new style (aka server side) replication Replication to the back end (brick processes) Controlled by a designated “leader” also known as sweeper. AdvantagesBandwidth usage of client network optimized for direct (fuse) mountsAvoidance of split brain Sweeper elected using majority principle. Per term Changelog on the sweeper preseves the ordering of operations.Variable consistency models for trading consistency with performance.

NSR high level blocks

NSR client side translator

Sends IO to sweeper

Sweeper (leader)

Forwards IO to peers

Commits after all peer completion

Non sweeper (follower)

Accepts IO only from sweeper or reconciliation

Rejects IO from client (client retry)

Change log

Reconciliation

Makes use of membership to figure out terms missing.

Makes use of change logs for syncing the corresponding terms.

NSR Sequence Diagram

Client 1

Client 2

Sweeper

Follower

Client 1 Request

Client 2 Request

Red Hat Storage ServerGeo-ReplicationVenky ShankarDeveloper, Red Hat

Geo-Replication Asynchronous data replication Continuous, Incremental

Across geographies One site (master) to another (slave) Multi-slave Cascading Fan-out

Disaster Recovery

Remote Replication: Past

Single node Change detection Crawling (xtime based crawl)

Data synchronization Rsync

Suboptimal processing rename, deletes, hardlink

Overview

Crawling and xtime

Xtime Inode changed time Marked up to root (marker xlator)

Crawling/Scanning Directory crawl and file synchronization

xtime(master) > xtime(slave)

Slave xtime maintained by master

Remote Replication: Present

Overview Multi node Distributed (parallel) synchronization Replica failover

Change detection Consumable journals

Data synchronization (configurable) Rsync, tar+ssh (large number of small files)

Efficient processing rename, delete, hardlink

Journaling

Journaling Translator (changelog) Records FOP (efficiently) local to a brick Data, Entry, Metadata

Change detection : O(1) relative to number of changes

Consumer library (libgfchangelog) Per brick Publish/Subscribe mechanism Journals periodically published

Remote Replication: Future

Replicating Snapshots Multi Master Vector clocks Conflict detection & resolution

Libgfapi integration Geo-replication to Swift target

Features

Red Hat Storage ServerReplication-related FeaturesJeff DarcyDeveloper, Red Hat

Unified Replication

Leader

ChangeLog

LocalReplica

ChangeLog

RemoteReplica

ChangeLogSync Async

Erasure Coding (a.k.a. “disperse”)

D1 D2 D3 D4 P1 P2 P3

D1 D2 D3 D4 P1 P2 P3

D2

Also…

VolumeSnapshot

FileSnapshot

Deduplication+

CompressionChecksums

OK

OK

Tiering (a.k.a. data classification)

Tier 0

Tier 1

Tier 2

SSD, no replication

Normal disk, sync replication

SMR disk, erasure codingcompression + checksumsasync replication

Questions?

top related