Couchbase, Inc. Confidential Couchbase Durability, HA and DR Todd Greenstein
Jun 17, 2015
Couchbase, Inc. Confidential
CouchbaseDurability, HA and DR
Todd Greenstein
Ultra High Availability- Agenda
• Demo• Architecture - Single Node Type 3.0.• Intra Cluster Replicas (1-3)• Graceful Failover and Node Recovery• Cross Data Center Replication (XDCR)• Backup and Restore.
Couchbase, Inc. Confidential
Single Node Architecture
Single Node Type, Version 3.0
Data Manager Cluster Manager
DCP: Agile Enterprise/Mission Critical Scalability-What is it?
• What is it? Couchbase 3.0 now uses DCP: Database Change Protocol.
• What does DCP do? DCP handles how a cluster rebalances vBuckets, how Views are updated, and how XDCR is performed. DCP now updates views and XDCR operations directly from memory. This dramatically improves scalability and performance.
Single Node Type• Documents are Hashed at the Smart Client• Documents Enter into Memory on a Single Node• Immediate/Async• Intra Cluster Replication Queue
(option to wait for persist)• Disk Queue• View Queue• XDCR Queue (if configured)
Metadata Ejection/Tunable Memory- What is it?
• What Is it? Metadata Ejection/Tunable Memory. Previous versions of Couchbase required ALL keys (metadata) be resident in memory, all the time—regardless of working set size. In 3.0 the the option exists at the bucket level to control if all meta data should remain in memory.
Couchbase, Inc. Confidential
Durability, HA and DR
Intra Cluster ReplicationInter Cluster Replication (XDCR)
Intra-Cluster Replication• Referred to as "Replicas" in Couchbase. There is no
Master/Slave. All nodes are equal.• Configured at the individual Bucket Level from 1-3
Replicas. The default configuration is 1 Replica. • Data is automatically sharded into 1024 vBuckets across
all nodes in a cluster, regardless of cluster size. (Reminder, vBuckets are the storage containers for data within Buckets).
• Each replica is a full copy of all 1024 vBuckets.• Replica vBuckets are stored in memory and persisted to
disk, in individual vBucket files. Strongly Consistent. • Replica's for each Node's Active vBuckets are
guaranteed to be stored on a different node. In rack zone awareness, this is abstracted further guaranteeing replicas for one zone reside in another zone.
Disk
Active and Replica vBuckets on an Individual Node in a 4 Node Cluster*
1. A Bucket is created with 1 Replica by default.
256 Active vBuckets in Memory and 256 vBucket Files on Disk
2. Replica Count for Bucket Changed to 2
256 Active vBuckets in Memory and 256 vBucket Files on Disk (active count stays the Same)
256 Replica vBuckets in Memory and 256 Replica vBucket Files on Disk
None of the Replicas on this Node are for Active vBuckets on this Node
512 Replica vBuckets in Memory and now 512 vBucket Replica Files on Disk
*All nodes in cluster function Identically, and all Buckets have 1024 vBuckets.
Memory
HA: Failing Over a Node in a 4 Node Cluster with Autofailover enabled
User Configured Replica Count = 1
SERVER 4
Replica
Active
App Server 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
App Server 2
Couchbase Server Cluster
Active
SERVER 1
Doc 5
Doc 2
Doc 9Doc
Doc
Doc
Replica
Doc 4
Doc 1
Doc
Doc
Active
SERVER 2
Doc 4
Doc 7 Doc 8
Doc
Doc
Replica
Doc 6
Doc 3 Doc 2
Doc
Doc
Active
SERVER 3
Doc 1
Doc 3
Doc
Doc
Replica
Doc 7
Doc 9
Doc 5Doc
Doc
Doc
• App servers accessing docs. For this example only a small subset of the documents in 1024 vBuckets are shown for clarity.
• Requests to Server 3 begin to fail
• The Cluster detects the failure and initiates a failover.Promotes replicas of docs (vBuckets) to active from Memory. The replica vBuckets on Disk are also promoted to active.– Updates cluster map and
smart client is immediately aware.
• Requests for docs now go to appropriate server
• Typically a rebalance would follow, but is not required.
• Rebalance is an online operation
Doc 1 Doc 3
Doc
Inter-Cluster Replication (XDCR)• Referred to as Cross Data Center Replication. • Configured at the individual Bucket Level.• Uni and Bi-directional. Both are supported. • Source and Destination Cluster don’t need to match• In 3.0 XDCR happens immediately. 32 streams per Bucket are
utilized to perform the replication. XDCR is eventually consistent. SSL/Encryption is supported.
• If the replication stream fails, the service will resume once connectivity is restored. Replication resumes from the last known “consistent” checkpoint. In 3.0 it’s Pausable
• Conflict Resolution, Bi-directional Replication.• The document with most updates, wins. • If same amount of updates exist, meta data is utilized to compare:
• Numerical sequence (incremented on each mutation) • CAS value • Document flags • Expiration (TTL) value
• XDCR ensures that same winner is on both sides of replication.
XDCR Data Flow
Source Destination
Couchbase, Inc. Confidential
Backup and Restore
Version 3.0 Backup StrategyIn Version 3.0 Couchbase improves on an already robust backup technology, with the addition of incremental backups:• More options for backup strategies• Greater flexibility in the restoration process• Reduces the amount of time needed for daily backups• Reduces the amount of disk storage needed for backups• Reduces bandwidth usage when backing up over a network
Incremental Backups can be performed using either:• Differential incremental backup• Cumulative incremental backup
Incremental Backup