Top Banner
Bootstrap From Backups Reducing cluster load while adding capacity #CassandraSummit instaclustr.com
28
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cassandra Bootstrap from Backups

Bootstrap From BackupsReducing cluster load while adding capacity

#CassandraSummit

instaclustr.com

Page 2: Cassandra Bootstrap from Backups

Who am I and what do I do?• Ben Bromhead

• Co-founder and CTO of Instaclustr -> www.instaclustr.com

• Instaclustr provides Cassandra-as-a-Service in the cloud.

• Currently in AWS, Google Cloud in private beta with more to come.

• We currently manage 50+ nodes for various customers, who do various things with it.

Page 3: Cassandra Bootstrap from Backups

Cassandra and Scaling• Premise: We have an existing cluster and we need either more storage / better

performance / higher availability.

• Normally fairly awesome, most people do the following:

• Set seed nodes, Start Cassandra.

• Node joins ring and take responsibility for some portion of the ring.

• Commence the bootstrap process. The joining node streams data from other nodes for the range, builds indexes etc.

• Specifically the node receives streamed SSTables that contain rows within the range that it is now responsible for (the data component)

Page 4: Cassandra Bootstrap from Backups

Not perfect, but getting better• Joining node can violate consistency due to range movements -

Somewhat fixed in 2.1 - See CASSANDRA-2434

• Adding a replacement node with the same address/range ownership is a different workflow. replace_address workflow is still tricky for some people. - See CASSANDRA-7356

• Adding nodes to a cluster with multiple racks can also be tricky and prone to creating hotspots. This is mainly an operational issue.

Page 5: Cassandra Bootstrap from Backups

A wild “fundamental issue” appears…

• Joining nodes add additional load on the existing nodes in the cluster.

• Joining nodes stream data from existing nodes (the node who used to be the primary for the range that is moving).

• Takes up valuable bandwidth and I/O

• Key requirement: As a managed Cassandra service, we need to make all our operations as side-effect free as possible.

• Key requirement: Our customers don’t want to worry about operation specific details.

Page 6: Cassandra Bootstrap from Backups

how do we prevent this?

Page 7: Cassandra Bootstrap from Backups

Solutions, part 1

Make sure your nodes never get stressed.

• Capacity planning (OpsCenter has some good tools). Traffic prediction.

Page 8: Cassandra Bootstrap from Backups

Solutions, part 2

Make sure your nodes never get stressed

• Over provision.

Page 9: Cassandra Bootstrap from Backups

Solutions, part 3

Make sure your nodes never get stressed.

• Ensure your startup / app / project / whatever never goes viral or gets featured in national media.

Page 10: Cassandra Bootstrap from Backups

Solutions, part 4

If your nodes are already stressed, very hard to add capacity.

• Batten down the hatches and wait for a quiet time?

Page 11: Cassandra Bootstrap from Backups

Solutions, part 5

If your nodes are already stressed, very hard to add capacity.

• You are a Cassandra wizard.

Page 12: Cassandra Bootstrap from Backups

Solutions, part 6

If your nodes are already stressed, very hard to add capacity.

• Rebuild from another DC.

• Add node, bootstrap = false and run nodetool rebuild -- OTHER_DC

Page 13: Cassandra Bootstrap from Backups

• All these solutions have various strengths and weaknesses.

• Have side-effects or a relatively costly.

• Still need to address:

• Key requirement: As a managed Cassandra service, we need to make all our operations as side-effect free as possible.

• Key requirement: Our customers don’t want to worry about operation specific details.

Page 14: Cassandra Bootstrap from Backups

Bootstrap from Backups!

• SSTables are immutable.

• SSTables are also the base unit of data that nodes stream to each other.

• SSTables are what we backup.

• How about we stream the SStables from the backup location instead of the live node?

Page 15: Cassandra Bootstrap from Backups
Page 16: Cassandra Bootstrap from Backups

• Define an arbitrary command that streams the sstable to stdout.

• Cassandra will some values (broadcast address and filename) into the command to help identify which sstable to fetch.

• e.g. cat /mnt/some-nfs-mount/%source/%filename

• Cassandra will run the command in a separate process and read the sstable from processes stdout stream.

• If the process fails, the node streams the sstable using the current streaming process. This becomes a performance optimisation rather than a replacement streaming mechanism.

Page 17: Cassandra Bootstrap from Backups

1 3

2

new node

SSTable1

Normal Bootstrap procedure

Page 18: Cassandra Bootstrap from Backups

1 3

2

new node

SSTable1

Normal Bootstrap procedure

Page 19: Cassandra Bootstrap from Backups

1 3

2

NAS/S3

SSTable1

SSTable1 SSTable2 SSTableN

Normal Cluster with backups

Page 20: Cassandra Bootstrap from Backups

1 3

2

new node

SSTable1

NAS/S3

SSTable1 SSTable2 SSTableN

Bootstrap from backup

Page 21: Cassandra Bootstrap from Backups

1 3

2

new node

SSTable1

NAS/S3

SSTable1 SSTable2 SSTableN

Bootstrap from backup

Page 22: Cassandra Bootstrap from Backups

1 3

2

new node

SSTable1

NAS/S3

SSTable1 SSTable2 SSTableN

Bootstrap from backup - Catch up

Page 23: Cassandra Bootstrap from Backups

How does it look in real life?

Page 24: Cassandra Bootstrap from Backups

This is your cluster on regular bootstrap

Ope

ratio

ns

0

7500

15000

22500

30000

Minutes0 5 10 15 20 25 30 35 40 45 50 55 60

Page 25: Cassandra Bootstrap from Backups

This is your cluster on bootstrap from backups

Ope

ratio

ns

0

7500

15000

22500

30000

Minutes0 5 10 15 20 25 30 35 40 45 50 55 60

Page 26: Cassandra Bootstrap from Backups

Why does this matter

• Mostly side-effect free bootstrapping.

• Explore reactive scaling rather than predictive.

• Makes your cluster more cost effective to run.

Page 27: Cassandra Bootstrap from Backups

When can I use this!?• Not right now, haven’t even submitted as a patch to the C* project

(we will).

• Currently running in beta with a select few of our customers.

• Not too sure how much of a good idea it is to use stdout as the stream mechanism. So far so good?

• Will probably need a refactor of the StreamMessage workflow… currently bootstrap from backups is a has that doesn't fit the current model.

Page 28: Cassandra Bootstrap from Backups

Questions