Top Banner
CASSANDRA & BENCHMARKING A holistic perspective
22
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cassandra Applications Benchmarking

CASSANDRA &

BENCHMARKINGA holistic perspective

Page 2: Cassandra Applications Benchmarking

Agenda

1. This presentation is related to performance benchmarks

for Cassandra based systems

2. Discuss benchmarking in general

3. Define and Approach

4. Explore gotchas and things to look out for

5. Hear from you! (Prizes for best benchmarking stories)

Page 3: Cassandra Applications Benchmarking

Benchmarking

• Benchmark testing is the process of load testing a

component or an entire end to end IT system to determine

the performance characteristics of the application.

Page 4: Cassandra Applications Benchmarking

Benchmarking Properties

• Should be repeatable

• Should capture performance measurements from

successive runs

• Ideally there should be low variance between successive

tests

• Should highlight improvements or degradation in system

changes

Page 5: Cassandra Applications Benchmarking

Modern Systems

• More often than not distributed.

• Many different types of system components

• Complex performance constraints

• What is Easily Measured? Network, CPU, Memory, I/O

Utilisation

• More Difficult: Tech. Specific Factors, e.g. Cassandra –

impact of compaction, read performance

Page 6: Cassandra Applications Benchmarking

Justification for Benchmarking

• Simple:

• Is the system going to keep performing the more users there are?

• Complex:

• Cost Reduction

• Optimisation

• Growth Projection

• TCO

Page 7: Cassandra Applications Benchmarking

APPROACH

Page 8: Cassandra Applications Benchmarking

Caveats

• The more information you have the better…

• Any investment in systemic testing is generally a good investment

• Simplify the goals/outcomes for business

• Automate as much as possible and formalise test procedure to ensure adherence to quality measures.

• As interested in percentiles as well as mean values

Page 9: Cassandra Applications Benchmarking

Requirements

• Discover resource constraints

• Discover modes of failure

• To guarantee operation outside of usual parameters

• Ensure SLAs are being met

• Ensure operation over longer periods is consistent.

Page 10: Cassandra Applications Benchmarking

Basic Approach

• Distinguish component benchmark from system benchmark.

• Component benchmark is important, defines a basic SLA for inter component operations.

• A system is sum of all parts, not just each component : Component performance does not imply system performance.

• Take corrective action from the bottom up (network, hardware, compute resources) as well as from the top down (API design, data access patterns).

Page 11: Cassandra Applications Benchmarking

Holistic Approach

• The system exists to service business requirements, work

backwards from them.

• Define our benchmark from user perspective.

• Technical goals + business goals must align.

• The system must function in its entirety, it is not sufficient

to performance test each component in isolation.

Page 12: Cassandra Applications Benchmarking

1. Define a Basic Traffic Model

• Example - Simple Storefront

• GET /product/list (50%)

• GET /product/{id} (20%)

• POST /product/{id}/order (20%)

• GET /orders/list (10%)

Page 13: Cassandra Applications Benchmarking

2. Define a User Profile

• User Type 1

• Browse heavy

• GET /product/list (70%)

• GET /product/{id} (20%)

• POST /product/{id}/order (5%)

• GET /orders/list (5%)

• User Type 2

• Compulsive buyers

• GET /product/list (30%)

• GET /product/{id} (20%)

• POST /product/{id}/order (30%)

• GET /orders/list (20%)

Page 14: Cassandra Applications Benchmarking

Peak Periods?

• Adding an hourly activity allows for a more useful

benchmark.

• Can be expressed as active user count.

• Very simple to assign a probability to the number of each

type of user on the system at that time.

• E.g. 20% type 1, 80% type 2.

• The ideal circumstance is to use real data for these

models if any is available.

• Distributed load drivers coordinate to meet the hourly user

count.

Page 15: Cassandra Applications Benchmarking

Peak Periods?

0

2000

4000

6000

8000

10000

12000

14000

16000

0 2 4 6 8 10 12 14 16 18 20 22

Hour

Active Users

Page 16: Cassandra Applications Benchmarking

Tooling

• Jmeter

• The Grinder

• Jolokia (JMX)

• Logstash / Statsd

• Codahale Metrics

• Graphite (Visualisation)

• Iostat / dstat, iftop, netstat, htop, etc.

• cassandra-stress (useful for a basic sanity check)

Page 17: Cassandra Applications Benchmarking

CASSANDRASpecifics

Page 18: Cassandra Applications Benchmarking

Considerations

• Cassandra’s append only writes mean writes are always

consistently fast given sufficient resources

• Compaction has a different impact depending on the

strategy you use (STCS lighter than LCS).

• Pending compactions tend to backup more during load

oriented testing

• Reads have a significant impact depending on:

• Spread of column mutations across SSTables

• Compaction strategy (STCS less efficient for above than LCS)

• No. of reads for same row key (whether we are exercising the key

cache or not)

• Our consistency level (same for writes)

Page 19: Cassandra Applications Benchmarking

Common Issues

• Poor query design (unbounded queries, abuse of ALLOW

FILTERING), anti-patterns.

• Poor capacity planning, disk, memory, cpu etc.

• Many failed requests on coordinators may lead to

resources being over-used for hinted handoff.

• If a node is memory constrained you may get JVM pauses

due to garbage collection

• Poor network connectivity and incorrect consistency

levels may lead to more timeouts.

• It is possible to have hotspots in Cassandra if you have

not modelled keys correctly.

Page 20: Cassandra Applications Benchmarking

What to collect during test?

• Read / Write latency per CF (nodetool cfstats)

• No. of reads / writes (nodetool cfstats)

• No. of pending compactions

• Thread Pool usage, especially pending (nodetool tipstats)

• Correlate with • Disk i/o

• CPU

• Memory usage

• Visualise as much as possible and use overlays for correlation.

Page 21: Cassandra Applications Benchmarking

Points to Remember

• Latency reported by Cassandra is internal, so only useful

to tell if Cassandra I/O is performing adequately. Graph it

to get most value or use OpsCentre.

• Add metrics at every tier in your system, make sure it is

possible to correlate the above number with latency in

other parts of the system.

• Soak testing is critical with Cassandra as empty system

performance may be very different as disk utilization /

compaction requirements grow.

• Experiment with settings for easy gains. Some CFs may

benefit from RowCache.

Page 22: Cassandra Applications Benchmarking

YOUR STORIESBest two stories get books from O’ Reilly