Top Banner
1 million writes per sec. on 60 nodes with Cassandra and EBS
73

1 Million Writes per second on 60 nodes with Cassandra and EBS

Feb 13, 2017

Download

Software

Jim Plush
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Million Writes per second on 60 nodes with Cassandra and EBS

1 million writes per sec. on 60 nodes with Cassandra and EBS

Page 2: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

1 Million Writes Per Second w/60 nodes. !

EBS and C*!Jim Plush - Sr Director of Engineering, CrowdStrike!

Dennis Opacki - Sr Cloud Systems Architect!

Page 3: 1 Million Writes per second on 60 nodes with Cassandra and EBS

An Introduction to CrowdStrike

We Are CyberSecurity Technology Company

We Detect, Prevent And Respond To All Attack Types In Real Time, Protecting Organizations From Catastrophic Breaches

We Provide Next Generation Endpoint Protection, Threat Intelligence & Pre &Post IR Services

NEXT- GEN ENDPOINT

INCIDENTRESPONSE

THREATINTEL

http://www.crowdstrike.com/introduction-to-crowdstrike-falcon-host/

Page 4: 1 Million Writes per second on 60 nodes with Cassandra and EBS

CrowdStrike Scale

•  Cloud based endpoint protection

•  Single customer can generate > 2TB daily

•  500K+ Events Per Second

•  Multi PetaBytes of managed data

© 2015. All Rights Reserved.

Page 5: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Truisms???

•  HTTPs is too slow to run everywhere

•  All you need is anti-virus

•  Never run Cassandra on EBS

© 2015. All Rights Reserved.

Page 6: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

What is EBS?

EBS Data Volume

EBS Data Volume

/mnt/foo

/mnt/bar

EC2 Instance

§ Network Mounted Hard Drive

§ Ability to snapshot data

§ Data encryption at rest & in flight

Page 7: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Existing EBS Assumptions

•  Jittery I/O aka: Noisy neighbors

•  Single Point of Failure in a Region

•  Cost is too damn high

•  Bad Volumes (dd and destroy) © 2015. All Rights Reserved.

Page 8: 1 Million Writes per second on 60 nodes with Cassandra and EBS

A recent project: initial requirements

•  1PB of incoming event data from millions of devices

•  Modeled as a graph

•  1 million writes per second (burst)

•  Age data out after x days

•  95% write 5% read

© 2015. All Rights Reserved.

Page 9: 1 Million Writes per second on 60 nodes with Cassandra and EBS

We Tried

•  Cassandra + Titan

•  Sharding?

•  Neo4J

•  PostgreSQL, MySQL, SQLite

•  LevelDB/RocksDB

© 2015. All Rights Reserved.

Page 10: 1 Million Writes per second on 60 nodes with Cassandra and EBS

We have to make this work

•  Cassandra had the properties we needed •  Time for a new approach?

© 2015. All Rights Reserved. http://techblog.netflix.com/2014/07/revisiting-1-million-writes-per-second.html

Page 11: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Number of Machines for 1PB

© 2015. All Rights Reserved.

0.

450.

900.

1350.

1800.

2250.

I2.xlarge c4.2XL EBS

Page 12: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Yearly Cost for 1PB Cluster

© 2015. All Rights Reserved.

0.

4.

8.

12.

16.

I2.xlarge-on demand I2.xlarge-reserved c4.2xl - on demand c4.2xl - reserved

Mill

ions

of $

With EBS

Page 13: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Initial Launch

Date Tiered Compaction

© 2015. All Rights Reserved.

…more details by Jeff Jirsa, CrowdStrike

Cassandra Summit 2015 - DTCS

Page 14: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Initial Launch

•  Cassandra 2.0.12 (DSE)

•  m3.2xlarge 8 core

•  Single 4TB EBS GP2 ~10,000 IOPS

•  Default tunings

© 2015. All Rights Reserved.

Page 15: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Performance was terrible

•  12 node cluster

•  ~60K writes per second RF2

•  ~10K writes per 8 core box

•  We went to the experts

© 2015. All Rights Reserved.

Page 16: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

Cassandra Summit 2014 Family Search asked the same question: Where’s the bottleneck?

https://www.youtube.com/watch?v=Qfzg7gcSK-g

Page 17: 1 Million Writes per second on 60 nodes with Cassandra and EBS

IOPS Available

© 2015. All Rights Reserved.

0.

12500.

25000.

37500.

50000.

I2.xlarge c4.2xlarge

Page 18: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

1.3K IOPS?

Page 19: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

IOPS I see you there,

but I can’t reach you!

Page 20: 1 Million Writes per second on 60 nodes with Cassandra and EBS
Page 21: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

The magic gates opened…

We hit 1 million writes per second RF3 on 60 nodes

Page 22: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

Testing Setup!

Page 23: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Testing Methodology

•  Each test run •  clean C* instances

•  old test keyspaces dropped •  13+TBs of data loaded during read testing •  20 C4.4XL Stress Writers each with their own 1BB sequence

© 2015. All Rights Reserved.

Page 24: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Cluster Topology

© 2015. All Rights Reserved.

Stress Node

10 Instances AZ: 1A

Stress Nodes

10 Instances AZ: 1B

20 C* Nodes AZ: 1A

20 C* Nodes AZ: 1B

20 C* Nodes AZ: 1C

OpsCenter

Page 25: 1 Million Writes per second on 60 nodes with Cassandra and EBS

EBS

© 2015. All Rights Reserved.

Page 26: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Cassandra Stress 2.1.x

© 2015. All Rights Reserved.

bin/cassandra-stress user duration=100000m cl=ONE profile=/home/ubuntu/summit_stress.yaml ops\(insert=1\) no-warmup -pop seq=1..1000000000 -mode native cql3 -node 10.10.10.XX -rate threads=1000 -errors ignore !

Page 27: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

PCSTAT - Al Tobey

http://www.datastax.com/dev/blog/compaction-improvements-in-cassandra-21

https://github.com/tobert/pcstat

Page 28: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

Netflix Test - What is C* capable of?

Page 29: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Netflix Test

© 2015. All Rights Reserved.

1+ Million Writes Per second RF:3 3+ Million Local Writes Per second

NICE!

Page 30: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Netflix Test

© 2015. All Rights Reserved.

Page 31: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Netflix Test

© 2015. All Rights Reserved.

No Dropped Mutations, system healthy at 1.1M after 50 mins

Page 32: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Netflix Test

© 2015. All Rights Reserved.

I/O Util is not pegged Commit Disk = Steady!

Page 33: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Netflix Test

© 2015. All Rights Reserved.

Low IO Wait

Page 34: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Netflix Test

© 2015. All Rights Reserved.

95th Latency = Reasonable

Page 35: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Netflix Test - Read Fail

© 2015. All Rights Reserved.

compression={'chunk_length_kb': '64', 'sstable_compression': 'LZ4Compressor'}

https://issues.apache.org/jira/browse/CASSANDRA-10249 https://issues.apache.org/jira/browse/CASSANDRA-8894

Data Drive Pegged L

Page 36: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Reading Data

•  24 hour read test •  over 10TBs of data in the CF •  sustained > 350K reads per

second over 24 hours •  1M reads/per sec peak •  CL ONE •  12 C4.4XL stress boxes

© 2015. All Rights Reserved.

Page 37: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Reading Data

© 2015. All Rights Reserved.

Page 38: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Reading Data

© 2015. All Rights Reserved.

Page 39: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Reading Data

© 2015. All Rights Reserved.

Not Pegged J

Page 40: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Reading Data

© 2015. All Rights Reserved.

7.2ms 95th latency

Page 41: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Netflix Test resource usage

•  180 Less Cores (45 less i2.xlarge instances) •  24 hour test (sans data transfer cost)

–  Netflix cluster/stress •  Cost: ~$6300 •  285 i2.xlarge $0.85 per hour

–  CrowdStrike cluster/stress with EBS cost •  Cost: ~$2600 •  60 C4.4XL $0.88 per hour

Page 42: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Read Notes with EBS

•  Our test was a single 10K IOPS volume •  More/Bigger Reads?

–  PIOPS gives you as much throughput as you need –  RAID0 multiple EBS volumes

/mnt/data

EBS Vol1 EBS Vol2

Page 43: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

What Unlocked Performance!

Page 44: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Major Tweaks

•  Ubuntu HVM types •  Enhanced Networking •  now faster than PVM •  Ubuntu distro tuned for cloud workloads •  XFS Filesystem

© 2015. All Rights Reserved.

Page 45: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Major Tweaks

•  Major Tweaks •  Cassandra 2.1

•  Java 8 •  G1 Garbage Collector - cassandra-env

© 2015. All Rights Reserved.

https://issues.apache.org/jira/browse/CASSANDRA-7486

Page 46: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Major Tweaks

•  C4.4XL 16 core, EBS Optimized •  4TB, 10,000 IOPS EBS GP2 Encrypted Data Drive

–  160MB/s throughput

•  1TB 3000 IOPS EBS GP2 Encrypted Commit Log Drive

© 2015. All Rights Reserved.

Page 47: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Major Tweaks

•  cassandra-env.sh •  MAX_HEAP_SIZE=8G •  JVM_OPTS=“$JVM_OPTS —XX:+UseG1GC” •  Lots of other minor tweaks

© 2015. All Rights Reserved.

Page 48: 1 Million Writes per second on 60 nodes with Cassandra and EBS

cassandra-env.sh

© 2015. All Rights Reserved.

Put PID in batch mode

Mask CPU0 from the process to reduce context switching

Magic From Al Tobey

Page 49: 1 Million Writes per second on 60 nodes with Cassandra and EBS

YAML Settings

•  cassandra.yaml (based on 16 core) •  concurrent_reads: 32 •  concurrent_writes: 64 •  memtable_flush_writers: 8 •  trickle_fsync: true •  trickle_fsync_interval_in_kb: 1000 •  native_transport_max_threads: 256 •  concurrent_compactors: 4

© 2015. All Rights Reserved.

Page 50: 1 Million Writes per second on 60 nodes with Cassandra and EBS

cassandra.yaml

© 2015. All Rights Reserved.

We found a good portion of the CPU load was being used for internode compression which reduced write throughput

internode_compression: none

Page 51: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Lessons Learned

•  EBS was never the bottleneck during testing, GP2 is legit •  If you’re doing batching, write to the same rowkey in the batch •  Builtin types like list and map come at a performance penalty

•  30% hit on our writes using Map type •  DTCS is very young (see Jeff Jirsa’s talk) •  2.1 Stress Tool is tricky but great for modeling workloads •  How will compression affect your read path?

© 2015. All Rights Reserved.

Page 52: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

Test your own!

https://github.com/CrowdStrike/cassandra-tools

Page 53: 1 Million Writes per second on 60 nodes with Cassandra and EBS

It’s just python

•  launch 20 nodes in us-east1 •  python launch.py launch --nodes=20 —config=c4-ebs-hvm

—az=us-east-1a •  bootstrap the new nodes with C*, RAID/Format disks, etc…

•  fab -u ubuntu bootstrapcass21:config=c4-highperf •  run arbitrary commands

•  fab -u ubuntu cmd:config=c4-highperf,cmd="sudo rm -rf /mnt/cassandra/data/summit_stress"

© 2015. All Rights Reserved.

Page 54: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Run custom stress profiles… multi-node support

[email protected]:~$ python runstress.py --profile=stress10 —seednode=10.10.10.XX —-threads=50!!!Going to run: /home/ubuntu/apache-cassandra-2.1.5/tools/bin/cassandra-stress user duration=100000m cl=ONE profile=/home/ubuntu/summit_stress.yaml ops\(insert=1,simple=9\) no-warmup -pop seq=1..1000000000 -mode native cql3 -node 10.10.10.XX -rate threads=50 -errors ignore !

© 2015. All Rights Reserved.

[email protected]:~$ python runstress.py --profile=stress10 --seednode=10.10.10.XX --threads=50 !!Going to run: /home/ubuntu/apache-cassandra-2.1.5/tools/bin/cassandra-stress user duration=100000m cl=ONE profile=/home/ubuntu/summit_stress.yaml ops\(insert=1,simple=9\) no-warmup -pop seq=1000000001..2000000000 -mode native cql3 -node 10.10.10.XX -rate threads=50 -errors ignore !

export NODENUM=1 !

export NODENUM=2 !

Page 55: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Where are we today?

•  ~3 months on our EBS based cluster •  Hundreds of TBs of graph data and growing in C* •  Billions of vertices/edges •  Changing perceptions?

Page 56: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Special thanks to

© 2015. All Rights Reserved.

•  Leif Jackson •  Marcus King •  Alan Hannan •  Jeff Jirsa

•  Al Tobey •  Nick Panahi •  J.B. Langston •  Marcus Eriksson •  Iian Finlayson •  Dani Traphagen

Page 57: 1 Million Writes per second on 60 nodes with Cassandra and EBS

EBS heading into 2016

© 2015. All Rights Reserved.

Page 58: 1 Million Writes per second on 60 nodes with Cassandra and EBS

4TB  (10k  IOPS)  GP2  

IO  Hit?  Not  enough  to  phase  C*  

Page 59: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

   

So  why  the  hate  for  EBS?  

Page 60: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

Following  the  Crowd  –  Trust  Issues  

 •  Used  instance-­‐store  image  and  ephemeral  drives  

•  Painful  to  stop/start  instances,  resize  •  Couldn’t  avoid  scheduled  maintenance  (i.e.  Reboot-­‐a-­‐palooza)  

•  EncrypUon  required  shenanigans  

Page 61: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

Guess  What?  

•  We  sUll  had  failures  •  Now  we  get  to  rebuild  from  scratch  

Page 62: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

What  do  you  mean  my  volume  is  “stuck”?    •  April  2011  –  Ne[lix,  Reddit  and  Quora  •  October  2012  –  Reddit,  Imgur,  Heroku  •  August  2013  –  Vine,  AirBNB  

EBS’s  Troubled  Childhood  

Page 63: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

h`p://techblog.ne[lix.com/2011/04/lessons-­‐ne[lix-­‐learned-­‐from-­‐aws-­‐outage.html    •  Spread  services  across  mulUple  regions  •  Test  failure  scenarios  regularly  (Chaos  Monkey)  •  Make  Cassandra  databases  more  resilient  by  avoiding  EBS  

Kiss  of  Death  

Page 64: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

Amazon  moves  quickly  and  quietly:    •  March  2011  –  New  EBS  GM  •  July  2012  –  Provisioned  IOPs  •  May  2014  –  NaUve  EncrypUon  •  Jun  2014  –  GP2  (game  changer)  •  Mar  2015  –  16TB  /  10K  GP2/  20K  PIOPS      

RedempUon  

Page 65: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

•  PrioriUzed  EBS  availability  and  consistency  beyond  features  and  funcUonality  

•  Compartmentalized  the  control  plane  -­‐  broke  cross-­‐AZ  dependencies  for  running  volumes  

•  Simplified  workflows  to  favor  sustained  operaUon  •  Tested  and  simulated  via  TLA+/PlusCal  -­‐  be`er  understood  corner  cases  •  Dedicated  a  large  fracUon  of  engineering  resources  to  reliability  and  performance  

 

RedempUon  

Page 66: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

Reliability  

 EBS  Team  targets  99.999%  availability  

   exceeding  expectaUons  

Page 67: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

Crowdstrike  Today  

In  past  12  months,  zero  EBS-­‐related  failures  •  Thousands  of  GP2  data  volumes  (~2PB  data)  •  TransiUoning  all  systems  to  EBS  root  drives  •  Moved  all  data  stores  to  EBS  (C*,  Kapa,  ElasUcsearch,  Postgres,  etc)  

Page 68: 1 Million Writes per second on 60 nodes with Cassandra and EBS
Page 69: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

Staying  Safe  -­‐  Architecture  

•  Select  a  region  with  >2  AZs  (e.g  us-­‐east-­‐1  or  us-­‐west-­‐2)  

 

•  Use  EBS  GP2  or  PIOPs  storage  •  Separate  volumes  for  data  and  commit  logs  

Page 70: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

Staying  Safe  -­‐  Ops  

•  Use  EBS  volume  monitoring  •  Pre-­‐warm  EBS  volumes?  •  Schedule  snapshots  for  consistent  backups  

Page 71: 1 Million Writes per second on 60 nodes with Cassandra and EBS

© 2015. All Rights Reserved.

Most  Importantly  

•  Challenge  assumpUons  •  Stay  current  on  AWS  blog  •  Talk  with  your  peers  

Page 72: 1 Million Writes per second on 60 nodes with Cassandra and EBS
Page 73: 1 Million Writes per second on 60 nodes with Cassandra and EBS

Thank you @jimplush

@opacki