Top Banner
© ALTOROS | CONFIDENTIAL Choosing a NoSQL: a Real-Life Case Sergey Sverchkov, Project Manager Vitaly Rudenia, Java Team Lead
21

Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

May 10, 2015

Download

Software

#BigDataBY
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL

Choosing a NoSQL: a Real-Life

Case

Sergey Sverchkov, Project Manager

Vitaly Rudenia, Java Team Lead

Page 2: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL 2

Business goals

Improve scalability of the Oracle-based solution

Get benefits of clouds

Provide high availability and excellent performance

Add geographic redundancy

Web performance vs. revenue

Page 3: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL 3

How we did it:

Arranged data stores by categories (key-value, document, relational / ACID)

Identified key criteria

Measured performance

Checked stability and scalability

How to choose a database?

Page 4: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL 4

Evaluation criteria

Selection CriteriaImportance

Score

Multi-data center bi-directional replication 10

Support for active/active reads/writes across regions 10/9

Auto resynchronization of data between regions 10

Support for encryption of data replication traffic across regions 10

Configurable replication factor 9

Tunable consistency for reads and writes 9

Survive loss of nodes and up to an entire region 8

Ability to add nodes in a cluster and rebalance data 8

Rich Query and Indexing capabilities 8

Security: Kerberos or similar authentication models 7

Backup/Recovery: Ability to perform live snapshots and restore 6

Bulk Loading and Extract/Dump capability; adapters for data transfer to Hadoop 3

Support for counters / sequence type structures 3

Page 5: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL 5

Compared data stores

Relational Key-Value Document-

oriented

Column family

MySQL Cluster Redis MongoDB Cassandra

Oracle 12c MySQL HandlerSocket Couchbase Vertica

MariaDB Berkeley DB CouchDB Teradata

VoltDB Project Voldemort MarkLogic Server Accumulo

NuoDB Riak HBase

Vertica

Page 6: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL 6

The benchmark framework

Amazon EC2 nodes for database cluster:

- i2.4xlarge: vCPU=16, RAM=122 GB, Storage=4x800 GB SSD in RAID0

- 3 availability zones in one region

- a single security group

Page 7: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL

Initial data sets

7

Data loading statistics

Entity Name Number of Records Size of a Record, KB

Users 69,278,283 14.5

Orders 5,000,000 0.5

Inventory 99,923 0.0016

Activity 422,227,370 28.4

Database

name

Load Time

(hr)

Number of

Threads

Throughput Average Latency

(ms)

Couchbase 11.2 10 12,335 0.6

MongoDB 9.8 10 14,000 0.4

Riak 10.2 10 13,990 4

Cassandra ~4 10 - -

MySQL 8.9 10 15,340 1.3

Loading statistics

Page 8: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL 8

Benchmarking workloads

Workload description Operations

1. Device Payload upload Insert

2. Add to Shopping Cart - with inventory check/update Read, Insert

3. Profile registrations Read, Insert

4. Login + Token (update of last login) Read, Update

5. Order create Read, Update, Insert

6. Activity List - Last 30 activities for a user Read (range of records)

7. Activity Detail - Details of a single activity based on id Read (single record)

8. Aggregation for the last 30 days for a user Read and aggregate

9. Delete of activity based on id Delete

10. Profile search - based on First and Last name Read

Page 9: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL 9

Initial data model

Page 10: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL 10

Steps:

1. Generate SYNC_ID

2. Read PAYLOAD value from pre-generated file

3. Insert new record (SYNC_ID, PAYLOAD) into SYNC

Performance results, 10 parallel threads

Workload: Device sync

Parameter Couchbase MongoDB Riak Cassandra MySQL

Throughput

(ops/sec)38,520 150 14,302 21,672 5,786

Average Latency,

ms (Insert)1.7 451.1 5.1 4.0 10.2

95th Percentile

Latency, ms (Insert)1.4 220.9 6.1 3.3 14.0

99th Percentile

Latency, ms (Insert)3.2 561.7 12.7 43.4 14.0

Page 11: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL 11

Steps:

1. Retrieve row from INVENTORY with given SKU_ID

2. Generate ORDER_ID (STATE=‘INCOMPLETE’)

3. Insert new Order to ORDERS

Performance results, 10 parallel threads

Workload: Add to Shopping Cart

Parameter Couchbase MongoDB Riak Cassandra MySQL

Throughput

(ops/sec)54,335 10,126 15,023 46,369 8,319

Average Latency,

ms (Insert / Read)0.7 / 0.7 7.6 / 1.2 2.7 / 1.6 0.9 / 1.0 8.4 / 3.7

95th Percentile

Latency, ms

(Insert / Read)

1.1 / 1.1 20.1 / 1.6 3.5 / 2.4 2.4 / 2.4 11.5 / 5.4

99th Percentile

Latency, ms

(Insert / Read)

1.4 / 1.4 34.2 / 2.7 4.9 / 3.3 4.4 / 5.9 13.2 / 6.5

Page 12: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL 12

Steps:

1. Generate a unique USER_ID, FIRST_NAME, LAST_NAME

2. Set email as [email protected]

3. Read USER verifying that record doesn’t exist

4. Insert new record into USER

Performance results, 10 parallel threads

Workload: Profile registrations

Parameter Couchbase MongoDB Riak Cassandra MySQL

Throughput

(ops/sec)4,521 3,800 24,807 31,490 8,346

Average Latency,

ms (Insert/Read) 0.8 / 16.9 64 / 5.7 3.1 / - 1.4 / 1.4 8.5 / 2.2

95th Percentile

Latency, ms

(Insert/Read)

1.2 / 25.5 133.1 / 156 4.3 / - 2.4 / 2.4 11.9 / 4.1

99th Percentile

Latency, ms

(Insert/Read)

1.4 / 34.1 197.6 / 29.9 5.3 / - 4.4 / 5.1 13.6 / 5.4

Page 13: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL 13

Steps:

1. Read a row from INVENTORY based on SKU_ID

2. Update INVENTORY for SKU_ID

3. INSERT to ORDERS(STATE=‘COMPLETE’)

Performance results, 10 parallel threads

Workload: Order Create

Parameter Couchbase MongoDB Riak Cassandra MySQL

Throughput (ops/sec)27,436 4,259 5,339 98,105 3,023

Average Latency, ms

(Update / Insert /

Read)

0.7 / 0.8 / 0.7 27.2 / 3.1 / 3.8 4.7 / 2.8 / 1.7 0.8 / 0.8 / 1.5 11.4 / 8.1 / 3.5

95th Percentile Latency,

ms (Update / Insert /

Read)

1.1 / 1.1 / 1.1 57.5 / 6.4 / 8.7 6.7 / 4.1 / 2.4 1.3 / 1.3 / 3.416.0 / 10.9 /

4.8

99th Percentile Latency,

ms (Update / Insert /

Read)

1.4 / 2 / 290.1 / 18.3 /

18.78.8 / 5.4 / 3.3 2.4 / 2.4 / 7.5

18.9 / 12.1 /

5.7

Page 14: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL 14

Steps:

1. Aggregate the last 30 activities from SPORT_ACTIVITY

Performance results, 10 parallel threads

Workload: Aggregate last 30 activities

ParameterCouchbase MongoDB Riak Cassandra MySQL

Throughput

(ops/sec) 3,783 9,285 - 60,713 9,195

Average Latency,

ms

(Scan)

21.1 29 - 1.3 6.5

95th Percentile

Latency, ms

(Scan)

33.1 114.1 - 2.3 12.8

99th Percentile

Latency, ms

(Scan)

41.6 209.3 - 3.4 20.5

Page 15: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL 15

Performance comparison

0

20 000

40 000

60 000

80 000

100 000

120 000

140 000

Th

rou

gh

pu

t (o

ps

/se

c)

Database comparison

Couchbase

Mongo

Cassandra

Riak

MySql

Page 16: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL

Node failure verification – 18 nodes

16

Stability tests - Couchbase

Login Profile registration

Number of

failed nodes

Throughput,

ops/sec

Scan,

average

latency, ms

Update,

average

latency, ms

Throughput,

ops/sec

Scan,

average

latency, ms

Insert,

average

latency, ms

1–6 4,450 17.1 0.7 4,600 86 0.8

>7 3,120 20 0.7 4,020 90 1.1

Region failure verification.

Bucket name Sync throughput per cluster,

ops/sec

Sync throughput per node,

ops/sec

SportActivity 115,000 19,167

Users 111,000 18,500

Inventory 122,000 20,333

Page 17: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL

Node failure verification – 9 nodes

17

Stability tests - MongoDB

Region failure verification, issues found

Profile registration Login

Number of failed

nodes

Throughput,

ops/sec

Scan,

average

latency, ms

Update,

average

latency, ms

Throughput,

ops/sec

Read,

average

latency, ms

Insert,

average

latency, ms

1–3 8,721 6.3 24.2 58.9 6.7 63.0

> 4 data nodes – all open client connections are closed

1 or 2 configuration servers unavailable - cluster metadata becomes read only

Error inserting data, when two master nodes failed in the “remote” data center

Primary node cannot be elected or the state between the elections

Page 18: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL

Node failure verification – 9 nodes

18

Stability tests - Cassandra

9 data nodes in 3 groups, replica factor 3

Each 3 minutes one node in each group was stopped (except seed node)

Login

The overall run time, min The overall throughput, ops/sec Operations

22 4,543.9 6,000,000

Order create

The overall run time, min The overall throughput, ops/sec Operations

10.8 6,935.2 4,500,000

Profile registration

The overall run time, min The overall throughput, ops/sec Operations

23.7 2,807.2 4,000,000

Page 19: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL

Couchbase

19

In-memory database, additional memory for metatada:

Metadata = documents count * (metadata per document = 56 bytes + key size)

Fast reads within 1ms when dataset fits in memory

Views operations (profile registrations) consume CPU and show slow performance

Continuous, asynchronous replication process between regions

Summary

MongoDB

Good performance in some workloads

Sensitive to availability of configuration servers – no chunk migrations or splits

Replica set should have at least one primary and secondary node

Only replica set can be geographically distributed between datacenters

Page 20: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL

Cassandra

20

Key-value + columnar data store

Flexibility in configuring a cluster topology

Very fast in inserting data, but keep in mind JVM garbage collection

CQL3 (Cassandra Query Language) and support of prepared query statements

Summary

Riak

Multiple back-ends within a single Riak instance

No way to delete an entire non-empty bucket

Map/Reduce code for scan and aggregation

MySQL NDB Cluster

All cluster data is stored in memory – need backup

Quite complex configuration – 3 types of processes

Geographic replication is configured manually

Page 21: Сергей Сверчков и Виталий Руденя. Choosing a NoSQL database

© ALTOROS | CONFIDENTIAL 21

Choosing NoSQL solution

Sergey Sverchkov: [email protected]

Vitaly Rudenia: [email protected]

Altoros, 2014

Thank you