Top Banner
@doanduyhai Introduction to Cassandra DuyHai DOAN, Technical Advocate
94

Cassandra introduction at FinishJUG

Jul 18, 2015

Download

Technology

Duyhai Doan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cassandra introduction at FinishJUG

@doanduyhai

Introduction to Cassandra DuyHai DOAN, Technical Advocate

Page 2: Cassandra introduction at FinishJUG

@doanduyhai

Who Am I ?!Duy Hai DOAN Cassandra technical advocate •  talks, meetups, confs •  open-source devs (Achilles, …) •  OSS Cassandra point of contact

[email protected] ☞ @doanduyhai

2

Page 3: Cassandra introduction at FinishJUG

@doanduyhai

Datastax!•  Founded in April 2010

•  We contribute a lot to Apache Cassandra™

•  400+ customers (25 of the Fortune 100), 200+ employees

•  Headquarter in San Francisco Bay area

•  EU headquarter in London, offices in France and Germany

•  Datastax Enterprise = OSS Cassandra + extra features

3

Page 4: Cassandra introduction at FinishJUG

@doanduyhai

Agenda!Architecture •  Cluster, Replication, Consistency

Data model •  Last Write Win (LWW), CQL basics, From SQL to CQL,

Lightweight Transaction

DSE

Use Cases

4

Page 5: Cassandra introduction at FinishJUG

@doanduyhai

Cassandra history!NoSQL database •  created at Facebook •  open-sourced since 2008 •  current version = 2.1 •  column-oriented ☞ distributed table

5

Page 6: Cassandra introduction at FinishJUG

@doanduyhai

Cassandra 5 key facts!Key fact 1: linear scalability

C*

C* C*

NetcoSports 3 nodes, ≈3GB

1k+ nodes, PB+

YOU

6

Page 7: Cassandra introduction at FinishJUG

@doanduyhai

Cassandra 5 key facts!Key fact 2: continuous availability (≈100% up-time) •  resilient architecture (Dynamo)

7

Page 8: Cassandra introduction at FinishJUG

@doanduyhai

Cassandra 5 key facts!Key fact 3: multi-data centers •  out-of-the-box (config only) •  AWS conf for multi-region DCs •  GCE/CloudStack support •  Microsoft Azure

8

Page 9: Cassandra introduction at FinishJUG

@doanduyhai

Multi-DC usages!

New York (DC1) London (DC2)

Data-locality, disaster recovery

n2

n3

n4

n5

n6

n7

n8

n1

n2

n3

n4 n5

n1

Async replication

9

Page 10: Cassandra introduction at FinishJUG

@doanduyhai

Multi-DC usages!Workload segregation/virtual DC

n2

n3

n4

n5

n6

n7

n8

n1

n2

n3

n4 n5

n1

Production (Live)

Analytics (Spark/Hadoop)

Same room

Async replication

10

Page 11: Cassandra introduction at FinishJUG

@doanduyhai

Multi-DC usages!Prod data copy for testing/benchmarking

n2

n3

n4

n5

n6

n7

n8

n1

n2

n3 n1

Use LOCAL

consistency

My tiny test cluster

Data copy

NEVER WRITE HERE !!!

11

Page 12: Cassandra introduction at FinishJUG

@doanduyhai

Cassandra 5 key facts!Key fact 4: operational simplicity •  1 node = 1 process + 2 config file (main + IP) •  deployment automation •  OpsCenter for monitoring

12

Page 13: Cassandra introduction at FinishJUG

@doanduyhai

Cassandra 5 key facts!

13

Page 14: Cassandra introduction at FinishJUG

@doanduyhai

Cassandra 5 key facts!Key fact 5: analytics combo •  Cassandra + Spark = awesome ! •  realtime streaming/analytics/aggregation …

14

Page 15: Cassandra introduction at FinishJUG

Cassandra architecture!

Cluster Replication

Consistency

Page 16: Cassandra introduction at FinishJUG

@doanduyhai

Cassandra architecture!Cluster layer •  Amazon DynamoDB paper •  masterless architecture

Data-store layer •  Google Big Table paper •  Columns/columns family

16

Page 17: Cassandra introduction at FinishJUG

@doanduyhai

Data distribution!Random: hash of #partition → token = hash(#p) Hash: ]-X, X] X = huge number (264/2)

n1

n2

n3

n4

n5

n6

n7

n8

17

Page 18: Cassandra introduction at FinishJUG

@doanduyhai

Token Ranges!A: ]0, X/8] B: ] X/8, 2X/8] C: ] 2X/8, 3X/8] D: ] 3X/8, 4X/8] E: ] 4X/8, 5X/8] F: ] 5X/8, 6X/8] G: ] 6X/8, 7X/8] H: ] 7X/8, X]

n1

n2

n3

n4

n5

n6

n7

n8

A

B

C

D

E

F

G

H

18

Page 19: Cassandra introduction at FinishJUG

@doanduyhai

Distributed Table!

n1

n2

n3

n4

n5

n6

n7

n8

A

B

C

D

E

F

G

H

user_id1

user_id2

user_id3

user_id4

user_id5

19

Page 20: Cassandra introduction at FinishJUG

@doanduyhai

Distributed Table!

n1

n2

n3

n4

n5

n6

n7

n8

A

B

C

D

E

F

G

H

user_id1

user_id2

user_id3

user_id4

user_id5

20

Page 21: Cassandra introduction at FinishJUG

@doanduyhai

Linear scalability!

n1

n2

n3

n4

n5

n6

n7

n8

n1

n2

n3 n4

n5

n6

n7

n8 n9

n10

8 nodes 10 nodes

21

Page 22: Cassandra introduction at FinishJUG

@doanduyhai

Failure tolerance!Replication Factor (RF) = 3

n1

n2

n3

n4

n5

n6

n7

n8

1

2

3{B, A, H}

{C, B, A}

{D, C, B}

A

B

C

D

E

F

G

H

22

Page 23: Cassandra introduction at FinishJUG

@doanduyhai

Coordinator node!Incoming requests (read/write) Coordinator node handles the request

Every node can be coordinator àmasterless

n1

n2

n3

n4

n5

n6

n7

n8

1

2

3

coordinator request

23

Page 24: Cassandra introduction at FinishJUG

@doanduyhai

Consistency!Tunable at runtime •  ONE •  QUORUM (strict majority w.r.t. RF) •  ALL Apply both to read & write

24

Page 25: Cassandra introduction at FinishJUG

@doanduyhai

Consistency in action!RF = 3, Write ONE, Read ONE

B A A

B A A

Read ONE: A

data replication in progress …

Write ONE: B

25

Page 26: Cassandra introduction at FinishJUG

@doanduyhai

Consistency in action!RF = 3, Write ONE, Read QUORUM

B A A

Write ONE: B

Read QUORUM: A

B A A

data replication in progress …

26

Page 27: Cassandra introduction at FinishJUG

@doanduyhai

Consistency in action!RF = 3, Write ONE, Read ALL

B A A

Read ALL: B

B A A

data replication in progress …

Write ONE: B

27

Page 28: Cassandra introduction at FinishJUG

@doanduyhai

Consistency in action!RF = 3, Write QUORUM, Read ONE

B B A

Write QUORUM: B

Read ONE: A

B B A

data replication in progress …

28

Page 29: Cassandra introduction at FinishJUG

@doanduyhai

Consistency in action!RF = 3, Write QUORUM, Read QUORUM

B B A

Read QUORUM: B

B B A

data replication in progress …

Write QUORUM: B

29

Page 30: Cassandra introduction at FinishJUG

@doanduyhai

Consistency trade-off!

30

Page 31: Cassandra introduction at FinishJUG

@doanduyhai

Consistency level!

ONE Fast, may not read latest written value

31

Page 32: Cassandra introduction at FinishJUG

@doanduyhai

Consistency level!

QUORUM Strict majority w.r.t. Replication Factor

Good balance

32

Page 33: Cassandra introduction at FinishJUG

@doanduyhai

Consistency level!

ALL Paranoid

Slow, no high availability

33

Page 34: Cassandra introduction at FinishJUG

@doanduyhai

Consistency summary!

ONERead + ONEWrite

☞ available for read/write even (N-1) replicas down

QUORUMRead + QUORUMWrite

☞ available for read/write even 1+ replica down 34

Page 35: Cassandra introduction at FinishJUG

Q & R

! " !

Page 36: Cassandra introduction at FinishJUG

Data model!

Last Write Win!CQL basics!

From SQL to CQL!Lightweight Transaction!

Page 37: Cassandra introduction at FinishJUG

@doanduyhai

Cassandra Write Path!

Commit log1

. . .

1

Commit log2

Commit logn

Memory

37

Page 38: Cassandra introduction at FinishJUG

@doanduyhai

Cassandra Write Path!

Memory

Commit log1

. . .

1

Commit log2

Commit logn

MemTable Table1

MemTable Table2

MemTable TableN

2

. . .

38

Page 39: Cassandra introduction at FinishJUG

@doanduyhai

Cassandra Write Path!

Commit log1

Commit log2

Commit logn

Table1

SSTable1

Table2 Table3

SSTable2 SSTable3 3

Memory

. . .

39

Page 40: Cassandra introduction at FinishJUG

@doanduyhai

Cassandra Write Path!

Commit log1

Commit log2

Commit logn

Table1

SSTable1

Table2 Table3

SSTable2 SSTable3

Memory . . . MemTable Table1

MemTable Table2

MemTable TableN

. . .

40

Page 41: Cassandra introduction at FinishJUG

@doanduyhai

Cassandra Write Path!

Commit log1

Commit log2

Commit logn

Table1

SSTable1

Table2 Table3

SSTable2 SSTable3

Memory

SSTable1

SSTable2

SSTable3 . . .

41

Page 42: Cassandra introduction at FinishJUG

@doanduyhai

Last Write Win (LWW)!

jdoe age name

33 John DOE

INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);

#partition

42

Page 43: Cassandra introduction at FinishJUG

@doanduyhai

Last Write Win (LWW)!

jdoe age (t1) name (t1)

33 John DOE

INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);

auto-generated timestamp

.

43

Page 44: Cassandra introduction at FinishJUG

@doanduyhai

Last Write Win (LWW)!

UPDATE users SET age = 34 WHERE login = ‘jdoe’;

jdoe age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

SSTable1 SSTable2

44

Page 45: Cassandra introduction at FinishJUG

@doanduyhai

Last Write Win (LWW)!

DELETE age FROM users WHERE login = ‘jdoe’;

jdoe age (t3)

ý

tombstone

jdoe age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

SSTable1 SSTable2 SSTable3

45

Page 46: Cassandra introduction at FinishJUG

@doanduyhai

Last Write Win (LWW)!

SELECT age FROM users WHERE login = ‘jdoe’;

? ? ?

SSTable1 SSTable2 SSTable3

jdoe age (t3)

ý jdoe

age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

46

Page 47: Cassandra introduction at FinishJUG

@doanduyhai

Last Write Win (LWW)!

SELECT age FROM users WHERE login = ‘jdoe’;

✓ ✕ ✕

SSTable1 SSTable2 SSTable3

jdoe age (t3)

ý jdoe

age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

47

Page 48: Cassandra introduction at FinishJUG

@doanduyhai

Compaction!SSTable1 SSTable2 SSTable3

jdoe age (t3)

ý jdoe

age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

New SSTable

jdoe age (t3) name (t1)

ý John DOE

48

Page 49: Cassandra introduction at FinishJUG

@doanduyhai

Historical data!

history

id date1(t1) date2(t2) … date9(t9)

… … … …

SSTable1 SSTable2

You want to keep data history ? •  do not use internal generated timestamp !!! •  ☞ time-series data modeling

id date10(t10) date11(t11) … …

… … … …

49

Page 50: Cassandra introduction at FinishJUG

@doanduyhai

CRUD operations!

INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);

UPDATE users SET age = 34 WHERE login = ‘jdoe’;

DELETE age FROM users WHERE login = ‘jdoe’;

SELECT age FROM users WHERE login = ‘jdoe’;

50

Page 51: Cassandra introduction at FinishJUG

@doanduyhai

Simple Table!

CREATE TABLE users ( login text, name text, age int, … PRIMARY KEY(login));

partition key (#partition)

51

Page 52: Cassandra introduction at FinishJUG

@doanduyhai

What about joins ?!How can I join data between tables ? How can I model 1 – N relationships ? How to model a mailbox ?

Emails User 1 n

52

Page 53: Cassandra introduction at FinishJUG

@doanduyhai

Clustered table (1 – N)!

CREATE TABLE mailbox ( login text, message_id timeuuid, interlocutor text, message text, PRIMARY KEY((login), message_id));

partition key clustering column (sorted)

unicity

53

Page 54: Cassandra introduction at FinishJUG

@doanduyhai

SSTable2

SSTable1

On disk layout

jdoe message_id1 message_id2 … message_id104

… … … …

hsue message_id1 message_id2 … message_id78

… … … …

jdoe message_id105 message_id106 … message_id169

… … … …

54

Page 55: Cassandra introduction at FinishJUG

@doanduyhai

Queries!Get message by user and message_id (date)

SELECT * FROM mailbox WHERE login = jdoe and message_id = ‘2014-09-25 16:00:00’;

Get message by user and date interval

SELECT * FROM mailbox WHERE login = jdoe and message_id <= ‘2014-09-25 16:00:00’ and message_id >= ‘2014-09-20 16:00:00’;

55

Page 56: Cassandra introduction at FinishJUG

@doanduyhai

Queries!Get message by message_id only ?

SELECT * FROM mailbox WHERE message_id = ‘2014-09-25 16:00:00’;

Get message by date interval only ?

SELECT * FROM mailbox WHERE and message_id <= ‘2014-09-25 16:00:00’ and message_id >= ‘2014-09-20 16:00:00’;

56

Page 57: Cassandra introduction at FinishJUG

@doanduyhai

Queries!Get message by message_id only (#partition not provided)

SELECT * FROM mailbox WHERE message_id = ‘2014-09-25 16:00:00’;

Get message by date interval only (#partition not provided)

SELECT * FROM mailbox WHERE and message_id <= ‘2014-09-25 16:00:00’ and message_id >= ‘2014-09-20 16:00:00’;

57

Page 58: Cassandra introduction at FinishJUG

@doanduyhai

Without #partition

?

?

?

?

?

?

?

?

❓ ❓

No #partition ☞ no token ☞ where are my data ?

58

Page 59: Cassandra introduction at FinishJUG

@doanduyhai

The importance of #partition

In RDBMS, no primary key ☞ full table scan

😭 59

Page 60: Cassandra introduction at FinishJUG

@doanduyhai

The importance of #partition

With Cassandra, no partition key ☞ full CLUSTER scan

😱 60

Page 61: Cassandra introduction at FinishJUG

@doanduyhai

Queries!

SELECT * FROM mailbox WHERE login >= ‘hsue’ and login <= ‘jdoe’;

Get message by user range (range query on #partition)

SELECT * FROM mailbox WHERE login like ‘%doe%‘;

Get message by user pattern (non exact match on #partition)

61

Page 62: Cassandra introduction at FinishJUG

@doanduyhai

WHERE clause restrictions!All queries (INSERT/UPDATE/DELETE/SELECT) must provide #partition

Only exact match (=) on #partition, range queries (<, ≤, >, ≥) not allowed •  ☞ full cluster scan

On clustering columns, only range queries (<, ≤, >, ≥) and exact match

WHERE clause only possible •  on columns defined in PRIMARY KEY •  on indexed columns (⚠)

62

Page 63: Cassandra introduction at FinishJUG

@doanduyhai

WHERE clause restrictions!What if I want to perform « arbitrary » WHERE clause ? •  search form scenario, dynamic search fields

63

Page 64: Cassandra introduction at FinishJUG

@doanduyhai

WHERE clause restrictions!What if I want to perform « arbitrary » WHERE clause ? •  search form scenario, dynamic search fields DO NOT RE-INVENT THE WHEEL ! ☞ Apache Solr (Lucene) integration (Datastax Enterprise) ☞ Same JVM, 1-cluster-2-products (Solr & Cassandra)

64

Page 65: Cassandra introduction at FinishJUG

@doanduyhai

WHERE clause restrictions!What if I want to perform « arbitrary » WHERE clause ? •  search form scenario, dynamic search fields DO NOT RE-INVENT THE WHEEL ! ☞ Apache Solr (Lucene) integration (Datastax Enterprise) ☞ Same JVM, 1-cluster-2-products (Solr & Cassandra)

SELECT * FROM users WHERE solr_query = ‘age:[33 TO *] AND gender:male’;

SELECT * FROM users WHERE solr_query = ‘lastname:*schwei?er’;

65

Page 66: Cassandra introduction at FinishJUG

@doanduyhai

Collections & maps!

CREATE TABLE users ( login text, name text, age int, friends set<text>, hobbies list<text>, languages map<int, text>, … PRIMARY KEY(login));

66

Keep the cardinality low ≈ 1000

Page 67: Cassandra introduction at FinishJUG

@doanduyhai

User Defined Type (UDT)!

CREATE TABLE users ( login text, … street_number int, street_name text, postcode int, country text, … PRIMARY KEY(login));

Instead of

67

Page 68: Cassandra introduction at FinishJUG

@doanduyhai

User Defined Type (UDT)!

CREATE TYPE address ( street_number int, street_name text, postcode int, country text);

CREATE TABLE users ( login text, … location frozen <address>, … PRIMARY KEY(login));

68

Page 69: Cassandra introduction at FinishJUG

@doanduyhai

UDT insert! INSERT INTO users(login,name, location) VALUES ( ‘jdoe’, ’John DOE’, { ‘street_number’: 124, ‘street_name’: ‘Congress Avenue’, ‘postcode’: 95054, ‘country’: ‘USA’ });

69

Page 70: Cassandra introduction at FinishJUG

@doanduyhai

UDT update!

UPDATE users set location = { ‘street_number’: 125, ‘street_name’: ‘Congress Avenue’, ‘postcode’: 95054, ‘country’: ‘USA’ } WHERE login = jdoe;

Can be nested ☞ store documents •  but no dynamic fields (or use map<text, blob>)

70

Page 71: Cassandra introduction at FinishJUG

@doanduyhai

From SQL to CQL!Normalized

Comment

User

1

n

CREATE TABLE comments ( article_id uuid, comment_id timeuuid, author_login text, // typical join id content text, PRIMARY KEY((article_id), comment_id));

71

Page 72: Cassandra introduction at FinishJUG

@doanduyhai

From SQL to CQL 1 SELECT -  10 last comments -  10 author_login

What to do with 10 author_login ???

Comment

User

1

n

72

Page 73: Cassandra introduction at FinishJUG

@doanduyhai

From SQL to CQL 1 SELECT -  10 last comments -  10 author_login

What to do with 10 author_login ??? 10 extra SELECT → N+1 SELECT problem !

Comment

User

1

n

73

Page 74: Cassandra introduction at FinishJUG

@doanduyhai

From SQL to CQL!De-normalized

Comment

User

1

n

CREATE TABLE comments ( article_id uuid, comment_id timeuuid, author frozen<person>, // person is UDT content text, PRIMARY KEY((article_id), comment_id));

74

Page 75: Cassandra introduction at FinishJUG

@doanduyhai

Data modeling best practices!Start by queries •  identify core functional read paths •  1 read scenario ≈ 1 SELECT

75

Page 76: Cassandra introduction at FinishJUG

@doanduyhai

Data modeling best practices!Start by queries •  identify core functional read paths •  1 read scenario ≈ 1 SELECT

Denormalize •  wisely, only duplicate necessary & immutable data •  functional/technical trade-off

76

Page 77: Cassandra introduction at FinishJUG

@doanduyhai

Data modeling best practices!

Person UDT - firstname/lastname - date of birth - gender - mood - location

77

Page 78: Cassandra introduction at FinishJUG

@doanduyhai

Data modeling best practices!

John DOE, male birthdate: 21/02/1981 subscribed since 03/06/2011 ☉ San Mateo, CA

’’Impossible is not John DOE’’

Full detail read from User table on click

78

Page 79: Cassandra introduction at FinishJUG

@doanduyhai

Data modeling trade-off What if ... •  not possible to de-normalize with immutable data ? •  have to duplicate mutable data ?

79

Page 80: Cassandra introduction at FinishJUG

@doanduyhai

Data modeling trade-off 2 strategies •  either accept to normalize some data (extra SELECT required) •  or de-normalize and update everywhere upon data mutation

80

Page 81: Cassandra introduction at FinishJUG

@doanduyhai

Data modeling trade-off 2 strategies •  either accept to normalize some data (extra SELECT required) •  or de-normalize and update everywhere upon data mutation But always keep those scenarios rare (5%-10% max), focus on the 90%

81

Page 82: Cassandra introduction at FinishJUG

@doanduyhai

Data modeling trade-off 2 strategies •  either accept to normalize some data (extra SELECT required) •  or de-normalize and update everywhere upon data mutation But always keep those scenarios rare (5%-10% max), focus on the 90% Example: Twitter tweet deletion

82

Page 83: Cassandra introduction at FinishJUG

Q & R

! " !

Page 84: Cassandra introduction at FinishJUG

@doanduyhai

Lightweight Transaction (LWT)!What ? ☞ make operations linearizable Why ? ☞ solve a class of race conditions in Cassandra that would require installing an external lock manager

84

Page 85: Cassandra introduction at FinishJUG

@doanduyhai

Lightweight Transaction (LWT)!

INSERT INTO account (id, email) VALUES (‘jdoe’, ‘[email protected]’);

SELECT * FROM account WHERE id= ‘jdoe’; (0 rows) SELECT * FROM account

WHERE id= ‘jdoe’; (0 rows)

INSERT INTO account (id, email) VALUES (‘jdoe’, ‘[email protected]’);

winner

85

Page 86: Cassandra introduction at FinishJUG

@doanduyhai

Lightweight Transaction (LWT)!How ? ☞ implementing Paxos protocol on Cassandra Syntax ?

INSERT INTO account (id, email) VALUES (‘jdoe’, ‘[email protected]’) IF NOT EXISTS;

UPDATE account SET email = ‘[email protected]’ IF email = ‘[email protected]’ WHERE id=‘jdoe’;

86

Page 87: Cassandra introduction at FinishJUG

@doanduyhai

Lightweight Transaction (LWT)!Recommendations •  insert with LWT ☞ delete must use LWT

INSERT INTO my_table … IF NOT EXISTS ☞ DELETE FROM my_table … IF EXISTS

87

Page 88: Cassandra introduction at FinishJUG

@doanduyhai

Lightweight Transaction (LWT)!Recommendations •  LWT expensive (4 round-trips), do not abuse •  only for 1% – 5% use cases

88

Page 89: Cassandra introduction at FinishJUG

@doanduyhai

Lightweight Transaction (LWT)!1

2

3

4Compare Swap / Learn

Queue-in Consensus

89

Page 90: Cassandra introduction at FinishJUG

Q & R

! " !

Page 91: Cassandra introduction at FinishJUG

@doanduyhai

DSE (Datastax Enterprise)!

Security

Analytics (Spark & Hadoop)

Search (Solr)

91

Page 92: Cassandra introduction at FinishJUG

@doanduyhai

Use Cases!

Messaging

Collections/ Playlists

Fraud detection

Recommendation/ Personalization

Internet of things/ Sensor data

92

Page 93: Cassandra introduction at FinishJUG

@doanduyhai

Use Cases!

Messaging

Collections/ Playlists

Fraud detection

Recommendation/ Personalization

Internet of things/ Sensor data

93

Page 94: Cassandra introduction at FinishJUG

Thank You @doanduyhai

[email protected]

https://academy.datastax.com/