Top Banner
1 1 September 2014 Cassandra at Umbel
52

Cassandra at Umbel

Jul 03, 2015

Download

Software

Umbel presents on their internal bitmap index that is backed by Cassandra.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cassandra at Umbel

11

September 2014Cassandra at Umbel

Page 2: Cassandra at Umbel

22

Travis TurnerChief Data Scientist, Co-Founder [email protected]

Page 3: Cassandra at Umbel

33

UmbelEmpower companies to convert people-based data into addressable, actionable relationships. !

Page 4: Cassandra at Umbel

44

Cassandra at Umbel1. Segmentation of people-based data 2. Pilosa 3. Cassandra Persistent Storage

Page 5: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 5

SDKs

Capture

S3

Page 6: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 6

Page 7: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 7

Page 8: Cassandra at Umbel

8

Decentralized, Distributed Bitmap Index & Query Engine

Pilosa

Page 9: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 9

SDKs

Capture

S3

Pilosa

Page 10: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 10

Page 11: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 11

Page 12: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 12

Page 13: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 13

Page 14: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 14

Page 15: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 15

Page 16: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 16

Page 17: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 17

11 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 10

0 0 0 0 0 0 0 0 0 0

0000000000

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0 0

000

0 0 0 0 0 0 000 0 0 0 0 0

Page 18: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 18

11 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 10

0 0 0 0 0 0 0 0 0 0

0000000000

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0 0

000

0 0 0 0 0 0 000 0 0 0 0 0

AND

Page 19: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 19

11 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 10

0 0 0 0 0 0 0 0 0 0

0000000000

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0 0

000

0 0 0 0 0 0 000 0 0 0 0 0

OR

Page 20: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 20

11 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 10

0 0 0 0 0 0 0 0 0 0

0000000000

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0 0

000

0 0 0 0 0 0 000 0 0 0 0 0

OR NOT

Page 21: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 21

Page 22: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 22

Page 23: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 23

Page 24: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 24

2640

Page 25: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 25

Slice 264

65,536 bits

Page 26: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 26

Slice 264

65,536 bits

Fragment

Page 27: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 27

Slice 264

65,536 bits

0 1 2 313

Fragment

Page 28: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 28

Slice 264

65,536 bitsChunk

0 1 2 313

Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31

(2048 bits)

Page 29: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 29

Slice 264

65,536 bitsChunk

0 1 2 313

Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31

Block (64 bits)

(2048 bits)

Page 30: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 30

Slice 264

65,536 bitsChunk

0 1 2 313

Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31

Block (64 bits)

(2048 bits) profile id: 150,000

0010 0100 1001 1111 0000

Page 31: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 31

Slice 264

65,536 bitsChunk

0 1 2 313

Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31

Block (64 bits)

(2048 bits)

0010 0100 1001 1111 0000 ÷ 216

profile id: 150,000

Page 32: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 32

Slice 264

65,536 bitsChunk

0 1 2 313

Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31

Block (64 bits)

(2048 bits)

0010 0100 1001 1111 00000010 = 2

÷ 216

profile id: 150,000

Page 33: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 33

Slice 264

65,536 bitsChunk

0 1 2 313

Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31

Block (64 bits)

(2048 bits)

0010 0100 1001 1111 00000010 = 2

÷ 216

0100 1001 1111 0000

profile id: 150,000

Page 34: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 34

Slice 264

65,536 bitsChunk

0 1 2 313

Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31

Block (64 bits)

(2048 bits)

0010 0100 1001 1111 00000010 = 2

÷ 216

0100 1001 1111 0000 ÷ 211

profile id: 150,000

Page 35: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 35

Slice 264

65,536 bitsChunk

0 1 2 313

Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31

Block (64 bits)

(2048 bits)

0010 0100 1001 1111 00000010 = 2

÷ 216

0100 1001 1111 00000 1001

÷ 211

= 9

profile id: 150,000

Page 36: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 36

Slice 264

65,536 bitsChunk

0 1 2 313

Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31

Block (64 bits)

(2048 bits)

0010 0100 1001 1111 00000010 = 2

÷ 216

0100 1001 1111 00000 1001

÷ 211

= 9001 1111 0000

profile id: 150,000

Page 37: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 37

Slice 264

65,536 bitsChunk

0 1 2 313

Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31

Block (64 bits)

(2048 bits)

0010 0100 1001 1111 00000010 = 2

÷ 216

0100 1001 1111 00000 1001

÷ 211

= 9001 1111 0000 ÷ 26

profile id: 150,000

Page 38: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 38

Slice 264

65,536 bitsChunk

0 1 2 313

Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31

Block (64 bits)

(2048 bits)

0010 0100 1001 1111 00000010 = 2

÷ 216

0100 1001 1111 00000 1001

÷ 211

= 9001 1111 0000

0 0111÷ 26

= 7

profile id: 150,000

Page 39: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 39

Slice 264

65,536 bitsChunk

0 1 2 313

Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31

Block (64 bits)

(2048 bits)

0010 0100 1001 1111 00000010 = 2

÷ 216

0100 1001 1111 00000 1001

÷ 211

= 9001 1111 0000

0 0111÷ 26

= 711 0000 = 48

profile id: 150,000

Page 40: Cassandra at Umbel

40

v 2.1

Cassandra Persistent Storage

Page 41: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 41

Chunk

0 1 2 313

0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31

Block (64 bits)

(2048 bits) profile id: 150,000

0010 0100 1001 1111 00000010 = 2

÷ 216

0100 1001 1111 00000 1001

÷ 211

= 9001 1111 0000

0 0111÷ 26

= 711 0000 = 48

CREATE TABLE bitmap ( bitmap_id bigint, db varchar, frame varchar, slice int, chunkkey int, blockindex int, block bigint, PRIMARY KEY ((bitmap_id, db, frame, slice), chunkkey, blockindex) )

Page 42: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 42

Chunk

0 1 2 313

0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31

Block (64 bits)

(2048 bits) profile id: 150,000

0010 0100 1001 1111 00000010 = 2

÷ 216

0100 1001 1111 00000 1001

÷ 211

= 9001 1111 0000

0 0111÷ 26

= 711 0000 = 48

PQL: set(88, d, 0, 150000) cqlsh:pilosa> select * from bitmap; ! bitmap_id | db | frame | slice | chunkkey | blockindex | block | filter -----------+------+-------+-------+----------+------------+-----------------+-------- 88 | test | d | 2 | -1 | 0 | 1 | 0 88 | test | d | 2 | 9 | 7 | 281474976710656 | 0 !(2 rows)

Page 43: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 43

Slice

Page 44: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 44

Slice

Frame

Page 45: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 45

Frame

!• Default

• Time-based

• Top-n

Frame Types

Page 46: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 46

foo

set(88, foo, 150000)

Default

88

150000

Page 47: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 47

bar.t

set(88, bar.t, 150000, 2014-09-18T19:00) !

fn(88, 2014-09-18T19:00) => [88.y, 88.m, 88.d, 88h] bitmap_id: 64 bits id: 44 bits date/hour: 20 bits (60 years with hours, 2010-2070)

Time-based

88.y

150000

88.m88.d

88.h

Page 48: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 48

bar.t

get(88, bar.t, 2014-08-01T00:00, 2014-09-04T03:00) 88.m.2014-08 88.d.2014-09-01 88.d.2014-09-02 88.d.2014-09-03 88.h.2014-09-04T00 88.h.2014-09-04T01 88.h.2014-09-04T02

Time-based

88.y

150000

88.m88.d

88.h

Page 49: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 49

brands.n

set(88, brands.n, 150000) !

• only one fragment • sorted by count • configurable limit (50,000) • compares count and loads from Cassandra

Top-n

88

150000

Page 50: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 50

brands.n

set(88, brands.n, 150000) !

• only one fragment • sorted by count • configurable limit (50,000) • compares count and loads from Cassandra

Top-n

88

150000

cqlsh:pilosa> select * from bitmap; ! bitmap_id | db | frame | slice | chunkkey | blockindex | block -----------+----------+-------+-------+----------+------------+----------------- 88 | brands.n | d | 2 | -1 | 0 | 1 88 | brands.n | d | 2 | 9 | 7 | 281474976710656 !(2 rows)

Page 51: Cassandra at Umbel

Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 51

brands.n

top-n(get(222, foo), brands.n)

Top-n

Page 52: Cassandra at Umbel

Thank You!

52

umbel.com/engineering