Top Banner
@PatrickMcFadin Patrick McFadin Chief Evangelist for Apache Cassandra A little Cassandra for the Relational Brain 1
48

Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

May 16, 2018

Download

Documents

dinhquynh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

@PatrickMcFadin

Patrick McFadinChief Evangelist for Apache Cassandra

A little Cassandra for the Relational Brain

1

Page 2: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Relational Modeling

Data

Models

Application

Normalized Data

Page 3: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

• Think a YouTube competitor – Users add videos, rate them, comment on them, etc. – Can search for videos by tag

Page 4: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v
Page 5: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v
Page 6: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

CREATE TABLE users ( id number(12) NOT NULL , firstname nvarchar2(25) NOT NULL , lastname nvarchar2(25) NOT NULL, email nvarchar2(50) NOT NULL, password nvarchar2(255) NOT NULL, created_date timestamp(6), PRIMARY KEY (id), CONSTRAINT email_uq UNIQUE (email) );

-- Users by email address indexCREATE INDEX idx_users_email ON users (email);

•Create entity table• Add constraints• Index fields• Foreign Key relationships

CREATE TABLE videos ( id number(12), userid number(12) NOT NULL, name nvarchar2(255), description nvarchar2(500), location nvarchar2(255), location_type int, added_date timestamp, CONSTRAINT users_userid_fk FOREIGN KEY (userid) REFERENCES users (Id) ON DELETE CASCADE, PRIMARY KEY (id) );

Page 7: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Cassandra Modeling

Data

Models

Application

Page 8: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Think Before You ModelOr how to keep doing what you’re already doing

8

Page 9: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Some of the Entities and Relationships in KillrVideo

9

Userid

firstname

lastname

email

password Video

id

name

description

location

preview_image

tagsfeatures

Commentcomment

id

adds

timestamp

posts

timestamp

1

nn

1

1

nn

mrates

rating

Page 10: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

• What are your application’s workflows?

• How will I access the data?

• Knowing your queries in advance is NOT optional

• Different from RDBMS because I can’t just JOIN or create a new indexes to support new queries

10

Modeling Queries

Page 11: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Some Application Workflows in KillrVideo

11

User Logs into site

Show basic information about user

Show videos added by a

user

Show comments posted by a

user

Search for a video by tag

Show latest videos

added to the site

Show comments for a video

Show ratings for a

video

Show video and its details

Page 12: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Some Queries in KillrVideo to Support Workflows

12

Users

User Logs into site

Find user by email address

Show basic information about user

Find user by id

Comments

Show comments for a video

Find comments by video (latest first)

Show comments posted by a

user

Find comments by user (latest first)

Ratings

Show ratings for a

video

Find ratings by video

Page 13: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Some Queries in KillrVideo to Support Workflows

13

Videos

Search for a video by tag Find video by tag

Show latest videos

added to the site

Find videos by date (latest first)

Show video and its details

Find video by idShow videos added by a

user

Find videos by user (latest first)

Page 14: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

“Static” Table

CREATE TABLE videos ( videoid uuid, userid uuid, name varchar, description varchar, location text, location_type int, preview_thumbnails map<text,text>, tags set<varchar>, added_date timestamp, PRIMARY KEY (videoid) );

Table Name

Column NameColumn CQL Type

Primary Key Designation Partition Key

Page 15: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Insert

INSERT INTO videos (videoid, name, userid, description, location, location_type, preview_thumbnails, tags, added_date, metadata) VALUES (06049cbb-dfed-421f-b889-5f649a0de1ed,'The data model is dead. Long live the data model.',9761d3d7-7fbd-4269-9988-6cfd4e188678, 'First in a three part series for Cassandra Data Modeling','http://www.youtube.com/watch?v=px6U2n74q3g',1, {'YouTube':'http://www.youtube.com/watch?v=px6U2n74q3g'},{'cassandra','data model','relational','instruction'}, '2013-05-02 12:30:29');

Table Name Fields

Values

Partition Key: Required

Page 16: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Partition keys

06049cbb-dfed-421f-b889-5f649a0de1ed Murmur3 Hash Token = 7224631062609997448

873ff430-9c23-4e60-be5f-278ea2bb21bd Murmur3 Hash Token = -6804302034103043898

Consistent hash. 64 bit numberbetween 2-63 and 264-1

INSERT INTO videos (videoid, name, userid, description) VALUES (06049cbb-dfed-421f-b889-5f649a0de1ed,'The data model is dead. Long live the data model.’, 9761d3d7-7fbd-4269-9988-6cfd4e188678, 'First in a three part series for Cassandra Data Modeling');

INSERT INTO videos (videoid, name, userid, description) VALUES (873ff430-9c23-4e60-be5f-278ea2bb21bd,'Become a Super Modeler’, 9761d3d7-7fbd-4269-9988-6cfd4e188678, 'Second in a three part series for Cassandra Data Modeling');

Page 17: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

“Dynamic” Table

CREATE TABLE videos_by_tag ( tag text, videoid uuid, added_date timestamp, name text, preview_image_location text, tagged_date timestamp, PRIMARY KEY (tag, videoid) );

Partition Key Clustering Column

Page 18: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Users – The Cassandra Way

User Logs into site

Find user by email address

Show basic information about user

Find user by id

CREATE TABLE user_credentials ( email text, password text, userid uuid, PRIMARY KEY (email) );

CREATE TABLE users ( userid uuid, firstname text, lastname text, email text, created_date timestamp, PRIMARY KEY (userid) );

Page 19: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Primary key relationship

PRIMARY KEY (tag,videoid)

Page 20: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Primary key relationship

Partition Key

PRIMARY KEY (tag,videoid)

Page 21: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Primary key relationship

Partition Key Clustering Column

PRIMARY KEY (tag,videoid)

Page 22: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Primary key relationship

Partition Key

data model

PRIMARY KEY (tag,videoid)

Clustering Column

Page 23: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

-5.6

06049cbb-dfed-421f-b889-5f649a0de1ed

Primary key relationship

Partition Key

2013-05-16 16:50:002013-05-02 12:30:29

873ff430-9c23-4e60-be5f-278ea2bb21bd

PRIMARY KEY (tag,videoid)

Clustering Column

data model49f64d40-7d89-4890-b910-dbf923563a33

2013-06-11 11:00:00

Page 24: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Select

name | description | added_date---------------------------------------------------+----------------------------------------------------------+--------------------------The data model is dead. Long live the data model. | First in a three part series for Cassandra Data Modeling | 2013-05-02 12:30:29-0700

SELECT name, description, added_dateFROM videosWHERE videoid = 06049cbb-dfed-421f-b889-5f649a0de1ed;

FieldsTable Name

Primary Key: Partition Key Required

Page 25: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Controlling OrderCREATE TABLE raw_weather_data ( wsid text, year int, month int, day int, hour int, temperature double, PRIMARY KEY ((wsid), year, month, day, hour) ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC);

INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,10,-5.6);

INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,9,-5.1);

INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,8,-4.9);

INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,7,-5.3);

Page 26: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Clustering

200510010:99999 12 1 10

200510010:99999 12 1 9

raw_weather_data

-5.6

-5.1

200510010:99999 12 1 8

200510010:99999 12 1 7

-4.9

-5.3

Order By

DESC

Page 27: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Write PathClient INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)

VALUES (‘10010:99999’,2005,12,1,7,-5.3);

year 1wsid 1 month 1 day 1 hour 1

year 2wsid 2 month 2 day 2 hour 2

Memtable

SSTable

SSTable

SSTable

SSTable

Node

Commit Log Data * Compaction *

Temp

Temp

Page 28: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Storage Model - Logical View

2005:12:1:10

-5.6

2005:12:1:9

-5.1

2005:12:1:8

-4.9

10010:99999

10010:99999

10010:99999

wsid hour temperature

2005:12:1:7

-5.310010:99999

SELECT wsid, hour, temperatureFROM raw_weather_dataWHERE wsid=‘10010:99999’ AND year = 2005 AND month = 12 AND day = 1;

Page 29: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

2005:12:1:10

-5.6 -5.3-4.9-5.1

Storage Model - Disk Layout

2005:12:1:9 2005:12:1:810010:99999

2005:12:1:7

Merged, Sorted and Stored Sequentially

SELECT wsid, hour, temperatureFROM raw_weather_dataWHERE wsid=‘10010:99999’ AND year = 2005 AND month = 12 AND day = 1;

Page 30: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

2005:12:1:10

-5.6

2005:12:1:11

-4.9 -5.3-4.9-5.1

Storage Model - Disk Layout

2005:12:1:9 2005:12:1:810010:99999

2005:12:1:7

Merged, Sorted and Stored Sequentially

SELECT wsid, hour, temperatureFROM raw_weather_dataWHERE wsid=‘10010:99999’ AND year = 2005 AND month = 12 AND day = 1;

Page 31: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

2005:12:1:10

-5.6

2005:12:1:11

-4.9 -5.3-4.9-5.1

Storage Model - Disk Layout

2005:12:1:9 2005:12:1:810010:99999

2005:12:1:7

Merged, Sorted and Stored Sequentially

SELECT wsid, hour, temperatureFROM raw_weather_dataWHERE wsid=‘10010:99999’ AND year = 2005 AND month = 12 AND day = 1;

2005:12:1:12

-5.4

Page 32: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Read PathClient

SSTableSSTable

SSTable

Node

Data

SELECT wsid,hour,temperatureFROM raw_weather_dataWHERE wsid='10010:99999'AND year = 2005 AND month = 12 AND day = 1 AND hour >= 7 AND hour <= 10;

year 1wsid 1 month 1 day 1 hour 1

year 2wsid 2 month 2 day 2 hour 2

Memtable

Temp

Temp

Page 33: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Query patterns•Range queries• “Slice” operation on disk

Single seek on disk

10010:99999

Partition key for locality

SELECT wsid,hour,temperatureFROM raw_weather_dataWHERE wsid='10010:99999'AND year = 2005 AND month = 12 AND day = 1 AND hour >= 7 AND hour <= 10;

2005:12:1:10

-5.6 -5.3-4.9-5.1

2005:12:1:9 2005:12:1:8 2005:12:1:7

Page 34: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Query patterns•Range queries• “Slice” operation on disk

Programmers like this

Sorted by event_time2005:12:1:10

-5.6

2005:12:1:9

-5.1

2005:12:1:8

-4.9

10010:99999

10010:99999

10010:99999

weather_station hour temperature

2005:12:1:7

-5.310010:99999

SELECT weatherstation,hour,temperature FROM temperature WHERE weatherstation_id=‘10010:99999' AND year = 2005 AND month = 12 AND day = 1 AND hour >= 7 AND hour <= 10;

Page 35: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Other New and Not-so-New-but-different things

Page 36: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

CollectionsSet

tags set<varchar>

CQL Type: For Ordering

Column Name CREATE TABLE videos ( videoid uuid, userid uuid, name varchar, description varchar, location text, location_type int, preview_thumbnails map<text,text>, tags set<varchar>, added_date timestamp, PRIMARY KEY (videoid) );

Page 37: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

CollectionsSet

List

Column Name

Column Name

CQL Type

CREATE TABLE videos ( videoid uuid, userid uuid, name varchar, description varchar, location text, location_type int, preview_thumbnails map<text,text>, tags set<varchar>, added_date timestamp, PRIMARY KEY (videoid) );

tags set<varchar>

CQL Type: For Ordering

tags set<varchar>

Page 38: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

CollectionsSet

List

Map

preview_thumbnails map<text,text>

Column Name

Column Name

CQL Key Type CQL Value Type

Column Name CREATE TABLE videos ( videoid uuid, userid uuid, name varchar, description varchar, location text, location_type int, preview_thumbnails map<text,text>, tags set<varchar>, added_date timestamp, PRIMARY KEY (videoid) );

CQL Type

tags set<varchar>

tags set<varchar>

CQL Type: For Ordering

Page 39: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Aggregates (Sort of)

*As of Cassandra 2.2

•Built-in: avg, min, max, count(<column name>)•Runs on server•Always use with partition key

Page 40: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

User Defined Functions

CREATE FUNCTION maxI(current int, candidate int) CALLED ON NULL INPUTRETURNS int LANGUAGE java AS'if (current == null) return candidate; else return Math.max(current, candidate);' ; CREATE AGGREGATE maxAgg(int) SFUNC maxISTYPE intINITCOND null;

CQL Type

Pure Function

SELECT maxAgg(temperature) FROM raw_weather_dataWHERE wsid='10010:99999' AND year = 2005 AND month = 12 AND day = 1

Aggregate usingfunction overpartition

Page 41: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Lightweight Transactions

Don’t overwrite!

INSERT INTO videos (videoid, name, userid, description, location, location_type, preview_thumbnails, tags, added_date, metadata) VALUES (06049cbb-dfed-421f-b889-5f649a0de1ed,'The data model is dead. Long live the data model.',9761d3d7-7fbd-4269-9988-6cfd4e188678, 'First in a three part series for Cassandra Data Modeling','http://www.youtube.com/watch?v=px6U2n74q3g',1, {'YouTube':'http://www.youtube.com/watch?v=px6U2n74q3g'},{'cassandra','data model','relational','instruction'}, '2013-05-02 12:30:29’) IF NOT EXISTS;

Page 42: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Lightweight Transactions

No-op. Don’t throw error

CREATE TABLE IF NOT EXISTS videos_by_tag ( tag text, videoid uuid, added_date timestamp, name text, preview_image_location text, tagged_date timestamp, PRIMARY KEY (tag, videoid) );

Page 43: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Regular Update

UPDATE videosSET name = 'The data model is dead. Long live the data model.'WHERE id = 06049cbb-dfed-421f-b889-5f649a0de1ed;

Table Name Fields to Update: Not in Primary Key

Primary Key

Page 44: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Lightweight Transactions

Don’t overwrite!

UPDATE videosSET name = 'The data model is dead. Long live the data model.'WHERE id = 06049cbb-dfed-421f-b889-5f649a0de1ed IF userid = 9761d3d7-7fbd-4269-9988-6cfd4e188678;

Page 45: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Deleting Data

Page 46: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Delete

DELETE FROM videosWHERE id = 06049cbb-dfed-421f-b889-5f649a0de1ed;

Table Name

Primary Key: Required

Page 47: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Expiring Data

Time To Live = TTL

INSERT INTO videos (videoid, name, userid, description, location, location_type, preview_thumbnails, tags, added_date, metadata) VALUES (06049cbb-dfed-421f-b889-5f649a0de1ed,'The data model is dead. Long live the data model.',9761d3d7-7fbd-4269-9988-6cfd4e188678, 'First in a three part series for Cassandra Data Modeling','http://www.youtube.com/watch?v=px6U2n74q3g',1, {'YouTube':'http://www.youtube.com/watch?v=px6U2n74q3g'},{'cassandra','data model','relational','instruction'}, '2013-05-02 12:30:29’) USING TTL = 2592000

Expire Data: 30 Days

Page 48: Cassandra for the relational brain - Percona · A little Cassandra for the Relational Brain 1. Relational Modeling ... 'First in a three part series for Cassandra Data Modeling','v

Thank you!Bring the questions

Follow me on twitter @PatrickMcFadin