Top Banner
©2013 DataStax Confidential. Do not distribute without consent. Apache Cassandra 2.0 - #Cassandra USE aarhus; SELECT * FROM presenters WHERE name = ‘Hayato Shimizu’; name | title | company | area ----------------+---------------------+----------+------ Hayato Shimizu | Solutions Architect | DataStax | EMEA
35

Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

May 13, 2018

Download

Documents

duongdieu
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

©2013 DataStax Confidential. Do not distribute without consent.

Apache Cassandra 2.0 - #Cassandra USE aarhus;!

SELECT * FROM presenters WHERE name = ‘Hayato Shimizu’;!

name | title | company | area!

----------------+---------------------+----------+------!

Hayato Shimizu | Solutions Architect | DataStax | EMEA!

Page 2: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

DataStax and Cassandra

• Commercial company behind Apache Cassandra • Cassandra is a highly distributed database

http://planetcassandra.org http://www.datastax.com

Page 3: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Five Years of Cassandra

0.1 0.3 0.6 0.7 1.0 1.2 ...

2.0

DSE

Jul-08

Page 4: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Core values • Massive scalablility • High performance • Reliability/Availability

http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf

Page 5: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

CREATE TABLE users ( id text PRIMARY KEY, name text, state text, birth_date int, email text);INSERT INTO users(id, name, state, birth, email)VALUES(‘hshimizu’, ‘Hayato Shimizu’, ‘Surrey’, ‘1-1-1995’);SELECT * FROM users WHERE id = ‘hshimizu’;

New Core Value

• Massive scalability • High performance • Reliability/Availability • Ease of use

Page 6: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Cassandra Basics

Page 7: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Data Storage Structure

Keyspace: the_matrix replication_factor: DC1:3, DC2:3

Table: character_information

Neo DOB: 2600-06-27 Actor: Keanu Reeves email1: Neo@matrix

email2: [email protected] Mr. Anderson

Page 8: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Cassandra Architecture – Data Replication

Token Range: 0 -> 2127

C* offers active everywhere strategy C* offers flexible replication strategies with TUNABLE CONSISTENCY

One, Two, Three, Quorum, Local Quorum, Each Quorum, All

Page 9: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Cassandra Architecture - Writes

Memtable

SSTable Commitlog

Page 10: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Cassandra 1.2

Page 11: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

1.2 for Developers

• CQL3 • SQL Like • Collections – set, list, map • Data dictionary • Auth support

• Tracing • Atomic batches

Page 12: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Authentication

• [cassandra.yaml] • authenticator: PasswordAuthenticator • # DSE offers KerberosAuthenticator as well

CREATE USER robin WITH PASSWORD 'manager' SUPERUSER;!

ALTER USER cassandra WITH PASSWORD 'newpassword’;!

DROP USER cassandra;!

Page 13: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Authorization • [cassandra.yaml] • authorizer: CassandraAuthorizer

!GRANT select ON audit TO jonathan;!GRANT modify ON users TO robin;!GRANT all ON ALL KEYSPACES TO lara;!

Page 14: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Native Protocol

CQL native protocol: efficient, lightweight, asynchronous

Java (GA): https://github.com/datastax/java-driver .NET (GA): https://github.com/datastax/csharp-driver Python (Beta): https://github.com/datastax/python-driver

Coming soon: C++, PHP, Ruby, others

Page 15: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

1.2 for Operators

• Virtual nodes • JBOD improvements • Off-heap bloom filters, compression metadata • “Dense node” support (5-10TB/machine) • Parallel leveled compaction

Page 16: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

1.2.5+ •  ~1/2 memory usage in partition summary •  Improved compaction throttle •  Removed cell-name bloom filters •  Thread-local allocation •  LZ4 compression (default in 2.0) •  (1.2.7) CQL Input/Output for Hadoop •  (1.2.7) Range tombstone performance •  (1.2.9) Larger default LCS filesize (160MB > 5MB)

Page 17: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Cassandra 2.0

Page 18: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

2.0

• Lightweight transactions • Triggers (experimental) • Improved compaction • CQL cursors • Streaming re-written

Page 19: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

SELECT * FROM usersWHERE username = ’jbellis’[empty resultset]INSERT INTO users (...)VALUES (’jbellis’, ...)

Session 1 SELECT * FROM usersWHERE username = ’jbellis’[empty resultset]INSERT INTO users (...)VALUES (’jbellis’, ...)

Session 2

Lightweight Transactions: the problem

Page 20: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Paxos •  All operations are quorum-based

•  An elected leader prepares the participating replicas a ballot

•  Replicas would reply with the promise

•  Each replica sends information about unfinished operations to the leader during prepare

•  Paxos Made Simple

•  Paxos Made Live – An Engineering Perspective

Page 21: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

LWT: details

• 4 round trips vs 1 for normal updates • Paxos state is durable • ConsistencyLevel.SERIAL • http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0

Page 22: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Using LWT CREATE TABLE USERS IF NOT EXISTS (!

!username text,!!email text!!…!

);!!INSERT INTO USERS (username, email, ...)!VALUES (‘jbellis’, ‘[email protected]’, ... )!IF NOT EXISTS;!!UPDATE USERS !SET email = ’[email protected]’, ...!WHERE username = ’jbellis’!IF email = ’[email protected]’;!!

Page 23: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

LWT: Use with caution

• Great for 1% of your application • Eventual consistency is your friend

• http://www.slideshare.net/planetcassandra/c-summit-2013-eventual-consistency-hopeful-consistency-by-christos-kalantzis

Page 24: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Triggers - Experimental

• CREATE TRIGGER <name> ON <table> USING <classname>; • Expect Changes in 2.1

class MyTrigger implements ITrigger!

{!

public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update)!

{!

...!

}!

}!

Page 25: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Compaction

• Single-pass, always • LeveledCompactionStrategy performs SizeTieredCompactionStrategy in Level 0

Page 26: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Healthy leveled compaction

Page 27: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Sad leveled compaction

Page 28: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

STCS in L0

Page 29: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Cursors (before)

SELECT *!

FROM timeline!

WHERE (user_id = :last_key !

AND tweet_id > :last_tweet)!

OR token(user_id) > token(:last_key)!

LIMIT 100!

CREATE TABLE timeline (!  user_id uuid,!  tweet_id timeuuid,!  tweet_author uuid,! tweet_body text,!  PRIMARY KEY (user_id, tweet_id)!);!

Page 30: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Cursors (after)

SELECT * FROM timeline;!!

Page 31: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Misc. performance improvements

Tracking statistics on clustered columns allows eliminating unnecessary sstables from the read path

• New half-synchronous, half-asynchronous Thrift server based on LMAX Disruptor

• Faster partition index lookups and cache reads by improving performance of off-heap memory

• Faster reads of compressed data by switching from CRC32 to Adler checksums

• JEMalloc support for off-heap allocation

Page 32: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Spring cleaning •  Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema

•  The potentially dangerous countPendingHints JMX call has been replaced by a Hints Created metric

•  The on-heap partition cache (“row cache”) has been removed

•  Vnodes are on by default

•  the old token range bisection code for non-vnode clusters is gone

•  Removed emergency memory pressure valve logic

Page 33: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

Operational concerns Java7 is now required!

• Leveled compaction level information has been moved into sstable metadata

• Kernel page cache skipping has been removed in favor of optional row preheating (preheat_kernel_page_cache)

• Streaming has been rewritten to be more transparent and robust.

• Streaming support for old-version sstables

Page 34: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

DataStax Enterprise

• Analytics Integration • Search Integration • Security Enhancement • Production Support

Page 35: Apache Cassandra 2.0 - #Cassandra - GOTO Conferencegotocon.com/.../slides/HayatoShimizu_ApacheCassandra20.pdfCassandra Architecture – Data Replication Token Range: 0 -> 2127!! C*

©2013 DataStax Confidential. Do not distribute without consent.

http://planetcassandra.org http://www.datastax.com