Top Banner
Cassandra 2.0 2.1 Codebits, Lisbon, April 2014 www.datastax.com @DataStaxEU
63

Cassandra 2.0 to 2.1

Jan 26, 2015

Download

Technology

Johnny Miller

Talk given at Codebits 2014 on Cassandra - features and enhancements in 2.0 and upcoming features in 2.1
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cassandra 2.0 to 2.1

Cassandra 2.0 2.1 Codebits, Lisbon, April 2014

www.datastax.com @DataStaxEU

Page 2: Cassandra 2.0 to 2.1

About Me

©2014 DataStax. Do not distribute without consent. @DataStaxEU 2

Johnny Miller Solutions Architect

•  @CyanMiller

•  www.linkedin.com/in/johnnymiller

We are hiring www.datastax.com/careers

@DataStaxCareers

Page 3: Cassandra 2.0 to 2.1

DataStax - Introduction

©2014 DataStax. Do not distribute without consent. @DataStaxEU 3

•  Founded in April 2010

•  We drive Apache Cassandra™

•  400+ customers (25 of the Fortune 100)

•  200+ employees

•  Home to Apache Cassandra™ Chair & most committers

•  Contribute ~ 90% of code into Apache Cassandra™ code base

•  Headquartered in San Francisco Bay area

•  European headquarters established in London

•  Offices in France and Germany

Our Goal

To be the first and best database choice for online applications

Page 4: Cassandra 2.0 to 2.1

Why DataStax?

©2014 DataStax. Do not distribute without consent. @DataStaxEU 4

DataStax supports both the open source community and enterprises.

Open Source/Community Enterprise Software

•  Apache Cassandra (employ Cassandra chair and 90+% of the committers)

•  DataStax Community Edition •  DataStax OpsCenter •  DataStax DevCenter •  DataStax Drivers/Connectors •  Online Documentation •  Online Training •  Mailing lists and forums

•  DataStax Enterprise Edition •  Certified Cassandra •  Built-in Analytics •  Built-in Enterprise Search •  Enterprise Security

•  DataStax OpsCenter •  Expert Support •  Consultative Help •  Professional Training

Page 5: Cassandra 2.0 to 2.1

History of Cassandra

©2014 DataStax. Do not distribute without consent. @DataStaxEU 5

Page 6: Cassandra 2.0 to 2.1

Cassandra Adoption

©2014 DataStax. Do not distribute without consent. @DataStaxEU 6

Source: http://db-engines.com/en/ranking, April 2014

Page 7: Cassandra 2.0 to 2.1

Core Values

©2014 DataStax. Do not distribute without consent. @DataStaxEU 7

•  Massive Scalability •  High Performance •  Reliability/Availability

Page 8: Cassandra 2.0 to 2.1

Performance and Scale

©2014 DataStax. Do not distribute without consent. @DataStaxEU 8

“In terms of scalability, there is a clear winner throughout our experiments. Cassandra achieves the highest throughput for the maximum number of nodes in all experiments with a linear increasing throughput.” Solving Big Data Challenges for Enterprise Application Performance Management, Tilman Rable, et al., August 2012. Benchmark paper presented at the Very Large Database Conference, 2012. http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf

End Point Independent NoSQL Benchmark

Lowest in latency…

Netflix Cloud Benchmark… Highest in throughput…

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

http://www.datastax.com/wp-content/uploads/2013/02/WP-Benchmarking-Top-NoSQL-Databases.pdf

Page 9: Cassandra 2.0 to 2.1

Performance and Scale

©2014 DataStax. Do not distribute without consent. @DataStaxEU 9

Cassandra works for small to huge deployments. •  Cassandra @ Netflix

•  80+ Clusters •  2500+ nodes •  4 Data Centres (Amazon Regions) •  > 1 Trillion transactions per day

•  Cassandra @ Ebay •  >250TB of data, dozens of nodes, multiple

data centres •  > 6 billion writes, > 5 billion reads per day

Source: http://planetcassandra.org

Page 10: Cassandra 2.0 to 2.1

Availability

©2014 DataStax. Do not distribute without consent. @DataStaxEU 10

•  Cassandra was designed with the understanding that system/hardware failures can and do occur

•  Peer-to-peer, distributed system •  All nodes the same – masterless with no single point of failure •  Read/Write-anywhere and across data centres

“Cassandra, our distributed cloud persistence store which is distributed across all zones and regions, dealt with the loss of one third of its regional nodes without any loss of data or availability”. http://techblog.netflix.com/2012/07/lessons-netflix-learned-from-aws-storm.html

“During Hurricane Sandy, we lost an entire data center. Completely. Lost. It. Our application fail-over resulted in us losing just a few moments of serving requests for a particular region of the country, but our data in Cassandra never went offline.” http://planetcassandra.org/blog/post/outbrain-touches-over-80-of-all-us-online-users-with-help-from-cassandra/

Page 11: Cassandra 2.0 to 2.1

Cassandra 1.2

©2014 DataStax. Do not distribute without consent. @DataStaxEU 11

Page 12: Cassandra 2.0 to 2.1

New Core Value

©2014 DataStax. Do not distribute without consent. @DataStaxEU 12

•  Massive Scalability •  High Performance •  Reliability/Availability •  Ease of Use

CREATE TABLE users (! id uuid PRIMARY KEY,! name text,! country text,! birth_date int!);!!CREATE INDEX ON users(country);!!SELECT * FROM users !WHERE country=‘Portugal’! AND birth_date > 1950;!

Cluster cluster = Cluster.builder() .addContactPoints("10.158.02.40", "10.158.02.44") .build();

Session session = cluster.connect("akeyspace");

session.execute( "INSERT INTO user (username, password) ” + "VALUES(‘johnny’, ‘password1234’)" );

Page 13: Cassandra 2.0 to 2.1

CQL3 Delivers

©2014 DataStax. Do not distribute without consent. @DataStaxEU 13

"Coming from a relational database background we found the transition to Cassandra to be very straightforward. There are a few simple key concepts one must grasp at first but ever since it’s been smooth sailing for us.”

- Boris Wolf, Comcast

Find out more: •  Introduction to CQL3 and Data Modeling

Slides: http://bit.ly/jpm_003, Video: http://bit.ly/jpm_004 [Cassandra Meetup, Helsinki, Feb 2014]

Page 14: Cassandra 2.0 to 2.1

Native Drivers and Protocol

©2014 DataStax. Do not distribute without consent. @DataStaxEU 14

Traditionally, Cassandra clients (Hector, Astynax1 etc..) were developed using Thrift With Cassandra 1.2 and the introduction of CQL3 and the CQL native protocol and drivers a new easier way of using Cassandra was introduced. Why? •  Easier to develop and model •  Best practices for building modern distributed applications •  Integrated tools and experience •  Enable Cassandra to evolve easier and support new features 1Astynax is being updated to include the native driver: https://github.com/Netflix/astyanax/wiki/Astyanax-over-Java-Driver

Page 15: Cassandra 2.0 to 2.1

Native Drivers

©2014 DataStax. Do not distribute without consent. 15

•  Java •  C# •  Python •  C++ (beta) •  ODBC (beta) •  Clojure •  Erlang •  Node.js •  Ruby •  Plus many, many more….

Get them here: http://www.datastax.com/download

Find out more: •  Going Native With Apache Cassandra

http://bit.ly/jpm_001 [QCon, London 2014]

Page 16: Cassandra 2.0 to 2.1

Asynchronous Read

©2014 DataStax. Do not distribute without consent. 16

ResultSetFuture future = session.executeAsync( "SELECT * FROM user");

for (Row row : future.get()) {

String userName = row.getString("username");

String password = row.getString("password");

}

Note: The future returned implements Guava's ListenableFuture interface. This means you can use all Guava's Futures1 methods! 1http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/util/concurrent/Futures.html

Page 17: Cassandra 2.0 to 2.1

Read with Callbacks

©2014 DataStax. Do not distribute without consent. 17

final ResultSetFuture future =

session.executeAsync("SELECT * FROM user");

future.addListener(new Runnable() {

public void run() {

for (Row row : future.get()) {

String userName = row.getString("username");

String password = row.getString("password");

}

}

}, executor);

Page 18: Cassandra 2.0 to 2.1

Parallelize Calls

©2014 DataStax. Do not distribute without consent. 18

int queryCount = 99;

List<ResultSetFuture> futures = new ArrayList<ResultSetFuture>();

for (int i=0; i<queryCount; i++) {

futures.add(

session.executeAsync("SELECT * FROM user "

+"WHERE username = '"+i+"'"));

}

for(ResultSetFuture future : futures) {

for (Row row : future.getUninterruptibly()) {

//do something

}

}

Page 19: Cassandra 2.0 to 2.1

Query Tracing

©2014 DataStax. Do not distribute without consent. 19

•  You can turn tracing on or off for queries with the TRACING ON | OFF command.

•  This can help you understand what Cassandra is doing and identify any performance problems.

Find out more: •  http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2

Page 20: Cassandra 2.0 to 2.1

Also worth noting…

©2014 DataStax. Do not distribute without consent. @DataStaxEU 20

•  Automatic Batches •  CQL3 Authentication Support •  CQL3 Collections Data Type •  Virtual Nodes (vnodes) •  JBOD improvements •  Parallel leveled compaction •  LZ4 compression Plus much, much more….

Page 21: Cassandra 2.0 to 2.1

Cassandra 2.0 DataStax Enterprise 4.0

©2014 DataStax. Do not distribute without consent. @DataStaxEU 21

Page 22: Cassandra 2.0 to 2.1

Lightweight Transactions (LWT)

©2014 DataStax. Do not distribute without consent. @DataStaxEU 22

Why? •  Solve a class of race conditions in Cassandra that you would otherwise need to install

an external locking manager to solve.

Syntax: !INSERT INTO customer_account (customerID, customer_email)!

!VALUES (‘Johnny’, ‘[email protected]’) !IF NOT EXISTS;!

!

!UPDATE customer_account !

!SET customer_email=’[email protected]’!

!IF customer_email=’[email protected]’;!

!

Example Use Case: •  Registering a user

Page 23: Cassandra 2.0 to 2.1

Race Condition

©2014 DataStax. Do not distribute without consent. @DataStaxEU 23

SELECT name!FROM users!WHERE username = 'johnny';!

(0 rows)!

INSERT INTO users ! (username, name, email,! password, created_date)!VALUES ('johnny',! 'Johnny Miller',! ['[email protected]'],! 'ba27e03fd9...',! '2011-06-20 13:50:00');!

INSERT INTO users ! (username, name, email,! password, created_date)!VALUES ('johnny',! 'Johnny Miller',! ['[email protected]'],! 'ea24e13ad9...',! '2011-06-20 13:50:01');!

This one wins!

SELECT name!FROM users!WHERE username = 'johnny';!

(0 rows)!

Page 24: Cassandra 2.0 to 2.1

Lightweight Transactions

©2014 DataStax. Do not distribute without consent. @DataStaxEU 24

INSERT INTO users ! (username, name, email,! password, created_date)!VALUES ('johnny',! 'Johnny Miller',! ['[email protected]'],! 'ba27e03fd9...',! '2011-06-20 13:50:00')!IF NOT EXISTS;!

INSERT INTO users ! (username, name, email,! password, created_date)!VALUES ('johnny',! 'Johnny Miller',! ['[email protected]'],! 'ea24e13ad9...',! '2011-06-20 13:50:01’)!IF NOT EXISTS;!!

[applied]!-----------! True!

[applied] | username | created_date | name !-----------+----------+----------------+----------------! False | johnny | 2011-06-20 ... | Johnny Miller!

Page 25: Cassandra 2.0 to 2.1

Lightweight Transactions

©2014 DataStax. Do not distribute without consent. @DataStaxEU 25

•  Uses Paxos algorthim •  All operations are quorum-based i.e. we can loose nodes and its still going

to work!

•  See Paxos Made Simple - http://bit.ly/paxosmadesimple

•  Consequences of Lightweight Transactions •  4 round trips vs. 1 for normal updates

•  Operations are done on a per-partition basis

•  Will be going across data centres to obtain consensus

•  Cassandra user will need read and write access i.e. you get back the row!

Great for 1% your app, but eventual consistency is still your friend!

Find out more: •  http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0 •  Eventual Consistency != Hopeful Consistency

http://www.youtube.com/watch?v=A6qzx_HE3EU

Page 26: Cassandra 2.0 to 2.1

Batch Statements and LWT

©2014 DataStax. Do not distribute without consent. @DataStaxEU 26

BEGIN BATCH !

!UPDATE foo SET z = 1 WHERE x = 'a' AND y = 1; !

!UPDATE foo SET z = 2 WHERE x = 'a' AND y = 2 IF t = 4; !

APPLY BATCH;!

•  Allows you to group multiple conditional updates in a batch as long as all those updates

apply to the same partition

Page 27: Cassandra 2.0 to 2.1

Triggers

©2014 DataStax. Do not distribute without consent. @DataStaxEU 27

CREATE TRIGGER <name> ON <table> USING <classname>; !class MyTrigger implements Itrigger {! public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update) {! ...! }!}!

•  The trigger defined on a table fires before a requested DML statement occurs

•  You place the trigger code in a lib/triggers subdirectory of the Cassandra installation directory

•  A full working example can be found in the Cassandra examples/triggers directory

•  EXPERIMENTAL: Expect changes in Cassandra 2.1

Find out more: •  http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-0-prototype-

triggers-support

Page 28: Cassandra 2.0 to 2.1

In-Memory Tables (DataStax Enterprise 4.0)

©2014 DataStax. Do not distribute without consent. @DataStaxEU 28

CREATE TABLE users (!

uid text,!

fname text,!

lname text,!

PRIMARY KEY (uid)!

) WITH compaction={'class': 'MemoryOnlyStrategy', ‘size_limit_in_mb’:100}!

AND memtable_flush_period_in_ms=3600000;!

!

•  We expect that in memory column families will be on average 20-50% faster with significantly less observed variance on read queries.

•  Great use case is for workloads with a lot of overwrites •  Caution: more tables = more memory = gc death spiral

Find out more: •  http://www.datastax.com/2014/02/why-we-added-in-memory-to-cassandra

Page 29: Cassandra 2.0 to 2.1

Static Columns

©2014 DataStax. Do not distribute without consent. 29

A static column is a special column that is shared by all the rows of the same partition !

CREATE TABLE foo ( !

x text, !

y bigint, !

t bigint static, !

z bigint, !

PRIMARY KEY (x, y) );!

!

INSERT INTO foo (x,y,t, z) VALUES ('a', 1, 1, 10);!

INSERT INTO foo (x,y,t, z) VALUES ('a', 2, 2, 20);!

!

SELECT * from foo;!

!

x | y | t | z!

---+---+---+----!

a | 1 | 2 | 10!

a | 2 | 2 | 20!

Page 30: Cassandra 2.0 to 2.1

Static Columns

©2014 DataStax. Do not distribute without consent. @DataStaxEU 30

•  Considerations •  Use them when you want to store some per-partition

“static” information alongside clustered rows and still want to be able to query both of those with a single SELECT.

•  only columns not part of the PRIMARY key can be static. •  only tables with at least one clustering column can have

static columns •  tables with the COMPACT STORAGE option cannot have

static columns.

Page 31: Cassandra 2.0 to 2.1

No more CQL2

©2014 DataStax. Do not distribute without consent. 31

•  CQL2 is not supported any more. •  CQL2 has been discouraged for a while, and if you

are still using it, do not upgrade until you have rewritten your application to use CQL3.

Page 32: Cassandra 2.0 to 2.1

Clustered columns can be indexed

©2014 DataStax. Do not distribute without consent. 32

CREATE TABLE foo (!

a int,!

b int,!

c int,!

PRIMARY KEY (a, b)!

);!

•  It was previously impossible to create an index on the ‘b’ column, since that column was a special clustered column.

•  This restriction has now been fixed and you can create indexes on clustered columns just as if they were regular CQL columns.

CREATE INDEX ON foo (b);!

Page 33: Cassandra 2.0 to 2.1

Conditional create/drop ks/table/index statements in CQL3

©2014 DataStax. Do not distribute without consent. 33

•  You can now use IF EXISTS and IF NOT EXISTS conditionals for dropping and creating tables and keyspaces.

Page 34: Cassandra 2.0 to 2.1

Automatic Paging

©2014 DataStax. Do not distribute without consent. 34

•  This is great! •  Historically difficult to get huge result sets out of Cassandra. It has

generally been necessary to explicitly enumerate your row keys in reasonably small batches (1000 rows or so per batch would be common).

•  This feature now allows you to get huge result sets (including “select * from table), and have the server automatically page the results, while the client is just able to trivially iterate over the entire result set.

•  This should remove a very common cause of OOMs (out of memory exceptions), and should make data exploration much easier.

Page 35: Cassandra 2.0 to 2.1

Paging (before)

©2014 DataStax. Do not distribute without consent. @DataStaxEU 35

CREATE TABLE timeline (!  user_id uuid,!  tweet_id timeuuid,!  tweet_author uuid,! tweet_body text,!  PRIMARY KEY (user_id, tweet_id)!);!!SELECT *!FROM timeline!WHERE (user_id = :last_key ! AND tweet_id > :last_tweet)! OR token(user_id) > token(:last_key)!LIMIT 100!

Page 36: Cassandra 2.0 to 2.1

Paging (after)

©2014 DataStax. Do not distribute without consent. @DataStaxEU 36

SELECT * FROM timeline!

Page 37: Cassandra 2.0 to 2.1

Thrift

©2014 DataStax. Do not distribute without consent. 37

•  Replace Thrift HsHa with LMAX Disruptor based implementation

•  Because of the substantial changes at the Thrift transport layer, be sure to update your app to use Thrift clients compatible with Cassandra 2.0, and test your application thoroughly before going to production.

Page 38: Cassandra 2.0 to 2.1

Streaming

©2014 DataStax. Do not distribute without consent. 38

•  This is a major rewrite of the Cassandra streaming protocol, and should be much more robust and reliable than the previous implementation.

•  It includes: •  several performance optimizations

•  multiple parallel sstable streaming

•  better logging

•  more metrics

Page 39: Cassandra 2.0 to 2.1

Reduce request latency with rapid retry protection/eager retries

©2014 DataStax. Do not distribute without consent. 39

•  This should substantially help with your 95%-99% latency. •  By rapidly detecting that a query was sent to a slow node, this feature

will greatly speed up performing a retry on another node. •  There is new metadata associated with each table

speculative_retry='99.0PERCENTILE' //default •  Be careful – retries will have an effect on what throughput you can

achieve in your cluster.

ALTER TABLE users WITH speculative_retry = '10ms’;!

!

ALTER TABLE users WITH speculative_retry = '99percentile'; !!

Page 40: Cassandra 2.0 to 2.1

Official way to disable compactions

©2014 DataStax. Do not distribute without consent. 40

•  nodetool disableautocompaction •  nodetool enableautocompaction

Page 41: Cassandra 2.0 to 2.1

Remove row-level bloom filters

©2014 DataStax. Do not distribute without consent. 41

•  This should be a largely invisible change since there was never a noticeable performance improvement from having these bloom filters.

•  However, you will see a reduction in memory usage as a result.

Page 42: Cassandra 2.0 to 2.1

add default_time_to_live

©2014 DataStax. Do not distribute without consent. 42

•  This has been a long-requested Cassandra feature and makes auto-expiring data easier.

•  You can have a single per-table TTL that will always be set unless overridden by the client.

•  It also allows for significant performance optimizations on the server side.

Page 43: Cassandra 2.0 to 2.1

New network topology snitch for mixed ec2/other envs

©2014 DataStax. Do not distribute without consent. 43

•  There is a new snitch(YamlFileNetworkTopologySnitch) and a new yaml file (cassandra-topology.yaml) that will be used if you select it.

•  This snitch should probably be used for any cluster that spans both EC2 as well as non-EC2 environments.

Page 44: Cassandra 2.0 to 2.1

Removed compatibility with pre-1.2.5 sstables and network messages

©2014 DataStax. Do not distribute without consent. 44

•  This is very important as it means that you must upgrade to Cassandra 1.2.6 ( or equivalent DSE) or later before upgrading to Cassandra 2.0.x or DSE 4.0.x.

Page 45: Cassandra 2.0 to 2.1

Improve memory use defaults

©2014 DataStax. Do not distribute without consent. 45

•  Memtables now use ¼ your heap by default instead of ⅓.

•  Additionally, the write timeout has been dramatically lowered to 2 seconds from 10 seconds, and the read timeout has been changed to 5 seconds.

Page 46: Cassandra 2.0 to 2.1

add SHOW SESSION <tracing-session> command

©2014 DataStax. Do not distribute without consent. 46

•  If you aren’t already using tracing to debug your dev and production clusters, then start doing so.

•  It’s one of the most powerful tools that you have at your disposal to understand what is going on.

•  This lets explicitly specify which session you want to display the output for.

•  Previously you would have had to manually query it from the system_traces.sessions and system_traces.events tables.

Page 47: Cassandra 2.0 to 2.1

Single-pass compaction

©2014 DataStax. Do not distribute without consent. 47

•  This should noticeably improve the performance of compaction since Cassandra no longer has to read through each sstable twice.

Page 48: Cassandra 2.0 to 2.1

Compact hottest sstables first and optionally omit coldest from compaction entirely

©2014 DataStax. Do not distribute without consent. 48

•  Read-coldness (how [in]frequently a row is read) is now used in consideration of compaction.

•  If you have a lot of cold data, this could greatly reduce the amount of unnecessary re-compaction.

Page 49: Cassandra 2.0 to 2.1

Leveled compaction performs size- tiered compactions in L0

©2014 DataStax. Do not distribute without consent. 49

•  If LCS gets behind, read performance deteriorates as we have to check bloom filters on many sstables in L0.

•  For wide rows, this can mean having to seek for each one since the BF doesn't help us reject much.

•  Performing size-tiered compaction in L0 will mitigate this until we can catch up on merging it into higher levels

Page 50: Cassandra 2.0 to 2.1

New CQL-aware SSTableWriter

©2014 DataStax. Do not distribute without consent. 50

•  Prior to Cassandra 2.0.4, It has been possible to write SStables for CQL3 tables, but only with a lot of difficulty.

•  Particularly with complex schemas, this is very complicated and error prone, and should be deprecated as an approach.

•  Instead the new CQL3 aware SSTableWriter should be used:

String schema = "CREATE TABLE foo (c1 int, c2 text, c3 float, PRIMARY KEY (c1, c2))"!String insert = "INSERT INTO foo(c1, c2, c3) VALUES (?, ?, ?)"!CQLSSTableWriter writer = CQLSSTableWriter.builder()! .for(schema)! .using(insert)! .build();!!writer.addRow(3, "foo", 2.3f);!writer.addRow(1, "bar", 0.0f); !

Page 51: Cassandra 2.0 to 2.1

Plus more….

©2014 DataStax. Do not distribute without consent. @DataStaxEU 51

•  Java7 is now required! •  Tracking statistics on clustered columns allows eliminating

unnecessary sstables from the read path •  Faster partition index lookups and cache reads by improving

performance of off-heap memory •  Faster reads of compressed data by switching from CRC32 to Adler

checksums •  JEMalloc support for off-heap allocation •  The potentially dangerous countPendingHints JMX call has been

replaced by a Hints Created metric •  The on-heap partition cache (“row cache”) has been removed •  Vnodes are on by default in Cassandra (off by default in DataStax

Enterprise).

And more……

Page 52: Cassandra 2.0 to 2.1

Find out more…

©2014 DataStax. Do not distribute without consent. @DataStaxEU 52

•  Cassandra 2.0 documentation http://www.datastax.com/documentation/cassandra/2.0/

•  DataStax Enterprise 4.0 documentation http://www.datastax.com/documentation/datastax_enterprise/4.0/

•  What’s new in Cassandra 2.0 http://www.datastax.com/wp-content/uploads/2013/09/WP-DataStax-WhatsNewC2.0.pdf

•  New CQL features in Cassandra 2.0.6 http://www.datastax.com/dev/blog/cql-in-2-0-6

•  What’s under the hood in Cassandra 2.0 http://www.datastax.com/dev/blog/whats-under-the-hood-in-cassandra-2-0

•  Facebook’s Cassandra paper, annotated and compared to Apache Cassandra 2.0 http://www.datastax.com/documentation/articles/cassandra/cassandrathenandnow.html

Page 53: Cassandra 2.0 to 2.1

Cassandra 2.1

©2014 DataStax. Do not distribute without consent. @DataStaxEU 53

Page 54: Cassandra 2.0 to 2.1

User Defined Types

©2014 DataStax. Do not distribute without consent. @DataStaxEU 54

CREATE TYPE address (! street text,! city text,! zip_code int,! phones set<text>!)!!CREATE TABLE users (! id uuid PRIMARY KEY,! name text,! addresses map<text, address>!)!!SELECT id, name, addresses.city, addresses.phones FROM users;!! id | name | addresses.city | addresses.phones!--------------------+----------------+--------------------------! 63bf691f | johnny | London | {’0201234567', ’0796622222'}!

Page 55: Cassandra 2.0 to 2.1

User Defined Types

©2014 DataStax. Do not distribute without consent. @DataStaxEU 55

Considerations •  you cannot update only parts of a UDT value, you have to overwrite the

whole thing every time (limitation in current implementation, may change). •  Always read entirely under the hood (as of the current implementation at

least) •  UDTs are not meant to store large and complex "documents" as of their

current implementation, but rather to help make the denormalization of short amount of data more convenient and flexible.

•  It is possible to use a UDT as type of any CQL column, including clustering ones.

Find out more: •  http://www.datastax.com/dev/blog/cql-in-2-1

Page 56: Cassandra 2.0 to 2.1

Secondary indexes on collections

©2014 DataStax. Do not distribute without consent. @DataStaxEU 56

CREATE TABLE songs (!

id uuid PRIMARY KEY,!

artist text,!

album text,!

title text,!

data blob,!

tags set<text>!

);!

!

CREATE INDEX song_tags_idx ON songs(tags);!

!

SELECT * FROM songs WHERE tags CONTAINS 'blues';!

!

id | album | artist | tags | title!

----------+---------------+-------------------+-----------------------+------------------!

5027b27e | Country Blues | Lightnin' Hopkins | {'acoustic', 'blues'} | Worrying My Mind!

!

!

!

Page 57: Cassandra 2.0 to 2.1

Secondary indexes on map keys

©2014 DataStax. Do not distribute without consent. @DataStaxEU 57

•  If you prefer indexing the map keys, you can do so by creating a KEYS index and by using CONTAINS KEY

CREATE TABLE products (! id int PRIMARY KEY,! description text,! price int,! categories set<text>,! features map<text, text>! );!! CREATE INDEX feat_key_index ON products(KEYS(features));!! SELECT id, description! FROM products! WHERE features CONTAINS KEY 'refresh-rate';! ! id | description! -------+-----------------------------! 34134 | 120-inch 1080p 3D plasma TV!

Page 58: Cassandra 2.0 to 2.1

Counters++

©2014 DataStax. Do not distribute without consent. @DataStaxEU 58

•  simpler implementation, no more edge cases •  possible to properly repair now •  significantly less garbage and internode traffic

generated •  better performance for 99% of uses

Page 59: Cassandra 2.0 to 2.1

Row Cache

©2014 DataStax. Do not distribute without consent. @DataStaxEU 59

CREATE TABLE notifications (!

target_user text,!

notification_id timeuuid,!

source_id uuid,!

source_type text, !

activity text,!

PRIMARY KEY (target_user, notification_id)!

)!

WITH CLUSTERING ORDER BY (notification_id DESC)!

AND caching = 'rows_only'!

AND rows_per_partition_to_cache = '3';!

Page 60: Cassandra 2.0 to 2.1

Thrift post-Cassandra 2.1

©2014 DataStax. Do not distribute without consent. @DataStaxEU 60

•  There is a proposal to freeze thrift starting with 2.1.0 •  http://bit.ly/freezethrift

•  Will retain it for backwards compatibility, but no new features or changes to the Thrift API after 2.1.0

“CQL3 is almost two years old now and has proved to be the better API that Cassandra needed. CQL drivers have caught up with and passed the Thrift ones in terms of features, performance, and usability. CQL is easier to learn and more productive than Thrift.” - Jonathan Ellis, Apache Chair, Cassandra

Page 61: Cassandra 2.0 to 2.1

2.1 Roadmap

©2014 DataStax. Do not distribute without consent. @DataStaxEU 61

•  Beta1 - 20th Feb •  Beta2 - ? •  RC - ? •  Final release currently mid-2014

Page 62: Cassandra 2.0 to 2.1

Find Out More

©2014 DataStax. Do not distribute without consent. 62

DataStax: •  http://www.datastax.com Getting Started: •  http://www.datastax.com/documentation/gettingstarted/index.html Training: •  http://www.datatstax.com/training Downloads: •  http://www.datastax.com/download Documentation: •  http://www.datastax.com/docs Developer Blog: •  http://www.datastax.com/dev/blog Community Site: •  http://planetcassandra.org Webinars: •  http://planetcassandra.org/Learn/CassandraCommunityWebinars

Page 63: Cassandra 2.0 to 2.1

©2014 DataStax. Do not distribute without consent. @DataStaxEU 63