Top Banner
Confidential Apache Cassandra & Java Caroline George DataStax Solutions Engineer June 30, 2014 1
39

DataStax NYC Java Meetup: Cassandra with Java

Jan 26, 2015

Download

Technology

DataStax presentation at NYC Java Meetup on June 30, 2014 on using Apache Cassandra via Java
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DataStax NYC Java Meetup: Cassandra with Java

Confidential

Apache Cassandra & Java

Caroline George

DataStax Solutions Engineer

June 30, 2014

1

Page 2: DataStax NYC Java Meetup: Cassandra with Java

Agenda

2

• Cassandra Overview

• Cassandra Architecture

• Cassandra Query Language

• Interacting with Cassandra using Java

• About DataStax

Page 3: DataStax NYC Java Meetup: Cassandra with Java

CASSANDRA OVERVIEW

3

Page 4: DataStax NYC Java Meetup: Cassandra with Java

Who is using DataStax?

4

Collections / Playlists

Recommendation /Personalization

Fraud detection

Messaging

Internet of Things /Sensor data

Page 5: DataStax NYC Java Meetup: Cassandra with Java

What is Apache Cassandra?

Apache Cassandra™ is a massively scalable NoSQL database.

• Continuous availability• High performing writes and reads• Linear scalability• Multi-data center support

Page 6: DataStax NYC Java Meetup: Cassandra with Java

6

The NoSQL Performance Leader

Source: Netflix Tech Blog

Netflix Cloud Benchmark…

“In terms of scalability, there is a clear winner throughout our experiments. Cassandra achieves the highest throughput for the maximum number of nodes in all experiments with a linear increasing throughput.”Source: Solving Big Data Challenges for Enterprise Application Performance Management benchmark paper presented at the Very Large Database Conference, 2013.

End Point Independent NoSQL BenchmarkHighest in throughput…

Lowest in latency…

Page 7: DataStax NYC Java Meetup: Cassandra with Java

10

50

3070

80

40

20

60

Client

Client

Replication Factor = 3

We could still retrieve the data from the other 2 nodes

Token Order_id Qty Sale

70 1001 10 100

44 1002 5 50

15 1003 30 200

Node failure or it goes down temporarily

Cassandra is Fault Tolerant

Page 8: DataStax NYC Java Meetup: Cassandra with Java

Client

10

50

3070

80

40

20

60

Client

15

55

3575

85

45

25

65

West Data CenterEast Data Center

10

50

3070

80

40

20

60

Data Center Outage Occurs

No interruption to the business

Multi Data Center Support

Page 9: DataStax NYC Java Meetup: Cassandra with Java

9

Writes in Cassandra

Data is organized into Partitions

1. Data is written to a Commit Log for a node (durability)

2. Data is written to MemTable (in memory)

3. MemTables are flushed to disk in an SSTable based on size.

SSTables are immutable

Client

Memory

SSTables

Commit Log

Flush to Disk

Page 10: DataStax NYC Java Meetup: Cassandra with Java

10

Tunable Data Consistency

Page 11: DataStax NYC Java Meetup: Cassandra with Java

11

Built for Modern Online Applications

• Architected for today’s needs• Linear scalability at lowest cost• 100% uptime• Operationally simple

Page 12: DataStax NYC Java Meetup: Cassandra with Java

Cassandra Query Language

12

Page 13: DataStax NYC Java Meetup: Cassandra with Java

CQL - DevCenter

13

A SQL-like query language for communicating with Cassandra

DataStax DevCenter – a free, visual query tool for creating and running CQL statements against Cassandra and DataStax Enterprise.

Page 14: DataStax NYC Java Meetup: Cassandra with Java

CQL - Create Keyspace

14

CREATE KEYSPACE demo WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'EastCoast': 3, 'WestCoast': 2);

Page 15: DataStax NYC Java Meetup: Cassandra with Java

CQL - Basics

15

CREATE TABLE users (username text,password text,create_date timestamp,PRIMARY KEY (username, create_date desc);

INSERT INTO users (username, password, create_date) VALUES ('caroline', 'password1234', '2014-06-01 07:01:00');

SELECT * FROM users WHERE username = ‘caroline’ AND create_date = ‘2014-06-01 07:01:00’;

PredicatesOn the partition key: = and INOn the cluster columns: <, <=, =, >=, >, IN

Page 16: DataStax NYC Java Meetup: Cassandra with Java

Collection Data Types

16

CQL supports having columns that contain collections of data.

The collection types include:Set, List and Map.

Favor sets over list – better performance

CREATE TABLE users (username text,set_example set<text>,list_example list<text>,map_example

map<int,text>,PRIMARY KEY (username)

);

Page 17: DataStax NYC Java Meetup: Cassandra with Java

Plus much more…

17

Light Weight TransactionsINSERT INTO customer_account (customerID, customer_email) VALUES (‘LauraS’, ‘[email protected]’) IF NOT EXISTS;

UPDATE customer_account SET customer_email=’[email protected]’IF customer_email=’[email protected]’;

CountersUPDATE UserActions SET total = total + 2 WHERE user = 123 AND action = ’xyz';

Time to live (TTL)INSERT INTO users (id, first, last) VALUES (‘abc123’, ‘abe’, ‘lincoln’) USING TTL 3600;

Batch StatementsBEGIN BATCH INSERT INTO users (userID, password, name) VALUES ('user2', 'ch@ngem3b',

'second user') UPDATE users SET password = 'ps22dhds' WHERE userID = 'user2' INSERT INTO users (userID, password) VALUES ('user3', 'ch@ngem3c') DELETE name FROM users WHERE userID = 'user2’APPLY BATCH;

Page 18: DataStax NYC Java Meetup: Cassandra with Java

JAVA CODE EXAMPLES

18

Page 19: DataStax NYC Java Meetup: Cassandra with Java

DataStax Java Driver

19

• Written for CQL 3.0• Uses the binary protocol introduced in

Cassandra 1.2• Uses Netty to provide an asynchronous architecture• Can do asynchronous or synchronous queries• Has connection pooling• Has node discovery and load balancing

http://www.datastax.com/download

Page 20: DataStax NYC Java Meetup: Cassandra with Java

Add .JAR Files to Project

20

Easiest way is to do this with Maven, which is a software project management tool

Page 21: DataStax NYC Java Meetup: Cassandra with Java

Add .JAR Files to Project

21

In the pom.xml file, select the Dependencies tab

Click the Add… button in the left column

Enter the DataStax Java driver info

Page 22: DataStax NYC Java Meetup: Cassandra with Java

Connect & Write

22

Cluster cluster = Cluster.builder().addContactPoints("10.158.02.40", "10.158.02.44").build();

Session session = cluster.connect("demo");

session.execute("INSERT INTO users (username, password) ”+ "VALUES(‘caroline’, ‘password1234’)"

);

Note: Cluster and Session objects should be long-lived and re-used

Page 23: DataStax NYC Java Meetup: Cassandra with Java

Read from Table

23

ResultSet rs = session.execute("SELECT * FROM users");

List<Row> rows = rs.all();

for (Row row : rows) {String userName = row.getString("username");String password = row.getString("password");

}

Page 24: DataStax NYC Java Meetup: Cassandra with Java

Asynchronous Read

24

ResultSetFuture future = session.executeAsync("SELECT * FROM users");

for (Row row : future.get()) {String userName = row.getString("username");String password = row.getString("password");

}

Note: The future returned implements Guava's ListenableFuture interface. This means you can use all Guava's Futures1 methods!

1http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/util/concurrent/Futures.html

Page 25: DataStax NYC Java Meetup: Cassandra with Java

Read with Callbacks

25

final ResultSetFuture future = session.executeAsync("SELECT * FROM users");

future.addListener(new Runnable() {

public void run() {for (Row row : future.get()) {

String userName = row.getString("username");

String password = row.getString("password");

}}

}, executor);

Page 26: DataStax NYC Java Meetup: Cassandra with Java

Parallelize Calls

26

int queryCount = 99;

List<ResultSetFuture> futures = new ArrayList<ResultSetFuture>();

for (int i=0; i<queryCount; i++) {futures.add(

session.executeAsync("SELECT * FROM users "+"WHERE username = '"+i+"'"));

}

for(ResultSetFuture future : futures) {for (Row row : future.getUninterruptibly()) {

//do something}

}

Page 27: DataStax NYC Java Meetup: Cassandra with Java

Prepared Statements

27

PreparedStatement statement = session.prepare("INSERT INTO users (username, password) " + "VALUES (?, ?)");

BoundStatement bs = statement.bind();

bs.setString("username", "caroline");bs.setString("password", "password1234");

session.execute(bs);

Page 28: DataStax NYC Java Meetup: Cassandra with Java

Query Builder

28

Query query = QueryBuilder.select().all().from("demo", "users").where(eq("username", "caroline"));

ResultSet rs = session.execute(query);

Page 29: DataStax NYC Java Meetup: Cassandra with Java

Load Balancing

29

Determine which node will next be contacted once a connection to a cluster has been established

Cluster cluster = Cluster.builder().addContactPoints("10.158.02.40","10.158.02.44").withLoadBalancingPolicy(

new DCAwareRoundRobinPolicy("DC1")).build();

Policies are:• RoundRobinPolicy• DCAwareRoundRobinPolicy (default)• TokenAwarePolicy

Page 30: DataStax NYC Java Meetup: Cassandra with Java

RoundRobinPolicy

30

• Not data-center aware• Each subsequent request after initial connection to the

cluster goes to the next node in the cluster

• If the node that is serving as the coordinator fails during a

request, the next node is used

Page 31: DataStax NYC Java Meetup: Cassandra with Java

DCAwareRoundRobinPolicy

31

• Is data center aware• Does a round robin within the local data center • Only goes to another

data center if there is

not a node available

to be coordinator in

the local data center

Page 32: DataStax NYC Java Meetup: Cassandra with Java

TokenAwarePolicy

32

• Is aware of where the replicas for a given token live• Instead of round robin, the client chooses the node that

contains the primary replica to be the chosen coordinator • Avoids unnecessary time taken to go to any node to have it

serve as coordinator to then contact the nodes with the

replicas

Page 33: DataStax NYC Java Meetup: Cassandra with Java

Additional Information & Support

33

• Community Site

(http://planetcassandra.org)• Documentation

(http://www.datastax.com/docs) • Downloads

(http://www.datastax.com/download) • Getting Started

(http://www.datastax.com/documentation/gettingstarted/index.html) • DataStax

(http://www.datastax.com)

Page 34: DataStax NYC Java Meetup: Cassandra with Java

ABOUT DATASTAX

34

Page 35: DataStax NYC Java Meetup: Cassandra with Java

About DataStax

35

Founded in April 2010

30Percent

500+Customers

Santa Clara, Austin, New York, London

300+Employees

Page 36: DataStax NYC Java Meetup: Cassandra with Java

Confidential

DataStax deliversApache Cassandra to the Enterprise

36

Certified / Enterprise-ready Cassandra

Visual Management & Monitoring Tools

24x7 Support & Training

Page 37: DataStax NYC Java Meetup: Cassandra with Java

37

Page 38: DataStax NYC Java Meetup: Cassandra with Java

DSE 4.5

38

Page 39: DataStax NYC Java Meetup: Cassandra with Java

Thank You!

[email protected]://www.linkedin.com/in/carolinerg@carolinerg

Follow for more updates all the time: @PatrickMcFadin