Top Banner
This presentation includes information that is condential and proprietary to Basho Technologies and should not be forwarded or distributed without Basho's prior written consent. © 2014. Basho Technologies, Inc. All Rights Reserved. This presentation includes information that is condential and proprietary to Basho Technologies and should not be forwarded or distributed without Basho's prior written consent. © 2014. Basho Technologies, Inc. All Rights Reserved. Matt Brender Developer Advocate From Relational to Riak
39

Relational Databases to Riak

Jul 21, 2015

Download

Technology

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Relational Databases to Riak

This presentation includes information that is confidential and proprietary to Basho Technologies and should not be forwarded or distributed without Basho's prior written consent. © 2014. Basho Technologies, Inc. All Rights Reserved.

This presentation includes information that is confidential and proprietary to Basho Technologies and should not be forwarded or distributed without Basho's prior written consent. © 2014. Basho Technologies, Inc. All Rights Reserved.

Matt Brender Developer Advocate

From

Relational to Riak

Page 2: Relational Databases to Riak

Relational databases give us a lot

Page 3: Relational Databases to Riak

Relationships Transactions Schemas Ad-hoc queries (SQL)

Page 4: Relational Databases to Riak

RDBMS BE LIKE

4

=>

Page 5: Relational Databases to Riak

RDBMS BE LIKE

5

OR

Page 6: Relational Databases to Riak

It’s not about NoSQL

Page 7: Relational Databases to Riak

What we really need..

Page 8: Relational Databases to Riak

is a database that makes my app faster

Page 9: Relational Databases to Riak

Enter Riak

Page 10: Relational Databases to Riak

Scalable Key => Value Schema-less Eventually Consistent Highly Available

Page 11: Relational Databases to Riak

Key => Value Masterless Schema-less Fault Tolerant High Availability Queries & Search

Page 12: Relational Databases to Riak

Scalable

Riak has a masterless architecture in which every node in a cluster is capable of serving read and write requests.

Requests are routed to nodes using standard load balancing appliances or software like Nginx or HAProxy.

Page 13: Relational Databases to Riak

Scalable

Data is guaranteed to be evenly distributed. Instead of manually sharding (partitioning) data Riak automatically distributes data evenly across a cluster by hashing keys using the SHA-1 algorithm that converts the key (bucket/key combination) into a number from:

0 - 1,461,501,637,330,902,918,203,684,832,716,283,019,655,932,542,976

or

0 - 2160

Page 14: Relational Databases to Riak

Scalable

Consistent Hashing – The Ring

0 - 2160

Page 15: Relational Databases to Riak

•  Linear ScalingRiak scales in a near-linear fashion so increasing the number of a nodes in a cluster increases the number of reads and writes a cluster can handle in a predictable fashion.

If 10 nodes can serve: 40,000 Writes/Second

Then 20 nodes should serve: 72,000+ Writes/Second

“To enable rapid iteration at scale, Riot moved to Riak to support millions of concurrent players at any moment.”

Scalable

Page 16: Relational Databases to Riak

RELATIONAL SCALABILITY

16

•  Designed for vertical scale

•  Cost Considerations a key element of vertical scaling

•  Sharding or re-distribution is I/O intensive

A - K L - P Q - Z

Page 17: Relational Databases to Riak

Key => Value

Riak stores data as a combination of keys and values in buckets

•  Keys – simply binary values used to identify Objects.*

•  Values – can be numbers, strings, objects, binaries, etc.

•  Buckets – used to define a virtual namespace for storing Riak objects.

Page 18: Relational Databases to Riak

Key => Value

curl http://127.0.0.1:8098/types/places/buckets/country/keys/US

{

"Alpha2_s": "US”, "Alpha3_s": "USA”, "EnglishName_s": "United States”, "NumericCode_i": 840 }

Riak offers both HTTP and Protocol Buffers APIs. The following HTTP API example uses curl to retrieve a value by key:

Note: Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler.

Page 19: Relational Databases to Riak

There are a diverse group of client libraries for Riak that support both the HTTP and Protocol Buffer APIs:

Key => Value

Basho Supported Libraries:•  Java•  Ruby•  Python•  PHP•  Erlang•  .NET•  Node.js

Community Libraries:•  C•  Clojure•  Go•  Perl•  Scala•  R

Page 20: Relational Databases to Riak

Schemas are not enforced by Riak, but by your application.

Schema-less

You still:•  Design a schema•  Denormalize dependent

data types

But get:•  Single reads for common

access patterns•  Richer, simpler data

structures

curl http://127.0.0.1:8098/types/places/buckets/country/keys/US

{

"Alpha2_s": "US”, "Alpha3_s": "USA”, "EnglishName_s": "United States”, "NumericCode_i": 840 }

Page 21: Relational Databases to Riak

Schemas are not enforced by Riak, but by your application.

Schema-less

Application Type Key Value

Session User/Session ID Session Data

Advertising Campaign ID Ad Content Logs Date Log File

Sensor Date, Date/Time Sensor Updates

User Data Login, email, UUID User Attributes

Content Title, Integer Text, JSON/XML/HTTP document, images, etc.

Page 22: Relational Databases to Riak

Eventually Consistent

C = ConsistencyA = AvailabilityP = Partition Tolerance

Client Client

DBDBDB

Network Partition

Cap theorem states that a distributed system can at most support 2 out of these 3 properties

Page 23: Relational Databases to Riak

Eventually Consistent

Read repair operations take place on every successful read, which updates replicas copy that may be out of sync.

Active anti-entropy (AAE) is a background operation that compares Merkle trees to repair operations.

Nodes periodically send their current view of the ring state to a randomly selected peer over gossip protocol.

get(“bucket/key”)

Page 24: Relational Databases to Riak

Eventually Consistent

Dotted Version Vectors are a tool used by Riak to track the logical sequence of updates to a key/value pair (versus the chronological order of events) and manage the process of merging siblings created as one of the side effects of eventual consistency.

A:1 B:1A:1

C:1B:1

C:2B:1 C2

C1

Page 25: Relational Databases to Riak

> curl http://127.0.0.1:8098/types/places/buckets/country/keys/US Siblings: 47fGOQwxRzq6wsbM7idvFB 2mJD0DEGoxdxdHUqS3bYt3 7Y68tqVG99xHBDu7AKtmb4

> curl -H "Accept: multipart/mixed" http://127.0.0.1:8098/types/places/buckets/country/keys/US

--RigRoRk6lkPXYIqBOv1jKEacnlr Content-Type: application/json Link: </buckets/country>; rel="up” Etag: 47fGOQwxRzq6wsbM7idvFB Last-Modified: Wed, 05 Nov 2014 22:44:00 GMT {"Alpha2_s":"US","Alpha3_s":"USA","EnglishName_s":"United States","NumericCode_i":840} --RigRoRk6lkPXYIqBOv1jKEacnlr Content-Type: application/json Link: </buckets/country>; rel="up”

...

Eventually Consistent

Page 26: Relational Databases to Riak

Riak Data Types (Convergent Replicated Data Types or CRDTs) are a developer-friendly way to keep track of updates in an eventually consistent environment:

•  MapSupports the nest of and of the Riak Data Types.

•  RegisterA named binary field that can only be used as part of a Map.

•  Counter Keeps tracks of increments and decrements on an integer

•  FlagValues limited to enable or disable

•  SetA collection of unique binary values that supports add and remove operations on one or more values

Eventually Consistent

Page 27: Relational Databases to Riak

Hinted handoff allows Riak nodes to temporarily take over storage operations for a failed node and update that node with changes when it comes back online.

put(“bucket/key”)

High Availability

Page 28: Relational Databases to Riak

RELATIONAL AVAILABILITY

28

•  Master/Replica Architecture

•  Assumption of Transactional Consistency

•  What happens under failure conditions?

master

replica replica replica

coordination

X X

Write/ Read

Write/ Read

WAIT

master

coordination

Page 29: Relational Databases to Riak

Riak automatically replicates between clusters•  Configurable number of

remote replicas•  Options for real-time sync and

full sync•  Spanning tree support for

cascading replication

Geo-Data Locality allows localized data processing

•  Reduced latency to end-users

•  Allows sub 5ms responses •  Active-Active ensures

continuous user experience

High Availability

Riak Multi-Datacenter (MDC) Replication

Page 30: Relational Databases to Riak

IN REVIEW

Page 31: Relational Databases to Riak
Page 32: Relational Databases to Riak

RDBMS BE LIKE

32

OR

Page 33: Relational Databases to Riak

CV CV

NoSQL Database

Unstructured Data

No pre-defined Schema

Small and Large Data Sets on Commodity HW

Many Models:

K/V, document store, graph

Variety of Query Methods

RELATIONAL & NOSQL What’s the difference?

Relational Database

Structured Data

Defined Schema

Tables with Rows/Columns

Indexed

w/ Table Joins

SQL

33

Page 34: Relational Databases to Riak

Biggest change for dev: reads

Page 35: Relational Databases to Riak

Biggest change for ops: administration

Page 36: Relational Databases to Riak

THE COST OF DOWNTIME

36

Page 37: Relational Databases to Riak

WHAT YOU WILL GAIN

37

More flexible, fluid designs

More natural data representations

Scaling without pain

Reduced operational complexity

Page 38: Relational Databases to Riak

RIAK DEPLOYED WORLDWIDE

Page 39: Relational Databases to Riak

39

Questions