Top Banner
The Whys of NoSQL
17

DataStax: The Whys of NoSQL

Feb 20, 2017

Download

Technology

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DataStax: The Whys of NoSQL

The Whys of NoSQL

Page 2: DataStax: The Whys of NoSQL

1 Jargon Galore

2 Schema

3 Modeling and Internals

4 Deployment

5 Conclusion

2 © 2015. All Rights Reserved.

Page 3: DataStax: The Whys of NoSQL

©2015 DataStax Confidential. Do not distribute without consent. 3

SQL Jargon

Page 4: DataStax: The Whys of NoSQL

©2015 DataStax Confidential. Do not distribute without consent. 4

NoSQL Noise?

Page 5: DataStax: The Whys of NoSQL

Schema

©2015 DataStax Confidential. Do not distribute without consent.

Rigid Schema

Schema Free

Schema on read

Schema Easy to change

In flexible Writes are schema free, reads are freaking slow

Reads/Writes are schema aware Schema changes are O(1) operations

BLOBs

Too Slow

Optimized for Agility of change when needed, not theoretical extremes

Page 6: DataStax: The Whys of NoSQL

©2015 DataStax Confidential. Do not distribute without consent. 6

Normalization, Joins, Referential Integrity

Database normalization is the process of organizing the columns (attributes) and tables (relations) of a relational database to minimize data redundancy.

Referential integrity is a property of data which, when satisfied, requires every value of one column of a table to exist as a value of another column in a different table. A JOIN is a means for combining

fields from two tables (or more) by using values common to each.

Source - https://en.wikipedia.org/

Page 7: DataStax: The Whys of NoSQL

©2015 DataStax Confidential. Do not distribute without consent. 7

Not all Data Access is equal

1:168K random vs. sequential

1:10 random vs. sequential

Source - https://queue.acm.org/detail.cfm?id=1563874

Page 8: DataStax: The Whys of NoSQL

©2015 DataStax Confidential. Do not distribute without consent. 8

Disk Density

Source http://silvertonconsulting.com/blog/2010/04/22/save-the-planet-buy-fatter-disks-and-flash/#sthash.sh2nwqtX.dpbs

Page 9: DataStax: The Whys of NoSQL

©2015 DataStax Confidential. Do not distribute without consent. 9

$0.01

$0.10

$1.00

$10.00

$100.00

$1,000.00

$10,000.00

$100,000.00

$1,000,000.00

2014 2013 2010 2005 2000 1995 1990 1985 1980

HDD Price / GB Minimize Data Redundancy?

Disk Price / GB

Page 10: DataStax: The Whys of NoSQL

OS Cache

C* Read and Write paths

©2015 DataStax Confidential. Do not distribute without consent.

Memtable 1 Memtable 2 Memtable N

SSTable 1 SSTable 2 SSTable N

Commit Log

Persistent Storage

Off Heap

In Process Memory

Reads (memtable + N SSTables where N >= 1)

Mandatory Flush

Writes

Max # of SSTables = N (based on compaction)

Creation of new memtable during flush operation (cleanup tombstones, cleanup token ranges, etc.)

Time (memtable_flush_in_ms controls the frequency)

Accounting

SSTable Compacted

RANDOM ACCESS

SEQUENTIAL ACCESS

Page 11: DataStax: The Whys of NoSQL

Execution Engine

©2015 DataStax Confidential. Do not distribute without consent.

Page 12: DataStax: The Whys of NoSQL

Key takeaways

©2015 DataStax Confidential. Do not distribute without consent.

Optimal utilization of physical resources (random access, sequential IO and CPU) No Read before Write (well mostly!) Plan for Compaction (like commercial paper, you need a regular pay back) De-Normalize for optimal application response (use 2NF instead of 3NF)

Page 13: DataStax: The Whys of NoSQL

Deployment Semantics

©2014 DataStax Confidential. Do not distribute without consent.

R/W R

Single Box DR GR

Sca

le U

p by

. S

hard

ing

Replication

GR + DR

San Francisco

New York

Stockholm

DC1 DC2

Page 14: DataStax: The Whys of NoSQL

Linear Scaling

©2015 DataStax Confidential. Do not distribute without consent.

http://www.datastax.com/apache-cassandra-leads-nosql-benchmark

End Point Report Excerpt: Balanced Read/Write YCSB Test

Page 15: DataStax: The Whys of NoSQL

So what's the catch?

©2015 DataStax Confidential. Do not distribute without consent.

Page 16: DataStax: The Whys of NoSQL

©2015 DataStax Confidential. Do not distribute without consent. 16

Conclusion Best in class performance, backed by physics

Enables pragmatic business agility, Delivering delightful customer experience, Always on, Linear Scale architecture delivering optimal ROI

Page 17: DataStax: The Whys of NoSQL

Thank you