Page 1
NoSQL DistilledA guide to polyglot persistence
#NoSQLDistilled@pramodsadalageThoughtWorks Inc.
Tuesday, June 11, 13
Page 2
Why RDBMS?
Tuesday, June 11, 13
Page 3
ACID TransactionsAtomicityConsistencyIsolationDurability
Tuesday, June 11, 13
Page 4
Standard Query Interface
Tuesday, June 11, 13
Page 5
Interact with many languages
Tuesday, June 11, 13
Page 6
Everyone knows SQL
Tuesday, June 11, 13
Page 7
Everyone knows SQL
Tuesday, June 11, 13
Page 8
Limit less indexing
Tuesday, June 11, 13
Page 9
Handles many data models
Tuesday, June 11, 13
Page 10
Why NoSQL
Tuesday, June 11, 13
Page 11
Schema changes are hard
Tuesday, June 11, 13
Page 12
line items:
customer: Ann
$4820321293533
$3910321601912
$5110131495054
$96
$39
$51
payment details:
Card: AmexCC Number: 12345expiry: 04/2001
ID: 1001orders
customers
order lines
credit cards
Impedance mismatchTuesday, June 11, 13
Page 13
Application vs Integration databases
Billing
Inventory
Billing
Inventory
Integration Database
Application Database web service
Tuesday, June 11, 13
Page 14
Running on clustersTuesday, June 11, 13
Page 15
Un-Structured Data
Tuesday, June 11, 13
Page 16
Un-Even rate of data growth
Tuesday, June 11, 13
Page 17
Domain Models
Tuesday, June 11, 13
Page 18
Domain driven data models
Tuesday, June 11, 13
Page 19
RDBMS dataTuesday, June 11, 13
Page 20
Aggregate model (Embedding objects)
Tuesday, June 11, 13
Page 21
Aggregate Data
// in customers{ "customer": { "id": 1, "name": "Martin", "billingAddress": [{"city": "Chicago"}], "orders": [ { "id":99, "orderItems":[ { "productId":27, "price": 32.45, "productName": "NoSQL Distilled" } ], "shippingAddress":[{"city":"Chicago"}] "orderPayment":[ { "ccinfo":"1000-1000-1000-1000", "txnId":"abelif879rft", "billingAddress": {"city": "Chicago"} } ], } ] }}
Tuesday, June 11, 13
Page 22
Aggregate model (Referencing Objects)
Tuesday, June 11, 13
Page 23
Aggregate data
// in Customers{ "id":1, "name":"Martin", "billingAddress":[{"city":"Chicago"}]}// in Orders{ "id":99, "customerId":1, "orderItems":[ { "productId":27, "price": 32.45, "productName": "NoSQL Distilled" } ], "shippingAddress":[{"city":"Chicago"}] "orderPayment":[ { "ccinfo":"1000-1000-1000-1000", "txnId":"abelif879rft", "billingAddress": {"city": "Chicago"} } ],}
Tuesday, June 11, 13
Page 24
Aggregate Orientation
Tuesday, June 11, 13
Page 25
RDBMS’s have no concept of aggregates
Tuesday, June 11, 13
Page 26
Aggregates reduce the need for ACID
Tuesday, June 11, 13
Page 27
Better for clusters, can be distributed easily
Tuesday, June 11, 13
Page 28
Key-ValueDocument
Column-Family
Tuesday, June 11, 13
Page 29
Key Value Databases
Tuesday, June 11, 13
Page 30
Key-ValueDatabase
•One Key-One Value•Value is opaque to database•Like a Hash•Some are distributed
Oracle Riak
instance cluster
table bucket
row key-value
row-id key
Tuesday, June 11, 13
Page 31
“key” (VIN) “value” (car facts)
JTTDR… …
make#Ford model#Mustang
year#2011 ….
Tuesday, June 11, 13
Page 32
Document Databases
Tuesday, June 11, 13
Page 33
Document Database
•One Key-One Value•Value is visible to database•Value can be queries•JSON/XML documents
Oracle MongoDB
instance mongod
schema database
table collection
row document
row_id _id
Tuesday, June 11, 13
Page 34
“id” (VIN) “document” (car facts)
JTTDR…
{… “make”: “Ford”, “model”: “Mustang”, “year”: 2011, … }
Tuesday, June 11, 13
Page 35
Column-Family Databases
Tuesday, June 11, 13
Page 36
Column-Family Database
•Data organized as columns•Each row has row key•Columns have versioned data
•Row data is sorted by column name
Oracle Cassandra
instance cluster
database keyspace
table column-family
row rowcolumns same for
every row
columns can be different for each row
Tuesday, June 11, 13
Page 37
“id” (VIN) “column families”(car facts)
JTTDR…
{… “car”:{“make”: “Ford”, “model”: “focus”…} “service”:{…} }
Tuesday, June 11, 13
Page 38
Key-Points Aggregate Databases
Tuesday, June 11, 13
Page 39
Inter-aggregate relations are hard to maintain
Tuesday, June 11, 13
Page 40
Schema-less means implicit schema
Tuesday, June 11, 13
Page 41
Graph Databases
Tuesday, June 11, 13
Page 42
Tuesday, June 11, 13
Page 43
Graph Databases
•Is multi-relational graph•Relationships are first-class citizens
•Traversal algorithms•Nodes and Edges can have data (key-value pairs)
Tuesday, June 11, 13
Page 44
Graph databases work best for data with complex
relations
Tuesday, June 11, 13
Page 45
Key-Value Database Usage
Tuesday, June 11, 13
Page 46
Session Storage
Tuesday, June 11, 13
Page 47
User Profiles/Preferences
Tuesday, June 11, 13
Page 48
Shopping Cart
Tuesday, June 11, 13
Page 49
Single user analytics
Tuesday, June 11, 13
Page 50
Document Database Usage
Tuesday, June 11, 13
Page 51
Event Logging
Tuesday, June 11, 13
Page 52
Prototype development
Tuesday, June 11, 13
Page 53
eCommerce Application
Tuesday, June 11, 13
Page 54
Content Management Applications
Tuesday, June 11, 13
Page 55
Column-Family Database Usage
Tuesday, June 11, 13
Page 56
Large write volume
Tuesday, June 11, 13
Page 57
Content Management
Tuesday, June 11, 13
Page 58
eCommerce Application
Tuesday, June 11, 13
Page 59
Graph Database Usage
Tuesday, June 11, 13
Page 60
Connected Data
Tuesday, June 11, 13
Page 61
Routing things/money
Tuesday, June 11, 13
Page 62
Location Services
Tuesday, June 11, 13
Page 63
Recommendation engines
Tuesday, June 11, 13
Page 64
Schema-less really?
Tuesday, June 11, 13
Page 65
Schema-free does not mean no schema-migration
Tuesday, June 11, 13
Page 66
Schema is implicit in code
Tuesday, June 11, 13
Page 67
Data must be migrated, when schema in code is
changed
Tuesday, June 11, 13
Page 68
All data need not be migrated at the same time
(lazy migration)
Tuesday, June 11, 13
Page 69
Polyglot Persistence
Tuesday, June 11, 13
Page 70
Use different data storage technology for varying
needs
Tuesday, June 11, 13
Page 71
Can be across the enterprise or in single
application
Tuesday, June 11, 13
Page 72
Encapsulate data access through services
Tuesday, June 11, 13
Page 73
Order persistence service
Document store
e-commerce platform
Session/Cart storage service
Key-Value store
Inventory and Price service
RDBMS(Legacy DB)
Nodes and Relations service
Graph store
Shopping cart and session data
Completed Orders
Inventoryand
Item Price Customer social graph
Tuesday, June 11, 13
Page 74
RDBMS
e-commerce platform
Shopping cart data
Completed Orders
Session dataSOLR
Searchrequests
Update Indexed Data
Update indexed data, batch or realtime
Tuesday, June 11, 13
Page 75
martinfowler.com/bliki/PolyglotPersistence.html
Session Storage
Financial Data
Shopping Cart
Recommendationengine
Product Catalog
Reporting Analytics User Activity Logs
Speculative Retail Web Application
Redis RDBMS Riak Neo4J
MongoDB RDBMS Cassandra Cassandra
Tuesday, June 11, 13
Page 76
Experience
Tuesday, June 11, 13
Page 77
e-learning (before)
Tuesday, June 11, 13
Page 78
e-learning (after)
Tuesday, June 11, 13
Page 79
e-commerce (before)
Tuesday, June 11, 13
Page 80
e-commerce (after)
Tuesday, June 11, 13
Page 81
How do I choose?
Tuesday, June 11, 13
Page 82
Choose for programmer productivity
Tuesday, June 11, 13
Page 83
Choose for data access performance
Tuesday, June 11, 13
Page 84
Choose to stick with the default
Tuesday, June 11, 13
Page 85
Choose by testing your expectations
Tuesday, June 11, 13
Page 86
Try the databases, they are all open-source
Tuesday, June 11, 13
Page 87
Thanks#NoSQLDistilled
@pramodsadalagesadalage.com
Tuesday, June 11, 13