Top Banner
Document databases in practice Luigi Berrettini Nicola Baldi http://it.linkedin.com/in/nicolabaldi http://it.linkedin.com/in/luigiberrettini
48

RavenDB

Jan 27, 2015

Download

Technology

An overview of document stores with a deep dive into Ayende Rahien's RavenDB: document design, querying, indexing, concurrency and more
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: RavenDB

Document databases in practice

Luigi Berrettini

Nicola Baldihttp://it.linkedin.com/in/nicolabaldi

http://it.linkedin.com/in/luigiberrettini

Page 2: RavenDB

Overview

15/12/2012 Document databases in practice 2

Page 3: RavenDB

Unbounded result sets problem

Unbounded number of requests problem

15/12/2012 Document databases in practice - Overview 3

Page 4: RavenDB

15/12/2012 Document databases in practice - Overview 4

They favor denormalization overcomposition and joins

Relations are different than in RDBMSs

They are schema-less, but attention should be paid in designing documents

Page 5: RavenDB

15/12/2012 Document databases in practice - Overview 5

« a conceptual model should be drawn with little or no regard for the software that might implement it » (Martin Fowler, UML Distilled)

A domain model should be independent from implementation details like persistence

In RavenDB this is somewhat true

Page 6: RavenDB

15/12/2012 Document databases in practice - Overview 6

RDBMS are schema-full• tuples = sets of key-value pairs ⇒ flat structure

• more complex data structures are stored as relations

Document databases are schema-less• object graphs stored as docs ⇒ no flat structure

• each document is treated as a single entity

RavenDB suggested approach is to follow the aggregate pattern from the DDD book

Page 7: RavenDB

ENTITY

15/12/2012 Document databases in practice - Overview 7

Some objects are not defined primarily by their attributes

They represent a thread of identity that runs through time and often across distinct representations

Mistaken identity can lead to data corruption

Page 8: RavenDB

VALUE OBJECT

15/12/2012 Document databases in practice - Overview 8

When you care only about the attributes of an element of the model, classify it as a value object

Make it express the meaning of the attributes it conveys and give it related functionality

Treat the value object as immutable

Don't give it any identity and avoid the design complexities necessary to maintain entities

Page 9: RavenDB

AGGREGATE

15/12/2012 Document databases in practice - Overview 9

Invariants are consistency rules that must be maintained whenever data changes

They’ll involve relationships within an aggregate(relations & foreign keys: order / orderlines)

Invariants applied within an aggregate will be enforced with the completion of each transaction

Page 10: RavenDB

15/12/2012 Document databases in practice - Overview 10

Cluster entities and value objects into aggregates and define boundaries around each

Choose one entity to be the root of each aggregate and control all access to the objects inside the boundary through the root

Allow external objects to hold references to the root only

Transient references to internal members can be passed out for use within a single operation only

Page 11: RavenDB

15/12/2012 Document databases in practice - Overview 11

Because the root controls access, it cannot be blindsided by changes to the internals

This arrangement makes it practical to enforce all invariants for objects in the aggregate and for the aggregate as a whole in any state change

Page 12: RavenDB

15/12/2012 Document databases in practice - Overview 12

Nested child document

Page 13: RavenDB

15/12/2012 Document databases in practice - Overview 13

Document referenced by ID

Page 14: RavenDB

Denormalized reference

15/12/2012 Document databases in practice - Overview 14

we clone properties that we care about when displaying or processing a containing document

avoids many cross document lookups and results in only the necessary data being transmitted over the network

it makes other scenarios more difficult: if we add frequently changing data, keeping details in synch could become very demanding on the server

use only for rarely changing data or for data that can be dereferenced by out-of-sync data

Page 15: RavenDB

15/12/2012 Document databases in practice - Overview 15

Page 16: RavenDB

Order contains denormalized data

from Customer

and Product

Full data are

saved elsewhere

15/12/2012 Document databases in practice - Overview 16

Page 17: RavenDB

15/12/2012 Document databases in practice - Overview 17

Page 18: RavenDB

Querying

15/12/2012 Document databases in practice 18

Page 19: RavenDB

15/12/2012 Document databases in practice – Querying 19

DocumentStore• used to connect to a RavenDB data store

• thread-safe

• one instance per database per application

Session• used to perform operations on the database

• not thread-safe

• implements the Unit of Work pattern

in a single session, a single document (identified by its key) always resolves to the same instance

change tracking

Page 20: RavenDB

15/12/2012 Document databases in practice – Querying 20

Page 21: RavenDB

15/12/2012 Document databases in practice – Querying 21

Sequential GUID key• when document key is not relevant (e.g. log entries)

• entity Id = sequential GUID (sorts well for indexing)

• Id property missing / not set ⇒ server generates a key

Identity key• entity Id = prefix + next available integer Id for it

• Id property set to a prefix = value ending with slash

• new DocumentStore ⇒ server sends a range of HiLo keys

Assign a key yourself• for documents which already have native id (e.g. users)

Page 22: RavenDB

15/12/2012 Document databases in practice – Querying 22

Page 23: RavenDB

15/12/2012 Document databases in practice – Querying 23

soft-limit = 128no Take() replaced by Take(128)

hard-limit = 1024if x > 1024 Take(x) returns 1024 documents

Page 24: RavenDB

15/12/2012 Document databases in practice – Querying 24

RavenDB can skip over some results internally ⇒ TotalResults value invalidated

For proper paging use SkippedResults:

Skip(currentPage * pageSize + SkippedResults)

Assuming a page size of 10…

Page 25: RavenDB

15/12/2012 Document databases in practice – Querying 25

Page 26: RavenDB

15/12/2012 Document databases in practice – Querying 26

Page 27: RavenDB

15/12/2012 Document databases in practice – Querying 27

RavenDB supports Count and Distinct

SelectMany, GroupBy and Join are not supported

The let keyword is not supported

For such operations an index is needed

Page 28: RavenDB

15/12/2012 Document databases in practice – Querying 28

All queries use an index to return results

Dynamic = created automatically by the server

Static = created explicitly by the user

Page 29: RavenDB

15/12/2012 Document databases in practice – Querying 29

no matching static index to query ⇒ RavenDB automatically creates a dynamic index on the fly (on first user query)

based on requests coming in, RavenDB can decide to promote a temporary index to a permanent one

Page 30: RavenDB

15/12/2012 Document databases in practice – Querying 30

permanent

expose much more functionality

low latency: on first run dynamic indexes have performance issues

map / reduce

Page 31: RavenDB

15/12/2012 Document databases in practice – Querying 31

Page 32: RavenDB

15/12/2012 Document databases in practice – Querying 32

Page 33: RavenDB

15/12/2012 Document databases in practice – Querying 33

Page 34: RavenDB

Advanced topics

15/12/2012 Document databases in practice 34

Page 35: RavenDB

15/12/2012 Document databases in practice – Advanced topics 35

an index is made of documents

document• atomic unit of indexing and searching

• flat ⇒ recursion and joins must be denormalized

• flexible schema

• made of fields

Page 36: RavenDB

15/12/2012 Document databases in practice – Advanced topics 36

field• a name-value pair with associated info

• can be indexed if you're going to search on it⇒ tokenization by analysis

• can be stored in order to preserve original untokenized value within document

example of physical index structure{“__document_id”: “docs/1”, “tag”: “NoSQL”}

Page 37: RavenDB

15/12/2012 Document databases in practice - Overview 37

Page 38: RavenDB

15/12/2012 Document databases in practice – Advanced topics 38

Page 39: RavenDB

15/12/2012 Document databases in practice – Advanced topics 39

Page 40: RavenDB

15/12/2012 Document databases in practice – Advanced topics 40

One to one

Page 41: RavenDB

15/12/2012 Document databases in practice – Advanced topics 41

One to many ⇒ SELECT N+1

Page 42: RavenDB

15/12/2012 Document databases in practice – Advanced topics 42

Value type

Page 43: RavenDB

15/12/2012 Document databases in practice – Advanced topics 43

indexing: thread executed on creation or update

server responds quickly BUT you may query stale indexes (better stale than offline)

Page 44: RavenDB

15/12/2012 Document databases in practice – Advanced topics 44

Page 45: RavenDB

documentStore.Conventions.DefaultQueryingConsistency

15/12/2012 Document databases in practice – Advanced topics 45

ConsistencyOptions.QueryYourWritessame behavior ofWaitForNonStaleResultsAsOfLastWrite

ConsistencyOptions.MonotonicReadyou never go back in time and read older data than what you have already seen

Page 46: RavenDB

15/12/2012 Document databases in practice - Overview 46

Page 47: RavenDB

15/12/2012 Document databases in practice - Overview 47

Page 48: RavenDB

15/12/2012 Document databases in practice - Overview 48