Top Banner
Document-Oriented Databases for the .NET platform Anton Samarskyy
27

03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Jan 27, 2015

Download

Education

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Document-Oriented Databases for the .NET platform

Anton Samarskyy

Page 2: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Agenda

• Challenges of Relational Databases• NoSQL: not only SQL• Document store concept• Document-oriented databases• Raven DB• Raven DB Demo• MapReduce (optional)

Page 3: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Relational Databases properties

• ACID Atomic, Consistent, Isolated, Durable• Relational based on relation algebra & Codd’s work• Table / Row based• Rich querying capabilities• Foreign keys• Schema

Page 4: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

What do our apps need?

• Need to scale horizontally• Partition and replication• OnLine Transaction Processing and

OnLine Analytical Processing• Web 2.0• Performance, Performance, Performance• Flexibility• Big even Huge datasets

http://www.graph-database.org

Page 5: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Not only SQL philosophy

• Being non-relational, distributed, cloud-ready

• Open-source• Horizontally scalable: easy replication

support• Schema-free• Simple API• BASE (not ACID): Basically Available, Soft

state, Eventual consistency• Huge data amount

Page 6: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

noSQL Pros

+ Cheap, easy to implement+ Removes impedance mismatch between objects and tables+ Quickly process large amounts of data+ Data modeling flexibility+ Command Query Responsibility Segregation (CQRS), Event Sourcing

Page 7: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

noSQL Cons

- New technologies- Data is generally duplicated,

potential for inconsistency- No standard language or format for

queries- Depends on application layer to

enforce data integrity- Reporting

Page 8: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

NoSQL types

Common• Wide Column

Store / Column Families

• Key Value / Tuple Store

• Document Store• Graph Databases• Object Databases

Other• Grid & Cloud

Database Solutions

• XML Databases• Multivalue

Databases• File Databases

Page 9: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

CAP

• Consistency Each client has the same view

• Availability All clients can read and write

• Partition tolerance Works well across different network partitions

http://www.julianbrowne.com/article/viewer/brewers-cap-theorem

Page 10: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

You pick only two!

Page 11: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Who is using noSQL?

Page 12: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Document-oriented databases are

• Collection of independent documents: XML, JSON, JAML

• Non relational, i.e. do not store data in tables with uniform sized fields for each record

• Not limited with number of fields or length • Usually accessible via a RESTful HTTP/JSON

API• Horizontally scalable• Can be distributed• Fault-tolerant

Page 13: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Why documents store?

• Schema free• User generated content• Storing full complex object graphs• Low overhead – usually operate on a

single document:- One read, one write

• Fast• Known format means the database

can do interesting things with it…

Page 14: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Indexing

• Order in schema free world• Materialized views• Built on the background• Allow stale reads• Don’t slow down CRUD ops

Page 15: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Index concept

{ "name": "ayende", ”twitter": "@ayende", "projects": [ "rhino mocks", "nhibernate", "raven db", ] }

from doc in docs from prj in doc.projects select new {

Project = prj, Name = doc.Name

}

http://ayende.com/blog/4459/that-no-sql-thing-document-databases

GET /indexes/ProjectAndName?query=Project:raven

Page 16: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Document DB family• CouchDB: Apache project created by

Damien Katz;• RavenDB: Oren Eini and Hybernating

Rhinos project;• MongoDB: 10gen project.• SimpleDB: Amazon project. It is used

as a web service in concert with Amazon Elastic Compute Cloud;

Page 17: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Comparison

• CouchDB: Elang, REST API, JavaScript map-reduce quering (concurrent), via .NET helpers;

• MongoDB: C++, Dynamic Query (non-concurrent MapReduce), custom TCP/IP access, .NET drivers: 10gen, NoRM (Linq);

• RavenDB: .NET, REST API, Linq map to Lucene .NET + MapReduce;

• SimpleDB: Erlang, Name/Value store, basic queries, not RESTful, via .NET helpers.

Page 18: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Raven DB

• Build on excising infrastructure (ESENT) that is known to scale to amazing sizes

• Can be transactional, i.e. ACID: supports System.Transactions and can take part in distributed transactions

• Indexes via Linq query, implements IQueryable that map to Lucene

• Supports map/reduce operations

Page 19: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Raven DB

• Comes with fully functional .NET client API, Unit of Work, change tracking

• REST based, so you can access it via the Java Script API directly

• Support optimistic concurrency blocking

• Can be extended with MEF• Has triggering support• Supports Sharding and Replication

http://ravendb.net

Page 20: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Raven Extensibility

• MEF (Managed Extensibility Framework)

• Triggers- PUT trigger- DELETE trigger- Read trigger- Index update triggers

• Request Responders• Custom Serialization/Deserialization

Page 21: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Demo: RavenDB

• Setup, Server• RavenDB Client API• Denormalization, modeling

documents• CRUD• Attachments• Indexes• MapReduce indexes• Sharding

Page 22: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

MapReduce

• MapReduce is a programming model and an associated implementation for processing and generating large data sets

• Map function processes a key/value pair to generate a set of intermediate key/value pairs

• Reduce function that merges all intermediate values associated with the same intermediate key

Page 23: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Map

Page 24: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Sort

Page 25: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Reduce

Page 26: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Sharding

• Sharding refers to horizontal partitioning of data across multiple machines

• The idea is to split the load across many commodity machines, instead of buying huge expensive servers

Page 27: 03 net saturday anton samarskyy ''document oriented databases for the .net platform''

Thanks!

Questions or comments?