MongoDB Schema Design

Post on 16-Jan-2015

5981 Views

Category:

Business

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

An overview of MongoDB Schema Design from M

Transcript

Emily Stolfo

#mongodbdays

Schema Design

Ruby Engineer/Evangelist, 10gen

@EmStolfo

Agenda

• Working with documents

• Common patterns

• Queries and Indexes

Terminology

RDBMS MongoDBDatabase ➜ DatabaseTable ➜ CollectionRow ➜ DocumentIndex ➜ IndexJoin ➜ Embedded

DocumentForeign Key ➜ Reference

Working with Documents

DocumentsProvide flexibility and performance

Example Schema (MongoDB)

Embedding

Example Schema (MongoDB)

Embedding

Linking

Example Schema (MongoDB)

Relational Schema DesignFocuses on data storage

Document Schema DesignFocuses on data use

Schema Design Considerations• What is a priority?– High consistency– High read performance– High write performance

• How does the application access and manipulate data?

– Read/Write Ratio– Types of Queries / Updates– Data life-cycle and growth– Analytics (Map Reduce, Aggregation)

Tools for Data Access

• Flexible Schemas

• Embedded data structures

• Secondary Indexes

• Multi-Key Indexes

• Aggregation Framework– Pipeline operators: $project, $match, $limit,

$skip, $sort, $group, $unwind

• No Joins

Data Manipulation

• Conditional Query Operators– Scalar: $ne, $mod, $exists, $type, $lt, $lte, $gt,

$gte, $ne– Vector: $in, $nin, $all, $size

• Atomic Update Operators– Scalar: $inc, $set, $unset– Vector: $push, $pop, $pull, $pushAll, $pullAll,

$addToSet

Schema Design Example

Library Management Application• Patrons

• Books

• Authors

• Publishers

One to One Relationsexample

patron = { _id: "joe" name: "Joe Bookreader”}

address = { patron_id = "joe", street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345}

Modeling Patrons

patron = { _id: "joe" name: "Joe Bookreader", address: { street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345 }}

One to One Relations

• “Contains” relationships are often embedded.

• Document provides a holistic representation of objects with embedded entities.

• Optimized read performance.

examples

One To Many Relations

patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), addresses: [ {street: "1 Vernon St.", city: "Newton", state: "MA", …}, {street: "52 Main St.", city: "Boston", state: "MA", …}, ]}

Patrons with many addresses

example 2Publishers and Books

One to Many Relations

Publishers and Books relation• Publishers put out many books

• Books have one publisher

MongoDB: The Definitive Guide,By Kristina Chodorow and Mike DirolfPublished: 9/24/2010Pages: 216Language: English

Publisher: O’Reilly Media, CA

Book Data

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" }}

Book Model with Embedded Publisher

publisher = { name: "O’Reilly Media", founded: "1980", location: "CA"}

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

Book Model with Normalized Publisher

publisher = { _id: "oreilly", name: "O’Reilly Media", founded: "1980", location: "CA"}

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher_id: "oreilly"}

Link with Publisher _id as a Reference

publisher = { name: "O’Reilly Media", founded: "1980", location: "CA" books: [ "123456789", ... ]}

book = { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

Link with Book _ids as a Reference

Where do you put the reference?

• Reference to single publisher on books– Use when items have unbounded growth

(unlimited # of books)

• Array of books in publisher document– Optimal when many means a handful of items– Use when there is a bound on potential growth

example 3Books and Patrons

One to Many Relations

Books and Patrons

• Book can be checked out by one Patron at a time

• Patrons can check out many books (but not 1000s)

patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }}

book = { _id: "123456789" title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], ...}

Modeling Checkouts

patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", checked_out: "2012-10-15" }, { _id: "987654321", checked_out: "2012-09-12" }, ... ]}

Modeling Checkouts

De-normalizationProvides data locality

patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], checked_out: ISODate("2012-10-15") }, { _id: "987654321" title: "MongoDB: The Scaling Adventure", ... }, ... ]}

Modeling Checkouts - de-normalized

Referencing vs. Embedding• Embedding is a bit like pre-joining data

• Document level operations are easy for the server to handle

• Embed when the “many” objects always appear with (viewed in the context of) their parents.

• Reference when you need more flexibility

How does your application access and manipulate data?

exampleMany to Many Relations

book = { title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York"}

author = { _id: "mdirolf", name: "Mike Dirolf", hometown: "Albany"}

Books and Authors

book = { title: "MongoDB: The Definitive Guide", authors : [ { _id: "kchodorow", name: "Kristina Chodorow” }, { _id: "mdirolf", name: "Mike Dirolf” } ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York"}

author = { _id: "mdirolf", name: "Mike Dirolf", hometown: "Albany"}

Relation stored in Book document

book = { _id: 123456789 title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "Cincinnati", books: [ {book_id: 123456789, title : "MongoDB: The Definitive Guide" }]}

Relation stored in Author document

book = { _id: 123456789 title: "MongoDB: The Definitive Guide", authors = [ kchodorow, mdirolf ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York", books: [ 123456789, ... ]}

author = { _id: "mdirolf", name: "Mike Dirolf", hometown: "Albany", books: [ 123456789, ... ]}

Relation stored in both documents

book = { title: "MongoDB: The Definitive Guide", authors : [ { _id: "kchodorow", name: "Kristina Chodorow” }, { _id: "mdirolf", name: "Mike Dirolf” } ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York"}

db.books.find( { authors.name : "Kristina Chodorow" } )

Where do you put the reference?Think about common queries

Where do you put the reference?Think about indexes

book = { title: "MongoDB: The Definitive Guide", authors : [ { _id: "kchodorow", name: "Kristina Chodorow” }, { _id: "mdirolf", name: "Mike Dirolf” } ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York"}

db.books.createIndex( { authors.name : 1 } )

Summary

• Schema design is different in MongoDB

• Basic data design principals apply

• Focus on how application accesses and manipulates data

• Evolve schema to meet changing requirements

• Application-level logic is important!

Emily Stolfo

#mongodbdays

Thank You

Ruby Engineer/Evangelist, 10gen

@EmStolfo

top related