Top Banner
Emily Stolfo #mongodbdays Schema Design Ruby Engineer/Evangelist, 10gen @EmStolfo
44

MongoDB Schema Design

Jan 16, 2015

Download

Business

MongoDB

An overview of MongoDB Schema Design from M
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MongoDB Schema Design

Emily Stolfo

#mongodbdays

Schema Design

Ruby Engineer/Evangelist, 10gen

@EmStolfo

Page 2: MongoDB Schema Design

Agenda

• Working with documents

• Common patterns

• Queries and Indexes

Page 3: MongoDB Schema Design

Terminology

RDBMS MongoDBDatabase ➜ DatabaseTable ➜ CollectionRow ➜ DocumentIndex ➜ IndexJoin ➜ Embedded

DocumentForeign Key ➜ Reference

Page 4: MongoDB Schema Design

Working with Documents

Page 5: MongoDB Schema Design

DocumentsProvide flexibility and performance

Page 6: MongoDB Schema Design

Example Schema (MongoDB)

Page 7: MongoDB Schema Design

Embedding

Example Schema (MongoDB)

Page 8: MongoDB Schema Design

Embedding

Linking

Example Schema (MongoDB)

Page 9: MongoDB Schema Design

Relational Schema DesignFocuses on data storage

Page 10: MongoDB Schema Design

Document Schema DesignFocuses on data use

Page 11: MongoDB Schema Design

Schema Design Considerations• What is a priority?– High consistency– High read performance– High write performance

• How does the application access and manipulate data?

– Read/Write Ratio– Types of Queries / Updates– Data life-cycle and growth– Analytics (Map Reduce, Aggregation)

Page 12: MongoDB Schema Design

Tools for Data Access

• Flexible Schemas

• Embedded data structures

• Secondary Indexes

• Multi-Key Indexes

• Aggregation Framework– Pipeline operators: $project, $match, $limit,

$skip, $sort, $group, $unwind

• No Joins

Page 13: MongoDB Schema Design

Data Manipulation

• Conditional Query Operators– Scalar: $ne, $mod, $exists, $type, $lt, $lte, $gt,

$gte, $ne– Vector: $in, $nin, $all, $size

• Atomic Update Operators– Scalar: $inc, $set, $unset– Vector: $push, $pop, $pull, $pushAll, $pullAll,

$addToSet

Page 14: MongoDB Schema Design

Schema Design Example

Page 15: MongoDB Schema Design

Library Management Application• Patrons

• Books

• Authors

• Publishers

Page 16: MongoDB Schema Design

One to One Relationsexample

Page 17: MongoDB Schema Design

patron = { _id: "joe" name: "Joe Bookreader”}

address = { patron_id = "joe", street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345}

Modeling Patrons

patron = { _id: "joe" name: "Joe Bookreader", address: { street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345 }}

Page 18: MongoDB Schema Design

One to One Relations

• “Contains” relationships are often embedded.

• Document provides a holistic representation of objects with embedded entities.

• Optimized read performance.

Page 19: MongoDB Schema Design

examples

One To Many Relations

Page 20: MongoDB Schema Design

patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), addresses: [ {street: "1 Vernon St.", city: "Newton", state: "MA", …}, {street: "52 Main St.", city: "Boston", state: "MA", …}, ]}

Patrons with many addresses

Page 21: MongoDB Schema Design

example 2Publishers and Books

One to Many Relations

Page 22: MongoDB Schema Design

Publishers and Books relation• Publishers put out many books

• Books have one publisher

Page 23: MongoDB Schema Design

MongoDB: The Definitive Guide,By Kristina Chodorow and Mike DirolfPublished: 9/24/2010Pages: 216Language: English

Publisher: O’Reilly Media, CA

Book Data

Page 24: MongoDB Schema Design

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" }}

Book Model with Embedded Publisher

Page 25: MongoDB Schema Design

publisher = { name: "O’Reilly Media", founded: "1980", location: "CA"}

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

Book Model with Normalized Publisher

Page 26: MongoDB Schema Design

publisher = { _id: "oreilly", name: "O’Reilly Media", founded: "1980", location: "CA"}

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher_id: "oreilly"}

Link with Publisher _id as a Reference

Page 27: MongoDB Schema Design

publisher = { name: "O’Reilly Media", founded: "1980", location: "CA" books: [ "123456789", ... ]}

book = { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

Link with Book _ids as a Reference

Page 28: MongoDB Schema Design

Where do you put the reference?

• Reference to single publisher on books– Use when items have unbounded growth

(unlimited # of books)

• Array of books in publisher document– Optimal when many means a handful of items– Use when there is a bound on potential growth

Page 29: MongoDB Schema Design

example 3Books and Patrons

One to Many Relations

Page 30: MongoDB Schema Design

Books and Patrons

• Book can be checked out by one Patron at a time

• Patrons can check out many books (but not 1000s)

Page 31: MongoDB Schema Design

patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }}

book = { _id: "123456789" title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], ...}

Modeling Checkouts

Page 32: MongoDB Schema Design

patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", checked_out: "2012-10-15" }, { _id: "987654321", checked_out: "2012-09-12" }, ... ]}

Modeling Checkouts

Page 33: MongoDB Schema Design

De-normalizationProvides data locality

Page 34: MongoDB Schema Design

patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], checked_out: ISODate("2012-10-15") }, { _id: "987654321" title: "MongoDB: The Scaling Adventure", ... }, ... ]}

Modeling Checkouts - de-normalized

Page 35: MongoDB Schema Design

Referencing vs. Embedding• Embedding is a bit like pre-joining data

• Document level operations are easy for the server to handle

• Embed when the “many” objects always appear with (viewed in the context of) their parents.

• Reference when you need more flexibility

How does your application access and manipulate data?

Page 36: MongoDB Schema Design

exampleMany to Many Relations

Page 37: MongoDB Schema Design

book = { title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York"}

author = { _id: "mdirolf", name: "Mike Dirolf", hometown: "Albany"}

Books and Authors

Page 38: MongoDB Schema Design

book = { title: "MongoDB: The Definitive Guide", authors : [ { _id: "kchodorow", name: "Kristina Chodorow” }, { _id: "mdirolf", name: "Mike Dirolf” } ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York"}

author = { _id: "mdirolf", name: "Mike Dirolf", hometown: "Albany"}

Relation stored in Book document

Page 39: MongoDB Schema Design

book = { _id: 123456789 title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "Cincinnati", books: [ {book_id: 123456789, title : "MongoDB: The Definitive Guide" }]}

Relation stored in Author document

Page 40: MongoDB Schema Design

book = { _id: 123456789 title: "MongoDB: The Definitive Guide", authors = [ kchodorow, mdirolf ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York", books: [ 123456789, ... ]}

author = { _id: "mdirolf", name: "Mike Dirolf", hometown: "Albany", books: [ 123456789, ... ]}

Relation stored in both documents

Page 41: MongoDB Schema Design

book = { title: "MongoDB: The Definitive Guide", authors : [ { _id: "kchodorow", name: "Kristina Chodorow” }, { _id: "mdirolf", name: "Mike Dirolf” } ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York"}

db.books.find( { authors.name : "Kristina Chodorow" } )

Where do you put the reference?Think about common queries

Page 42: MongoDB Schema Design

Where do you put the reference?Think about indexes

book = { title: "MongoDB: The Definitive Guide", authors : [ { _id: "kchodorow", name: "Kristina Chodorow” }, { _id: "mdirolf", name: "Mike Dirolf” } ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York"}

db.books.createIndex( { authors.name : 1 } )

Page 43: MongoDB Schema Design

Summary

• Schema design is different in MongoDB

• Basic data design principals apply

• Focus on how application accesses and manipulates data

• Evolve schema to meet changing requirements

• Application-level logic is important!

Page 44: MongoDB Schema Design

Emily Stolfo

#mongodbdays

Thank You

Ruby Engineer/Evangelist, 10gen

@EmStolfo