Top Banner
Schema Design #MongoDBTokyo Derick Rethans Software Engineer, 10gen
47
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Schema & Design

Schema Design

#MongoDBTokyo

Derick RethansSoftware Engineer, 10gen

Page 2: Schema & Design

Agenda

• Working with documents

• Evolving a Schema

• Queries and Indexes

• Common Patterns

Page 3: Schema & Design

Terminology

RDBMS MongoDBDatabase ➜ DatabaseTable ➜ CollectionRow ➜ DocumentIndex ➜ IndexJoin ➜ Embedded

DocumentForeign Key ➜ Reference

Page 4: Schema & Design

Working with Documents

Page 5: Schema & Design

Modeling Data

Page 6: Schema & Design

Documents

Provide flexibility and performance

Page 7: Schema & Design

Normalized Data

Page 8: Schema & Design

De-Normalized (embedded) Data

Page 9: Schema & Design

Relational Schema Design

Focus on data storage

Page 10: Schema & Design

Document Schema Design

Focus on data use

Page 11: Schema & Design

Schema Design Considerations

• How do we manipulate the data?– Dynamic Ad-Hoc Queries

– Atomic Updates

– Map Reduce

• What are the access patterns of the application?

– Read/Write Ratio

– Types of Queries / Updates

– Data life-cycle and growth rate

Page 12: Schema & Design

Data Manipulation

• Conditional Query Operators– Scalar: $ne, $mod, $exists, $type, $lt, $lte, $gt,

$gte, $ne

– Vector: $in, $nin, $all, $size

• Atomic Update Operators– Scalar: $inc, $set, $unset

– Vector: $push, $pop, $pull, $pushAll, $pullAll, $addToSet

Page 13: Schema & Design

Data Access

• Flexible Schemas

• Ability to embed complex data structures

• Secondary Indexes

• Multi-Key Indexes

• Aggregation Framework– $project, $match, $limit, $skip, $sort, $group,

$unwind

• No Joins

Page 14: Schema & Design

Getting Started

Page 15: Schema & Design

Library Management Application

• Patrons

• Books

• Authors

• Publishers

Page 16: Schema & Design

An Example

One to One Relations

Page 17: Schema & Design

patron = { _id: "joe" name: "Joe Bookreader”}

address = { patron_id = "joe", street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345}

Modeling Patrons

patron = {

_id: "joe"

name: "Joe Bookreader",

address: {

street: "123 Fake St. ",

city: "Faketon",

state: "MA",

zip: 12345

}

}

Page 18: Schema & Design

One to One Relations

• Mostly the same as the relational approach

• Generally good idea to embed “contains” relationships

• Document model provides a holistic representation of objects

Page 19: Schema & Design

An Example

One To Many Relations

Page 20: Schema & Design

patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), addresses: [ {street: "1 Vernon St.", city: "Newton", state: "MA", …}, {street: "52 Main St.", city: "Boston", state: "MA", …}, ]}

Modeling Patrons

Page 21: Schema & Design

Publishers and Books

• Publishers put out many books

• Books have one publisher

Page 22: Schema & Design

MongoDB: The Definitive Guide,By Kristina Chodorow and Mike DirolfPublished: 9/24/2010Pages: 216Language: English

Publisher: O’Reilly Media, CA

Book

Page 23: Schema & Design

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" }}

Modeling Books – Embedded Publisher

Page 24: Schema & Design

publisher = { name: "O’Reilly Media", founded: "1980", location: "CA"}

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

Modeling Books & Publisher Relationship

Page 25: Schema & Design

publisher = { _id: "oreilly", name: "O’Reilly Media", founded: "1980", location: "CA"}

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher_id: "oreilly"}

Publisher _id as a Foreign Key

Page 26: Schema & Design

publisher = { name: "O’Reilly Media", founded: "1980", location: "CA" books: [ "123456789", ... ]}

book = { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

Book _id as a Foreign Key

Page 27: Schema & Design

Where do you put the foreign Key?

• Array of books inside of publisher– Makes sense when many means a handful of

items

– Useful when items have bound on potential growth

• Reference to single publisher on books– Useful when items have unbounded growth

(unlimited # of books)

• SQL doesn’t give you a choice, no arrays

Page 28: Schema & Design

Another Example

One to Many Relations

Page 29: Schema & Design

Books and Patrons

• Book can be checked out by one Patron at a time

• Patrons can check out many books (but not 1000s)

Page 30: Schema & Design

patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }}

book = { _id: "123456789" title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], ...}

Modeling Checkouts

Page 31: Schema & Design

patron = {

_id: "joe"

name: "Joe Bookreader",

join_date: ISODate("2011-10-15"),

address: { ... },

checked_out: [

{ _id: "123456789", checked_out: "2012-10-15" },

{ _id: "987654321", checked_out: "2012-09-12" },

...

]

}

Modeling Checkouts

Page 32: Schema & Design

De-normalize for speed

De-normalizationProvides data locality

Page 33: Schema & Design

patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], checked_out: ISODate("2012-10-15") }, { _id: "987654321" title: "MongoDB: The Scaling Adventure", ... }, ... ]}

Modeling Checkouts -- de-normalized

Page 34: Schema & Design

Referencing vs. Embedding

• Embedding is a bit like pre-joined data

• Document level ops are easy for server to handle

• Embed when the “many” objects always appear with (viewed in the context of) their parents.

• Reference when you need more flexibility

Page 35: Schema & Design

An Example

Single Table Inheritance

Page 36: Schema & Design

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), kind: loanable locations: [ ... ] pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" }}

Single Table Inheritance

Page 37: Schema & Design

An Example

Many to Many Relations

Page 38: Schema & Design

Relational Approach

Page 39: Schema & Design

book = { title: "MongoDB: The Definitive Guide", authors = [ { _id: "kchodorow", name: "K-Awesome” }, { _id: "mdirolf", name: "Batman Mike” } ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York"}

Books and Authors

Page 40: Schema & Design

An Example

Trees

Page 41: Schema & Design

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", category: "MongoDB"}

category = { _id: MongoDB, parent: Databases }category = { _id: Databases, parent: Programming }

Parent Links

Page 42: Schema & Design

book = { _id: 123456789, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

category = { _id: MongoDB, children: [ 123456789, … ] }category = { _id: Databases, children: [ MongoDB, Postgres }category = { _id: Programming, children: [ DB, Languages ] }

Child Links

Page 43: Schema & Design

Modeling Trees

• Parent Links

- Each node is stored as a document

- Contains the id of the parent

• Child Links

- Each node contains the id’s of the children

- Can support graphs (multiple parents / child)

Page 44: Schema & Design

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", categories: [ Programming, Databases, MongoDB ]}

book = { title: "MySQL: The Definitive Guide", authors: [ ”Michael Kofler" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", parent: "MongoDB", ancestors: [ Programming, Databases, MongoDB ]}

Array of Ancestors

Page 45: Schema & Design

An Example

Queues

Page 46: Schema & Design

book = { _id: 123456789, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", available: 3}

db.books.findAndModify({ query: { _id: 123456789, available: { "$gt": 0 } }, update: { $inc: { available: -1 } }})

Book Document

Page 47: Schema & Design

Software Engineer, 10gen

Derick Rethans

#MongoDBTokyo

Thank You