Top Banner
Consulting Engineer, 10gen Jason Zucchetto #MongoSF Schema Design
44

MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

May 25, 2015

Download

Technology

MongoDB

MongoDB’s basic unit of storage is a document. Documents can represent rich, schema-free data structures, meaning that we have several viable alternatives to the normalized, relational model. In this talk, we’ll discuss the tradeoff of various data modeling strategies in MongoDB using a library as a sample application. You will learn how to work with documents, evolve your schema, and common schema design patterns.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Consulting Engineer, 10gen

Jason Zucchetto

#MongoSF

Schema Design

Page 2: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Single Table En

Agenda

• Working with documents

• Evolving a Schema

• Queries and Indexes

• Common Patterns

Page 3: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Terminology

RDBMS MongoDB

Database ➜ Database

Table ➜ Collection

Row ➜ Document

Index ➜ Index

Join ➜ Embedded Document

Foreign Key ➜ Reference

Page 4: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Working with Documents

Page 5: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Modeling Data

Page 6: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

DocumentsProvide flexibility and performance

Page 7: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Normalized Data

Page 8: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

De-Normalized (embedded) Data

Page 9: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Relational Schema DesignFocus on data storage

Page 10: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Document Schema DesignFocus on data use

Page 11: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Data Access

• Flexible Schemas

• Ability to embed complex data structures

• Secondary Indexes

• Multi-Key Indexes

• Aggregation Framework– $project, $match, $limit, $skip, $sort, $group,

$unwind

• No Joins

Page 12: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Getting Started

Page 13: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Library Management Application• Patrons

• Books

• Authors

• Publishers

Page 14: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

An ExampleOne to One Relations

Page 15: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

patron = { _id: "joe", name: "Joe Bookreader”}

address = { patron_id = "joe", street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345}

Modeling Patrons

patron = { _id: "joe", name: "Joe Bookreader", address: { street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345 }}

Page 16: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

One to One Relations

• Mostly the same as the relational approach

• Generally good idea to embed “contains” relationships

• Document model provides a holistic representation of objects

Page 17: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

An ExampleOne To Many Relations

Page 18: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

patron = { _id: "joe", name: "Joe Bookreader", join_date: ISODate("2011-10-15"), addresses: [ {street: "1 Vernon St.", city: "Newton", state: "MA", …}, {street: "52 Main St.", city: "Boston", state: "MA", …} ]}

Modeling Patrons

Page 19: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Publishers and Books

• Publishers put out many books

• Books have one publisher

Page 20: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

MongoDB: The Definitive Guide,By Kristina Chodorow and Mike DirolfPublished: 9/24/2010Pages: 216Language: English

Publisher: O’Reilly Media, CA

Book

Page 21: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" }}

Modeling Books – Embedded Publisher

Page 22: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

publisher = { name: "O’Reilly Media", founded: "1980", location: "CA"}

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

Modeling Books & Publisher Relationship

Page 23: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

publisher = { _id: "oreilly", name: "O’Reilly Media", founded: "1980", location: "CA"}

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher_id: "oreilly"}

Publisher _id as a Foreign Key

Page 24: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

publisher = { name: "O’Reilly Media", founded: "1980", location: "CA" books: [ "123456789", ... ]}

book = { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

Book _id as a Foreign Key

Page 25: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Where Do You Put the Foreign Key?

• Array of books inside of publisher– Makes sense when many means a handful of

items– Useful when items have bound on potential

growth

• Reference to single publisher on books– Useful when items have unbounded growth

(unlimited # of books)

• SQL doesn’t give you a choice, no arrays

Page 26: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Another ExampleOne to Many Relations

Page 27: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Books and Patrons

• Book can be checked out by one Patron at a time

• Patrons can check out many books (but not 1000’s)

Page 28: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

patron = { _id: "joe", name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }}

book = { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], ...}

Modeling Checkouts

Page 29: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

patron = { _id: "joe", name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", checked_out: "2012-10-15" }, { _id: "987654321", checked_out: "2012-09-12" }, ... ]}

Modeling Checkouts

Page 30: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

De-normalize for speed

DenormalizationProvides data locality

Page 31: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

patron = { _id: "joe", name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], checked_out: ISODate("2012-10-15") }, { _id: "987654321" title: "MongoDB: The Scaling Adventure",

... }, ... ]}

Modeling Checkouts: Denormalized

Page 32: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Referencing vs. Embedding• Embedding is a bit like pre-joined data

• Document-level ops are easy for server to handle

• Embed when the 'many' objects always appear with (i.e. viewed in the context of) their parent

• Reference when you need more flexibility

Page 33: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

An ExampleSingle Table Inheritance

Page 34: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), kind: "loanable", locations: [ ... ], pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" }}

Single Table Inheritance

Page 35: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

An ExampleMany to Many Relations

Page 36: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

book = { title: "MongoDB: The Definitive Guide", authors = [ { _id: "kchodorow", name: "K-Awesome" }, { _id: "mdirolf", name: "Batman Mike" }, ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York"}

Books and Authors

Page 37: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

An ExampleTrees

Page 38: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", category: "MongoDB"}

category = { _id: MongoDB, parent: "Databases" }category = { _id: Databases, parent: "Programming" }

Parent Links

Page 39: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

book = { _id: 123456789, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

category = { _id: MongoDB, children: [ 123456789, … ] }category = { _id: Databases, children: ["MongoDB", "Postgres"}category = { _id: Programming, children: ["DB", "Languages"] }

Child Links

Page 40: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Modeling Trees

• Parent Links

- Each node is stored as a document

- Contains the id of the parent

• Child Links

- Each node contains the id’s of the children

- Can support graphs (multiple parents / child)

Page 41: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", categories: ["Programming", "Databases", "MongoDB” ]}

book = { title: "MySQL: The Definitive Guide", authors: [ "Michael Kofler" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", parent: "MongoDB", ancestors: [ "Programming", "Databases", "MongoDB"]}

Array of Ancestors

Page 42: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

An ExampleQueues

Page 43: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

book = { _id: 123456789, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", available: 3}

db.books.findAndModify({ query: { _id: 123456789, available: { "$gt": 0 } }, update: { $inc: { available: -1 } }})

Book Document

Page 44: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Consulting Engineer, 10gen

Jason Zucchetto

#MongoSF

Thank You