Top Banner
Schema Design Software Engineer, MongoDB Craig Wilson #MongoDBDays @craiggwilson
42
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Schema Design

Schema Design

Software Engineer, MongoDB

Craig Wilson

#MongoDBDays

@craiggwilson

Page 2: Schema Design

All application development is Schema Design

Page 3: Schema Design

Success comes from a Proper Data Structure

Page 4: Schema Design

Terminology

RDBMS MongoDB Database ➜ Database Table ➜ Collection Row ➜ Document Index ➜ Index Join ➜ Embedding & Linking

Page 5: Schema Design

Working with Documents

Page 6: Schema Design

 {          _id:  “123”,          title:  "MongoDB:  The  Definitive  Guide",          authors:  [                {  _id:  "kchodorow",  name:  "Kristina  Chodorow“  },                {  _id:  "mdirold",  name:  “Mike  Dirolf“  }          ],          published_date:  ISODate("2010-­‐09-­‐24"),          pages:  216,          language:  "English",          publisher:  {                  name:  "O’Reilly  Media",                  founded:  "1980",                  location:  "CA"          }  }  

What is a Document?

Page 7: Schema Design

Traditional Schema Design Focus on Data Storage

Page 8: Schema Design

Document Schema Design Focus on Data Usage

Page 9: Schema Design

Traditional Schema Design What answers do I have?

Page 10: Schema Design

Document Schema Design What questions do I have?

Page 11: Schema Design

Schema Design By Example

Page 12: Schema Design

Library Management Application

•  Patrons/Users

•  Books

•  Authors

•  Publishers

Page 13: Schema Design

Question: What is a Patron’s Address?

Page 14: Schema Design

>  patron  =  db.patrons.find({  _id  :  “joe”  })  {          _id:  "joe“,          name:  "Joe  Bookreader”  }  

 

>  address  =  db.addresses.find({  _id  :  “joe”  })  {          _id:  "joe“,          street:  "123  Fake  St.  ",          city:  "Faketon",          state:  "MA",          zip:  12345  }  

 

A Patron and their Address

Page 15: Schema Design

>  patron  =  db.patrons.find({  _id  :  “joe”  })  {          _id:  "joe",          name:  "Joe  Bookreader",          address:  {                  street:  "123  Fake  St.  ",                  city:  "Faketon",                  state:  "MA",                  zip:  12345          }  }    

A Patron and their Address

Page 16: Schema Design

One-to-One Relationships

•  “Belongs to” relationships are often embedded.

•  Holistic representation of entities with their embedded attributes and relationships.

•  Optimized for read performance

Page 17: Schema Design

Question: What are a Patron’s Addresses?

Page 18: Schema Design

> patron = db.patrons.find({ _id : “bob” }) { _id: “bob", name: “Bob Knowitall", addresses: [ {street: "1 Vernon St.", city: "Newton", …}, {street: "52 Main St.", city: "Boston", …}, ] }

A Patron and their Addresses

Page 19: Schema Design

> patron = db.patrons.find({ _id : “bob” }) { _id: “bob", name: “Bob Knowitall", addresses: [ {street: "1 Vernon St.", city: "Newton", …}, {street: "52 Main St.", city: "Boston", …}, ] } > patron = db.patrons.find({ _id : “joe” }) { _id: "joe", name: "Joe Bookreader", address: { street: "123 Fake St. ", city: "Faketon", …} }

A Patron and their Addresses

Page 20: Schema Design

Migration Possibilities

•  Migrate all documents when the schema changes.

•  Migrate On-Demand –  As we pull up a patron’s document, we make the change. –  Any patrons that never come into the library never get

updated.

•  Leave it alone –  As long as the application knows about both types…

Page 21: Schema Design

Question: Who is the publisher of this book?

Page 22: Schema Design

Book

MongoDB: The Definitive Guide,

By Kristina Chodorow and Mike Dirolf

Published: 9/24/2010

Pages: 216

Language: English

Publisher: O’Reilly Media, CA

Page 23: Schema Design

> book = db.books.find({ _id : “123” }) { _id: “123”, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" } }

Book with embedded Publisher

Page 24: Schema Design

Book with embedded Publisher

•  Optimized for read performance of Books

•  Other queries become difficult

Page 25: Schema Design

Question: Who are all the publishers in the system?

Page 26: Schema Design

> publishers = db.publishers.find() { _id: “oreilly”, name: "O’Reilly Media", founded: "1980", location: "CA" } { _id: “penguin”, name: “Penguin”, founded: “1983”, location: “CA” }

All Publishers

Page 27: Schema Design

> book = db.books.find({ _id: “123” }) { _id: “123”, publisher_id: “oreilly”, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English" }

> db.publishers.find({ _id : book.publisher_id }) { _id: “oreilly”, name: "O’Reilly Media", founded: "1980", location: "CA" }

Book with linked Publisher

Page 28: Schema Design

Question: What are all the books a publisher has published?

Page 29: Schema Design

> publisher = db.publishers.find({ _id : “oreilly” }) { _id: “oreilly”, name: "O’Reilly Media", founded: "1980", location: "CA“, books: [“123”,…] }

> books = db.books.find({ _id: { $in : publisher.books } })

Publisher with linked Books

Page 30: Schema Design

Question: Who are the authors of a given book?

Page 31: Schema Design

> book = db.books.find({ _id : “123” }) { _id: “123”, title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English“, authors: [“kchodorow”, “mdirolf”] } > authors = db.authors.find({ _id : { $in : book.authors } }) { _id: "kchodorow", name: "Kristina Chodorow”, hometown: … } { _id: “mdirolf", name: “Mike Dirolf“, hometown: … }

Books with linked Authors

Page 32: Schema Design

> book = db.books.find({ _id : “123” }) { _id: “123”, title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English“, authors = [ { id: "kchodorow", name: "Kristina Chodorow” }, { id: "mdirolf", name: "Mike Dirolf” } ] }

Books with linked Authors

Page 33: Schema Design

Question: What are all the books an author has written?

Page 34: Schema Design

> authors = db.authors.find({ _id : “kchodorow” }) { _id: "kchodorow", name: "Kristina Chodorow", hometown: "Cincinnati", books: [ {id: “123”, title : "MongoDB: The Definitive Guide“ } ] }

Authors with linked Books

Page 35: Schema Design

> authors = db.authors.find({ _id : “kchodorow” }) { _id: "kchodorow", name: "Kristina Chodorow", hometown: "Cincinnati", books: [ {id: “123”, title : "MongoDB: The Definitive Guide“ } ] }

> book = db.books.find({ _id : “123” }) { _id: “123”, title: "MongoDB: The Definitive Guide", authors = [ { id: "kchodorow", name: "Kristina Chodorow” }, { id: "mdirolf", name: "Mike Dirolf” } ] }

Links on both Authors and Books

Page 36: Schema Design

Linking vs. Embedding

•  Embedding –  Great for read performance –  Writes can be slow –  Data integrity needs to be managed

•  Linking –  Flexible –  Data integrity is built-in –  Work is done during reads

Page 37: Schema Design

Question: What are all the books about databases?

Page 38: Schema Design

> book = db.books.find({ _id : “123” }) { _id: “123”, title: "MongoDB: The Definitive Guide", category: “MongoDB” } > categories = db.categories.find({ _id: “MongoDB” }) { _id: “MongoDB”, parent: “Databases” }

Categories as Documents

Page 39: Schema Design

> book = db.books.find({ _id : “123” }) { _id: “123”, title: "MongoDB: The Definitive Guide", categories: [“MongoDB”, “Databases”, “Programming”] } > db.books.find({ categories: “Databases” })

Categories as an Array

Page 40: Schema Design

> book = db.books.find({ _id : “123” }) { _id: “123”, title: "MongoDB: The Definitive Guide", category: “Programming/Databases/MongoDB” } > db.books.find({ category: ^Programming/Databases/* })

Categories as a Path

Page 41: Schema Design

Conclusion

•  Schema design is different in MongoDB

•  Basic data design principals stay the same

•  Focus on how an application accesses/manipulates data

•  Evolve the schema to meet requirements as they change

Page 42: Schema Design

Schema Design

Software Engineer, 10gen

Craig Wilson

#MongoDBDays

@craiggwilson