Schema Design in MongoDB - TriMug Meetup North Carolina

Post on 10-Nov-2014

1583 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

The Schema Design talk given at the TriMug meetup in Durham, NC

Transcript

Schema DesignJ. Randall Hunt Developer and Evangelist at MongoDB

Who am I?

• J. Randall Hunt

• @jrhunt

• github.com/ranman

• randall@mongodb.com

Why are you here?

• To learn about MongoDB

• To engage in the MongoDB community

• To get free stuff

Levels of Abstraction!

ORMs To Save The Day!

Why change something that's been around for 40

years?

10TB

Data Human Kind Has Produced Until 1991

Data Mankind Produces Every Day Since 2001

10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB

10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB

10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB

10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB

10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB

NOSQL/NOREL

Relational Schema Design Focus on data storage

Document Schema Design Focus on data use

Why MongoDB?• Focus on commodity hardware, not insane machines

• Document Store

• Dynamic Schema

• Sensible Defaults

• Modern Scaling Infrastructure

How People Use MongoDB• Analytics

• Risk Management

• Caching Layer

• Recommendation Engines

• GIS

Nitty Gritty

RDBMS MongoDB Database � Database Table � Collection Row � Document Index � Index Join � Embedded Document Foreign Key � Reference

Documents?

{ "hello": "world" }

{ "_id": ObjectId("51638f8332e9bc556fe86de7"), "dstats": [ { "+": "5", "-": "0", "f": "gitstreamer.py" }, { "+": "3", "-": "3", "f": "post-commit.py" } ], "author": "ranman", "ts": ISODate("2013-04-08T19:48:11-0400"), "project": "gitstreamer", "msg": "turning this into a webapp" } !

CRUD

test> db.test.find() Fetched 0 record(s) in 1ms -- Index[none]

test> db.test.find() Fetched 0 record(s) in 1ms -- Index[none]test> db.test.insert({'hello': 'world'}) Inserted 1 record(s) in 1ms Insert WriteResult({ "ok": 1, "n": 1 })

test> db.test.find() Fetched 0 record(s) in 1ms -- Index[none]test> db.test.insert({'hello': 'world'}) Inserted 1 record(s) in 1ms Insert WriteResult({ "ok": 1, "n": 1 }) test> db.test.find({'hello': 'world'}) { "_id": ObjectId("52d61af21486ef9e06d6d41a"), "hello": "world" } Fetched 1 record(s) in 0ms -- Index[none]

test> db.test.update({'hello': 'world'}, {$set: {'hello': 'welt'}}) Updated 1 existing record(s) in 0ms Update WriteResult({ "ok": 1, "n": 1 })

test> db.test.find() Fetched 0 record(s) in 1ms -- Index[none]test> db.test.insert({'hello': 'world'}) Inserted 1 record(s) in 1ms Insert WriteResult({ "ok": 1, "n": 1 }) test> db.test.find({'hello': 'world'}) { "_id": ObjectId("52d61af21486ef9e06d6d41a"), "hello": "world" } Fetched 1 record(s) in 0ms -- Index[none]

Lots of Operators!

Enough already I know what MongoDB is! Teach me

schema design!

Library Management

• Patrons

• Books

• Authors

• Publishers

One To One Relations

patron = { _id: ObjectId("52d7173817d8bbd9564613cd"), name: 'Joe Schmoe' } !

address = { patron_id: ObjectId("52d7173817d8bbd9564613cd"), street: "100 Five Bridge Rd", city: "Clinton", state: "NC", zip: 28723 } >patron = db.patron.find({'name': 'Joe Schmoe'})[0] >db.address.find('patron_id': patron._id)

patron = { _id: ObjectId("52d7173817d8bbd9564613cd"), name: 'Joe Schmoe', address: { street: "100 Five Bridge Rd", city: "Clinton", state: "NC", zip: 28723 } } >db.patrons.findOne({'name': /Joe Schmoe/})

One To Many Relations

book = { _id: ObjectID(...), title: "MongoDB", authors: ['Kristina Chodorow', 'Mike Dirolf'], published_date: ISODate('2010-09-24'), pages: 216, language: 'English', publisher: { name: "O'Reilly Media", founded: "1980", location: "CA" } }

4 ways of modeling one-to-many (there are more)

• Embed the publisher

• Use publisher as the "foreign key"

• Use book as the "foreign key"

• Hybrid

publisher = { _id: "oreilly", name: "O'Reilly Media", founded: "1980", location: "CA" } book = { _id: ObjectID(...), title: "MongoDB", authors: ['Kristina Chodorow', 'Mike Dirolf'], published_date: ISODate('2010-09-24'), pages: 216, language: 'English', publisher_id: 'oreilly' }

publisher = { name: "OReilly Media", founded: "1980", location: "CA" books: [ ObjectId(...), ... ] } book = { _id: ObjectID(...), title: "MongoDB", authors: ['Kristina Chodorow', 'Mike Dirolf'], published_date: ISODate('2010-09-24'), pages: 216, language: "English" }

Hybrid Models

• We store the foreign key

• and the info relevant to the relation

patron = { _id: ObjectId("52d7173817d8bbd9564613cd"), name: "Joe Bookreader", address: {...}, join_date: ISODate("2011-10-15"), books: [ {_id: ObjectID(...), title: "MongoDB", author: "Kristina C.", ...}, {_id: ObjectId(...), title: "Postgres", author: "Randall H.", ...} ] }

Where do you put the foreign key• Array of books inside of publisher

• Makes sense when many means a handful of items

• Useful when items have bounds on potential growth

• Reference to a single publisher on each book

• Useful when items have unbounded growth

Other Things To Model

• Trees

• Queues

• Many-To-Many relationships

Thanks! @jrhunt

top related