Top Banner
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/ Taking care about your schema in the MongoDBs schemaless world Alessandro Palumbo [email protected] http://it.linkedin.com/in/alessandropalumbo/ http://www.byte-code.com
17

Taking care about your schema in the MongoDB’s schemaless world

Jul 14, 2015

Download

Data & Analytics

MongoDB Milan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Taking care about

your schema in the

MongoDB’s

schemaless worldAlessandro Palumbo

[email protected] http://it.linkedin.com/in/alessandropalumbo/

http://www.byte-code.com

Page 2: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

MongoDB

from humongous “huge; enormous”

NoSql

OPEN-source

Document-OrientedJSON-style documents

Page 3: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

JSON-style documents

{ "_id" : "6c85fa4c-fa64-44e2-89c9-e5eb7f306ed7", "code" : "CRS0001", "name" : "Test", "description" : "Test description", "active" : true, "scheduledDate" : { "from" : ISODate("2013-09-12T00:00:00.000Z"), "to" : ISODate("2013-10-31T00:00:00.000Z") }, "version" : NumberLong(1) }

Page 4: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

don’t be relationaL

no joins

NO FULL transactions

no SCHEMA

WE CAN EMBED

IS IT REALLY AN ISSUE?

DOCUMENT LEVELTRANSACTIONS

Page 5: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

DESIGN

DESIGN

FOR

QUERYEMBEDDED

DATA

vs

References

DYNAMIC

SCHEMA

VS

static

languages

friendly fire(aka RTFM)

AVOID

NATURAL

KEYS AS

IDENTIFIERS

PERFORMANCE

PREALLOCATE

FIELDS?

TUNING

UPDATES

AND

INSERTS

DOCUMENT

MOVING

SLOWS

YOU

Page 6: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

FRIENDLY FIRE

Page 7: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

All collections have an index on the id field that exists by default. If ID IS NOT PROVIDED the driver or the mongod will create an _id field with an ObjectID value.

AVOID

NATURAL

KEYS AS

IDENTIFIERS

ADD AN UNIQUE INDEX ON THE NATURAL KEY, SOMETIMES THE APPLICATION REALM CAN EVOLVE IN AN UNEXPECTED WAY

REMEMBER THAT UNIQUE INDEXES FIELDS MUST BE PART OF THE SHARD KEY IF SHARDING IS ENABLED

Page 8: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

DESIGN

Page 9: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

DOCUMENT DESIGN IS FUNCTIONAL TO THE QUERIES THAT WILL EXISTS IN THE APPLICATION

DESIGN

FOR

QUERY

REFERENCE OR EMBED DOCUMENTS,

“denormalized” is not always

a bad word

your document design will affect what kind of OPERATIONS will be safe or not

Page 10: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

Embedded data models allow applications to store related pieces of information in the same database record

EMBEDDED

DATA

vs

References

The maximum BSON document size is 16 megabytes and embedding may lead to performance issues if not correctly used

USUALLY there is a “contains” relation

between the embedding and the embedded object

Page 11: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

Normalized data models describe relationships using references between documents

EMBEDDED

DATA

vs

References

NO Referential integrity is supported, references could point to a not existing object

References provides more flexibility than embedding but remember that client-side applications will have to lookup for referenced objects with multiple queries

Page 12: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

why use dynamic schema if we are not using a dynamic programming language?

DYNAMIC

SCHEMA

VS

static

languages

inheritance is not only a matter of hierarchy, it could be also a matter of composition

composition is the key to introduce dynamic schema in a static programming language

Page 13: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

PERFORMANCE

Page 14: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

MONGODB handle the space allocation of a RECORD considering also a PADDING FACTOR

DOCUMENT

MOVING

SLOWS

YOU

WHEN AN UPDATED DOCUMENT DOES NOT FIT IN THE RECORD SPACE IT WILL BE MOVED

DYNAMIC SCHEMA IS THE FIRST CAUSE OF DOCUMENT MOVING

Page 15: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

FIELDS PREALLOCATION CAN FIX THE DOCUMENT MOVING ISSUES IN SOME USE CASES

PREALLOCATE

FIELDS?

Default values must be used to preallocate, this MUST BE HANDLEDin the application

NULL is not a default value :-) as it has its own type

Page 16: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

MongoDB stores BSON documents as a sequence of fields and values, not as aN hash table

TUNING

UPDATES

AND

INSERTS

WRITING THE FIRST FIELD OF A DOCUMENT (OR A NESTED DOCUMENT) is considerably faster than writing THE LAST

Intra-Document Hierarchy could help to handle the issue

Page 17: Taking care about your schema in the MongoDB’s schemaless world

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Any questions?