Top Banner
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/ Application Design FOR MongoDB Alessandro Palumbo [email protected] http://it.linkedin.com/in/alessandropalumbo/ http://www.byte-code.com
26

Application Design for MongoDB

May 17, 2015

Download

Technology

When you use MongoDB for the first time, the biggest risk is to apply the same patterns and designs used in the SQL world, in this way you miss the real change that SQL MongoDB requires: change the way of thinking.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Application Design

FOR MongoDB

Alessandro [email protected]

http://it.linkedin.com/in/alessandropalumbo/ http://www.byte-code.com

Page 2: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

MongoDB

from humongous “huge; enormous”

NoSql

OPEN-source

Document-OrientedJSON-style documents

Page 3: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

JSON-style documents

{ "_id" : "6c85fa4c-fa64-44e2-89c9-e5eb7f306ed7", "code" : "CRS0001", "name" : "Test", "description" : "Test description", "active" : true, "scheduledDate" : { "from" : ISODate("2013-09-12T00:00:00.000Z"), "to" : ISODate("2013-10-31T00:00:00.000Z") }, "version" : NumberLong(1) }

Page 4: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

don’t be relationaL

no joins

NO FULL transactions

no SCHEMA

WE CAN EMBED

IS IT REALLY AN ISSUE?

DOCUMENT LEVELTRANSACTIONS

Page 5: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

DESIGN

DESIGN

FOR

QUERYEMBEDDED

DATA

vs

References

DBREFS

VS

MANUAL

REFERENCE

DYNAMIC

SCHEMA

VS

static

languages

PURE DRIVER

VS

MAPPING

FRAMEWORKS

BE

CAREFUL

WITH

DATES

SPLIT DATA

ON

MULTIPLE

COLLECTIONS

friendly fire(aka RTFM)

Write

Concern

READ

PREFERENCE

ATOMIC

DOCUMENT

OPERATIONS

AVOID

NATURAL

KEYS AS

IDENTIFIERS

PERFORMANCE

PREALLOCATE

FIELDS?

be aware

of

the trees

PREPROCESS

HIGH

RESOLUTION

DATA

TUNING

UPDATES

AND

INSERTS

DOCUMENT

MOVING

SLOWS

YOU

Page 6: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

FRIENDLY FIRE

Page 7: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

ATOMIC

DOCUMENT

OPERATIONS

OPERATIONS ON MULTIPLE DOCUMENTS ARE NOT ATOMIC

NO “ALL OR NOTHING”

EMBEDding OR APPLIcaTION TRANSACTIONS CAN be used to handle the issue

RELATIONAL TRANSACTIONS ARE NOT TOTALLY SAFE

Page 8: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

“Describes the guarantee that

MongoDB provides when reporting on the success of a write

operation”

Write

Concern

IT IS SET BY THE CLIENT AND CAN BE SET FOR EACH OPERATION

Errors Ignored Unacknowledged

Acknowledged (*) Journaled

Replica Acknowledged> 1 , majority , custom using tags

Page 9: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

“IT describes how MongoDB clients

route read operations to members

of a replica set”

Read

Preference

primary (*)

nearest

primary Preferred

secondary secondary PREFERRED

IT IS SET BY THE CLIENT AND CAN BE SET FOR EACH OPERATION

Page 10: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

All collections have an index on the id field that exists by default. If ID IS NOT PROVIDED the driver or the mongod will create an _id field with an ObjectID value.

AVOID

NATURAL

KEYS AS

IDENTIFIERS

ADD AN UNIQUE INDEX ON THE NATURAL KEY, SOMETIMES THE APPLICATION REALM CAN EVOLVE IN AN UNEXPECTED WAY

REMEMBER THAT UNIQUE INDEXES FIELDS MUST BE PART OF THE SHARD KEY IF SHARDING IS ENABLED

Page 11: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

DESIGN

Page 12: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

DOCUMENT DESIGN IS FUNCTIONAL TO THE QUERIES THAT WILL EXISTS IN THE APPLICATION

DESIGN

FOR

QUERY

REFERENCE OR EMBED DOCUMENTS,

“denormalized” is not always

a bad word

your document design will affect what kind of OPERATIONS will be safe or not

Page 13: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

Embedded data models allow applications to store related pieces of information in the same database record

EMBEDDED

DATA

vs

References

The maximum BSON document size is 16 megabytes and embedding may lead to performance issues if not correctly used

USUALLY there is a “contains” relation

between the embedding and the embedded object

Page 14: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

Normalized data models describe relationships using references between documents

EMBEDDED

DATA

vs

References

NO Referential integrity is supported, references could point to a not existing object

References provides more flexibility than embedding but remember that client-side applications will have to lookup for referenced objects with multiple queries

Page 15: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

DBRefs are a convention for representing a document, it will hold the collection name, the id, and optionally the db name

DBREFS

VS

MANUAL

REFERENCE

MANUAL REFERENCES are just fields that will hold the id of the related document, without the collection name or the db name

MANUAL REFERENCES are suitable for most of the use cases

Page 16: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

BSON Date is a 64-bit signed integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970), Negative values represent dates before 1970.The official BSON specification refers to the BSON Date type as the UTC datetime.

BE

CAREFUL

WITH

DATES

ALWAYS Use bson date when is related to an instant of time or you will never be able to use operators on that fields

Page 17: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

split data on multiple collections to easily partition your data (a.k.a. Multitenancy)

SPLIT DATA

ON

MULTIPLE

COLLECTIONS

use collections as namespaces for your data

remember once data is partioned it will be more hard to aggregate if needed

Page 18: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

why use dynamic schema if we are not using a dynamic programming language?

DYNAMIC

SCHEMA

VS

static

languages

inheritance is not only a matter of hierarchy, it could be also a matter of composition

composition is the key to introduce dynamic schema in a static programming language

Page 19: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

using the mongo driver directly will give you great powers, but will force you to write a lot of boilerplate code

PURE DRIVER

VS

MAPPING

FRAMEWORKS

MAPPING FRAMEWORKS WILL HELP TO WRITE LESS CODE, but you will sacrifice the control on all the aspects of the persistence

why not take the most from both?

Page 20: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

PERFORMANCE

Page 21: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

Indexes in MongoDB are defined at the collection level and can be on any field or sub-field of the document

be aware

of

the trees

Indexes are created using a b-tree and can be of different types

Single Field Compound

Multikey Geospatial

TEXT (BETA) Hashed

THEY COULD BE UNIQUE and sparse

Page 22: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

MONGODB handle the space allocation of a RECORD considering also a PADDING FACTOR

DOCUMENT

MOVING

SLOWS

YOU

WHEN AN UPDATED DOCUMENT DOES NOT FIT IN THE RECORD SPACE IT WILL BE MOVED

DYNAMIC SCHEMA IS THE FIRST CAUSE OF DOCUMENT MOVING

Page 23: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

FIELDS PREALLOCATION CAN FIX THE DOCUMENT MOVING ISSUES IN SOME USE CASES

PREALLOCATE

FIELDS?

Default values must be used to preallocate, this MUST BE HANDLEDin the application

NULL is not a default value :-) as it has its own type

Page 24: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

MONGODB let you store the maximum resolution of your data

PREPROCESS

HIGH

RESOLUTION

DATA

MAP REDUCE and aggregation ARE okbut you could also preprocess and have aggregated data that you can use for your queries

MONGODB rocks for business intelligence

Page 25: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Alessandro Palumbo - [email protected] - http://www.byte-code.com

MongoDB stores BSON documents as a sequence of fields and values, not as aN hash table

TUNING

UPDATES

AND

INSERTS

WRITING THE FIRST FIELD OF A DOCUMENT (OR A NESTED DOCUMENT) is considerably faster than writing THE LAST

Intra-Document Hierarchy could help to handle the issue

Page 26: Application Design for MongoDB

Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/

Any questions?