MongoDB Introduction and Internal by Shridhar Joshi
Jan 15, 2015
MongoDBIntroduction and Internal
by Shridhar Joshi
What is MongoDB?
• Open source, scalable, high-performance, document-oriented NoSQL Key-Value based database.
• Features• JSON-style document –oriented storage with schema-less• B-tree index supported on any attribute• Log-based replication for Master/Slave and Replica Set• Auto-sharding architecture (via horizontal partition) scales to thousands of nodes• NoSQL-style query• Surprising updating behaviors• Map/Reduce support• GridFS specification for storing large files• Developed by 10gen with commercial support
Well/Less Well Suited
Source: http://www.mongodb.org/display/DOCS/Use+Cases
Basic concepts in MongoDB
NoSQL MongoDB
Database
Collection
Document
Field
Index
Cursor
Relational DBMS
Database
Relation
Tuple
Column
Index
Cursor
MongoDB
Databases*
Collections*
Documents* Indexes*
Fields*
* means 0 or more objects
Relational DBMS
Databases*
Relations*
Columns* Indexes*
Each document has its own fields and makes MongoDB schema-less.
CRUD Demo time
Ø show dbs view existing databases Ø use test use database “test”Ø db.t.insert({name:’bob’,age:’30’}) insert 30 years bobØ db. t.insert({name:’alice’,gender:’female’}) insert lady alice Ø db. t.find() list all documents in
collection tØ db. t.find({name:’bob’},{age:1}) find 1 year old bob Ø db. t.find().limit(1).skip(1) find the second document Ø db. t.find().sort({name:1}) sort the results with ascend
nameØ db. t.find({$or:[{name:’bob’},{name:’tom’}]}) find bob or tom’s documentsØ db. t.update({name:’ bob’},{$set:{age:31}}, update all bob’s age to 31Ø false,true})Ø db.stats() database statistic Ø db.getCollectionNames() collections under this dbØ db.t.ensureIndex({name:1}) create index on nameØ db.people.find({name:“bob"}).explain() explain plan step
Query Optimization
db.people.find({x:10,y:”foo”})
Index on x
Index on y
Collection people
Index Scan
Index Scan
DiskLocation Scan
MongoDB Architecture
Source: mongoDB Replication and Replica Set by Dwight Merriman 10gen
MongoDB ShardingMongoDB uses two key operations to facilitate sharding - split and migrate. Split splits a chunk into two ranges; it is done to assure no one chunk is unusually large.Migrate moves a chunk (the data associated with a key range) to another shard. This is done as needed to rebalance.
Split is an inexpensive metadata operation, while migrate is expensive as large amounts of data may be moving server to server. Both splits and migrates are performed automatically.
MongoDB has a sub-system called Balancer, which monitors shards loads and moves chunks around if it finds an imbalance.
If you add a new shard to the system, some chunks will eventually be moved to that shard to spread out the load.
A recently split chunk may be moved immediately to a new shard if the system predicts that future insertions will benefit from that move.
MongoDB Sharding
Pull mode
MongoDB Sharding: Briefly
FROM:C TO:N
#Copy Index Definition from C#Remove existing data in [min~max]#Clone the data in[min~max] from C#Ask C to replicate the changes
#Make sure my view is complete and lock#Get the document’s DiskLoc for sharding#Trigger the N to sharding in Pull mode
Sequence
#N commit#Ask N to commit
MongoDB Sharding: In Details FROM TO
Notice: The FROM can be updated/deleted during sharding and TO can catch up in step 4.
Replication and Sharding
Source: http://www.mongodb.org/display/DOCS/Simple+Initial+Sharding+Architecture
MongoDB Replication: Pull mode
Slave continuously pull the OpLog from Master.
Question
Reference:1: Source code digest: http://www.cnblogs.com/daizhj/category/260889.html2: Books http://www.mongodb.org/display/DOCS/Books3: MongoDB offical website http://www.mongodb.com/