Top Banner
MongoDB Indexing Constraints & Creative Schemas Chris Winslett [email protected] Thursday, June 27, 13
30

MongoDB Indexing Constraints and Creative Schemas

May 12, 2015

Download

Technology

MongoDB
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MongoDB Indexing Constraints and Creative Schemas

MongoDB Indexing Constraints & Creative

Schemas

Chris [email protected]

Thursday, June 27, 13

Page 2: MongoDB Indexing Constraints and Creative Schemas

My Background

• For the past year, I’ve looked at MongoDB logs at least once every day.

• We routinely answer the question “how can I improve performance?”

Thursday, June 27, 13

Page 3: MongoDB Indexing Constraints and Creative Schemas

Who’s this talk for?

• New to MongoDB

• Seeing some slow operations, and need help debugging

• Running database operations on a sizeable deploy

• I have a MongoDB deployment, and I’ve hit a performance wall

Thursday, June 27, 13

Page 4: MongoDB Indexing Constraints and Creative Schemas

What should you learn?Know where to look on a running MongoDBto uncover slowness, and discuss solutions.

MongoDB has performance “patterns”.

How to think about improving performance.

And . . .

Thursday, June 27, 13

Page 5: MongoDB Indexing Constraints and Creative Schemas

Schema Design

Design with the end in mind.

Thursday, June 27, 13

Page 6: MongoDB Indexing Constraints and Creative Schemas

MongoDB Indexing Constraints

• One index per query *

• One range operator per query ($)

• Range operator must be last field in index

• Using RAM well

* except $or, but the sin with $or is appending a sort to the query.

Thursday, June 27, 13

Page 7: MongoDB Indexing Constraints and Creative Schemas

The Tools

• `mongostat` for MongoDB Behavior

• `tail` the logs for current options

• `iostat` for disk util

• `top -c` for CPU usage

Thursday, June 27, 13

Page 8: MongoDB Indexing Constraints and Creative Schemas

First, a Simple One

query getmore command res faults locked db ar|aw netIn netOut conn time 129 4 7 126m 2 my_db:0.0% 3|0 27k 445k 42 15:36:54 64 4 3 126m 0 my_db:0.0% 5|0 12k 379k 42 15:36:55 65 7 8 126m 0 my_db:0.1% 3|0 15k 230k 42 15:36:56 65 3 3 126m 1 my_db:0.0% 3|0 13k 170k 42 15:36:57 66 1 6 126m 1 my_db:0.0% 0|0 14k 262k 42 15:36:58 32 8 5 126m 0 my_db:0.0% 5|0 5k 445k 42 15:36:59

a truncated mongostat

Alerted due to high CPU

Thursday, June 27, 13

Page 9: MongoDB Indexing Constraints and Creative Schemas

log

[conn73454] query my_db.my_collection query: { $query: { publisher: "US Weekly" }, orderby: { publishAt: -1 } } ntoreturn:5 ntoskip:0 nscanned:33236 scanAndOrder:1 keyUpdates:0 numYields: 21 locks(micros) r:317266 nreturned:5 reslen:3127 178ms

Thursday, June 27, 13

Page 10: MongoDB Indexing Constraints and Creative Schemas

Solution

{ $query: { publisher: "US Weekly" }, orderby: { publishedAt: -1 } }

db.my_collection.ensureIndex({“publisher”: 1, publishedAt: -1}, {background: true})

We are fixing this query

With this index

I would show you the logs, but now they are silent.

Thursday, June 27, 13

Page 11: MongoDB Indexing Constraints and Creative Schemas

The Pattern

Inefficient Read Queries from in-memory table scans cause high CPU load

Caused by not matching indexes to queries.

Thursday, June 27, 13

Page 12: MongoDB Indexing Constraints and Creative Schemas

Example 2

MongoDB Twitter-ish Feed

Customer was building a network graph of users.

Thursday, June 27, 13

Page 13: MongoDB Indexing Constraints and Creative Schemas

Naive Method

{ creator_id: ObjectId(), status: “This is so awesome!”}

Statuses Users

{ _id: ObjectId(), friends: [array-o-friends]}

db.status.find({creator_id: {$in: [array-o-friends]}}).sort({_id: -1})

Query

Thursday, June 27, 13

Page 14: MongoDB Indexing Constraints and Creative Schemas

Solution

{ creator_id: ObjectId(), friends_of_creator: [array-of-viewers], status: “This is so awesome!”}

Statuses Users

{ _id: ObjectId(), friends: [array-o-friends]}

db.statuses.find({friends_of_creator: ObjectId()}).sort({_id: -1})

Query

Thursday, June 27, 13

Page 15: MongoDB Indexing Constraints and Creative Schemas

The Pattern

With graphs, query on viewable by.

What worked with minimal documents was not scaling.

Thursday, June 27, 13

Page 16: MongoDB Indexing Constraints and Creative Schemas

Similar Issues - Messages

{ sender_id: ObjectId(), recipient_id: ObjectId(), message: “This is so awesome!”}

Naive{ sender_id: ObjectId(), recipient_id: ObjectId(), participants: [ObjectId(), ObjectId()], thread_id: ObjectId(), message: “This is so awesome!”}

Solution

db.messages.find({participants: ObjectId()}).sort({_id: -1})

Query

db.messages.find({$or: [{sender_id: ObjectId()}, {recipient_id: ObjectId()]}).sort({_id: -1})

Naive Query

Thursday, June 27, 13

Page 17: MongoDB Indexing Constraints and Creative Schemas

Example 3

insert query update delete getmore command faults locked % idx miss % qr|qw ar|aw *0 *0 *0 *0 0 1|0 1422 0 0 0|0 50|0 *0 6 *0 *0 0 6|0 575 0 0 0|0 51|0 *0 3 *0 *0 0 1|0 1047 0 0 0|0 50|0 *0 2 *0 *0 0 3|0 1660 0 0 0|0 50|0

a truncated mongostat

Alerted on high CPU

Thursday, June 27, 13

Page 18: MongoDB Indexing Constraints and Creative Schemas

tail

[initandlisten] connection accepted from ....[conn4229724] authenticate: { authenticate: ....[initandlisten] connection accepted from ....[conn4229725] authenticate: { authenticate: .....[conn4229717] query ..... 102ms[conn4229725] query ..... 140ms

amazingly quietThursday, June 27, 13

Page 19: MongoDB Indexing Constraints and Creative Schemas

currentOp> db.currentOP(){ "inprog" : [ { "opid" : 66178716, "lockType" : "read", "secs_running" : 760, "op" : "query", "ns" : "my_db.my_collection", "query" : {

keywords: $in: [“keyword1”, “keyword2”],tags: $in: [“tags1”, “tags2”]

},orderby: {

“created_at”: -1},

"numYields" : 21 }

]}

Thursday, June 27, 13

Page 20: MongoDB Indexing Constraints and Creative Schemas

Solution

> db.currentOP().inprog.filter(function(row) { return row.secs_running > 100 && row.op == "query"

}).forEach(function(row) { db.killOp(row.opid)

})

Return Stability to Database

Disable query, and refactor schema.

Thursday, June 27, 13

Page 21: MongoDB Indexing Constraints and Creative Schemas

Refactoring

I have one word for you, “Schema”

Thursday, June 27, 13

Page 22: MongoDB Indexing Constraints and Creative Schemas

Example 4

A map reduce has gradually runslower and slower.

Thursday, June 27, 13

Page 23: MongoDB Indexing Constraints and Creative Schemas

Finding Offenders

Find the time of the slowest query of the day:grep '[0-9]\{3,100\}ms$' $MONGODB_LOG | awk '{print $NF}' | sort -n

Thursday, June 27, 13

Page 24: MongoDB Indexing Constraints and Creative Schemas

Slowest Map Reducemy_db.$cmd command: {

mapreduce: "my_collection", map: function() {}, query: { $or: [

{ object.type: "this" }, { object.type: "that" } ],time: { $lt: new Date(1359025311290), $gt: new Date(1358420511290) }, object.ver: 1, origin: "tnh"

},out: "my_new_collection", reduce: function(keys, vals) { ....}

} ntoreturn:1 keyUpdates:0 numYields: 32696 locks(micros) W:143870 r:511858643 w:6279425 reslen:140 421185ms

Thursday, June 27, 13

Page 25: MongoDB Indexing Constraints and Creative Schemas

Solution

Query is slow because it has multiple multi-value operators: $or, $gte, and $lte

Problem

Solution Change schema to use an “hour_created” attribute:

hour_created: “%Y-%m-%d %H”

Create an index on “hour_created” with followed by “$or” values. Query using the new “hour_created.”

Thursday, June 27, 13

Page 26: MongoDB Indexing Constraints and Creative Schemas

Words of caution

2 / 4 solutions were to add an index.

New indexes as a solution scales poorly.

Thursday, June 27, 13

Page 27: MongoDB Indexing Constraints and Creative Schemas

Sometimes . . .

It is best to do nothing, except add shards / add hardware.

Go back to the drawing board on the design.

Thursday, June 27, 13

Page 28: MongoDB Indexing Constraints and Creative Schemas

Bad things happen to good databases?

• ORMs

• Manage your indexes and queries.

• Constraints will set you free.

Thursday, June 27, 13

Page 29: MongoDB Indexing Constraints and Creative Schemas

Road Map for Refactoring

• Measure, measure, measure.

• Find your slowest queries and determine if they can be indexed

• Rephrase the problem you are solving by asking “How do I want to query my data?”

Thursday, June 27, 13

Page 30: MongoDB Indexing Constraints and Creative Schemas

Thank you!

• Questions?

• E-mail me: [email protected]

Thursday, June 27, 13