Managing a Maturing MongoDB Ecosystem

Charity Majors@mipsytipsy

Thursday, June 20, 13

Managing a maturing MongoDB ecosystem

automating with chef

performance tuning

disaster recovery

Basic replica set

How do I chef that?

... grab the AWS and mongodb cookbooks, create a site wrapper cookbook

make a role for your cluster,

launch some nodes,

initiate the replica set,

... and you’re done.

Adding snapshots

adding RAID for EBS volumes

this will bootstrap a new node for the cluster from snapshots

with this role ...

multiple clusters

distinct cluster name, backup host, backup volumes

sharding

assign a shard name per cluster, per role

treat them like ordinary replica sets

Arbiters

• Mongod processes that do nothing but vote

• Highly reliable

• To provision an arbiter, use the LWRP

• Easy to run multiple arbiters on a single host

arbiter LWRP

replica set with arbiters

run multiple arbiters on a single host:

Managing votes with arbiters

tuning and performance.

resources and provisioning

tuning your filesystem

snapshotting and warmups

fragmentation

Provisioning tips

• Memory is your primary scaling constraint

• Your working set must fit in to memory

• in 2.4, estimate with:

• Page faults? Your working set may not fit

Disk options

• If you’re on Amazon:

• EBS

• Dedicated SSD

• Provisioned IOPS

• Ephemeral

• If not:

• use SSDs!

EBS classic

EBS with PIOPS:

... just say no to EBS

SSD (hi1.4xlarge)

• 8 cores

• 60 gigs RAM

• 2 1-TB SSD drives

• 120k random reads/sec

• 85k random writes/sec

• expensive! $2300/mo on demand

• Up to 2000 IOPS/volume

• Up to 1024 GB/volume

• Variability of < 0.1%

• Costs double regular EBS

• Supports snapshots

• RAID together multiple volumes for more storage/performance

• multiply that by 2-3x depending on your spikiness

Estimating PIOPS

• estimate how many IOPS to provision with the “tps” column of sar -d 1

EphemeralStorage

• Cheap

• Fast

• No network latency

• No snapshot capability

• Data is lost forever if you stop or resize the instance

Filesystem and limits

• Raise file descriptor limits

• Raise connection limits

• Mount with noatime and nodiratime

• Consider putting the journal on a separate volume

Blockdev

• Your default blockdev is probably wrong

• Too large? you will underuse memory

• Too small? you will hit the disk too much

• Experiment.

Snapshot best practices

• Set priority = 0

• Set hidden = 1

• Consider setting votes = 0

• Lock mongo or stop mongod before snapshot

• Consider running continuous compaction on snapshot node

Restoring from snapshot

• EBS snapshot will lazily-load blocks from S3

• run “dd” on each of the data files to pull blocks down

• Always warm up a secondary before promoting

• warm up both indexes and data

• http://blog.parse.com/2013/03/07/techniques-for-warming-up-mongodb/

• in mongodb 2.2 and above you can use the touch command:

Fragmentation

• Your RAM gets fragmented too!

• Leads to underuse of memory

• Deletes are not the only source of fragmentation

• Repair, compact, or resync regularly

3 ways to fix fragmentation:

• Re-sync a secondary from scratch

• hard on your primary; rs.syncFrom() a secondary

• Repair a secondary

• can cause small discrepancies in your data

• Run continuous compaction on your snapshot node

• won’t reset padding factors

• not appropriate if you do lots of deletes

Fragmentation is terrible

Upgrade!

mongo is getting faster. :)

disasters and recovery.

Finding bad queries

• db.currentOp()

• mongodb.log

• profiling collection

db.currentOp()

• Check the queue size

• Any indexes building?

• Sort by num_seconds

• Sort by num_yields, locktype

• Consider adding comments to your queries

• Run explain() on queries that are long-running

mongodb.log

• Configure output with --slowms

• Look for high execution time, nscanned, ntoreturn

• See which queries are holding long locks

• Match connection ids to IPs

system.profile collection

• Enable profiling with db.setProfiling()

• Does not persist through restarts

• Like mongodb.log, but queryable

• Writes to this collection incur some cost

• Use db.system.profile.find() to get slow queries for a certain collection, time range, execution time, etc

• Know what your tipping point looks like

• Don’t switch your primary or restart

• Do kill queries before the tipping point

• Write your kill script before you need it

• Don’t kill internal mongo operations, only queries.

... when queries pile up ...

can’t elect a master?

• Never run with an even number of votes (max 7)

• You need > 50% of votes to elect a primary

• Set your priority levels explicitly if you need warmup

• Consider delegating voting to arbiters

• Set snapshot nodes to be nonvoting if possible.

• Check your mongo log. Is something vetoing? Do they have an inconsistent view of the cluster state?

secondaries crashing?

• Some rare mongo bugs will cause all secondaries to crash unrecoverably

• Never kill oplog tailers or other internal database operations, this can also trash secondaries

• Arbiters are more stable than secondaries, consider using them to form a quorum with your primary

replication stops?

• Other rare bugs will stop replication or cause secondaries to exit without a corrupt op

• The correct way to fix this is to re-snapshot off the primary and rebuild your secondaries.

• However, you can sometimes *dangerously* repair a secondary:

1. stop mongo

2. bring it back up in standalone mode

3. repair the offending collection

4. restart mongo again as part of the replica set

• Everything is getting vaguely slower?

• check your padding factor, try compaction

• You rs.remove() a node and get weird driver errors?

• always shut down mongod after removing from replica set

• Huge background flush spike?

• probably an EBS or disk problem

• You run out of connection limits?

• possibly a driver bug

• hard-coded to 80% of soft ulimit until 20k is reached.

• It looks like all I/O stops for a while?

• check your mongodb.log for large newExtent warnings

• also make sure you aren’t reaching PIOPS limits

• You get weird driver errors after adding/removing/re-electing?

• some drivers have problems with this, you may have to restart

Glossary of resources

• Opscode AWS cookbook

• https://github.com/opscode-cookbooks/aws

• edelight MongoDB cookbook

• https://github.com/edelight/chef-mongodb

• Parse MongoDB cookbook fork

• https://github.com/ParsePlatform/Ops/tree/master/chef/cookbooks/mongodb

• Parse compaction scripts and warmup scripts

• http://blog.parse.com/2013/03/07/techniques-for-warming-up-mongodb/

• http://blog.parse.com/2013/03/26/always-be-compacting/

Charity Majors@mipsytipsy

Managing a Maturing MongoDB Ecosystem

snapshot ebs snapshot

bad queries db

working set

ebs volumesthursday

multiple arbiters

snapshot capability

secondary repair

regular ebs

Technology

MongoDB Atlas - On Tour!: Introduction to MongoDB

MongoDB Europe 2016 - Graph Operations with MongoDB

Automate MongoDB with MongoDB Management Service

Automate MongoDB with MongoDB Management Service (MMS)

MongoDB Evenings Minneapolis: Medtronic's MongoDB Journey

MongoDB Europe 2016 - Distributed Ledgers, Blockchain +...

MongoDB: What, why, when. Solutions Architect, MongoDB Inc.....

MongoDB World 2016: MongoDB + Google Cloud

Шардинг в MongoDB, Henrik Ingo (MongoDB)

mongodb training | mongodb online training | mongodb...

MongoDB - Marquette University › Classes › Databases ›...

Maturing the Startup

Slide: 1 CAMP 06: Maturing Minds Programming in C Recap Camp...

MongoDB São Paulo - Utilizando MongoDB com .NET

MongoDB World 2016: MongoDB & IBM

What’s New in MySQL and MongoDB Ecosystem - …€™s New...