Top Banner
Running MongoDB in the Cloud Tony Tam @fehguy
27

Running MongoDB in the Cloud

May 11, 2015

Download

Technology

Tony Tam

A talk about how Wordnik migrated from EC2 to physical servers and back again, much due to the cloud-friendliness of MongoDB
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Running MongoDB in the Cloud

Running MongoDB in the Cloud

Tony Tam@fehguy

Page 2: Running MongoDB in the Cloud

What this Talk is About

Wordnik left the cloud and came back

• What?!?

• Why we left

• Decisions

• Why we came back (and what we did differently)

Page 3: Running MongoDB in the Cloud

Who is Wordnik?

•World’s fastest updating English dictionary

• Based on input of text at ~8k words/second

• Word Graph as basis to our analysis

• Synchronous & asynchronous processing

•10’s of Billions of documents in NR storage

•Concept & Meaning Discovery Engine

•> 20M daily REST API calls, billions served

Page 4: Running MongoDB in the Cloud

So Why the Detour?

•Architectural Choices

•Business Choices

•Feedback, tooling, infrastructure

•Learning

•Changes in use case

•Progress!

Page 5: Running MongoDB in the Cloud

Architecture History

•EC2-based LAMP Stack

• POC (and seed funding)

• A manageable corpus < 1M records

•REST API

• Web + public

• MySQL in master/slave

• ~1B documents

• Operational nightmare

Page 6: Running MongoDB in the Cloud

Architecture History

•MongoDB

• First-order MySQL issues solved

• But it got slow…

•Real Servers to the rescue!

• Faster, bigger disks

•MongoDB for Corpus, Structured Data

• Faster Reads + Writes!

• More metal (72GB RAM)

• More cores

• “cold” query from 400ms to < 100

Page 7: Running MongoDB in the Cloud

Why Change?

Easy!

•Can’t beat metal…except

• Quick expansion

• Batch jobs/experiments

• Add a datacenter

• Full cluster migration

• The bill for unused capacity

Page 8: Running MongoDB in the Cloud

Architectural Mindshift

1. Anything can die, anytime

2. Centralized, redundant state (see point 1)

3. Server performance is *different*

• CPU, I/O, Memory—choose one

• Smart design makes it work!

Page 9: Running MongoDB in the Cloud

Architectural Mindshift

•Your software will need to change!

• So will the components you rely on

Page 10: Running MongoDB in the Cloud

Your Infrastructure

•Deploying Servers

• Going to need a lot!

•Configuration

•Updates to your software

What about Data?

Cloud Hero

Page 11: Running MongoDB in the Cloud

Let’s make this Work!

•MySQL Master Slave

• Take a snapshot (yes, this will block)

• Keep your binlogs!

change master to MASTER_HOST='app1', MASTER_USER='XXXX', MASTER_PASSWORD='XXXX', MASTER_LOG_FILE='app1-relay.0038774', MASTER_LOG_POS=6754205951;

Page 12: Running MongoDB in the Cloud

Let’s make this Work!

But…

•Your master is down!

• Quick, promote a slave!

• Point the other slaves to the new master

•As for the clients…

“Well, we never really tried that…”

Page 13: Running MongoDB in the Cloud

Better with Mongo

•Easy up, easy down!

• Startup: Sync your data, and announce to clients when ready for business

• Shutdown: Announce your departure and leave

•Replica setsrs.add("db4.wordnik.com:27017");

rs.remove("db1.wordnik.com:27017");

Page 14: Running MongoDB in the Cloud

Better with Mongo

Page 15: Running MongoDB in the Cloud

But what about Performance?

•Software Design

• It’s slow! (What is *it*?)

• Profile everythingimport com.wordnik.util.perf._

...

def findUser(id:Long): User = {

Profile("UserDao::findUserById", dao.findUserById(id))

}

http://github.com/wordnik/wordnik-oss

Page 16: Running MongoDB in the Cloud

But what about Performance?

Page 17: Running MongoDB in the Cloud

But what about Performance?

•“It’s the database!”

• What is it?

•Mapping layer

• Mysql (12+ joins) => 50 records/sec

• Mongo JSON POJO => 1000 records/sec

• Mongo DBO POJO => 35,000 records/sec

•How do you know?Profile

it!

Page 18: Running MongoDB in the Cloud

It’s Still Slow!

•It’s the index!

• How do you know?

• AHHHHH

Page 19: Running MongoDB in the Cloud

It’s Still Slow!

•Balance your B-Tree

• Can't always keep index in ram. MMF "does it's thing"

• Right-balanced b-tree keeps necessary index hot

• If you hit indexes on disk, mute your pager17

15

27

Page 20: Running MongoDB in the Cloud

But it’s Still Slow!

•Look at your Schema design

• Design to limit index size/number

• _id is your friend—make it meaningful

• Record size consistency

• Hierarchal Data beware!

• Split documents even in same collection!db.posts.find({_id:/^tony_posts_/})

{_id:"tony_posts_1”, posts:[...]}

{_id:"tony_posts_2”, posts:[...]}

{_id:"tony_posts_3”, posts:[...]}

YOUR app

knows best

Page 21: Running MongoDB in the Cloud

Really, it’s STILL slow!

•Your monolithic app/DB won’t scale same on VMs

•Specialize!

• Wordnik uses mSOA

• Data tiers follow service types

• Smaller *everything*

Powered APIswagger.wordnik.com

Page 22: Running MongoDB in the Cloud

Really, it’s STILL slow!

•Your monolithic app/DB won’t scale same on VMs

•Specialize!

• Wordnik uses mSOA

• Data tiers follow service types

• Smaller *everything*

Powered APIswagger.wordnik.com

A contract for your clients

Page 23: Running MongoDB in the Cloud

Be the Boss of your Data

•Your app *should* be smarter than your DB

• Lots of users?

• Lots of blog posts?

• Lots of images?

• Shard? On what?

•Data dimensionality

• Keep active data hot

• Don’t try to boil the ocean

Page 24: Running MongoDB in the Cloud

Cloud Computing + Mongo

•It can work extremely well

• No “Save as Cloud!” menu item

•Shifting constraints

• Optimize for RAM on VM

• Virtual disk => virtual performance

•Be “Deployable”

• Mongo Replica Sets are made for this

Page 25: Running MongoDB in the Cloud

Cloud Computing + Mongo

•System Durability

• Design your software for abuse

• Your old design doesn’t apply

• Add APM hooks, now!

•Dissect your app

• Build to micro services with dedicated MongoDB clusters

•Deployment Infrastructure

• Don’t wait until it’s too late

Page 26: Running MongoDB in the Cloud

See More

• See more about Wordnik APIs

http://developer.wordnik.com

• Migrating from MySQL to MongoDBhttp://www.slideshare.net/fehguy/migrating-from-mysql-to-mongodb-at-wordnik

• Maintaining your MongoDB Installationhttp://www.slideshare.net/fehguy/mongo-sv-tony-tam

• Swagger API Frameworkhttp://swagger.wordnik.com

• Mapping Benchmarkhttps://github.com/fehguy/mongodb-benchmark-tools

• Wordnik OSS Tools https://github.com/wordnik/wordnik-oss

Page 27: Running MongoDB in the Cloud

Questions?