Top Banner
Lessons Learned from Migrating 2+ Billion Documents at Craigslist Jeremy Zawodny [email protected] [email protected] http://blog.zawodny.com/
19

Lessons Learned Migrating 2+ Billion Documents at Craigslist

Jan 15, 2015

Download

Technology

Jeremy Zawodny

The slides from my 2011 MongoSF talk of the same name
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lessons Learned Migrating 2+ Billion Documents at Craigslist

Lessons Learned from Migrating 2+ Billion Documents at Craigslist

Jeremy [email protected]@Zawodny.com

http://blog.zawodny.com/

Page 2: Lessons Learned Migrating 2+ Billion Documents at Craigslist

Outline

• Recap last year’s MongoSV Talk– The Archive, Why MongoDB, etc.– http://www.10gen.com/video/mongosv2010/craig

slist• The Infrastructure• The Lessons• Wishlist• Q&A

Page 3: Lessons Learned Migrating 2+ Billion Documents at Craigslist

Craigslist Numbers

• 2 data centers• ~500 servers• ~100 MySQL servers

• ~700 cities, worldwide• ~1 billion hits/day• ~1.5 million posts/day

Page 4: Lessons Learned Migrating 2+ Billion Documents at Craigslist

Archive: Where Data Goes To Die

Live Numbers• ~1.75M posts/day• ~14 day avg. lifetime• ~60 day retention• ~100M posts

• We keep all postings• Users reuse postings• Daily archive migration• Internal query tools

Page 5: Lessons Learned Migrating 2+ Billion Documents at Craigslist

Archive Pain

• Coupled Schemas• Big Indexes• Hardware Failures

• Replication Lag• Poor Search• Human Time Costs

Page 6: Lessons Learned Migrating 2+ Billion Documents at Craigslist

MongoDB Wins

• Scalable• Fast• Friendly• Proven• Pragmatic• Approachable

Page 7: Lessons Learned Migrating 2+ Billion Documents at Craigslist

MongoDB Details

• Plan for 5 billion documents• Average size: 2KB• 3 Replica sets, 3 Servers each• Deploy to 2 datacenters• Same deployment in each datacenter• Posting ID is sharding key

Page 8: Lessons Learned Migrating 2+ Billion Documents at Craigslist

MongoDB Architecture

• Typical Sharding with Replica Sets

mongos mongos mongos

shard001

replica set

shard003

replica set

shard002

replica set

config

config

config

client client client client

(external sphinx full-text indexers not pictured)

Page 9: Lessons Learned Migrating 2+ Billion Documents at Craigslist

Lesson: Know Your Hardware

• MongoDB on blades really sucks– Single 10k RPM disks can’t take it when data is

noticeably larger than RAM– Mongo operations can hit the client timeout (30

sec default)– Even minutely cron jobs start to spew– Lots of time wasted in development environment,

trying different kernels, tuning, etc.– Most noticeable during heavy writes but can

happen if pages fall out of RAM for other reasons

Page 10: Lessons Learned Migrating 2+ Billion Documents at Craigslist

Lesson: Replica Sets Rock

• Lots of reboots happened during dev environment troubleshooting

• Each time, one of the remaining nodes took over

• No “reclone” no config file or DNS changes• Stuff “just worked” while nodes bounced up

and down

Page 11: Lessons Learned Migrating 2+ Billion Documents at Craigslist

Lesson: Know Your Data

• MongoDB is UTF-8– Some of our older data is decidedly NOT UTF-8– We have lots of sloppy encoding issues to clean

up. But we had to clean them all up.– Start data load. Wait 12-36 hours. Witness fail.

Fix code. Start over. Sigh.– This is a combination of having been sloppy and

having old data. Even with a lot less history, this can bite you. Get your encoding house in order!

Page 12: Lessons Learned Migrating 2+ Billion Documents at Craigslist

Lesson: Know Your Data Size

• MongoDB has a doc size limits– 4MB in 1.6.x, 16MB in 1.8.x

• What to do with outliers?– In our case, trim off some useless data.– But going from relational to document means this

sort of problem is easy to have. One parent, many children.

• It’d be nice if this was easier to change, but clients have it hard-coded too.

• Compression would help, of course.

Page 13: Lessons Learned Migrating 2+ Billion Documents at Craigslist

Lesson: Know Your Data Types

• Field Types and Conversions can be expensive to do after the fact!– MongoDB treats strings and numbers differently,

but some programming languages (such as Perl) don’t make that distinction obvious

– This has indexing implications when you later look for 123456789 but had unknowingly stored “123456789”

– http://search.cpan.org/dist/MongoDB/lib/MongoDB/DataTypes.pod

Page 14: Lessons Learned Migrating 2+ Billion Documents at Craigslist

Data Types, continued

• “If the type of a field is ambiguous and important to your application, you should document what you expect the application to send to the database and convert your data to those types before sending.”

• Do you know how to do that in your language of choice?

• Some drivers may make a “guess” that gets it right most of the time.

Page 15: Lessons Learned Migrating 2+ Billion Documents at Craigslist

Lesson: Know Some Sharding

• The Balancer can be your frenemy– Initial insert rate: 8,000/sec– Later drops to 200/sec– Too much time spent waiting to page in data that’s

going to be sent to another node and never looked at (locally) again

– Pre-split your data if possible– http://blog.zawodny.com/2011/03/06/mongodb-p

re-splitting-for-faster-data-loading-and-importing/

Page 16: Lessons Learned Migrating 2+ Billion Documents at Craigslist

Lesson: Know Some Replica Sets

• Replica Set re-sync requires index rebuilds on the secondary– Most painful when a slave is down too long and

can’t catch up using the oplog– Typically during high write volumes– In a large data set, the index rebuilding can take a

couple of days w/out many indexes– What if you lose another while that is happening?

Page 17: Lessons Learned Migrating 2+ Billion Documents at Craigslist

MongoDB Wishlist

• Replica set node re-sync without out index rebuilding

• Record (or field) compression (not everyone uses a filesystem that offers compression)

• Method to tap into the oplog so that changes can be fed to external indexers (Sphinx, Redis, etc.)

• Hash-based sharding (coming soon?)• Cluster snapshot/backup tool

Page 18: Lessons Learned Migrating 2+ Billion Documents at Craigslist

craigslist is hiring!

• Front-end Engineering– HTML, CSS, JavaScript, jQuery– (Mobile too)

• Network Administration– Routers, switches, load balancers, etc.

• Back-end Engineering– Linux, Apache, Perl, MySQL, MongoDB, Redis,

Gearman, etc.• Systems Administration– Help keep all those systems running.

send resumes to: [email protected]

Plain Text or PDF, no Word Docs!

Page 19: Lessons Learned Migrating 2+ Billion Documents at Craigslist

craigslist is hiring!

• Laid back, non-corporateenvironment

• Engineering driven culture– Lots of interesting technical challenges

• Easy SF commute• Excellent benefits and pay• High-impact work– Millions use craigslist daily

send resumes to: [email protected]

Plain Text or PDF, no Word Docs!