Top Banner
Building Wikipedia Scalable LAMP on a shoestring budget Brion Vibber GatorJUG 2007-09-12
44

Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Building WikipediaScalable LAMP on a shoestring budget

Brion Vibber GatorJUG 2007-09-12

Page 2: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Wikipedia is happy to serve you...

Page 3: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

•7,000,000,000 page views per month

•32,000 HTTP objects/second at peak

•Hardware budget to date: $1 million

•Tech department staff: 4

Page 4: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

That’s not a big budget, dude.

Page 5: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

1.Make a cool site

2.Slap up some ads to look like you have a business plan

3.Sell out to Google, Y!, or Microsoft

Web 2.0 model

cha-ching!

Page 6: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Wikipedia model

1.Make a cool site

2.Incorporate as a not-for-profit, keep the site non-commercial and ad-free

3.Beg for moneyHey, at least we’re honest about it!

Page 7: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Wikimedia Foundation, Inc.

•501(c)3 not-for-profit

•Funded primarily through donations, tax-deductible in the US (hint hint)

•Annual donation drive in Fall/Winter (hmmm, that’s coming up isn’t it?)

Page 8: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Free is good!

so when you have no money...

Page 9: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Free as in software!

yay :)

Page 10: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Basic LAMP stack

Page 11: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

•Linux

•Apache

•MySQL

•PHP / Perl / Python / Pwhatever

y’all know the drill...

Page 12: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

keep it simpleat the core...

Page 13: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Apache+PHPApache+PHP

MySQLMySQL

Page 14: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Ahhh, simple is nice.

:)

Page 15: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Simple is slow.:(

Page 16: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

First, add cache!

Page 17: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Apache+PHP+APCApache+PHP+APC

MySQLMySQL

SquidSquid

memcachedmemcached

Page 18: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Apache+PHP+APCApache+PHP+APC

MySQLMySQL

SquidSquid

memcachedmemcached

Page 19: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Apache+PHP+APCApache+PHP+APC

MySQLMySQL

SquidSquid

memcachedmemcached

Good for static-dynamic sites like a wiki...

The public face of a given page doesn’t change very often, so you can cache at the HTTP level.

Page 20: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Apache+PHP+APCApache+PHP+APC

SquidSquid

SquidSquidSquidSquid

AmsterdamSeoul

Tampa

Page 21: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Apache+PHP+APCApache+PHP+APC

SquidSquid

SquidSquidSquidSquid

AmsterdamSeoul

Tampa

Good for geographic load balancing, too!

Use cheaper, faster local bandwidth...

Page 22: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Apache+PHP+APCApache+PHP+APC

MySQLMySQL

SquidSquid

memcachedmemcached

Page 23: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Apache+PHP+APCApache+PHP+APC

MySQLMySQL

SquidSquid

memcachedmemcached

PHP compiles your scripts to

bytecode...

...then throws it away after execution.

Page 24: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Apache+PHP+APCApache+PHP+APC

MySQLMySQL

SquidSquid

memcachedmemcached

Compiling on every run

adds a lot of overhead...

...especially as your

application grows...

...pulling in lots of framework and

library code.

Page 25: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Apache+PHP+APCApache+PHP+APC

MySQLMySQL

SquidSquid

memcachedmemcached

Always use an opcode cache

with PHP!APC

eAccelerator

Zend Platform

...

Drastically reduces startup time for large

apps, and moderate

improvements even for small

scripts.

Page 26: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Apache+PHP+APCApache+PHP+APC

MySQLMySQL

SquidSquid

memcachedmemcached

Page 27: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Apache+PHP+APCApache+PHP+APC

MySQLMySQL

SquidSquid

memcachedmemcached

Share temporary

data in your network’s memory

Faster than disk-

backed database when you just need an object cache...

Page 28: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Now add cash!

Page 29: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

MySQLMySQL

PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP

MySQLMySQL

SquidSquid SquidSquid SquidSquid SquidSquid

MySQLMySQL MySQLMySQL MySQLMySQL MySQLMySQLmemcachememcachedd

memcachememcachedd

memcachememcachedd

memcachememcachedd

PHPPHPmemcachememcache

dd

PHPPHP

MySQLMySQL

Page 30: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

MySQLMySQL

PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP

MySQLMySQL

SquidSquid SquidSquid SquidSquid SquidSquid

MySQLMySQL MySQLMySQL MySQLMySQL MySQLMySQLmemcachememcachedd

memcachememcachedd

memcachememcachedd

memcachememcachedd

PHPPHPmemcachememcache

dd

PHPPHP

MySQLMySQL

Page 31: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

MySQLMySQL

PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP

MySQLMySQL

SquidSquid SquidSquid SquidSquid SquidSquid

MySQLMySQL MySQLMySQL MySQLMySQL MySQLMySQLmemcachememcachedd

memcachememcachedd

memcachememcachedd

memcachememcachedd

PHPPHPmemcachememcache

dd

PHPPHP

MySQLMySQL

Put underutilized memory and disk to work!

Page 32: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Unless you’re playing the blade game, those web

servers come with plenty of memory and disk space. Use

it!

Page 33: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

A bit of memory on each server

adds up to a big memcached store

space...

Page 34: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Disk space can be used for

replicated bulk data storage at very little CPU

cost.

Page 35: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

MySQLMySQL

PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP

MySQLMySQL

SquidSquid SquidSquid SquidSquid SquidSquid

MySQLMySQL MySQLMySQL MySQLMySQL MySQLMySQLmemcachememcachedd

memcachememcachedd

memcachememcachedd

memcachememcachedd

PHPPHPmemcachememcache

dd

PHPPHP

MySQLMySQL

Page 36: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

MySQLMySQL

PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP PHPPHP

MySQLMySQL

SquidSquid SquidSquid SquidSquid SquidSquid

MySQLMySQL MySQLMySQL MySQLMySQL MySQLMySQLmemcachememcachedd

memcachememcachedd

memcachememcachedd

memcachememcachedd

PHPPHPmemcachememcache

dd

PHPPHP

MySQLMySQL

Break it up...

Page 37: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Replicate for speed!

MySQL masterMySQL master slaveslave slaveslave

high write loadlow read load

low write loadhigh read load

...but your application now has to think about replication lag.

Page 38: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

& reliability...

MySQL masterMySQL master slaveslave slaveslave

Promote a slave!

Master dead?

MySQL masterMySQL masterslaveslave slaveslave

...but failover isn’t automated with MySQL.

Page 39: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

& some tricks.

MySQL masterMySQL master slaveslave slaveslave

...then swap masters...

apply schema changes on slaves...

slaveslave slaveslave

...for low-downtime column and index

changes!

MySQL masterMySQL master

Page 40: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

MySQL group MySQL group s1s1

English-language Wikipedia

Next 19 biggest wikisMySQL group MySQL group s2s2

MySQL group MySQL group s3s3

Next 730 wikis

Split along logical data partitions, such as subsites that don’t interact

closely.

Data too big? Load too high?

Page 41: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

MySQL big MySQL big ironiron

Page metadata, links, users......read/write/update...active index scans

Append-only bulk text...nice simple blobs

Split along functional boundaries...

Data too big? Load too high?

PHPPHP

MySQLMySQL

PHPPHP

MySQLMySQL

PHPPHP

MySQLMySQL

Page 42: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

General scalability tricks...

•Smart caching can keep most load away from the backend

•Keep data sets small -- look for places to spread out horizontally

•Keep worst cases fast, not just average cases

Page 43: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

Questions?

Page 44: Building Wikipedia Scalable LAMP on a shoestring budget Brion VibberGatorJUG 2007-09-12.

wikimediafoundation.org