Scaling Django Web Apps Mike Malone djangocon 2009 Thursday, September 10, 2009
Oct 17, 2014
Scaling Django Web AppsMike Malone
djangocon 2009Thursday, September 10, 2009
Thursday, September 10, 2009
Thursday, September 10, 2009
http://www.flickr.com/photos/kveton/2910536252/Thursday, September 10, 2009
Thursday, September 10, 2009
djangocon 2009
Pownce
• Large scale
• Hundreds of requests/sec
• Thousands of DB operations/sec
• Millions of user relationships
• Millions of notes
• Terabytes of static data
6
Thursday, September 10, 2009
djangocon 2009
Pownce
• Encountered and eliminated many common scaling bottlenecks
• Real world example of scaling a Django app
• Django provides a lot for free
• I’ll be focusing on what you have to build yourself, and the rare places where Django got in the way
7
Thursday, September 10, 2009
Scalability
Thursday, September 10, 2009
djangocon 2009
Scalability
9
• Speed / Performance
• Generally affected by language choice
• Achieved by adopting a particular technology
Scalability is NOT:
Thursday, September 10, 2009
djangocon 2009
import time
def application(environ, start_response): time.sleep(10) start_response('200 OK', [('content-type', 'text/plain')]) return ('Hello, world!',)
A Scalable Application
10
Thursday, September 10, 2009
djangocon 2009
def application(environ, start_response): remote_addr = environ['REMOTE_ADDR'] f = open('access-log', 'a+') f.write(remote_addr + "\n") f.flush() f.seek(0) hits = sum(1 for l in f.xreadlines()
if l.strip() == remote_addr) f.close() start_response('200 OK', [('content-type', 'text/plain')]) return (str(hits),)
A High Performance Application
11
Thursday, September 10, 2009
djangocon 2009
Scalability
12
A scalable system doesn’t need to change when the size of the problem changes.
Thursday, September 10, 2009
djangocon 2009
Scalability
• Accommodate increased usage
• Accommodate increased data
• Maintainable
13
Thursday, September 10, 2009
djangocon 2009
Scalability
• Two kinds of scalability
• Vertical scalability: buying more powerful hardware, replacing what you already own
• Horizontal scalability: buying additional hardware, supplementing what you already own
14
Thursday, September 10, 2009
djangocon 2009
Vertical Scalability
• Costs don’t scale linearly (server that’s twice is fast is more than twice as much)
• Inherently limited by current technology
• But it’s easy! If you can get away with it, good for you.
15
Thursday, September 10, 2009
djangocon 2009
Vertical Scalability
16
Sky scrapers are special. Normal buildings don’t need 10 floor foundations. Just build!
- Cal Henderson
“
Thursday, September 10, 2009
djangocon 2009
Horizontal Scalability
17
The ability to increase a system’s capacity by adding more processing units (servers)
Thursday, September 10, 2009
djangocon 2009
Horizontal Scalability
18
It’s how large apps are scaled.
Thursday, September 10, 2009
djangocon 2009
Horizontal Scalability
• A lot more work to design, build, and maintain
• Requires some planning, but you don’t have to do all the work up front
• You can scale progressively...
• Rest of the presentation is roughly in order
19
Thursday, September 10, 2009
Caching
Thursday, September 10, 2009
djangocon 2009
Caching
• Several levels of caching available in Django
• Per-site cache: caches every page that doesn’t have GET or POST parameters
• Per-view cache: caches output of an individual view
• Template fragment cache: caches fragments of a template
• None of these are that useful if pages are heavily personalized
21
Thursday, September 10, 2009
djangocon 2009
Caching
• Low-level Cache API
• Much more flexible, allows you to cache at any granularity
• At Pownce we typically cached
• Individual objects
• Lists of object IDs
• Hard part is invalidation
22
Thursday, September 10, 2009
djangocon 2009
Caching
• Cache backends:
• Memcached
• Database caching
• Filesystem caching
23
Thursday, September 10, 2009
djangocon 2009
Caching
24
Use Memcache.
Thursday, September 10, 2009
djangocon 2009
Sessions
25
Use Memcache.
Thursday, September 10, 2009
djangocon 2009
Sessions
26
Or Tokyo Cabinethttp://github.com/ericflo/django-tokyo-sessions/
Thanks @ericflo
Thursday, September 10, 2009
djangocon 2009
from django.core.cache import cache
class UserProfile(models.Model): ... def get_social_network_profiles(self): cache_key = ‘networks_for_%s’ % self.user.id profiles = cache.get(cache_key) if profiles is None: profiles = self.user.social_network_profiles.all() cache.set(cache_key, profiles) return profiles
Caching
27
Basic caching comes free with Django:
Thursday, September 10, 2009
djangocon 2009
from django.core.cache import cachefrom django.db.models import signals
def nuke_social_network_cache(self, instance, **kwargs): cache_key = ‘networks_for_%s’ % self.instance.user_id cache.delete(cache_key)
signals.post_save.connect(nuke_social_network_cache, sender=SocialNetworkProfile)signals.post_delete.connect(nuke_social_network_cache, sender=SocialNetworkProfile)
Caching
28
Invalidate when a model is saved or deleted:
Thursday, September 10, 2009
djangocon 2009
Caching
29
• Invalidate post_save, not pre_save
• Still a small race condition
• Simple solution, worked for Pownce:
• Instead of deleting, set the cache key to None for a short period of time
• Instead of using set to cache objects, use add, which fails if there’s already something stored for the key
Thursday, September 10, 2009
djangocon 2009
Advanced Caching
30
• Memcached’s atomic increment and decrement operations are useful for maintaining counts
• They were added to the Django cache API in Django 1.1
Thursday, September 10, 2009
djangocon 2009
Advanced Caching
31
• You can still use them if you poke at the internals of the cache object a bit
• cache._cache is the underlying cache object
try: result = cache._cache.incr(cache_key, delta)except ValueError: # nonexistent key raises ValueError # Do it the hard way, store the result.return result
Thursday, September 10, 2009
djangocon 2009
Advanced Caching
32
• Other missing cache API
• delete_multi & set_multi
• append: add data to existing key after existing data
• prepend: add data to existing key before existing data
• cas: store this data, but only if no one has edited it since I fetched it
Thursday, September 10, 2009
djangocon 2009
Advanced Caching
33
• It’s often useful to cache objects ‘forever’ (i.e., until you explicitly invalidate them)
• User and UserProfile
• fetched almost every request
• rarely change
• But Django won’t let you
• IMO, this is a bug :(
Thursday, September 10, 2009
djangocon 2009
class CacheClass(BaseCache): def __init__(self, server, params): BaseCache.__init__(self, params) self._cache = memcache.Client(server.split(';'))
def add(self, key, value, timeout=0): if isinstance(value, unicode): value = value.encode('utf-8') return self._cache.add(smart_str(key), value, timeout or self.default_timeout)
The Memcache Backend
34
Thursday, September 10, 2009
djangocon 2009
class CacheClass(BaseCache): def __init__(self, server, params): BaseCache.__init__(self, params) self._cache = memcache.Client(server.split(';'))
def add(self, key, value, timeout=None): if isinstance(value, unicode): value = value.encode('utf-8') if timeout is None: timeout = self.default_timeout return self._cache.add(smart_str(key), value, timeout)
The Memcache Backend
35
Thursday, September 10, 2009
djangocon 2009
Advanced Caching
36
• Typical setup has memcached running on web servers
• Pownce web servers were I/O and memory bound, not CPU bound
• Since we had some spare CPU cycles, we compressed large objects before caching them
• The Python memcache library can do this automatically, but the API is not exposed
Thursday, September 10, 2009
djangocon 2009
from django.core.cache import cachefrom django.utils.encoding import smart_strimport inspect as i
if 'min_compress_len' in i.getargspec(cache._cache.set)[0]: class CacheClass(cache.__class__): def set(self, key, value, timeout=None, min_compress_len=150000): if isinstance(value, unicode): value = value.encode('utf-8') if timeout is None: timeout = self.default_timeout return self._cache.set(smart_str(key), value, timeout, min_compress_len) cache.__class__ = CacheClass
Monkey Patching core.cache
37
Thursday, September 10, 2009
djangocon 2009
Advanced Caching
38
• Useful tool: automagic single object cache
• Use a manager to check the cache prior to any single object get by pk
• Invalidate assets on save and delete
• Eliminated several hundred QPS at Pownce
Thursday, September 10, 2009
djangocon 2009
Advanced Caching
39
All this and more at:
http://github.com/mmalone/django-caching/
Thursday, September 10, 2009
djangocon 2009
Caching
40
Now you’ve made life easier for your DB server,next thing to fall over: your app server.
Thursday, September 10, 2009
Load Balancing
Thursday, September 10, 2009
djangocon 2009
Load Balancing
• Out of the box, Django uses a shared nothing architecture
• App servers have no single point of contention
• Responsibility pushed down the stack (to DB)
• This makes scaling the app layer trivial: just add another server
42
Thursday, September 10, 2009
djangocon 2009
Load Balancing
43
App Servers
Database
Load Balancer
Spread work between multiple nodes in a cluster using a load balancer.
• Hardware or software• Layer 7 or Layer 4
Thursday, September 10, 2009
djangocon 2009
Load Balancing
44
• Hardware load balancers
• Expensive, like $35,000 each, plus maintenance contracts
• Need two for failover / high availability
• Software load balancers
• Cheap and easy, but more difficult to eliminate as a single point of failure
• Lots of options: Perlbal, Pound, HAProxy, Varnish, Nginx
Thursday, September 10, 2009
djangocon 2009
Load Balancing
45
• Most of these are layer 7 proxies, and some software balancers do cool things
• Caching
• Re-proxying
• Authentication
• URL rewriting
Thursday, September 10, 2009
djangocon 2009
Load Balancing
46
A common setup for large operations is to use redundant layer 4 hardware balancers in front of a pool of layer 7 software balancers.
Hardware Balancers
Software Balancers
App Servers
Thursday, September 10, 2009
djangocon 2009
Load Balancing
47
• At Pownce, we used a single Perlbal balancer
• Easily handled all of our traffic (hundreds of simultaneous connections)
• A SPOF, but we didn’t have $100,000 for black box solutions, and weren’t worried about service guarantees beyond three or four nines
• Plus there were some neat features that we took advantage of
Thursday, September 10, 2009
djangocon 2009
Perlbal Reproxying
48
Perlbal reproxying is a really cool, and really poorlydocumented feature.
Thursday, September 10, 2009
djangocon 2009
Perlbal Reproxying
49
1. Perlbal receives request
2. Redirects to App Server
1. App server checks auth (etc.)
2. Returns HTTP 200 with X-Reproxy-URL header set to internal file server URL
3. File served from file server via Perlbal
Thursday, September 10, 2009
djangocon 2009
Perlbal Reproxying
• Completely transparent to end user
• Doesn’t keep large app server instance around to serve file
• Users can’t access files directly (like they could with a 302)
50
Thursday, September 10, 2009
djangocon 2009
def download(request, filename): # Check auth, do your thing response = HttpResponse() response[‘X-REPROXY-URL’] = ‘%s/%s’ % (FILE_SERVER, filename) return response
Perlbal Reproxying
51
Plus, it’s really easy:
Thursday, September 10, 2009
djangocon 2009
Load Balancing
52
Best way to reduce load on your app servers: don’t use them to do hard stuff.
Thursday, September 10, 2009
Queuing
Thursday, September 10, 2009
djangocon 2009
Queuing
• A queue is simply a bucket that holds messages until they are removed for processing by clients
• Many expensive operations can be queued and performed asynchronously
• User experience doesn’t have to suffer
• Tell the user that you’re running the job in the background (e.g., transcoding)
• Make it look like the job was done real-time (e.g., note distribution)
54
Thursday, September 10, 2009
djangocon 2009
Queuing
• Lots of open source options for queuing
• Ghetto Queue (MySQL + Cron)
• this is the official name.
• Gearman
• TheSchwartz
• RabbitMQ
• Apache ActiveMQ
• ZeroMQ
55
Thursday, September 10, 2009
djangocon 2009
Queuing
• Lots of fancy features: brokers, exchanges, routing keys, bindings...
• Don’t let that crap get you down, this is really simple stuff
• Biggest decision: persistence
• Does your queue need to be durable and persistent, able to survive a crash?
• This requires logging to disk which slows things down, so don’t do it unless you have to
56
Thursday, September 10, 2009
djangocon 2009
Queuing
• Pownce used a simple ghetto queue built on MySQL / cron
• Problematic if you have multiple consumers pulling jobs from the queue
• No point in reinventing the wheel, there are dozens of battle-tested open source queues to choose from
57
Thursday, September 10, 2009
djangocon 2009
from django.core.management import setup_environfrom mysite import settings
setup_environ(settings)
Django Standalone Scripts
58
Consumers need to setup the Django environment
Thursday, September 10, 2009
THE DATABASE!
Thursday, September 10, 2009
djangocon 2009
The Database
• Til now we’ve been talking about
• Shared nothing
• Pushing problems down the stack
• But we have to store a persistent and consistent view of our application’s state somewhere
• Enter, the database...
60
Thursday, September 10, 2009
djangocon 2009
CAP Theorem
• Three properties of a shared-data system
• Consistency: all clients see the same data
• Availability: all clients can see some version of the data
• Partition Tolerance: system properties hold even when the system is partitioned & messages are lost
• But you can only have two
61
Thursday, September 10, 2009
djangocon 2009
CAP Theorem
• Big long proof... here’s my version.
• Empirically, seems to make sense.
• Eric Brewer
• Professor at University of California, Berkeley
• Co-founder and Chief Scientist of Inktomi
• Probably smarter than me
62
Thursday, September 10, 2009
djangocon 2009
CAP Theorem
• The relational database systems we all use were built with consistency as their primary goal
• But at scale our system needs to have high availability and must be partitionable
• The RDBMS’s consistency requirements get in our way
• Most sharding / federation schemes are kludges that trade consistency for availability & partition tolerance
63
Thursday, September 10, 2009
djangocon 2009
The Database
• There are lots of non-relational databases coming onto the scene
• CouchDB
• Cassandra
• Tokyo Cabinet
• But they’re not that mature, and they aren’t easy to use with Django
64
Thursday, September 10, 2009
Denormalization
Thursday, September 10, 2009
djangocon 2009
Denormalization
• Django encourages normalized data, which is usually good
• But at scale you need to denormalize
• Corollary: joins are evil
• Django makes it really easy to do joins using the ORM, so pay attention
66
Thursday, September 10, 2009
djangocon 2009
Denormalization
• Start with a normalized database
• Selectively denormalize things as they become bottlenecks
• Denormalized counts, copied fields, etc. can be updated in signal handlers
67
Thursday, September 10, 2009
Replication
Thursday, September 10, 2009
djangocon 2009
Replication
• Typical web app is 80 to 90% reads
• Adding read capacity will get you a long way
• MySQL Master-Slave replication
69
Read & Write
Read only
Thursday, September 10, 2009
djangocon 2009
Replication
• Django doesn’t make it easy to use multiple database connections, but it is possible
• Some caveats
• Slave lag interacts with caching in weird ways
• You can only save to your primary DB (the one you configure in settings.py)
• Unless you get really clever...
70
Thursday, September 10, 2009
djangocon 2009
class SlaveDatabaseWrapper(DatabaseWrapper): def _cursor(self, settings): if not self._valid_connection(): kwargs = { 'conv': django_conversions, 'charset': 'utf8', 'use_unicode': True, } kwargs = pick_random_slave(settings.SLAVE_DATABASES) self.connection = Database.connect(**kwargs) ... cursor = CursorWrapper(self.connection.cursor()) return cursor
Replication
71
1. Create a custom database wrapper by subclassing DatabaseWrapper
Thursday, September 10, 2009
djangocon 2009
class MultiDBQuerySet(QuerySet): ... def update(self, **kwargs): slave_conn = self.query.connection self.query.connection = default_connection super(MultiDBQuerySet, self).update(**kwargs) self.query.connection = slave_conn
Replication
72
2. Custom QuerySet that uses primary DB for writes
Thursday, September 10, 2009
djangocon 2009
class SlaveDatabaseManager(db.models.Manager): def get_query_set(self): return MultiDBQuerySet(self.model, query=self.create_query())
def create_query(self): return db.models.sql.Query(self.model, connection)
Replication
73
3. Custom Manager that uses your custom QuerySet
Thursday, September 10, 2009
djangocon 2009
Replication
74
http://github.com/mmalone/django-multidb/
Example on github:
Thursday, September 10, 2009
djangocon 2009
Replication
• Goal:
• Read-what-you-write consistency for writer
• Eventual consistency for everyone else
• Slave lag screws things up
76
Thursday, September 10, 2009
djangocon 2009
Replication
77
What happens when you become write saturated?
Thursday, September 10, 2009
Federation
Thursday, September 10, 2009
djangocon 2009
Federation
79
• Start with Vertical Partitioning: split tables that aren’t joined across database servers
• Actually pretty easy
• Except not with Django
Thursday, September 10, 2009
djangocon 2009
Federation
80
django.db.models.base
FAIL!
Thursday, September 10, 2009
djangocon 2009
Federation
• At some point you’ll need to split a single table across databases (e.g., user table)
• Auto-increment PKs won’t work
• It’d be nice to have a UUIDField for PKs
• You can probably build this yourself
81
Thursday, September 10, 2009
Profiling, Monitoring & Measuring
Thursday, September 10, 2009
djangocon 2009
>>> Article.objects.filter(pk=3).query.as_sql()('SELECT "app_article"."id", "app_article"."name", "app_article"."author_id" FROM "app_article" WHERE "app_article"."id" = %s ', (3,))
Know your SQL
83
Thursday, September 10, 2009
djangocon 2009
>>> import sqlparse>>> def pp_query(qs):... t = qs.query.as_sql()... sql = t[0] % t[1]... print sqlparse.format(sql, reindent=True, keyword_case='upper')... >>> pp_query(Article.objects.filter(pk=3))SELECT "app_article"."id", "app_article"."name", "app_article"."author_id"FROM "app_article"WHERE "app_article"."id" = 3
Know your SQL
84
Thursday, September 10, 2009
djangocon 2009
>>> from django.db import connection>>> connection.queries[{'time': '0.001', 'sql': u'SELECT "app_article"."id", "app_article"."name", "app_article"."author_id" FROM "app_article"'}]
Know your SQL
85
Thursday, September 10, 2009
djangocon 2009
Know your SQL
• It’d be nice if a lightweight stacktrace could be done in QuerySet.__init__
• Stick the result in connection.queries
• Now we know where the query originated
86
Thursday, September 10, 2009
djangocon 2009
Measuring
87
Django Debug Toolbar
http://github.com/robhudson/django-debug-toolbar/
Thursday, September 10, 2009
djangocon 2009
Monitoring
• Ganglia
• Munin
88
You can’t improve what you don’t measure.
Thursday, September 10, 2009
djangocon 2009
Measuring & Monitoring
• Measure
• Server load, CPU usage, I/O
• Database QPS
• Memcache QPS, hit rate, evictions
• Queue lengths
• Anything else interesting
89
Thursday, September 10, 2009
All done... Questions?Contact me at [email protected] or @mjmalone
Thursday, September 10, 2009