Scaling Django with GeventScaling Django with Gevent
Mahendra MMahendra M
@mahendra@mahendra
https://github.com/mahendrahttps://github.com/mahendra
@mahendra@mahendra
● Python developer for 6 yearsPython developer for 6 years● FOSS enthusiast/volunteer for 14 yearsFOSS enthusiast/volunteer for 14 years
● Bangalore LUG and Infosys LUGBangalore LUG and Infosys LUG● FOSS.in and LinuxBangalore/200xFOSS.in and LinuxBangalore/200x
● Gevent user for 1 yearGevent user for 1 year● Twisted user for 5 years (before migrating)Twisted user for 5 years (before migrating)
● Added twisted support libraries like mustaineAdded twisted support libraries like mustaine
Concurrency modelsConcurrency models
● Multi-ProcessMulti-Process● ThreadsThreads● Event drivenEvent driven● CoroutinesCoroutines
Process/ThreadProcess/Thread
dispatch()request worker_1()
worker_n()
read(fp)
db_rd()
db_wr()
sock_wr()
Process/ThreadProcess/Thread
● There are blocking sections in the codeThere are blocking sections in the code● Python GIL is an issue in thread based Python GIL is an issue in thread based
concurrencyconcurrency
Event drivenEvent driven
block_on_events()
event_1
event_2
event_n
hdler_1()
hdler_2()
hdler_n()
Events are posted
ev()
Event driven web serverEvent driven web server
event_loop()
request
opened
sql_read
open(fp)
wri_sql()
reg()
parse()
read_sql() reg()
reg()
sql_writ
responded
sock_wr() reg()
close()
Two years backTwo years back
● Using python twisted for half of our productsUsing python twisted for half of our products● Using django for the other halfUsing django for the other half● Quite a nightmareQuite a nightmare
Python twistedPython twisted
● An event driven library (very scalable)An event driven library (very scalable)● Using epoll or kqueueUsing epoll or kqueue
ClientNginx
(SSL & LB)
Server 1
Server 2
Server N
.
.
.
Proc 1 (:8080)
Proc 2 (:8080)
Proc N (:8080)
GeventGevent
A coroutine-based Python networking library that A coroutine-based Python networking library that uses greenlet to provide a high-level synchronous uses greenlet to provide a high-level synchronous API on top of the libevent event loop.API on top of the libevent event loop.
GeventGevent
A coroutine-based Python networking library that A coroutine-based Python networking library that uses greenlet to provide a high-level synchronous uses greenlet to provide a high-level synchronous API on top of the libevent event loop.API on top of the libevent event loop.
CoroutinesCoroutines
● Python coroutines are almost similar to Python coroutines are almost similar to generators.generators.
def abc( seq ):def abc( seq ):
lst = list( seq )lst = list( seq )
for i in lst:for i in lst:
value = yield ivalue = yield i
if cmd is not None:if cmd is not None:
lst.append( value )lst.append( value )
r = abc( [1,2,3] )r = abc( [1,2,3] )
r.send( 4 )r.send( 4 )
Gevent featuresGevent features
● Fast event-loop based on libevent (epoll, Fast event-loop based on libevent (epoll, kqueue etc.)kqueue etc.)
● Lightweight execution units based on greenlets Lightweight execution units based on greenlets (coroutines)(coroutines)
● Monkey patching supportMonkey patching support● Simple APISimple API● Fast WSGI serverFast WSGI server
GreenletsGreenlets
● Primitive notion of micro-threads with no implicit Primitive notion of micro-threads with no implicit schedulingscheduling
● Just co-routines or independent pseudo-Just co-routines or independent pseudo-threadsthreads
● Other systems like gevent build micro-threads Other systems like gevent build micro-threads on top of greenlets. on top of greenlets.
● Execution happens by switching execution Execution happens by switching execution among greenlet stacksamong greenlet stacks
● Greenlet switching is not implicit (switch())Greenlet switching is not implicit (switch())
Greenlet executionGreenlet execution
Child greenlet
Main greenlet pause()
pause()
abc()
func_1()
some() reg()
func_2()
Greenlet codeGreenlet code
from greenlet import greenletfrom greenlet import greenlet
def test1():def test1():
gr2.switch()gr2.switch()
def test2():def test2():
gr1.switch()gr1.switch()
gr1 = greenlet(test1)gr1 = greenlet(test1)
gr2 = greenlet(test2)gr2 = greenlet(test2)
gr1.switch()gr1.switch()
How does gevent workHow does gevent work
● Creates an implicit event loop inside a Creates an implicit event loop inside a dedicated greenletdedicated greenlet
● When a function in gevent wants to block, it When a function in gevent wants to block, it switches to the greenlet of the event loop. This switches to the greenlet of the event loop. This will schedule another child greenlet to runwill schedule another child greenlet to run
● The eventloop automatically picks up the The eventloop automatically picks up the fastest polling mechanism available in the fastest polling mechanism available in the systemsystem
● One event loop runs inside a single OS thread One event loop runs inside a single OS thread (process)(process)
Gevent codeGevent code
import geventimport gevent
from gevent import socketfrom gevent import socket
urls = ['www.google.com', 'www.example.com', urls = ['www.google.com', 'www.example.com', 'www.python.org']'www.python.org']
jobs = [gevent.spawn(socket.gethostbyname, url) for jobs = [gevent.spawn(socket.gethostbyname, url) for url in urls]url in urls]
gevent.joinall(jobs, timeout=2)gevent.joinall(jobs, timeout=2)
[job.value for job in jobs][job.value for job in jobs]
['74.125.79.106', '208.77.188.166', '82.94.164.162']['74.125.79.106', '208.77.188.166', '82.94.164.162']
Gevent apisGevent apis
● Greenlet management (spawn, timeout, schedule)Greenlet management (spawn, timeout, schedule)
● Greenlet local dataGreenlet local data
● Networking (socket, ssl, dns, select)Networking (socket, ssl, dns, select)
● SynchronizationSynchronization
● Event – notify multiple listenersEvent – notify multiple listeners● Queue – synchronized producer/consumer queuesQueue – synchronized producer/consumer queues● Locking – SemaphoresLocking – Semaphores
● Greenlet poolsGreenlet pools
● TCP/IP and WSGI serversTCP/IP and WSGI servers
Gevent advantagesGevent advantages
● Almost synchronous code. No callbacks and Almost synchronous code. No callbacks and deferredsdeferreds
● Lightweight greenletsLightweight greenlets● Good concurrencyGood concurrency● No issues of python GILNo issues of python GIL● No need for in-process locking, since a greenlet No need for in-process locking, since a greenlet
cannot be pre-emptedcannot be pre-empted
Gevent issuesGevent issues
● A greenlet will run till it blocks or switchesA greenlet will run till it blocks or switches● Be vary of large/infinite loopsBe vary of large/infinite loops
● Monkey patching is required for un-supported Monkey patching is required for un-supported blocking libraries. Might not work well with blocking libraries. Might not work well with some librariessome libraries
Our django dreamOur django dream
● We love djangoWe love django● I like twisted, but love django moreI like twisted, but love django more
● Coding complexityCoding complexity● Lack of developers for hireLack of developers for hire● Deployment complexityDeployment complexity
● Gevent saved the dayGevent saved the day
The Django ProblemThe Django Problem
● In a HTTP request cycle, we wanted the In a HTTP request cycle, we wanted the following operationsfollowing operations● Fetch some metadata for an item being soldFetch some metadata for an item being sold● Purchase the item for the user in the billing systemPurchase the item for the user in the billing system● Fetch ads to be shown along with the itemFetch ads to be shown along with the item● Fetch recommendations based on this itemFetch recommendations based on this item
● In parallel … !!In parallel … !!● Twisted was the only optionTwisted was the only option
Twisted codeTwisted code
def handle_purchase( rqst ):def handle_purchase( rqst ):
defs = []defs = []
defs.append( biller() )defs.append( biller() )
defs.append( ads() )defs.append( ads() )
defs.append( recos() )defs.append( recos() )
defs.append( meta() )defs.append( meta() )
def = DeferredList( defs, … )def = DeferredList( defs, … )
def.addCallback( send_response() )def.addCallback( send_response() )
return NOT_DONE_YETreturn NOT_DONE_YET
Twisted issuesTwisted issues
● The issues were with everything elseThe issues were with everything else● Header managementHeader management● Templates for responseTemplates for response● ORM supportORM support● SOAP, REST, Hessian/Burlap supportSOAP, REST, Hessian/Burlap support
– We liked to use suds, requests, mustaine etc.We liked to use suds, requests, mustaine etc.● Session management and authSession management and auth● Caching supportCaching support
● The above are django's strengthThe above are django's strength● Django's vibrant eco-system (celery, south, Django's vibrant eco-system (celery, south,
tastypie)tastypie)
gunicorngunicorn
● A python WSGI HTTP serverA python WSGI HTTP server● Supports running code under worker, eventlet, Supports running code under worker, eventlet,
gevent etc.gevent etc.● Uses monkey patchingUses monkey patching
● Excellent django supportExcellent django support● gunicorn_django app.settingsgunicorn_django app.settings
● Enabled gevent support for our app by default Enabled gevent support for our app by default without any code changeswithout any code changes
● Spawns and manages worker processes and Spawns and manages worker processes and distributes load amongst themdistributes load amongst them
Migrating our productsMigrating our products
def handle_purchase( request ):def handle_purchase( request ):
jobs = []jobs = []
jobs.append( gevent.spawn( biller, … ) )jobs.append( gevent.spawn( biller, … ) )
jobs.append( gevent.spawn( ads, … ) )jobs.append( gevent.spawn( ads, … ) )
jobs.append( gevent.spawn( meta, … ) )jobs.append( gevent.spawn( meta, … ) )
jobs.append( gevent.spawn( reco, … ) )jobs.append( gevent.spawn( reco, … ) )
gevent.joinall()gevent.joinall()
Migrating our productsMigrating our products
● Migrating our entire code base (2 products) Migrating our entire code base (2 products) took around 1 week to finishtook around 1 week to finish
● Was easier because we were already using Was easier because we were already using inlineCallbacks() decorator of twistedinlineCallbacks() decorator of twisted
● Only small parts of our code had to be migratedOnly small parts of our code had to be migrated
DeploymentDeployment
ClientNginx
(SSL & LB)
Gunicorn 1
Gunicorn 2
Gunicorn N
.
.
.
Proc 1
Proc 2
Proc N
Life todayLife today
● Single framework for all 4 productsSingle framework for all 4 products● Use django's awesome features and Use django's awesome features and
ecosystemecosystem● Increased scalability. More so with celery.Increased scalability. More so with celery.● Use blocking python libraries without worrying Use blocking python libraries without worrying
too muchtoo much● No more usage of python-twistedNo more usage of python-twisted● Coding, testing and maintenance is much Coding, testing and maintenance is much
easiereasier● We are hiring!!We are hiring!!
LinksLinks
● http://greenlet.readthedocs.org/en/latest/index.html http://greenlet.readthedocs.org/en/latest/index.html
● http://www.gevent.org/ http://www.gevent.org/
● http://in.pycon.org/2010/talks/48-twisted-programming http://in.pycon.org/2010/talks/48-twisted-programming