State of the real-time web with Django Aymeric Augustin - @aymericaugustin DjangoCon US - September 5th, 2013 1
State of the real-time web with DjangoAymeric Augustin - @aymericaugustin
DjangoCon US - September 5th, 2013
1
Real-time
3
1. Systems responding within deadlines
2. Simulations running at wall clock time
3. Processing events without perceivable delay
http://en.wikipedia.org/wiki/Real-time_computing
Real-time web
4
“The real-time web is a set oftechnologies and practices that enable users toreceive information as soon as it is published
by its authors, rather thanrequiring that they or their software
check a source periodically for updates.”
http://en.wikipedia.org/wiki/Real-time_web
Use cases
5
Games
Chat
Live data
Notifications
Collaboration
VoIP
Social feeds
All these use cases requireservers to push events to clients.
Problem
6
Client Server
GET / HTTP/1.1
HTTP/1.1 200 OK
informationpublished!???
The request-response model doesn’t allowservers to push events to clients.
Early solutions• Java applets (1996)
• Not quite the web
• “Pushlets” (2000)
• Call back from Java applications into DHTML
• “Comet” (2006)
• Long-lived HTTP connections to reduce latency
• A revolution in browser-based user interfaces
8 http://www.pushlets.com/ – http://infrequently.org/2006/03/comet-low-latency-data-for-the-browser/
HTTP long polling
9
• Server keeps the requeston hold and only sendsa response when thereis an event to deliver
• Client resends a request after each response
Client Serverrequest
responserequest
event
responserequest
event
HTTP streaming
10
• Server sends a series of events in a single HTTP response
• Chunked
• EOF-terminated
• Client processes each incoming event
Client Serverrequest
response ...event
response ...event
response ...
Chunked response
HTTP/1.1 200 OKContent-Type: text/plainTransfer-Encoding: chunked
25This is the data in the first chunk
1Cand this is the second one
0
11
Client Serverrequest
response ...event
response ...event
response ...
RFC 6202
12
“The authors acknowledge that both the HTTPlong polling and HTTP streaming mechanisms
stretch the original semantic of HTTPand that the HTTP protocol was
not designed for bidirectional communication.”
http://tools.ietf.org/html/rfc6202
Server-sent events
13
• HTTP stream of events
• Format: text/event-stream
• JavaScript API: EventSource interface and events
http://www.w3.org/TR/eventsource/
: The ‘data’ field is mandatory.
data: This is the first message.
: ‘event’ and ‘id’ are optional.
event: messagedata: This is another messagedata: over several lines.
event: flashdata: This is a flash event!id: 0042
WebSocket
15
• Provides bidirectional communication in the context of the existing HTTP infrastructure
• RFC 6455 (supersedes hybi-xx and hixie-xx)
• Opening handshake to upgrade from HTTP
• Framing protocol and closing handshake
• Provisions for extensions and subprotocols
• JavaScript API: WebSocket interface and events
http://tools.ietf.org/html/rfc6455 – http://www.w3.org/TR/websockets/
WebSocket
> GET /endpoint HTTP/1.1> Host: ws.example.com> Connection: Upgrade> Upgrade: websocket> Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==> Sec-WebSocket-Version: 13
< HTTP/1.1 101 Switching Protocols< Connection: Upgrade< Upgrade: websocket< Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
16
WebSocket> 81 83 ec a0 1e 0c 81 f9 75# TEXT [ M A S K ] m Y k
< 81 0a 48 65 6c 6c 6f 20 6d 59 6b 21# TEXT H e l l o m Y k !
< 88 02 03 e8# CLOSE 1005
> 88 82 73 7a 29 aa 70 92# CLOSE [ M A S K ] 1005
17
An example of long polling
21
ApplicationDjango
Pub/SubRedis
EventsAny source
Web pageJS
Redis
SUBSCRIBE
Red
is
PUBL
ISH
GET
Long
pol
ling
Helpers# demo/models.py (1/2)
import redis
CHANNEL = 'demo'
def send_message(message): client = redis.StrictRedis() message = message.encode('utf-8') return client.publish(CHANNEL, message)
22
Helpers# demo/models.py (2/2)
def recv_message(): client = redis.StrictRedis() pubsub = client.pubsub()
pubsub.subscribe(CHANNEL) for event in pubsub.listen(): if event['type'] == 'message': message = event['data'].decode('utf-8') break pubsub.unsubscribe()
return message
23
Publisher# demo/send_msg.py
#!/usr/bin/env python
import sys
from demo.models import send_message
message = " ".join(sys.argv[1:])num = send_message(message)
if num == 1: log = "Sent to one subscriber"else: log = "Sent to {} subscribers".format(num)print("{}: {}".format(log, message))
24
Subscriber# demo/views.py
from django.http import HttpResponse
from demo.models import recv_message
def long_polling_endpoint(request): message = recv_message() return HttpResponse(message.encode('utf-8'), content_type='text/plain; charset=utf-8')
25
HTML# demo/templates/demo/long_polling.html
<!DOCTYPE html>{% load static %}<html> <head> <title>Long polling demo</title> </head> <body> <ul><!-- messages will be inserted here --></ul> <script src="//ajax.googleapis.com/ajax/libs/jquery/2.0.3/jquery.min.js"></script> <script src="{% static 'demo/long_polling.js' %}"></script> </body></html>
26
JavaScript# demo/static/demo/long_polling.js
$(function () { function show_next_message() { $('<li><i>Loading…</i></li>') .appendTo($('ul')) .load('endpoint/', show_next_message); } show_next_message();});
27
Deployment
30
% gunicorn --timeout 10 --workers 2 dcus13rt.wsgi[68271] [INFO] Starting gunicorn 17.5[68271] [INFO] Listening at: http://127.0.0.1:8000[68271] [INFO] Using worker: sync[68274] [INFO] Booting worker with pid: 68274[68275] [INFO] Booting worker with pid: 68275
% ab -c 20 -n 20 \ http://127.0.0.1:8000/long_polling/endpoint/Benchmarking 127.0.0.1 (be patient)...
% PYTHONPATH=. demo/send_msg.py spamSent to 2 subscribers: spam
Deployment% ab -c 20 -n 20 \ http://127.0.0.1:8000/long_polling/endpoint/...
Server Software: gunicorn/17.5Server Hostname: 127.0.0.1Server Port: 8000
Document Path: /long_polling/endpoint/Document Length: 4 bytes
Concurrency Level: 20Time taken for tests: 93.476 secondsComplete requests: 20Failed requests: 18 (Connect: 0, Receive: 0, Length: 18, Exceptions: 0)
31
An example of WebSocket
33
Real timeTulip
Pub/SubRedis
EventsAny source
Web pageJS
Redis
SUBSCRIBE
Red
is
PUBL
ISH
Web
Sock
et
Subscriber# demo/handlers.py
import tulip
from demo.models import recv_message
@tulip.coroutinedef simple_endpoint(websocket, uri): # Doesn't work! recv_message isn't a coroutine. message = yield from recv_message() websocket.send(message)
34
Subscriber# demo/handlers.py (1/2)
import tulipimport websockets
from demo.models import recv_message
subscribers = set()
@tulip.coroutinedef endpoint(websocket, uri): global subscribers subscribers.add(websocket) yield from websocket.recv() subscribers.remove(websocket)
35
Subscriber# demo/handlers.py (2/2)
def relay_messages(): while True: message = recv_message() for websocket in subscribers: if websocket.open: websocket.send(message)
if __name__ == '__main__': websockets.serve(endpoint, 'localhost', 7999) loop = tulip.get_event_loop() loop.run_in_executor(None, relay_messages) loop.run_forever()
36
HTML# demo/templates/demo/websocket.html
<!DOCTYPE html>{% load static %}<html> <head> <title>WebSocket demo</title> </head> <body> <ul><!-- messages will be inserted here --></ul> <script src="//ajax.googleapis.com/ajax/libs/jquery/2.0.3/jquery.min.js"></script> <script src="{% static 'demo/websocket.js' %}"></script> </body></html>
37
JavaScript# demo/static/demo/websocket.js
$(function () { var ws = new WebSocket("ws://localhost:7999/"); ws.onmessage = function (event) { $('<li>' + event.data + '</li>') .appendTo($('ul')); }});
38
Deployment with gevent% gunicorn --timeout 10 --workers 2 \ --worker-class gevent dcus13rt.wsgi[68261] [INFO] Starting gunicorn 17.5[68261] [INFO] Listening at: http://127.0.0.1:8000[68261] [INFO] Using worker: gevent[68264] [INFO] Booting worker with pid: 68264[68265] [INFO] Booting worker with pid: 68265
% ab -c 20 -n 20 \ http://127.0.0.1:8000/long_polling/endpoint/Benchmarking 127.0.0.1 (be patient)...
% PYTHONPATH=. demo/send_msg.py spamSent to 20 subscribers: spam
40
Deployment with gevent% ab -c 20 -n 20 \ http://127.0.0.1:8000/long_polling/endpoint/...
Server Software: gunicorn/17.5Server Hostname: 127.0.0.1Server Port: 8000
Document Path: /long_polling/endpoint/Document Length: 4 bytes
Concurrency Level: 20Time taken for tests: 1.740 secondsComplete requests: 20Failed requests: 0
41
Execution model
43
• Based on an event loop
• Handle many socket connections in a single thread
• epoll (Linux), kqueue (BSD), IOCP (Windows)
• More efficient than one thread per connection
• Suitable for network programming
http://python-notes.boredomandlaziness.org/en/latest/pep_ideas/async_programming.html
Programming model
44
• Based on explicit cooperative multi-threading
• Callbacks
• Coroutines
• In Python: yield (from)
• Suitable for concurrent applications
http://python-notes.boredomandlaziness.org/en/latest/pep_ideas/async_programming.html
The sad truth
45
“Converting [a synchronousoperation] to asynchronous requires
modifying every point that calls itto yield control appropriately.”
http://python-notes.boredomandlaziness.org/en/latest/pep_ideas/async_programming.html
What about gevent?
46
It modifies every function that performs I/Oby monkey-patching the standard library
so you don’t have to change your own code.
You get the benefits of the execution model for free!But you lose the benefits of the programming model.
Major Python frameworks
47
• gevent is “a coroutine-based networking library”
• Twisted is “an event-driven networking engine”
• Tornado is a “web framework and asynchronous networking library”
PEP 3156• Pluggable event loop API for interoperability
• Callbacks, transports, protocols, futures
• High-level scheduler based on coroutines
• Suspend execution with yield from ...
• Reference implementation code-named Tulip
• Effort led by Guido van Rossum himself
48
django-c10k-demo• Define WebSocket
handlers – @websocket
• Wire them in URLconfs
• Use runserver in devand gunicorn in prod
• Tulip under the hood
• Real-time made easy!
50 https://github.com/aaugustin/django-c10k-demo
django-c10k-demo
51 https://github.com/aaugustin/django-c10k-demo
from c10ktools.http import websocket
@websocketdef worker(ws): # ... initial synchronization (not shown) ... while True: msg = yield from ws.recv() if msg is None: break step, row, col, state = msg.split() for subscriber in subscribers[row][col]: if subscriber.open: subscriber.send(msg)
for row, col in subscriptions: subscribers[row][col].remove(ws)
53
Django isn’t asynchronous# Synchronous cache access
def sync(request, key): value = cache.get(key) return HttpResponse(value)
# Asynchronous cache access
@websocketdef async(ws): value = yield from cache.get(key) ws.send(value)
Django isn’t asynchronous# Synchronous ORM access
def sync(request, user_id): user = User.objects.get(id=int(user_id)) bio_html = user.bio.html return HttpResponse(bio_html)
# Asynchronous ORM access
@websocketdef async(ws): user_id = yield from ws.recv() user = yield from User.objects.get(id=int(user_id)) bio_html = # ??? — no “yield from” attribute access! ws.send(bio_html)
54
56
HTTP != real-time
• Execution – threads vs. events
• Programming – preemptive vs. cooperative
• Protocol – request / response vs. message streams
• Scalability – stateless vs. stateful
• Workload – CPU vs. I/O bound
57
Key learnings• PEP 3156 will improve support and resources
for asynchronous I/O in Python ≥ 3.4.
• Django isn’t designed for explicit cooperative multi-threading and that’s unlikely to change.
• Robust client and server stacks are emergingas well as deployment best-practices.
• Simplified development setups are possibleat the cost of losing parity with production.