Top Banner
Concurrent HTTP Proxy with Caching Ashwin Bharambe Ashwin Bharambe Monday, Dec 4, 2006 Monday, Dec 4, 2006
28

Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

Dec 22, 2015

Download

Documents

Patrick Bates
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

Concurrent HTTPProxy with Caching

Ashwin BharambeAshwin Bharambe

Monday, Dec 4, 2006Monday, Dec 4, 2006

Page 2: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

2

Outline

ParsingParsingSome quick hints

Threads Threads Review of the lecture

SynchronizationSynchronizationUsing semaphores; preview of Wed. lecture

Caching in the proxyCaching in the proxy

Questions?Questions?

Page 3: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

3

Parsing a HTTP request

Things to keep in mindThings to keep in mindRead all lines of the request, not just the first rio_readlineb

Look for Host:, Connection: headers

How do you parse?How do you parse?strtok? Complex semantics Modifies the string passed as the argument

sscanf sscanf(“%s %s %s”, line, req, url, version)

Hand-coded strchr(‘ ’) and strdup

Page 4: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

4

Allocating Buffer Space

Size of request is not known before-handSize of request is not known before-handClient can send an arbitrary number of headers

Size of response is not known before-handSize of response is not known before-handServer may not set a Content-Length headerSome servers set it incorrectly!

How do you allocate space beforehand, then?How do you allocate space beforehand, then?You cannot! Use realloc(), periodically adding more space

n = rio_readnb(…);if (used + n > alloced) { req = realloc (...); alloced += chunk_size;}

Page 5: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

5

Concurrent servers

WebBrowser

WebServer

WebBrowser

WebBrowser

WebServer

WebServer

Proxy

Iterative servers can only serve one client at Iterative servers can only serve one client at a timea time

Concurrent servers handle multiple requests Concurrent servers handle multiple requests in parallelin parallel

Page 6: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

6

Implementing concurrency1. Processes1. Processes

Fork a child process for every incoming client connectionDifficult to share data among child processes

2. Threads2. ThreadsCreate a thread to handle every incoming client connectionOur focus today

3. I/O multiplexing with Unix 3. I/O multiplexing with Unix select()select()Use select() to notice pending socket activity Manually interleave the processing of multiple open connectionsMore complex!

~ implement your own app-specific thread package!

Page 7: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

7

Traditional view of a process

Process = process context + code, data, & Process = process context + code, data, & stackstack

shared libraries

run-time heap

0

read/write data

Program context: Data registers Condition codes Stack pointer (SP) Program counter (PC)Kernel context: VM structures Descriptor table brk pointer

Code, data, and stack

read-only code/data

stackSP

PC

brk

Process context

Page 8: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

8

Alternate view of a process

Process = thread + code, data, & kernel Process = thread + code, data, & kernel contextcontext

shared libraries

run-time heap

0

read/write dataThread context: Data registers Condition codes Stack pointer (SP) Program counter (PC)

Code and Data

read-only code/data

stackSP

PC

brk

Thread (main thread)

Kernel context: VM structures Descriptor table brk pointer

Page 9: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

9

A process with multiple threads

Multiple threads can be associated with a processMultiple threads can be associated with a processEach thread has its own logical control flow (instruction flow)Each thread shares the same code, data, and kernel contextEach thread has its own thread ID (TID)

shared libraries

run-time heap

0

read/write data

Shared code and data

read-only code/dataThread 1 context: Data registers Condition codes SP1 PC1

stack 1

Thread 1 (main thread)

Kernel context: VM structures Descriptor table brk pointer

Thread 2 context: Data registers Condition codes SP2 PC2

stack 2

Thread 2 (peer thread)

Page 10: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

10

Threads vs. processes

How threads and processes are similarHow threads and processes are similarEach has its own logical control flow.Each can run concurrently.Each is context switched.

How threads and processes are differentHow threads and processes are differentThreads share code and data, processes (typically) do not.Threads are less expensive than processes.

Process control (creating and reaping) is twice as expensive as thread control.

Linux/Pentium III numbers:~20K cycles to create and reap a process.~10K cycles to create and reap a thread.

Page 11: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

11

Posix threads (pthreads)Creating and reaping threadsCreating and reaping threadspthread_createpthread_joinpthread_detach

Determining your thread IDDetermining your thread IDpthread_self

Terminating threadsTerminating threadspthread_cancelpthread_exitexit [terminates all threads] return [terminates current thread]

Page 12: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

12

Hello World, with pthreads

/* * hello.c - Pthreads "hello, world" program */#include "csapp.h"

void *thread(void *vargp);

int main() { pthread_t tid;

Pthread_create(&tid, NULL, thread, NULL); Pthread_join(tid, NULL); exit(0);}

/* thread routine */void *thread(void *vargp) { printf("Hello, world!\n"); return NULL;}

Thread attributes (usually NULL)

Thread arguments(void *p)

return value(void **p)

Upper case Pthread_xxx

checks errors

Page 13: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

13

Hello World, with pthreads

main thread

main thread waits for peer

thread to terminate

exit() terminates

main thread and any peer threads

peer thread

call Pthread_create()

call Pthread_join()

Pthread_join() returns

printf()

return NULL;(peer threadterminates)

Pthread_create() returns

Page 14: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

14

Thread-based echo serverint main(int argc, char **argv){ int listenfd, *connfdp, port, clientlen; struct sockaddr_in clientaddr; pthread_t tid;

if (argc != 2) { fprintf(stderr, "usage: %s <port>\n", argv[0]); exit(0); } port = atoi(argv[1]);

listenfd = open_listenfd(port); while (1) { clientlen = sizeof(clientaddr); connfdp = Malloc(sizeof(int)); *connfdp = Accept(listenfd,(SA *)&clientaddr,&clientlen); Pthread_create(&tid, NULL, thread, connfdp); }}

Page 15: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

15

Thread-based echo server

/* thread routine */void *thread(void *vargp){ int connfd = *((int *)vargp);

Pthread_detach(pthread_self()); Free(vargp);

echo_r(connfd); /* thread-safe version of echo() */ Close(connfd); return NULL;}

?

pthread_detach() is recommended in the proxy lab

Page 16: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

16

Issue 1: detached threads

A thread is eitherA thread is either joinablejoinable or or detacheddetachedJoinableJoinable thread can be reaped or killed by other threads. thread can be reaped or killed by other threads.

must be reaped (pthread_join) to free resources.

DetachedDetached thread can’t be reaped or killed by other thread can’t be reaped or killed by other threads.threads.

resources are automatically reaped on termination.

Default state is joinable.Default state is joinable.pthread_detach(pthread_self()) to make detached.

Why should we use detached threads?Why should we use detached threads?pthread_join() blocks the calling thread

Page 17: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

17

Issue 2: avoid unintended sharing

What happens if we pass the address of connfd to What happens if we pass the address of connfd to the thread routine as in the following code?the thread routine as in the following code?

connfdp = Malloc(sizeof(int));*connfdp = Accept(listenfd,(SA *)&clientaddr,&clientlen);Pthread_create(&tid, NULL, thread, connfdp);

connfd = Accept(listenfd,(SA *)&clientaddr,&clientlen);Pthread_create(&tid, NULL, thread, (void *)&connfd);

Page 18: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

18

Issue 3: thread-safe

Easy to share data structures between threadsEasy to share data structures between threadsBut we need to do this correctly!

Recall the shell lab:Recall the shell lab:Job data structuresShared between main process and signal handler

Synchronize multiple control flowsSynchronize multiple control flows

Page 19: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

19

Synchronizing with semaphoresSemaphores are counters for resources Semaphores are counters for resources shared between threadsshared between threads

Non-negative integer synchronization variable

Two operations: P(s) & V(s)Two operations: P(s) & V(s)Atomic operationsP(s): [ while (s == 0) wait(); s--; ]V(s): [ s++; ]

If initial value of s == 1If initial value of s == 1Serves as a mutual exclusive lock

Just a very brief descriptionDetails in the next lecture

Page 20: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

20

Sharing with POSIX semaphores

#include "csapp.h"#define NITERS 1000

unsigned int cnt; /* counter */sem_t sem; /* semaphore */

int main() { pthread_t tid1, tid2;

Sem_init(&sem, 0, 1);

/* create 2 threads and wait */ ...... exit(0);}

/* thread routine */void *count(void *arg){ int i;

for (i=0;i<NITERS;i++){ P(&sem); cnt++; V(&sem); } return NULL;}

Page 21: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

21

Thread-safety of library functions

All functions in the Standard C Library are thread-safeAll functions in the Standard C Library are thread-safeExamples: malloc, free, printf, scanf

Most Unix system calls are thread-safeMost Unix system calls are thread-safewith a few exceptions:

Thread-unsafe function Reentrant versionasctime asctime_rctime ctime_rgethostbyaddr gethostbyaddr_rgethostbyname gethostbyname_rinet_ntoa (none)localtime localtime_rrand rand_r

Page 22: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

22

Thread-unsafe functions: fixes

Return a ptr to a Return a ptr to a staticstatic variablevariable

Fixes: Fixes: 1. Rewrite code so caller passes pointer to struct

Issue: Requires changes in caller and callee

hostp = Malloc(...));gethostbyname_r(name, hostp, …);

struct hostent *gethostbyname(char name){ static struct hostent h; <contact DNS and fill in h> return &h;}

Page 23: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

23

Thread-unsafe functions: fixes2. Lock-and-copy

Issue: Requires only simple changes in caller However, caller must free memory

struct hostent *gethostbyname_ts(char *p) { struct hostent *q = Malloc(...); P(&mutex); /* lock */ p = gethostbyname(name); *q = *p; /* copy */ V(&mutex); return q;}

Page 24: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

24

Caching

What should you cache?What should you cache?Complete HTTP response Including headers

You don’t need to parse the response But real proxies do. Why?

If size(response) > MAX_OBJECT_SIZE, don’t If size(response) > MAX_OBJECT_SIZE, don’t cachecache

Page 25: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

25

Cache Replacement

Least Recently Used (LRU)Least Recently Used (LRU)Evict the cache entry whose “access” timestamp is farthest into the past

When to evict?When to evict?When you have no space!Size(cache) + size(new_entry)

> MAX_CACHE_SIZEWhat is Size (cache)? Sum of size (cache_entries)

Page 26: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

26

Cache Synchronization

A single cache is shared by all proxy threadsA single cache is shared by all proxy threadsMust carefully control access to the cache

What operations should be locked? What operations should be locked? add_cache_entryremove_cache_entrylookup_cache_entry

For the ambitious:For the ambitious:Multiple readers can peacefully co-existBut if a writer arrives, that thread MUST synchronize access with others

Page 27: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

27

Summary

Threading is a clean and efficient way to Threading is a clean and efficient way to implement concurrent serverimplement concurrent server

We need to synchronize multiple threads for We need to synchronize multiple threads for concurrent accesses to shared variablesconcurrent accesses to shared variables

Semaphore is one way to do thisThread-safety is the difficult part of thread programming

Final review session:Final review session:Friday 1-2:30pm WeH 7500 (all TAs)

Page 28: Concurrent HTTP Proxy with Caching Ashwin Bharambe Monday, Dec 4, 2006.

28

Questions?