Concurrency, Threads, and Events Ken Birman (Based on a slide set prepared by Robbert van Renesse)
Summary Paper 1
Using Threads in Interactive Systems: A Case Study (Hauser et al 1993) Analyzes two interactive computing
systems Classifies thread usage Finds that programmers are still struggling
(pre-Java) Limited scheduling support
Priority-inversion
Summary Paper 2
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services (Welsh, 2001) Analyzes threads vs event-based
systems, finds problems with both Suggests trade-off: stage-driven
architecture Evaluated for two applications
Easy to program and performs well
What is a thread?
A traditional “process” is an address space and a thread of control.
Now add multiple thread of controls Share address space Individual program counters and stacks
Same as multiple processes sharing an address space.
Thread Switching To switch from thread T1 to T2:
Thread T1 saves its registers (including pc) on its stack
Scheduler remembers T1’s stack pointer Scheduler restores T2’ stack pointer T2 restores its registers T2 resumes
Two models: preemptive/non-preemptive
Thread Scheduler Maintains the stack pointer of each thread Decides what thread to run next
E.g., based on priority or resource usage Decides when to pre-empt a running thread
E.g., based on a timer May need to deal with multiple CPUs
But not usually “fork” creates a new thread Blocking or calling “yield” lets scheduler run
Synchronization Primitives Semaphores
P(S): block if semaphore is “taken” V(S): release semaphore
Monitors: Only one thread active in a module at a time Threads can block waiting for some condition
using the WAIT primitive Threads need to signal using NOTIFY or
BROADCAST
Uses of threads To exploit CPU parallelism
Run two CPUs at once in the same program To exploit I/O parallelism
Run I/O while computing, or do multiple I/O Listen to the “window” while also running code,
e.g. allow commands during an interactive game For program structuring
E.g., timers To avoid deadlock in RPC-based applications
Hauser’s categorization
Defer Work: asynchronous activity Print, e-mail, create new window, etc.
Pumps: pipeline components Wait on input queue; send to output
queue E.g., slack process: add latency for
buffering Sleepers & one-shots
Periodic activity & timers
Categorization, cont’d
Deadlock Avoiders Avoid deadlock through ordered
acquisition of locks When needing more locks, roll-back
and re-acquire Task Rejuvenation: recovery
Start new thread when old one dies, say because of uncaught exception
Categorization, cont’d
Serializers: event loop for (;;) { get_next_event();
handle_event(); } Concurrency Exploiters
Use multiple CPUs Encapsulated Forks
Hidden threads used in library packages E.g., menu-button queue
Common Problems Priority Inversion
High priority thread waits for low priority thread Solution: temporarily push priority up (rejected??)
Deadlock X waits for Y, Y waits for X
Incorrect Synchronization Forgetting to release a lock
Failed “fork” Tuning
E.g. timer values in different environment
Problems he neglects
Implicit need for ordering of events E.g. thread A is supposed to run before
thread B does, but something delays A Non-reentrant code
Languages lack “monitor” features and users are perhaps surprisingly weak at detecting and protecting concurrently accessed data
Criticism of Hauser He assumes superb programmers and
seems to believe that “most” programmers won’t use threads (his example systems are really platforms, not applications)
Systems old but/and not representative Pre-Java and C# And now there are some tools that can
help discover problems
What is an Event? An object queued for some module Operations:
create_event_queue(handler) EQ enqueue_event(EQ, event-object)
Invokes, eventually, handler(event-object) Handler is not allowed to block
Blocking could cause entire system to block
But page faults, garbage collection, …
Example Event System
(Also common in telecommunications industry, where it’s called “workflow programming”)
Event Scheduler
Decides which event queue to handle next. Based on priority, CPU usage, etc.
Never pre-empts event handlers! No need for stack / event handler
May need to deal with multiple CPUs
Synchronization?
Handlers cannot block no synchronization
Handlers should not share memory At least not in parallel
All communication through events
Uses of Events CPU parallelism
Different handlers on different CPUs I/O concurrency
Completion of I/O signaled by event Other activities can happen in parallel
Program structuring Not so great… But can use multiple programming
languages!
Hauser’s categorization ?!
Defer Work: asynchronous activity Send event to printer, etc
Pumps: pipeline components Natural use of events!
Sleepers & one-shots Periodic events & timer events
Categorization, cont’d
Deadlock Avoiders Ordered lock acquisition still works
Task Rejuvenation: recovery Watchdog events?
Categorization, cont’d
Serializers: event loop Natural use of events and handlers!
Concurrency Exploiters Use multiple CPUs
Encapsulated Events Hidden events used in library
packages E.g., menu-button queue
Threads vs. Events Events-based systems use fewer
resources Better performance (particularly scalability)
Event-based systems harder to program Have to avoid blocking at all cost Block-structured programming doesn’t work How to do exception handling?
In both cases, tuning is difficult
Both? In practice, many kinds of systems
need to support both threads and events Threaded programs in Unix are the
common example of these, because window systems use events
The programmer uses cthreads or pthreads
Major problem: the UNIX kernel interface wasn’t designed with threads in mind!
Why does this cause problems?
Many system calls block the “process” File read or write, for example
And many libraries aren’t reentrant So when the user employs threads
The application may block unexpectedly Limited work-around: add “kernel threads”
And the user might stumble into a reentrancy bug
Events as seen in Unix Window systems use small messages… But the “old” form of events are signals
Kernel basically simulates an interrupt into the user’s address space
The “event handler” then runs… But can it launch new threads? Some system calls can return EINTR Very limited options to “block” signals in critical
sections
How people work around this?
They try not to do blocking I/O Use asynchronous system calls… or
select… or some mixture of the two Or try to turn the whole application into
an event-driven one using pools of threads, in the SEDA model (more or less)
One dedicated thread per I/O “channel”, to turn signal-style events into events on the event queue for the processing stage
This can be hard, but it works
Must write the whole program and have a way to review any libraries it uses!
One learns, the hard way, that pretty much nothing else works
Unix programs built by inexperienced developers are often riddled with concurrency bugs!
SEDA
Mixture of models of threads and (small message-style) events
Events, queues, and “pools of event handling threads”.
Pools can be dynamically adjusted as need arises.
Similar to Javabeans and EventListeners?
Authors: “Best of both worlds”
Ease of programming of threads Or even better
Performance of events Or even better
Threads Considered Harmful Like goto, transfer to some entry in
program In any scope Destroys structure of programs
Primitive Synchronization Primitives Too low-level Too coarse-grained Too error-prone Prone to over-specification
Example: create file
1. Create file2. Read current directory (may be
cached)3. Update and write back directory4. Write file
Thread Implementations
1. Serialize: op1; op2; op3; op4• Simplest and most common
2. Use threads• Requires at least two semaphores!• Results in complicated program
3. Simplified threadsa) Create file and read directory in parallelb) Barrierc) Write file and write directory in parallel• Over-specification!
Event Implementation
Create a dummy handler that awaits file creation and directory read events and then send an event to update the directory.
Not great…
GOP: Discussion Specifies dependencies at a high-level
No semaphores, condition variables, etc No explicit threads nor events
Can easily be supported by many languages C, Java, etc.
Top-down specification cmp with make, prolog, theorem prover
Exception handling easily supported
Conclusion
Threads still problematic As a code structuring mechanism High resource usage
Events also problematic Hard to code, but efficient
SEDA and GOP address shortcomings But neither can be said to have taken
hold