Concurrent Programming
2
The Cunning Plan
• We’ll look into:– What concurrent programming is– Why you care– How it’s done
• We’re going to skim over *all* the interesting details
3
One-Slide Summary
• There are many different ways to do concurrent programming
• There can be more than one
• We need synchronization primitives (e.g. semaphores) to deal with shared resources
• Message passing is complicated
4
5
What?
• Concurrent Programming– using multiple threads on a single machine
• OS simulates concurrency or
• using multiple cores/processors
– using message passing• Memory is not shared between threads• More general in terms of hardware
requirementsetc.
6
What (Shorter version)
• There are a million different ways to do concurrent programming
• We’ll focus on three:– co-begin blocks– threads– message passing
7
Concurrent Programming: Why?
1. Because it is intuitive for some problems (say you’re writing httpd)
2. Because we need better-than-sequential performance
3. Because the problem is inherently distributed (e.g. BitTorrent)
8
Coding Challenges
• How do you divide the problem across threads?– easy: matrix multiplication using threads
– hard: heated plate using message passing
– harder: n-body simulation for large n
9
One Slide on Co-Begin
• We want to execute commands simultaneously m’kay—solution:int x;int y;// ...run-in-parallel{ functionA(&x) | functionB(&x,&y)}
A B
Main
10
Threads
• Most common in everyday applications
• Instead of a run-in-parallel block, we want explicit ways to create and destroy threads
• Threads can all see a program’s global variables (i.e. they share memory)
11
Some Syntax:
Thread mythread = new Thread( new Runnable() {
public void run() { // your code here }
} );
mythread.start();mythread.join();
12
Some Syntax:
Thread mythread = new Thread( new Runnable() {
public void run() { // your code here }
} );
mythread.start();
void foo(int & x) { // your code here}//...
int bar = 5;pthread_t my_id;
pthread_create(&my_id, NULL, foo, (void *)&bar);// ...pthread_join(my_id, NULL);
13
Example: Matrix Multiplication
• Given:
• Compute:9 * 2 = -187 * 5 = 354 * -3 = -12 +----- 5
9 * 2 = 187 * 5 = 354 * -3 = -12 +----- 41
14
Matrix Multiplication ‘Analysis’
• We havep = 4 size(A) = (p, q)q = 3 size(B) = (q, r)r = 4 size(C) = (p, r)
• Complexity:• p×r elements in C• O(q) operations per element
• Note: calculating each element of C is independent from the other elements
15
Matrix Multiplication using Threads
pthread_t threads[P][Q];struct location locs[P][Q];
for (i = 0; i < P; ++i) { for (j = 0; j < R; ++j) { (locs[i][j]).row = i; (locs[i][j]).col = j; pthread_create( &threads[i][j], NULL, calc_cell, (void*)(&(locs[i][j])) ); }}
for (i = 0; i < P; ++i) { for (j = 0; j < R; ++j) { pthread_join( &threads[i][j], NULL); }}
16
Matrix Multiplication using Threads
for each element in C: create a thread: call the function 'calc_cell'
for each created thread: wait until the thread finishes
// Profit
17
Postmortem
• Relatively easy to parallellize:– matrices A and B are ‘read only’– each thread writes to a unique entry in C– entries in C do not depend on each other
• What are some problems with this?– overhead of creating threads– use of shared memory
18
Synchronization
• So far, we have only covered how to create & destroy threads
• What else do we need? (See title)
19
Synchronization• We want to do things like:
– event A must happen before event Band
– events A and B cannot occur simultaneously
• Is there a problem here?counter = counter + 1 counter = counter + 1
Thread 1 Thread 2
20
Semaphores
• A number n(initialized to some value)
• Can only increment sem.V() and decrement sem.P()
• n > 0 : P() doesn’t blockn ≤ 0 : P() blocksV() unblocks some waiting
process
21
More Semaphore Goodness
• Semaphores are straightforward to implement on most types of systems
• Easy to use for resource management(set n equal to the number of resources)
• Some additional features are common(e.g. bounded semaphores)
22
Semaphore Example
• Let’s try this again:
counter = counter + 1wes.V()
wes.P()counter = counter + 1
Thread 1 Thread 2
Semaphore wes = new Semaphore(0)// start threads 1 and 2 simultaneously
Main
23
foo_a1aArrived.V()bArrived.P()foo_a2
foo_b1bArrived.V()aArrived.P()foo_b2
Semaphore Example 2
• Suppose we want two threads to “meet up” at specific points in their code:
foo_a1bArrived.P()aArrived.V()foo_a2
foo_b1aArrived.P()bArrived.V()foo_b2
Thread A Thread B
Semaphore aArrived = new Semaphore(0)Sempahore bArrived = new Semaphore(0)// start threads A and B simultaneously
24
Deadlock
• ‘Deadlock’ refers to a situation in which one or more threads is waiting for something that will never happen
• Theorem: You will, at some point in your life, write code that deadlocks
25
Readers/Writers Problem
• Let’s do a slightly bigger example• Problem:
– some finite buffer b
– multiple writer threads(only one can write at a time)
– multiple reader threads(many can read at a time)
– can only read if no writing is happening
26
Readers/Writers Solution #1int readers = 0Semaphore mutex = new Semaphore(1)Semaphore roomEmpty = new Semaphore(1)
Writers:roomEmpty.P() // write hereroomEmpty.V()
27
Readers/Writers Solution #1Readers:
mutex.P() readers ++ if (readers == 1) roomEmpty.P()mutex.V() // read heremutex.P() readers -- if (readers == 0) roomEmpty.V()mutex.V()
28
Starvation
• Starvation occurs when a thread is continuously denied resources
• Not the same as deadlock: it might eventually get to run, but it needs to wait longer than we want
• In the previous example, a writer might ‘starve’ if there is a continuous onslaught of readers
29
Guiding Question
• Earlier, I said sem.V() unblocks some waiting thread
• If we don’t unblock in FIFO order, that means we could cause starvation
• Do we care?
30
Synchronization Summary
• We can use semaphores to enforce synchronization:– ordering– mutual exclusion– queuing
• There are other constructs as well
• See your local OS Prof
31
Message Passing
• Threads and co. rely on shared memory
• Semaphores make very little sense if they cannot be shared between n > 1 threads
• What about systems in which we can’t share memory?
32
Message Passing
• Threads Processes are created for us(they just exist)
• We can do the following:blocking_send(int destination, char * buffer, int size)
blocking_receive(int source, char * buffer, int size)
33
Message Passing: Motivation
• We don’t care if threads run on different machines:– same machine - use virtual
memory tricks to make messages very quick
– different machines - copy and send over the network
34
Heated Plate Simulation
• Suppose you have a metal plate:
• Three sides are chilled to 273 K• One side is heated to 373 K
35
Heated Plate Simulation
Problem: Calculate the heat distribution after some time t:
t=10
t=30
t=50
36
Heated Plate Simulation
• We model the problem by dividing the plate into small squares:
• For each time step, take the average of a square’s four neighbors
37
Heated Plate Simulation
• Problem: need to communicate for each time step
• Sending messages is expensive…
P1
P2
P3
38
Heated Plate Simulation
• Problem: need to communicate for each time step
• Sending messages is expensive…
• Solution: send fewer, larger messages, limitlongest message path
P1
P2
P3
39
How to cause deadlock in MPI
char * buff = "Goodbye"char * buff2 = new char(15);
send(2, buff, 8)recv(2, buff, 15)
char * buff = ", cruel world\n“char * buff2 = new char(8);
send(1, buff, 15)recv(1, bugg, 8)
Process 1 Process 2
40
Postmortem
• Our heated plate solution doesnot rely on shared memory
• Sending messages becomes complicated in a hurry (easy to do the wrong thing)
• We need to reinvent the wheel constantly for different interaction patterns
41
Example Summary• Matrix Multiplication Example
– used threads and implicitly shared memory– this is common for everyday applications
(especially useful for servers, GUI apps, etc.)
• Heated Plate Example– used message passing– this is more common for big science and
big business (also e.g. peer-to-peer)– it is not used to code your average firefox
42
Guiding Question
If you’re writing a GUI app(let’s call it “firefox”)
would you prefer to use threads or message passing?
43
Summary
• Looked at three ways to do concurrent programming:– co-begin– threads, implicitly shared memory,
semaphores– message passing
• Concerns of scheduling, deadlock, starvation
44
Summary