1 Outline Part 1 • Objectives: – To define the process and thread abstractions. – To briefly introduce mechanisms for implementing processes (threads). – To introduce the critical section problem. – To learn how to reason about the correctness of concurrent programs. • Administrative details: – Groups are listed with “class pics”. – Pictures – make sure name-to-face mapping is correct. – Password protected name: cps110 passwd: OntheGo
113
Embed
1 Outline Part 1 Objectives: –To define the process and thread abstractions. –To briefly introduce mechanisms for implementing processes (threads). –To.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Outline Part 1• Objectives:
– To define the process and thread abstractions.– To briefly introduce mechanisms for implementing
processes (threads).– To introduce the critical section problem.– To learn how to reason about the correctness of
concurrent programs.
• Administrative details: – Groups are listed with “class pics”.– Pictures – make sure name-to-face mapping is correct.– Password protected
name: cps110 passwd: OntheGo
2
Lab 1 suggestions
• Some code interleavings can cause “system” problems – e.g. freeing the same object twice– Add code to detect this and give error message.
This code should be “conditional” on an argument, so you can see what happens if you let the system handle the problem.
• Fetching data from a freed object should also be detected and stopped, conditionally.
Concurrency• Multiple things happening simultaneously
• Could provide direct support in HW– Atomic increment– Insert node into sorted list??
• Just provide low level primitives to construct atomic sequences– called synchronization primitivesLOCK(counter->lock); // Wait here until unlockedcounter->value = counter->value + 1;UNLOCK(counter->lock);
• test&set (x) instruction: returns previous value of x and sets x to “1”LOCK(x) => while (test&set(x));UNLOCK(x) => x = 0;
5
The Basics of Processes
• Processes are the OS-provided abstraction of multiple tasks (including user programs) executing concurrently.
• A Process IS:one instance of a program (which is only a passive set of bits) executing (implying an execution context – register state, memory resources, etc.)
• OS schedules processes to share CPU.
6
Why Use Processes?
• To capture naturally concurrent activities within the structure of the programmed system.
• To gain speedup by overlapping activities or exploiting parallel hardware.– From DMA to
multiprocessorsP P P P
Memory
7
Separation of Policy and Mechanism
(System Design Principle)
• “Why and What” vs. “How”
• Objectives and strategies vs. data structures, hardware and software implementation issues.
• Process abstraction vs. Process machinery
Can you think of examples?
8
Process Abstraction
• Unit of scheduling• One (or more*) sequential threads of control
– program counter, register values, call stack
• Unit of resource allocation – address space (code and data), open files– sometimes called tasks or jobs
• Operations on processes: fork (clone-style creation), wait (parent on child), exit (self-termination), signal, kill.
Process-related System Calls in Unix.
9
Threads and Processes
• Decouple the resource allocation aspect from the control aspect
• Thread abstraction - defines a single sequential instruction stream (PC, stack, register values)
• Process - the resource context serving as a “container” for one or more threads (shared address space)
• Kernel threads - unit of scheduling (kernel-supported thread operations still slow)
10
Threads and Processes
Address Space Address Space
Thread Thread Thread
11
An Example
Address Space
Thread Thread
Editing thread:Responding toyour typing in your doc
Autosave thread: periodicallywrites your docfile to disk
doc
Doc formatting process
12
User-Level Threads• To avoid the performance penalty of kernel-
supported threads, implement at user level and manage by a run-time system – Contained “within” a single kernel entity (process)– Invisible to OS (OS schedules their container, not
being aware of the threads themselves or their states). Poor scheduling decisions possible.
• User-level thread operations can be 100x faster than kernel thread operations, but need better integration / cooperation with OS.
13
Context Switching• When a process is running, its program
counter, register values, stack pointer, etc. are contained in the hardware registers of the CPU. The process has direct control of the CPU hardware for now.
• When a process is not the one currently running, its current register values are saved in a process descriptor data structure (PCB - process control block)
• Context switching involves moving state between CPU and various processes’ PCBs by the OS.
14
Process State Transitions
Ready
Create Process
Running
Blocked
Done
Wakeup(due to interrupt)
sleep (due tooutstanding request of syscall)
scheduled
suspendedwhile anotherprocess scheduled
15
Process Mechanisms• PCB data structure in kernel memory
represents a process (allocated on process creation, deallocated on termination).
• PCBs reside on various state queues (including a different queue for each “cause” of waiting) reflecting the process’s state.
• As a process executes, the OS moves its PCB from queue to queue (e.g. from the “waiting on I/O” queue to the “ready to run” queue).
What is the value of x when both threadsleave this while loop?
19
Nondeterminism• What unit of work can be
performed without interruption? Indivisible or atomic operations.
• Interleavings - possible execution sequences of operations drawn from all threads.
• Race condition - final results depend on ordering and may not be “correct”.
while (i<10) {xx+1; i++;}
load value of x into reg
yield( )
add 1 to reg
yield ( )
store reg value at x
yield ( )
20
InterleavingThread 0Load x (value 0)
Incr x (value 1 in reg)
Store x (value 1)
Load x (value 1) for 2nd iteration
Incr x (value 2 in reg)
…
Store x for 10th iteration (value 10)
Thread 1
Load x (value 0)Incr x (value 1 in reg)Store x (value 1)…Store x for 9th iteration (value 9)
Load x (value 1) for 10th iterationIncr x (value 2 in reg)
Store x (value 2) for 10th iteration
21
Reasoning about Interleavings• On a uniprocessor, the possible execution
sequences depend on when context switches can occur– Voluntary context switch - the process or thread
explicitly yields the CPU (blocking on a system call it makes, or invoking a Yield operation).
– Interrupts or exceptions occurring - an asynchronous handler is activated, disrupting the execution flow.
– Preemptive scheduling - a timer interrupt may cause an involuntary context switch at any point in the code.
• On multiprocessors, the ordering of operations on shared memory locations is the important factor.– Memory references are atomic, but instructions aren’t
The Trouble with Concurrency• Two threads (T1,T2) in one address space or two processes in
the kernel• One counter
ld r2, countadd r1, r2, r3st count, r1
Shared Datacount
ld r2, countadd r1, r2, r3st count, r1T
ime
T1 T2 countld (count)addswitch
ld (count)addst (count+1)
count+1switch
st (count+1) count+1
Solution: Atomic Sequence of Instructions
• Atomic Sequence– Appears to execute to completion without any intervening
operations
Tim
eT1 T2 countbegin atomicld (count)addswitch
begin atomic st (count+1) count+1end atomicswitch
ld (count+1)addst (count+2) count+2end atomic
wait
24
Critical Sections• If a sequence of non-atomic operations must be
executed as if it were atomic in order to be correct, then we need to provide a way to constrain the possible interleavings in this critical section of our code. – Critical sections are code sequences that contribute
to “bad” race conditions.– Synchronization is needed around such critical
sections.• Mutual Exclusion - goal is to ensure that critical
sections execute atomically w.r.t. related critical sections in other threads or processes.– How?
25
The Critical Section Problem
Each process follows this template:while (1){ ...other stuff... //processes in here shouldn’t stop others
enter_region( );critical sectionexit_region( );
}The problem is to define enter_region and
exit_region to ensure mutual exclusion with some degree of fairness.
– execute a tight loop if critical section is busy– benefits from specialized atomic (read-mod-write)
instructions
• Blocking synchronization– sleep (enqueued on wait queue) while C.S. is busy
Synchronization primitives (abstractions, such as locks) which are provided by a system may be implemented with some combination of these techniques.
27
The Trouble with Concurrency in Threads...
Thread0
Thread1
Data: x
while(i<10){xx+1; i++;}
0
while(j<10) {xx+1; j++;}
0 0i j
What is the value of x when both threadsleave this while loop?
28
Range of Answers
Process 0LD x // x currently 0
Add 1
ST x // x now 1, stored over 9
Do 9 more full loops // leaving x at 10
Process1
LD x // x currently 0
Add 1
ST x // x now 1
Do 8 more full loops // x = 9
LD x // x now 1
Add 1
ST x // x = 2 stored over 10
29
The Critical Section Problemwhile (1)
{ ...other stuff...
enter_region();
critical section
exit_region( );
}
30
Boolean flag[2];
proc (int i) {
while (TRUE){
compute;
flag[i] = TRUE;
while(flag[(i+1) mod 2]);
critical section;
flag[i] = FALSE;
}
}
flag[0] = flag[1]= FALSE;
fork (proc, 0);
fork (proc, 1);
Is it correct?
Assume they go lockstep.Both set their own flag to TRUE.Both busywait forever on the other’s flag ->deadlock.
Proposed Algorithm for 2 Process Mutual Exclusion
31
Proposed Algorithm for 2 Process Mutual Exclusion
• enter_region:needin [me] = true;
turn = you;
while (needin [you] && turn == you) {no_op};
• exit_region:needin [me] = false;
Is it correct?
32
Interleaving of Execution of 2 Threads (blue and green)
enter_region:needin [me] = true;
turn = you;
while (needin [you] && turn == you) {no_op};
Critical Section
exit_region:needin [me] = false;
enter_region:needin [me] = true;
turn = you;
while (needin [you] && turn == you) {no_op};
Critical Section
exit_region:needin [me] = false;
33
needin [blue] = true;
needin [green] = true;
turn = green;
turn = blue;
while (needin [green] && turn == green)
while (needin [blue] && turn == blue){no_op};
while (needin [blue] && turn == blue){no_op};
needin [blue] = false;
while (needin [blue] && turn == blue)
Critical Section
needin [green] = false;
Critical Section
34
needin [blue] = true;
needin [green] = true;
turn = blue;
turn = green;
while (needin [green] && turn == green)
while (needin [blue] && turn == blue)
Critical Section
Critical Section
Oooops!
Greedy Version (turn = me)
35
Synchronization• We illustrated the dangers of race conditions
when multiple threads execute instructions that interfere with each other when interleaved.
• Goal in solving the critical section problem is to build synchronization so that the sequence of instructions that can cause a race condition are executed AS IF they were indivisible (just appearances)
• “Other stuff” can be interleaved with critical section code as well as with the enter_region and exit_region protocols, but it is deemed OK.
36
Peterson’s Algorithm for 2 Process Mutual Exclusion
• enter_region:needin [me] = true;
turn = you;
while (needin [you] && turn == you) {no_op};
• exit_region:needin [me] = false;
What about more than 2 processes?
38
Can we extend 2-process algorithm to work with n
processes?needin [me] = true;turn = you;
needin [me] = true;
turn = you;
needin [me] = true;turn = you;
needin [me] = true;turn = you;
needin [me] = true;
turn = you;
CSIdea:TournamentDetails:Bookkeeping (left to the reader)
39
Lamport’s Bakery Algorithm
• enter_region:choosing[me] = true;number[me] = max(number[0:n-1]) + 1;choosing[me] = false;for (j=0; n-1; j++) { { while (choosing[j] != 0) ; while((number[j] != 0 ) and ((number[j] < number[me]) or ((number[j] == number[me]) and (j < me)))) ; }
• exit_region:number[me] = 0;
40
Explanation of Lamport’s Bakery Algorithmchoosing[me] = true;number[me] = max(number[0:n-1]) + 1;choosing[me] = false;/* choosing[i] is false when number[i] is not changing to non-zero */
for (j=0; n-1; j++) { { while (choosing[j]) ; while((number[j] != 0 ) and ((number[j] < number[me]) or ((number[j] == number[me]) and (j < me)))) ; }
/* While thread i is in this for-loop, number[i] is non-zero; if thread j (j<i) arrives later, while i is examining number[], and computes number[j], when j sets choosing[j] false, number[j]>number[i].
Even if thread i has examined j already, i can enter CS(i), and j will NOT enter CS(j) until i leaves CS(i), and sets number[i] to 0. */
41
Interleaving / Execution Sequence with Bakery
AlgorithmThread 0 Thread 1
Thread 2 Thread 3
Choosing=
Choosing=
Choosing=
Choosing=
Number [0]=
Number [3]=Number [2]=
Number [1]=
False False
FalseFalse
0
0
0
0
42
Thread 0 Thread 1
Thread 2 Thread 3
Choosing=
Choosing=
Choosing=
Choosing=
Number [0]=
Number [3]=Number [2]=
Number [1]=
True True
TrueFalse
0
0
0
0 1
1
43
for (j=0; n-1; j++) { { while (choosing[j]) {skip} while((number[j] != 0 ) and ((number[j] < number[me]) or ((number[j] == number[me]) and (j < me)))) {skip} }
Thread 0 Thread 1 (running)
Thread 2 Thread 3
Choosing=
Choosing=
Choosing=
Choosing=
Number [0]=
Number [3]=Number [2]=
Number [1]=
True False
FalseFalse
0
1
1
0
j
44
for (j=0; n-1; j++) { { while (choosing[j]) {skip} while((number[j] != 0 ) and ((number[j] < number[me]) or ((number[j] == number[me]) and (j < me)))) {skip} }
Thread 0 Thread 1
Thread 2 Thread 3
Choosing=
Choosing=
Choosing=
Choosing=
Number [0]=
Number [3]=Number [2]=
Number [1]=
False False
FalseFalse
2
1
1
0
45
for (j=0; n-1; j++) { { while (choosing[j]) {skip} while((number[j] != 0 ) and ((number[j] < number[me]) or ((number[j] == number[me]) and (j < me)))) {skip} }
Thread 0 Thread 1
Thread 2 Thread 3
Choosing=
Choosing=
Choosing=
Choosing=
Number [0]=
Number [3]=Number [2]=
Number [1]=
False False
FalseTrue
2
1
1
3
46
for (j=0; n-1; j++) { { while (choosing[j]) {skip} while((number[j] != 0 ) and ((number[j] < number[me]) or ((number[j] == number[me]) and (j < me)))) {skip} }
Thread 0 Thread 1
Thread 2 Thread 3
Choosing=
Choosing=
Choosing=
Choosing=
Number [0]=
Number [3]=Number [2]=
Number [1]=
False False
FalseTrue
2
1
1
3
Thread 3Stuck
47
for (j=0; n-1; j++) { { while (choosing[j]) {skip} while((number[j] != 0 ) and ((number[j] < number[me]) or ((number[j] == number[me]) and (j < me)))) {skip} }
Thread 0 Thread 1
Thread 2 Thread 3
Choosing=
Choosing=
Choosing=
Choosing=
Number [0]=
Number [3]=Number [2]=
Number [1]=
False False
FalseTrue
2
1
0
3
48
for (j=0; n-1; j++) { { while (choosing[j]) {skip} while((number[j] != 0 ) and ((number[j] < number[me]) or ((number[j] == number[me]) and (j < me)))) {skip} }
Thread 0 Thread 1
Thread 2 Thread 3
Choosing=
Choosing=
Choosing=
Choosing=
Number [0]=
Number [3]=Number [2]=
Number [1]=
False False
FalseFalse
2
1
0
3
49
Hardware Assistance
• Most modern architectures provide some support for building synchronization: atomic read-modify-write instructions.
• Example: test-and-set (loc, reg)[ sets bit to 1 in the new value of loc;
returns old value of loc in reg ]
• Other examples:compare-and-swap, fetch-and-op
[ ] notation meansatomic
50
Busywaiting with Test-and-Set
• Declare a shared memory location to represent a busyflag on the critical section we are trying to protect.
• exit region (or releasing the “lock”):busyflag 0
51
Pros and Cons of Busywaiting
• Key characteristic - the “waiting” process is actively executing instructions in the CPU and using memory cycles.
• Appropriate when:– High likelihood of finding the critical section
unoccupied (don’t need context switch just to find that out) or estimated wait time is very short
• Disadvantages:– Wastes resources (CPU, memory, bus bandwidth)
52
Blocking Synchronization
• OS implementation involving changing the state of the “waiting” process from running to blocked.
• Need some synchronization abstraction known to OS - provided by system calls.– mutex locks with operations acquire and release– semaphores with operations P and V (down, up)– condition variables with wait and signal
53
Template for Implementing Blocking Synchronization
• Associated with the lock is a memory location (busy) and a queue for waiting threads/processes.
• Acquire syscall: while (busy) {enqueue caller on lock’s queue}/*upon waking to nonbusy lock*/ busy = true;
• Release syscall:busy = false; /* wakeup */ move any waiting threads to Ready queue
54
Pros and Cons of Blocking
• Waiting processes/threads don’t consume CPU cycles
• Appropriate: when the cost of a system call is justified by expected waiting time– High likelihood of contention for lock– Long critical sections
• Disadvantage: OS involvement overhead
57
Outline Part 2• Objectives:
– Higher level synchronization mechanisms– Classic problems in concurrency
• Administrative details:
58
Semaphores
• Well-known synchronization abstraction
• Defined as a non-negative integer with two atomic operationsP(s) - [wait until s > 0; s--]
V(s) - [s++]
• The atomicity and the waiting can be implemented by either busywaiting or blocking solutions.
59
Semaphore Usage
• Binary semaphores can provide mutual exclusion (solution of critical section problem)
• Counting semaphores can represent a resource with multiple instances (e.g. solving producer/consumer problem)
• Signaling events (persistent events that stay relevant even if nobody listening right now)
60
while (1)
{ ...other stuff...
critical section
}
The Critical Section Problem
P(mutex)
V(mutex)
Semaphore:mutex initially 1
Fill in the boxes
61
When is a code section a critical section?
Thread 0
a = a + c;
b = b + c;
Thread 1
a = a + c;
b = b + c;
62
When is a code section a critical section?
Thread 0
P(mutex)
a = a + c;
b = b + c;
V(mutex)
Thread 1
P(mutex)
a = a + c;
b = b + c;
V(mutex)
63
When is a code section a critical section?
Thread 0
P(mutexa)
a = a + c;
V(mutexa)
P(mutexb)
b = b + c;
V(mutexb)
Thread 1
P(mutexa)
a = a + c;
V(mutexa)
P(mutexb)
b = b + c;
V(mutexb)
64
When is a code section a critical section?
Thread 0
P(mutex0)
a = a + c;
b = b + c;
V(mutex0)
Thread 1
P(mutex1)
a = a + c;
b = b + c;
V(mutex1)
65
When is a code section a critical section?
Thread 0
a = a + c;
b = b + c;
Thread 1
a = a + c;
b = b + c;
Thread 2
c = a + b;
66
When is a code section a critical section?
Thread 0
P(mutexa)
a = a + c;
V(mutexa)
P(mutexb)
b = b + c;
V(mutexb)
Thread 1
P(mutexa)
a = a + c;
V(mutexa)
P(mutexb)
b = b + c;
V(mutexb)
Thread 2
?
c = a + b;
?
67
When is a code section a critical section?
Thread 0
P(mutexa)
a = a + c;
V(mutexa)
P(mutexb)
b = b + c;
V(mutexb)
Thread 1
P(mutexa)
a = a + c;
V(mutexa)
P(mutexb)
b = b + c;
V(mutexb)
Thread 2
P(mutexa)
P(mutexb)
c = a + b;
V(mutexa)
V(mutexb)
68
When is a code section a critical section?
Thread 0
P(mutex);
a = a + c;
b = b + c;
V(mutex);
Thread 1
P(mutex);
a = a + c;
b = b + c;
V(mutex);
Thread 2
P(mutex);
c = a + b;
V(mutex);
69
Classic Problems
There are a number of “classic” problems each of which represents a class of synchronization situations
• Locks and condition variablesLock::AcquireLock::ReleaseCondition::Wait (conditionLock)Condition::Signal (conditionLock)Condition::Broadcast (conditionLock)
86
Design Decisions / Issues
• Locking overhead (granularity)• Broadcast vs. Signal • Nested lock/condition variable problem
LOCK a DOLOCK b DO
while (not_ready) wait (b, c) //releases b not a
ENDEND
• My advice – correctness first!
Unseenin call
87
Lock Granularity – how much should one lock protect?
2 4 6 8
103
head tail
A B
88
2 4 6 8
103
head tail
A B
Lock Granularity – how much should one lock protect?
Concurrency vs. overheadComplexity threatens correctness
89
Using Condition Variables
while (! required_conditions) wait (m, c);
• Why we use “while” not “if” – invariant not guaranteed
• Why use broadcast vs. signal – can arise if we are using one condition queue for many reasons. Waking threads have to sort it out (spurious wakeups). Possibly better to separate into multiple conditions (but more complexity to code).
– Message passing – communication vs. synchronization
• Administrative details: – Signing up for demos – demo scheduler– Rules for demos:
• Four groups should sign up to meet with me, two with Varun.• Try to demo some labs with each of us.• Each member is expected to be prepared to contribute to each demo
– Ask your grader where the demo will be held if they don’t post to the newsgroup
• For me, my office – LSRC D336.
91
5 Dining Philosophers
Philosopher 0
Philosopher 1
Philosopher 2
Philosopher 3
Philosopher 4
while(food available){pick up 2 adj. forks; eat; put down forks; think awhile;}
92
Template for Philosopher
while (food available){ /*pick up forks*/
eat; /*put down forks*/
think awhile;}
93
Naive Solution
while (food available){ /*pick up forks*/
eat; /*put down forks*/
think awhile;}
P(fork[left(me)]);P(fork[right(me)]);
V(fork[left(me)]);V(fork[right(me)]);
Does this work?
94
Simplest Example of Deadlock
Thread 0
P(R1)
P(R2)
V(R1)
V(R2)
Thread 1
P(R2)
P(R1)
V(R2)
V(R1)
Interleaving
P(R1)
P(R2)
P(R1) waits
P(R2) waits
R1 and R2 initially 1 (binary semaphore)
95
Conditions for Deadlock• Mutually exclusive use of resources
– Binary semaphores R1 and R2
• Circular waiting– Thread 0 waits for Thread 1 to V(R2) and
Thread 1 waits for Thread 0 to V(R1)
• Hold and wait – Holding either R1 or R2 while waiting on other
• No pre-emption– Neither R1 nor R2 are forcibly removed from their respective
holding Threads.
96
Philosophy 101(or why 5DP is interesting)
• How to eat with your Fellows without causing Deadlock.– Circular arguments (the circular wait condition)
– Not giving up on firmly held things (no preemption)
– Infinite patience with Half-baked schemes (hold some & wait for more)
• Why Starvation exists and what we can do about it.
97
Dealing with DeadlockIt can be prevented by breaking one of the
prerequisite conditions:• Mutually exclusive use of resources
– Example: Allowing shared access to read-only files (readers/writers problem)
• circular waiting– Example: Define an ordering on resources and
acquire them in order
• hold and wait • no pre-emption
98
while (food available)
{ if (me 0) {P(fork[left(me)]); P(fork[right(me)]);}
if (blocking[leftneighbor(me)] || blocking[rightneighbor(me)]) wakeup ( ); forkMutex.Release( );
think awhile;
}
103
Readers/Writers Problem
Synchronizing access to a file or data record in a database such that any number of threads requesting read-only access are allowed but only one thread requesting write access is allowed, excluding all readers.