Top Banner
Università degli studi di Udine Sistemi operativi – Operating Systems Synchronization Università degli studi di Udine Sistemi operativi – Operating Systems Synchronization Synchronization primitives HW primitives Atomic operations Low-level synchronization primitives Exclusive locks, rwlocks, seq. locks, non-blocking data structures Locking strategies and issues High-level synchronization primitives Synchronization patterns Classical problems Deadlock management Università degli studi di Udine Sistemi operativi – Operating Systems Concurrency Multiple applications (multiprogramming) independent application processes unaware of others competition on shared resources cooperating application processes indirectly aware of others cooperation by sharing resources synchronization Parallel applications processes/threads directly aware of others cooperation by communication (messages or shared variables) synchronization Università degli studi di Udine Sistemi operativi – Operating Systems Concurrence issues Race conditions final results depend on execution order Starvation some task waits indefinitely Deadlock a circular waiting dependency prevents work to proceed
42

Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Jul 05, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

� Synchronization primitives

� HW primitives

� Atomic operations

� Low-level synchronization primitives

� Exclusive locks, rwlocks, seq. locks, non-blocking data structures

� Locking strategies and issues

� High-level synchronization primitives

� Synchronization patterns

� Classical problems

� Deadlock management

Università degli studi di Udine Sistemi operativi – Operating Systems

Concurrency

� Multiple applications (multiprogramming)

� independent application

� processes unaware of others

� competition on shared resources

� cooperating application

� processes indirectly aware of others

� cooperation by sharing resources

� synchronization

� Parallel applications

� processes/threads directly aware of others

� cooperation by communication (messages or shared variables)

� synchronization

Università degli studi di Udine Sistemi operativi – Operating Systems

Concurrence issues

� Race conditions

� final results depend on execution order

� Starvation

� some task waits indefinitely

� Deadlock

� a circular waiting dependency prevents work to proceed

Page 2: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Race condition

� Results depend on the order of the execution

a=a+b

process A

b=a+b

process B

shared vars

a=1 ; b=2

a=? ; b=?a=3 ; b=5

a=4 ; b=3

a=3 ; b=3

Università degli studi di Udine Sistemi operativi – Operating Systems

Race condition

� Results depend on the order of the execution

local tmpA

tmpA=count

tmpA=tmpA+1

count=tmpA

process A

local tmpB

tmpB=count

tmpB=tmpB+1

count=tmpB

process B

shared var

count=0

count=?count=2 OK

count=1 NO

Università degli studi di Udine Sistemi operativi – Operating Systems

Mutual exclusion

� Group of instructions must be executed atomically

local tmpA

tmpA=count

tmpA=tmpA+1

count=tmpA

process A

local tmpB

tmpB=count

tmpB=tmpB+1

count=tmpB

process B

shared var

count=0

count=2

BeginSection / Lock

EndSection / Unlock

Critical

Section

BeginSection / Lock

EndSection / Unlock

Critical

Section

Università degli studi di Udine Sistemi operativi – Operating Systems

Starvation

D C B A

Execute

E

ready processes RUN

D E C B

Execute

A

ready processes RUN

Page 3: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Deadlock

Task A Task B

Task D Task C

wait

wait

waitwait

Wrong synchronization!

System is blocked!

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

Synchronization primitives

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

� HW primitives

� processor instructions

� usually not privileged

� Low-level synchronization primitives

� built on top of HW primitives

� do not require scheduler intervention� can be implemented at user level

� High-level synchronization primitives

� built on top of low-level primitives

� interact with scheduler� from user level, imply syscalls

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

HW primitives

Synchronization primitives

Page 4: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

HW primitives

� Atomic Read, atomic Write� not practical

� requires N accesses to synchronize N tasks� requires unique IDs for tasks

� Atomic Read-Modify-Write� allows to implement simple spin-locks

� implementation independent of involved tasks� no "unique IDs" requirement

� clever implementation can reduce contention� ticket, array-based, queue-based locks

� Atomic Read-Test-Modify-Write� allows wait-free synchronization

� Load-Link and Store-Conditional� do not require a double memory access in a single instruction

Università degli studi di Udine Sistemi operativi – Operating Systems

HW primitives

� Atomic Read-Modify-Write

� minimal feature to implement practical locks

� Test-and-Set

� Read-and-Increment

� x86: lock xadd

� Exchange

� x86: xchg

� ARM:swp

� deprecated since ARMv6

� Others:

� fetch_and_sub, fetch_and_or, ...

int Test-and-Set(int *ptr){ int old = *ptr; *ptr = 1; return old;}

int Read-and-Increment(int *ptr, int inc){ int old = *ptr; *ptr = old + incr; return old;}

int Exchange(int *ptr, int new){ int old = *ptr; *ptr = new; return old;}

pseudo-code

atomic

atomic

atomic

Università degli studi di Udine Sistemi operativi – Operating Systems

HW primitives

� Atomic Read-Test-Modify-Write

� allows wait-free and lock-free synchronization

� Compare-and-Exchange or Compare-and-Swap (CAS)

� x86: lock cmpxchg

int Compare-Exchange(int *ptr, int testval, int new){ int old = *ptr; if (old == testval) *ptr = new; return old;}

pseudo-code

atomic

Università degli studi di Udine Sistemi operativi – Operating Systems

HW primitives

� Load-Link and Store-Conditional

� do not require a double memory access in a single instruction

� MIPS:

� ll, sc

� ARM:

� ldrex, strex

int LL(int *ptr){ remember this access return *ptr;}

int SC(int *ptr, int val){ if (this cpu has executed LL on ptr) { if (*ptr written since the last LL performed by this cpu) return SC_FAILURE; /* fail */ else { /* *ptr has not changed */ *ptr = val; return SC_SUCCESS; /* success */ } } unspecified behavior}

pseudo-code

atomic

atomic

Page 5: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

� Load-Link and Store-Conditional

� do not require a double memory access in a single instruction

� MIPS:

� ll, sc

� ARM:

� ldrex, strex

HW primitives

atomic

pseudo-code

atomic

LL x

Modify x

SC x

failure � operations not atomic: retry

LL x

Modify x

SC x

success � operations was atomic: go on

PROCESSOR A PROCESSOR B

Atomic Read-Modify-Write

Università degli studi di Udine Sistemi operativi – Operating Systems

HW primitives: summary

� Atomic accesses:

� Read-Modify-Write operations

� fetch_and_add, fetch_and_sub, fetch_and_or, fetch_and_and, ...� perform the operation suggested by the name, and return the old value

� swap, add_and_fetch, sub_and_fetch, or_and_fetch,

and_and_fetch, ...� perform the operation suggested by the name, and return the new value

� Read-Test-Modify-Write operations

� compare_and_swap

Università degli studi di Udine Sistemi operativi – Operating Systems

Operation costs

� Typical values:

� Best-case Atomic increment: 50 – 100 cycles

� Best-case Compare-and-Exchange: 50 – 100 cycles� CAS on a variable in cache

� Memory barrier: 100 – 150 cycles

� Single cache miss: 200 – 300 cycles

� Compare-and-Exchange cache miss: 500 – 1000 cycles

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

Low-level

synchronization primitives

Synchronization primitives

Page 6: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Low-level synchronization primitives

� Exclusive locks

� Reader-Writer locks

� Sequential locks

� Non-blocking data structures

Università degli studi di Udine Sistemi operativi – Operating Systems

Low-level synchronization primitives

� Exclusive locks

� allow only one task to proceed, others must wait

� several tasks compete to acquire a lock

� only one wins (acquires the lock)

� others wait until the lock is released

� e.g.,

� entering in a critical section � lock acquisition

� exiting the critical section � lock releasing

Università degli studi di Udine Sistemi operativi – Operating Systems

Exclusive lock

� Binary variable

� States:

� locked (or acquired, or held),

� unlocked (or free, or available)

� Operations

� lock(lock_var)

if lock_var is unlocked then lock_var becomes locked

else the calling task cannot proceed until lock_var becomes unlocked

� unlock(lock_var)

lock_var becomes unlocked

note: unlock should be called by the task that holds the lock

Università degli studi di Udine Sistemi operativi – Operating Systems

Exclusive lock implementations

� Classical locking algorithms

� Dekker's algorithm

� Peterson's algorithm

� Lamport's bakery algorithm

� Spinlocks

� Polling on a variable

� Basic implementation

� Ticket spinlock

� Array spinlock

Page 7: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Exclusive lock implementations

� Implementation

� based on Atomic Read and atomic Write

� Classical locking algorithms

� Dekker's algorithm

� Peterson's algorithm

� Lamport's bakery algorithm

� based on

Atomic Read-Modify-Write, or

Atomic Read-Test-Modify-Write, or

Load-Link and Store-Conditional

� spinlock

Needs unique task id (no id reuse)

Must read N memory locations

Requires sequential consistency

Limited to 2 tasks

Requires processor consistency

Università degli studi di Udine Sistemi operativi – Operating Systems

Classical locking algorithms

� Lock algorithm proprieties

� Mutual exclusion (safety property)

� critical sections of different threads do not overlap� cannot guarantee integrity of computation without this property!

� No deadlock

� if someone attempts to acquire the lock, then someone will acquire it� does not imply deadlock-free programs

� No starvation

� every thread that attempts to acquire the lock eventually succeeds

� implies no deadlock

� desirable but not essential� practical locks: many permit starvation, if it is unlikely to occur

� without a real-time guarantee, starvation freedom is weak

Università degli studi di Udine Sistemi operativi – Operating Systems

Classical locking algorithms

� Dekker's algorithm (1964)

� for 2 tasks

� Peterson's algorithm (1981)

� for 2 tasks

� generalizable to N tasks (filter algorithm)

� Lamport's “bakery” algorithm (1974)

� for N tasks

Università degli studi di Udine Sistemi operativi – Operating Systems

Classical locking algorithms

� Use atomic load and store only, no stronger atomic primitives

� Not used in practice

� locks based on stronger atomic primitives are more efficient

� Why study classical lock algorithms?

� understand the principles underlying synchronization

� ubiquitous in parallel programs

� appreciate their subtlety

� motivate the need for hardware

Page 8: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Wrong algorithm - 1

#define N 2 /* number of processes */int flag[N]; /* initialized to all 0s */

void lock(int process /* 0 or 1 */){ int other = 1 - process; flag[process] = 1; while (flag[other] == 1) ; /* wait */}

void unlock(int process /* 0 or 1*/){ flag[process] = 0;}

I'm interested

Università degli studi di Udine Sistemi operativi – Operating Systems

Wrong algorithm - 1

int other = 1 – process;

flag[process] = 1;

while (flag[other] == 1) ; /* wait */

....

critical section

...

flag[process] = 0;

task A (process=0)

int other = 1 – process;

flag[process] = 1;

while (flag[other] == 1) ; /* wait */

....

critical section

...

flag[process] = 0;

task B (process=1)

unlock

OK

Università degli studi di Udine Sistemi operativi – Operating Systems

Wrong algorithm - 1

int other = 1 – process;

flag[process] = 1;

while (flag[other] == 1) ; /* wait */

task A (process=0)

int other = 1 – process;

flag[process] = 1;

while (flag[other] == 1) ; /* wait */

task B (process=1)

lockedlocked

deadlock

Università degli studi di Udine Sistemi operativi – Operating Systems

Wrong algorithm - 2

#define N 2 /* number of processes */int turn = 0; /* who has prenoted access */

void lock(int process /* 0 or 1 */){ turn = process; while (turn == process) ; /* wait */}

void unlock(int process /* 0 or 1*/){}

other goes first

Page 9: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Wrong algorithm - 2

turn = process;while (turn == process) ; /* wait */

....

critical section

...

turn = process;while (turn == process) ; /* wait */

....

critical section

...

task A (process=0)

turn = process;while (turn == process) ; /* wait */

....

critical section

...

task B (process=1)

unlock

OK

unlock

Università degli studi di Udine Sistemi operativi – Operating Systems

Dekker's algorithm

#define N 2 /* number of processes */int flag[N]; /* initialized to all 0s */int turn = 0; /* who has prenoted access */

void lock(int process /* 0 or 1 */){ int other = 1-process; flag[process] = 1; while flag[other] { flag[process] = 0; while (turn != process) ; /* wait */ flag[process] = 1; }}

void unlock(int process /* 0 or 1*/){ turn = 1-process; /* other */ flag[process] = 0;}

Università degli studi di Udine Sistemi operativi – Operating Systems

Peterson's algorithm

#define N 2 /* number of processes */int flag[N]; /* initialized to all 0s */int turn = 0; /* who has prenoted access */

void lock(int process /* 0 or 1 */){ int other = 1-process; flag[process] = 1; turn=process; while( turn == process && flag[other] == 1 ) ; /* wait */ }

void unlock(int process /* 0 or 1*/){ flag[process] = 0;}

I'm interestedother

goes

first

Università degli studi di Udine Sistemi operativi – Operating Systems

Peterson's algorithm

other = 1-process;flag[process] = 1;turn=process;while(turn==process && flag[other]==1);

....

critical section

...

flag[process] = 0;

task A (process=0)

other = 1-process;flag[process] = 1;turn=process;while(turn==process && flag[other]==1);

....

critical section

...

flag[process] = 0

task B (process=1)OK

unlock

Page 10: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Peterson's algorithm

other = 1-process;flag[process] = 1;turn=process;

while(turn==process && flag[other]==1);

....

critical section

...

flag[process] = 0;

task A (process=0)

other = 1-process;flag[process] = 1;

turn=process;

while(turn==process && flag[other]==1);

....

critical section

...

flag[process] = 0

task B (process=1)OK

unlock

Università degli studi di Udine Sistemi operativi – Operating Systems

Lamport's bakery algorithm

� On arrival get a (incremental) ticket

� The bakery serves who has the smallest ticket

10 11 12 138

9

7

6

arriving tasks

tickets

waiting tasks

served

task

now serving:

6

Università degli studi di Udine Sistemi operativi – Operating Systems

Lamport's bakery algorithm

int flag[N]; /* initialized to all 0s */int ticket[N]; /* initialized to all 0s */ lock(int process) { int j; flag[process] = 1; ticket[process] = 1 + max(ticket[0], ..., ticket[N-1]); flag[process] = 0; for (j = 0; j < N; j++) { while (flag[j]) ; /* wait if task-j is getting its ticket */ /* Wait for threads with higher priority */ while ( ticket[j] != 0 && ( (ticket[j]<ticket[process]) || ((ticket[j]==ticket[process]) && j<process)) ) ; /* wait */ }}

void unlock(int process){ ticket[process] = 0;}

Università degli studi di Udine Sistemi operativi – Operating Systems

Observations

� Bakery algorithm is concise, elegant and fair

� Why is it not practical?

� must read N distinct locations (N could be very large)

� threads must be assigned unique IDs between 0 and N-1

� awkward for dynamic threads

� value of a label is monotonically increasing and unbounded�

� There can exist a more clever lock using only atomic

load/store that avoids these problems?

� No. Any deadlock-free algorithm requires reading or writing

at least N distinct locations in the worst case.

Page 11: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

Spinlocks

Synchronization primitives

Low-level synchronization primitives

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlocks

� Repeatedly check the lock variable

� loop until is locked

� at kernel level: disable preemption

� Note: on uni-processor systems, just disable preemption

� clever implementation can reduce contention

� ticket locks

� array-based locks

� queue-based locks

� whenever possible, processor is turned in a low-power state

when waiting

� e.g., with a wfe or wfi in ARM

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlock implementation (example)

void lock(int *lck){ while(Test-and-Set(lck) == 1) { continue; /* wait! */ } /* memory barrier if needed */}

void unlock(int *lck){ /* memory barrier if needed */ *lck = 0;}

void lock(int *lck){ while(Exchange(lck, 1) == 1) { continue; /* wait! */ } /* memory barrier if needed */}

void unlock(int *lck){ /* memory barrier if needed */ *lck = 0;}

Leveraging Test-and-Set

Leveraging Exchange

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlock implementation (example)

void lock(int *lck){ while ((*lck == 1) || (Test-and-Set(lck) == 1)) { continue; /* wait! */ } /* memory barrier if needed */}

void unlock(int *lck){ /* memory barrier if needed */ *lck = 0;}

void lock(int *lck){ do { while (*lck == 1) { continue; /* wait! */ } } while(Exchange(lck, 1) == 1); /* memory barrier if needed */}

void unlock(int *lck){ /* memory barrier if needed */ *lck = 0;}

Leveraging Test-and-Set

(reducing communication)

Leveraging Exchange

(reducing communication)

Page 12: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlock implementation (example)

void lock(int *lck){ /* * write code here * */

/* memory barrier if needed */}

void unlock(int *lck){ /* memory barrier if needed */ *lck = 0;}

Using LL and SC

write code here

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlock implementation (ticket)

� Same principle of Lamport's bakery algorithm

� arriving tasks get a ticket

� atomically

� there is a global indicator: the current turn

� each task waits until current turn is equal to its own ticket

� a leaving task increments the current turn

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlock implementation (ticket)

typedef struct { int next; int current;} lock_t;

void init_lock(lock_t *lck){ lck->next = lck->current = 0;}

void lock(volatile int *lck){ int myturn;

myturn = /* get a ticket */

/* wait until it's my turn */

}

void unlock(int *lck){

/* increment current turn */

}

current turn

must be an atomic operation

atomicity not required:

only one task here

next available ticket

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlock implementation (ticket)

typedef struct { int next; int current;} lock_t;

void init_lock(lock_t *lck){ lck->next = lck->current = 0;}

void lock(volatile int *lck){ int myturn;

myturn = fetch_and_add(lck->next, 1);

while (myturn != lck->current) continue; /* wait! */ /* memory barrier */}

void unlock(int *lck){ /* memory barrier */ lck->current++; /* memory barrier */}

atomically acquire current value

and store a new value for the field

next

loop until the field next becomes

equal to the field owner

lck must be volatile for this test

for efficiency: the previous write

become visible on all CPUs as

soon as possible: spinning is

reduced

Page 13: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlock implementation (ticket)

typedef struct { int next; int current;} lock_t;

void init_lock(lock_t *lck){ lck->next = lck->current = 0;}

void lock(volatile int *lck){ int myturn;

myturn = fetch_and_add(lck->next, 1);

while (myturn != lck->current) delay(myturn - lck->current); /* memory barrier */}

void unlock(int *lck){ /* memory barrier */ lck->current++; /* memory barrier */}

to reduce contention

(delay can be a simple empty loop)

With a back-off delay

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlock implementation (ticket)

typedef struct { int next; int current;} lock_t;

void init_lock(lock_t *lck){ lck->next = lck->current = 0;}

void lock(volatile int *lck){ int myturn;

myturn = fetch_and_add(lck->next, 1);

while (myturn != lck->current) wait_for_event(); /* memory barrier */}

void unlock(int *lck){ /* memory barrier */ lck->current++; /* memory barrier */ send_event();}

to reduce contention

(if there is architectural support)

With sleeping

needed

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlock implementation (ticket)

typedef struct { int next; int current;} lock_t;

void init_lock(lock_t *lck){ lck->next = lck->current = 0;}

void lock(volatile int *lck){ int myturn;

while (myturn != lck->current) continue; /* wait! */ /* memory barrier */}

void unlock(int *lck){ /* memory barrier */ lck->current++; /* memory barrier */}

implement fetch_and_add

using LL and SC

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlock implementation (ticket)

�Only 1 atomic instruction executed per lock acquisition

�Fair, locks granted in order of request: no starvation

�Back off delay proportional to position in queue

� if time in critical section is constant, the delay can be calculated

such that the subsequent test of lck->current will just succeed

�Polling on a single shared location

� bus traffic with an invalidate cache coherency protocol (e.g., MESI)

� delay not necessary with a write-update protocol (e.g., Firefly)

Page 14: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlock implementation (array-based)

� Each task must poll a different location

� arriving tasks get an index

� atomically

� each task waits until current its own lock becomes free

� a leaving task unlocks the following one

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlock implementation (array-based)

LockedLocked

UnlockedUnlocked

LockedLocked

LockedLocked

LockedLocked

LockedLocked

LockedLocked

LockedLocked

LockedLockedidxidx

Task 1:release lock

Task 2:acquired lock

Task 3:waiting

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlock implementation (array-based)

typedef struct { int flags[N]; int queuelast, winner_idx;} lock_t;

void init_lock(lock_t *lck){ int i; lck->flags[0] = HAS_LOCK; for (i=1; i<N; i++) lck->flags[i] = MUST_WAIT; lck->queuelast = 0;}

void lock(volatile int *lck){ int myplace;

myplace = fetch_and_add(lck->queuelast, 1);

while (lck->flags[myplace % N] == MUST_WAIT) continue; lck->winner_idx = myplace; /* memory barrier */}

void unlock(int *lck){ /* memory barrier */ lck->flags[lck->winner_idx % N] = MUST_WAIT; lck->flags[(lck->winner_idx + 1) % N] = HAS_LOCK; /* memory barrier */}

should be padded: each element in

a different cache line

N must be a power of 2

get a location to poll

(each task obtains a different index)

only a task here: record the index used

(needed in unlock)

allows the next waiting task to proceed

Università degli studi di Udine Sistemi operativi – Operating Systems

�Tasks do not poll a single shared location

� reduced bus traffic for a write-invalidate cache coherency protocol

�Lock is passed from a task to the next

� through a shared slot in an array

� this slot is not shared with any other thread

�Only 1 atomic instruction executed per lock acquisition

�Fair, lock is granted in order of request: no starvation

�Need to know max number of threads

Spinlock implementation (array-based)

Page 15: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlocks

�Applicable to any number of tasks

�Applicable to any number of processors (shared memory)

�Simple

� thus easy to verify

�Support multiple critical sections

� each critical section is identified by its own lock variable

Università degli studi di Udine Sistemi operativi – Operating Systems

Spinlocks

� Process waits by executing a loop

� Can be implemented at user level

� no syscalls are required by user level code

� CPU time is wasted

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

Reader-writer locks

Synchronization primitives

Low-level synchronization primitives

Università degli studi di Udine Sistemi operativi – Operating Systems

Low-level synchronization primitives

� Reader-Writer locks

� 2 categories of tasks: readers and writers� readers can proceed concurrently� a writer must have exclusive access

� increase parallelism� readers advance in parallel� a new reader can proceed if other readers are accessing data

� tasks must be specialized� readers do not modify data!

� writers may starve� a writer must wait until there are no more readers, but a new reader

can steal the waiting writers turn� give priority to writers � increased complexity (thus overhead)

W

Page 16: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Reader-Writer lock

� 3-state variable

� States:� unlocked� reader_locked� writer_locked

� Operations

� read_lock(lock_var)

if lock_var is writer_locked the calling task cannot proceed

else lock_var becomes (or stays) reader_locked

� write_lock(lock_var)

if lock_var is unlocked then lock_var becomes writer_locked

else the calling task cannot proceed

� unlock(lock_var) lock_var is reader_locked and no more readers � lock_var becomes unlocked

lock_var is writer_locked � lock_var becomes unlocked

Università degli studi di Udine Sistemi operativi – Operating Systems

RWlock implementation (example)

const W = 1;const R = 2;

typedef int lock_t;

void init_lock(lock_t *lck){ *lck = 0;}

void read_lock(volatile lock_t *lck) { fetch_and_add(lck, R); while(lck & W) continue;}

void write_lock(lock_t *lck) { while(CAS(lck, 0, W) != 0) continue;}

void read_unlock(lock_t *lck) { fetch_and_add(lck, -R);}

void write_unlock(lock_t *lck) { fetch_and_add(lck, -W);}

�Simple

�Not efficient

� Polling CAS

�Not fair

� Readers are preferred

� Writers can starve

Università degli studi di Udine Sistemi operativi – Operating Systems

Reader-Writer lock

� Variants

� more states

� VAX/VMS Distributed Lock Manager: 6-state lock

� states: Unlocked, Concurrent-Read, Concurrent-Write,

Protected-Read, Protected-Write, Exclusive

� DBMS: even more than 30 states!

����������

��������� ������������� ���������� ���� ������������ ��������� ���� ���������

�������� � � � � �������������� � � � � ����������� ���� � � � � ������������� � � � � ���������� ���� � � � � ���������� � � � � �

Result:

: allowed

: blocked

Università degli studi di Udine Sistemi operativi – Operating Systems

Low-level synchronization primitives

� Sequential locks

� Similar to reader-writer locks but writers have priority

� a writer is never blocked by readers

� writers do not starve

� a writer is only serialized with respect to other writers

� readers try to get data

� operation is restarted if a conflict with a writer is detected

do { seq = read_seqbegin(&foo); ...} while (read_seqretry(&foo, seq));

write_seqlock(&test_seqlock);... /* update data */write_sequnlock(&test_seqlock);

Reader Writer

Example

Page 17: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

Locking

strategies and issues

Synchronization primitives

Low-level synchronization primitives

Università degli studi di Udine Sistemi operativi – Operating Systems

Locking strategies

� Giant lock

� the whole code (e.g., a library) is protected with a single lock

� simplest approach

� allows to port non-parallel code in parallel architectures

� available parallelism is lost

Università degli studi di Udine Sistemi operativi – Operating Systems

Locking strategies

� Coarse-grained locking

� code is split in subsystems

� e.g., for an OS kernel

� filesystems

� memory management

� network stack

� video drivers

� input drivers

� ...

� each subsystem is protected with its own lock

� calls to different subsystem can proceed concurrently

� communication between different subsystems can still require

a global lock

Università degli studi di Udine Sistemi operativi – Operating Systems

Locking strategies

� Fine-grained locking

� locks protect individual data structures

� scalable

� several locks must be managed

� need to understand which locks are required

� order on locks requests

� management of a hierarchy of locks

� rules!

Page 18: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Locking issues

� Deadlock

� circular waiting dependency that prevents work to proceed

� tasks blocked on a lock held by a task waiting for another lock held...

� Convoying

� set of tasks repeatedly competing for a lock

� progression speed is limited by the slowest task

� fast tasks are forced to slow down

� similar to a column of cars in a single lane

Università degli studi di Udine Sistemi operativi – Operating Systems

Locking issues

� Priority Inversion

� a high priority task (TH) blocked on a lock held by a low priority

task (TL)

� an independent medium priority task (TM) is ready

� � TM is scheduled to run� � TH obtains an actual lower priority than TM

� Workarounds

� disable preemption when a lock is held� requires disabling interrupts

� priority ceiling� give the highest priority to a task that holds a lock

� priority inheritance� �task is blocked on a lock its priority passes to the lock owner (if higher)

Università degli studi di Udine Sistemi operativi – Operating Systems

Locking issues

� Signal-safety

� signal handlers (and interrupt handlers) cannot share locks

with the other code

� e.g.,

1. task1 holds lockA

2. task1 is interrupted by a signal

3. signal handler requires lockA

� signal handler blocks, task1 cannot proceed

� � deadlock

� disable signals (or interrupts) when lockA is acquired

� not required for locks that are not used in signal handlers too

Università degli studi di Udine Sistemi operativi – Operating Systems

Locking issues

� Kill-tolerant availability

� tasks killed while holding a lock

� Pre-emption tolerance

� tasks pre-empted while holding a lock

Page 19: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Locking issues

� Overall performance

� overhead of lock primitives

� global communication

� memory barriers

� depends on lock contention

� non-contended lock is stored only in a CPU cache� still, not for free: memory barrier

� contended locks bounce from a cache to other caches� cache misses

� look for efficient algorithms

� use specialized locks

� e.g., reader-writer locks

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

Non-blocking data

structures

Synchronization primitives

Low-level synchronization primitives

Università degli studi di Udine Sistemi operativi – Operating Systems

Low-level synchronization primitives

� Non-blocking data structures

� Lock-free data structures

� e.g. lock-free linked lists

� see Linux llist

� others: buffer, stack, queue, map, snapshot

� Wait-free data structures

� much harder than lock-free

� not always possible

Università degli studi di Udine Sistemi operativi – Operating Systems

Lock- and Wait-free synchronization

� Lock-free synchronization:

� At least one thread will make progress in finite time

� A data structure is lock-free if and only if some operation

completes after a finite number of steps system-wide have

been executed on the structure

� Wait-free synchronization:

� Every thread will make progress in finite time

� A data structure is wait-free if and only if every operation on

the structure completes after it has executed a finite

number of steps

Page 20: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Lock-free stack (example)

struct NodeType {

Datatype data;

struct NodeType *next;

};

struct NodeType *Head;

void init() {

Head = NULL;

}

void push(struct NodeType *n) {

n->next = Head;

Head = n;

}

struct NodeType *pop() {

struct NodeType *n;

n = Head;

if (n != NULL)

Head = n->next;

return n;

}

Head

NULL

top of the stackGlobal data is changed here

If nobody else has changed global data

� changes are valid

otherwise, abort and retry

Not concurrent:

a lock is needed to make push and

pop atomic

Università degli studi di Udine Sistemi operativi – Operating Systems

Lock-free stack (example)

struct NodeType {

Datatype data;

struct NodeType *next;

};

struct NodeType *Head;

void init() {

Head = NULL;

}

void push(struct NodeType *n) {

do {

n->next = Head;

} while (CAS(&Head, n->next, n) != n->next);

}

struct NodeType *pop() {

struct NodeType *n;

do {

n = Head;

} while (n != NULL && CAS(&Head, n, n->next) != n);

return n;

}

Lock free

Head

NULL

top of the stack

Università degli studi di Udine Sistemi operativi – Operating Systems

Lock-Free issues

� Designing generalized lock-free algorithms is hard

� � Design lock-free data structures instead� buffer, list, stack, queue, map, deque, snapshot

� ABA problem

� typical lock-free operation� task1:

1. acquire atomically a flag (finds the value A)

2. use data

3. test the current value of flag� �if A data not changed: ok to proceed; else, repeat the operation

� problem:� after task1.1, task2 stores B to flag

� before task1.3, task2 changes data and store A to flag� � task1 is not aware of changes

� data inconsistency

Università degli studi di Udine Sistemi operativi – Operating Systems

Lock-free stack (example): ABA

struct NodeType {

Datatype data;

struct NodeType *next;

};

struct NodeType *Head;

void init() {

Head = NULL;

}

void push(struct NodeType *n) {

do {

n->next = Head;

} while (CAS(&Head, n->next, n) != n->next);

}

struct NodeType *pop() {

struct NodeType *n, *next;

do {

n = Head; next = n->next;

} while (n != NULL && CAS(&Head, n, next) != n);

return n;

}

Lock free

Head

A B C NULL

top of the stack

Page 21: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Lock-free stack (example): ABA

TASK1: TASK2:

n = Head;

n1 = Head;

CAS(&Head, n1, n1->next)

n2 = Head;

CAS(&Head, n2, n2->next)

n1->next = Head;

CAS(&Head, n1->next, n1)

CAS(&Head, n, n->next)

Head

A B C NULL

n

n1

n2

n = pop();

n1 = pop();

n2 = pop();

push(n1);

Università degli studi di Udine Sistemi operativi – Operating Systems

Lock-free stack (example): ABA

TASK1: TASK2:

n = Head;

n1 = Head;

CAS(&Head, n1, n1->next)

n2 = Head;

CAS(&Head, n2, n2->next)

n1->next = Head;

CAS(&Head, n1->next, n1)

CAS(&Head, n, n->next)

Head

A B C NULL

n

n1

n2

n = pop();

n1 = pop();

n2 = pop();

push(n1);

Università degli studi di Udine Sistemi operativi – Operating Systems

Lock-free stack (example): ABA

TASK1: TASK2:

n = Head;

next = n->next;

n1 = Head;

CAS(&Head, n1, n1->next)

n2 = Head;

CAS(&Head, n2, n2->next)

n1->next = Head;

CAS(&Head, n1->next, n1)

CAS(&Head, n, n->next)

Head

A B C NULL

top of the stack

n = pop();

n1 = pop();

n2 = pop();

push(n1);

Università degli studi di Udine Sistemi operativi – Operating Systems

Lock-free stack (example): ABA

TASK1: TASK2:

n = Head;

next = n->next;

n1 = Head;

CAS(&Head, n1, n1->next)

n2 = Head;

CAS(&Head, n2, n2->next)

n1->next = Head;

CAS(&Head, n1->next, n1)

CAS(&Head, n, n->next)

n = pop();

n1 = pop();

n2 = pop();

push(n1);

Head

A B C NULL

n

n1

n2

next

CAS is successful

Page 22: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Lock-free stack (example): ABA

� Do not reuse nodes

� task2:

n1 = pop(); n1 = pop();

n2 = pop(); n2 = pop();

push(n1); n3 = new node

n3.data = n1.data;

push(n3);

� When can n1 be freed?

� after n1 is released, another task can obtain that memory as a new

node

� � ABA can happen

Università degli studi di Udine Sistemi operativi – Operating Systems

ABA solutions

� Deferred reclamation

� Do not reuse nodes

� Don't recycle the memory “too soon”

� Garbage collector

� Hazard pointers

� Read-Copy-Update

� Use the same CAS for 2 pointers

� needs a double-word CAS

� Tagged pointers

� some bits of a pointer are used as a counter

� beware the wrap-around

Università degli studi di Udine Sistemi operativi – Operating Systems

Lock-free list (example)

struct NodeType {

Datatype data;

struct NodeType *next;

};

struct NodeType *Head, *Tail;

void init() {

NodeType *dummynode;

dummynode = malloc(sizeof struct NodeType);

dummynode->next = NULL;

Head = Tail = dummynode;

}

void insert(struct NodeType *n) {

struct NodeType *tmp;

n->next = NULL;

tmp = Tail;

tmp->next = n;

Tail = n;

}

struct NodeType *remove() {

struct NodeType *n;

n = Head->next;

if (n != NULL) {

Head = n;

}

return n;

}

Head

NULL

dummy node

Tail

first node

Not concurrent

discard dummy node;

n becomes the new dummy node Not concurrent:

a lock is needed to make push and

pop atomic

Università degli studi di Udine Sistemi operativi – Operating Systems

Lock-free list (example)

struct NodeType {

Datatype data;

struct NodeType *next;

};

struct NodeType *Head, *Tail;

void init() {

NodeType *dummynode;

dummynode = malloc(sizeof struct NodeType);

dummynode->next = NULL;

Head = Tail = dummynode;

}

void insert(struct NodeType *n) {

struct NodeType *tmp;

n->next = NULL;

tmp = Tail;

tmp->next = n;

Tail = n;

}

struct NodeType *remove() {

struct NodeType *n;

n = Head->next;

if (n != NULL) {

Head = n;

}

return n;

}

Head

NULL

dummy node

Tail

first node

Not concurrent:

a lock is needed to make push and

pop atomic

after every step, the list must remain consistent:

- nodes are all linked

- Head points to the dummy node

- Tail is after Head

concurrent tasks must “cooperate”

Page 23: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Lock-free list (example)

Head

NULL

dummy node

Tail

first node

struct NodeType {

Datatype data;

struct NodeType *next;

};

struct NodeType *Head, *Tail;

void init() {

NodeType *dummynode;

dummynode = malloc(sizeof struct NodeType);

dummynode->next = NULL;

Head = Tail = dummynode;

}

void insert(struct NodeType *n) {

struct NodeType *tmp, *ntmp;

n->next = NULL;

do {

tmp = Tail;

ntmp = tmp->next;

if (Tail != tmp) continue;

if (ntmp != NULL) {

CAS(&Tail, tmp, tmp->next);

continue;

}

} while (CAS(&tmp->next, NULL, n) != NULL);

CAS(&Tail, tmp, n);

}

struct NodeType *remove() {

struct NodeType *n, *h, *t;

do {

h = Head;

t = Tail;

n = h->next;

if (Head != h) continue;

if (n == NULL)

break;

if (h == t) {

CAS(&Tail, t, n);

continue;

}

} while (CAS(&Head, h, n) != h);

return n;

}Lock free

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

High-level

synchronization primitives

Synchronization primitives

Università degli studi di Udine Sistemi operativi – Operating Systems

� Semaphores

� Semaphores (Counting semaphores)

� Binary semaphores

� Mutexes

� Condition variables

� Monitors

� Deferred processing

� e.g., Read-Copy-Update (RCU)

High-level synchronization primitives

Università degli studi di Udine Sistemi operativi – Operating Systems

� Semaphore

� Integer variable

� Operations (all atomic)

� initialize

� set the initial value� an arbitrary non-negative value

� semWait (also: P)

� decrement value; if the result is negative, then suspend the calling processif suspended, the process is stored on a list associated to the semaphore

� used to enter in a critical section

� semSignal (also: V)

� increment value; if the result is non-positive, then resume a suspended processthe process to be resumed is read from the list associated to the semaphore

� used to leave a critical section

High-level synchronization primitives

Page 24: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Semaphores

� Strong semaphore

� task are resumed in FIFO order

� fair implementation

� Weak semaphore

� no order is imposed on task reactivations

Università degli studi di Udine Sistemi operativi – Operating Systems

Semaphores

� Binary semaphore

� Semaphore that can only assume values 0 or 1

� initialize

� Only 0 or 1 are valid initial values

� semSignal

� if value is 0, then resume a waiting process (if any)the process to be resumed is read from the list associated to the semaphore

� semWait

� if value is 0, then suspend process, else decrement valueif suspended, the process is stored on a list associated to the semaphore

Università degli studi di Udine Sistemi operativi – Operating Systems

Semaphore implementation (example)

typedef struct semaphore_t { int count; int lock; QUEUE suspended;} semaphore;

void semWait(semaphore *sem){ lock(&sem->lock); sem->count--; if (sem->count < 0) { place this process in sem->suspended unlock(&sem->lock); suspend this process } else { unlock(&sem->lock); }}

void semSignal(semaphore *sem){ lock(&sem->lock); sem->count++; if ( sem->count <= 0 ) { remove a process P from sem->suspended place process P on the ready list } unlock(&sem->lock);}

kernel-level

operations

access to

sem must

be atomic

spinlock

protected

section

Università degli studi di Udine Sistemi operativi – Operating Systems

High-level synchronization primitives

� Mutex

� Similar to a binary semaphore but

only the task owning the mutex can unlock it

� The same semantic of low-level locks

� but scheduler is into play

� Reentrant (or recursive) mutex

� a task can acquire the mutex multiple times

� multiple levels of ownership

� must be released the same number of times

Page 25: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

High-level synchronization primitives

� Monitor

� abstract data type

� accessible only through “access procedures” (all atomic and exclusive)

� Only a task can access the monitor at a time

� Object oriented approach

� e.g., in C++ a monitor can be implemented with a class where:

� there is a reentrant mutex as a field

� all methods get the mutex on entry

� all methods release the mutex on exit

� signaling is realized with explicit condition variables

Università degli studi di Udine Sistemi operativi – Operating Systems

� Condition variables

� Condition to test

� Operations (all atomic)

� cond_wait

� sleep until another task calls signal or broadcast

� cond_notify (also: signal)

� wake up a waiting task

� cond_notifyAll (also: broadcast)

� wake up all waiting tasks

High-level synchronization primitives

Università degli studi di Udine Sistemi operativi – Operating Systems

Condition variables

� Using condition variables

� on waiting: use a loop

� condition set by signaling task may be no more true

� another task could have changed the condition after the signaling

� MESA semantic

� Hoare semantic:

� after the signaling the waiting thread is woken up

� nobody else gets control on the condition

� hard to implement (never used in practice)

� a lock (or a mutex) is required

� prevents the race condition:

� sequence: test (done is 0), set (done becomes 1), cond_signal, cond_wait

� the signaling is lost � the waiting thread is never woken up

task A

task B

Università degli studi di Udine Sistemi operativi – Operating Systems

Condition variables pseudocode (example 1)

typedef struct { lock_t lock; QUEUE waiting;} cond_t;

void cond_wait(cond_t *cond_var){ atomically add this task to cond_var->waiting unlock(&cond_var->lock); suspend this task lock(&cond_var->lock);}

void cond_notify(cond_t *cond_var){ atomically remove a task T from cond_var->waiting resume T (place T on the ready list)}

void cond_notify_all(cond_tr *cond_var){ for each task T in cond_var->waiting { atomically remove a task T from cond_var->waiting resume T (place T on the ready list) }}

Page 26: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Condition variables (example 1)

/* do something */

/* need to wait for task B */

lock(&cv->lock);while (done == 0) { cond_wait(cv);}unlock(&cv->lock);

Task A

/* do something */

lock(&cv->lock);done = 1;cond_notify(cv);unlock(&cv->lock);

/* now task A can advance */

Task B

test on condition and

call to cond_wait

must be atomic

changes on condition and

call to cond_notify

must be atomic

protected by the same lock

done is initially is 0

Università degli studi di Udine Sistemi operativi – Operating Systems

Condition variables pseudocode (example 2)

typedef struct { QUEUE waiting;} cond_t;

void cond_wait(cond_t *cond_var, lock_t *lck){ atomically add this task to cond_var->waiting unlock(lck); suspend this task lock(lck);}

void cond_notify(cond_t *cond_var, lock_t *lck){ atomically remove a task T from cond_var->waiting resume T (place T on the ready list)}

void cond_notify_all(cond_t *cond_var, lock_t *lck){ for each task T in cond_var->waiting { atomically remove a task T from cond_var->waiting resume T (place T on the ready list) }}

Università degli studi di Udine Sistemi operativi – Operating Systems

Condition variables (example 2)

/* do something */

/* need to wait for task B */

lock(&lck);while (done == 0) { cond_wait(&cv, &lck);}unlock(&lck);

Task A

/* do something */

lock(&lck);done = 1;cond_notify(&cv, &lck);unlock(&lck);

/* now task A can advance */

Task B

test on condition and

call to cond_wait

must be atomic

changes on condition and

call to cond_notify

must be atomic

protected by the same lock

done is initially is 0

Università degli studi di Udine Sistemi operativi – Operating Systems

Read-Copy Update

� Synchronization for read-mostly data

� Update is split in:

� removal

� reclamation

� Publish-Subscribe Mechanism

� Simple to apply to data structures

� lists, arrays

Page 27: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Read-Copy Update

� Update is split in:

� removal

� remove the reference to old data

� reclamation

� free memory

� � removal does not need to wait for running readers

� � reclamation must wait until readers have done

Università degli studi di Udine Sistemi operativi – Operating Systems

Read-Copy Update

� Publish-Subscribe Mechanism

� subscribe data

� rcu_dereference

� publish new data

� rcu_assign_pointer

� old data is “reclaimed” when is no more needed

� after a “grace” period

Università degli studi di Udine Sistemi operativi – Operating Systems

Readers-Writers with RCU

data

dataptr

p1

data2

rcu_read_lock();

p1 = rcu_dereference(dataptr);

... do something with data

rcu_read_unlock();

Reader(s)

p2

lock(&writers_lock);

... prepare new data pointed by p2

oldp = dataptr;rcu_assign_pointer(dataptr, p2);

unlock(&writers_lock);

synchronize_rcu();

... free data pointed by oldp

Writer(s)

p1 is only valid between rcu_read_lock

and rcu_read_unlock

Università degli studi di Udine Sistemi operativi – Operating Systems

Readers-Writers with RCU

� Readerrcu_read_lock(); Reader signals its arrival

p1 = rcu_dereference(dataptr); Reader gets a reference to data

... do something with data

rcu_read_unlock(); Reader signals its leaving

� Writerlock(&writers_lock); Writer synchronizes with other writers

... prepare new data pointed by p2

oldp = dataptr;

rcu_assign_pointer(dataptr, p2); Writers “publishes” new data

unlock(&writers_lock); Writer synchronizes with other writers

synchronize_rcu(); Writer waits until “active” readers complete

... free data pointed by oldp Old data is no more needed (can be freed)

No blocking

operations

here!

Page 28: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Readers-Writers with RCU

� Readerrcu_read_lock();

p1 = rcu_dereference(dataptr);

... do something with data

rcu_read_unlock();

� Writerlock(&writers_lock);

... prepare new data pointed by p2

oldp = dataptr;rcu_assign_pointer(dataptr, p2);

unlock(&writers_lock);

call_rcu(..., reclaim_func);

Called asynchronously when readers have completedreclaim_func(...){

... free data}

To avoid waiting on writers

No blocking

operations

here!

Università degli studi di Udine Sistemi operativi – Operating Systems

Read-Copy Update

�Performance

� Readers

� do not acquire locks

� do not perform atomic instructions

� do not need memory barriers (but for Alpha)

�Deadlock immunity

�Realtime latency

Università degli studi di Udine Sistemi operativi – Operating Systems

Read-Copy Update

�Readers and Updaters run concurrently

� Readers can obtain old data

�Low-priority RCU readers can block high-priority

Reclaimers

�Grace-period latencies can extend for many milliseconds

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

Synchronization patterns

Page 29: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization patterns

� Signaling

� instruction (or instructions block) A1 must be executed before B1

instruction A1;

semSignal(sem);

task A

semWait(sem);

instruction B1;

task B

sem is initialized with 0

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization patterns

� Mutual exclusion (Mutex)

� A1 and B1 cannot overlap

semWait(sem_mutex);

instruction A1;

semSignal(sem_mutex);

task A

semWait(sem_mutex);

instruction A1;

semSignal(sem_mutex);

task B

sem_mutex is initialized with 1

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization patterns

� Multiplex

� generalized mutex

� no more than k tasks can access to critical section

� same structure of mutex (initialize sem_mutex with k)

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization patterns

instruction A1;

semSignal(semBgo);semWait(semAgo);

instruction A2;

task A

instruction B1;

semSignal(semAgo);semWait(semBgo);

instruction B2;

task B

semAgo and semBgo are initialized with 0

� Rendezvous

� both A1 and B1 must be executed before A2 and B2

Page 30: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization patterns

instruction A1_before;

barrier(B,k)

instruction A1_after;

task A1

instruction A2_before;

barrier(B,k)

instruction A2_after;

task A2

instruction Ak_before;

barrier(B,k)

instruction Ak_after;

task Ak

� Barrier

� generalized rendezvous (to k tasks)

� use a barrier object

� implemented on top of semaphores

Università degli studi di Udine Sistemi operativi – Operating Systems

Barriers

An implementationtypedef struct barr_t { int arrived; semaphore mutex, sem;} barr;

void barrier(barr b, int n_proc){ semWait(b.mutex); tmp = ++b.arrived; semSignal(b.mutex);

if (tmp != n_proc) { semWait(b.sem); semSignal(b.sem); } else { semWait(b.mutex); b.arrived = 0; semSignal(b.mutex); semSignal(b.sem); }}

arrived and sem are initialized with 0 ; mutex is initialized with 1

not reusable (after all tasks leaved, sem==1)

an additional semWait(sem) is needed

Università degli studi di Udine Sistemi operativi – Operating Systems

Barriers

V

phase 1

phase 2

� Reusable barrier

� where the final semWait should be issued?

� after all tasks leaved the barrier

� otherwise one waiting task will not resume

� before tasks leave the barrier

� otherwise a task can reenter the barrier before the final semWait

Università degli studi di Udine Sistemi operativi – Operating Systems

Barriers

An implementationtypedef struct barr_t { int arrived; semaphore mutex, phase1, phase2;} barr;

void barrier(barr b, int n_proc){ semWait(b.mutex); tmp = ++b.arrived; semSignal(b.mutex);

if (tmp != n_proc) { semWait(b.phase1); semSignal(b.phase1); } else { semWait(b.phase2); semSignal(b.phase1); }

arrived and phase1 are initialized with 0 ;

mutex and phase2 are initialized with 1

semWait(b.mutex); tmp = --b.arrived; semSignal(b.mutex);

if (tmp != 0) { semWait(b.phase2); semSignal(b.phase2); } else { semWait(b.phase1); semSignal(b.phase2); }}phase 1

phase 2

semaphore phase2 needs an

additional semWait too

use phase 2 to issue the additional

semWait to put the semaphore phase1

at its initial value

Page 31: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

Classical problems

Università degli studi di Udine Sistemi operativi – Operating Systems

Classical problems

� Use semaphores to solve:

� Producer – Consumer (with a bounded buffer)

� Readers – Writers

� no priority

� no-starve writers

� writers with priority

� Dining philosophers

Università degli studi di Udine Sistemi operativi – Operating Systems

Producer – Consumer

� Some tasks produce data

� Some tasks consume data

� Data are consumed in the same order they are produced

� The queue size is known and limited (e.g., a circular buffer)

Producers Consumers

FIFO

Università degli studi di Udine Sistemi operativi – Operating Systems

Producer – Consumer

� Grant exclusive access to the queue

� Producer signals to consumers that a new data is ready

� Consumer signals that space is available in queue

semWait(space);semWait(mutex);queue.insert(data);semSignal(mutex);semSignal(inqueue);

Producer

semWait(inqueue);semWait(mutex);data = queue.get();semSignal(mutex);semSignal(space);

Consumer

mutex is initialized with 1

inqueue is initialized with 0 (initial data into queue)

space is initialized with queue size (initial room into queue)

Page 32: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Producer – Consumer

semWait(space);semWait(mutex);queue.insert(data);semSignal(mutex);semSignal(inqueue);

Producer

semWait(mutex);semWait(inqueue);data = queue.get();semSignal(mutex);semSignal(space);

Consumer

DEADLOCK

������������������ ����������� ������ ������� ����������������� ���������

� swap semWaits? WRONG� consumer waits into a critical section� producer cannot pass the critical section

� producer cannot send semSignal to consumer

Università degli studi di Udine Sistemi operativi – Operating Systems

Producer – Consumer

� swap semSignal?

� No deadlock

� Additional context switches can occur

semWait(space);semWait(mutex);queue.insert(data);semSignal(mutex);semSignal(inqueue);

Producer

semWait(inqueue);semWait(mutex);data = queue.get();semSignal(space);semSignal(mutex);

Consumer

mutex is initialized with 1

inqueue is initialized with 0 (initial data into queue)

space is initialized with queue size (initial room into queue)

non-optimal implementation

Università degli studi di Udine Sistemi operativi – Operating Systems

Producer – Consumer

� Using conditional variables and mutexes

� Grant exclusive access to the queue

� Producer signals to consumers that a new data is ready

� Consumer signals that space is available in queue

mutex_lock(mutex);while(count == MAX) cond_wait(space, mutex);queue.insert(data); count++;cond_signal(datain, mutex);mutex_unlock(mutex);

Producer

mutex_lock(mutex);while(count == 0) cond_wait(datain, mutex);data = queue.get(); count--;cond_signal(space, mutex);mutex_unlock(mutex);

Consumer

mutex is initialized with 1

count is initialized with 0 (initial data into queue)

datain is used to signal that there is some data in the queue

space is used to signal that there is free space in the queue

Università degli studi di Udine Sistemi operativi – Operating Systems

Readers – Writers

Shared data

Writers

Readers

� Some tasks write in a shared area

� Some tasks read from the shared area

� No order must be enforced

� Data integrity must be preserved� do not read half-written data

Page 33: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Readers – Writers - 1

semWait(noWriters);write datasemSignal(noWriters);

Writer

semWait(mutex);if (readers==0) semWait(noWriters);readers++;semSignal(mutex);

read data

semWait(mutex);readers--;if (readers==0) semSignal(noWriters);semSignal(mutex);

Reader

noWriters is initialized with 1 (initially none is accessing to shared data)

mutex is initialized with 1

readers is initialized with 0 (initially no readers are reading)

� Grant exclusive access for writers

� First arriving reader must signal that data is used (and wait for writer)

� Last leaving reader must signal that none is using data

Università degli studi di Udine Sistemi operativi – Operating Systems

Readers – Writers - 2

� Writers has less chance to get data� possible starvation

� Solution:� do not allow incoming readers to access data until waiting

writers have been served

Shared data

Writer (waiting)

ReadersIncoming readers

Università degli studi di Udine Sistemi operativi – Operating Systems

Readers – Writers - 2

semWait(writer_in);semWait(noWriters);write datasemSignal(noWriters);semSignal(writer_in);

Writer

semWait(writer_in);semSignal(writer_in);

semWait(mutex);if (readers==0) semWait(noWriters);readers++;semSignal(mutex);

read data

semWait(mutex);readers--;if (readers==0) semSignal(noWriters);semSignal(mutex);

Reader

noWriters is initialized with 1 ; mutex is initialized with 1 ; readers is initialized with 0

writer_in is initialized with 1 (readers and writers can try to proceed)

� Block readers when a writer is waiting

� Resume readers when a writer finishes

� Readers must not hold the writer_in semaphore

Università degli studi di Udine Sistemi operativi – Operating Systems

Readers – Writers - 2

semWait(writer_in);semWait(noWriters);write datasemSignal(noWriters);semSignal(writer_in);

Writer

semWait(writer_in);semSignal(writer_in);

semWait(mutex);if (readers==0) semWait(noWriters);readers++;semSignal(mutex);

read data

semWait(mutex);readers--;if (readers==0) semSignal(noWriters);semSignal(mutex);

Reader

� readers and writers wait on writer_in

� one writer or one reader is selected

� how to grant priority to writers?

Page 34: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Readers – Writers - 3

semWait(mutexW);if (writers==0) semWait(noReaders);writers++;semSignal(mutexW);

semWait(noWriters);write datasemSignal(noWriters);

semWait(mutexW);writers--;if (writers==0) semSignal(noReaders);semSignal(mutexW);

Writer

semWait(noReaders);

semWait(mutexR);if (readers==0) semWait(noWriters);readers++;semSignal(mutexR);

semSignal(noReaders);

read data

semWait(mutexR);readers--;if (readers==0) semSignal(noWriters);semSignal(mutexR);

Reader

writers wait here when readers are into (so new readers are blocked)

readers wait here when a writer is into

(so incoming writers can still block readers)

noWriters is initialized with 1 ; mutexR and mutexW are initialized with 1 ; readers is initialized with 0

noReaders is initialized with 1 ; writers is initialized with 0

Università degli studi di Udine Sistemi operativi – Operating Systems

Readers – Writers

� Readers – Writers – 1

� Simple

� Writers can starve

� Readers – Writers – 2

� No starvation for writers

� Readers – Writers – 3

� Writers have priority over readers

Università degli studi di Udine Sistemi operativi – Operating Systems

Dining philosophers

0

14

23

01

2

3

4

for (;;) { think(); get_forks(); eat(); release_forks();}

philosopher

non-critical section

critical section

� Only one philosopher can hold a fork at a time.

� No deadlock.

� No starvation.

� Allows more eating philosopher at the same time

� Five plates� one for each philosopher

� Five forks

� To eat two forks are needed

Università degli studi di Udine Sistemi operativi – Operating Systems

Dining philosophers

for (;;) { think();

/* get_forks(); */ semWait(fork[(i+1)%5]); semWait(fork[i]);

eat();

/* release_forks(); */ semSignal(fork[(i+1)%5]); semSignal(fork[i]);}

philosopher

fork[5] are initialized with 1

Page 35: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Dining philosophers

� Deadlock

think();

semWait(fork[1]);semWait(fork[0]);

eat();

semSignal(fork[1]);semSignal(fork[0]);

philosopher 0

think();

semWait(fork[2]);semWait(fork[1]);

eat();

semSignal(fork[2]);semSignal(fork[1]);

philosopher 1

think();

semWait(fork[3]);semWait(fork[2]);

eat();

semSignal(fork[3]);semSignal(fork[2]);

philosopher 2

think();

semWait(fork[4);semWait(fork[3]);

eat();

semSignal(fork[4]);semSignal(fork[3]);

philosopher 3

think();

semWait(fork[0);semWait(fork[4]);

eat();

semSignal(fork[0]);semSignal(fork[4]);

philosopher 4

Università degli studi di Udine Sistemi operativi – Operating Systems

Dining philosophers

� Allow only 4 philosophers to try to acquire forks

think();

semWait(fork[1]);semWait(fork[0]);

eat();

semSignal(fork[1]);semSignal(fork[0]);

philosopher 0

think();

semWait(fork[2]);semWait(fork[1]);

eat();

semSignal(fork[2]);semSignal(fork[1]);

philosopher 1

think();

semWait(fork[3]);semWait(fork[2]);

eat();

semSignal(fork[3]);semSignal(fork[2]);

philosopher 2

think();

semWait(fork[4);semWait(fork[3]);

eat();

semSignal(fork[4]);semSignal(fork[3]);

philosopher 3

think();

semWait(fork[0);semWait(fork[4]);

eat();

semSignal(fork[0]);semSignal(fork[4]);

philosopher 4

Università degli studi di Udine Sistemi operativi – Operating Systems

Dining philosophers

� Symmetric solution

for (;;) { think();

/* get_forks(); */ semWait(mutex4); semWait(fork[(i+1)%5]); semWait(fork[i]);

eat();

/* release_forks(); */ semSignal(fork[(i+1)%5]); semSignal(fork[i]); semSignal(mutex4);}

philosopher

fork[5] are initialized with 1

mutex4 is initialized with 4

Università degli studi di Udine Sistemi operativi – Operating Systems

Dining philosophers

� Asymmetric solution

for (;;) { think();

/* get_forks(); */ semWait(fork[(i+1)%5]); semWait(fork[i]);

eat();

/* release_forks(); */ semSignal(fork[(i+1)%5]); semSignal(fork[i]);}

L-philosopher

fork[5] are initialized with 1 ; there is at least one L-philosopher and one R-philosopher

for (;;) { think();

/* get_forks(); */ semWait(fork[i]); semWait(fork[(i+1)%5]);

eat();

/* release_forks(); */ semSignal(fork[(i+1)%5]); semSignal(fork[i]);}

R-philosopher

Page 36: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Dining philosophers

� No more than 4 philosophers can try to acquire forks

think();

semWait(fork[1]);semWait(fork[0]);

eat();

semSignal(fork[1]);semSignal(fork[0]);

L-philosopher 0

think();

semWait(fork[2]);semWait(fork[1]);

eat();

semSignal(fork[2]);semSignal(fork[1]);

L-philosopher 1

think();

semWait(fork[3]);semWait(fork[2]);

eat();

semSignal(fork[3]);semSignal(fork[2]);

L-philosopher 2

think();

semWait(fork[4);semWait(fork[3]);

eat();

semSignal(fork[4]);semSignal(fork[3]);

L-philosopher 3

think();

semWait(fork[4);semWait(fork[0]);

eat();

semSignal(fork[0]);semSignal(fork[4]);

R-philosopher 4

ph 4 is blocked by ph 3

Università degli studi di Udine Sistemi operativi – Operating Systems

Dining philosophers

think();

semWait(fork[1]);semWait(fork[0]);

eat();

semSignal(fork[1]);semSignal(fork[0]);

L-philosopher 0

think();

semWait(fork[2]);semWait(fork[1]);

eat();

semSignal(fork[2]);semSignal(fork[1]);

L-philosopher 1

think();

semWait(fork[3]);semWait(fork[2]);

eat();

semSignal(fork[3]);semSignal(fork[2]);

L-philosopher 2

think();

semWait(fork[4);semWait(fork[3]);

eat();

semSignal(fork[4]);semSignal(fork[3]);

L-philosopher 3

think();

semWait(fork[4);semWait(fork[0]);

eat();

semSignal(fork[0]);semSignal(fork[4]);

R-philosopher 4

think();

semWait(fork[1]);semWait(fork[0]);

eat();

semSignal(fork[1]);semSignal(fork[0]);

think();

semWait(fork[2]);semWait(fork[1]);

eat();

semSignal(fork[2]);semSignal(fork[1]);

think();

semWait(fork[3]);semWait(fork[2]);

eat();

semSignal(fork[3]);semSignal(fork[2]);

think();

semWait(fork[4);semWait(fork[3]);

eat();

semSignal(fork[4]);semSignal(fork[3]);

think();

semWait(fork[4);semWait(fork[0]);

eat();

semSignal(fork[0]);semSignal(fork[4]);

ph 3 is blocked by ph 4

ph 0 is blocked by ph 4

ph 3 is blocked by ph 4

ph 4 is blocked by ph 0

� No more than 4 philosophers can try to acquire forks

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

Deadlock management

Università degli studi di Udine Sistemi operativi – Operating Systems

Deadlock conditions

� Mutual exclusion.� A resource can be assigned only at a fixed finite number of processes at a

time. No other processes may access a resource unit that has reached the

maximum number of assignations.� Needed (to enforce synchronization)

� No preemption.� No resource can be forcibly removed from a process holding it.� Difficult to avoid (a rollback is needed to implement resource preemption)

� Hold and wait.� A process may hold allocated resources while awaiting assignment of other

resources.

� Circular wait.� A closed chain of processes exists, such that each process holds at least one

resource needed by the next process in the chain

Deadlock is possible

Page 37: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Resources

� Swappable space

� Devices

� physical drives

� files

� Main memory blocks

� Internal resources

� I/O interrupts handling

Required asynchronously by

independent processes

Università degli studi di Udine Sistemi operativi – Operating Systems

Resource allocation graph

B R1

Task requires Resource

Task Resource

Task holds Resource

Task Resource

A B has to wait until A will release R1

B

R1

A

R2

circular dependency: deadlock

Università degli studi di Udine Sistemi operativi – Operating Systems

Deadlock handling

� Prevention

� make deadlock not possible

� Avoidance

� disallow operations that may lead to a deadlock

� Detection

� periodically check for deadlock and recover

Università degli studi di Udine Sistemi operativi – Operating Systems

Deadlock prevention

� Disallow hold-and-wait� all the needed resources must be required simultaneously� a process is blocked until all the required resources are available� inefficient

� a process must acquire resources needed only for small time intervals or actually not

needed

� Allow preemption� when a request is refused, a process must release all its resources� OS may request a process to release resources� practical only for resources with an easily restored state (e.g., processor)

� Disallow circular waits� define an ordering on resources� a process that owns a resource R can request a resource Q only if ord(R) < ord(Q)� disallows incremental resource request

Page 38: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Deadlock avoidance

� Evaluate resource requests

� grant a resource request only if a deadlock cannot occur

� OS must know all the future requests

� banker's algorithm (Dijkstra)

Università degli studi di Udine Sistemi operativi – Operating Systems

Deadlock detection

� Periodically check for deadlock

� grant resource requests whenever possible

� if a deadlock is detected

� kill all deadlocked processes

� most common approach

� successively abort deadlocked processes (until deadlock no longer exist)

� selection order can be a key factor

� rollback all deadlocked processes to a previous state

� backup and restore mechanism must be implemented

� force a deadlocked process to release resources

� preemption

� rollback the process to a point prior the resource acquisition

Università degli studi di Udine Sistemi operativi – Operating Systems

Banker's algorithm

� For a single resource type

� Process

� resources used

� resources needed

� Available resources

� grant request only if it will lead to a safe state

� safe state:

� there exist at least one process that still needs less resources than available

� unsafe state

� deadlock is possible (no-deadlock cannot be ensured)

Università degli studi di Udine Sistemi operativi – Operating Systems

Banker's algorithm

Process A 0 15

Allocated Needed

Process B 0 7

Process C 0 4

Process D 0 12

Available 20

Process A 8 7

Allocated Needed

Process B 4 3

Process C 1 3

Process D 6 6

Available 1

unsafe: with available resources no process

is guaranteed to terminate

� For a single resource type

Page 39: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

Banker's algorithm

Process A 0 15

Allocated Needed

Process B 0 7

Process C 0 4

Process D 0 12

Available 20

Process A 7 8

Allocated Needed

Process B 4 3

Process C 2 2

Process D 5 7

Available 2

safe: with available resources process C can

surely terminate

� For a single resource type

Università degli studi di Udine Sistemi operativi – Operating Systems

Banker's algorithm

� For a single resource type

� safe state:

� � i � Needed(i) < Available

Needed(i): resources still needed by process i

Available: resources still available on the system

Università degli studi di Udine Sistemi operativi – Operating Systems

Banker's algorithm

� For several resource types

� replicate information for each resource type

Process A 0 15

Allocated Needed

Process B 0 7

Process C 0 4

Process D 0 12

Available 20

0 1

Allocated Needed

0 2

0 4

0 1

Available 4

0 3

Allocated Needed

0 7

0 4

0 9

Available 10

Type-1 Type-2 Type-3

� safe state:

� i �� � j Needed(i,j) < Available(j)

Needed(i,j): resources of type j still needed by process i

Available(j): resources of type j still available on the system

Università degli studi di Udine Sistemi operativi – Operating Systems

Synchronization

User level

(POSIX)

synchronization primitives

Page 40: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

GCC builtins for atomic accesses

� Read-Modify-Write operations

� __sync_fetch_and_add(type *ptr, type value);

� __sync_fetch_and_sub(type *ptr, type value);

� __sync_fetch_and_or(type *ptr, type value);

� __sync_fetch_and_and(type *ptr, type value);

� __sync_fetch_and_xor(type *ptr, type value);

� __sync_fetch_and_nand(type *ptr, type value);

� perform the operation suggested by the name, and return the old

value;

� imply a full memory barrier

Università degli studi di Udine Sistemi operativi – Operating Systems

GCC builtins for atomic accesses

� Read-Modify-Write operations

� __sync_add_and_fetch(type *ptr, type value);

� __sync_sub_and_fetch(type *ptr, type value);

� __sync_or_and_fetch(type *ptr, type value);

� __sync_and_and_fetch(type *ptr, type value);

� __sync_xor_and_fetch(type *ptr, type value);

� __sync_nand_and_fetch(type *ptr, type value);

� perform the operation suggested by the name, and return the new

value;

� imply a full memory barrier

Università degli studi di Udine Sistemi operativi – Operating Systems

GCC builtins for atomic accesses

� Read-Modify-Write operations

� __sync_lock_test_and_set(type *ptr, type value);

� perform an atomic exchange: writes value into *ptr and returns the previous

contents of *ptr;

� implies an acquire barrier

� Read-Test-Modify-Write operations

� __sync_val_compare_and_swap(type *ptr, type oldval, type newval);

� __sync_bool_compare_and_swap(type *ptr, type oldval, type newval);

� perform atomic compare-and-swap: if the current value of *ptr is oldval,

then write newval into *ptr;

� __sync_val_compare_and_swap returns the old value of *ptr

� __sync_bool_compare_and_swap returns true if the comparison is successful

� imply a full memory barrier

Università degli studi di Udine Sistemi operativi – Operating Systems

GCC builtins for atomic accesses

� Others:

� __sync_lock_release(type *ptr);

� Writes 0 to *ptr;

� implies a release barrier

� __sync_synchronize();

� Issues a full memory barrier

Page 41: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

User level (POSIX)

synchronization primitives

� Low-level primitives

� spinlocks

� High-level primitives

� semaphores

� mutexes

� reader-writer locks

� condition variables

� barriers

Università degli studi di Udine Sistemi operativi – Operating Systems

User level (POSIX)

synchronization primitives

� Low-level primitives

� spinlocks

� type

� pthread_spinlock_t

� operations:

� pthread_spin_init

� pthread_spin_destroy

� pthread_spin_lock

� pthread_spin_unlock

� pthread_spin_trylock

initialization

deallocation

locking

unlocking

tentative locking

Università degli studi di Udine Sistemi operativi – Operating Systems

User level (POSIX)

synchronization primitives

� High-level primitives

� semaphores

� type

� sem_t

� operations:

� sem_init

� sem_destroy

� sem_getvalue

� sem_wait

� sem_timedwait

� sem_trywait

� sem_post

initialization

deallocation

waiting

unlocking

tentative waiting

Università degli studi di Udine Sistemi operativi – Operating Systems

User level (POSIX)

synchronization primitives

� High-level primitives

� mutexes

� type

� pthread_mutex_t

� operations:

� pthread_mutex_init

� pthread_mutex_destroy

� pthread_mutex_lock

� pthread_mutex_unlock

� pthread_mutex_trylock

initialization

deallocation

locking

unlocking

tentative locking

Page 42: Synchronization patterns Classical problems Deadlock ... · allow only one task to proceed, others must wait several tasks compete to acquire a lock only one wins (acquires the lock)

Università degli studi di Udine Sistemi operativi – Operating Systems

User level (POSIX)

synchronization primitives

� High-level primitives

� reader-writer locks

� type

� pthread_rwlock_t

� operations:

� pthread_rwlock_init

� pthread_rwlock_destroy

� pthread_rwlock_rdlock

� pthread_rwlock_wrlock

� pthread_rwlock_unlock

� pthread_rwlock_tryrdlock

� pthread_rwlock_trywrlock

initialization

deallocation

locking

unlocking

tentative locking

Università degli studi di Udine Sistemi operativi – Operating Systems

User level (POSIX)

synchronization primitives

� High-level primitives

� condition variables

� type

� pthread_cond_t

� operations:

� pthread_cond_init

� pthread_cond_destroy

� pthread_cond_wait

� pthread_cond_timedwait

� pthread_cond_signal

� pthread_cond_broadcast

initialization

deallocation

waiting

notifying

Università degli studi di Udine Sistemi operativi – Operating Systems

User level (POSIX)

synchronization primitives

� High-level primitives

� barriers

� type

� pthread_barrier_t

� operations:

� pthread_barrier_init

� pthread_barrier_destroy

� pthread_barrier_wait

initialization

deallocation

waiting

Università degli studi di Udine Sistemi operativi – Operating Systems

Lock pthread_spinlock_t

RW-Lock NO

Mutex pthread_mutex_t

Semaphore sem_t

RW-mutex pthread_rwlock_t

Condition Variable pthread_cond_t

Barrier pthread_barrier_t

Low-level

(no sleeping)

High-level

(may sleep)

User level (POSIX)

synchronization primitives