Top Banner
Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich
106

Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Dec 16, 2015

Download

Documents

Keyshawn Basden
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Spin Locks and Contention

Based on slides by by Maurice Herlihy & Nir Shavit

Tomer Gurevich

Page 2: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Mutual Exclusion

• Most programs aren’t embarrassingly parallel

• “critical sections” of the code must be executed by one thread at a time to ensure correctness

• use locks for mutual exclusion

Art of Multiprocessor Programming 2

Page 3: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Example: concurrent counter

Art of Multiprocessor Programming 3

Thread 2 Thread 1

R1

R1

W2

W2

Page 4: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 4

Locks

CS

Resets lock upon exit

lock

critical section

...

…lock introduces sequential bottleneck

Page 5: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 5

What Should you do if you can’t get a lock?

• Keep trying– “spin” or “busy-wait”– Good if delays are short

• Give up the processor– Good if delays are long– Always good on uniprocessor

(1)

Page 6: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Outline

• Spinlock review • TAS-lock optimizations • Queue locks • Abortable locks

Art of Multiprocessor Programming 6

Page 7: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 7

Review: Test-and-Set

• Atomic operation • Test-and-set (addr,new_val)

– Set the current value of the word addr to new_val

– Return the old value • TAS aka “getAndSet”

Page 8: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 8

Review: Test-and-Set

public class AtomicBoolean { boolean value; public synchronized boolean getAndSet(boolean newValue) {

boolean prior = value; value = newValue; return prior; }}

(5)

Page 9: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 9

Test-and-Set Locks

• Locking– Lock is free: value is false– Lock is taken: value is true

• Acquire lock by calling TAS– If result is false, you win– If result is true, you lose

• Release lock by writing false

Page 10: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 10

Test-and-set Lock

class TASlock { AtomicBoolean state = new AtomicBoolean(false);

void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }}

Page 11: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 11

Test-and-set Lock

class TASlock { AtomicBoolean state = new AtomicBoolean(false);

void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }} Lock state is AtomicBoolean

Page 12: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 12

Test-and-set Lock

class TASlock { AtomicBoolean state = new AtomicBoolean(false);

void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }} Keep trying until lock acquired

Page 13: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 13

Test-and-set Lock

class TASlock { AtomicBoolean state = new AtomicBoolean(false);

void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }}

Release lock by resetting state to false

Page 14: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 14

Space Complexity

• TAS spin-lock has small “footprint” • N thread spin-lock uses O(1) space

Page 15: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 15

Performance

• Experiment– n threads– Increment shared counter 1 million

times• How long should it take?• How long does it take?

Page 16: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 16

Mystery #1ti

me

threads

TAS lock

Ideal

(1)

What is going on?

Page 17: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 17

Bus-Based Architectures

Bus

cache

memory

cachecache

Page 18: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 18

Bus

Processor Issues Load Request

cache

memory

cachecache

data

Page 19: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 19

Bus

Processor Issues Load Request

Bus

cache

memory

cachecache

data

Gimmedata

Page 20: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 20

cache

Bus

Memory Responds

Bus

memory

cachecache

data

Got your data right here data

Page 21: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 21

Bus

Processor Issues Load Request

memory

cachecachedata

data

Gimmedata

Page 22: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 22

Bus

Processor Issues Load Request

Bus

memory

cachecachedata

data

Gimmedata

Page 23: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 23

Bus

Processor Issues Load Request

Bus

memory

cachecachedata

data

I got data

Page 24: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 24

Bus

Other Processor Responds

memory

cachecache

data

I got data

datadata

Bus

Page 25: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 25

Bus

Other Processor Responds

memory

cachecache

data

datadata

Bus

Page 26: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 26

Cache Coherence

• We have lots of copies of data– Original copy in memory – Cached copies at processors

• Some processor modifies its own copy– What do we do with the others?– How to avoid confusion?

Page 27: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 27

Modify Cached Data

Bus

data

memory

cachedata

data

(1)

Page 28: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 28

Modify Cached Data

Bus

data

memory

cachedata

data

data

(1)

Page 29: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 29

memory

Bus

data

Modify Cached Data

cachedata

data

Page 30: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 30

memory

Bus

data

Modify Cached Data

cache

What’s up with the other copies?

data

data

Page 31: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 31

cache

Bus

Modified cache data

memory

cachedata

data

Other caches invalidate data

This cache acquires write permission

Page 32: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 32

cache

Bus

Modified cache data

memory

cachedata

data

Memory can be updated later

Page 33: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 33

What’s wrong with TASLock?

• TAS invalidates cache lines• Spinners

– Miss in cache– Go to bus

• Thread wants to release lock– delayed behind spinners

Page 34: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 34

Test-and-Test-and-Set Locks

• Lurking stage– Wait until lock “looks” free– Spin while read returns true (lock

taken)• Pouncing state

– As soon as lock “looks” available– Read returns false (lock free)– Call TAS to acquire lock– If TAS loses, back to lurking

Page 35: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 35

Test-and-test-and-set Lock

class TTASlock { AtomicBoolean state = new AtomicBoolean(false);

void lock() { while (true) { while (state.get()) {} if (!state.getAndSet(true)) return; }}

Page 36: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 36

Test-and-test-and-set Lock

class TTASlock { AtomicBoolean state = new AtomicBoolean(false);

void lock() { while (true) { while (state.get()) {} if (!state.getAndSet(true)) return; }} Wait until lock looks free

Page 37: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 37

Test-and-test-and-set Lock

class TTASlock { AtomicBoolean state = new AtomicBoolean(false);

void lock() { while (true) { while (state.get()) {} if (!state.getAndSet(true)) return; }}

Then try to acquire it

Page 38: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 38

Graph

TAS lock

TTAS lock

Idealtim

e

threads

Page 39: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 39

Test-and-test-and-set

• Wait until lock “looks” free– Spin on local cache– No bus use while lock busy

• Problem: when lock is released– Invalidation storm …

Page 40: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 40

Local Spinning while Lock is Busy

Bus

memory

busybusybusy

busy

Page 41: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 41

Bus

On Release

memory

freeinvalidinvalid

free

Page 42: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 42

On Release

Bus

memory

freeinvalidinvalid

free

miss miss

Everyone misses, rereads

(1)

Page 43: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 43

On Release

Bus

memory

freeinvalidinvalid

free

TAS(…) TAS(…)

Everyone tries TAS

(1)

Page 44: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 44

An important observation

spin locktimedr1dr2d

• If the lock looks free• But I fail to get it

• There must be contention• Better to back off than to collide again

Page 45: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 45

Solution: delay

timed2d4d spin lock

If I fail to get lock– wait random duration before

retry– Each subsequent failure

doubles expected wait

Page 46: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 46

Exponential Backoff Lock

public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}}

Page 47: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 47

Exponential Backoff Lock

public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} Fix minimum delay

Page 48: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 48

Exponential Backoff Lock

public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} Wait until lock looks free

Page 49: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 49

Exponential Backoff Lock

public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} If we win, return

Page 50: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 50

Exponential Backoff Lock

public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}}

Back off for random duration

Page 51: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 51

Exponential Backoff Lock

public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}}

Double max delay, within reason

Page 52: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 52

Spin-Waiting Overhead

TTAS Lock

Backoff locktim

e

threads

Page 53: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 53

Backoff: Other Issues

• Good– Easy to implement– Beats TTAS lock

• Bad– Must choose parameters carefully– Not portable across platforms

Page 54: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Summary: basic TAS-Lock

• Perform well for low contention , but basic spinlocks aren’t scalable

• All thread spin on the same shared memory location, causing a lot of bus traffic

• No fairness , so a thread might starve

Art of Multiprocessor Programming 54

Page 55: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Queue locks

• Keep FIFO Order • Scalable locks • Harder to implement• Hurt performance for low

contention

Art of Multiprocessor Programming 55

Page 56: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 56

Anderson Queue Lock

flags

next

T F F F F F F F

idle

Page 57: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 57

Anderson Queue Lock

flags

next

T F F F F F F F

acquiring

getAndIncrement

Page 58: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 58

Anderson Queue Lock

flags

next

T F F F F F F F

acquiring

getAndIncrement

Page 59: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 59

Anderson Queue Lock

flags

next

T F F F F F F F

acquired

Mine!

Page 60: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 60

Anderson Queue Lock

flags

next

T F F F F F F F

acquired acquiring

Page 61: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 61

Anderson Queue Lock

flags

next

T F F F F F F F

acquired acquiring

getAndIncrement

Page 62: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 62

Anderson Queue Lock

flags

next

T F F F F F F F

acquired acquiring

getAndIncrement

Page 63: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 63

acquired

Anderson Queue Lock

flags

next

T F F F F F F F

acquiring

Page 64: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 64

released

Anderson Queue Lock

flags

next

F T F F F F F F

acquired

Page 65: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Problem: false sharing

• Each thread spins on different variable, so there is no reason for contention.

• But adjacent Array elements are contained within the same cacheline…

Art of Multiprocessor Programming 65

Page 66: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

66

released

The Solution: Padding

flags

next

T / / / F / / /

acquired

Line 1 Line 2Art of Multiprocessor Programming

Spin on my line

Page 67: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 67

Performance

• Shorter handover than backoff

• Curve is practically flat• Scalable performance

queue

TTAS

Page 68: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 68

Anderson Queue LockGood - Easy to implement Queue lock Bad

–Not Space efficient• What if unknown number of

threads?• What if small number of actual

contenders?

Page 69: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 69

CLH Lock

• FIFO order• Small, constant-size overhead per

thread

Page 70: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 70

CLH Queue Lock

class Qnode { AtomicBoolean locked = new AtomicBoolean(true);}

Page 71: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 71

CLH Queue Lockclass CLHLock implements Lock { AtomicReference<Qnode> tail; ThreadLocal<Qnode> myNode = new Qnode(); public void lock() { Qnode pred = tail.getAndSet(myNode); while (pred.locked) {} }}

Page 72: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 72

CLH Queue Lockclass CLHLock implements Lock { AtomicReference<Qnode> tail; ThreadLocal<Qnode> myNode = new Qnode(); public void lock() { Qnode pred = tail.getAndSet(myNode); while (pred.locked) {} }}

Queue tail

Page 73: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 73

CLH Queue Lockclass CLHLock implements Lock { AtomicReference<Qnode> tail; ThreadLocal<Qnode> myNode = new Qnode(); public void lock() { Qnode pred = tail.getAndSet(myNode); while (pred.locked) {} }}

Thread-local Qnode

Page 74: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 74

CLH Queue Lockclass CLHLock implements Lock { AtomicReference<Qnode> tail; ThreadLocal<Qnode> myNode = new Qnode(); public void lock() { Qnode pred = tail.getAndSet(myNode); while (pred.locked) {} }}

Swap in my node

Page 75: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 75

CLH Queue Lockclass CLHLock implements Lock { AtomicReference<Qnode> tail; ThreadLocal<Qnode> myNode = new Qnode(); public void lock() { Qnode pred = tail.getAndSet(myNode); while (pred.locked) {} }}

Spin until predecessorreleases lock

Page 76: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 76

Initially

false

tail

idle

Page 77: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 77

Initially

false

tail

idle

Page 78: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 78

Purple Wants the Lock

false

tail

acquiring

Page 79: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 79

Purple Wants the Lock

false

tail

acquiring

true

Page 80: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 80

Purple Wants the Lock

false

tail

acquiring

true

Swap

Page 81: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 81

Purple Has the Lock

false

tail

acquired

true

Page 82: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 82

Red Wants the Lock

false

tail

acquired acquiring

true true

Page 83: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 83

Red Wants the Lock

false

tail

acquired acquiring

true

Swap

true

Page 84: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 84

Red Wants the Lock

false

tail

acquired acquiring

true true

Page 85: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 85

Red Wants the Lock

false

tail

acquired acquiring

true true

Page 86: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 86

Red Wants the Lock

false

tail

acquired acquiring

true true

ImplicitLinked list

Page 87: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 87

CLH Queue LockClass CLHLock implements Lock { … public void unlock() { myNode.locked.set(false); myNode = pred; }}

Page 88: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 88

CLH Queue LockClass CLHLock implements Lock { … public void unlock() { myNode.locked.set(false); myNode = pred; }}

Notify successor

Page 89: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 89

CLH Queue LockClass CLHLock implements Lock { … public void unlock() { myNode.locked.set(false); myNode = pred; }}

Recycle predecessor’s

node

Page 90: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 90

Purple Releases

false

tail

release acquiring

false true

falseBingo

!

Page 91: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 91

Purple Releases

tail

released acquired

true

Page 92: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 92

Space Usage

• Let– L = number of locks– N = number of threads

• ALock– O(LN)

• CLH lock– O(L+N)

Page 93: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 93

CLH Lock

• Good– Lock release affects predecessor only– Small, constant-sized space

• Bad– Doesn’t work for uncached NUMA

architectures

Page 94: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 94

NUMA Architecturs

• Acronym:– Non-Uniform Memory Architecture

• Illusion:– Flat shared memory

• Truth:– No caches (sometimes)– Some memory regions faster than

others

Page 95: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 95

MCS Lock

• FIFO order, list based Queue lock• Similar to CLH• Spin on local memory only, solving

the NUMA problem

Page 96: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

MCS lock

• Each node contains now a “next” field.

• Each node spins locally on its own “Locked” field

• upon release, notify next node you finished

Art of Multiprocessor Programming 96

Page 97: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 97

Abortable Locks

• What if you want to give up waiting for a lock?

• For example– Timeout– Database transaction aborted by user

Page 98: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 98

Back-off Lock

• Aborting is trivial– Just return from lock() call

• Extra benefit:– No cleaning up– Immediate return

Page 99: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 99

Queue Locks

• Can’t just quit– Thread in line behind will starve

• Need a graceful way out

Page 100: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 100

Abortable CLH Lock

• When a thread gives up– Removing node in a wait-free way is

hard• Idea:

– let successor deal with it.

Page 101: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 101

Queue Locks

locked

true

spinning

truetrue

spinning

Page 102: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 102

Queue Locks

locked

trueabortrue

spinning

Time-out

Page 103: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 103

Queue Locks

locked

trueabortrue

spinningPredecessor

aborted

Page 104: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 104

Queue Locks

locked

truetrue

spinning

Page 105: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 105

One Lock To Rule Them All?

• TTAS+Backoff, CLH, MCS, ToLock…• Each better than others in some

way• There is no one solution• Lock we pick really depends on:

– the application– the hardware– which properties are important

Page 106: Spin Locks and Contention Based on slides by by Maurice Herlihy & Nir Shavit Tomer Gurevich.

Art of Multiprocessor Programming 106

         This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

• You are free:– to Share — to copy, distribute and transmit the work – to Remix — to adapt the work

• Under the following conditions:– Attribution. You must attribute the work to “The Art of

Multiprocessor Programming” (but not in any way that suggests that the authors endorse you or your use of the work).

– Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license.

• For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to– http://creativecommons.org/licenses/by-sa/3.0/.

• Any of the above conditions can be waived if you get permission from the copyright holder.

• Nothing in this license impairs or restricts the author's moral rights.