Top Banner
Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007
217

Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

Dec 14, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

Spin Locks and Contention

ManagementThe Art of

Multiprocessor Programming

Spring 2007

Page 2: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 2

Focus so far: Correctness

• Models– Accurate (we never lied to you)

– But idealized (so we forgot to mention a few things)

• Protocols– Elegant– Important– But naïve

Page 3: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 3

New Focus: Performance

• Models– More complicated (not the same as complex!)

– Still focus on principles (not soon obsolete)

• Protocols– Elegant (in their fashion)

– Important (why else would we pay attention)

– And realistic (your mileage may vary)

Page 4: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 4

Kinds of Architectures

• SISD (Uniprocessor)– Single instruction stream– Single data stream

• SIMD (Vector)– Single instruction– Multiple data

• MIMD (Multiprocessors)– Multiple instruction– Multiple data.

Page 5: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 5

Kinds of Architectures

• SISD (Uniprocessor)– Single instruction stream– Single data stream

• SIMD (Vector)– Single instruction– Multiple data

• MIMD (Multiprocessors)– Multiple instruction– Multiple data.

Our space

(1)

Page 6: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 6

MIMD Architectures

• Memory Contention• Communication Contention • Communication Latency

Shared Bus

memory

Distributed

Page 7: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 7

Today: Revisit Mutual Exclusion

• Think of performance, not just correctness and progress

• Begin to understand how performance depends on our software properly utilizing the multipprocessor machines hardware

• And get to know a collection of locking algorithms…

(1)

Page 8: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 8

What Should you do if you can’t get a lock?

• Keep trying– “spin” or “busy-wait”– Good if delays are short

• Give up the processor– Good if delays are long– Always good on uniprocessor

(1)

Page 9: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 9

What Should you do if you can’t get a lock?

• Keep trying– “spin” or “busy-wait”– Good if delays are short

• Give up the processor– Good if delays are long– Always good on uniprocessor

our focus

Page 10: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 10

Basic Spin-Lock

CS

Resets lock upon exit

spin lock

critical section

...

Page 11: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 11

Basic Spin-Lock

CS

Resets lock upon exit

spin lock

critical section

...

…lock introduces sequntial bottleneck

Page 12: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 12

Basic Spin-Lock

CS

Resets lock upon exit

spin lock

critical section

...

…lock suffers from contention

Page 13: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 13

Basic Spin-Lock

CS

Resets lock upon exit

spin lock

critical section

...Notice: these are two different phenomenon

…lock suffers from contention

Seq Bottleneck no parallelism

Page 14: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 14

Basic Spin-Lock

CS

Resets lock upon exit

spin lock

critical section

...Contention ???

…lock suffers from contention

Page 15: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 15

Review: Test-and-Set

• Boolean value• Test-and-set (TAS)

– Swap true with current value– Return value tells if prior value was

true or false

• Can reset just by writing false• TAS aka “getAndSet”

Page 16: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 16

Review: Test-and-Set

public class AtomicBoolean { boolean value; public synchronized boolean getAndSet(boolean newValue) {

boolean prior = value; value = newValue; return prior; }}

(5)

Page 17: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 17

Review: Test-and-Set

public class AtomicBoolean { boolean value; public synchronized boolean getAndSet(boolean newValue) {

boolean prior = value; value = newValue; return prior; }}

Packagejava.util.concurrent.atomic

Page 18: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 18

Review: Test-and-Set

public class AtomicBoolean { boolean value; public synchronized boolean getAndSet(boolean newValue) {

boolean prior = value; value = newValue; return prior; }}

Swap old and new values

Page 19: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 19

Review: Test-and-Set

AtomicBoolean lock = new AtomicBoolean(false)…boolean prior = lock.getAndSet(true)

Page 20: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 20

Review: Test-and-Set

AtomicBoolean lock = new AtomicBoolean(false)…boolean prior = lock.getAndSet(true)

(5)

Swapping in true is called “test-and-set” or TAS

Page 21: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 21

Test-and-Set Locks

• Locking– Lock is free: value is false– Lock is taken: value is true

• Acquire lock by calling TAS– If result is false, you win– If result is true, you lose

• Release lock by writing false

Page 22: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 22

Test-and-set Lock

class TASlock { AtomicBoolean state = new AtomicBoolean(false);

void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }}

Page 23: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 23

Test-and-set Lock

class TASlock { AtomicBoolean state = new AtomicBoolean(false);

void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }}

Lock state is AtomicBoolean

Page 24: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 24

Test-and-set Lock

class TASlock { AtomicBoolean state = new AtomicBoolean(false);

void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }}

Keep trying until lock acquired

Page 25: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 25

Test-and-set Lock

class TASlock { AtomicBoolean state = new AtomicBoolean(false);

void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }}

Release lock by resetting state to

false

Page 26: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 26

Space Complexity

• TAS spin-lock has small “footprint” • N thread spin-lock uses O(1) space• As opposed to O(n)

Peterson/Bakery • How did we overcome the (n)

lower bound? • We used a RMW operation…

Page 27: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 27

Performance

• Experiment– n threads– Increment shared counter 1 million

times

• How long should it take?• How long does it take?

Page 28: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 28

Graph

ideal

tim e

threads

no speedup because of sequential bottleneck

Page 29: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 29

Mystery #1ti

m e

threads

TAS lock

Ideal

(1)

What is going on?

Lets try and fix

it…

Page 30: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 30

Test-and-Test-and-Set Locks

• Lurking stage– Wait until lock “looks” free– Spin while read returns true (lock

taken)• Pouncing state

– As soon as lock “looks” available– Read returns false (lock free)– Call TAS to acquire lock– If TAS loses, back to lurking

Page 31: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 31

Test-and-test-and-set Lock

class TTASlock { AtomicBoolean state = new AtomicBoolean(false);

void lock() { while (true) { while (state.get()) {} if (!state.getAndSet(true)) return; }}

Page 32: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 32

Test-and-test-and-set Lock

class TTASlock { AtomicBoolean state = new AtomicBoolean(false);

void lock() { while (true) { while (state.get()) {} if (!state.getAndSet(true)) return; }} Wait until lock looks free

Page 33: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 33

Test-and-test-and-set Lock

class TTASlock { AtomicBoolean state = new AtomicBoolean(false);

void lock() { while (true) { while (state.get()) {} if (!state.getAndSet(true)) return; }}

Then try to acquire it

Page 34: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 34

Mystery #2

TAS lock

TTAS lock

Ideal

tim e

threads

Page 35: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 35

Mystery

• Both– TAS and TTAS– Do the same thing (in our model)

• Except that– TTAS performs much better than TAS– Neither approaches ideal

Page 36: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 36

Opinion

• Our memory abstraction is broken• TAS & TTAS methods

– Are provably the same (in our model)

– Except they aren’t (in field tests)

• Need a more detailed model …

Page 37: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 37

Bus-Based Architectures

Bus

cache

memory

cachecache

Page 38: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 38

Bus-Based Architectures

Bus

cache

memory

cachecache

Random access memory (10s of cycles)

Page 39: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 39

Bus-Based Architectures

cache

memory

cachecache

Shared Bus•broadcast medium•One broadcaster at a time•Processors and memory all “snoop”

Bus

Page 40: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 40

Bus-Based Architectures

Bus

cache

memory

cachecache

Per-Processor Caches•Small•Fast: 1 or 2 cycles•Address & state information

Page 41: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 41

Jargon Watch

• Cache hit– “I found what I wanted in my cache”– Good Thing™

Page 42: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 42

Jargon Watch

• Cache hit– “I found what I wanted in my cache”– Good Thing™

• Cache miss– “I had to shlep all the way to memory

for that data”– Bad Thing™

Page 43: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 43

Cave Canem

• This model is still a simplification– But not in any essential way– Illustrates basic principles

• Will discuss complexities later

Page 44: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 44

Bus

Processor Issues Load Request

cache

memory

cachecache

data

Page 45: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 45

Bus

Processor Issues Load Request

Bus

cache

memory

cachecache

data

Gimmedata

Page 46: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 46

cache

Bus

Memory Responds

Bus

memory

cachecache

data

Got your data right here data

Page 47: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 47

Bus

Processor Issues Load Request

memory

cachecachedata

data

Gimmedata

Page 48: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 48

Bus

Processor Issues Load Request

Bus

memory

cachecachedata

data

Gimmedata

Page 49: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 49

Bus

Processor Issues Load Request

Bus

memory

cachecachedata

data

I got data

Page 50: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 50

Bus

Other Processor Responds

memory

cachecache

data

I got data

datadata

Bus

Page 51: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 51

Bus

Other Processor Responds

memory

cachecache

data

datadata

Bus

Page 52: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 52

Modify Cached Data

Bus

data

memory

cachedata

data

(1)

Page 53: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 53

Modify Cached Data

Bus

data

memory

cachedata

data

data

(1)

Page 54: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 54

memory

Bus

data

Modify Cached Data

cachedata

data

Page 55: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 55

memory

Bus

data

Modify Cached Data

cache

What’s up with the other copies?

data

data

Page 56: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 56

Cache Coherence

• We have lots of copies of data– Original copy in memory – Cached copies at processors

• Some processor modifies its own copy– What do we do with the others?– How to avoid confusion?

Page 57: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 57

Write-Back Caches

• Accumulate changes in cache• Write back when needed

– Need the cache for something else– Another processor wants it

• On first modification– Invalidate other entries– Requires non-trivial protocol …

Page 58: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 58

Write-Back Caches

• Cache entry has three states– Invalid: contains raw seething bits– Valid: I can read but I can’t write– Dirty: Data has been modified

• Intercept other load requests• Write back to memory before using cache

Page 59: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 59

Bus

Invalidate

memory

cachedatadata

data

Page 60: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 60

Bus

Invalidate

Bus

memory

cachedatadata

data

Mine, all mine!

Page 61: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 61

Bus

Invalidate

Bus

memory

cachedatadata

data

cache

Uh,oh

Page 62: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 62

cache

Bus

Invalidate

memory

cachedata

data

Other caches lose read permission

Page 63: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 63

cache

Bus

Invalidate

memory

cachedata

data

Other caches lose read permission

This cache acquires write permission

Page 64: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 64

cache

Bus

Invalidate

memory

cachedata

data

Memory provides data only if not present in any cache, so no need

to change it now (expensive)

(2)

Page 65: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 65

cache

Bus

Another Processor Asks for Data

memory

cachedata

data

(2)

Bus

Page 66: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 66

cache data

Bus

Owner Responds

memory

cachedata

data

(2)

Bus

Here it is!

Page 67: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 67

Bus

End of the Day …

memory

cachedata

data

(1)

Reading OK, no writing

data data

Page 68: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 68

Mutual Exclusion

• What do we want to optimize?– Bus bandwidth used by spinning

threads– Release/Acquire latency– Acquire latency for idle lock

Page 69: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 69

Simple TASLock

• TAS invalidates cache lines• Spinners

– Miss in cache– Go to bus

• Thread wants to release lock– delayed behind spinners

Page 70: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 70

Test-and-test-and-set

• Wait until lock “looks” free– Spin on local cache– No bus use while lock busy

• Problem: when lock is released– Invalidation storm …

Page 71: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 71

Local Spinning while Lock is Busy

Bus

memory

busybusybusy

busy

Page 72: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 72

Bus

On Release

memory

freeinvalidinvalid

free

Page 73: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 73

On Release

Bus

memory

freeinvalidinvalid

free

miss miss

Everyone misses, rereads

(1)

Page 74: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 74

On Release

Bus

memory

freeinvalidinvalid

free

TAS(…) TAS(…)

Everyone tries TAS

(1)

Page 75: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 75

Problems

• Everyone misses– Reads satisfied sequentially

• Everyone does TAS– Invalidates others’ caches

• Eventually quiesces after lock acquired– How long does this take?

Page 76: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 76

Measuring Quiescence Time

P1

P2

Pn

X = time of ops that don’t use the busY = time of ops that cause intensive bus traffic

In critical section, run ops X then ops Y. As long as Quiescence time is less than X, no drop in performance.

By gradually varying X, can determine the exact time to quiesce.

Page 77: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 77

Quiescence Time

Increses linearly with the number of processors for bus architectureti

m e

threads

Page 78: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 78

Mystery Explained

TAS lock

TTAS lock

Ideal

tim e

threads

Better than TAS but still

not as good as ideal

Page 79: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 79

Solution: Introduce Delay

spin locktimedr1dr2d

• If the lock looks free• But I fail to get it

• There must be lots of contention• Better to back off than to collide again

Page 80: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 80

Dynamic Example: Exponential Backoff

timed2d4d spin lock

If I fail to get lock– wait random duration before retry– Each subsequent failure doubles

expected wait

Page 81: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 81

Exponential Backoff Lock

public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}}

Page 82: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 82

Exponential Backoff Lock

public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} Fix minimum delay

Page 83: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 83

Exponential Backoff Lock

public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} Wait until lock looks free

Page 84: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 84

Exponential Backoff Lock

public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} If we win, return

Page 85: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 85

Exponential Backoff Lock

public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}}

Back off for random duration

Page 86: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 86

Exponential Backoff Lock

public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}}

Double max delay, within reason

Page 87: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 87

Spin-Waiting Overhead

TTAS Lock

Backoff lock

tim e

threads

Page 88: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 88

Backoff: Other Issues

• Good– Easy to implement– Beats TTAS lock

• Bad– Must choose parameters carefully– Not portable across platforms

Page 89: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 89

Idea

• Avoid useless invalidations– By keeping a queue of threads

• Each thread– Notifies next in line– Without bothering the others

Page 90: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 90

Anderson Queue Lock

flags

next

T F F F F F F F

idle

Page 91: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 91

Anderson Queue Lock

flags

next

T F F F F F F F

acquiring

getAndIncrement

Page 92: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 92

Anderson Queue Lock

flags

next

T F F F F F F F

acquiring

getAndIncrement

Page 93: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 93

Anderson Queue Lock

flags

next

T F F F F F F F

acquired

Mine!

Page 94: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 94

Anderson Queue Lock

flags

next

T F F F F F F F

acquired acquiring

Page 95: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 95

Anderson Queue Lock

flags

next

T F F F F F F F

acquired acquiring

getAndIncrement

Page 96: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 96

Anderson Queue Lock

flags

next

T F F F F F F F

acquired acquiring

getAndIncrement

Page 97: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 97

acquired

Anderson Queue Lock

flags

next

T F F F F F F F

acquiring

Page 98: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 98

released

Anderson Queue Lock

flags

next

T T F F F F F F

acquired

Page 99: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 99

released

Anderson Queue Lock

flags

next

T T F F F F F F

acquired

Yow!

Page 100: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 100

Anderson Queue Lock

class ALock implements Lock { boolean[] flags={true,false,…,false}; AtomicInteger next = new AtomicInteger(0); int[] slot = new int[n];

Page 101: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 101

Anderson Queue Lock

class ALock implements Lock { boolean[] flags={true,false,…,false}; AtomicInteger next = new AtomicInteger(0); int[] slot = new int[n];

One flag per thread

Page 102: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 102

Anderson Queue Lock

class ALock implements Lock { boolean[] flags={true,false,…,false}; AtomicInteger next = new AtomicInteger(0); int[] slot = new int[n];

Next flag to use

Page 103: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 103

Anderson Queue Lock

class ALock implements Lock { boolean[] flags={true,false,…,false}; AtomicInteger next = new AtomicInteger(0); ThreadLocal<Integer> mySlot;

Thread-local variable

Page 104: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 104

Anderson Queue Lock

public lock() { int mySlot = next.getAndIncrement(); while (!flags[mySlot % n]) {}; flags[mySlot % n] = false;}

public unlock() { flags[(mySlot+1) % n] = true;}

Page 105: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 105

Anderson Queue Lock

public lock() { int mySlot = next.getAndIncrement(); while (!flags[mySlot % n]) {}; flags[mySlot % n] = false;}

public unlock() { flags[(mySlot+1) % n] = true;} Take next slot

Page 106: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 106

Anderson Queue Lock

public lock() { int mySlot = next.getAndIncrement(); while (!flags[mySlot % n]) {}; flags[mySlot % n] = false;}

public unlock() { flags[(mySlot+1) % n] = true;} Spin until told to go

Page 107: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 107

Anderson Queue Lock

public lock() { int slot[i]=next.getAndIncrement(); while (!flags[slot[i] % n]) {}; flags[slot[i] % n] = false;}

public unlock() { flags[slot[i]+1 % n] = true;} Prepare slot for re-use

Page 108: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 108

Anderson Queue Lock

public lock() { int mySlot = next.getAndIncrement(); while (!flags[mySlot % n]) {}; flags[mySlot % n] = false;}

public unlock() { flags[(mySlot+1) % n] = true;}

Tell next thread to go

Page 109: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 109

Performance

• Curve is practically flat

• Scalable performance

• FIFO fairness

queue

TTAS

Page 110: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 110

Anderson Queue Lock

• Good– First truly scalable lock– Simple, easy to implement

• Bad– Space hog– One bit per thread

• Unknown number of threads?• Small number of actual contenders?

Page 111: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 111

CLH Lock

• FIFO order• Small, constant-size overhead per

thread

Page 112: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 112

Initially

false

tail

idle

Page 113: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 113

Initially

false

tail

idle

Queue tail

Page 114: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 114

Initially

false

tail

idle

Lock is free

Page 115: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 115

Initially

false

tail

idle

Page 116: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 116

Purple Wants the Lock

false

tail

acquiring

Page 117: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 117

Purple Wants the Lock

false

tail

acquiring

true

Page 118: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 118

Purple Wants the Lock

falsetail

acquiring

true

Swap

Page 119: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 119

Purple Has the Lock

false

tail

acquired

true

Page 120: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 120

Red Wants the Lock

false

tail

acquired acquiring

true true

Page 121: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 121

Red Wants the Lock

false

tail

acquired acquiring

true

Swap

true

Page 122: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 122

Red Wants the Lock

false

tail

acquired acquiring

true true

Page 123: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 123

Red Wants the Lock

false

tail

acquired acquiring

true true

Page 124: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 124

Red Wants the Lock

false

tail

acquired acquiring

true true

trueActually, it spins on cached copy

Page 125: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 125

Purple Releases

false

tail

release acquiring

false true

falseBingo

!

Page 126: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 126

Purple Releases

tail

released acquired

true

Page 127: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 127

Space Usage

• Let– L = number of locks– N = number of threads

• ALock– O(LN)

• CLH lock– O(L+N)

Page 128: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 128

CLH Queue Lock

class Qnode { AtomicBoolean locked = new AtomicBoolean(true);}

Page 129: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 129

CLH Queue Lock

class Qnode { AtomicBoolean locked = new AtomicBoolean(true);}

Not released yet

Page 130: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 130

CLH Queue Lockclass CLHLock implements Lock { AtomicReference<Qnode> tail; ThreadLocal<Qnode> myNode = new Qnode(); public void lock() { Qnode pred = queue.getAndSet(myNode); while (pred.locked) {} }}

(3)

Page 131: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 131

CLH Queue Lockclass CLHLock implements Lock { AtomicReference<Qnode> tail; ThreadLocal<Qnode> myNode = new Qnode(); public void lock() { Qnode pred = queue.getAndSet(myNode); while (pred.locked) {} }}

(3)

Tail of the queue

Page 132: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 132

CLH Queue Lockclass CLHLock implements Lock { AtomicReference<Qnode> tail; ThreadLocal<Qnode> myNode = new Qnode(); public void lock() { Qnode pred = queue.getAndSet(myNode); while (pred.locked) {} }}

(3)

Thread-local Qnode

Page 133: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 133

CLH Queue Lockclass CLHLock implements Lock { AtomicReference<Qnode> tail; ThreadLocal<Qnode> myNode = new Qnode(); public void lock() { Qnode pred = queue.getAndSet(myNode); while (pred.locked) {} }}

(3)

Swap in my node

Page 134: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 134

CLH Queue Lockclass CLHLock implements Lock { AtomicReference<Qnode> tail; ThreadLocal<Qnode> myNode = new Qnode(); public void lock() { Qnode pred = queue.getAndSet(myNode); while (pred.locked) {} }}

(3)

Spin until predecessorreleases lock

Page 135: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 135

CLH Queue LockClass CLHLock implements Lock { … public void unlock() { myNode.locked.set(false); myNode = pred; }}

(3)

Page 136: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 136

CLH Queue LockClass CLHLock implements Lock { … public void unlock() { myNode.locked.set(false); myNode = pred; }}

(3)

Notify successor

Page 137: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 137

CLH Queue LockClass CLHLock implements Lock { … public void unlock() { myNode.locked.set(false); myNode = pred; }}

(3)

Recycle predecessor’s

node

Page 138: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 138

CLH Lock

• Good– Lock release affects predecessor only– Small, constant-sized space

• Bad– Doesn’t work for uncached NUMA

architectures

Page 139: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 139

NUMA Architecturs

• Acronym:– Non-Uniform Memory Architecture

• Illusion:– Flat shared memory

• Truth:– No caches (sometimes)– Some memory regions faster than

others

Page 140: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 140

NUMA Machines

Spinning on local memory is fast

Page 141: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 141

NUMA Machines

Spinning on remote memory is slow

Page 142: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 142

CLH Lock

• Each thread spin’s on predecessor’s memory

• Could be far away …

Page 143: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 143

MCS Lock

• FIFO order• Spin on local memory only• Small, Constant-size overhead

Page 144: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 144

Initially

false

tail false

idle

Page 145: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 145

Acquiring

false

queuefalse

true

acquiring

(allocate Qnode)

Page 146: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 146

Acquiring

false

tail false

true

acquired

swap

Page 147: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 147

Acquiring

false

tail false

true

acquired

Page 148: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 148

Acquired

false

tail false

true

acquired

Page 149: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 149

Acquiring

tail

true

acquiredacquiring

trueswap

Page 150: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 150

Acquiring

tail

acquiredacquiring

true

true

Page 151: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 151

Acquiring

tail

acquiredacquiring

true

true

Page 152: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 152

Acquiring

tail

acquiredacquiring

true

true

Page 153: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 153

Acquiring

tail

acquiredacquiring

true

true

false

Page 154: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 154

Acquiring

tail

acquiredacquiring

true

trueYes!

false

Page 155: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 155

MCS Queue Lock

class Qnode { boolean locked = false; qnode next = null;}

Page 156: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 156

MCS Queue Lockclass MCSLock implements Lock { AtomicReference tail; public void lock() { Qnode qnode = new Qnode(); Qnode pred = tail.getAndSet(qnode); if (pred != null) { qnode.locked = true; pred.next = qnode; while (qnode.locked) {} }}}

(3)

Page 157: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 157

MCS Queue Lockclass MCSLock implements Lock { AtomicReference tail; public void lock() { Qnode qnode = new Qnode(); Qnode pred = tail.getAndSet(qnode); if (pred != null) { qnode.locked = true; pred.next = qnode; while (qnode.locked) {} }}}

(3)

Make a QNode

Page 158: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 158

MCS Queue Lockclass MCSLock implements Lock { AtomicReference tail; public void lock() { Qnode qnode = new Qnode(); Qnode pred = tail.getAndSet(qnode); if (pred != null) { qnode.locked = true; pred.next = qnode; while (qnode.locked) {} }}}

(3)

add my Node to the tail of

queue

Page 159: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 159

MCS Queue Lockclass MCSLock implements Lock { AtomicReference tail; public void lock() { Qnode qnode = new Qnode(); Qnode pred = tail.getAndSet(qnode); if (pred != null) { qnode.locked = true; pred.next = qnode; while (qnode.locked) {} }}}

(3)

Fix if queue was non-

empty

Page 160: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 160

MCS Queue Lockclass MCSLock implements Lock { AtomicReference tail; public void lock() { Qnode qnode = new Qnode(); Qnode pred = tail.getAndSet(qnode); if (pred != null) { qnode.locked = true; pred.next = qnode; while (qnode.locked) {} }}}

(3)

Wait until unlocked

Page 161: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 161

MCS Queue Lockclass MCSLock implements Lock { AtomicReference tail; public void unlock() { if (qnode.next == null) { if (queue.CAS(qnode, null) return; while (qnode.next == null) {} } qnode.next.locked = false;}}

(3)

Page 162: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 162

MCS Queue Lockclass MCSLock implements Lock { AtomicReference tail; public void unlock() { if (qnode.next == null) { if (queue.CAS(qnode, null) return; while (qnode.next == null) {} } qnode.next.locked = false;}}

(3)

Missingsuccessor?

Page 163: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 163

MCS Queue Lockclass MCSLock implements Lock { AtomicReference tail; public void unlock() { if (qnode.next == null) { if (queue.CAS(qnode, null) return; while (qnode.next == null) {} } qnode.next.locked = false;}}

(3)

If really no successor, return

Page 164: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 164

MCS Queue Lockclass MCSLock implements Lock { AtomicReference tail; public void unlock() { if (qnode.next == null) { if (queue.CAS(qnode, null) return; while (qnode.next == null) {} } qnode.next.locked = false;}}

(3)

Otherwise wait for successor to catch up

Page 165: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 165

MCS Queue Lockclass MCSLock implements Lock { AtomicReference queue; public void unlock() { if (qnode.next == null) { if (queue.CAS(qnode, null) return; while (qnode.next == null) {} } qnode.next.locked = false;}}

(3)

Pass lock to successor

Page 166: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 166

Purple Release

false

releasing swap

false

(2)

Page 167: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 167

Purple Release

false

releasing swap

false

By looking at the queue, I see another

thread is active

(2)

Page 168: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 168

Purple Release

false

releasing swap

false

By looking at the queue, I see another

thread is active

I have to wait for that thread to finish

(2)

Page 169: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 169

Purple Release

false

releasing spinning

true

Page 170: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 170

Purple Release

false

releasing spinning

truefalse

Page 171: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 171

Purple Release

false

releasing spinning

true

locked

false

Page 172: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 172

Abortable Locks

• What if you want to give up waiting for a lock?

• For example– Timeout– Database transaction aborted by user

Page 173: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 173

Back-off Lock

• Aborting is trivial– Just return from lock() call

• Extra benefit:– No cleaning up– Wait-free– Immediate return

Page 174: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 174

Queue Locks

• Can’t just quit– Thread in line behind will starve

• Need a graceful way out

Page 175: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 175

Queue Locks

spinning

true

spinning

truetrue

spinning

Page 176: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 176

Queue Locks

spinning

true

spinning

truefalse

locked

Page 177: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 177

Queue Locks

spinning

true

locked

false

Page 178: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 178

Queue Locks

locked

false

Page 179: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 179

Queue Locks

spinning

true

spinning

truetrue

spinning

Page 180: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 180

Queue Locks

spinning

truetruetrue

spinning

Page 181: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 181

Queue Locks

spinning

truetruefalse

locked

Page 182: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 182

Queue Locks

spinning

truefalse

Page 183: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 183

Queue Locks

pwned

truefalse

Page 184: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 184

Abortable CLH Lock

• When a thread gives up– Removing node in a wait-free way is

hard

• Idea:– let successor deal with it.

Page 185: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 185

Initially

tail

idlePointer to

predecessor (or null)

A

Page 186: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 186

Initially

tail

idleDistinguished available

node means lock is free

A

Page 187: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 187

Acquiring

tail

acquiring

A

Page 188: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 188

Acquiringacquiring

A

Null predecessor means lock not

released or aborted

Page 189: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 189

Acquiringacquiring

A

Swap

Page 190: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 190

Acquiringacquiring

A

Page 191: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 191

Acquiredlocked

A

Pointer to AVAILABLE

means lock is free.

Page 192: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 192

Normal Case

spinningspinninglocked

Null means lock is not free & request not

aborted

Page 193: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 193

One Thread Aborts

spinningTimed outlocked

Page 194: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 194

Successor Notices

spinningTimed outlocked

Non-Null means predecessor

aborted

Page 195: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 195

Recycle Predecessor’s Node

spinninglocked

Page 196: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 196

Spin on Earlier Node

spinninglocked

Page 197: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 197

Spin on Earlier Node

spinningreleased

A

The lock is now mine

Page 198: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 198

Time-out Lockpublic class TOLock implements Lock { static Qnode AVAILABLE = new Qnode(); AtomicReference<Qnode> tail; ThreadLocal<Qnode> myNode;

(3)

Page 199: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 199

Time-out Lockpublic class TOLock implements Lock { static Qnode AVAILABLE = new Qnode(); AtomicReference<Qnode> tail; ThreadLocal<Qnode> myNode;

(3)

Distinguished node to signify free lock

Page 200: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 200

Time-out Lockpublic class TOLock implements Lock { static Qnode AVAILABLE = new Qnode(); AtomicReference<Qnode> tail; ThreadLocal<Qnode> myNode;

(3)

Tail of the queue

Page 201: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 201

Time-out Lockpublic class TOLock implements Lock { static Qnode AVAILABLE = new Qnode(); AtomicReference<Qnode> tail; ThreadLocal<Qnode> myNode;

(3)

Remember my node …

Page 202: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 202

Time-out Lockpublic boolean lock(long timeout) { Qnode qnode = new Qnode(); myNode.set(qnode); qnode.prev = null; Qnode pred = tail.getAndSet(qnode); if (pred == null || pred.prev == AVAILABLE) { return true; }…

(3)

Page 203: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 203

Time-out Lockpublic boolean lock(long timeout) { Qnode qnode = new Qnode(); myNode.set(qnode); qnode.prev = null; Qnode pred = tail.getAndSet(qnode); if (pred == null || pred.prev == AVAILABLE) { return true; }

(3)

Create & initialize node

Page 204: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 204

Time-out Lockpublic boolean lock(long timeout) { Qnode qnode = new Qnode(); myNode.set(qnode); qnode.prev = null; Qnode pred = tail.getAndSet(qnode); if (pred == null || pred.prev == AVAILABLE) { return true; }

(3)

Swap with tail

Page 205: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 205

Time-out Lockpublic boolean lock(long timeout) { Qnode qnode = new Qnode(); myNode.set(qnode); qnode.prev = null; Qnode pred = tail.getAndSet(qnode); if (pred == null || pred.prev == AVAILABLE) { return true; } ...

(3)

If predecessor absent or released, we are

done

Page 206: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 206

Time-out Lock… long start = now(); while (now()- start < timeout) { Qnode predPred = pred.prev; if (predPred == AVAILABLE) { return true; } else if (predPred != null) { pred = predPred; } } …

(3)

Page 207: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 207

Time-out Lock… long start = now(); while (now()- start < timeout) { Qnode predPred = pred.prev; if (predPred == AVAILABLE) { return true; } else if (predPred != null) { pred = predPred; } } …

(3)

Keep trying for a while …

Page 208: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 208

Time-out Lock… long start = now(); while (now()- start < timeout) { Qnode predPred = pred.prev; if (predPred == AVAILABLE) { return true; } else if (predPred != null) { pred = predPred; } } …

(3)

Spin on predecessor’s prev field

Page 209: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 209

Time-out Lock… long start = now(); while (now()- start < timeout) { Qnode predPred = pred.prev; if (predPred == AVAILABLE) { return true; } else if (predPred != null) { pred = predPred; } } …

(3)

Predecessor released lock

Page 210: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 210

Time-out Lock… long start = now(); while (now()- start < timeout) { Qnode predPred = pred.prev; if (predPred == AVAILABLE) { return true; } else if (predPred != null) { pred = predPred; } } …

(3)

Predecessor aborted, advance one

Page 211: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 211

Time-out Lock…if (!tail.compareAndSet(qnode, pred)) qnode.prev = pred; return false; }}

(3)

Page 212: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 212

Time-out Lock…if (!tail.compareAndSet(qnode, pred)) qnode.prev = pred; return false; }}

(3)

Try to put predecessor back

Page 213: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 213

Time-out Lock…if (!tail.compareAndSet(qnode, pred)) qnode.prev = pred; return false; }}

(3)

Otherwise, if it’s too late, redirect to predecessor

Page 214: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 214

Time-out Lockpublic void unlock() { Qnode qnode = myNode.get(); if (!tail.compareAndSet(qnode, null)) qnode.prev = AVAILABLE;}

(3)

Page 215: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 215

Time-out Lockpublic void unlock() { Qnode qnode = myNode.get(); if (!tail.compareAndSet(qnode, null)) qnode.prev = AVAILABLE;}

(3)

Clean up if no one waiting

Page 216: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 216

Time-out Lockpublic void unlock() { Qnode qnode = myNode.get(); if (!tail.compareAndSet(qnode, null)) qnode.prev = AVAILABLE;}

(3)

Notify successor by pointing to AVAILABLE

node

Page 217: Spin Locks and Contention Management The Art of Multiprocessor Programming Spring 2007.

© Herlihy-Shavit 2007 217

One Lock To Rule Them All?

• TTAS+Backoff, CLH, MCS, ToLock…• Each better than others in some

way• There is no one solution• Lock we pick really depends on:

– the application– the hardware– which properties are important