Top Banner
Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1 University of Cambridge June 2012
25

Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Jul 17, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Load-reserve / Store-conditional on POWER and ARM

Peter Sewell (slides from Susmit Sarkar)

1University of Cambridge

June 2012

Page 2: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Correct implementations of C/C++ on hardware

Can it be done?◮ . . . on highly relaxed hardware?

What is involved?◮ Mapping new constructs to assembly

◮ Optimizations: which ones legal?

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 2 / 10

Page 3: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Correct implementations of C/C++ on hardware

Can it be done?◮ . . . on highly relaxed hardware? e.g. Power

What is involved?◮ Mapping new constructs to assembly

◮ Optimizations: which ones legal?

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 2 / 10

Page 4: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Implementing C/C++11 on POWER: Pointwise Mapping

C/C++11 Operation POWER Implementation

Store (non-atomic)Load (non-atomic)

st

ld

(From Paul McKenney and Raul Silvera)

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 3 / 10

Page 5: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Implementing C/C++11 on POWER: Pointwise Mapping

C/C++11 Operation POWER Implementation

Store (non-atomic)Load (non-atomic)

st

ld

Store relaxedStore releaseStore seq-cst

st

lwsync; st

lwsync; st

Load relaxedLoad consumeLoad acquireLoad seq-cst

ld

ld (and preserve dependency)ld; cmp; bc; isync

sync; ld; cmp; bc; isync

(From Paul McKenney and Raul Silvera)

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 3 / 10

Page 6: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Implementing C/C++11 on POWER: Pointwise Mapping

C/C++11 Operation POWER Implementation

Store (non-atomic)Load (non-atomic)

st

ld

Store relaxedStore releaseStore seq-cst

st

lwsync; st

lwsync; st

Load relaxedLoad consumeLoad acquireLoad seq-cst

ld

ld (and preserve dependency)ld; cmp; bc; isync

sync; ld; cmp; bc; isync

Fence acquireFence releaseFence seq-cst

lwsync

lwsync

sync

(From Paul McKenney and Raul Silvera)

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 3 / 10

Page 7: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Implementing C/C++11 on POWER: Pointwise Mapping

C/C++11 Operation POWER Implementation

Store (non-atomic)Load (non-atomic)

st

ld

Store relaxedStore releaseStore seq-cst

st

lwsync; st

lwsync; st

Load relaxedLoad consumeLoad acquireLoad seq-cst

ld

ld (and preserve dependency)ld; cmp; bc; isync

sync; ld; cmp; bc; isync

Fence acquireFence releaseFence seq-cst

lwsync

lwsync

sync

CAS relaxed

CAS seq-cst

loop: lwarx; cmp; bc exit;

stwcx.; bc loop; exit:

sync; loop: lwarx; cmp; bc exit;

stwcx.; bc loop; isync; exit:

. . . ...

(From Paul McKenney and Raul Silvera)

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 3 / 10

Page 8: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Implementing C/C++11 on POWER: Pointwise Mapping

C/C++11 Operation POWER Implementation

Store (non-atomic)Load (non-atomic)

st

ld

Store relaxedStore releaseStore seq-cst

st

lwsync; st

lwsync; st

Load relaxedLoad consumeLoad acquireLoad seq-cst

ld

ld (and preserve dependency)ld; cmp; bc; isync

sync; ld; cmp; bc; isync

Fence acquireFence releaseFence seq-cst

lwsync

lwsync

sync

CAS relaxed

CAS seq-cst

loop: lwarx; cmp; bc exit;

stwcx.; bc loop; exit:

sync; loop: lwarx; cmp; bc exit;

stwcx.; bc loop; isync; exit:

. . . ...

Is that mapping correct?

(From Paul McKenney and Raul Silvera)

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 3 / 10

Page 9: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Implementing C/C++11 on POWER: Pointwise Mapping

C/C++11 Operation POWER Implementation

Store (non-atomic)Load (non-atomic)

st

ld

Store relaxedStore releaseStore seq-cst

st

lwsync; st

lwsync; sync; st

Load relaxedLoad consumeLoad acquireLoad seq-cst

ld

ld (and preserve dependency)ld; cmp; bc; isync

sync; ld; cmp; bc; isync

Fence acquireFence releaseFence seq-cst

lwsync

lwsync

sync

CAS relaxed

CAS seq-cst

loop: lwarx; cmp; bc exit;

stwcx.; bc loop; exit:

sync; loop: lwarx; cmp; bc exit;

stwcx.; bc loop; isync; exit:

. . . ...

Answer: No!

(From Paul McKenney and Raul Silvera)

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 3 / 10

Page 10: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Implementing C/C++11 on POWER: Pointwise Mapping

C/C++11 Operation POWER Implementation

Store (non-atomic)Load (non-atomic)

st

ld

Store relaxedStore releaseStore seq-cst

st

lwsync; st

sync; st

Load relaxedLoad consumeLoad acquireLoad seq-cst

ld

ld (and preserve dependency)ld; cmp; bc; isync

sync; ld; cmp; bc; isync

Fence acquireFence releaseFence seq-cst

lwsync

lwsync

sync

CAS relaxed

CAS seq-cst

loop: lwarx; cmp; bc exit;

stwcx.; bc loop; exit:

sync; loop: lwarx; cmp; bc exit;

stwcx.; bc loop; isync; exit:

. . . ...

Is that mapping correct?

Answer: Yes!

(From Paul McKenney and Raul Silvera)

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 3 / 10

Page 11: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Implementing C/C++11 on POWER: Pointwise Mapping

C/C++11 Operation POWER Implementation

Store (non-atomic)Load (non-atomic)

st

ld

Store relaxedStore releaseStore seq-cst

st

lwsync; st

sync; st

Load relaxedLoad consumeLoad acquireLoad seq-cst

ld

ld (and preserve dependency)ld; cmp; bc; isync

sync; ld; cmp; bc; isync

Fence acquireFence releaseFence seq-cst

lwsync

lwsync

sync

CAS relaxed

CAS seq-cst

loop: lwarx; cmp; bc exit;

stwcx.; bc loop; exit:

sync; loop: lwarx; cmp; bc exit;

stwcx.; bc loop; isync; exit:

. . . ...

Is that the only correct mapping?

Answer: No!

(From Paul McKenney and Raul Silvera)

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 3 / 10

Page 12: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Implementing C/C++11 on POWER: Pointwise Mapping

C/C++11 Operation POWER Implementation

Store (non-atomic)Load (non-atomic)

st

ld

Store relaxedStore releaseStore seq-cst

st

lwsync; st

sync; st

Alternative

sync; st; sync;

Load relaxedLoad consumeLoad acquireLoad seq-cst

ld

ld (and preserve dependency)ld; cmp; bc; isync

sync; ld; cmp; bc; isync ld; sync

Fence acquireFence releaseFence seq-cst

lwsync

lwsync

sync

CAS relaxed

CAS seq-cst

loop: lwarx; cmp; bc exit;

stwcx.; bc loop; exit:

sync; loop: lwarx; cmp; bc exit;

stwcx.; bc loop; isync; exit:

. . . ...

All compilers must agree for separate compilationPeter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 3 / 10

Page 13: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Machine Synchronisation Operations

x86: atomic synchronization operations, e.g. “atomic add”,“CAS”,. . .

RISC-friendly alternative: Load-reserve/Store-conditional(aka LL/SC, larx/stcx and lwarx/stwcx, LDREX/STREX)

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 4 / 10

Page 14: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Machine Synchronisation Operations

x86: atomic synchronization operations, e.g. “atomic add”,“CAS”,. . .

RISC-friendly alternative: Load-reserve/Store-conditional(aka LL/SC, larx/stcx and lwarx/stwcx, LDREX/STREX)

Can be used to implement CAS, atomic add, spinlocks, . . .

Universal (like CAS) [Herlihy’93] (but no ABA problem)

Atomic Addition

loop: lwarx r, d;

add r,v,r;

stwcx r, d;

bne loop;

Informally, stwcx succeeds only if no other write to the same addresssince last lwarx, setting a flag iff it succeeds

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 4 / 10

Page 15: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

What is no write since . . . ?

In machine time?◮ Neither necessary, nor sufficient

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 5 / 10

Page 16: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

What is no write since . . . ?

In machine time?◮ Neither necessary, nor sufficient

Microarchitecturally (simplified): if cache-lineownership not lost since last lwarx

(but we don’t want to model the microarchitecture...)

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 5 / 10

Page 17: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Modeling “not lost since”

Abstractly: ownership chain modeled by building up coherence order

Coherence: order relating stores to the same location (eventuallylinear)

A stwcx succeeds only if it is (or at least, if it can become)coherence-next-to the write read from by lwarx

. . . and no other write can later come in between

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 6 / 10

Page 18: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Modeling “not lost since”

Abstractly: ownership chain modeled by building up coherence order

Coherence: order relating stores to the same location (eventuallylinear)

A stwcx succeeds only if it is (or at least, if it can become)coherence-next-to the write read from by lwarx

. . . and no other write can later come in between

Isolate key concept: write reaching coherence point —◮ coherence is linear below this write, and no new edges will be added

below

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 6 / 10

Page 19: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Coherence points and a successful stwcx

Atomic Addition

loop: lwarx r, x;

add r,3,r;

stwcx r, x;

bne loop;

Coherence order for x:

b:W x=3a:W x=2

i:W x=0 j:W x=1

c:W x=4

Suppose lwarx reads from the “a:W x:2”

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 7 / 10

Page 20: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Coherence points and a successful stwcx

Atomic Addition

loop: lwarx r, x;

add r,3,r;

stwcx r, x;

bne loop;

Coherence order for x:

b:W x=3a:W x=2

i:W x=0 j:W x=1

c:W x=4

Suppose lwarx reads from the “a:W x:2”

stwcx can succeed if this becomes possible:

writes that have reached coherence point

i:W x=0 j:W x=1 a:W x=2 d:W∗ x=5

c:W x=4

b:W x=3

Warning: stwcx can fail spuriously

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 7 / 10

Page 21: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Load-reserve/store-conditional and ordering

Same-thread load-reserve/store-conditionals orderedby program order

If all memory accesses are l-r/s-c sequences

Then: only SC behaviour

But . . . normal loads/stores (to different addresses)not ordered; the l-r/s-c do not act as a barrier

Confusion here led to Linux bug. . . bad barrier placement in atomic-add-return

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 8 / 10

Page 22: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Correctness of the Mapping

Theorem: For any sane, non-optimising compiler following the mapping:

DRF C/C++ prog

POWER prog

C/C++11 executionobservations

POWER executionobservations

C/C++11 semantics

POWER semantics

compilation ⊆

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 9 / 10

Page 23: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Correctness of the Mapping

Theorem: For any sane, non-optimising compiler following the mapping:

DRF C/C++ prog

POWER prog

C/C++11 executionobservations

POWER executionobservations

C/C++11 semantics

POWER semantics

compilation ⊆Preserves memory accesses;Uses the mapping table;Respects the thread local semantics of C/C++, preservingdependencies

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 9 / 10

Page 24: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

Correctness of the Mapping

Theorem: For any sane, non-optimising compiler following the mapping:

DRF C/C++ prog

POWER prog

C/C++11 executionobservations

POWER executionobservations

C/C++11 semantics

POWER semantics

compilation ⊆

From POWER trace, build key relations (happens-before, SCorder)

Required properties from abs. machine properties

If trace looks like it produces data race, build the C/C++data race

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 9 / 10

Page 25: Load-reserve / Store-conditional on POWER and ARM · 2012-10-29 · Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1UniversityofCambridge

For details...

see Synchronising C/C++ and POWER, Sarkar et al., PLDI 2012

http://www.cl.cam.ac.uk/~pes20/cppppc-supplemental/

In the paper:

A formal model of load-reserve/store-conditional (in Lem)

An executable model with exploration tool (ppcmem)

Simplifications to the C/C++11 lock model

Models “tight” against each other: relaxing the Power model wouldmake C/C++11 unimplementable

Peter Sewell (Cambridge) Load-reserve / Store-conditional on POWER and ARM June 2012 10 / 10