Top Banner
Towards a Scalable Non-Blocking Coding Style Dr. Cliff Click, Distinguished Engineer Azul Systems http://blogs.azulsystems.com/cliff
57

Towards a Scalable Non-Blocking Coding Style

May 10, 2015

Download

Technology

Nonblocking (NB) algorithms are something of a Holy Grail of concurrent programming--typically very fast, even under heavy load, they come with hard guarantees about forward progress. The downside is that they are very hard to get right. This presentations authors worked on writing some nonblocking utilities (open sourced on SourceForge in the high-scale-lib project) and have made some progress toward a coding style that can be used to build a variety of NB data structures: hash tables, sets, work queues, and bit vectors. These data structures scale much better than even the concurrent JDK™ software utilities while providing the same correctness guarantees. They usually have similar overheads at the low end while scaling incredibly well on high-end hardware. The coding style is still very immature but shows clear promise. It stems from a handful of basic premises: You don't hide payload during updates; any thread can complete (or ignore) any in-progress update; use flat arrays for quick access and broadest-possible striping; and use parallel, concurrent, incremental array copy. At the core is a simple state-machine description of the update logic.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Towards a Scalable Non-Blocking Coding Style

Towards a ScalableNon-Blocking Coding Style

Dr. Cliff Click, Distinguished EngineerAzul Systems

http://blogs.azulsystems.com/cliff

Page 2: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 2

The Computer Revolution is Here We already did the 0->1 cpu transition

Concurrent Programming is Now 'The Norm' and hard to do We're doing the 1->2 cpu transition

Scalable Concurrent Programming is even harder Time to think about the 2->N cpu transition

Here is a different way of thinking about the problem

Page 3: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 3

Formally:• Stopping one thread will not prevent global progress

Less formally:• No thread 'locks' any resource

• and then gets pre-empted by OS• Or blocked in I/O, etc

• No 'critical sections', locks, mutexs, spin-locks, etcIndividual threads might starve

What is Non-Blocking Algorithm?

Page 4: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 4

Wait-Free Algorithms (the best)• All threads complete in finite count of steps• Low priority threads cannot block high priority threads• No priority inversion possible

Lock-Free (this work)• Every successful step makes Global Progress• But individual threads may starve

• Hence priority inversion is possible• No live-lock

Obstruction-Free• A single thread in isolation completes in finite count of steps• Threads may block each other

• Hence live-lock is possible

XXX-Free Hierarchy

Page 5: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 5

Multi-core is now almost unavoidableLarger core counts more common:• 8+ (X86), 64 (Sun/ Rock), 768 (Azul, more coming)

Locking suffers serious contention issues• Amdahl's Law, etc

Would like to write correct code without locks!Obstruction-free can live-lock• More prone with higher cpu count• Or higher thread count

Wait-free algorithms behave the best• But tend to be slow• And are very hard to code

• Handful of people on the planet can write these

Motivation

Page 6: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 6

Most large-CPU count shared-memory hardware is:• Parallel-read, Independent-write

Multiple CPUs reading the same location is fast• Free 'cache-hitting-loads'

Multiple CPUs writing to the same location serialize• Speed limited to '1-cache-miss-per-write'

or '1-memory-bus-update-per-write'Must avoid all CPUs writing same location for independent operations• i.e., no shared counters, single lock-words, etc

Classic reader/writer lock chokes w/ >100 CPUs• Contention on single reader-count word limits scaling

Scalable

Page 7: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 7

MotivationA Scalable Non-Blocking Coding StyleExample 1: BitVectorExample 2: HashTableExample 3: Nearly FIFO QueueSummary

Agenda

Page 8: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 8

An Array to hold all Data• Fast parallel (scalable) access

Atomic-update on single Array Words• java.util.concurrent.Atomic.*• “No spurious failure CAS”

A Finite State Machine • Replicated per array word (or small set of words)• Use Atomic-Update to 'step' in the FSM

Construct algorithm from many FSM 'steps' • Lock-Free: Each CAS makes progress• CAS success is local progress• CAS failure means another CAS succeeded

(global progress, local starvation)

Parts we need...

Page 9: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 9

Don't answer that: Make array growable• Resize array as needed• Common operation for Collection classes

Support array resize via State Machine• Really: array-copy while in use• All array words are independent• Copy is parallel, incremental, concurrent

But mostly operate without a copy-in-progress• So the common situation is simple, fast

How Big is the Array?

Page 10: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 10

Copy old Array into a new larger ArrayThe hard part during a resize operation:• Copy without losing any late-writes to old Array

Fix: “mark” old Array words with no-more-updates flag• Payload still visible through the “mark”

Updaters' of marked payload must copy then update in new arrayReaders' seeing mark must copy then read in new array

Concurrent Array Resize

Page 11: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 11

Need some form of Atomic-Update• java.util.concurrent.atomic.Atomic*

Update 1 word IFF old-value is equal to expected-valueGenerally Compare-And-Swap (CAS, Azul/Sparc/X86) or Load-Linked / Store-Conditional (LL/SC, IBM)Common Hardware Limitations• LL/SC suffers from live-lock• Both CAS & LL/SC can suffer spurious failure on some hardware

• Infinite spurious failures is live-lock(?)• Finite failures fixed with spin loop

• Useful if CAS does not spuriously fail (e.g. Azul) • Especially at high CPU count• If 1000 CPUs attempt update, 1 should succeed

Atomic Update

Page 12: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 12

CAS failure returns old value on most (all?) hardware?• Old value is evidence CAS did not fail spuriously• The “witness” - the “proof of failure”• LL/SC never provides old value

The witness not available after the CAS• Overwritten by another thread

JDK API mistake: witness turned into a boolean• Hence failure-for-cause can not be

distinguished from spurious-failureHence must spin on CAS failure until see reason for failure• Report either CAS success OR• CAS failure-for-cause

Spinning builds a “No spurious failure CAS”

Atomic Update: Failure

Page 13: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 13

Big Array to hold DataParallel, Scalable read accessConcurrent writes via: CAS & Finite State Machine• No JMM issues during Finite State Machine updates• No locks, no volatile

Fast as a best-of-breed not-thread-safe implementation• But as correct as thread-safe implementations• Much faster than locking under heavy load• No indirections in common case• Directly reach main data array in 1 step

Resize as needed• Copy Array to a larger Array on demand• Use State Machine to help copy• “Mark” old Array words to avoid missing late updates

Towards A Scalable Lock-Free Coding Style

Page 14: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 14

MotivationA Scalable Non-Blocking Coding StyleExample 1: BitVectorExample 2: HashTableExample 3: Nearly FIFO QueueSummary

Agenda

Page 15: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 15

Size: O(max element)• Auto-resizing

Supports concurrent insert, remove, test&setObvious implementation:• Array of 'long' - 64-bit payload words• Bit mask & shift accessors

How to 'mark' payload?• Steal 1 bit out of 64• MOD 63 to select index words – this example only

• (Actually: avoid slow MOD by moving every 64th bit to recursive bitvector)

Code up in SourceForge, high-scale-lib

Example 1: BitVector

Page 16: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 16

Basic get & test/set (using MOD)

boolean get( int x ) { long[] A = _A; int idx = x/63; if( idx >= A.length) return false;

int old = A[idx]; if( old < 0 ) return copy(x).get(x); long mask = 1L <<(x%63); return (old & mask)!=0;}

boolean test_set( int x ) { long[] A = _A; // read once int idx = x/63; if( idx >= A.length ) return grow(x); while( true ) { // spin loop int old = A[idx]; if( old < 0 ) // marked? return copy(x).test_set(x); long mask = 1L <<(x%63); if( (old & mask) != 0) return true; if( CAS(A[idx],old,old|mask)) return false; }

Example 1: BitVector

Page 17: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 17

Read Array once – it may change out from under us!

Example 1: BitVector

boolean get( int x ) { long[] A = _A; int idx = x/63; if( idx >= A.length) return false;

int old = A[idx]; if( old < 0 ) return copy(x).get(x); long mask = 1L <<(x%63); return (old & mask)!=0;}

boolean test_set( int x ) { long[] A = _A; // read once int idx = x/63; if( idx >= A.length ) return grow(x); while( true ) { // spin loop int old = A[idx]; if( old < 0 ) // marked? return copy(x).test_set(x); long mask = 1L <<(x%63); if( (old & mask) != 0) return true; if( CAS(A[idx],old,old|mask)) return false; }

Page 18: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 18

Out-of-bounds triggers resize

Example 1: BitVector

boolean get( int x ) { long[] A = _A; int idx = x/63; if( idx >= A.length) return false;

int old = A[idx]; if( old < 0 ) return copy(x).get(x); long mask = 1L <<(x%63); return (old & mask)!=0;}

boolean test_set( int x ) { long[] A = _A; // read once int idx = x/63; if( idx >= A.length ) return grow(x); while( true ) { // spin loop int old = A[idx]; if( old < 0 ) // marked? return copy(x).test_set(x); long mask = 1L <<(x%63); if( (old & mask) != 0) return true; if( CAS(A[idx],old,old|mask)) return false; }

Page 19: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 19

'Mark' triggers copy & retry

Example 1: BitVector

boolean get( int x ) { long[] A = _A; int idx = x/63; if( idx >= A.length) return false;

int old = A[idx]; if( old < 0 ) return copy(x).get(x); long mask = 1L <<(x%63); return (old & mask)!=0;}

boolean test_set( int x ) { long[] A = _A; // read once int idx = x/63; if( idx >= A.length ) return grow(x); while( true ) { // spin loop int old = A[idx]; if( old < 0 ) // marked? return copy(x).test_set(x); long mask = 1L <<(x%63); if( (old & mask) != 0) return true; if( CAS(A[idx],old,old|mask)) return false; }

Page 20: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 20

Failed CAS must retry – BUT!• Means another thread made progress

Example 1: BitVector

boolean get( int x ) { long[] A = _A; int idx = x/63; if( idx >= A.length) return false;

int old = A[idx]; if( old < 0 ) return copy(x).get(x); long mask = 1L <<(x%63); return (old & mask)!=0;}

boolean test_set( int x ) { long[] A = _A; // read once int idx = x/63; if( idx >= A.length ) return grow(x); while( true ) { // spin loop int old = A[idx]; if( old < 0 ) // marked? return copy(x).test_set(x); long mask = 1L <<(x%63); if( (old & mask) != 0) return true; if( CAS(A[idx],old,old|mask)) return false; }

Page 21: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 21

Almost as fast as plain BitVector• Normal load & mask for get/set• Range check• Extra '<0' test (triggers copy & retry)• Set uses CAS spin-loop

Copy: Sign-bit to stop further updates• Use CAS to set sign-bit• Then copy word to new array• Repeat operation on new array

Finite State Machine!• per Array word• Hidden in the code

Let's make the FSM obvious...

Example 1: BitVector

Page 22: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 22

BitVector State Machine

0000“initial”

Page 23: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 23

0000 0XXXset

set & clearA: Normal operations

“active”

BitVector State Machine

Page 24: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 24

0000 0XXXset

set & clear

0000

A: Normal operations

old array

new array

Out-of-Bounds set triggers resize!

“initial”

BitVector State Machine

Page 25: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 25

0000 0XXXset

1XXX

set & clear

mark

0000

A: Normal operations

B: Mark to prevent further updates

old array

new array

“marked”

BitVector State Machine

Page 26: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 26

0000 0XXXset

1XXX

set & clear

mark

0000 0XXXcopy

A: Normal operations

B: Mark to prevent further updates

C: Copy from old to new

old array

new array

BitVector State Machine

Page 27: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 27

0000 0XXXset

1XXX

set & clear

mark

0000 0XXXcopy

A: Normal operations

B: Mark to prevent further updates

C: Copy from old to new

old array

new arrayD: Memory-fence between arrays

BitVector State Machine

Page 28: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 28

0000 0XXXset

1XXX1000

set & clear

mark

copydone

0000 0XXXcopy

A: Normal operations

B: Mark to prevent further updates

C: Copy from old to new

E: Signal copy-done in old table

D: Memory-fence between arraysold array

new array

“copy-done”

BitVector State Machine

Page 29: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 29

0000 0XXXset

1XXX1000

set & clear

mark mark

copydone

0000 0XXXcopy

A: Normal operations

B: Mark to prevent further updates

D: Copy from old to new

C: Memory-fence between arraysold array

new arrayE: Signal copy-done in old table

BitVector State Machine

Page 30: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 30

Triggered by adding larger elementCopy each word before get/putPay indirection even after copy• Visit old table, fence, operate on new table

So need to copy all words eventually, and thenPromote: make new array the top-level array• No more indirection

Policy? How to copy all words? • Visiting threads can “copy some words”• Or background threads copy, or only-writers, etc• Good standard engineering, nothing special

Resize - motivation

Page 31: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 31

Helper: any thread copying words it does not directly needHelpers CAS-up a “promise to copy” counter• Atomic-increment by fixed N (e.g. 16 words)

Helpers copy words via State MachineHelpers atomic-increment “done work” counter• On transition to “copy-done” state

Promote new Array when “done work” == A.lengthWhat If: Helper stalled? (promises but never copies)• Allow helpers to “double-promise”!• Worst case: each thread can complete entire copy

Eventually, copy completes & array promotes

Resize – Copy Mechanics

Page 32: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 32

Large array for parallel read & update• No JMM issues for read or update (no lock, no volatile)

State Machine per-array-word• Successful CAS is FSM transition• Failed CAS causes retry

• (but another thread made progress)'Mark' payload words to stop 'late updates'Array copy for Resize• Copy is parallel, incremental, concurrent• Copy part of State Machine• Unrelated threads can make progress during resize• Fence between old and new tables

Coding Style Elements

Page 33: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 33

MotivationA Scalable Non-Blocking Coding StyleExample 1: BitVectorExample 2: HashTableExample 3: Nearly FIFO QueueSummary

Agenda

Page 34: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 34

Array of K/V Pairs• Keys in even slots, Values odd slots• CAS each word separately, but FSM spans both words• Value can also be 'Tombstone' • Key & Value both start as null

Mark payload by 'boxing' valuesCopy on resize, or to flush stale keysSupports concurrent insert, remove, test, resizeLinear scaling on Azul to 768 CPUs• More than billion reads/sec simultaneous with• More than 10million updates/sec

Code up in SourceForge, high-scale-lib• Passes Java Compatibility Kit (JCK) for ConcurrentHashMap

Example 2: HashTable

Page 35: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 35

“Uninteresting” DetailsGood, standard engineering – nothing specialClosed Power-of-2 Hash Table• Reprobe on collision• Stride-1 reprobe: better cache behavior• (complicated argument about 2n vs prime goes here)

Key & Value on same cache lineHash memoized• Should be same cache line as K + V• But hard to do in pure Java

No allocation on get() or put()Auto-Resize

Page 36: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 36

HashTable State Machine

0/0“initial”

•Inserting K/V pair•Already probed table, missed•Found proper empty K/V slot•Ready to claim slot for this Key

Page 37: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 37

0/0

K/0insertkey

“bare key”

Claim key slot

HashTable State Machine

Page 38: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 38

0/0

K/0insertkey

K/Vinsert V “active”

Initial set of Value

HashTable State Machine

Page 39: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 39

0/0

K/0insertkey

K/Vinsert V

K/T

delete

“deleted”

Delete uses 'tombstone' value;

Key remains

HashTable State Machine

Page 40: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 40

0/0

K/0insertkey

K/Vinsert V

K/T

deletere-insert

change V

“deleted”

Re-insert uses same key slot

Change Value uses same key slot

HashTable State Machine

Page 41: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 41

0/0

K/0insertkey

K/Vinsert V

0/0

old array

new array

K/T

deletere-insert

change V

Resize triggered, new array created

“initial”

HashTable State Machine

Page 42: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 42

0/0

K/0insertkey

K/Vinsert V

0/0

old array

new array

K/[V]

K/T

deletere-insert

box

change V

“boxed V”

Boxing V prevents further changes

HashTable State Machine

Page 43: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 43

0/0

K/0insertkey

K/Vinsert V

0/0 K/0

old array

new array

K/[V]

K/T

deletere-insert

box

insertkey

change V

“bare key”

Claim key slot in new table

HashTable State Machine

Page 44: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 44

0/0

K/0insertkey

K/Vinsert V

0/0 K/0 copy

old array

new array

K/[V]

K/T

deletere-insert

box

K/Vinsertkey

change V

Copy in new table without box

“active”

HashTable State Machine

Page 45: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 45

0/0

K/0insertkey

K/Vinsert V

0/0 K/0 copy

Memory-fence between arraysold array

new array

K/[V]

K/T

deletere-insert

box

K/Vinsertkey

change V

Fence after writing to new arrayand before setting 'copy done'

HashTable State Machine

Page 46: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 46

0/0

K/0insertkey

K/V

K/[T]

insert V

copydone

0/0 K/0 copy

Memory-fence between arraysold array

new array

K/[V]

K/T

deletere-insert

box

K/Vinsertkey

change V

Final state: “new Array has Value”

“copy done”

HashTable State Machine

Page 47: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 47

0/0

K/0insertkey

K/V

K/[T]

insert V

copydone

0/0

Memory-fence between arraysold array

new array

K/[V]

K/T

deletere-insert

box

change V

Nothing to copy

HashTable State Machine

Page 48: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 48

0/0

K/0insertkey

K/V

K/[T]

insert V

copydone

0/0

Memory-fence between arraysold array

new array

K/[V]

K/T

deletere-insert

box

change V

Copy stops partial insertion

HashTable State Machine

Page 49: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 49

0/0

K/0insertkey

K/V

K/[T]

insert V

copydone

0/0 K/0 copy

Memory-fence between arraysold array

new array

K/[V]

K/T

deletere-insert

box

K/Vinsertkey

change V

HashTable State Machine

Page 50: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 50

MotivationA Scalable Non-Blocking Coding StyleExample 1: BitVectorExample 2: HashTableExample 3: Nearly FIFO QueueSummary

Agenda

Page 51: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 51

Concurrent near-FIFO Queue• e.g. producer / consumer worklist• Producers & consumers are large thread pools

Scaling bottleneck:• Locking or single word CAS on push & pop

Could stripe Queue: • Many short Queues• Select random Queue• Many different locks or many different words to CAS

• Less contention• Pick at random to push or pop• Must search all queues for not-full or not-empty

Example 3: Nearly FIFO Queue

Page 52: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 52

1000's of CPUs need 1000's of Queues• Stripe Ad-Absurdum• Queues get ever-smaller• Get down to Queues of 1 entry

Single-entry Queue: either full or empty • Implement as a single word• Either null or value

Need 1000's of single-entry Queues• Array of single word Queues

Producers start @ random index• Search for null, CAS down value

Consumers start @ random index• Search for value, CAS down null

Example 3: Nearly FIFO Queue

Page 53: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 53

Nearly FIFO:• Consumers must advance scan point• Or might neglect tasks left in other slots• Means every value in array gets visited eventually

Tricky bit: correct array size for efficiency• Too small, table gets full, producers spin uselessly• Too large, table is mostly empty, consumers scan uselessly

Array copy & promote is easier:• Risk: late insert in old array just prior to promote

abandons value• Consumers fill old array with 'tombstone' • Promote when old array is entire 'stoned

Still need feedback mechanisms on P/C threadpools

Example 3: Nearly FIFO Queue

Page 54: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 54

Work in progress, no code yet...But out of time anyways ;-)Nice idea, hope it pans out

Example 3: Nearly FIFO Queue

Page 55: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 55

MotivationA Scalable Non-Blocking Coding StyleExample 1: BitVectorExample 2: HashTableExample 3: Nearly FIFO QueueSummary

Agenda

Page 56: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 56

Summary

Lock-FreeHighly scalable (proven scalable to ~1000 CPUs)Use large array for data• Allows fast parallel-read• Allows parallel, incremental, concurrent copy

Use Finite State Machine to control writes• FSM-per-word• Successful CAS advances FSM• Failed CAS retries

During copy, FSM includes words from both arrays

http://www.azulsystems.com/blogs/cliff

Page 57: Towards a Scalable Non-Blocking Coding Style

2008 JavaOneSM Conference | java.sun.com/javaone | 57

Dr. Cliff Click, Distinguished Engineer

Azul Systems

http://blogs.azulsystems.com/cliff