Damien Doligez Georges Gonthier POPL 1994 Presented by Eran Yahav (yahave@math.tau.ac.il) Portable, Unobtrusive Garbage Collection for Multiprocessor Systems.

Post on 21-Dec-2015

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Damien DoligezGeorges GonthierPOPL 1994

Presented by Eran Yahav (yahave@math.tau.ac.il)

Portable, Unobtrusive Garbage Collection for Multiprocessor Systems

Portable, Unobtrusive Garbage Collection for Multiprocessor Systems

A concurrent, generational garbage collector for a multithreaded implementation of ML - Doligez - Leroy (POPL 1993)

On-the-fly garbage collection: an exercise in cooperation - Dijkstra et al. (1978)

Overview

MotivationConcurrent collection strategiesConcurrent collection constraintsThe basic algorithm (Dijkstra)Doligez-Leroy modelDoligez-Leroy concurrent collector

Concurrent GC

Known as a tough problemPublished algorithms contain simplifying

assumptions that either: impose unbearable overhead on mutators require high degree of hardware/OS support

Other algorithms are buggy

“Stop the world”

all threads synchronically stop and perform GC

introduces sync. between independent threads

T1

T2

T3

T4

Sync. GC Sync. GC

“Stop the world”

all threads synchronically stop and perform GC

introduces sync. between independent threads

T1

T2

T3

T4

Sync. GC

“Stop the world” - Mostly Parallel GC (Bohem et. al)

Uses virtual memory page protections reduces duration of “stop the world”

perioddoes not prevent synchronization between

threads at “stop the world” points

T1

T2

T3

T4

Sync. GCmarking

“Stop the world” - Scalable mark-sweep GC

Uses a parallelization of Bohem’s mostly parallel collector

reduces duration of “stop the world” periods

does not prevent synchronization between threads at “stop the world” points

T1

T2

T3

T4

Sync. GCmarking

“Stop the world” - Real Time GC (Nettles & O’Toole)

Incremental copying collector reduces duration of “stop the world”

periodsdoes not prevent synchronization between

threads at the swap point

T1

T2

T3

T4

Sync. GC

Concurrent collector

run the collector concurrently with user threads

use as little as possible sync between user threads and GC thread

T1

T2

T3

T4

GC

Concurrent Collection strategies

Reference countingcopying (relocation)mark & sweep

Concurrent GC - Reference counting

Locks on reference counters

heap

RC = 2

M1

M2

-1

M3 +1

Concurrent GC - relocation

relocating objects while mutators are running

heap

from toM1

GC

?

M2

?

Concurrent GC - relocation

relocating objects while mutators are running

must ensure that mutators are aware of relocation test on heap pointer deref extra indirection word for each object virtual memory page protections

significant run-time penalty

Concurrent GC - mark/sweep

Mark all threads rootsNo inherent locksMutators may change trace graph during

any collection phase

Heap

Threads

1

Globalvariables

2 3

Multiprocessors facts of life

Registers are local impossible to track down machine registers of

a running process

Synchronization is expensive semaphores and synchronization are only

available through expensive system calls

Unobtrusive?

No overhead on frequent actions: move data between registers and memory deref a heap pointer fill a field in a new heap object

imposes sync. overhead only on reserve actions (for which it is unavoidable)

mutator cooperation with collector is done only at mutator’s convenience

Portable ?

No special use of OS synchronization primitives

no hardware support

Where all else fail

relocating GC algorithms break locality or impose large overhead

proposed incremental algorithms requires global synchronization

mark & sweep - collector working while mutators change trace graph - complicated but possible

The basic algorithm

Dijkstra et al. - “On the fly garbage collection”published in 1978breaks localityassumes fixed set of roots

Heap

Threads

1

Globalvariables

2 3 GC

Dijkstra’s collector

Mark: for each x in Globals do MarkGray(x)

Scan: repeatdirty false for each x in heap do if color[x] = Gray then dirty true MarkGray(x . Sons) color[x] black

until not dirty

Sweep: for each x in heap doif color[x] = white then append x to free listelse if color[x] = black then color[x] white

black gray

white

mark

swee

p

mar

k

upda

te

sweep

allo

cate

Doligez-Leroy model

Damein doligez & Xavier Leroy at 1993a concurrent, generational GC for

multithreaded implementation of MLrelies on ML properties:

compile time distinction between mutable and immutable objects

duplicating immutable objects is semantically transparent

does not stop program threads

Doligez-Leroy model

Do anything to avoid synchronizationtrade collection “quality” for level of

synchronization - allow large amounts of floating garbage

trade collection “simplicity” for level of synchronization - complicated algorithm (not to mention correctness proof)

3

Doligez-Leroy model

Stacks

Minorheaps

Majorheap

Threads

1 2

Globalvariables

Collection generations

Each thread treats the two heaps (private and shared) as two generations private = young generation shared = old generation

immutable objects are allocated in private heaps does not require synchronization

mutable objects handled differently (later)

Minor collection

When private heap is full - stop and perform minor collection

copy live objects from private heap to shared heap (old generation)

after minor collection, whole private heap is free

can be performed in any timesynchronization is only required for allocation

of the copied object on shared heap

Major collection

Dedicated GC threaduses a variation of Dijkstra’s algorithm

(mark & sweep)does not move objects, no synchronization

is required when accessing/modifying objects in shared heap

will be described later

3

Major and minor collection

Stacks

Minorheaps

Majorheap

Threads

1 2

Globalvariables

GC

Copy on update

We assumed no pointers from shared heap to private heap

Majorheap

Not reachablefrom thread’sroots

Copy on update

Copy the referenced object (and descendents)

similar to minor collectionwith a single root

simply does some of theminor collection right away

Majorheap

Copy on update

Until next minor collection, copying thread can access original and copied objects

immutable objects - semantically equivalent

what about mutable objects ?

Majorheap

Allocation of mutable objects

If copied - can update both objects separatelyno equivalence of original and copied objectsolution: always allocate mutable objects in

the shared heaprequires synchronization (free list)ML programs usually use few mutable objectsmutable objects have longer life span than

average

The Concurrent collector

Adapted version of Dijkstra’s algorithmnaming conventions

mutator = thread + minor collection thread collector = major collector

major collector only requires marking of mutator roots.

does not demand minor collections

Four color marking

White - not yet marked (or unreachable)Gray - marked but sons not markedBlack - marked and sons markedBlue - free list blocks

Heap

Collection phases

Root enumerationend of markingsweeping

Root enumeration

Raise a flag to signal beginning of marking shade globalsask mutators to shade rootswait until all mutators answeredmeanwhile - start scanning and marking

Root enumeration

Mark: for each x in Globals do MarkGray(x) call mutator to mark roots wait until all mutators answered...

Cooperate: if call to roots is pending then call MarkGray on all roots answer the call

Collector

Mutators

End of marking

Repeatedly mark gray objects until no more gray objects remain

Scan: repeatdirty false for each x in heap do if color[x] = Gray then dirty true MarkGray (x . Sons) color[x] black

until not dirty

Sweeping

Scan heapAll white objects are free - set to blue and

add to the free listall black objects are reset to whitesome object might have been set to gray

since the end of marking phase - set to white

Invariants (1/2)

All objects reachable from mutator roots at the time mutator shaded its roots, or that become reachable after that time are black at the end of the marking phase

Objects can become reachable by allocation and modificationwhich are performed concurrently with the collection

Invariants (2/2)

gray objects that are unreachable at the beginning of the mark phase become black during mark, then white during sweep and reclaimed by the next cycle (floating garbage)

all white objects unreachable at the start of the marking phase remain white

No unreachable object ever becomes reachable again

there are no blue objects outside the free list

Concurrent allocation and modification

Mutators must consider collector status when performing modification or allocation of heap objects

first, lets consider modification of heap objects

Concurrent modification

Updating a black object could result in a reachable object that remains white at the end of marking

even worse - the set of roots is not fixed during collection

must shade both the new value and the old value

What happens if we don’t shade new value

T1 T2

Majorheap

Mark T1 root

A

B

T2 updates A

Root enumeration

T2 pops

What happens if we don’t shade new value

Majorheap

T1 T2

A

B

Mark T1 root T2 updates A

Root enumeration

T2 pops

What happens if we don’t shade new value

Majorheap

T1 T2

A

B

Mark T1 root T2 updates A

Root enumeration

T2 pops Mark T2 root

End mark Sweep

What happens if we don’t shade old value

T

Majorheap

Mark T root T pushes B

Root enumeration

A

B

End mark

What happens if we don’t shade old value

Majorheap

T

A

B

Mark T root T pushes B

Root enumeration End mark

What happens if we don’t shade old value

Majorheap

T

A

B

Mark T root T pushes B

Root enumeration

T updates A

SweepEnd mark

Concurrent Allocation

Assign right color to new objectsduring marking - allocated objects are

black allocated are reachable sons of allocated are reachable and will

eventually be set to black

sweeping - white if already swept, gray otherwise set to gray to avoid immediate deallocation

Synchronization

It is always safe to set an object to graysetting many objects to gray is inefficient

will be only reclaimed on next cycle

allows us to avoid synchronization when race condition can end up making an object gray

used to test collector status without locking

Synchronization

Coloring of newly allocated block

1. If phase = marking then2. Set object to black3. If phase = sweeping then4. Set object to gray5. Else6. If address(object) < sweep-pointer then7. Set object to white8. Else 9. Set object to gray

SM

SSS

SSSSS

Color transitions summary

black gray

blue white

mark

allocate sweep

sweep

mark

allocate

update

sweep

allocat

e

Experimental results

Corrections

When shading old value - what old value do we shade ?

Another thread might “replace” old value, after we shade it, by a non-shaded value

this is corrected by adding another handshake - all updates must end before we start marking

Summary

Doligez Leroy & Gonthier concurrent GCdoes not stop program threadsfour colors mark & sweep - white,gray,

black and bluerelies on ML language properties, but can

be extended for other languages

The End

top related