Immix: A Mark-Region Garbage Collector Curtis Dunham CS 395T Presentation Feb 2, 2011 Thanks to Steve Blackburn and Jennifer Sartor for their 2008 and.

Post on 20-Jan-2016

212 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

1

Immix: A Mark-Region Garbage Collector

Curtis DunhamCS 395T Presentation

Feb 2, 2011

Thanks to Steve Blackburn and Jennifer Sartor for their 2008 and 2009 Immix presentations, respectively.

I believe this presentation to be ~95% Steve’s PLDI talk and ~4% Jennifer Sartor’s 395T presentation and < 1% mine

2

Comparison to Prior Work; Contributions

Space efficiency Fast Collection

Mutator Performance

Copying

Mark-Sweep

Mark-Compact

Status QuoBefore This Work

Post-Immix:The New World of GC

Immix:Mark-Region w/

OpportunisticDefragmentation

Bump Pointer

Non-Semispace

Does both (every object

either marked or copied)

In One Pass

Locality

3

GC FundamentalsAlgorithmic Components

Allocation ReclamationIdentification

Bump Allocation

Free List

`

Tracing(implicit)

Reference Counting(explicit)

Sweep-to-Free

Compact

Evacuate

3 1

4

Mark-Compact [Styger 1967]

Bump allocation + trace + compact

GC FundamentalsCanonical Garbage Collectors

`

Sweep-to-Free

Compact

Evacuate

Mark-Sweep [McCarthy 1960]

Free-list + trace + sweep-to-free

Semi-Space [Cheney 1970]

Bump allocation + trace + evacuate

5

Sweep-To-Regionand Mark-Region

`

Sweep-to-Free

Compact

Evacuate

Reclamation

Sweep-to-Region

Mark-SweepFree-list + trace + sweep-to-free

Mark-CompactBump allocation + trace + compact

Semi-SpaceBump allocation + trace + evacuate

Mark-RegionBump alloc + trace + sweep-to-region

6

Naïve Mark-Region

• Contiguous allocation into regionsExcellent locality– For simplicity, objects cannot span regions

• Simple mark phase (like mark-sweep)– Mark objects and their containing region

• Unmarked regions can be freed

7

Heap Organization

• Blocks – analogous to Regions– Recyclable– Immix block = 32KB

• Lines– Objects can span lines– Immix line = 128B

• Opportunistic defragmentation– Candidate and target blocks– Single pass to mark and copy

Reusable for (more) allocation

256 per Block

Move from mostly-emptyto mostly-full

8

Immix: Lines and Blocks

Small Regions

Large Regions

✗ Fragmentation (can’t fill blocks)

✓ More contiguous allocation ✗ Fragmentation (false marking)

Lines & BlocksN pages approx 1 cache line

✓ Less fragmentation▫ Objects span lines

✓ Fast common case▫ Lines marked with objects

✗ Increased metadata o/h

✗ Constrained object sizes

▫ TLB locality, cache locality ▫ Block > 4 X max object size

Free FreeRecyclable“In a mark-region collector, region size embodies

the collector’s space-time tradeoff.” Recyclable

9

Allocation Policy(Recycling)

• Recycle partially marked blocks firstMinimize fragmentationMaximize sharing of freed blocks

• Recycle in address order– We explored other options

• Allocate into free blocks lastEffect on locality and fragmentation?

10

Opportunistic Defragmentation

• Identify source and target blocks– (see paper for heuristics)

• Evacuate objects in source blocks– Allocate into target blocks

• Opportunistic– Leave in place if no space, or object pinned

• Opportunistically evacuate fragmented blocks– Lightweight, uses same allocation mechanism– No cost in common case (specialized GC)

• Source = most holes• Other heuristics?

11

Details

• Parallelizable– Coarse sweeping– Defragmentation

• Demand-driven overflow allocations– Medium objects

• Metadata space overheads– For parallel synch: mark bytes (not bits)– Line and block mark, not just object mark– Defragmentation headroom– Overflow allocation block– Conservative line marking

12

Other Optimizations

Implicit Marking

✓ Most objects small▫ Small objects implicitly mark next line✓ V. Fast common case▫ Large objects mark lines exactly Implicit line mark

Line mark

Overflow Allocation

▫ Multi-line objects may skip many small holes▫ Overflow allocation (used on failure)

✓ Large objects uncommon ✓ V. effective solution✓

13

Garbage Collection

Space

Tim

e

Total Performance

MarkSweepMarkCompactSemiSpaceImmix

Space

Tim

e

Mark-Region: Immix(Bump Allocation + Trace + Sweep-to-Region)

Mutator

Space

Tim

e

Minimum Heap

Spac

e

✓Simple, very

fast collection

✓Space

efficient

✓Good

locality

Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo

✓Excellent

performance

14

Total Performance

Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo

1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 61

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9

2

MarkSweep MarkCompact SemiSpace Immix

Heap Size (Normalized)

Tota

l Tim

e (N

orm

alize

d)

15

Discussion

• Necessity of two-level hierarchy?• Caching/Paging?– Efficacy of tuned line/block sizes:

e.g. actual TLB miss reduction?• Implicit Marking– advantages overcome possible fragmentation?

• Methodology and Results

16

Minimum Heap

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.61111111111111 3.0377358490566

MarkSweep MarkCompact SemiSpace Immix

17

Sticky Performance

Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo

1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 61

1.05

1.1

1.15

1.2

1.25

StickyMS StickyIX GenMS (Production)

Heap Size (Normalized)

Tota

l Tim

e (N

orm

alize

d)

Benefits of Sticky?

top related