Immix: A Mark-Region Garbage Collector Curtis Dunham CS 395T Presentation Feb 2, 2011 Thanks to Steve Blackburn and Jennifer Sartor for their 2008 and.
Post on 20-Jan-2016
212 Views
Preview:
Transcript
1
Immix: A Mark-Region Garbage Collector
Curtis DunhamCS 395T Presentation
Feb 2, 2011
Thanks to Steve Blackburn and Jennifer Sartor for their 2008 and 2009 Immix presentations, respectively.
I believe this presentation to be ~95% Steve’s PLDI talk and ~4% Jennifer Sartor’s 395T presentation and < 1% mine
2
Comparison to Prior Work; Contributions
Space efficiency Fast Collection
Mutator Performance
Copying
Mark-Sweep
Mark-Compact
Status QuoBefore This Work
Post-Immix:The New World of GC
Immix:Mark-Region w/
OpportunisticDefragmentation
Bump Pointer
Non-Semispace
Does both (every object
either marked or copied)
In One Pass
Locality
3
GC FundamentalsAlgorithmic Components
Allocation ReclamationIdentification
Bump Allocation
Free List
`
Tracing(implicit)
Reference Counting(explicit)
Sweep-to-Free
Compact
Evacuate
3 1
4
Mark-Compact [Styger 1967]
Bump allocation + trace + compact
GC FundamentalsCanonical Garbage Collectors
`
Sweep-to-Free
Compact
Evacuate
Mark-Sweep [McCarthy 1960]
Free-list + trace + sweep-to-free
Semi-Space [Cheney 1970]
Bump allocation + trace + evacuate
5
Sweep-To-Regionand Mark-Region
`
Sweep-to-Free
Compact
Evacuate
Reclamation
Sweep-to-Region
Mark-SweepFree-list + trace + sweep-to-free
Mark-CompactBump allocation + trace + compact
Semi-SpaceBump allocation + trace + evacuate
Mark-RegionBump alloc + trace + sweep-to-region
6
Naïve Mark-Region
• Contiguous allocation into regionsExcellent locality– For simplicity, objects cannot span regions
• Simple mark phase (like mark-sweep)– Mark objects and their containing region
• Unmarked regions can be freed
7
Heap Organization
• Blocks – analogous to Regions– Recyclable– Immix block = 32KB
• Lines– Objects can span lines– Immix line = 128B
• Opportunistic defragmentation– Candidate and target blocks– Single pass to mark and copy
Reusable for (more) allocation
256 per Block
Move from mostly-emptyto mostly-full
8
Immix: Lines and Blocks
Small Regions
Large Regions
✗ Fragmentation (can’t fill blocks)
✓ More contiguous allocation ✗ Fragmentation (false marking)
Lines & BlocksN pages approx 1 cache line
✓ Less fragmentation▫ Objects span lines
✓ Fast common case▫ Lines marked with objects
✗ Increased metadata o/h
✗ Constrained object sizes
▫ TLB locality, cache locality ▫ Block > 4 X max object size
Free FreeRecyclable“In a mark-region collector, region size embodies
the collector’s space-time tradeoff.” Recyclable
9
Allocation Policy(Recycling)
• Recycle partially marked blocks firstMinimize fragmentationMaximize sharing of freed blocks
• Recycle in address order– We explored other options
• Allocate into free blocks lastEffect on locality and fragmentation?
10
Opportunistic Defragmentation
• Identify source and target blocks– (see paper for heuristics)
• Evacuate objects in source blocks– Allocate into target blocks
• Opportunistic– Leave in place if no space, or object pinned
• Opportunistically evacuate fragmented blocks– Lightweight, uses same allocation mechanism– No cost in common case (specialized GC)
• Source = most holes• Other heuristics?
11
Details
• Parallelizable– Coarse sweeping– Defragmentation
• Demand-driven overflow allocations– Medium objects
• Metadata space overheads– For parallel synch: mark bytes (not bits)– Line and block mark, not just object mark– Defragmentation headroom– Overflow allocation block– Conservative line marking
12
Other Optimizations
Implicit Marking
✓ Most objects small▫ Small objects implicitly mark next line✓ V. Fast common case▫ Large objects mark lines exactly Implicit line mark
Line mark
Overflow Allocation
▫ Multi-line objects may skip many small holes▫ Overflow allocation (used on failure)
✓ Large objects uncommon ✓ V. effective solution✓
13
Garbage Collection
Space
Tim
e
Total Performance
MarkSweepMarkCompactSemiSpaceImmix
Space
Tim
e
Mark-Region: Immix(Bump Allocation + Trace + Sweep-to-Region)
Mutator
Space
Tim
e
Minimum Heap
Spac
e
✓Simple, very
fast collection
✓Space
efficient
✓Good
locality
Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo
✓Excellent
performance
14
Total Performance
Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo
1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 61
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2
MarkSweep MarkCompact SemiSpace Immix
Heap Size (Normalized)
Tota
l Tim
e (N
orm
alize
d)
15
Discussion
• Necessity of two-level hierarchy?• Caching/Paging?– Efficacy of tuned line/block sizes:
e.g. actual TLB miss reduction?• Implicit Marking– advantages overcome possible fragmentation?
• Methodology and Results
16
Minimum Heap
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.61111111111111 3.0377358490566
MarkSweep MarkCompact SemiSpace Immix
17
Sticky Performance
Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo
1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 61
1.05
1.1
1.15
1.2
1.25
StickyMS StickyIX GenMS (Production)
Heap Size (Normalized)
Tota
l Tim
e (N
orm
alize
d)
Benefits of Sticky?
top related