Top Banner
ParMarkSplit: A Parallel Mark-Split Garbage Collector Based on a Lock-Free Skip- List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing and Systems Chalmers University of Technology
22

ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

Dec 18, 2015

Download

Documents

Mavis Flynn
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

ParMarkSplit: A Parallel Mark-Split Garbage Collector Based on a Lock-Free Skip-List

Nhan Nguyen

Philippas Tsigas

Håkan Sundell

Distributed Computing and Systems

Chalmers University of Technology

Page 2: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

2

Garbage Collection (GC)

Reclaim memory no longer used by programs Algorithms:

Mark-sweep (McCarthy 1959), copying (Fenichel & Yochelsom 1969, Cheney 1970), reference counting (Collins 1960), mark-compact (Cohen & Nicolau 1983), mark-split (Sagonas & Wilhelmsson 2006), mark-region (Blackburn & McKinley 2008), etc.

Multicore era Garbage collectors need to be parallelized!

Page 3: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

3

How do we parallelize GCs?

The answer is not limited to one algorithm! Several algorithms have been parallelized: mark-

sweep, mark-compact, copying, mark-region, etc

Mark-Split [Sagonas et al. ISMM2006] Can Mark-Split be parallelized? Is it a challenging problem to solve?

Page 4: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

4

Contributions

ParMarkSplit: A Parallel and Concurrent Mark-split Garbage Collector Lock-free skip-list with extended functionality Lazy splitting mechanism

Implementation in OpenJDK HotSpot A collector for the old generation

Evaluation using the DaCapo benchmarks (Blackburn et al. 2006)

Page 5: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

5

Contributions

Introduction Background

Mark-split algorithm Contribution

Skip-list with extended functionalityParallelization of mark-split

Evaluation Conclusion

Page 6: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

6

Mark-split algorithm(Sagonas & Wilhelmsson, ISMM 2006)

Mark-Sweep: MARK reachable objects as LIVE Scan the heap to SWEEP over unmarked objects Create free-list: list of free spaces.

Mark-Split: mark without sweep MARK live objects Create the free-list during the mark phase, using an

operation called SPLIT!

In some cases, this replacement of the sweep phase pays off!

Page 7: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

7

Mark-Split (1)

Heap:

Root set:

Free-list:

Page 8: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

8

Mark-Split (2)

Heap:

Root set:

Free-list:

Page 9: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

9

Mark-Split (2)

Heap to be collected

Sequential Mark-Split outperforms Mark-Sweep in certain cases (ISMM 2006)

Free-list:

Root set:

Page 10: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

10

Mark-Split Algorithmfor Multi-core Architectures

Most frequent operation in Mark-Split “search and split”

Free-list implementation in a sequential context: Fast search is enough! Balanced search tree, skip-list, hash table

Free-list implementation in a multi-threaded context: atomic “search and split” is needed

split = find_interval + delete [+ insert_intervals]

Suitable search data structure to adapt?

Skip-list !

Page 11: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

11

Contributions

Introduction Background

Mark-split algorithm Contribution

Skip-list with extended functionalityParallelization of mark-split

Evaluation Conclusion

Page 12: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

12

Skip-list

1 2 3 4 5 6 7

Head Tail

50%25%

Layers of ordered lists with different densities, achieve tree-like behavior

Search: O(log2N) – probabilistic! Lock-free skip-list [Sundell & Tsigas 2003] Extension needed to perform split

Page 13: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

Extended functionality

13

Search data structure for parallelization of mark-split: To store and efficiently search for free memory chunks. Can perform a special, composite split operation.

split = 1 delete + 2 inserts Atomic, CAS

Page 14: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

Lazy Splitting

Reduce the number of splitting interval operations Lowers the contention at the shared skip-list

Observation: several adjacent objects are marked consecutively 10-60% total objects Degree of adjacency

Mark-split Mark-mark…-mark-split

14

Page 15: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

Our parallelization of mark-split

Our Parallel Mark-Split (PMS)- naturally achieved by: Using the extended skip-list to store free memory intervals for

splitting. Integrating concurrent splitting into parallel marking.

Implementation: New GC available for HotSpot – an industry-level Java Virtual

Machine.

Evaluation using DaCapo benchmarks on Intel and AMD systems. Compare with Lock-based PMS and Concurrent Mark-Sweep

(CMS).

15

Page 16: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

16

A comparison to Concurrent Mark-Sweep

Concurrent

Mark-Sweep Initial Mark Concurrent Mark Final Mark Sweep

Parallel

Mark-Split Initial Mark & Split Concurrent Mark & Split Final Mark & Split

Translate the skip-list to free-list for allocation

Page 17: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

17

Contributions

Introduction Background

Mark-split algorithm Contribution

Skip-list with extended functionalityParallelization of mark-split

Evaluation Conclusion

Page 18: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

Evaluation setup

DaCapo benchmarksAvrora, tomcat, sunflow, lusearch,

xalan 2 Intel Nahalem x 6 HT cores 4 AMD Bulldozer x 12 cores Linux Ubuntu 11.10

OpenJDK 7

Page 19: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

Lazy Splitting

avrora lusearch sunflow tomcat xalan0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Number of split_interval operations when using lazy splitting mechanism

Page 20: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

6 9 12 15 6 9 12 150

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1tomcat

PMS PMS_Lock CMS

GC Threads

AMDIntel

Evaluation – Garbage collection time

20

6 9 12 15 6 9 12 150

0.1

0.2

0.3

0.4

0.5sunflow

PMS PMS_Lock CMS

GC Threads

AMDIntel

GC

Tim

e (

s)

Stop-the-world scenario

Good case

Bad case

Page 21: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

Evaluation – Benchmark time

21

6 9 12 15 6 9 12 150

1000

2000

3000

4000

5000

6000

7000

8000sunflow

PMS PMS_Lock CMS

GC Threads

AMDIntel

6 9 12 15 6 9 12 150

1000

2000

3000

4000

5000

6000

7000tomcat

PMS PMS_Lock CMS

GCThreads

AMDIntel

Be

nc

hm

ark

tim

e (

s)

Concurrent GC

Good case

Bad case

Characterizes applications that benefit from ParMarkSplit:

• High garbage to live ratio

• Live objects reside adjacent being marked consecutively

Page 22: ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.

Conclusion

Parallel and Concurrent Mark-Split GC An extended lock-free skip-list to handle free-intervals Lazy splitting mechanism

Implemented as a garbage collector in industrial-standard OpenJDK7 HotSpot

Evaluation ParMarkSplit performs well with applications with

certain characteristics