Top Banner
Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh Gidra Gaël Thomas Julien SopenaMarc Shapiro Regal-LIP6/INRIA
12

Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh GidraGaël Thomas Julien SopenaMarc.

Dec 14, 2015

Download

Documents

Carley Clink
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh GidraGaël Thomas Julien SopenaMarc.

Assessing the Scalability of GarbageCollectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT)

Lokesh Gidra Gaël ThomasJulien Sopena Marc Shapiro

Regal-LIP6/INRIA

Page 2: Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh GidraGaël Thomas Julien SopenaMarc.

2

Introduction

Why?– MREs are ubiquitous!– GC, a vital component of it performance is critical?– Hardware is more and more multi-resourced.– Are GCs scaling with such hardware?– Current solutions not evaluated on true many-cores!

What?– Assesses GC scalability : Empirical Results.– Possible factors affecting the GC scalability.

Lokesh Gidra

Page 3: Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh GidraGaël Thomas Julien SopenaMarc.

3

Multi-Node Architecture

C0 C1 C5

L2 L2 L2

L3

MC

DRAM

C0 C1 C5

L2 L2 L2

L3

MC

DRAM

Our machine has 8 nodes with 6 cores each

Remote access >> Local access

To other nodes

Lokesh Gidra

1540

125315

Page 4: Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh GidraGaël Thomas Julien SopenaMarc.

4

Parallel Copying Garbage Collection

PauseTime

ApplicationTime

Mutator Threads GC Threads

From Space To Space

Live Object

Dead Object

Total Time

Lokesh Gidra

Page 5: Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh GidraGaël Thomas Julien SopenaMarc.

5

GCs effect on Application Scalability (Lusearch)

Up-to 6 cores:• 3X performance improvement.

More than 6 cores:• No improvement in total time.• Proportion of pause time increases up-to 50%.

Lokesh Gidra

Mutator Threads = GC Threads = Varying Number of Cores

Page 6: Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh GidraGaël Thomas Julien SopenaMarc.

6

GC Scalability (Lusearch)

Pause time increases with GC threads Negative Scalability!

Lokesh Gidra

Mutator Threads = Cores = 48 and, Varying Number of GC Threads

Page 7: Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh GidraGaël Thomas Julien SopenaMarc.

7

1. Remote Scanning

From Space To Space

Live Object

Dead Object

Node 0

Node 1

Node 2

Node 3

GC Threads

GC0 GC1 GC2 GC3

Lokesh Gidra

87.7% scans were remote!Random (Default)

object allocation

Page 8: Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh GidraGaël Thomas Julien SopenaMarc.

8

2. Remote Copying

Node 0

Node 1

Node 2

Node 3

GC Threads

From Space To Space

Live Object

Dead Object

GC0 GC1 GC2 GC3

Lokesh Gidra

82.7% copies were remote!

Page 9: Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh GidraGaël Thomas Julien SopenaMarc.

9

3. Load Balancing

Task QueueOwner: Push and Pop

Other GC Threads: Steal (Pop)

•Based on work stealing technique.

•1 task queue per GC thread.

Highly unbalanced load:

• Requires a lot of stealing.

• Keep doing until all are done.

Performance Impact: ≥ 2-4 cache misses/stealing!33.3% improvement in pause time by disabling it!

Shared Variable: size (task queue size)

Lokesh Gidra

Page 10: Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh GidraGaël Thomas Julien SopenaMarc.

10

Conclusion

• GC does affect application’s scalability it matters!

• GC doesn’t scale with the hardware!• Bottlenecks:– Remote Scanning– Remote Copying– Load Balancing

• Future Work:– Fix the bottlenecks does it help GC to scale?

Lokesh Gidra

Page 11: Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh GidraGaël Thomas Julien SopenaMarc.

11

DaCapo Benchmarks’ Scalability

Lokesh Gidra

Page 12: Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT) Lokesh GidraGaël Thomas Julien SopenaMarc.

12

Revisiting App. (Lusearch) Scalability…

Lokesh Gidra