Top Banner
© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 1 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory University of Kent at Canterbury mm-net Garbage Collection & Memory Management Summer School Tuesday 20 July 2004 © Richard Jones, 2004. All rights reserved.
21

© Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 1

A Rapid Introduction to Garbage CollectionA Rapid Introduction to Garbage Collection

Richard JonesComputing Laboratory

University of Kent at Canterbury

mm-net Garbage Collection & Memory Management Summer School

Tuesday 20 July 2004

© Richard Jones, 2004.All rights reserved.

Page 2: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 2

PART 1: IntroductionPART 1: Introduction

Motivation for garbage collection

What to look for

Motivation for garbage collection

What to look for

Page 3: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 3

Why garbage collect? Why garbage collect?

Finite storage requirement• computer have finite, limited storage

Language requirement• many OO languages assume GC, e.g. allocated objects

may survive much longer than the method that created them

Problem requirement• the nature of the problem may make it very hard/impossible

to determine when something is garbage

Page 4: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 4

Why automatic garbage collection?

Why automatic garbage collection?

Because human programmers just can’t get it right.Either

too little is collected leading to memory leaks, or too much is collected leading to broken programs.

Explicit memory management conflicts with the software engineering principles of abstraction and modularity.

It’s not a silver bullet

• Some memory management problems cannot be solved using automatic

GC, e.g. if you forget to drop references to objects that you no longer need.

• Some environments are inimical to garbage collection– embedded systems with limited memory– hard real-time systems

Page 5: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 5

PART 2: The BasicsPART 2: The Basics

• What is garbage?

• The concept of liveness by reachability

• The basic algorithms

• The cost of garbage collection

• What is garbage?

• The concept of liveness by reachability

• The basic algorithms

• The cost of garbage collection

Page 6: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 6

What is garbage?What is garbage?

Almost all garbage collectors assume the following definition of live objects called liveness by reachability: if you can get to an object, then it is live.

More formally: An object is live if and only if:

it is referenced in a predefined variable called a root,

or

it is referenced in a variable contained in a live object

(i.e. it is transitively referenced from a root).

Non-live objects are called dead objects, i.e. garbage.

Page 7: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 7

RootsRoots

Objects and references can be considered a directed graph.Live objects are those reachable from a root. A process executing a computation is called a mutator — it simply modifies the object graph dynamically.

Determining roots of a computation is, in general, language-dependent.

In common language implementations roots include• words in the static area

• registers

• words on the execution stack that point into the heap.

Page 8: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 8

The basic algorithmsThe basic algorithms

• Reference counting: Keep a note on each object in your garage, indicating the number of live references to the object. If an object’s reference count goes to zero, throw the object out (it’s dead).

• Mark-Sweep: Put a note on objects you need (roots). Then recursively put a note on anything needed by a live object. Afterwards, check all objects and throw out objects without notes.

• Mark-Compact: Put notes on objects you need (as above). Move anything with a note on it to the back of the garage. Burn everything at the front of the garage (it’s all dead).

• Copying: Move objects you need to a new garage. Then recursively move anything needed by an object in the new garage. Afterwards, burn down the old garage (any objects in it are dead)!

Page 9: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 9

ST

R root

ST

R root

Update(left(R), S)

Reference countingReference countingThe simplest form of garbage collection is reference counting.

Basic idea: count the number of references from live objects.

Each object has a reference count (RC) • when a reference is copied, the referent’s RC is incremented

• when a reference is deleted, the referent’s RC is decremented

• an object can be reclaimed when its RC = 0

Page 10: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 10

Advantages of reference counting

Advantages of reference counting

Simple to implement

Costs distributed throughout program

Good locality of reference: only touch old and new targets' RCs

Works well because few objects are shared and many are short-lived

Zombie time minimized: the zombie time is the time from when an object becomes garbage until it is collected

Immediate finalisation is possible (due to near zero zombie time)

OHP

Page 11: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 11

Disadvantages of reference countingDisadvantages of reference counting

Not comprehensive (does not collect all garbage):cannot reclaim cyclic data structures

High cost of manipulating RCs: cost is ever-present even if no garbage is collected

Bad for concurrency — need Compare&Swap

Tightly coupled interface to mutator

High space overheads

Recursive freeing cascade

OHP

Page 12: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 12

Mark-SweepMark-Sweep

Mark-sweep is a tracing algorithm — it works by following (tracing) references from live objects to find other live objects.

Implementation:Each object has a mark-bit associated with it.

There are two phases:• Mark phase: starting from the roots, the graph is traced and

the mark-bit is set in each unmarked object encountered. At the end of the mark phase, unmarked objects are

garbage.

• Sweep phase: starting from the bottom, the heap is swept – mark-bit not set: the object is reclaimed– mark-bit set: the mark-bit is cleared

Page 13: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 13

A simple mark-sweep exampleA simple mark-sweep example

root

mark-bit

0

1

2

Page 14: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 14

Comprehensive: cyclic garbage collected naturally

No run-time overhead on pointer manipulations

Loosely coupled to mutator

Does not move objects• does not break any mutator invariants

• optimiser-friendly

• requires only one reference to each live object to be discovered (rather than having to find every reference)

Advantages of mark-sweepAdvantages of mark-sweepOHP

Page 15: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 15

Disadvantages of mark-sweepDisadvantages of mark-sweep

Stop/start nature leads to disruptive pauses and long zombie times.

Complexity is O(heap) rather than O(live) • every live object is visited in mark phase• every object, alive or dead, is visited in sweep phase

Degrades with residency (heap occupancy)• the collector needs headroom in the heap to avoid thrashing

Fragmentation and mark-stack overflow are issues

Tracing collectors must be able to find roots (unlike reference counting)

OHP

Page 16: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 16

Fast allocation?Fast allocation?

Problem: Non-moving memory managers fragment the heap

• mark-sweep

• reference counting

A compacted heap

• offers better spatial locality, e.g. better virtual memory and cache performance

• allows fast allocation

– merely bump a pointer

Page 17: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 18

Registers

freescan

Tospace

From space

C

F GD E

B

A

Registers

freescan

Tospace

From space

C

F GD E

B

A

A'

A'

copy root and update pointer,

leaving forwarding address

Registers

freescan

Tospace

From space

C

F GD E

B

A

B'

A'

A'

B' C'

C'

scan A'copy B and C,

leaving forwarding addresses

Registers

freescan

Tospace

From space

C

F GD E

B

A

D' E'B' C'

B' C'

A'

A'

D' E'

scan B'copy D and E,

leaving forwarding addresses

Registers

freescan

Tospace

From space

C

F GD E

B

A

D' E'B' C'

B' C'

A'

A' F'

G'

D' E' F' G'

scan C'copy F and G,

leaving forwarding addresses

Registers

freescan

Tospace

From space

C

F GD E

B

A

F' G'

D' E'

D' E'

B' C'

B' C'

A'

A' F'

G'

scan D' and E'nothing to do

Registers

freescan

Tospace

From space

C

F GD E

B

A

F' G'

D' E'

D' E'

B' C'

B' C'

A'

A' F'

G'

scan F'use A's forwarding address

Registers

freescan

Tospace

From space

C

F GD E

B

A

F' G'

D' E'

D' E'

B' C'

B' C'

A'

A' F'

G'

scan G'nothing to do

Registers

freescan

Tospace

From space

C

F GD E

B

A

F' G'

D' E'

D' E'

B' C'

B' C'

A'

A' F'

G'

scan=freeso collection is complete

Co

pyi

ng

GC

Exa

mp

leC

op

yin

g G

C E

xam

ple

Page 18: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 20

Disadvantages of copying GCDisadvantages of copying GC

Stop-and-copy may be disruptiveDegrades with residency

Requires twice the address space of other simple collectors• touch twice as many pages• trade-off against fragmentation

Cost of copying large objectsLong-lived data may be repeatedly copied

All references must be updatedMoving objects may break mutator invariants

Breadth-first copying may disturb locality patterns

Page 19: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 21

Mark-compact collectionMark-compact collection

Mark-compact collectors make at least two passes over the heap after marking

• to relocate objects

• to update references (not necessarily in this order)

Issues• how many passes?

• compaction style

– sliding: preserve the original order of objects

– linearising: objects that reference each other are placed adjacently (as far as possible)

– arbitrary: objects moved without regard for original order or referential locality

Page 20: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 22

Cost metricsCost metrics

Many cost metrics can be interesting (albeit not necessarily at the same time). These cost metrics cover different types of concerns that may apply. The metrics are partially orthogonal, partially overlapping, and certainly also partially contradictory.

In general it is not possible to identify one particular metric as the most important in all cases — it is application dependent.

Because different GC algorithms emphasise different metrics, it is also, in general, not possible to point out one particular GC algorithm as “the best”.

In the following, we present the most important metrics to consider when choosing a collector algorithm.

Page 21: © Richard Jones, Eric Jul, 1999-2004mmnet GC & MM Summer School, 20-21 July 20041 A Rapid Introduction to Garbage Collection Richard Jones Computing Laboratory.

© Richard Jones, Eric Jul, 1999-2004 mmnet GC & MM Summer School, 20-21 July 2004 23

GC MetricsGC Metrics

Execution time• total execution time• distribution of GC execution

time• time to allocate a new

object

Memory usage• additional memory

overhead• fragmentation• virtual memory and cache

performance

Delay time• length of disruptive

pauses• zombie times

Other important metrics• comprehensiveness• implementation simplicity

and robustness