Top Banner
Memory Management Tom Roeder CS215 2006fa
30

Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Dec 16, 2015

Download

Documents

Edwina Warren
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Memory Management

Tom Roeder

CS215 2006fa

Page 2: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Motivation

Recall unmanaged code eg C:

{

double* A = malloc(sizeof(double)*M*N);for(int i = 0; i < M*N; i++) {

A[i] = i; }}

What’s wrong? memory leak: forgot to call free(A); common problem in C

Page 3: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Motivation

What’s wrong here?char* f(){

char c[100];for(int i = 0; i < 100; i++) {

c[i] = i; } return c;}

Returning memory allocated on the stack Can you still do this in C#?

no: array sizes must be specified in “new” expressions

Page 4: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Motivation

Solution: no explicit malloc/free (new/delete) eg. in Java/C#

{

double[] A = new double[M*N];for(int i = 0; i < M*N; i++) {

A[i] = i; }}

No leak: memory is “lost” but freed later A Garbage collector tries to free memory

keeps track of used information somehow

Page 5: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

COM’s Solution

Reference Counting AddRef/Release

each time a new reference is created: call AddRef each time released: call Release must be called by programmer

leads to difficult bugs forgot to AddRef: objects disappear underneath forgot to Release: memory leaks

Entirely manual solutions unacceptable

Page 6: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Garbage Collection

Why must we do this in COM? no way to tell what points to what C/C++ pointers can point to anything

C#/Java have a managed runtime all pointer types are known at runtime can do reference counting in CLR

Garbage Collection is program analysis figure out properties of code automatically two type of analysis: dynamic and static

Page 7: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Soundness and Completeness

For any program analysis Sound?

are the operations always correct? usually an absolute requirement

Complete? does the analysis capture all possible instances?

For Garbage Collection sound = does it ever delete current memory? complete = does it delete all unused memory?

Page 8: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Reference Counting

As in COM, keep count of references. How? on assignment, increment and decrement when removing variables, decrement

eg. local variables being removed from stack know where all objects live at ref count 0, reclaim object space

Advantage: incremental (don’t stop) Is this safe?

Yes: not reference means not reachable

Page 9: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Reference Counting

Disadvantages constant cost, even when lots of space

optimize the common case! can’t detect cycles

Has fallen out of favor.

Reachable

1

1

1

2

Page 10: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Trees

Instead of counting references keep track of some top-level objects and trace out the reachable objects only clean up heap when out of space

much better for low-memory programs

Two major types of algorithm Mark and Sweep Copy Collectors

Page 11: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Trees

Top-level objects managed by CLR local variables on stack registers pointing to objects

Garbage collector starts top-level builds a graph of the reachable objects

Page 12: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Mark and Sweep

Two-pass algorithm First pass: walk the graph and mark all objects

everything starts unmarked Second pass: sweep the heap, remove unmarked

not reachable implies garbage

Soundness? Yes: any object not marked is not reachable

Completeness? Yes, since any object unreachable is not marked but only complete eventually

Page 13: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Mark and Sweep

Can be expensive eg. emacs

everything stops and collection happens this is a general problem for garbage collection

at end of first phase, know all reachable objects should use this information how could we use it?

Page 14: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Copy Collectors

Instead of just marking as we trace copy each reachable object to new part of heap needs to have enough space to do this no need for second pass

Advantages one pass compaction

Disadvantages higher memory requirements

Page 15: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Fragmentation

Common problem in memory schemes Enough memory but not enough contiguous

consider allocator in OS

5

5

10

15

10

10?

10

Page 16: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Unmanaged algorithms

best-fit search the heap for the closest fit takes time causes external fragmentation (as we saw)

first-fit choose the first fit found starts from beginning of heap

next-fit first-fit with a pointer to last place searched

Page 17: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Unmanaged algorithms

worst-fit put the object in the largest possible hole under what workload is this good?

objects need to grow eg. database construction eg. network connection table

different algorithms appropriate in different settings: designed differently in compiler/runtime, we want access speed

Page 18: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Heap Allocation Algorithms

Best for managed heap? must be usually O(1)

so not best or first fit use next fit

walk on the edge of the last chunk General idea

allocate contiguously allocate forwards until out of memory

Page 19: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Compacting Copy Collector

Move live objects to bottom of heap leaves more free space on top contiguous allocation allows faster access

cache works better with locality

Must then modify references recall: references are really pointers must update location in each object

Can be made very fast

Page 20: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Compacting Copy Collector

Another possible collector: divide memory into two halves fill up one half before doing any collection on full:

walk the trees and copy to other side work from new side

Need twice memory of other collectors But don’t need to find space in old side

contiguous allocation is easy

Page 21: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

C# Memory management

Related to next-fit, copy-collector keep a NextObjPointer to next free space use it for new objects until no more space

Keep knowledge of Root objects global and static object pointers all thread stack local variables registers pointing to objects maintained by JIT compiler and runtime

eg. JIT keeps a table of roots

Page 22: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

C# Memory management

On traversal: walk from roots to find all good objects linear pass through heap

on gap, compact higher objects down fix object references to make this work

very fast in general Speedups:

assume different types of objects

Page 23: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Generations

Current .NET uses 3 generations: 0 – recently created objects: yet to survive GC 1 – survived 1 GC pass 2 – survived more than 1 GC pass

Assumption: longer lived implies live longer Is this a good assumption?

good assumption for many applications and for many systems (eg. P2P)

Put lower objects lower in heap

Page 24: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Generations

During compaction, promote generations eg. Gen 1 reachable object goes to Gen 2

Eventually:

}}}

Generation 0

Generation 1

Generation 2

Heap

Page 25: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

More Generation Optimization

Don’t trace references in old objects. Why? speed improvement but could refer to young objects

Use Write-Watch support. How? note if an old object has some field set then can trace through references

Page 26: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Large Objects Heap

Area of the heap dedicated to large objects never compacted. Why?

copy cost outweights any locality automatic generation 2

rarely collected large objects likely to have long lifetime

Commonly used for DataGrid objects results from database queries

20k or more

Page 27: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Object Pinning

Can require that an object not move could hurt GC performance useful for unsafe operation in fact, needed to make pointers work

syntax: fixed(…) { … } will not move objects in the declaration in the

block

Page 28: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Finalization

Recall C++ destructors:~MyClass() { // cleanup}

called when object is deleted does cleanup for this object

Don’t do this in C# (or Java) similar construct exists but only called on GC no guarantees when

Page 29: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Finalization

More common idiom:public void Finalize() { base.Finalize(); Dispose(false);}

maybe needed for unmanaged resources slows down GC significantly

Finalization in GC: when object with Finalize method created

add to Finalization Queue when about to be GC’ed, add to Freachable Queue

Page 30: Memory Management Tom Roeder CS215 2006fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)

Finalization

images from MSDNNov 2000