Top Banner
CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap- Up Dan Grossman Winter 2009
57

CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

CSEP505: Programming LanguagesLecture 10: OOP; Memory Mgmt; Wrap-Up

Dan Grossman

Winter 2009

Page 2: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 2

Last time

• Key novelty / semantic difference of OOP is dynamic dispatch– Defined by self mapping to “whole current object”

• The method’s “receiver”

• Investigating the “extensibility problem” with canonical example:– Abstract class Exp with subclasses IntExp, AddExp, …– Exp has methods for interp, typecheck, toInt, …

Page 3: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 3

The Grid

interp typecheck toInt …

IntExp Code Code Code Code

AddExp Code Code Code Code

MultExp Code Code Code Code

… Code Code Code Code

1 new function

1 new class

Page 4: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 4

Back to MultExp

• Even in OOP, MultExp is easy to add, but you’ll copy the typecheck method of AddExp

• Or maybe AddExp extends MultExp, but it’s a kludge

• Or maybe refactor into BinaryExp with subclasses AddExp and MultExp– So much for not changing existing code– Fairly heavyweight approach to a helper function

Page 5: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 5

Remaining OO plan

• Meaning of type-safety for OO

• Why are subtyping and subclassing separate concepts worth keeping separate?

• Multiple inheritance; multiple interfaces

• Static overloading

• Multimethods

• Revenge of bounded polymorphism

Page 6: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 6

Typechecking

We were sloppy:

talked about types without “what are we preventing”

1. In pure OO, stuck if we need to interpret v.m(v1,…,vn) and v has no m method (taking n args)• “No such method” error

2. Also if ambiguous: multiple methods with same name and there is no “best choice”• “No best match” error• Will arise with static overloading and multimethods

Page 7: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 7

Subtyping vs. subclassing

• Often convenient confusion: C a subtype of D if and only if C a subclass of D– But self is covariant; the key type system difference

• But more subtypes are sound– If A has every field and method that B has (at appropriate

types), then subsume B to A– Interfaces help, but require explicit annotation

• And fewer subtypes could allow more code reuse…

Page 8: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 8

Non-subtyping example

Pt2 ≤ Pt1 is unsound here:

class Pt1 extends Object { int x; int get_x() { x } bool compare(Pt1 p){ p.get_x() == self.get_x() }}class Pt2 extends Pt1 { int y; int get_y() { y } bool compare(Pt2 p) { // override p.get_x() == self.get_x() && p.get_y() == self.get_y() }}

Page 9: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 9

What happened

• Could inherit code without being a subtype• Cannot always do this

– what if get_x called self.compare with a Pt1Possible solutions:– Re-typecheck get_x in subclass– Use a really fancy type system– Don’t override compare

• Moral: Not suggesting “subclassing not subtyping” is useful, but the concepts of inheritance and subtyping are orthogonal

Page 10: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 10

Remaining OO plan

• Meaning of type-safety for OO

• Why are subtyping and subclassing separate concepts worth keeping separate?

• Multiple inheritance; multiple interfaces

• Static overloading

• Multimethods

• Revenge of bounded polymorphism

Page 11: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 11

Multiple inheritance

Why not allow C extends C1,…,Cn {…}– and C≤C1, …, C≤Cn

What everyone agrees on: C++ has it, Java doesn’t

We’ll just consider some problems it introduces and how (multiple) interfaces avoids some of them

Problem sources:

1. Class hierarchy is a dag, not a tree

2. Type hierarchy is a dag, not a tree

Page 12: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 12

Diamonds

• If C extends C1, C2 and C1, C2 have a common (transitive) superclass D, we have a diamond– Always have one with multiple inheritance and a topmost

class (Object)• If D has a field f, does C have one field or two?

– C++ answer: yes • If D has a method m, C1 and C2 will have a clash

– Also possible without a diamond• If subsumption is coercive (changing method-lookup), then how

we subsume from C to D affects run-time behavior (incoherent)

Diamonds are common, largely due to types like Object with methods like equals

Page 13: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 13

Method-name clash

What if C extends C1, C2 which both define m?

Possibilities:

1. Reject declaration of C• Too restrictive with diamonds

2. Require C overrides m• Possibly with directed resends

3. “Left-side” (C1) wins• Question: does cast to C2 change what m means?

4. C gets both methods (implies incoherent subtyping)

5. Other?

Page 14: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 14

Implementation issues

• Multiple-inheritance semantics often muddied by wanting “efficient member lookup”– If “efficient” is compile-time offset from self pointer, then

multiple inheritance means subsumption must “bump the pointer”

– Roughly why C++ has different sorts of casts

• Preaching: Better to think– semantically first: how should subsumption affect the

behavior of method-lookup– implementationally second: what can I optimize based on

the class/type hierarchy

Page 15: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 15

Digression: casts

A “cast” can mean too many different things (cf. C++):

Language-level:• Upcast: no run-time effect• Downcast: failure or no run-time effect• Conversion: key question is round-tripping• “Reinterpret bits”: not well-defined

Implementation level• Upcast: usually no run-time effect, but see multiple inheritance• Downcast: check the tag, maybe fail, but see multiple inheritance• Conversion: same as at language level• “Reinterpret bits”: no effect (by definition)

Page 16: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 16

Least supertypes

[Related to homework 4 challenge problem]

For e1 ? e2 : e3– e2 and e3 need the same type– But that just means a common supertype– But which one? (The least one)

• But multiple inheritance means may not exist!

Common solution:• Reject without explicit cast on e2 and/or e3

Page 17: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 17

Multiple inheritance summary

1. Diamond issues (coherence issues, shared (?) fields)

2. Method clashes (what does inheriting m mean)

3. Implementation issues (slower method lookup)

4. Least supertypes (may not exist)

Multiple interfaces have issues (3) and (4) – Again, an interface is just a named type– Provides no implementation (method or field definition)

Page 18: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 18

Remaining OO plan

• Meaning of type-safety for OO

• Why are subtyping and subclassing separate concepts worth keeping separate?

• Multiple inheritance; multiple interfaces

• Static overloading

• Multimethods

• Revenge of bounded polymorphism

Page 19: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 19

Static overloading

• So far: Assume every method name unique – Same name in subclass meant override

• Many OO languages allow same name, different argument types: A f(B b) {…}

C f(D d, E e) {…} F f(G g, H h) {…}

• Changes method-lookup definition for e.m(e1,…en)– Old: method-lookup a (meta)function of the class of the object e evaluates to (at run-time)

– New: method-lookup a (meta)function of the class of the object e evaluates to (at run-time) and the types of e1,…,en (at compile-time)

Page 20: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 20

Ambiguity

Because of subtyping, multiple methods can match!

“Best match” rules are complicated. One rough idea:– Fewer subsumptions is better match– If tied, subsume to immediate supertypes & recur

Ambiguities remain (no best match)

1. A f(B) or C f(B) (usually disallowed)

2. A f(B) or A f(C) and f(e) where e has a subtype of B and C but B and C are incomparable

3. A f(B,C) or A f(C,B) and f(e1,e2) where e1 and e2 have type B and B ≤C

Page 21: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 21

Multimethods

Static overloading mostly saves keystrokes – Shorter method names– Name-mangling on par with syntactic sugar– But sometimes can comment out a method and program still

type-checks with different run-time behavior due to different compile-time method resolution

Multiple (dynamic) dispatch (a.k.a. multimethods) much more interesting: Method lookup for e.m(e1,…,en)a (meta)function of the classes of the objects e and e1,…,en evaluate to (at run-time)

A natural generalization: “receiver” no longer special

So may as well write m(e1,…,en) instead of e1.m(e2,…,en)

Page 22: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 22

Multimethods example

• compare(x,y) calls first version unless both arguments are Bs– Could add “one of each” methods if you want different

behavior

• f has fairly surprising behavior– But still more useful than with static overloading?

class A { int f; }class B extends A { int g; }bool compare(A x, A y) { x.f==y.f }bool compare(B x, B y) { x.f==y.f && x.g==y.g }bool f(A x, A y, A z) { compare(x,y) &&

compare(y,z) }

Page 23: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 23

Pragmatics; UW

Not clear where multimethods should be defined• No longer “belong to a class” because receiver not special

Multimethods are “more OO” because dynamic-dispatch is the essence of OO

Multimethods are “less OO” because without distinguished receiver the “analogy to physical objects” is reduced

A couple papers:– Millstein got a UW PhD around multimethods for Java

• UW a long-time multimethods leader– Nice summary and “where really used” Noble OOPSLA08

Page 24: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 24

Revenge of ambiguity

• Like static overloading, multimethods have “no best match” problems

• Unlike static overloading, the problem does not arise until run-time!

Possible solutions:

1. Run-time exception

2. Always define a best-match (e.g., Dylan)

3. A conservative type system

Page 25: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 25

Remaining OO plan

• Meaning of type-safety for OO

• Why are subtyping and subclassing separate concepts worth keeping separate?

• Multiple inheritance; multiple interfaces

• Static overloading

• Multimethods

• Revenge of bounded polymorphism

Page 26: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 26

Still want generics

OO subtyping no replacement for parametric polymorphismSo have both

Example:

/* 3 type constructors (e.g., Int Set a type) */interface ’a Comparable { Int f(’a,’a); }interface ’a Predicate { Bool f(’a); }class ’a Set { … constructor(’a Comparable x){…} unit add (’a x) {…} ’a Set functional_add(’a x) {…} ’a find (’a Predicate x) {…}}

Page 27: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 27

Worse ambiguity

“Interesting” interaction with overloading or multimethods

class B { Int f(Int C x){1} Int f(String C x){2} Int g(’a x) { self.f(x) }}

Whether match is found depends on instantiation of ’a

Cannot resolve static overloading at compile-time without code duplication

At run-time, need run-time type information– Including instantiation of type constructors– Or restrict overloading enough to avoid it

Page 28: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 28

Wanting bounds

As expected, with subtyping and generics, want bounded polymorphism

Example:

interface Printable { unit print(); }class (’a ≤ Printable) Logger { ’a item; ’a get() { item.print(); item }}

w/o polymorphism, get would return an Printable (not useful)

w/o the bound, get could not send print to item

Page 29: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 29

Fancy example

With forethought, can use bounds to avoid some subtyping limitations

(Example lifted from Abadi/Cardelli text; I would have never thought of this)

/* Herbivore1 ≤ Omnivore1 unsound */interface Omnivore1 { unit eat(Food); }interface Herbivore1 { unit eat(Veg); }/* T Herbivore2 ≤ T Omnivore2 sound for any T */interface (’a≤Food) Omnivore2 { unit eat(’a); }interface (’a≤Veg) Herbivore2 { unit eat(’a); }/* subtyping lets us pass herbivores to feed but only if food is a Veg */unit feed(’a food, ’a Omnivore animal) { animal.eat(food);}

Page 30: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 30

You have grading to do…

I am going to distribute course evaluation forms so you may rate thequality of this course. Your participation is voluntary, and you mayomit specific items if you wish. To ensure confidentiality, do notwrite your name on the forms. There is a possibility your handwritingon the yellow written comment sheet will be recognizable; however, Iwill not see the results of this evaluation until after the quarter isover and you have received your grades. Please be sure to use a No. 2 PENCIL ONLY on the scannable form.

I have chosen _______ to distribute and collect the forms. Whenyou are finished, he/she will collect the forms, put them into anenvelope and mail them to the Office of Educational Assessment. Ifthere are no questions, I will leave the room and not return until allthe questionnaires have been finished and collected. Thank you foryour participation.

Page 31: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 31

From the beginning

Problem:1. Why do we need memory management?

• Same reason for any finite reusable resource2. What does safety mean? (What is guaranteed?)3. What is drag?

Solutions:1. How does tracing garbage collection (GC) work?

2. What other ways for safe memory management? a. Unique pointersb. (Automatic) reference-countingc. Regions

Page 32: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 32

Why reuse?

• Values/objects/code take up space

• Using too much space slows down programs– Eventually they stop (memory exhaustion)

• Optimal space: reclaim immediately after last use– Earlier is incorrect (dangling-pointer dereference)– Drag is time between last use and reclamation

• But:– Last-use undecidable– Batched reclamation can gain time for space

Page 33: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 33

The view from C/C++

• Stack objects reclaimed at end of block/function

• Heap objects reclaimed with call to free/delete– Drag can still exist

• Dangling-pointers fine; dereferencing them unsafe– “Double-free” also unsafe

• Unreclaimed objects that become unreachable will:– Never be used– Never be reclaimed– So drag until termination (“space leak”)

Page 34: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 34

Reachability

Reachability soundly approximates “may be used again”

Inductive definition (transitive “points to”):• Global variables reachable• Unreclaimed stack objects reachable

– Liveness analysis can do a bit better• Objects pointed to by reachable objects are reachable

C: Avoid leaks by freeing before unreachable

Garbage-collected language: Make things unreachable

Reachability is an approximation that works well in practice

Page 35: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 35

Reachability and leaks

• GC’d languages reclaim unreachable objects– So by some definitions “leaks are impossible”– Like by some definitions deadlock with atomic is impossible

• But “infinite drag times” are possible– Example: large unused data structure in a global

• Programming for space in GC’d languages– Usually ignore the issue– Set pointers to null when done with them

• Error-prone!– Use weak pointers where appropriate

• Provided as a language feature, dereference can fail

• Compiler-writer should also consider if optimizations are “safe for space”

Page 36: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 36

Where are we

Problem:1. Why do we need memory management?

• Same reason for any finite reusable resource2. What does safety mean?3. What is drag?

Solutions:1. How does garbage collection (GC) work?2. What other ways for safe memory management?

a. Unique pointersb. (Automatic) reference-countingc. Regions

Page 37: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 37

Reachability, cont’d

Algorithm sketch to find all reachable objects:• Start at roots (globals and stack objects)• Follow all pointers, but do not go around cycles

Problems:• Find all pointers in pointed-to object

– How big is the object?– What fields are integers?

• Avoid cycles (solution depends on GC technique)

Page 38: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 38

Finding sizes

Garbage collector must know an object’s size– free/delete need to know too!

Solutions:• A header word (e.g., before object) with the size

– Class pointer can “serve double-duty”• Size segregation and a global table of “page to size”

Bottom line:• Allocator and/or compiler must collaborate with GC

Page 39: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 39

Finding pointers

Does the GC know which fields/roots are pointers?• Yes: accurate GC• No: conservative GC

Theory: With conservative GC, “one unlucky int” could keep huge amount of data

Practice: Conservative GC tends to work

Accurate GC techniques:• Class-pointer can “serve triple-duty”• Low-order bit tricks (e.g., Caml ints are 31-bits)

Page 40: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 40

Conservative GC for C

Yes, you can (conservatively) GC a C program• The Boehm-Demers-Weiser conservative collector

2 of many interesting details:• Use collector’s malloc (so GC knows the size)• Possible b/c C bans code most people think is legal:

void f() { int * p = malloc(100*sizeof(int)); int * q = p + 1000; // not allowed q[-950] = 17; int * r = p + 100; /* allowed */ r[-50] = 17;}

Compile-time flag to “add a byte or keep 2 objects”

Page 41: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 41

Semispace copying collection

• Divide memory into 2 equal-size contiguous pieces• Allocate objects into one-space until full

– Easy and fast: “bump an allocation-pointer”• Now have a full from-space & an empty to-space

– Copy reachable objects into end of to-space– Set allocation-pointer just past them in to-space– Restart the program (semispaces reversed roles)

Page 42: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 42

Wait a minute

Skimmed over key details• We moved objects; must update all pointers to them• Must avoid cycles• The GC can run without much extra space (good)

How:• “Cheney queue” just two pointers in to-space

– Objects to scan (update pointers and maybe add pointed-to objects to queue)

• Cycle avoidance: forwarding-pointers in from-space– Easy to tell what space is pointed-to

Page 43: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 43

Mark-sweep collection

• Allocate objects until you have almost no room left• Mark all reachable objects (bit in header word)

– Avoid cycle by checking bit• Sweep through memory

– If object unmarked, reclaim it– If object marked, unmark it

No 2x space and no moving objects, but…

Page 44: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 44

Wait another minute

• In practice, if more than 2/3 of objects or so are reachable, you spend lots of time in GC

• Allocation is complicated– Must find enough space for the new object– Fragmentation can hurt performance

• Or exhaust memory before copying GC does

• No “Cheney” queue, so GC needs an explicit stack or low-level cleverness to run in little space

Page 45: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 45

Generational

Copying and mark-sweep from about 1960Generational GC a key mid-80s optimization because• Most objects die young• Most old objects never get mutated to point to young

How:• Allocate in a nursery• Empty nursery has no pointers into it!• Fill nursery like in copying collection• Also track mutations to record pointers into nursery

– Yet another reason to avoid mutation (slower)• To collect nursery, ignore rest of heap except recorded pointers

Page 46: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 46

Some more terms

Just sketched the basics of copying and mark-sweep

And the orthogonal issue of generations

Some other terms worth knowing:• Incremental GC: do a little bit on each allocation

– Avoid large pause times• Concurrent GC (collector thread in parallel with the program)• Parallel GC (multiple collector threads)

• Lots of other important tricks: – lazy-sweeping, large-object spaces, …

Page 47: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 47

GC Summary

Great survey paper: Paul R. Wilson. Uniprocessor Garbage Collection Techniques.

International Workshop on Memory Management 1992

• Programmer must know about reachability, that objects may move, that mutation may cost, etc.

• GC implementor must try to do well without knowing the application’s memory behavior– But done by memory-system experts!– One-size-fits-most

Page 48: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 48

Where are we

Problem:1. Why do we need memory management?

• Same reason for any finite reusable resource2. What does safety mean?3. What is drag?

Solutions:1. How does garbage collection (GC) work?2. What other ways for safe memory management?

a. Unique pointersb. (Automatic) reference-countingc. Regions

Page 49: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 49

Now forget GC

Idioms that avoid dangling-pointer dereferences– And languages and/or types to enforce them!– A language can have more than one– More work than GC, but safer than unchecked malloc/free

Worth knowing just for the idioms

Page 50: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 50

Unique pointers

• If p is the only pointer to o, then free(p) can’t lead to dangling-pointer dereferences provided *p is not used afterwards

• Unique-pointers allow only trees (no dags or cycles)

• Maintaining uniqueness invariant– Dynamic: destructive-reads

• p=q and free(q) set q to null– Static: linear type systems and/or flow analysis

Page 51: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 51

Reference-counting

(Dynamic) ref-counting basics:• Store number of pointers to object with object• If count goes to zero, free it

Can automate this easily enough

But:• Cycles never get reclaimed unless programmer breaks the

cycle– Or cycles are eventually detected via other techniques

• Expensive without tricks (e.g., “deferred ref-counting”)

Page 52: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 52

Regions

• A decades-old idiom also known as zones, arenas, …

• Partition memory into region; every object in one region

• API basics– new_region returns a handle– new_object takes a handle– free_region takes a handle

• No free_object

Page 53: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 53

What did we do

• Accomplished nothing if we put every object in a different region• But now intra-region pointers “can’t go wrong”

– Programmer puts objects with similar lifetimes in same region

– To avoid leaks, just don’t lose the handle• For inter-region pointers, options:

– Dynamic ref-count (see RC or RTSJ)– Type-system to restrict “what points where” and when

pointers can be dereferenced (see Cyclone)

Page 54: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 54

A common idiom

• Far too painful in C: caller knows lifetime of result, callee knows size and structure of result– Leads to evil stack-allocated buffers

• Region solution: a region-handle argument– Easy even if result is some complicated graph

result_t g(handle_t, …);void f() { handle_t h = new_region(); result_t r = g(h,…); /* compute with r */ free_region(h); }

Page 55: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 55

Course summary

• Defining languages is hard but worth it– Interpretation vs. translation– Inference rules vs. a PL for the metalanguage

• Essential features we investigated– Mutable variables (and loops)– Higher-order functions, scope– Pairs and sums– Threads (and locks and channels)– Objects

• Types restrict programs (that’s a good thing!)– But want polymorphism for reuse and abstraction

Page 56: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 56

Penultimate slide

• We avoided:– Subjective non-science (“I like curly braces”)– Real-world issues (“cool libraries/tricks in language X”)

• Focused on:– Concepts that almost every language has, including the next

fad that doesn’t exist yet– Connections (objects and closures are different, but not

totally different)– Reference implementations, not fast or industrial-strength

ones

Page 57: CSEP505: Programming Languages Lecture 10: OOP; Memory Mgmt; Wrap-Up Dan Grossman Winter 2009.

12 March 2009 CSE P505 Winter 2009 Dan Grossman 57

Questions?

Questions?