Top Banner
Compiling Objects in Class-Based Languages CS 153: Compilers
26

Compiling Objects in Class-Based Languages CS 153: Compilers.

Jan 14, 2016

Download

Documents

Gwen Logan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Compiling Objects in Class-Based Languages CS 153: Compilers.

Compiling Objects in Class-Based Languages

CS 153: Compilers

Page 2: Compiling Objects in Class-Based Languages CS 153: Compilers.

Object-Oriented Language?

It should be clear that there are several families of OO languages:

• Prototype-based (e.g. Javascript, Lua)• Class-based (e.g. C++, Java, C#)

We will focus on the compilation of class-based OO languages.

Page 3: Compiling Objects in Class-Based Languages CS 153: Compilers.

A brief incomplete history of OO

• (Early 60’s) Key concepts emerge in various languages/programs: sketchpad (Sutherland), SIMSCRIPT (Hoare), and probably many others.

• (1967) Simula 67 (Dahl, Nygaard) crystalizes many ideas (class, object, subclass, dispatch) into a coherent OO language

• (1972) Smalltalk (Kay) introduces the concept of object-oriented programming (you should try Squeak!)

• (1978) Modula-2 (Wirth)• (1985) Eiffel (Meyer)• (1990’s) OO programming becomes mainstream: C++,

Java, C#, …

Page 4: Compiling Objects in Class-Based Languages CS 153: Compilers.

Classes

What’s the difference between a class and an object?

Page 5: Compiling Objects in Class-Based Languages CS 153: Compilers.

Classes

What’s the difference between a class and an object?

Class (a blueprint for objects)declared fields / instance variables

values may differ from object to objectusually mutable

methodsshared by all objects of a classinherited from superclassesusually immutableimplicit receiver object (this, self)

all have visibility: public / private/ protected

Page 6: Compiling Objects in Class-Based Languages CS 153: Compilers.

Code generation for objects

MethodsGenerate method body codeGenerate method calls (dispatching)

FieldsMemory layoutAlignment

Page 7: Compiling Objects in Class-Based Languages CS 153: Compilers.

Jish Abstract Syntax

type tipe = Int_t|Bool_t |Class_t of class_name

type exp = Var of var | Int of int | Nil |

Var of var | Assign of var * exp |

New of class_name |

Invoke of exp * var * (exp list) | ...

type stmt = Exp of exp | Seq of stmt*stmt | ...

type method = Method of {mname:var, mret_tipe:tipe option,

margs:var*tipe list, mbody:stmt}

type class =

Class of {cname:class_name, csuper:class_name,

cinstance_vars:var*tipe list,

cmethods:method list}

Page 8: Compiling Objects in Class-Based Languages CS 153: Compilers.

The need for dynamic dispatching

Methods look like functions, type-checked like functions… what’s different?

Problem: compiler can’t tell what code to run when a method is called, in general.interface Point { int getx(); float norm(); }class ColoredPoint implements Point { … float norm() { return sqrt(x*x+y*y); }class 3DPoint implements Point { … float norm() return sqrt(x*x+y*y+z*z); }Point p = new 3DPoint(...); p.norm();

Solution: dispatch table (vtable, virtual method table)

Page 9: Compiling Objects in Class-Based Languages CS 153: Compilers.

Method dispatch

Idea: every method has its own offsetOffset is used to look up method in vtable

class A { void foo() { .. }}class B extends A { void bar() { .. } void baz() { .. }}

class C extends B {void foo() {…} void bar() {…}void baz() {…}void quux() {…}}

0

12

3

Page 10: Compiling Objects in Class-Based Languages CS 153: Compilers.

vtable

A foo |

B bar, baz |

C quux

Page 11: Compiling Objects in Class-Based Languages CS 153: Compilers.

Fields

Work similarly, calculate offset into object.

class Pt2d extends Object { int x; /* offset 4 */ int y; /* offset 8 */ void movex(int i) { x = x + i; } void movey(int i) { y = y + i; } } class Pt3d extends Pt2d { int z; /* offset 12 */ void movez(int i) { z = z + i; } }

Page 12: Compiling Objects in Class-Based Languages CS 153: Compilers.

Fields

Values of fields placed in objectAccesses to fields are indexed loadsNeed to know size of superclasses – can be a

probleme.g., Java – field offsets resolved at dynamic link/load

time

class Pt3d extends Pt2d { int z; /* offset = size(Pt2d) + 4 = 12 */ void movez(int i) { z = z + i; } }

Page 13: Compiling Objects in Class-Based Languages CS 153: Compilers.

Object layout

Page 14: Compiling Objects in Class-Based Languages CS 153: Compilers.

Field layout in Java (actually)

public abstract class AbstractMap<K,V> implements Map<K,V> { Set<K> keySet; Collection<V> values;}public class HashMap<K,V> extends AbstractMap<K,V> { Entry[] table; int size; int threshold; float loadFactor; int modCount;

boolean useAltHashing;int hashSeed; }

Will keySet be the first field in HashMap?Will table be the 3rd field in HashMap?

Page 15: Compiling Objects in Class-Based Languages CS 153: Compilers.

Field Layout in Java (actually)

$ java -jar target/java-object-layout.jar java.util.HashMap java.util.HashMap offset size type description 0 12 (object header + first field alignment) 12 4 Set AbstractMap.keySet 16 4 Collection AbstractMap.values 20 4 int HashMap.size 24 4 int HashMap.threshold 28 4 float HashMap.loadFactor 32 4 int HashMap.modCount 36 4 int HashMap.hashSeed 40 1 boolean HashMap.useAltHashing 41 3 (alignment/padding gap) 44 4 Entry[] HashMap.table 48 4 Set HashMap.entrySet 52 4 (loss due to the next object alignment) 56 (object boundary, size estimate) VM reports 56 bytes per instance

Page 16: Compiling Objects in Class-Based Languages CS 153: Compilers.

Field Layout in Java

Instead, the attributes are organized in memory in the following order:

1. doubles and longs

2. ints and floats

3. shorts and chars

4. booleans and bytes

5. references

Why was there a reference as the first field in HashMap?

Page 17: Compiling Objects in Class-Based Languages CS 153: Compilers.

Compiling to Cish

For every method m(x1,…,xn), generate a Cish function m(self,vtables,x1,…,xn).

At startup, for every class C, create a record of C's methods (the vtable.)

Collect all of the vtables into a big record.we will pass this data structure to each method as the vtables argument.

wouldn't need this if we had a global variable in Cish for storing the vtables.

Create a Main object and invoke its main() method.

Page 18: Compiling Objects in Class-Based Languages CS 153: Compilers.

Operations

new C create a record big enough to hold a C object initialize the object's vtable pointer. initialize instance variables with default values

0 is default for int, false for bool, null for objects. return pointer to object as result

e.m(e1,…,en) evaluate e to an object. extract a pointer to the m method from e's vtable invoke m, passing to it e,e1,…,en e is passed as self, or this. in a real system, must check that e isn't null!

Page 19: Compiling Objects in Class-Based Languages CS 153: Compilers.

Operations (continued)

x, x := e read or write a variable. the variable could be a local or an instance variable. if it's an instance variable, we must use the "self" pointer to access the

value. *(this+offset(x)) = [| e |]

(C)e -- type casts if e has type D and D ≤ C, succeeds. (upcast)if e has type D and C ≤ D, performs a run-time check to make sure the

object is actually (at least) a C. (downcast)if e has type D, and C is unrelated to D, then generates a compile-time

error.

Page 20: Compiling Objects in Class-Based Languages CS 153: Compilers.

Subtleties in Type-Checking:

Every object has a run-time type.

Essentially its vtable.

The type-checker tracks a static type.some super-type of the object.

NB: Java confuses super-types and super-classes.

In reality, if e is of type C, then e could be nil or a C object. Java "C" = ML "C option"

Page 21: Compiling Objects in Class-Based Languages CS 153: Compilers.

Subtyping vs. Inheritance

Inheritance is a way to assemble classes

Simple inheritance: D extends C implies D Ca read of instance variable x defined in C?

okay because D has it too.

an invocation of method m defined in C?

okay because D has it too.

m : (C self,T1,…,Tn) T

Read C instance variables, invoke C methods.

Page 22: Compiling Objects in Class-Based Languages CS 153: Compilers.

Overriding:class List { int hd; List tl; void append(List y) { if (tl == Nil) tl := y; else tl.append(y); }}class DList extends List { DList prev; void append(DList y) { if (tl == Nil) {

tl := y; if (y != Nil) y.prev := self; } else { tl.append(y); }}

Java won'tlet you saythis…

Page 23: Compiling Objects in Class-Based Languages CS 153: Compilers.

Best you can do:class List { int hd; List tl; void append(List y) { if (tl == Nil) tl := y; else tl.append(y); }}class DList extends List { DList prev; void append(List y) { if (tl == Nil) {

tl := y; if (y != Nil) ((DList)y).prev := self; } else { tl.append(y); }}

Run-time type-check

Page 24: Compiling Objects in Class-Based Languages CS 153: Compilers.

What We Wish we Had…

Don't just "copy" when inheriting:Also replace super-class name with sub-class name.

That is, we need a "self" type as much as a self value.

But this will not, in general, give you that the sub-class is a sub-type of the super-class.

why?

Page 25: Compiling Objects in Class-Based Languages CS 153: Compilers.

Overloading

• In Java you can override a method, and you can also overload a method

• With overloading, you don’t update the behavior of a method, instead, within a class hierarchy, you have different methods with the same name

• However, for a call to such an overloaded method, we must be able to figure out which one to call at compile time

Page 26: Compiling Objects in Class-Based Languages CS 153: Compilers.

Danger of Overriding + Overloading

Class C {

Boolean equals(C other) { … } // 1

}

Class SC extends C {

Boolean equals(C other) { … } // 2

Boolean equals(SC other) { … } // 3

}

C c = new C();

SC sc = new SC();

C c’ = new SC();

c.equals(c) ; c.equals(sc) ; c.equals(c’);

c’.equals(c) ; c’.equals(sc) ; c’.equals(c’) ;

sc.equals(c) ; sc.equals(sc) ; sc.equals(c’);