Top Banner
CS 320: Compiling Techniques David Walker
52

CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building [email protected] office hours: after each.

Dec 16, 2015

Download

Documents

Eunice Fowler
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

CS 320: Compiling Techniques

David Walker

Page 2: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

People David Walker (Professor)

412 Computer Science Building [email protected] office hours: after each class

Guilherme Ottoni (TA) 417 Computer Science Building [email protected] office hours:

Mondays 2-2:30 PM Fridays 2-3 PM

Page 3: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Information Web site:

www.cs.princeton.edu/courses/archive/spring05/cos320/index.htm

Mailing list: To subscribe:

[email protected] To post to this list, send your email to:

[email protected]

Page 4: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Books Modern Compiler Implementation

in ML, Andrew Appel

A reference manual for SML best choice: Online references

see course web site several hardcopy books

Elements of ML Programming, Jeffrey D. Ullman

Page 5: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Assignment 0 Write your name and other

information on the sheet circulating Find, skim and bookmark the course

web pages Subscribe to course e-mail list Begin assignment 1

Figure out how to run & use SML Due next Thursday February 10

Page 6: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

onward!

Page 7: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

What is a compiler? A compiler is program that

translates a source language into an equivalent target language

Page 8: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

What is a compiler?

while (i > 3) { a[i] = b[i]; i ++}

mov eax, ebxadd eax, 1cmp eax, 3jcc eax, edx

C program

assemblyprogram

compiler does this

Page 9: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

What is a compiler?

class foo { int bar; ...}

struct foo { int bar; ...}

Java program

compiler does this

C program

Page 10: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

What is a compiler?

class foo { int bar; ...}

........

.........

........

Java program

compiler does this

Java virtual machine program

Page 11: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

What is a compiler?

\newcommand{....}

\sfd\sf\fadg

Latex program

compiler does this

Tex program

Page 12: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

What is a compiler?

\newcommand{....}

\sfd\sf\fadg

Tex program

compiler does this

Postscript program

Page 13: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

What is a compiler? Other places:

Web scripts are compiled into HTML assembly language is compiled into

machine language hardware description language is

compiled into a hardware circuit ...

Page 14: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Compilers are complex text file to abstract

syntax lexing; parsing

abstract syntax to intermediate form (IR) analysis; optimizations;

data layout

IR to machine code code generation;

register allocation

front-end

middle-end

back-end

Page 15: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Course project Fun Source Language

simple imperative language

Only 1 IR (the initial abstract syntax generated by the parser)

type checking; high-level optimizations

Code Generation instruction selection algorithms;

register allocation via graph coloring

front-end

middle-end

back-end

Page 16: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Standard ML Standard ML is a domain-specific

language for building compilers Support for

Complex data structures (abstract syntax, compiler intermediate forms)

Memory management like Java Large projects with many modules Advanced type system for error

detection

Page 17: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Introduction to ML You will be responsible for learning

ML on your own. Today I will cover some basics

Resources: Robert Harper’s Online book “an

introduction to ML” is a good place to start

See course webpage for pointers and info about how to get the software

Page 18: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Intro to ML Highlights

Data Structures for compilers Data type definitions Pattern matching

Strongly-typed language Every expression has a type Certain errors cannot occur Polymorphic types provide flexibility

Flexible Module System Abstract Types Higher-order modules (functors)

Page 19: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Intro to ML Interactive Language

Type in expressions Evaluate and print type and result Compiler as well

High-level programming features Data types Pattern matching Exceptions Mutable data discouraged

Page 20: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Preliminaries start sml in Unix by typing sml

at a prompt:

tux% smlStandard ML of New Jersey, Version 110.0.7,

September 28, 2000 [CM; autoload enabled]-

(* quit SML by pressing ctrl-D *)(* just so you know, comments can be (* nested *)

*)

Page 21: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Preliminaries Read – Eval – Print – Loop

- 3 + 2;

Page 22: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Preliminaries Read – Eval – Print – Loop

- 3 + 2;> 5: int

Page 23: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Preliminaries Read – Eval – Print – Loop

- 3 + 2;> 5: int- it + 7;> 12 : int

Page 24: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Preliminaries Read – Eval – Print – Loop

- 3 + 2;> 5: int- it + 7;> 12 : int- it – 3;> 9 : int- 4 + true;

stdIn:17.1-17.9 Error: operator and operand don't agree [literal]

operator domain: int * int operand: int * bool in expression: 4 + true

Page 25: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Preliminaries Read – Eval – Print – Loop

- 3 div 0;Failure : Div run-time error

Page 26: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Basic Values- ();> () : unit => like “void” in C (sort of)

=> the uninteresting value/type

- true;> true : bool- false;> false : bool- if it then 3+2 else 7; “else” clause is always necessary> 7 : int- false andalso loop_Forever;> false : bool and also, or else short-circuit eval

Page 27: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Basic ValuesIntegers- 3 + 2;> 5 : int- 3 + (if not true then 5 else 7);> 10 : int No division between expressions

and statementsStrings- “Dave” ^ “ “ ^ “Walker”;> “Dave Walker” : string- print “foo\n”;foo> 3 : int

Reals- 3.14;> 3.14 : real

Page 28: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Using SML/NJ Interactive mode is a good way to

start learning and to debug programs, but…

Type in a series of declarations into a “.sml” file

- use “foo.sml”[opening foo.sml]…

list of declarationswith their types

Page 29: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Larger Projects SML has its own built in interactive

“make” Pros:

It automatically does the dependency analysis for you

No crazy makefile syntax to learn Cons:

May be more difficult to interact with other languages or tools

Page 30: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Compilation Manager

% sml

- OS.FileSys.chDir “~/courses/510/a2”;

- CM.make(); looks for “sources.cm”, analyzes dependencies

[compiling…] compiles files in group

[wrote…] saves binaries in ./CM/

- CM.make’ “myproj/”(); specify directory

sources.cmc.smlb.smla.sigGroup is

a.sigb.smlc.sml

Page 31: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

What is next? ML has a rich set of structured values

Tuples: (17, true, “stuff”) Records: {name = “Dave”, ssn = 332177} Lists: 3::4::5::nil or [3,4]@[5] Datatypes Functions And more!

Rather than list all the details, we will write a couple of programs

Page 32: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

An interpreter Interpreters are usually

implemented as a series of transformers:

stream ofcharacters

abstractsyntax

lexing/parsing

evaluate

abstractvalue

print

stream ofcharacters

Page 33: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

A little language (LL) An arithmetic expression e is

a boolean value an if statement (if e1 then e2 else e3) an integer an add operation a test for zero (isZero e)

Page 34: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

LL abstract syntax in ML

datatype term = Bool of bool| If of term * term * term| Num of int| Add of term * term| IsZero of term

-- by convention, constructors are capitalized

-- constructors can take a single argument of a particular type

type of a tupleanother eg: string * char

vertical barseparates alternatives

Page 35: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

LL abstract syntax in ML

Add (Num 2, Num 3)

represents the expression “2 + 3”

Add

Num Num

2 3

Page 36: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

LL abstract syntax in ML

If (Bool true, Num 0, Add (Num 2, Num 3))

represents

“if true then 0 else 2 + 3”

Add

Num Num

2 3

true

Bool Num

0

If

Page 37: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Function declarations

fun isValue t = case t of Num n => true | Bool b => true | _ => false

function name function parameter

default pattern matches anything

Page 38: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

What is the type of the parameter t? Of the function?

fun isValue t = case t of Num n => true | Bool b => true | _ => false

function name function parameter

default pattern matches anything

Page 39: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

What is the type of the parameter t? Of the function?

fun isValue (t:term) : bool = case t of Num n => true | Bool b => true | _ => false

val isValue : term -> bool

ML does type inference => you need notannotate functions yourself (but it can be helpful)

Page 40: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

A type error

fun isValue t = case t of Num n => n | _ => false

ex.sml:22.3-24.15 Error: types of rules don't agree [literal] earlier rule(s): term -> int this rule: term -> bool in rule: Successor t2 => true

Page 41: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

A type error

Actually, ML will give you several errors in a row:

ex.sml:22.3-25.15 Error: types of rules don't agree [literal] earlier rule(s): term -> int this rule: term -> bool in rule: Successor t2 => trueex.sml:22.3-25.15 Error: types of rules don't agree [literal] earlier rule(s): term -> int this rule: term -> bool in rule: _ => false

Page 42: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

A very subtle error

fun isValue t = case t of num => true | _ => false

The code above type checks. But whenwe test it refined the function always returns “true.”What has gone wrong?

Page 43: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

A very subtle error

fun isValue t = case t of num => true | _ => false

The code above type checks. But whenwe test it refined the function always returns “true.”What has gone wrong?-- num is not capitalized (and has no argument)-- ML treats it like a variable pattern (matches anything!)

Page 44: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Exceptions

exception Error of string

fun debug s : unit = raise (Error s)

Page 45: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Exceptions

exception Error of string

fun debug s : unit = raise (Error s)

- debug "hello";

uncaught exception Error raised at: ex.sml:15.28-15.35

in SML interpreter:

Page 46: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Evaluator

fun isValue t = ...

exception NoRule

fun eval t = case t of Bool _ | Num _ => t | ...

Page 47: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Evaluator

...

fun eval t = case t of Bool _ | Num _ => t | If(t1,t2,t3) => let val v = eval t1 in case v of Bool b => if b then (eval t2) else (eval t3) | _ => raise NoRule end

let statementfor rememberingtemporaryresults

Page 48: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Evaluatorexception NoRule

fun eval1 t = case t of Bool _ | Num _ => ... | ... | Add (t1,t2) => case (eval v1, eval v2) of (Num n1, Num n2) => Num (n1 + n2) | (_,_) => raise NoRule

Page 49: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Finishing the Evaluatorfun eval1 t = case t of ... | ... | Add (t1,t2) => ... | IsZero t => ...

be sure yourcase isexhaustive

Page 50: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Finishing the Evaluatorfun eval1 t = case t of ... | ... | Add (t1,t2) => ... What if we

forgot a case?

Page 51: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Finishing the Evaluator

ex.sml:25.2-35.12 Warning: match nonexhaustive (Bool _ | Zero) => ... If (t1,t2,t3) => ... Add (t1,t2) => ...

fun eval1 t = case t of ... | ... | Add (t1,t2) => ... What if we

forgot a case?

Page 52: CS 320: Compiling Techniques David Walker. People David Walker (Professor) 412 Computer Science Building dpw@cs.princeton.edu office hours: after each.

Last Things Learning to program in SML can be

tricky at first But once you get used to it, you

will never want to go back to imperative languages

Check out the reference materials listed on the course homepage