Top Banner
CSEP505: Programming Languages Lecture 2: functional programming, syntax, semantics via interpretation or translation Dan Grossman Winter 2009
68

Dan Grossman Winter 2009

Jan 25, 2016

Download

Documents

afram

CSEP505: Programming Languages Lecture 2: functional programming, syntax, semantics via interpretation or translation. Dan Grossman Winter 2009. Where are we. Programming: To finish: Caml tutorial Idioms using higher-order functions Similar to objects Tail recursion Languages: - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dan Grossman Winter 2009

CSEP505: Programming LanguagesLecture 2: functional programming, syntax, semantics via interpretation or translation

Dan Grossman

Winter 2009

Page 2: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 2

Where are we

Programming:• To finish: Caml tutorial• Idioms using higher-order functions

– Similar to objects• Tail recursion

Languages:• Abstract syntax, Backus-Naur Form• Definition via interpretation• Definition via translation

We are unlikely to finish these slides today; that’s okay

Page 3: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 3

Picking up our tutorial

• We did:– Recursive higher-order functions– Records– Recursive datatypes

• We started some important odds and ends:– Standard-library– Common higher-order function idioms– Tuples– Nested patterns– Exceptions

• Need to do:– (Simple) Modules

Page 4: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 4

Standard library

• Values (e.g., functions) bound to foo in module M are accessed via M.foo

• Standard library organized into modules• For homework 1, will use List, String, and Char

– Mostly List, for example, List.fold_left– And we point you to the useful functions

• Standard library a mix of “primitives” (e.g., String.length) and useful helpers written in Caml (e.g., List.fold_left)

• Pervasives is a module implicitly “opened”

Page 5: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 5

Higher-order functions

• Will discuss “map” and “fold” idioms more next time, but to help get through early parts of homework 1:

let rec mymap f lst = match lst with [] -> [] | hd::tl -> (f hd)::(mymap f tl)

let lst234 = mymap (fun x -> x+1) [1;2;3]let lst345 = List.map (fun x -> x+1) [1;2;3]let incr_list = mymap (fun x -> x+1)

Page 6: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 6

Tuples

Defining record types all the time is unnecessary:• Types: t1 * t2 * … * tn• Construct tuples e1,e2,…,en• Get elements with pattern-matching x1,x2,…,xn• Advice: use parentheses!

let x = (3,"hi",(fun x -> x), fun x -> x ^ "ism")

let z = match x with (i,s,f1,f2) -> f1 i (*poor style *)

let z = (let (i,s,f1,f2) = x in f1 i)

Page 7: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 7

Pattern-matching revealed

• You can pattern-match anything– Only way to access datatypes and tuples– A variable or _ matches anything– Patterns can nest– Patterns can include constants (3, “hi”, …)

• Patterns are not expressions, though syntactically a subset– Plus some bells/whistles (as-patterns, or-patterns)

• Exhaustiveness and redundancy checking at compile-time!

• let can have patterns, just sugar for one-branch match!

Page 8: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 8

Fancy patterns example

type sign = P | N | Z

let multsign x1 x2 = let sign x = if x>=0 then (if x=0 then Z else P) else N in match (sign x1,sign x2) with (P,P) -> P | (N,N) -> N | (Z,_) -> Z | (_,Z) -> Z | _ -> N (* many say bad style *)

To avoid overlap, two more cases (more robust if type changes)

Page 9: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 9

Fancy patterns example (and exns)

exception ZipLengthMismatch

let rec zip3 lst1 lst2 lst3 = match (lst1,lst2,lst3) with ([],[],[]) -> [] | (hd1::tl1,hd2::tl2,hd3::tl3) ->

(hd1,hd2,hd3)::(zip3 tl1 tl2 tl3) | _ -> raise ZipLengthMismatch

’a list -> ’b list -> ’c list -> (’a*’b*’c) list

Page 10: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 10

Pattern-matching in general

• Full definition of matching is recursive – Over a value and a pattern– Produce a binding list or fail– You implement a simple version in homework 1

• Example:

(p1,p2,p3) matches (v1,v2,v3)

if pi matches vi for 1<=i<=3– Binding list is 3 subresults appended together

Page 11: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 11

“Quiz”

What is

let f x y = x + y

let f pr = (match pr with (x,y) -> x+y)

let f (x,y) = x + y

let f (x1,y1) (x2,y2) = x1 + y2

Page 12: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 12

Exceptions

See the manual for:

• Exceptions that carry values– Much like datatypes but extends exn

• Catching exceptions– try e1 with …– Much like pattern-matching but cannot be exhaustive

• Exceptions are not hierarchical (unlike Java/C# subtyping)

Page 13: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 13

Modules

• So far, only way to hide things is local let– Not good for large programs– Caml has a fancy module system, but we need only the

basics

• Modules and signatures give– Namespace management– Hiding of values and types– Abstraction of types– Separate type-checking and compilation

• By default, Caml builds on the filesystem

Page 14: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 14

Module pragmatics

• foo.ml defines module Foo

• Bar uses variable x, type t, constructor C in Foo via Foo.x, Foo.t, Foo.C– Can open a module, use sparingly

• foo.mli defines signature for module Foo– Or “everything public” if no foo.mli

• Order matters (command-line)– No forward references (long story)– Program-evaluation order

• See manual for .cm[i,o] files, -c flag, etc.

Page 15: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 15

Module example

type t1 = X1 of int | X2 of int

let get_int t = match t with X1 i -> i | X2 i -> i

type even = int

let makeEven i = i*2let isEven1 i = true(* isEven2 is “private” *)let isEven2 i = (i mod 2)=0

(* choose to show *)type t1 = X1 of int | X2 of int

val get_int : t1->int

(* choose to hide *)type even

val makeEven : int->even val isEven1 : even->bool

foo.ml: foo.mli:

Page 16: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 16

Module example

type t1 = X1 of int | X2 of int

let conv1 t = match t with X1 i -> Foo.X1 i | X2 i -> Foo.X2 ilet conv2 t = match t with Foo.X1 i -> X1 i | Foo.X2 i -> X2 i

let _ = Foo.get_int(conv1(X1 17)); Foo.isEven1(Foo.makeEven 17) (* Foo.isEven1 34 *)

(* choose to show *)type t1 = X1 of int | X2 of int

val get_int : t1->int

(* choose to hide *)type even

val makeEven : int->even val isEven1 : even->bool

bar.ml: foo.mli:

Page 17: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 17

Not the whole language

• Objects• Loop forms (bleach)• Fancy module stuff (e.g., functors)• Polymorphic variants• Mutable fields• …

Just don’t need much of this for class

(nor do I use it much)• Will use floating-point, etc. (easy to pick up)

Page 18: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 18

Summary

• Done with Caml tutorial– Focus on “up to speed” while being precise– Much of class will be more precise

• Next: functional-programming idioms – Uses of higher-order functions – Tail recursion– Life without mutation or loops

Will use Caml but ideas are more general

Page 19: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 19

6 closure idioms

Closure: Function plus environment where function was defined– Environment matters when function has free variables

1. Create similar functions

2. Combine functions

3. Pass functions with private data to iterators

4. Provide an abstract data type

5. Currying and partial application

6. Callbacks

Page 20: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 20

Create similar functions

let addn m n = m + n

let add_one = addn 1

let add_two = addn 2

let rec f m = if m=0 then [] else (addn m)::(f (m-1))

let lst65432 = List.map (fun x -> x 1) (f 5)

Page 21: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 21

Combine functions

let f1 g h = (fun x -> g (h x))

type ’a option = None | Some of ’a (*predefined*)

let f2 g h x = match g x with None -> h x | Some y -> y

(* just a function pointer *)let print_int = f1 print_string string_of_int

(* a closure *)let truncate1 lim f = f1 (fun x -> min lim x) flet truncate2 lim f = f1 (min lim) f

Page 22: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 22

Private data for iterators

let rec map f lst = match lst with [] -> [] | hd::tl -> (f hd)::(map f tl)

(* just a function pointer *)let incr lst = map (fun x -> x+1) lstlet incr = map (fun x -> x+1)

(* a closure *)let mul i lst = map (fun x -> x*i) lstlet mul i = map (fun x -> x*i)

Page 23: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 23

A more powerful iterator

let rec fold_left f acc lst = match lst with [] -> acc | hd::tl -> fold_left f (f acc hd) tl

(* just function pointers *)let f1 = fold_left (fun x y -> x+y) 0let f2 = fold_left (fun x y -> x && y>0) true

(* a closure *)let f3 lst lo hi = fold_left (fun x y -> if y>lo && y<hi then x+1 else x) 0 lst

Page 24: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 24

Thoughts on fold

• Functions like fold decouple recursive traversal (“walking”) from data processing

• No unnecessary type restrictions• Similar to visitor pattern in OOP

– Private fields of a visitor like free variables

• Very useful if recursive traversal hides fault tolerance (thanks to no mutation) and massive parallelism

MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat6th Symposium on Operating System Design and Implementation 2004

Page 25: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 25

Provide an ADT

• Note: This is mind-bending stuff

type set = { add : int -> set; member : int -> bool }let empty_set = let exists lst j = (*could use fold_left!*) let rec iter rest = match rest with [] -> false | hd::tl -> j=hd || iter tl in iter lst in let rec make_set lst = { add = (fun i -> make_set(i::lst)); member = exists lst } in make_set []

Page 26: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 26

Thoughts on ADT example

• By “hiding the list” behind the functions, we know clients do not assume the representation

• Why? All you can do with a function is apply it– No other primitives on functions– No reflection– No aspects– …

Page 27: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 27

Currying

• We’ve been using currying a lot– Efficient and convenient in Caml– (Partial application not efficient, but still convenient)

• Just remember that the semantics is to build closures:– More obvious when desugared:

let f = fun x -> (fun y -> (fun z -> … ))

let a = ((f 1) 2) 3

Page 28: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 28

Callbacks

• Library takes a function to apply later, on an event:– When a key is pressed– When a network packet arrives– …

• Function may be a filter, an action, …

• Various callbacks need private state of different types

• Fortunately, a function’s type does not depend on the types of its free variables

Page 29: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 29

Callbacks cont’d

• Compare OOP: subclassing for private state

type event = …val register_callback : (event->unit)->unit

• Compare C: a void* arg for private state

abstract class EventListener { abstract void m(Event); //”pure virtual”}void register_callback(EventListener);

void register_callback(void*, void (*)(void*,Event);// void* and void* better be compatible// callee must pass back the same void*

Page 30: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 30

Recursion and efficiency

• Recursion is more powerful than loops– Just pass loop state as another argument

• But isn’t it less efficient?

– Function calls more time than branches?• Compiler’s problem• An O(1) detail irrelevant in 99+% of code

– More stack space waiting for return• Shared problem: use tail calls where it matters• An O(n) issue (for recursion-depth n)

Page 31: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 31

Tail recursion example

(* factorial *)

let rec fact1 x =

if x==0 then 1 else x * (fact1(x-1))

• More complicated, more efficient version

let fact2 x =

let rec f acc x =

if x==0 then acc else f (acc*x) (x-1)

in

f 1 x

• Accumulator pattern (base-case becomes initial accumulator)

Page 32: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 32

Another example

• Again O(n) stack savings• But input was already O(n) size

let rec sum1 lst = match lst with [] -> 0 | hd::tl -> hd + (sum1 tl)let sum2 lst = let rec f acc lst = match lst with [] -> acc | hd::tl -> f (acc+hd) tl in f 0 lst

Page 33: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 33

Half-example

• One tail-call, one non• Tail recursive version will build O(n) worklist

– No space savings– That’s what the stack is for!

• O(1) space requires mutation and no re-entrancy

type tree = Leaf of int | Node of tree * treelet sum tr = let rec f acc tr = match tr with Leaf i -> acc+i | Node(left,right) -> f (f acc left) right in f 0 tr

Page 34: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 34

Informal definition

If the result of f x is the result of the enclosing function, then the call is a tail call (in tail position):

• In (fun x -> e), the e is in tail position.• If if e1 then e2 else e3 is in tail position, then e2 and e3

are in tail position.• If let p = e1 in e2 is in tail position, then e2 is in tail

position.• …

• Note: for call e1 e2, neither is in tail position

Page 35: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 35

Defining languages

• We have built up some terminology and relevant programming prowess

• Now– What does it take to define a programming language?– How should we do it?

Page 36: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 36

Syntax vs. semantics

Need: what every string means:

“Not a program” or “produces this answer”

Typical decomposition of the definition:

1. Lexing, a.k.a. tokenization, string to token list

2. Parsing, token list to labeled tree (AST)

3. Type-checking (a filter)

4. Semantics (for what got this far)

For now, ignore (3) (accept everything) and skip (1)-(2)

Page 37: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 37

Abstract syntax

To ignore parsing, we need to define trees directly:• A tree is a labeled node and an ordered list of (zero or more)

child trees.• A PL’s abstract syntax is a subset of the set of all such trees:

– What labels are allowed?– For a label, what children are allowed?

Advantage of trees: no ambiguity, i.e., no need for parentheses

Page 38: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 38

Syntax metalanguage

• So we need a metalanguage to describe what syntax trees are allowed in our language.

• A fine choice: Caml datatypes

• +: concise and direct for common things• -: limited expressiveness (silly example: nodes labeled Foo

must have a prime-number of children)• In practice: push such limitations to type-checking

type exp = Int of int | Var of string | Plus of exp * exp | Times of exp * exptype stmt = Skip | Assign of string * exp | Seq of stmt * stmt | If of exp * stmt * stmt | While of exp * stmt

Page 39: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 39

We defined a subset?

• Given a tree, does the datatype describe it?– Is root label a constructor?– Does it have the right children of the right type?– Recur on children

• Worth repeating: a finite description of an infinite set– (all?) PLs have an infinite number of programs– Definition is recursive, but not circular!

• Made no mention of parentheses, but we need them to “write a tree as a string”

Page 40: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 40

BNF

A more standard metalanguage is Backus-Naur Form• Common: should know how to read and write it

e ::= c | x | e + e | e * es ::= skip | x := e | s;s | if e then s else s | while e s

(x in {x1,x2,…,y1,y2,…,z1,z2,…,…})(c in {…,-2,-1,0,1,2,…})

Also defines an infinite set of trees. Differences:

• Different metanotation (::= and |)

• Can omit labels, e.g., “every c is an e”

• We changed some labels (e.g., := for Assign)

Page 41: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 41

Ambiguity revisited

• Again, metalanguages for abstract syntax just assume there are enough parentheses

• Bad example:

if x then skip else y := 0; z := 0

• Good example:

y:=1; (while x (y:=y*x; x:= x-1))

Page 42: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 42

Our first PL

• Let’s call this dumb language IMP– It has just mutable ints, a while loop, etc.– No functions, locals, objects, threads, …

Defining it:

1. Lexing (e.g., what ends a variable)

2. Parsing (make a tree from a string)

3. Type-checking (accept everything)

4. Semantics (to do)

You’re not responsible for (1) and (2)! Why…

Page 43: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 43

Syntax is boring

• Parsing PLs is a computer-science success story• “Solved problem” taught in compilers• Boring because:

– “If it doesn’t work (efficiently), add more keywords/parentheses”

– Extreme: put parentheses on everything and don’t use infix• 1950s example: LISP (foo …)• 1990s example: XML <foo> … </foo>

• So we’ll assume we have an AST

Page 44: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 44

Toward semantics

Now: describe what an AST “does/is/computes”• Do expressions first to get the idea• Need an informal idea first

– A way to “look up” variables (the heap)• Need a metalanguage

– Back to Caml (for now)

e ::= c | x | e + e | e * es ::= skip | x := e | s;s | if e then s else s | while e s

(x in {x1,x2,…,y1,y2,…,z1,z2,…,…})(c in {…,-2,-1,0,1,2,…})

Page 45: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 45

An expression interpreter

• Definition by interpretation: Program means what an interpreter written in the metalanguage says it means

type exp = Int of int | Var of string | Plus of exp * exp | Times of exp * exptype heap = (string * int) list

let rec lookup h str = … (*lookup a variable*)

let rec interp_e (h:heap) (e:exp) = match e with Int i ->i |Var str ->lookup h str |Plus(e1,e2) ->(interp_e h e1)+(interp_e h e2) |Times(e1,e2)->(interp_e h e1)*(interp_e h e2)

Page 46: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 46

Not always so easy

let rec interp_e (h:heap) (e:exp) = match e with Int i -> i |Var str -> lookup h str |Plus(e1,e2) ->(interp_e h e1)+(interp_e h e2) |Times(e1,e2)->(interp_e h e1)*(interp_e h e2)

• By fiat, “IMP’s plus/times” is the same as Caml’s• We assume lookup always returns an int

– A metalanguage exception may be inappropriate– So define lookup to return 0 by default?

• What if we had division?

Page 47: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 47

On to statements

• A wrong idea worth pursuing:

let rec interp_s (h:heap) (s:stmt) = match s with Skip -> () |Seq(s1,s2) -> interp_s h s1 ; interp_s h s2 |If(e,s1,s2) -> if interp_e h e then interp_s h s1 else interp_s h s2 |Assign(str,e) -> (* ??? *) |While(e,s1) -> (* ??? *)

Page 48: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 48

What went wrong?

• In IMP, expressions produce numbers (given a heap)• In IMP, statements change heaps, i.e., they produce a heap

(given a heap)

let rec interp_s (h:heap) (s:stmt) = match s with Skip -> h |Seq(s1,s2) -> let h2 = interp_s h s1 in interp_s h2 s2 |If(e,s1,s2) -> if (interp_e h e) <> 0 then interp_s h s1 else interp_s h s2 |Assign(str,e) -> update h str (interp_e h e) |While(e,s1) -> (* ??? *)

Page 49: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 49

About that heap

• In IMP, a heap maps strings to values• Yes, we could use mutation, but that is:

– less powerful (old heaps do not exist) – less explanatory (interpreter passes current heap)

type heap = (string * int) list

let rec lookup h str = match h with [] -> 0 (* kind of a cheat *) |(s,i)::tl -> if s=str then i else lookup tl strlet update h str i = (str,i)::h

• As a definition, this is great despite terrible waste of space

Page 50: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 50

Meanwhile, while

• Loops are always the hard part!

let rec interp_s (h:heap) (s:stmt) = match s with … | While(e,s1) -> if (interp_e h e) <> 0 then let h2 = interp_s h s1 in interp_s h2 s else h

• s is While(e,s1)• Semi-troubling circular definition

– That is, interp_s might not terminate

Page 51: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 51

Finishing the story

• Have interp_e and interp_s• A “program” is just a statement • An initial heap is (say) one that maps everything to 0

type heap = (string * int) list

let mt_heap = [] (* common PL pun *)

let interp_prog s = lookup (interp_s mt_heap s) “ans”

Fancy words: We have defined a large-step

operational-semantics using Caml as our metalanguage

Page 52: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 52

Fancy words

• Operational semantics– Definition by interpretation– Often implies metalanguage is “inference rules”

(a mathematical formalism we’ll learn in a couple weeks)

• Large-step– Interpreter function “returns an answer” (or diverges)– So definition says nothing about intermediate computation– Simpler than small-step when that’s okay

Page 53: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 53

Language properties

• A semantics is necessary to prove language properties

• Example: Expression evaluation is total and deterministic

“For all heaps h and expressions e, there is exactly one integer i such that interp_e h e returns i”– Rarely true for “real” languages– But often care about subsets for which it is true

• Prove for all expressions by induction on the tree-height of an expression

Page 54: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 54

Small-step

• Now redo our interpreter with small-step– An expression/statement “becomes a slightly simpler thing”– A less efficient interpreter, but has advantages as a

definition (discuss after interpreter)

Large-step Small-step

interp_e heap->exp->int heap->exp->exp

interp_s heap->stmt->heap heap->stmt->(heap*stmt)

Page 55: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 55

Example

Switching to concrete syntax, where each → is one call to interp_e and heap maps everything to 0

(x+3)+(y*z) → (0+3)+(y*z)

→ 3+(y*z)

→ 3+(0*z)

→ 3+(0*0)

→ 3+0

→ 3

Page 56: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 56

Small-step expressions

exception AlreadyValue

let rec interp_e (h:heap) (e:exp) = match e with Int i -> raise AlreadyValue |Var str -> Int (lookup h str) |Plus(Int i1,Int i2) -> Int (i1+i2) |Plus(Int i1, e2) -> Plus(Int i1,interp_e h e2) |Plus(e1, e2) -> Plus(interp_e h e1,e2) |Times(Int i1,Int i2) -> Int (i1*i2) |Times(Int i1, e2)-> Times(Int i1,interp_e h e2) |Times(e1, e2) -> Times(interp_e h e1,e2)

“We just take one little step”

We chose “left to right”, but not important

Page 57: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 57

Small-step statements

let rec interp_s (h:heap) (s:stmt) = match s with Skip -> raise AlreadyValue |Assign(str,Int i)-> ((update h str i),Skip) |Assign(str,e) -> (h,Assign(str,interp_e h e)) |Seq(Skip,s2) -> (h,s2) |Seq(s1,s2) -> let (h2,s3) = interp_s h s1 in (h2,Seq(s3,s2)) |If(Int i,s1,s2) -> (h, if i <> 0 then s1 else s2) |If(e,s1,s2) -> (h, If(interp_e h e, s1, s2)) |While(e,s1) -> (*???*)

Page 58: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 58

Meanwhile, while

• Loops are always the hard part!

let rec interp_s (h:heap) (s:stmt) = match s with … | While(e,s1) -> (h, If(e,Seq(s1,s),Skip))

• “A loop takes one step to its unrolling”• s is While(e,s1)• interp_s always terminates• interp_prog may not terminate…

Page 59: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 59

Finishing the story

• Have interp_e and interp_s• A “program” is just a statement • An initial heap is (say) one that maps everything to 0

type heap = (string * int) listlet mt_heap = [] (* common PL pun *)let interp_prog s = let rec loop (h,s) = match s with Skip -> lookup h “ans” | _ -> loop (interp_s h s) in loop (mt_heap,s)

Fancy words: We have defined a small-stepoperational-semantics using Caml as our metalanguage

Page 60: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 60

Small vs. large again

• Small is really inefficient – descends and rebuilds AST at every tiny step

• But as a definition, it gives a trace of program states – A state is a pair heap*stmt– Can talk about them e.g., “no state has x>17…”– Infinite loops now produce infinite traces rather than Caml

just “hanging forever”• Theorem: Total equivalence: interp_prog (large) returns i for

s if and only if interp_prog (small) does– Proof is pretty tricky

• With the theorem, we can choose whatever semantics is most convenient for whatever else we want to prove

Page 61: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 61

Where are we

Definition by interpretation• We have abstract syntax and two interpreters for

our source language IMP• Our metalanguage is Caml

Now definition by translation• Abstract syntax and source language still IMP• Metalanguage still Caml• Target language now “Caml with just functions strings, ints, and

conditionals” tricky stuff?

Page 62: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 62

In pictures and equations

Compiler (in metalang)

Sourceprogram

Targetprogram

• If the target language has a semantics, then:

compiler + targetSemantics = sourceSemantics

Page 63: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 63

Deep vs. shallow

• Meta and target can be the same language– Unusual for a “real” compiler– Makes example harder to follow

• Our target will be a subset of Caml– After translation, you could (in theory) “unload” the AST

definition– This is a “deep embedding”

• An IMP while loop becomes a function• Not a piece of data that says “I’m a while loop”• Shows you can really think of loops, assignments, etc. as

“functions over heaps”

Page 64: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 64

Goals

• xlate_e:

exp -> ((string->int)->int)– “given an exp, produce a function that given a function from

strings to ints returns an int”– (string->int acts like a heap)– An expression “is” a function from heaps to ints

• xlate_s:

stmt->((string->int)->(string->int))– A statement “is” a function from heaps to heaps

Page 65: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 65

Expression translation

let rec xlate_e (e:exp) = match e with Int i -> (fun h -> i) |Var str -> (fun h -> h str) |Plus(e1,e2) -> let f1 = xlate_e e1 in let f2 = xlate_e e2 in (fun h -> (f1 h) + (f2 h)) |Times(e1,e2) -> let f1 = xlate_e e1 in let f2 = xlate_e e2 in (fun h -> (f1 h) * (f2 h))

xlate_e: exp -> ((string->int)->int)

Page 66: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 66

What just happened

(* an example *)let e = Plus(Int 3, Times(Var “x”, Int 4))let f = xlate_e e (* compile *)(* the value bound to f is a function whose body

does not use any IMP abstract syntax! *)let ans = f (fun s -> 0)(* run w/ empty heap *)

• Our target sublanguage:– Functions (including + and *, not interp_e)– Strings and integers– Variables bound to things in our sublanguage– (later: if-then-else)

• Note: No lookup until “run-time” (of course)

Page 67: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 67

Wrong

• This produces a program not in our sublanguage:

let rec xlate_e (e:exp) = match e with Int i -> (fun h -> i) |Var str -> (fun h -> h str) |Plus(e1,e2) -> (fun h -> (xlate_e e1 h) + (xlate_e e2 h)) |Times(e1,e2) -> (fun h -> (xlate_e e1 h) * (xlate_e e2 h))

• Caml evaluates function bodies when called (like YFL)

• Waits until run-time to translate Plus and Times children!

Page 68: Dan Grossman Winter 2009

15 January 2009 CSE P505 Winter 2009 Dan Grossman 68

Now what?

• What’s left?– Statements– Programs (a.k.a. “finishing the story”)