Code Generation and Optimisation Haskell for Compiler Writers Implementing compilers There are two basic requirements of any compiler implementation: 1. To represent the source / target program in a data structure, usually referred to as an abstract syntax tree. 2. To traverse the abstract syntax tree, extracting information and transforming it from one form to another. Why Haskell? Haskell has two main features which make it good for writing compilers. 1. Algebraic data types allow an abstract syntax tree to be easily constructed. 2. Pattern matching makes it easy to define functions that traverse an abstract syntax tree. This lecture How to: define abstract syntax transform abstract syntax trees using Haskell. Aim: to appreciate features of Haskell that make it good for implementing compilers.
16
Embed
Implementing compilers Code Generation and Optimisation ... · Implementing compilers There are two basic requirements of any compiler implementation : 1. To represent the source
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Code Generation and Optimisation
Haskell for Compiler Writers
Implementing compilers
There are two basic requirements of any compiler implementation:
1. To represent the source / target program in a data structure, usually referred to as an abstract syntax tree.
2. To traverse the abstract syntax tree, extracting information and transforming it from one form to another.
Why Haskell?
Haskell has two main features which make it good for writing compilers.
1. Algebraic data types allow an abstract syntax tree to be easily constructed.
2. Pattern matching makes it easy to define functions that traverse an abstract syntax tree.
This lecture
How to:
define abstract syntax
transform abstract syntax trees
using Haskell.
Aim: to appreciate features of Haskell that make it good for implementing compilers.
FUNCTIONS AND APPLICATIONS
Principles of Haskell
Functions
The factorial function in Haskell:
fact :: Int -> Int
Type signature
“is of type”
Function names begin with a lower-case letter.
fact(n) = if n == 1 then 1 else n * fact(n-1)
Equation
Application and reduction
A function applied to an input is called an application, e.g.
An application reduces to the right-hand-side of the first matching equation.
fact(3)
if 3 == 1 then 1 else 3 * fact(3-1)
fact(3) ⇒
“reduces to”
Evaluation
fact(3) ⇒ if 3 == 1 then 1 else 3 * fact(3-1) ⇒ if False then 1 else 3 * fact(3-1) ⇒ 3 * fact(3-1) ⇒ 3 * fact(2) ⇒ ... ⇒ 6
Expressions are evaluated by repeatedly reducing applications.
6 fact(3) ⇒*
“evaluates to”
LISTS AND TUPLES
Commonly used data types
Lists
The empty list is written []
The list with head h and tail t is written h:t
[5,6,7] ≡ 5:6:7:[]
[1] ≡ 1:[]
['x', 'y'] ≡ 'x' : ('y':[])
“cons” “the list containing 1”
The type of a list
The type of a list of elements of type a is written [a].
If x:xs is of type [a]
then x must have type a
and xs must have type [a].
For example, the list ['a', 'b'] is a value of type [Char].
Sum
A function to sum the elements of a list:
sum :: [Int] -> Int
sum([]) = 0 sum(x:xs) = x + sum(xs)
Two equations
Exercise 1
Give the reduction steps to evaluate the following application.
sum([1,2,3])
Polymorphism
A function to compute the length of a list:
length :: [a] -> Int
length([]) = 0 length(x:xs) = 1 + length(xs)
It is a polymorphic function: it can be applied to a list of values of any type.
Type variable
Polymorphism
A couple more examples of polymorphic functions:
head :: [a] -> a
head(x:xs) = x
tail :: [a] -> [a]
tail(x:xs) = xs
Tuples
If x1, x2, …, xn are values of types t1, t2, …, tn respectively
then the tuple
is a value of type
For example, the following values are of type (Char, [Int]).
(x1, x2, …, xn)
(t1, t2, …, tn)
('a', []) ('b', [9]) ('z', [5,6,7])
Multiple inputs
min :: (Int, Int) -> Int
min(x, y) = if x < y then x else y
Tuples can be used to pass multiple inputs to a function.
min(5, 10) 5 ⇒*
For example:
Exercise 2
Define a function
append :: ([a], [a]) -> [a]
append([1,4,2], [3,4])
that joins two lists into a single list, e.g.
[1,4,2,3,4]
⇒*
Homework Exercise
Define a function
first :: (Int, [a]) -> [a]
such that first(n, xs) returns the first n elements of the list xs, e.g.
first(2, [9,8,3,5]) ⇒* [9,8]
first(4, [3,5]) ⇒* [3,5]
Infix operators
Infix operators can be defined for functions of two arguments. For example, the definition
allows ++ to be used as follows.
xs ++ ys = append(xs, ys)
[1,2] ++ [3] ++ [4,5,6]
[1,2,3,4,5,6]
⇒*
Precedence and associativity
Any Haskell operator can be given a precedence (from 0 to 9) and left, right, or non-associativity.
For example, we can write:
infixl 6 – infixr 7 *
So x–y–z is interpreted as (x–y)–z.
And x–y*z is interpreted as x–(y*z).
Exercise 3
Why make ++ right-associative?
infixr 5 ++
In Haskell, we have:
USER-DEFINED TYPES
Type synonyms and algebraic data types
Type synonyms
Type synonyms allow a new (more meaningful) name to be given to an existing type, e.g.
The new type String is entirely
equivalent to [Char]. In Haskell:
type String = [Char]
New name Existing type
['h', 'i', '!'] ≡ "hi!"
Algebraic data types
A data definition introduces a new type, and a set of constructors that can be used to create values of that type.
Type and constructor names begin with an upper-case letter.
data Bool = True | False
data Colour = Red | Green | Blue
Data type Data constructors
Pattern matching
Examples of functions involving Bool and Colour:
not :: Bool -> Bool
not(False) = True not(True) = False
isRed :: Colour -> Bool
isRed(Red) = True isRed(x) = False
Shapes
A data constructor may have associated components, e.g.
data Shape = Circ(Float) | Rect(Float, Float)
Circ(10.5)
Rect(10.2, 20.9)
Example values of type Shape:
Component values (width & height)
Component value (radius)
Area
A function to compute the area of any given shape.
area :: Shape -> Float
area(Rect(w, h)) = w * h area(Circ(r)) = pi * r * r
(Compare with C code for same task in LSA Chapter 2.)
CASE STUDY
A simplifier for arithmetic expressions.
Concrete syntax
Here is a concrete syntax for arithmetic expressions.
v = [a-z]+
n = [0-9]+
e → v | n | e + e | e * e | ( e )
Example expression:
x * y + (z* 10)
Simplification
Consider the algebraic law:
∀e. e * 1 = e
Example simplification:
x * (y * 1) → x * y
This law can be used to simplify expressions by using it as a rewrite rule from left to right.
Problem
1. Define an abstract syntax, in Haskell, for arithmetic.
2. Implement the simplification rule as a Haskell function over abstract syntax trees.
Abstract syntax
data Op = Add | Mul data Expr = Num(Int) | Var(String) | Apply(Expr, Op, Expr)
An op is an addition or a multiplication
An expression is a number, or a variable, or an application of an op to two sub-expressions