Top Banner
Compilation 2007 Compilation 2007 Abstract Syntax Trees Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus
41
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

Compilation 2007Compilation 2007

Abstract Syntax TreesAbstract Syntax Trees

Michael I. Schwartzbach

BRICS, University of Aarhus

Page 2: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

2Abstract Syntax Trees

Syntax Trees Carry InformationSyntax Trees Carry Information

method

printNumber

type

void

args

int

number

int

base

sequence

decl

type

String

name

ns

const

""

if

==

lvalue

numberconst

0

assign

lvalue

nsconst

"0"

while

>

lvalue

number

const

0

sequence

assign

assign

lvalue

ns

concat

%

lvalue

number

lvalue

base

lvalue

ns

lvalue

number/

lvalue

number

lvalue

base

System.out..print

concat+

lvalue

ns

const

\n

intint String

int

int int

int int

int

intint

int

boolean

boolean

String

String

String

String

String

String

String

String String

int

1 2

3

Page 3: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

3Abstract Syntax Trees

Syntax Trees in SableCCSyntax Trees in SableCC

SableCC creates the parse tree automatically A common superclass of nodes:

public abstract class Node implements Switchable, Cloneable {

public abstract Object clone();

void parent() {...}

void parent(Node parent) {...}

abstract void removeChild(Node child);

abstract void replaceChild(Node oldChild, Node newChild);

public void replaceBy(Node node) {...}

protected String toString(Node node) {...}

protected String toString(List list) {...}

protected Node cloneNode(Node node) {...}

protected List cloneList(List list) {...}

...

}

Page 4: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

4Abstract Syntax Trees

TokensTokens

Tokens are a special kind of nodes:

public abstract class Token extends Node {

public String getText() {...}

public void setText(String text) {...}

public int getLine() {...}

public void setLine(int line) {...}

public int getPos() {...}

public void setPos(int pos) {...}

public String toString() {...}

...

}

Page 5: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

5Abstract Syntax Trees

Tree InvariantTree Invariant

SableCC trees are guaranteed to be tree shaped If a node is moved around, it loses its parent Use the clone() method instead of sharing

Page 6: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

6Abstract Syntax Trees

Our Favorite Grammar in SableCCOur Favorite Grammar in SableCC

Helpers

tab = 9;

cr = 13;

lf = 10;

Tokens

eol = cr | lf | cr lf;

blank = ' ' | tab;

star = '*';

slash = '/';

plus = '+';

minus = '-';

lpar = '(';

rpar = ')';

id = 'x' | 'y' | 'z';

Ignored Tokens

blank,eol;

Productions

start = {plus} start plus term |

{minus} start minus term |

{term} term;

term = {mult} term star factor |

{div} term slash factor |

{factor} factor;

factor = {id} id |

{paren} lpar start rpar;

Page 7: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

7Abstract Syntax Trees

Concrete Classes for TokensConcrete Classes for Tokens

public final class TId extends Token {...}

public final class TBlank extends Token {...}

public final class TEol extends Token {...}

public final class TLPar extends Token {...}

public final class TMinus extends Token {...}

public final class TPlus extends Token {...}

public final class TRPar extends Token {...}

public final class TSlash extends Token {...}

public final class TStar extends Token {...}

Page 8: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

8Abstract Syntax Trees

Abstract Classes for NonterminalsAbstract Classes for Nonterminals

public abstract class PFactor extends Node {...}

public abstract class PStart extends Node {...}

public abstract class PTerm extends Node {...}

Page 9: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

9Abstract Syntax Trees

Concrete Classes for ProductionsConcrete Classes for Productions

public final class APlusStart extends PStart {...}

public final class AMinusStart extends PStart {...}

public final class ATermStart extends PStart {...}

public final class AMultTerm extends PTerm {...}

public final class ADivTerm extends PTerm {...}

public final class AFactorTerm extends PTerm {...}

public final class AIdFactor extends PFactor {...}

public final class AParenFactor extends PFactor {...}

Page 10: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

10Abstract Syntax Trees

Naming ConventionsNaming Conventions

Production: foo = {bar} baz | {qux} quux Abstract class: PFoo Concrete class: ABarFoo, AQuxFoo

Generated enum: EFoo = {BAR, QUX} All PFoo has a kindPFoo() method

Page 11: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

11Abstract Syntax Trees

Parse Trees Use T-Nodes and A-nodesParse Trees Use T-Nodes and A-nodes

x*y+zAPlusStart

AFactorTerm

AIdFactor

TId

ATermStart

AMultTerm

AFactorTerm AIdfactor

TId

TStar

AIdfactor

TId

TPlus

Page 12: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

12Abstract Syntax Trees

Irrelevant DetailsIrrelevant Details

LALR(1) parse trees have irrelevant details

There is no semantic distinction between:• PStart• PTerm• PFactor

The extra structure complicates traversals...

Page 13: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

13Abstract Syntax Trees

Abstract Syntax TreesAbstract Syntax Trees

An AST records only semantically relevant information:

ABinopExp

ABinopExp AVarExp

AVarExpAVarExp

TIdTId

TIdAMulBinop

AAddBinop

Page 14: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

14Abstract Syntax Trees

ASTs in SableCCASTs in SableCC

The AST could be built by hand, by traversing the parse tree and creating the new nodes

We would quickly get tired of this...

SableCC allows another grammar for the ASTs Productions define an inductive mapping An AST grammar is a recursive datatype

Page 15: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

15Abstract Syntax Trees

Our Favorite Grammar with ASTs (1/2)Our Favorite Grammar with ASTs (1/2)

Helpers

tab = 9;

cr = 13;

lf = 10;

Tokens

eol = cr | lf | cr lf;

blank = ' ' | tab;

star = '*';

slash = '/';

plus = '+';

minus = '-';

l_par = '(';

r_par = ')';

id = 'x' | 'y' | 'z';

Ignored Tokens

blank,eol;

Page 16: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

16Abstract Syntax Trees

Our Favorite Grammar with ASTs (2/2)Our Favorite Grammar with ASTs (2/2)

Productions

start {-> exp} =

{plus} start plus term

{-> New exp.binop(start.exp, New binop.add(), term.exp)} |

{minus} start minus term

{-> New exp.binop(start.exp, New binop.sub(), term.exp)} |

{term} term {-> term.exp} ;

term {-> exp} =

{mult} term star factor

{-> New exp.binop(term.exp, New binop.mul(), factor.exp)} |

{div} term slash factor

{-> New exp.binop(term.exp, New binop.div(), factor.exp)} |

{factor} factor {-> factor.exp};

factor {-> exp} =

{id} id {-> New exp.var(id)} |

{paren} l_par start r_par {-> start.exp};

Abstract Syntax Tree

exp = {binop} [l]:exp binop [r]:exp | {var} id;

binop = {add} | {sub} | {mul} | {div};

Page 17: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

17Abstract Syntax Trees

Traversal SupportTraversal Support

Many applications need to traverse the AST SableCC adds automatic support

An extended visitor pattern: AnalysisAdapter A specialization: DepthFirstAdapter

These are generated for the given AST

Page 18: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

18Abstract Syntax Trees

AnalysisAdapterAnalysisAdapter

Each node class XYZ has a

public void caseXYZ(XYZ node)

method that may be overridden.

Each node class has a method

public void apply(...)

that accepts an AnalysisAdapter and invokes the appropriate method

Page 19: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

19Abstract Syntax Trees

DepthFirstAdapterDepthFirstAdapter

A subclass of AnalysisAdapter Each node type XYZ has two further methods:

public void inXYZ(XYZ node)

public void outXYZ(XYZ node)

The caseXYZ methods are implemented to perform a depth-first traversal of the AST

Page 20: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

20Abstract Syntax Trees

Pretty Printing (1/3)Pretty Printing (1/3)

import analysis.*;

import node.*;

import java.io.*;

public class PrettyPrint extends AnalysisAdapter {

private PrintStream out;

public PrettyPrint(PrintStream out) {

this.out = out;

}

private void print(Token t) {

print(t.getText());

}

private void print(Node n) {

if (n == null) print("<<<null>>>");

else n.apply(this);

}

Page 21: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

21Abstract Syntax Trees

Pretty Printing (2/3)Pretty Printing (2/3)

private void print(Object o) {

out.print(o.toString());

}

public @Override void caseABinopExp(ABinopExp binopexp) {

out.print("(");

print(binopexp.getL());

print(binopexp.getBinop());

print(binopexp.getR());

out.print(")");

}

public @Override void caseAVarExp(AVarExp varexp) {

print(varexp.getId());

}

Page 22: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

22Abstract Syntax Trees

Pretty Printing (3/3)Pretty Printing (3/3)

public @Override void caseAAddBinop(AAddBinop addbinop) {

out.print("+");

}

public @Override void caseASubBinop(ASubBinop subbinop) {

out.print("-");

}

public @Override void caseAMulBinop(AMulBinop mulbinop) {

out.print("*");

}

public @Override void caseADivBinop(ADivBinop divbinop) {

out.print("/");

}

}

Page 23: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

23Abstract Syntax Trees

Evaluating the ExpressionsEvaluating the Expressions

A typical task for the DepthFirstAdapter But it only has void methods...

We must add a value field to all AST nodes

(and other compiler phases will add further fields)

But this means changing all the generated files What happens when they are regenerated?

Page 24: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

24Abstract Syntax Trees

AspectsAspects

An aspect is in most ways similar to a class Only one instance obtained with aspectOf() Cool new ability, add to other classes:

• new fields• new methods • new interfaces

The AspectJ compiler weaves things together The full AspectJ language is much more general

Page 25: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

25Abstract Syntax Trees

The Evaluate Aspect (1/2)The Evaluate Aspect (1/2)

import analysis.*;

import node.*;

public aspect Evaluate extends DepthFirstAdapter {

int x,y,z;

public Evaluate setEnv(int x, int y, int z) {

this.x = x;

this.y = y;

this.z = z;

return this;

}

public int PExp.value; /* inject a value field to the PExp class */

public @Override void outABinopExp(ABinopExp binopexp) {

binopexp.value = eval(binopexp.getL().value,

binopexp.getBinop().kindPBinop(),

binopexp.getR().value);

}

Page 26: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

26Abstract Syntax Trees

The Evaluate Aspect (2/2)The Evaluate Aspect (2/2)

public @Override void outAVarExp(AVarExp varexp) {

String id = varexp.getId().getText();

if (id.equals("x")) varexp.value = x;

else if (id.equals("y")) varexp.value = y;

else if (id.equals("z")) varexp.value = z;

}

int eval(int l, EBinop op, int r) {

switch(op) {

case ADD: return l+r;

case SUB: return l-r;

case MUL: return l*r;

case DIV: return l/r;

}

return 0;

}

}

Page 27: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

27Abstract Syntax Trees

Using Aspects and AdaptersUsing Aspects and Adapters

class Main {

public static void main(String args[]) {

try {

Parser p =

new Parser (

new Lexer (

new PushbackReader(new InputStreamReader(System.in))));

int x,y,z;

x = Integer.parseInt(args[0]);

y = Integer.parseInt(args[1]);

z = Integer.parseInt(args[2]);

Start tree = p.parse(); /* parse the input */

PExp exp = tree.getExp();

exp.apply(new PrettyPrint(System.out)); /* pretty print */

exp.apply(Evaluate.aspectOf().setEnv(x,y,z)); /* evaluate */

System.out.println(exp.value);

}

catch(Exception e) { System.out.println(e); }

}

}

Page 28: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

28Abstract Syntax Trees

Manipulating ASTsManipulating ASTs

Desugaring:

locally translate constructs into simpler forms Weeding:

reject unwanted ASTs Transforming:

rewrite sub-ASTs

Page 29: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

29Abstract Syntax Trees

An HTML SubsetAn HTML Subset

HTML word* |

<a href="word"> HTML </a> |

<b> HTML </b> |

<i> HTML </i> |

<em> HTML </em>

Page 30: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

30Abstract Syntax Trees

HTML in SableCC (1/2)HTML in SableCC (1/2)

Helpers

tab = 9;

cr = 13;

lf = 10;

char = ['a'..'z'] | ['A'..'Z'] | ['0'..'9'];

Tokens

eol = cr | lf | cr lf;

blank = ' ' | tab;

starta = '<a';

href = 'href';

eq = '=';

quote = '"';

gt = '>';

enda = '</a>';

startb = '<b>';

starti = '<i>';

startem = '<em>';

endb = '</b>';

endi = '</i>';

endem = '</em>';

word = char+;

Ignored Tokens

blank,eol;

Page 31: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

31Abstract Syntax Trees

HTML in SableCC (2/2)HTML in SableCC (2/2)

Productions

html = {word} word* |

{a} starta href eq [quote1]:quote word [quote2]:quote gt html enda |

{b} startb html endb |

{i} starti html endi |

{em} startem html endem ;

Page 32: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

32Abstract Syntax Trees

DesugaringDesugaring

View <em> as syntactic sugar for <i> Just perform the translation during AST building:

Productions

html {->html} =

{word} word*

{-> New html.word([word])} |

{a} starta href eq [quote1]:quote word [quote2]:quote gt html enda

{-> New html.a(word,html.html)} |

{b} startb html endb

{-> New html.b(html.html)} |

{i} starti html endi

{-> New html.i(html.html)} |

{em} startem html endem

{-> New html.i(html.html)} ;

Abstract Syntax Tree

html = {word} word* | {a} word html | {b} html | {i} html ;

Page 33: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

33Abstract Syntax Trees

WeedingWeeding

Don't allow nested anchors One solution is to rewrite the grammar: HTML word* |

<a href="word"> HTMLNoAnchor </a> |

<b> HTML </b> |

<i> HTML </i> |

<em> HTML </em>

HTMLNoAnchor word* |

<b> HTMLNoAnchor </b> |

<i> HTMLNoAnchor </i> |

<em> HTMLNoAnchor </em>

Page 34: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

34Abstract Syntax Trees

Combinatorial ExplosionCombinatorial Explosion

We just doubled the size of the grammar Enforcing 10 constraints like this makes the

grammar 210 = 1024 times larger And impossible to maintain...

Page 35: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

35Abstract Syntax Trees

A Weeding PhaseA Weeding Phase

import node.*;

import analysis.*;

public class Weeding extends DepthFirstAdapter {

int aHeight = 0;

public @Override void inAAHtml(AAHtml node) {

if (aHeight>0) System.out.println("Nested anchors");

aHeight++;

}

public @Override void outAAHtml(AAHtml node) {

aHeight--;

}

}

Page 36: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

36Abstract Syntax Trees

TransformationTransformation

Eliminate nested <b> tags Again, one solution is to rewrite the grammar:

HTML word* |

<a href="word"> HTML </a> |

<b> HTMLInsideB </b> |

<i> HTML </i> |

<em> HTML </em>

HTMLInsideB word* |

<a href="word"> HTMLInsideB </a> |

<b> HTMLInsideB </b> |

<i> HTMLInsideB </i> |

<em> HTMLInsideB </em>

ignore this in the AST

Page 37: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

37Abstract Syntax Trees

Combinatorial ExplosionCombinatorial Explosion

This also doubles the size of the grammar Detecting 7 conditions like this makes the

grammar 27 = 128 times larger Combined with the earlier 10 constraints, the

grammar is now 131,072 times larger, with nonterminals such as:

HTMLInsideBNotInsideINoAnchor...

Page 38: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

38Abstract Syntax Trees

A Transformation PhaseA Transformation Phase

import node.*;

import analysis.*;

public class Transform extends DepthFirstAdapter {

int bHeight = 0;

public @Override void inABHtml(ABHtml node) {

if (bHeight>0) node.replaceBy(node.getHtml());

bHeight++;

}

public @Override void outABHtml(ABHtml node) {

bHeight--;

}

}

Page 39: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

39Abstract Syntax Trees

An Outline Phase (1/2)An Outline Phase (1/2)

import node.*;

import analysis.*;

public class Outline extends DepthFirstAdapter {

int indent = 0;

String indentString() {

String s="";

for (int i=0; i<indent; i++) s=s+" ";

return s;

}

public void inAWordHtml(AWordHtml node) {

indent++;

System.out.println(indentString()+node.toString());

}

public void outAWordHtml(AWordHtml node) {

indent--;

}

Page 40: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

40Abstract Syntax Trees

An Outline Phase (2/2)An Outline Phase (2/2)

public void inAAHtml(AAHtml node) {

System.out.println(indentString()+"a"+" "+node.getWord().toString());

indent++;

}

public void outAAHtml(AAHtml node) { indent--; }

public void inABHtml(ABHtml node) {

System.out.println(indentString()+"b");

indent++;

}

public void outABHtml(ABHtml node) { indent--; }

public void inAIHtml(AIHtml node) {

System.out.println(indentString()+"i");

indent++;

}

public void outAIHtml(AIHtml node) { indent--; }

}

Page 41: Compilation 2007 Abstract Syntax Trees Michael I. Schwartzbach BRICS, University of Aarhus.

41Abstract Syntax Trees

The Main ApplicationThe Main Application

import parser.*;

import lexer.*;

import node.*;

import java.io.*;

class Main {

public static void main(String args[]) {

try {

Parser p =

new Parser (

new Lexer (

new PushbackReader(new InputStreamReader(System.in))));

Start tree = p.parse(); /* parse the input */

tree.apply(new Weeding()); /* check nested anchors */

tree.apply(new Transform()); /* eliminate nested b tags */

tree.apply(new Outline()); /* print an outline */

}

catch(Exception e) { System.out.println(e); }

}

}