Top Banner
Lab 3: Using ML-Yacc Zhong Zhuang [email protected]
23

Lab 3: Using ML-Yacc

Mar 19, 2016

Download

Documents

Nash

Lab 3: Using ML-Yacc. Zhong Zhuang [email protected]. How to write a parser?. Write a parser by hand Use a parser generator May not be as efficient as hand-written parser General and robust How it works?. stream of tokens. Parser Specification. Parser. parser generator. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lab 3: Using ML-Yacc

Lab 3: Using ML-Yacc

Zhong [email protected]

Page 2: Lab 3: Using ML-Yacc

How to write a parser? Write a parser by hand Use a parser generator

May not be as efficient as hand-written parser General and robust How it works?

Parser Specification parser

generator

Parser

abstract syntax

stream oftokens

Page 3: Lab 3: Using ML-Yacc

ML-Yacc specification Three parts again

User Declarations: declare values available in the rule actions

%%

ML-Yacc Definitions: declare terminals and non-terminals; special declarations to resolve conflicts

%%

Rules: parser specified by CFG rules and associated semantic action that generate abstract syntax

Page 4: Lab 3: Using ML-Yacc

ML-Yacc Definitions specify type of positions

%pos int * int specify terminal and nonterminal symbols

%term IF | THEN | ELSE | PLUS | MINUS ...%nonterm prog | exp | op

specify end-of-parse token%eop EOF

specify start symbol (by default, non terminal in LHS of first rule)

%start prog

Page 5: Lab 3: Using ML-Yacc

A Simple ML-Yacc File%%

%term NUM | PLUS | MUL | LPAR | RPAR%nonterm exp | fact | base

%pos int%start exp%eop EOF

%%

exp : fact () | fact PLUS exp ()

fact : base () | base MUL factor ()

base : NUM () | LPAR exp RPAR ()

grammar rules

semantic actions(currentlydo nothing)

grammarsymbols

Page 6: Lab 3: Using ML-Yacc

each nonterminal may have a semantic value associated with it

when the parser reduces with (X ::= s) a semantic action will be executed uses semantic values from symbols in s

when parsing is completed successfully parser returns semantic value associated with the

start symbol usually a syntax tree

Page 7: Lab 3: Using ML-Yacc

to use semantic values during parsing, we must declare symbol types: %terminal NUM of int | PLUS | MUL | ... %nonterminal exp of int | fact of int | base of int

type of semantic action must match type declared for the nonterminal in rule

Page 8: Lab 3: Using ML-Yacc

A Simple ML-Yacc File with Action%%

%term NUM of int | PLUS | MUL | LPAR | RPAR%nonterm exp of int | fact of int | base of int

%pos int%start exp%eop EOF

%%

exp : fact (fact) | fact PLUS exp (fact + exp)

fact : base (base) | base MUL base (base1 * base2)

base : NUM (NUM) | LPAR exp RPAR (exp)

grammar ruleswithsemantic actions

grammarsymbolswithtypedeclarations

computinginteger resultvia semanticactions

Page 9: Lab 3: Using ML-Yacc

Conflicts in ML-Yacc We often write ambiguous grammar

Example Tokens from lexer

NUM PLUS NUM MUL NUM

State of Parser E+E

exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR

To be read

Page 10: Lab 3: Using ML-Yacc

Conflicts in ML-Yacc We often write ambiguous grammar

Example Tokens from lexer

NUM PLUS NUM MUL NUM

State of Parser E+E Result is : E+(E*E)

exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR

To be read

Shift E+E*Shift E+E*EReduce E+EReduce E

If we shift

Page 11: Lab 3: Using ML-Yacc

Conflicts in ML-Yacc We often write ambiguous grammar

Example Tokens from lexer

NUM PLUS NUM MUL NUM

State of Parser E+E Result is: (E+E)*E

exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR

To be read

Reduce EShift E*Shift E*EReduce E

If we reduce

Page 12: Lab 3: Using ML-Yacc

This is a shift-reduce conflict We want E+E*E, because “*” has higher

precedence than “+” Another shift-reduce conflict

Tokens from lexer NUM PLUS NUM PLUS NUM

State of Parser E+E Result is : E+(E+E) and (E+E)+E

To be read

Shift E+E+Shift E+E+EReduce E+EReduce E

If we shift

Reduce EShift E+Shift E+EReduce E

If we reduce

Page 13: Lab 3: Using ML-Yacc

Deal with shift-reduce conflicts This case, we need to reduce, because “+” is

left associative Deal with it!

let ML-Yacc complain. default choice is to shift when it encounters a shift-

reduce error BAD: programmer intentions unclear; harder to debug

other parts of your grammar; generally inelegant rewrite the grammar to eliminate ambiguity

can be complicated and less clear use Yacc precedence directives

%left, %right %nonassoc

Page 14: Lab 3: Using ML-Yacc

Precedence and Associativity precedence of terminal based on order in

which associativity is specified precedence of rule is the precedence of the

right-most terminal eg: precedence of (E ::= E + E) == prec(+)

a shift-reduce conflict is resolved as follows prec(terminal) > prec(rule) ==> shift prec(terminal) < prec(rule) ==> reduce prec(terminal) = prec(rule) ==>

assoc(terminal) = left ==> reduce assoc(terminal) = right ==> shift assoc(terminal) = nonassoc ==> report as error

Page 15: Lab 3: Using ML-Yacc

datatype exp = Int of int | Add of exp * exp | Sub of exp * exp | Mul of exp * exp | Div of exp *exp

%%

%left PLUS MINUS%left MUL DIV

%%

exp : NUM (Int NUM) | exp PLUS exp (Add (exp1, exp2)) | exp MINUS exp (Sub (exp1, exp2)) | exp MUL exp (Mul (exp1, exp2)) | exp DIV exp (Div (exp1, exp2)) | LPAR exp RPAR (exp)

Higher precedence

Page 16: Lab 3: Using ML-Yacc

Reduce-reduce Conflict This kind of conflict is more difficult to deal

with Example

When we get a “word” from lexer, word -> maybeword -> sequence (rule 1) empty –> sequence word -> sequence (rule 2)

We have more than one way to get “sequence” from input “word”

sequence::= | maybeword | sequence wordmaybeword: := | word

Page 17: Lab 3: Using ML-Yacc

Reduce-reduce Conflict Reduce-reduce conflict means there are two

or more rules that apply to the same sequence of input. This usually indicates a serious error in the grammar.

ML-Yacc reduce by first rule Generally, reduce-reduce conflict is not allowed in

your ML-Yacc file We need to fix our grammarsequence::=

| sequence word

Page 18: Lab 3: Using ML-Yacc

Summary of conflicts Shift-reduce conflict

precedence and associativity Shift by default

Reduce-reduce conflict reduce by first rule Not allowed!

Page 19: Lab 3: Using ML-Yacc

Lab3 Your job is to finish a parser for C language Input: A “.c” file Output: “Success!” if the “.c” file is correct File description

c.lex c.grm main.sml call-main.sml sources.cm lab3.mlb test.c

Page 20: Lab 3: Using ML-Yacc

Using ML-Yacc Read the ML-Yacc Manual Run

If your finish “c.grm” and “c.lex” In command-line: (use MLton’s)

mlyacc c.grm mllex c.lex

we will get “c.grm.sig”, “c.grm.sml”, “c.grm.desc”, “c.lex.sml”

Then compile Lab3 Start SML/NJ, Run CM.make “sources.cm”; or in command-line, mlton lab3.mlb

To run lab3 In SML/NJ, Main.parse “test.c”; or in command-line, lab3 test.c

Page 21: Lab 3: Using ML-Yacc

“Debug” ML-Yacc File When you run mlyacc, you’ll see error messages

if your ml-yacc file has conflicts. For example, mlyacc c.grm

2 shift/reduce conflicts open file “c.grm.desc”(This file is generated by

mlyacc) The beginning of this file

the rest are all the states

rule 12 means the 12th rule (from 0) in your ML-Yacc file

2 shift/reduce conflicts error: state 0: shift/reduce conflict (shift MYSTRUCT, reduce by rule 12)error: state 1: shift/reduce conflict (shift MYSTRUCT, reduce by rule 12)

state 0: prog : . structs vdecs preds funcs MYSTRUCT shift 3 prog goto 429 structs

goto 2 structdec goto 1 .reduce by rule 12

Page 22: Lab 3: Using ML-Yacc

Use ML-lex with ML-yacc Most of the work in “c.lex” this time can be

copied from Lab2 You can re-use Regular expressions and

Lexical rules Difference with Lab2

You have to define “token” in “c.grm” %term INT of int | EOF “%term” in “c.grm” will be automatically in “c.grm.sig”signature C_TOKENS =

sigtype ('a,'b) tokentype svalueval EOF: 'a * 'a -> (svalue,'a) tokenval INT: (int) * 'a * 'a -> (svalue,'a) tokenend

Page 23: Lab 3: Using ML-Yacc

Hints Read ML-Yacc Manual Read the language specification Test a lot!