Pure and Declarative Syntax Definition: Paradise Lost and Regained Lennart Kats Eelco Visser Guido Wachsmuth Delft University of Technology
Nov 21, 2014
Pure and Declarative Syntax Definition:
Paradise Lost and Regained
Lennart Kats Eelco Visser Guido Wachsmuth
Delft University of Technology
PARADISE
PARADISE LOST
PARADISE REGAINED
PARADISE DENIED
PARADISE
WORDS
TREES
GRAMMARS
LANGUAGE ENGINEERS
LANGUAGES
GRAMMARS
NATURAL
PURE
BEAUTIFUL
SOFTWARE ENGINEERS
LANGUAGE SOFTWARE
NATURAL
NOT
PURE
NOT
BEAUTIFUL
NOT
SYNTAX DEFINITIONS
NATURAL
PURE
BEAUTIFUL
THE FALL
PARSER DEFINITIONS
PARADISE LOST
PAIN
SWEAT
NATURAL
NOT
PURE
NOT
BEAUTIFUL
NOT
THE PLAGUES
context-free grammars
LR(k)
LR(1)
LALR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
GRAMMAR CLASSES
DISAMBIGUATION
LEXICAL SYNTAX
TREE CONSTRUCTION
EVOLUTION
COMPOSITION
RESTRICTION TO PARSERS
PARADISE
WORDS WERE MADE THROUGH
GRAMMARS
GRAMMARS
Num ➝ Digit NumNum ➝ DigitDigit ➝ “0”Digit ➝ “1”
Num ➝ Digit NumNum ➝ DigitDigit ➝ “0”Digit ➝ “1”
production rules
Num ➝ Digit NumNum ➝ DigitDigit ➝ “0”Digit ➝ “1”
terminal symbolsproduction rules
Num ➝ Digit NumNum ➝ DigitDigit ➝ “0”Digit ➝ “1”
nonterminal symbolsterminal symbolsproduction rules
Num ➝ Digit NumNum ➝ DigitDigit ➝ “0”Digit ➝ “1”
nonterminal symbolsterminal symbolsproduction rules
start symbol
WORDS
Num
NumDigit Num
Num ➝ Digit Num
NumDigit NumDigit Digit
Num ➝ Digit NumNum ➝ Digit
NumDigit NumDigit DigitDigit 0
Num ➝ Digit NumNum ➝ DigitDigit ➝ “0”
NumDigit NumDigit DigitDigit 01 0
Num ➝ Digit NumNum ➝ DigitDigit ➝ “0”Digit ➝ “1”
SENTENCES
Exp ➝ Exp “+” ExpExp ➝ Exp “*” ExpExp ➝ Num
production rules
Exp ➝ Exp “+” ExpExp ➝ Exp “*” ExpExp ➝ Num
terminal symbolsproduction rules
Exp ➝ Exp “+” ExpExp ➝ Exp “*” ExpExp ➝ Num
nonterminal symbolsterminal symbolsproduction rules
Exp ➝ Exp “+” ExpExp ➝ Exp “*” ExpExp ➝ Num
nonterminal symbolsterminal symbolsproduction rules
start symbol
Exp
ExpExp + Exp
Exp ➝ Exp “+” Exp
ExpExp + ExpExp * Exp + Exp
Exp ➝ Exp “+” ExpExp ➝ Exp “*” Exp
ExpExp + ExpExp * Exp + Exp3 * Exp + Exp
Exp ➝ Exp “+” ExpExp ➝ Exp “*” ExpExp ➝ Num
ExpExp + ExpExp * Exp + Exp3 * Exp + Exp3 * 7 + Exp
Exp ➝ Exp “+” ExpExp ➝ Exp “*” ExpExp ➝ NumExp ➝ Num
ExpExp + ExpExp * Exp + Exp3 * Exp + Exp3 * 7 + Exp3 * 7 + 21
Exp ➝ Exp “+” ExpExp ➝ Exp “*” ExpExp ➝ NumExp ➝ NumExp ➝ Num
THEY MADE LANGUAGESBY MAKING GRAMMARS
GRAMMAR
LANGUAGE
TRUTH
Exp ➝ Exp “+” ExpExp ➝ Exp “*” ExpExp ➝ Num
Exp “+” Exp ➝ ExpExp “*” Exp ➝ Exp
Num ➝ Exp
productive reductive
3 * 7 + 21
3 * 7 + 21 3 * 7 + Exp
Num ➝ Exp
3 * 7 + 21 3 * 7 + Exp 3 * Exp + Exp
Num ➝ ExpNum ➝ Exp
3 * 7 + 21 3 * 7 + Exp 3 * Exp + Exp Exp * Exp + Exp
Num ➝ ExpNum ➝ ExpNum ➝ Exp
3 * 7 + 21 3 * 7 + Exp 3 * Exp + Exp Exp * Exp + ExpExp + Exp
Num ➝ ExpNum ➝ ExpNum ➝ Exp
Exp “*” Exp ➝ Exp
3 * 7 + 21 3 * 7 + Exp 3 * Exp + Exp Exp * Exp + ExpExp + Exp Exp
Num ➝ ExpNum ➝ ExpNum ➝ Exp
Exp “*” Exp ➝ ExpExp “+” Exp ➝ Exp
THEY TURNED WORDS INTO
TREES
SENTENCES
STRUCTURE
Exp ➝ Exp “+” ExpExp ➝ Exp “*” ExpExp ➝ Num
Exp “+” Exp ➝ ExpExp “*” Exp ➝ Exp
Num ➝ Exp
productive reductive
Exp
Num
Exp
+Exp Exp
Exp
*Exp Exp
tree construction
3 * +7 21
Exp Exp
3 * +7 21
Exp
Exp
Num
Exp
Exp
Exp
3 * +7 21
Exp
Exp
+Exp Exp
Exp
*Exp Exp
Exp
Exp
Exp
3 * +7 21
Exp
Exp
Exp
+Exp Exp
ONE FORMALISM
THREE READINGS
PURE
DECLARATIVE
BEAUTIFUL
PARADISE LOST
EFFICIENCY
THE FIRST PLAGUE WERE
GRAMMAR CLASSES
context-free grammars
context-free grammars
LL(0)
context-free grammars
LL(1)
LL(0)
context-free grammars
LL(k)
LL(1)
LL(0)
context-free grammars
LR(0)
LL(k)
LL(1)
LL(0)
context-free grammars
LR(1)
LR(0)
LL(k)
LL(1)
LL(0)
context-free grammars
LR(k)
LR(1)
LR(0)
LL(k)
LL(1)
LL(0)
context-free grammars
LR(k)
LR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
context-free grammars
LR(k)
LR(1)
LALR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
Exp “+” Exp ➝ ExpExp “*” Exp ➝ Exp
Num ➝ Exp
Term (“+” Term)* ➝ ExpFact (“*” Fact)* ➝ Term
Num ➝ Fact
paradise grammar classes
Exp
Exp
Exp
3 + +7 21
Exp
Exp
paradise grammar classes
Fact Fact
3 + +7 21
Fact
*
Exp
Term Term Term
*
THE SECOND PLAGUE WAS
DISAMBIGUATION
Exp
Exp
Exp
3 * +7 21
Exp
Exp
Exp
Exp
Exp
3 * +7 21
Exp
Exp
text books
precedence operators associativity
1 ( ), [ ] non-associative
2 new non-associative
3 . left-associative
4 ++, -- non-associative
5 -, +, !, ~, ++, --, (type) right-associative
6 *, /, % left-associative
7 +, - left-associative
… … …
context-free grammars
LR(k)
LR(1)
LALR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
context-free grammars
unambigous
LR(k)
LR(1)
LALR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
Exp “+” Term ➝ ExpTerm ➝ Exp
Term “*” Fact ➝ Term Fact ➝ TermNum ➝ Fact
grammar classes
FALSE PROPHETS
“a” “b” ➝ A“a” ➝ A
“a” “b” / “a” ➝ A
paradise PEGs
L = {ab, a} L = {ab, a}
“a” ➝ A “a” “b” ➝ A
“a” / “a” “b” ➝ A
paradise PEGs
L = {ab, a} L = {a}
if c1 then if c2 then s1 else s2
dangling else
“if ” E “then” S “else” S / “if ” E “then” S ➝ S
“if ” E “then” S / “if ” E “then” S “else” S ➝ S
PEGs
THE THIRD PLAGUE WAS
LEXICAL SYNTAX
morphology & syntax
limited look-ahead
LR(k)
LR(1)
LALR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
scanners
3 * +7 21
3 * +7 21
parsers
3 * +7 21
Exp
Exp
Exp
3 * +7 21
Exp
Exp
array [ ..1 10 ] integerof
x = *1. .10
y :
array [ ..1 10 ] integerof
x = *1. .10
y :
array [ 1. .10 ] integerof
x = *1. .10
y :
THE FOURTH PLAGUE WAS
TREE CONSTRUCTION
Exp
Exp
Exp
3 + +7 21
Exp
Exp
Const
Add
Const
3 7 21
Const
Add
paradise
Const
Add
Const
3 7 21
Const
Add
Fact Fact
3 + +7 21
Fact
*
Exp
Term Term Term
*
grammar classes
expr: INTEGER { $$ = con($1); }| expr '+' expr { $$ = opr('+', 2, $1, $3);}| expr '*' expr { $$ = opr('*', 2, $1, $3);};
semantic actions
THE FIFTH PLAGUE WAS
EVOLUTION
Exp “+” Exp ➝ ExpExp “*” Exp ➝ Exp
Num ➝ Exp
Exp “+” Term ➝ ExpTerm ➝ Exp
Term “*” Fact ➝ Term Fact ➝ TermNum ➝ Fact
paradise grammar classes
Exp “+” Exp ➝ ExpExp “*” Exp ➝ Exp
Num ➝ Exp
Exp “=” Exp ➝ ExpExp “<” Exp ➝ ExpExp “>” Exp ➝ Exp
CExp “+” Term ➝ CExpTerm ➝ CExp
Term “*” Fact ➝ Term Fact ➝ TermNum ➝ Fact
Exp “=” CExp ➝ ExpExp “<” CExp ➝ ExpExp “>” CExp ➝ Exp
CExp ➝ Exp
paradise grammar classes
THE SIXTH PLAGUE WAS
COMPOSITION
parsers
context-free grammars
LR(k)
LR(1)
LALR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
context-free grammars
LR(k)
LR(1)
LALR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
context-free grammars
LR(k)
LR(1)
LALR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
context-free grammars
LR(k)
LR(1)
LALR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
scanners
public boolean authenticate(String user, String pw) {
SQL stm = <| SELECT id FROM Users WHERE name = ${user} AND password = ${pw} |>;
return executeQuery(stm).size() != 0;}
THE SEVENTH PLAGUE WAS
RESTRICTION TO
PARSERS
PRETTY PRINTERS
SENTENCE GENERATORS
AST ACCESS
IDE SUPPORT
PARADISE REGAINED
GENERALISED
PARSING
context-free grammars
LR(k)
LR(1)
LALR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
context-free grammars
paradise SDF
context-free syntax
Exp "+" Exp -> Exp Exp "*" Exp -> Exp NUM -> Exp
Exp “+” Exp ➝ ExpExp “*” Exp ➝ Exp
Num ➝ Exp
DECLARATIVE
DISAMBIGUATION
context-free grammars
unambigous
LR(k)
LR(1)
LALR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
context-free grammars
unambigous
context-free grammars
text books
precedence operators associativity
1 ( ), [ ] non-associative
2 new non-associative
3 . left-associative
4 ++, -- non-associative
5 -, +, !, ~, ++, --, (type) right-associative
6 *, /, % left-associative
7 +, - left-associative
… … …
context-free priorities Exp "*" Exp -> Exp {left}> Exp "+" Exp -> Exp {left}
context-free syntax "if" E "then" S -> S {prefer} "if" E "then" S "else" S -> S
SCANNERLESS
PARSING
morphology & syntax
lexical syntax [0-9]+ -> NUM
[\ \t\n] -> LAYOUT "//" ~[\n]* [\n] -> LAYOUT
parser
parser
DECLARATIVE
TREE CONSTRUCTION
Exp
Exp
Exp
3 + +7 21
Exp
Exp
Const
Add
Const
3 7 21
Const
Add
paradise
SDF
context-free syntax Exp "+" Exp -> Exp {cons("Add")} Exp "*" Exp -> Exp {cons("Mul")} NUM -> Exp {cons("Const")}
Const
Add
Const
3 7 21
Const
Add
paradise SDF
Add( Add( Const("3"), Const("7") ), Const("21"))
SEAMLESS
EVOLUTION
context-free syntax Exp "+" Exp -> Exp {cons("Add")} Exp "*" Exp -> Exp {cons("Mul")} NUM -> Exp {cons("Const")} Exp "=" Exp -> Exp {cons("Eq")} Exp ">" Exp -> Exp {cons("Gt")} Exp "<" Exp -> Exp {cons("Lt")}
MODULAR
COMPOSITION
context-free grammars
LR(k)
LR(1)
LALR(1)
SLR
LR(0)
LL(k)
LL(1)
LL(0)
context-free grammars
public boolean authenticate(String user, String pw) {
SQL stm = <| SELECT id FROM Users WHERE name = ${user} AND password = ${pw} |>;
return executeQuery(stm).size() != 0;}
module Java-SQL imports Java SQL
exports context-free syntax
"<|" Query "|>" -> Exp {cons("ToSQL")} "${" Exp "}" -> SqlExp {cons("FromSQL")}
BEYOND
PARSERS
PRETTY PRINTERS
SENTENCEGENERATORS
AST ACCESS
IDE SUPPORT
PARADISE DENIED
still around
still around
still have to use it
still have to learnLL, LR, SLR, LALR
still think using parser generators is hard
modern parser generator
modern parser generator
PARADISE OPEN
slides title author copyright
1, 2, 42 Jeremiah lamenting Rembrandt public domain
3, 7, 14-16, 19-21, 23-25, 31-33, 43, 88-92
Adam and Eve in the Garden of Eden Wenzel Peterphoto: Jonathan Linczak some rights reserved
4, 28, 93 Expulsion from the Garden of Eden Thomas Cole public domain
5, 138, 175 Livres d'heures des Étienne Chevalier Jean Fouquet public domain
6, 167 The Adoration of the Golden Calf Nicolas Poussin public domain
8, 51, 112, 131, 150 Thesaurus Enoch Lau some rights reserved
9, 79, 81 The Burmis Tree Monsieur david some rights reserved
10, 13, 22, 45, 69 Latin Grammar Anthony Nelzin some rights reserved
11, 17, 176 The Garden of Earthly Delights (centre panel) Hieronymus Bosch public domain
12, 18, 68 Programming language textbooks K.lee public domain
26 The Fall of Man Jacob Jordaens public domain
27 Illustration d'après un Bison naturalisé d'Eulalie en Margeride F Lamiot some rights reserved
29 Can't Concentrate Sasha Wolff some rights reserved
30 Cold Sweat Eric Tastad some rights reserved
34 The Fifth Plague of Egypt Joseph M. W. Turner public domain
36 Managed Destruction Harley Kingston some rights reserved
slides title author copyright
37 Book Scanner Ben Woosley some rights reserved
38 Dead trees in the clay pan of the Deadvlei Harald Süpfle some rights reserved
39 Charles Robert Darwin John Maler Collier public domain
40 Black Lego Wallpaper monohex some rights reserved
41 Four - Nova Prospekt (Restricted) |Digressive| some rights reserved
44, 110, 148 Noam Chomsky Fellowsisters some rights reserved
57, 80, 115, 129, 151 Minuscule 798 f.41v - f.42r unknown public domain
70 Latin Bible Gerard Brilsphoto: Adrian Pingstone public domain
71 Themis and Aegeus Kodros Painterphoto: Bibi Saint-Pol public domain
94 IBM System/3 Jonathunder some rights reserved
104 Destruction of the Beast and the False Prophet Benjamin West public domain
168 lex & yacc O’Reilly all rights reserved
169 flex & bison O’Reilly all rights reserved
170-172 Students TU Delft Media Services all rights reserved
173 ANTLR The Pragmatic Bookshelf all rights reserved
174 Xtext all rights reserved