YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: Combinator parsing

Combinator ParsingBy Swanand Pagnis

Page 2: Combinator parsing

Higher-Order Functions for ParsingBy Graham Hutton

Page 3: Combinator parsing

• Abstract & Introduction

• Build a parser, one fn at a time

• Moving beyond toy parsers

Page 4: Combinator parsing

Abstract

Page 5: Combinator parsing

In combinator parsing, the text of parsers resembles BNF notation. We present the basic method, and a number of extensions. We address the special problems presented by whitespace, and parsers with separate lexical and syntactic phases. In particular, a combining form for handling the “offside rule” is given. Other extensions to the basic method include an “into” combining form with many useful applications, and a simple means by which combinator parsers can produce more informative error messages.

Page 6: Combinator parsing

• Combinators that resemble BNF notation

• Whitespace handling through "Offside Rule"

• "Into" combining form for advanced parsing

• Strategy for better error messages

Page 7: Combinator parsing

Introduction

Page 8: Combinator parsing

Primitive Parsers

• Take input

• Process one character

• Return results and unused input

Page 9: Combinator parsing

Combinators

• Combine primitives

• Define building blocks

• Return results and unused input

Page 10: Combinator parsing

Lexical analysis and syntax

• Combine the combinators

• Define lexical elements

• Return results and unused input

Page 11: Combinator parsing

input: "from:swiggy to:me" output: [("f", "rom:swiggy to:me")]

Page 12: Combinator parsing

input: "42 !=> ans" output: [("4", "2 !=> ans")]

Page 13: Combinator parsing

rule: 'a' followed by 'b' input: "abcdef" output: [(('a','b'),"cdef")]

Page 14: Combinator parsing

rule: 'a' followed by 'b' input: "abcdef" output: [(('a','b'),"cdef")]

Combinator

Page 15: Combinator parsing

Language choice

Page 16: Combinator parsing

Suggested: Lazy Functional Languages

Page 17: Combinator parsing

Miranda: Author's choice

Page 18: Combinator parsing

Haskell: An obvious choice. 🤓

Page 19: Combinator parsing

Racket: Another obvious choice. 🤓

Page 20: Combinator parsing

Ruby: 🍎 to 🍊 so $ for learning

Page 21: Combinator parsing

OCaml: Functional, but not lazy.

Page 22: Combinator parsing

Haskell %

Page 23: Combinator parsing

Simple when stick to fundamental FP

• Higher order functions

• Immutability

• Recursive problem solving

• Algebraic types

Page 24: Combinator parsing

Let's build a parser, one fn at a time

Page 25: Combinator parsing

type Parser a b = [a] !-> [(b, [a])]

Page 26: Combinator parsing

Types help with abstraction

• We'll be dealing with parsers and combinators

• Parsers are functions, they accept input and return results

• Combinators accept parsers and return parsers

Page 27: Combinator parsing

A parser is a function that accepts an input and returns parsed results and the unused input for each result

Page 28: Combinator parsing

Parser is a function type that accepts a list of type a and returns all possible results as a list of tuples of type (b, [a])

Page 29: Combinator parsing

(Parser Char Number) input: "42 it is!" !-- a is a [Char] output: [(42, " it is!")] !-- b is a Number

Page 30: Combinator parsing

type Parser a b = [a] !-> [(b, [a])]

Page 31: Combinator parsing

Primitive Parsers

Page 32: Combinator parsing

succeed !:: b !-> Parser a b succeed v inp = [(v, inp)]

Page 33: Combinator parsing

Always succeeds Returns "v" for all inputs

Page 34: Combinator parsing

failure !:: Parser a b failure inp = []

Page 35: Combinator parsing

Always fails Returns "[]" for all inputs

Page 36: Combinator parsing

satisfy !:: (a !-> Bool) !-> Parser a a satisfy p [] = failure [] satisfy p (x:xs) | p x = succeed x xs !-- if p(x) is true | otherwise = failure []

Page 37: Combinator parsing

satisfy !:: (a !-> Bool) !-> Parser a a satisfy p [] = failure [] satisfy p (x:xs) | p x = succeed x xs !-- if p(x) is true | otherwise = failure []

Guard Clauses, if you want to Google

Page 38: Combinator parsing

literal !:: Eq a !=> a !-> Parser a a literal x = satisfy (!== x)

Page 39: Combinator parsing

match_3 = (literal '3') match_3 "345" !-- !=> [('3',"45")] match_3 "456" !-- !=> []

Page 40: Combinator parsing

succeed failure satisfy literal

Page 41: Combinator parsing

Combinators

Page 42: Combinator parsing

match_3_or_4 = match_3 `alt` match_4 match_3_or_4 "345" !-- !=> [('3',"45")] match_3_or_4 "456" !-- !=> [('4',"56")]

Page 43: Combinator parsing

alt !:: Parser a b !-> Parser a b !-> Parser a b (p1 `alt` p2) inp = p1 inp !++ p2 inp

Page 44: Combinator parsing

(p1 `alt` p2) inp = p1 inp !++ p2 inpList concatenation

Page 45: Combinator parsing

(match_3 `and_then` match_4) "345" # !=> [(('3','4'),"5")]

Page 46: Combinator parsing

🐉

Page 47: Combinator parsing

and_then !:: Parser a b !-> Parser a c !-> Parser a (b, c) (p1 `and_then` p2) inp = [ ((v1, v2), out2) | (v1, out1) !<- p1 inp, (v2, out2) !<- p2 out1 ]

Page 48: Combinator parsing

and_then !:: Parser a b !-> Parser a c !-> Parser a (b, c) (p1 `and_then` p2) inp = [ ((v1, v2), out2) | (v1, out1) !<- p1 inp, (v2, out2) !<- p2 out1 ]

List comprehensions

Page 49: Combinator parsing

(v11, out11) (v12, out12) (v13, out13)

(v21, out21) (v22, out22)

(v31, out31) (v32, out32)

(v31, out31)

p1

p2

Page 50: Combinator parsing

((v11, v21), out21) ((v11, v22), out22)

Page 51: Combinator parsing

(match_3 `and_then` match_4) "345" # !=> [(('3','4'),"5")]

Page 52: Combinator parsing

Manipulating values

Page 53: Combinator parsing

match_3 = (literal '3') match_3 "345" !-- !=> [('3',"45")] match_3 "456" !-- !=> []

Page 54: Combinator parsing

(number "42") "42 !=> answer" # !=> [(42, " answer")]

Page 55: Combinator parsing

(keyword "for") "for i in 1!..42" # !=> [(:for, " i in 1!..42")]

Page 56: Combinator parsing

using !:: Parser a b !-> (b !-> c) !-> Parser a c (p `using` f) inp = [(f v, out) | (v, out) !<- p inp ]

Page 57: Combinator parsing

((string "3") `using` float) "3" # !=> [(3.0, "")]

Page 58: Combinator parsing

Levelling up

Page 59: Combinator parsing

many !:: Parser a b !-> Parser a [b] many p = ((p `and_then` many p) `using` cons) `alt` (succeed [])

Page 60: Combinator parsing

0 or many

Page 61: Combinator parsing

(many (literal 'a')) "aab" !=> [("aa","b"),("a","ab"),("","aab")]

Page 62: Combinator parsing

(many (literal 'a')) "xyz" !=> [("","xyz")]

Page 63: Combinator parsing

some !:: Parser a b !-> Parser a [b] some p = ((p `and_then` many p) `using` cons)

Page 64: Combinator parsing

1 or many

Page 65: Combinator parsing

(some (literal 'a')) "aab" !=> [("aa","b"),("a","ab")]

Page 66: Combinator parsing

(some (literal 'a')) "xyz" !=> []

Page 67: Combinator parsing

positive_integer = some (satisfy Data.Char.isDigit)

negative_integer = ((literal '-') `and_then` positive_integer) `using` cons

positive_decimal = (positive_integer `and_then` (((literal '.') `and_then` positive_integer) `using` cons)) `using` join

negative_decimal = ((literal '-') `and_then` positive_decimal) `using` cons

Page 68: Combinator parsing

number !:: Parser Char [Char] number = negative_decimal `alt` positive_decimal `alt` negative_integer `alt` positive_integer

Page 69: Combinator parsing

word !:: Parser Char [Char] word = some (satisfy isLetter)

Page 70: Combinator parsing

string !:: (Eq a) !=> [a] !-> Parser a [a] string [] = succeed [] string (x:xs) = (literal x `and_then` string xs) `using` cons

Page 71: Combinator parsing

(string "begin") "begin end" # !=> [("begin"," end")]

Page 72: Combinator parsing

xthen !:: Parser a b !-> Parser a c !-> Parser a c p1 `xthen` p2 = (p1 `and_then` p2) `using` snd

Page 73: Combinator parsing

thenx !:: Parser a b !-> Parser a c !-> Parser a b p1 `thenx` p2 = (p1 `and_then` p2) `using` fst

Page 74: Combinator parsing

ret !:: Parser a b !-> c !-> Parser a c p `ret` v = p `using` (const v)

Page 75: Combinator parsing

succeed, failure, satisfy, literal, alt, and_then, using, string, many, some, string, word, number, xthen, thenx, ret

Page 76: Combinator parsing

Expression Parser & Evaluator

Page 77: Combinator parsing

data Expr = Const Double | Expr `Add` Expr | Expr `Sub` Expr | Expr `Mul` Expr | Expr `Div` Expr

Page 78: Combinator parsing

(Const 3) `Mul` ((Const 6) `Add` (Const 1))) # !=> "3*(6+1)"

Page 79: Combinator parsing

parse "3*(6+1)" # !=> (Const 3) `Mul` ((Const 6) `Add` (Const 1)))

Page 80: Combinator parsing

(Const 3) Mul ((Const 6) `Add` (Const 1))) # !=> 21

Page 81: Combinator parsing

BNF Notation

expn !::= expn + expn | expn − expn | expn ∗ expn | expn / expn | digit+ | (expn)

Page 82: Combinator parsing

Improving a little:

expn !::= term + term | term − term | term term !::= factor ∗ factor | factor / factor | factor factor !::= digit+ | (expn)

Page 83: Combinator parsing

Parsers that resemble BNF

Page 84: Combinator parsing

addition = ((term `and_then` ((literal '+') `xthen` term)) `using` plus)

Page 85: Combinator parsing

subtraction = ((term `and_then` ((literal '-') `xthen` term)) `using` minus)

Page 86: Combinator parsing

multiplication = ((factor `and_then` ((literal '*') `xthen` factor)) `using` times)

Page 87: Combinator parsing

division = ((factor `and_then` ((literal '/') `xthen` factor)) `using` divide)

Page 88: Combinator parsing

parenthesised_expression = ((nibble (literal '(')) `xthen` ((nibble expn) `thenx`(nibble (literal ')'))))

Page 89: Combinator parsing

value xs = Const (numval xs) plus (x,y) = x `Add` y minus (x,y) = x `Sub` y times (x,y) = x `Mul` y divide (x,y) = x `Div` y

Page 90: Combinator parsing

expn = addition `alt` subtraction `alt` term

Page 91: Combinator parsing

term = multiplication `alt` division `alt` factor

Page 92: Combinator parsing

factor = (number `using` value) `alt` parenthesised_expn

Page 93: Combinator parsing

expn "12*(5+(7-2))" # !=> [ (Const 12.0 `Mul` (Const 5.0 `Add` (Const 7.0 `Sub` Const 2.0)),""), … ]

Page 94: Combinator parsing

value xs = Const (numval xs) plus (x,y) = x `Add` y minus (x,y) = x `Sub` y times (x,y) = x `Mul` y divide (x,y) = x `Div` y

Page 95: Combinator parsing

value = numval plus (x,y) = x + y minus (x,y) = x - y times (x,y) = x * y divide (x,y) = x / y

Page 96: Combinator parsing

expn "12*(5+(7-2))" # !=> [(120.0,""), (12.0,"*(5+(7-2))"), (1.0,"2*(5+(7-2))")]

Page 97: Combinator parsing

expn "(12+1)*(5+(7-2))" # !=> [(130.0,""), (13.0,"*(5+(7-2))")]

Page 98: Combinator parsing

Moving beyond toy parsers

Page 99: Combinator parsing

Whitespace? 🤔 (

Page 100: Combinator parsing

white = (literal " ") `alt` (literal "\t") `alt` (literal "\n")

Page 101: Combinator parsing

white = many (any literal " \t\n")

Page 102: Combinator parsing

/\s!*/

Page 103: Combinator parsing

any p = foldr (alt.p) fail

Page 104: Combinator parsing

any p [x1,x2,!!...,xn] = (p x1) `alt` (p x2) `alt` !!... `alt` (p xn)

Page 105: Combinator parsing

white = many (any literal " \t\n")

Page 106: Combinator parsing

nibble p = white `xthen` (p `thenx` white)

Page 107: Combinator parsing

The parser (nibble p) has the same behaviour as parser p, except that it eats up any white-space in the input string before or afterwards

Page 108: Combinator parsing

(nibble (literal 'a')) " a " # !=> [('a',""),('a'," "),('a'," ")]

Page 109: Combinator parsing

symbol = nibble.string

Page 110: Combinator parsing

symbol "$fold" " $fold " # !=> [("$fold", ""), ("$fold", " ")]

Page 111: Combinator parsing

The Offside Rule

Page 112: Combinator parsing

w = x + y where x = 10 y = 15 - 5 z = w * 2

Page 113: Combinator parsing

w = x + y where x = 10 y = 15 - 5 z = w * 2

Page 114: Combinator parsing

When obeying the offside rule, every token must lie either directly below, or to the right of its first token

Page 115: Combinator parsing

i.e. A weak indentation policy

Page 116: Combinator parsing

The Offside Combinator

Page 117: Combinator parsing

type Pos a = (a, (Integer, Integer))

Page 118: Combinator parsing

prelex "3 + \n 2 * (4 + 5)" # !=> [('3',(0,0)), ('+',(0,2)), ('2',(1,2)), ('*',(1,4)), … ]

Page 119: Combinator parsing

satisfy !:: (a !-> Bool) !-> Parser a a satisfy p [] = failure [] satisfy p (x:xs) | p x = succeed x xs !-- if p(x) is true | otherwise = failure []

Page 120: Combinator parsing

satisfy !:: (a !-> Bool) !-> Parser (Pos a) a satisfy p [] = failure [] satisfy p (x:xs) | p a = succeed a xs !-- if p(a) is true | otherwise = failure [] where (a, (r, c)) = x

Page 121: Combinator parsing

satisfy !:: (a !-> Bool) !-> Parser (Pos a) a satisfy p [] = failure [] satisfy p (x:xs) | p a = succeed a xs !-- if p(a) is true | otherwise = failure [] where (a, (r, c)) = x

Page 122: Combinator parsing

offside !:: Parser (Pos a) b !-> Parser (Pos a) b offside p inp = [(v, inpOFF) | (v, []) !<- (p inpON)] where inpON = takeWhile (onside (head inp)) inp inpOFF = drop (length inpON) inp onside (a, (r, c)) (b, (r', c')) = r' !>= r !&& c' !>= c

Page 123: Combinator parsing

offside !:: Parser (Pos a) b !-> Parser (Pos a) b

Page 124: Combinator parsing

offside !:: Parser (Pos a) b !-> Parser (Pos a) b offside p inp = [(v, inpOFF) | (v, []) !<- (p inpON)]

Page 125: Combinator parsing

(nibble (literal 'a')) " a " # !=> [('a',""),('a'," "),('a'," ")]

Page 126: Combinator parsing

offside !:: Parser (Pos a) b !-> Parser (Pos a) b offside p inp = [(v, inpOFF) | (v, []) !<- (p inpON)]

Page 127: Combinator parsing

offside !:: Parser (Pos a) b !-> Parser (Pos a) b offside p inp = [(v, inpOFF) | (v, []) !<- (p inpON)] where inpON = takeWhile (onside (head inp)) inp

Page 128: Combinator parsing

offside !:: Parser (Pos a) b !-> Parser (Pos a) b offside p inp = [(v, inpOFF) | (v, []) !<- (p inpON)] where inpON = takeWhile (onside (head inp)) inp inpOFF = drop (length inpON) inp

Page 129: Combinator parsing

offside !:: Parser (Pos a) b !-> Parser (Pos a) b offside p inp = [(v, inpOFF) | (v, []) !<- (p inpON)] where inpON = takeWhile (onside (head inp)) inp inpOFF = drop (length inpON) inp onside (a, (r, c)) (b, (r', c')) = r' !>= r !&& c' !>= c

Page 130: Combinator parsing

(3 + 2 * (4 + 5)) + (8 * 10)

(3 + 2 * (4 + 5)) + (8 * 10)

Page 131: Combinator parsing

(offside expn) (prelex inp_1) # !=> [(21.0,[('+',(2,0)),('(',(2,2)),('8',(2,3)),('*',(2,5)),('1',(2,7)),('0',(2,8)),(')',(2,9))])]

(offside expn) (prelex inp_2) # !=> [(101.0,[])]

Page 132: Combinator parsing

Quick recap before we 🛫

Page 133: Combinator parsing

∅ !|> succeed, fail !|> satisfy, literal !|> alt, and_then, using !|> many, some !|> string, thenx, xthen, return !|> expression parser & evaluator !|> any, nibble, symbol !|> prelex, offside

Page 134: Combinator parsing

Practical parsers

Page 135: Combinator parsing

🎯Syntactical analysis 🎯Lexical analysis 🎯Parse trees

Page 136: Combinator parsing

type Parser a b = [a] !-> [(b, [a])] type Pos a = (a, (Integer, Integer))

Page 137: Combinator parsing

data Tag = Ident | Number | Symbol | Junk deriving (Show, Eq) type Token = (Tag, [Char])

Page 138: Combinator parsing

(Symbol, "if") (Number, "123")

Page 139: Combinator parsing

Parse the string with parser p, & apply token t to the result

Page 140: Combinator parsing

(p `tok` t) inp = [ (((t, xs), (r, c)), out) | (xs, out) !<- p inp] where (x, (r,c)) = head inp

Page 141: Combinator parsing

(p `tok` t) inp = [ ((<token>,<pos>),<unused input>) | (xs, out) !<- p inp] where (x, (r,c)) = head inp

Page 142: Combinator parsing

(p `tok` t) inp = [ (((t, xs), (r, c)), out) | (xs, out) !<- p inp] where (x, (r,c)) = head inp

Page 143: Combinator parsing

((string "where") `tok` Symbol) inp # !=> ((Symbol,"where"), (r, c))

Page 144: Combinator parsing

many ((p1 `tok` t1) `alt` (p2 `tok` t2) `alt` !!... `alt` (pn `tok` tn))

Page 145: Combinator parsing

[(p1, t1), (p2, t2), …, (pn, tn)]

Page 146: Combinator parsing

lex = many.(foldr op failure) where (p, t) `op` xs = (p `tok` t) `alt` xs

Page 147: Combinator parsing

🐉

Page 148: Combinator parsing

lex = many.(foldr op failure) where (p, t) `op` xs = (p `tok` t) `alt` xs

Page 149: Combinator parsing

# Rightmost computation cn = (pn `tok` tn) `alt` failure

Page 150: Combinator parsing

# Followed by (pn-1 `tok` tn-1) `alt` cn

Page 151: Combinator parsing

many ((p1 `tok` t1) `alt` (p2 `tok` t2) `alt` !!... `alt` (pn `tok` tn))

Page 152: Combinator parsing

lexer = lex [ ((some (any_of literal " \n\t")), Junk), ((string "where"), Symbol), (word, Ident), (number, Number), ((any_of string ["(", ")", "="]), Symbol)]

Page 153: Combinator parsing

lexer = lex [ ((some (any_of literal " \n\t")), Junk), ((string "where"), Symbol), (word, Ident), (number, Number), ((any_of string ["(", ")", "="]), Symbol)]

Page 154: Combinator parsing

lexer = lex [ ((some (any_of literal " \n\t")), Junk), ((string "where"), Symbol), (word, Ident), (number, Number), ((any_of string ["(", ")", "="]), Symbol)]

Page 155: Combinator parsing

head (lexer (prelex "where x = 10")) # !=> ([((Symbol,"where"),(0,0)), ((Ident,"x"),(0,6)), ((Symbol,"="),(0,8)), ((Number,"10"),(0,10)) ],[])

Page 156: Combinator parsing

(head.lexer.prelex) "where x = 10" # !=> ([((Symbol,"where"),(0,0)), ((Ident,"x"),(0,6)), ((Symbol,"="),(0,8)), ((Number,"10"),(0,10)) ],[])

Page 157: Combinator parsing

(head.lexer.prelex) "where x = 10" # !=> ([((Symbol,"where"),(0,0)), ((Ident,"x"),(0,6)), ((Symbol,"="),(0,8)), ((Number,"10"),(0,10)) ],[])

Function composition

Page 158: Combinator parsing

length ((lexer.prelex) "where x = 10") # !=> 198

Page 159: Combinator parsing

Conflicts? Ambiguity?

Page 160: Combinator parsing

In this case, "where" is a source of conflict. It can be a symbol, or identifier.

Page 161: Combinator parsing

lexer = lex [ {- 1 -} ((some (any_of literal " \n\t")), Junk), {- 2 -} ((string "where"), Symbol), {- 3 -} (word, Ident), {- 4 -} (number, Number), {- 5 -} ((any_of string ["(",")","="]), Symbol)]

Page 162: Combinator parsing

Higher priority, higher precedence

Page 163: Combinator parsing

Removing Junk

Page 164: Combinator parsing

strip !:: [(Pos Token)] !-> [(Pos Token)] strip = filter ((!!= Junk).fst.fst)

Page 165: Combinator parsing

((!!= Junk).fst.fst) ((Symbol,"where"),(0,0)) # !=> True ((!!= Junk).fst.fst) ((Junk,"where"),(0,0)) # !=> False

Page 166: Combinator parsing

(fst.head.lexer.prelex) "where x = 10" # !=> [((Symbol,"where"),(0,0)), ((Junk," "),(0,5)), ((Ident,"x"),(0,6)), ((Junk," "),(0,7)), ((Symbol,"="),(0,8)), ((Junk," "),(0,9)), ((Number,"10"),(0,10))]

Page 167: Combinator parsing

(strip.fst.head.lexer.prelex) "where x = 10" # !=> [((Symbol,"where"),(0,0)), ((Ident,"x"),(0,6)), ((Symbol,"="),(0,8)), ((Number,"10"),(0,10))]

Page 168: Combinator parsing

Syntax Analysis

Page 169: Combinator parsing

characters !|> lexical analysis !|> tokens

Page 170: Combinator parsing

tokens !|> syntax analysis !|> parse trees

Page 171: Combinator parsing

f x y = add a b where a = 25 b = sub x y

answer = mult (f 3 7) 5

Page 172: Combinator parsing

f x y = add a b where a = 25 b = sub x y

answer = mult (f 3 7) 5

Script

Page 173: Combinator parsing

f x y = add a b where a = 25 b = sub x y

answer = mult (f 3 7) 5

Definition

Page 174: Combinator parsing

f x y = add a b where a = 25 b = sub x y

answer = mult (f 3 7) 5

Body

Page 175: Combinator parsing

f x y = add a b where a = 25 b = sub x y

answer = mult (f 3 7) 5

Expression

Page 176: Combinator parsing

f x y = add a b where a = 25 b = sub x y

answer = mult (f 3 7) 5

Definition

Page 177: Combinator parsing

f x y = add a b where a = 25 b = sub x y

answer = mult (f 3 7) 5

Primitives

Page 178: Combinator parsing

data Script = Script [Def] data Def = Def Var [Var] Expn data Expn = Var Var | Num Double | Expn `Apply` Expn | Expn `Where` [Def] type Var = [Char]

Page 179: Combinator parsing

prog = (many defn) `using` Script

Page 180: Combinator parsing

defn = ( (some (kind Ident)) `and_then` ((lit "=") `xthen` (offside body))) `using` defnFN

Page 181: Combinator parsing

body = ( expr `and_then` (((lit "where") `xthen` (some defn)) `opt` [])) `using` bodyFN

Page 182: Combinator parsing

expr = (some prim) `using` (foldl1 Apply)

Page 183: Combinator parsing

prim = ((kind Ident) `using` Var) `alt` ((kind Number) `using` numFN) `alt` ((lit "(") `xthen` (expr `thenx` (lit ")")))

Page 184: Combinator parsing

!-- only allow a kind of tag kind !:: Tag !-> Parser (Pos Token) [Char] kind t = (satisfy ((!== t).fst)) `using` snd

— only allow a given symbol lit !:: [Char] !-> Parser (Pos Token) [Char] lit xs = (literal (Symbol, xs)) `using` snd

Page 185: Combinator parsing

prog = (many defn) `using` Script

Page 186: Combinator parsing

defn = ( (some (kind Ident)) `and_then` ((lit "=") `xthen` (offside body))) `using` defnFN

Page 187: Combinator parsing

body = ( expr `and_then` (((lit "where") `xthen` (some defn)) `opt` [])) `using` bodyFN

Page 188: Combinator parsing

expr = (some prim) `using` (foldl1 Apply)

Page 189: Combinator parsing

prim = ((kind Ident) `using` Var) `alt` ((kind Number) `using` numFN) `alt` ((lit "(") `xthen` (expr `thenx` (lit ")")))

Page 190: Combinator parsing

data Script = Script [Def] data Def = Def Var [Var] Expn data Expn = Var Var | Num Double | Expn `Apply` Expn | Expn `Where` [Def] type Var = [Char]

Page 191: Combinator parsing

Orange functions are for transforming values.

Page 192: Combinator parsing

Use data constructors to generate parse trees

Page 193: Combinator parsing

Use evaluation functions to evaluate and generate a value

Page 194: Combinator parsing

f x y = add a b where a = 25 b = sub x y

answer = mult (f 3 7) 5

Page 195: Combinator parsing

Script [ Def "f" ["x","y"] ( ((Var "add" `Apply` Var "a") `Apply` Var "b") `Where` [ Def "a" [] (Num 25.0), Def "b" [] ((Var "sub" `Apply` Var "x") `Apply` Var "y")]), Def "answer" [] ( (Var "mult" `Apply` ( (Var "f" `Apply` Num 3.0) `Apply` Num 7.0)) `Apply` Num 5.0)]

Page 196: Combinator parsing

Strategy for writing parsers

Page 197: Combinator parsing

1. Identify components i.e. Lexical elements

Page 198: Combinator parsing

lexer = lex [ ((some (any_of literal " \n\t")), Junk), ((string "where"), Symbol), (word, Ident), (number, Number), ((any_of string ["(", ")", "="]), Symbol)]

Page 199: Combinator parsing

2. Structure these elements a.k.a. syntax

Page 200: Combinator parsing

defn = ((some (kind Ident)) `and_then` ((lit "=") `xthen` (offside body))) `using` defnFN

body = (expr `and_then` (((lit "where") `xthen` (some defn)) `opt` [])) `using` bodyFN

expr = (some prim) `using` (foldl1 Apply)

prim = ((kind Ident) `using` Var) `alt` ((kind Number) `using` numFN) `alt` ((lit "(") `xthen` (expr `thenx` (lit ")")))

Page 201: Combinator parsing

3. BNF notation is very helpful

Page 202: Combinator parsing

4. TDD in the absence of types

Page 203: Combinator parsing

Where to, next?

Page 204: Combinator parsing

Monadic ParsersGraham Hutton, Eric Meijer

Page 205: Combinator parsing

Introduction to FPPhilip Wadler

Page 206: Combinator parsing

The Dragon BookIf your interest is in compilers

Page 207: Combinator parsing

Libraries?

Page 208: Combinator parsing

Haskell: Parsec, MegaParsec. ✨ OCaml: Angstrom. ✨ 🚀 Ruby: rparsec, or roll you own Elixir: Combine, ExParsec Python: Parsec. ✨

Page 209: Combinator parsing

Thank you!

Page 210: Combinator parsing

Twitter: @_swanand GitHub: @swanandp


Related Documents