Top Banner
Grammars CPSC 5135
38

Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Jan 02, 2016

Download

Documents

Stewart White
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Grammars

CPSC 5135

Page 2: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Formal Definitions

• A symbol is a character. It represents an abstract entity that has no inherent meaning

• Examples: a, A, 3, *, - ,=

Page 3: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Formal Definitions

• An alphabet is a finite set of symbols.

• Examples: A = { a, b, c } B = { 0, 1 }

Page 4: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Formal Definitions

• A string (or word) is a finite sequence of symbols from a given alphabet.

• Examples: S = { 0, 1 } is a alphabet 0, 1, 11010, 101, 111 are strings from

SA = { a, b, c ,d } is an alphabet

bad, cab, dab, d, aaaaa are strings from A

Page 5: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Formal Definitions

• A language is a set of strings from an alphabet.

• The set can be finite or infinite.• Examples:

A = { 0, 1}L1 = { 00, 01, 10, 11 } L2 = { 010, 0110, 01110,011110,

…}

Page 6: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Formal Definitions

• A grammar is a quadruple G = (V, Σ, R, S) where1) V is a finite set of variables (non-terminals),2) Σ is a finite set of terminals, disjoint from V,3) R is a finite set of rules. The left side of each rule is a string of one or more elements from V U Σ and whose right side is a string of 0 or more elements from V U Σ 4) S is an element of V and is called the start symbol

Page 7: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Formal Definitions

• Example grammar:• G = (V, Σ, R, S)

V = { S, A }Σ = { a, b }R = { S → aA

A → bAA → a }

Page 8: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Derivations

R = S → aAA → bAA → a

• A derivation is a sequence of replacements , beginning with the start symbol, and replacing a substring matching the left side of a rule with the string on the right side of a rule S → aA

→ abA → abbA → abba

Page 9: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Derivations

• What strings can be generated from the following grammar?

S → aBaB → aBaB → b

Page 10: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Formal Definitions

• The language generated by a grammar is the set of all strings of terminal symbols which are derivable from S in 0 or more steps.

• What is the language generated by this grammar?

• S → aS → aBB → aBB → a

Page 11: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Kleene Closure

• Let Σ be a set of strings. Σ* is called the Kleene closure of Σ and represents the set of all concatenations of 0 or more strings in Σ.

• Examples Σ* = { 1 }* = { ø, 1, 11, 111, 1111, …} Σ* = { 01 }* = { ø, 01, 0101, 010101, …}

Σ* = { 0 + 1 }* = set of all possible strings of 0’s and 1’s. (+ means union)

Page 12: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Formal Definitions

• A grammar G = (V,Σ, R, S) is right-linear if all rules are of the form:

A → xB

A → x

where A, B ε V and x ε Σ*

Page 13: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Right-linear Grammar

• G = { V, Σ, R, S } V = { S, B }

Σ = { a, b }R = { S → aS ,

S → B ,B → bB ,

B → ε }What language is generated?

Page 14: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Formal Definitions

• A grammar G = (V,Σ, R, S) is left-linear if all rules are of the form:

A → Bx

A → x

where A, B ε V and x ε Σ*

Page 15: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Formal Definitions

• A regular grammar is one that is either right or left linear.

• Let Q be a finite set and let Σ be a finite set of symbols. Also let δ be a function from Q x Σ to Q,   let q0 be a state in Q and let A be a subset of Q. We call each element of Q a state, δ the transition function, q0 the initial state and A the set of accepting states. Then a deterministic finite automaton (DFA) is a 5-tuple < Q , Σ , q0 , δ , A >

• Every regular grammar is equivalent to a DFA

Page 16: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Language Definition

• Recognition – a machine is constructed that reads a string and pronounces whether the string is in the language or not. (Compiler)

• Generation – a device is created to generate strings that belong to the language. (Grammar)

Page 17: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Chomsky Hierarchy

• Noam Chomsky (1950’s) described 4 classes of grammars1) Type 0 – unrestricted grammars2) Type 1 – Context sensitive grammars

3) Type 2 – Context free grammars

4) Type 3 – Regular grammars

Page 18: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Grammars

• Context-free and regular grammars have application in computing

• Context-free grammar – each rule or production has a left side consisting of a single non-terminal

Page 19: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Backus-Naur form (BNF)

• BNF was used to describe programming language syntax and is similar to Chomsky’s context free grammars

• A meta-language is a language used to describe another language

• BNF is a meta-language for computer languages

Page 20: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

BNF

• Consists of nonterminal symbols, terminal symbols (lexemes and tokens), and rules or productions

• <if-stmt> → if <logical-expr> then <stmt>• <if-stmt> → if <logical-expr> then <stmt>

else <stmt>• <if-stmt> → if <logical-expr> then <stmt>

| if <logical-expr> then <stmt>else <stmt>

Page 21: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

A Small Grammar

<program> begin <stmt_list> end<stmt_list> <stmt> | <stmt> ; <stmt_list><stmt> <var> = <expression><var> A | B | C<expression> <var> + <var>

| <var> - <var>| <var>

Page 22: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

A Derivation

<program> begin <stmt_list> end begin <stmt> endbegin <var> = <expression> endbegin A = <expression> endbegin A = <var> + <var> endbegin A = B + <var> endbegin A = B + C end

Page 23: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Terms

• Each of the strings in a derivation is called a sentential form.

• If the leftmost non-terminal is always the one selected for replacement, the derivation is a leftmost derivation.

• Derivations can be leftmost, rightmost, or neither

• Derivation order has no effect on the language generated by the grammar

Page 24: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Derivations Yield Parse Trees<program> begin

<stmt_list> end begin <stmt> endbegin <var> =

<expression> endbegin A = <expression>

endbegin A = <var> + <var>

endbegin A = B + <var> endbegin A = B + C end

<Program>

begin <stmt_list> end

<stmt>

<var> = <expression>

A <var> + <var>

B C

Page 25: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Parse Trees

• Parse trees describe the hierarchical structure of the sentences of the language they define.

• A grammar that generates a sentence for which there are two or more distinct parse trees is ambiguous.

Page 26: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

An Ambiguous Grammar

<assign> <id> = <expr><id> A | B | C<expr> <expr> + <expr>

| <expr> * <expr>| ( <expr> )| <id>

Page 27: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Two Parse Trees – Same Sentence

<assign>

<id> = <expr>

A <expr> + <expr>

<id> <expr> * <expr>

B <id> <id>

C A

<assign>

<id> = <expr>

A <expr> * <expr>

<expr> + <expr> <id>

<id> <id> A

B C

Page 28: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Derivation 1

<assign> <id> = <expr> A = <expr> A = <expr> + <expr> A = <id> + <expr> A = B + <expr> A = B + <expr> * <expr> A = B + <id> * <expr> A = B + C * <expr> A = B + C * <id> A = B + C * A

Page 29: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Derivation 2

<assign> <id> = <expr> A = <expr> A = <expr> * <expr> A = <expr> + <expr> * <expr> A = <id> + <expr> * <expr> A = B + <expr> * <expr> A = B + <id> * <expr> A = B + C * <expr> A = B + C * <id> A = B + C * A

Page 30: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Ambiguity

• Parse trees are used to determine the semantics of a sentence

• Ambiguous grammars lead to semantic ambiguity - this is intolerable in a computer language

• Often, ambiguity in a grammar can be removed

Page 31: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Unambiguous Grammar

<assign> <id> = <expr><id> A | B | C<expr> <expr> + <term> | <term><term> <term> * <factor> | <factor><factor> ( <expr> ) | <id>

• This grammar makes multiplication take precedence over addition

Page 32: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Associativity of Operators

<assign> <id> = <expr>

<id> A | B | C

<expr> <expr> + <term> | <term>

<term> <term> * <factor> | <factor>

<factor> ( <expr> ) | <id>

Addition operators associate from left to right

<assign>

<id> = <expr>

A <expr> + <term>

<expr> + <term> <factor>

<term> <factor> <id>

<factor> <id> A

<id> C

B

Page 33: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

BNF

• A BNF rule that has its left hand side appearing at the beginning of its right hand side is left recursive .

• Left recursion specifies left associativity

• Right recursion is usually used for associating exponetiation operators

<factor> <exp> ** <factor> | <exp> <exp> ( <expr> ) | <id>

Page 34: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Ambiguous If Grammar

<stmt> <if_stmt><if_stmt> if <logic_expr> then <stmt> | if <logic_expr> then <stmt>

else <stmt>

• Consider the sentential form: if <logic_expr> then if <logic_expr> then <stmt> else

<stmt>

Page 35: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Parse Trees for an If Statement<if_stmt>

If <logic_expr> then <stmt> else <stmt>

<if_stmt>

if <logic_expr> then <stmt>

<if_stmt>

If <logic_expr> then <stmt>

<if_stmt>

if <logic_expr> then <stmt> else <stmt>

Page 36: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Unambiguous Grammar for If Statements

<stmt> <matched> | <unmatched><matched> if <logic_expr> then <matched>

else <matched> | any non-if statement<unmatched> if <logic_expr> then <stmt> | if <logic_expr> then <matched> else

<unmatched>

Page 37: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

Extended BNF (EBNF)

• Optional part denoted by […]<selection> if ( <expr> ) <stmt> [ else <stmt> ]

• Braces used to indicate the enclosed part can be repeated indefinitely or left out

<ident_list> <identifier> { , <identifier> }

• Multiple choice options are put in parentheses and separated by the or operator |

<for_stmt> for <var> := <expr> (to | downto) <expr> do <stmt>

Page 38: Grammars CPSC 5135. Formal Definitions A symbol is a character. It represents an abstract entity that has no inherent meaning Examples: a, A, 3, *, -,=

BNF vs EBNF for Expressions

BNF: <expr> <expr> +

<term> | <expr> -

<term> | <term> <term> <term> *

<factor> | <term> / <factor> | <factor>

EBNF: <expr> <term> { (+ | - )

<term> } <term> <factor> { ( * | / )

<factor>