Top Banner
LR-Grammars LR(0), LR(1), and LR(K)
40

LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Dec 15, 2015

Download

Documents

Freddie Coe
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

LR-Grammars

LR(0), LR(1), and LR(K)

Page 2: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Deterministic Context-Free Languages DCFL A family of languages that are accepted

by a Deterministic Pushdown Automaton (DPDA)

Many programming languages can be described by means of DCFLs

Page 3: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Prefix and Proper Prefix Prefix (of a string)

Any number of leading symbols of that string

Example: abc Prefixes: , a, ab, abc

Proper Prefix (of a string) A prefix of a string, but not the string itself Example: abc

Proper prefixes: , a, ab

Page 4: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Prefix Property Context-Free Language (CFL) L is said

to have the prefix property whenever w is in L and no proper prefix of w is in L

Not considered a serve restriction Why?

Because we can easily convert a DCFL to a DCFL with the prefix property by introducing an endmarker

Page 5: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Suffix and Proper Suffix Suffix (of a string)

Any number of trailing symbols Proper Suffix

A suffix of a string, but not the string itself

Page 6: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Example Grammar This is the grammar that will be used in

many of the examples: S’ Sc S SA | A A aSb | ab

Page 7: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

LR-Grammar Left-to-right scan of the input producing

a rightmost derivation Simply:

L stands for Left-to-right R stands for rightmost derivation

Page 8: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

LR-Items An item (for a given CFG)

A production with a dot anywhere in the right side (including the beginning and end)

In the event of an -production: B B · is an item

Page 9: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Example: Items Given our example grammar:

S’ Sc, S SA|A, A aSb|ab The items for the grammar are:

S’·Sc, S’S·c, S’Sc·

S·SA, SS·A, SSA·, S·A, SA·

A·aSb, Aa·Sb, AaS·b, AaSb·, A·ab, Aa·b, Aab·

Page 10: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Some Notation * = 1 or more steps in a derivation

*rm = rightmost derivation

rm = single step in rightmost derivation

Page 11: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Right-Sentential Form A sentential form that can be derived by

a rightmost derivation A string of terminals and variables is

called a sentential form if S*

Page 12: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

More terms Handle

A substring which matches the right-hand side of a production and represents 1 step in the derivation

Or more formally: (of a right-sentential form for CFG G) Is a substring such that:

S *rm w w =

If the grammar is unambiguous: There are no useless symbols The rightmost derivation (in right-sentential

form) and the handle are unique

Page 13: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Example Given our example grammar:

S’ Sc, S SA|A, A aSb|ab An example right-most derivation:

S’ Sc SAc SaSbc Therefore we can say that: SaSbc is in

right-sentential form The handle is aSb

Page 14: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

More terms Viable Prefix

(of a right-sentential form for ) Is any prefix of ending no farther right

than the right end of a handle of . Complete item

An item where the dot is the rightmost symbol

Page 15: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Example Given our example grammar:

S’ Sc, S SA|A, A aSb|ab The right-sentential form abc:

S’ *rm Ac abc

Valid prefixes: A ab for prefix ab A ab for prefix a A ab for prefix

Aab is a complete item, Ac is the right-sentential form for abc

Page 16: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

LR(0) Left-to-right scan of the input producing

a rightmost derivation with a look-ahead (on the input) of 0 symbols

It is a restricted type of CFG 1st in the family of LR-grammars LR(0) grammars define exactly the

DCFLs having the prefix property

Page 17: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Computing Sets of Valid Items The definition of LR(0) and the method

of accepting L(G) for LR(0) grammar G by a DPDA depends on: Knowing the set of valid items for each

prefix For every CFG G, the set of viable

prefixes is a regular set This regular set is accepted by an NFA

whose states are the items for G

Page 18: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Continued Given an NFA (whose states are the

items for G) that accepts the regular set We can apply the subset construction to

this NFA and yield a DFA The DFA whose state is the set of valid

items for

Page 19: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

NFA M NFA M recognizes the viable prefixes for CFG

M = (Q, V T, , q0, Q) Q = set of items for G plus state q0

G = (V, T, P, S) Three Rules (q0,) = {S| S is a production} (AB,) = {B| B is a production}

Allows expansion of a variable B appearing immediately to the right of the dot

(AX, X) = {AX} Permits moving the dot over any grammar symbol X if

X is the next input symbol

Page 20: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Theorem 10.9

The NFA M has property that (q0, ) contains A iff A is valid for

This theorem gives a method for computing the sets of valid items for any viable prefix Note: It is an NFA. It can be converted to a

DFA. Then by inspecting each state it can be determine if it is a valid LR(0) grammar

Page 21: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Definition of LR(0) Grammar G is an LR(0) grammar if

The start symbol does not appear on the right side of any productions

prefixes of G where A is a complete item, then it is unique

i.e., there are no other complete items (and there are no items with a terminal to the right of the dot) that are valid for

Page 22: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Facts we now know: Every LR(0) grammar generates a

DCFL Every DCFL with the prefix property has

a LR(0) grammar Every language with LR(0) grammar

have the prefix property L is DCFL iff L has a LR(0) grammar

Page 23: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

DPDA’s from LR(0) Grammars We trace out the rightmost derivation in

reverse The stack holds a viable prefix (in right-

sentential form) and the current state (of the DFA) Viable prefixes: X1X2…Xk

States: s1, s2,…,sk

Stack: s0X1s1…Xksk

Page 24: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Reduction

If sk contains A Then A is valid for X1X2…Xk

= suffix of X1X2…Xk

Let = Xi+1…Xk

w such that X1…Xkw is a right-sentential form.

Page 25: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Reduction Continued There is a derivation:

S *rm X1…XiAw rm X1…Xkw

To obtain the right-sentential form (X1…Xkw) in a right derivation we reduce to A Therefore, we pop Xi+1…Xk from the stack

and push A onto the stack

Page 26: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Shift

If sk contains only incomplete items Then the right-sentential form (X1…Xkw)

cannot be formed using a reduction Instead we simply “shift” the next input

symbol onto the stack

Page 27: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Theorem 10.10 If L is L(G) for an LR(0) grammar G,

then L is N(M) for a DPDA M N(M) = the language accepted by empty

stack or null stack

Page 28: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Proof Construct from G the DFA D

Transition function: recognizes G’s prefixes Stack Symbols of M are

Grammar Symbols of G States of D

M has start state q and other states used to perform reduction

Page 29: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

We know that: If G is LR(0) then

Reductions are the only way to get the right-sentential form when the state of the DFA (on the top of the stack) contains a complete item

When M starts on input w it will construct a right-most derivation for w in reverse order

Page 30: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

What we need to prove: When a shift is called for and the top

DFA state on the stack has only incomplete items then there are no handles

(Note: if there was a handle, then some DFA state on the stack would have a complete item)

Page 31: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Suppose state A (complete

item)

Each state is put onto the top of the stack

It would then immediately be reduced to A

Therefore, a complete item cannot possibly become buried on the stack

Page 32: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Proof continued The acceptance of G occurs when the

top of the stack contains the start symbol

The start symbol by definition of LR(0) grammars cannot appear on the right side of a production

L(G) always has a prefix property if G is LR(0)

Page 33: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Conclusion of Proof Thus, if w is in L(G), M finds the

rightmost derivation of w, reduces w to S, and accepts

If M accepts w, then the sequence of right-sentential forms provides a derivation of w from S

N(M) = L(G)

Page 34: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Corollary of Theorem 10.10 Every LR(0) grammar is unambiguous Why?

The rightmost derivation of w is unique (Given the construction we provided)

Page 35: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

LR(1) Grammars LR grammar with 1 look-ahead All and only deterministic CFL’s have

LR(1) grammars Are greatly important to compiler design

Why? Because they are broad enough to include the

syntax of almost all programming languages Restrictive enough to have efficient parsers

(that are essentially DPDAs)

Page 36: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

LR(1) Item Consists of an LR(0) item followed by a

look-ahead set consisting of terminals and/or the special symbol $ $ = the right end of the string

General Form: A , {a1, a2, …, an}

The set of LR(1) items forms the states of a viable prefix by converting the NFA to a DFA

Page 37: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

A grammar is LR(1) if The start symbol does not appear on the

right side of any productions The set of items, I, valid for some viable

prefix includes some complete item A, {a1,…,an} then No ai appears immediately to the right of the

dot in any item of I If B, {b1,…,bk} is another complete item in

I, then ai bj for any 1 i n and 1 j k

Page 38: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Accepting LR(1) language: Similar to the DPDA used with LR(0)

grammars However, it is allowed to use the next

input symbol during it’s decision making This is accomplished by appending a $

to the end of the input and the DPDA keeps the next input symbol as part of the state

Page 39: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

LR(1) Rules for Reduce/Shift If the top set of items has a complete item

A, {a1, a2, …, an}, where A S, reduce by A if the current input symbol is in {a1, a2, …, an}

If the top set of items has an item S, {$}, then reduce by S and accept if the current symbol is $ (i.e., the end of the input is reached)

If the top set of items has an item AaB, T, and a is the current input symbol, then shift

Page 40: LR-Grammars LR(0), LR(1), and LR(K). Deterministic Context-Free Languages DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton.

Regarding the Rules Guarantees that at most one of the

rules will be applied for any input symbol or $

Often for practicality the information is summarized into a table Rows: sets of items Columns: terminals and $