Top Banner
Context-Free Parsing
37

Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

Context-Free Parsing

Page 2: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

2/37

Basic issues

• Top-down vs. bottom-up

• Handling ambiguity– Lexical ambiguity– Structural ambiguity

• Breadth first vs. depth first

• Handling recursive rules

• Handling empty rules

Page 3: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

3/37

Some terminology

• Rules written A B co Terminal vs. non-terminal symbolso Left-hand side (head): always non-terminalo Right-hand side (body): can be mix of terminal

and non-terminal, any number of themo Unique start symbol (usually S)o ‘’ “rewrites as”, but is not directional (an “=”

sign would be better)

Page 4: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

4/37

S NP VPNP det nVP vVP v NP

1. Top-down with simple grammar

Lexicondet {an, the}n {elephant, man} v shot

S NP VP S

NP VP

NP det n

the man shot an elephant

det n

det {an, the}

the

n {elephant, man}

man

VP vVP v NP

v

v shot shot

No more rules, but input is not completely accounted for…So we must backtrack, and try the other VP rule

Page 5: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

5/37

Lexicondet {an, the}n {elephant, man} v shot

S NP VPNP det nVP vVP v NP

1. Top-down with simple grammar

S NP VP S

NP VP

NP det n

the man shot an elephant

det n

det {an, the}

the

n {elephant, man}

man

VP vVP v NP

v NP

v shot shot

NP det n

det n

det {an, the} an

n {elephant, man}

elephant

No more rules, and input is completely accounted for

Page 6: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

6/37

Breadth-first vs depth-first (1)

• When we came to the VP rule we were faced with a choice of two rules

• “Depth-first” means following the first choice through to the end

• “Breadth-first” means keeping all your options open

• We’ll see this distinction more clearly later,

• And also see that it is quite significant

Page 7: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

7/37

S NP VPNP det nVP vVP v NP

2. Bottom-up with simple grammar

Lexicondet {an, the}n {elephant, man} v shot

S NP VP

S

NP det n

the man shot an elephant

VP v

NP VP

We’ve reached the top, but input is not completely accounted for…So we must backtrack, and try the other VP rule

det {an, the}n {elephant, man} v shot

det n v det n

NP

VP v NP

VP

S NP VP

S

We’ve reached the top, and input is completely accounted for

Page 8: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

8/37

S NP VPNP det nVP vVP v NP

Same again but with lexical ambiguity

Lexicondet {an, the}n {elephant, man, shot} v shot

shot can be v or n

Page 9: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

9/37

S NP VPNP det nVP vVP v NP

3. Top-down with lexical ambiguity

Lexicondet {an, the}n {elephant, man, shot} v shot

S NP VP S

NP VP

NP det n

the man shot an elephant

det n

det {an, the}

the

n {elephant, man}

man

VP vVP v NP

v NP

shot det n

an elephantSame as before: at this point, we are looking for a v, and shot fits the bill; the n reading never comes into play

Page 10: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

10/37

S NP VPNP det nVP vVP v NP

4. Bottom-up with lexical ambiguity

Lexicondet {an, the}n {elephant, man, shot} v shot

S NP VP

S

NP det n

the man shot an elephant

VP v

NP VP

det {an, the}n {elephant, man, shot} v shot

det n v det n

NP

n

VP v NP VP

S

Terminology:graphnodesarcs (edges)

Page 11: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

11/37

det n

S NP VPNP det nVP vVP v NP

4. Bottom-up with lexical ambiguity

Lexicondet {an, the}n {elephant, man, shot} v shot

the man shot an elephant

NP

det n v

NP

n

VP

VPS

S

Let’s get rid of all the unused arcs

Page 12: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

12/37

det n

S NP VPNP det nVP vVP v NP

4. Bottom-up with lexical ambiguity

Lexicondet {an, the}n {elephant, man, shot} v shot

the man shot an elephant

NP

det n v

NP

VP

S

Let’s get rid of all the unused arcs

Page 13: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

13/37

det n

S NP VPNP det nVP vVP v NP

4. Bottom-up with lexical ambiguity

Lexicondet {an, the}n {elephant, man, shot} v shot

the man shot an elephant

NP

det n v

NP

VP

SAnd let’s clear away all the arcs…

Page 14: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

14/37

det n

S NP VPNP det nVP vVP v NP

4. Bottom-up with lexical ambiguity

Lexicondet {an, the}n {elephant, man, shot} v shot

the man shot an elephant

NP

det n v

NP

VP

SAnd let’s clear away all the arcs…

Page 15: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

15/37

Breadth-first vs depth-first (2)

• In chart parsing, the distinction is more clear cut:

• At any point there may be a choice of things to do: which arcs to develop

• Breadth-first vs. depth-first can be seen as what order they are done in

• Queue (FIFO = breadth-first) vs. stack (LIFO= depth-first)

Page 16: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

16/37

S NP VPNP det nNP det n PPVP vVP v NPVP v NP PPPP prep NP

Same again but with structural ambiguity

Lexicondet {an, the, his}n {elephant, man, shot, pyjamas} v shot prep in

S

NP VP

det n

the man

v NP

shot det n

an elephantprep NP

in det n

his pyjamas

PP

in his pyjamasthe man shot an elephant

We introduce a PP rule in two places

Page 17: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

17/37

Lexicondet {an, the, his}n {elephant, man, shot, pyjamas} v shot prep in

S

NP VP

det n

the man

v NP

shot det n

an elephantprep NP

in det n

his pyjamas

PP

in his pyjamasthe man shot an elephant

We introduce a PP rule in two places

S NP VPNP det nNP det n PPVP vVP v NPVP v NP PPPP prep NP

Same again but with structural ambiguity

Page 18: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

18/37

S NP VP S

NP VP

NP det n NP det n PP

the man shot an elephant in his pyjamas

det n

det {an, the, his}

n {elephant, man, shot, pyjamas}

5. Top-down with structural ambiguity

At this point, depending on our strategy (breadth-first vs. depth-first) we may consider the NP complete and look for the VP, or we may try the second NP rule.Let’s see what happens in the latter case.

PP prep NPprep in

PP

prep NP

The next word, shot, isn’t a prep,So this rule simply fails

the man

S NP VPNP det nNP det n PPVP vVP v NPVP v NP PPPP prep NP

Page 19: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

19/37

S NP VP S

NP VP

NP det n NP det n PP

the man shot an elephant in his pyjamas

det n

det {an, the, his}

the

n {elephant, man, shot, pyjamas}

man

VP vVP v NPVP v NP PPv shot

an elephant

5. Top-down with structural ambiguity

shot

v

As before, the first VP rule works,But does not account for all the input.

v NP

shot

NP det n NP det n PP

det n

det {an, the, his}

n {elephant, man, shot, pyjamas} Similarly, if we try the second VP rule, and the

first NP rule …

S NP VPNP det nNP det n PPVP vVP v NPVP v NP PPPP prep NP

Page 20: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

20/37

S NP VP S

NP VP

NP det n NP det n PP

the man shot an elephant in his pyjamas

det n

det {an, the, his}

the

n {elephant, man, shot, pyjamas}

man

VP vVP v NPVP v NP PPv shot

5. Top-down with structural ambiguity

shot

v

NP det n NP det n PP

So what do we try next?This?Or this?

Depth-first: it’s a stack, LIFO

Breadth-first: it’s a queue, FIFO

S NP VPNP det nNP det n PPVP vVP v NPVP v NP PPPP prep NP

an elephant

v NP

shot det n

Page 21: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

21/37

S NP VP S

NP VP

NP det n NP det n PP

the man shot an elephant in his pyjamas

det n

det {an, the, his}

the

n {elephant, man, shot, pyjamas}

man

VP vVP v NPVP v NP PPv shot

5. Top-down with structural ambiguity (depth-first)

NP det n NP det n PP an elephant

v NP

det {an, the, his}n {elephant, man, shot, pyjamas}

shot det n PP

PP prep NPprep in

prep NP

in his pyjamas

S NP VPNP det nNP det n PPVP vVP v NPVP v NP PPPP prep NP

Page 22: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

22/37

S NP VP S

NP VP

NP det n NP det n PP

the man shot an elephant in his pyjamas

det n

det {an, the, his}

the

n {elephant, man, shot, pyjamas}

man

VP vVP v NPVP v NP PPv shot

5. Top-down with structural ambiguity (breadth-first)

NP det n NP det n PP

v NP PP

prep NP

in his pyjamas

shot det n

andet {an, the, his}

elephant

n {elephant, man, shot, pyjamas}

PP prep NPprep in

S NP VPNP det nNP det n PPVP vVP v NPVP v NP PPPP prep NP

Page 23: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

23/37

Recognizing ambiguity

• Notice how the choice of strategy determines which result we get (first).

• In both strategies, there are often rules left untried, on the list (whether queue or stack).

• If we want to know if our input is ambiguous, at some time we do have to follow these through.

• As you will see later, trying out alternative paths can be quite intensive

Page 24: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

24/37

6. Bottom-up with structural ambiguity

S NP VP

NP det n

the man shot an elephant in his pyjamas

VP v

NP NPVP

VP v NP

S NP VPNP det nNP det n PPVP vVP v NPVP v NP PPPP prep NP

det n v det n prep det n

NP

NP det n PP

PP prep NP

PP

NP

VP

VP

VP v NP PP

VP

S

S

S

S

Page 25: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

25/37

6. Bottom-up with structural ambiguity

the man shot an elephant in his pyjamas

NP NP

S NP VPNP det nNP det n PPVP vVP v NPVP v NP PPPP prep NP

det n v det n prep det n

NP

PP

NP

VP

VP

VP

VP

S

S

S

Page 26: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

26/37

Recursive rules

• “Recursive” rules call themselves

• We already have some recursive rule pairs: NP det n PP

PP prep NP

• Rules can be immediately recursive AdjG adj AdjG

(the) big fat ugly (man)

Page 27: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

27/37

Recursive rulesLeft recursiveAdjG AdjG adjAdjG adj

Right recursiveAdjG adj AdjGAdjG adj

AdjG

AdjG adj

AdjG adj

big fat rich old

AdjG

AdjGadj

AdjGadj

big fat rich old

Page 28: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

28/37

NP

det n

the

NP det nNP det AdjG nAdjG AdjG adjAdjG adj

7. Top-down with left recursion

NP det nNP det AdjG n

the big fat rich old man

the

AdjG AdjG adjAdjG adj

NP

det AdjG n

AdjG adj

AdjG adj

AdjG adj

AdjG adj

AdjG adj

You can’t have left-recursive rules with a top-down parser, even if the non-recursive rule is first

Page 29: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

29/37

NP det nNP det AdjG nAdjG adj AdjGAdjG adj

7. Top-down with right recursion

NP det nNP det AdjG n

the big fat rich old man

the

AdjG adj AdjGAdjG adj

NP

det AdjG n

adj AdjG

big adj AdjG

fat adj AdjG

rich adj AdjG

old adj AdjG

adj

old

man

old

Page 30: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

30/37

NP det nNP det AdjG nAdjG AdvG adj AdjGAdjG adjAdvG AdvG advAdvG adv

8. Bottom-up with left and right recursion

AdjG rule is right recursive,AdvG rule is left recursive

the very very fat ugly man

det adv adv adj adj n

AdjG adj

AdjG AdjG

AdvG adv

AdvG AdvG

AdjG AdvG adj AdjG

AdjG

AdvG AdvG adv

AdvG

AdjG AdvG adj AdjG AdjGNP det AdjG n

NP

Quite a few useless paths, but overall no difficulty

Page 31: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

31/37

NP det nNP det AdjG nAdjG AdvG adj AdjGAdjG adjAdvG AdvG advAdvG adv

8. Bottom-up with left and right recursion

AdjG rule is right recursive,AdvG rule is left recursive

the very very fat ugly man

det adv adv adj adj n

AdjG adj

AdjG

AdvG adv

AdvG

AdjG AdvG adj AdjG AdvG AdvG adv

AdvG

AdjG AdvG adj AdjG AdjGNP det AdjG n

NP

Page 32: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

32/37

Empty rules

• For exampleNP det AdjG n

AdjG adj AdjG

AdjG ε

• Equivalent toNP det AdjG n

NP det n

AdjG adj

AdjG adj AdjG

• Or NP det (AdjG) n

AdjG adj (AdjG)

Page 33: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

33/37

NP det AdjG nAdjG adj AdjGAdjG ε

7. Top-down with empty rules

NP det AdjG n

the man

the

AdjG adj AdjGAdjgG ε

NP

det AdjG n

adj AdjG man

NP det AdjG n

the big fat man

the

AdjG adj AdjGAdjgG ε

NP

det AdjG n

adj AdjG

big adj AdjG

fat

man

Page 34: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

34/37

8. Bottom-up with empty rules

the fat man

det adj n

AdjG ε

AdjG adj AdjG

NP det AdjG nNP

Lots of useless paths, especially in a long sentence, but otherwise no difficulty

NP det AdjG nAdjG adj AdjGAdjG ε

AdjGAdjG AdjG AdjG

AdjG

Page 35: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

35/37

Top down vs. bottom-up

• Bottom-up builds many useless trees• Top-down can propose false trails, sometimes

quite long, which are only abandoned when they reach the word level– Especially a problem if breadth-first

• Bottom-up very inefficient with empty rules• Top-down CANNOT handle left-recursion• Top-down cannot do partial parsing

– Especially useful for speech

• Wouldn’t it be nice to combine them to get the advantages of both?

Page 36: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

36/37

Left-corner parsing

• The “left corner” of a rule is the first symbol after the rewrite arrow– e.g. in S NP VP, the left corner is NP.

• Left corner parsing starts bottom-up, taking the first item off the input and finding a rule for which it is the left corner.

• This provides a top-down prediction, but we continue working bottom-up until the prediciton is fulfilled.

• When a rule is completed, apply the left-corner principle: is that completed constituent a left-corner?

Page 37: Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.

37/37

S NP VPNP det nVP v VP v NP

9. Left-corner with simple grammar

NP det n

the man shot an elephant

VP v

det

the man

NP

n

S NP VP

S

VP

shot

v

but text not allaccounted for,so tryVP v NP

NP

det

an

NP det nn

elephant