October 2008CSA3180: Sentence Parsing1 CSA3180: NLP Algorithms Sentence Parsing Algorithms 2 Problems with DFTD Parser.
Post on 04-Jan-2016
221 Views
Preview:
Transcript
October 2008 CSA3180: Sentence Parsing 1
CSA3180: NLP Algorithms
Sentence Parsing Algorithms 2
Problems with DFTD Parser
October 2008 CSA3180: Sentence Parsing 2
Problems withDFTD Parser
• Left Recursion• Handling Ambiguity• Inefficiency
October 2008 CSA3180: Sentence Parsing 3
Left Recursion
• A grammar is left recursive if it contains at least one non-terminal A for whichA * A and *
(n.b. * is the transitive closure of )• Intuitive idea: derivation of that category
includes itself along its leftmost branch.
NP NP PP NP NP and NP
NP DetP Nominal DetP NP ' s
October 2008 CSA3180: Sentence Parsing 4
Left RecursionLeft recursion can lead to an infinite loop
[nltk demo
October 2008 CSA3180: Sentence Parsing 5
Dealing with Left Recursion
• Use different parsing strategy• Reformulate the grammar to
eliminate LRA A |
is rewritten asA A' A' A' |
October 2008 CSA3180: Sentence Parsing 6
Rewriting the Grammar
NP → NP ‘and’ NP
NP → D N | D N PP
October 2008 CSA3180: Sentence Parsing 7
Rewriting the Grammar
NP → NP ‘and’ NP
β
NP → D N | D N PP
α
October 2008 CSA3180: Sentence Parsing 8
Rewriting the Grammar
NP → NP ‘and’ NP
β
NP → D N | D N PP
α
New Grammar
NP → α NP1
NP1 → β NP1 | ε
October 2008 CSA3180: Sentence Parsing 9
Rewriting the Grammar
NP → NP ‘and’ NP
β
NP → D N | D N PP
α
New Grammar
NP → α NP1
NP1 → β NP1 | ε
α → D N | D N PP
β → ‘and’ NP
October 2008 CSA3180: Sentence Parsing 10
New Parse Tree
D N
the cat
α
NP
NP1
ε
October 2008 CSA3180: Sentence Parsing 11
Rewriting the Grammar
• Different parse tree• Unnatural parse tree?
October 2008 CSA3180: Sentence Parsing 12
Problems withDFTD Parser
• Left Recursion• Handling Ambiguity• Inefficiency
October 2008 CSA3180: Sentence Parsing 13
Handling Ambiguity
• Coordination Ambiguity: different scope of conjunction:Hot curry and ice taste nice with riceHot curry and rice taste nice with ice
• Attachment Ambiguity: a constituent can be added to the parse tree in different places:I shot an elephant in my trousers
• VP → VP PPNP → NP PP
October 2008 CSA3180: Sentence Parsing 14
Real sentences are full of ambiguities
President Kennedy today pushed aside other White House business to devote all his time and attention to working on the Berlin crisis address he will deliver tomorrow night to the American people over nationwide television and radio
October 2008 CSA3180: Sentence Parsing 15
Prepositional Phrase Ambiguity
No of PPs # parses
2 2
3 5
4 14
5 132
6 469
7 1430
8 4867
he will deliver- to the American people- over nationwide TV- in New York- during September- for very good reasons
October 2008 CSA3180: Sentence Parsing 16
Growth of Number of Ambiguities
The nth Catalan number counts the ways of dissecting a polygon with n+2 sides into triangles by drawing nonintersecting diagonals. No of PPs # parses
2 2
3 5
4 14
5 132
6 469
7 1430
8 4867
October 2008 CSA3180: Sentence Parsing 17
Handling Ambiguities
• Statistical disambiguation– which is the most probable interpretation?
• Semantic knowledge– which is the most sensible interpretation?– Subatomic particles such as positively charged
protons and electrons
October 2008 CSA3180: Sentence Parsing 18
Problems withDFTD Parser
• Left Recursion• Handling Ambiguity• Inefficiency
October 2008 CSA3180: Sentence Parsing 19
Repeated Parsing of Subtrees
• Local versus global ambiguity.– NP → Det Noun– NP → NP PP
• Because of the top down depth first, left to right policy, the parser builds trees that fail because they do not cover all of the input.
• Successive parses cover larger segments of the input, but these include structures that have already been built before.
October 2008 CSA3180: Sentence Parsing 20
Repeated Parsing ofSubtrees
a flight 4
from Indianapolis 3
to Houston 2
on TWA 1
A flight from Indianapolis 3
A flight from Indianapolis to Houston
2
A flight from Indianapolis to Houston on TWA
1
NP Nom
Det Noun
a flight
NP
NP PP Nom
Det Noun P Noun
a flight from Indianapolis
October 2008 CSA3180: Sentence Parsing 21
Repeated Parsing ofSubtrees
a flight 4
from Indianapolis 3
to Houston 2
on TWA 1
A flight from Indianapolis 3
A flight from Indianapolis to Houston
2
A flight from Indianapolis to Houston on TWA
1
top related