Top Banner
Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies
26

Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Dec 19, 2015

Download

Documents

Alvin Cox
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Natural Language Processing

Lecture 17: Earley’s Algorithm and Dependencies

Page 2: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Survey Feedback

• Expanded office hours– Tuesday evenings– Friday afternoons

• More detail in the lectures• Piazza• Quiz & Midterm policy– You don’t get them back

• Grading policy

Page 3: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Earley’s Algorithm

Page 4: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Grammar for Examples

NP -> NNP -> DT NNP -> NP PPNP -> PNPPP -> P NPS -> NP VPS -> VPVP -> V NPVP -> VP PP

DT -> aDT -> theP -> throughP -> withPNP -> SwabhaPNP -> ChicagoV -> bookV -> booksN -> bookN -> booksN -> flight

Page 5: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Earley’s Algorithm

• More “top-down” than CKY.• Still dynamic programming.• The Earley chart:ROOT → • S [0,0]

goal:ROOT → S• [0,n]

book the flight through Chicago

Page 6: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Earley’s Algorithm: PREDICT

Given V → α•Xβ [i, j] and the rule X → γ,create X → •γ [j, j]

ROOT → • S [0,0]S → • VP [0,0]S → • NP VP [0,0]...VP → • V NP [0,0]...NP → • DT N [0,0]...

book the flight through Chicago

ROOT → • S [0,0]S→ VPS → VP • [0,0]

Page 7: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Earley’s Algorithm: SCAN

Given V → α•Tβ [i, j] and the rule T → wj+1,create T → wj+1• [j, j+1]

ROOT → • S [0,0]S → • VP [0,0]S → • NP VP [0,0]...VP → • V NP [0,0]...NP → • DT N [0,0]...

V → book• [0, 1]

book the flight through Chicago

VP → • V NP [0,0]V → bookV → book • [0,1]

Page 8: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Earley’s Algorithm: COMPLETE

Given V → α•Xβ [i, j] and X → γ• [j, k],create V → αX•β [i, k]

ROOT → • S [0,0]S → • VP [0,0]S → • NP VP [0,0]...VP → • V NP [0,0]...NP → • DT N [0,0]...

V → book• [0, 1]VP → V • NP [0,1]

book the flight through Chicago

VP → • V NP [0,0]V → book • [0,1]VP → V • NP [0,1]

Page 9: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.
Page 10: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Earley’s Algorithm: COMPLETE

Given V → α•Xβ [i, j] and X → γ• [j, k],create V → αX•β [i, k]

ROOT → • S [0,0]S → • VP [0,0]S → • NP VP [0,0]...VP → • V NP [0,0]...NP → • DT N [0,0]...

V → book• [0, 1]VP → V • NP [0,1]

book the flight through Chicago

VP → • V NP [0,0]V → book • [0,1]VP → V • NP [0,1]

Page 11: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Thought Questions

• Runtime?– O(n3)

• Memory?– O(n2)

• Weighted version?• Recovering trees?

Page 12: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Parsing as Search

Page 13: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Implementing Recognizers as Search

Agenda = { state0 }while(Agenda not empty)

s = pop a state from Agendaif s is a success-state return s // valid parse treeelse if s is not a failure-state:

generate new states from spush new states onto Agenda

return nil // no parse!

Page 14: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Agenda-Based Probabilistic Parsing

Agenda = { (item, value) : initial updates from equations }// items take the form [X, i, j]; values are realswhile(Agenda not empty)

u = pop an update from Agendaif u.item is goal return u.value // valid parse treeelse if u.value > Chart[u.item]

store Chart[u.item] ← u.valueif u.item combines with other Chart items:

generate new updates from u and items stored in Chart

push new updates onto Agendareturn nil // no parse!

Page 15: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Catalog of CF Parsing Algorithms

• Recognition/Boolean vs. parsing/probabilistic• Chomsky normal form/CKY vs.

general/Earley’s• Exhaustive vs. agenda

Page 16: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Dependency Parsing

Page 17: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Treebank Tree

The luxury auto maker last year sold 1,214 cars in the U.S.

DT NN NN NN JJ NN VBD CD NNS IN DT NNP

NP NP NP NP

PP

VP

S

Page 18: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Headed Tree

The luxury auto maker last year sold 1,214 cars in the U.S.

DT NN NN NN JJ NN VBD CD NNS IN DT NNP

NP NP NP NP

PP

VP

S

Page 19: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Lexicalized Tree

The luxury auto maker last year sold 1,214 cars in the U.S.

DT NN NN NN JJ NN VBD CD NNS IN DT NNP

NPmaker NPyear NPcars NPU.S.

PPin

VPsold

Ssold

Page 20: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Dependency Tree

Page 21: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Methods for Dependency Parsing

• Parse with a phrase-structure parser with headed / lexicalized rules– Reuse algorithms we know– Leverage improvements in phrase structure parsing

• Maximum spanning tree algorithms– Words are nodes, edges are possible links– MSTParser

• Shift-reduce parsing– Read words in one at a time, decide to “shift” or

“reduce” to incrementally build tree structures– MaltParser, Stanford NN Dependency Parser

Page 22: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Maximum Spanning Tree

• Each dependency is an edge• Assign each edge a goodness score (ML problem)• Dependencies must form a tree• Find the highest scoring tree (Chu-Liu-Edmonds algorithm)

Figure: Graham Neubig

Page 23: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Shift-Reduce Parsing

• Two data structures– Buffer: words that are being read in– Stack: partially built dependency trees

• At each point choose– Shift: move word from stack to queue– Reduce-left: combine top two items in stack by making the top word the head of the tree

– Reduce-right: combine top two items in stack by maing the second word the head of the tree

• Parsing as classification: classifier says “shift” or “reduce-left” or “reduce-right”

Page 24: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Shift-Reduce Parsing

Figure: Graham Neubig

Stack Buffer Stack Buffer

Page 25: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Parsing as Classification

• Given a state:

• What action is best?

• Better classification -> better parsing

Stack Buffer

Page 26: Natural Language Processing Lecture 17: Earley’s Algorithm and Dependencies.

Shift-Reduce Algorithm

• ShiftReduce(queue)– make list heads– stack = [ (0, “ROOT”, “ROOT”) ]– while |buffer| > 0 or |stack| > 1:

• feats = MakeFeats(stack, buffer)• action = Predict(feats, weights)• if action = shift:

– stack.push(buffer.read())

• elif action = reduce_left:– heads[stack[-2]] = stack[-1]– stack.remove(-2)

• else: # action = reduce_right– heads[scack[-1]] = stack[-2]– stack.remove(-1)