Top Banner
Natural Language Processing Readings: Russell and Norvig’s AI: a Modern approach (Chapter 22) James Allen’s NLU (Chapter 3)
35

Natural Language Processing

Jan 11, 2016

Download

Documents

ASTRA

Natural Language Processing. Readings: Russell and Norvig ’ s AI: a Modern approach (Chapter 22) James Allen ’ s NLU (Chapter 3). What is Natural Language Processing?. NLP: applications. Speech recognition and synthesis Machine translation Document processing information extraction - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Natural Language Processing

Natural Language Processing

Readings:Russell and Norvig’s AI: a Modern approach (Chapter 22)

James Allen’s NLU (Chapter 3)

Page 2: Natural Language Processing

What is Natural Language Processing?

Page 3: Natural Language Processing

NLP: applications

• Speech recognition and synthesis

• Machine translation

• Document processing– information extraction– summarization

• Text generation

• Dialog systems (typed and spoken)

Page 4: Natural Language Processing

Levels of language analysis

• Phonology: What words (or sub words) are we dealing with?

• Morphology: How words are constructed from more basic meaning units?

• Syntax: What phrases are we dealing with?• Semantics: What’s the context-free meaning?• Pragmatics: What is the more exact (context-dependent)

meaning?• Discourse Knowledge: how the immediately preceding

sentences affect the interpretation of the next sentence?• World knowledge: Using general knowledge about the

world

Page 5: Natural Language Processing

Levels of language analysis

• Phonetics: sounds -> words

– /b/ + /o/ + /t/ = boat

• Morphology: morphemes -> words

– friend + ly = friendly

• Syntax: word sequence -> sentence structure

• Semantics: sentence structure + word meaning -> sentence meaning

• Pragmatics: sentence meaning + context -> more precise meaning

• Discourse and world knowledge

Page 6: Natural Language Processing

Levels of language analysis (cont.)

1. Language is one of fundamental aspects of human behavior and is crucial component of our lives.

2. Green frogs have large noses.

3. Green ideas have large noses.

4. Large have green ideas nose.

5. I go store.

Page 7: Natural Language Processing

Why is NLP Hard?

“At last, a computer that understands you like your mother”

Page 8: Natural Language Processing

Ambiguity

• “At last, a computer that understands you like your mother”

1. (*) It understands you as well as your mother understands you

2. It understands (that) you like your mother

3. It understands you as well as it understands your mother

• 1 and 3: Does this mean well, or poorly?

Page 9: Natural Language Processing

Ambiguity at Many Levels

At the acoustic level (speech recognition):

1. “ ... a computer that understands you like your mother”

2. “ ... a computer that understands you lie cured mother”

Page 10: Natural Language Processing

Ambiguity at Many Levels

At the syntactic level:

Different structures lead to different interpretations.

Page 11: Natural Language Processing

More Syntactic Ambiguity

Page 12: Natural Language Processing

Ambiguity at Many Levels

At the semantic (meaning) level:

Two definitions of “mother”

• a woman who has given birth to a child

• a substance consisting of bacteria, used to produce vinegar (i.e., mother of vinegar)

This is an instance of word sense ambiguity

Page 13: Natural Language Processing

Ambiguity at Many Levels

At the discourse level:

• Alice says they’ve built a computer that understands you like your mother

• But she ...

• ... doesn’t know any details

• ... doesn’t understand me at all

Page 14: Natural Language Processing

Syntactic analysis

• Syntax can make explicit when there are several possible interpretations– (Rice flies) like sand.– Rice (flies like sand).

• Knowledge of ‘correct’ grammar can help finding the right interpretation– Flying planes are dangerous.– Flying planes is dangerous.

Page 15: Natural Language Processing

Syntax shows how words are related in a sentence.

Visiting aunts ARE boring.

vs

Visiting aunts IS boring.

 

Subject verb agreement allows us to disambiguate here.

Page 16: Natural Language Processing

How do we represent syntax?

Parse Tree

Page 17: Natural Language Processing

An example:

– Parsing sentence:

– "They are cooking apples."

Page 18: Natural Language Processing

Parse 1

Page 19: Natural Language Processing

Parse 2

Page 20: Natural Language Processing

How do we represent syntax?

List

Sue hit John

[ s, [np, [proper_noun, Sue] ] ,

[vp, [v, hit],

[np, [proper_noun, John] ]

Page 21: Natural Language Processing

What strategies exist for trying to find the structure in natural language?

Top Down vs. Bottom Up

Bottom - Up

John, hit, the, cat

prpn, hit, the, cat

prpn, v, the, cat

prpn, v, det, cat

prpn, v, det, n

np, v, det, n

np, v, np

np, vp

s

Top - Down

s

s -> np, vp

s -> prpn, vp

s -> John, v, np

s -> John, hit, np

s -> John, hit, det,n

s -> John, hit, the,n

s -> John, hit, the,cat

Page 22: Natural Language Processing

What strategies exist for trying to find the structure in natural language?

Top Down vs. Bottom Up

Bottom - Up

John, hit, the, cat

prpn, hit, the, cat

prpn, v, the, cat

prpn, v, det, cat

prpn, v, det, n

np, v, det, n

np, v, np

np, vp

s

Better if many alternative rules for a phrase

Worse if many alternative terminal symbols for each word

Top - Down

s

s -> np, vp

s -> prpn, vp

s -> John, v, np

s -> John, hit, np

s -> John, hit, det,n

s -> John, hit, the,n

s -> John, hit, the,cat

Better if many alternative terminal symbols for each word

Worse if many alternative rules for a phrase

Page 23: Natural Language Processing

What does an example grammar for English look like?

• Re-write rules

1.sentence -> noun phrase , verb phrase

2.noun phrase -> art , noun

3.noun phrase -> art , adj , noun

4.verb phrase -> verb

5.verb phrase -> verb , noun phrase

Page 24: Natural Language Processing

Top down parsing (as a search procedure)

1 The 2 dog 3 cried 4

Step Current state Backup States comment

1 ((S) 1) initial position

2 ((NP VP) 1) Rule 1

3 ((ART N VP) 1) Rules 2 & 3

((ART ADJ N VP) 1)

4 ((N VP) 2) Match Art with the

((ART ADJ N VP) 1)

5 ((VP) 3) Match N with dog

((ART ADJ N VP) 1)

6 ((V) 3) Rules 4 & 5

((V NP) 3)

((ART ADJ N VP) 1)

7 Success

Page 25: Natural Language Processing

25

Chart Parsing

General Principles: A Bottom-Up parsing method

– Construct a parse starting from the input symbols– Build constituents from sub-constituents– When all constituents on the RHS of a rule are matched, create a

constituent for the LHS of the rule

The Chart allows storing partial analyses, so that they can be shared.

Data structures used by the algorithm:– The Key: the current constituent we are attempting to “match”– An Active Arc: a grammar rule that has a partially matched RHS– The Agenda: Keeps track of newly found unprocessed constituents– The Chart: Records processed constituents (non-terminals) that span

substrings of the input

Page 26: Natural Language Processing

26

Chart Parsing

Steps in the Process: Input is processed left-to-right, one word at a time

1. Find all POS of the current word (terminal-level)2. Initialize Agenda with all POS of the word3. Pick a Key from the Agenda4. Add all grammar rules that start with the Key as active

arcs5. Extend any existing active arcs with the Key6. Add LHS constituents of newly completed rules to the

Agenda7. Add the Key to the Chart8. If Agenda not empty – go to (3), else go to (1)

Page 27: Natural Language Processing

27

The Chart Parsing Algorithm

Extending Active Arcs with a Key:- Each Active Arc has the form:

<pi> [A X1…C…Xm] <pj>

- A Key constituent has the form: <pj >C<pk>- When processing the Key <p1>C<p2>, we search the active

arc list for an arc <p0>[A X1…C…Xm]<p1>, and then create a new active arc <p0>[A X1…C… Xm]<p2>

- If the new active arc is a completed rule: <p0>[A X1…C]<p2>, then we add <p0>A<p2> to the Agenda

- After “using” the key to extend all relevant arcs, it is entered into the Chart

Page 28: Natural Language Processing

28

The Chart Parsing Algorithm

The Main Algorithm: parsing input x = x1…xn

1. i = 02. If Agenda is empty and i < n then set i = i + 1, find all

POS of xi and add them as constituents <pi>C<pi+1> to the Agenda

3. Pick a Key constituent <pj>C< pk > from the Agenda4. For each grammar rule of form A CX1…Xm, add

<pj>[A CX1…Xm]<pj> to the list of active arcs5. Use Key to extend all relevant active arcs6. Add LHS of any completed active arcs into the Agenda7. Insert the Key into the Chart8. If Key is <1>S<n> then Accept the input Else goto (2)

Page 29: Natural Language Processing

29

Chart Parsing - Example

The Grammar:(1) S NP VP

(2) NP ART ADJ N

(3) NP ART N

(4) NP ADJ N

(5) VP AUX VP

(6) VP V NP

Page 30: Natural Language Processing

30

Chart Parsing - Example

The input: “x = The large can can hold the water”

POS of Input Words:– the: ART– large: ADJ– can: N, AUX, V– hold: N, V– water: N, V

Page 31: Natural Language Processing

31

The large can can hold the water

Page 32: Natural Language Processing

32

The large can can hold the water

Page 33: Natural Language Processing

33

The large can can hold the water

Page 34: Natural Language Processing

34

The large can can hold the water

Page 35: Natural Language Processing

35

The final Chart