Top Banner
L90: Overview of Natural Language Processing Lecture 12: Natural Language Generation Weiwei Sun Department of Computer Science and Technology University of Cambridge Michaelmas 2020/21
42

L90: Overview of Natural Language Processing - Lecture 12 ...

Jan 30, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: L90: Overview of Natural Language Processing - Lecture 12 ...

L90: Overview of Natural Language ProcessingLecture 12: Natural Language Generation

Weiwei Sun

Department of Computer Science and TechnologyUniversity of Cambridge

Michaelmas 2020/21

Page 2: L90: Overview of Natural Language Processing - Lecture 12 ...

I have a question about whether you’ve beenattempted to look at generation? [...] That is arich rich area which so few people address [...]

Well, I find generation completely terrifying[...] I am very interested in the problem [...]That’s an important question.

ACL lifetime archievement award lecture (vimeo.com/288152682) Mark SteedmanFBA, FRSE

equally important to language understanding

Lecture 12: Natural Language Generation

1. Overview

2. Text summarization

3. Surface realisation

4. Evaluation

Page 3: L90: Overview of Natural Language Processing - Lecture 12 ...

Overview

Page 4: L90: Overview of Natural Language Processing - Lecture 12 ...

Generation from what?!

natural language expressions R

comprehension

production

morphological structure

syntactic structure

semantic structure

discourse structureapplication-related structure

[...] you can get away with incomplete seman-tics when you are doing parsing, but when you’redoing generation, you have to specify everythingin semantics. And we don’t know how to do that.At least we don’t know how to do that completelyor properly.

Mark SteedmanFBA, FRSE

1 of 15

Page 5: L90: Overview of Natural Language Processing - Lecture 12 ...

Generation from what?!

natural language expressions R

comprehension

production

morphological structure

syntactic structure

semantic structure

discourse structureapplication-related structure

[...] you can get away with incomplete seman-tics when you are doing parsing, but when you’redoing generation, you have to specify everythingin semantics. And we don’t know how to do that.At least we don’t know how to do that completelyor properly.

Mark SteedmanFBA, FRSE

1 of 15

Page 6: L90: Overview of Natural Language Processing - Lecture 12 ...

Generation from what?!

• logical form: inverse of (deep) (semantic) parsing.aka surface realisation

• formally-defined data: databases, knowledge bases, etc

• semantic web ontologies, etc

• semi-structured data: tables, graphs etc

• numerical data: weather reports, etc

• cross-modal input: image, etc

• user input (plus other data sources) in assistive communication.

generating from data often requires domain experts

2 of 15

Page 7: L90: Overview of Natural Language Processing - Lecture 12 ...

Components of a classical generation system

• Content determination: deciding what information to convey

• Discourse structuring : overall ordering, sub-headings etc

• Aggregation: deciding how to split information into sentence-sizedchunks

• Referring expression generation: deciding when to use pronouns, whichmodifiers to use etc

• Lexical choice: which lexical items convey a given concept (or predicatechoice)

• Realization: mapping from a meaning representation (or syntax tree) toa string (or speech)

• Fluency ranking

3 of 15

Page 8: L90: Overview of Natural Language Processing - Lecture 12 ...

A typical framework for neural generation

y1=I y2=love yn=processing

h1 h2· · ·

hn

qx1

• Many different model designs.

• Need many examples of input and desired output.

4 of 15

Page 9: L90: Overview of Natural Language Processing - Lecture 12 ...

A typical framework for neural generation

y1=I y2=love yn=processing

h1 h2· · ·

hn

!x1

• Many different model designs.

• Need many examples of input and desired output.

4 of 15

Page 10: L90: Overview of Natural Language Processing - Lecture 12 ...

A typical framework for neural generation

y1=I y2=love yn=processing

h1 h2· · ·

hn

Õx1

• Many different model designs.

• Need many examples of input and desired output.

4 of 15

Page 11: L90: Overview of Natural Language Processing - Lecture 12 ...

A typical framework for neural generation

y1=I y2=love yn=processing

h1 h2· · ·

hn

`x1

• Many different model designs.

• Need many examples of input and desired output.

4 of 15

Page 12: L90: Overview of Natural Language Processing - Lecture 12 ...

A typical framework for neural generation

y1=I y2=love yn=processing

h1 h2· · ·

hn

`x1

encoding

• Many different model designs.

• Need many examples of input and desired output.

4 of 15

Page 13: L90: Overview of Natural Language Processing - Lecture 12 ...

A typical framework for neural generation

y1=I y2=love yn=processing

h1 h2· · ·

hn

`x1

encoding

decoding

• Many different model designs.

• Need many examples of input and desired output.

4 of 15

Page 14: L90: Overview of Natural Language Processing - Lecture 12 ...

from Y Goldberg’s talk

5 of 15

Page 15: L90: Overview of Natural Language Processing - Lecture 12 ...

Approaches to generation

• Classical (limited domain): hand-written rules for first five steps,grammar for realization, grammar small enough that no need for fluencyranking (or hand-written rules).

• Templates: most practical systems. Fixed text with slots, fixed rules forcontent determination.

• Statistical (limited domain): components as above, but use machinelearning (supervised or non-supervised).

• Neural (sequence-)to-sequence models.

6 of 15

Page 16: L90: Overview of Natural Language Processing - Lecture 12 ...

Text Summarization

Page 17: L90: Overview of Natural Language Processing - Lecture 12 ...

Regeneration: transforming text

• Text from partially ordered bag of words: statistical MT.

• Paraphrase

• Summarization (single- or multi-document)

• Wikipedia article construction from text fragments

• Text simplification

Also: mixed generation and regeneration systems, MT.

7 of 15

Page 18: L90: Overview of Natural Language Processing - Lecture 12 ...

Overview of summarization• Pure form of task: reduce the length of a document.• Most used for search results, question answering etc: different scenarios

have different requirements.• Multidocument summarization: e.g., bringing together information from

different news reports.• Two main system types:

Extractive: select sentences from a document. Possibly compressselected sentences.

Abstractive: use partial analysis of the text to build a summary.

Extractive

If we consider a discourse relation as a relationship between two phrases,we get a binary branching tree structure for the discourse. In manyrelationships, such as Explanation, one phrase depends on the other: e.g.,the phrase being explained is the main one and the other is subsidiary. Infact we can get rid of the subsidiary phrases and still have a reasonablycoherent discourse.

8 of 15

Page 19: L90: Overview of Natural Language Processing - Lecture 12 ...

Abstractive summarization with meaning representationsI saw Joe’s dog, which was running in the garden.

The dog was chasing a cat.

see-01

I dog

joe run-02

garden

ARG0 ARG1

poss ARG0

location

chase-01

dog cat

ARG0 ARG1

semantic parsing

see-01

I dog

joe run-02

garden

chase-01

cat

ARG0 ARG1

poss ARG0

location

ARG0 ARG1

merge

chase-01

dog catgarden

joe

ARG0 ARG1location

poss

summarize

Joe’s dog was chasing a cat in the garden.

surface realisation

Liu et al. 2015

9 of 15

Page 20: L90: Overview of Natural Language Processing - Lecture 12 ...

Abstractive summarization with meaning representationsI saw Joe’s dog, which was running in the garden.

The dog was chasing a cat.

see-01

I dog

joe run-02

garden

ARG0 ARG1

poss ARG0

location

chase-01

dog cat

ARG0 ARG1

semantic parsing

see-01

I dog

joe run-02

garden

chase-01

cat

ARG0 ARG1

poss ARG0

location

ARG0 ARG1

merge

chase-01

dog catgarden

joe

ARG0 ARG1location

poss

summarize

Joe’s dog was chasing a cat in the garden.

surface realisation

Liu et al. 2015

9 of 15

Page 21: L90: Overview of Natural Language Processing - Lecture 12 ...

Abstractive summarization with meaning representationsI saw Joe’s dog, which was running in the garden.

The dog was chasing a cat.

see-01

I dog

joe run-02

garden

ARG0 ARG1

poss ARG0

location

chase-01

dog cat

ARG0 ARG1

semantic parsing

see-01

I dog

joe run-02

garden

chase-01

cat

ARG0 ARG1

poss ARG0

location

ARG0 ARG1

merge

chase-01

dog catgarden

joe

ARG0 ARG1location

poss

summarize

Joe’s dog was chasing a cat in the garden.

surface realisation

Liu et al. 2015

9 of 15

Page 22: L90: Overview of Natural Language Processing - Lecture 12 ...

Abstractive summarization with meaning representationsI saw Joe’s dog, which was running in the garden.

The dog was chasing a cat.

see-01

I dog

joe run-02

garden

ARG0 ARG1

poss ARG0

location

chase-01

dog cat

ARG0 ARG1

semantic parsing

see-01

I dog

joe run-02

garden

chase-01

cat

ARG0 ARG1

poss ARG0

location

ARG0 ARG1

merge

chase-01

dog catgarden

joe

ARG0 ARG1location

poss

summarize

Joe’s dog was chasing a cat in the garden.

surface realisation

Liu et al. 2015

9 of 15

Page 23: L90: Overview of Natural Language Processing - Lecture 12 ...

Abstractive summarization with meaning representationsI saw Joe’s dog, which was running in the garden.

The dog was chasing a cat.

see-01

I dog

joe run-02

garden

ARG0 ARG1

poss ARG0

location

chase-01

dog cat

ARG0 ARG1

semantic parsing

see-01

I dog

joe run-02

garden

chase-01

cat

ARG0 ARG1

poss ARG0

location

ARG0 ARG1

merge

chase-01

dog catgarden

joe

ARG0 ARG1location

poss

summarize

Joe’s dog was chasing a cat in the garden.

surface realisation

Liu et al. 2015

9 of 15

Page 24: L90: Overview of Natural Language Processing - Lecture 12 ...

Abstractive summarization: Evaluation

Evaluation on Proxy Report section of AMRBank LCD2017T10.

AMRs NLG model rouge-1 rouge-2 rouge-L

gold amr2seq + LM 40.4 20.3 31.4amr2seq 38.9 12.9 27.0amr2bow (Liu et al.) 39.6 6.2 22.1

RIGA amr2seq + LM 42.3 21.2 33.6amr2seq 37.8 10.7 26.9

– OpenNMT 36.1 19.2 31.1

Hardy and Vlachos, 2018

10 of 15

Page 25: L90: Overview of Natural Language Processing - Lecture 12 ...

Surface Realisation

Page 26: L90: Overview of Natural Language Processing - Lecture 12 ...

Modeling Syntactico-Semantic CompositionThe Principle of Compositionality

The meaning of an expression is a function of the meanings ofits parts and of the way they are syntactically combined.

B. Partee

pandablue

11 of 15

Page 27: L90: Overview of Natural Language Processing - Lecture 12 ...

Modeling Syntactico-Semantic CompositionThe Principle of Compositionality

The meaning of an expression is a function of the meanings ofits parts and of the way they are syntactically combined.

B. Partee

pandablue

11 of 15

Page 28: L90: Overview of Natural Language Processing - Lecture 12 ...

Modeling Syntactico-Semantic CompositionThe Principle of Compositionality

The meaning of an expression is a function of the meanings ofits parts and of the way they are syntactically combined.

B. Partee

pandablue

11 of 15

Page 29: L90: Overview of Natural Language Processing - Lecture 12 ...

Parse a meaning representation

A dynamic programming algorithm (Chiang et al., 2013)

A

CB D

E F G

arg1

arg1

arg1

arg1

cjt-l

cjt-r cjt-l

cjt-r

2 3

1

Y1

2

3

4

X=⇒Z

arg1arg1

arg1

12 of 15

Page 30: L90: Overview of Natural Language Processing - Lecture 12 ...

Parse a meaning representation

A dynamic programming algorithm (Chiang et al., 2013)

A

CB D

E F G

arg1

arg1

arg1

arg1

cjt-l

cjt-r cjt-l

cjt-r

2 3

1

Y1

2

3

4

X=⇒Z

arg1arg1

arg1

12 of 15

Page 31: L90: Overview of Natural Language Processing - Lecture 12 ...

Parse a meaning representation

A dynamic programming algorithm (Chiang et al., 2013)

A

CB D

E F G

arg1

arg1

arg1

arg1

cjt-l

cjt-r cjt-l

cjt-r

2 3

1

Y1

2

3

4

X=⇒Z

arg1arg1

arg1

12 of 15

Page 32: L90: Overview of Natural Language Processing - Lecture 12 ...

Parse a meaning representation

A dynamic programming algorithm (Chiang et al., 2013)

A

CB D

E F G

arg1

arg1

arg1

arg1

cjt-l

cjt-r cjt-l

cjt-r

2 3

1

Y1

2

3

4

X=⇒Z

arg1arg1

arg1

12 of 15

Page 33: L90: Overview of Natural Language Processing - Lecture 12 ...

Parse a meaning representation

A dynamic programming algorithm (Chiang et al., 2013)

A

CB D

E F G

arg1

arg1

arg1

arg1

cjt-l

cjt-r cjt-l

cjt-r

2 3

1

Y1

2

3

4

X=⇒Z

arg1arg1

arg1

12 of 15

Page 34: L90: Overview of Natural Language Processing - Lecture 12 ...

Parse a meaning representation

A dynamic programming algorithm (Chiang et al., 2013)

A

CB D

E F G

arg1

arg1

arg1

arg1

cjt-l

cjt-r cjt-l

cjt-r

2 3

1

Y1

2

3

4

X=⇒Z

arg1arg1

arg1

12 of 15

Page 35: L90: Overview of Natural Language Processing - Lecture 12 ...

Parse a meaning representation

A dynamic programming algorithm (Chiang et al., 2013)

A

CB D

E F G

arg1

arg1

arg1

arg1

cjt-l

cjt-r cjt-l

cjt-r

2 3

1

Y1

2

3

4

X=⇒Z

arg1arg1

arg1

12 of 15

Page 36: L90: Overview of Natural Language Processing - Lecture 12 ...

Parse a meaning representation

A dynamic programming algorithm (Chiang et al., 2013)

A

CB D

E F G

arg1

arg1

arg1

arg1

cjt-l

cjt-r cjt-l

cjt-r

2 3

1

Y1

2

3

4

X=⇒Z

arg1arg1

arg1

12 of 15

Page 37: L90: Overview of Natural Language Processing - Lecture 12 ...

Parse a meaning representation

A dynamic programming algorithm (Chiang et al., 2013)

A

CB D

E F G

arg1

arg1

arg1

arg1

cjt-l

cjt-r cjt-l

cjt-r

2 3

1

Y1

2

3

4

X=⇒Z

arg1arg1

arg1

12 of 15

Page 38: L90: Overview of Natural Language Processing - Lecture 12 ...

Evaluation

Page 39: L90: Overview of Natural Language Processing - Lecture 12 ...

Tokenwise evaluation

complete match?

POS tagging

| {〈word, tag〉}system ∩ {〈word, tag〉}gold || {word} |

Phrase structure parsing

precision =| {〈left, right, category〉}system ∩ {〈left, right, category〉}gold |

| {〈left, right, category〉}system |

recall =| {〈left, right, category〉}system ∩ {〈left, right, category〉}gold |

| {〈left, right, category〉}gold |

Fβ = (1 + β2)× precision× recall

β2precision + recall

f-score: en.wikipedia.org/wiki/F-score

13 of 15

Page 40: L90: Overview of Natural Language Processing - Lecture 12 ...

Tokenwise evaluation

complete match?

POS tagging

| {〈word, tag〉}system ∩ {〈word, tag〉}gold || {word} |

Phrase structure parsing

precision =| {〈left, right, category〉}system ∩ {〈left, right, category〉}gold |

| {〈left, right, category〉}system |

recall =| {〈left, right, category〉}system ∩ {〈left, right, category〉}gold |

| {〈left, right, category〉}gold |

Fβ = (1 + β2)× precision× recall

β2precision + recall

f-score: en.wikipedia.org/wiki/F-score13 of 15

Page 41: L90: Overview of Natural Language Processing - Lecture 12 ...

rougerouge-N : Overlap of N -grams between the system and referencesummaries.

rouge-L: Longest Common Subsequence.

• A sequence Z = [z1, z2, . . . , zk] is a subsequence of another sequenceX = [x1, x2, . . . , xm], if there exists a strict increasing sequence[i1, i2, . . . , ik] of indices of X such that for all j = 1, 2, . . . , k, we havexij = zj .

• The longest common subsequence (LCS) of X and Y is a commonsubsequence with maximum length.

Sentence-level LCS (X: reference):

Rlcs =#LCS(X,Y )

#X

Plcs =#LCS(X,Y )

#Y

Lin (2004): www.aclweb.org/anthology/W04-1013.pdf

14 of 15

Page 42: L90: Overview of Natural Language Processing - Lecture 12 ...

Readings

• Ann’s lecture notes.https://www.cl.cam.ac.uk/teaching/1920/NLP/materials.html

* Y Goldberg. Neural Language Generation. https://inlg2018.uvt.

nl/wp-content/uploads/2018/11/INLG2018-YoavGoldberg.pdf

15 of 15