Top Banner
Introduction to Natural Language Generation Yael Netzer Department of Computer Science Ben Gurion University
25

Introduction to Natural Language Generation

Jan 04, 2016

Download

Documents

kerry-grimes

Introduction to Natural Language Generation. Yael Netzer Department of Computer Science Ben Gurion University. Outline. Introduction – what is NLG Traditional architecture of NLG system Statistical methods in NLG FUF/SURGE An example in Hebrew – the noun phrase - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to Natural Language Generation

Introduction to Natural Language Generation

Yael Netzer

Department of Computer Science

Ben Gurion University

Page 2: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU2

Outline

• Introduction – what is NLG

• Traditional architecture of NLG system

• Statistical methods in NLG

• FUF/SURGE

• An example in Hebrew – the noun phrase

• A statistical method for generation

Page 3: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU3

What is Natural Language Generation (NLG)

NLG is the process of constructing natural language outputs from non-linguistic inputs. [VanLinden]

NLG is mapping some communication goal to some surface utterance that satisfies the goal. [Reiter & Dale]

Page 4: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU4

Aspects in NLG

• Theoretical and practical interests:– Theoretical: modeling various depths of human

language representation and production.– Practical: engineering human/computer

interfaces (computer as an author/authoring aid).

Page 5: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU5

Systems for examples:• NLG as an Author:

– Weather reports (FoG)– Stock market descriptions – Museum artifacts descriptions (ILEX)– “Personal” letters to costumers (AlethGen)

• NLG as an author aid• Integrated (partial) NLG uses:

– NLG in augmentative and alternative communication – Summarization (integrate ‘cut and paste’ techniques

with generation)– Machine Translation (generation from interlingua)

Page 6: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU6

Inputs of NLG systems

Formally, a system can be defined as a four-tuple: {k,c,u,d}

• k- knowledge source (tables of numbers, knowledge representation lang.) domain dependent, no generalizations.

• c - communicative goal: the consequence of a given execution of the system (considering appropriate information)

Page 7: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU7

NLG input spec. cont.

u - user model: characterization of the hearer or intended audience for whom the text is to be generated.

d - discourse history: previous interactions between user and NLG controlling anaphoric forms, preventing repetitions.

Page 8: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU8

The output for an NLG system

Any text conveying the communicative goal:

It can be a word like ``yes'' in a dialogue -

or a text consisting of many paragraphs in other cases.

The output should be related to the medium:

web pages with hyperlinks, voice stream etc.

Page 9: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU9

Main (Pipeline) Architecture• Content determination

– What information should be included in the text?• Document structuring

– how to organize text• Lexicalisation

– choosing particular words or phrases• Aggregation

– composing chunks of info into sentences.• Referring expression generation –

– what properties should be used in referring to an entity.

• Surface realization – mapping underlying content of text to a grammatically correct

sentence that expresses the desired meaning.

Page 10: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU10

Content Determination

Content determination:

• The process of deciding what to say.

• No general rules - domain specific. - what is important - what should always be

included, what is exceptional information, etc.- Practically – constructs a set of messages from

the underlying data (entities, concepts and relations).

Page 11: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU11

Document Structuring

Document Structuring:

imposing ordering and structure over the information.

- conceptual grouping

- rhetorical relationships.

Page 12: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU12

Lexical choice

Lexical chooser: • determining the particular words to be used to

express concepts and relations.• complexity of coding vs. richer language.

– choosing content words: information is mapped from conceptual vocabulary.

– LC should supply a variety of words, consider the user model [precise vs. general description of weather phenomenon], and account for pragmatic considerations (formal vs. casual style).

Page 13: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU13

Aggregation

Aggregation - can be performed in various stages: – the planner: combines similar data. – In lexicalization: aggregates some concepts into

one lexical element.– Aggregations of sentences:

• The month was cooler than average. The month was drier than average into The month was cooler and drier than average

Page 14: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU14

Referring expression generation

Referring Expression Generation: – an entity can be referred in many ways: initially,

subsequently, distinguishing, definite, pronouns.• Proper names:

– באר שבע

– באר שבע בית הנגב

• Definite descriptions:– The train that leaves at 10am

– The next train.

• Prounouns– it

Page 15: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU15

Syntactic realizer

Syntactic Realizer: syntax and morphology.– Most general, domain independent (but definitely

language dependent).

– Various Usage Scenarios

– Input to syntactic realization is not observable

• Input for syntactic realizers in NLG– What knowledge is needed to prepare input?

– Who supplies this knowledge?

– Can we find a common abstraction, common across languages and applications?

Page 16: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU16

Possible techniques for realizers

• Bi-directional grammar specification.

• Grammar specifications tuned for generation.

• Templates

• Corpus statistics

Page 17: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU17

A note on bi-directional grammar

• Realization, in some aspects, is easier than parsing: no need to handle the full range of syntax that a human might use, no need to resolve ambiguities, no need to recover ill-formed input.

• A bi-directional grammar, is, theoretically, a possible elegant approach.

• However, most NLG systems use a generation-oriented grammar

Page 18: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU18

Why not bi-directional?

• Output of NLU parser is very different from the input to an NLG realizer.

• Not obvious that lexicalization is a part of the realization.

• Practically, not easy to engineer large bi-directional grammars.

• And more: generation is the process of choices, even to use ‘canned text’ when needed.

Page 19: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU19

Syntactic Realizer

• This work concerns Syntactic Realizers – the grammar

• Input for grammar: lexicalized representation of a phrase in various levels of abstractions.

• Output of grammar: a grammatical string, representing most accurately the info in the input.

Page 20: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU20

The input question is:

Knowledge

base

Application

Content planner

And lexiconInput??

Syntactic

Realizer

Page 21: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU21

FUF/SURGE - Implementation

• The grammar is written in FUF – Functional Unification Formalism [Elhadad] FD - a list of (att val) val = atom\fd\path Grammar: meta-FD: disjunction with ALT, control

with NONE, GIVEN, ANY.

All components in the generation process can be implemented with this formalism.

Page 22: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU22

Requirements for a syntactic realizer

• Mapping thematic structure onto syntactic roles.• Control of syntactic paraphrasing and alternations.• Provision of default for syntactic features.• Propagation of agreement features.• Selection of closed class words.• The imposition of linear precedence constraints.• The inflection of open class words.

Page 23: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU23

SURGE [Elhadad&Robin 96]

• Functional Grammar, HPSG and descriptive studies of language

• Input for the grammar is a lexicalized representation of a phrase (a clause, NP, AP).

• Minimal syntactic information in the input allows isolating earlier stages of the process from containing purely syntactic knowledge, it gives the grammar paraphrasing power, and it is also useful for multilingual application.

Page 24: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU24

Input for SURGE in general

• Each constituent has the feature cat which determines which part of the grammar it will be unified with.

• The representation of the clause is mostly semantic: a process (in SFL terms) and its participant. Paraphrasing can be done using one feature, like focus

• The input of an NP uses mostly syntactic features.• Paraphrases requires different input.

Page 25: Introduction to Natural Language Generation

November 6, 2001Yael Netzer BGU25

An Example((cat clause) (tense past) (process ((type material) (agentless no) (lex “kiss”))) (participants ((agent ((cat proper) (lex “John”))) (affected ((cat common) (lex “girl”))))))

John kissed the girl.)focus {partic affected}(

The girl was kissed by John.