Top Banner
Jackie Chi Kit Cheung McGill University Towards World-Level Understanding for Conversational Agents
14

Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

Jul 06, 2018

Download

Documents

hadan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

Jackie Chi Kit CheungMcGill University

Towards World-Level Understanding for Conversational Agents

Page 2: Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

Conversational Agents – Many Possible Uses!

Corollary: probably little data for your particular domain and application!

Page 3: Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

Robust and Useful Conversational Agents

Robust – Should not require much additional engineering for every word, entity, domain, genre, task

Useful – Should understand and correctly process:• Events, entities, relations

• Beliefs, intentions, desires

• Sentiment, emotions, attitudes

Do our current neural network models satisfy these desiderata?

Page 4: Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

Case Study: Predicting Dialogue Success

We created a dataset from stackoverflow.com for predicting success in goal-driven human-human dialogues

Noseworthy, Cheung, Pineau, SIGDIAL 2017

User A : I accidentally closed the Stack Trace window in the Visual Studio 2008 debugger. How do I redisplay this window?

User B : While debugging: Debug\Windows\Call stack

User A : Thanks, I don’t know how I overlooked it.

Page 5: Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

What Do Current Methods Capture?

Long Short-Term Memory Networks (Hochreiter and Schmidhuber, 1997) and variants, trained in a standard supervised setup

• Suggests LSTM model is mostly capturing discourse cues

• Task-specific supervised learning focuses on useful cues for this task only!

• Much harder: understanding whether information need in question was satisfied

Noseworthy, Cheung, Pineau, SIGDIAL 2017

Information Given Success F1 Failure F1

Full conversation thread 89 73

Only the last comment 86 68

Without last comment 83 38

Page 6: Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

Incorporate additional sources of information outside of task-specific training

• Wikipedia pages, WordNet, specialized dictionaries, TV show descriptions, product descriptions

World knowledge can make our systems more robust to new entities, events, and tasks

Incorporating World Knowledge

Page 7: Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

Wikilinks Corpus• Dataset for coreference resolution (Singh et al., 2012)

• Web corpus where spans are annotated with links to Wikipedia pages

• We enhance this with definitions from Freebase/Wikipedia

Wikilinks Rare Entity Prediction

Long, Bengio, Lowe, Cheung, Precup, EMNLP 2017

Page 8: Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

• Predict which entity from a document fits into a blank, given entity definitions

Plausibility Cloze Task

Drawn from the same

original document

Page 9: Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

• LSTM networks to encode the context and the definition, then combine the information from both

• Further model that exploits long-range dependencies between choices with a hierarchical model (HierEnc)

A Double Encoder Model (DoubEnc)

Context encoder

Definition encoder

Page 10: Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

Random Randomly predict an entity

ContEnc Standard LSTM language model

DoubEnc Double encoder model

HierEnc Hierchical encoder model

Rare Entity Prediction Results

30.1

39.6

54.056.6

0

10

20

30

40

50

60

Acc

ura

cy (

%)

Page 11: Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

Correct Answer: Larnaca

Dataset frequencies:Istanbul (86); Larnaca (2, 0 in training)

Rare and Unseen Entities

Page 12: Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

Moving beyond end-to-end training for a single task!

Modularized and reusable components

Current work:• External information about entities and events

• Common sense reasoning

• Reusable components for language generation (grammaticality, content selection, style transfer)

Expectations for the Future

Page 13: Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

Work supported by:Natural Sciences and Engineering Research Council

Fonds de Recherche du Québec – Nature et Technologie

Samsung

Page 14: Towards World-Level Understanding for Conversational Agents · Towards World-Level Understanding for Conversational Agents. Conversational Agents –Many Possible Uses! Corollary:

• Sepp Hochreiter and Jurgen Schmidhuber. 1997. Long short-term memory. Neural computation. 9(8):1735–1780.

• Teng Long, Emmanuel Bengio, Ryan Lowe, Jackie CK Cheung, and DoinaPrecup. 2017. World Knowledge for Reading Comprehension: Rare Entity Prediction with Hierarchical LSTMs Using External Descriptions. EMNLP.

• Michael Noseworthy, Jackie CK Cheung, and Joelle Pineau. 2017. Predicting Success in Goal-Driven Human-Human Dialogues. SIGDIAL.

• Sameer Singh, Amarnag Subramanya, Fernando Pereira, and Andrew McCallum. 2012. Wikilinks: A large-scale cross-document coreferencecorpus labeled via links to Wikipedia. Technical Report.

References