Top Banner
a “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements Tim Paek Eric Horvitz Microsoft Research
32

A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

a “k-hypotheses + other”belief updating model

Dan BohusAlex Rudnicky

Computer Science DepartmentCarnegie Mellon UniversityPittsburgh, PA 15213

acknowledgements

Tim Paek Eric Horvitz

Microsoft Research

Page 2: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

2/25

motivation

spoken language interfaces are still very brittle

[Parade, Sunday, March 26]

Page 3: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

3/25

S: What city are you leaving from ?

U: [CHICAGO]

S: traveling from Chicago. Where would you like to go?

U: [SEOUL]

S: traveling to Seoul…What day did you need to travel?

U: [THE TRAVELING to berlin P_M]

S: traveling in the afternoon… okay, what day would you be departing Chicago?

U: [AT THE TENTH OF AUGUST]

S: a flight tomorrow… I have a Korean Airlines flight departing Chicago at 1:40pm, arrives Seoul at 5pm the next day. Do you want to take that?

U: [flight destination mr WEEKEND]

S: traveling on Saturday, August 12th … I have a flight departing Chicago at 1:40pm arrives Seoul at ……

/ 0.72

/ 0.35

/ 0.58

/ 0.65

/ 0.28

confidence score

misunderstandings

Chicago

Huntsville

no no I’m traveling to Birmingham

the tenth of August

my destination is Birmingham

arrival = {Seoul / 0.65}

Page 4: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

4/25

/ 0.72

/ 0.35

/ 0.58

/ 0.65

/ 0.28

confidence score

S: What city are you leaving from ?

U: [CHICAGO]

S: traveling from Chicago. Where would you like to go?

U: [SEOUL]

S: traveling to Seoul…What day did you need to travel?

U: [THE TRAVELING to berlin P_M]

S: traveling in the afternoon… okay, what day would you be departing Chicago?

U: [AT THE TENTH OF AUGUST]

S: a flight tomorrow… I have a Korean Airlines flight departing Chicago at 1:40pm, arrives Seoul at 5pm the next day. Do you want to take that?

U: [flight destination mr WEEKEND]

S: traveling on Saturday, August 12th … I have a flight departing Chicago at 1:40pm arrives Seoul at ……

misunderstandings

arrival = {Seoul / 0.65}

f

arrival = ?

arrival = { … }

arrival = { … }

arrival = { … }

departure = { … }

departure = { … }

departure = { … }

departure = { … }

departure = { … }

Page 5: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

5/25

belief updating: problem statement

S: traveling to Seoul…What day did you need to travel?

U: [THE TRAVELING to berlin P_M]

arrival = {Seoul / 0.65}

f

arrival = ? given an initial belief Binitial(C) over concept C a system action SA(C) a user response R

construct an updated belief Bupdated(C) ← f(Binitial(C), SA(C), R)

Page 6: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

6/25

outline

introduction

current solutions

approach

experimental results

effects on global performance

conclusion and future work

intro : current solutions : approach : experimental results : global performance : conclusion

Page 7: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

7/25

current solutions

S: traveling from Chicago. Where would you like to go?

U: [SEOUL]

S: traveling to Seoul… what day did you need to travel?

U: [THE TRAVELING to berlin P_M]

/ 0.65

/ 0.35

confidence scores / detecting misunderstandings[Cox, Chase, Bansal, Hazen, Ravishankar, Walker, San-Segundo, Bohus]/ 0.72detecting corrections[Litman, Swerts, Hirschberg, Krahmer, Levow]

arrival = {Seoul / 0.65}

f

arrival = ?

track single values use simple heuristic belief updating rules

explicit confirmations yes / no

implicit confirmations new values overwrite old values

intro : current solutions : approach : experimental results : global performance : conclusion

Page 8: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

8/25

outline

introduction

current solutions

approach

experimental results

effects on global performance

conclusion and future work

intro : current solutions : approach : experimental results : global performance : conclusion

Page 9: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

9/25

given an initial belief Binitial(C) over concept C a system action SA(C) a user response R

construct an updated belief Bupdated(C) ← f(Binitial(C), SA(C), R)

S: traveling to Seoul…What day did you need to travel?

U: [THE TRAVELING to berlin P_M]

belief updating: problem statement

/ 0.35

arrival = {Seoul / 0.65}

arrival = ?

f

intro : current solutions : approach : experimental results : global performance : conclusion

Page 10: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

10/25

probability distribution

over

the set of possible values

belief representationBupdated(C) ← f(Binitial(C), SA(C),

R)

however system “hears” only a small number of

conflicting values for a concept throughout a

session max = 3 conflicting values heard

ABERDEEN, TX

ABILENE, T

XALB

ANY, NY

ALBUQUERQUE, N

M

ALLENTO

WN, PA

ALEXANDRIA, L

A

ALLAKAKET,

AK

ALLIANCE, N

EALP

ENA, MI

ALPIN

E, TX

YUMA, AZ

departure

intro : current solutions : approach : experimental results : global performance : conclusion

Page 11: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

11/25

compressed belief representation

k hypotheses + other

dynamically add and drop hypotheses remember m

hypotheses, add n new ones (m+n=k)

belief representation

departure_city [k=3, m=2, n=1]

AustinBoston Houston other

S: Did you say you were flying from Austin?U: [NO ASPEN]

Aspen

S: flying from Aspen… what is your destination?U: [NO NO I DIDN’T THAT THAT]

ØBoston Aspen other

Boston Austin other

Bupdated(C) ← f(Binitial(C), SA(C), R)

B…(C) is a multinomial variable of degree k+1

intro : current solutions : approach : experimental results : global performance : conclusion

Page 12: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

12/25

request S: When would you like to take this flight?U:Friday

[FRIDAY] / 0.65

explicit confirmation

S: Did you say you wanted to fly this Friday?U:Yes

[GUEST] / 0.30

implicit confirmation

S: A flight for Friday … at what time?U:At ten a.m.

[AT TEN A_M] / 0.86

no action /unexpected update

S: okay. I will complete the reservation. Please tell

me your name or say ‘guest user’ if you are not

a registered user.U:guest user

[THIS TUESDAY] / 0.55

system actionBupdated(C) ← f(Binitial(C), SA(C),

R)

intro : current solutions : approach : experimental results : global performance : conclusion

Page 13: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

13/25

acoustic / prosodic

acoustic and language scores, duration, pitch information, voiced-to-unvoiced ratio, speech rate, initial pause

lexical number of words, presence of words highly correlated with corrections or acknowledgements

grammatical

number of slots (new and repeated), goodness-of-parse scores

dialog dialog state, turn number, expectation match, timeout, barge-in, concept identity

priors priors for concept values

confusability

how confusable concept values are

user responseBupdated(C) ← f(Binitial(C), SA(C),

R)

intro : current solutions : approach : experimental results : global performance : conclusion

Page 14: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

14/25

approach

multinomial regression problem

multinomial generalized linear model sample efficient

stepwise approach feature selection

one separate model for each system action

Bupdated(C) ← fSA(C) (Binitial(C), R)

Bupdated(C) ← f(Binitial(C), SA(C), R)

intro : current solutions : approach : experimental results : global performance : conclusion

Page 15: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

15/25

outline

introduction

current solutions

approach

experimental results

effects on global performance

conclusion and future work

intro : current solutions : approach : experimental results : global performance : conclusion

Page 16: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

16/25

data

RoomLine conference room reservations explicit and implicit confirmations

user study 46 participants 10 scenario-based interactions each

corpus 449 sessions, 8848 user turns transcribed & annotated

misunderstandings, corrections, correct concept values

intro : current solutions : approach : experimental results : global performance : conclusion

Page 17: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

17/25

model performance

Model (M)[k=2, all features]

initial baseline (i)[error before update]

heuristic baseline (h)[error after heuristic update]

correction baseline (c)[error if we had perfect correction detection]

30.8

16.1

5.0 6.2

30%

20%

10%

0%

i h M c

explicit confirm

c

30.326.0

15.0

21.5

30%

20%

10%

0%

i h M

implicit

confirm

98.2

9.5

5.7

12%

8%

4%

0%

i h M

request 79.7

44.8

14.8

45%

30%

15%

0%

i h M

no action

intro : current solutions : approach : experimental results : global performance : conclusion

Page 18: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

18/25

outline

introduction

current solutions

approach

experimental results

effects on global performance

conclusion and future work

intro : current solutions : approach : experimental results : global performance : conclusion

Page 19: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

19/25

a new user study …

implemented models in the system

2nd, between-subjects experiment

control: using heuristic update rules

treatment: using belief updating models

40 participants, non-native users improvements more likely at high word-error-rates

intro : current solutions : approach : experimental results : global performance : conclusion

Page 20: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

20/25

effect on task success

logit(TaskSuccess) ← 2.09 - 0.05∙WER + 0.69∙Condition

pro

bab

ility

of

task

su

ccess

16% word error rate

p=0.009

20% 40% 60% 80% 100%0%

word error rate

0%

20%

40%

60%

80%

100%

78%

30% word error rate

78%

64%

treatmentcontrol

logistic ANOVA on task success

intro : current solutions : approach : experimental results : global performance : conclusion

Page 21: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

21/25

how about efficiency?

ANOVA on task duration for successful tasks

Duration ← -0.21 + 0.013∙WER - 0.106∙Condition

significant improvement equivalent to 7.9% absolute reduction in word-error

p=0.0003

intro : current solutions : approach : experimental results : global performance : conclusion

Page 22: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

22/25

outline

introduction

current solutions

approach

experimental results

effects on global performance

conclusion and future work

intro : current solutions : approach : experimental results : global performance : conclusion

Page 23: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

23/25

U: [CHICAGO]

S: traveling from Chicago. Where would you like to go?

U: [SEOUL]

S: traveling to Seoul…What day did you need to travel?

U: [THE TRAVELING to berlin P_M]

S: traveling in the afternoon. Okay what day would you be departing chicago

summary

arrival = {Seoul / 0.65}

/ 0.72

/ 0.35

/ 0.65

arrival = ?

f

arrival = { … }departure = { … }

departure = { … }

departure = { … }

approach for constructing accurate beliefs integrate information across multiple turns

significant gains in task success and efficiency

intro : current solutions : approach : experimental results : global performance : conclusion

Page 24: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

24/25

other advantages

learns from data tuned to the domain in which it operates

sample efficient / scalable local one-turn optimization, concepts are

independent

RoomLine operates with 29 concepts cardinality: 2 several hundreds

portable decoupled from dialog task specification

no assumptions about dialog management

intro : current solutions : approach : experimental results : global performance : conclusion

Page 25: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

25/25

future work

integrate information from n-best list

integrate other high-level knowledge

domain-specific constraints

inter-concept dependencies

investigate technique in other domains

intro : current solutions : approach : experimental results : global performance : conclusion

Page 26: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

26/25

thank you! questions …

Page 27: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

27/25

0 10 20 30 40 50 60 70 80 90 1000.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

improvements at different WER

word-error-rate

absolute improvement in task success

Page 28: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

28/25

user study

10 scenarios, fixed order presented graphically (explained during briefing)

participants compensated per task success

Page 29: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

29/25

informative features

priors and confusability initial confidence scores concept identity barge-in expectation match repeated grammar slots

Page 30: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

30/25

Models (k=2, runtime features)

# The model for the explicit confirm action new_1 otherLR_MODEL(EC)

k = -15.96 3.61answer_type[YES] = -12.67 -5.90answer_type[NO] = 4.55 3.15answer_type[OTHER] = 1.20 -0.75concept_id(equip) = 6.96 4.42i_th_confusability = -3.67 -4.80ih_diff_lexical_one_word = -15.99 -1.17lexw1[SMALL] = 17.63 20.26response_new_hyps_in_selh = 18.85 0.41

END

Page 31: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

31/25

Models (k=2, runtime features)

# The model for the implicit confirm action new_1 otherLR_MODEL(IC)

mark_confirm = 0.31 -1.74mark_disconfirm = 3.39 1.57i_th_conf = 0.39 -3.63i_th_confusability = -4.17 -4.54k = -16.83 3.75lex[THREE] = -2.25 -2.68response_new_hyps_in_selh = 20.88 1.70turn_number = 0.01 0.03

END

Page 32: A “k-hypotheses + other” belief updating model Dan Bohus Alex Rudnicky Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 acknowledgements.

32/25

Models (k=2, runtime features)

# The model for the request action new_1 otherLR_MODEL(REQ)

k = -0.78 3.56 barge_in = -2.07 -1.40 concept_id(date)= 11.29 9.80 concept_id(user_name) = 1.93 -13.91

dialog_state[RequestSpecificTimes] = 13.29 14.26 ih_diff_lexical = -1.54 0.17 initial_num_hyps_>_0 = -21.70 -2.71

total_num_parses = -1.06 -0.40ur_selh_new_1_conf = 4.09 1.76ur_selh_new_1_confusability = 5.81 1.70 ur_selh_new_1_prior = 0.67 0.98ur_selh_new_1_prior_>_1 = -1.00 -6.38

END