Top Banner
NESPOLE! Project Status Carnegie Mellon University Grenoble Meeting November 15, 2001
24

NESPOLE! Project Status Carnegie Mellon University

Jan 01, 2016

Download

Documents

lucius-farmer

NESPOLE! Project Status Carnegie Mellon University. Grenoble Meeting November 15, 2001. Main Accomplishments: Nov-01. Improved DACPar Analyzer Improved SR engines Port to Linux HLT servers Significant coverage improvements Formal evaluation (SPECTRUM Proposal). The DACPar Analyzer. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: NESPOLE! Project Status Carnegie Mellon University

NESPOLE! Project StatusCarnegie Mellon University

Grenoble Meeting

November 15, 2001

Page 2: NESPOLE! Project Status Carnegie Mellon University

Main Accomplishments: Nov-01

• Improved DACPar Analyzer

• Improved SR engines

• Port to Linux HLT servers

• Significant coverage improvements

• Formal evaluation

• (SPECTRUM Proposal)

Page 3: NESPOLE! Project Status Carnegie Mellon University

The DACPar Analyzer

Parse an utterance for arguments (SOUP)

Segment the utterance into sentences Extract features from the utterance

and the single best parse output Use a learned classifier to identify the

speech act (TiMBL) Use a learned classifier to identify the

concept sequence (TiMBL) Combine into a full parse

Page 4: NESPOLE! Project Status Carnegie Mellon University

Improved DACPar Analyzer

• Improved segmentation of utterances into SDUs

• Using IF well-formedness constraints to improve overall DA classification

• coverage and training set improvements

Page 5: NESPOLE! Project Status Carnegie Mellon University

DACPar - Improved Segmentation

• Segmenting single turns into DA units (SDUs) - two problems:– under-segmentation: detecting SDU boundaries between

parsed arguments

– over-segmentation: due to CrossDomain grammar - single SDUs that are incorrectly split

• New segment boundary detector implemented based on argument statistical model

• CrossDomain grammar tuned to prevent over-segmentation

Page 6: NESPOLE! Project Status Carnegie Mellon University

DACPar - Using IF Constraints

• Two goals:– Ensure that resulting DA analysis is a legal IF– Improve classification outcome using the well-

formedness constraints from IF spec

• Classifier produces ranked list of Das

• Select highest ranking DA that licenses the greatest number of arguments (ideally all)

Page 7: NESPOLE! Project Status Carnegie Mellon University

DACPar - Initial Results

• SA classification accuracy ~65%• DA classification accuracy ~45%• Eng-to-Eng translation (from trans) 58% (43%)• Eng-to-Eng translation (from hypo) 45% (32%)

Page 8: NESPOLE! Project Status Carnegie Mellon University

Improved SR Engine

Page 9: NESPOLE! Project Status Carnegie Mellon University

Showcase-1 Formal Evaluation

• Data used for evaluation

• Evaluation scheme: end-to-end, mono- and cross-lingual, SDU-based, human-grading

• Compiling of results

• Initial available results

• Lessons learned...

Page 10: NESPOLE! Project Status Carnegie Mellon University

Evaluation - Data Used

• Goals:– unseen data not used for system development– both scenario-a and scenario-c, some MM data– original mono-lingual data and cross-lingual

data collected when using the system

• Mixture intended primarily for comprehensiveness, not for comparison of different conditions (stat significance)

Page 11: NESPOLE! Project Status Carnegie Mellon University

Evaluation Methodology• Evaluation scheme: end-to-end, mono- and cross-lingual, SDU-

based, human-grading

• Evaluate translation from transcriptions and from SR output, also SR WERs, (SR as a paraphrase)

• Multiple human graders - should NOT be system developers

• One grader segments each turn into SDUs, graders then assign grades for each identified SDU

• Cross lingual eval: – client SDUs from E/G/F --> Italian– agent SDUs from Italian --> E/G/F

• Donna’s grading program

Page 12: NESPOLE! Project Status Carnegie Mellon University

Compiling of Results

• Each site should compile its own results!

• Calculate separate results for:– each dialogue, each grader, client/agent SDUs

• Average/combine results for:– all graders, client+agent, all dialogues

combined

Page 13: NESPOLE! Project Status Carnegie Mellon University

Results: SR PerformanceGerman SR Accuracy

Speaker % Accuracy

------------------------

g006 42.69

g034 66.32

g047 78.67

g051 69.43

Average 63.52

English SR Accuracy

Speaker % Accuracy ------------------------ e025ap 68.6 e039ap 39.5 e011yp 83.1 e827cy 71.0

Average 61.9

Page 14: NESPOLE! Project Status Carnegie Mellon University

English Evaluation

English Eval Data

a1 = e025ap ( 46 SDUs) ( 27 utts)

a2 = e039ap (123 SDUs) ( 37 utts)

amm = e011yp ( 54 SDUs) ( 39 utts)

cmm = e827cy (109 SDUs) ( 48 utts)

ALL = total (332 SDUs) (151 utts)

Page 15: NESPOLE! Project Status Carnegie Mellon University

English-to-English

HYPO ---- G1 G2 G3 ALL | WA------------------------------------------- a1 76(65) 74(61) 65(52) 72(59) | 68------------------------------------------- a2 55(39) 43(32) 50(35) 50(35) | 39-------------------------------------------amm 91(89) 93(85) 91(78) 91(84) | 84-------------------------------------------cmm 71(63) 65(59) 69(56) 68(59) | 70-------------------------------------------ALL 69(59) 63(54) 65(51) 66(56) | 61-------------------------------------------

Page 16: NESPOLE! Project Status Carnegie Mellon University

English-to-English

SLT-TCT ---- G1 G2 G3 ALL---------------------------------------- a1 74(70) 76(54) 67(41) 72(55)---------------------------------------- a2 62(46) 45(40) 46(32) 51(39)----------------------------------------amm 74(57) 67(54) 61(48) 67(53)----------------------------------------cmm 65(49) 40(31) 51(31) 52(37)----------------------------------------ALL 67(52) 51(41) 53(35) 58(43)----------------------------------------

SLT-REC ---- G1 G2 G3 ALL---------------------------------------- a1 58(50) 52(33) 43(24) 51(36)---------------------------------------- a2 41(27) 29(23) 33(21) 34(23)----------------------------------------amm 69(57) 70(63) 70(41) 70(54)----------------------------------------cmm 50(39) 32(26) 41(21) 41(29)----------------------------------------ALL 51(39) 40(32) 43(25) 45(32)----------------------------------------

Page 17: NESPOLE! Project Status Carnegie Mellon University

Results: English-to-English

English-to-English

a1 a2 amm cmm ALL----------------------------------------------TCT 72(55) 51(39) 67(53) 52(37) 58(43)----------------------------------------------REC 51(36) 34(23) 70(54) 41(29) 45(32)----------------------------------------------HYPO 72(59) 50(35) 91(84) 68(59) 66(56)----------------------------------------------

Page 18: NESPOLE! Project Status Carnegie Mellon University

Results: English-to-Italian

English-to-Italian a1 a2 amm cmm ALL-----------------------------------------------TCT 77(52) 48(36) 67(45) 59(31) 55(38)-----------------------------------------------REC 57(39) 29(19) 69(44) 39(24) 43(27)-----------------------------------------------

English-to-English

a1 a2 amm cmm ALL----------------------------------------------TCT 72(55) 51(39) 67(53) 52(37) 58(43)----------------------------------------------REC 51(36) 34(23) 70(54) 41(29) 45(32)----------------------------------------------HYPO 72(59) 50(35) 91(84) 68(59) 66(56)----------------------------------------------

Page 19: NESPOLE! Project Status Carnegie Mellon University

German Evaluation

Graders: Dialogs:G1: Benjamin a1: g047ak ( 46 SDUs / 23 utts.)G2: Tanja a2: g051ak (174 SDUs / 59 utts.)G3: Stephan amm: g006yk (108 SDUs / 70 utts.)

c1: g034ck (314 SDUs / 98 utts.)All: 644 SDUs / 350 utts.

Page 20: NESPOLE! Project Status Carnegie Mellon University

German-to-German

HYPO SLT-TCT SLT-RECG1 57 (50) 28 (23) 26 (23)G2 59 (50) 24 (6) 21 (5)G3 64 (48) 39 (7) 32 (5)All 58 (48) 31 (15) 25 (12)

a1 81 (74) 55 (21) 52 (20)a2 71 (59) 35 (14) 34 (14)amm 38 (25) 34 (18) 22 (11)c1 58 (49) 23 (8) 19 (8)All 58 (48) 31 (15) 25 (12)

G1 G2 G3 AllHYPO 57 (50) 59 (50) 64 (48) 58 (48)SLT-TCT 28 (23) 24 (6) 39 (7) 31 (15)SLT-REC 26 (23) 21 (5) 32 (5) 25 (12)

Page 21: NESPOLE! Project Status Carnegie Mellon University

German-to-Italian

SLT-TCT SLT-RECG1 31 (7) 26 (4)G2 38 (9) 32 (6)G3 30 (24) 26 (22)All 32 (13) 27 (11)

a1 55 (21) 56 (22)a2 39 (13) 34 (12)amm 36 (18) 31 (15)c1 25 (10) 19 (8)All 32 (13) 27 (11)

G1 G2 G3 AllSLT-TCT 31 (7) 38 (9) 30 (24) 32 (13)SLT-REC 26 (4) 32 (6) 26 (22) 27 (11)

Page 22: NESPOLE! Project Status Carnegie Mellon University

Lessons Learned/Issues

• Variance between graders - what to do?

• Segmentation variations - what to do?

• Grading with two binary decisions

• New data for next evaluation + save copy of current system version

• Release current eval data for system dev?

• Component Evaluation

Page 23: NESPOLE! Project Status Carnegie Mellon University

Showcase-2 Open Issues

• Definitions of the domains and scenarios for showcase-2a and showcase-2b

• Data Collection

• New functionalities:– for the users (client/agent)– for the system developers & for demonstration

• Architecture modifications

Page 24: NESPOLE! Project Status Carnegie Mellon University

Demo at IST Issues

• Details about the demo

• Demo “wrapper” around the system:– client initiates call from a web page– dealing with the push-to-talk issue– other functionalities?

• Schedule for tests before demo