Human Factors in Voice Interface Design Jeff Dworkin Segment Marketing Manager jeff.dworkin@dialogic .com
Dec 16, 2015
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 2
Extends Mobile VAS Segment Leadership
Video Algorithmic and Analytics Leadership
Extends Technology Enabling MSS Leadership− Fax Segment MSS Leadership
Converged Communications Technology Enabling Market Segment Share Leadership− Dialogic “pioneer” history, relationships and patent portfolio− Enterprise Gateway
Established SS7 / Signaling Part of Business
Established HMP as core to Dialogic customer value proposition
Deeper Service Provider Segment Products / Customers− Service Provider gateway and IP media server
Extends Technology Enabling MSS Leadership
TDM to IP Transition Leadership Extend into Web communication and
open source ISV innovators Enabling Video IP Streaming Value Added
Services
2006 2007 2008
“VIDEO IS THE NEW VOICE”™
Dialogic Evolution
“VIDEO IS THE NEW VOICE”™
Mission: To Enable Secure Multimedia Communications Through Any Network To And From Any Endpoint In The World
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 3
What is Human Factors?
Ergonomics – an applied science concerned with designing and arranging things people use, such that they interact most efficiently and safely.
Ergonomics is the physical part
Human Factors encompasses the physical as well as the mental and emotional.
The Man/Machine Interface
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 4
PEOPLE
JUST
DON’T LISTEN!
Persistence, Memory and Time
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 5
Persistence, Memory and Time
Telephony Interfaces
Vs.
Visual Interfaces
Persistence, Memory and Time
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 6
Persistence
In a visual display, data remains on the display until replaced by new data.
This allows users to:– Return to a task after interruption– Review – by scanning back and forth – among several
possible menu choices– Eliminate or minimize the effects of time by scrolling freely
between the past and the present– Maintain context – even when confronted with multiple tasks
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 7
Memory
The serial presentation of auditory information places heavy demands on working memory
More impactful on novice users More impactful on older users
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 8
Time
Time is the enemy of the spoken user interface
-Bruce Balentine/David P. Morgan, How To Build a Speech Recognition Application
– Defeating this enemy requires repeating critical information until it “sticks”
– Yet it takes time to say things– “Hold on – I’m writing this down”– Cultural/Social issues can cause communication breakdown
• Issues of Prosody/Timing• What’s your phone Number?
– Is it 973-555-1212 or is it 9735-5-1212?
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 9
Machine Output
Prompts
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 10
Machine Spoken Output
Prompts – indicate it is time for user input.
Feedback – presents the application state that results from user input, allowing the user to compare original intent with final results.
Instructions – give information to the user about operating the user interface or understanding the task.
Help – offer context sensitive corrective action. Often adopts a separate mode or state aimed at coaching.
Application Data – the content or information that the user seeks or intends to modify.
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 11
Silence, the Silent Killer
People will wait without feedback for six to eight seconds.– Anything longer than that and callers will think something
wrong.– Causes frustration.– Calles people to hang up.
If a processing delay or a wait in queue lasts more than six seconds, give the caller feedback.– Music, Information, Advertising.– If using tones, explain the tone or callers may think the tone
is an indication that they have been disconnected.
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 12
Prompts
Prompts
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 13
Action-Goal vs Goal-Action
Action-Goal– Press one for sales…
Goal-Action– For sales, press one…
Goal-Action reflects the way people think, using Action-Goal can cause confusion.
What you are saying:Press One for Sales…Press Two for Marketing…Press Three for Support.
What is heard:Press One (not heard because the user is not paying attention yet) for Sales, press two…for Marketing…press three, for support…???
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 14
Please, Now and Thank-You
“Social Graces” just add to the length of the communication– For sales, please press one now…– For sales, press one…
Many phone-based interfaces are tedious because they unnecessarily put the word “please” in front of every acdtion statement on a menu (e.g., “For more information, please press 4.” (Scumacher)
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 15
Anthropomorphism
The attribution of human characteristics to non-human beings
This is not the same as the system having a “personality” Experts disagree on the use of anthropomorphism
In my opinion: Avoid anthropomorphism The more “like” a person people believes the system to be
more they want to communicate with it like it is a person, but it is not a person, it is a machine
If you must personify, let the personality be a narrator or guide, not the machine
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 16
Compression
The speed or tempo at which recordings are played– Should be between 135 words/min and 170 words/min– Software can be used to compress (or speed-up) playback
while maintaining the pitch of the voice– Faster may seem better, but it can cause error due to
retention issues and response mistakes…especially in older adults (Sharit, 2003)
Faster tempo can cause Perceived Enunciation Errors or Mondegreens– There’s a bathroom on the right
• There’s a bad moon on the rise– Mairzy doats and dozy doats and liddle lamzy divey
• Mares eat oats and does eat oats and little lambs eat ivy– For information and directions, press 5…
• ???
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 17
Short Prompts vs Long Prompts
With dial-through, dial-ahead and/or barge in, why is this relevant.– Prevents repeating irrelevant prompts during error correction.
• “Thank you for calling XYZ, please enter your PIN”• “That was not a correct entry”• “Thank you for calling XYZ, please enter your PIN”
– Prompts can be used for “grunt detection” even when ASR is failing.
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 18
Lists, Menus and User Input
Lists and Menus
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 19
Hierarchy vs Skip and Scan
Hierarchy– For Sales, press 1…For Support, press 2.– There are four matches…For Jan Smith, press 1…For John Smith,
press 2…For Ken Smith press 3.– “Lakeview Terrace”, press 9…”Burn after Reading” ,press
10…”Igor”, press 11.
Skip and Scan– Sales. To select this option, press 1. For the next option, press 9.
For the previous option, press 7.– Jan Smith. To select this option, press 1. For the next option, press
9. For the previous option, press 7.– “Lakeview Terrace” To select this option, press 1. For the next
option, press 9. For the previous option, press 7.
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 20
Number of Choices Per Menu
The primacy and recency effects– Designers should also consider the primacy and recency
effect that enables users to remember the first and last options most frequently. The recency effect makes the last few items presented in a list the easiest to recall. However, a short disturbance or interference can make it difficult to remember the last few items (Baddeley, 1999).
Most people can only remember 5 choice– Some can remember more, some less– More complex instructions are harder to remember– Older users have more difficulty remembering– 5 items +2, depending on user base and complexity, is a
good rule of thumb
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 21
Delimiters: To # or not to #
What is that thing (#) called– Pound, Number Sign, Hash
Telling them where it is– The # Key is located at the lower right corner of your keypad.
Enter your 4 Digit PIN followed by the #?– Why required the # if you know length of the expected input?
Enter your 4 Digit PIN– What to do if they enter # anyway?
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 22
Other User Inputs
Lack of Instruction, Preparation. Directional Metaphors Consistent use of keys Mnemonics Dynamic Menus Alphabetic Input
– Two Button – Key then Position– Two Button – Key then Location– Count along the key
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 23
Press vs Enter
Use Press when a single digit entry is required– Implies that no Delimiter (#) is Needed– “For Sales, Press 1...”
Use Enter when a multi-digit entry is required– Doesn’t matter if it is a fixed-length entry or a variable-length
entry– “Enter your 4-digit PIN Now”– “Enter you PIN, followed by the # key”
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 24
DTMF or ASR: Different or Better
DTMF: STRENGTHS– Familiarity– Ubiquity– Speed– Privacy– Efficiency– Availability– Cost
DTMF: WEAKNESSES– Auditory Only– Taxes Working Memory– Limited Input Device– Variability in Equipment
ASR: STRENGTHS– Hands Free in a Mobile World– Flexible– Adaptable– Good for Data Intensive Input
• Automated Attendant• Lists
ASR: WEAKNESSES– Cost– Difficult to Recover From Errors– Error Amplification– Regional Issues– Legally Ambiguous
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 25
ASR Menus
Don’t mimic DTMF menus– “To Pay with Visa, press 1 or say one”– “To Pay with Visa, press or say 1”– “To Pay with Visa, say Visa”
How about– “What Credit Card Would You Like to Use to Pay for That”
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 26
Feedback
Presents the application state that results from user input, allowing the user to compare original intent with final results:
Echoing user input for confirmation– You entered “ABC”, if this is correct, press 1, if you need to try
again press 2– You said “ABC”, is this correct?
Do not echo menu choices– For technical support press 1…
“Technical Support Menu”
Can be tedious for experienced users, the feedback can be implied in the follow up prompt
“For new product installation support, press 1, for trouble shooting an existing implementation, press 2
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 27
References
How to Build a Speech Recognition Application– Balentine & Morgan, 2001
It’s Better to be a Good Machine than a Bad Person– Balentine, 2007
Increasing the Usability of Interactive Voice Response Systems: Research and Guidelines for Phone-Based Systems– Scumacher, Hardzinski & Schwarz, 1995
Skip and Scan: Cleaning up Telephone Interfaces– Resnick & Virzi, 1992
Effects of Age, Speech Rate, and Environmental Support in Using Telephone Voice Menu Systems– Sharit, Czaja, Nair, Lee, 2003
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 28
Dialogic, Dialogic Pro, Brooktrout, Cantata, SnowShore, Eicon, Eicon Networks, Eiconcard, Diva, SIPcontrol, Diva ISDN, TruFax, Realblocs, Realcomm 100, NetAccess, Instant ISDN, TRXStream, Exnet, Exnet Connect, EXS, ExchangePlus VSE, SwitchKit, N20, Powering The Service-Ready Network, Vantage, Connecting People to Information, Connecting to Growth, Making Innovation Thrive and Shiva, among others as well as related logos, are either registered trademarks or trademarks of Dialogic Corporation or its subsidiaries (“Dialogic”). The names of actual companies and products mentioned herein are the trademarks of their respective owners. Dialogic encourages all users of its products to procure all necessary intellectual property licenses required to implement their concepts or applications, which licenses may vary from country to country. Dialogic may make changes to specifications, product descriptions, and plans at any time, without notice.
06/08
www.dialogic.com
USE CASE(S)Any use case(s) shown and/or described herein represent one or more examples of the various ways, scenarios or environments in which Dialogic products can be used. Such use case(s) are non-limiting and do not represent recommendations of Dialogic as to whether or how to use Dialogic products.
www.dialogic.com
Company Confidential • © Copyright 2008 Dialogic Corporation. All rights reserved.
Slide 29
Please do not alter the design of the
template, by changing fonts,
bullets, or design elements.
PLEASE NOTE: You may need to modify this sentence for your presentation. “© Copyright XXXX Dialogic Corporation. The XXXX should be the current year, unless the presentation includes information from pre-2008, in which case XXXX should be “XXXX-YYYY,” where YYYY is 2008 and XXXX is the earliest year of creation of content that was included in a previous version of the presentation.” For example, each “golden” likely should have a copyright date of XXXX-2008.” For example, if it’s a golden for a brand new product or no golden previously existed then it would be just 2008. .
PLEASE NOTE: Remove for presentations where an NDA is not required.
The title is 28 pt. Arial Bold
The first bullet is 24 pt. Arial Bold
The second bullet is 22 pt. Arial Bold
The third bullet is 20 pt. Arial