Top Banner
CMPUT 301: Lecture 31 Out of the Glass Box Martin Jagersand Department of Computing Science University of Alberta
37

CMPUT 301: Lecture 31 Out of the Glass Box

Jan 19, 2016

Download

Documents

Jonathan Best

CMPUT 301: Lecture 31 Out of the Glass Box. Martin Jagersand Department of Computing Science University of Alberta. Overview. Idea: why only use the sense of vision in user interfaces? - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CMPUT 301: Lecture 31 Out of the Glass Box

CMPUT 301: Lecture 31Out of the Glass Box

Martin Jagersand

Department of Computing ScienceUniversity of Alberta

Page 2: CMPUT 301: Lecture 31 Out of the Glass Box

2

Overview

• Idea:– why only use the sense of vision in user

interfaces?– increase the bandwidth of the interaction by

using multiple sensory channels, instead of overloading the visual channel

Page 3: CMPUT 301: Lecture 31 Out of the Glass Box

3

Overview

• Multi-sensory systems:– use more than one sensory channel in

interaction– e.g., sound, video, gestures, physical actions

etc.

Page 4: CMPUT 301: Lecture 31 Out of the Glass Box

4

Overview

• Usable senses:– sight, sound, touch, taste, smell, – Haptics, proprioception and accelerations– each is important on its own– together, they provide a fuller interaction with

the natural world

Page 5: CMPUT 301: Lecture 31 Out of the Glass Box

5

Overview

• Usable senses:– computers rarely offer such a rich interaction– we can use sight, sound, and sometimes touch– Flight simulators and some games uses

accelerations to create a multimodal immersion experience.

– we cannot (yet) use taste or smell

Page 6: CMPUT 301: Lecture 31 Out of the Glass Box

6

Overview

• Multi-modal systems:– use more than one sense in the interaction– e.g., sight and sound: a word processor that

speaks the words as well as rendering them on the screen

Page 7: CMPUT 301: Lecture 31 Out of the Glass Box

7

Overview

• Multi-media systems:– use a number of different media to

communicate information– e.g., a computer-based teaching system with

video, animation, text, and still images

Page 8: CMPUT 301: Lecture 31 Out of the Glass Box

8

Speech

• Human speech:– natural mastery of language– instinctive, taken for granted– difficult to appreciate the complexities– potentially a useful way to extend human-

computer interaction

Page 9: CMPUT 301: Lecture 31 Out of the Glass Box

9

Speech

• Structure:– phonemes (English)

– 40 (24 consonant and 16 vowel sounds)

– basic atomic units of speech

– sound slightly different depending on context …

Page 10: CMPUT 301: Lecture 31 Out of the Glass Box

10

Speech

• Structure:– allophones:

– 120 to 130

– all the sounds in the language

– count depends on accents

Page 11: CMPUT 301: Lecture 31 Out of the Glass Box

11

Speech

• Structure:– morphemes

– basic atomic units of language

– part or whole words

– formed into sentences using the rules of grammar

Page 12: CMPUT 301: Lecture 31 Out of the Glass Box

12

Speech

• Prosody:– variations in emphasis, stress, pauses, and pitch

to impart more meaning to sentences

• Co-articulation:– the effect of context on the sound– transforms phonemes into allophones

Page 13: CMPUT 301: Lecture 31 Out of the Glass Box

13

Speech Recognition

• Problems:– different people speak differently

(e.g., accent, stress, volume, etc.)– background noises– “ummm …” and “errr …”– speech may conflict with complex cognition

Page 14: CMPUT 301: Lecture 31 Out of the Glass Box

14

Speech Recognition

• Issues:– recognizing words is not enough– need to extract meaning– understanding a sentence requires context, such

as information about the subject and the speaker

Page 15: CMPUT 301: Lecture 31 Out of the Glass Box

15

Speech Recognition

• Phonetic typewriter:– developed for Finnish

(a phonetic language)– trained on one speaker, tries to generalize to others– uses neural network that clusters similar sounds

together, for a character– poor performance on speakers it has not been trained

on– requires a large dictionary of minor variations

Page 16: CMPUT 301: Lecture 31 Out of the Glass Box

16

Speech Recognition

• Currently:– single user, limited vocabulary systems can

work satisfactorily– no general user, general vocabulary systems are

commercial successful, yet

• Current commercial examples:– Simple telephone based UI such as Train

schedule information systems

Page 17: CMPUT 301: Lecture 31 Out of the Glass Box

17

Speech Recognition

• Potential:– for users with physical disabilities– for lightweight, mobile devices– for when user’s hands are already occupied

with a manual task (auto mechanic, surgeon)

Page 18: CMPUT 301: Lecture 31 Out of the Glass Box

18

Speech Synthesis

• What:– computer-generated speech– natural and familiar way of receiving

information

Page 19: CMPUT 301: Lecture 31 Out of the Glass Box

19

Speech Synthesis

• Problems:– human find it difficult to adjust to monotonic,

non-prosodic speech– computer needs to understand natural language

and the domain– Speech is transient

(hard to review or browse)– produces noise in the workplace or requires

headphones(intrusive)

Page 20: CMPUT 301: Lecture 31 Out of the Glass Box

20

Speech Synthesis

• Potential:– screen readers

– read a textual display to a visually impaired person

– warning signals– spoken information especially for aircraft pilots whose

visual and haptic channels are busy

Page 21: CMPUT 301: Lecture 31 Out of the Glass Box

21

Speech Synthesis

• Virtual newscaster (Ananova)

Page 22: CMPUT 301: Lecture 31 Out of the Glass Box

22

Uninterpreted Speech

• What:– fixed, recorded speech– e.g., played back in airport announcements– e.g., attached as voice annotation to files

Page 23: CMPUT 301: Lecture 31 Out of the Glass Box

23

Uninterpreted Speech

• Digital processing:– change playback speed without changing pitch

– to quickly scan phone messages

– to manually transcribe voice to text

– to figure out the lyrics and chords of a song

– spatialization and environmental effects

Page 24: CMPUT 301: Lecture 31 Out of the Glass Box

24

Non-Speech Sound

• What:– boings, bangs, squeaks, clicks, etc.– commonly used in user interfaces to provide

warnings and alarms

Page 25: CMPUT 301: Lecture 31 Out of the Glass Box

25

Non-Speech Sound

• Why:– fewer typing mistakes with key clicks– video games harder without sound

Page 26: CMPUT 301: Lecture 31 Out of the Glass Box

26

Non-Speech Sound?

• D’oh!

Page 27: CMPUT 301: Lecture 31 Out of the Glass Box

27

Non-Speech Sound

• Dual mode displays:– information presented along two different

sensory channels– e.g., sight and sound

– allows for redundant presentation– user uses whichever they find easiest

– allows for resolution of ambiguity in one mode through information in the other

Page 28: CMPUT 301: Lecture 31 Out of the Glass Box

28

Non-Speech Sound

• Dual mode displays:– humans can react faster to auditory than visual

stimuli– sound is especially good for transient

information that would otherwise clutter a visual display

– sound is more language and culture independent (unlike speech)

Page 29: CMPUT 301: Lecture 31 Out of the Glass Box

29

Non-Speech Sound

• Auditory icons:– use natural sounds to represent different types of

objects and actions in the user interface– e.g., breaking glass sound when deleting a file

– direction and volume of sounds can indicate position and importance/size

– SonicFinder

– not all actions have an intuitive sound

Page 30: CMPUT 301: Lecture 31 Out of the Glass Box

30

Non-Speech Sound

• Earcons:– synthetic sounds used to convey information– structured combinations of motives (musical

notes) to provide rich information

Page 31: CMPUT 301: Lecture 31 Out of the Glass Box

31

Non-Speech Sound

• Earcons:

Page 32: CMPUT 301: Lecture 31 Out of the Glass Box

32

Handwriting Recognition

• Handwriting:– text and graphic input– complex strokes and spaces– natural

Page 33: CMPUT 301: Lecture 31 Out of the Glass Box

33

Handwriting Recognition

• Problems:– variation in handwriting between users– variation from day to day and over years for a

single user– variation of letters depending on nearby letters

Page 34: CMPUT 301: Lecture 31 Out of the Glass Box

34

Handwriting Recognition

• Currently:– limited success with systems trained on a few

users, with separated letters– generic, multi-user, cursive text recognition

systems are not accurate enough to be commercially successful

• Current applications e.g. pre-sorting of mail (but human has to assist with failures)

Page 35: CMPUT 301: Lecture 31 Out of the Glass Box

35

Handwriting Recognition

• Newton:– printing or cursive

writing recognition

– dictionary of words

– contextual recognition

– fine tune spacing and letter shapes

– fine tune recognition speed

– learn handwriting over time

Page 36: CMPUT 301: Lecture 31 Out of the Glass Box

36

Handwriting Recognition

• Newton:

Page 37: CMPUT 301: Lecture 31 Out of the Glass Box

37

End

• What did I learn today?

• What questions do I still have?