On Grounding Human Communication with Human-Computer Interaction Designs HaoChuan Wang . 王浩全 Department of Computer Science Ins3tute of Informa3on Systems and Applica3ons Na3onal Tsing Hua University h-p://www.cs.nthu.edu.tw/~haochuan May 26, 2014 @ Department of Communica3on and Technology, Na3onal Chiao Tung University
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
������
On Grounding Human Communication with ���Human-Computer Interaction Designs
Hao-‐Chuan Wang . 王浩全 Department of Computer Science Ins3tute of Informa3on Systems and Applica3ons Na3onal Tsing Hua University h-p://www.cs.nthu.edu.tw/~haochuan May 26, 2014 @ Department of Communica3on and Technology, Na3onal Chiao Tung University
������
������
Wang
A Quick Overview of Human-Computer Interaction (HCI)
2
������
The two “senses” of Human-‐Computer Interac7on: From interface …
“Interac<on” in the sense of computers listening and responding to people’s input
������
… to problem solving and value crea7on in the real world
“Interac<on” in the sense of designing technologies based on user needs, goals, constraints, and characteris<cs. UCD: User-‐Centered Design.
Iden7fying & fixing usability problems
Technology supported educa7on
Persuasive (behavioral change) compu7ng
������
������
Wang
HCI: Studying Existing and Possible Relationships between Computers and People
5
ACM SIGCHI Curricula 1996
������
������
Wang
30 Years of the HCI Community
6
ACM SIGCHI: 9 Turing Award Winners / 188 ACM Fellows
http://dl.acm.org/sig.cfm?id=SP923
������
������
Wang
What’s Changing in HCI Today?
Big picture is s<ll there, but: • More emphasis is on use contexts and
applica<ons. • Computers are of many forms, doing all
sort of things. • Compu<ng is not necessarily done
by silicon chips computers.
-‐ Input and output are versa<le. Not necessarily “keyboard and mouse”, “text, speech or graphics”
-‐ Collabora<on and social. Not necessarily “one human, one computer”.
7
������
������
Wang
Computer-Mediated Communication (CMC)
8
������
������
Wang
What’s the longest distance in the world? 世界上最遠的距離是什麼?
9
������
������
Wang 10
������
������
Wang
Supporting Human Communication
Communica<on in the sense of data transmission across physical distance is not that hard today
• Wired and wireless computer networking, internet etc. Communica<on, in the sense of understanding each other, or crossing the “psychological distance” between people remains hard • Difficul<es in expressing or understanding thoughts • Barriers between genera<ons, genders, professions,
languages, and cultures. Suppor<ng human communica<on con<nues to be a challenging yet worth-‐of-‐pursuing topic in HCI.
11
������
������
Wang
Supporting Human Communication
Communica<on in the sense of data transmission across physical distance is not that hard today
• Wired and wireless computer networking, internet etc. Communica<on, in the sense of understanding each other, or crossing the “psychological distance” between people remains hard • Difficul<es in expressing or understanding thoughts • Barriers between genera<ons, genders, professions,
languages, and cultures. Suppor<ng human communica<on con<nues to be a challenging yet worth-‐of-‐pursuing topic in HCI.
12
������
������
Wang
Ultimate Goal? Mind-Connecting!
13
������
������
Wang
Lost in Technologies However, technology development does not always approach
the goal effec<vely. For example: Video conferencing • Bandwidth-‐demanding. Video lagging
that disrupts conversa<on • Adop<on is not guaranteed .
Privacy and other social concerns
Machine transla<on • Quality concern • Influent second language can beat
MT (cf. Yamashita & Ishida, 2006).
14
������
������
Wang
Observation
Designs of CMC can work be-er when features and constraints of human communica<on are inves<gated and considered.
Ex. Awareness indicator that makes “typing” visible in instant messaging.
Basic research stays relevant! What are the features of successful and unsuccessful communica<on? What’s the nature of “understanding”?
15
������
������
Wang
Grounding Communication
16
������
������
Wang
How Would You Describe…
Where you live in Hsinchu? Where you lived when you were in U.S.?
17
������
������
Wang
My Answer
Where you live in Hsinchu? Near 清大後門. Where you lived when you were in U.S.? In Ithaca, a college town in the middle of New York state if you know where it is. It’s where Cornell University is located.
18
������
������
Wang
My Answer
Where you live in Hsinchu? Near 清大後門. Where you lived when you were in U.S.? In Ithaca, a college town in the middle of New York state if you know where it is. It’s where Cornell University is located. Do you see the general difference? Why?
19
������
������
Wang
My Answer
Where you live in Hsinchu? Near 清大後門. Where you lived when you were in U.S.? In Ithaca, a college town in the middle of New York state if you know where it is. It’s where Cornell University is located. Do you see the general difference? Why? The amount of knowledge that we shared. 20
������
������
Wang
Common Ground
21
Knowledge, beliefs, aitudes we share, and know that we share, and know that we know that we share, influence how we use language to communicate. Grounding: Interac<ve process by which communicators exchange evidence of their understanding to arrive at the state of common ground.
Herbert Clark
������
������
Wang
Evidence of Common Ground
Physical co-‐presence (being co-‐located) • “close that door”
Shared community membership • “Let’s meet at 小七”
Linguis<c co-‐presence (can access same u-erances)
22
������
������
Wang
Evidence of Common Ground
Physical co-‐presence (being co-‐located) • “close that door”
Shared community membership • “Let’s meet at 小七”
Linguis<c co-‐presence (can access same u-erances)
23
“What’s this?”
������
������
Wang
Grounding is a Collaborative Process
24
������
������
Wang
The Role of Media: Affordances
An influen<al HCI-‐rooted concept, which roughly means “ac<on-‐permiing proper<es” of objects that people see • Chair affords siing • Door-‐knob affords door-‐opening • Virtual keyboard affords typing (but is this trivial?)
25
Don Norman
������
������
Wang
Affordances of Communication Media
26
������
������
Wang
Technology Changes Grounding
Affordances of media constrain how people may interact with one another • E.g., if no visibility, impossible to use head-‐nodding as a technique for grounding
People may learn to adapt their grounding behaviors
(this happens. E.g., emo<cons in IM) or Design new CMC tools with useful proper7es to support
Microsoft Research Asia UR Project: FY13-RES-OPP-027
Wang, H-C., & Lai, C-T. (accepted). Kinect-taped Communication: Using Motion Sensing to Study Gesture Use and Similarity in Face-to-Face and Computer-Mediated Brainstorming. ACM Conference on Human Factors in Computing Systems (CHI) 2014. Full paper. [Acceptance rate: 22.8%]
Kinect-taped Communication: ���Using Motion Sensing to Study Gesture Use ���and Similarity in Face-to-Face and ���Computer-Mediated Brainstorming
Hao-Chuan Wang, Chien-Tung Lai National Tsing Hua University, Taiwan
[cf. Bos et al., 2002; Setlock et al., 2004; Scissors et al., 2008, Wang et al., 2009]
Computer-mediated communication (CMC) tools are prevalent, but are they all equal?�• Ex. Video vs. Audio���Media properties influence aspects of communication differently�• Task performance, grounding, styles, similarity of
language patterns, social processes and outcomes etc.
How media influence communication?
Communication could be more than speaking.�Both verbal and non-verbal channels are active
Studying gesture use in communication Current methods:�
• Videotaping with manual coding.�• Giving specific instructions to participants �
(e.g., to gesture or not).�• Using confederates etc.�
Problems to solve:�• High cost. Labor-intensiveness.�• Resolution of manual analysis- �
Hard to recognize and reliably label small movements.�• Scalability-�
Hard to study arbitrary communication in the wild.�
“Kinect-taping”method Like videotaping, we use motion sensing devices, such as Microsoft Kinect, to record hand and body movements during conversations.�
• Detailed, easier-to-process representations.�• Behavioral science instrument (“microscope”) to
study non-verbal communication in ad hoc groups.�• Low cost if automatic measures are satisfactory.�
Re-appropriating motion sensors in HCI: Sensing-aided user research for ���future designs From sensors as design elements to sensors as research instruments to help future designs.�
A media comparison study Investigate how people use gestures during face-to-face and computer-mediated brainstorming��Compare three communication media�
H1. Visibility increases gesture use� Proportion of gesture� Face-to-Face > Video > Audio�
H2. Visibility increases accommodation Similarity between group members’ gestures�
Face-to-Face > Video > Audio�
Also explore how gesture use, level of understanding, and ideation productivity correlate.
[cf. Clark & Brennan, 1991]
[cf. Giles & Coupland, 1991]
Experimental design
36 individuals, 18 two-person groups�
�Kinect-taped group brainstorming sessions�
�����
Face-to-Face Video Audio
Three trials (15 min each) in counterbalanced order
Data analysis�Amount and similarity of gestures, �
Level of understanding, Productivity�
How to quantify gestures? How many gestures are there in a 15 min talk?
moving
not moving
Two unit motions with speed threshold 0
Three unit motions with speed threshold 2
Choose the thresholds
(m/s)
Choose the thresholds
Too few signals Almost everything
Data points of interest (m/s)
How to measure similarity between unit motions?
Feature extraction and representation Unit motions are represented as feature vectors�
• Time length, path length, displacement, �velocity, speed, angular movement etc.�
• Features extracted for both hands and both elbows.�
73 features extracted for each unit motion.��Similarity between unit motions: Cosine value between the two vectors.��
Validating the similarity metric
1 2
3
Machine Ranking
Human Ranking
1 2
3
Randomly select motion queries
Retrieve similar and dissimilar motions
Kinect-taped motion database
Count Human Rank
R1 R2 R3
Machine Rank�
R1 29 2 5
R2 7 27 2
R3 0 7 29
x2=107.97, p<.001
Validating the similarity metric
Contingency analysis
H1: Amount of gesture use�
H2: Similarity between group members�
�
Associations�• Amount of gesture and understanding�• Amount of gesture and ideation productivity�• Gesture similarity and ideation productivity��
Key Results
Visibility on proportion of gesture use
0
2
4
6
8
10
12
14
16
Face-to-face Video Audio
Prop
ortio
n of
Ges
ture
Use
(%
)
H1 not supported. Media did not influence percentage of gesture. �People gesture as much in Audio as in F2F and Video.�
Association between self-gesture and level of understanding
Mod
el&Predicted
,Und
erstanding�
Mod
el&Predicted
,Num
ber,o
f,Ide
as�
Propor9on,of,Individual’s,Own,Gesture,Use,(%)�
Mod
el&Predicted
,Und
erstanding�
Mod
el&Predicted
,Num
ber,o
f,Ide
as�
Propor9on,of,Individual’s,Own,Gesture,Use,(%)�
Audio�
F2F�
Video�
Individual’s Own Gesture Use (%)�
Non-communicative function of gesture. ��Understanding correlates with �self-gesture but not partner-gesture��Stronger correlation with reduced or no visibility.��
Similarity between group members
0.46
0.47
0.48
0.49
0.5
0.51
0.52
0.53
0.54
0.55
Face-to-face Video Audio
Betw
een-
part
icip
ant
Ges
tura
l Si
mila
rity
H2 supported. Similarity F2F > Video > Audio. �People gesture more similarly when they can see each other.�
Summary and implications
Media
Comparison Study
Kinect-taping
Method��
Motion sensing for studying non-verbal behaviors in CMC.�
Summary and implications
Media
Comparison Study
Kinect-taping
Method��
Visibility influences similarity but not amount of gesture.��Only self-gesture correlates with understanding.��Gesture doesn’t seem to convey much meaning to the partner. Seeing the partner is not crucial to understanding.���
Study communication of ad hoc groups�in the wild. ��Distributed deployment�study of CMC tools.��Cross-lingual and cross-cultural communication.�
Summary and implications (cont.)
Media
Comparison Study
Kinect-taping
Method��
The value of video may be relatively limited to the social and collaborative aspect (similarity etc.).��Feedback that promotes self-gesturing may help understanding.��
Effects of Interface Interactivity on Collecting Language Data to Power Dialogue Agents �
Young, S., Keiser, S. & Gašić, M. Spoken Dialogue Management using Partially Observable Markov Decision Processes
Spoken Dialogue Systems�
ChiCHI 2014 | Effects of Interface Interactivity on Collecting Language Data to Power Dialogue Agents
Young, S., Keiser, S. & Gašić, M. Spoken Dialogue Management using Partially Observable Markov Decision Processes
How to collect more natural language responses?
Language Genera<on Task�
ChiCHI 2014 | Effects of Interface Interactivity on Collecting Language Data to Power Dialogue Agents
Some Exis<ng Methods� • One-on-one interviews to get the responses
from people - Manual data collection. - Expensive.
• Using surveys with specific instructions, “Imagine that you’re answering people’s questions …” - Less expensive. - Non-interactive, “imagined interaction”.
ChiCHI 2014 | Effects of Interface Interactivity on Collecting Language Data to Power Dialogue Agents
Idea: Using an Interac<ve Chat Bot to
Elicit Natural Responses�
ChiCHI 2014 | Effects of Interface Interactivity on Collecting Language Data to Power Dialogue Agents
ChiCHI 2014 | Effects of Interface Interactivity on Collecting Language Data to Power Dialogue Agents
Anthropomorphic features: ü Greet workers ü Simulate human typing delays ü Wait for response
Sta<c Interface�
ChiCHI 2014 | Effects of Interface Interactivity on Collecting Language Data to Power Dialogue Agents
Crowdsourcing Answer Genera<on�
Evalua<on
Compare interactive and static interface
Crowdsourcing to select quality responses
Evaluate the results with end users
ChiCHI 2014 | Effects of Interface Interactivity on Collecting Language Data to Power Dialogue Agents
Stage1 : Creation
Stage2 : Aggregation
Evaluation Stage
PTT A BBS System and Online
Community in Taiwan
MTurk
Mul<lingual Crowdsourcing Study�
ChiCHI 2014 | Effects of Interface Interactivity on Collecting Language Data to Power Dialogue Agents
Chinese and English versions of ads and task instructions are prepared for crowdsourcing
Stage 1 : Answer Creation • 223 workers
- 122 from MTurk - 101 from PTT
Stage 2 : Answer Aggregation • 222 workers
Evaluation • 165 workers
98 from Mturk 67 from PTT
ChiCHI 2014 | Effects of Interface Interactivity on Collecting Language Data to Power Dialogue Agents
Key Results�
ChiCHI 2014 | Effects of Interface Interactivity on Collecting Language Data to Power Dialogue Agents
Interac<ve vs. Sta<c Interface�
• 73.6% of comments show preference for working with the interactive chat bot.
• Increasing the satisfaction of workers (Kittur, A., et al. 2013)
ChiCHI 2014 | Effects of Interface Interactivity on Collecting Language Data to Power Dialogue Agents
Conclusion� • Present an interactive chat bot-based
interface for crowdsourcing language generation tasks for building natural dialogue agents.
• Interactivity lead to higher worker satisfaction, and better perceived enjoyability by Chinese-speaking users.
• Also, identified language specificity of crowdsourcing platforms. Helps to inform crowdsourcing practices.
ChiCHI 2014 | Effects of Interface Interactivity on Collecting Language Data to Power Dialogue Agents
Thank you for your listening. �
Acknowledgement This study is partially supported by Project D352B24310 and conducted at ITRI under the sponsorship of the Ministry of Economic Affairs, Taiwan.
ChiCHI 2014 | Effects of Interface Interactivity on Collecting Language Data to Power Dialogue Agents
������
������
Wang
Key Messages Suppor<ng human communica<on con<nues to be an important
topic in HCI, both to research and design prac<ce. • Focusing on how to shorten the “psychological distance” between people. “Mind-‐connec<ng”!
Basic and applied behavioral, cogni<ve and social sciences helps to understand the features of successful and unsuccessful communica<on • Insight that we should focus on CMC affordances as much as technicality.
Interdisciplinary work can benefit both sides: Social and behavioral sciences help technology design, and vice versa.
76
������
������
Wang
Ultimate Goal? Mind-Connecting!
77
������
������
Wang 78
國⽴立清華⼤大學⼈人機合作與社群運算實驗室 NTHU Collabora<ve and Social Compu<ng Lab (CSC Lab)
Acknowledgement for Support from Ministry of Science and Technology, Taiwan 科技部
Google Inc. 美國Google總部 Microsov Research Asia 微軟亞洲研究院
Industrial Technology Research Ins<tute (ITRI) ⼯工業技術研究院 Delta Corp 台達電⼦子公司