“Yours is Better!” Participant Response Bias in HCI Nicola Dell, Vidya Vaidyanathan, Indrani Medhi, Edward Cutrell, William Thies CHI 2012 Joon-won Lee Jimin Han Jincheul Jang
“Yours is Better!” Participant Response Bias in HCI
Nicola Dell, Vidya Vaidyanathan, Indrani Medhi, Edward Cutrell, William Thies
CHI 2012
Joon-won LeeJimin Han
Jincheul Jang
Motivation
• HCI researchers frequently work with groups of people that differ significantly from themselves
• Little attention has been paid to the effects these differences have on the evaluation of HCI systems.
• They measure participant response bias due to interviewer demand characteristics and the role of social and demographic factors in influencing that bias.
What is “Demand characteristic”?
• Participants in an experiment often share with the experimenter the hope that the study will be successful.
• Frequently, a participant will want to ensure that she makes a useful contribution to the study and so will strive to be a ‘good’ participant and provide the experimenter with the ‘right’ results.
• participant may resent the experimenter and actively work to disprove the hypothesis .
Lack of research in HCI
• They found only one study that specifically addresses demand characteristics in HCI. ( At ‘trial of trials’ Brown et al. found that participants changed their system usage partly to give researchers ‘good’ data)
• Scarcity of research that quantifies participant response bias due to demand characteristics in HCI setting.
Only one research
4 Contribution to HCI• Survey existing literature to bring demand characteristics and
their known effects to the attention.
• If participants believe that a particular technological artifact is favored by the interviewer, their responses are biased to favor it.
• If interviewer is a foreign researcher who requires a translator, responses are even more biased towards the technology favored by the interviewer.
• For a foreign interviewer with translator, participants report a preference for an obviously inferior technology.
Experimental Design
• H.1 If participants believe that the interviewer favors a technology, their responses will be biased to favor it as well
• H.2 If the interviewer is a foreign researcher requiring a translator, participants’ responses will be even more biased towards the technology favored by the interviewer
• H.3 Participants will express a preference for an obviously inferior technology if they believe it is favored by the interviewer
Experimental Design
• Total 450 Participants, Field study in Bangalore, India• Experiment1 : Test H.1 & H.2
- participants were shown video clip on each of 2 smartphone in this time one of smartphone will be associated with interviewer
• Experiment2 :Test H.1, H.2 & H.3 - degrading one of the video clips- seeing participants preference for the degraded video when it was associated with the interviewer
Experimental Procedure (1/2)
• Between subjects design • Sample size of 50 per each experimental condition• Each interview lasting between 2 and 3 minutes • Same general interview procedure across all experimental
condition • Demand characteristics : association (Bold font in Script)
Experimental Procedure (2/2)
• The order of video (associated or not) was randomized to prevent ordering effect
• Interviewer recorded participant responses and comment • Responses were coded into 3 distinct classes
– 1. favored the video associated with interviewer– 2. favored the video not associated with interviewer – 3. same (will not used in this paper)
Interviewers
• Vary the social status of the interviewers • 2 different female, graduate student - 29 year-old English-speaking Caucasian foreign interviewer (not born in Bangalore, distinguishable as an outsider) - 33 year-old Kannada and English-speaking Indian local interviewer (She grew up in the Bangalore, identifiable as a local member)
• Foreign interviewer required a translator • In this region, language associated with prestige and
opportunity high social status
Participants
• 2 distinct social group– 1. male university students (elite student) speak English, experienced high technology , 200 male student– 2. local auto rickshaw drivers high-school education, daily income $5~$10, possess cheap mobile
phone and not experience high technology , 250 male driver (Socio-demographic difference is larger than 1st group)
• Simplified interview script for rickshaw driver
• 2x2 factorial design– Interviewer (F/L)– Participants (Driver/Student)
• Dependent variable:video chosen
• Response bias in all cases
• The largest bias?
Experiment 1: Identical videos
Why Chi-Square Test?
e.g. Yes/No
<Textbook Ch.4 (p.92~)>
e.g.
• H1. Presence of Response Bias–
Experiment 1: Identical videos
𝜒2 (1 ,𝑛=144 )=26.7 ,𝑝<.001Significant bias
n=103+41
• H1. Presence of Response Bias– Foreign interviewer
• Rickshaw Drivers
• University Student
– Local interviewer• Rickshaw Drivers
• University Student
• Rickshaw + Student
Experiment 1: Identical videos
𝜒2 (1 ,𝑛=42 )=18.7 ,𝑝<.001
𝜒2 (1 ,𝑛=36 )=5.4 ,𝑝=.02
𝜒2 (1 ,𝑛=28 )=3.6 ,𝑝=.06
𝜒2 (1 ,𝑛=38 )=2.6 ,𝑝=.10
Significant bias
ns.
𝜒2 (1 ,𝑛=66 )=6.1 ,𝑝=.01
• H2. Impact of Foreign Interviewer– 2 comparison
• Foreign-Rickshaw vs. Foreign-Student• Foreign-Rickshaw vs. Local-Rickshaw
Experiment 1: Identical videos
5x
2.1x
2.3x
• H2. Impact of Foreign Interviewer– No significant relationship between
• the video chosen and the interviewer
• the video chosen and the participant
Experiment 1: Identical videos
𝜒2 (1 ,𝑛=70 )=2.28 ,𝑝=.13
𝜒2 (1 ,𝑛=78 )=2.11 ,𝑝=.15
• Making one of the video clips noticeably worse than the other– the interviewer associate herself to the degraded clip
• Loading the low-quality video clip on one smartphone and the high-quality clip on the other
• Modifying scripts• Randomizing the playing order in order to avoid order effect
Experiment 2: Degraded video
• Adding a condition: without association– By changing the script, this condition represented a baseline that
minimized demand characteristics
• Not performing local interviewer with university student– In experiment 1, two groups showed no significant differences
Experiment 2: Degraded video
• No response bias in group a, b
• Group d (Foreign interviewer with student) more biased than group b
• In the case of rickshaw drivers, the response bias occurred
Experiment 2: Degraded video
• H1: Presence of Response Bias– Using chi-square test and Fisher’s
exact test to improve the accuracy
– Foreign interviewers with rickshaw drivers (p<.001)
– Foreign interviewers with university students (p=.008)
– Local interviewers with rickshaw drivers (p=.003)
Experiment 2: Degraded video
• H2: Impact of Foreign Interviewer
Experiment 2: Degraded video
049Cramer’s V = .288
Significant association between interviewers and video chosen (response bias)
• H3: Preference for Inferior Tech
– In the case of foreign interviewers with rickshaw drivers, the participants select the low-quality video (27/50 = 54%)
– In the case of local interviewers,
Experiment 2: Degraded video
𝜒2 (1 ,𝑛=46 )=1.39 ,𝑝=.24ns.
𝜒2 (1 ,𝑛=47 )=2.57 ,𝑝=.11ns.
Summary of Hypotheses tests
Experiment 1 Experiment 2
H.1 If participants believe that the in-terviewer favors a technology, their re-sponses will be biased to favor it as well
Foreign-rickshaw: sig.Foreign-student: sig.Local-rickshaw: ns.Local-rickshaw: ns.
Sig.
H.2 If the interviewer is a foreign re-searcher requiring a translator, partici-pants’ responses will be even more bi-ased towards the technology favored by the interviewer
ns. Rickshaw: sig.
H.3 Participants will express a prefer-ence for an obviously inferior technol-ogy if they believe it is favored by the interviewer
- ns.
Discussion, Recommendation
• Participants did not tell the interviewer the ‘right’ response while secretly thinking otherwise, but rather that participants seemed to genuinely believe the interviewer’s artifact to be superior and identified convincing reasons to justify their choice.
• Researchers pay more attention to the types of response bias that might result from working with any participant population and actively take steps to minimize this bias.
Generalization and Limitations
• Consideration of gender– Davis et al. (2010) suggested that gender might be an important factor
that could influence participant response.• This study avoided examining the extent to which gender may play a role in any bias
observed.• All experiments in this paper recruited “male” participants and performed
interviews with “female” interviewers.
• Cultural differences
• More sophisticated analysis
Conclusion
• (1) if participants believe that a particular technological artifact is favored by the interviewer, their responses are biased to favor it as well
• (2) the bias due to interviewer demand characteristics is exaggerated much further when the interviewer is a foreign researcher requiring a translator
• (3) in response to a foreign interviewer with a translator, participants of lower social status report a preference for an obviously inferior technology, which they otherwise do not prefer