Preliminary Examination Proposal Slides

Preliminary Examination

Manas Tungare

Advisory Committee:

Dr. Manuel Pérez-Quiñones

Dr. Stephen H. Edwards

Dr. Edward A. Fox

Prof. Steve Harrison

Dr. Tonya Smith-Jackson

Talk outline

Presentation & questions Additional comments, suggestions

0 ~45 min

Slides contain only major citations. The document contains full citations.

OK to record audio?

Your questions/comments are welcome at any time.

Talk outline

• Introduction

• Problem statement

• A review of my work so far

• Research questions

• How my research plan will address these

• Planned schedule

Introduction

Human Computer Interaction

PersonalInformationManagement

Multi-Platform User Interfaces

Personal Information

Multiple devices

Problems and workarounds

• Constant need for manual synchronization

• Give up using multiple computers

• Copy addresses and phone numbers on sticky notes

• Use USB flash drives to cart data around

• Email files to themselves

Evaluation issues in PIM

• Evaluating new PIM tools

• Comparing PIM tools developed by diverse research groups

• Choosing suitable reference tasks for PIM

• Measures that are valid across tasks

Paraphrased from discussions at the CHI 2008 Workshop on Personal Information Management, April 2008.

Problem Statement

Understanding PIM

• Understanding users and how they use multiple devices to accomplish PIM

• Identify common device configurations in information ecosystems

• Identify tasks performed on each device

• Identify problems, frustrations

• What is the mental workload incurred by users when they are trying to use multiple devices for personal information management?

• For those tasks that users have indicated are frustrating for them, do the alternate strategies result in lower mental workload?

• Are multi-dimensional subjective workload assessment techniques (such as NASA TLX) an accurate indicator of operator performance in information ecosystems?

Mental workload inInformation Ecosystems

Research: Phase I

Understanding users’ PIM practices across devices

Research Questions

• Devices and activities

• What is the distribution of users who use multiple devices? Most common devices? Common PIM tasks? Tasks bound to a device?

• The use of multiple devices together

• Factors in choice of new devices

• Device failures

Research Questions



• Which devices were commonly used in groups? Methods employed to share data among these devices? Problems and frustrations?


• Device failures

Research Questions




• What are some of the factors that influence users’ buying decisions for new devices? Integrating a device into current set of devices?

• Device failures

Research Questions




• Device failures

• How often do users encounter failures in their information ecosystems? Common types of failures? Coping with failure?

Survey: August 2007

• Knowledge workers (N=220)

• Highlights from preliminary results:

• 96% use at least one laptop

• 71% use at least one desktop

• Lots of frustrated users (as expected)

• Longer discussion in [Tungare and Pérez-Quiñones 2008]

Survey analysis

• Content analysis to uncover common tasks

• Quantitative analysis to determine typical set of devices for experiment

• Recruit two students to code random subset of survey; ensure high inter-rater reliability

• Design Phase II experiment based on these findings

Content analysis: example

“The last device I acquired was a cell phone from Verizon. I would have liked to synchronize data from my laptop or my PDA with it but there seems to be no reasonable way to do so. I found a program that claimed to be able to break in over bluetooth but it required a fair amount of guess work as to data rates etc and I was never able to actually get it to do anything. In the end I gave up. Fortunately I dont know that many people and I usually have my PDA with me so it isnt a big deal but frankly I dont know how Verizon continues to survive with the business set...”

Content analysis: example

“The last device I acquired was a cell phone from Verizon. I would have liked to synchronize data from my laptop or my PDA with it but there seems to be no reasonable way to do so. I found a program that claimed to be able to break in over bluetooth but it required a fair amount of guess work as to data rates etc and I was never able to actually get it to do anything. In the end I gave up. Fortunately I dont know that many people and I usually have my PDA with me so it isnt a big deal but frankly I dont know how Verizon continues to survive with the business set...”

Device 1

Device 2 Device 2

Task

Problem 1

Conclusion

Problem 2

Research: Phase II

Measurement of mental workload and task performance of users while they

perform representative PIM tasks

Mental workload

• [...] “That portion of an operator’s limited capacity actually required to perform a particular task.” [O’Donnell and Eggemeier, 1986]

• Low to moderate levels of workload are associated with acceptable levels of operator performance [Wilson and Eggemeier, 2006]

• Often used as a measure of operator performance

Mental workload as a measure of operator performance

• Alternative: direct measurement of task performance:

• Time taken to perform task,

• Number of errors, etc.

• Task metrics are more difficult to measure

• Need instrumentation of equipment

• Scores cannot be compared across tasks

Name Task Da te

Mental Demand How menta lly demand ing was the task?

Physica l Demand How physica lly demand ing was the task?

Tempora l Demand How hurried or rushed was the pace of the task?

Per formance How successful were you in accomp lishing wha tyou were asked to do?

E f for t How hard d id you have to work to accomp lishyour leve l of performance?

Frustra tion How insecure , d iscouraged , irrita ted , stressed ,and annoyed wereyou?

Figure 8.6

NASA Task Load Index

Hart and Stave land ’s NASA Task Load Index (TLX) me thod assesseswork load on five 7-point sca les. Increments of high, med ium and lowestima tes for each point result in 21 grada tions on the sca les.

Very Low Very H igh

Very Low Very H igh

Very Low Very H igh

Very Low Very H igh

Perfec t Fa ilure

Very Low Very H igh

Measuring mental workload

NASA TLX

• NASA TLX:Task Load Index

• SWAT:Subjective workload assessment technique

• WP:Workload Profile

Validity of workload measures

• Mental workload consistently shown to be negatively correlated with performance metrics [Bertram et al. 1992]

• Airline cockpits [Ballas et al. 1992]

• Navigation [Schryver 1994]

• Multi-device computing environments: information ecosystems [None yet!]

Research Question 1

• RQ: What is the mental workload incurred by users in certain common tasks that were considered difficult in Phase I?

• Hypothesis: Subjective assessment of mental workload will be high in these tasks

• Experiment: Measure mental workload for several representative tasks performed in information ecosystems

Research Question 2

• RQ: Is a decrease in mental workload a factor that motivates changes in users’ information management strategies?

• Hypothesis: Users adopt strategies that will eventually lead to lowered mental workload

• Experiment: Compare mental workload for tasks identified as difficult, and for their respective workarounds

Research Question 3

• RQ: Are subjective assessments of mental workload an accurate indicator of operator performance in this domain?

• Hypothesis: Mental workload measured by NASA TLX (including existing dimensions, and possibly new dimensions) can be used to predict operator performance

• Experiment: (Attempt to) correlate workload assessments with operator performance

Experiment design

• Representative tasks from the content analysis of Phase I

• Identify devices, tasks, strategies, etc. and use these to give users benchmark tasks

• Measure mental workload

• Other benchmark tasks too

• To have a baseline

Expected contributions

• Understanding users and how they use multiple devices to accomplish PIM

• Comparing workloads in different information ecosystems

• Formative feedback for designers

• Validating NASA TLX as an accurate predictor of task performance in information ecosystems

Schedule

May 08 June 08 July 08 Aug 08 Sep 08

Perform content analysis for Phase I

Determine tasks

Recruitment, IRB, Scheduling Study

Conduct Experiments

Perform analysis

Write dissertation

Prepare publications

Questions & comments

?!

Note to self: Turn off audio recording before committee deliberation.

Thank you!

Supporting Slides

Mental workload andtask performance

Perf

orm

ance

Mental workload

[O’Donnell, Eggemeier 1986]

Why NASA TLX

• Higher correlation with performance as compared to SWAT and WP [Rubio & Díaz, 2004]

• Validated in several environments since1988 [several, 1988-present]

NASA TLX procedure

Name Task Da te

Mental Demand How menta lly demand ing was the task?

Physica l Demand How physica lly demand ing was the task?

Tempora l Demand How hurried or rushed was the pace of the task?

Per formance How successful were you in accomp lishing wha tyou were asked to do?

E f for t How hard d id you have to work to accomp lishyour leve l of performance?

Frustra tion How insecure , d iscouraged , irrita ted , stressed ,and annoyed wereyou?

Figure 8.6

NASA Task Load Index

Hart and Stave land ’s NASA Task Load Index (TLX) me thod assesseswork load on five 7-point sca les. Increments of high, med ium and lowestima tes for each point result in 21 grada tions on the sca les.

Very Low Very H igh

Very Low Very H igh

Very Low Very H igh

Very Low Very H igh

Perfec t Fa ilure

Very Low Very H igh

20 steps

Frustration Level

NASA TLX procedure

Mental Demand

Pairwise Comparisons

Quantitative analysis

Home Desktop

Laptop

Cell phone

Media player

Work Desktop

PDA cell phone

52 32 29 25 24 22 20 19 18

Number of participants using these devices as a group

Content analysis

• Techniques from [Neuendorf 2004, Krippendorf 2004]

• Inter-rater reliability with 2 additional coders (expected Cohen’s ! " 0.6~0.7)

• Purpose of content analysis is to design the experiment, not to draw conclusions

• Coding: a priori versus emergent

• Challenge: converging on representative tasks

Experimental setup

• Explain features

• Training period with example tasks

• Account for experience

• Stratified samples?

• Participant recruitment

• CHCI, CS@VT, CRC, Google (?)

Preliminary Examination Proposal Slides

Business

use of multiple devices

common devices

research questions devices

current set of devices

pim understanding users

common pim tasks

new pim tools

multiple computers