Lecture 12 & 13 UX Goals and Metrics Human Computer Interaction / COG3103, 2014 Fall Class hours : Tue 1-3 pm/Thurs 12-1 pm 4 & 6 November
Jul 09, 2015
Lecture 12 & 13
UX Goals and Metrics
Human Computer Interaction / COG3103, 2014 Fall
Class hours : Tue 1-3 pm/Thurs 12-1 pm
4 & 6 November
PLANNING Tullis Chapter 3.
Lecture #12 COG_Human Computer Interaction 2
INTRODUCTION
• What are the goals of your usability study?
– Are you trying to ensure optimal usability for a new piece of functionality?
– Are you benchmarking the user experience for an existing product?
• What are the goals of users?
– Do users complete a task and then stop using the product?
– Do users use the product numerous times on a daily basis
• What is the appropriate evaluation method?
– How many participants are needed to get reliable feedback?
– How will collecting metric impact the timeline and budget?
– How will the data be collected and analyzed?
Lecture #12 COG_Human Computer Interaction 3
STUDY GOALS
• How will the data be used within the product development lifecycle?
• Two general ways to use data
– Formative
– Summative
Lecture #12 COG_Human Computer Interaction 4
STUDY GOALS
FORMATIVE SUMMATIVE
Lecture #12 COG_Human Computer Interaction 5
STUDY GOALS
FORMATIVE SUMMATIVE
Lecture #12 COG_Human Computer Interaction 6
Chef who periodically checks a dish while it’s being prepared and makes adjustments to positively impact the end result.
Evaluating the dish after it is completed like a restaurant critic who compares the meal with other restaurants.
STUDY GOALS
• Formative Usability
– Evaluates product or design, identifies shortcomings, makes
recommendations
– Repeats process
• Attributes
– Iterative nature of testing with the goal of improving the
design
– Done before the design has been finalized
• Key Questions
– What are the most significant usability issues that are
preventing users from completing their goals or that are
resulting in inefficiencies?
– What aspects of the product work well for users? What do
they find frustrating?
– What are the most common errors or mistakes users are
making?
– Are improvements being made from one design iteration to
the next?
– What usability issues can you expect for remain after the
product is launched?
Lecture #12 COG_Human Computer Interaction 7
Chef who periodically checks a dish while it’s being prepared and makes adjustments to positively impact the end result.
STUDY GOALS
• Summative Usability
– Goal is to evaluate how well a product or piece
of functionality meets its objectives
– Comparing several products to each other
– Focus on evaluating again a certain set of
criteria
• Key Questions
– Did we meet the usability goals of the project?
– How does our product compare against the
competition?
– Have we made improvements from one
product release to the next?
Lecture #12 COG_Human Computer Interaction 8
Evaluating the dish after it is completed like a restaurant critic who compares the meal with other restaurants.
USER GOALS
• Need to know about users and what they are trying to
accomplish
– Forced to use product everyday as part of their jobs?
– Likely to use product only one or twice?
– Is product a source of entertainment?
– Does user care about design aesthetic?
• Simplifies to two main aspects of the user experience
– Performance
– Satisfaction
Lecture #12 COG_Human Computer Interaction 9
USER GOALS
• Performance
– What the user does in interacting with the product
• Metrics (more in Ch 4)
– Degree of success in accomplishing a task or set of
tasks
– Time to perform each task
– Amount of effort to perform task
• Number of mouse clicks
• Cognitive effort
• Important in products that users don’t have choice in
how they are used
– If user can’t successfully complete key tasks, it will fail
Lecture #12 COG_Human Computer Interaction 10
USER GOALS
• Satisfaction
– What users says or thinks about their interaction
• Metrics (more in Ch 6)
– Ease of use
– Exceed expectations
– Visually appealing
– Trustworthy
• Important in products that users have choice in usage
Lecture #12 COG_Human Computer Interaction 11
STUDY DETAILS
• Budgets and Timelines
– Difficult to provide cost or time estimates for a any particular type of study
• General rules of thumb
– Formative study
• Small number of participants (≤10)
• Little impact
– Lab setting with larger number of participants (>12)
• Most significant cost – recruiting and compensating participants
• Time required to run tests
• Additional cost for usability specialists
• Time to clean up and analyze data
– Online study
• Half of the time is spent setting up the study
• Running online study requires little if any time for usability specialist
• Other half of time spent cleaning up and analyzing data
• 100-200 person-hours (50% variation)
Lecture #12 COG_Human Computer Interaction 12
STUDY DETAILS
• Evaluation Methods
– Not restricted to certain type of method (lab test vs. online test)
– Choosing method based on how many participants and what metrics
you want to use
• Lab test with small number of participants
– One-on-one session between moderator and participant
– Participant thinking-aloud, moderator notes participant behavior and
responses to questions
– Metrics to collect
• Issue based metrics – issue frequency, type, severity
• Performance metrics – task success, errors, efficient
• Self-reported metrics – answer questions regarding each task at the end of
study
• Caution
– Easy to over generalize performance and self-reported metrics without
adequate sample size
Lecture #12 COG_Human Computer Interaction 13
STUDY DETAILS
• Evaluation Methods (continued)
• Lab test with larger number of participants
– Able to collect wider range of data because increased sample size means
increased confidence in data
• All performance, self-reported, and physiological metrics are fair game
– Caution
• Inferring website traffic patterns from usability lab data is not very reliable
• Looking at how subtle design changes impact user experience
• Online studies
– Testing with many participants at the same time
– Excellent way to collect a lot of data in a short time
– Able to collect many performance, self reported metrics, subtle design
changes
– Caution
• Difficult to collect issue-based data, can’t directly observe participants
• Good for software or website testing, difficult to test consumer electronics
Lecture #12 COG_Human Computer Interaction 14
STUDY DETAILS
• Participants
– Have major impact in findings
• Recruiting issues
– Identifying the recruiting criteria to determine if participant eligible
for study
• How to segment users
– How many users are needed
• Diversity of user population
• Complexity of product
• Specific goals of study
– Recruiting strategy
• Generate list from customer data
• Send requests via email distribution lists
• Third party
• Posting announcement on website
Lecture #12 COG_Human Computer Interaction 15
STUDY DETAILS
• Data Collection
– Plan how you are capturing data needed for study
– Significant impact on how much work later when analysis begins
• Lab test with small number of participants
– Excel works well
– Have template in place for quickly capturing data during testing
– Data entered in numeric format as much as possible
• 1 – success
• 0 – failure
– Everyone should know coding scheme extremely well
• Someone flips scales or doesn’t understand what to enter
• Throw out data or have to recode data
• Larger studies
– Use data capture tool
– Helpful to have option to download raw data into excel
Lecture #12 COG_Human Computer Interaction 16
STUDY DETAILS
• Data Cleanup
– Rarely in a format that is instantly ready to analyze
– Can take anywhere from one hour to a couple of weeks
• Cleanup tasks
– Filtering data
• Check for extreme values (task completion times)
• Some participants leave in the middle of study, and times are unusually
large
• Impossible short times may indicate user not truly engaged in study
• Results from users who are not in target population
– Creating new variables
• Building on raw data useful
• May create a top-2-box variable for self-reported scales
• Aggregate overall success average representing all tasks
• Create an overall usability score
Lecture #12 COG_Human Computer Interaction 17
STUDY DETAILS
• Cleanup tasks (continued)
– Verifying responses
• Notice large percentage of participants giving the same wrong
answer
• Check why this happens
– Checking consistency
• Make sure data capture properly
• Check task completion times and success to self reported
metrics (completed fast but low rating)
– Data captured incorrectly
– Participant confused the scales of the question
– Transferring data
• Capture and clean up data in Excel, then use another program to
run statistics, then move to Excel to create charts and graphs
Lecture #12 COG_Human Computer Interaction 18
SUMMARY
• Formative vs. summative approach
– Formative – collecting data to help improve design before it is launched or released
– Summative – want to measure the extend to which certain target goal were achieved
• Deciding on the most appropriate metrics, take into account two main aspect of user experiences –
performance and satisfaction
– Performance metrics – characterize what the user does
– Satisfaction metrics - relate to what users think or feel about their experience
• Budgets and timelines need to be planned well out in advance when running any usability study
• Three general types of evaluation methods used to collect usability data
– Lab tests with small number of participants
• Best for formative testing
– Lab test with large number of participants (>12)
• Best for capturing a combination of qualitative and quantitative data
– Online studies with very large number of participants (>100)
• Best to examine subtle design changes and preferences
Lecture #12 COG_Human Computer Interaction 19
SUMMARY
• Clearly identify criteria for recruiting participants
– Truly representative of target group
– Formative
• 6 to 8 users for each iteration is enough
• If distinct groups, helpful to have four from each group
– Summative
• 50 to 100 representative users
• Plan how you are going to capture all the data needed
– Template for quickly capturing data during test
– Everyone familiar with coding conventions
• Data cleanup
– Manipulating data in a way to make them usable and reliable
– Filtering removes extreme values or records that are problematic
– Consistency checks and verifying responses make sure participant intensions map to their responses
Lecture #12 COG_Human Computer Interaction 20
UX GOALS, METRICS, AND TARGETS
Hartson Chapter 10.
Lecture #12 COG_Human Computer Interaction 21
INTRODUCTION
Lecture #12 COG_Human Computer Interaction 22
Figure 10-1 You are here; the chapter on UX goals, metrics, and targets in the context of the overall Wheel lifecycle template.
UX GOALS
• Example: User Experience Goals for Ticket Kiosk System
– We can define the primary high-level UX goals for the ticket buyer to include:
• Fast and easy walk-up-and-use user experience, with absolutely no user training
• Fast learning so new user performance (after limited experience) is on par with that
of an experienced user [from AB-4-8]
• High customer satisfaction leading to high rate of repeat customers [from BC-6-16]
– Some other possibilities:
• High learnability for more advanced tasks [from BB-1-5]
• Draw, engagement, attraction
• Low error rate for completing transactions correctly, especially in the interaction
for payment [from CG-13-17]
Lecture #12 COG_Human Computer Interaction 23
UX TARGET TABLES
Lecture #12 COG_Human Computer Interaction 24
Table 10-1 Our UX target table, as evolved from the Whiteside, Bennett, and Holtzblatt (1988) usability specification table
WORK ROLES, USER CLASSES, AND UX GOALS
Lecture #12 COG_Human Computer Interaction 25
Ticket buyer: Casual new user, for occasional personal use
Walk-up ease of use for new user
Table 10-2 Choosing a work role, user class, and UX goal for a UX target
UX MEASURES
• Objective UX measures (directly measurable by evaluators)
– Initial performance
– Long-term performance (longitudinal, experienced, steady state)
– Learnability
– Retainability
– Advanced feature usage
• Subjective UX measures (based on user opinions)
– First impression (initial opinion, initial satisfaction)
– Long-term (longitudinal) user satisfaction
Lecture #12 COG_Human Computer Interaction 26
MEASURING INSTRUMENTS
Lecture #12 COG_Human Computer Interaction 27
Ticket buyer: Casual new user, for occasional personal use
Walk-up ease of use for new user
Initial user performance
Ticket buyer: Casual new user, for occasional personal use
Initial customer satisfaction
First impression
Table 10-3 Choosing initial performance and first impression as UX measures
MEASURING INSTRUMENTS
• Benchmark Tasks
– Address designer questions with benchmark tasks and UX targets
– Selecting benchmark tasks
• Create benchmark tasks for a representative spectrum of user tasks.
• Start with short and easy tasks and then increase difficulty progressively.
• Include some navigation where appropriate.
• Avoid large amounts of typing (unless typing skill is being evaluated).
• Match the benchmark task to the UX measure.
• Adapt scenarios already developed for design.
• Use tasks in realistic combinations to evaluate task flow.
Lecture #12 COG_Human Computer Interaction 28
MEASURING INSTRUMENTS
• Do not forget to evaluate with your power users.
• To evaluate error recovery, a benchmark task can begin in an error state.
• Consider tasks to evaluate performance in “degraded modes” due to partial
equipment failure.
• Do not try to make a benchmark task for everything.
– Constructing benchmark task content
• Remove any ambiguities with clear, precise, specific, and repeatable instructions.
• Tell the user what task to do, but not how to do it.
• Do not use words in benchmark tasks that appear specifically in the interaction
design.
Lecture #12 COG_Human Computer Interaction 29
MEASURING INSTRUMENTS
• Use work context and usage-centered wording, not system-oriented wording.
• Have clear start and end points for timing.
• Keep some mystery in it for the user.
• Annotate situations where evaluators must ensure pre-conditions for running
benchmark tasks.
• Use “rubrics” for special instructions to evaluators.
• Put each benchmark task on a separate sheet of paper.
• Write a “task script” for each benchmark task.
Lecture #12 COG_Human Computer Interaction 30
MEASURING INSTRUMENTS
Lecture #12 COG_Human Computer Interaction 31
Ticket buyer: Casual new user, for occasional personal use
Walk-up ease of use for new user
Initial user performance
BT1: Buy special event ticket
Ticket buyer: Casual new user, for occasional personal use
Initial customer satisfaction
First impression
Table 10-4 Choosing “buy special event ticket” benchmark task as measuring instrument for “initial performance” UX measure in first UX target
MEASURING INSTRUMENTS
Lecture #12 COG_Human Computer Interaction 32
Ticket buyer: Casual new user, for occasional personal use
Walk-up ease of use for new user
Initial user performance
BT1: Buy special event ticket
Ticket buyer: Casual new user, for occasional personal use
Walk-up ease of use for new user
Initial user performance
BT2: Buy movie ticket
Ticket buyer: Casual new user, for occasional personal use
Initial customer satisfaction
First impression
Table 10-5 Choosing “buy movie ticket” benchmark task as measuring instrument for second initial performance UX measure
MEASURING INSTRUMENTS
– How many benchmark tasks and UX targets do you need?
– Ensure ecological validity [Write your benchmark task descriptions, how
can the setting be made more realistic?]
• What are constraints in user or work context?
• Does the task involve more than one person or role?
• Does the task require a telephone or other physical props?
• Does the task involve background noise?
• Does the task involve interference or interruption?
• Does the user have to deal with multiple simultaneous inputs, for example,
multiple audio feeds through headsets?
Lecture #12 COG_Human Computer Interaction 33
MEASURING INSTRUMENTS
Lecture #12 COG_Human Computer Interaction 34
Ticket buyer: Casual new user, for occasional personal use
Walk-up ease of use for new user
Initial user performance
BT1: Buy special event ticket
Ticket buyer: Casual new user, for occasional personal use
Walk-up ease of use for new user
Initial user performance
BT2: Buy movie ticket
Ticket buyer: Casual new user, for occasional personal use
Initial customer satisfaction
First impression
Questions Q1–Q10 in the QUIS questionnaire
Table 10-6 Choosing questionnaire as measuring instrument for first-impression UX measure
MEASURING INSTRUMENTS
Lecture #12 COG_Human Computer Interaction 35
Ease of first-time use Initial performance Time on task
Ease of learning Learnability Time on task or error rate, after given amount of use and compared with initial performance
High performance for experienced users
Long-term performance Time and error rates
Low error rates Error-related performance
Error rates
Error avoidance in safety critical tasks
Task-specific error performance
Error count,with strict target levels (much more important than time on task)
Error recovery performance
Task-specific time performance
Time on recovery portion of the task
Overall user satisfaction User satisfaction Average score on questionnaire
User attraction to product
User opinion of attractiveness
Average score on questionnaire, with questions focused on the effectiveness of the “draw” factor
Quality of user experience
User opinion of overall experience
Average score on questionnaire, with questions focused on quality of the overall user experience, including specific points about your product that might be associated most closely with emotional impact factors
Overall user satisfaction User satisfaction Average score on questionnaire, with questions focusing on willingness to be a repeat customer and to recommend product to others
Continuing ability of users to perform without relearning
Retainability Time on task and error rates re-evaluated after a period of time off (e.g., a week)
Avoid having user walk away in dissatisfaction
User satisfaction, especially initial Satisfaction
Average score on questionnaire, with questions focusing on initial impressions and satisfaction
Table 10-7 Close connections among UX goals, UX measures, and measuring instruments
UX METRICS
Lecture #12 COG_Human Computer Interaction 36
Ticket buyer: Casual new user, for occasional personal use
Walk-up ease of use for new user
Initial user performance
BT1: Buy special event ticket
Average time on task
Ticket buyer: Casual new user, for occasional personal use
Walk-up ease of use for new user
Initial user performance
BT2: Buy movie ticket
Average number of errors
Ticket buyer: Casual new user, for occasional personal use
Initial customer satisfaction
First impression
Questions Q1–Q10 in the QUIS questionnaire
Average rating across users and across questions
Table 10-8 Choosing UX metrics for UX measures
SETTING LEVELS
Lecture #12 COG_Human Computer Interaction 37
Ticket buyer: Casual new user, for occasional personal use
Walk-up ease of use for new user
Initial user performance
BT1: Buy special event ticket
Average time on task
3 minutes
Ticket buyer: Casual new user, for occasional personal use
Walk-up ease of use for new user
Initial user performance
BT2: Buy movie ticket
Average number of errors
<1
Ticket buyer: Casual new user, for occasional personal use
Initial customer satisfaction
First impression
Questions Q1–Q10 in questionnaire XYZ
Average rating across users and across questions
7.5/10
Table 10-9 Setting baseline levels for UX measures
SETTING LEVELS
Lecture #12 COG_Human Computer Interaction 38
Ticket buyer: Casual new user, for occasional personal use
Walk-up ease of use for new user
Initial user performance
BT1: Buy special event ticket
Average time on task
3 min, as measured at the MUTTS ticket counter
2.5 min
Ticket buyer: Casual new user, for occasional personal use
Walk-up ease of use for new user
Initial user performance
BT2: Buy movie ticket
Average number of errors
<1 <1
Ticket buyer: Casual new user, for occasional personal use
Initial customer satisfaction
First impression
Questions Q1–Q10 in questionnaire XYZ
Average rating across users and across questions
7.5/10 8/10
Ticket buyer: Frequent music patron
Accuracy Experienced usage error rate
BT3: Buy concert ticket
Average number of errors
<1
<1
Casual public ticket Buyer
Walk-up ease of use for new user
Initial user Performance
BT4: Buy Monster Truck Pull tickets
Average time on Task
5 min (online system)
2.5 min
Casual public ticket buyer
Walk-up ease of use for new user
Initial user performance
BT4: Buy Monster Truck Pull tickets
Average number of errors
< 1
<1
Casual public ticket buyer
Initial customer satisfaction
First Impression
QUIS questions 4–7, 10, 13
Average rating across users and across Questions
6/10 8/10
Casual public ticket Buyer
Walk-up ease of use for user with a little experience
Just postinitial performance
BT5: Buy Almost Famous movie tickets
Average time on task
5 min (including review)
2 min
Casual public ticket Buyer
Walk-up ease of use for user with a little experience
Just postinitial performance
BT6: Buy Ben Harper concert tickets
Average number of errors
<1
<1
Table 10-10 Setting target levels for UX metrics
PRACTICAL TIPS AND CAUTIONS FOR CREATING UX TARGETS
• Are user classes for each work role specified clearly enough?
– Have you taken into account potential trade-offs among user groups?
– Are the values for the various levels reasonable?
– Be prepared to adjust your target level values, based on initial observed
results
– Remember that the target level values are averages.
– How well do the UX measures capture the UX goals for the design?
– What if the design is in its early stages and you know the design will change
significantly in the next version, anyway?
– What about UX goals, metrics, and targets for usefulness and emotional
impact?
Lecture #12 COG_Human Computer Interaction 39
Exercise 10-2: Creating Benchmark Tasks and UX Targets for Your System
• Goal
– To gain experience in writing effective benchmark tasks and measurable UX targets.
• Activities
– We have shown you a rather complete set of examples of benchmark tasks and UX targets for the Ticket Kiosk
System. Your job is to do something similar for the system of your choice.
– Begin by identifying which work roles and user classes you are targeting in evaluation (brief description is
enough).
– Write three or more UX table entries (rows), including your choices for each column. Have at least two UX
targets based on a benchmark task and at least one based on a questionnaire.
– Create and write up a set of about three benchmark tasks to go with the UX targets in the table.
• Do NOT make the tasks too easy.
• Make tasks increasingly complex.
• Include some navigation.
• Create tasks that you can later “implement” in your low-fidelity rapid prototype.
• The expected average performance time for each task should be no more than about 3 minutes, just to keep it
short and simple for you during evaluation.
– Include the questionnaire question numbers in the measuring instrument column of the appropriate UX target.
Lecture #12 COG_Human Computer Interaction 40
Exercise 10-2: Creating Benchmark Tasks and UX Targets for Your System
• Cautions and hints:
– Do not spend any time on design in this exercise; there will be time for detailed design in
the next exercise.
– Do not plan to give users any training.
• Deliverables:
– Two user benchmark tasks, each on a separate sheet of paper.
– Three or more UX targets entered into a blank UX target table on your laptop or on paper.
– If you are doing this exercise in a classroom environment, finish up by reading your
benchmark tasks to the class for critique and discussion.
• Schedule
– Work efficiently and complete in about an hour and a half.
Lecture #12 COG_Human Computer Interaction 41