Top Banner
An Introduction to Usability Testing Bill Killam, MA CHFP Adjunct Professor University of Maryland [email protected] User-Centered Design www.user-centereddesign.com
107

An Introduction to Usability Testing

Feb 25, 2016

Download

Documents

lydie

An Introduction to Usability Testing. Bill Killam, MA CHFP Adjunct Professor University of Maryland [email protected]. Background. Origin of the Species. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An Introduction to Usability Testing

An Introduction to Usability Testing

Bill Killam, MA CHFPAdjunct Professor

University of [email protected]

User-Centered Design www.user-centereddesign.com

Page 2: An Introduction to Usability Testing

Background

User-Centered Design www.user-centereddesign.com

Page 3: An Introduction to Usability Testing

3

Origin of the Species

“Usability testing” is the common name for multiple forms both user and non-user based system evaluation focused on a specific aspect of the design

Done for many, many years prior, but popularized in the media by Jakob Neilson in the 1990’s

User-Centered Design www.user-centereddesign.com

Page 4: An Introduction to Usability Testing

4 User-Centered Design www.user-centereddesign.com

What does “usability” mean?

ISO 9126 – “A set of attributes that bear on the effort needed

for use, and on the individual assessment of such use, by a stated or implied set of users”

ISO 9241– “Extent to which a product can be used by

specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.”

Page 5: An Introduction to Usability Testing

5 User-Centered Design www.user-centereddesign.com

The Ontology of “Usability” Accessibility

– A precursor to usability: if users cannot gain access to the product, its usability is a moot point

Functional Suitability – Does the product contain the functionality required by the user?

Functional Discoverability– Can the user “discover” the functions of a product?

Ease-of-learning – Can the user figure out how to exercise the functionality provided?

Ease-of-use – Can the user exercise the functionality accurately and efficiently once its

learned (includes accessibility issues)?– Can users use it safely?

Ease-of-recall – Can the knowledge of operation be easily maintained over time?

Safety– Can the user operate the system in relative safety, and recover from

errors? Subjective Preference

– Do user’s like using it?

Page 6: An Introduction to Usability Testing

Usability, Organizations,

and Process

User-Centered Design www.user-centereddesign.com

Page 7: An Introduction to Usability Testing

7

Thought From CHI ‘92The 1970s, when Hardware is King

– 1950s – its an art– 1960s – there are degrees– 1970s – they’re in management

The 1980s, when Software is King– 1960s – its an art– 1970s – there are degrees– 1980s – they’re in management

1990s, when Interaction will be King– 1970s – its an art– 1980s – there are degrees– 1990s – they’re in management

User-Centered Design www.user-centereddesign.com

Page 8: An Introduction to Usability Testing

8

Processes System Development Models

– Waterfall– Spiral– V-Model

Software Development Models– Dynamic System Development Process (DSDP)– Joint Application Development Process (JAD) (circa 1970)– Structured Systems Analysis and Design Methodology

(SSADM) (circa 1980)– Information Requirement Analysis/Soft System (circa

1980)– Object Oriented Programming (origins in 1960, but a

common methodology in the 1990s)– Rapid Application Development (circa 1991)*– Agile*

• Extreme Programming (circa 1990)• SCRUM

User-Centered Design www.user-centereddesign.com

Page 9: An Introduction to Usability Testing

9

Processes (concluded) Interface Design Models

– User-Centered Design (the common term) – Star (Hartson & Hix, 1989)– LUCID (Cognetics, 2008)– ISO 13407/ISO 9241– Design Thinking (aka Human Centered Design) (IDEO)

Characteristics of a User-Centered Design Process– Design is a separate activity, distinct from development– Design should occur, completely, before development

begins– Feedback is needed at many steps in the design process

to…• Confirm the direction of design• Evaluate alternatives• User-Centered Design techniques can also be

used to test the outcome (the final product) under the correct conditions

User-Centered Design www.user-centereddesign.com

Page 10: An Introduction to Usability Testing

10

Corporate Organization Structure

User-Centered Design www.user-centereddesign.com

SalesMarketing

C-Level Management

(CEO, CFO, CIO, CTO, CPO)

Product Design &

DevelopmentTraining Field Services R&D

Page 11: An Introduction to Usability Testing

11

Product Design & Development

User-Centered Design www.user-centereddesign.com

MechanicalEngineerin

gElectrical

Engineering

Software Engineering

&Web

Development

Visual Design

Technical Writers

Interaction Design

Industrial Design

Test & Evaluation

DesignTeam

Development

Team

Systems Engineer

(Management)R&D

Page 12: An Introduction to Usability Testing

Testing MethodsPart 1: Non-User Based

Testing

User-Centered Design www.user-centereddesign.com12

Page 13: An Introduction to Usability Testing

13

Compliance Testing

The Spelling and Grammar checker of usability testing

Possible (within limits) to be performed by anyone

Can remove the low level usability issues that often mask more significant usability issues

User-Centered Design www.user-centereddesign.com

Page 14: An Introduction to Usability Testing

14

Compliance Testing (concluded)Style Guide-based Testing

– Checklists– Interpretation Issues– Scope Limitations

Available Standards– Commercially GUI & Web Standards and Style

Guides– Domain Specific GUI & Web Standards and

Style Guides– Internal Standards and Style Guides

Interface Specification Testing*

*Special Case of QC Testing that assumes a usable design to start with

User-Centered Design www.user-centereddesign.com

Page 15: An Introduction to Usability Testing

15

Expert Review

Aka: Heuristic EvaluationOne or more usability experts review a

product, application, etc.Free format review or structured reviewSubjective but based on sound usability

and design principlesHighly dependent on the qualifications of

the reviewer(s)

User-Centered Design www.user-centereddesign.com

Page 16: An Introduction to Usability Testing

16

Expert Review (Concluded)

Nielson’s 10 Most Common Mistakes Made by Web Developers (three versions)

Shneiderman’s 8 Golden RulesConstantine & Lockwood HeuristicsForrester Group HeuristicsNorman’s 4 Principles of Usability

User-Centered Design www.user-centereddesign.com

Page 17: An Introduction to Usability Testing

1st HeuristicFunctional discoverability through obvious

interactive elements and adequate feedback

User-Centered Design www.user-centereddesign.com17

Page 18: An Introduction to Usability Testing

User-Centered Design www.user-centereddesign.com18

Page 19: An Introduction to Usability Testing

User-Centered Design www.user-centereddesign.com19

Page 20: An Introduction to Usability Testing

2nd HeuristicA good, complete, and unambiguous cognitive (or

conceptual) model to predict the effects of our actions

User-Centered Design www.user-centereddesign.com20

Page 21: An Introduction to Usability Testing

21 User-Centered Design www.user-centereddesign.com

Page 22: An Introduction to Usability Testing

22 User-Centered Design www.user-centereddesign.com

Page 23: An Introduction to Usability Testing

23 User-Centered Design www.user-centereddesign.com

Page 24: An Introduction to Usability Testing

24 User-Centered Design www.user-centereddesign.com

Page 25: An Introduction to Usability Testing

25

Cognitive Models

We all develop cognitive models– They may not be complete– They may be inconsistent– They ay be self contradicting– They are not always correct

We don’t necessarily invest in maintaining our cognitive models

User-Centered Design www.user-centereddesign.com

Page 26: An Introduction to Usability Testing

Conceptual Model Issues - Tabs

26

Page 27: An Introduction to Usability Testing

Conceptual Model Issues

27

Page 28: An Introduction to Usability Testing

Conceptual Model Issues - Tabs

28

Page 29: An Introduction to Usability Testing

Conceptual Model Issues

29

Page 30: An Introduction to Usability Testing

3rd HeuristicDesign for the intended users (and not

yourself)

User-Centered Design www.user-centereddesign.com

Page 31: An Introduction to Usability Testing

31 User-Centered Design www.user-centereddesign.com

Page 32: An Introduction to Usability Testing

32 User-Centered Design www.user-centereddesign.com

Page 33: An Introduction to Usability Testing

33 User-Centered Design www.user-centereddesign.com

1131 SAN 0820+1 LGW AA 2734 FCYBM D10 1

AA 2734 CHG PLANE AT DFW

X12 1805 SAN 1425+1 LGW BA 284 FJMSB D10 1

2100 SAN 2030+1 LHR TW 702 FCYBQ * 2

TW 702 EQUIPMENT 767 LAX L-10

Page 34: An Introduction to Usability Testing

34 User-Centered Design www.user-centereddesign.com

Midnight Midnight6:00 AM 6:00 AMNoon Noon6:00 PM 6:00 PM

TWA 702

BA 284

AA 2734 Gatwick

Gatwick

Heathrow

(7:00 AM)(1:00 PM) (7:00 PM) (1:00 AM)Local

(London)

Page 35: An Introduction to Usability Testing

4th HeuristicDesign for Errors (Slips)

User-Centered Design www.user-centereddesign.com35

Page 36: An Introduction to Usability Testing

36

Error versus Slip Errors are generated by a lack of understanding or a

lack of sufficient or correct information– Lack of sufficient or correct information is the

responsibility of the designer in the presentation layer of an interface

– Lack of understanding is the responsibility of the designer in interaction and in conceptual model of an interface

– Errors are often undetectable by the end user Slips are common users issues

– Hand/eye coordination or basic control of our psychomotor systems

– Exacerbated by distraction, speed, attention overload– Unavoidable by design but need to be anticipated and

addressed by the designer

User-Centered Design www.user-centereddesign.com

Page 37: An Introduction to Usability Testing

37

Others Cognitive Walkthrough

– Specific review to ensure the correct information is available for the task being performed

– Also low cost usability testing – Highly dependent on the qualifications of the reviewer(s)

Pluralistic Walkthrough– Team Approach– Best if a diverse population of reviewers– Issues related to cognition (understanding) more than

presentation– Also low cost usability testing – Highly dependent on the qualifications of the reviewer(s)

User-Centered Design www.user-centereddesign.com

Page 38: An Introduction to Usability Testing

Testing MethodsPart 2: User Based Testing

User-Centered Design www.user-centereddesign.com38

Page 39: An Introduction to Usability Testing

Statistics: A Primer

User-Centered Design www.user-centereddesign.com39

Page 40: An Introduction to Usability Testing

Some Principles

Research to used to test a hypothesis based on a theory– Smoking increases the likelihood of developing

cancer Testing is used to support a decision

– For example, “this design change is going to be better for users”, or “design A is better than design B”

Statistics are used to provide a way relate the small sample tested to the larger population, but small is a relative term– 25-30 is considered minimal before you see

regression to the mean Statistical analysis assumes the data obtained is

valid and reliableUser-Centered Design www.user-centereddesign.com40

Page 41: An Introduction to Usability Testing

Validity

Validity is the degree to which the results of a research study provide trustworthy information about the truth or falsity of the hypothesis*

Internal validity refers to the situation where the “experimental treatments make a difference in this specific experimental instance” (from Cambell, D.T. & Stanley, J.C. (1963) Experimental and Quasi-experimental Designs for Research

External validity asks the question of “generalizability”

User-Centered Design www.user-centereddesign.com41

*Cherulnik, P.D. 2001. Methods for Behavioural Research: A Systematic Approach

Page 42: An Introduction to Usability Testing

Reliability

Reliability is the ability of a test to show the same results if conducted multiple times– Test-retest reliability– Repeatability– Reproducibility

User-Centered Design www.user-centereddesign.com42

Page 43: An Introduction to Usability Testing

Use of Confidence Intervals

When working with small samples, confidence interval provide a way to represent uncertainty in test results

Since each sample and each test is different, the confidence level tells the informed reader the likelihood that another sample will provide the same results. (In other words, if you ran the test again, what value are you likely to get next time?)

Typical confidence intervals in research include the 90% or 95% confidence interval. Behavioural research often uses a 80% confidence interval.

User-Centered Design www.user-centereddesign.com43

Page 44: An Introduction to Usability Testing

Use of Confidence Intervals (continued) “You just finished a usability test. You had 5

participants attempt a task in a new version of your software. All 5 out of 5 participants completed the task. You rush excitedly to tell your manager the results so you can communicate this success to the development team. Your manager asks, ‘OK, this is great with 5 users, but what are the chances that 50 or 1000 will have a 100% completion rate?’ ”- Jeff Sauro

The confidence level tells the informed reader the likelihood that another sample will provide the same results. In other words, if you ran the test again, what value are you likely to get next time?

User-Centered Design www.user-centereddesign.com44

Page 45: An Introduction to Usability Testing

Use of Confidence Intervals (continued) Usability is typically done with very few people per round

– Neilson says 5 (but not for the right reason)– Krug says 2 or 3 (also not for the right reason)– 3 per user group, profile, or persona is considered a

minimum by convention and ISO standard, a day consisting of about 8-9 people

You could do statistical analysis on the results of a typical usability if…– Your test as valid and reliable– You had truly random sampling– You did not interfere with performance during testing

User-Centered Design www.user-centereddesign.com45

Page 46: An Introduction to Usability Testing

Use of Confidence Intervals (concluded) Confidence intervals when testing with, say, 8 people

range from 37% (0 out of 8 or 8 out of 8) to between 50%-70% (all other values)– For example, if 6 out of 8 people successfully completed a

task in your test, you can only predict that somewhere between 20% and 97% of all people would complete the task (assuming all conditions for validity and reliability have been met)

– If you want to confidently state, based on your testing, that 9 out of 10 people will be able to successfully complete a task, and all conditions needed for validity and reliability have been met, you need to test 430 people and 400 of them have to successfully complete the task

User-Centered Design www.user-centereddesign.com46

Page 47: An Introduction to Usability Testing

The Psychology of Usability

User-Centered Design www.user-centereddesign.com47

Page 48: An Introduction to Usability Testing

Attention

• Highly Limited• Attenuator Model• Switching Model• But attention is conscious attention, we

have non-conscious attention

48

Page 49: An Introduction to Usability Testing

Test Your Attention

49

Page 50: An Introduction to Usability Testing

50

Page 52: An Introduction to Usability Testing

Non Conscious Attention

• The car versus elephant analogy• Accounts for the vast majority of decision

making• Efficient (lazy)

52

Page 53: An Introduction to Usability Testing

7H15 M3554G3 53RV35 70 PROV3 H0W 0UR M1ND5 C4N D0 4M4Z1NG 7H1NG5! 1MPR3551V3 7H1NG5! 1N 7H3 B3G1NN1NG 17 WA5 H4RD BU7 N0W, 0N 7H15 L1N3 Y0UR M1ND 1S R34D1NG 17 4U70M471C4LLY W17H 0U7 3V3N 7H1NK1NG 4B0U7 17

53

Page 54: An Introduction to Usability Testing

54

Page 55: An Introduction to Usability Testing

OrangeYellowGreenBlackBlueRed55

Page 56: An Introduction to Usability Testing

FINISHED FILES ARE THE RE-SULT OF YEARS OF SCIENTIF-IC STUDY COMBINED WITH THE EXPERIENCE OF MANY YEARS

56

Page 57: An Introduction to Usability Testing

57

Page 58: An Introduction to Usability Testing

Awareness is not needed to function...

...and its a good thing, based on the limits of our awareness

How many times have you found yourself thinking about something in the morning while taking your shower and forget if you actually washed your hair?

If you are in a car singing along with the radio and you get distracted thinking about some topic, you may not recall that your continued to sing, but other around you can attend to the fact that you did, indeed singe and didn't go blank or babble.

Similarly, the reserve is true, you can read a passage, get distracted, and feel you need to reread the passage to learn it. But research has shown that facts get through even though you're not conscious of it.

58

Page 59: An Introduction to Usability Testing

CRT A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball.  How much does the ball cost?

If it takes 5 machines 5 minutes to make 5 widgets, how long would it take 100 machines to make 100 widgets?

In a lake, there is a patch of lily pads. Every day, the patch doubles in size.  If it takes 48 days for the patch to cover the entire lake, how long would it  take for the patch to cover half of the lake?  

59

Page 60: An Introduction to Usability Testing

Humans attempt to avoid mental effort, often resulting in errors of judgment and calculation. However, the level 2 processing can be activated. Example: In an experiment a set of puzzles (the Cognitive Reflection Test) were presented to students at Princeton. When the fonts and representation were simple, 90% of the participants made an error on at least one of the three problems. When the font was muddled and it was hard to read, error rates dropped to 35%   

60

Page 61: An Introduction to Usability Testing

The Anchor and Adjustment BiasLevel one thinking wants to work "efficiently"  (i.e., with as little effort as possible).  Given an anchor point, it will expend an amount of energy to adjust it.  But it doesn't care how realistic the anchor point is.  That requires level 2 thinking.  So, a low anchor point will be adjusted a bit up, and a high anchor pint will be adjusted adjust bit it down.  But the adjustment is based on amount of effort needed. This effect can be confirm by engaging level 2 thinking with another task.  When participants are asked to identify a tone while doing an anchor and adjustment task, their adjustments are lessened.  

61

Page 62: An Introduction to Usability Testing

Learning

Much of our leaning is also done without any conscious awareness.

The Garcia Effect Classical Conditioning Operant Conditioning

62

Page 63: An Introduction to Usability Testing

Developing Expertise

When we do need conscious awareness to learn an activity, we become proficient, even expert, as the thinking and decision making moved from level 2 to level 1 thinking and decision making

Consider driving a car. When you first leaned to drive a car, it required all of your attention. You could (should) not listen to the radio, engage in a conversation, etc. But as you became more skilled, you moved the activity from conscious (level 2) thinking and decision making to non conscious (level 1) thinking and decision making.

63

Page 64: An Introduction to Usability Testing

Goal of interaction design

– Recall that the primary task or NOT to operate the computer. The primary task is to accomplish some task they only REQUIRES the use of a computing. All of our conscious attention should be on the primary task.

– Since conscious thinking (attention or level level 2 thinking) is so limited, the goal of interaction design is to reduce the requirement for conscious attention and allow product interaction to occur (ideally) as all non conscious (level 2) thinking

64

Page 65: An Introduction to Usability Testing

• The effects of emotion• The effects of memory on emotion• The effects of bias• Etc., etc., etc.

65

Other Issues

Page 66: An Introduction to Usability Testing

Specific goals of design Intuitive design leads to ease of learning - we can use transfer

of knowledge from prior experience to quickly obtain proficient operation with a design and little conscious attention is needed.

It's better we already know how to use a new design that have to stop to figure it out.

Consistency, good conceptual models, good feedback, matching expectation, etc. leads to ease of use where we can continue to operate the design with little conscious attention needed while we dedicate our conscious attention to the task we are trying to accomplish.

The less we have to redirect our attention from our task to attend to how we accomplish the task, the more transparent the product design. Ideally to the point we don't even notice the device we used to get the job done.

66

Page 67: An Introduction to Usability Testing

Observational “Tests”

User-Centered Design www.user-centereddesign.com67

Page 68: An Introduction to Usability Testing

68

Contextual Inquiry Field Study

– Sometimes (incorrectly) called “ethnography” Direct observation of

– intended users – performing the intended tasks– in the intended environment

(Should be) non disruptive, so its limited in its ability to be diagnostic or exploratory

Common functions are viewed– Incomplete view of a system

Can be time consuming and logistically prohibitive Best for directly observable data from a “safe”

distance

User-Centered Design www.user-centereddesign.com

Page 69: An Introduction to Usability Testing

Performance-Based Tests

User-Centered Design www.user-centereddesign.com69

Page 70: An Introduction to Usability Testing

70

Performance-based Testing Sometimes called an “Un-moderated Remote

Usability Testing” Must be non-disruptive

– Need a fully operational system, mock up, or prototype– In context (ideally not in a lab)

Need large enough sample Need objective measure(s) Need a comparison or a benchmark Example

– Redundant High Centered Tail Lights Applicability in (some) web-based situation,

however…– Limited ability to to determine cause– Limited ability to determine possible changes/improvements

User-Centered Design www.user-centereddesign.com

Page 71: An Introduction to Usability Testing

The Think Aloud Protocol

User-Centered Design www.user-centereddesign.com71

Page 72: An Introduction to Usability Testing

72

Think Aloud Protocol Most widely used (which is not a good thing) Highly disruptive to performance No reliable evidence of its efficacy When used on existing systems or interactive

prototypes/mockups– Issues of the ability for users to be introspective– Issues of distraction (split attention)– Issues of verbal overshadowing– Issues of increased anxiety– Issues of projected responding

Suitability for concept presentation and cognitive walkthroughs on non-operational products (e.g., story boards, static screen flows, Wizard of Oz walkthroughs)

User-Centered Design www.user-centereddesign.com

Page 73: An Introduction to Usability Testing

User-Centered Design www.user-centereddesign.com73

Page 74: An Introduction to Usability Testing

Threats to User-Based Testing

Reactivity Effect– Individuals alter their performance or behavior due to

the awareness that they are being observed– “The Hawthorne Effect” is the most widely known

version– Bradley, Wilder– Demand characteristics (subtle) and projected

responding (more overt ) Issues with introspection and confabulation The Effect of Anxiety

– General – With split attention– During a think aloud protocol

User-Centered Design www.user-centereddesign.com74

Page 75: An Introduction to Usability Testing

Interrupted Task-based Test

User-Centered Design www.user-centereddesign.com75

Page 76: An Introduction to Usability Testing

76

Interrupted Task-Based Testing A compromise approach that allows for exploration

of issues without being overly disruptive when issues are not present

Can be used for exploratory testing on an existing design

Can be used for exploring possible design alternatives

Should (Must)– follow the ethical guidelines for the treatment of

human subjects (including informed consent), confidentiality

Should not – be hampered by trying to support statistical analysis

User-Centered Design www.user-centereddesign.com

Page 77: An Introduction to Usability Testing

77

Test Set-up

What’s the hypothesis?– Required for research– Required for usability testing?

Define Your Variables– Dependent and Independent Variables– Confounding Variables– Operationalize Your Variables

User-Centered Design www.user-centereddesign.com

Page 78: An Introduction to Usability Testing

78

Participant Issues

User-types– Users versus user surrogates– All profiles or specific user profiles/personas?– Critical segments?

How many?– Relationship to statistical significance– “Discount Usability” – who’s rule?– No less then 3 from any group

Participant stipends Over recruiting Scheduling

User-Centered Design www.user-centereddesign.com

Page 79: An Introduction to Usability Testing

79

Within versus Between Subject Designs Based on time commitment & number of

designs/products Within lets everyone see both products, which

is better for small scale studies Practically: Use an unbalanced within subject

design

User-Centered Design www.user-centereddesign.com

Page 80: An Introduction to Usability Testing

80

Defining Task Scenarios Scenarios are contrived for testing, may not be

representative of real world usage patterns, and are NOT always required

Short, unambiguous tasks to explore areas of concern, redesign, or of interest

Wording is critical– In the user’s own terms– Does not contain “seeds” to the correct

solution Enough to form a complete test but able to stay

within the time limit– Flexibility is key– Variations ARE allowed

User-Centered Design www.user-centereddesign.com

Page 81: An Introduction to Usability Testing

81

Preparing Test Materials

Consent form Video release form Receipt and confidentiality agreement Facilitator’s Guide

– Introductory comments– Participant task descriptions– Questionnaires, SUS, Cooper-Harper, etc.

User-Centered Design www.user-centereddesign.com

Page 82: An Introduction to Usability Testing

82

Piloting the Design

Getting subjects– Convenience sampling– Cells and Power

Collect data Check task wording Check timing

User-Centered Design www.user-centereddesign.com

Page 83: An Introduction to Usability Testing

Facilitating

Rogerian principles apply– Unconditional Positive Regard– Empathy– Congruence

Rogerian techniques are used– Minimal encouragers– Reflections– Summarization– Open ended questions

Objectiveness

User-Centered Design www.user-centereddesign.com83

Page 84: An Introduction to Usability Testing

84

Collecting Data

Collecting data– Data is observational, not transcribable– The data is NOT in the interface, the data is in the

user!– Behavior, Reactions, hesitations (movement and

voice), body language, “tells” Collecting participant may be misleading (e.g,

confabulation), but may help indicate when issues are present (e.g., projected responding)

Collecting subjective data (why not)– Pre-test– Post-task– Post-test

User-Centered Design www.user-centereddesign.com

Page 85: An Introduction to Usability Testing

Reporting Results

User-Centered Design www.user-centereddesign.com85

Page 86: An Introduction to Usability Testing

User-Centered Design www.user-centereddesign.com86

Page 87: An Introduction to Usability Testing

User-Centered Design www.user-centereddesign.com

Efficiency Data – Time on TaskEfficiency

– Can be operationalized in number of ways– Time on task being the most common

Time on task can be measured objectivelyExternal time

– Important to management and some types of engineering (particularly process flow)

– Its not necessarily important to users– Time-on-task does not correlate with effectiveness

except in extreme cases

87

Page 88: An Introduction to Usability Testing

Sample ToT Data – Controlled Experiment*

User-Centered Design www.user-centereddesign.com

150 250 350 450 550 650 750 850 950 1050

1150

1250

1350

0123456789

10

System A: ToT Time in Seconds

Num

ber

of In

divi

dual

s

150 250 350 450 550 650 750 850 950 1050

1150

1250

1350

0123456789

10

System B: ToT Time in SecondsN

umbe

r of

Indi

vidu

als

*Source: UCD, Inc. – Voting System Usability Compliance Test Development Report for NIST

88

Page 89: An Introduction to Usability Testing

User-Centered Design www.user-centereddesign.com

Efficiency Data – Other Measures The following measures have been proposed

– Number of clicks– Number of pages– Number of errors– Number of times the back button is used– “Pogo sticking”

There is no construct validity for any of these measures against task performance (though there may be some spurious correlations for some of these)

89

Page 90: An Introduction to Usability Testing

User-Centered Design www.user-centereddesign.com

Satisfaction Data

Satisfaction data can be operationalized in a number of ways, but is always opinion data– Standardized survey instrument (e.g. SUS, SUMI,

QUIS)– Simple Likert item and Likert scale assessments

Satisfaction data suffers from numerous issues that threaten their validity– Halo effect– Leniency bias– Strictness bias– Projected responding– Issues with introspection– Usability Issues–a lack of agreed understanding

between the question(er)and the respondent) Satisfaction data does not correlate with

performance90

Page 91: An Introduction to Usability Testing

Post Test Analysis of Approx. 3000 Sessions*

User-Centered Design www.user-centereddesign.com

*Source: Jeff Sauro, Measuring Usability

Subjective Ease of Use Assessment (when successful)

91

Page 92: An Introduction to Usability Testing

Post Test Analysis of Approx. 3000 Sessions*

User-Centered Design www.user-centereddesign.com

Subjective Ease of Use Assessment (when unsuccessful)

92

Page 93: An Introduction to Usability Testing

Effectiveness Data

Effectiveness data can be operationalized in a number of ways but is generally operationalized as success or failure to complete a task

Completion rate as a pass/fail criteria can be measured objectively if the criteria is pre-determined and is not subjective

Best estimates, error rate, and the confidence interval can be calculated easily for pass/fail measure of completion rate using a Binomial calculation

User-Centered Design www.user-centereddesign.com93

Page 94: An Introduction to Usability Testing

Descriptive Statistics

But the data often shows other patterns such as bimodal distributions. In these cases, the average and standard deviation are not adequate…

User-Centered Design www.user-centereddesign.com94

Page 95: An Introduction to Usability Testing

User Ratings

User-Centered Design www.user-centereddesign.com95

Score

Num

ber

who

got

tha

t sco

re

Page 96: An Introduction to Usability Testing

Correlated User Ratings

User-Centered Design www.user-centereddesign.com

SUS

Cooper Harper

96

Page 97: An Introduction to Usability Testing

User-Centered Design www.user-centereddesign.com97

Findings from Sets of User Ratings

DC – MCH DataMemphis – MCH Data

Memphis – SUS Data DC – SUS Data

Page 98: An Introduction to Usability Testing

Reportable “Results” Violations of industry standards and best practices are reportable results

from testing (though many should have been included in any expert review prior to testing)

Direct user comments may or may not be reportable, based on the observers assessment of the comment

Direct user behaviour is generally reportable, but only if confirmed to be behaviour based on a design issue and/or behaviour that is consistent throughout testing

An observation of a reaction suggestive of a cognitive issues, regardless of its effect on observable behaviour, is reportable provided there is a basis for that assumption

Behaviours that did not occur in testing but are suspected to occur under different conditions are reportable provided they re based on prior experience and there is a basis for that behaviour

Subjective data is reportable to support other findings, but this support may be inversely correlated with observation or performance

User-Centered Design www.user-centereddesign.com98

Page 99: An Introduction to Usability Testing

Design Guidelines

All navigation should be in grouped

together.

All navigation should be in grouped

together.

Page 100: An Introduction to Usability Testing

Prior Research Findings

User-Centered Design www.user-centereddesign.com

Bold form labels draws users eyes away from the form and reduces

usability. Consider removing the bold and

possibly bolding the content.

100

Page 101: An Introduction to Usability Testing

101

Knowledge of Human Perception:

User-Centered Design www.user-centereddesign.com

There are 50hyper links on the

home page (not including primary nav.)

representing four levels within the clinical trial

section and direct links to other parts of NCI

Page 102: An Introduction to Usability Testing

Industry Standards and Best Practices

Participants (without prior exposure) failed to recognized the five primary disciplines as navigational elements. The most common expectation (if noticed at all) was that the links would provide definitions of the terms.

User-Centered Design www.user-centereddesign.com102

Page 103: An Introduction to Usability Testing

Direct Observation or Comment

User-Centered Design www.user-centereddesign.com

Participants had difficulty understanding what content was

searched.Many thought all content in Clinical Trials would be searched, not just

ongoing trials A few participants wanted to use the global NCI search

to search Clinical Trials (consider labelling this “Search NCI” or “NCI

Search”

Some participants responded to the term “Find” even when the search form was on the

page.

Page 104: An Introduction to Usability Testing

Reporting Results

User-Centered Design www.user-centereddesign.com104

Page 105: An Introduction to Usability Testing

Conclusions

Any testing is better than no testing, but don’t mistake “6 pack and friends” testing for the real thing

Testing with human subject is highly valuable, the basic skills can be taught, it can be deeply insightful, but it is serious business and should not be conducted casually

The more you know about experimental design the better your testing will be, but the more you know about users the better the data you can get from any testing

User-Centered Design www.user-centereddesign.com105

Page 106: An Introduction to Usability Testing

Conclusions (concluded ;-) )

Testing is best done early and often as part of a user-centered design process (it part of what makes is user-centered)

The intent of testing should be to not just to know what happened, but to determine why it happened and to figure out what, if anything, can be done about it

Unless you have the right conditions and a large sample set available, the is little distinction between a true expert review and small sample user-based testing, but experts will need users to “see” the data

User-Centered Design www.user-centereddesign.com106

Page 107: An Introduction to Usability Testing

Other Formats Remote Usability Testing

– Has logistical advantages– Generates a false assumption that its more valid– Doable as a think aloud, but otherwise results in a hybrid (part

interrupted task based and part think aloud)– Much of the observational data is missing

Eye Tracking, Physiological Measures, Blink Rates, etc.– Objective measures that seem more real– But lacks perceptual component (e.g., with eye tracking what

we look directly at is not all we see, we can look directly at something and not see it, and what we perceive is not always what is in front of us)

Co-Discovery– 2 peoples working on a problem together– A highly useful hybrid approach (natural task performance and

think aloud)

User-Centered Design www.user-centereddesign.com107