Usability of CAPTCHAs Or “usability issues in CAPTCHA design”

Jeff YanSchool of Computing Science

Newcastle University, UK(Joint work with Ahmad Salah El Ahmad)

Usability of CAPTCHAs Or “usability issues in CAPTCHA

design”

SOUPS’08 (CMU, July 2008) (2)

Apology

2nd time to miss SOUPS … nth (n > 2) time to be unable to

present my paper … All due to the same problem:

A US visit visa!(started my application in April, I’ve not heard its result

yet …)


Does this man look like a terrorist?! ;-)


CAPTCHA Why was it invented?

Ask any CMU people, or read the cartoon

Automated Turing tests that computers cannot

pass, but human can

Almost standard security technology (e.g. for anti-spam) widespread application

on commercial websites


Main CAPTCHAs Text-based schemes

typically require users to solve a text recognition task the most widely deployed

Sound-based schemes typically require users to solve a speech recognition

task.

Image-based schemes typically require users to perform an image

recognition task Example: Microsoft’s Assira

This paper is about understanding

how to design usable and robust CAPTCHAs, with a focus on usability


Isn’t that …CAPTCHAs with poor usability should

not exist by definition?

Yes, but … still many deployed CAPTCHAs, including

famous ones, are not that usable …


How about robustness? When necessary, it will be covered However, our major attacks are

discussed in somewhere else Low-cost attacks on schemes by Microsoft,

Yahoo and Google (CCS’08, to appear) The pixel count attack (ACSAC’07)

Breaking CAPTCHAs by counting the number of pixels!


A framework for CAPTCHA usability Distortion

distortion techniques employed and their impact on usability.

Content content embedded in CAPTCHA challenges and

their impact on usability e.g. how the content should be organized?

Presentation the way that CAPTCHA challenges are presented

and impact on usability.


Distortion | confusing characters

Well-known that under common distortions, characters such as 1 and l, o and 0, 5 and s, would cause confusion

To be secure (or resistant to segmentation attacks), Google and Yahoo CAPTCHAs introduced new confusing characters vv or w? rm or nn? cl or d? cm or an? rn or m? nn or m? …


Distortion | confusing characters

~6% challenges in Google CAPTCHA, and

~10% in the latest Yahoo scheme (rolled out since Mar 2008)

were observed to have such confusing characters.


Content | string length

A design issue: string length predictable or not?

Case study: Microsoft CAPTCHA

used a fixed length of 8 characters, which helped its usability

The first object is “7”?

The first object is “L”?

With the length info, users can be pretty sure that the first objects in the above examples are noise.


Content | string length

However, the length info also helped our automated segmentation attack (success rate: >92%) Our program knows when to stop!

Start point Stop: identified 8 chars already


Presentation | the use of colour

Using colour is common practice in CAPTCHA design (for all sorts of reasons)

However, we have seen many cases in which the use of colour is unhelpful for usability has caused negative impact on security, or is problematic in terms of both usability and security


Presentation | the use of colour Case 1: Gimpy-r (a well-known early scheme)

How human see it How machines see it


Presentation | the use of colour Dominant colour of

distorted text (often black) is distinguishable: always the lowest intensity,

and never appeared in the

background

easy to extract the text colour background:

No much use in terms of security

negative effect in usability (e.g. confusing people)

Case 1: Gimpy-r


Presentation | the use of colour Case 2: BotBlock

How human see it How machines see it


Presentation | the use of colour

Case 2: BotBlock

sophisticated colour management providing resistance to OCR

However, the misuse of colour: texts have distinguishable

colour patterns the same colour for foreground

occurs repetitively. easy to extract text

automatically

Negative effect on usability and false sense of security.


Presentation | the use of colour It seems that the “Las Vegas effect” also

applies to CAPTCHA design No colour might be better than too much colour

Major CAPTCHAs started to avoid using fancy colour management, including Microsoft Yahoo Google reCAPTCHA


The framework: applied to text CAPTCHAs

Category Usability issue

Distortion

Distortion method and level

Confusing characters

Friendly to foreigners?

Content

Character set

String lengthHow long?

Predictable or not?

Random string or dictionary word?

Offensive word

Presentation

Font type and size

Image size

Use of color

Integration with web pages


The framework

Inspired by text-based CAPTCHAs Applicable to sound-based schemes

Details see our paper also applicable to image-based schemes

(e.g. IMAGINATION) for schemes such as Assira and Bongo, in which

distortion is absent, only the dimensions of content and presentation will apply.


Summary First attempt towards a systematic analysis of

usability issues in CAPTCHA design (in particular, text-based schemes)

Proposed a simple but novel framework, which accommodates both novel issues we have identified, and known issues scattered in the literature

The framework is applicable to text, sound and (some) image based CAPTCHAs.

Usability of CAPTCHAs Or “usability issues in CAPTCHA design”

Documents

Usability of CAPTCHAs Or “usability issues in CAPTCHA design”