Jeff YanSchool of Computing Science
Newcastle University, UK(Joint work with Ahmad Salah El Ahmad)
Usability of CAPTCHAs Or “usability issues in CAPTCHA
design”
SOUPS’08 (CMU, July 2008) (2)
Apology
2nd time to miss SOUPS … nth (n > 2) time to be unable to
present my paper … All due to the same problem:
A US visit visa!(started my application in April, I’ve not heard its result
yet …)
SOUPS’08 (CMU, July 2008) (3)
Does this man look like a terrorist?! ;-)
SOUPS’08 (CMU, July 2008) (4)
CAPTCHA Why was it invented?
Ask any CMU people, or read the cartoon
Automated Turing tests that computers cannot
pass, but human can
Almost standard security technology (e.g. for anti-spam) widespread application
on commercial websites
SOUPS’08 (CMU, July 2008) (5)
Main CAPTCHAs Text-based schemes
typically require users to solve a text recognition task the most widely deployed
Sound-based schemes typically require users to solve a speech recognition
task.
Image-based schemes typically require users to perform an image
recognition task Example: Microsoft’s Assira
This paper is about understanding
how to design usable and robust CAPTCHAs, with a focus on usability
SOUPS’08 (CMU, July 2008) (7)
Isn’t that …CAPTCHAs with poor usability should
not exist by definition?
Yes, but … still many deployed CAPTCHAs, including
famous ones, are not that usable …
SOUPS’08 (CMU, July 2008) (8)
How about robustness? When necessary, it will be covered However, our major attacks are
discussed in somewhere else Low-cost attacks on schemes by Microsoft,
Yahoo and Google (CCS’08, to appear) The pixel count attack (ACSAC’07)
Breaking CAPTCHAs by counting the number of pixels!
SOUPS’08 (CMU, July 2008) (9)
A framework for CAPTCHA usability Distortion
distortion techniques employed and their impact on usability.
Content content embedded in CAPTCHA challenges and
their impact on usability e.g. how the content should be organized?
Presentation the way that CAPTCHA challenges are presented
and impact on usability.
SOUPS’08 (CMU, July 2008) (10)
Distortion | confusing characters
Well-known that under common distortions, characters such as 1 and l, o and 0, 5 and s, would cause confusion
To be secure (or resistant to segmentation attacks), Google and Yahoo CAPTCHAs introduced new confusing characters vv or w? rm or nn? cl or d? cm or an? rn or m? nn or m? …
SOUPS’08 (CMU, July 2008) (11)
Distortion | confusing characters
~6% challenges in Google CAPTCHA, and
~10% in the latest Yahoo scheme (rolled out since Mar 2008)
were observed to have such confusing characters.
SOUPS’08 (CMU, July 2008) (12)
Content | string length
A design issue: string length predictable or not?
Case study: Microsoft CAPTCHA
used a fixed length of 8 characters, which helped its usability
The first object is “7”?
The first object is “L”?
With the length info, users can be pretty sure that the first objects in the above examples are noise.
SOUPS’08 (CMU, July 2008) (13)
Content | string length
However, the length info also helped our automated segmentation attack (success rate: >92%) Our program knows when to stop!
Start point Stop: identified 8 chars already
SOUPS’08 (CMU, July 2008) (14)
Presentation | the use of colour
Using colour is common practice in CAPTCHA design (for all sorts of reasons)
However, we have seen many cases in which the use of colour is unhelpful for usability has caused negative impact on security, or is problematic in terms of both usability and security
SOUPS’08 (CMU, July 2008) (15)
Presentation | the use of colour Case 1: Gimpy-r (a well-known early scheme)
How human see it How machines see it
SOUPS’08 (CMU, July 2008) (16)
Presentation | the use of colour Dominant colour of
distorted text (often black) is distinguishable: always the lowest intensity,
and never appeared in the
background
easy to extract the text colour background:
No much use in terms of security
negative effect in usability (e.g. confusing people)
Case 1: Gimpy-r
SOUPS’08 (CMU, July 2008) (17)
Presentation | the use of colour Case 2: BotBlock
How human see it How machines see it
SOUPS’08 (CMU, July 2008) (18)
Presentation | the use of colour
Case 2: BotBlock
sophisticated colour management providing resistance to OCR
However, the misuse of colour: texts have distinguishable
colour patterns the same colour for foreground
occurs repetitively. easy to extract text
automatically
Negative effect on usability and false sense of security.
SOUPS’08 (CMU, July 2008) (19)
Presentation | the use of colour It seems that the “Las Vegas effect” also
applies to CAPTCHA design No colour might be better than too much colour
Major CAPTCHAs started to avoid using fancy colour management, including Microsoft Yahoo Google reCAPTCHA
SOUPS’08 (CMU, July 2008) (20)
The framework: applied to text CAPTCHAs
Category Usability issue
Distortion
Distortion method and level
Confusing characters
Friendly to foreigners?
Content
Character set
String lengthHow long?
Predictable or not?
Random string or dictionary word?
Offensive word
Presentation
Font type and size
Image size
Use of color
Integration with web pages
SOUPS’08 (CMU, July 2008) (21)
The framework
Inspired by text-based CAPTCHAs Applicable to sound-based schemes
Details see our paper also applicable to image-based schemes
(e.g. IMAGINATION) for schemes such as Assira and Bongo, in which
distortion is absent, only the dimensions of content and presentation will apply.
SOUPS’08 (CMU, July 2008) (22)
Summary First attempt towards a systematic analysis of
usability issues in CAPTCHA design (in particular, text-based schemes)
Proposed a simple but novel framework, which accommodates both novel issues we have identified, and known issues scattered in the literature
The framework is applicable to text, sound and (some) image based CAPTCHAs.