Human Based Character Recognition Via Web-Security Measures Original Research By Luis Von Ahn Benjamin Maurer Colin McMillen David Abraham Manuel Blum Presented BY : Md. Shihab Uddin Roll: 0607029, CSE,KUET This paper was published in Science Express on 14 August 2008 by the American Association for the Advancement of Science (AAAS).
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Human Based Character Recognition Via Web-Security Measures
Original Research By Luis Von Ahn Benjamin Maurer Colin McMillen David Abraham Manuel Blum
Presented BY : Md. Shihab Uddin Roll: 0607029, CSE,KUET
This paper was published in Science Express on 14 August 2008 by the American Association for the Advancement of Science (AAAS).
Outline CAPTCHA’S
WHY RE-INVENTING CAPTCHA?DIGITIZING BOOKS WITH RE-CAPTCHARE-CAPTCHA IN USE RE-CAPTCHA CURRENTREFERENCES
CAPTCHA’S
CAPTCHA @GMAIL
CAPTCHA’S
4
CAPTCHA@yahoomail
CAPTCHA’S
5
CAPTCHA@HOTMAIL
CAPTCHA’S
A CAPTCHA(COMPLETELY AUTOMATED TURING TEST TO TELL COMPUTERS & HUMANS APART) is a program that can tell its user whether a human or computer.
Colorful images with distorted text at the bottom of web registration forms.
Only can be deciphered by humans, computer programs or autobot's can’t .
APPLICATIONS: Free e-mail services, social networks,blogs Data collection Preventing worms & spam Preventing dictionary attacks
Why Re-inventing CAPTCHA
A calculation: Time takes to solve a CAPTCHA= 10 seconds Daily solved CAPTCHA’S= more than 200
millions Human hours lost= more than 150,000 hours
a day. 6% of world’s population type’s CAPTCHA
everyday
Why Re-inventing CAPTCHA
Though CAPTCHA’S prevents spam’s & autobot’s but this human effort is totally
wasted everyday.
Is there anyway to use this HUMAN effort for
something good?
Solution is: Re-CAPTCHAor Re-invented CAPTCHA
Digitizing Books: Normal Approach
SCAN
OC R
Problem is OCR is not perfect.
Cannot Decipher 20% of the word’s whereas Re-
CAPTCHA can 99%
Digitizing Books: Re-CAPTCHA Approach
SCANNED BOOK
WORD’s that OCR Cannot Read
Randomly Distorted Image of WORD
Digitizing Books: Re-CAPTCHA Approach
Randomly Distorted Image of WORD
Known Distorted Control Word
Re-CAPTCHA
Added in Random Order
Digitizing Books: Re-CAPTCHA Approach
Re-CATCHA
One Re-CAPTCHA is sent to many users.Same word typed by 3 users & matches with OCR Guess, word digitized
Skipped by 6 users to type Re-CAPTCHA,Word Considered Un-readable
Re-CAPTCHA IN USE
FREE TO USE
Popular UsersFacebookCraiglist More than 100,000 Websites Twitter
Re-CAPTCHA IN USE
Re-CAPTCHA IN TWITTER
Re-CAPTCHA IN USE
Re-CAPTCHA IN FACEBOOK
Words Digitized Per Day
Re-CAPTCHA IN USE
Digitization Rate:1. 4 Million Words Per Day2. Approximately 160 Books(400 pages,250
words per page) Per Day3. This ratio’s are very old, current rate is very
high, cause Facebok+Twitter now have nearly 500 million users & using Re-CAPTCHA.
Re-CAPTCHA IN USE
Words are coming from:1. The NEWYORK TIMES(1851-1980)2. Internet Archive Stored In:3. Google News4. Google Books
Re-CAPTCHA CURRENT
1. GOOGLE Acquired Re-CAPTCHA 2. LUIS VON AHN works as Research Scientist at
GOOGLE along with his job at Carnegie Mellon. 3. LUIS VON AHN’s co-workers who worked on Re-
CAPTHA are now working on GOOGLE.4. LUIS VON AHN awarded a lot for inventing
CAPTCHA & Re-CAPTCHA including Mc Arthur Fellowship, One of The Best 10 Computer Scientist of the world, Pioneer of Human Computation.
REFERENCES
1. Paper from www.sciencmag.org2. http://www.captcha.net3. http://www.re-captcha.net4. http://www.captcha.net5. http://www.cs.cmu.edu/~biglou Homepage
of LUIS VON AHN6. Pictures from Web: Facebook,Twitter,Google