Top Banner
Segmentação de Captchas Primeiro Seminário BCC448 Reconhecimento de Padrões
46

Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie.....

Mar 22, 2018

Download

Documents

lytruc
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Segmentação de Captchas

Primeiro Seminário

BCC448

Reconhecimento de Padrões

Page 2: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Alunos: Filipe Eduardo Mata dos Santos

Pedro Henrique Lopes Silva

Page 3: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Paper

• "Text-based CAPTCHA strengths and weaknesses.“

• Elie Bursztein, Matthieu Martin e John Mitchell

• Proceedings of the 18th ACM conference on Computer and communications security.

• ACM, 2011

Page 4: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Captcha

• “Completely Atomated Public Turing

tests to tell Computers and Humans

Apart”

• “Reverse Turing tests”

• Uses

Page 5: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Real-world Captcha Security Features

• Anti-recognition

• Anti-segmentation

Page 6: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Real-world Captcha Security Features

• Anti-recognition

1. Multi-fonts

Page 7: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Real-world Captcha Security Features

• Anti-recognition 1. Multi-fonts

2. Charset

Page 8: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Real-world Captcha Security Features

• Anti-recognition 1. Multi-fonts

2. Charset

3. Font Size

Page 9: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Real-world Captcha Security Features

• Anti-recognition 1. Multi-fonts

2. Charset

3. Font Size

4. Distortion

Page 10: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Real-world Captcha Security Features

• Anti-recognition 1. Multi-fonts

2. Charset

3. Font Size

4. Distortion

5. Blurring

Page 11: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Real-world Captcha Security Features

• Anti-recognition 1. Multi-fonts

2. Charset

3. Font Size

4. Distortion

5. Blurring

6. Tilting

Page 12: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Real-world Captcha Security Features

• Anti-recognition 1. Multi-fonts

2. Charset

3. Font Size

4. Distortion

5. Blurring

6. Tilting

7. Waving

Page 13: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Real-world Captcha Security Features

• Anti-segmentation

8. Complex Background

Page 14: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Real-world Captcha Security Features

• Anti-segmentation 8. Complex Background

9. Lines

Page 15: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Real-world Captcha Security Features

• Anti-segmentation 8. Complex Background

9. Lines

10.Collapsing

Page 16: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Segmentation

• Background Confusion o Complex Background

Page 17: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Segmentation

• Background Confusion o Complex Background

o Color Similarity

Page 18: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Segmentation

• Background Confusion o Complex Background

o Color Similarity

o Noise*

Page 19: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Segmentation

• Using Lines o Small Lines

Page 20: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Segmentation

• Using Lines o Small Lines

o Big Lines

Page 21: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Segmentation

• Using Lines o Small Lines

o Big Lines

• Collapsing o Predictable Collapsing

Page 22: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Segmentation

• Using Lines o Small Lines

o Big Lines

• Collapsing o Predictable Collapsing

o Unpredictable Collapsing

Page 23: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

Authorize

Page 24: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

Baidu

Page 25: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

• Blizzard(50%) Blizzard

Page 26: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

• Blizzard(50%)

• Captcha.net(50%) Captcha.net

Page 27: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

• Blizzard(50%)

• Captcha.net(50%)

• CNN(10-24%)

CNN

Page 28: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

• Blizzard(50%)

• Captcha.net(50%)

• CNN(10-24%)

• Digg(10-24%)

Digg

Page 29: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

• Blizzard(50%)

• Captcha.net(50%)

• CNN(10-24%)

• Digg(10-24%)

• eBay(25-49%)

eBay

Page 30: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

• Blizzard(50%)

• Captcha.net(50%)

• CNN(10-24%)

• Digg(10-24%)

• eBay(25-49%)

• Google(0%)

Google

Page 31: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

• Blizzard(50%)

• Captcha.net(50%)

• CNN(10-24%)

• Digg(10-24%)

• eBay(25-49%)

• Google(0%)

• Megaupload

(50%)

Megaupload

Page 32: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

• Blizzard(50%)

• Captcha.net(50%)

• CNN(10-24%)

• Digg(10-24%)

• eBay(25-49%)

• Google(0%)

• Megaupload(50%)

• NIH(50%)

NIH

Page 33: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

• Blizzard(50%)

• Captcha.net(50%)

• CNN(10-24%)

• Digg(10-24%)

• eBay(25-49%)

• Google(0%)

• Megaupload(50%)

• NIH(50%)

• Recaptcha(0%)

Recaptcha

Page 34: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

• Blizzard(50%)

• Captcha.net(50%)

• CNN(10-24%)

• Digg(10-24%)

• eBay(25-49%)

• Google(0%)

• Megaupload(50%)

• NIH(50%)

• Recaptcha(0%)

• Reddit(25-49%)

Reddit

Page 35: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

• Blizzard(50%)

• Captcha.net(50%)

• CNN(10-24%)

• Digg(10-24%)

• eBay(25-49%)

• Google(0%)

• Megaupload(50%)

• NIH(50%)

• Recaptcha(0%)

• Reddit(25-49%)

• Skyrock(1-10%)

Skyrock

Page 36: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

• Blizzard(50%)

• Captcha.net(50%)

• CNN(10-24%)

• Digg(10-24%)

• eBay(25-49%)

• Google(0%)

• Megaupload(50%)

• NIH(50%)

• Recaptcha(0%)

• Reddit(25-49%)

• Skyrock(1-10%)

• Slashdot(25-49%)

Slashdot

Page 37: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

• Blizzard(50%)

• Captcha.net(50%)

• CNN(10-24%)

• Digg(10-24%)

• eBay(25-49%)

• Google(0%)

• Megaupload(50%)

• NIH(50%)

• Recaptcha(0%)

• Reddit(25-49%)

• Skyrock(1-10%)

• Slashdot(25-49%)

• Wikipedia(25-49%)

Wikipedia

Page 38: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Data Set • Authorize(50%)

• Baidu(1-10%)

• Blizzard(50%)

• Captcha.net(50%)

• CNN(10-24%)

• Digg(10-24%)

• eBay(25-49%)

• Google(0%)

• Megaupload(50%)

• NIH(50%)

• Recaptcha(0%)

• Reddit(25-49%)

• Skyrock(1-10%)

• Slashdot(25-49%)

• Wikipedia(25-49%)

Page 39: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Decaptcha • Code in C#

o Speed

o Robustness

o Availability of AI/Vision Libraries

• Visual Studio

• Framework o aForge

o Accord

• Machine Learning o SVM

Page 40: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Decaptcha Pipeline • Method

1. Preprocessing

Page 41: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Decaptcha Pipeline • Method

1. Preprocessing

2. Segmentation

Page 42: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Decaptcha Pipeline • Method

1. Preprocessing

2. Segmentation

3. Post-segmentation

Page 43: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Decaptcha Pipeline • Method

1. Preprocessing

2. Segmentation

3. Post-segmentation

4. Recognition

Page 44: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Decaptcha Pipeline • Method

1. Preprocessing

2. Segmentation

3. Post-segmentation

4. Recognition

5. Post-processing

Page 45: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

Decaptcha Pipeline • Method

1. Preprocessing

2. Segmentation

3. Post-segmentation

4. Recognition

5. Post-processing

Page 46: Segmentação de Captchas Primeiro Seminário - DECOM · PDF file · 2014-09-26Paper •"Text-based CAPTCHA strengths and weaknesses.“ •Elie Bursztein, Matthieu Martin e John

References

• Bursztein, Elie, Matthieu Martin, and John Mitchell.

"Text-based CAPTCHA strengths and weaknesses."

Proceedings of the 18th ACM conference on

Computer and communications security. ACM,

2011