Top Banner
Auto Correction for Mobile Typing 2016320172 Chan Ho Jun 2016320177 Hyeon Min Park 2016160040 Sun Mook Choi 2016-06-14 1
37

[16.06.14] Auto Correction for Mobile Typing

Jan 23, 2018

Download

Education

KENNY Park
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: [16.06.14] Auto Correction for Mobile Typing

Auto Correctionfor

Mobile Typing

2016320172 Chan Ho Jun

2016320177 Hyeon Min Park

2016160040 Sun Mook Choi

2016-06-14 1

Page 2: [16.06.14] Auto Correction for Mobile Typing

Contents

Algorithm Research

Nota Keyboard

SwiftKey

Conclusion

Reference

2016-06-14 2

Page 3: [16.06.14] Auto Correction for Mobile Typing

ALGORITHM RESEARCHChapter 1

2016-06-14 3

Page 4: [16.06.14] Auto Correction for Mobile Typing

Ultimate Goal of Spelling Correction

Reducing spelling errors while the user types the same way

as before

Reducing spelling errors that occur at borders between keys

2016-06-14 4

Page 5: [16.06.14] Auto Correction for Mobile Typing

Cause of Spelling Error

The difference among an individual’s touch distribution

The difference between a key’s area of recognition and an

individual’s touch distribution

2016-06-14 5

Page 6: [16.06.14] Auto Correction for Mobile Typing

Review

Machine Learning

Learn through training data

Supervised Learning

Knowing a user’s intention is the key to spelling correction

Supervised model

- Refined input & answer information

2016-06-14 6

Page 7: [16.06.14] Auto Correction for Mobile Typing

Review (Cont’d)

Problem

Difficult to differentiate which key the user pressed when he or she

presses the border between keys

Other Algorithms

By tracking backspace

- Inferring the answer information

- Learning through supervised learning

Low accuracy

2016-06-14 7

Page 8: [16.06.14] Auto Correction for Mobile Typing

Semi-supervised Learning

Supervised learning

A small amount of labeled data (the answer information)

Unsupervised learning

A large amount of unlabeled data (the distribution of pressed keys)

A model that can learn without an answer information when

a user presses the borders between keys

2016-06-14 8

Page 9: [16.06.14] Auto Correction for Mobile Typing

Clustering Algorithm

Grouping similar objects into a same group

Distribution-based clustering

Gaussian mixture models

- Using the Expectation-Maximization algorithm

2016-06-14 9

Page 10: [16.06.14] Auto Correction for Mobile Typing

Clustering Algorithm (Cont’d)

Data near the key center

Intended that key

Used first-hand to educate the model

Data on key borders

Filed into the clustering algorithm

- Widen a key's area of recognition

2016-06-14 10

Page 11: [16.06.14] Auto Correction for Mobile Typing

NOTA KEYBOARDChapter 2

2016-06-14 11

Page 12: [16.06.14] Auto Correction for Mobile Typing

Statistics

5.52% Error rate25.4% decreased

4.12%

292.0 press/min Input speed4.8% increased

306.1 press/min

9.19% Backspace input23.6% decreased

7.02%

2016-06-14 12

Page 13: [16.06.14] Auto Correction for Mobile Typing

Usage Map

5/8 ~ 6/10

2016-06-14 13

Page 14: [16.06.14] Auto Correction for Mobile Typing

Typing Video

2016-06-14 14

Page 15: [16.06.14] Auto Correction for Mobile Typing

Correction Moment

2016-06-14 15

Page 16: [16.06.14] Auto Correction for Mobile Typing

Problems or Limitations

Not possible to suggest correction on a contextual basis

When data set is small - High error rate when false data is

mistakenly input

2016-06-14 16

Page 17: [16.06.14] Auto Correction for Mobile Typing

SWIFTKEYChapter 3

2016-06-14 17

Page 18: [16.06.14] Auto Correction for Mobile Typing

SwiftKey

Natural Language Processing (NLP) for predictions and

spelling corrections

Retroactive correction

2016-06-14 18

Page 19: [16.06.14] Auto Correction for Mobile Typing

NLP – Types of Errors

Non word error (NWE)

bannana → banana

Real word error (RWE)

Typographical

- two → tow

Cognitive

- two → too

2016-06-14 19

Page 20: [16.06.14] Auto Correction for Mobile Typing

Correction

NWE

RWE

Candidate generation

Candidate selection

Detect errorCandidate generation

Candidate selection

2016-06-14 20

Page 21: [16.06.14] Auto Correction for Mobile Typing

Candidate Generation

Words with similar spelling

Words with similar pronunciation ( for RWE )

The word itself ( for RWE )

2016-06-14 21

Page 22: [16.06.14] Auto Correction for Mobile Typing

Candidate GenerationWords with similar spelling

Smallest edit distance between words where the edits of

letters are

Deletion

Insertion

Substitution

Reversal (Transposition)

80% to 95% of errors are within edit distance 1

2016-06-14 22

Page 23: [16.06.14] Auto Correction for Mobile Typing

Candidate GenerationExample

Typo Candidate ti ci Type

acress

actress t Deletion

cress a Insertion

caress ac ca Reversal

access r c Substitution

across e o Substitution

acres s Insertion

acres s Insertion

2016-06-14 23Jurafsky 2012

Page 24: [16.06.14] Auto Correction for Mobile Typing

Candidate Selection

Select the candidate where the following is greatest:

𝑃 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 𝑡𝑦𝑝𝑜

=𝑃 𝑡𝑦𝑝𝑜 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 𝑃(𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒)

𝑃(𝑡𝑦𝑝𝑜)

≈ 𝑃 𝑡𝑦𝑝𝑜 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 𝑃 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒

Bayes’ Theorem

Error Model Language Model

2016-06-14 24

Page 25: [16.06.14] Auto Correction for Mobile Typing

Candidate SelectionLanguage Model

Unigram Model

𝑃(𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒)

The ratio of the frequency of 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 and the total count of words in

the training set

n-gram Model

𝑃(𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒|𝑤𝑜𝑟𝑑1,… ,𝑤𝑜𝑟𝑑𝑛−1)

The ratio of the frequency of 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒with considering n-1 words

surrounding the training set

2016-06-14 25

Page 26: [16.06.14] Auto Correction for Mobile Typing

Candidate SelectionError Model

Noisy Channel Model

Kernighan, Church, Gale 1990

𝑃 𝑡𝑦𝑝𝑜 𝑐𝑎𝑛𝑑𝑖𝑑𝑎𝑡𝑒 ≈

𝑑𝑒𝑙 𝑐𝑖−1, 𝑐𝑖𝑐𝑜𝑢𝑛𝑡[𝑐𝑖−1𝑐𝑖]

, if deletion

𝑑𝑒𝑙 𝑐𝑖−1, 𝑡𝑖𝑐𝑜𝑢𝑛𝑡[𝑐𝑖−1]

, if insertion

𝑑𝑒𝑙 𝑡𝑖 , 𝑐𝑖𝑐𝑜𝑢𝑛𝑡[𝑐𝑖]

, if substitution

𝑟𝑒𝑣 𝑐𝑖 , 𝑐𝑖+1𝑐𝑜𝑢𝑛𝑡[𝑐𝑖𝑐𝑖+1]

, if reversal

𝑑𝑒𝑙[𝑥,𝑦] : count of 𝑥𝑦 typed as 𝑥𝑎𝑑𝑑[𝑥,𝑦] : count of 𝑥 typed as 𝑥𝑦𝑠𝑢𝑏[𝑥,𝑦] : count of 𝑥 typed as 𝑦𝑟𝑒𝑣[𝑥,𝑦] : count of 𝑥𝑦 typed as 𝑦𝑥

𝑐𝑖 : the edit letter in correction𝑡𝑖 : the edit letter in typo

𝑐𝑜𝑢𝑛𝑡[𝑥] : count of 𝑥 in training set𝑐𝑜𝑢𝑛𝑡[𝑥𝑦] : count of 𝑥𝑦 in training set

2016-06-14 26

Page 27: [16.06.14] Auto Correction for Mobile Typing

2016-06-14 27Kernighan, Church, Gale 1990

Page 28: [16.06.14] Auto Correction for Mobile Typing

2016-06-14 28Kernighan, Church, Gale 1990

Page 29: [16.06.14] Auto Correction for Mobile Typing

Candidate GenerationExample

Jurafsky 2012

Typo Candidate ti ci Type

acress

actress t Deletion

cress a Insertion

caress ac ca Reversal

access r c Substitution

across e o Substitution

acres s Insertion

acres s Insertion

2016-06-14 29

Page 30: [16.06.14] Auto Correction for Mobile Typing

Candidate SelectionExample (Language Model: Unigram, Error Model: Noisy Channel Model)

Candidate Frequency P(Candidate) P(Typo|Candidate) P(Typo|Candidate)P(Candidate)

actress 9321 .0000230573 .000117000 2.7000 × 10-9

cress 220 .0000005442 .000001440 .00078 × 10-9

caress 686 .0000016969 .000001640 .00280 × 10-9

access 37038 .0000916207 .000000209 .01900 × 10-9

across 120844 .0002989314 .000009300 2.8000 × 10-9

acres 12874 .0000318463 .000032100 1.0000 × 10-9

acres 12874 .0000318463 .000034200 1.0000 × 10-9

Using training set of Corpus of Contemporary English (400 million words)

2016-06-14 30Jurafsky 2012

Page 31: [16.06.14] Auto Correction for Mobile Typing

Candidate SelectionExample (Language Model: Bigram)

“… a stellar and versatile acress whose combination of sass

and glamour …”

Using training set of Corpus of Contemporary English (400 million words)

P(actress|versatile) = .000021 P(whose|actress) = .0010

P(across|versatile) = .000021 P(whose|across) = .000006

P(versatile, actress, whose) = .000021 × .001000 = 210 × 10-10

P(versatile, across, whose) = .000021 × .000006 = 1 × 10-10

2016-06-14 31Jurafsky 2012

Page 32: [16.06.14] Auto Correction for Mobile Typing

CONCLUSIONChapter 4

2016-06-14 32

Page 33: [16.06.14] Auto Correction for Mobile Typing

Nota Keyboard SwiftKey

Preventing typo’s Correcting typo’s

2016-06-14 33

Page 34: [16.06.14] Auto Correction for Mobile Typing

REFERENCEAppendix

2016-06-14 34

Page 35: [16.06.14] Auto Correction for Mobile Typing

Reference

https://en.wikipedia.org/wiki/Semi-supervised_learning

https://en.wikipedia.org/wiki/Cluster_analysis#Algorithms

https://play.google.com/store/apps/details?id=com.notakeyboard&hl=ko

Kernighan, Mark D., Kenneth W. Church, and William A. Gale. (1990). A Spelling Correction

Program Based on a Noisy Channel Model.

Jurafsky, D. (2012). Spelling Correction and the Noisy Channel. Lecture. Retrieved June 10,

2016, from http://spark-public.s3.amazonaws.com/nlp/slides/spelling.pdf

2016-06-14 35

Page 36: [16.06.14] Auto Correction for Mobile Typing

Q&A

2016-06-14 36

Page 37: [16.06.14] Auto Correction for Mobile Typing

Thank You

You can look again this presentation athttps://docs.com/kennyhm97/2659/16-06-14-auto-correction-for-mobile-typing

2016-06-14 37