Towards a Smart Wearable Tool to Enable People with SSPI to Communicate by Sentence Fragments https://www.youtube.com/watch?v=qzPAY4f8Wc8 András Németh (presenter, constructor, ELTE) – on movie Anita Verő (natural language processing, ELTE) – on movie András Sárkány (natural language processing, ELTE) – on movie Gyula Vörös (natural language processing, ELTE) – on movie Brigitta Miksztay-Réthey (special need expert, ELTE) Takumi Toyama (eye-tracking solution, DFKI) Daniel Sonntag (head of the German group, DFKI) András Lőrincz (project leader, ELTE) Challenge Handicap et Technologie 2014, May 26-27, Lille, France
58
Embed
Towards a Smart Wearable Tool to Enable People with SSPI to … · •A system to enable people with SSPI to communicate with natural language sentences •Demonstrated the feasibility
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Towards a Smart Wearable Tool to Enable
People with SSPI to
Communicate by Sentence Fragments https://www.youtube.com/watch?v=qzPAY4f8Wc8
András Németh (presenter, constructor, ELTE) – on movie
Anita Verő (natural language processing, ELTE) – on movie
András Sárkány (natural language processing, ELTE) – on movie
Gyula Vörös (natural language processing, ELTE) – on movie
Brigitta Miksztay-Réthey (special need expert, ELTE)
• Fiducial markers around the board are recognized by
vision-based pattern recognition techniques
25
Usage and HCI details
• The estimated gaze position was indicated as a small red
circle on the projected board (similar to the previous test).
• A symbol was selected by keeping the red circle on it for
two seconds.
• The eye tracker sometimes needs recalibration;
– the user could initiate recalibration by raising his
head up (detected by the MPU.)
– Once the recalibration process was triggered, a
distinct sound was played, and an arrow indicated
where the participant had to look for doing RC.
Communication Setup
Communication experiments
• Goal: communicate with a partner
• Two situations (with different boards)
– Buying food
– Discussing an appointment
Experimental results
• Verification
– To verify that communication was successful, the participant indicated misunderstandings using well-known yes-no gestures, which were quick and reliable. Moreover, a certified expert in AAC was present to indicate apparent communication problems.
– 205 symbol selections happened
– 23 of them were incorrect
• 89% accuracy
– The error rate was acceptable
• Real communication took place!
29
Experiment 3
External Symbols (towards mixed
reality setup)
External symbols
• Idea
– Not all symbols are present in the system
– Optical character recognition can help
– (Object recognition can help)
• Example
– The user wants to buy a certain type of
sandwich
– In the store, there are labels with the names of
the sandwich types on it
• The OCR was simulated
External symbols - Communication Setting
32
33
Results
• Wizard-of-Oz experiment for OCR
• Similar “good” results as in experiment 2
34
Technical input processing
and sentence generation
methods
35
Sentence fragment generation
• A word guessing game
– Good afternoon, how are
– I apologize for being late, I am very
– My favorite OS is
• A symbol guessing game
• LM needs to help
– To select words with the right sense (disambiguate homonyms: e.g., river
bank versus money bank)
– To select hypothesis where graphical symbols (words) are ‘in the
right place’
– To increase cohesion between words (agreement)
you?
sorry!
[Linux, Mac OS, Windows XP, Win CE].
Sentence fragment generation
• Understanding symbol communication requires practice
• Symbol communication is non-syntactical – Function words are rarely used
– Order of symbols may vary
• e.g., {lemon, sugar, tea} -> tea with lemon and sugar
• Idea – Generate fragments by inserting function words
– Rank them based on language models
– User should select from top 4 variants
37
Language Models
• Estimate the probabilities of n-grams (sequences of
words)
• P(tea with sugar) > P(tea the sugar)
– Use a corpus (collection of texts)
• Sparsity problem
– Long n-grams tend to be rare
– Smoothing and backoff methods are used to deal with
this problem
Stupid backoff
• Let denote a string of L tokens of a fixed vocabulary
• approximation reflects the Markov assumption that only the most recent n-1 tokens are relevant when predicting the next word.
• For any substring wij of w
1L let denote the frequency of occurrence of that
substring in the longer training data. The maximum likelihood probability estimates for the n-grams are given by their relative frequencies
•
• Problematic because (de-)nominator can be zero.
• Appropriate conditions are needed on the r.h.s.
• Noisy: estimates need smoothing
Language corpora in use
• Google Books n-gram corpus
– Collection of digitized books from Google
– Very large, freely available
– Represents written language
• OpenSubtitles corpus
– Collection of film and TV subtitles
– Moderate size, freely available
– Represents spoken language
Language modeling tools used
• Google Books n-gram corpus – Software: BerkeleyLM
• Pauls, A., Klein, D.: Faster and Smaller N-Gram Language Models. In: 49th Annual Meeting of the ACL: Human Language Technologies, Vol. 1, pp. 258--267. ACL, Stroudsburg, USA (2011)
– Method: Stupid Backoff • Brants, T., Popat, A. C., Xu, P., Och, F. J., Dean, J.: Large language models in machine translation. In:
EMNLP 2007, pp. 858--867.
• OpenSubtitles corpus – Software: KenLM
• Heafield, K.: KenLM: Faster and Smaller Language Model Queries. In: EMNLP 2011 Sixth Workshop on Statistical Machine Translation, pp. 187--197. ACL, Stroudsburg, USA (2011)
– Method: Modified Kneser-Ney smoothing • Heafield, K., Pouzyrevsky, I., Clark, J. H., Koehn, P.: Scalable Modified Kneser-Ney Language
Model Estimation. In: 51st Annual Meeting of the Association for Computational Linguistics, pp. 690--696. Curran Associates, Inc., New York, USA (2013)
Prefix tree building algorithm (1)
• Representation – Work with a prefix tree, where each path represents an n-gram
• Input – Set of named entities (e.g., tea, sugars)
– Set of function words (e.g., you, with, for)
– Desired length of sentence fragment (e.g., 3 words)