Keyword Extraction and Image Annotation Games to Enhance the Cultural Database Creation Virach Sornlertlamvanich and Thatsanee Charoenporn [email protected], [email protected]National Electronics and Computer Technology Center, Thailand PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
36
Embed
Keyword Extraction and Image Annotation Games to Enhance the Cultural Database Creation
Keyword Extraction and Image Annotation Games to Enhance the Cultural Database Creation. Virach Sornlertlamvanich and Thatsanee Charoenporn [email protected] , [email protected] National Electronics and Computer Technology Center, Thailand. Motivation. - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
Keyword Extraction and Image Annotation Games to Enhance the Cultural Database
Creation
Virach Sornlertlamvanich and Thatsanee Charoenporn
• Segmented/Tagged/Labeled DescriptionWord POS tag
Labelผ้าซิน่ N
B-Kลายมดัหมีบ่า้นปทมุแก้ว N I-K<space> P
Oเป็น V
Oงานฝีมอืพื้นบา้น N O…… …..
…..PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
• Keyword List (extracted from tag and title)…..ผ้า…..ผ้าซิน่ผ้าซิน่ลายมดัหมีบ่า้นปทมุแก้ว…..…..
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
Chunking Model
Preliminary Experiment Result
• 3000 examples for training, 500 examples for testing
• Based on Margin Infused Relaxed Algorithm (MIRA), Crammer et al., 2005– Baseline features (Unigram and Bigram) +– 3 character prefix/suffix of current word +– 3 consecutive POS tags
• Recall=0.8256, Precision=0.9061, F1=0.8640
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
Semantic Relation Acquisition
• Extract commons syntactic patterns between two nouns
• Our task is to acquire triples (ei , rij , ej ), where– ei and ej are entities (keywords)
– rij is a relationship between them
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
(ei, BUILT_BY,
ej)
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
ESP Game and Peekaboomproposed by Luis von Ahn, May 25, 2006 by Pete Cashmore
• ESP Game – In the ESP Game, the two players are shown an image and asked to enter a word that describes it. The players can’t see each other’s guesses. The aim is to enter the same word as your partner in the shortest possible time. But there’s an ulterior motive here: much of the data is recorded, and could be used to power image search engines in the future. What’s cheaper – paying thousands of Mechanical Turkers to label all the images on the web, or tricking people into doing it for free?
• Peekaboom – Peekaboom takes the ESP Game to the next level. Unlike the ESP Game, it’s asymmetrical. To start, one user is shown an image and the other sees an empty black space. The first user is given a word relating to the image, and the aim is to communicate that word to the other player by revealing portions of the image. So if the word is “eye” and the image is a face, you reveal the eye to your partner. But the real aim here is to build a better image search engine: one that could identify individual items within an image.
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
ESP Game
• Two players are shown an image
• asked to enter a word that describes it.
• The aim is to enter the same word as your partner in the shortest possible time.
Twitter Bird
BirdTo name the image
Angry birdBird
Mohawk
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
Peekaboom• One user is shown a
named image and show the part of the image according to the name
• Another user gives a word relating to the image
• The aim is to enter the same word as it is named in the shortest possible time.
Bird
Bird
BirdSquirrel
Flying fish
To label the objectin the image
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
Extended Peekaboom• One user is shown a named
image and show the part of the image according to the name
• Another user gives a word relating to the image
• The aim is to enter the same word as it is named in the shortest possible time.
• A word from the Synset can be matched.
• Once a synset is selected cross language matching can be determined.
Bird
Bird
Bird
Squirrel
Flying fish
AWN
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
Demo• ESP-like game
– http://m-culture.in.th/game/esp_game– Play mode
• Single player mode: play against history• Two-player mode: guess to match each other
• Extended Peekaboom game– http://m-culture.in.th/game/peekaboom– Play mode
• Single player mode: play against history• Two-player mode: guess to match each other
– For Thai language, use AWN to support synonym, hypernym, hyponym, meronym, and holonym
– For other languages, use AWN to support synonym only
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
Preliminary Experiment
• 18 images played by 19 persons. For each image, we allow 60 seconds to guess a proper word.
• AWN can expand the matching in 67 cases or increase 22% of matching ratio.
Exact Syn Hyper Hypo Mero Holo
229 32 16 1 7 11
CULTURAL KNOWLEDGE SERVICE
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
Title cultureSnippet descriptionTags A, B, C
Title cultureSnippet descriptionTags A, B, C
Title cultureSnippet descriptionTags A, B, C
A
B
C
D
Cultural Database
Title productSnippet descriptionTags A, B, C
Title productSnippet descriptionTags A, B, C
Title productSnippet descriptionTags A, B, C
Product Database
Title shopSnippet descriptionTags A, B, C
Title shopSnippet descriptionTags A, B, C
Title shopSnippet descriptionTags A, B, C
Shop Database
Title makerSnippet descriptionTags A, B, C
Title makerSnippet descriptionTags A, B, C
Title makerSnippet descriptionTags A, B, C
Maker Database
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
Title productSnippet descriptionTags A, B, C
Title cultureSnippet descriptionTags A, B, C
Title cultureSnippet descriptionTags A, B, C
Title cultureSnippet descriptionTags A, B, C
Title cultureSnippet descriptionTags A, B, C
Title productSnippet descriptionTags A, B, C
To find a related Product from Culture information
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
Title productSnippet descriptionTags A, B, C
Title productSnippet descriptionTags A, B, C
Title productSnippet descriptionTags A, B, C
Title productSnippet descriptionTags A, B, C
Title cultureSnippet descriptionTags A, B, C
Title productSnippet descriptionTags A, B, C
To find the background Culture information from a Product
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
Title productSnippet descriptionTags A, B, C
Title cultureSnippet descriptionTags A, B, C
Title cultureSnippet descriptionTags A, B, C
Title cultureSnippet descriptionTags A, B, C
Title cultureSnippet descriptionTags A, B, C
Title productSnippet descriptionTags A, B, C
Title productSnippet descriptionTags A, B, C
Title productSnippet descriptionTags A, B, C
Title productSnippet descriptionTags A, B, C
Product and Culture information relation
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
Summary
• From this ESP-like game, we successfully named the images or at least obtained a list of candidates for labeling the object in the image to be used in the next extended Peekaboom game.
• Synonym, hypernym, hyponym, meronym, holonym from AWN can help expanding the matching ratio.
• Cross language image labeling is realized by AWN synonym.
PNC 2012 Annual Conference and Joint Meetings, UC Berkeley, US., December 7-9, 2012
Future Work
• Enhancing keyword extraction to find more term candidate for image matching
• Call for participation of the extended ESP and Peekaboom games for image labeling