In-depth Exploration of Geotagging Performance using sampling strategies on YFCC100M George Kordopatis-Zilos, Symeon Papadopoulos, Yiannis Kompatsiaris Information Technologies Institute, Thessaloniki, Greece MMCommons Workshop, October 16, 2016 @ Amsterdam, NL
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
In-depth Exploration of Geotagging Performance using sampling strategies on YFCC100MGeorge Kordopatis-Zilos, Symeon Papadopoulos, Yiannis KompatsiarisInformation Technologies Institute, Thessaloniki, Greece
MMCommons Workshop, October 16, 2016 @ Amsterdam, NL
Where is it?Depicted landmarkEiffel TowerLocationParis, Tennessee
Keyword “Tennesee” is very important to correctly place the photo.
MotivationEvaluating multimedia retrieval systems• What do we evaluate?• How?• What decisions do we make based on it?
MM system (black box) Test Collection
Comparison to ground truth
Evaluation measure
Decision
Problem Formulation• Test collection creation Evaluation bias
• Performance reduced to a single measure miss a lot of nuances of performance
• Test problem: Geotagging = predicting the geographic location of a multimedia item based on its content
Example: Evaluating geotagging• Test collection #1: 1M images, 700K located in US• Assume we use P@1km as an evaluation measure
• System 1: almost perfect precision in US (100%), very poor for rest of the world (10%) P@1km = 0.7*100 + 0.3*10 = 73%
• System 2: approximately the same precision all over the world (65%) P@1km = 65%
• Test collection #2: 1M images, 500K depicting cats and puppies on white background• Then, for 50% of the collection any prediction is
essentially random.
Multimedia Geotagging• Problem of estimating the geographic location of a
multimedia item (e.g. Flickr image + metadata)• Variety of approaches:• Text-based: use the text metadata (tags)
• Gazetteer-based• Statistical methods (associations between tags & locations)
• Visual• Similarity-based (find most similar and use their location)• Model-based (learn visual model of an area)
• Hybrid• Combine text and visual
Language Model
• Most likely cell: • Tag-cell probability:
We will refer to this as:Base LM (or Basic)
Language Model Extensions• Feature selection
• Discard tags that do not provide any geographical cues• Selection criterion: locality > 0
• Feature weighting• More importance to tags with geographic information• Linear combination of locality and spatial entropy
• Multiple grids• Consider two grids: fine and coarse – if the estimate from the fine
grid falls within that of the coarse, then use that one
• Similarity Search• Out of the selected cell, use lat/lon of most similar item to refine
location estimation
We will refer to this as:Full LM (or Full)
MediaEval Placing Task• Benchmarking activity in the context of MediaEval• Dataset: • Flickr images and videos (different each year)• Training and test set
• Also possible to test systems that use external data
Edition Training Set Test Set
2015 4,695,149 949,889
2014 5,025,000 510,000
2013 8,539,050 262,000
Proposed Evaluation Framework• Initial (reference) test collection Dref
• Sampling function f: Dref Dtest
• Performance volatility
• p(D): performance score achieved in collection D• In our case, we consider two such measures:• P@1km• Median distance error
Sampling StrategiesA variety of approaches for Placing Task collection:• Geographical Uniform Sampling• User Uniform Sampling• Text-based Sampling• Text Diversity Sampling• Geographically Focused Sampling• Ambiguity-based Sampling• Visual Sampling
Uniform Sampling• Geographic Uniform Sampling• Divide earth surface into square areas of approximately
the same size (~10x10km)• Select N items from each area (N=median of items/area)
• User Uniform Sampling• Select only one item per user
Text Sampling• Text-based Sampling• Select only items with more than M terms (M: median
of terms/item)
• Text Diversity Sampling• Represent items using bag-of-words• Use MinHash to generate a binary code per BoW vector• Select one item per code (bucket) B
Other Sampling Strategies• Geographically Focused Sampling
• Pick items from a selected place (continent/country)
• Ambiguity-based Sampling• Select the set of items that are associated with ambiguous
place names (or the complementary set)• Ambiguity defined with the help of entropy
• Visual Sampling• Select only items associated with a given visual concept• Select only items associated with concepts related to buildings