Multipedia: Enriching DBpedia with Images Andrés García-Silva † , Asunción Gómez-Pérez † Max Jakob * , Pablo Mendez * and Chris Bizer ⃰ † {hgarcia, ocorcho,asun}@fi.upm.es Facultad de Informática Universidad Politécnica de Madrid Campus de Montegancedo s/n 28660 Boadilla del Monte, Madrid, Spain *[email protected]Web-based Systems Group Freie Universitat Berlin, Germany
16
Embed
Multipedia: Enriching DBpedia with Multimedia information
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Multipedia:Enriching DBpedia with
ImagesAndrés García-Silva†, Asunción Gómez-Pérez†
Max Jakob *, Pablo Mendez * and Chris Bizer �
† {hgarcia, ocorcho,asun}@fi.upm.esFacultad de Informática
Universidad Politécnica de MadridCampus de Montegancedo s/n
• Enriching ontologies with multimedia• The use of images and videos complement information
about concepts/entities in existing knowledge bases.
• Multimodal ontologies can help in QA systems, User Interfaces, search and recommendation processes.
Bone
Pathology
IsA
occurs
isA
depicts
depicts
«Show me X-ray Images with fractures of the Femur»
Radhouani, S., HweeLim, J.: pierre Chevallet, J., Falquet, G.: Combining textual and visual ontologies to solve medical multimodal queries. In: IEEE International Conference on Multimedia and Expo., pp. 1853-1856 (2006).
hgarcia
Cambiar la imagen por otra de internet
Garcia-Silva et al.
Multipedia
3
• Goal: Populate a general purpose ontology with images from the Web.
- Find relevant images for ontology instances with ambiguous names
• DBpedia knowledge base• Collects facts from Wikipedia containing 3.5 million entities, • Classified into a consistent cross-domain ontology: 272 classes and
1.6 million instances.• Has evolved into a hub in the linked data cloud.
• Images in DBpedia• Wikipedia images are represented in
DBpedia (foaf:depiction)• about 70% of the wikipedia articles don’t
have images
Introduction
hgarcia
1) validar el dato del 70%2) Validar el numero de classes en la DBpedia Ontology3) validar "has evolved into a " el into
1) Measuring relatedness between a DBpedia resource and an image: - Overlapping of terms between the context of the former and the tags of the latter.
2) Vector Space Model to represent the DBpedia resource and images: - TF as weighting scheme, - cosine function to measure similarity
3) Generate ranking of images according to the similarity value
Rtag-based= img1; img2 ... Imgq
Rcontext-based= img1; img2 ... Imgp
Aggregate Rfinal= img1; img2 ... Imgl
11Garcia-Silva et al.
Multipedia Experiments
• How many context words do produce the best results?
Apple context: «juice, fruit, apples, capital, michigan, orange»
12Garcia-Silva et al.
Multipedia Experiments• Ambiguity
• Search engines work well:• unambiguous names• ambiguous names referring a dominant sense
e.g., dbpedia:Stonehenge
• However they fail for ambiguous names:
• Lacking of a dominant sensee.g.: dbpedia:Apple
• When they do not refer to the dominant sense
e.g.: dbpedia:Blackberry
13Garcia-Silva et al.
Multipedia Experiments
• Dominance:
• Dataset:• 10 Classes and 15 dbpr randomly selected per each class• Each dbpr must be: 1) popular, 2) have a dominance under 0.7 • We found dbpr for Mammals, Birds and Insects• Increasing the dominance limit to 0.9 we found dbpr for the rest
of classes.
14Garcia-Silva et al.
Multipedia Experiments
• 15 people evaluate the results of three approaches• Each image was rated by 3 evaluators
15Garcia-Silva et al.
Multipedia Experiments
16Garcia-Silva et al.
Multipedia Conclusions
• Multipedia an approach to automatically populate an ontology with images related to existing instances
• We focused on the particularly challenging problem of ambiguity in instance names
• Human-driven evaluation of the approach involving 15 users and a total of 2250 image ratings containing DBpedia resources from several classes.
• A variation of Multipedia improves average precision by 9.4% over a baseline of keyword queries to commercial image search engines
• We have validated that in contrast to the baseline our approach achieves the highest precision with ambiguous names lacking a dominant sense.