October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 1
Using Ontologies to Strengthen Folksonomies and Enrich Information Retrieval in Weblogs
Alexandre Passant EDF, Recherche & Développement, Clamart, France Université Paris 4, Laboratoire LaLICC, Paris, France
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 2
1 Corporate Blogging
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 3
Classical (old-school) information systems
• Features: • Workflows • Access Control • Templates • Archives
• Designed to store: • Project reports • Internal notes • Hierarchical information
• But, what about informal and unstructured content: • Short news • E-mails • Coffee-break discussions
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 4
A new vision: Enterprise 2.0
• Web 2.0 concepts in a corporate environment
• Write and share any knowledge:
• Blogs • Comments, trackbacks
• Re-use and capitalize knowledge:
• Wikis • Open structure and cooperation
• Be informed:
• RSS feeds • Shared subscriptions
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 5
EDF R&D Experiment
• EDF R&D: • 2000 researchers • 3 research centers
• Our platform: • RSS subscription system • Blogs • Group wikis
• Experiment (1 1/2 year): • 1500 feeds, 8000 blog posts • 80 bloggers, 600 readers / subscribers
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 6
2 Weblogs Indexing and Folksonomies
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 7
Indexing blog posts
• Meta-data about the container:
• Automatically-created metadata: • Author, creation date, title
• DublinCore standards • Meta-data about blog post content:
• Previously-defined: • Ontologies, taxonomies
• User-defined: • Free keywords and folksonomies
• Folksonomies: • Easy to adopt • Build from scratch
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 8
Folksonomies limits
• Tags variation: • Different tags for the same meaning:
• Synonymy, Abbreviation, Acronyms … • Typos
• Tag ambiguity: • Different meanings for the same tag:
• Acronyms, Homonymy …
• Flat organisation: • No way to suggest related tags
• Paris / France • blog / socialsoftware
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 9
Ambiguity example from del.icio.us
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 10
Some tools to reduce variation and explosion
• Tag completion: • Forward and “backward” completion • AJAX
• Tag suggestion: • Based on blog post content • Suggesting existing tags • But also new ones using named entites extraction • AJAX
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 11
3 The Semantic Web
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 12
A Few Words About the Semantic Web
• Semantic Web
• Extending the current Web • Unified description of resources
• Ontologies
• Representation of a domain • Common vocabulary
• Languages
• Modeling: RDF(S), OWL … • Querying: SPARQL …
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 13
Weblogging and the Semantic Web
• Tools: • Snippet Manager [Cayzer & al. 2005] • semiBlog [Möller & al. 2006]
• Folksonomies: • Tag ontology to represent connections between tags [Newman 2005] • Tagging model [Gruber 2005]
• Vocabularies:
• RSS 1.0 based on RDF, can be extended with other vocabularies • SIOC: Semantically-Interlinked Online Communities [Breslin & al. 2005]
• State of The Art: • « What next for Semantic Blogging » [Cayzer 2006]
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 14
4 Strengthening Folksonomies
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 15
A mix of Folksonomies and Ontologies
• Mixing both approaches • Keep the open spirit of folksonomies • Use an ontology layer as a formal way to represent data • Remove ambiguity and add meaning to tags to enhance search experience
• How ? • Link tags to domain ontology concepts (classes / instances) • Create links between posts and ontology concepts using SIOC
• First steps • Analyse the folksonomy and most popular tags • Create a core ontology (based on PROTON) and its instances • 300 classes, 600 instances, mainly named entities
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 16
A simple way to link tags to ontology instances
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 17
From Tags to Ontology (1/5)
1) Create and tag post
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 18
From Tags to Ontology (2/5)
1) Create and tag post
2) Link tag to ontology
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 19
From Tags to Ontology (3/5)
1) Create and tag post
2) Link tag to ontology
3) Link post to ontology
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 20
From Tags to Ontology (4/5)
1) Create and tag post
2) Link tag to ontology
3) Link post to ontology
4) Create new post
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 21
From Tags to Ontology (5/5)
1) Create and tag post
2) Link tag to ontology
3) Link post to ontology
4) Create new post
5) Infer related post
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 22
Dealing with ambiguity and maintaining instances
• User interface to remove ambiguity when adding a new post
• Ability to browse the ontology to link to other class / instance for new / different meaning tags
• Restricted back office to create new instances
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 23
5 Enhanced Information Retrieval
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 24
A Semantic Search Engine
• Decentralized data:
• Ontologies • Blog posts w/ RDF description • A central 3store w/ SPARQL
• Using tags / ontology links to:
• Deal with tags variations • Remove ambiguity
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 25
Suggesting related tags and posts
• Goal:
• Offer suggestions to users based on their current tag search • Let users discover potential interesting posts • Different approaches, use both tagging and ontologies:
• Using instances co-occurrence • Using instances type • Defining specific rules
• Run on the fly and suggest results, using SPARQL
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 26
Using instances co-occurrence
• Basic approach:
– Posts sharing concepts
– Ontology to remove ambiguity
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 27
Using instances types
• Suggesting instances of the same class:
– Programming languages
– Energy companies
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 28
Defining specific rules
• Rule(s) for each class
• Location / {sub|top}locations • Company / companies working
in the same area
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 29
Rule example
• RDF rule {! :x rdf:type sioc:post .! :y rdf:type sioc:post .! :x sioc:topic :a .! :y sioc:topic :b .! :a rdf:type ptop:Agent .! :b rdf:type ptop:Agent .! :a company:expertiseIn :d .! :b company:expertiseIn :d !} => {! :x sioc:related_to :y !} .
• Reformulated w/ SPARQL SELECT ?post ?topic ?domain!WHERE {! <$uri> [! rdf:type ptop:Agent ;! company:expertiseIn ?domain! ] .! ?post sioc:topic ?topic .! ?topic [! rdf:type ptop:Agent ;! company:expertiseIn ?domain! ] !}
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 30
Running and Displaying results
• SPARQL queries on the triplestore
• Offering different tagclouds to split results in clusters:
• Instances • Specific rules • Co-occurrence
• Let users know why they should have a look to related posts
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 31
6 Conclusion
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 32
Conclusion
• Mixing folksonomies and ontologies • Keep simplicity of free tagging • Ontologies to reduce limits of free tagging • Infer related posts / topics • A mix of top-down and bottom-up approach
• A real use-case • Adapt model and interfaces to users needs
• Future works • Knowledge extraction from weblogs • Use wikis to add properties to ontology instances • New ways to browse and query blog posts
October 24, 2009 Using Ontologies To Strenghten Folksonomies and Enrich Information Retrieval in Weblogs - ICWSM 2007 33
Thank you !