Date: 30/11/2012 SSONDE: Semantic Similarity On liNked Data Entities Riccardo Albertoni [email protected]Ontology Engineering Group. Departamento de Inteligencia Artificial Facultad de Informática Universidad Politécnica de Madrid Joint work with Monica De Martino (CNR-IMATI-GE) MTSR 2012, 6th Metadata and Semantics Research Conference 28-30 November 2012 - Cádiz (Spain)
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Date: 30/11/2012
SSONDE: Semantic Similarity On liNked Data Entities
Ontology Engineering Group. Departamento de Inteligencia ArtificialFacultad de Informática
Universidad Politécnica de MadridJoint work with Monica De Martino (CNR-IMATI-GE)
MTSR 2012,
6th Metadata and Semantics Research Conference
28-30 November 2012 - Cádiz (Spain)
2
Presentation Outline
1. How SSONDE fits with other linked data technologies• What is it for? what is it not for?
2. Characteristics of instance similarity in SSONDE• The theory behind SSONDE’s similarity is detailed in
• Albertoni R. and De Martino M.; Asymmetric and context dependent semantic similarity among ontology instances, Journal of Data Semantics, LNCS, 2008.
3. SSONDE Architecture and Examples on Linked Data
Riccardo Albertoni
3
Linked data Crawling architectural pattern
Riccardo Albertoni
SSONDE
LDSPIDER/FUSEKI
LDIF
Cluster analysis Explorative search on resources
Build analysis services
Tom Heath and Christian Bizer (2011) Linked Data: Evolving the Web into a Global Data Space (1st edition). 1-136. Morgan & Claypool
4
SSONDE Instance similarity
is not to align ontologies/schemas;
to interlink/consolidate entities;
aims at • providing a method for comparing entities represented as
instances in ontology driven repository or as entities exposed in linked data;
• supporting in explorative searches.
assumes all the integration steps are doneActually, it works at the Application Layer of the Linked Data Crawling Architectural Pattern
main characteristics (make SSONDE unique in its kind)Context to represent similarity criteria (algorithm parameters);
Asymmetry to emphasize containment between instances.
Example: comparing researchers
5
Presentation Outline
1. How SSONDE fits with other linked data technologies• What is it for? what is it not for?
2. Characteristics of instance similarity in SSONDE• The theory behind SSONDE’s similarity is detailed in
• Albertoni R. and De Martino M.; Asymmetric and context dependent semantic similarity among ontology instances, Journal of Data Semantics, LNCS, 2008.
3. SSONDE Architecture and Examples on Linked Data
Sim(a,b) might differ from Sim(b,a) • Sim is not the inverse of a metric distance metric properties
cannot be exploited to prune comparisons
Here asymmetry is adopted to highlight the containment between instances A, B
Example of containment: (Comparing wrt publications only)
• A is Ph.D student who has always published with his tutor B,
A
B
pub 3
pub 1
pub 2
A is contained in B!!! (A<<B)A can be replaced by B
B is not contained in A!!!If you replace B with A
some experience got lost !!
10
SSONDE’s Asymmetric Similarity returns
Sim(A,B) ranges in [0,1]
It is proportional to the number of data and object property values that A shares with B • A is contained in B Sim(A,B)=1 • If A is not contained in B Sim(A,B)<1 • If A and B don’t share any “features” Sim(A,B)=0• If A has exactly the same characteristics of B (A<<B,
B<<A) Sim(A,B) = Sim(B,A) = 1
11
Results comparing young and senior researchers of IMATI
Research Experience Research Interest
The darkest is the matrix value the more is the similarity
12
Presentation Outline
1. How SSONDE fits with other linked data technologies• What is it for? what is it not for?
2. Characteristics of instance similarity in SSONDE• The theory behind SSONDE’s similarity is detailed in
• Albertoni R. and De Martino M.; Asymmetric and context dependent semantic similarity among ontology instances, Journal of Data Semantics, LNCS, 2008.
3. SSONDE Architecture and Examples on Linked Data
Riccardo Albertoni
13
SS
ON
DE
Output
TDB Rep.
SDBRep.
RDF Dumps
Configuration Similarity
Context Layer
Ontology Layer
Data Layer
Data wrappers
JENA TDB
JENASDB
JENA MEM
List of Instances Java Class to
generate the list
Ref. Context
Ref. Rules (e.g., JENA rules)
Similarity matrix in CSV
n-most similar entities
In JSON...Virtuoso
Wrppr
virtuoso
Kind of Store
….
WE
B O
F
DA
TA
RDF Dumps
HTTP DEREFERENCIABLE URIs
SPARQLEnd Points
Third parties
Served Linked dataset
Crawling architectural pattern
LDIFLDSpider +Fuseki Linked data consumption
Local Data Store/Cache
SSONDE ARCHITECTURE
14
SSONDE: a building block for new analysis services
SSONDE applied on “real linked data”• Analysing Habitat and Species
• published in NatureSDIplus (ECP-2007-GEO-317007), a European project developing a Spatial Data Infrastructure for Nature Conservation.
• to rank habitats according to the species they host an insight into inter-dependencies between habitats and species
• Analysing overlaps among scientific interests• Subset of linked dataset provided data.cnr.it as part of
SemanticScout framework by third parties (Gangemi et al)• to compare IMATI-CNR researcher according to their
(i) semantic similarity optimization:(i) the caching of intermediate similarity results
(ii) the adoption of MapReduce paradigm to speed up the assessment of semantic similarity;
(ii) domain driven extensions at data layer: (iii) defining new data layer measures suited for geo-
referenced entities
(iv) the multilingual similarity
(iii) definition of interfaces sifting entities according to their similarity exploiting visualization frameworks such as Exibit, Google visualization and JavaScript InfoVis Toolkit.
SSONDE Framework • R. Albertoni, M. De Martino, SSONDE: Semantic Similarity On liNked Data Entities, 6th Metadata
and Semantics Research Conference, 28-30 November 2012 - Cádiz (Spain) [to appear]• Framework Installation & use http://code.google.com/p/ssonde/wiki/GettingStarted
Semantic Similarity Theoretical Framework• Albertoni R. and De Martino M.; Asymmetric and context dependent semantic similarity among
ontology instances, Journal of Data Semantics, LNCS, 2008.• Albertoni R. and De Martino M.;. Semantic similarity of ontology instances tailored on the
application context. Full paper at On the Move to Meaningful Internet Systems 2006: CoopIS, DOA, GADA, and ODBASE, volume 4275 of LNCS, pages 1020–1038. Springer, 2006.
Issues adapting theoretical framework to Linked Data • Albertoni R., De Martino M.; Semantic Similarity and Selection of Resources Published
According to Linked Data Best Practice, OnToContent 2010, Part of the OTM (OTM'10)
Further ApplicationsComparing EUNIS habitats wrt their species• Albertoni R., De Martino M.; Semantic Technology to Exploit Digital Content Exposed as Linked
Data, eChallenges e-2011, 26-28 October 2011 Florence, Italy
Comparing shapes metadata (not Linked Data)• Albertoni R., De Martino M.; Using Context Dependent Semantic Similarity to Browse
Information Resources: an Application for the Industrial Design, First workshop on multimedia Annotation and Retrieval enabled by Shared Ontologies, Genoa, Italy, (2007)
A complete list of references on SSONDE and its Instance Similarity