Top Banner
A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010
31

A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Dec 14, 2015

Download

Documents

Alexis Dock
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

A Semantic Web View on Concepts and their Alignments

Antoine Isaac

Vrije Universiteit AmsterdamEuropeana

Concepts in Context, Köln, July 19th 2010

Page 2: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Linked Data Principles

1. Use URIs as names for things2. Use HTTP URIs so that people can look up those names3. When someone looks up a URI, provide useful information

using standards (RDF, SPARQL)4. Include links to other URIs, so that they can discover more

things.Tim Berners-Lee, http://linkeddata.org/

A way to publish Semantic Web data

Page 3: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

A web of data

• Publish and re-use data via the web, building innovative applications over former data silos

• Principle #4 is crucial to this vision:Include links to other URIs, so that they can discover more things.

http://linkeddata.org/

Page 4: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

SKOS, Knowledge Organization Systems and Linked Data

SKOS allows representing (simple) KOS data as RDFanimals

NT catscats

UF domestic catsRT wildcatsBT animalsSN used only for domestic cats

domestic catsUSE cats

wildcats

Page 5: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

SKOS, KOSs and LDSKOS allows bridging across KOSs from different contexts

http://www.w3.org/2004/02/skos/

Page 6: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Some landmark KOS LD implementations• Many Libraries – not a surprise!

• Swedish National Library’s Libris catalogue and thesaurus http://libris.kb.se/ • Library of Congress’ vocabularies, including LCSH http://id.loc.gov/ • DNB’s Gemeinsame Normdatei (incl. SWD subject headings) http://d-nb.info/gnd/

Documentation at https://wiki.d-nb.de/display/LDS

• BnF’s RAMEAU subject headings http://stitch.cs.vu.nl/ • OCLC’s DDC classification http://dewey.info/ and VIAF http://viaf.org/ • STW economy thesaurus http://zbw.eu/stw • National Library of Hungary’s catalogue and thesauri http://oszkdk.oszk.hu/resource/DRJ/404

(example)

• Other fields• Wikipedia categories through Dbpedia http://dbpedia.org/ • New York Times subject headings http://data.nytimes.com/ • IVOA astronomy vocabularies http://www.ivoa.net/Documents/latest/Vocabularies.html• GEMET environmental thesaurus http://eionet.europa.eu/gemet • UMTHES• Agrovoc http://aims.fao.org/ • Linked Life Data http://linkedlifedata.com/ • Taxonconcept http://www.taxonconcept.org/ • UK Public sector vocabularies http://standards.esd.org.uk/ (e.g., http://id.esd.org.uk/lifeEvent/7 )

Page 7: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

KOS Alignments?

Quite many of them are linked to some other resource• LCSH, SWD and RAMEAU interlinked through MACS mappings• GND linked to DBpedia and VIAF• Libris linked to LCSH• Agrovoc to CAT, NAL, SWD, GEMET• NYT to freebase, DBpedia, Geonames• dbPedia links are overwhelming

Hungary, STW, TaxonConcept, GND…

Is that enough? Are these links any good?

Page 8: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

[Cyganiak, Jentzsch] http://linkeddata.org/

Sparse linkage: the LD cloud

Page 9: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

[Guéret, 2010] http://blog.larkc.eu/?p=1941

Sparse of linkage: another view

Page 10: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Linked Data Issues

Mike Uschold’s “semantic elephants”• Proliferation of URIs, Managing Coreference• Versioning and URIs• Overloading owl:sameAs

http://lists.w3.org/Archives/Public/public-lod/2010May/0012.html

Page 11: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

What kind of links?

Coreference links are the most used (and needed)• owl:sameAs• skos:exactMatch• skos:closeMatch• rdfs:seeAlso• umbel:isLike

Page 12: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Overloading owl:sameAs

• Formally, two URIs linked by owl:sameAs are inferred to have the same propertiesex:a name “Antoine Isaac” .ex:b owl:sameAs ex:a .Implies ex:b name “Antoine Isaac” .

• Many owl:sameAs statements are asserted between resources that are only very similar [Halpin 2009]A same resource but in different contexts, a reference…

Page 13: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Case study: New York Times

• 10K concepts (places, descriptors, persons, organizations)http://data.nytimes.com

• Manually or automatically mapped by NYT staff to dbPedia, freebase, geonamesLinking LD cloud to NYT articles!Allows to easily mix NYT content with other content

• Started with quite messy modeling http://data.nytimes.com/60694995023816375851

dcterms:rightsHolder The New York Times Company .http://data.nytimes.com/60694995023816375851

owl:sameAs http://dbpedia.org/resource/Park_Slope%2C_Brooklyn .

Page 14: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Clearer KOS alignments (1)

What is being aligned?Concepts, documents, real-world entities “out there”

(persons, places…)

• In principle owl:sameAs should not be applied across disjoint categories

• But even for one category there can be issues• Two KOS concepts representing a same notion but with different

management metadata attached (skos:changeNote)

Page 15: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Clearer KOS alignments (2)

How is it aligned? Distinguish:• exact co-reference• conceptual similarity, including equivalence • classification

• Making clearer distinctions between conceptual links• skos:narrowMatch, skos:broadMatch, skos:relatedMatch

• Minimize ontological commitment for KOS data consumers• skos:exactMatch: concepts can be used interchangeably across a wide range of

information retrieval applications. skos:exactMatch is a transitive property• skos:closeMatch: In order to avoid the possibility of "compound errors" when

combining mappings across more than two concept schemes, skos:closeMatch is not declared to be a transitive property

Page 16: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Case study: New York Times (2)

Data quality has considerably improved• Factual data is at the concept itself, management data is at the resource

representing the data source (context)

http://data.nytimes.com/60694995023816375851 rdf:type skos:Concept ;skos:prefLabel “Park Slope (NYC)” ;geo:lat “40.6701033” ;owl:sameAs http://dbpedia.org/resource/Park_Slope%2C_Brooklyn .

http://data.nytimes.com/60694995023816375851.rdfdcterms:rightsHolder “The New York Times Company” ;foaf:primaryTopic http://data.nytimes.com/60694995023816375851

• Still, for resources linked with owl:sameAs statements representing different modeling choices can be merged

the DBpedia resource might not be a skos:Concept, or use different latitude format

Page 17: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Clearer KOS alignments (3)

What is the alignment for?• SKOS mapping properties use the notion of validity within one

application context• Application context for mapping has been investigated in

thesaurus interoperability studies• Application of alignments matters:

• STITCH application scenarios for Cultural Heritage: book re-indexing, thesaurus merging, query reformulation…

• A same alignment performs differently for different scenarios[Isaac 2008, Wang 2009]

Page 18: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Application-specific alignment evaluation

Example: OAEI 2007 campaign, 3 matching tools evaluated for thesaurus merging & book re-indexing

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Precision Coverage

Falcon

Silas

DSSim

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Pa Ra

Falcon

Silas

DSSim

Page 19: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Application-specific alignments

Why?

Take 2 thesauri at the Nat. Library of the Netherlands: GTT and Brinkman

• For thesaurus merging, gtt:excavation should be aligned to brinkman:excavation

• For book re-indexing, gtt:excavation should be aligned to brinkman:archeology_netherlands

Page 20: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

• Requires a finer representation grain for the context in which the alignment is produced• Who created it?• Manual vs. Automatic?• Which alignment strategy or tool?• Is there a degree of confidence?

Page 21: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Case study: New York Times (3)

• Using nyt:mapping_strategy property with nyt:manual or nyt:automatic:

http://data.nytimes.com/60694995023816375851.rdfnyt:mapping_strategy http://data.nytimes.com/elements/manual .

• Problem: it applies to the context file for the concept, not to the statement itself:

http://data.nytimes.com/60694995023816375851 owl:sameAs http://dbpedia.org/resource/Park_Slope%2C_Brooklyn .

• Using simple binary properties (skos:exactMatch…) between aligned resources does not allow for much flexibility

Page 22: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Ontology Matching community practices

• Community investigating the ontology and vocabulary matching issuesOntology Alignment Evaluation Initiative

http://oaei.ontologymatching.org

• Matching tools produce some metadata• Metadata repositories store and manage them

– Bioportal http://bioportal.bioontology.org/ – CATCH vocabulary and alignment repository

http://stitch.cs.vu.nl/repository/ …

• Consensus: richer alignment metadata is needed

Page 23: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

From a simple representation

Page 24: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

to a more complete one

http://alignapi.gforge.inria.fr/edoal.html

Page 25: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Can LD accommodate complex representations?

• The strength of the LD vision lies in the relative simplicity of a standard representation

• LD provides a simple way to publish data and follow one’s nose to connected dataSerendipity!

• Reification and metadata on links are not really compatible with itHigher barrier for data publication and consumption

Page 26: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Peaceful co-existence

• Applications with narrow scope and that require precise data can afford• Selecting alignments they consume• Exploiting finer-grained representations• Creating finer-grained representations

• Simple data for applications that are simple and/or exploiting a wide range of datasets• Simple mesh-up applications robust to (limited) approximation• Web-scale applications

Large-scale document retrieval, Concept discovery

Page 27: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Does it need to be perfect anyway?

• Do we really want to throw away crucial URI co-reference data?http://sameAs.org has 35,187,488 URIs in 11,285,263 bundles

• Extensive linking to dbPedia is useful, even with a type of link which is not used in the theoretically good wayCf. BBC content and data mesh-ups

http://www.bbc.co.uk/wildlifefinder/ http://www.bbc.co.uk/music/

• Issues with mixed quality are being tackled– http://sameAs.org as a “service to provide you with help finding URIs”,

keeping track of data sources– Representation and exchange of provenance info is under active

investigation

Page 28: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Peaceful co-existence (2)

• If you have complex representation, don’t be pedantic and publish simpler data, too!

• Articulation between LD (to discover links) and alignment repositories is needed

• Technically feasible, best practices have to be identified

Page 29: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Conclusions

• (Almost) any alignment is better than none This is a web of data, without links there’s almost no value

• There is already great linking happening!

• More involvement from this community would certainly help!Alignment themselves & Theoretical foundations

Page 30: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Thanks!

Possible participation channels:Linked Open Data community (http://linkeddata.org) and

mailing list ([email protected])Library Linked Data W3C incubator group

(http://www.w3.org/2005/Incubator/lld/wiki/ ) and community list ([email protected])

Page 31: A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

References

• [Halpin 2009] Harry Halpin, Pat Hayes. When owl:sameAs isn't the Same: An Analysis of Identity Links on the Semantic Web. LDOW 2009

• [Isaac, 2008] Antoine Isaac, Henk Matthezing, Lourens van der Meij, Stefan Schlobach, Shenghui Wang, Claus Zinn. Putting ontology alignment in context: usage scenarios, deployment and evaluation in a library case. ESWC 2008

• [Wang, 2009] Shenghui Wang, Antoine Isaac, Balthasar Schopman, Stefan Schlobach, Lourens van der Meij. Matching multi-lingual subject vocabularies. ECDL 2009