ARIADNE is funded by the European Commission's Seventh Framework Programme SemanAc IntegraAon experiments Improving Interoperability and Reusability Unlocking the PotenAal of Digital Archaeological Data Florence, 15 December 2016 Maria Theodoridou FORTHICS, Greece
16
Embed
Maria Theodoridou Semantic Integration Experiments
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
ARIADNE is funded by the European Commission's Seventh Framework Programme
SemanAc IntegraAon experiments Improving Interoperability and Reusability
Unlocking the PotenAal of Digital Archaeological Data Florence, 15 December 2016
Maria Theodoridou FORTH-‐ICS, Greece
The challenge Build an Integrated Knowledge Repository and support innovaAve reasoning on archaeological datasets (relaAng and combining data) preserving the original meaning and the perspecAve of the different data providers. Two main pillars: Ø a global, extensible schema in the form of a formal ontology that allows for
integraAon without loss of meaning.
Ø ARIADNE Reference Model = CIDOC CRM + Extension Suite
Ø Common vocabularies/terminologies Ø Use of well established standard terminologies Ø GeCy AAT Ø Nomisma.org
ARIADNE Reference Model
Few concepts, high recall
Special concepts, high precision
Case Studies Ø NumismaAcs
• tradiAonal science with experience and iniAaAves in standardizaAon so it was chosen as a very good starAng point for item-‐level integraAon
• Nomisma.org serves as a authoritaAve resource
Ø Wood/Dendrochronology • integraAon of informaAon from diverse datasets and (via NLP)
archaeological reports in different languages • GeCy AAT serves as an authoritaAve resource
Ø Sculptures • data integraAon of sources from various disciplines including sculpture informaAon and its archaeological context.
• focuses on the provenance of informaAon according to bibliographic references which leads to advanced literature research
NumismaAcs Case Study Extracts of 5 diverse databases & datasets: Ø OEAW: dFMRO coin archive 72 records
Ø COINS Project: SAR Archive 627 records
Ø COINS Project: FWM Archive
Ø iDAI Coins Pergamon 517 records
Ø CultureItalia: MuseiD-‐Italia 25562 records
Ø NLP data from Heslington East ExcavaAon Archive 37 records
Ø ACDM records
NumismaAcs Case Study
Wood/Dendrochronology Case Study • Extracts of 5 archaeological datasets, output from NLP
on 25 grey literature reports • MulAlingual -‐ English, Dutch and Swedish data • Data integraAon via CIDOC CRM and Geay AAT • 1.09 million RDF triples • 23,594 records • 37,935 objects • DemonstraAon query builder
for easier cross-‐search and browse of integrated datasets
Wood/Dendrochronology Case Study
SPARQL queries
DemonstraAon applicaAon: Query Builder
DCCD
RDF triple store
ADS, DANS, SND
Geay AAT (RDF)
VAG cruck NMS VAG
dendro UNID
XML NLP
Direct import TransformaAon (STELETO)
Cleansing + NormalisaAon (OpenRefine)
tabular records
TransformaAon (STELETO)
Grey literature Archaeological datasets
tabular records TransformaAon (XSLT)
Sculptures Case Study • Extracts of 5 diverse databases & datasets: – Archaeological object database: Arachne – Field research databases: Athenian Agora, iDAI.field – Museum data: BriAsh Museum – Research data: Oxford Roman Economy Project
• Data integraAon via CIDOC CRM and controlled vocabularies: Geay AAT, Wikidata, Zenon, iDAI.gazeaeer
Ø NumismaAcs Case Sudy 1,2M triples Ø Wood/Dendrochronology Case Study 1,5M triples Ø Sculptures Case Study 5,5 M triples Ø AAT thesaurus 4,4M triples
Total ~ 13M triples Contains different levels of informaAon:
Ø Item specific informaAon Ø Document research data Ø NLP data Ø Catalog informaAon
Technologies used:
hap://www.metaphacts.com/
haps://www.blazegraph.com/
Research quesAons Ø Query mechanisms support innovaAve reasoning on
archaeological datasets
Ø Query power lies in relaAng and combining
Ø data from different providers, preserving the original meaning and their perspecAve
Ø data from grey literature reports Ø item level with catalog info on archaeological datasets
Research quesAons
Ø Find all bronze coins (item level info, retrieves datasets from mulAple providers)
Ø Find the publishers of all collecAons that contain coins (catalog info)
Ø Find all datasets and grey literature reports that contain bronze antonianus (item level, NLP data and catalog info)
SAR records
NLP
record
CulturaItalia records
DAI
record
OEAW records
Catalog info
ContribuAng partners Achille Felicem, PIN
Carlo Meghini, CNR-‐ISTI
Philipp Gerth, DAI
Ceri Binding, USW
Douglas Tudhope, USW
Andreas Vlachidis, USW
Nadezhda Kecheva, NIAM-‐BAS
Sara di Giorgio. ICCU
Edeltraud Aspoeck, OEAW
Anja Masur, OEAW
ARIADNE is a project funded by the European Commission under the Community’s Seventh Framework Programme, contract no. FP7-‐INFRASTRUCTURES-‐2012-‐1-‐313193. The views and opinions expressed in this presentaAon are the sole responsibility of the authors and do not necessarily reflect the views of the European Commission.