CAEPIA 2011 Linked Data Methodology

Post on 13-Jun-2015

854 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

Transcript

http://www.weso.es http://www.bcn.cl

An architecture and process of implantation for Linked Data

environments

A case study for the Library of Congress of Chile

Francisco Cifuentes – José María ÁlvarezChristian Sifaqui – José Emilio Labra

TLDE-CAEPIA 2011

http://www.weso.es http://www.bcn.cl

Overview: this talk in 1’

Why?Linked Open Data in Public Administrations

How?Proposal of Architecture

Adoption process

Where?Library of Congress - Chile

http://www.weso.es http://www.bcn.cl

Linked Open Data in Public Administrations

Government data & actions can be supervised

Improve transparency & confidence

http://www.weso.es http://www.bcn.cl

Linked Open Data inPublic Administrations

Public value (generates citizen experience)

Research & Collaboration

Reuse data

http://www.weso.es http://www.bcn.cl

Linked Open Data in Public Administrations

Public information belongs to citizens

Financed by public resources

Return of inversion

http://www.weso.es http://www.bcn.cl

Linked Open Data inPublic Administrations

Legislation is public information…

…and must be of public domain

Everyone is affected by laws

OK, ¡Linked Open Data is good!but…

http://www.weso.es http://www.bcn.cl

Architecture & Adoption Process

There is huge interest to publish LOD

Practical guidelines & methodologies ?

Our proposal:Architecture of Linked Open Data

Implementation methodology

http://www.weso.es http://www.bcn.cl

Considerations in Public Administrations context

Large volumes of dataSemistructured content

Contents of general interestHigh expectations

New projects should not interfereSmall teams in large organizations

Low semantic expertise

http://www.weso.es http://www.bcn.cl

Linked Open Data Architecture

Web Server Operating System

RDF Storage CacheDB

Endpoint SPARQL

OutputRDF

GraphOntologies

DocumentationPortal

UpdateRDF

GraphService

Web Application Server

Server side

Client side

Web Browser Semantic Application

http://www.weso.es http://www.bcn.cl

Adoption Process

Time

Phases

Contextualization

Ontology design

RDF Graph Modeling

SPARQL Endpoint Implementation

RDF Graph Implementation

Update Graph Service

Documentation Web Portal

Non functional Requirements

Optional Data Visualization & demos

OK, you propose an architecture& adoption process, but…

http://www.weso.es http://www.bcn.cl

Contextualization

Library of Congress - Chile

http://www.weso.es http://www.bcn.cl

ContextualizationLeychile 2008

Juridical certainty

LOD in Leychile: Natural extension

Improve interoperability (more formats)

Create domain ontologies

Complex queries through SPARQL endpoint

http://www.weso.es http://www.bcn.cl

Contextualization

Publish Linked Open Data – 5 stars

Norms and relationships in a global RDF graph

Infrastructure for future developments

First stage, pilot project

http://www.weso.es http://www.bcn.cl

Contextualization

≈ 300.000 norms and their relationships Modifications, Concordances, etc.

First stage ⇒ Only main metadata of norms Title, important dates, types, relationships

We exclude body text (articles, chapters, etc.)

http://www.weso.es http://www.bcn.cl

Contextualization

Definition of domain model:Norms, relationships, types of norms, metadata,

Functional requirements for bibliographical records (FRBR)

Output formats: RDFa, RDF/XML, JSON, N3,…

http://www.weso.es http://www.bcn.cl

Domain OntologiesSmall Ontology about Norms

http://www.weso.es http://www.bcn.cl

RDF Graph Modeling

A norm can be modified by another norm

Decree 296Published 1995-02-17

Art..1. abc.Art. 2. def.Artí.3. ghi.

Decree 12066Published 2005-05-15

Art. 1. Modify decree 296 in the following way:: substitute in Art.1 the words “a” by “xyz”.

Now, Decree 296 should be:

Decree 296

Artículo 1. xyzbc.Artículo 2. def.Artículo 3. ghi.

http://www.weso.es http://www.bcn.cl

RDF Graph Modeling

Careful URI Design

Expressiveness

http://www.weso.es http://www.bcn.cl

RDF Graph Modelinghttp://datos.bcn.cl/recurso/cl/DTO/ministerio-del-interior/1995-02-17/296/Decree 296

http://datos.bcn.cl/recurso/cl/DTO/ministerio-del-interior/1995-02-17/296/es@1995-02-17Original

http://datos.bcn.cl/recurso/cl/DTO/ministerio-del-interior/1995-02-17/296/es@2005-05-10Latest version

http://www.weso.es http://www.bcn.cl

Links to other datasets (Countries for International Treaties)DBPedia, Geonames

Reuse vocabularies / OntologiesSKOS, DC, FOAF, DBPedia, ORG

Triplestore: Openlink Virtuoso

SPARQL Endpoint

http://www.weso.es http://www.bcn.cl

SPARQL Endpoint

Example of queryFind all norms emitted by a municipality between 1995 and 2000

that were modified after 2005.PREFIX dc: <http://purl.org/dc/elements/1.1/>PREFIX n: <http://datos.bcn.cl/ontologies/bcn-norms#>

SELECT ?normTitle ?creatorName ?pubDate ?pubDateOtherWHERE {?norm n:createdBy ?creator .?creator n:hasName ?creatorName .?norm dc:title ?normTitle .?norm n:publishDate ?pubDate .?norm n:isModifiedBy ?otherNorm .?otherNorm n:publishDate ?pubDateOther .FILTER (regex(?creatorName,"MUNICIPALIDAD","i"))FILTER (?pubDate > "1995" &&

?pubDate < "2000" && ?pubDateOther > "2005")

}ORDER BY (?pubDate)

http://www.weso.es http://www.bcn.cl

RDF Graph Implementation

http://code.google.com/p/weso-desh/

We developed a Linked Data Frontend (WESO-DESH)

Content negotiation based on HTTP 303 See Other

Definition of URIs based on regular expressions

Easy configuration

Support for CONSTRUCT, ASK & DESCRIBE

Delegates output formats to SPARQL Endpoint

Result caching

GUI for administration backend (in progress)

http://www.weso.es http://www.bcn.cl

RDF Graph Implementation

WESO-DESH (Linked Data Frontend)

Output HTML+RDFa

XML Configuration

http://www.weso.es http://www.bcn.cl

26

Update Graph Service

*ETL = Extraction, Transformation Loading

Automatic extraction & transformation process to update the RDF GraphBased on Pentaho - Kettle ETL

Executes Transformations in threads

Configuration in XML

http://www.weso.es http://www.bcn.cl

Documentation

Documentation Web Portal: TYPO3 CMS

Sections:URI construction guidelines

Example queries

Output formats

Ontology documentation

etc.

http://www.weso.es http://www.bcn.cl

Non-Functional Requirements

Answer timeCache system, Profiling

Security & privacityDifferent views and access levels of RDF Graph

OthersInternationalization

Accessibility

Use of standards

http://www.weso.es http://www.bcn.cl

29

Optional: Data visualization

http://www.weso.es/lodviz/

Protype tool: LODViz (Linked Open Data Vizualization)

Based on HTML5 (pattern library)

Work in progress

http://www.weso.es http://www.bcn.cl

http://www.weso.es http://www.bcn.cl

31

Results

Public Dataset Catalogs Faceted Browser - CTIC FoundationFive stars Linked Open Data

http://www.weso.es http://www.bcn.cl

32

Conclusions

First stage finished> 300.000 norms exported

≈ 8mill. triples, ≈ 27 triples by norm

200/400 triples added each day

3 tools in developmentWESO DESH - Linked data frontend

WESO RUD – RDF Updater

LODVIZ – Linked Open Data Visualization

Proposed methodology of Linked Open Data

http://www.weso.es http://www.bcn.cl

Future Work

Library of Congress of ChileMore datasets: Biographies, Geographical data

History of Law

Improve documentation

WESO Research groupSemantic search engine

Entity extraction & reconciliation in text

Resource Recommendation

Provenance & graph views

The End

http://www.weso.es

http://www.bcn.cl

More Information

http://www.weso.es http://www.bcn.cl

35

Main Team

Francisco CifuentesMember of WESO Research Group and Library of Congress of Chilehttp://www.weso.es/~fcifuentes

José María ÁlvarezMember of WESO Research Grouphttp://josemalvarez.es

Christian SifaquiHead of Systems and Network information servicesLibrary of Congress of Chilehttp://sifaqui.blogspot.com/

Jose Emilio LabraAssociate Professor of University of Oviedo and Head of WESO Research Grouphttp://www.di.uniovi.es/~labra/

http://www.weso.es http://www.bcn.cl

CreditsMost of the people were obtained from Internet.

Imagen transparencia: http://2.bp.blogspot.com/--wFwsKwMgAg/TjSDXOLCTzI/AAAAAAAAOzQ/qvBtbShckdI/s1600/11.2.bmp

Euros: Minuto digital. http://www.minutodigital.com/wp-content/uploads/euros-300x196.jpg

Biblioteca: http://ffernandez.files.wordpress.com/2010/04/biblioteca.jpg

FRBR: http://cucataloging.blogspot.com/

Contextualization: http://tentblogger.com/right-advertisers/

Documentation: http://susops.blogspot.com/2010/07/power-of-documentation.html

top related