Top Banner
http://www.weso.es http://www.bcn.cl An architecture and process of implantation for Linked Data environments A case study for the Library of Congress of Chile Francisco Cifuentes José María Álvarez Christian Sifaqui José Emilio Labra TLDE-CAEPIA 2011
36
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

An architecture and process of implantation for Linked Data

environments

A case study for the Library of Congress of Chile

Francisco Cifuentes – José María ÁlvarezChristian Sifaqui – José Emilio Labra

TLDE-CAEPIA 2011

Page 2: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Overview: this talk in 1’

Why?Linked Open Data in Public Administrations

How?Proposal of Architecture

Adoption process

Where?Library of Congress - Chile

Page 3: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Linked Open Data in Public Administrations

Government data & actions can be supervised

Improve transparency & confidence

Page 4: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Linked Open Data inPublic Administrations

Public value (generates citizen experience)

Research & Collaboration

Reuse data

Page 5: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Linked Open Data in Public Administrations

Public information belongs to citizens

Financed by public resources

Return of inversion

Page 6: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Linked Open Data inPublic Administrations

Legislation is public information…

…and must be of public domain

Everyone is affected by laws

Page 7: CAEPIA 2011 Linked Data Methodology

OK, ¡Linked Open Data is good!but…

Page 8: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Architecture & Adoption Process

There is huge interest to publish LOD

Practical guidelines & methodologies ?

Our proposal:Architecture of Linked Open Data

Implementation methodology

Page 9: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Considerations in Public Administrations context

Large volumes of dataSemistructured content

Contents of general interestHigh expectations

New projects should not interfereSmall teams in large organizations

Low semantic expertise

Page 10: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Linked Open Data Architecture

Web Server Operating System

RDF Storage CacheDB

Endpoint SPARQL

OutputRDF

GraphOntologies

DocumentationPortal

UpdateRDF

GraphService

Web Application Server

Server side

Client side

Web Browser Semantic Application

Page 11: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Adoption Process

Time

Phases

Contextualization

Ontology design

RDF Graph Modeling

SPARQL Endpoint Implementation

RDF Graph Implementation

Update Graph Service

Documentation Web Portal

Non functional Requirements

Optional Data Visualization & demos

Page 12: CAEPIA 2011 Linked Data Methodology

OK, you propose an architecture& adoption process, but…

Page 13: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Contextualization

Library of Congress - Chile

Page 14: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

ContextualizationLeychile 2008

Juridical certainty

LOD in Leychile: Natural extension

Improve interoperability (more formats)

Create domain ontologies

Complex queries through SPARQL endpoint

Page 15: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Contextualization

Publish Linked Open Data – 5 stars

Norms and relationships in a global RDF graph

Infrastructure for future developments

First stage, pilot project

Page 16: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Contextualization

≈ 300.000 norms and their relationships Modifications, Concordances, etc.

First stage ⇒ Only main metadata of norms Title, important dates, types, relationships

We exclude body text (articles, chapters, etc.)

Page 17: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Contextualization

Definition of domain model:Norms, relationships, types of norms, metadata,

Functional requirements for bibliographical records (FRBR)

Output formats: RDFa, RDF/XML, JSON, N3,…

Page 18: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Domain OntologiesSmall Ontology about Norms

Page 19: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

RDF Graph Modeling

A norm can be modified by another norm

Decree 296Published 1995-02-17

Art..1. abc.Art. 2. def.Artí.3. ghi.

Decree 12066Published 2005-05-15

Art. 1. Modify decree 296 in the following way:: substitute in Art.1 the words “a” by “xyz”.

Now, Decree 296 should be:

Decree 296

Artículo 1. xyzbc.Artículo 2. def.Artículo 3. ghi.

Page 20: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

RDF Graph Modeling

Careful URI Design

Expressiveness

Page 21: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

RDF Graph Modelinghttp://datos.bcn.cl/recurso/cl/DTO/ministerio-del-interior/1995-02-17/296/Decree 296

http://datos.bcn.cl/recurso/cl/DTO/ministerio-del-interior/1995-02-17/296/es@1995-02-17Original

http://datos.bcn.cl/recurso/cl/DTO/ministerio-del-interior/1995-02-17/296/es@2005-05-10Latest version

Page 22: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Links to other datasets (Countries for International Treaties)DBPedia, Geonames

Reuse vocabularies / OntologiesSKOS, DC, FOAF, DBPedia, ORG

Triplestore: Openlink Virtuoso

SPARQL Endpoint

Page 23: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

SPARQL Endpoint

Example of queryFind all norms emitted by a municipality between 1995 and 2000

that were modified after 2005.PREFIX dc: <http://purl.org/dc/elements/1.1/>PREFIX n: <http://datos.bcn.cl/ontologies/bcn-norms#>

SELECT ?normTitle ?creatorName ?pubDate ?pubDateOtherWHERE {?norm n:createdBy ?creator .?creator n:hasName ?creatorName .?norm dc:title ?normTitle .?norm n:publishDate ?pubDate .?norm n:isModifiedBy ?otherNorm .?otherNorm n:publishDate ?pubDateOther .FILTER (regex(?creatorName,"MUNICIPALIDAD","i"))FILTER (?pubDate > "1995" &&

?pubDate < "2000" && ?pubDateOther > "2005")

}ORDER BY (?pubDate)

Page 24: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

RDF Graph Implementation

http://code.google.com/p/weso-desh/

We developed a Linked Data Frontend (WESO-DESH)

Content negotiation based on HTTP 303 See Other

Definition of URIs based on regular expressions

Easy configuration

Support for CONSTRUCT, ASK & DESCRIBE

Delegates output formats to SPARQL Endpoint

Result caching

GUI for administration backend (in progress)

Page 25: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

RDF Graph Implementation

WESO-DESH (Linked Data Frontend)

Output HTML+RDFa

XML Configuration

Page 26: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

26

Update Graph Service

*ETL = Extraction, Transformation Loading

Automatic extraction & transformation process to update the RDF GraphBased on Pentaho - Kettle ETL

Executes Transformations in threads

Configuration in XML

Page 27: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Documentation

Documentation Web Portal: TYPO3 CMS

Sections:URI construction guidelines

Example queries

Output formats

Ontology documentation

etc.

Page 28: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Non-Functional Requirements

Answer timeCache system, Profiling

Security & privacityDifferent views and access levels of RDF Graph

OthersInternationalization

Accessibility

Use of standards

Page 29: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

29

Optional: Data visualization

http://www.weso.es/lodviz/

Protype tool: LODViz (Linked Open Data Vizualization)

Based on HTML5 (pattern library)

Work in progress

Page 30: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Page 31: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

31

Results

Public Dataset Catalogs Faceted Browser - CTIC FoundationFive stars Linked Open Data

Page 32: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

32

Conclusions

First stage finished> 300.000 norms exported

≈ 8mill. triples, ≈ 27 triples by norm

200/400 triples added each day

3 tools in developmentWESO DESH - Linked data frontend

WESO RUD – RDF Updater

LODVIZ – Linked Open Data Visualization

Proposed methodology of Linked Open Data

Page 33: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

Future Work

Library of Congress of ChileMore datasets: Biographies, Geographical data

History of Law

Improve documentation

WESO Research groupSemantic search engine

Entity extraction & reconciliation in text

Resource Recommendation

Provenance & graph views

Page 34: CAEPIA 2011 Linked Data Methodology

The End

http://www.weso.es

http://www.bcn.cl

More Information

Page 35: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

35

Main Team

Francisco CifuentesMember of WESO Research Group and Library of Congress of Chilehttp://www.weso.es/~fcifuentes

José María ÁlvarezMember of WESO Research Grouphttp://josemalvarez.es

Christian SifaquiHead of Systems and Network information servicesLibrary of Congress of Chilehttp://sifaqui.blogspot.com/

Jose Emilio LabraAssociate Professor of University of Oviedo and Head of WESO Research Grouphttp://www.di.uniovi.es/~labra/

Page 36: CAEPIA 2011 Linked Data Methodology

http://www.weso.es http://www.bcn.cl

CreditsMost of the people were obtained from Internet.

Imagen transparencia: http://2.bp.blogspot.com/--wFwsKwMgAg/TjSDXOLCTzI/AAAAAAAAOzQ/qvBtbShckdI/s1600/11.2.bmp

Euros: Minuto digital. http://www.minutodigital.com/wp-content/uploads/euros-300x196.jpg

Biblioteca: http://ffernandez.files.wordpress.com/2010/04/biblioteca.jpg

FRBR: http://cucataloging.blogspot.com/

Contextualization: http://tentblogger.com/right-advertisers/

Documentation: http://susops.blogspot.com/2010/07/power-of-documentation.html