Top Banner
Open Data and its Potential - reuse of public sector information - Svein Ølnes, Vestlandsforsking, 13.04.2011
49

Open data and reuse of public information

May 08, 2015

Download

Technology

Vestforsk.no

A presentation of open data and its potential, especially seen in light of the linked open data development.

Presentation held for Institute of Information and Media Science at the University of Bergen, 14.04.2011
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Open data and reuse of public information

Open Data and its Potential

- reuse of public sector information

- Svein Ølnes, Vestlandsforsking, 13.04.2011

Page 2: Open data and reuse of public information

www.vestforsk.no

Outline

About Vestforsk and myself

Semantic technologies

Linked (Open) Data

Open Data

Open Data -> LOD -> Sem. Techn.

Relevant projects and resources

Literature

Page 3: Open data and reuse of public information

www.vestforsk.no

Vestlandsforsking

ICT themes

Semantic technologies, information structures ++

Regional development, organizational changes with ICT

ICT application areas

Public sector (eGovernment, eHealth)

Tourism sector (local, regional, national, int’national level)

Vestforsk also does research in

Climate change

Transport and environment

Sustainable tourism

Renewable energy

Page 4: Open data and reuse of public information

www.vestforsk.no

About me

Vestforsk since 1996

eGovernment

Municipalities

Government

Semantic technologies

Projects

Norge.no (establishing in 1999/2000)

MiSide (development of demonstrator in 2004)

LivsIT/Los (2003 – to date)

Evaluation of public websites (2001 – to date)

Page 5: Open data and reuse of public information

www.vestforsk.no

Naming things!

[the famous cartoon by Gary Larson showing a man

painting ’the cat’, ’the dog’, ’the house’ on his cat,

dog, and house and explaining ”Now, this should

clear up a few things around here!”]

Page 6: Open data and reuse of public information

www.vestforsk.no

Technology waves

Procedure orientedFocus: Syntacs

Data: Hierarchical

Object orientedFocus: Structure

Data: Relational

Component basedFocus: Services

Data: XML

Model drivenFocus: Semantic

Data: Ontologi & Data

19951975 2005 201519851965

Stian Danenbarger, Bouvet.no

Page 7: Open data and reuse of public information

www.vestforsk.no

The ontology spectrum

The ontology spectrum: From weak to strong semantics

1. Vocabulary• plain text documents/HTML pages – almost no semantic structure

2. Controlled vocabularies (weak semantic structure)• adding metadata to the information

3. Taxonomies• metadata and hierarchy

4. Thesauri• metadata, hierarchy and a limited set of relations (BT, NT, related to ...)

5. Stronger semantic structures/ontologies• metadata, [hierarchy], any relations

(Daconta et al.: “The Semantic Web”)

Page 8: Open data and reuse of public information

www.vestforsk.no

Semantic technologies

AI tradition/Logics: Semantic web

W3C as the standardization body

Humanities/Library science: Topic Maps

ISO-standard

Light-weight, bottom-up: Microformats

Not a standard yet, but might be as part of HTML5

Page 9: Open data and reuse of public information

www.vestforsk.no

Semantic Web

”Web of data”

”Web 3.0”

”Semantic web” coined by Tim Berners-Lee in mid 1990s

The (in-)famous article ”The Semantic Web” in Scientific

American 2001 (TBL, Jim Hendler, Ora Lassila)

Wikipedia:However, the Semantic Web as originally envisioned, a system that

enables machines to understand and respond to complex human

requests based on their meaning, has remained largely unrealized and

its critics have questioned its feasibility.

Page 10: Open data and reuse of public information

www.vestforsk.no

Semantic Web stack

Page 11: Open data and reuse of public information

www.vestforsk.no

Lessons learned from the HTML history?

xhtml 1: HTML as XML

xhtml 2: Get rid of html altogether

... it was a disaster!

WHATWG TF – a rebellion inside W3C

Web Apps 1.0

.. eventually led to HTML5

pragmatism won over idealism

Jeremy Keith: ”HTML5 for Web Designers”

Page 12: Open data and reuse of public information

www.vestforsk.no

Semantic Web light

Is the Semantic Web too complex?

difficult to scale to the WWW

more suitable for use within smaller domains

Introducing ”Light-weight” SW:

RDFa: RDF expressed as (x)HTML – part of HTML

GRDDL: RDF data from XML/xHTML documents

SKOS: Simple Knowledge Organization System – representation of”classical” structures as taxonomies, thesauri in RDF

• organizing concepts with standard relations

Linked (Open) Data

Page 13: Open data and reuse of public information

www.vestforsk.no

Topic Maps

ISO standard from 2001 (present standard from 2003)

ISO 13250:2003

Strong Norwegian community

small world wide community compared to SW

Large uptake in portals, especially public portals

”Fight” between TM and SW

Largely over, ”SW has won”

Linked Data as a common ground for further development

Focus has shifted from technology to utilizing data

Page 14: Open data and reuse of public information

www.vestforsk.no

Simple Topic Maps model

3 Topic types: person, project and publication

2 Association types: Project manager of, Author of, and Result of

Page 15: Open data and reuse of public information

www.vestforsk.no

Topic Maps in use

Some Topic Maps driven portals:

uib.no

vestforsk.no

nofima.no

regjeringen.no

stortinget.no

bergen.kommune.no

Page 16: Open data and reuse of public information

www.vestforsk.no

The Linking Open Data (LOD) Project

Page 17: Open data and reuse of public information

www.vestforsk.no

Ultimate goal: My metadata is

your data (and vice versa)

SERES

Lov

data

LOS

Europe

ana

KS

Smiln

o

Volve

n

Yr.no

Kart-

verket

SKD

SSB

Page 18: Open data and reuse of public information

www.vestforsk.no

Linked (Open) Data

using the Web to lower the barriers to linking data

use of RDF to make typed statements

Linked Data = Use the Web to make typed links between data

from different sources

Alex Wright: The Web That Wasn’t (Topic Maps 2008 Conf.)

David Weinberger: Thank God! (Topic Maps 2008 Conf.)

”small pieces loosely joined”

Page 19: Open data and reuse of public information

www.vestforsk.no

Linked Data vs. Linked Open Data

Page 20: Open data and reuse of public information

www.vestforsk.no

Linked Data Principles

1. Use URIs as names for things

2. Use HTTP URIs so that people can look up those names

3. When someone looks up a URI, provide useful information,

using the standards (RDF, SPARQL)

4. Include linkes to other URIs, so that they can discover more

things

Linked Data can be serialized as

RDF/XML

N3 (Turtle)

RDFa

Page 21: Open data and reuse of public information

www.vestforsk.no

Linked Open Data Star Scheme

Tim Berners-Lee/DERI – University of Galway

Page 22: Open data and reuse of public information

www.vestforsk.no

Linked Data example

”Populated place” is a concept defined in the DBpedia ontology

Use established ontologies whereever possible

FOAF (friend-of-a-friend)

Dublin Core

hCard, hCalendar, hAtom

Page 23: Open data and reuse of public information

www.vestforsk.no

Linked Data vs. Semantic Web

The Semantic Web, or the Web of Data, is the ultimate goal

Linked Data provides the means to reach that goal

Linked Data helps build the Web of Data that later can be

exploited by more advanced techn. such as intelligent agents

(it has to be added that this is the proponents of the semantic

web/intelligent agents claim)

Tom Heath: ”Without Linked Data, no Semantic Web!

Talis Nodalities no. 11

Page 24: Open data and reuse of public information

www.vestforsk.no

Open data

In principle all data, but mostly public data because that is the

easiest to start with

PSI directive from EU an important enabler (also included in

Offentleg-lova)

data.norge.no

data.norge.no from FAD to Difi

and from blog to data repository (?)

Page 25: Open data and reuse of public information

www.vestforsk.no

Why open data?

1. Increase democratic control and political participation

Empower citizens to exercise their democratic rights

2. Foster service and product innovation

New opportunities for innovation generated by open governmentdata

3. Strengthen law enforcement

Especially the US and the UK strategies emphasize this

Study published in the European Journal of ePractice, 2011

Page 26: Open data and reuse of public information

www.vestforsk.no

“Open data and its enemies”

Some pressure from FAD (recently expressed in ”Tildelings-

brevet”), but slow movement in general

cultural issues

budget issues

fear of loosing control

transparency is seen as a threat

Map data is some of the most important – Map Authorities are

not willing to publish raw data

Page 27: Open data and reuse of public information

www.vestforsk.no

Closed map data a problem

Bente Kalsnes, Origo

Page 28: Open data and reuse of public information

www.vestforsk.no

Open data strategies

Study published in the European Journal of ePractice, 2011

Page 29: Open data and reuse of public information

www.vestforsk.no

Open Data Instruments

Study published in the European Journal of ePractice, 2011

Page 30: Open data and reuse of public information

www.vestforsk.no

Top 10 drivers of open data1. Strategies and experiences

2. Political leadership

3. Regional initiatives

4. Citizen initiatives

5. Market initiatives

6. Emerging technologies

7. European legislation

8. Thought leaders

9. Possibility of monitoring government

10. Budget cuts

European Journal of ePractice, 2011

Page 31: Open data and reuse of public information

www.vestforsk.no

Top 10 barriers to open data1. Closed government culture

2. Privacy legislation

3. Limited quality of data

4. Limited user-friendliness/Info overload

5. Lack of standardisation of open data

6. Security threats

7. Existing charging methods

8. Uncertain economic impact

9. Digital divide

10. Network overload

European Journal of ePractice, 2011

Page 32: Open data and reuse of public information

www.vestforsk.no

data.norge.no

Initiative from FAD started in 2010

(I will take credit for the name! :)

Mostly a blog

Gradually building up a data repository

From 01.05.2011 Difi will have the responsibility for

data.norge.no

Page 33: Open data and reuse of public information

www.vestforsk.no

data.norge.no as of April 2011

1. Byantikvarens gule liste (xls)

2. Einingsregisteret (rdf/xml)

3. Gardsmatrikkelen 1886 (xls)

4. Idrettsanlegg (csv)

5. Kraftprisar (Tab-sep. tekst)

6. Ladestasjonar (csv, ov2..)

7. Los (ods)

8. N5000 (div. grafiske format

+ sosi/shape)

9. Statlege styre, råd og utval

(html)

10. Statsbudsjettet og nasjonal-

budsjettet 2011 (xls, csv)

11. Tenestemannsregisteret

(csv)

• no.ckan.net lists 212

different data sources

Page 34: Open data and reuse of public information

www.vestforsk.no

7 tips for publishing linked open data

1. Use standard Internet protocols for access (http)

2. All objects need a unique identifier (URI)

3. Avoid aggregation of data

4. Structure metadata in a machine readable format (xml or xml/rdf/xtm)

5. Use international character set (UTF-8)

6. Use minimum Dublin Core as a standard way of describing metadata

7. Think about linking to other data sources by preparing for Linked Data

Page 35: Open data and reuse of public information

www.vestforsk.no

Relevant projects from Vestforsk

Sesam4 – Semantic technologies for SMEs

Los – a navigator for public services

Tourism concepts – a common vocabulary for the tourism industry

Seminars on semantic technologies

The WIMS’11 Conference

Page 36: Open data and reuse of public information

www.vestforsk.no

Sesam4

VERDIKT project 2008 – 2011 (ended 31st of March this year)

Use of semantic technologies in SMEs

Provided a set of tools for SMEs (and others) to use for ”semantisizing”

their data

Demonstrated semantic technologies in two pilots:

Tourism

Business information

NR, Vestlandsforsking, Esis, Computas, UNI Digital,

Cyberwatcher, TextUrgy, Ovitas, IKT-Norge

Page 37: Open data and reuse of public information

www.vestforsk.no

Sesam4 – lessons learned

Project planned in 2007

A lot of things have happened since 2007

Emerging of Linked Data

Sesam4 gradually tuned in to LOD

Too much focus, resources, and discussion (!) spent on ontologies!

Light-weight approach saves time & money

Valuable tools for semantic lifting and best practices remains

available for anybody to use (most of the project in open

source)

Page 38: Open data and reuse of public information

www.vestforsk.no

LOS – a navigator to public services

LivsIT (1996 – 2004)

Life situations

Los (2005 - ??)

Shared vocabulary for public services

More than 1/3 of the municipalities in Norway use Los as a

foundation of their web portal

Difi is the responsible agency

Los a success despite Difi’s lack of support and development

Problem with uptake in Governmental bodies

By using Los municipalities can share information with

Governmental bodies and themselves

Page 39: Open data and reuse of public information

www.vestforsk.no

What a difference a little semantics can do

Note: Bergen recently changed their internal search to Google search and lost the semantic support (Los) for search

Page 40: Open data and reuse of public information

www.vestforsk.no

What a difference a little semantics can do

Page 41: Open data and reuse of public information

www.vestforsk.no

Los – structure

Tema = Theme

Emneord = Keyword

Nettressurs = Net resources

Page 42: Open data and reuse of public information

www.vestforsk.no

How does Los work?

Keyword

Net resource

Help word

Page 43: Open data and reuse of public information

www.vestforsk.no

Tourism concepts

Pre-project for the Norwegian tourism industry (VisitNorway, NCE

Tourism)

Advice on constructing a common vocabulary for tourism concepts

Initiated by Anders Waage Nilsen in NCE Tourism/Fjord Norway

(Anders now in MediArena)

Page 44: Open data and reuse of public information

www.vestforsk.no

Tourism concepts - advice

1. Simplification (today’s categorizing scheme is too complicated)

2. Develop a controlled vocabulary with emphasis on keywords (the Los

method)

3. Not everything can be solved with categorizing

A controlled vocabulary is necessary but not enough

4. Publish the vocabulary in the cloud

5. Publish the vocabulary in many formats (html, xml, xml/rdf, xtm)

6. Publish also the information resources in the cloud, as linked open

data

Page 45: Open data and reuse of public information

www.vestforsk.no

Seminars on semantic technologies

Vestforsk initiated a series of seminars on semantic technologies as

part of its 25th anniversary in 2010

A total of 7-8 seminars will be held, 4 already arranged

Streaming of all seminars, and archived for video on demand

We also have project ”Kunnskap kryssar grenser”/”Access to Knowledge” where we focus on streaming and use of video

Page 46: Open data and reuse of public information

www.vestforsk.no

WIMS’11

International Conference on Web Intelligence, Mining, and Semantics

Sogndal, May 25 – 27

Keynote speakers:

Jim Hendler: The Semantic Web 10th Year Update (25.05)

Peter Mika: Making Things Findable (26.05)

Sören Auer: Creating Knowledge Out of Interlinked Data (26.05)

Ashwin Ram: Open Social Learning Communities (27.05)

Marko Grobelnik: Scalable Reasoning on Intensive Streams of Data (27.05)

wims.vestforsk.no

Page 47: Open data and reuse of public information

www.vestforsk.no

Some resources

Vestforsk series of seminars on semantic technologies

http://www.vestforsk.no/aktuelt/seminarserie-om-semantiske-teknologiar

Linked Data – The Story So Far (Bizer, Heath, Berners-Lee)

http://tomheath.com/papers/bizer-heath-berners-lee-ijswis-linked-data.pdf

Linked Data vs. Linked Open Data

http://datavisualization.ch/opinions/introduction-to-linked-data

Linked Data – Evolving the Web into a Global Data Space (Heath, Bizer)

http://linkeddatabook.com/editions/1.0/

Introduction to Linked Open Data for Visualization Creators:

http://datavisualization.ch/opinions/introduction-to-linked-data

CKAN: The Data Hub

http://ckan.net

Page 48: Open data and reuse of public information

www.vestforsk.no

More resources

Talis Nodalities:

http://www.talis.com/nodalities

Publishing Open Government Data (working draft)

http://www.w3.org/TR/2009/WD-gov-data-20090908/

Åpne data og journalistikk (Bente Kalsnes, Origo)

http://www.slideshare.net/benteka/pne-data-og-journalistikk

European Journal of ePractice

http://www.epractice.eu/en/journal/issues

Figshare: Sharing scientific data (http://figshare.com)

http://blog.okfn.org/2011/03/02/introducing-figshare-a-new-way-to-share-open-

scientific-data/

Page 49: Open data and reuse of public information

www.vestforsk.no

Thank you for your attention!

Contact information:

Svein Ølnes – [email protected]