Top Banner
Data coherence between OSM and Wikipedia Cristian Consonni Fondazione Bruno Kessler State of the Map 2013 - Birmingham September 2013 Cristian Consonni Data coherence between OSM and WIkipedia 1 / 16
21

Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Aug 29, 2014

Download

Technology

Title:
"Data coherence between OSM and WIkipedia"

Abstract:
Volunteered geographical information (VGI) are one facet of phenomenon of crowdsourcing in which people are collecting and sharing large amounts data in open and collaborative projects. Although these projects have different purposes and scopes there is some overlap between them so it can be asked if these data, which are collected from different communities with different processes, are coherent.

In this talk I will discuss a set of possile analysis between OSM and Wikipedia data, how they can be performed and a path for further research. I will also present some premilinary results of the application of these metrics regarding Italian Wikipedia and OSM in Italy for given category of objects (churches and historical buildings).
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Data coherence between OSM and Wikipedia

Cristian ConsonniFondazione Bruno Kessler

State of the Map 2013 - BirminghamSeptember 2013

Cristian Consonni Data coherence between OSM and WIkipedia 1 / 16

Page 2: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Outline

1 Introduction

2 The Problem

3 Proposing a SolutionWikipedia-OSM comparatorNut4Nuts

4 Conclusions

5 Questions

Cristian Consonni Data coherence between OSM and WIkipedia 2 / 16

Page 3: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Collecting Information About the Real World

Cristian Consonni Data coherence between OSM and WIkipedia 3 / 16

Page 4: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Collecting Information About the Real World

Cristian Consonni Data coherence between OSM and WIkipedia 3 / 16

Page 5: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Collecting Information About the Real World

Wikipedia and OpenStreetMap are:collaborativevolunteer-drivenfree (as in freedom and as in beer)

Both projects collect information about the real world.

Cristian Consonni Data coherence between OSM and WIkipedia 4 / 16

Page 6: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Different Processes and Communities

Wikipedia

anonymous users can editentries consist in text (or media)only encyclopedical subjectscontent can be protected fromediting in case of problems

OpenStreetMap

only registered users can editentries consist in dataeverything can be describedcontent is always editable

Cristian Consonni Data coherence between OSM and WIkipedia 5 / 16

Page 7: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Inconsistencies in the data

Data in Wikipedia can be inconsistent with data from OpenStreetMap.We should compare the data and reconcile the differences.

On Wikipedia the metro station“Colosseum” is inside the Colosseum

itself.On OpenStreetMap the metro station iscorrectly placed outside the monument.

OpenStreetMap maps on Wikipedia provided by WIWOSM tool by User:Master and User:Kolossos, check it out on:

http://wiki.openstreetmap.org/wiki/WIWOSM

Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16

Page 8: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Inconsistencies in the data

Data in Wikipedia can be inconsistent with data from OpenStreetMap.We should compare the data and reconcile the differences.

On Wikipedia the metro station“Colosseum” is inside the Colosseum

itself.

On OpenStreetMap the metro station iscorrectly placed outside the monument.

OpenStreetMap maps on Wikipedia provided by WIWOSM tool by User:Master and User:Kolossos, check it out on:

http://wiki.openstreetmap.org/wiki/WIWOSM

Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16

Page 9: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Inconsistencies in the data

Data in Wikipedia can be inconsistent with data from OpenStreetMap.We should compare the data and reconcile the differences.

On Wikipedia the metro station“Colosseum” is inside the Colosseum

itself.On OpenStreetMap the metro station iscorrectly placed outside the monument.

OpenStreetMap maps on Wikipedia provided by WIWOSM tool by User:Master and User:Kolossos, check it out on:

http://wiki.openstreetmap.org/wiki/WIWOSM

Cristian Consonni Data coherence between OSM and WIkipedia 6 / 16

Page 10: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Proposal of the Solution

Two steps towards a solution:1 Compare the data

Identify links between Wikipedia pages and OSM entitiesExtract all the available geographical informationDefine metrics to calculate if the data are “close” or not

2 Reconcile the differencesProvide the communities with the result of previous analysisCreating tools to facilitate the reconciliation

Cristian Consonni Data coherence between OSM and WIkipedia 7 / 16

Page 11: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Comparing the dataWikipedia-OpenStreetMap comparator

Proof-of-concept: comparing data about churches in Italy:

Wikipedia-OpenStreetMap comparatorsource code: https://github.com/CristianCantoro/WOcomparator

Easy case:pre-defined category of items (selection on a set of features in OSM,articles with a given template in Wikipedia)only entities with a (it:)Wikipedia attribute were selected

⇒ linking is straightforward.

Cristian Consonni Data coherence between OSM and WIkipedia 8 / 16

Page 12: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Comparing the dataWikipedia-OpenStreetMap comparator

http://it.wikipedia.org/wiki/Utente:CristianCantoro/Georeferenziazione

Cristian Consonni Data coherence between OSM and WIkipedia 9 / 16

Page 13: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Comparing the datanuts4nuts

For the hard case (try to link every possible thing), another tool:

Nuts4Nutssource code: https://github.com/SpazioDati/Nuts4Nuts

http://nuts4nutsrecon.spaziodati.eu/reconcile?queries={%22q0%22:%20{%22query%22:%20%22Palazzo%20Vecchio%22}}

Known limitations:limited to Italyuses of external services

grab the source code: https://github.com/SpazioDati/Nuts4NutsCristian Consonni Data coherence between OSM and WIkipedia 10 / 16

Page 14: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

DandelionNuts4Nuts is built using the infrastracture provided by

Dandelion (http://dandelion.eu)a datamarket by SpazioDati srl.

Cristian Consonni Data coherence between OSM and WIkipedia 11 / 16

Page 15: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Future WorkNuts4nuts is a step to find geographical information for Wikipedia articlethat have no explicit coordinates in them.Future work:

study new approaches to link entities between Wikipedia andOpenStreetMapan application to fix inconsistencies or fill in missing data, like this:

Cristian Consonni Data coherence between OSM and WIkipedia 12 / 16

Page 16: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Conclusions

Wikipedia and OSM collect information about the real world

Comparing data among the two project can highlight inconsistencies

We should fix them

Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16

Page 17: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Conclusions

Wikipedia and OSM collect information about the real world

Comparing data among the two project can highlight inconsistencies

We should fix them

Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16

Page 18: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Conclusions

Wikipedia and OSM collect information about the real world

Comparing data among the two project can highlight inconsistencies

We should fix them

Cristian Consonni Data coherence between OSM and WIkipedia 13 / 16

Page 19: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Questions & Contacts

Questions?mail: [email protected]: @CristianCantoro

github: https://github.com/CristianCantoro

Cristian Consonni Data coherence between OSM and WIkipedia 14 / 16

Page 20: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Thank you

Thank you!This work was supported by:

A project by:SpazioDati srlEdizioni Curcu & Genovese

with funds from the European Regional Development Fund.More information: http://trentino.dandelion.eu

Cristian Consonni Data coherence between OSM and WIkipedia 15 / 16

Page 21: Data coherence between OpenStreetMap and Wikipedia - Presentation @ State of the Map 2013 Birmingham

Copyright notice

The following presentation is realeased under the licence CC3.0-BY-SA.

Further info:http://creativecommons.org/licenses/by-sa/3.0/

Logos and trademarks are of the respective owners.

Cristian Consonni Data coherence between OSM and WIkipedia 16 / 16