Top Banner
CKAN an open-source data management solution for open data Ivan Ermilov
18
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CKAN as an open-source data management solution for open data

CKANan open-source data management solution for open data

Ivan Ermilov

Page 2: CKAN as an open-source data management solution for open data

AKSW Research Group

http://aksw.org

Page 3: CKAN as an open-source data management solution for open data

My experience with CKAN

● PublicData.eu portalo Crowd-sourcing CSV2RDF mappings

● LODStatso Version 1: crawling datahub.io (CKAN)o Version 2: CKAN aggregator for data.gov,

publicdata.eu and datahub.ioo Version 2: Crawled all three portals and published

the data on datahub.io

Page 4: CKAN as an open-source data management solution for open data

CKAN IS NOTa file storage!

Page 5: CKAN as an open-source data management solution for open data

Why CKAN?

● An open source platformo Relatively easy to deployo Provides a rich set of features for free

● Data management● Community involvement

Page 6: CKAN as an open-source data management solution for open data

Who use CKAN?

● All major open governmentso Canada (open.canada.ca): 244,238 datasetso The U.S. (data.gov): 131,348 datasetso Europe (publicdata.eu): 47,863 datasets

● And some other communities:o Semantic Web community (datahub.io): 9,509

datasets

Page 7: CKAN as an open-source data management solution for open data

CKAN architecture

Page 8: CKAN as an open-source data management solution for open data

CKAN Pros/Cons

● Proso Organizes your data in structured wayo Have an extension to support DCAT (only for

datasets)o Provides API to digest your data

● Conso The data model does not work for all use cases

(DBpedia)o No strict guidelines for dataset publishing

Page 9: CKAN as an open-source data management solution for open data

CKAN functionality

● Publishing metadata ● Exposing metadata (API/front-end)● Access control for users/organizations● Additional functionality via plugins

Page 10: CKAN as an open-source data management solution for open data

CKAN extensions/plugins

● Data preview and visualization● CKAN + DCAT● Extension that adds the Disqus commenting

system to CKAN● Simple API dataset hits counter

Full list is available at: http://extensions.ckan.org/

Page 11: CKAN as an open-source data management solution for open data

CKAN deployment

● From source● OS package (e.g. as debian package)● Docker image

Official guide: http://docs.ckan.org/en/latest/maintaining/installing/index.html

Page 12: CKAN as an open-source data management solution for open data

CKAN Multi-Tier Deployment

Page 13: CKAN as an open-source data management solution for open data

CKAN API

● Well documented● Covers everything you can do with the web

interfaceo You can write your own web interface

● Various API clientso ckanclient (python) - officialo Ruby, PHP, Java, Nodejs, Perl, R

https://github.com/ckan/ckan/wiki/CKAN-API-Clients

Page 14: CKAN as an open-source data management solution for open data

CKAN API methods

● Retrieving data● Creating new data● Update existing data● Delete existing data● Data is: packages, resources, groups, tags,

users etc.

http://docs.ckan.org/en/latest/api/index.html

Page 15: CKAN as an open-source data management solution for open data

CKAN API: Examples

● Get package listo http://demo.ckan.org/api/3/action/package_listo Disabled for data.gov

● Get one packageo http://demo.ckan.org/api/3/action/package_show?id=

adur_district_spending● ckan.logic.action.get.organization_show

o api/3/action/organization_show?id=...

Page 16: CKAN as an open-source data management solution for open data

Use Case: LODStats● Aggregate CKAN

instances via API

● Filter out only related datasets

● Build an application on top of it

Page 17: CKAN as an open-source data management solution for open data

Use Case: CSV2RDF● Integrated with a particular CKAN instance

● Aggregates all CSV files from the instance

● Provides an interface for CSV2RDF conversion

Page 18: CKAN as an open-source data management solution for open data

Thank you for your attention!

Presented by Ivan Ermilov.LinkedIn: https://www.linkedin.com/in/iermilovEmail: [email protected]: earthquakesan