Top Banner
Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher http://www.opengeocode.org/articles/Open%20Data.pptx
38

Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Dec 15, 2015

Download

Documents

Roman Barr
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Open Data Portals

Andrew FerlitschOpenGeoCode.Org, Co-Founder

Sharp Labs of America, Principal Researcher

http://www.opengeocode.org/articles/Open%20Data.pptx

Page 2: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Open DataINDEX:What is Open Data?What are Open Data Portals?Data Portals in the US

Page 3: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

What is Open Data?

• Open data is data that can be freely used, reused and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike.

opendatahandbook.org/en/what-is-open-data/

• Open data is the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control.

en.wikipedia.org/wiki/Open_data

Page 4: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

What are Open Data Portals?

• A single point of access to open data provided freely for by a – Government: Federal, State, Regional and local

Municipal governments.– Institutions: Universities– Organization: International and National standards

bodies and NGOs.– Private: Corporations, Individuals

Page 5: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

5

• US government agencies have been mandated to make data collected and compiled by US tax dollars accessible to the public.

• In 2009, the Obama administration launched the Open Government Initiative (OGI) to provide a centralized repository for “public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.” (data.gov).

• On May 9, 2013, President Obama signed an executive order that made open and machine-readable data the new default for government information. Making information about government operations more readily available and useful is also core to the promise of a more efficient and transparent government. (whitehouse.gov/open).

Open Data Portals in the US

Page 6: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Open Data PortalsINDEX:Open Data Portal in North AmericaOpen Data Portals in the World

Page 7: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Open Data Portals in Northern America

• Number of “government” data portals in the United States (~350 10/2014)

– Federal Government : 50+ Data Portals– State Government : 150+ Data Portals– County Government : 50+ Data Portals– City Government : 100+ Data Portals

• Number of “government” data portals in Canada (~90 10/2014)

– Federal Government : 5+ Data Portals– State Government : 20+ Data Portals– County Government : 15+ Data Portals– City Government : 50+ Data Portals

Page 8: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Open Data Portals in the World• 212 Countries of 250 Countries worldwide have at least one government

open data portal (10/2014).

• Largest (non US/CA) Portals (10/2014):– United Kingdom : 35+– Spain: 30+– Italy: 25+– Australia : 20+– France: 20+– Germany: 15+– Austria: 10+– Finland: 10+– Netherlands: 10+– Brazil: 10+

Page 9: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Data Portal CatalogsINDEX:Data Portal CatalogsOpenGeoCodeDataCatalogsSunlight Foundation

Page 10: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Data Portal Catalogs

• OpenGeoCode (crowd sourced) – 1300 portals http://opengeocode.org/opendata/

• Datacatalogs (Open Knowledge Foundation) – 390 portals http://datacatalogs.org/

• Sunlight Foundation – 169 portals https://github.com/sunlightpolicy/opendata

• Open Access Directory – 110 portals http://oad.simmons.edu/oadwiki/Data_repositories

• OpenGovernmentData – 60 portals |http://opengovernmentdata.org/data/catalogues/

Page 11: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

OpenGeoCode - Catalog

Largest Catalog / Crowd Sourced

Major Categories:

• Data Portals• Transparency Portals• GIS/Gazetteer• Census/ Demographics

View• List View by Country• Map View• Filter by Category

CSV Dump of Catalog

Page 12: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

OpenGeoCode - Catalog

Page 13: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

DataCatalogs - Catalog

Most-Established /Maintained by Curators /Crowd Sourced

Build using CKAN 2.0 (open source)

View• Browse Alphabetically• Map View

Has Tags, but not yet searchable by.

Page 14: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Sunlight Foundation - Catalog

List of US Portals

Maintained in GitHub

Page 15: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Data Portal ProvidersINDEX:SocrataCKANESRI – Geospatial ServerDevelopment Seed

Page 16: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Primary Data Portal Providers

• Socrata – Private Strong presence in federal, state and municipal in the US.• Hosting Service• Interactive Search• Developer API

Page 17: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Primary Data Portal Providers

• CKAN – Open SourceStrong presence outsidethe US.

• Hosting• Interactive Search• Developer API

Page 18: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Primary Data Portal Providers

• ESRI – Geoportal ServerStrong presence in GIS/Mapping,Land/Property

• Free / Open Source• Interactive Search

Page 19: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

ESRI: Geoportal Server – Example Sites

Other Sites:• Abu Dhabi SDI GeoPortal• Australia E-NRIMS Digital

Geographic Information• Austria Energeo Geoportal• Canada Saskatchewan GeoSask

Portal• GeoPortal Genie• Malaysia GeoPortal• Poland IKAR Geoportal• Portugal National System for

Geographic Information (SNIG)• Sweden Geodata Portal • USA New York Ocean and Great

Lakes Ecosystem Conservation Council

Page 20: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

ESRI: Geoportal Server – Features

• Geoportal Catalog Service for GIS Resources– OGC (Open Geospatial Consortium) WS compliant – Publish resources to the geoportal by registering

the resource's metadata with the catalog service: datasets, analyses, tools, and web services.

• Search– Keyword and Location– Clip-Zip-Ship (emails packaged Zipfiles)– Search from ArcGIS applications

Page 21: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Primary Data Portal Providers

• Development Seed NGO, Strong presence Internationally (UN, World Bank, 3rd World).

• Builds integrated systems and tools for open data deployment.

Page 22: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

US CensusINDEX:Tiger/Line ShapefilesKML Boundary Files

Page 23: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

US Census: TIGER/Line ShapefilesStreets/Roads• Obtained from County Survey data• Street addresses extrapolated

Administrative Boundaries• Nation, Region• State, County, Place, ZCTA (~zip)• PUMA, MSA• CBSA, Tract• Voting Districts• School Districts• Native American Reservations

Infrastructure• Railroads

Page 24: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

US Census: TIGER/KML Boundary Files

NEW for 2013

Administrative Boundaries• Nation, Regions• State, County, Place, ZCTA (~zip)• PUMA, MSA• CBSA, Tract• Voting Districts• School Districts• Native American Regions

Page 25: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Data Portal - Portland, OR INDEX:Portland Portal – CivicApps.OrgPortland Data Portal – TrimetPortland Data Portal – Other Shapefiles

Page 26: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Portland Data Portal (CivicApps.Org)

• Local Design• Listing of Datasets• Selected APIs

• Street Addresses• Business Licenses• Crime Statistics• Parks / Trees• Restaurant Inspections (CSV/Text)• Trimet (KML)• Boundaries / Bridges / etc (Shapefiles)

Page 27: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Portland Data Portal - TrimetDataset Shapefile KML CSV WSTrimet BoundaryScheduleDetoursFare ZonesPark n RideRail LinesRail StopsArrival PredictionRoutesRoute StopsTransit Centers

Page 28: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Portland Data Portal – Other ShapefilesBoundaries• City / County/ Zip Codes * Urban Renewal• Address Points / Streets / Center Lines * Enterprise Zones• Local Improvement Districts * Parks• Neighborhood Associations * Snow/Ice Routes• Business Associations * Watershed Areas• Metro Council DistrictsFootprints / POI• Sidewalks / Curbs / Ramps * Guardrails• Bicycle Parking * Hospitals• Bridges * Libraries• Capital Improvement Projects * ITS Signs / Cameras• City Halls * Garbage Routes / Leaf Pickup• Fire Stations * Parking Meters• Schools * Traffic Devices / SignalsAND MORE

Page 29: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

PDX Crime AnalysisINDEX:PDX Crime Analysis – DatasetsPDX Crime Analysis – ProcessPDX Crime Analysis – High Crime StopsPDX Crime Analysis – Low Crime StopsPDX Crime Analysis – Time Flow

Page 30: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

PDX Crime Analysis - Datasets

• Datasets for CivicApps.Org– Crime Incidents 2013 – by type, date and time– Trimet Transit Stops – location and coordinate– Business Licenses - business type (NAICS code)

• Analysis– Look for correlation between transit stops with

high crime in vicinity:• Presence of Alcohol Establishments• Route and Time of Day

• Application Source and Data: www.opengeocode.org/PDX/crimePDX.zip

Page 31: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

PDX Crime Analysis - Process

– DPDX

CivicApps.Org

ETL

Crime Incidents

TrimetStops

BusinessLicenses

Automated Extract-Transform-Load

ODI Linked CSV Format,CUDE Ontology

Analyze

Custom Analysis Tool(Java)

CmdlineOutput CSV KML

Output Results

Page 32: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

PDX Crime Analysis – High Crime Stops

- Filtered to Personal Crime Categories (e.g., assault, robbery, prostitution, drugs).

- Filter for Trimet stops with over 300 reported (personal) crimes last year within 1/10th of a mile.

- Alcohol establishments within same radius average 1 to 9.

- Concentration around Downtown and Burnside.

Page 33: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

PDX Crime Analysis – Low Crime Stops

- Filtered to Personal Crime Categories (e.g., assault, robbery, prostitution, drugs).

- Filter for Trimet stops with under 100 reported (personal) crimes last year within 1/10th of a mile.

- Alcohol establishments within same radius average 0 to 1.

Page 34: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

PDX Crime Analysis – Time Flow- Filtered to Personal Crime Categories (e.g., assault, robbery, prostitution, drugs).

- Filter for Trimet stops with over 200 reported (personal) crimes last year within 1/10th of a mile, between 6am and noon.

- Alcohol establishments within same radius average 4 to 8.

- Concentration around Downtown Transit Center.

Page 35: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

ETL / Portland CivicAppsINDEX:ETL for Portland Civic AppsETL PDX UsageETL PDX Parseable Datasets

Page 36: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Open Source ToolsETL PDX Usage

• Usage: pdxETL [-p params] url• URL is Civicapps.Org location of CSV file on FTP site:

– E.g., ftp://ftp02.portlandoregon.gov/CivicApps/address.zip– Will automatically download– Extract from ZIP file– Transform fields into our standardized Linked CSV Vocabulary:

• http://www.opengeocode.org/cude1.2/LinkedCSV-Vocab.php

– Export to CSV for loading into database / application

– Application and Data: www.opengeocode.org/PDX/pdxETL.zip

Page 37: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Open Source ToolsETL PDX Parseable Datasets

• Supports ETL for PDX datasets in CSV format:– Address Points ftp://ftp02.portlandoregon.gov/CivicApps/address.zip -e unit_value=ADDRESS:SPC

– Building Permits ftp://ftp02.portlandoregon.gov/CivicApps/permits.zip -p FC=S,FD=BLDG,FX=permit

– Business Licenses ftp://ftp02.portlandoregon.gov/CivicApps/business_licenses.zip-e BusinessName=NAME:LEGAL

– Crime Incidents ftp://ftp02.portlandoregon.gov/CivicApps/crime_incident_data.zip– Crime Incidents 2004 - ftp://ftp02.portlandoregon.gov/CivicApps/

crime_incident_data_2004.zip 2013 ftp://ftp02.portlandoregon.gov/CivicApps/

crime_incident_data_2014.zip– Park Finder ftp://ftp02.portlandoregon.gov/CivicApps/ParkFinder.zip – Public Art ftp://ftp02.portlandoregon.gov/CivicApps/public_art.zip

-p FC-S,FD-ARTP

– Earthquake (BEECN) http://www.portlandoregon.gov/pbem/article/

Page 38: Open Data Portals Andrew Ferlitsch OpenGeoCode.Org, Co-Founder Sharp Labs of America, Principal Researcher 20Data.pptx.

Join Open Source ProjectTasks to Do

• We invite developers in the Portland area to make community contributions to the ETL PDX tool.– KML format (Trimet datasets)– Conversion of State Plane Coordinates to WGS84 (lat/lon)– Shapefile format (large number of datasets)– [Geo]JSON output– Extend to State of Oregon Data Portal