The image part with relationship ID rId2 was not found in the file. From Open Data to Open Algorithms Innovations in Data Management Forum on Innovative Data Approaches to SDGs UN ESCAP - 1 June 2017
The image part with relationship ID rId2 was not found in the file.
From Open Data toOpen Algorithms
Innovations in Data Management
Forum on Innovative Data Approaches to SDGsUN ESCAP - 1 June 2017
The image part with relationship ID rId2 was not found in the file.
About mePhilosopherturnedSoftware EngineerturnedData Scientist
Co-founder of DrivenData
@pjbull | @drivendataorg
A Case Study
4 years of history8,000 locations
315,000 violations
The image part with relationship ID rId2 was not found in the file.
Data Partner
A public-private data partnership changed the perspective the city had on the problem.
778algorithms
607data scientists
2.5months
The image part with relationship ID rId2 was not found in the file.
1st
Liliana MedinaMsC in Electrical and Computer Engineering
Portuguese citizen living and working in the UK
Built a model with the history of violations and the “sentiment” of the Yelp reviews.
Boston is currently running an experiment using what they learned from the competition.
In the initial results, they are finding 25% more violations with the new process.
The image part with relationship ID rId2 was not found in the file.
The image part with relationship ID rId2 was not found in the file.
The image part with relationship ID rId2 was not found in the file.
Open SourceAll of the algorithms from every competition are published as open source code.
We can’t improve what wecan’t measure
Open Data
Permissive License
• No fees, no restrictions
• Allow commercial use
• Common license (e.g., CC0)
Useful Metadata
• How collected
• When, how often updated
• Machine readable codebook
Open Format
• No proprietary file types (SAS, STATA, SPSS)
• Follow standards
• CSV, JSON, HDF5
We can’t improve what wecan’t reproduce
Open Algorithms
Data
• The data as it was used for analysis
• Any intermediate data
Code
• Preprocessing
• Modeling and analysis
• Visualizations
Results
• Bottom line up front
• Context
• Methods and modeling decisions
Open Algorithms Foster
Transparency
Citizen collaboration
Shared knowledge
The image part with relationship ID rId2 was not found in the file.
Build your future onopen algorithms.