Not all open data is born equal
Nov 22, 2014
Not all open data is born equal
• Canadian nonprofit that builds websites and tools to help governments and citizens engage with each other
• Follows two main strategies:
Improve access to government information via open data
Make participation easy and meaningful
Some context
• Citizen Budget: a online consultative budget simulator for municipalities and civil society organizations
• Represent: the largest database and open API of elected Canadian officials with two drupal modules for easy website integration
• MaMairie/MyCityHall: an online portal for tracking and interacting with your city hall
• Open511: an open data standard for traffic data and basic related tools
Ongoing projects
Data = Natural Resources?
Source: James St-Jones (cc-by)Source: USGS
Value! Meh?Hint: this is bauxite
Value extraction
Diamond
Extract
Cut
Tada!
Aluminum
Discover it’s valuable
Elaborate process
Industrialize process…
Cans, Car parts, etc.
• Sort of case study
– Region of San Francisco: 2 leader organizations
• Bay Area Rapid Transit (BART): 80+ apps
• Metropolitan Transportation Commission (MTC): handful of apps
– Same (full of geeks and startups) region
– Same “type” of data (transportation)
– Both organizations are innovative
Let’s look at “intrinsic” data value
Traffic and Transit data
• Transit data
– GTFS & SIRI: open data-oriented standards
– Used by 250 transit/transportation agencies
• Traffic
– Several standards (TMDD, TPEG, etc.), but difficult to use in an open data context
Standard = low barrier to entry,
Tools/apps built for these standards can reach lots of customers
1. Standardization
• Transit data
– Data can be interpreted on its own. No need for external data
• Traffic
– Several subsets of related data (accident, constructions, road data, etc.)
– Data managed by several jurisdictions (local, regional, provincial, federal)
Managing several sources and several datasets is always… complex
2. Self sufficient
• Transit
– (Quite) simple: some schedules, some fares, some spatial data
• Traffic
– Complex: networks are wide, intertwined, with lots of rules, lots of “free” actors
Modeling complex data is… complex and more prone to discrepancy
3. Complexity
• Transit
– Usually buses and trains follow their schedule
– Adding a GPS on each single bus is simple and give almost 100% reliability of the data
• Traffic
– Impossible to monitor every single road segment
Lack of reliability has a strong, negative impact on data value
4. Reliability
Techno-utopian dreamYour iphone 8S
Dear smartphone,I need to pick
the kids at schoolas fast as possible,
what’s the bestchoice?
A wealth of data
Road events
Parking data Crowdsourced data
Realtime traffic sensors (gov)
Realtime traffic (business)
Planned trip
Road dataGaz price
Personal data: car, location, habits
Car efficiency
• “Diamond” data self-sufficient: a strength for adoption
• For all data: real value is in cross-use with other datasets
• Some datasets will find their value because of the existence of other datasets
• Adding new datasets has a multiplier effects on existing related datasets
Multiplicative effect
• Usually open data = open government data
• But open data can be much more
Not only gov data
Gaz price
Road eventsRoad data
Traffic dataParking data
Parking data
Car, transit pass, bike shareTransportation habits
Planned trip
Traffic data
Vehicle efficiency
Crowdsourced data
OpenGovData
Openpersonal
data
Open (?) data from companies
Bike share
Gartner’s hype cycle of innovation (but it is not only about hype)
Some innovation theory
Innovationtrigger
Peak ofinflated expectations
Trough ofdisillusionment
Slope ofenlightment
Plateau ofproductivity
Stairway to heaven(internet-style)
Abyssal crash
You might be here…
…or here
• Assess your datasets: diamond vs bauxite analogy or any other analysis framework
• All datasets are not born equal, some might take more time to show their value
• Help discovery and value extraction process
• Follow “open” standards when they exist or participate to their elaboration
• Improve reliability of data where possible
• Be patient… but active!
Conclusion
Twitter: @opennorth Facebook: OpenNorth.NordOuvertBlog: www.opennorth.ca/blog
Stéphane Guidoin@hoedic