Exploring the Networks in Open Public Data Uldis Bojārs Institute of Mathematics and Computer Science University of Latvia Using Open Data Workshop Brussels, 20-Jun-2012
Nov 01, 2014
Exploring the Networks in Open Public Data
Uldis Bojārs
Institute of Mathematics and Computer Science
University of Latvia
Using Open Data Workshop
Brussels, 20-Jun-2012
About us
• Institute of Mathematics and Computer Science, University of Latvia– http://www.lumii.lv/resource/show/170
– Uldis Bojārs @CaptSolo– Valdis Krebs http://orgnet.com– Pēteris Ručevskis
Network visualisation and analysis
Applications:• discover interesting patterns• explore data in [more] detail
Work from the Open Data Hackaton in Riga• analysis of Saeima voting patterns• http://opendata.lv
Overview
• Data needs to be Open• Pre-processing and filtering the data– selecting what to show
• Data visualization– iterative process (visualize, refine, repeat)
• What’s next?
Open Data needed first (!)“Open data is data that can be freely used, reused and redistributed by anyone …”
http://opendefinition.org/
Data needs to be:• open• easy to use
Still a problem in Latvia:• only a few datasets are open in
an easy-to-consume form (PDF does not count :)
http://titania.saeima.lv/LIVS11/SaeimaLIVS2_DK.nsf/0/9DEA96450E79B7E5C2257944007E589D?OpenDocument
Pre-processing
• Input:– raw vote data (scraped from the website)
published at http://data.opendata.lv/
• Output:– nodes (MPs)– edges (connections between them)
• What is a connection?
Defining graph connections
• Connect MPs if they have voted similarly– disagreed on at most n% of decisions
• Filter out cases where almost allMPs voted the same
• Filter out trivial decisions
• Filter out noise
Node colour legend
• Ruling coalition:– Zatler’s Reform Party– Unity– the National Alliance
• Opposition:– Harmony Centre– Greens / Farmers Party
• a few non-party MPs
MPs who always vote the same (n = 0%)Connection criteria too narrow
MPs who disagree in less than 35% of cases
Connection criteria too broad (everyone agrees, really?)
Refining the visualisation
• Need to find the right cut-off values (n%)– where patterns [start to] appear– and the visualisation makes sense
• Show the results to domain experts– MPs, journalists, political researchers, …
• Experts:– help improve visualisations– can discover new things for themselves
MPs who disagree in less than 11% of cases
Opposition parties [sometimes] vote the same
MPs who disagree in less than 25% of casesBridges appear b/w position and opposition parties
(see slides 21, 22 re the bridging role of yellow nodes)
What next?
• Improve our understanding of data
• Enhance visualisations– add clusters, etc.
• Create multiple visualisations– different topics, changes in time, etc.
• Bring in more data– explain nodes & edges
Donations to political partieshttp://www.thenetworkthinkers.com/2011/12/innovation-happens-at-intersections.html
networkvisualisationexample #1
Intra-company communication patterns
networkvisualisationexample #2
Conclusion
• Need more, useful Open Data
• Discovering patterns, making sense of data– helping make sense = purpose of visualisations
• Looking forward to collaboration re:– Using Open Data– Data Visualisation and Analysis
More info
• Uldis Bojā[email protected]
• Social Network Analysis talk / Valdis Krebshttp://www.slideshare.net/DERIGalway/valdis-krebs-social-network-analysis-19872007
• Smart Network Analyzer toolhttp://sna.lumii.lv/in development at IMCS, University of Latvia