Crossfilter Fast Multidimensional Filtering for Coordinated Views
Apr 22, 2015
CrossfilterFast Multidimensional Filtering for Coordinated Views
Data management and visualization consultant !
Project to develop a general purpose collaborative data management, transformation,
and visualization platform (GPCDMTVP for short).
!
You can find me at http://esjewett.com or @esjewett”
We’ve got problems
More specifically
We’ve got data
and filters
and aggregations
and speed
DemoMoritz Stefaner’s “Elastic Lists” experiment
Demo review
• We control the data
• Many different simultaneous filters
• Aggregation: count
• Data in the browser … in Flash
What would a general-purpose approach to this
problem look like?Let’s just call it … say … Crossfilter
Javascript(am I at the right meetup?)
Data is encapsulated!
var data = [ { date: ‘2014-01-01’,!!! ! ! ! ! ! value: 10,!!! ! ! ! ! ! color: ‘orange’ },!!! ! ! ! { … },!!! ! ! ! ! … ];!!
var transactions = crossfilter(data);!
and dimensional
!
var dateDim = transactions.dimension( ! function (d) {! return “” + d.date;!! }!);!
Filter on dimensions!
dateDim.filter(“2014-01-01”);!!
!
!
!
!
!
!
Filter on dimensions!
dateDim.filter(“2014-01-01”);!dateDim.filter([“2013-01-01”,”2014-01-01”]);!!
!
!
!
!
!
Filter on dimensions!
dateDim.filter(“2014-01-01”);!dateDim.filter([“2013-01-01”,”2014-01-01”]);!dateDim.filter( ! function (d) {! return d === “2013-06-01”;! }!);!!
Filter on dimensions!
dateDim.filter(“2014-01-01”);!dateDim.filter([“2013-01-01”,”2014-01-01”]);!dateDim.filter( ! function (d) {! return d === “2013-06-01”;! }!);!dateDim.filterAll();!
Aggregate on dimensions
!
// Count transactions per day!var dateGroup = dateDim.group();!
Aggregate on dimensions
!
// Count transactions per month!var monthGroup = dateDim.group(! function (d) {! return d.substr(1,7);! }!);!
Aggregate on dimensions!
// Sum by value over groups of days!dateGroup.reduceSum(! function (d) {! // “d” is the complete record! return d.value;! }!);!
“Queries”!// Month with most activity under current filter!monthGroup.top(1);!// { key: “2013-06”, value: 435 }!!// Day with highest value under current filter!dateGroup.top(1);!// { key: “2013-12-24”, value: 143700 }!!// Get all the months values under current filter!monthGroup.all();!// [ { key: “2013-06”, value: 435 },!// { key: “2013-12”, value: 315 },!// { key: “2013-02”, value: 250 }, … ]!
So what about speed?
Demo
Some notes
• Dimension accessors must return naturally ordered values. Cast before returning!
• reduce(add, remove, initial)
• order(ordering)
• groupAll()
Ethan Jewett!
esjewett.com / coredatra.com
@esjewett
https://github.com/esjewett
Links
• Crossfilter - http://square.github.io/crossfilter/
• Crossfilter API - https://github.com/square/crossfilter/wiki/API-Reference
• Moritz Stefaner’s Elastic Lists - http://moritz.stefaner.eu/projects/elastic-lists/