Rule-based Exploration of Structured Data in the Browser Sudhir Agarwal, Abhijeet Mohapatra, Michael Genesereth and Harold Boley DEXTER http://dexter.stanford.edu
Aug 17, 2015
Rule-based Exploration ofStructured Data in the Browser
Sudhir Agarwal, Abhijeet Mohapatra,Michael Genesereth and Harold Boley
DEXTERhttp://dexter.stanford.edu
Ad hoc exploration of structured data is often cumbersome and time-consuming using state-of-art tools
Lots of structured data available on the Web
- Limited or no querying support E.g. http://www.govtrack.us “Which senators are 40 years old?”
- Ad hoc compilation from multiple sources: “Which U.S. Department Heads attended Stanford University”
- Combining private (local) data with publicly accessible data
DEXTERBrowser-based
Explorer
for Structured Data
runs exclusively on the client side
supports ad hoc queries
http://dexter.stanford.edu
DEXTER
Data = Tables
* Extraction and integration of web data by end-users [Agarwal ’13]
*
http://dexter.stanford.edu
Rules = Dexlog
Extends standard Datalog with negation, aggregates, and built-ins
Reference: http://dexter.stanford.edu/main/dexlog.html
Sets and tuples as first-class citizense.g. {}, {[“a”,“1”],[“b”,“2”]}
DEXTERhttp://dexter.stanford.edu
Rules = Dexlog
Reference: http://dexter.stanford.edu/main/dexlog.html
Introduces a specialized operator called setof
q(X,S) :- p(X) & setof(Y, r(X,Y), S)
for all X s.t. p(X) evaluates to true,
SX = {Y | r(X,Y)}
DEXTERhttp://dexter.stanford.edu
DEXTER
Rules = Dexlog
Reference: http://dexter.stanford.edu/main/dexlog.html
Integrity constraints:
2nd arg of p functionally depends on 1st arg of p
illegal :- p(X,Y) & p(X,Z) & distinct(Y,Z)
http://dexter.stanford.edu
DEXTER
ExamplesHospitals registered with Medicare (hospitals.xml)
http://dexter.stanford.edu
DEXTER
ExamplesHospitals registered with Medicare
http://dexter.stanford.edu
ExamplesHospitals registered with Medicare in California
misc.hospitalsInCA(NAME) :- misc.hospitals(NAME,ADDR,CITY,"CA",ZIP,COUNTY,PHONE)
DEXTERhttp://dexter.stanford.edu
DEXTER
ExamplesWhat is the number of hospitals in each state?
http://dexter.stanford.edu
EECS Faculty at MITExamples
https://www.eecs.mit.edu/people/faculty-advisors
misc.mitFaculty(A,B,C,D,E,F)
DEXTERhttp://dexter.stanford.edu
ExamplesTuring Award Winners
https://en.wikipedia.org/wiki/Turing_Award misc.turing(YEAR,NAME)
DEXTERhttp://dexter.stanford.edu
DEXTER
Examples
misc.mitTuring(NAME) :- misc.turing(YEAR,NAME) & misc.mitFaculty(NAME,B,C,D,E,F)
Faculty at MIT who have won a Turing Award
http://dexter.stanford.edu
Summary• Dexter is a browser-based, domain-independent, explorer
for structured data (e.g. CSV, XML, JSON Databases, APIs)
• Ad hoc exploration of structured data as tables through Dexlog rules
• Client-side query evaluation
• Support for exporting, visualizing and sharing tables
• Implemented with Javascript — runs entirely inside user’s browser
Project Page: http://dexter.stanford.edu
Backup Slides
DEXTER
Client-Side Rule Evaluation• Hybrid-shipping strategy: Evaluate queries directly on
sources whenever possible
• Decompose queries Input: q(X) :- source1.t1(X,Y) & source2.t2(Y,“a”) Output: q(X) :- q1(X, Y) & q2(Y) q1(X, Y) :- t1(X, Y) — evaluated at source1 q2(Y) :- t2(Y,“a”) — evaluated at source2
• Remove irrelevant rules
• Evaluate queries in parallel
http://dexter.stanford.edu
DEXTER
Additional Features! Visualizing table’s data as charts ! Exporting into
popular formats- CSV - JSON - XML - RuleML/XML
! Sharing tables via Dexter Server
! Filters
http://dexter.stanford.edu