Pay-as-you-go OWL Query Answering Using a Triple Store Yujiao Zhou, Yavor Nenov, Bernardo Cuenca Grau and Ian Horrocks Pay-as-you-go Approach Intuition ‣ to delegate the bulk of the computational workload to a highly scalable datalog reasoner ‣ to minimise the use of a fully- fledged reasoner Evaluation ‣ Evaluated on LUBM(100,1000), UOBM(1, 60, 500), FLY, DBPedia +travel and NPD FactPages. Average time without OWL 2 reasoning Average time Acknowledgements This work was supported by the Royal Society, the EPSRC projects Score!, ExODA, and MaSI 3 , and the FP7 project OPTIQUE. Data Lower ELHO Lower Data Upper Ontology D U Query Summary Datalog Engine Datalog Engine Datalog Engine Summarisation Full Reasoner Q Dependency Analysis Fragment F Full Reasoner Q F Output Tracking by datalog encoding triple store OWL 2 reasoner L=LRL ∪ LEL ∪ … U L = U σ(cert(q, F)) ⊆ cert(q, σ(F)) Incomplete endomorphisms Arrange calls to the reasoner according to the dependencies heuristically Rule out non-answers Done Diagram Over-approx to datalog ‣ upper bound U answer of q w.r.t the resulting set of rules U(Σ) and D. Lower bounds ‣ basic lower bound LRL answer of q w.r.t. the datalog fragment of Σ and D; ‣ EL lower bound LEL answer of q w.r.t. the ELHO fragment of Σ and D. Tracking encoding in datalog Intuition: to compute all the rules and facts that participate in a proof of q(a) in Σ∪D. This goal can be archived using datalog encoding. ‣ Example: ‣ If B 1 (x 1 ),…,B m (x m ) → H(x) is a rule in U(Σ), H t ( x), B1 (x1), . . . , Bm (xm) → S(c r )∧B 1 t (x1 )∧ . . . ∧B m t (xm ) is added to the tracking rule. ‣ Involved rules: {r | S(c r ) is derived} Involved facts: {P(a) ∈ D | P t (a) is derived} Summarisation & dependency between answers ‣ Let σ be the summary function, σ(cert(q, F)) ⊆ cert(q, σ(F)) ‣ If there is an endomorphism from a to b in F, then a ∈ cert(q, F) implies b ∈ cert(q, F) ‣ Existential knowledge {...,A u B,...} {C } {C } {A,...} x 1 {A,...} x 2 R R {A,...} {C } c x 1 {A,...} x 2 R R {...,A t B,...} ‣ Disjunctive knowledge DL Ontology Dataset Queries LUBM(n) SHI 93 ~100,000n 14 (std)+10 UOBM(n) SHIN 314 ~200,000n 15 FLY SRI 144,407 6,308 5 DBPedia SHOIN 1,757 12,119,662 441 (atomic) NPD SHIF 819 3,817,079 329 (atomic) LUBM(1000) UOBM(100) FLY DBPedia NPD Queries 22/24 12/15 5/5 439/441 294/329 Time(s) 18.4 0.7 0.2 0.3 0.1 LUBM(100) UOBM(1) FLY DBPedia NPD Time(s) 29.6 1.8 0.2 3 3 Problem Setting ‣ Ontology Σ — a set of rules of the form φ(x) → V i ∃y i ψ(x, y i ) ‣ Data D — a set of ground atoms of the form P(a) ‣ Conjunctive queries — FO formula of the form q(x) ← ∃y ψ(x, y) where ψ and φ are conjunctions of atoms.