Top Banner
Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department of Information Engineering and Computer Science
70

Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

Dec 30, 2015

Download

Documents

Adrian Greer
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

Exemplar Queries: Knowledge Exploration using Information GraphsDavide Mottin, University of TrentoAugust 20, 2015 @ RMIT University, Melbourne

Department ofInformation Engineering and Computer Science

Page 2: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

2 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Short Bio

Education• April 2015 – Now in the job market!: PhD in computer science

from University of Trento• Thesis title: “Advanced Query Paradigms for the Novice User”• Advisors: Prof. Themis Palpanas, Prof. Yannis Velegrakis

• 2010/08: MSc/BSc in computer science

Working Experience• 2012: Yahoo! Labs, Barcelona under Dr. Francesco Bonchi • 2011: Microsoft Research, Beijing under Dr. Haixun Wang

Page 3: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

3 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Traditional Query Answering

owns=Search Engine, based=California produces=Mobiles

Database

Page 4: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

4 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Hardly Expressible Queries

Query???

Does not know how to describe other companies

Database

Page 5: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

5 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

The Exemplar Queries perspective

“I think the greatest way to learn is to learn by someone's example.”

Tobey Maguire

Page 6: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

6 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

A different need

Page 7: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

7 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Existing Search Engines

acquisitions like Google Youtube

Yahoo!-Tumblr or Microsoft-Skypenot present as interesting acquisitions.

Cannot be solved by Related Queries [Boldi11,Bordino13] and Query Relaxation [Mottin13,Mishra09].

Page 8: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

8 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

A new perspective

Page 9: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

9 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Exemplar Queries

Input: Qe, an example element of interestOutput: set of elements in the desired result set

Exemplar Query Evaluation• evaluate Qe in a database D, finding a sample s• find the set of elements a similar to s given a similarity relation

[PVLDB 2014, SIGMOD 2014 (Demo)]

Page 10: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

10 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Challenges

• Define the similarity between sample and answers• Determine the best data-model for the problem• Find answers efficiently

Page 11: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

11 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Our Approach

Exemplar Queries• The user query is an indication of the structure of the answers

Page 12: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

12 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Problem

Solution Overview [SIGMOD Record 2014]

Page 13: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

13 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

General Solution

Input: User Query Q, an example of the expected resultsOutput: Set of expected results

Procedure:- Detect the sample for the query Q- Find the structures similar to the sample- Rank the results

Page 14: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

14 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Data Model: Knowledge graph

14

Page 15: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

15 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Strict equality: Edge Isomorphism

15

S A1 A2

Page 16: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT16

Similarity: Edge Isomorphism

D. Mottin et al. Exemplar queries: Give me an example of what you need. PVLDB, 7(5), 2014.

Page 17: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

17 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

subgraph isomorphism is NP-complete [Cook71]

Solution

Input: User Query Q, an example of the expected results.Output: Set of expected results

Procedure:- Detect the sample for the query Q

- Find the structures edge isomorphic to the sample- Rank the results

- Prune the non-matching nodes

Solution1. IterativePruning: fast

reject non matching nodes

Page 18: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

18 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

distance 1 distance 2

a b c a b c

2 0 0 1 2 1

d-neighborhood

distance 1 distance 2

a b c a b c

1 0 0 0 1 1

Query node q1

Graph node 1

Difference

1 0 0 1 1 01 0 0 1 1 0

Theorem

Page 19: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

19 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

d-neighborhood

distance 1 distance 2

a b c a b c

1 0 0 0 1 1

distance 1 distance 2

a b c a b c

1 1 1 2 1 0

Query node q1

Graph node 2

Difference

0 1 1 2 0 -10 1 1 2 0 -1

Theorem

Page 20: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

20 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

The IterativePruning Algorithm

1. Start from a query node q2. Match q with the graph nodes3. For each adjacent node of q4. Find nodes in the graph from

candidate map of q matching the edge

5. Repeat 2. with an adjacent node of q until all nodes have been visited

Theorem (Pruning Completeness)No subgraph isomoprhic solution is discarded by IterativePruning Algorithm

Page 21: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

21 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Solution

Input: User Query Q, an example of the expected results.Output: Set of expected results

Procedure:- Detect the sample for the query Q

subgraph isomorphism is NP-complete [Cook71]

- Prune the non-matching nodes - Find the structures edge isomorphic to the sample- Rank the results

- Restrict the search space

Solution1. IterativePruning: fast

reject non matching nodes

Solution1. IterativePruning: fast

reject non matching nodes

2. RelevantNeighborhood: restrict the search space to “near” nodes

Page 22: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

22 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Restricting the search space

22

S A1 A2

User Query

Idea1. Not all the the nodes are equally relevant2. Nodes “far” from the query are less related

Page 23: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

23 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

The Relevant Neighborhood Algorithm

Prune the search space by identifying the valuable portions:• Based on an approximation of Personalized PageRank

• Transition matrix A with non-uniform edge weights based on inverse frequency

Procedure1. Assign each node in the sample a fixed number of particles2. Distribute the particles on neighbor nodes favoring sample edge-

labels3. Repeat 2 until the number of particles is less than a threshold

Page 24: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT24

Similarity: Simulation

D. Mottin et al. Exemplar queries: a New Way of Searching. Submitted for publication.

Page 25: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

25 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Strict equality: Edge Isomorphism

S A1 A2

Why Yahoo! Tumblr are not present?

Page 26: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

26 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

More freedom: Simulation

S A1 A2

Tumblr matches both an acquisition and a

website

Match edge-label sequences instead of structures

Page 27: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

27 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

• Use Strong Simulation [Ma14], with:• bounded matchings• node-topology preserving

Issue: Strong Simulation preserves node labelsIdea: Apply Strong Simulation algorithm on a graph where edges becomes nodes with label equal to the original edge.

Pruning: • d-neighborhood becomes a boolean vector• a node matches a query node if the boolean and between the two

vectors is positive

Theorem

Algorithms for Simulation

Page 28: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

28 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Ranking results

28

S A1 A2

User Query

Google Yahoo! CBS

Combination of two factors1. Structural: similarity of two nodes in terms of neighbor

relationships2. Distance-based: the PageRank already computed

Page 29: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

29 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Experimental Setup

Dataset• Freebase: 76M nodes, 314M edges (entire!)

• Freebase Internet Domain: 2M nodes, 6M edges

• Synthetic datasets

• Testset: 100 queries manually mapped from AOL query logs

• Baseline: NeMa [6]: approximate answers on graphs

Measures• Algorithms total time

• User study asking to evaluate the usefulness of our approach

29

Page 30: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

30 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Scalability results (10M nodes)

30

Time• RelevantNeighborhood is stable on the number of

answers

• <150ms to get the answers

Page 31: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

31 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Usefulness

Quality• 92% people say that

Exemplar Queries are useful

• 62% already had the need for such a service

ComparisonWhich method is preferred? • 64% Exemplar Queries • 30% Other approaches

Page 32: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

32 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Simulation vs Isomorphism

32

Analysis• Simulation finds more answers (up to 48%) but aggregates results

• Isomorphism runs faster than simulation (less operations on simple queries)

Page 33: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

33 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Qualitative Evaluation

33

Query: Google – YouTube – Menlo Park

Approximate Graph Query Answering [Khan13]

Edge Isomorphism

Simulation

Answers are collapsed

More interesting answers

Page 34: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

34 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Size increment for Simulation

25% to 46% more edges than isomorphism: Answers are collapsed

Page 35: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

35 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Dealing with too many results

“One of the effects of living with electric information is that we live habitually in a state of information

overload. There's always more than you can cope with.”

Marshall McLuhan

Page 36: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

36 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Result Refinement

Page 37: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

37 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Information overload

37

I want to know about IT company

acquisitions

Page 38: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

38 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Too many results to visualize

Page 39: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

39 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Dealing with Information Overload

• Faceted Search• present aspects of the results [Roy08]

• Query reformulation• Modify some of the query conditions

• In structured databases [Mishra09]• In web search [Dang10]

Frist Study of Problem on GRAPHS

Page 40: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

40 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Graph Search

40

Page 41: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

41 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Graph Query Reformulation

Results

Query

Reformulations:query supergraphs

Exponential numberof reformulations

Page 42: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

42 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Challenges

• The number of reformulation is exponential• Quantify the interestingness of a reformulation• Finding query reformulations is NP-complete

Page 43: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

43 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

A Naïve Approach: k-most frequent super-graphs

Query

480 matches

450 matches

100 matches

Supergraphs

30 matches420 matches

Until k reformulations are found:- Retrieve the most frequent super-

patternFrequent ≠ Interesting

!

Page 44: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

44 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Our Approach

Graph Query Reformulation with Diversity• Finds k meaningful reformulation efficiently

D. Mottin, F. Bonchi, F. Gullo. Graph Query Reformulation with Diversity, KDD 2015.

Page 45: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

45 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Finding meaningful Reformulations

Results

Query

Coverage Diversity

Find k meaningful reformulations:1. Span all the results

2. Present different aspects of the results

?

Page 46: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

46 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Diversity Matters

Results

Query

Objective function f(Q)

λ = 1• Non optimal: f({Q1’,Q2’}) = 7

• Optimal: f({Q3’,Q4’}) = 8

Page 47: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

47 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Problem

Graph Query Reformulation with Diversity

47

Theorem (NP-hardness)The problem reduces to MAX-SUM Diversification Problem, so it is NP-hard

[KDD 2015]

Page 48: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

48 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Solution: Greedy Algorithm

Greedy

While k-reformulations are not found

1. Find the reformulation leading to the maximum increment of the objective function (marginal gain)

2. Add the reformulation to the results

48

TheoremThe algorithm is a ½-approximation

Finding the maximum gain is #P-complete

[Valiant79]

Solution

Fast_MMPG: Branch and bound algorithm with quality guarantees

Page 49: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

49 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

The multiplicity vector

Results

0 0 0 0 01 1 0 0 02 2 1 1 02 2 2 2 02 3 3 3 1

Output set of reformulations

Page 50: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

50 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Upper bound on the Marginal gain

LemmaThe marginal gain increases if the multiplicity of the considered item is where |Q| is the number of reformulations in the reformulated set constructed so far.

Upper bound : is the value of the objective function considering only results with multiplicity

Theorem

Page 51: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

51 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Upper bound

Results

0 0 0 0 01 2 1 1 1

Output set of reformulations

1 2 1 1 1

Page 52: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

52 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Until the reformulation with the maximum upper bound and marginal gain is not found1. Expand the reformulation with the max upper

bound2. Prune Reformulations with marginal gain

smaller than the upper bound so far

The Fast_MMPG Algorithmupper bound

marginal gain

Page 53: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

53 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Experimental Setup

• Datasets: • AIDS: 10k chemical compounds

• Financial: 17k transaction workflows

• Web: 13k interactions with a recommender system

• Baseline algorithms: • k-freq: returns top-k frequent supergraphs of a query

• LIndex: informative patterns index

• Experiments: • Time and objective function value varying k, query size, λ

• Anecdotal

• Scalability

Page 54: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

54 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Time Comparison

Number of reformulations1. k-freq runs only slighly faster2. Time increases linearly in k3. Fast_MMPG has real-time

performance

Query size1. Fast_MMPG comparable to k-

freq2. Time decreases with query

size (less reformulations)

number of reformulations (k)

query size

Page 55: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

55 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Objective function gain

Analysis1. Lambda correctly moves the objective function towards

diversity2. k-freq only captures coverage

Page 56: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

56 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Qualitative evaluation

k-freq

Fast_MMPG

C O

O OH

C

O CH3

C

O Fe

C

O NH2

C

O

CH3

C

CH3

O CH3

C

O CH3

C C

O CH2

C C

O NH2

C

O CH2

C NH

Query

Analysis• k-freq finds reformulation of the same superquery

• Fast_MMPG returns reformulations with more diversified structures

Page 57: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

57 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Conclusions

Hardly Expressible Queries • Exemplar Queries: user query is an example of the desired

results

• Efficient algorithmic solution scaling on real knowledge graphs

• Study of 2 similarity measures for query answering

Information Overload • Study of the problem in graph databases

• Principled objective function optimizing coverage and diversity

• Algorithmic solutions with quality guarantees

Page 58: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

58 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Other Studied Problems

“There are no right answers to wrong questions.”

Ursula K. Le Guin

Page 59: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

59 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Company

BasedRevenue

Mobile

Search

Hardware

Cloud

Apple Cupertino $62B 0 0 0 1

Google M.View $80B 0 1 1 0

HP Palo Alto $30B 0 0 1 0

Yahoo!Sunnyval

e$16B 0 1 0 0

Empty-Answer Problem

COMPANYDB

query = Mobile, Search, Hardware

{}

No answer

Page 60: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

60 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Dealing with the Empty Answer Problem

• Ranking results based on user preferences• IR [Baeza11] and database solutions [Chaudhuri04]

• Query relaxation• Modify some of the query conditions [Mishra09]

• (-) Suggests all the modification together• (-) Does not take user feedback into account

Page 61: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

61 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Our Solution: Interactive Query Relaxation

• Suggests one relaxation at a time• Takes user feedback into account• Models user preferences• Optimization centric relaxation suggestions• User centric (effort, relevance)

• System-centric (profit)

[PVLDB 2013, SIGMOD 2014 (Demo)]

Page 62: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

62 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Conclusions

We propose

• Exemplar Query Framework on Information Graphs: user query is an example of the desired results

We study

• Exemplar Query Answering: efficiently answering and ranking of exemplar queries

• Graph Query Reformulation: provide insights of the exemplar query answers

We show

• Solutions scaling on real size information graphs

• Principled approaches with quality guarantee

• Practical applicability of the problem

Page 63: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

63 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Future Directions

Query reformulation in connected-graphs• Current: set of small graphs (simulated in big graphs)

Include User preferences• In exemplar queries• In graph query reformulation

Multiple exemplar queries• Current: single exemplar queries• With multiple exemplar queries semantics changes

Page 64: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT64

Questions?

Thank you!

Page 65: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

65 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Publications

Hardly Expressible Queries• D. Mottin, M. Lissandrini, Y. Velegrakis, T. Palpanas. Exemplar queries: Give

me an example of what you need. PVLDB, 7(5), 2014.• D. Mottin, M. Lissandrini, Y. Velegrakis, T. Palpanas. Searching with XQ: the

eXemplar Query Search Engine. SIGMOD, 2014.• M. Lissandrini, D. Mottin, D. Papadimitriou, T. Palpanas, Y. Velegrakis.

Unleashing the power of information graphs. SIGMOD Record, 43(4), 2014.

• D. Mottin, M. Lissandrini, Y. Velegrakis, T. Palpanas. Exemplar queries: A new Way of Searching. (under submission)

Information Overload• D. Mottin, F. Bonchi, F. Gullo. Graph Query Reformulation with Diversity.

(KDD 2015)

Empty-Answer• D. Mottin, A. Marascu, S. B. Roy, G. Das, T. Palpanas, Y. Velegrakis. A

probabilistic optimization framework for the empty-answer problem. PVLDB, 6(14), 2013.

• D. Mottin, A. Marascu, S. B. Roy, G. Das, T. Palpanas, Y. Velegrakis. IQR: An interactive query relaxation system for the empty-answer problem. SIGMOD, 2014

• D. Mottin, A. Marascu, S. B. Roy, G. Das, T. Palpanas, Y. Velegrakis. A holistic and principled approach for the empty-answer problem. (under submission)

Page 66: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

66 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Bibliography

[Mishra09] C. Mishra and N. Koudas. Interactive query refinement. In EDBT, 2009.

[Roy08] S. Basu Roy, H. Wang, G. Das, U. Nambiar, and M. Mohania. Minimum-effort driven dynamic faceted search in structured databases. In CIKM, 2008.

[Chadhuri04] S. Chaudhuri, G. Das, V. Hristidis, and G. Weikum. Probabilistic ranking of database query results. In VLDB, 2004.

[Baeza11] R. A. Baeza-Yates and B. A. Ribeiro-Neto. Modern Information Retrieval. 2011.

[Haveliwala02] T. H. Haveliwala. Topic-sensitive pagerank. In WWW, 2002.

[Cook71] S. A. Cook. The complexity of theorem-proving procedures. In Symposium on Theory of Computing, 1971.

[Ma14] S. Ma, Y. Cao, W. Fan, J, Huai, and T. Wo. Strong simulation: Capturing topology in graph pattern matching. TODS, 2014.

66

Page 67: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

67 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Bibliography

[Valiant79] Leslie G Valiant. The complexity of computing the permanent. Theoretical computer science, 1979.

[Dang10] V. Dang and B.W.Croft. Query reformulation using anchor text. In WSDM, 2010.

[Bordino13] I. Bordino, G. De F. Morales, I. Weber, and F. Bonchi. From machu picchu to rafting the urubamba river: anticipating information needs via the entity-query graph. In WSDM, 2013.

[Boldi11] P. Boldi, F., C. Castillo, and S. Vigna. Query reformulation mining: models, patterns, and applications. Information retrieval, 2011.

[Khan13] A. Khan, Y. Wu, C. C. Aggarwal, and X. Yan. Nema: Fast graph search with label similarity. In PVLDB, 6(3), 2013.

67

Page 68: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

68 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Research Topics

Probabilistic databases• Consider probabilistic knowledge bases to capture noise and

uncertainty• Propose solutions that cope with many world semantics• Propose novel similarity measures for exemplar queries• Define reformulations in a probabilistic fashion

Exemplar Query Answering Framework• Study the problem of identifying exemplar queries need• Propose solutions for keyword queries to graph samples• Extend current solution with incomplete queries or multiple queries• Include reformulation capabilities • Study exemplar queries in other context (research papers,

newspapers, …)

Page 69: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

69 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

Back-up slides

Page 70: Exemplar Queries: Knowledge Exploration using Information Graphs Davide Mottin, University of Trento August 20, 2015 @ RMIT University, Melbourne Department.

70 Exemplar Queries: Knowledge Exploration Using Information Graphs – Davide Mottin @ RMIT

RelevantNeighborhood