Top Banner
Querying Big Graphs: Theory and Practice Wenfei Fan School of Informatics University of Edinburgh 1
43

Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Jul 08, 2018

Download

Documents

doankhuong
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Querying Big Graphs:

Theory and Practice

Wenfei Fan

School of Informatics

University of Edinburgh

1

Page 2: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Social networks modeled as graphs

B

A1 Am

W

W

W

W W

W

W W

Node: person

report

Edge: relationship

supervise

Social graphs: Facebook, Twitter, LinkedIn, …

Page 3: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Find all matches of a pattern in a graph

Pattern matching in social graphs

Identify suspects

in a drug ring

3 “Understanding the structure of drug trafficking organizations”

pattern graph

B

A1 Am

W

W

W

W W

W

W W

3

3

1

B

A S

W

Page 4: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

4

Graph Pattern Matching

Input: a pattern graph Q and a data graph G

Output: all the matches of Q in G, i.e., all subgraphs of G that

are isomorphism to G

Good for social network analysis?

Applications

• pattern recognition

• intelligence analysis

• transportation network analysis

• Web site classification,

• social position detection,

• user targeted advertising,

• knowledge base disambiguation …

a bijective function f on nodes:

(u,u’ ) ∈ Q iff (f(u), f(u’)) ∈ G

Page 5: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

5

New challenges

Real-life social graphs are typically large

Facebook : more than 1 billion nodes,

and over 140 billion links

Is it feasible on big graphs?

Graph pattern matching is costly

• NP-complete to decide whether there exists a match

• possibly exponentially many matches

How long do we have to wait?

Page 6: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

6

The good, the bad and the ugly

Traditional computational complexity theory of almost 50 years:

• The good: polynomial time computable (PTIME)

• The bad: NP-hard (intractable)

• The ugly: PSPACE-hard, EXPTIME-hard, undecidable…

Polynomial time queries become intractable on big data

What happens when it comes to big data?

Assuming SSD of 6G/s. A linear scan of a data set D would take

• 1.9 days when D is of 1PB (1015B)

• 5.28 years when D is of 1EB (1018B)

O(n) time is already beyond reach on big data in practice!

Page 7: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

7

Tractability revisited for big data

Properly contained in P, unless P = NC

NP and beyond

PTIME

BD-tractable not

BD-tractable

graph pattern matching

Page 8: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

8

Coping with the sheer size of real-life graphs

1. Revising graph pattern matching

1) Bounded simulation

2) Incorporating edge relationships

2. Making big graphs small

1) Distributed graph pattern matching

2) Query preserving graph compression

3) Graph pattern matching using views

4) Incremental graph pattern matching

3. Approximate query answering

1) Relaxing the semantics of queries

2) Resource-bounded query answering

Joint work with Xin Wang and Yinghui Wu

Page 9: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Graph pattern matching for social network analysis

9

Page 10: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Pattern matching in social graphs

10

not allowed by

bijection relation

instead of

function

Subgraph isomorphism may be too strict for social data analysis

B

A1

Am

W

W

W

W W

W

W W

3

3

1

B

A S

W

Page 11: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

11

Graph simulation

Input: a pattern graph Q and a data graph G

Output: a binary relation S on the nodes of Q and G

Does this suffice?

• each node u in Q is mapped to a node v in G, such that

(u, v’)∈ S

• for each (u,v)∈ S, each edge (u,u’) in Q is mapped to an

edge (v, v’ ) in G, such that (u’,v’ )∈ S

Page 12: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Pattern matching in social graphs

12

edges to paths

The quest for a revision of graph simulation

B

A1

Am

W

W

W

W W

W

W W

3

3

1

B

A S

W

Page 13: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Gen

13

Directed graph G = (V, E, fA)

attributes fA(u): a tuple (A1 = a1, ..., An = an)

Social Graphs

Med

Soc Eco

AI

Chem

(‘dept’=CS, ‘field’=AI)

(‘dept’=CS, ‘field’=DB) (‘dept’=Bio, ‘field’=Gen)

(‘dept’=Bio, ‘field’=Eco)

Social graphs: attributes for data content

DB

label, keywords, blogs,

comments, rating …

Page 14: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

14

Pattern Graphs

Pattern graph: Q = (VQ, EQ, fv, fe)

fv(u): a conjunction of A op a, op in <, ≤, =, ≠, >, ≥

fe(u,u’): a constant k or a symbol ∗, bound

Bounded

Unbounded

fv(): ‘dept’=CS

Incorporating search conditions and bounds on the number of hops

Search condition

within k hops

CS Bio

Soc

Med

*

3

*

2

2

3

Page 15: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

15

G = (V, E, fA) matches Q = (VQ, EQ, fv, fe) via bounded simulation, if

there exists a binary relation S ⊆ VQ × V such that S

is a total mapping,

satisfies search conditions and bounds on edge-to-path mappings

Bounded Simulation

CS DB

Soc

Med Med

Gen

Soc Eco

*

3

*

2

2

3 AI

Chem

S

Q(G): a unique maximum match relation

Bio

for each u∈ VQ, there exists v∈ V such that

(u,v)∈ S

for each (u,v)∈ S,

attributes fA(v) satisfy predicate fv(u)

each (u,u’ ) in EQ is mapped to a path in G from v to

v’ of length bounded by fe(u,u’ ), (u’,v’ )∈ S

empty if G does

not match Q

Page 16: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Bounded simulation in social graphs

16 The set of all suspects involved in a drug ring

edges to paths

B

A1

Am

W

W

W

W W

W

W W

3

3

1

B

A S

W

relation instead

of function

Page 17: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

O(| V | | E | + | EQ| | V |2 + | VQ| | V |)

17

Complexity

Subgraph isomorphism: intractable

Graph simulation: O((| V | + | VQ |) (| E | + | EQ| ))

Input: Pattern Q and data graph G

Output: Q(G) cubic time

comparable: Q is

small in practice

a special case of bounded simulation

o The same bound 1 on all pattern edges (edge-to-edge mapping)

o Unique attributes vs. search conditions: label equality

Capture more sensible matches in social graphs (by 80%)

Page 18: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

18

Homeomorphism and monomorphism

Graph homeomorphism: G = (V, E) matches Q = (VQ, EQ)

an injective function from VQ V

edges to pairwise node-disjoint simple paths in G

function rather than relation

Strike a balance between expressive power and complexity

constraints on paths Monomorphism revised: G = (V, E) matches Q = (VQ, EQ)

an injective function from VQ V

edges to nonempty paths in G

Intractable, even when Q

is a tree and G is a DAG

Page 19: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Incorporating edge relationships

19 Incorporating edge “colors”

S: supervise

C: co-author

Ann, CS

Pat, DB

John, DB

Bill, Bio

Don, Gen

Tom, Bio

C

S

S

S

C

C

C

C

C

Mat, DB

DB

CS

Bio

Bio

C

C

S+

pattern

Page 20: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

20

Regular patterns

Pattern: Q = (VQ, EQ, fv, fe)

fv(u): a conjunction of A op a, op in <, ≤, =, ≠, >, ≥

fe(u,u’ ): a regular expression of the form

Mapping edges to paths satisfying associated regular expressions

DB

CS

Bio

Bio

C

C

S+ F ::= c | ck | c+ | FF

Simple regular expressions:

fairly common

optimizing patterns (checking

containment in linear-time)

Page 21: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

O(| V | | E | + m | EQ| | V |2 + | VQ| | V |)

21

Complexity

bounded simulation: a special case

single color c (hence m = 1)

fe(u,u’ ) = ck | c+

Input: Pattern Q and data graph G

Output: Q(G) m: the number of

distinct colors in Q

Adding edge colors does not incur extra complexity

Page 22: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

22

Various notions for graph pattern matching

Which one to use for social network analysis?

matching complexity |Q(G))|

subgraph isomorphism NP-complete |V| |VQ|

graph simulation quadratic time |V| |VQ|

bounded simulation cubic time |V| |VQ|

regular matching cubic time |V| |VQ|

Page 23: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Making big graphs small

23

Page 24: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

24

How to make a query tractable on big data?

Can we effectively query big graphs?

Querying big graphs:

• Input: Query Q, and a big graph G,

• Output: Q(G), the set of matches of Q in G

Making big graphs small

Make the cost of query processing “independent” of |G|!

The cost of query processing: a function of |G| and |Q|

O(|G|) time is already beyond reach in practice!

A number of techniques:

1. Distributed query processing

2. Query preserving data compression

3.Query answering using views

4. Bounded incremental evaluation

5. …

O(n2) or O(n3) time

Page 25: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Distributed query processing

25

The cost of evaluation algorithm: f(|G|, |Q|)

Divide and conquer

partition G into fragments (G1, …, Gn), distributed to various sites

manageable sizes

upon receiving a query Q,

• evaluate Q( Gi ) in parallel

• collect partial answers at a coordinator site, and assemble

them to find the answer Q( G ) in the entire G

evaluate Q on smaller Gi

Network traffic and response time: Independent of |G|

Performance guarantees for evaluating graph pattern queries

It is unlikely that we can lower its complexity, but

can we reduce the size of its parameter |G|?

Page 26: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Partial evaluation and

distributed query answering

26

Partial evaluation: a promising approach

compute f( x ) f( s, d )

conduct the part of computation that depends only on s

generate a partial answer

the part of known input

Partial evaluation in distributed query processing

• evaluate Q( Gi ) in parallel

• collect partial matches at a coordinator site, and assemble

them to find the answer Q( G ) in the entire G

yet unavailable input

a residual function

Gj as the yet unavailable input

functions

at each site, Gi as the known input

Page 27: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Query preserving data compression

27

The cost of query processing: f(|G|, |Q|)

Query preserving compression <R, P> for a class L of queries

For any data collection G, C = R(G)

For any Q in L, Q( G ) = P(Q, Gc)

Q( G )

R G Gc

Q P

Q

Q( Gc )

Compress big G into a smaller Gc

reduce the parameter?

Page 28: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

What is new about query preserving compression?

28 18 times faster on average for reachability queries

In contrast to lossless compression, no need to

restore the original graph G

Relative to a class L of queries of users’ choice

Better compression ratio: only information about L queries

Query preserving compression <R, P> for a class L of queries

For any dataset G, Gc = R(G)

For any Q in L, Q( G ) = P(Q, Gc)

For any Q in L, Q(Gc) can be directly computed

Any algorithms and indexing structures for G can be used for Gc

no need to decompress Gc

Gc is computed once for all queries Q in L

Incrementally maintained

Page 29: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Answering queries using views

29 The complexity is no longer a function of |G|

can we compute Q(G) without accessing G, i.e.,

independent of |G|?

The cost of query processing: f(|G|, |Q|)

Query answering using views: given a query Q in a language L

and a set V views, find another query Q’ such that

Q and Q’ are equivalent

Q’ only accesses V(G )

for any G, Q(G) = Q’(G)

Answering graph pattern queries on big social graphs:

Regardless of how big G is – the cost is “independent” of G

V(G ) is often much smaller than G (4% -- 12% on real-life data) Improvement: 31 times faster for graph pattern matching

Page 30: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Incremental query answering

30 Minimizing unnecessary recomputation

Incremental query processing:

Input: Q, G, Q(G), ∆G

Output: ∆M such that Q(G⊕∆G) = Q(G) ⊕ ∆M

Changes to the output New output

Changes to the input Old output

When changes ∆G to the data G are small, typically so are the

changes ∆M to the output Q(G⊕∆G)

Changes ∆G are typically small

Compute Q(G) once, and then incrementally maintain it

Real-life data is dynamic – constantly changes, ∆G

Re-compute Q(G⊕∆G) starting from scratch?

5%/week in

Web graphs

Page 31: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Complexity of incremental problems

Bounded: the cost is expressible as f(|CHANGED|, |Q|)?

Optimal: in O(|CHANGED| + |Q|)?

31 Complexity analysis in terms of the size of changes

Incremental query answering

Input: Q, G, Q(G), ∆G

Output: ∆M such that Q(G⊕∆G) = Q(G) ⊕ ∆M

The cost of query processing: a function of |G| and |Q|

incremental algorithms: |CHANGED|, the size of changes in

• the input: ∆G, and

• the output: ∆M

The updating cost that is

inherent to the incremental

problem itself

The amount of work absolutely

necessary to perform for any

incremental algorithm

graph simulation: bounded

Page 32: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

32

Graph pattern matching on big graphs

Partial evaluation for distributed query processing?

Query preserving compression: convert big data to small data

Query answering using views: make big data small

Bounded incremental query answering: depending on the size of

the changes rather than the size of the original big data

. . .

Combinations of these can do better than MapReduce!

Make big data small

Yes, MapReduce is useful, but it is not the only way!

Page 33: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Approximate query answering

33

Page 34: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

34

Graph simulation or bounded simulation

Relaxing the semantics of query answering

Effectiveness: capture more sensible matches in social graphs

Efficiency: from intractable to low polynomial time

Subgraph isomorphism

NP-complete Exponentially many matches

Quadratic/cubic time |VQ||V|

Use “cheaper” queries whenever possible

Page 35: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Top-k query answering

35

Early termination: return top-k matches without computing Q(G)

Traditional query answering: compute Q(G)

Top-k query answering:

Input: : Query Q, dataset G and a positive integer k.

Output: A top-ranked set of k elements in Q(G)

It is expensive to compute when G is large

The result Q(G) is excessively large for the users to inspect –

larger than G

Improvement: 1.8 times as fast, graph pattern matching

Page 36: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

36

The approximation theory revisited

Traditional approximation algorithms A: for an NPO

• for each instance x, A(x) computes a feasible solution y

• quality metric f(x, y)

• performance ratio (minimization): for all x,

A revision of approximation for querying big data

Approximation: for even low PTIME problems, not just NPO

Quality metric: answers to a query is a typically a set, not a number

Approach: it does not help much if A(x) conducts computation on

“big” data x directly!

OPT(x): optimal solution, 1

OPT(x) f(x, y) OPT(x)

Big graphs?

Page 37: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

37

Data-driven approximation

Resource-bounded query answering

Input: A class Q of queries, and a resource ratio [0, 1)

Question: Develop an algorithm that given any query Q Q and

graph G, computes Q(G) by accessing at most |G| amount of data

Accessing |G| amount of data in the entire process

Data-driven approximation algorithm A(G)

Dynamic reduction: given Q and G, find GQ such that | GQ | |G|

Compute Q(GQ) as approximate query answers to Q(G)

Performance ratio: F-measure of precision and recall

with best performance ratio

precision = | Q(GQ) Q(G)| / | Q(GQ)|

Recall = | Q(GQ) Q(G)| / | Q(G)|

Page 38: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

38

Personalized social search queries

We can make big graphs of PB size fit into our memory!

Graph Search, Facebook

Find me all my friends who live in Edinburgh and like cycling

Find me restaurants in London my friends have been to

Find me photos of my friends in New York

Does Michael connect to lady Gaga through social links?

We can do personalized social search with = 0.0015%!

1.5 * 10-6 * 1PB (1015B) = 15 * 109 = 15GB

We are making big graphs of PB size as small as 15GB!

Non-localized reachability

Localized patterns

with 100% accuracy!

Page 39: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

39

Scale independence

Input: A class Q of queries

Question: Can we find, for any query Q Q and any (possibly

big) graph G, a fraction GQ of G such that

|GQ | M, and

Q(G) = Q(GQ)?

Characterizing scale independent queries

Scalable with big graph G, when D grows!

Desirable, but hard

Independent of the size of G

Page 40: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

Summing up

40

Page 41: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

41

Challenges and opportunities

Challenges: querying big graphs is hard!

Introduce new fundamental problems – a departure from our

familiar terrain

any technique alone may not work very well – MapReduce is not

the only way, and may not be the best way

Nonetheless, we can do it!

Exact query answers: making big data small!

• combinations of effective techniques

Approximate query answering:

• relax the semantics of query answering

• data-driven approximation

Querying big graphs: A rich source of questions and vitality!

Page 42: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

References

Complexity theory for big data

W. Fan, F. Geerts, F. Neven. Making Queries Tractable on Big Data

with Preprocessing, VLDB 2013.

W. Fan, F. Geerts, and L. Libkin. On Scale Independence for

Querying Big Data. PODS 2014

Querying big social data:

W. Fan, X. Wang, and Y. Wu. Diversified Top-k Graph Pattern

Matching, VLDB, 2014.

W. Fan, X. Wang, and Y. Wu. Answering Graph Pattern Queries

using Views, ICDE 2014.

S. Ma, Y. Cao, W. Fan, J. Huai, and T. Wo. Strong Simulation:

Capturing Topology in Graph Pattern Matching, TODS 39(1), 2014

42

Page 43: Querying Big Graphs: Theory and Practice - cse…iwgdm/2014/Slides/Wenfei.pdf · Querying Big Graphs: Theory and Practice ... Find all matches of a pattern in a graph ... • pattern

References

Querying big social data:

W. Fan, X. Wang, and Y. Wu. Incremental Graph Pattern Matching,

TODS 38(3), 2013

W. Fan. Graph Pattern Matching Revised for Social Network Analysis,

ICDT 2012.

W. Fan, J. Li, X. Wang, and Y. Wu. Query Preserving Graph

Compression, SIGMOD, 2012.

W. Fan, X. Wang, and Y. Wu. Performance Guarantees for

Distributed Reachability Queries, VLDB, 2012.

W. Fan, J. Li, S. Ma, N. Tang, and Y. Wu. Adding Regular

Expressions to Graph Reachability and Pattern Queries, ICDE 2011.

W. Fan J. Li, S. Ma, and H. Wang, and Y. Wu. Graph Homomorphism

Revisited for Graph Matching, VLDB 2010.

W. Fan J. Li, S. Ma, and N. Tang, and Y. Wu. Graph pattern

matching: From intractable to polynomial time, VLDB, 2010.

43