Top Banner
Testing Forest-Isomorphism in the Adjacency List Model Mitsuru Kusumoto, Yuichi Yoshida* : Preferred Infrastructure, Inc. * : National Institute of Informatics. 1
27
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Testing Forest-Isomorphismin the Adjacency List Model

Testing Forest-Isomorphism in the Adjacency List Model

Mitsuru Kusumoto†, Yuichi Yoshida†*

† : Preferred Infrastructure, Inc.

* : National Institute of Informatics.

1

Page 2: Testing Forest-Isomorphismin the Adjacency List Model

Overview

Given two forests G and H, determine if G ≅ H or G and H are far from being so by looking at very small parts of G and H.

Outline

Introduction

Property testing

Problem setting

Our algorithms

≅ ?

2 / 21

Page 3: Testing Forest-Isomorphismin the Adjacency List Model

Introduction

3

Page 4: Testing Forest-Isomorphismin the Adjacency List Model

Property Testing

We want to solve decision problem as efficiently as possible!!

Example : Graph connectivity

Standard setting : BFS is enough. → Θ(n) time.

Property testing : Check if G is connected or G is far from being connected. → O(1) time!?

Connected Not connected

4 / 21

Page 5: Testing Forest-Isomorphismin the Adjacency List Model

Property Testing

Property testing algorithm is a (randomized) algorithm that checks if input satisfies property P or is far from P with high probability (e.g., ≥ 2/3) with sublinear query or time complexity.

Main Interest

What kinds of properties are testable efficiently?

Connected Not connected

We want to distinguish them

Far from being connected

Close to being connected

5 / 21

Page 6: Testing Forest-Isomorphismin the Adjacency List Model

Graph Property Testing - Review

The efficiency of property testing algorithms depends on the input models.

Adjacency matrix model

[01010]

[10110]

G = [01001]

[11001]

[00110]

Adjacency list model

v

A

B

C

1

2

3

O(v, 1) = A

O(v, 2) = B

O(v, 3) = C

• Input model for dense graphs. [GGR’98] • Many properties are testable.

(e.g., connectivity, △-freeness, ... .) • Necessity & sufficiency for constant-

time testability are known. [Alon+’09]

• Input model for sparse graphs. [GR’02] [KKR’04]

• Many properties are testable. (e.g., connectivity, H-minor-freeness.)

• But many results assume bounded-degree condition: degrees of vertices must be bounded by some constant.

6 / 21

Page 7: Testing Forest-Isomorphismin the Adjacency List Model

Graph Property Testing - Review

Only a few efficient algorithms.

Many hardness results: △-freeness, k-colorability, etc., requires Ω(√n) queries. [A+08, B+08, K+04]

Question : Is it possible to obtain efficient algorithms for fundamental problems without bounded-degree condition?

Adjacency list model

v

A

B

C

1

2

3

O(v, 1) = A

O(v, 2) = B

O(v, 3) = C

• Input model for sparse graphs. [GR’02] [KKR’04]

• Many properties are testable. (e.g., connectivity, H-minor-freeness.)

• But many results assume bounded-degree condition: degrees of vertices must be bounded by some constant.

What happens if we do not assume the bounded-degree condition?

7 / 21

Page 8: Testing Forest-Isomorphismin the Adjacency List Model

Forest-Isomorphism

We focus on forest-isomorphism in adjacency list model.

Input : Two forests G and H represented by adjacency lists and proximity parameter ε > 0.

Query Model : We can access to G and H via following queries:

deg(v): returns the degree of vertex v.

adj(v, i): returns a vertex adjacent to v by i-th edge.

random(): returns a randomly chosen vertex.

≅ ?

8 / 21

Page 9: Testing Forest-Isomorphismin the Adjacency List Model

Forest-Isomorphism

We focus on forest-isomorphism in adjacency list model.

Input : Two forests G and H represented by adjacency lists and proximity parameter ε > 0.

ε-Farness : d(G, H) := # of edge-(additions / deletions) to transform G to H. (Graph edit distance) For ε>0, (G, H) are ε-far from being isomorphic ⇔ d(G, H) ≥ εn.

Objective: Determine G≅H or d(G, H) ≥ εn.

≅ ?

9 / 21

Page 10: Testing Forest-Isomorphismin the Adjacency List Model

Forest-Isomorphism

We focus on forest-isomorphism in adjacency list model.

Motivation

Problem is fundamental: Forest is simple structure and isomorphism is a theoretically important problem.

Isomorphism was sometimes considered in property testing literature. [AS’05, AS’08, NS’11]

≅ ?

10 / 21

Page 11: Testing Forest-Isomorphismin the Adjacency List Model

Forest-Isomorphism

We focus on forest-isomorphism in adjacency list model.

Related Work

If there is no restriction on input, graph isomorphism testing in the adjacency list model requires Ω(√n) queries. [FM’08]

Good motivation for our focus on forests.

If input is a bounded-degree hyperfinite graph, then graph isomorphism is constant-time testable. [NS’11]

But if there is no degree bound, testability was unknown.

≅ ?

11 / 21

Page 12: Testing Forest-Isomorphismin the Adjacency List Model

Our Contribution

Furthermore, we obtained more general result:

If the input is a forest, every graph property is testable in poly(log n) queries in the adjacency list model.

We use a similar technique with [Newman and Sohler’11].

Query complexity

Upper bound poly(log n)

Lower bound Ω(√log n)

12 / 21

Page 13: Testing Forest-Isomorphismin the Adjacency List Model

Overview of Our Algorithm

13

Page 14: Testing Forest-Isomorphismin the Adjacency List Model

Overview of Our Method 1. Partitioning oracle: We define a procedure that removes small fractions of edges to partition the graph into several parts with “good” properties.

G

The Partitioning Oracle

H

2. We check if each corresponding part in G and H is isomorphic or far from so.

If G, H are far from being isomorphic, there is at least one corresponding part in G, H that is also far from being isomorphic.

14 / 21

Page 15: Testing Forest-Isomorphismin the Adjacency List Model

Partitioning Oracle

Partitioning Oracle: Given ε>0 and access to G, there exists integer s=s(ε) and subgraph G’⊆ G s.t.,

|E(G) – E(G’)| ≤ εn / 3

Each connected component of G’ is either s-bounded-degree-tree or s-rooted-tree.

s-rooted tree: A tree where there exists v ∈ V(T) s.t. deg(v) ≥ s and (size of each sub-tree) < s. (We call the vertex v a root.)

s-bounded-degree-tree: A tree where (degree of each vertex) < s.

v

15 / 21

Page 16: Testing Forest-Isomorphismin the Adjacency List Model

Partitioning Oracle

Partitioning Oracle: Given ε>0 and access to G, there exists integer s=s(ε) and subgraph G’⊆ G s.t.,

|E(G) – E(G’)| ≤ εn / 3

Each connected component of G’ is either s-bounded-degree-tree or s-rooted-tree.

We can provide query access to G’.

Alive Edge Query: Check if edge (v, i) still exists in G’.

The subgraph G’ is chosen deterministically.

If G ≅ H, then G’ ≅ H’.

v

A

B

C

1

2

3

(v, 1) : not alive

(v, 2) : not alive

(v, 3) : alive 16 / 21

Page 17: Testing Forest-Isomorphismin the Adjacency List Model

Partitioning Oracle

Partitioning Oracle: Given ε>0 and access to G, there exists integer s=s(ε) and subgraph G’⊆ G s.t.,

|E(G) – E(G’)| ≤ εn / 3

Each connected component of G’ is either s-bounded-degree-tree or s-rooted-tree.

So…

If d(G, H) = 0 ⇒ d(G’, H’) = 0

G’ and H’ are chosen deterministically.

If d(G, H) ≥ εn ⇒ d(G’, H’) ≥ εn / 3

We remove at most εn / 3 edges from G and H.

Thus, it is enough to consider the partitioned graphs G’ and H’.

17 / 21

Page 18: Testing Forest-Isomorphismin the Adjacency List Model

Graph Partition

Suppose that G is obtained through the partitioning oracle.

We split G into the following parts for some constants α,γ>1.

G[0] := s-bounded degree trees in G

G[1] := s-rooted trees in G with root degrees in [s, αγ)

G[2] := s-rooted trees in G with root degrees in [αγ, αγ2)

G[3] := s-rooted trees in G with root degrees in [αγ2, αγ3)

...

O(log n) parts

G[0] G[1] G[2] ......

18 / 21

Page 19: Testing Forest-Isomorphismin the Adjacency List Model

Isomorphism between Each Partitions

Graph partition is useful in the following sense.

Lemma. d(G, H) ≤ Σi d(G[i], H[i]).

Proof. Transformation from G[i] to H[i] for each i would transform G to H. □

Corollary. If d(G, H) ≥ εn, then for βi > 0 with Σ βi = ε, ∃i s.t. d(G[i], H[i]) ≥ βin. □

Thus, it suffices to check the isomorphism between G[i] and H[i] for each i=0,1,2,….

We set β0=ε/2, β1=β2=…=O(ε / log n).

19 / 21

Page 20: Testing Forest-Isomorphismin the Adjacency List Model

Isomorphism between Each Partitions

Testing G[i]≅H[i]

For i=0 : We can use a tester for the bounded-degree model [NS’11].

For i≥1 : We develop a new algorithm.

Sketch : We randomly sample root vertices.

For each root vertex, we randomly sample its subtrees and create a histogram of subtrees.

After this, we compute the minimum matching between the histograms in G and H.

This minimum matching turns out to be a good approximation to d(G, H).

:2

:2

:1

… 20 / 21

Page 21: Testing Forest-Isomorphismin the Adjacency List Model

Conclusion

If the input is a forest, every graph property is testable in poly(log n) queries.

Future Work?

Can we obtain similar results for larger graph class than forests?

Outerplanar graphs, Bounded-tree width graphs, Scale-free graphs, …

Query complexity

Upper bound poly(log n)

Lower bound Ω(√log n)

Actually O(log^2^poly(1/ε)(n))

21 / 21

Page 22: Testing Forest-Isomorphismin the Adjacency List Model

Appendix : Lower bound

22

Page 23: Testing Forest-Isomorphismin the Adjacency List Model

Lower bound - Overview

1. We construct two distributions of input, D1, D2.

∀(G, H) ∈ D1, G ≅ H

∀(G, H) ∈ D2, d(G, H) ≥ n/8

2. We reduce the isomorphism testing to checking if two probabilistic distributions are the same or not. This requires Ω(√N) queries.

≅ ?

≅ ?

23 / 21

Page 24: Testing Forest-Isomorphismin the Adjacency List Model

Lower bound

Let Fk := (n / (2klogn)) copies of a star graph with 2k vertices

(Remark that |Fi| = n / logn)

F3

F2

F1

F0

… Flogn 24 / 21

Page 25: Testing Forest-Isomorphismin the Adjacency List Model

Lower bound

Construct two distributions D1, D2 :

D1 : G=H

D2 : randomly assign Fk to either G or H so that |V(G)| = |V(H)|.

G = F0 ∪ F1 ∪ … Flogn

H = F0 ∪ F1 ∪ … Flogn

G = ................................

H = ...............................

F0 F1 … Flogn

25 / 21

Page 26: Testing Forest-Isomorphismin the Adjacency List Model

Lower bound

Because we can perform only “random-sampling” and (degree/neighbor)-query, checking if G ≅ H is equivalent to checking two probabilistic distributions are the same.

Lemma. We need Ω(√logn) queries to distinguish D1 and D2.

proba. to observe by random-sampling

F0 F1 F2 Flogn

G

H

G=H

26 / 21

Page 27: Testing Forest-Isomorphismin the Adjacency List Model

Lower bound

Lemma. ∀(G, H) ∈ D2, d(G, H) ≥ n/8

Proof.

Let Φ:V(G)→V(H) be a bijection achieves minimum graph edit distance. It holds that

d(G, H) ≥ Σv∈V(G) |deg(v) – deg(Φ(v))| / 2.

If we restrict v in the sum to the root of stars, we obtain d(G, H) ≥ Σk=2,3,4,... (n / (2k logn)) ∙ 2k-1/2 ≥ n/8. □

Thus, Ω(logn) lower bound holds. Φ

27 / 21