Top Banner
Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan
46

Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

Fast Random Walk with Restart and Its Applications

Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan

ICDM 2006 Dec. 18-22, HongKong

Page 2: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

2

Motivating Questions

• Q: How to measure the relevance?

• A: Random walk with restart

• Q: How to do it efficiently?

• A: This talk tries to answer!

Page 3: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

3

1

4

3

2

56

7

910

8

11

12

Random walk with restart

Page 4: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

4

Random walk with restart

Node 4

Node 1Node 2Node 3Node 4Node 5Node 6Node 7Node 8Node 9Node 10Node 11Node 12

0.130.100.130.220.130.050.050.080.040.030.040.02

1

4

3

2

56

7

910

811

120.13

0.10

0.13

0.13

0.05

0.05

0.08

0.04

0.02

0.04

0.03

Ranking vector More red, more relevant

Nearby nodes, higher scores

4r

Page 5: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

5

Automatic Image Caption• Q

Sea Sun Sky Wave{ } { }Cat Forest Grass Tiger

{?, ?, ?,}

?A: RWR! [Pan KDD2004]

Page 6: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

6

Test Image

Sea Sun Sky Wave Cat Forest Tiger Grass

Image

Keyword

Region

Page 7: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

7

Test Image

Sea Sun Sky Wave Cat Forest Tiger Grass

Image

Keyword

Region

{Grass, Forest, Cat, Tiger}

Page 8: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

8

Neighborhood Formulation

ICDM

KDD

SDM

Philip S. Yu

IJCAI

NIPS

AAAI M. Jordan

Ning Zhong

R. Ramakrishnan

… …

Conference Author

A: RWR! [Sun ICDM2005]

Q: what is most related conference to ICDM

Page 9: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

9

NF: example

ICDM

KDD

SDM

ECML

PKDD

PAKDD

CIKM

DMKD

SIGMOD

ICML

ICDE

0.009

0.011

0.0080.007

0.005

0.005

0.005

0.0040.004

0.004

Page 10: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

10

Center-Piece Subgraph(CePS)

A C

B

A C

B

?

Original GraphBlack: query nodes

CePS

Q

A: RWR! [Tong KDD 2006]

Page 11: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

11

CePS: Example

R. Agrawal Jiawei Han

V. Vapnik M. Jordan

H.V. Jagadish

Laks V.S. Lakshmanan

Heikki Mannila

Christos Faloutsos

Padhraic Smyth

Corinna Cortes

15 1013

1 1

6

1 1

4 Daryl Pregibon

10

2

11

3

16

Page 12: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

12

Other Applications

• Content-based Image Retrieval [He]

• Personalized PageRank [Jeh], [Widom], [Haveliwala]

• Anomaly Detection (for node; link) [Sun]

• Link Prediction [Getoor], [Jensen]

• Semi-supervised Learning [Zhu], [Zhou]

• …

Page 13: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

13

Roadmap

• Background– RWR: Definitions– RWR: Algorithms

• Basic Idea• FastRWR

– Pre-Compute Stage– On-Line Stage

• Experimental Results• Conclusion

Page 14: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

14

Computing RWR

1

43

2

5 6

7

9 10

811

12

0.13 0 1/3 1/3 1/3 0 0 0 0 0 0 0 0

0.10 1/3 0 1/3 0 0 0 0 1/4 0 0 0

0.13

0.22

0.13

0.050.9

0.05

0.08

0.04

0.03

0.04

0.02

0

1/3 1/3 0 1/3 0 0 0 0 0 0 0 0

1/3 0 1/3 0 1/4 0 0 0 0 0 0 0

0 0 0 1/3 0 1/2 1/2 1/4 0 0 0 0

0 0 0 0 1/4 0 1/2 0 0 0 0 0

0 0 0 0 1/4 1/2 0 0 0 0 0 0

0 1/3 0 0 1/4 0 0 0 1/2 0 1/3 0

0 0 0 0 0 0 0 1/4 0 1/3 0 0

0 0 0 0 0 0 0 0 1/2 0 1/3 1/2

0 0 0 0 0 0 0 1/4 0 1/3 0 1/2

0 0 0 0 0 0 0 0 0 1/3 1/3 0

0.13 0

0.10 0

0.13 0

0.22

0.13 0

0.05 00.1

0.05 0

0.08 0

0.04 0

0.03 0

0.04 0

2 0

1

0.0

n x n n x 1n x 1

Ranking vector Starting vectorAdjacent matrix

1

(1 )i i ir cWr c e

Restart p

Page 15: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

15

Beyond RWR

P-PageRank[Haveliwala]

PageRank[Haveliwala]

RWR[Pan, Sun]

SM Learning[Zhou, Zhu]

RL in CBIR[He]

Fast RWR Finds the Root Solution !

: Maxwell Equation for Web![Chakrabarti]

Page 16: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

16

• Q: Given query i, how to solve it?

0 1/3 1/3 1/3 0 0 0 0 0 0 0 0

1/3 0 1/3 0 0 0 0 1/4 0 0 0 0

1/3 1/3 0 1/3 0 0 0 0 0 0 0 0

1/3 0 1/3 0 1/4

0.9

0 0 0 0 0 0 0

0 0 0 1/3 0 1/2 1/2 1/4 0 0 0 0

0 0 0 0 1/4 0 1/2 0 0 0 0 0

0 0 0 0 1/4 1/2 0 0 0 0 0 0

0 1/3 0 0 1/4 0 0 0 1/2 0 1/3 0

0 0 0 0 0 0 0 1/4 0 1/3 0 0

0 0 0 0 0 0 0 0 1/2 0 1/3 1/2

0 0 0 0 0

0

0

0

0

00.1

0

0

0

0

0 0 1/4 0 1/3 0 1/2 0

0 0 0 0 0 0 0 0 0 1/3 1/3

1

0 0

??

Page 17: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

17

1

43

2

5 6

7

9 10

8 11

120.130.10

0.13

0.130.05

0.05

0.08

0.04

0.02

0.04

0.03

OntheFly: 0 1/3 1/3 1/3 0 0 0 0 0 0 0 0

1/3 0 1/3 0 0 0 0 1/4 0 0 0 0

1/3 1/3 0 1/3 0 0 0 0 0 0 0 0

1/3 0 1/3 0 1/4

0.9

0 0 0 0 0 0 0

0 0 0 1/3 0 1/2 1/2 1/4 0 0 0 0

0 0 0 0 1/4 0 1/2 0 0 0 0 0

0 0 0 0 1/4 1/2 0 0 0 0 0 0

0 1/3 0 0 1/4 0 0 0 1/2 0 1/3 0

0 0 0 0 0 0 0 1/4 0 1/3 0 0

0 0 0 0 0 0 0 0 1/2 0 1/3 1/2

0 0 0 0 0

0

0

0

0

00.1

0

0

0

0

0 0 1/4 0 1/3 0 1/2 0

0 0 0 0 0 0 0 0 0 1/3 1/3

1

0 0

0

0

0

1

0

0

0

0

0

0

0

0

0.13

0.10

0.13

0.22

0.13

0.05

0.05

0.08

0.04

0.03

0.04

0.02

1

43

2

5 6

7

9 10

811

12

0.3

0

0.3

0.1

0.3

0

0

0

0

0

0

0

0.12

0.18

0.12

0.35

0.03

0.07

0.07

0.07

0

0

0

0

0.19

0.09

0.19

0.18

0.18

0.04

0.04

0.06

0.02

0

0.02

0

0.14

0.13

0.14

0.26

0.10

0.06

0.06

0.08

0.01

0.01

0.01

0

0.16

0.10

0.16

0.21

0.15

0.05

0.05

0.07

0.02

0.01

0.02

0.01

0.13

0.10

0.13

0.22

0.13

0.05

0.05

0.08

0.04

0.03

0.04

0.02

No pre-computation/ light storage

Slow on-line response O(mE)

ir

ir

Page 18: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

18

0.20 0.13 0.14 0.13 0.68 0.56 0.56 0.63 0.44 0.35 0.39 0.34

0.28 0.20 0.13 0.96 0.64 0.53 0.53 0.85 0.60 0.48 0.53 0.45

0.14 0.13 0.20 1.29 0.68 0.56 0.56 0.63 0.44 0.35 0.39 0.33

0.13 0.10 0.13 2.06 0.95 0.78 0.78 0.61 0.43 0.34 0.38 0.32

0.09 0.09 0.09 1.27 2.41 1.97 1.97 1.05 0.73 0.58 0.66 0.56

0.03 0.04 0.04 0.52 0.98 2.06 1.37 0.43 0.30 0.24 0.27 0.22

0.03 0.04 0.04 0.52 0.98 1.37 2.06 0.43 0.30 0.24 0.27 0.22

0.08 0.11 0.04 0.82 1.05 0.86 0.86 2.13 1.49 1.19 1.33 1.13

0.03 0.04 0.03 0.28 0.36 0.30 0.30 0.74 1.78 1.00 0.76 0.79

0.04 0.04 0.04 0.34 0.44 0.36 0.36 0.89 1.50 2.45 1.54 1.80

0.04 0.05 0.04 0.38 0.49 0.40 0.40 1.00 1.14 1.54 2.28 1.72

0.02 0.03 0.02 0.21 0.28 0.22 0.22 0.56 0.79 1.20 1.14 2.05

4

PreCompute

1 2 3 4 5 6 7 8 9 10 11 12r r r r r r r r r r r r

1

43

2

5 6

7

9 10

8 11

120.130.10

0.13

0.130.05

0.05

0.08

0.04

0.02

0.04

0.03

13

2

5 6

7

9 10

811

12

[Haveliwala]

R:

Page 19: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

19

2.20 1.28 1.43 1.29 0.68 0.56 0.56 0.63 0.44 0.35 0.39 0.34

1.28 2.02 1.28 0.96 0.64 0.53 0.53 0.85 0.60 0.48 0.53 0.45

1.43 1.28 2.20 1.29 0.68 0.56 0.56 0.63 0.44 0.35 0.39 0.33

1.29 0.96 1.29 2.06 0.95 0.78 0.78 0.61 0.43 0.34 0.38 0.32

0.91 0.86 0.91 1.27 2.41 1.97 1.97 1.05 0.73 0.58 0.66 0.56

0.37 0.35 0.37 0.52 0.98 2.06 1.37 0.43 0.30 0.24 0.27 0.22

0.37 0.35 0.37 0.52 0.98 1.37 2.06 0.43 0.30 0.24 0.27 0.22

0.84 1.14 0.84 0.82 1.05 0.86 0.86 2.13 1.49 1.19 1.33 1.13

0.29 0.40 0.29 0.28 0.36 0.30 0.30 0.74 1.78 1.00 0.76 0.79

0.35 0.48 0.35 0.34 0.44 0.36 0.36 0.89 1.50 2.45 1.54 1.80

0.39 0.53 0.39 0.38 0.49 0.40 0.40 1.00 1.14 1.54 2.28 1.72

0.22 0.30 0.22 0.21 0.28 0.22 0.22 0.56 0.79 1.20 1.14 2.05

PreCompute:

1

43

2

5 6

7

9 10

8 11

120.130.10

0.13

0.130.05

0.05

0.08

0.04

0.02

0.04

0.03

1

43

2

5 6

7

9 10

811

12

Fast on-line response

Heavy pre-computation/storage costO(n ) O(n )

0.13

0.10

0.13

0.22

0.13

0.05

0.05

0.08

0.04

0.03

0.04

0.02

3 2

Page 20: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

20

Q: How to Balance?

On-line Off-line

Page 21: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

21

Roadmap

• Background– RWR: Definitions– RWR: Algorithms

• Basic Idea• FastRWR

– Pre-Compute Stage– On-Line Stage

• Experimental Results• Conclusion

Page 22: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

22

Basic Idea

1

43

2

5 6

7

9 10

8 11

120.130.10

0.13

0.130.05

0.05

0.08

0.04

0.02

0.04

0.03

1

43

2

5 6

7

9 10

811

12

Find Community

Fix the remaining

Combine1

43

2

5 6

7

9 10

8 11

12

1

43

2

5 6

7

9 10

8 11

12

5 6

7

9 10

811

12

1

43

2

5 6

7

9 10

8 11

12

1

43

2

5 6

7

9 10

8 11

12

1

43

2

Page 23: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

23

Pre-computational stage

• Q: • A: A few small, instead of ONE BIG, matrices inversions

Efficiently compute and store Q-1

Page 24: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

24

• Q: Efficiently recover one column of Q• A: A few, instead of MANY, matrix-vector multiplication

On-Line Query Stage

+

0

0

0

0

0

0

1

0

0

0

0

0

-1

ie ir

Page 25: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

25

Roadmap

• Background– RWR: Definitions– RWR: Algorithms

• Basic Idea• FastRWR

– Pre-Compute Stage– On-Line Stage

• Experimental Results• Conclusion

Page 26: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

26

Pre-compute Stage

• p1: B_Lin Decomposition– P1.1 partition– P1.2 low-rank approximation

• p2: Q matrices– P2.1 computing (for each partition)– P2.2 computing (for concept space)

11Q

Page 27: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

27

P1.1: partition

1

43

2

5 6

7

9 10

811

12

1

43

2

5 6

7

9 10

811

12

Within-partition links cross-partition links

Page 28: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

28

P1.1: block-diagonal

1

43

2

5 6

7

9 10

811

12

1

43

2

5 6

7

9 10

811

12

Page 29: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

29

P1.2: LRA for

31

4

2

5 6

7

9 10

811

12

1

43

2

5 6

7

9 10

811

12

|S| << |W2|~

Page 30: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

30

+

=

Page 31: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

31

p2.1 Computing

Page 32: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

32

Comparing and

• Computing Time– 100,000 nodes; 100 partitions– Computing 100,00x is Faster!

• Storage Cost – 100x saving!

Q 1,1

Q 1,2

Q 1,k

11Q

=

11Q

11Q

Page 33: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

33

• Q: How to fix the green portions?

W +~

~~

11Q

+ ?

Page 34: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

34

p2.2 Computing:

1S UV=_

-1

1

43

2

5 6

7

9 10

811

12

Q 1,1

Q 1,2

Q 1,k

Page 35: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

35

SM Lemma says:

We have:

Communities Bridges

1 1 11 1

11U VcQQ Q Q

Page 36: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

36

Roadmap

• Background– RWR: Definitions– RWR: Algorithms

• Basic Idea• FastRWR

– Pre-Compute Stage– On-Line Stage

• Experimental Results• Conclusion

Page 37: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

37

On-Line Stage

• Q

+

Query

0

0

0

0

0

0

1

0

0

0

0

0

Result

?

• A (SM lemma)

Pre-Computation

ie ir

Page 38: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

38

On-Line Query Stage

q1:q2:q3:q4:q5:q6:

Page 39: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

39

Page 40: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

40

Roadmap

• Background– RWR: Definitions– RWR: Algorithms

• Basic Idea• FastRWR

– Pre-Compute Stage– On-Line Stage

• Experimental Results• Conclusion

Page 41: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

41

Experimental Setup

• Dataset– DBLP/authorship– Author-Paper– 315k nodes– 1,800k edges

• Approx. Quality: Relative Accuracy

• Application: Center-Piece Subgraph

Page 42: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

42

Query Time vs. Pre-Compute Time

Log Query Time

Log Pre-compute Time

•Quality: 90%+ •On-line:

•Up to 150x speedup•Pre-computation:

•Two orders saving

Page 43: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

43

Query Time vs. Pre-Storage

Log Query Time

Log Storage

•Quality: 90%+ •On-line:

•Up to 150x speedup•Pre-storage:

•Three orders saving

Page 44: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

44

Roadmap

• Background– RWR: Definitions– RWR: Algorithms

• Basic Idea• FastRWR

– Pre-Compute Stage– On-Line Stage

• Experimental Results• Conclusion

Page 45: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

45

Conclusion

• FastRWR– Reasonable quality preservation (90%+)– 150x speed-up: query time– Orders of magnitude saving: pre-compute & storage

• More in the paper– The variant of FastRWR and theoretic justification– Implementation details

• normalization, low-rank approximation, sparse

– More experiments• Other datasets, other applications

Page 46: Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18-22, HongKong.

46

Q&A

Thank you!

[email protected]

www.cs.cmu.edu/~htong