Random walk on Graphs

Random Walk on Graphs

Pavan Kapanipathi Reading Group (Kno.e.sis)

Referred: Purnamrita Sarkar, Random Walks on Graphs: An Overview

Agenda

• Introduction – Motivation

• Background– Graphs– Matrices

• Random Walk– PageRank– Personalized PageRank– Topic Sensitive PageRank

• Applications– Specifically in Recommender Systems

Random Walk

A drunk man will find his way home, but a drunk bird may get lost forever.

4

Motivation: Link prediction in social networks

5

Motivation: Basis for recommendation

Since I had very less slides and More time – Graphs

• Undirected Graphs

Since I had very less slides and more time in hand-- Graphs

• Directed Graphs

Since I had very less slides and more time in hand – Matrix

Rows

Colu

mns

i

i

j

j

k

k

i,j

9

Adjacency matrix A Transition matrix P

1

1

11

1

1/2

1/21

Adjacency and Transition Matrix

Markov Property (Basic)

• Given the present state, the future and past states are independent

Stochastic

• Wikipedia: In probability theory, a purely stochastic system is one whose state is non-deterministic so that the subsequent state of the system is determined probabilistically.

• Matrix – Stochastic Row/Column?

1

1/2

1/21





The random sequence of points selected this way is a random walk on the graph

16

Transition matrix P

1

1/2

1/21

Again: Transition Matrix

Probability?

i

i j

j

k

k

17

Probability Distributions

• xt(i) = probability that the surfer is at node i at time t

• xt+1(i) = ∑j(Probability of being at node j)*Pr(j->i) =∑jxt(j)*P(j,i)

• xt+1 = xtP = xt-1*P*P= xt-2*P*P*P = …=x0 Pt

Matrix Multiplication?

• What happens when the surfer keeps walking for a long time?

Property of Adjacency Matrix

What does AxA () represent in Graph?

Similarly Transitional Matrix?

Adjacency matrix A

1

1

11

19

What is a stationary distribution? Intuitively and Mathematically

• The stationary distribution at a node is related to the amount of time a random walker spends visiting that node.

• Remember that we can write the probability distribution at a node as– xt+1 = xtP

• For the stationary distribution v0 we have– v0 = v0 P

• Whoa! that’s just the left eigenvector of the transition matrix !

Eigen Value and Eigen Vector?

The basic equation is Ax = x. The number is an eigenvalue of A.

21

Interesting questions

• Does a stationary distribution always exist? Is it unique?– Yes, if the graph is “well-behaved”.

• What is “well-behaved”?– We shall talk about this soon.

• How fast will the random surfer approach this stationary distribution?– Mixing Time!

22

Well behaved graphs

• Irreducible: There is a path from every node to every other node.

Irreducible Not irreducible

What about connected undirected Graph?

23

Well behaved graphs

• Aperiodic: The GCD of all cycle lengths is 1. The GCD is also called period.

AperiodicPeriodicity is 3

24

Implications of the Perron Frobenius Theorem

• If a markov chain is irreducible and aperiodic then the largest eigenvalue of the transition matrix will be equal to 1 and all the other eigenvalues will be strictly less than 1.– Let the eigenvalues of P be {σi| i=0:n-1} in non-increasing order of

σi .– σ0 = 1 > σ1 > σ2 >= ……>= σn

• These results imply that for a well behaved graph there exists an unique stationary distribution.

• More details when we discuss pagerank.

25

Some fun stuff about undirected graphs

• A connected undirected graph is irreducible

• A connected non-bipartite undirected graph has a stationary distribution proportional to the degree distribution!

• Makes sense, since larger the degree of the node more likely a random walk is to come back to it.

26

Proximity measures from random walks

• How long does it take to hit node b in a random walk starting at node a? --- Hitting time.

• How long does it take to hit node b and come back to node a? --- Commute time.

ab

27

Hitting and Commute times

• Hitting time from node i to node j – Expected number of hops to hit node j starting at node i.– Is not symmetric. h(a,b) > h(a,b)– h(i,j) = 1 + ΣkЄnbs(A) p(i,k)h(k,j)

ab

28

Hitting and Commute times

• Commute time between node i and j

– Is expected time to hit node j and come back to i

– c(i,j) = h(i,j) + h(j,i)

– Is symmetric. c(a,b) = c(b,a)

ab

Random Walk (versions)

• PageRank– Personalized PageRank– Topic Sensitive PageRank

• Recommender Systems (My interests)

30

Recommender Networks

• For a customer node i define similarity as– H(i,j)– C(i,j)– Or the cosine similarity

• Now the question is how to compute these quantities quickly for very large graphs. – Fast iterative techniques (Brand 2005)– Fast Random Walk with Restart (Tong, Faloutsos 2006)– Finding nearest neighbors in graphs (Sarkar, Moore 2007)

jjii

ij

LL

L

PageRank (Initial)

• Intuition– PageRank of “A” is higher if the pages that links to

“A” has higher PageRank • User behavior where a surfer clicks on links at random

with no regard towards content

– One page's PageRank is not completely passed on to a page it links to, but is divided by the number of links on the page.

PageRank

• Intuitively

• v works out to be the stationary distribution of the markov chain corresponding to the web.

ij

out j

jviv

)(deg

)()(

33

Pagerank & Perron-frobenius• Perron Frobenius only holds if the graph is irreducible and

aperiodic.

• But how can we guarantee that for the web graph?– Do it with a small restart probability c.

• At any time-step the random surfer – jumps (teleport) to any other node with probability c– jumps to its direct neighbors with total probability 1-c.

jin

cc

ij ,

)(~

1

1

U

UPP

jin

cc

ij ,1

)1(~

U

UPP

34

Pagerank• We are looking for the vector v s.t.

• r is a distribution over web-pages.

• If r is the uniform distribution we get pagerank.

• What happens if r is non-uniform?

crc vP)1(v

Personalization

35

Personalized Pagerank1,2,3

• The only difference is that we use a non-uniform teleportation distribution, i.e. at any time step teleport to a set of webpages.

• In other words we are looking for the vector v s.t.

• r is a non-uniform preference vector specific to an user.

• v gives “personalized views” of the web.

rvP)1(v cc

1. Scaling Personalized Web Search, Jeh, Widom. 2003

2. Topic-sensitive PageRank, Haveliwala, 2001

3. Towards scaling fully personalized pagerank, D. Fogaras and B. Racz, 2004

36

Topic-sensitive pagerank (Haveliwala’01)

• Divide the webpages into 16 broad categories

• For each category compute the biased personalized pagerank vector by uniformly teleporting to websites under that category.

• At query time the probability of the query being from any of the above classes is computed, and the final page-rank vector is computed by a linear combination of the biased pagerank vectors computed offline.

Random Walk for Recommendations

• Collaborative Filtering by Shang et.al• Graph– Vertices: Users (U), Items (I), Item Information (T)

and User Profiles (P) – Edges with weights

• u has a rating for i• i has a tag for t• u belongs to profile category p• u to u if they are connected in the social network

• Edge weights assignments

Random Walk for Recommendations

• Item Rank (IJCAI)Connectivity/Transition

Preference VectorGenerally 0.85

Most of it from

• Purnamrita Sarkar, Random Walks on Graphs: An Overview

• Random Walks on Graphs: A Survey, Laszlo Lov'asz

• OBVIOUSLY: Wikipedia :D• Random Walk on Graphs: Ankit Agarwal

http://www.cs.iastate.edu/~pavan/633/lec7.pdf

Thanks

Random walk on Graphs

Education

node j

customer node i

time graphs undirected

node b

j i degout j j v

random surfer

fast random walk

transition matrix j