Top Banner
Erd ˝ os-Rényi Again Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential Attachment References Chaos, Complexity, and Inference (36-462) Lecture 21: More Networks: Models and Origin Myths Cosma Shalizi 31 March 2009 36-462 Lecture 21
25

Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Apr 01, 2018

Download

Documents

dangminh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Chaos, Complexity, and Inference (36-462)Lecture 21: More Networks: Models and Origin Myths

Cosma Shalizi

31 March 2009

36-462 Lecture 21

Page 2: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

New Assignment: Implement Butterfly Mode in R

36-462 Lecture 21

Page 3: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Real Agenda: Models of Networks, with Origin Myths

Erdos-Rényi EncoreErdos-Rényi with Node TypesWatts-Strogatz “Small World” GraphsExponential-Family Random GraphsPreferential Attachment

36-462 Lecture 21

Page 4: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Erdos-Rényi Again

n nodes, edges are IID binary variables with probability pDegree of node i = Ki

Ki ∼ Binom(n − 1, p) Pois(np)

Problems

Degree distribution Not PoissonReciprocity Pr

(Aji = 1|Aij = 1

)6= p

Transitivity Pr(Aik = 1|Aij = Ajk = 1

)6= p

Homophily/Assortativeness Pr(

Aij = 1|typei = typej

)6= p

36-462 Lecture 21

Page 5: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Inhomogeneous E-R Models

Give each node a type, 1, . . . k , Timixing matrix Pab = probability of link from type a to type bEdges are still independent given typeEdges are not independent ignoring typeExample: k = 2, types uniform and independent

P =

[0.9 0.10.1 0.9

]Obviously gives homophily

p = Pr(Aij = 1

)= 0.9Pr

(Ti = Tj = 1

)+ 0.1Pr

(Ti = 1, Tj = 2

)+0.1Pr

(Ti = 2, Tj = 1

)+ 0.9Pr

(Ti = Tj = 2

)= 0.9× 0.25 + 0.1× 0.25 + 0.1× 0.25 + 0.9× 0.25 = 0.5

36-462 Lecture 21

Page 6: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Also gives reciprocity:

Pr(Aji=1 = 1, Aij = 1

)= 0.81Pr

(Ti = Tj = 1

)+ 0.01Pr

(Ti = 1, Tj = 2

)+0.01Pr

(Ti = 2, Tj = 1

)+ 0.81Pr

(Ti = Tj = 2

)= 0.41

Pr(Aji=1 = 1|Aij = 1

)=

Pr(Aji = 1, Aij = 1

)Pr

(Aij = 1

)= 0.82 > 0.5

EXERCISE: Show that this model has transitivity of edges aswell

36-462 Lecture 21

Page 7: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

One direction for extending this: block models (“block” = type),indicating “type A gets links from type B, gives links to type C,never gets links from D or E. . . ”Community structure or modularity is a limiting case of this,where mixing matrix has big diagonal entries, small off-diagonalonesReferences: Reichardt and White (2007) for discovering block models;Clauset et al. (2007) for discovering hierarchies of modules;http://bactra.org/notebooks/community-discovery.html forreferences on community structure and community discovery

36-462 Lecture 21

Page 8: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Watts-Strogatz “Small World” Graphs

Watts and Strogatz (1998)Regular lattices have a lot of reciprocity andtransitivity/clusteringbut are “large worlds”, in d dimensions diameter= O(n1/d) � O(log n)Somehow interpolate between lattices and E-R graphs to get allthree propertiesbut work with undirected graphs for simplicity

36-462 Lecture 21

Page 9: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Solution: start with regular lattice, add “long-range shortcuts” atrandomFirst approach: For each edge, with probability ρ, re-wire oneedge to a uniformly random new node (avoiding self-loops)As ρ → 0, go to regular latticeAs ρ → 1, go to E-R graph with same density as latticecan create disconnected graphsSecond approach: add random edges without removing oldoneseasier to manipulate, doesn’t quite go to E-R as ρ→ 1Will do more with this in the EXERCISES

36-462 Lecture 21

Page 10: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Exponential Family Random Graphs

Measure graph properties like density, reciprocity, transitivity;specify graph probabilities in terms of themExponential families are the easiest way to do this

Pr (X = x) =h(x) exp

{∑di=1 θiTi(x)

}∫

dx h(x) exp{∑d

i=1 θiTi(x)}

=h(x) exp

{∑di=1 θiTi(x)

}Z (θ)

Ti are sufficient statistics, θi are natural parametersAcronym: ERGM, Exponential family Random Graph Model (“err-gim” or“err-gum”)

36-462 Lecture 21

Page 11: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

E-R model is an exponential family:

Pr (A = a) =n∏

i=1

∏j 6=i

paij (1− p)(1−aij )

= pP

ij aij (1− p)n(n−1)−P

ij aij

= (1− p)n(n−1)

(p

1− p

)Pij aij

= (1− p)n(n−1) exp

(log p/(1− p))∑

ij

aij

so T =

∑ij aij , θ = log p/(1− p), Z (θ) = (1− p)−n(n−1)

36-462 Lecture 21

Page 12: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Exponential family models are easy to fit by maximumlikelihood, if you can find Z (θ) or Eθ [Ti(x)]

∂ log Pr (X = x)

∂θi

=∂

∂θilog h(x) +

∂θi

d∑j=1

θjTj(x)− ∂

∂θilog Z (θ)

= 0 + Ti(x)− 1Z (θ)

∂Z (θ)

∂θi

36-462 Lecture 21

Page 13: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

The last term is worth a look:

1Z (θ)

∂Z (θ)

∂θi=

1Z (θ)

∂θi

∫dx h(x) exp

d∑

j=1

θjTj(x)

=

1Z (θ)

∫dx

∂θih(x) exp

d∑

j=1

θjTj(x)

=

1Z (θ)

∫dx h(x) exp

∑j 6=i

θjTj(x)

∂θiexp {θiTi(x)}

=1

Z (θ)

∫dx h(x) exp

∑j 6=i

θjTj(x)

Ti(x) exp {θiTi(x)}

36-462 Lecture 21

Page 14: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

continued:

1Z (θ)

∂Z (θ)

∂θi=

∫dx Ti(x)

h(x) exp{∑d

i=1 θiTi(x)}

Z (θ)

= Eθ [Ti(X )]

Go back to the likelihood equation:

∂ log Pr (X = x)

∂θi= Ti(x)− 1

Z (θ)

∂Z (θ)

∂θi

= Ti(x)− Eθ [Ti(X )]

The derivatives are zero at the MLE θ:

Ti(x) = Eθ [Ti(X )]

36-462 Lecture 21

Page 15: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

For E-R model, Eθ

[∑ij Aij

]= n(n − 1)p

so

pMLE =

∑ij aij

n(n − 1)

What about more complicated ERGMs?“p1 model”: sufficient statistics are total number of edges, andtotal number of reciprocal edgesNot so easy to solve but can be done (Wasserman and Faust,1994; Hunter et al., 2008)p∗: general ERGM, can add more features, homophily as suchvs. reciprocity or transitivity as such...

36-462 Lecture 21

Page 16: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Example of ERGMs Working

High school friendship network (Goodreau et al., 2005)

Fit model including homophily by sex, grade, race; also differentover all probability of forming edges (“main effect”)

36-462 Lecture 21

Page 17: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

36-462 Lecture 21

Page 18: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Best R package: statnet (on CRAN) — see special issue(vol. 24) of the Journal of Statistical Software,http://www.jstatsoft.org/v24Generally not possible to solveUse simulation to approximate Z (θ) and/or Eθ [T (X )] (Hunterand Handcock, 2006)even then there can be pathologies from bad choice of model(e.g. model say probability of these network statistics is 10−50)

36-462 Lecture 21

Page 19: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Some Important Weaknesses of ERGMs

1 Possible pathologies in fitting2 “Statistics convenient for us to measure” 6= “important

causal variables”3 Matching some statistics doesn’t mean matching others

(Hunter et al., 2008)4 No origin myth/generative model (typically)

36-462 Lecture 21

Page 20: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Some Generative Models

E-R model edges appear and disappear independently overtime (works whether or not homogeneous)

p1 model Markov chain, edge in one direction makes addingedge more likely, losing one edge makes othertend to go away

Watts-Strogatz Models See Clauset and Moore (2003) for asemi-plausible story about adaptive re-wiring

E-R again Add nodes one by one, each node adds links toexisting nodes independently with probability p

Preferential attachment Graphical version of Yule-Simonprocess

36-462 Lecture 21

Page 21: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Preferential Attachment

Made famous by Barabási and Albert (1999); Albert andBarabási (2002)At each time-step a new node arrivesWith probability ρ, new node i makes edge to old node j ,picking j ∝ kj , degree of jWith probability 1− ρ, i links to a completely random nodeThis is exactly the Yule-Simon process that produces power lawtails (Bornholdt and Ebel, 2001)Apparently first applied to networks by Price (1965)Will see more in the EXERCISES

36-462 Lecture 21

Page 22: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Albert, Réka and Albert-László Barabási (2002). “StatisticalMechanics of Networks.” Reviews of Modern Physics, 74:47–97. URLhttp://arxiv.org/abs/cond-mat/0106096.

Barabási, Albert-László and Réka Albert (1999). “Emergenceof Scaling in Random Networks.” Science, 286: 509–512.URL http://arxiv.org/abs/cond-mat/9910332.

Bornholdt, Stefan and Holger Ebel (2001). “World-Wide Webscaling exponent from Simon’s 1955 model.” Physical ReviewE , 64: 035104. URLhttp://arxiv.org/abs/cond-mat/0008465.

Clauset, Aaron and Cristopher Moore (2003). “How DoNetworks Become Navigable?” Physical Review Letters,submitted. URLhttp://www.arxiv.org/abs/cond-mat/0309415.

36-462 Lecture 21

Page 23: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Clauset, Aaron, Cristopher Moore and Mark E. J. Newman(2007). “Structural Inference of Hierarchies in Networks.” InStatistical Network Analysis: Models, Issues, and NewDirections (Edo Airoldi and David M. Blei and Stephen E.Fienberg and Anna Goldenberg and Eric P. Xing and Alice X.Zheng, eds.), vol. 4503 of Lecture Notes in ComputerScience, pp. 1–13. New York: Springer-Verlag. URLhttp://arxiv.org/abs/physics/0610051.

Goodreau, Steven M., David R. Hunter and Martina Morris(2005). Statistical Modeling of Social Networks: PracticalAdvances and Results. Tech. Rep. 05-01, Center for Studiesin Demography and Ecology, University of Washington. URLhttp://csde.washington.edu/downloads/05-01.pdf.

Hunter, David R., Steven M. Goodreau and Mark S. Handcock36-462 Lecture 21

Page 24: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

(2008). “Goodness of Fit of Social Network Models.” Journalof the American Statistical Association, 103: 248–258. URLhttp://www.csss.washington.edu/Papers/wp47.pdf.doi:10.1198/016214507000000446.

Hunter, David R. and Mark S. Handcock (2006). “Inference incurved exponential family models for networks.” Journal ofComputational and Graphical Statistics, 15: 565–583. URLhttp://www.stat.psu.edu/%7Edhunter/papers/cef.pdf.

Price, Derek J. de Solla (1965). “Networks of Scientific Papers.”Science, 149.

Reichardt, Jörg and Douglas R. White (2007). “Role models forcomplex networks.” E-print, arxiv.org, 0708.0958. URLhttp://arxiv.org/abs/0708.0958.

36-462 Lecture 21

Page 25: Chaos, Complexity, and Inference (36-462) - CMU …cshalizi/462/lectures/21/21.pdfErdos-Rényi Again˝ Watts-Strogatz Graphs Exponential Family Random Graphs Generative Models, Preferential

Erdos-Rényi AgainWatts-Strogatz Graphs

Exponential Family Random GraphsGenerative Models, Preferential Attachment

References

Wasserman, Stanley and Katherine Faust (1994). SocialNetwork Analysis: Methods and Applications. Cambridge,England: Cambridge University Press.

Watts, Duncan J. and Steven H. Strogatz (1998). “CollectiveDynamics of “Small-World” Networks.” Nature, 393:440–442.

36-462 Lecture 21