Top Banner
AOT LAB DII, UNIPR SOCIAL NETWORK ANALYSIS Enrico Franchi ([email protected] ) 1
22

Social Network Analysis

May 11, 2015

Download

Technology

rik0

Introductive presentation on static social network models.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Social Network Analysis

AOT LABDII, UNIPR

SOCIAL NETWORK ANALYSISEnrico Franchi ([email protected])

1

Page 2: Social Network Analysis

SNA = Complex Network Analysis on Social Networks

Outline

Notation & Metrics

Models

Models Discussion

Conclusion

Degree DistributionPath LengthsTransitivityRandom GraphsSmall-WorldsPreferential Attachment

2

Page 3: Social Network Analysis

G = (V ,E) E ⊂V 2

(x, x) x ∈V{ }∩E =∅

Network

Adjacency Matrix

Aij =1 if (i,j) ∈E0 otherwise

⎧⎨⎩

Directed Network

A symmetricUndirected Network

kiin = A ji

j∑ki

out = Aijj∑

ki = kiin + ki

out

ki = A jij∑ = Aij

j∑

Degree Distribution

Average Degree

px =1n# i ki = x{ }

k = n−1 kxx∈V∑

3

Page 4: Social Network Analysis

Local Clustering Coefficient Ci =ki2( )−1T (i)

Clustering Coefficient C = 1n

Cii∈V∑

T(i): # distinct triangles with i as vertex

C =number of closed paths of length 2( )

number of paths of length 2( ) =number of triangles( )× 3

number of connected triples( )

Measure of Transitivity

4

Page 5: Social Network Analysis

Sk (M) =M + .^ Mk ^ .+Mk( )

Set of Adjacency Matrices

A,+,⋅( )AB = A + .⋅B The matrix product depends from

the operations of the semi-ring

Other matrix products make sense: e.g., or A,+,^( ) A,^,+( )

min

We consider:

L = Sn …S1( ) M( )Shortest path lengths matrix:

Shortest Path Length and Diameter

Diameter: d = maxijL Average shortest path: = Lij

scalar operations

AB[ ]ij = Aik ⋅Bkjk∑

5

Page 6: Social Network Analysis

Computational Complexity of ASPL:

O n3+α( ) α ≈ 3 / 4All pairs shortest path matrix based (parallelizable):

All pairs shortest path Bellman-Ford: O n3( )All pairs shortest path Dijkstra w. Fibonacci Heaps: O n2 logn + nm( )

x = Mq (S)

Computing the CPL

q#S elements are ≤ than x and (1-q)#S are > than x

x = Lqδ (S) q#S(1-δ) elements are ≤ than x and (1-q)#S(1-δ) are > than x

s = 2q2ln 2

1−δ( )2δ 2

Huber Algorithm

Let R a random sample of S such that #R=s, then Lqδ(S) = Mq(R) with probability p = 1-ε.

6

Page 7: Social Network Analysis

s = 2q2ln 2

1−δ( )2δ 2

7

Page 8: Social Network Analysis

1

10

100

1000

10000

100000

1000000

10000000

1 10 100 1000

Facebook Hugs Degree Distribution

Nodes: 1322631 Edges: 1555597m/n: 1.17 CPL: 11.74Clustering Coefficient: 0.0527Number of Components: 18987Isles: 0Largest Component Size: 1169456

8For small k power-laws do not hold

For large k we have statistical fluctuations

Page 9: Social Network Analysis

0.1

1

10

100

1000

10000

100000

1000000

1 10 100 1000

Power-Law: ! gamma=3

Many networks have power-law degree distribution. pk ∝ k−γ γ >1• Citation networks

• Biological networks

• WWW graph

• Internet graph

• Social Networks

9

kr = ?

Page 10: Social Network Analysis

G(n, p)G(n,m)

p

ppp

pp

pp

p

p Pr(Aij = 1) = p

Erdös-Rényi Random Graphs

Ensembles of Graphs

When describe values of properties, we actually the expected value of the property

d := d = Pr(G) ⋅d(G)G∑ ∝ logn

log kPr(G) = pm (1− p)

n2( )−m

m =n2

⎛⎝⎜

⎞⎠⎟p k = (n −1)p

pk =n −1k

⎛⎝⎜

⎞⎠⎟pk (1− p)n−1−k

C = k (n −1)−1

Connectedness Threshold logn / n

pk = e− k k k

k!n→∞ 10

Page 11: Social Network Analysis

p

Watts-Strogatz Model

11

In the modified model, we only add the edges.

ki =κ + si

Edges in the lattice # added

shortcuts

ps = e−κ s κ p( )s

s!

pk = e−κ s κ p( )k−κ

k −κ( )!

C = 3(κ − 2)4(κ −1)+ 8κ p + 4κ p2

≈ log(npκ )

κ 2p

Page 12: Social Network Analysis

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Strogatz-Watts Model - 10000 nodes k = 4

CPL(p)/CPL(0)C(p)/C(0)

CP

L(p)

/CP

L(0)

C(p

)/C(0

)

pShort CPLThreshold Large Clustering Coefficient

Threshold12

Page 13: Social Network Analysis

Matt Britt ©13

Page 14: Social Network Analysis

BARABASI-ALBERT-MODEL(G,M0,STEPS) FOR K FROM 1 TO STEPS N0 ← NEW-NODE(G) ADD-NODE(G,N0) A ← MAKE-ARRAY() FOR N IN NODES(G) PUSH(A, N) FOR J IN DEGREE(N) PUSH(A, N) FOR J FROM 1 TO M N ← RANDOM-CHOICE(A) ADD-LINK (N0, N)

Barabási-Albert Model

Pr(V = x ) = Pr(E = e) =e∈N (x )∑

= kxm

= 2kxkx

x∑

pk ∝ x−3

No analytical proof available

≈ logn

log logn

C ≈ n−3/4

Scale-free entailsshort CPL

Transitivity disappearswith network size

Connectedness Threshold

lognlog logn

14

Page 15: Social Network Analysis

OSN Refs. Users Links <k> C CPL

d γ r

Club NexusCyworldCyworld TLiveJournalFlickrTwitterOrkutOrkutYoutubeFacebookFB HFB GLBrightKiteFourSquareLiveJournalTwitterTwitter

Adamic et al 2.5 K 10 K 8.2 0.2 4 13 n.a. n.a.Ahn et al 12 M 191 M 31.6 0.2 3.2 16 -0.1Ahn et al 92 K 0.7 M 15.3 0.3 7.2 n.a. n.a. 0.4

Mislove et al 5 M 77 M 17 0.3 5.9 20 0.2Mislove et al 1.8 M 22 M 12.2 0.3 5.7 27 0.2Kwak et al 41 M 1700 M n.a. n.a. 4 4.1 n.a.

Mislove et al 3 M 223 M 106 0.2 4.3 9 1.5 0.1Ahn et al 100 K 1.5 M 30.2 0.3 3.8 n.a. 3.7 0.3

Mislove et al 1.1 M 5 M 4.29 0.1 5.1 21 -0Gjoka et al 1 M n.a. n.a. 0.2 n.a. n.a. 0.23Nazir et al 51 K 116 K n.a. 0.4 n.a. 29 n.a.Nazir et al 277 K 600 K n.a. 0.3 n.a. 45 n.a.

Scellato et al 54 K 213 K 7.88 0.2 4.7 n.a. n.a.Scellato et al 58 K 351 K 12 0.3 4.6 n.a. n.a.Scellato et al 993 K 29.6 M 29.9 0.2 4.9 n.a. n.a.

Java et al 87 K 829 K 18.9 0.1 n.a. 6 0.59Scellato et al 409 K 183 M 447 0.2 2.8 n.a. n.a.

15

Page 16: Social Network Analysis

• Moreover:

• Mostly no navigability

• Uniformity assumption

• Sometimes too complex for analytic study

• Few features studied

• Power-law?

16

Static Deg C Rigid

ER

WS

BA

Yes Poisson Low -

Yes Poisson Ok Yes

No PL γ=3 Fixable Yes

Page 17: Social Network Analysis

Alternative models for degree distributionsPower-laws are difficult to fit.When they do, there are often better distributions.

Power-law with cutoff almost always fits better than plain power-law.

f (x;γ ,β ) = x−γ eβx

Sometimes the log-normal distribution is more appropriate

f (x;σ ,m) = 1xσ (2π )1/2

exp − log(x /m)( )22σ 2

⎝⎜⎞

⎠⎟

Most of the times random and preferential attachment processes concur

F(x;r) = 1− (rm)1+r (x + rm)−(1+r )r→ 0

scale-free negative exponential dist.

r→∞

17

Page 18: Social Network Analysis

Nebraska

Kansas

Massachussets

Omaha

Wichita

Boston

6 Degrees

• Random people from Omaha & Wichita were asked to send a postcard to a person in Boston:

• Write the name on the postcard

• Forward the message only to people personally known that was more likely to know the target 18

1st run: 64/296 arrived, most delivered to him by 2 men

2nd run: 24/160 arrived, 2/3delivered by “Mr. Jacobs”

2 ≤ hops ≤ 10; µ=5.x

CPL, hubs, ...... and Kleinberg’s Intuition

Milgram’s Experiment

Page 19: Social Network Analysis

Biased Preferential AttachmentAt each step:

A new node is added to the network and is assigned to one of thesets P, I and L according to a probability distribution h

e0 ∈+ edges are added to the network

for each edge (u,v) u is chosen with distribution D0 and:

if u ∈ I, v is a new node and is assigned to P;

if u ∈ L, v is chosen according to Dγ.

Dβ (u)∝(β +1)(ku +1) u ∈Lku +1 u ∈I0 u ∈P

⎧⎨⎪

⎩⎪

No analytic results available.19

Page 20: Social Network Analysis

Transitive Linking Model [Davidsen 02]

I At each step:TL: a random node is chosen, and it introduces two other nodes that

are linked to it; if the node does not have 2 edges, it introduceshimself to a random node

RM: with probability p a node is chosen and removed along its edgesand replaced with a node with one random edge

I When p ⇤ 1 the TL dominates the process:I the degree distribution is a power-law with cutoffI 1 � C = p(⌅k⇧ � 1), i.e., quite large in practice

I For larger values of p the two different process concur to form anexponential degree distribution

I for p ⇥ 1 the degree distribution is essentially a Poissondistribution

Bergenti, Franchi, Poggi (Univ. Parma) Models for Agent-based Simulation of SN SNAMAS ’11 11 / 19

Transitive Linking

Instead of p it would make sense to have distinct p and rparameters for nodes leaving and entering the network

Few analytic results available.20

Page 21: Social Network Analysis

[1] Dorogovtsev, S. N. and Mendes, J. F. F. 2003 Evolution of Networks: From Biological Nets to the Internet and WWW (Physics). Oxford University Press, USA.

[2] Watts, D. J. 2003 Small Worlds: The Dynamics of Networks between Order and Randomness (Princeton Studies in Complexity). Princeton University Press.

[3] Jackson, M. O. 2010 Social and Economic Networks. Princeton University Press.[4] Newman, M. 2010 Networks: An Introduction. Oxford University Press, USA.[5] Wasserman, S. and Faust, K. 1994 Social Network Analysis: Methods and Applications

(Structural Analysis in the Social Sciences). Cambridge University Press.[6] Scott, J. P. 2000 Social Network Analysis: A Handbook. Sage Publications Ltd.[7] Kepner, J. and Gilbert, J. 2011 Graph Algorithms in the Language of Linear Algebra

(Software, Environments, and Tools). Society for Industrial & Applied Mathematics.[8] Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. 2009 Introduction to

Algorithms. The MIT Press.[9] Skiena, S. S. 2010 The Algorithm Design Manual. Springer.[10] Bollobas, B. 1998 Modern Graph Theory. Springer.[11] Watts, D. J. and Strogatz, S. H. 1998. Collective dynamics of ‘small-world’networks.

Nature. 393, 6684, 440-442.[12] Barabási, A. L. and Albert, R. 1999. Emergence of scaling in random networks. Science.

286, 5439, 509.[13] Kleinberg, J. 2000. The small-world phenomenon: an algorithm perspective. Proceedings of

the thirty-second annual ACM symposium on Theory of computing. 163-170.[14] Milgram, S. 1967. The small world problem. Psychology today. 2, 1, 60-67.

21

Page 22: Social Network Analysis

Thanks for your kind attention.

Enrico Franchi ([email protected])AOTLAB, Dipartimento Ingegneria dell’Informazione, Università di Parma

22