Top Banner
Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian
40

Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Influence and Correlation in Social Networks

Aris AnagnostopoulosRavi Kumar

Mohammad Mahdian

Page 2: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Preliminaries

- Correlations exist in users' behaviors 

Page 3: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Preliminaries

- Correlations exist in users' behaviors - Representation:     individuals are nodes of a social graph, G    every node is "active" or "inactive" - Formally, correlation = if u and v are adjacent in G:     the event that u becomes active is correlated with v becoming active

Page 4: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Preliminaries

- Correlations exist in users' behaviors - Representation:     individuals are nodes of a social graph, G    every node is "active" or "inactive" - Formally, correlation = if u and v are adjacent in G:     the event that u becomes active is correlated with v becoming active

- Want to distinguish between different sources of social correlation

Page 5: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Models of Social Correlation

- Homophily = tendency for individuals to choose friends with similar characteristics / preferences

Page 6: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Models of Social Correlation

- Homophily = tendency for individuals to choose friends with similar characteristics / preferences

- Confounding = external influence from elements in the environment (confounding factors)

Page 7: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Models of Social Correlation

- Homophily = tendency for individuals to choose friends with similar characteristics / preferences

- Confounding = external influence from elements in the environment (confounding factors)

- Influence = the action of one individual induces another individual to act in a similar way.

Page 8: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Motivation

- Useful to know when social influence is the source of correlation

Page 9: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Motivation

- Useful to know when social influence is the source of correlation

- Viral marketing -> want to target select individuals

Page 10: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Motivation

- Useful to know when social influence is the source of correlation

- Viral marketing -> want to target select individuals

- Influence behavior -> create "role models" (e.g. in fashion)

Page 11: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Motivation

- Useful to know when social influence is the source of correlation

- Viral marketing -> want to target select individuals

- Influence behavior -> create "role models" (e.g. in fashion)

- We want to identify situations when such techniques can be applied.

Page 12: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Motivation

- Useful to know when social influence is the source of correlation

- Viral marketing -> want to target select individuals

- Influence behavior -> create "role models" (e.g. in fashion)

- We want to identify situations when such techniques can be applied.

- Also useful for analysis (predicting future state of network)

Page 13: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Modeling Influence

1. Graph G drawn according to some distribution 

Page 14: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Modeling Influence

1. Graph G drawn according to some distribution 2. In each of the time steps 1, ..., T, each non-active agent decides whether to become active.  

Page 15: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Modeling Influence

1. Graph G drawn according to some distribution 2. In each of the time steps 1, ..., T, each non-active agent decides whether to become active.  3. An agent becomes active with probability p(a), a function of the number of neighboring and active nodes.

Page 16: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

or, alternatively,

Page 17: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Some remarks...

- The coefficient α measures social correlation. 

Page 18: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Some remarks...

- The coefficient α measures social correlation. 

- Since actions are stored, a represents the number of users active at any earlier time step

Page 19: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Some remarks...

- The coefficient α measures social correlation. 

- Since actions are stored, a represents the number of users active at any earlier time step

- This model is relatively simplistic:     - the probability does not vary between nodes    - or as time passes

Page 20: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Some remarks...

- The coefficient α measures social correlation. 

- Since actions are stored, a represents the number of users active at any earlier time step

- This model is relatively simplistic:     - the probability does not vary between nodes    - or as time passes

- However, these simplifying assumption are practical

Page 21: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Estimating α, β

 - Can estimate using maximum likelihood logistic regression

 - Maximize expression

whereis the number of users who at the beginning of time had a active friends and became active at time t

Page 22: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

The Shuffle Test

- Idea: if influence does not play a role, then the timing of activations amongst users should be independent of each other:

Pr(a active before b) = Pr(b active before a)

Page 23: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

The Shuffle Test

1. Estimate α for initial graph2. Randomly permute the order in which active nodes have been activated:

set the time of

3. Estimate α' for this configuration4. If the values for α and α' are close to each other, the model exhibits little or no social influence.

Page 24: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

The Edge-reversal Test

1. reverse direction of all the edges 2. run the same logistic regression on the data using the new graph

If correlation is not due to influence, then α should not change

Page 25: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Generative Models

- No Correlation

- Influence

- Correlation, no influence

Page 26: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Generative Models - No Correlation

- network grows just as the real data - at every step, randomly pick n nodes, and make them active

Page 27: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Influence Model- network grows just as the real data -  at every step, every inactive node flips a coin, with

Page 28: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Correlation, No Influence Model

- network grows just as the real data - Pick a subset S of G:    - randomly pick centers, add a ball of radius 2 from each to S    - do this until |S| reaches parameter L- Pick nodes to become active uniformly at random, from S 

Page 29: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Distinguishing Influence: Shuffle Test

Influence:

Correlation:

Page 30: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Distinguishing Influence: Edge Reversal

Correlation:

Influence:

Page 31: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Real Data: the Flickr Dataset

- analyzed 800K users over 16 months - about 340K exhibited tagging behavior

- size of giant component: 160K

- 2.8M directed edges, 28.5% not mutual

- analyzed 1,700 tags independently    - various types (event, color, object, etc)    - various numbers of users    - various growth patterns (bursty, smooth, periodic)

Page 32: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Distinguishing Influence in Flickr

Shuffle test

Page 33: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Distinguishing Influence in Flickr

Edge reversal test

Page 34: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Some Influence

- can discover traces of influence by looking at similar tags 

Page 35: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Some Influence

- can discover traces of influence by looking at similar tags - for the tag "graffiti", the difference between αs was 0

- however, for the misspelling "grafitti", difference was slightly larger

- with even less common misspelling "graffitti", difference increased even more

Page 36: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Conclusions

- distinguishing between correlation and causation is difficult

Page 37: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Conclusions

- distinguishing between correlation and causation is difficult

- timing information can help answer the question (shuffle)

Page 38: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Conclusions

- distinguishing between correlation and causation is difficult

- timing information can help answer the question (shuffle)

- knowing of asymmetric social ties is also useful (edge-reversal)

Page 39: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Further research directions

- formal verification of results? (controlled experiments) - quantification of the strength of influence?  - identify which nodes influence others  - what if social ties are symmetric?   - distinguishing between other forms of correlation

- distinguishing between different forms of social influence

Page 40: Influence and Correlation in Social Networks Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian.

Questions?