Top Banner
An introduction to social network challenges Arnaud Martin [email protected] Universit´ e de Rennes 1 - IRISA - DRUID, Lannion, France Paris, January, 22th 2018 An introduction to social network challenges, A. Martin - 22/01/18 1/73
91

An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin [email protected]

Sep 25, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

An introduction to social networkchallenges

Arnaud [email protected]

Universite de Rennes 1 - IRISA - DRUID, Lannion, France

Paris, January, 22th 2018

An introduction to social network challenges, A. Martin - 22/01/18

1/73

Page 2: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

OutlineSocial NetworkModelModel informationMining

1. What is a social network?

2. How to model a social network?

3. How to model information on social networks?

4. How to analyse social network?

An introduction to social network challenges, A. Martin - 22/01/18

2/73

Page 3: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

What is a social network?Social NetworkModelModel informationMining

An introduction to social network challenges, A. Martin - 22/01/18

3/73

(1/11)

Page 4: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

What is a social network?Social NetworkModelModel informationMining

An introduction to social network challenges, A. Martin - 22/01/18

4/73

(2/11)

Page 5: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

What is a social network?Social NetworkModelModel informationMining

Collaborative platforms

An introduction to social network challenges, A. Martin - 22/01/18

5/73

(3/11)

Page 6: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

What is a social network?Social NetworkModelModel informationMining

An introduction to social network challenges, A. Martin - 22/01/18

6/73

(4/11)

Page 7: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

What is a social network?Social NetworkModelModel informationMining

A definitionA finite set of social actors (individual, organisations) with relations(collaboration, advice, control, influence, etc.) between them.

Remarks:

I Technical definition

I Is it really always finite?

I Relations and actors are never fixed

I Most of time not only one social network, not only one kind ofgroup (community)

An introduction to social network challenges, A. Martin - 22/01/18

7/73

(5/11)

Page 8: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Application domainsSocial NetworkModelModel informationMining

I Sociology

I Ethnology

I Economy

I Demography

I Criminal networks

I Social media

I Literary

I Ecology

I etc.

An introduction to social network challenges, A. Martin - 22/01/18

8/73

(6/11)

Page 9: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Notion of communitiySocial NetworkModelModel informationMining

An introduction to social network challenges, A. Martin - 22/01/18

9/73

(7/11)

Page 10: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Notion of communitiySocial NetworkModelModel informationMining

An introduction to social network challenges, A. Martin - 22/01/18

10/73

(8/11)

Page 11: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Social network: a matter of scaleSocial NetworkModelModel informationMining

I 31% of world population connected on social network

I Facebook: 1,8 billions of users/month - 17.9 billions of $

I Qzone: 653 millions of users/month

I Instagram: 600 millions of users/month

I Twitter: 317 millions of users/month

I LinkedIn: 106 millions of users/month

I Snapchat: 150 millions of users/day

An introduction to social network challenges, A. Martin - 22/01/18

11/73

(9/11)

Page 12: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Social network: main challengesSocial NetworkModelModel informationMining

I economical challenges:games, publicities, business image, marketing (viralmarketing), etc.

I political challenges:social influence, e.g. Jasmin revolution, Obama elections,Trump tweets, etc.

I social challenges:share knowledge: all information at any time, communication(to find a job, a partner, etc.), etc.

An introduction to social network challenges, A. Martin - 22/01/18

12/73

(10/11)

Page 13: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Social network: main scientific challengesSocial NetworkModelModel informationMining

I big data management:How to access to the data? How to make requests on thedata? How to reduce complexity of processes?, etc.

I social mining:How to extract information from the data? How tocharacterize the data?, etc.

I privacy and security:How to protect people data? How to assure the security ofpeople?, etc.

An introduction to social network challenges, A. Martin - 22/01/18

13/73

(11/11)

Page 14: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

OutlineSocial NetworkModelModel informationMining

1. What is a social network?

2. How to model a social network?

3. How to model information on social networks?

4. How to analyse social network?

An introduction to social network challenges, A. Martin - 22/01/18

14/73

Page 15: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

How to model a social network?Social NetworkModelModel informationMining

A graph: GA set (V,E) with V = v1, . . . , vV a set of vertices/nodes andE = e1, . . . , eE a set of edges/linksek ∈ E is couple of (vi, vj).

I |V | = V : order of the graph

I |E| = E number of edges

I vi and vj are neighbour or adjacent if ∃ek ∈ E such asek = (vi, vj)

I N(u) = v ∈ V, (u, v) ∈ E: the neighbourhood of u

I Node degree: d(u) = |N(u)| i.e. the number of edges from u.

I Centrality of a node: d(u)E−1

I Link density: D = 2EV (V−1)

See Ernesto Estrada talk for more features on the graphs...

An introduction to social network challenges, A. Martin - 22/01/18

15/73

(2/16)

Page 16: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

How to model a social network?Social NetworkModelModel informationMining

a graph:

an adjacent matrix:

1 2 3 4 5 6123456

1 1 0 0 0 01 1 1 0 1 10 1 1 0 1 00 0 0 1 0 10 1 1 0 1 00 1 0 1 0 1

An introduction to social network challenges, A. Martin - 22/01/18

16/73

(3/16)

Page 17: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

How to model a social network?Social NetworkModelModel informationMining

a graph:

a list of adjacent nodes:1: 22: 1, 3, 5, 63: 2, 54: 65: 2,36: 2, 4

An introduction to social network challenges, A. Martin - 22/01/18

17/73

(4/16)

Page 18: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

How to model a social network?Social NetworkModelModel informationMining

Challenge: drawing large graphs

An introduction to social network challenges, A. Martin - 22/01/18

18/73

(5/16)

Page 19: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Specificity of social networkSocial NetworkModelModel informationMining

An introduction to social network challenges, A. Martin - 22/01/18

19/73

(6/16)

Page 20: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Specificity of social networkSocial NetworkModelModel informationMining

Main social networks are scale-free network and have a degreedistribution given by a power distribution:

P (k) = Ck−γ

P (k) is the proportion of nodes with the degree k, in general2 ≤ γ ≤ 3 C a constant. The density of a graph depend on theapplication domain (Melancon, 2006)

Flikr social online network (Scholz 2015)An introduction to social network challenges, A. Martin - 22/01/18

20/73

(7/16)

Page 21: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Specificity: Milgram paradoxSocial NetworkModelModel informationMining

(Milgram 1967): In average, the number of links between twopersons (nodes) is small (around 6).(Facebook, 2011): Each person is linked to other by 4.74 relations(in average).

An introduction to social network challenges, A. Martin - 22/01/18

21/73

(8/16)

Page 22: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Specificity: Small world networkSocial NetworkModelModel informationMining

In social networks:

I The number of neighbours for a given node is approximatelythe same than the number of neighbours of its neighbours

I The distance L between two randomly chosen nodes is givenby:

L ' lnE

An introduction to social network challenges, A. Martin - 22/01/18

22/73

(9/16)

Page 23: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Distance on graphSocial NetworkModelModel informationMining

I geodesic distance: between two vertices is the shortest path(number of edges)

I eccentricity: ε(u) is the greatest geodesic distance between uand another vertex

I radius: minu∈V

= ε(u)

I graph diameter: maxu∈V

= ε(u)

Problem: detection of cycles - NP-hard algorithms

I intermediary centrality of a node:

IC(u) =∑

s 6=u,t6=u,s 6=t

σst(u)

σst

σst: number of shortest paths between s and t,σst(u): number of shortest paths between s and t passing by u

An introduction to social network challenges, A. Martin - 22/01/18

23/73

(10/16)

Page 24: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

How to model a social network?Social NetworkModelModel informationMining

a directed graph: (e.g. followers in Tweeter)

an adjacent matrix:

1 2 3 4 5 6123456

0 1 0 0 0 00 0 0 0 1 00 1 0 0 0 00 0 0 0 0 00 0 1 0 0 00 1 0 1 0 0

An introduction to social network challenges, A. Martin - 22/01/18

24/73

(11/16)

Page 25: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

What is a community on a network?Social NetworkModelModel informationMining

(Fortunato, 2010) some properties for a community:

I Two neighbours in a same community are approximately thesame

I Two neighbours in a same community must be near

I The nodes of a community have a high average degree

I A community contains a high proportion of triplets (highclustering coefficient)

I A community has a large embeddedness (ratio on internal andexternal degree)

An introduction to social network challenges, A. Martin - 22/01/18

25/73

(12/16)

Page 26: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

What is a community on a network?Social NetworkModelModel informationMining

(Fortunato, 2010) some properties for a community:

I Two neighbours in a same community are approximately thesame

I Two neighbours in a same community must be near

I The nodes of a community have a high average degree

I A community contains a high proportion of triplets (highclustering coefficient)

C(u) =2|eij = (vi, vj) ∈ E : vi, vj ∈ N(u)|

|N(u)(N(u)− 1)|

C(G) =1

V

∑u ∈V

C(u)

I A community has a large embeddedness (ratio on internal andexternal degree)

An introduction to social network challenges, A. Martin - 22/01/18

25/73

(12/16)

Page 27: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

What is a community on a network?Social NetworkModelModel informationMining

(Fortunato, 2010) some properties for a community:I Two neighbours in a same community are approximately the

sameI Two neighbours in a same community must be nearI The nodes of a community have a high average degreeI A community contains a high proportion of triplets (high

clustering coefficient)I A community has a large embeddedness (ratio on internal and

external degree)For a given sub-graph Gc of G, A it adjacent matrix, u ∈ Gc:

kintu =∑j∈Gc

Auj

kextu =∑j /∈Gc

Auj

An introduction to social network challenges, A. Martin - 22/01/18

25/73

(12/16)

Page 28: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

What is a community on a network?Social NetworkModelModel informationMining

(Fortunato, 2010) some properties for a community:

I Two neighbours in a same community are approximately thesame

I Two neighbours in a same community must be near

I The nodes of a community have a high average degree

I A community contains a high proportion of triplets (highclustering coefficient)

I A community has a large embeddedness (ratio on internal andexternal degree)For a given sub-graph Gc of G, A it adjacent matrix, u ∈ Gc:

ξu =kintu

kintu + kextu

An introduction to social network challenges, A. Martin - 22/01/18

25/73

(12/16)

Page 29: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

What is a community on a network?Social NetworkModelModel informationMining

(Fortunato, 2010) some properties for a community:

I Two neighbours in a same community are approximately thesame

I Two neighbours in a same community must be near

I The nodes of a community have a high average degree

I A community contains a high proportion of triplets (highclustering coefficient)

I A community has a large embeddedness (ratio on internal andexternal degree)

Challenge: Give a definition of a community on a social network

See Mauro Sozio and Florence Sedes talks...

An introduction to social network challenges, A. Martin - 22/01/18

25/73

(12/16)

Page 30: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Challenges: classical problemsSocial NetworkModelModel informationMining

I Travelling salesman problem: Find the shortest way to visitgiven nodes only one time (NP-hard) - equivalent to vehiclerouting problem

I Graph labelling and colouring: give a label to all nodes (orlinks) (NP-hard)

I Maximum flow: in flow network (valued directed graph) findthe largest possible total flow

I Large graph compression

I Maximal clique enumeration (NP-hard)

I Independent set problem: find the largest possibleindependent set (set of vertices with no two of which areadjacent) (NP-hard)

Most of problem on graph are equivalent to NP-hard optimisationproblems. Some approximation algorithms are developed.

An introduction to social network challenges, A. Martin - 22/01/18

26/73

(13/16)

Page 31: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Challenge: realistic social network generationSocial NetworkModelModel informationMining

For different communities social network:Lancichinetti-Fortunato-Radicchi LFR benchmark: based on powerlaw distribution, need:

I number of nodes

I minimum and maximum for the community sizes

I average, maximum degree

I etc

defines:

I number of edges

I number of communities

An introduction to social network challenges, A. Martin - 22/01/18

27/73

(14/16)

Page 32: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Challenge: realistic social network generationSocial NetworkModelModel informationMining

Zachary’s Karate club network LFR generation

An introduction to social network challenges, A. Martin - 22/01/18

28/73

(15/16)

Page 33: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Challenge: realistic social network generationSocial NetworkModelModel informationMining

(Largeron, et al, 2015), see Christine Largeron talk...

I Local preferential attachment: new link between vertices withhigh degree

I Small world

I Community structure: vertices are connected to vertices in asame group compared to other group (large embeddedness)

I Community homogeneity: similarity of vertices in a samegroup

I Homophily: vertices in a same group are more similar thanwith the other groups

allows:

I dynamical generation of social networks

I fix the number of vertices

I fix the number of communities

An introduction to social network challenges, A. Martin - 22/01/18

29/73

(16/16)

Page 34: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

OutlineSocial NetworkModelModel informationMining

1. What is a social network?

2. How to model a social network?

3. How to model information on social networks?

4. How to analyse social network?

An introduction to social network challenges, A. Martin - 22/01/18

30/73

Page 35: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

How to model information on social network?Social NetworkModelModel informationMining

On social network some information can be considered:

I information on the links: LinkedIn, etc.

I information on the nodes: Facebook, LinkedIn, etc.

I information (message) throw the network: Tweeter,collaborative platforms, etc.

An introduction to social network challenges, A. Martin - 22/01/18

31/73

(1/20)

Page 36: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Valued graphsSocial NetworkModelModel informationMining

G = (V,E,w) where w : e ∈ E −→ X

an adjacent matrix:

1 2 3 4 5 6123456

0 w12 0 0 0 0w12 0 w23 0 w25 w26

0 w23 0 0 w35 00 0 0 0 0 w46

0 w25 w35 0 0 00 w26 0 w46 0 0

An introduction to social network challenges, A. Martin - 22/01/18

32/73

(2/20)

Page 37: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Valued graphsSocial NetworkModelModel informationMining

G = (V,E, p) where p : e ∈ E −→ X

an adjacent matrix:

1 2 3 4 5 6123456

0 p12 0 0 0 0p12 0 p23 0 p25 p26

0 p23 0 0 p35 00 0 0 0 0 p46

0 p25 p35 0 0 00 p26 0 p46 0 0

An introduction to social network challenges, A. Martin - 22/01/18

33/73

(3/20)

p12(friend) = 0.8p12(family) = 0.15p12(colleague) = 0.05

Page 38: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Valued graphsSocial NetworkModelModel informationMining

G = (V,E,m) where m : e ∈ E −→ X

an adjacent matrix:

1 2 3 4 5 6123456

0 m12 0 0 0 0m12 0 m23 0 m25 m26

0 m23 0 0 m35 00 0 0 0 0 m46

0 m25 m35 0 0 00 m26 0 m46 0 0

An introduction to social network challenges, A. Martin - 22/01/18

34/73

(4/20)

Veracity of informationDoubtReliability

Page 39: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Limits of the theory of probabilitiesSocial NetworkModelModel informationMining

A probability is a positive and additive measure, p is defined on aσ-algebra of Ω = ω1, ω2, . . . , ωn and takes values in [0,1].

It verifies: p(∅) = 0, p(Ω) = 1,∑X∈Ω

p(X) = 1

I Difficulties to model the absence of knowledge (ex: Sirius)

I Constraint on the classes (exhaustive and exclusive)

I Constraint on the measures (additivity)

If one symptom f (for fiver) is always true when a patient get aillness A (flu) (p(f |A) = 1), and if we observe this symptom f ,then the probability of the patient having A increases (becausep(A|f) = p(A)/p(f) so p(A|f) ≥ p(A)).The additivity constraint require then that the probability of thepatient having not A decreases: p(A|f) = 1− p(A|f) sop(A|f) ≤ p(A) While there is no reason if the symptom f can bealso observe in some other diseases.

An introduction to social network challenges, A. Martin - 22/01/18

35/73

Belief (5/20)

Page 40: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Bases on Belief functionsSocial NetworkModelModel informationMining

I Use of functions defined on sub-sets instead of singletons suchas probabilities

I Discernment frame: Ω = ω1, . . . , ωn, with ωi are exclusiveand exhaustive classes

I Power set: 2Ω = ∅, ω1, ω2, ω1 ∪ ω2, . . . ,Ω.I Several functions in one to one correspondence model

uncertainty and imprecision: mass functions, belief functions,plausilibity functions

I Extension of 2Ω to DΩ, hyper power set in order to model theconflicts

I DΩ closed set by union and intersectionoperators

I DΩr : reduced set with constraints

(ω2 ∩ ω3 ≡ ∅)

An introduction to social network challenges, A. Martin - 22/01/18

36/73

Belief (6/20)

Page 41: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Mass functionsSocial NetworkModelModel informationMining

I The basic belief functions (bba or mass functions) are definedon 2Ω and take values in [0, 1]

I Normalisation condition:∑X∈2Ω

m(X) = 1

I A focal element is an element X of 2Ω such as m(X) > 0

I Closed world: m(∅) = 0

I We note mj the mass function of the source Sj

An introduction to social network challenges, A. Martin - 22/01/18

37/73

Belief (7/20)

Page 42: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Mass functionsSocial NetworkModelModel informationMining

Special cases:

I If only focal elements are ωi then mj is a probability

I mj(Ω) = 1: total ignorance of SjI categorical mass function: mj(X) = 1 (noted mX): Sj has

an imprecise knowledge

I mj(ωi) = 1: Sj has a precise knowledge

I simple mass functions Xw:mj(X) = w and mj(Ω) = 1− w: Sj has an uncertain andimprecise knowledge

An introduction to social network challenges, A. Martin - 22/01/18

38/73

Belief (8/20)

Page 43: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

DiscountingSocial NetworkModelModel informationMining

From (Shafer, 1976):mαj (X) = αjmj(X), ∀X ∈ 2Ω

mαj (Ω) = 1− αj(1−mj(Ω))

αj ∈ [0, 1] discounting coefficient can be seen as the reliability ofthe source SjIf αj = 0 the source are completely unreliable, all the mass istransferred on Ω, the total ignorance

An introduction to social network challenges, A. Martin - 22/01/18

39/73

Belief (9/20)

Page 44: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Fusion architectureSocial NetworkModelModel informationMining

s sources S1, S2, ..., Ss that must take a decision on anobservation x in a set of n classes x ∈ Ω = ω1, ω2, . . . , ωnclasses

ω1 . . . ωi . . . ωnS1...Sj...Ss

m1

1(x) . . . m1i (x) . . . m1

n(x)...

. . ....

. . ....

mj1(x) . . . mj

i (x) . . . mjn(x)

.... . .

.... . .

...ms

1(x) . . . msi (x) . . . ms

n(x)

An introduction to social network challenges, A. Martin - 22/01/18

40/73

Belief (10/20)

Page 45: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Conjunctive rulesSocial NetworkModelModel informationMining

I Assume: two cognitively independent and reliable sources S1

and S2.

I The conjunctive rule is given for m1 and m2 bbas of S1 andS2, for all X ∈ 2Ω, with X 6= ∅ by:

mConj(X) =∑

Y1∩Y2=X

m1(Y1)m2(Y2) (1)

∅ ω1 ω2 ω3 Ω

m1 0 0.5 0.1 0 0.4

m2 0 0.2 0 0.5 0.3

m 0.32 0.33 0.03 0.2 0.12

An introduction to social network challenges, A. Martin - 22/01/18

41/73

Belief (11/20)

Page 46: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Dempster’s ruleSocial NetworkModelModel informationMining

I Dempster’s rule:

mD(X) =1

1− κmConj(X) (2)

where κ =∑

A∩B=∅

m1(A)m2(B) is generally called conflict or

global conflict. That is the sum of the partial conflicts.

I That is not a conflict measure.

I Conjunctive rules are not idempotent

An introduction to social network challenges, A. Martin - 22/01/18

42/73

Belief (12/20)

Page 47: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Decision on ΩSocial NetworkModelModel informationMining

I In general the decision is made on Ω and not on 2Ω

I Pessimist: maxω∈Ω bel(ω)I Optimist: maxω∈Ω pl(ω)I Compromise: maxω∈Ω betP (ω)

Pignistic probability:

betP(ω) =∑

Y ∈2Ω,ω∩Y 6=∅

1

|Y |m(Y )

1−m(∅)(3)

An introduction to social network challenges, A. Martin - 22/01/18

43/73

Belief (13/20)

Page 48: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Information on social networkSocial NetworkModelModel informationMining

On social network some information can be considered:

I information on the links: LinkedIn, etc.

I information on the nodes: Facebook, LinkedIn, etc.

I information (message) throw the network: Tweeter,collaborative platforms, etc.

An introduction to social network challenges, A. Martin - 22/01/18

44/73

(14/20)

Page 49: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Node-attributed graphsSocial NetworkModelModel informationMining

G = (V,E, F ) where F : V −→ XF (v) = [f1(v), . . . , fa(v)]

Attributes can be qualitative, quantitative (fuzzy, interval,probabilistic, belief, etc.).see Christine Largeron talk...

An introduction to social network challenges, A. Martin - 22/01/18

45/73

(15/20)

Page 50: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Node and link-attributed graphsSocial NetworkModelModel informationMining

G = (V,E,mu,me) where mu : V −→ X and me : e ∈ E −→ Xmu(v) = [m1(v), . . . ,ma(v)]

(Ben Dhaou, 2014, 2017)

An introduction to social network challenges, A. Martin - 22/01/18

46/73

(16/20)

Page 51: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Information on social networkSocial NetworkModelModel informationMining

On social network some information can be considered:

I information on the links: LinkedIn, etc.

I information on the nodes: Facebook, LinkedIn, etc.

I information (message) throw the network: Tweeter,collaborative platforms, etc.

An introduction to social network challenges, A. Martin - 22/01/18

47/73

(17/20)

Page 52: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Information on social networkSocial NetworkModelModel informationMining

Characteristics of the messages:

I A message is a text (can be short text 140 characters onTwitter)

I That is not in general literature (many typos, errors, etc.)

I A message has an author

I A message can be send to some recipients

I A message has in general a date

I A message can have a label (type of message)

I A message can have an influence on the evolution of thenetwork

An introduction to social network challenges, A. Martin - 22/01/18

48/73

(18/20)

Page 53: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Evolution of information on social networkSocial NetworkModelModel informationMining

Information on

I the existence of a node in the network

I the existence of a link between two nodes

I existence at time t can be model by a probability or a belief

An introduction to social network challenges, A. Martin - 22/01/18

49/73

(19/20)

Page 54: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Challenge: privacy-preserving data miningSocial NetworkModelModel informationMining

How can we protect our personal data?How do not send personal information?What is personal, what is public?

I Cryptography and network security

I Watermarking (Gross-Amblard, 2003)

I Preference elicitation in Personal Information managementSystems (Allard et al., 2017)

See Oana Goga talk.

An introduction to social network challenges, A. Martin - 22/01/18

50/73

(20/20)

Page 55: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

OutlineSocial NetworkModelModel informationMining

1. What is a social network?

2. How to model a social network?

3. How to model information on social networks?

4. How to analyse social network?

An introduction to social network challenges, A. Martin - 22/01/18

51/73

Page 56: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Message miningSocial NetworkModelModel informationMining

Challenges:

I Understand the messages

I Characterise emotion in the message

I Characterise the writer by the text (level of expertise, sociallevel, etc.)

I Characterise positive/negative/neutral message

I Detect fake news

I Detect new topics, interest centres, etc.

Methods: coming from text mining must be lingual independent,robust to the form of the message, time dependent, etc.

An introduction to social network challenges, A. Martin - 22/01/18

52/73

Message (1/16)

Page 57: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Person identificationSocial NetworkModelModel informationMining

Challenges:

I Find criminals on a social network

I Find influencers for viral marketing

I Find spammers on participating platforms

I Find experts on participating platforms

I etc.

An introduction to social network challenges, A. Martin - 22/01/18

53/73

Person (2/16)

Page 58: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Expert identification in stackoverflowSocial NetworkModelModel informationMining

An introduction to social network challenges, A. Martin - 22/01/18

54/73

Person (3/16)

Page 59: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Expert identification in stackoverflowSocial NetworkModelModel informationMining

(Attiaoui, et al. 2017)

An introduction to social network challenges, A. Martin - 22/01/18

55/73

Person (4/16)

Page 60: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Expert identification in stackoverflowSocial NetworkModelModel informationMining

Evolution of the percentage of each class over 15 months.

Data set: 37 Go, 2 Million users, 2.5 Million answers, 1.7 Millionquestions, Data from December 2013 to March 2015

An introduction to social network challenges, A. Martin - 22/01/18

56/73

Person (5/16)

Page 61: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Influencers identificationSocial NetworkModelModel informationMining

Problem: Given a social network, find a set of influencers that areable to trigger a large cascade.

An introduction to social network challenges, A. Martin - 22/01/18

57/73

Person (6/16)

Page 62: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Influencers identificationSocial NetworkModelModel informationMining

Problem: Given a social network, find a set of influencers that areable to trigger a large cascade.

An introduction to social network challenges, A. Martin - 22/01/18

57/73

Person (6/16)

Page 63: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Influencers identificationSocial NetworkModelModel informationMining

Problem: Given a social network, find a set of influencers that areable to trigger a large cascade.

An introduction to social network challenges, A. Martin - 22/01/18

57/73

Person (6/16)

Page 64: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Influencers identificationSocial NetworkModelModel informationMining

Solution: Influencers on Twitter (Jendoubi, et al, 2016, 2017)I Define an influence measure based on belief functions by:

I Ω = I, P I for influencer, P for passiveI Calculate belief weights on each edge (u, v)I Integrate opinion of tweetI Combine the mass functions

I Compute influence maximisation by CELF algorithm(Leskovec et al. 2007)

An introduction to social network challenges, A. Martin - 22/01/18

58/73

Person (7/16)

Page 65: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Influencers identificationSocial NetworkModelModel informationMining

Solution: Influencers on Twitter (Jendoubi, et al, 2016, 2017)I Define an influence measure based on belief functions by:

I Ω = I, P I for influencer, P for passiveI Calculate belief weights on each edge (u, v)

from numbers of common neighbours, number of tweets whereu mentions v, number of tweets where v retweets from u

I Integrate opinion of tweetI Combine the mass functions

I Compute influence maximisation by CELF algorithm(Leskovec et al. 2007)

An introduction to social network challenges, A. Martin - 22/01/18

58/73

Person (7/16)

Page 66: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Influencers identificationSocial NetworkModelModel informationMining

Solution: Influencers on Twitter (Jendoubi, et al, 2016, 2017)I Define an influence measure based on belief functions by:

I Ω = I, P I for influencer, P for passiveI Calculate belief weights on each edge (u, v)I Integrate opinion of tweet

I Give a label to each word in the tweet using Stanford POSTagger with the model GATE Twitter part-of-speech tagger,

I Use the SentiWordNet dictionary to get the polarity of eachword in the tweet

I Build a belief function on Θ = Pos,Neg,NeutI Combine the mass functions

I Compute influence maximisation by CELF algorithm(Leskovec et al. 2007)

An introduction to social network challenges, A. Martin - 22/01/18

58/73

Person (7/16)

Page 67: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detectionSocial NetworkModelModel informationMining

Define first type of communities expected:

I Hard communities: each node v belongs to one and only onecommunity in Ω = C1, . . . , Cn

µvk = 1 if v ∈ Ckµvk = 0 otherwise

I Fuzzy communities: each node v has a degree of membership

µvk ∈ [0, 1] to each community withn∑k=1

µvk = 1

An introduction to social network challenges, A. Martin - 22/01/18

59/73

Community (8/16)

Page 68: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detectionSocial NetworkModelModel informationMining

Define first type of communities expected:

I Possibilistic communities: the conditionn∑k=1

µvk = 1 is

relaxed. µvk can be interpreted as a degree of possibility thata node v belongs to the community Ck

I Rough communities: the membership of node v to communityCk is described by a pair (µ

vk, µvk) ∈ 0, 12 indicating its

membership to the lower and upper approximations ofcommunity Ck

I Belief communities: the membership of each node v isdescribed by a belief function mv over Ω.

An introduction to social network challenges, A. Martin - 22/01/18

60/73

Community (9/16)

Page 69: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detectionSocial NetworkModelModel informationMining

An introduction to social network challenges, A. Martin - 22/01/18

61/73

Community (10/16)

Page 70: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detectionSocial NetworkModelModel informationMining

Define first type of communities expected:

I Hard communities: each node v belongs to one and only onecommunity in Ω = C1, . . . , Cn

I Overlapped communities: each node v belongs to more thanone community in Ω, C1, . . . , Cn are not exclusive

With belief functions, work on DΩ, hyper power set in orderto model the overlapped communities:

I DΩ closed set by union and intersection operatorsI DΩ

r : reduced set with constraints (C2 ∩ C3 ≡ ∅)

See Remy Cazabet talk...

An introduction to social network challenges, A. Martin - 22/01/18

62/73

Community (11/16)

Page 71: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detectionSocial NetworkModelModel informationMining

Methods: Depend on information in input and expected in output

1 Selection: can be from databases by requests, or by scanningthe web

An introduction to social network challenges, A. Martin - 22/01/18

63/73

Community (12/16)

Page 72: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detectionSocial NetworkModelModel informationMining

Methods: Depend on information in input and expected in output

2 Preprocessing: depend on the data, transform the data ingraph, list of adjacent nodes, belief functions information, etc.

An introduction to social network challenges, A. Martin - 22/01/18

63/73

Community (12/16)

Page 73: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detectionSocial NetworkModelModel informationMining

Methods: Depend on information in input and expected in output

3 Transformation: Calculate extracted feature (by supervised orunsupervised methods)

An introduction to social network challenges, A. Martin - 22/01/18

63/73

Community (12/16)

Page 74: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detectionSocial NetworkModelModel informationMining

Methods: Depend on information in input and expected in output

4 Data Mining: Classify the data (by supervised or unsupervisedmethods)

An introduction to social network challenges, A. Martin - 22/01/18

63/73

Community (12/16)

Page 75: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detectionSocial NetworkModelModel informationMining

Methods: Depend on information in input and expected in output

5 Evaluation: Calculate some measures on the obtained patterns

An introduction to social network challenges, A. Martin - 22/01/18

63/73

Community (12/16)

Page 76: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detectionSocial NetworkModelModel informationMining

Characterisation of classical clustering methods (a challenge):

1. hierarchical methods by division or agglomeration buildpartitionsExamples: Louvain algorithm, spectral approaches, etc.

2. partitioning methods:Examples: C-means, Fuzzy C-means, Evidential C-means(Zhou et al., 2015)

3. Label propagation methods

Need

I a distance (or similarity) on data (structure of the graph andinformation on the graph)

I an optimisation process

An introduction to social network challenges, A. Martin - 22/01/18

64/73

Community (13/16)

Page 77: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detection: SELPSocial NetworkModelModel informationMining

Initialize the bba of each node in the network

Select a node nx in VU, find its rx direct neighbors and construct rx bbas

Calculate the fused bba of node nx

Output the bba of each node

Input the labeled nodes in the graph

the maximum of mass assignment of nx is larger than a given threshold

Yes

No

move node nx from set VU to set VL

There is no node in VU

Yes

No

Semi-supervised Evidential Label Propagation algorithm (Zhou et al., 2018)

An introduction to social network challenges, A. Martin - 22/01/18

65/73

Community (14/16)

Page 78: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detection: SELPSocial NetworkModelModel informationMining

Example on Karate Club network

1

2

3

4

5

6

78

9

10

11

12

13

14

15

16

17

18

19

20

2122

23

24

25

26

27

28

29

30

31

32

33

34

Initialization

Labeled data in ω1Labeled data in ω2Labeled as noisy dataunlabeled data

An introduction to social network challenges, A. Martin - 22/01/18

66/73

Community (15/16)

Page 79: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detection: SELPSocial NetworkModelModel informationMining

Example on Karate Club network

1

2

3

4

5

6

78

9

10

11

12

13

14

15

16

17

18

19

20

2122

23

24

25

26

27

28

29

30

31

32

33

34

Iteration 1

Labeled data in ω1Labeled data in ω2Labeled as noisy dataunlabeled data

An introduction to social network challenges, A. Martin - 22/01/18

66/73

Community (15/16)

Page 80: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detection: SELPSocial NetworkModelModel informationMining

Example on Karate Club network

1

2

3

4

5

6

78

9

10

11

12

13

14

15

16

17

18

19

20

2122

23

24

25

26

27

28

29

30

31

32

33

34

Iteration 2

Labeled data in ω1Labeled data in ω2Labeled as noisy dataunlabeled data

An introduction to social network challenges, A. Martin - 22/01/18

66/73

Community (15/16)

Page 81: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detection: SELPSocial NetworkModelModel informationMining

Example on Karate Club network

1

2

3

4

5

6

78

9

10

11

12

13

14

15

16

17

18

19

20

2122

23

24

25

26

27

28

29

30

31

32

33

34

Iteration 3

Labeled data in ω1Labeled data in ω2Labeled as noisy dataunlabeled data

An introduction to social network challenges, A. Martin - 22/01/18

66/73

Community (15/16)

Page 82: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detection: SELPSocial NetworkModelModel informationMining

Example on Karate Club network

1

2

3

4

5

6

78

9

10

11

12

13

14

15

16

17

18

19

20

2122

23

24

25

26

27

28

29

30

31

32

33

34

Iteration 4

Labeled data in ω1Labeled data in ω2Labeled as noisy dataunlabeled data

An introduction to social network challenges, A. Martin - 22/01/18

66/73

Community (15/16)

Page 83: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detection: SELPSocial NetworkModelModel informationMining

Example on Karate Club network

1

2

3

4

5

6

78

9

10

11

12

13

14

15

16

17

18

19

20

2122

23

24

25

26

27

28

29

30

31

32

33

34

Iteration 5

Labeled data in ω1Labeled data in ω2Labeled as noisy dataunlabeled data

An introduction to social network challenges, A. Martin - 22/01/18

66/73

Community (15/16)

Page 84: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Community detectionSocial NetworkModelModel informationMining

Challenges:

I How to learn information on graphs?

I How many communities? A difficult problem in clustering ingeneral

I How to combine methods? Methods of information fusion canbe used

I How to well consider the dynamic aspect of social network?

I How to reduce the time consuming of algorithms? Somealgorithms can be parallelised

I How to evaluate the obtained communities? A difficultproblem in clustering, more difficult when we don’t know whatis a community.

I etc.

An introduction to social network challenges, A. Martin - 22/01/18

67/73

Community (16/16)

Page 85: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

To endSocial NetworkModelModel informationMining

Many challenges around social networks

I We don’t know exactly what is a social network

I We are not sure of given information on social network(veracity, precision, existence, etc.)

I We don’t know exactly what is a community

I We have a lot of information

I Almost all our problems need a NP-hard algorithm

Next presentations during these two days will give you someanswers.

An introduction to social network challenges, A. Martin - 22/01/18

68/73

Page 86: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

To endSocial NetworkModelModel informationMining

Many challenges around social networks

I We don’t know exactly what is a social network

I We are not sure of given information on social network(veracity, precision, existence, etc.)

I We don’t know exactly what is a community

I We have a lot of information

I Almost all our problems need a NP-hard algorithm

Next presentations during these two days will give you someanswers.

My proposal: use the theory of belief functions in order to wellmodel uncertainty and imprecision of information

An introduction to social network challenges, A. Martin - 22/01/18

68/73

Page 87: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

ReferencesSocial NetworkModelModel informationMining

I Stanford POSTagger:http://nlp.stanford.edu/software/tagger.shtml

I GATE Twitter part-of-speech tagger:https://gate.ac.uk/wiki/twitter-postagger.html

I SentiWordNet: http://sentiwordnet.isti.cnr.it/I Santo Fortunato, Community detection in graphs. Physics

Reports, 486(3):75-174, 2010I Santo Fortunato, Darko Hric, Community detection in

networks: A user guide, Physics Reports, 659, pp 1-44, 2016I Guy Melancon, Just how dense are dense graphs in the real

world?: a methodological note. In Proceedings of the 2006AVI workshop on BEyond time and errors: novel evaluationmethods for information visualization, pp 1-7. ACM, 2006

I C. Largeron, P.N. Mougel, R. Rabbany, O.R. Zaiane,Generating attributed networks with communities, PloS one10(4), 2015

An introduction to social network challenges, A. Martin - 22/01/18

69/73

Page 88: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

ReferencesSocial NetworkModelModel informationMining

I David Gross-Amblard, Query-preserving watermarking ofrelational databases and XML documents, Proceedings of thetwenty-second ACM SIGMOD-SIGACT-SIGART symposiumon Principles of database systems, 2003

I Tristan Allard, Tassadit Bouadi, Joris Dugueperoux, VirginieSans, From Self-Data to Self-Preferences: Towards PreferenceElicitation in Personal Information Management Systems,International Workshop on Personal Analytics and Privacy (Inconjunction with ECML PKDD 2017)

I Shafer, G. A mathematical theory of evidence. PrincetonUniversity Press, (1976)

I J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos,Cost-effective outbreak detection in networks, KDD 2007

An introduction to social network challenges, A. Martin - 22/01/18

70/73

Page 89: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Personal references: with belief functionsSocial NetworkModelModel informationMining

I Kuang Zhou, Arnaud Martin, Quan Pan, Zhunga Liu, SELP:Semi-supervised evidential label propagation algorithm forgraph data clustering, International Journal of ApproximateReasoning, Elsevier, 2018, 92, pp.139-154

I Dorra Attiaoui, Arnaud Martin, Boutheina Ben Yaghlane,Belief Temporal Analysis of Expert Users: case study StackOverflow, Big Data Analytics and Knowledge DiscoveryDAWAK, Aug 2017, Lyon, France

I Dorra Attiaoui, Arnaud Martin, Boutheina Ben Yaghlane,Belief Measure of Expertise for Experts Detection in QuestionAnswering Communities: case study Stack Overflow, 21stInternational Conference on Knowledge-Based and IntelligentInformation & Engineering Systems, Sep 2017, Marseille,France

An introduction to social network challenges, A. Martin - 22/01/18

71/73

Page 90: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Personal references: with belief functionsSocial NetworkModelModel informationMining

I Siwar Jendoubi, Arnaud Martin, Ludovic Lietard, Hend BenHadji, Boutheina Ben Yaghlane, Two Evidential Data BasedModels for Influence Maximization in Twitter,Knowledge-Based Systems, 2017

I Salma Ben Dhaou, Kuang Zhou, Mouloud Kharoune, ArnaudMartin, Boutheina Ben Yaghlane, The Advantage ofEvidential Attributes in Social Networks, 20th InternationalConference on Information Fusion, Jul 2017, Xi’an, China

I Kuang Zhou, Arnaud Martin, Quan Pan, Zhun-Ga Liu, Medianevidential c-means algorithm and its application to communitydetection, Knowledge-Based Systems, 2015, 74, pp.69 - 88

I Kuang Zhou, Arnaud Martin, Quan Pan, A similarity-basedcommunity detection method with multiple prototyperepresentation, Physica A: Statistical Mechanics and itsApplications, Elsevier, 2015, pp.519-531

An introduction to social network challenges, A. Martin - 22/01/18

72/73

Page 91: An introduction to social network challengespeople.irisa.fr/Arnaud.Martin/publi/slideeEGC2018.pdf · An introduction to social network challenges Arnaud Martin Arnaud.Martin@univ-rennes1.fr

Personal references: www-druid.irisa.frSocial NetworkModelModel informationMining

I Imen Ouled Dlala, Dorra Attiaoui, Arnaud Martin, BoutheinaBen Yaghlane, Trolls Identification within an UncertainFramework, International Conference on Tools with ArtificialIntelligence - ICTAI, Nov 2014, Limassol, Cyprus

I Salma Ben Dhaou, Mouloud Kharoune, Arnaud Martin,Boutheina Ben Yaghlane, Belief Approach for SocialNetworks, Belief 2014, Oxford, United Kingdom

I Kuang Zhou, Arnaud Martin, Quan Pan, EvidentialCommunities for Complex Networks, 15th InternationalConference on Information Processing and Management ofUncertainty in Knowledge-Based Systems, Jul 2014

An introduction to social network challenges, A. Martin - 22/01/18

73/73