Top Banner
57

Final_Presentation

Apr 11, 2017

Download

Documents

Ardavan Afshar
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Final_Presentation
Page 2: Final_Presentation

2

Page 3: Final_Presentation

3

Page 4: Final_Presentation

4

Page 5: Final_Presentation

5

Community

Detection

Page 6: Final_Presentation

Social Networks can be represented in graphs

Nodes correspond to individuals

Edges represent interaction among them

A community can be defined as a group of entities that share similar properties

6

Page 7: Final_Presentation

7

Page 8: Final_Presentation

8

Page 9: Final_Presentation

10

Page 10: Final_Presentation

Compute the distance between all vertices

and communities

Choose two communities based on their similarity

Update the distance between communities

Merge these two communities into a new

community

Walk-Trap

11

Page 11: Final_Presentation

12

Page 12: Final_Presentation

13

Modularity is based on the idea that a random graph is not expected to have a

community structure

π‘€π‘œπ‘‘π‘’π‘™π‘Žπ‘Ÿπ‘–π‘‘π‘¦ = 𝑄 =1

2π‘š 𝑖𝑗(𝐴𝑖𝑗 βˆ’

𝑑𝑖𝑑𝑗

2π‘š)𝛿(𝐢𝑖 , 𝐢𝑗)

A: Adjacency Matrix m: the total number of edges in the network

𝑑𝑖: degree of node i

𝛿(𝐢𝑖 , 𝐢𝑗) = 1, 𝐢𝑖 = 𝐢𝑗0, 𝐢𝑖 β‰  𝐢𝑗

The choice of null model is in principle arbitrary, and several possibilities exist

Page 13: Final_Presentation

π‘Έπ’Žπ’‚π’™ = π¦πšπ±π’‘

𝒄=𝟏

𝒏𝒄𝑰 𝒄

π’Žβˆ’π’…π’„πŸπ’Ž

𝟐

=𝟏

π’Žπ¦πšπ±π’‘ 𝒄=πŸπ’π’„ 𝑰(π‘ͺ) βˆ’ 𝑬𝒙(𝑰 𝒄 )

=-𝟏

π’Žπ¦π’Šπ’π’‘βˆ’ 𝒄=πŸπ’π’„ 𝑰(π‘ͺ) βˆ’ 𝑬𝒙(𝑰 𝒄 )

=-𝟏

π’Žπ¦π’Šπ’π’‘π’Žβˆ’ 𝒄=𝟏

𝒏𝒄 𝑰(𝒄) βˆ’ π’Žβˆ’ 𝒄=πŸπ’π’„ 𝑬𝒙(𝑰 𝒄 )

=-𝟏

π’Žπ¦π’Šπ’π’‘( π‘ͺ𝒖𝒕𝒑 βˆ’ 𝑬𝒙π‘ͺ𝒖𝒕𝒑)

Intra-community edges

14

Inter-community edges

Page 14: Final_Presentation

15

Each node is assigned to its own community

The algorithm repeatedly merges pairs of communities together

Repeat the procedure until only one community

remains

Choose the merger for which the resulting modularity is the

largest.

FastQ

Page 15: Final_Presentation

16

Proposed

Method

Page 16: Final_Presentation

17

A majority of community detection methods try to optimize a global metric

Several of methods need initial parameters to find out the problems

A centralized decision maker has been proposed by most of the algorithms

Page 17: Final_Presentation

A distributed framework has been proposed to detect social networks communities

Each community acts as a selfish agent to maximize its utility function

We use local utility maximization

Modularity has been chosen as the community utility function

18

Page 18: Final_Presentation

Each community just uses local information to maximize its utility function

Each community has some pre-defined actions

Each community chooses the best action in order to have maximum utility

Our distributed framework can perform as well as the existing centralized approaches

19

Page 19: Final_Presentation

Local information is used to identify communities

Every community only utilizes the knowledge obtained from its neighbors

Nodes belonging to a community fall into two types:

1-Core Set(C): no node in C is linked to the

outside of the community

2-Boundary Set(B): every node in B has at

least one connection to the outside of the

community

20

Page 20: Final_Presentation

𝑸 = π‘ͺ=𝟏

𝒏π‘ͺ 𝑰(π‘ͺ)

π’Žβˆ’π‘« π‘ͺ

πŸπ’Ž

𝟐

𝐃 𝐂 = 𝐝𝟏 + 𝐝𝟐 +β‹―+ 𝐝𝐧𝟐 = 𝐝𝟏

𝟐 + 𝐝𝟐𝟐 +β‹―+ 𝟐𝐝𝟏𝐝𝟐 +β‹―+ 𝐝𝐧

𝟐𝐈 𝐂 =

𝐒𝐣

𝐀𝐒𝐣

π‘ͺ𝑰(π‘ͺ)𝑫 π‘ͺπ’Ž

Page 21: Final_Presentation

22

C1

C2

CC

C1

C2

Page 22: Final_Presentation

There exist 2

Possible Merge

for C1

π‘ΌπŸ =𝑰(π‘ͺ𝟏)

π’Žβˆ’π‘« π‘ͺπŸπŸπ’Ž

𝟐

=πŸ‘

πŸπŸβˆ’πŸπŸŽ

𝟐𝟐

𝟐

= 𝟎. πŸŽπŸ”πŸ”

π‘ΌπŸ‘ =𝑰(π‘ͺπŸ‘)

π’Žβˆ’π‘« π‘ͺπŸ‘πŸπ’Ž

𝟐

=𝟏

πŸπŸβˆ’πŸ“

𝟐𝟐

𝟐

= 𝟎. πŸŽπŸ‘πŸ—

𝑼 =𝑰 π‘ͺ𝟏 + 𝑰 π‘ͺπŸ‘ + 𝒙

π’Žβˆ’π‘« π‘ͺ𝟏 +𝑫 π‘ͺπŸ‘πŸπ’Ž

𝟐

=πŸ•

πŸπŸβˆ’πŸπŸ“

𝟐𝟐

𝟐

= 𝟎. πŸπŸ•πŸ

Suppose C1 is a player

Merging between C1 &

C3 occurs If and only if𝐔 > π”πŸ + π”πŸ‘ 𝒙 >

𝑫 π‘ͺ𝟏 𝑫 π‘ͺπŸ‘πŸπ’Ž

= πŸ‘ >πŸ“πŸŽ

𝟐𝟐= 𝟐. πŸπŸ•

23

Page 23: Final_Presentation

25

Our goal is to find a

division in which

modularity has been

maximized

𝐬𝐒 = βˆ’πŸ

C

C1

C2

𝐒𝐒 = +𝟏

𝐐 =𝟏

𝟐𝐦 𝐒𝐣(𝐀𝐒𝐣 βˆ’

𝐝𝐒𝐝𝐣

𝟐𝐦) 𝐬𝐒𝐬𝐣

𝑸 =𝟏

πŸπ’Žπ’”π‘»π‘©π’”

S is a vector whose

elements are π’”π’Š

𝑺 =

π’Š=𝟏

𝒏

πœΆπ’Šπ’–π’Š

π’–π’Š is ith Eigen vector

of B

B is a modularity

matrix whose

elements are:

π‘©π’Šπ’‹ = (π‘¨π’Šπ’‹ βˆ’π’…π’Šπ’…π’‹

πŸπ’Ž))

Page 24: Final_Presentation

26

𝒙 >𝑫 π‘ͺ𝟏 𝑫 π‘ͺ𝟐

πŸπ’Ž,

𝒙 <𝑫 π‘ͺ𝟏 𝑫 π‘ͺ𝟐

πŸπ’Ž,

Page 25: Final_Presentation

The proposed method may get stuck at a local modularity

It may be possible that no community can improve itself and also modularity is not maximized

27

Page 26: Final_Presentation

28

U1 U2U

π‘ΌπŸβ€²

π‘ΌπŸβ€²

𝐔 < π”πŸ + π”πŸ

Merging between C1 and

C2 is irrational

π”πŸ + π”πŸ < π”β€²πŸ + π”β€²πŸ

But splitting of C is rational

Page 27: Final_Presentation

29

U1 U2

xC

π‘ΌπŸβ€²

x’

Irrational Merge

Split Condition

π”πŸ + π”πŸ < π”β€²πŸ + π”β€²πŸ

(π’™β€²βˆ’π’™)

π’Ž>πŸπ‘« π’„πŸ 𝑫 π’„πŸ βˆ’ πŸπ‘« π’„πŸ

β€² 𝑫(π’„πŸβ€² )

πŸ’π’ŽπŸ

𝑫 π‘ͺ𝟏 +𝑫 π‘ͺ𝟐 = 𝑫 π‘ͺπŸβ€² + 𝐃(π‚πŸ

β€² ) 𝑰 π‘ͺ𝟏 + 𝑰 π‘ͺ𝟐 + 𝒙 = 𝑰 π‘ͺπŸβ€² + 𝐈 π‚πŸ

β€² + 𝐱′

π‘ΌπŸβ€²

Page 28: Final_Presentation

30

Page 29: Final_Presentation

31

𝒙 >𝑫 π‘ͺ𝟏 𝑫 π‘ͺ𝟐

πŸπ’Ž,

𝒙 <𝑫 π‘ͺ𝟏 𝑫 π‘ͺ𝟐

πŸπ’Ž

(π’™β€²βˆ’π’™)

π’Ž>πŸπ‘« π’„πŸ 𝑫 π’„πŸ βˆ’πŸπ‘« π’„πŸ

β€² 𝑫(π’„πŸβ€² )

πŸ’π’ŽπŸ

Page 30: Final_Presentation

32

Experimental

Results

Page 31: Final_Presentation

DataSet Number of Nodes Number of edges

Karate 34 77

Risk 42 83

Dolphin 62 159

Politics 105 441

AdjNoun 112 425

Football 115 613

Jazz 198 2742

USAir97 332 2126

Email 1133 5452

Power 4941 6594

Internet 22960 48436 33

Page 32: Final_Presentation

34

The community structures of the ground truth communities and those detected by 1st proposed

Method and 2nd proposed method on Zachary’s karate club network.

Page 33: Final_Presentation

35

The community structures of the ground truth communities and those detected by 1st proposed

Method and 2nd proposed method on Dolphin Network.

Page 34: Final_Presentation

36

The community structures of the ground truth communities and those detected by 1st proposed

Method and 2nd proposed method on NCCA Football Network.

Page 35: Final_Presentation

37

The community structures of the 1st proposed Method and 2nd proposed method on Risk

Network.

Page 36: Final_Presentation

38

The community structures of the 1st proposed Method and 2nd proposed method on

Politics Network.

Page 37: Final_Presentation

0

1

2

3

4

5

6

Karate Risk Dolphin Politics AdjNoun Football Jazz USAir97 Email Power Internet

Rank

Dataset

Rank of Modularity per Dataset

39

Page 38: Final_Presentation

DataSet FastQ walktrap Laplacian SLAP 1st Proposed Method 2nd Proposed Method

Karate 0.252 0.36 0.255 0.399 0.4197 0.4197

Risk 0.624 0.624 0.624 0.626 0.631 0.637

Dolphin 0.341 0.517 0.365 0.511 0.509 0.529

Politics 0.447 0.524 0.527 0.494 0.52 0.527

AdjNoun 0.1845 0.229 0.259 0.286 0.272 0.306

Football 0.577 0.604 0.604 0.6045 0.6043 0.6045

Jazz 0.403 0.437 0.441 0.428 0.425 0.444

USAir97 0.29 0.315 0.363 0.351 0.356 0.366

Email 0.506 0.534 0.543 0.47 0.548 0.566

Power 0.447 0.886 0.932 0.64 0.933 0.939

Internet 0.472 0.647 0.646 0.574 0.588 0.6489

Modularity Obtained From Several Popular Approaches And

Our Proposed Method On Real World Networks

40

Page 39: Final_Presentation

0

1

2

3

4

5

6

Karate Risk Dolphin Politics AdjNoun Football Jazz USAir97 Email Power

Rank

Dataset

Rank of Execution Time per Dataset

FastQ walktrap SLAP 1st Prposed Method 2nd Proposed Method

41

Page 40: Final_Presentation

DataSet FastQ walktrap SLAP 1st Proposed Method 2nd Proposed Method

Karate 77 77 45 31 39

Risk 93 84 38 18 90

Dolphin 211 117 63 54 141

Politics 414 197 88 107 314

AdjNoun 426 194 82 126 379

Football 350 190 100 152 380

Jazz 740 314 295 625 1100

USAir97 3600 497 211 1020 4200

Email 5452 1833 458 2042 8201

Power 31458 7153 762 9472 39763

42

Page 41: Final_Presentation

43

0

1

2

3

4

5

6

100 200 300 400 500 600 700 800 900 1000

Rank

Dataset

Rank of Modularity per Dataset(MU=0.3)

FastQ walktrap Laplacian SLAP 1st Prposed Method 2nd Proposed Method

Page 42: Final_Presentation

DataSet FastQ walktrap Laplacian SLAP 1st Proposed Method 2nd Proposed Method

100 0.35 0.365 0.365 0.365 0.324 0.365

200 0.500 0.549 0.549 0.549 0.523 0.549

300 0.541 0.593 0.593 0.563 0.549 0.593

400 0.562 0.606 0.606 0.6058 0.58 0.606

500 0.574 0.613 0.613 0.613 0.602 0.613

600 0.597 0.608 0.608 0.587 0.589 0.608

700 0.591 0.612 0.612 0.612 0.604 0.612

800 0.59 0.613 0.613 0.613 0.596 0.613

900 0.595 0.611 0.611 0.610 0.579 0.613

1000 0.59 0.609 0.609 0.609 0.586 0.609

MODULARITY OBTAINED FROM SEVERAL POPULAR APPROACHES AND

OUR PROPOSED METHOD ON SYNTHETIC NETWORK(MU=0.3)

44

Page 43: Final_Presentation

45

0

1

2

3

4

5

6

100 200 300 400 500 600 700 800 900 1000

Page 44: Final_Presentation

DataSet FastQ walktrap SLAP 1st Proposed Method 2nd Proposed Method

100 332 196 123 116 240

200 649 364 212 485 731

300 1162 480 390 780 1340

400 1284 682 486 992 2210

500 1555 873 685 1240 3406

600 2060 1041 1047 1570 4210

700 2776 1289 1358 1743 5378

800 2745 1580 1637 2020 6421

900 3980 1860 2438 2320 8745

1000 3565 2179 2599 2610 10255

The Execution Time From Several Popular Approaches And

Our Proposed Method On Synthetic Network(mu=0.3)

46

Page 45: Final_Presentation

47

0

1

2

3

4

5

6

100 200 300 400 500 600 700 800 900 1000

Rank

Dataset

Rank of Modularity per Dataset(Mu=0.5)

FastQ walktrap Laplacian SLAP 1st Prposed Method 2nd Proposed Method

Page 46: Final_Presentation

DataSet FastQ walktrap Laplacian SLAP 1st Proposed Method 2nd Proposed Method

100 0.233 0.202 0.238 0.229 0.231 0.253

200 0.27 0.356 0.356 0.332 0.288 0.355

300 0.344 0.407 0.402 0.395 0.352 0.407

400 0.363 0.431 0.425 0.406 0.391 0.431

500 0.372 0.433 0.433 0.433 0.406 0.434

600 0.367 0.439 0.426 0.406 0.403 0.44

700 0.377 0.435 0.427 0.425 0.400 0.436

800 0.374 0.428 0.429 0.416 0.396 0.432

900 0.365 0.429 0.43 0.424 0.408 0.43

1000 0.375 0.436 0.431 0.435 0.415 0.436

48

MODULARITY OBTAINED FROM SEVERAL POPULAR APPROACHES AND

OUR PROPOSED METHOD ON SYNTHETIC NETWORK(MU=0.5)

Page 47: Final_Presentation

49

0

1

2

3

4

5

6

100 200 300 400 500 600 700 800 900 1000

Rank

Dataset

Rank of Execution Time per Dataset(Mu=0.5)

FastQ walktrap SLAP 1st Proposed Method 2nd Proposed Method

Page 48: Final_Presentation

DataSet FastQ walktrap SLAP 1st Proposed Method 2nd Proposed Method

100 354 191 102 91 221

200 723 377 199 463 621

300 886 472 392 720 1420

400 1288 677 544 1009 2451

500 1654 879 734 1120 3231

600 2180 1049 1061 1680 4621

700 2864 1295 1738 1920 5145

800 2848 1583 1896 2007 6352

900 3305 1867 2366 2247 8745

1000 3859 2172 2700 2670 11471

The Execution Time From Several Popular Approaches And

Our Proposed Method On Synthetic Network(mu=0.5)

50

Page 49: Final_Presentation

51

0

1

2

3

4

5

6

100 200 300 400 500 600 700 800 900 1000

Rank

Dataset

Rank of Modularity per Dataset(Mu=0.7)

FastQ walktrap Laplacian SLAP 1st Prposed Method 2nd Proposed Method

Page 50: Final_Presentation

DataSet FastQ walktrap Laplacian SLAP 1st Proposed Method 2nd Proposed Method

100 0.234 0.196 0.244 0.242 0.23 0.254

200 0.168 0.144 0.178 0.154 0.159 0.179

300 0.155 0.174 0.189 0.166 0.141 0.19

400 0.169 0.239 0.236 0.231 0.177 0.232

500 0.180 0.247 0.245 0.238 0.204 0.247

600 0.181 0.257 0.255 0.202 0.206 0.257

700 0.184 0.26 0.254 0.236 0.229 0.259

800 0.182 0.259 0.255 0.231 0.233 0.259

900 0.185 0.262 0.258 0.252 0.23 0.262

1000 0.180 0.26 0.257 0.23 0.231 0.26

52

Modularity Obtained From Several Popular Approaches And

Our Proposed Method On Synthetic Network(mu=0.7)

Page 51: Final_Presentation

53

0

1

2

3

4

5

6

100 200 300 400 500 600 700 800 900 1000

Rank

Dataset

Rank of Execution Time per Dataset(Mu=0.7)

FastQ walktrap SLAP 1st Proposed Method 2nd Proposed Method

Page 52: Final_Presentation

DataSet FastQ walktrap SLAP 1st Proposed Method 2nd Proposed Method

100 330 197 111 105 320

200 608 382 259 370 591

300 951 486 383 690 1345

400 1321 693 687 997 2684

500 1569 886 1045 1140 3354

600 2011 1085 1041 1620 4574

700 2198 1374 1492 1749 5354

800 2813 1541 1968 1984 6478

900 2836 1841 2408 2146 8894

1000 3823 2200 2618 2541 12577

The Execution Time From Several Popular Approaches And

Our Proposed Method On Synthetic Network(mu=0.7)

54

Page 53: Final_Presentation

55

Page 54: Final_Presentation

56

Page 56: Final_Presentation

58

Page 57: Final_Presentation

59