Top Banner
1 KONECT Cloud Large Scale Network Mining in the Cloud Jérôme Kunegis Future SOC Lab Day, 18.04.2012
19

KONECT Cloud – Large Scale Network Mining in the Cloud

Jan 26, 2015

Download

Technology

In the Winter 2011/2012 run at the Future SOC Lab, we used the KONECT
framework (Koblenz Network Collection) to compute ten
different network statistics on a large collection of downsampled
versions of a large network dataset, with the goal of determining
whether sampling of a large network can be used to reduce the
computational effort needed to compute a network statistic. Preliminary
results show that this is indeed the case.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: KONECT Cloud – Large Scale Network Mining in the Cloud

1

KONECT Cloud

Large Scale Network Mining in the Cloud

Jérôme Kunegis Future SOC Lab Day, 18.04.2012

Page 2: KONECT Cloud – Large Scale Network Mining in the Cloud

Networks are Everywhere

Communication

Authorship

Friendship

c

Interaction

Trust

Co-occurrence

Page 3: KONECT Cloud – Large Scale Network Mining in the Cloud

Social Networks

friend

Page 4: KONECT Cloud – Large Scale Network Mining in the Cloud

Trust Networks

trust

Page 5: KONECT Cloud – Large Scale Network Mining in the Cloud

Friend/Enemy Network

enemy

frien

d

Page 6: KONECT Cloud – Large Scale Network Mining in the Cloud

Interaction Networklisten

Page 7: KONECT Cloud – Large Scale Network Mining in the Cloud

KONECT – Koblenz Network Collection

148 network datasets

26 are undirected 38 are directed 84 are bipartite 59 have unweighted edges 77 allow multiple edges 04 have signed edges 08 have ratings as edges 78 have edge arrival times

konect.uni-koblenz.de

Page 8: KONECT Cloud – Large Scale Network Mining in the Cloud

Largest Network

Directed “who follows who” network

0 041 652 230 users

1 468 365 182 edges

konect.uni-koblenz.de/networks/twitter

Page 9: KONECT Cloud – Large Scale Network Mining in the Cloud

148 Network Datasets

authorshipcommunicationco-occurrence

featuresfolksonomyinteraction

physicalratings

referencesemantic

socialtrust

Page 10: KONECT Cloud – Large Scale Network Mining in the Cloud

What We Computed

Connected componentsNetwork diameterClustering coefficientsDegree distributionsSpectral distributionEigenvector centralityGraph drawingTemporal AnalysisLink prediction

←at Future SOC Lab

Page 11: KONECT Cloud – Large Scale Network Mining in the Cloud

Network Diameter

6

Page 12: KONECT Cloud – Large Scale Network Mining in the Cloud

90 Percentile Effective Diameter

5

Page 13: KONECT Cloud – Large Scale Network Mining in the Cloud

90 Percentile Effective Diameter

3

Page 14: KONECT Cloud – Large Scale Network Mining in the Cloud

90 Percentile Effective Diameter

3.75

Page 15: KONECT Cloud – Large Scale Network Mining in the Cloud

Computing the Effective Diameter

for each node i { |V| count hops needed to reach 90% |E|

}

Total runtime: |E| × |V|

Page 16: KONECT Cloud – Large Scale Network Mining in the Cloud

Graph Sampling

KeepX% of edges

Page 17: KONECT Cloud – Large Scale Network Mining in the Cloud

Computation

× 1 000 vertices (sampled)× 120 840 391 edges× 20 sample sizes (5%, 10%, …, 100%)× 50 random samplings

Evaluation on single machine:

1 TiB memory 64 cores Matlab 64 bit

Page 18: KONECT Cloud – Large Scale Network Mining in the Cloud

Results

Page 19: KONECT Cloud – Large Scale Network Mining in the Cloud

Dr. Jérôme Kunegis

[email protected]

west.uni-koblenz.de

Thank You!

konect.uni-koblenz.de