From complex systems to networks: discovering and modelign the higher order network Nitesh Chawla Frank M. Freimann Professor of Computer Science and Engineering Director, iCeNSA 3/2/17 11:17 AM
Apr 12, 2017
From complex systems to networks: discovering and modelign the higher order network
Nitesh ChawlaFrank M. Freimann Professor of Computer Science and Engineering
Director, iCeNSA
3/2/17 11:17 AM
2
Our world is complex
Ship 1 Shanghai à Singapore à Los Angeles à …Ship 2 Tokyo à Singapore à Seattle à …Ship 3 Shanghai à Singapore à Hong Kong à …Ship 4 Hong Kong à Singapore à Seattle à …
… … … …
Ship trajectories
3
Our world is complex
Ship 1 Shanghai à Singapore à Los Angeles à …Ship 2 Tokyo à Singapore à Seattle à …Ship 3 Shanghai à Singapore à Hong Kong à …Ship 4 Hong Kong à Singapore à Seattle à …
… … … …
Ship trajectories Global shipping network
4
Our world is complex
User 1 Company ranking à Job listing à Applyà …User 2 Weather à Homepage à News à …User 3 News à Sports à Scores à …User 4 Events à Homepage à Weather à …
… … … …
Web page clickstreams
5
Our world is complex
User 1 Company ranking à Job listing à Applyà …User 2 Weather à Homepage à News à …User 3 News à Sports à Scores à …User 4 Events à Homepage à Weather à …
… … … …
Web page clickstreams Web traffic network
7
Data
• Ship movements• Web clickstreams• Phone call cascades• … …
Network representation
• Global shipping network
• Web traffic network• Social network• … …
Network analysis
• Clustering• Ranking• Link prediction• Anomaly detection• … …
8
Complex systems: representation
How to best represent such big data, and reveal the intrinsic connections concisely and accurately?
13
Enriching the networkConventionally: every node represents a single entity (location, state, etc.)
( )( )
11 1( | ) t t
t t t ttj
W i iP X i X i
W i j+
+ +
®= = =
®å
14
Enriching the networkConventionally: every node represents a single entity (location, state, etc.)
Now: break down nodes into higher-order nodes that carry different dependency relationships
( )( )
11 1( | ) t t
t t t ttj
W i iP X i X i
W i j+
+ +
®= = =
®å
15
Enriching the networkConventionally: every node represents a single entity (location, state, etc.)
Now: break down nodes into higher-order nodes that carry different dependency relationships
( )( )
11 1( | ) t t
t t t ttj
W i iP X i X i
W i j+
+ +
®= = =
®å( )1
( | )| ( | )( | )t t
k
W i h jP X j X i hW i h k+
®= = =
®å
16
Enriching the networkConventionally: every node represents a single entity (location, state, etc.)
Now: break down nodes into higher-order nodes that carry different dependency relationships
( )( )
11 1( | ) t t
t t t ttj
W i iP X i X i
W i j+
+ +
®= = =
®å( )1
( | )| ( | )( | )t t
k
W i h jP X j X i hW i h k+
®= = =
®åCompatible with existing tools!
17
Fixed-order Variable-order
Assuming a fixed order beyond the second order becomes impractical because “higher-order Markov models are more complex” due to combinatorial explosion
--- Rosvall et al. (Nature Comm. 2014)
20
How to construct HON?
Raw data
• Sequential data
Rule extraction
• Which nodes need to be split into higher-order nodes, and how high the orders are
Network wiring
• Connecting nodes representing different orders of dependency
HON
• Use HON like the conventional network for analyses
22
Network wiring
A• Convert all first-order rules into edges
B• Convert higher-order rules• Add higher-order nodes when necessary
C• Rewire edges• The edge weights are preserved
D• Rewire remaining edges
24
Higher-order dependencies revealed by HON
Data # Records Dependencies revealed Similar observations
Ship movement 3,415,577 Up to 5th order N/A
Clickstream 3,047,697 Up to 3rd order
“… appear to saturate at k = 3 forYahoo… browsing behavioracross websites is definitely not Markovian but can be captured reasonably well by a not-too-high order Markov chain.”--- Chierichetti et al. (2012)
Retweet 23,755,810 N/A
26
Invasive species
Zebra mussels @ Great LakesClogging water pipes, attach to boats
Photos: Great Lakes Environmental Research Lab; TIME & LIFE Images, Getty Images
$120 billion / yeardamage & control costs
33
Ranking on clickstream network
User 1 Company ranking à Job listing à Applyà …User 2 Weather à Homepage à News à …User 3 News à Sports à Scores à …User 4 Events à Homepage à Weather à …
… … … …
35
Ranking on clickstream network
• 26% pages show more than 10% changes in ranking
• More than 90% pages lose PageRank scores, while a few pages gain significant scores
No changesto the ranking algorithm
40
Summary
Data
• Ship movements• Web clickstreams• Phone call cascades• … …
Network representation
• Global shipping network
• Web traffic network• Social network• … …
Network analysis
• Clustering• Ranking• Link prediction• Anomaly detection• … …
41
Full paper• Jian Xu, Thanuka L. Wickramarathne, and Nitesh V. Chawla.
"Representing Higher-order Dependencies in Networks." Science Advances 2, e1600028 (2016)
• Jun Tao, Jian Xu, Chaoli Wang, and Nitesh V. Chawla. ”HonVis: Visualizing and Exploring Higher Order Networks." IEEE PacificViz, 2017.