Top Banner
Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction PReSub Horst Bunke University of Bern
20

Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

Jun 08, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame

Recurrent Subgraph Prediction"PReSub

Horst Bunke University of Bern

Page 2: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

2!

Interactions in networks aren’t always dyadic.

Fig 1a: Simple group messages (hub-and-spoke).

Fig 1b: More complex hierarchies in businesses.

Page 3: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

3!

Influence in social networks –  [Leskovec, J., et al. Advances in Knowledge Discovery

and Data Mining (2006)] Inferring attacks in anonymized social networks –  [Backstrom, L., et al. WWW (2007)]

Functional Discovery in biological networks –  [Hu, H., et al. Bioinformatics (2005)]

Page 4: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

4!

Fig 2: Weekly snapshots of  the Enron email corpus.

Source Node Destination Node t

432 23432 54

4254 437854 54

473743 32 55

93535 35443 55

tn-1 tn tn+1

Page 5: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

5!Fig 3: Distribution of recurrent edges in the Enron network (across all timestamps).

Recurrent Edges

Freq

uenc

y of

Rec

urre

nce

Page 6: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

6!

Say we have network snapshots as below: G1 = {l1, l2, l3} G2 = {l2, l3, l4} G3 = {l1, l2, l4} G4 = {l1, l2, l3}

We aren’t interested in the links that are static signals; instead we want to register the “blips”: G’1 = { } G’2 = { } G’3 = {l1} G’4 = {l3}

Network instance = bag of links = set of transactions

Most frequently recurring links = GetFrequentTransactions(allTransactions, minSupport) = GetFrequentTransactions([G’1, …, G’n], minSupport)

Page 7: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

7!

There are predictable patterns in networks. Can we identify: What these patterns are When they occur Effective “early warning” methods to predict them

Page 8: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

8!

There are predictable patterns in networks. Can we identify: What these patterns are When they occur Effective “early warning” methods to predict them

Solution: Frequent Subgraph Mining

Recurrent Edges Fr

eque

ncy

of R

ecur

renc

e

Page 9: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

9!

There are predictable patterns in networks. Can we identify: What these patterns are When they occur Effective “early warning” methods to predict them

Solution: Subgraph Prediction

Page 10: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

10!

There are predictable patterns in networks. Can we identify: What these patterns are When they occur Effective “early warning” methods to predict them

Solution: Early Warning Subgraphs

Page 11: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

11!

Predict individual links of the subgraph. If l subgraph, predict for occurrence of l subgraph, predict for non-occurrence of l. State-of-the-art link prediction methods employed using LPMade suite [Lichtenwalter, R. & Chawla, N. JMLR (2011)].

Fig 4: Exact graph matching.

∈ ∉

Page 12: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

12!

We use GEDs to contextualize the subgraphs occurrence with respect to the global scenario in the network. Advantages of using GEDs: o  Flexible definition allows for weighted/

unweighted, directed/undirected graphs. o  Inexact matching allows for “unknown”

links in graph instances. o  Near-linear time approximate

implementation [Andoni, A. & Onak, K., SIAM Journal on Computing (2012)].

Fig 5: A pictorial summary of vector space embedding in GEDs.

Page 13: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

13!

Networks often exhibit a telltale “build up” to the desired structure. Use these early warning subgraphs as features to predict target subgraph. Learn how these breadcrumbs lead to target subgraph for given data.

Fig 6: Early warning subgraphs as features to predict the target subgraph.

Page 14: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

14!

Our method achieves high AUROC performance in predicting subgraphs, and outperforms link prediction on:

Commercial cellular phone calls Wikipedia Co-authorship Enron Email Corpus Facebook Wall Posts

Conclusion: We need to think of subgraphs as emergent structures in their own right and not just a composition of links.

Fig 7: AUROC Performance of our method v/s baseline link prediction.

Page 15: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

15!

Army Research Laboratory (ARL)

U.S. Air Force Office of Scientific Research (AFOSR)

Defense Advanced Research Projects Agency (DARPA)

National Science Foundation (NSF)

Research was supported in part by the Army Research Laboratory under Cooperative Agreement Number W911NF- 09-2-0053, National Science Foundation (NSF) Grant OCI- 1029584, and by the grant FA9550-12-1-0405 from the U.S. Air Force Office of Scientific Research (AFOSR) and the Defense Advanced Research Projects Agency (DARPA).

Page 16: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

16!

–  Leskovec, J., Singh, A. & Kleinberg, J. Patterns of influence in a recommendation network. In Advances in Knowledge Discovery and Data Mining, 380-389 (Springer, 2006).

–  Backstrom, L., Dwork, C. & Kleinberg, J. Wherefore art thou r3579x?: Anonymized social networks, hidden patterns, and structural steganography. In Proceedings of the 16th international conference on World Wide Web, 181-190 (ACM, 2007).

–  Hu, H., Yan, X., Huang, Y., Han, J. & Zhou, X. J. Mining coherent dense subgraphs across massive biological networks for functional discovery. Bioinformatics 21, i213-i221 (2005).

–  Lichtenwalter, R. N. & Chawla, N. V. LPMade: Link prediction made easy. Journal of Machine Learning Research 12, 2489-2492 (2011).

–  Andoni, A. & Onak, K. Approximating Edit Distance in Near-Linear Time. SIAM Journal on Computing 41, 1635-1648 (2012).

Page 17: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

17!

Q & A

Page 18: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

18!

Fig 8a (above): How prototypes are selected.

Fig 8b (right): How many prototypes are enough?

Page 19: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

19!

Fig 9: If a very broad window is chosen, the fine-grained aspects of recurrence may be lost, and if a very narrow window is chosen, there may be very little difference between two consecutive snapshots.

Page 20: Recurrent Subgraph Prediction: PReSubsnagrech/papers/PresubASONAM2015_slides.pdf · Saurabh Nagrecha and Nitesh V. Chawla University of Notre Dame Recurrent Subgraph Prediction" PReSub

20!

Dataset Number of Nodes Number of Edges Time Span

Mobile 8,321,119 712 million 65 days

Wiki 25,323,882 266 million ~4 years

Enron 87,098 1,147,028 ~4 years

Facebook 46,715 803,744 ~2 years

Table 1: A summary of the data sources used.