Top Banner
CMU SCS Thank you! C. Faloutsos CMU
79

CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

Dec 24, 2015

Download

Documents

Zoe Howard
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Thank you!

C. Faloutsos

CMU

Page 2: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Large Graph Mining

C. Faloutsos

CMU

Page 3: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

C. Faloutsos

CMU

Page 4: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 4

Outline

• Credit where credit is due

• Technical part – Data mining– Can it be automated?– Research challenges

• Non-technical part: `Listen’– To the data– To non-experts

Page 5: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Nominator

• Jian Pei

KDD'10 C. Faloutsos 5

Page 6: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Endorsers

• Charu C. Aggarwal (IBM Research)• Ricardo Baeza-Yates (Yahoo! Research)• Albert-Laszlo Barabasi (Northeastern University)• Denilson Barbosa (University of Alberta)• Yixin Chen (Washington University at St. Louis)

KDD'10 C. Faloutsos 6

Page 7: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Endorsers, cont’d

• William Cohen (Carnegie Mellon University)• Diane J. Cook (Washington State University)• Gautam Das (University of Texas at Arlington)• Inderjit S. Dhillon (University of Texas at Austin)• Chris H. Q. Ding (University of Texas at

Arlington)

KDD'10 C. Faloutsos 7

Page 8: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Endorsers, cont’d

• Petros Drineas (Rensselaer Polytechnic Institute)• Tina Eliassi-Rad (Lawrence Livermore National

Laboratory)• Greg Ganger (Carnegie Mellon University)• Minos Garofalakis (Technical University of Crete)• James Garrett (Carnegie Mellon University)

KDD'10 C. Faloutsos 8

Page 9: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Endorsers, cont’d

• Dimitrios Gunopulos (University of Athens)• Xiaofei He (Zhejiang University)• Panagiotis G. Ipeirotis (New York University)• Eamonn Keogh (UCR)• Hiroyuki Kitagawa (University of Tsukuba)• Tamara Kolda (Sandia Nat. Labs)

KDD'10 C. Faloutsos 9

Page 10: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Endorsers, cont’d

• Flip Korn (AT&T Research)• Nick Koudas (University of Toronto)• Hans-Peter Kriegel • Ravi Kumar (Yahoo! Research)• Laks Lakshmanan (UBC)• Jure Leskovec (Stanford University)

KDD'10 C. Faloutsos 10

Page 11: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Endorsers, cont’d

• Nikos Mamoulis (Hong Kong University)• Heikki Manilla (Aalto University, • Dharmendra S. Modha (IBM Research)• Mario Nascimento (University of Alberta)• Jennifer Neville (Purdue University)• Beng Chin Ooi (National University of Singapore)

KDD'10 C. Faloutsos 11

Page 12: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Endorsers, cont’d

• Dimitris Papadias (Hong Kong University of Science and Technology)

• Spiros Papadimitriou (IBM Research)• Jian Pei (Simon Fraser University)• Foster Provost (New York University)• Oliver Schulte (Simon Fraser University)• Dennis Shasha (New York University)• Srinivasan Parthasarathy (OSU)

KDD'10 C. Faloutsos 12

Page 13: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Endorsers, cont’d

• Jimeng Sun (IBM Research)• Dacheng Tao (Nanyang University of Technology)• Yufei Tao (The Chinese University of Hong Kong)• Evimaria Terzi (Boston University)• Alex Thomo (University of Victoria)• Andrew Tomkins (Google Research)

KDD'10 C. Faloutsos 13

Page 14: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Endorsers, cont’d

• Caetano Traina (University of Sao Paulo)• Vassilis Tsotras (University of California, Riverside)• Alex Tuzhilin (New York University)• Haixun Wang (Microsoft Research)

KDD'10 C. Faloutsos 14

Page 15: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Endorsers, cont’d

• Wei Wang (University of North Carolina at Chapel Hill)

• Philip S. Yu (University of Illinois, Chicago)• Zhongfei Zhang (Binghamton University, State

University of New York)

KDD'10 C. Faloutsos 15

Page 16: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD committee

• Ramasamy Uthurusamy, Chair

• Robert Grossman (University of Illinois at Chicago)

• Jiawei Han (University of Illinois at Urbana-Champaign)

• Tom Mitchell (Carnegie Mellon University)

• Gregory Piatetsky-Shapiro (KDnuggets)

KDD'10 C. Faloutsos 16

Page 17: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD committee cnt’d

• Raghu Ramakrishnan (Yahoo! Research)

• Sunita Sarawagi (Indian Institute of Technology, Bombay)

• Padhraic Smyth (University of California at Irvine)

• Ramakrishnan Srikant (Google Research)

KDD'10 C. Faloutsos 17

Page 18: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD committee cnt’d

• Xindong Wu (University of Vermont)

• Mohammed J. Zaki (Rensselaer Polytechnic Institute)

KDD'10 C. Faloutsos 18

Page 19: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Family• Parents Nikos & Sophia

• Siblings Michalis*, Petros*, Maria

• Wife Christina#

(*) : and co-authors(#) : and research impact evaluator (‘grandpa’ test - see later…)

KDD'10 C. Faloutsos 19

Page 20: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 20

Academic ‘parents’

• Christodoulakis, Stavros (T.U.C.)

• Sevcik, Ken (U of T)

• Roussopoulos, Nick (UMD)

Page 21: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 21

Academic ‘children’

• King-Ip (David) Lin• Ibrahim Kamel• Flip Korn• Byoung-Kee Yi• Leejay Wu• Deepayan Chakrabarti

Page 22: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 22

Academic ‘children’

• Jia-Yu (Tim) Pan

• Spiros Papadimitriou

• Jimeng Sun

• Jure Leskovec

• Hanghang Tong

Page 23: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 23

Academic ‘children’

• Mary McGlohon• Fan Guo• Lei Li• Leman Akoglu• Dueng Horng (Polo) Chau• Aditya Prakash• U Kang

Page 24: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 24

CMU colleagues

• Tom Mitchell• Garth Gibson• Greg Ganger• M. (Satya) Satyanarayanan• Howard Wactlar• Jeannette Wing• + +

Page 25: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 25

Co-authors

• [dblp 7/2010:] All 300 of you

• Agma J. M. Traina (22)• Caetano Traina Jr. (20)• …

Page 26: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Funding agencies

• NSF (Maria Zemankova, Frank Olken, ++)

• DARPA, LLNL, PITA

• IBM, MS, HP, INTEL, Y!, Google, Symantec, Sony, Fujitsu, …

KDD'10 C. Faloutsos 26

Page 27: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 27

Outline

• Credit where credit is due

• Technical part – Data mining– Can it be automated?– Research challenges

• Non-technical part: `Listen’– To the data– To non-experts

Page 28: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Data mining = compression & …

KDD'10 C. Faloutsos 28

Christos Faloutsos, Vasileios Megalooikonomou: On data mining, compression, and Kolmogorov complexity. Data Min. Knowl. Discov. 15(1): 3-20 (2007)

Page 29: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Data mining = compression & …

KDD'10 C. Faloutsos 29

Christos Faloutsos, Vasileios Megalooikonomou: On data mining, compression, and Kolmogorov complexity. Data Min. Knowl. Discov. 15(1): 3-20 (2007)

Page 30: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Data mining = compression & …

KDD'10 C. Faloutsos 30

But: how can compression• do forecasting?• spot outliers?

• do classification?

Page 31: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Data mining = compression & …

KDD'10 C. Faloutsos 31

OK – then, isn’t compression a solved problem (gzip, LZ)?

Page 32: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

… compression is undecidable!

Theorem*: for an arbitrary string x, computing its Kolmogorov complexity K(x) is undecidable

KDD'10 C. Faloutsos 32(*) E.g., [T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley and Sons,1991, section 7.7]

A.N. Kolmogorov

EVEN WORSEthan NP-hard!

Page 33: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

… compression is undecidable!

…which means there will always be better data mining tools/models/patterns to be discovered

-> job security

-> job satisfaction

KDD'10 C. Faloutsos 33

Page 34: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Let’s see some examples of models

KDD'10 C. Faloutsos 34Body weight

Responseto new drug

Page 35: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Let’s see some examples of models

KDD'10 C. Faloutsos 35income

$ spent

Page 36: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Let’s see some examples of models

KDD'10 C. Faloutsos 36income

$ spent

Page 37: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Let’s see some examples of models

KDD'10 C. Faloutsos 37income

$ spent

Page 38: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Let’s see some examples of models

KDD'10 C. Faloutsos 38income

$ spent

3/4

Page 39: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 39

Metabolicrate

3/4

mass

Let’s see some examples of models

http://universe-review.ca /R10-35-metabolic.htm

Page 40: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 40

Metabolicrate

3/4

mass

Kleiberg’s law

http://universe-review.ca /R10-35-metabolic.htm

Page 41: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 41

Outline

• Credit where credit is due

• Technical part – Data mining– Can it be automated? NO!

• Always room for better models

– Research challenges

• Non-technical part: `Listen’– To the data– To non-experts

Page 42: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Always room for better models

• Eg.: clustering – k-means (or our favorite clustering algo)

• How many clusters are in the Sierpinski triangle?

KDD'10 C. Faloutsos 42

Page 43: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Always room for better models

KDD'10 C. Faloutsos 43

Page 44: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Always room for better models

KDD'10 C. Faloutsos 44

K=3 clusters?

Page 45: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Always room for better models

KDD'10 C. Faloutsos 45

K=3 clusters?K=9 clusters?

Page 46: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Always room for better models

KDD'10 C. Faloutsos 46

Piece-wiseflat

Mixtureof (Gaussian)

clusters

Page 47: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Always room for better models

KDD'10 C. Faloutsos 47

Piece-wiseflat

Mixtureof (Gaussian)

clusters

¾ Powerlaw

??

Page 48: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Always room for better models

KDD'10 C. Faloutsos 48

Piece-wiseflat

Mixtureof (Gaussian)

clusters

¾ Powerlaw

ONE, butSelf-similar

‘cluster’

Page 49: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Always room for better models

KDD'10 C. Faloutsos 49

ONE, butSelf-similar

‘cluster’

• Barnsley’s method of IFS (iterated function systems) can easily generate it [Barnsley+Sloan, BYTE, 1988]

~100 lines of C code: www.cs.cmu.edu~/christos/www/SRC/ifs.tar

Page 50: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Always room for better models

KDD'10 C. Faloutsos 50

• But, does self-similarity appear in real life?

Page 51: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Real, self similar dataset

KDD'10 C. Faloutsos 51

Page 52: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Real, self similar dataset

KDD'10 C. Faloutsos 52

Page 53: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Real, self similar dataset

KDD'10 C. Faloutsos 53

Page 54: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Real, self similar dataset

KDD'10 C. Faloutsos 54

Page 55: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 55

• the red is true• origin: Norway•but most other coastlines are ‘self-similar’, too!

Page 56: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

How can we find better models?

• Obviously, an art (‘undecidable’!)

• Helps if we– Listen to domain experts and– Listen to the data (next)

KDD'10 C. Faloutsos 56

Page 57: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 57

Outline

• Credit where credit is due

• Technical part – Data mining– Can it be automated? NO!– Research challenges

• Listen to the data (the more, the better!)

• Non-technical part: `Listen’– To the data– To non-experts

Page 58: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 58

Scalability

• Google: > 450,000 processors in clusters of ~2000 processors each [Barroso, Dean, Hölzle, “Web Search for a Planet: The Google Cluster Architecture” IEEE Micro 2003]

• Yahoo: ~5Pb of data [Fayyad’07]

• ‘M45’: 4K proc’s, 3Tb RAM, 1.5 Pb disk

Page 59: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Promising research direction: scalability

• challenges– Vast amounts of data; storing; cooling (!); …

• … and opportunities:– DATA: Easier to collect (clickstreams, sensors

etc)– S/W: Hadoop, hbase, pig, … : open source– H/W: 1Tb disk: ~ US$ 100

KDD'10 C. Faloutsos 59

Page 60: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Promising research direction

• The more data, the more subtle patterns we may discover

• Examples of subtle patterns:

KDD'10 C. Faloutsos 60

Page 61: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

More data, more subtle patterns

KDD'10 C. Faloutsos 61

Mukund Seshadri, Sridhar Machiraju, Ashwin Sridharan, Jean Bolot, Christos Faloutsos, Jure Leskovec: Mobile call graphs: beyond power-law and lognormal distributions. KDD 2008: 596-604

Duration (log scale)

PDF: fraction of customers (log scale)

Page 62: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

More data, more subtle patterns

KDD'10 C. Faloutsos 62

Mukund Seshadri, Sridhar Machiraju, Ashwin Sridharan, Jean Bolot, Christos Faloutsos, Jure Leskovec: Mobile call graphs: beyond power-law and lognormal distributions. KDD 2008: 596-604

Duration (log scale)

PDF: fraction of customers (log scale)

(mixture of)Gaussians

Page 63: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

More data, more subtle patterns

KDD'10 C. Faloutsos 63

Mukund Seshadri, Sridhar Machiraju, Ashwin Sridharan, Jean Bolot, Christos Faloutsos, Jure Leskovec: Mobile call graphs: beyond power-law and lognormal distributions. KDD 2008: 596-604

Page 64: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

More data, more subtle patterns

KDD'10 C. Faloutsos 64

Mukund Seshadri, Sridhar Machiraju, Ashwin Sridharan, Jean Bolot, Christos Faloutsos, Jure Leskovec: Mobile call graphs: beyond power-law and lognormal distributions. KDD 2008: 596-604

Zipf(Pareto,

Power-law)

Page 65: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

More data, more subtle patterns

KDD'10 C. Faloutsos 65

Mukund Seshadri, Sridhar Machiraju, Ashwin Sridharan, Jean Bolot, Christos Faloutsos, Jure Leskovec: Mobile call graphs: beyond power-law and lognormal distributions. KDD 2008: 596-604

Page 66: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

More data, more subtle patterns

KDD'10 C. Faloutsos 66

Mukund Seshadri, Sridhar Machiraju, Ashwin Sridharan, Jean Bolot, Christos Faloutsos, Jure Leskovec: Mobile call graphs: beyond power-law and lognormal distributions. KDD 2008: 596-604

lognormal

Page 67: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

More data, more subtle patterns

KDD'10 C. Faloutsos 67

Mukund Seshadri, Sridhar Machiraju, Ashwin Sridharan, Jean Bolot, Christos Faloutsos, Jure Leskovec: Mobile call graphs: beyond power-law and lognormal distributions. KDD 2008: 596-604

Page 68: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

More data, more subtle patterns

KDD'10 C. Faloutsos 68

Mukund Seshadri, Sridhar Machiraju, Ashwin Sridharan, Jean Bolot, Christos Faloutsos, Jure Leskovec: Mobile call graphs: beyond power-law and lognormal distributions. KDD 2008: 596-604

dPln(=doubly

ParetoLognormal)

Page 69: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

So, dPln is the answer?

KDD'10 C. Faloutsos 69

Page 70: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

So, dPln is the answer?

KDD'10 C. Faloutsos 70

Yes, for the moment…

Page 71: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

So, dPln is the answer?

KDD'10 C. Faloutsos 71

With more data, who knows?!

Page 72: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

KDD'10 C. Faloutsos 72

Outline

• Credit where credit is due

• Technical part – Data mining– Can it be automated? NO!– Research challenges

• Listen to the data (the more, the better!)

• Non-technical part: ‘Listen’– To the data– To non-experts

Page 73: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Listen to non-experts

• Explain ‘why’, to a non-expert (‘grandpa’)

• (and, even harder, explain ‘how’ – e.g.:– Frobenious Perron for irreducible MC

KDD'10 C. Faloutsos 73

Page 74: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Listen to non-experts

• Explain ‘why’, to a non-expert (‘grandpa’)

• (and, even harder, explain ‘how’ – e.g.:– Frobenious Perron for irreducible MC ->

pageRank -> random surfer

KDD'10 C. Faloutsos 74

Page 75: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Summary• Data mining = compression = undecidable =

job security • Hence: always room for better

models/patterns

– Listen to the data (Gb, Tb and Pb of them!)

– Listen to domain experts (e.g., ¾ Kleiberg’s law)

• Listen to non-experts (‘explain to grandpa’)

KDD'10 C. Faloutsos 75

Page 76: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Compression, fun, recursion

• The shortest, recursive joke:

• There are 3 types of data miners

KDD'10 C. Faloutsos 76

Page 77: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Compression, fun, recursion

• The shortest, recursive joke:

• There are 3 types of data miners– Those who can count

KDD'10 C. Faloutsos 77

Page 78: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Compression, fun, recursion

• The shortest, recursive joke:

• There are 3 types of data miners– Those who can count– And those who can not

KDD'10 C. Faloutsos 78

Page 79: CMU SCS Thank you! C. Faloutsos CMU. CMU SCS Large Graph Mining C. Faloutsos CMU.

CMU SCS

Thank you!For the honor,and for making this wonderful research community

KDD'10 C. Faloutsos 79