Click here to load reader

Mar 16, 2020

100 150

200 250

300

Aaron Clauset @aaronclauset Assistant Professor of Computer Science University of Colorado Boulder External Faculty, Santa Fe Institute

Network Analysis and Modeling

© 2017 Aaron Clauset

lecture 0: what are networks and how do we talk about them?

who are network scientists?

Physicists

Computer Scientists

Applied Mathematicians

Statisticians

Biologists

Ecologists

Sociologists

Political Scientists

it’s a big community!}

who are network scientists?

Physicists

Computer Scientists

Applied Mathematicians

Statisticians

Biologists

Ecologists

Sociologists

Political Scientists

it’s a big community!

• different traditions

• different tools

• different questions

}

who are network scientists?

Physicists

Computer Scientists

Applied Mathematicians

Statisticians

Biologists

Ecologists

Sociologists

Political Scientists

it’s a big community!

• different traditions

• different tools

• different questions

increasingly, not ONE community, but MANY, only loosely interacting communities

}

who are network scientists?

Physicists

Computer Scientists

Applied Mathematicians

Statisticians

Biologists

Ecologists

Sociologists

Political Scientists

phase transitions, universality

data / algorithm oriented, predictions

dynamical systems, diff. eq.

inference, consistency, covariates

experiments, causality, molecules

observation, experiments, species

individuals, differences, causality

rationality, influence, conflict

}

what are networks?

what are networks? • an approach • a mathematical representation • provide structure to complexity • structure above

individuals / components • structure below

system / population

system / population

individuals / components

}

CSCI 5352 Network Analysis and Modeling : learning goals

1. develop a network intuition for reasoning about how structural patterns are related, and how they influence dynamics in / on networks

2. master basic terminology and concepts 3. master practical tools for analyzing / modeling structure of

network data 4. build familiarity with advanced techniques for exploring / testing

hypotheses about networks

building intuition basic concepts, tools practical tools advanced toolsCourse schedule (roughly) :

1. network basics 2. centrality measures 3. random graphs (simple) 4. configuration model 5. large-scale structure (communities, hierarchies, etc.) 6. probabilistic generative models (SBMs) 7. metadata, label and link prediction 8. spreading processes (social, biological, SI-type) 9. data wrangling + data sampling (artifacts) 10. role of statistics in hypothesis generation / testing 11. spatial networks 12. citations networks, dynamics, preferential attachment 13. temporal networks 14. student project presentations

100 150

200 250

300

http://santafe.edu/~aaronc/courses/5352/Course webpage:

http://www.santafe.edu/~aaronc/courses/5352/

Network data for assignments

lessons learned from past instances

what’s difficult:

1. students need to know many different things:

2. can’t teach all of these things to all types of students!

• vast amounts of advanced material in each of these directions

• students have little experience / intuition of what makes good science

• some probability Erdos-Renyi, configuration, calculations • some mathematics physics-style calculations, phase transitions • some statistics basic data analysis, correlations, distributions • some machine learning prediction, likelihoods, features, estimation algorithms • some programming data wrangling, coding up measures and algorithms

what works well:

1. simple mathematical problems build intuition + practice with concepts

nA nB

A

B

calculate the diameter

closeness centrality

modularity of a line graph

n− rr

betweenness of

Q(r)

A

A

lessons learned from past instances

what works well:

2. analyze real networks test understanding + practice with implementing methods

10 2

10 3

10 4

10 5

2

2.5

3

3.5

Network size, n

M e a n g

e o d e si

c p a th

le n g th

USF

Haverford

Caltech

Penn

mean geodesics and O(log n) 1 4 7 10 13 16 19 22 25 28 31 34

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

vertex label

h a rm

o n ic

c e n tr

a lit

y

Karate club configuration model real-world network

node centrality vs. configuration model (when is a pattern interesting?) Assortativity (gender)-0.1 -0.05 0 0.05 0.1

D en

si ty

×10-3

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

attribute assortativity

lessons learned from past instances

what works well:

3. simple prediction tasks test intuition + run numerical experiments

Fraction of labels observed, f 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Fr ac

tio n

of c

or re

ct la

be l p

re di

ct io

ns

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 malaria genes, HVR5 Norwegian boards, net1m-2011-08-01

Fraction of edges observed, f 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

AU C

0.5

0.6

0.7

0.8

0.9

1 HVR5 malaria genes network

degree product Jaccard coefficient shortest path baseline (guessing)

label prediction via homophily link prediction via heuristic

lessons learned from past instances

in-degree, kin 100 101 102 103 104 105

Pr (K

≥ k

in )

10-6

10-5

10-4

10-3

10-2

10-1

100 r=1 r=4 no preferential attachment

0 15

5

1

l

10

10

cin-cout p

15

0.55 0 0

what works well:

4. simple simulations explore dynamics vs. structure + numerical experiments

simulate epidemics (SIR) on planted partitions simulate Price’s model

lessons learned from past instances

what works well:

5. team projects teamwork + exploring their own ideas

lessons learned from past instances

key takeaways

0

0.5 1

a( t)

0

1

0

1

0 200

400 600

0

1

alignment position t

1

2 3 4

5 6

7 8

9

calculate alignment scoresconvert to alignment indicatorsremove short aligned regions extract highly variable regions

NGDYKEKVSNNLRAIFNKIYENLNDPKLKKHYQKDAPNY

NGDYKKKVSNNLKTIFKKIYDALKDTVKETYKDDPNY

NGDYKEKVSNNLRAIFKKIYDALEDTVKETYKDDPNY

16

6

13

16 6

13

A

B

C

D

• network intuition is hard to develop! good intuition draws on many skills (probability, statistics, computation, causal dynamics, etc.)

• best results come from 1. exercises to get practice with calculations 2. practice analyzing diverse real-world networks 3. conducting out numerical experiments & simulations

• practical tasks are a pedagogical tool (e.g., link and label prediction)

• interpreting the results requires a good intuition

• null models are key conceptual idea: is a pattern interesting?

• networks are fun!

key takeaways

0

0.5 1

a( t)

0

1

0

1

0 200

400 600

0

1

alignment position t

1

2 3 4

5 6

7 8

9

calculate alignment scoresconvert to alignment indicatorsremove short aligned regions extract highly variable regions

NGDYKEKVSNNLRAIFNKIYENLNDPKLKKHYQKDAPNY

NGDYKKKVSNNLKTIFKKIYDALKDTVKETYKDDPNY

NGDYKEKVSNNLRAIFKKIYDALEDTVKETYKDDPNY

16

6

13

16 6

13

A

B

C

D

1. defining a network

2. describing a network

vertices edges

what is a vertex?

when are two vertices connected?

V distinct objects (vertices / nodes / actors)

E ✓ V ⇥ V pairwise relations (edges / links / ties)

te le

co m

m un

ic at

io ns

in fo

rm at

io na

l tr

an sp

or ta

ti on

so ci

al bi

ol og

ic al

network v

Welcome message from author

This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Related Documents