Top Banner

Click here to load reader

Network Analysis and Modeling - Santa Fe aaronc/courses/5352/csci5352_2017_L0.pdf CSCI 5352 Network Analysis and Modeling: learning goals 1. develop a network intuition for reasoning

Mar 16, 2020

ReportDownload

Documents

others

  • 100 150

    200 250

    300

    Aaron Clauset @aaronclauset Assistant Professor of Computer Science University of Colorado Boulder External Faculty, Santa Fe Institute

    Network Analysis and Modeling

    © 2017 Aaron Clauset

    lecture 0: what are networks and how do we talk about them?

  • who are network scientists?

    Physicists

    Computer Scientists

    Applied Mathematicians

    Statisticians

    Biologists

    Ecologists

    Sociologists

    Political Scientists

    it’s a big community!}

  • who are network scientists?

    Physicists

    Computer Scientists

    Applied Mathematicians

    Statisticians

    Biologists

    Ecologists

    Sociologists

    Political Scientists

    it’s a big community!

    • different traditions

    • different tools

    • different questions

    }

  • who are network scientists?

    Physicists

    Computer Scientists

    Applied Mathematicians

    Statisticians

    Biologists

    Ecologists

    Sociologists

    Political Scientists

    it’s a big community!

    • different traditions

    • different tools

    • different questions

    increasingly, not ONE community, but MANY, only loosely interacting communities

    }

  • who are network scientists?

    Physicists

    Computer Scientists

    Applied Mathematicians

    Statisticians

    Biologists

    Ecologists

    Sociologists

    Political Scientists

    phase transitions, universality

    data / algorithm oriented, predictions

    dynamical systems, diff. eq.

    inference, consistency, covariates

    experiments, causality, molecules

    observation, experiments, species

    individuals, differences, causality

    rationality, influence, conflict

    }

  • what are networks?

  • what are networks? • an approach • a mathematical representation • provide structure to complexity • structure above

    individuals / components • structure below

    system / population

    system / population

    individuals / components

    }

  • CSCI 5352 Network Analysis and Modeling : learning goals

    1. develop a network intuition for reasoning about how structural patterns are related, and how they influence dynamics in / on networks

    2. master basic terminology and concepts 3. master practical tools for analyzing / modeling structure of

    network data 4. build familiarity with advanced techniques for exploring / testing

    hypotheses about networks

  • building intuition basic concepts, tools practical tools advanced toolsCourse schedule (roughly) :

    1. network basics 2. centrality measures 3. random graphs (simple) 4. configuration model 5. large-scale structure (communities, hierarchies, etc.) 6. probabilistic generative models (SBMs) 7. metadata, label and link prediction 8. spreading processes (social, biological, SI-type) 9. data wrangling + data sampling (artifacts) 10. role of statistics in hypothesis generation / testing 11. spatial networks 12. citations networks, dynamics, preferential attachment 13. temporal networks 14. student project presentations

  • 100 150

    200 250

    300

    http://santafe.edu/~aaronc/courses/5352/Course webpage:

    http://www.santafe.edu/~aaronc/courses/5352/

  • Network data for assignments

  • lessons learned from past instances

    what’s difficult:

    1. students need to know many different things:

    2. can’t teach all of these things to all types of students!

    • vast amounts of advanced material in each of these directions

    • students have little experience / intuition of what makes good science

    • some probability Erdos-Renyi, configuration, calculations • some mathematics physics-style calculations, phase transitions • some statistics basic data analysis, correlations, distributions • some machine learning prediction, likelihoods, features, estimation algorithms • some programming data wrangling, coding up measures and algorithms

  • what works well:

    1. simple mathematical problems build intuition + practice with concepts

    nA nB

    A

    B

    calculate the diameter

    closeness centrality

    modularity of a line graph

    n− rr

    betweenness of

    Q(r)

    A

    A

    lessons learned from past instances

  • what works well:

    2. analyze real networks test understanding + practice with implementing methods

    10 2

    10 3

    10 4

    10 5

    2

    2.5

    3

    3.5

    Network size, n

    M e a n g

    e o d e si

    c p a th

    le n g th

    USF

    Haverford

    Caltech

    Penn

    mean geodesics and O(log n) 1 4 7 10 13 16 19 22 25 28 31 34

    0.35

    0.4

    0.45

    0.5

    0.55

    0.6

    0.65

    0.7

    0.75

    vertex label

    h a rm

    o n ic

    c e n tr

    a lit

    y

    Karate club configuration model real-world network

    node centrality vs. configuration model (when is a pattern interesting?) Assortativity (gender)-0.1 -0.05 0 0.05 0.1

    D en

    si ty

    ×10-3

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    1.6

    1.8

    2

    attribute assortativity

    lessons learned from past instances

  • what works well:

    3. simple prediction tasks test intuition + run numerical experiments

    Fraction of labels observed, f 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

    Fr ac

    tio n

    of c

    or re

    ct la

    be l p

    re di

    ct io

    ns

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1 malaria genes, HVR5 Norwegian boards, net1m-2011-08-01

    Fraction of edges observed, f 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

    AU C

    0.5

    0.6

    0.7

    0.8

    0.9

    1 HVR5 malaria genes network

    degree product Jaccard coefficient shortest path baseline (guessing)

    label prediction via homophily link prediction via heuristic

    lessons learned from past instances

  • in-degree, kin 100 101 102 103 104 105

    Pr (K

    ≥ k

    in )

    10-6

    10-5

    10-4

    10-3

    10-2

    10-1

    100 r=1 r=4 no preferential attachment

    0 15

    5

    1

    l

    10

    10

    cin-cout p

    15

    0.55 0 0

    what works well:

    4. simple simulations explore dynamics vs. structure + numerical experiments

    simulate epidemics (SIR) on planted partitions simulate Price’s model

    lessons learned from past instances

  • what works well:

    5. team projects teamwork + exploring their own ideas

    lessons learned from past instances

  • key takeaways

    0

    0.5 1

    a( t)

    0

    1

    0

    1

    0 200

    400 600

    0

    1

    alignment position t

    1

    2 3 4

    5 6

    7 8

    9

    calculate alignment scoresconvert to alignment indicatorsremove short aligned regions extract highly variable regions

    NGDYKEKVSNNLRAIFNKIYENLNDPKLKKHYQKDAPNY

    NGDYKKKVSNNLKTIFKKIYDALKDTVKETYKDDPNY

    NGDYKEKVSNNLRAIFKKIYDALEDTVKETYKDDPNY

    16

    6

    13

    16 6

    13

    A

    B

    C

    D

  • • network intuition is hard to develop! good intuition draws on many skills (probability, statistics, computation, causal dynamics, etc.)

    • best results come from 1. exercises to get practice with calculations 2. practice analyzing diverse real-world networks 3. conducting out numerical experiments & simulations

    • practical tasks are a pedagogical tool (e.g., link and label prediction)

    • interpreting the results requires a good intuition

    • null models are key conceptual idea: is a pattern interesting?

    • networks are fun!

    key takeaways

    0

    0.5 1

    a( t)

    0

    1

    0

    1

    0 200

    400 600

    0

    1

    alignment position t

    1

    2 3 4

    5 6

    7 8

    9

    calculate alignment scoresconvert to alignment indicatorsremove short aligned regions extract highly variable regions

    NGDYKEKVSNNLRAIFNKIYENLNDPKLKKHYQKDAPNY

    NGDYKKKVSNNLKTIFKKIYDALKDTVKETYKDDPNY

    NGDYKEKVSNNLRAIFKKIYDALEDTVKETYKDDPNY

    16

    6

    13

    16 6

    13

    A

    B

    C

    D

  • 1. defining a network

    2. describing a network

  • vertices edges

    what is a vertex?

    when are two vertices connected?

    V distinct objects (vertices / nodes / actors)

    E ✓ V ⇥ V pairwise relations (edges / links / ties)

  • te le

    co m

    m un

    ic at

    io ns

    in fo

    rm at

    io na

    l tr

    an sp

    or ta

    ti on

    so ci

    al bi

    ol og

    ic al

    network v

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.