Random Dot Product Graphs

Post on 03-Jan-2016

35 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

IPAM Intelligent Extraction of Information from Graphs & High Dimensional Data July 26, 2005. Random Dot Product Graphs. Ed Scheinerman Applied Mathematics & Statistics Johns Hopkins University. Coconspirators. Libby Beer John Conroy (IDA) Paul Hand (Columbia) Miro Kraetzl (DSTO) - PowerPoint PPT Presentation

Transcript

Random Dot Product GraphsRandom Dot Product Graphs

Ed ScheinermanApplied Mathematics & Statistics

Johns Hopkins University

IPAMIntelligent Extraction of Information from

Graphs & High Dimensional DataJuly 26, 2005

CoconspiratorsCoconspirators

• Libby Beer• John Conroy (IDA)• Paul Hand (Columbia)• Miro Kraetzl (DSTO)• Christine Nickel• Carey Priebe• Kim Tucker• Stephen Young (Georgia Tech)

OverviewOverview

• Mathematical context

• Modeling networks

• Random dot product model

• The inverse problem

Mathematical ContextMathematical Context

Graphs I Have LovedGraphs I Have Loved

• Interval graphs & intersection graphs

• Random graphs

• Random intersection graphs

• Threshold graphs & dot product graphs

Interval GraphsInterval Graphs

v a Ivv ~ w ⇔ Iv ∩ Iw ≠∅

Intersection GraphsIntersection Graphs

v a Sv

v ~ w ⇔ Sv ∩ Sw ≠∅

{1}

{1}

{1,2}

{2}

Random GraphsRandom Graphs

Erdös-Rényi style…

p 1 – p

Randomness is “in” the edges. Vertices are “dumb” placeholders.

Random Intersection GraphsRandom Intersection Graphs

• Assign random sets to vertices.

• Two vertices are adjacent iff their sets intersect.

• Randomness is “in” the vertices.

• Edges reflect relationships between vertices.

Threshold GraphsThreshold Graphs

v a xv ∈ R

v ~ w ⇔ xv + xw ≥1

0.5

0.6

0.8

0.3

Dot Product GraphsDot Product Graphs

v a xv ∈ Rd

v ~ w ⇔ xv ⋅xw ≥1

[1 0]

[2 0]

[1 1]

[0 1]

Fractional intersection graphs

Communication NetworksCommunication Networks

Physical NetworksPhysical Networks

Telephone

Local area network

Power grid

Internet

Social NetworksSocial Networks

Alice

Bob

A B

2003-4-10

Social Network GraphsSocial Network Graphs

Vertices (Actors) Edges (Dyads)

Telephones Calls

Email addresses Messages

Computers IP Packets

Human beings Acquaintance

Academicians Coauthorship

Example: Email at HPExample: Email at HP

• 485 employees

• 185,000 emails

• Social network (who emails whom) identified 7 “communities”, validated by interviews with employees.

Properties of Social NetworksProperties of Social Networks

• Clustering

• Low diameter

• Power law

Properties of Social NetworksProperties of Social Networks

• Clustering

• Low diameter

• Power law

P a ~ c | a ~ b ~ c[ ] > P a ~ c[ ]

a

b

c

Properties of Social NetworksProperties of Social Networks

• Clustering

• Low diameter

• Power law

“Six degrees of separation”

Properties of Social NetworksProperties of Social Networks

• Clustering

• Low diameter

• Power law

log d

log

N(d

)

Degree Histogram

Degree Histogram Example 1Degree Histogram Example 1

2838 vertices

degree

Num

ber

of v

erti

ces

Degree Histogram Example 2Degree Histogram Example 2

16142 vertices

degree

Num

ber

of v

erti

ces

Random Graph ModelsRandom Graph Models

Goal: Simple and realistic random graph models of social networks.

Erdös-Rényi?Erdös-Rényi?

• Low diameter!

• No clustering: P[a~c]=P[a~c|a~b~c].

• No power-law degree distribution.

Not a good model.

Model by Fan Chung et alModel by Fan Chung et al

N(d) = α d−β⎣ ⎦

Consider only those graphs with

with all such graphs equally likely.

People as VectorsPeople as Vectors

a1

a2

a3

a4

⎢ ⎢ ⎢ ⎢

⎥ ⎥ ⎥ ⎥

Sports

Politics

Movies

Graph theory

b1

b2

b3

b4

⎢ ⎢ ⎢ ⎢

⎥ ⎥ ⎥ ⎥

Shared InterestsShared Interests

P a ~ b[ ] = f a ⋅b( )

Alice and Bob are more likely to communicate when they have more shared interests.

Selecting the FunctionSelecting the Function

P a ~ b[ ] = f a ⋅b( )

f(t)=1

πtan

−1(t)+

1

2 f:[−∞,+∞]→[0,1]

Selecting the FunctionSelecting the Function

P a ~ b[ ] = f a ⋅b( ) €

f(t)=t

1+t a⋅b≥0 f:[0,∞]→[0,1]

Selecting the FunctionSelecting the Function

f(t)=tr

a⋅b∈[0,1]

f:[0,1]→[0,1]

P a ~ b[ ] = f a ⋅b( )

Random Dot Product Graphs, IRandom Dot Product Graphs, I

Given x1,x2 ,K ,xn ∈ Rd

P[i ~ j] = xi ⋅x j or = f (xi ⋅x j )( )

Write X = [x1,x2,K ,xn ]

PX (G) = (xi ⋅x j )ij∈E

∏ ⎛

⎝ ⎜ ⎜

⎠ ⎟ ⎟× (1−xi ⋅x j )

ij∉E i≠ j

∏ ⎛

⎝ ⎜ ⎜

⎠ ⎟ ⎟

Generalize Erdös-RényiGeneralize Erdös-Rényi

Take x1 = x2 =L = xn = x

with x ⋅x = p.

Generalize Intersection GraphsGeneralize Intersection Graphs

If i a Ai ⊆{1,2,K ,k}

take xi = χ (Ai )∈ {0,1}k

and f (t) =0 t = 0

1 t > 0

⎧ ⎨ ⎩

Whence the Vectors?Whence the Vectors?

• Vectors are given in advance.

• Vectors chosen (iid) from some distribution.

P(G) = PX (G) dX∫

Random Dot Product Graphs, IIRandom Dot Product Graphs, II

• Step 1: Pick the vectors Given by fiat. Chosen from iid a distribution.

• Step 2: For all i<j Let p=f(xi•xj).

Insert an edge from i to j with probability p.

MegageneralizationMegageneralization

• Generalization of: Intersection graphs (ordinary & random) Threshold graphs Dot product graphs Erdös-Rényi random graphs

• Randomness is “in” both the vertices and the edges.

• P[a~b] independent of P[c~d] when a,b,c,d are distinct.

Results in Dimension 1Results in Dimension 1

Choose xi iid uniform in [0,1].

Use f (t) = t r .

Choose xi independently from U r[0,1]

P(i ~ j) = xix j f (t) = t

Probability/Number of EdgesProbability/Number of Edges

P[i ~ j] = (xix j )rdxidx j

0

1

∫0

1

∫ =1

(1+ r)2

Expected number of edges =n

2

⎝ ⎜

⎠ ⎟1+ r( )

−2.

ClusteringClustering

P[a ~ c | a ~ b ~ c] =P[a ~ c & a ~ b ~ c]

P[a ~ b ~ c]

=(xy)r (xz)r (yz)rdxdydz∫∫∫

(xy)r (yz)rdxdydz∫∫∫

=(1+ r)2(1+ 2r)

(1+ 2r)3>

1

(1+ r)2= P[a ~ c]

Power LawPower Law

Believe :

N(d)∝ d−c, c =1−1/r

Can show :

N (1−ε)d,(1+ ε)d( )∝ 2εd−c

Power Law ExamplePower Law Example

n = 30000

P[i ~ j] = (xix j )3

Isolated VerticesIsolated Vertices

E[N(0)] ~ Crn(r−1)/r = o(n)

where Cr =(1+ r)1/r Γ(1/ r)

r.

Thus, the graph is not connected, but…

“Mostly” Connected“Mostly” Connected

“Giant” connected component

A “few” isolated vertices

Six Degrees of SeparationSix Degrees of Separation

Diameter ≤ 6

Attached

Attachedpair

Diameter ≤ 6 Proof OutlineDiameter ≤ 6 Proof Outline

{i : xi ≥ τ }

Diameter = 2

Isolated

{i : xi < τ }

Diameter ≤ 6 Proof OutlineDiameter ≤ 6 Proof Outline

Graphs to VectorsGraphs to Vectors

The Inverse Problem

Given Graphs, Find VectorsGiven Graphs, Find Vectors

• Given: A graph, or a series of graphs, on a common vertex set.

• Problem: Find vectors to assign to vertices that “best” model the graph(s).

Maximum Likelihood MethodMaximum Likelihood Method

• Feasible in dimension 1. Awful d>1.

• Nice results for f(t) = t / (1+t).€

arg maxX

{PX (G)}

Gram Matrix ApproachGram Matrix Approach

Given G1,G2 ,K ,Gm .

Let A =1

mA(G j )

j=1

m

∑ .

∴ aij ≈ P[i ~ j] = xi ⋅x j (i ≠ j)

X =[x1,x2,K ,xn ] (d ×n)

A ≈ XTX

Wrong Best SolutionWrong Best Solution

Minimize f (X) = A− XTXF

2

A =UTΛU; λ 1 ≥ λ 2 ≥L ≥ λ n

X = gd (A) :=

λ 1+ 0 L 0

0 λ 2+ L 0

M M O M

0 0 L λ d+

⎢ ⎢ ⎢ ⎢ ⎢

⎥ ⎥ ⎥ ⎥ ⎥

u1T

u2T

M

udT

⎢ ⎢ ⎢ ⎢

⎥ ⎥ ⎥ ⎥

Real ProblemReal Problem

Minimize f (X) = A− XTX + I o(XTX)F

2

We don’ t want xi ⋅xi ≈ 0 = aii .

Idea : aii ← xi ⋅xi

Iterative AlgorithmIterative Algorithm

1. D = 0n×n

2. X = gd (A+ D)

3. D = I o(XTX)

4. Go to 2

Minimize f (X) = A− XTX + I o(XTX)F

2

ConvergenceConvergence

If (when) the algorithm converges,

then the rows of X are eigenvectors

of A+ I o(XTX) and X is a local min

of f (X) = A− XTX + I o(XTX)F

2.

ConvergenceConvergence

iteration

diag

onal

ent

ries

G(n = 40,m =115) d = 2

ConvergenceConvergence

iteration

diag

onal

ent

ries

G(n = 40,m =115) d = 5

max{xi ⋅x j : i ≠ j} =1.05

ConvergenceConvergence

iteration

diag

onal

ent

ries

max{xi ⋅x j : i ≠ j} =1.152

G(n = 40,m =115) d =12

ConvergenceConvergence

iteration

diag

onal

ent

ries

G = C12 d =1

30 iterations

ConvergenceConvergence

iteration

diag

onal

ent

ries

G = C12 d =1

150 iterations

ConvergenceConvergence

iteration

diag

onal

ent

ries

G = C12 d =1

500 iterations

Enron exampleEnron example

ApplicationsApplications

Network Change/Anomaly Detection

Clustering

Change/Anomaly DetectionChange/Anomaly Detection

G1,G2 ,K ,Gr

X1 2 4 4 3 4 4 H1,H2,K ,H s

Y1 2 4 4 3 4 4

Align X,Y

Find xi − yi large.

Change/Anomaly DetectionChange/Anomaly Detection

Graph ClusteringGraph Clustering

Graph ClusteringGraph Clustering

Synthetic Lethality GraphsSynthetic Lethality Graphs

• Vertices are genes in yeast

• Edge between u and v iff Deleting one of u or v does not kill, but Deleting both is lethal.

SL Graph StatusSL Graph Status

• Yeast has about 6000 genes.

• Full graph known on 126 “query” genes (about 1300 edges).

• Partial graph known on 1000 “library” genes.

What Next?What Next?

Random Dot Product GraphsRandom Dot Product Graphs

• Extension to higher dimension Cube Unit ball intersect positive orthant

• Small world measures: clustering coefficient

• Other random graph properties

γ(v) = E(N(v)) ÷N (v)

2

⎝ ⎜

⎠ ⎟

γ(G) = average γ (v) :d(v) ≥ 2.

Vector EstimationVector Estimation

• MLE method Computationally efficient? More useful?

• Eigenvalue method Understand convergence Prove that it globally minimizes Extension to missing data

• Validate against real data

Network EvolutionNetwork Evolution

• Communication influences interests:

X =[x1,x2 ,…,xn ]

X(k +1) = F[G, X(k)]

Rapid GenerationRapid Generation

• Can we generate a sparse random dot product graph with n vertices and m edges in time O(n+m)?

• Partial answer: Yes, but.

The EndThe End

top related