Top Banner
© 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel February 19, 2007 Efficient implementation of BP in P2P networks Roman Schmidt School of Computer and Communication Sciences Ecole Polytechnique Fédérale de Lausanne (EPFL) Evergrow Loopy Belief Propagation Algorithm and Applications Workshop Jerusalem, Israel, February 19-21, 2007
29

© 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel February 19, 2007

Efficient implementation of BP in P2P networks

Roman Schmidt

School of Computer and Communication SciencesEcole Polytechnique Fédérale de Lausanne (EPFL)

Evergrow Loopy Belief Propagation Algorithm and Applications WorkshopJerusalem, Israel, February 19-21, 2007

Page 2: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 2February 19, 2007

Motivation

• Users share (correlated) data in P2P systems– currently mainly for retrieval

– but correlations hold hidden knowledge

• Profit by correlations for new services– Distributed Knowledge Base (e.g., for software bugs)

– Structure/cluster data (e.g., for better search results)

– Recommendation system (e.g., for data annotation)

– etc.

• Distributed Inference System on top of a P2P system

• Current focus and contribution– Message reduction

Page 3: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 3February 19, 2007

Outline

• Motivation

• Basic Concepts

– Belief Propagation

– The P-Grid Overlay

• P2P Belief Propagation

– Inference Architecture

– The Relaxation Algorithm

• Evaluation

• Conclusions

Page 4: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 4February 19, 2007

Belief Propagation

• Inference based on Bayesian networks– models dependencies between variables

• Iterative message-passing algorithm– compute marginal probabilities (“beliefs”)– provably efficient on trees, works for arbitrary networks

OS1 Driver1

App1

True FalseInstalled 0.2 0.8

True FalseInstalled 0.2 0.8

OS1 Driver1 Runs Error T T 0.9 0.1 T F 0.4 0.6 F T 0.0 1.0 F F 0.0 1.0

Page 5: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 5February 19, 2007

The BP message-passing algorithm

• Sends messages across edges

– 2 messages per edge and iteration

– if all messages from previous iteration were received

• Beliefs are updated per iteration

– algorithm terminates if beliefs stabilize

• Messages are vectors

– length corresponds to the number of node states

• Computation complexity grows exponentially with

the number of states of nodes

Page 6: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 6February 19, 2007

The P-Grid Overlay

• Peers are organized in a binary trie structure– one node for every common prefix

– trie is only virtual (exists only via routing tables)

– all nodes remain at the leaf-level (no hierarchy)

• Multiple peers per key space partition

• Multiple routing entries (random choice)– per routing table level

• Logarithmic search complexity– even for skewed data distributions

Page 7: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 7February 19, 2007

P-Grid routing

00*

0*

01*

1*

10* 11*

AA FF BB CC DD EE

1* : C, D01* : B

Stores data with key prefix 00

1* : E01* : B

Stores data with key prefix 00

1* : C, D00* : F

Stores data with key prefix 01

0* : A, B11* : E

Stores data with key prefix 10

0* : A, F11* : E

Stores data with key prefix 10

0* : B, F10* : D

Stores data with key prefix 11

queryfor ‘100’

queryfor ‘100’

• Keys resolved by longest prefix matching– Insures logarithmic search cost for skewed trees

Page 8: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 8February 19, 2007

The Distributed Inference System

• P-Grid– Bug reports, metadata, tags, etc.

– Bayesian network

• Variables (spread over P-Grid nodes)

• Dependencies between variables

• Distributive learning

• Belief Propagation– Distributed inference

• Message-passing algorithm

• Identified problem– high message cost

Page 9: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 9February 19, 2007

Spring Relaxation

• Bayesian network as spring network

– find minimum energy configuration (relax springs)

– energy is proportional to the distance between P-Grid nodes

– variables at the same node require no energy

– optimal: all variables at one node (load balancing)

• Decentralized algorithm

– nodes try to relax their springs

– move correlated variables close to each other

– optimally, at the same node (no physical message)

– considering load distribution

Page 10: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 10February 19, 2007

Spring relations in P-Grid

00*

0*

01*

1*

10* 11*

AA FF BB CC DD EE

1* : C, D01* : B

a -> h, tf -> o, r…

1* : E01* : B

a -> h, tf -> o, r…

1* : C, D00* : F

h -> a, m m -> h, u…

0* : A, B11* : E

o -> f r -> f, t…

0* : A, F11* : E

o -> fr -> f, t…

0* : B, F10* : D

t -> a, ru -> m…

Page 11: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 11February 19, 2007

The Relaxation Algorithm (relax variables)

currentLoad = length(localVars);overload = currentLoad - avgLoad / 2;IF (overload <= 0) return;ENDIFundirVars = variables having a tension only at one level;WHILE ((overload > 0) AND (length(unidirVars) > 0)) move variable to a peer from the level with the tension; removeFirst(unidirVars); overload = overload - 1;ENDWHILE…

Page 12: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 12February 19, 2007

The Relaxation Algorithm (balance load)

…multidirVars = vars having tensions at multiple levels;WHILE ((currentLoad > avgLoad) AND (length(multidirVars) > 0)) FOR i = routingTable.levels TO 1 IF (level i is underpopulated) cand = vars having a tension at level i; FOR j = 1 TO length(cand) IF (cand(j).tension(i) >= max(cand(j).tension)) move variable to a peer from level i; remove(multidirVars, cand(j)); currentLoad = currentLoad - 1; IF (currentLoad <= avgLoad) break; ENDIF; ENDIF; ENDFOR; ENDIF; ENDFOR; ENDWHILE

Page 13: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 13February 19, 2007

Algorithm execution

• Executed at each node– Iteratively

– Independently (evaluated simultaneous)

• Termination– Max. number of iterations

– No free or multi-directional variables to move

– No tension reduction in last two iterations

• Effort– Variable movements require only 1 message

– Trade-off to message reduction

– Dynamic variables require “remote” updates

Page 14: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 14February 19, 2007

Evaluation

• Matlab implementation

• Diverse Bayesian networks

– random, binary trees, scale-free

– up to 2048 Bayesian nodes

– up to 512 P-Grid nodes

• 10 repetitions

• 2 main evaluation criterions

– message reduction

– load balance

Page 15: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 15February 19, 2007

Random network

1024 nodes, average node degree 4

Page 16: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 16February 19, 2007

Binary tree network

1023 nodes

Page 17: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 17February 19, 2007

Scale-free network

1024 nodes, average node degree 4

Page 18: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 18February 19, 2007

Message reduction (random)

128 / 2048 / 4

256 / 2048 / 4 512 / 2048 / 4

64 / 2048 / 4

Page 19: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 19February 19, 2007

Message reduction (binary tree)

64 / 2047 128 / 2047

256 / 2047 512 / 2047

Page 20: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 20February 19, 2007

Message reduction (scale-free)

64 / 2048 128 / 2048

256 / 2048 512 / 2048

Page 21: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 21February 19, 2007

Load balancing (random)

128 / 2048 / 4

256 / 2048 / 4 512 / 2048 / 4

64 / 2048 / 4

Page 22: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 22February 19, 2007

Load balancing (binary tree)

64 / 2047

128 / 2047

256 / 2047512 / 2047

Page 23: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 23February 19, 2007

Load balancing (scale-free)

64 / 2048128 / 2048

128 / 2048

512 / 2048

Page 24: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 24February 19, 2007

Number of iterations

• Till relaxation algorithm termination• Scale-free network (128 nodes, 1024 vars)• 100 runs

Page 25: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 25February 19, 2007

Reduction effort (random)

128 / 2048 / 4

256 / 2048 / 4 512 / 2048 / 4

64 / 2048 / 4

Page 26: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 26February 19, 2007

Reduction effort (binary tree)

64 / 2047 128 / 2047

256 / 2047 512 / 2047

Page 27: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 27February 19, 2007

Reduction effort (scale-free)

64 / 2048 128 / 2048

128 / 2048 512 / 2048

Page 28: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel 28February 19, 2007

Conclusions

• Decentralized relaxation algorithm

– Reduces message cost for Belief Propagation

– Considers load balance

• Several scenarios (Distributed Knowledge Base)

• First evaluation looks promising

• Intermediate steps are still missing

– Learning of Bayesian network

Page 29: © 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.

© 2007, Roman SchmidtDistributed Information Systems Laboratory Evergrow workshop, Jerusalem, Israel February 19, 2007

Thank you!

Questions?