Lineariza(on of Belief Propaga(on on Pairwise Markov Random Fields Node labeling over large graphs is important for applica5ons such as fraud detec5on in graphs on e-commerce transac5ons or node classifica5on in social networks Belief Propaga5on (BP) is an itera5ve message-passing algorithm for inference in Markov Random Fields (MRFs) that is oBen used for node labeling MOTIVATION & PROBLEM FORMULATION Our approach simplifies the problem of applying Belief propaga(on to node labeling PAIRWISE LinBP IN A NUTSHELL Wolfgang GaEerbauer [email protected] Prakhar Ojha [email protected] OUR APPROACH THEORETICAL RESULTS Q1: How accurate is our approxima(on, and under which condi(ons is it reasonable? • The lineariza5on gives comparable labeling accuracy as BP for graphs with weak poten5als. • The accuracy deteriorates with denser networks and strong poten5als. EXPERIMENTS & RESULTS • Linearized BP with convergence guarantees and closed-form solu5on for arbitrary pairwise MRFs that can be solved with itera5ve updates. • Compelling computa5onal advantage and very simple code • Source code available at hRps://github.com/sslh/sslh CONCLUSIONS & FUTURE WORK BP FaBP LinBP this work # node types arbitrary 1 1 arbitrary # node classes arbitrary 2 const k arbitrary # edge types arbitrary 1 1 arbitrary edge symmetry arbitrary required required arbitrary edge potential arbitrary doubly stoch. doubly stoch. arbitrary closed form no yes yes yes Q3: How fast is the linearized approxima(on as compared to BP? • The lineariza5on is around 100 5mes faster than BP per itera5on and oBen needs 10 5mes fewer itera5ons un5l convergence. • In prac5ce, this can lead to a speed-up of 1000 5mes. Experiments are run in Python. Applying this idea to arbitrary poten5als is more complicated • Hardness arises due to different re-centering in both the direc5ons of an edge • Belief vectors do not remain stochas5c anymore. ˆ r (j ) j=1 -0.04 -0.02 -0.06 j=2 -0.01 0.01 0 j=3 0.02 0.04 0.06 ˆ c(i) -0.03 0.03 ˆ s =0 i =1 i =2 r (j ) j=1 0.96 0.98 1.94 j=2 0.99 1.01 2 j=3 1.02 1.04 2.06 c(i) 2.97 3.03 s =6 i =1 i =2 r (j ) 0 0.495 0.505 1 0.495 0.505 1 0.495 0.505 1 c(i) 0 1.485 1.515 3 i =1 i =2 r (j ) 00 0.32 ˙ 3 0.32 ˙ 3 0.64 ˙ 6 0.33 ˙ 3 0.33 ˙ 3 0.66 ˙ 6 0.34 ˙ 3 0.34 ˙ 3 0.68 ˙ 6 c(i) 00 1 1 2 ˆ r (j ) 00 -0.01 -0.01 -0.02 0 0 0 0.01 0.01 0.02 ˆ c(i) 00 0 0 0 ˆ r (j ) 0 -0.005 0.005 0 -0.005 0.005 0 -0.005 0.005 0 ˆ c(i) 0 -0.015 0.015 0 centering around 1 row recentering column recentering un2centering around ½ un2centering around ⅓ 0 00 ˆ ˆ 0 ˆ 00 FaBP: Koutra et al. Unifying guilt-by-associa5on approaches: Theorems and fast algorithms. ECML/PKDD 2011 LinBP: GaEerbauer et al. Linearized and single-pass belief propaga5on. PVLDB 2015 s t st m s t k s k t k s k t ts m = = | ts st s y t y s y t y • But convergence is not guaranteed in loopy graphs • BP needs a lot of fine-tuning to make it convergerge This work derives an approach to approximate loopy BP on any pairwise MRF by linearizing the update equa(ons and giving exact convergence guarantees Input: • x_s: prior label distribu5on for node s • A: (n x n) adjacency matrix between nodes. • : poten5als on edge s-t h = 2, include matrix H(h) d = 5 and 10, f= 0.05 Q2: How predictable is our formula(on as compared to BP? • Our approxima5on can always be made to converge by scaling the poten5als with a known scaling factor. • In contrast, loopy BP some5mes does needs damping and scaling with an unknown factor. • ABer re-centering, the Max Marginal solu5on for a par5ally labeled MRF can be approximatet by the solu5on of following equa5on system • The solu5on can be found quickly with itera5ve updates and following convergence condi5on: Important prac(cal take-away : These update equa(ons come with exact convergence guarantees and can be implemented very efficiently with exis(ng linear algebra packages Residual 0.5 0.5 = + 0.6 0.4 0.36 0.16 Σ=0.50 Center 0.5 0.5 0.25 0.25 0.5 0.5 Value 0.6 0.4 0.1 -0.1 0.2 -0.2 0.1 -0.1 0.69 0.31 0.7 0.3 Σ=0.52 = = ⇒ Σ=0 Σ=1 Σ=1 Σ=1 Key Idea: By star5ng with messages and poten5als that are appropriately recentered around 1 and have small standard devia5ons, the resul5ng equa5ons do not require further normaliza5on. Linearized system that has comparable labeling accuracy as BP for graphs with weak poten5als, while speeding-up inference by orders of magnitude. • The formalism can model arbitrary heterogeneous networks • It comes with exact convergence guarantees and a fast matrix implementa5on • Itera5ve updates using linear equa5ons • The code library is available on github and is very easy to use Output: • y_s: posterior label distribu5on for each node s We replace mul(plica(on with addi(on and normaliza(on is not necessary anymore Approx. st = + + − 3 2 1 1 2 3 ˆ 0 3⇤ ˆ 0 2⇤ ˆ 0 1⇤ ˆ 0| 12 ˆ 0| 21 ˆ 0| 23 ˆ 0| 32 1 2 = 3 b b b = 23 12 ˆ y 1 ˆ y 2 ˆ y 3 y 1 y 2 y 3 ⇢ ⇣ ˆ 0 - ˆ 02 ⌘ 1 ⇢ : spectral radius of matrix Row Centering: Center each element of a row with the average of its row-sum. Formally, Similarly Column Centering. ˆ 0 (j, i)= 1 k ⇣ ˆ (j, i) - ˆ r (j ) k ⌘ where, ˆ r (j )= P i ˆ (j, i) ˆ y (t+1) ( ˆ x +ˆ c 0 ⇤ ) + ( ˆ 0 - ˆ 02 ) ˆ y (t) Exact Equa5on Itera5ve Updates ˆ y =ˆ x + ˆ 0T k + ˆ 0T ˆ y - ˆ 0T 2 ˆ y