On the approximability of the link building problem Author - MartinOlsena,AnastasiosViglasb,∗ Speaker - Wayne Yang
Feb 22, 2016
On the approximability of the link building
problemAuthor - MartinOlsena,AnastasiosViglasb,∗
Speaker - Wayne Yang
Agenda
• Introduction• Define LINK BUILDING PROBLEM• Effect of receiving new links• Hardness results the complexity of LINK BUILDING• An approximation algorithm for LINK BUILDING• Lower bounds for the approximation ratios of greedy for LINK BUILDING• Discussion and Problems
Introduction
• Search engine optimization(SEO)is a fast growing industry that deals with optimizing the ranking of web pages in search engine results.
• The PageRank algorithm is one of the most well-known methods of defining a ranking among vertices according to the link structure of a graph.
LINK BUILDING PROBLEM
• Instance : A triple (G,x,k) where G(V,E)is a directed graph.• Solution : A set S⊆V \{x} with |S|=k.• Objective : Maximize πx in G(V,E∪(S×{x})).• x<=target vertex • k<= how many optimal link to create
Effect of receiving new links
• Avrachenkov and Litvak[6] study the effect on PageRank of adding newlinks with the same origin to the web graph.• Theorem 1. Let each of the pages 2 to k + 1 create a link to page 1. If π˜ p
denotes the updated PageRank value for page p for p ∈ {1,...,n}then we have:• π ˜ p = πp + [ π2 π3 . . . πk+1 ] M^(-1) q• Roughly the first factor concerns the PageRank values of the vertices
involved and the second factor M^(-1) q concerns the “distances” between the vertices involved in the update.
Ideal sources for backlinks• Any vertex u in S satisfies at least one of the following two
conditions: • (4a) u is relatively popular compared to its out degree, or • (4b) u has a low out-degree and is within a short distance from x = 1 (zxu
is large)• (4c) The vertices belong to different communities(zuv is small
for u, v ∈ S) • (4d)The distances from the vertices to x = 1 are long(zux is
small for u ∈ S)
Hardness results the complexity of LINK BUILDING
• We show that LINKBUILDING is W[1]-hard, and does not have a fully polynomial-time approximation scheme(FPTAS). These results are based on reductions from a variant of independent set.• =>If NP!=P then there is no FPTAS for LINK BUILDING.
An approximation algorithm for LINK BUILDING
• r-Greedy, a greedy polynomial time algorithm for LINK BUILDING computing a set of knew backlinks to a target vertex x to achieve a PageRank value that is within a constant factor from the optimal value.• zuv denotes the expected number of visits to vertex v,for the
PageRank random walk, starting from vertex u, before a zapping event occurs.
An approximation algorithm for LINK BUILDING
• r-Greedy(G, x, k)• S := ∅• repeat k times• Let u be a vertex which maximizes the value of πx/Zxx in G(V , E ∪ {(u, x)})• S := S ∪ {u}• E := E ∪ {(u, x)}• Report S as the solution
Lower bounds for the approximation ratios of greedy algorithms for LINK BUILDING
• In order to force a greater approximation ratio, we would have to consider graph families that use the independent set aspect of link building, as discussed in Remark1 and Section4.
• We want to construct a graph with vertices that have the following properties:• k cycle vertices c1,c2,...,ck that• – have high PageRank compared to their out-degree• – form a cycle, and therefore are in the same community• k sink vertices s1, s2,..., sk that• – have PageRank values (compared to their degrees) slightly lower than the cycle vertices• – do not belong to the same community• – have links from the target vertex x towards them, and therefore are within a short distance from the
target
Discussion and open problems
• We present a lower bound for the approximation ratio achieved by a perhaps more intuitive and simpler greedy algorithm.• A more interesting open problem is to develop a polynomial
time approximation scheme(PTAS) for LINK BUILDING