Top Banner
1 Joint work with Shmuel Safra
49

1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

1

Joint work with Shmuel Safra

Joint work with Shmuel Safra

Page 2: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

2

MotivationMotivation

Page 3: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

3

MotivationMotivation

Page 4: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

4

The Catalog ProblemThe Catalog ProblemInput: A set of customers C. A set of pages P. A function : C 2P. The catalog size r.

Output: A catalog P’ P of size r s.t. is maximal.

Cc'Pc

Page 5: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

5

The Catalog Problem The Catalog Problem (cont.)(cont.)Algorithm:Take the r most popular pages.

Page 6: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

6

Catalog SegmentationCatalog Segmentation

Page 7: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

7

The k-Catalog The k-Catalog SegmentationSegmentationInput: A set of customers C. A set of pages P. A function : C 2P. The catalog size r.

Output: k catalogs P1,…,Pk P of size r each,

s.t. is maximal.

Cc

iki

Pcmax

Page 8: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

8

Representation as a Representation as a GraphGraph We can consider the input as a bipartite

graph G = (C, P, E), whereE = { (c,p) | c C, p (c) }.

Then, our goal is to find k sets of vertices P1,…Pk P of size r each, and a partition of C into k sets C1,…,Ck s.t.| E ( P1C1 … Pk Ck) | is maximal.

Page 9: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

9

Uniform Catalog ProblemUniform Catalog ProblemDefinition: A catalog problem is called

uniform if there exists a number d such that the degree of every vertex p P is d.

The maximum possible number of hits for a uniform catalog problem is krd.

Thus, we can normalize the number of hits and define

drkPC...PCE kk11maxGsat

Page 10: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

10

HardnessHardnessTheorem (Kleinberg, Papadimitriou and

Raghavan): It is NP-hard to precisely

compute the optimal k catalogs.

Page 11: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

11

ApproximationApproximationProposition: Taking the r most popular

pages in all k catalogs gives an approximation factor of 1/k.

Proof: In the optimal solution, there is a catalog that gives at least 1/k of the hits. Thus, using only this catalog leaves us with at least 1/k of the hits. Replacing this catalog by the r most popular pages can only increase the number of hits.

Page 12: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

12

Dense InstancesDense InstancesKleinberg, Papadimitriou and Raghavan

gave an approximation scheme for dense instances, i.e. instances in which each customer is interested in at least fraction of the pages.

Page 13: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

13

The PCPThe PCP A SAT instance = (1,…,n) over 2

types of variables: X and Y. The range of the variables x X is

RX = {0,1}l. The range of the variables y Y is {0,1}. Each i depends on exactly one x

X and one y Y, s.t the value assigned to x determines the value of y. Thus, we can write it as a function xy : Rx {0,1}.

Page 14: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

14

The PCP (cont.)The PCP (cont.)It is NP-hard to distinguish between the

following 2 cases:

Good: There exists an assignment A s.t.

Bad: For any assignment A

1yAxAPr yxyx

21

yx yAxAPryx

Page 15: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

15

The ReductionThe ReductionGiven an instance for the above PCP, let

G be the following instance for the 2-catalog segmentation problem:

P = { (x, a, s) | x X, a RX, s {0,1} } C = { (y, b) | y Y, b {0,1} } (x, a, s) (y, b)

xy and xy(a) = b s r = |X|

Page 16: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

16

CompletenessCompletenessTheorem: If is satisfiable then sat(G) =

1.

Proof: Consider the following segmentation: i {0,1}, Pi = { (x, A(x), i) | x X}. y Y, (y, A(y)) gets P0 and (y, A(y))

gets P1.Thus, for every page in the catalogs, all the

customers that are interested in it get it, and hence sat(G) = 1.

Page 17: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

17

We would like to show that: , = (), = () s.t. if sat(G) > ½ + then there exists an assignment A s.t.

.

We would like to construct an assignment according to the catalogs.

SoundnessSoundness

21

yx yAxAPryx

Problem: A catalog might contain many pages for the same x with different assignments.

Page 18: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

18

Refining the PCPRefining the PCPSolution: Changing the PCP.

Good: There exists an assignment A s.t.

Bad: For any assignment A

1yAxAPr yxyx

21

yx yAxAPryx

21

yxXx

yAxAPrPryx

Page 19: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

19

Choosing One CatalogChoosing One CatalogNow, assume sat(G) > ½ + . Thus, for

one of the catalogs, Pi’,

and hence

222

1'icp:cPp

CcPrPr'i

21

'icp:c,PpCcPr

'i

Page 20: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

20

Choosing a Subset of Choosing a Subset of PagesPages Let .

Thus, |Pi’’| /2 |X|.

Now, let us keep only one page in Pi’’ for each x X, and denote the set by Pi’’’.|Pi’’’| 2-l /2 |X|.

221

'icp:c'i'i CcPr|Pp'P

Page 21: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

21

Enforcing the Same sEnforcing the Same s s’ {0,1} s.t.

|{ (x, a, s’) | (x, a, s’) Pi’’’ }| 2-l+1 /2 |X|.

Denote the set of the corresponding x’s by X’.

For an appropriate value of , |X’| |X|.

Page 22: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

22

Constructing an Constructing an AssignmentAssignmentWe would like to construct an assignment

as follows: x X’, assign the value of the

appropriate page. y Y, if (y, b) gets the catalog Pi’,

assign the value b s’ to y.

Thus, x X’, ½ + /2 of the clauses xy are satisfied.

Page 23: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

23

ProblemProblemFor a variable y Y, both (y, 0) and (y, 1)

might get the same catalog. Thus, we cannot obtain an assignment to Y as we would like to.

Page 24: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

24

ProblemProblemFor a variable y Y, both (y, 0) and (y, 1)

might get the same catalog. Thus, we cannot obtain an assignment to Y as we would like to.

Page 25: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

25

Taking Subsets of x’sTaking Subsets of x’sInstead of taking one page for each (x, a,

s), we take a page for every tuple of: A subset of m x’s An assignment to A bit s

x

xA x

Page 26: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

26

The PCPThe PCP = (1,…,n) over variables, X and Y, s.t.

it is NP-hard to distinguish between:

Good: There exists an assignment A s.t.

Bad: For any assignment A

1yAxAPr yxyx

21

yxXx

yAxAPrPryx

Page 27: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

27

par[par[,k] - Definitions,k] - Definitions For a 3SAT formula over boolean

variables Y, let Y(k) be the set of allk-subset of Y, and let (k) be the set of all k- subset of .

VY(k), let SV be the set of all assignments to V.

C(k), let SC be the set of all satisfying assignments to C.

Page 28: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

28

par[par[,k] – Definitions ,k] – Definitions (cont.)(cont.) VY(k), C(k), let V C if V is a choice

of one variable of each clause in C.

VY(k), C(k), s.t. V C let a|V denote the natural restriction of an a SC to SV.

Page 29: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

29

par[par[,k] ,k] Definition: For a 3SAT formula over

boolean variables Y, denote by par[,k] the following instance:

There are 2 types of variables: W : x[V] for every V Y(k), over SV

Z : x[C] for every C (k), over SC

There is a local test [C,V] for everyV C that accepts x[C]|v = x[V].

Page 30: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

30

par[par[,k] (cont.),k] (cont.)Definition: For a set of boolean clauses ,

let sat() denote the maximal fraction of clauses of that can be satisfied simultaneously.

Theorem: If sat() = 1 then sat(par[,k]) = 1. sat(par[, k]) sat()c·k for some c>0.

Page 31: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

31

Long CodeLong CodeDefinition: An R-long-code has one bit for

each boolean f : [R] {0,1}.

Page 32: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

32

The PCP of [ST]The PCP of [ST]For any bipartite graph G = ([k], [k], E) we

construct a SAT instance (G), that contains one boolean function for every choice of:

z Z v1,…vk LC[z] w1,…,wk W, s.t. 1 i k, wi z 1 i k, ui wi

k2 perturbation functions p1,1,…,pk,k

Page 33: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

33

The PCP of [ST] (cont.)The PCP of [ST] (cont.) (v1,…,vk,u1,…,uk,p1,1,…,pk,k) = TRUE

(i,j)E, vi uj = ‘vi uj pi,j’.

Denote TRUEp,...,p,u,...,u,v,...,vPrp k,k1,1k1k1

p,u,v t,sji

Page 34: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

34

The PCP of [ST] (cont.)The PCP of [ST] (cont.)Theorem: > 0, it is NP-hard to

distinguish between the following 2 cases:

Good: G = ([k], [k], E), p > (1 - )-|E|

Bad: G = ([k], [k], E), p < 2-|E|

Page 35: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

35

Our PCPOur PCP A SAT instance = (1,…,n) over 2

types of variables: X and Y. The range of the variables x X is

RX = {0,1}l. The range of the variables y Y is

{0,1}. Each i is of the type xy : Rx

{0,1}.

Page 36: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

36

Our PCP (cont.)Our PCP (cont.) Let k = l/2. Given an instance (G) as above, we

construct an instance as follows: There is a variable x X for every

test (G). An assignment to x is an assignment to the bits v1,…,vk,u1,…,uk.

Y = LC[W].

Page 37: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

37

Our PCP (cont.)Our PCP (cont.)Theorem: , > 0 and for some

constant c = c( ) > 0, it is NP-hard to distinguish between:

Good: There exists an assignment A s.t.

Bad: For any assignment A

1yAxAPr yxyx

21

yxXx

yAxAPrPryx

2cl2

Page 38: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

38

Our PCP (cont.)Our PCP (cont.)Lemma: If there exists an assignment A

s.t.

,

then, there exists a graph G = (V, U, E) and an assignment to LC[W] and LC[Z] s.t.p 2-|E|.

21

yxXx

yAxAPrPryx

Page 39: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

39

Our PCP (cont.)Our PCP (cont.)Proof: Assume there exists an assignment

A s.t.

.

We assign the bits of LC[W] the values assigned to them by A, and the bits of LC[Z] are assigned random values.

21

yxXx

yAxAPrPryx

Page 40: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

40

Our PCP (cont.)Our PCP (cont.)We now have to construct a graph G that

would satisfy the lemma.

We call an x good if .

Let x be good and let V0, U0 be the corresponding vertices.

21

yx yAxAPryx

Page 41: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

41

Our PCP (cont.)Our PCP (cont.)V0 U0

V1 U1

U2

The set of vertices in V0 for which at least½ + /2 of their edges are consistent with x.

|V1| /2 k

The set of vertices in U0 that are consistent with x.

U0 \ U1

Page 42: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

42

Our PCP (cont.)Our PCP (cont.)Proposition: There exists i {1,2} s.t.

|Ui| /4 k, and at least ½ + /4 of the edges between Ui and V1 are consistent with x.

Page 43: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

43

Our PCP (cont.)Our PCP (cont.)The set of vertices in V0 for which at least½ + /2 of their edges are consistent with x.

|V1| /2 k

The set of vertices in U0 that are consistent with x.

U0 \ U1

V1 U1

V’

U’

Page 44: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

44

Our PCP (cont.)Our PCP (cont.)V1 U1

V1

U1

U2

The set of vertices in V0 for which at least½ + /2 of their edges are consistent with x.

|V1| /2 k

The set of vertices in U0 that are consistent with x.

U0 \ U1

Page 45: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

45

Our PCP (cont.)Our PCP (cont.) Let U’ Ui, V’ V1, s.t. |U’| = |V’| = /4

k, and at least ½ + /4 of the edges between U’ and V’ are consistent with x.

There are less than 22k possibilities to choose U’ and V’ there is a subset X’ of at least 2-2k (and thus of size at least2-2k |X|) of the good x’s with the same choice of U’ and V’.

Page 46: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

46

Our PCP (cont.)Our PCP (cont.) Let X’’ be the subset of variables x X’

that are consistent with the random assignment to LC[Z].

The probability that A(x) is consistent with a random assignment to LC[Z] is 2-k

the expected size of X’’ is 2-k |X’|.

Therefore, there exists an assignment to LC[Z] s.t. |X’’| 2-3k |X|.

Page 47: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

47

Our PCP (cont.)Our PCP (cont.) Let G be the multi-set of all graphs

G = (V’, U’, E), corresponding to the variables x X’’, where E is the set of all edges between U’ and V’ that are consistent with x.

|G| 2-3k |X|.

GG, |E| (½ + /4) (/4 k)2.

Page 48: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

48

Our PCP (cont.)Our PCP (cont.)Lemma: Let G be a multi-set of bipartite

graphs on [k’][k’], s.t. each graph in G has at least (½ + ’)k’2 edges.Then, t ’/2 k’2, G = ([k’], [k’], E), s.t. |E| t and

. t2

'1

'E,'k,'k'GE'EPr

G

Page 49: 1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.

49

Our PCP (cont.)Our PCP (cont.)By the above lemma, for k’ = /4 k and

’ = /2, G = ([/4 k], [/4 k], E), s.t.|E| = t = c’ (/4 k)2, where c’ < /4, and all the edges of this graph are consistent in at least 2-3k (/4)t fraction of the variables in X.

Considering this graph over the vertex sets U and V gives the desired result.