Top Banner
Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004
38

Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Batch Codes and Their Applications

Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai

Preliminary version in STOC 2004

Page 2: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Talk Outline

• Batch codes

• Amortized PIR– via hashing– via batch codes

• Constructing batch codes

• Concluding remarks

Page 3: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

A Load-Balancing Scenario

x

Page 4: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

What’s wrong with a random partition?

• Good on average for “oblivious” queries.• However:

– Can’t balance adversarial queries– Can’t balance few random queries– Can’t relieve “hot spots” in multi-user setting

Page 5: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Example• 3 devices, 50% storage overhead.• By how much can the maximal load be reduced?

– Replicating bits is no good: device s.t.1/6 of the bits can only be found at this device.

– Factor 2 load reduction is possible:

L R

L R LR

Page 6: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Batch Codes• (n,N,m,k) batch code:

• Notes – Rate = n / N– By default, insist on minimal load per bucket m≥k. – Load measured by # of probes.

• Generalizations – Allow t probes per bucket– Larger alphabet

x

n

y1 y2 ym

N

i1,…,ik

Page 7: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Multiset Batch Codes• (n,N,m,k) multiset batch code:

• Motivation– Models multiple users (with off-line coordination)– Useful as a building block for standard batch codes

• Nontrivial even for multisets of the form < i,i,…,i >

x

n

y1 y2 ym

N

< i1,…,ik >

Page 8: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Examples• Trivial codes

– Replication: N=kn, m=k • Optimal m, bad rate.

– One bit per bucket: N=m=n• Optimal rate, bad m.

• (L,R,LR) code: rate=2/3, m=3, k=2.

• Goal: simultaneously obtain– High rate (close to 1)– Small m (close to k)

multiset

multiset

Page 9: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Private Information Retrieval (PIR)

• Goal: allow user to query database while hiding the identity of the data-items she is after.

• Motivation: patent databases, web searches, ...• Paradox(?): imagine buying in a store without the

seller knowing what you buy. Note: Encrypting requests is useful against third parties;

not against server holding the data.

Page 10: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Modeling

• Database: n-bit string x

• User: wishes to– retrieve xi and

– keep i private

Page 11: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Server

User

nx 1,0

,...,1 ni

xi

???

Page 12: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Some “Solutions”

1. User downloads entire database. Drawback: n communication bits (vs. logn+1 w/o privacy). Main research goal: minimize communication complexity.

2. User masks i with additional random indices. Drawback: gives a lot of information about i.

3. Enable anonymous access to database. Note: addresses the different security concern of hiding

user’s identity, not the fact that xi is retrieved.

Fact: PIR as described so far requires (n) communication bits.

Page 13: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Two Approaches

• Computational PIR [KO97, CMS99,...] – Computational privacy

– Based on cryptographic assumptions

• Information-Theoretic PIR [CGKS95,Amb97,...] – Replicate database among s servers– Unconditional privacy against t servers– Default: t=1

Page 14: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Communication Upper Bounds

• Computational PIR– O(n), polylog(n), O(logn), O(+logn)

[KO97,CMS99,…]

• Information-theoretic PIR– 2 servers, O(n1/3) [CGKS95]

– s servers, O(n1/c(s)) where c(s)=Ω(slogs / loglogs)[CGKS95,Amb97,BIKR02]

– O(logn/loglogn) servers, polylog(n)

Page 15: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Time Complexity of PIR

• Given low-communication protocols, efficiency bottleneck shifts to servers’ time complexity.– Protocols require (at least) linear time per query.– This is an inherent limitation!

• Possible workarounds:– Preprocessing– Amortize cost over multiple queries

Page 16: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Previous Results [BIM00]

• PIR with preprocessing– s-server protocols with O(n) communication and O(n1/s+) work

per query, requiring poly(n) storage.– Disadvantages:

• Only work for multi-server PIR• Storage typically huge

• Amortized PIR– Slight savings possible using fast matrix multiplication– Require a large batch of queries and high communication– Apply also to queries originating from different users.

• This work:– Assume a batch of k queries originate from a single user.– Allow preprocessing (not always needed).– Nearly optimal amortization

Page 17: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Model

Server/s

User

Nn yx 1,01,0

,...,1,...,, 21 niii k

???

xi , xi ,…, xi1 2 k

Page 18: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Amortized PIR via Hashing

• Let P be a PIR protocol.• Hashing-based amortized PIR:

– User picks hRH , defining a random partition of x into k buckets of sizen/k, and sends h to Server/s.

• Except for 2- failure probability, at most t=O(logk) queries fall in each bucket.

– P is applied t times for each bucket.

• Complexity:– Time kt T(n/k) t T(n)– Communication ktC(n/k)– Asymptotically optimal up to “polylog factors”

Page 19: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

So what’s wrong?

• Not much…• Still:

– Not perfect• introduces either error or privacy loss

– Useless for small k• t=O(logk) overhead dominates

– Cannot hash “once and for all” h bad k-tuple of queries

• Sounds familiar?

Page 20: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Amortized PIR via Batch Codes

• Idea: use batch-encoding instead of hashing.• Protocol:

– Preprocessing: Server/s encode x as y=(y1,y2,…,ym).

– Based on i1,…,ik, User computes the index of the bit it needs from each bucket.

– P is applied once for each bucket.

• Complexity– Time 1jmT(Nj) T(N)

– Communication 1jmC(Nj) mC(n)

• Trivial batch codes imply trivial protocols.• (L,R,LR) code: 2 queries,1.5 X time, 3 X communication

Page 21: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Constructing Batch Codes

Page 22: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Overview• Recall notion

• Main qualitative questions:1.Can we get arbitrarily high constant rate (n/N=1-)

while keeping m feasible in terms of k (say m=poly(k))?2.Can we insist on nearly optimal m (say m=O(k)) and

still get close to a constant rate? • Several incomparable constructions• Answer both questions affirmatively.

x

n

y1 y2 ym

N

i1,…,ik

~

Page 23: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Batch Codes from Unbalanced Expanders

• By Hall’s theorem, the graph represents an (n,N=|E|,m,k) batch code iff every set S containing at most k vertices on the left has at least |S| neighbors on the right.

• Fully captures replication-based batch codes.

n m

Page 24: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Parameters

• Non-explicit: N=dn, m=O(k (nk)1/(d-1))– d=3: rate=1/3, m=O(k3/2n1/2).– d=logn: rate=1/logn, m=O(k) Settles Q2

• Explicit (using [TUZ01],[CRVW02])– Nontrivial, but quite far from optimal

• Limitations:– Rate < ½ (unless m=(n))– For const. rate, m must also depend on n.– Cannot handle multisets.

Page 25: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

The Subcube Code

• Generalize (L,R,LR) example in two ways– Trade better rate for larger m

• (Y1,Y2,…,Ys,Y1 … Ys)

• still k=2

– Handle larger k via composition

Page 26: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Geomertic Interpretation

A B

C D

A

B

C

DAB

CD

AC

BD

ABCD

Page 27: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Parameters

• Nklog(1+1/s)n, mklog(s+1)

– s=O(logk) gives an arbitrary constant rate with m=kO(loglogk). “almost” resolves Q1

• Advantages:– Arbitrary constant rate– Handles multisets– Very easy decoding

• Asymptotically dominated by subsequent construction.

Page 28: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

The Gadget Lemma

• From now on, we can choose a “convenient” n and get same rate and m(k) for arbitrarily larger n.

Primitive multiset batch code

Page 29: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Batch Codes vs. Smooth Codes• Def. A code C:n m is q-smooth if there exists a

(randomized) decoder D such that– D(i) decodes xi by probing q symbols of C(x).– Each symbol of C(x) is probed w/prob q/m.

• Smooth codes are closely related to locally decodable codes [KT00].

• Two-way relation with batch codes:– q-smooth code primitive multiset batch code with k=m/q2

(ideally would like k=m/q).– Primitive multiset batch code (expected) q-smooth for q=m/k

• Batch codes and smooth codes are very different objects:– Relation breaks when relaxing “multiset” or “primitive”– Gap between m/q and m/q2 is very significant for high rate case

• Best known smooth codes with rate>1/2 require q>n1/2

• These codes are provably useless as batch codes.

Page 30: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Batch Codes from RM Codes

• (s,d) Reed-Muller code over F– Message viewed as s-variate polynomial p

over F of total degree (at most) d.– Encoded by the sequence of its evaluations

on all points in Fs

– Case |F|>d is useful due to a “smooth decoding” feature: p(z) can be extrapolated from the values of p on any d+1 points on a line passing through z.

Page 31: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

s=2, d(2n)1/2

x2

x1 xn

• Two approaches for handling conflicts:1.Replicate each point t times

2.Use redundancy to “delete” intersections• Slightly increases field size, but still allows constant rate.

Page 32: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Parameters

• Rate = (1/s!-), m=k1+1/(s-1)+o(1)

– Multiset codes with constant rate (< ½)

• Rate = (1/k), m=O(k) resolves Q2 for multiset codes as well

• Main remaining challenge: resolve Q1

~

Page 33: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

The Subset Code

• Choose s,d such that n • Each data bit i[n] is associated T • Each bucket j[m] is associated S

• Primitive code: yS=TSxT

x

y

s

d

( )[s]d

( )[s]d

( )sd

Page 34: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Batch Decoding the Subset Code

• Lemma: For each T’T, xT can be decoded from all yS such that ST=T’.– Let LT,T’ denote the set of such S.– Note: LT,T’ : T’T defines a partition of

xT

yT’

( )[s]d

0011110000**0110****

Page 35: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Batch Decoding the Subset Code (contd.)

• Goal: Given T1,…,Tk, find subsets T’1,…,T’k such that LTi,T’i are pairwise disjoint.– Easy if all Ti are distinct or if all Ti are the same.

• Attempt 1: T’i is a random subset of Ti

– Problem: if Ti,Tj are disjoint, LTi,T’i and LTj,T’j intersect w.h.p.• Attempt 2: greedily assign to Ti the largest T’i such that LTi,T’i does

not intersect any previous LTj,T’j

– Problem: adjacent sets may “block” each other.• Solution: pick random T’i with bias towards large sets.

x3 x1 x2

Page 36: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Parameters

• Allows arbitrary constant rate with m=poly(k) Settles Q1

• Both the subcube code and the subset code can be viewed as sub-codes of the binary RM code. – The full binary RM code cannot be batch

decoded when the rate>1/2.

Page 37: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Concluding Remarks: Batch Codes

• A common relaxation of very different combinatorial objects– Expanders – Locally-decodable codes

• Problem makes sense even for small values of m,k.– For multiset codes with m=3,k=2, rate 2/3 is optimal.– Open for mk+2.

• Useful building block for “distributed data structures”.

Page 38: Batch Codes and Their Applications Y.Ishai, E.Kushilevitz, R.Ostrovsky, A.Sahai Preliminary version in STOC 2004.

Concluding Remarks: PIR

• Single-user amortization is useful in practice only if PIR is significantly more efficient than download.– Certainly true for multi-server PIR– Most likely true also for single-server PIR

• Killer app for lattice-based cryptosystems?

Single user

Multiple users

AdaptiveNon-adaptive

? ?

?