Top Banner
The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff MIT
22

The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Mar 27, 2015

Download

Documents

Audrey King
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

The Communication and Streaming Complexity of

Computing the Longest Common and Increasing Subsequences

Xiaoming Sun Tsinghua University

David Woodruff MIT

Page 2: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

The Problem

• Stream of elements a1, …, an 2

• Algorithm given one pass over stream

• Problem: Compute the longest increasing subsequence (LIS) – in this case answer is (3,7)

0113734

Page 3: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Previous Work

• Let k be the length of the LIS of the stream

• There exists an algorithm which computes the LIS with O(k2 log ||) space [LNVZ05]

• Trivial (k) lower bound

• Our first result: Improve both bounds to a tight (k2 log ||/k)

Page 4: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Our Lower Bound

Alice Bob

Reduction from indexing function:

x 2 {0,1}n i 2 [n] = {1, 2, …, n}

Randomized 1-way communication is (n)

What is xi?

Page 5: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Alice Bob

x 2 {0,1}n i 2 [n] = {1, 2, …, n}

What is xi?

Construct a stream A Construct a stream B

1. From LIS(A, B), Bob can get xi

2. |LIS(A, B)| = k, where k is input parameter

Page 6: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Alice

Alice uses x to create k-1 increasing sequences A1, …, Ak-1

For each j, Aj has length j. Each bit of x is encoded in some sequence Aj

Every element in Ak-1 is larger than every element in Ak-2, every element in Ak-2 larger than every element in Ak-3, etc.

Set A = Ak-1 ,…, A2 , A1

x 2 {0,1}n A:

A 1

A 2

A k-1

…Value

Position in stream

Page 7: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Bob

i 2 [n]

Bob uses i to recover Aj, the sequence encoding xi

Bob creates an increasing subsequence B of length k-j,

Every element in B is greater than Ar if r < j, and every element in B is less than Ar if r > j

A j-1

A j+1Value

Position in stream

A j

B:

B

Page 8: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Alice Bob

x 2 {0,1}n i 2 [n]

What is xi?

A = Ak-1, …, A2, A1B

A j-1

A j+1Value

Position in stream

A jB

LIS(A, B) = Aj, B, and |LIS(A, B)| = k

But xi encoded in Aj, so Bob recovers xi

Page 9: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

• Thus, any streaming algorithm must use (n) space.

• But what is n? We need to construct k increasing sequences that are different for different x in {0,1}n

• Assume || large. Divide into k-1 blocks of size ||/(k-1)

• Let Aj be a random increasing sequence of length j in block j.

• The space to represent Aj is (k log ||/k) for j > k/2

• Set n = (k2 log ||/k).

Page 10: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Our Upper Bound• When processing the stream, keep lists A[1],

A[2], …, A[k].

• A[j] is an LIS of length j in the stream with minimal last element.

• Let L[1], L[2], …, L[k] be last elements of A[1], A[2], …, A[k]

• To process item x, find i for which L[i] < x < L[i+1], and replace A[i+1] with A[i], x

Page 11: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

• So we have k arrays A[1], …, A[k], each of length at most k.

• Naively, this takes O(k2 log ||) space.

• But the Ai are increasing, so can compress the list by storing differences.

• Total space is O(k2 log ||/k).

Page 12: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

This talk

• First result: a tight space bound for the LIS problem

• Second result: tight bounds for longest common subsequence (LCS)

Page 13: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

LCS Bounds

• Problem: Alice has a permutation of [N], Bob has a permutation of [N]. Decide if |LCS(, )| ¸ k.

• Previous space bound: (k) [LNVZ05]

• Our space bound: (N) for 3 · k · N/2

(holds for randomized O(1)-pass algorithms)

Page 14: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

LCS Bounds

• Why can we only prove (N) for 3 · k · N/2?

• If k = 2, reduces to equality test.

• If k large, there are at most O(N2(N-k)) permutations with |LCS(, )| > k, so just use an equality test with error O(1/N2(N-k))

Page 15: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Our Lower Bound

• Padding lemma: if for k = 3 the randomized communication complexity is (N), then it’s (N) for all k · N/2

• Proof: just pad each of the inputs by some common subsequence of length k-3

Page 16: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Alice Bob

Remains to show high complexity for k =3. We reduce from disjointness

x 2 {0,1}n y 2 {0,1}n

Randomized multi-way communication is (n)

Is there ani such thatxi = yi = 1?

Page 17: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Alice Bob

x 2 {0,1}N/3 y 2 {0,1}N/3

Construct Construct

Want |LCS(, )| ¸ 3 iff x and y are disjoint

Is there ani such thatxi = yi = 1?

Page 18: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Alice

x 2 {0,1}N/3

Divide 1, …, N into N/3 groupsG1 = (1, 2, 3), G2 = (4, 5, 6), …, GN/3 = (N-2, N-1, N).

Use x to choose 1, …, N/3

ii acts onacts on G Gii

If xIf xii = 0, = 0, ii (m+1, m+2, m+3) = (m+1, m+2, m+3). (m+1, m+2, m+3) = (m+1, m+2, m+3).

If xIf xii = 1, = 1, ii (m+1, m+2, m+3) = (m+1, m+3, m+2). (m+1, m+2, m+3) = (m+1, m+3, m+2).

= 1, 2, …, N/3

Page 19: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Bob

y 2 {0,1}N/3 = N/3 , …, 1

Divide 1, …, N into N/3 groupsG1 = (1, 2, 3), G2 = (4, 5, 6), …, GN/3 = (N-2, N-1, N).

Use y to choose 1, …, N/3

i acts on Gi

If yi = 0, i (m+1, m+2, m+3) = (m+3, m+2, m+1).If yi = 1, I (m+1, m+2, m+3) = (m+1, m+3, m+2).

Page 20: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

1(G1)

2(G2)

3(G3)

N/3(GN/3)

N/3(GN/3)

3(G3)

2(G2)

1(G1)

Claim: |LCS(, )| · 3.

Proof: Use the fact that LCS(, ) intersects at most one Gi

Claim: |LCS(, )| = 3 iff there is some i with xi = yi = 1

Proof: Use the way we defined i and i

Thus, can decide disjointness, so (N) communication.

Page 21: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Other results

• Tight space bounds for computing the LIS length.

• Generalization to approximate LIS and LCS. Still many gaps here.

• Example: approximate LIS length, we have (1/) and O(k log ||). Recent work [GJKK07] has shown O(sqrt(N/) log ||), but still large gap.

Page 22: The Communication and Streaming Complexity of Computing the Longest Common and Increasing Subsequences Xiaoming Sun Tsinghua University David Woodruff.

Conclusion

• First result: a tight bound for the LIS

• Second result: an (N) space bound for the LCS k-decision problem for 3 · k · N/2

• Other results for approximation problems

• Another open question: extend our lower bound for LIS to randomized multi-round