Top Banner
1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen Chen Date:2010/04/07
16

1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

1

A Linear Space Algorithm for Computing Maximal Common Subsequences

Author:D.S. HirschbergPublisher:Communications of the ACM 1975Presenter:Han-Chen ChenDate:2010/04/07

Page 2: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

2

Outline

Introduction Algorithm A Algorithm B Algorithm C

Page 3: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

3

Introduction

LCS (Longest Common Subsequence) of two strings has been solved in quadratic time and space.

We present an algorithm which will solve this problem in quadratic time and in linear space.

Page 4: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

4

Algorithm A

Input string A1m and B1n output matrix L

Page 5: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

5

Analysis of Algorithm A

Time Complexity :

execute m*n times → O(mn) Space Complexity :

input arrays m + n

output array (m+1)*(n+1)

space require → O(mn)

Page 6: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

6

Algorithm B (I)

Space require : O(m+n) It can output the max common length but ca

nnot record the max common subsequence.

Page 7: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

7

Algorithm B (II)

Input string A1m and B1n output matrix LL

Page 8: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

8

Analysis of Algorithm B

Time Complexity :

execute m*n times → O(mn) Space Complexity :

input arrays m + n

output array n+1

space require → O(m+n)

Page 9: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

9

Algorithm C Divide and conquer

i=m/2

String B

Strin

g

A

Find j

1

1 n

m

ALG B

ALG B

A1i

B1j

Bj+1,n

Ai+1,m

Page 10: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

10

Algorithm C L(i,j) j=0 … n the maximum lengths of common subsequence A1i and

B1j

L*(i,j) j=0 … n the maximum lengths of common subsequence Am,i+1 and Bn,j+1

Define M(i) = max{ L(i,j) + L*(i,j) } 0 j n≦ ≦ Theorem M(i) = L(m,n) Proof: for all L(i,j) + L*(i,j) L(m,n)≦

S(i,j) : any maximal common subsequence of A1i and B1j

S*(i,j) : any maximal common subsequence of Ai+1,m and Bj+1,n

Then C= S(i,j) || S*(i,j) is a common subsequence of A1m and B1n of length M(i). Thus L(m,n) L(i,j) + L*(i,j) ≧

Page 11: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

11

Algorithm C exist some L(i,j) + L*(i,j) L(m,n)≧

S(m,n) : any maximal common subsequence of A1m and B1n

S(m,n) is a subsequence of A1m so S(m,n) = S1 || S2 that

S1 is a subsequence of A1i , S2 is a subsequence of Ai+1,m

Also S(m,n) is a subsequence of B1n so there exists j such that S1 is a

subsequence of B1j and S2 is a subsequence of Bj+1,n

By definition of L and L* , |S1| L(i,j) and |S≦ 2| L*(i,j)≦Thus L(m,n) = |S(m,n)| = |S1| + |S2| L(i,j) + L*(i,j)≦

So M(i) = max{ L(i,j) + L*(i,j) } = L(m,n)

Page 12: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

12

Algorithm C

m,i+1

Page 13: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

13

Algorithm C

Page 14: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

15

Analysis of Algorithm C (I)

Time analysis: O(mn) + O(1/2mn) + O(1/4mn) + …

= O(mn(1+1/2+1/4+…)) = O(mn)

Page 15: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

16

Analysis of Algorithm C (II)

Space analysis:

we calls ALG B use temporary storage which is m and n.

Exclusive of recursive calls to ALG C, ALG C uses a constant amount of memory space. There are 2m-1 calls to ALG C, so ALG C require memory space O(m+n).

Page 16: 1 A Linear Space Algorithm for Computing Maximal Common Subsequences Author: D.S. Hirschberg Publisher: Communications of the ACM 1975 Presenter: Han-Chen.

17

Proof 2m-1 calls to ALG C

Let m 2≦ r m=1 there are 2*1 – 1 = 1 call to ALG C Assume m 2≦ r = M there are 2m-1 calls to

ALG C For m’ = 2r+1 = 2M. First call ALG C to

partition 2 part, each calls call 2m-1 times ALG C. So there are 1 + (2m-1) + (2m-1) = 4m - 1 = 2m’ – 1 calls.