Top Banner
Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niederme ier, Peter Rossmanith
21

Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Dec 16, 2015

Download

Documents

Abbie Blong
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems

Algorithmica(2003)Jens Gramm, Rolf Niedermeier, P

eter Rossmanith

Page 2: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Outline

Introduction Preliminaries Linear-Time solution for constant d Related Problems Linear-Time solution for fixed k Conclusion

Page 3: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Intro : Problem Definition

Input: String s1, s2, …, sk over alphabet Σ of length L each, and a nonnegative integer d.

Question: Is there a string s of length L such that dH(s, si)≤d for all i=1,…,k dH(s1, s2) = |{i|s1[i]≠s2[i]}|, |s1|=|s2|

Page 4: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

NP-completeness

CLOSEST STRING is NP-complete d is usually small in biological applica

tions O(kL+kd*dd) result in this paper

PTAS by Li et al

Page 5: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Extended problems

d-MISMATCH DISTINGUISHING STRING

SELECTION DISTINGUISHING SUBSTRING

SELECTION

Page 6: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Preliminaries

Given a set of string S={s1,…,sk}, each of length L s is optimal center string iff no s’ such t

hat maxi=1,…,kdH(s’,si)<maxi=1,…,kdH(s,si) s is optimal median string iff no s’ such t

hat Σi=1,…,kdH(s’,si)<Σi=1,…,kdH(s,si)

Page 7: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Given a set of k strings of length L, think of this string as k x L matrix

Optimal median string : a c c a

s1 a b c d

s2 a a d b

s3 b c d a

s4 a c c c

Page 8: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Main idea

Search! Fixed-parameter tractibility Reduction to problem kernel

Page 9: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

LEMMA 1. Given a set of strings S={s1,…,sk}, each of length L, and a permutationσ:{1,…,L}{1,…,L}. Then s is an optimal center string for {s1,…,sk} iff σ(s) is an optimal center string for {σ(s1), σ(s2), …, σ(sk)}

Page 10: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

LEMMA 2. To compute an optimal center string, it is sufficient to solve a normalized and reordered instance. From this, the solution of the original instance can be derived in linear time s

1a b c d

s2

a a d b

s3

b c d a

s4

a c c c

s1

a b a a

s2

a c b b

s3

b a b c

s4

a a a d

s1

b a a a

s2

c a b b

s3

a b b c

s4

a a a d

Page 11: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

LEMMA 3. A CLOSEST STRING instance with arbitrary alphabet Σ, |Σ|>k, isomorphic to a CLOSEST STRING instance with alphabet Σ’, |Σ’|=k. By normalization

Page 12: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

LEMMA 4. Given a CLOSTEST STRING instance s1,…,sk of length L and d. If the resulting k x L matrix has more than kd dirty dirty columns, then there is no string s with maxi=1,…,kdH(s,si)≤d A column is dirty iff it contains at least tw

o different symbols from alphabet Σ By pigeon theorem

Page 13: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

A Linear-Time solution for constant d

Bounded search tree algorithm LEMMA 5. Given a set of strings S={s1,

…,sk} and a positive integer d. If there are i, j {1,…,k} with dH{si,sj}>2d, then there is no string s with maxi=1,…,kdH(s, si)≤d

Page 14: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.
Page 15: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Theorem 1. Given a set of string S={s1,…,sk} and d, Algorithm D determines in O(kL+kd*dd) time. By lemma 4, reduced the input instance t

o O(kd) in O(kL) time Depth=d, Time(D0+D1+D2+D3)=kd by buil

ding a table containing the distances of candidate s1 to all other given strings

Page 16: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

correctness

Show only the correctness of first step If s1 is not a solution but there exists a c

enter string s P :={p|s1[p]≠si[p]}, |P|=d+1 Ps1≠s=si := {p|s1[p]≠s[p]=si[p]} goal! Ps1≠s=si =Ps≠si∪ P (disjoint), |Ps≠si|≤d So d+1 subcases is sufficient

Page 17: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Related Problems d-MISMATCH problem

Si,p,L denote the length L substring of a given string si starting at position p

Whether there is a string of length L and a position p with 1≤p≤n-L+1, such that dH(s,si,p,L)≤d, for all I

Stojanvoic et al give a linear time algorithm fo 1-MISMATCH

Theorem 2. d-MISMATCH is solvable in O(kL+(n-L)kd*dd) time which O(n*k) for fixed d

Naively: O(n*(KL+kd*dd)) Maintain the queue of dirty columns Considering only the first L columns, we can build a FIFO

queue in O(kL) Update at each position in O(k) time

Page 18: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

DSS problem DISTINGUISHING STRING SELECTION

Given S={s1,…,sk1}, S’={s’1,…,s’k2} all of the same length L, and d1,d2≥0, is there a s such that

LEMMA 6. Given two set of strings S1={s1,…,sk1} and S2={s’1,…,s’k2} and positive d1,d2. If there are i{1,…,k1} and j{1,…k2} with dH(si,s’j)<L-(d1+d2), then there is no string s satisfying both maxi=1,…,k1dH(s,si)≤d1 and minj=1,…,k2dH(s,s’j)≥L-d2

dH(s,s’j)≤dH(s,si)+dH(si,s’j)

2,...,1

1,...,1

)',(max

),(max

2

1

dLssd

dssd

jHkj

iHki

Page 19: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

A Linear-Time Solution for Fixed k

Is CLOSEST STRING fixed parameter tractable?

Use integer linear programming (ILP) Lenstra: ILP with a fixed number of va

riables can be solved in linear time(exponential space)

Page 20: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

CLOSEST STRING in ILP Column types for k

For k=3: (a,a,a)t, (a,a,b)t, (a,b,a)t, (b,a,a)t, (a,b,c)t

|column types|=B(k)≤k! Xt,φ, t: column type, φΣ

Number of column type t whose corresponding character in the desired solution string of CLOSEST STRING is set to φ

B(k)*k Variables needed Minimize

Φt,i denates the alphabet symbol at the ith entry of column type t

tt

kiit

x}){(,

1,

max

Page 21: Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems Algorithmica(2003) Jens Gramm, Rolf Niedermeier, Peter Rossmanith.

Conclusion

Fixed parameter tractability for CLOSEST STRING in d, k

Improve previous work in d-MISMATCH

DSS CLOSEST SUBSTRING ?