TY - GEN

T1 - On the closest string via rank distance

AU - Dinu, Liviu P.

AU - Popa, Alexandru

PY - 2012/7/4

Y1 - 2012/7/4

N2 - Given a set S of k strings of maximum length n, the goal of the closest substring problem (CSSP) is to find the smallest integer d (and a corresponding string t of length ℓ ≤ n) such that each string s ∈ S has a substring of length ℓ of "distance" at most d to t. The closest string problem (CSP) is a special case of CSSP where ℓ = n. CSP and CSSP arise in many applications in bioinformatics and are extensively studied in the context of Hamming and edit distance. In this paper we consider a recently introduced distance measure, namely the rank distance. First, we show that the CSP and CSSP via rank distance are NP-hard. Then, we present a polynomial time k-approximation algorithm for the CSP problem. Finally, we give a parametrized algorithm for the CSP (the parameter is the number of input strings) if the alphabet is binary and each string has the same number of 0's and 1's.

AB - Given a set S of k strings of maximum length n, the goal of the closest substring problem (CSSP) is to find the smallest integer d (and a corresponding string t of length ℓ ≤ n) such that each string s ∈ S has a substring of length ℓ of "distance" at most d to t. The closest string problem (CSP) is a special case of CSSP where ℓ = n. CSP and CSSP arise in many applications in bioinformatics and are extensively studied in the context of Hamming and edit distance. In this paper we consider a recently introduced distance measure, namely the rank distance. First, we show that the CSP and CSSP via rank distance are NP-hard. Then, we present a polynomial time k-approximation algorithm for the CSP problem. Finally, we give a parametrized algorithm for the CSP (the parameter is the number of input strings) if the alphabet is binary and each string has the same number of 0's and 1's.

UR - http://www.scopus.com/inward/record.url?scp=84861902988&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84861902988&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-31265-6_33

DO - 10.1007/978-3-642-31265-6_33

M3 - Conference contribution

AN - SCOPUS:84861902988

SN - 9783642312649

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 413

EP - 426

BT - Combinatorial Pattern Matching - 23rd Annual Symposium, CPM 2012, Proceedings

T2 - 23rd Annual Symposium on Combinatorial Pattern Matching, CPM 2012

Y2 - 3 July 2012 through 5 July 2012

ER -