On the closest string via rank distance

Liviu P. Dinu, Alexandru Popa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

Given a set S of k strings of maximum length n, the goal of the closest substring problem (CSSP) is to find the smallest integer d (and a corresponding string t of length ℓ ≤ n) such that each string s ∈ S has a substring of length ℓ of "distance" at most d to t. The closest string problem (CSP) is a special case of CSSP where ℓ = n. CSP and CSSP arise in many applications in bioinformatics and are extensively studied in the context of Hamming and edit distance. In this paper we consider a recently introduced distance measure, namely the rank distance. First, we show that the CSP and CSSP via rank distance are NP-hard. Then, we present a polynomial time k-approximation algorithm for the CSP problem. Finally, we give a parametrized algorithm for the CSP (the parameter is the number of input strings) if the alphabet is binary and each string has the same number of 0's and 1's.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages413-426
Number of pages14
Volume7354 LNCS
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event23rd Annual Symposium on Combinatorial Pattern Matching, CPM 2012 - Helsinki, Finland
Duration: Jul 3 2012Jul 5 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7354 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other23rd Annual Symposium on Combinatorial Pattern Matching, CPM 2012
CountryFinland
CityHelsinki
Period7/3/127/5/12

Fingerprint

Approximation algorithms
Bioinformatics
Strings
Polynomials
Edit Distance
Hamming Distance
Distance Measure
Approximation Algorithms
Polynomial time
NP-complete problem
Binary
Integer

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Dinu, L. P., & Popa, A. (2012). On the closest string via rank distance. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7354 LNCS, pp. 413-426). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7354 LNCS). https://doi.org/10.1007/978-3-642-31265-6_33

On the closest string via rank distance. / Dinu, Liviu P.; Popa, Alexandru.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7354 LNCS 2012. p. 413-426 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7354 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Dinu, LP & Popa, A 2012, On the closest string via rank distance. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 7354 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7354 LNCS, pp. 413-426, 23rd Annual Symposium on Combinatorial Pattern Matching, CPM 2012, Helsinki, Finland, 7/3/12. https://doi.org/10.1007/978-3-642-31265-6_33
Dinu LP, Popa A. On the closest string via rank distance. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7354 LNCS. 2012. p. 413-426. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-31265-6_33
Dinu, Liviu P. ; Popa, Alexandru. / On the closest string via rank distance. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7354 LNCS 2012. pp. 413-426 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{8269619e209d4474aaf0707048de7caa,
title = "On the closest string via rank distance",
abstract = "Given a set S of k strings of maximum length n, the goal of the closest substring problem (CSSP) is to find the smallest integer d (and a corresponding string t of length ℓ ≤ n) such that each string s ∈ S has a substring of length ℓ of {"}distance{"} at most d to t. The closest string problem (CSP) is a special case of CSSP where ℓ = n. CSP and CSSP arise in many applications in bioinformatics and are extensively studied in the context of Hamming and edit distance. In this paper we consider a recently introduced distance measure, namely the rank distance. First, we show that the CSP and CSSP via rank distance are NP-hard. Then, we present a polynomial time k-approximation algorithm for the CSP problem. Finally, we give a parametrized algorithm for the CSP (the parameter is the number of input strings) if the alphabet is binary and each string has the same number of 0's and 1's.",
author = "Dinu, {Liviu P.} and Alexandru Popa",
year = "2012",
doi = "10.1007/978-3-642-31265-6_33",
language = "English",
isbn = "9783642312649",
volume = "7354 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "413--426",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - On the closest string via rank distance

AU - Dinu, Liviu P.

AU - Popa, Alexandru

PY - 2012

Y1 - 2012

N2 - Given a set S of k strings of maximum length n, the goal of the closest substring problem (CSSP) is to find the smallest integer d (and a corresponding string t of length ℓ ≤ n) such that each string s ∈ S has a substring of length ℓ of "distance" at most d to t. The closest string problem (CSP) is a special case of CSSP where ℓ = n. CSP and CSSP arise in many applications in bioinformatics and are extensively studied in the context of Hamming and edit distance. In this paper we consider a recently introduced distance measure, namely the rank distance. First, we show that the CSP and CSSP via rank distance are NP-hard. Then, we present a polynomial time k-approximation algorithm for the CSP problem. Finally, we give a parametrized algorithm for the CSP (the parameter is the number of input strings) if the alphabet is binary and each string has the same number of 0's and 1's.

AB - Given a set S of k strings of maximum length n, the goal of the closest substring problem (CSSP) is to find the smallest integer d (and a corresponding string t of length ℓ ≤ n) such that each string s ∈ S has a substring of length ℓ of "distance" at most d to t. The closest string problem (CSP) is a special case of CSSP where ℓ = n. CSP and CSSP arise in many applications in bioinformatics and are extensively studied in the context of Hamming and edit distance. In this paper we consider a recently introduced distance measure, namely the rank distance. First, we show that the CSP and CSSP via rank distance are NP-hard. Then, we present a polynomial time k-approximation algorithm for the CSP problem. Finally, we give a parametrized algorithm for the CSP (the parameter is the number of input strings) if the alphabet is binary and each string has the same number of 0's and 1's.

UR - http://www.scopus.com/inward/record.url?scp=84861902988&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84861902988&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-31265-6_33

DO - 10.1007/978-3-642-31265-6_33

M3 - Conference contribution

SN - 9783642312649

VL - 7354 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 413

EP - 426

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -