Restricted common superstring and restricted common supersequence

Raphaël Clifford, Zvi Gotthilf, Moshe Lewenstein, Alexandru Popa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

The shortest common superstring and the shortest common supersequence are two well studied problems having a wide range of applications. In this paper we consider both problems with resource constraints, denoted as the Restricted Common Superstring (shortly RCSstr) problem and the Restricted Common Supersequence (shortly RCSseq). In the RCSstr (RCSseq) problem we are given a set S of n strings, s 1, s 2, ..., s n , and a multiset t = {t 1, t 2, ..., t m }, and the goal is to find a permutation π: {1, ..., m} → {1, ..., m} to maximize the number of strings in S that are substrings (subsequences) of π(t) = t π(1) t π(2) ⋯ t π(m) (we call this ordering of the multiset, π(t), a permutation of t). We first show that in its most general setting the RCSstr problem is NP-complete and hard to approximate within a factor of n 1 - ε , for any ε > 0, unless P = NP. Afterwards, we present two separate reductions to show that the RCSstr problem remains NP-Hard even in the case where the elements of t are drawn from a binary alphabet or for the case where all input strings are of length two. We then present some approximation results for several variants of the RCSstr problem. In the second part of this paper, we turn to the RCSseq problem, where we present some hardness results, tight lower bounds and approximation algorithms.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages467-478
Number of pages12
Volume6661 LNCS
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event22nd Annual Symposium on Combinatorial Pattern Matching, CPM 2011 - Palermo, Italy
Duration: Jun 27 2011Jun 29 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6661 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other22nd Annual Symposium on Combinatorial Pattern Matching, CPM 2011
CountryItaly
CityPalermo
Period6/27/116/29/11

Fingerprint

Superstring
Computational complexity
Approximation algorithms
Strings
Multiset
Hardness
Permutation
NP-complete problem
Lower Approximation
Resource Constraints
NP-hard Problems
Subsequence
Approximation Algorithms
Maximise
Binary
Lower bound
Approximation
Range of data

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Clifford, R., Gotthilf, Z., Lewenstein, M., & Popa, A. (2011). Restricted common superstring and restricted common supersequence. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6661 LNCS, pp. 467-478). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6661 LNCS). https://doi.org/10.1007/978-3-642-21458-5_39

Restricted common superstring and restricted common supersequence. / Clifford, Raphaël; Gotthilf, Zvi; Lewenstein, Moshe; Popa, Alexandru.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6661 LNCS 2011. p. 467-478 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6661 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Clifford, R, Gotthilf, Z, Lewenstein, M & Popa, A 2011, Restricted common superstring and restricted common supersequence. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 6661 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6661 LNCS, pp. 467-478, 22nd Annual Symposium on Combinatorial Pattern Matching, CPM 2011, Palermo, Italy, 6/27/11. https://doi.org/10.1007/978-3-642-21458-5_39
Clifford R, Gotthilf Z, Lewenstein M, Popa A. Restricted common superstring and restricted common supersequence. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6661 LNCS. 2011. p. 467-478. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-21458-5_39
Clifford, Raphaël ; Gotthilf, Zvi ; Lewenstein, Moshe ; Popa, Alexandru. / Restricted common superstring and restricted common supersequence. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6661 LNCS 2011. pp. 467-478 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{78b9477201b54be7b90b001e0617878b,
title = "Restricted common superstring and restricted common supersequence",
abstract = "The shortest common superstring and the shortest common supersequence are two well studied problems having a wide range of applications. In this paper we consider both problems with resource constraints, denoted as the Restricted Common Superstring (shortly RCSstr) problem and the Restricted Common Supersequence (shortly RCSseq). In the RCSstr (RCSseq) problem we are given a set S of n strings, s 1, s 2, ..., s n , and a multiset t = {t 1, t 2, ..., t m }, and the goal is to find a permutation π: {1, ..., m} → {1, ..., m} to maximize the number of strings in S that are substrings (subsequences) of π(t) = t π(1) t π(2) ⋯ t π(m) (we call this ordering of the multiset, π(t), a permutation of t). We first show that in its most general setting the RCSstr problem is NP-complete and hard to approximate within a factor of n 1 - ε , for any ε > 0, unless P = NP. Afterwards, we present two separate reductions to show that the RCSstr problem remains NP-Hard even in the case where the elements of t are drawn from a binary alphabet or for the case where all input strings are of length two. We then present some approximation results for several variants of the RCSstr problem. In the second part of this paper, we turn to the RCSseq problem, where we present some hardness results, tight lower bounds and approximation algorithms.",
author = "Rapha{\"e}l Clifford and Zvi Gotthilf and Moshe Lewenstein and Alexandru Popa",
year = "2011",
doi = "10.1007/978-3-642-21458-5_39",
language = "English",
isbn = "9783642214578",
volume = "6661 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "467--478",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Restricted common superstring and restricted common supersequence

AU - Clifford, Raphaël

AU - Gotthilf, Zvi

AU - Lewenstein, Moshe

AU - Popa, Alexandru

PY - 2011

Y1 - 2011

N2 - The shortest common superstring and the shortest common supersequence are two well studied problems having a wide range of applications. In this paper we consider both problems with resource constraints, denoted as the Restricted Common Superstring (shortly RCSstr) problem and the Restricted Common Supersequence (shortly RCSseq). In the RCSstr (RCSseq) problem we are given a set S of n strings, s 1, s 2, ..., s n , and a multiset t = {t 1, t 2, ..., t m }, and the goal is to find a permutation π: {1, ..., m} → {1, ..., m} to maximize the number of strings in S that are substrings (subsequences) of π(t) = t π(1) t π(2) ⋯ t π(m) (we call this ordering of the multiset, π(t), a permutation of t). We first show that in its most general setting the RCSstr problem is NP-complete and hard to approximate within a factor of n 1 - ε , for any ε > 0, unless P = NP. Afterwards, we present two separate reductions to show that the RCSstr problem remains NP-Hard even in the case where the elements of t are drawn from a binary alphabet or for the case where all input strings are of length two. We then present some approximation results for several variants of the RCSstr problem. In the second part of this paper, we turn to the RCSseq problem, where we present some hardness results, tight lower bounds and approximation algorithms.

AB - The shortest common superstring and the shortest common supersequence are two well studied problems having a wide range of applications. In this paper we consider both problems with resource constraints, denoted as the Restricted Common Superstring (shortly RCSstr) problem and the Restricted Common Supersequence (shortly RCSseq). In the RCSstr (RCSseq) problem we are given a set S of n strings, s 1, s 2, ..., s n , and a multiset t = {t 1, t 2, ..., t m }, and the goal is to find a permutation π: {1, ..., m} → {1, ..., m} to maximize the number of strings in S that are substrings (subsequences) of π(t) = t π(1) t π(2) ⋯ t π(m) (we call this ordering of the multiset, π(t), a permutation of t). We first show that in its most general setting the RCSstr problem is NP-complete and hard to approximate within a factor of n 1 - ε , for any ε > 0, unless P = NP. Afterwards, we present two separate reductions to show that the RCSstr problem remains NP-Hard even in the case where the elements of t are drawn from a binary alphabet or for the case where all input strings are of length two. We then present some approximation results for several variants of the RCSstr problem. In the second part of this paper, we turn to the RCSseq problem, where we present some hardness results, tight lower bounds and approximation algorithms.

UR - http://www.scopus.com/inward/record.url?scp=79960081879&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79960081879&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-21458-5_39

DO - 10.1007/978-3-642-21458-5_39

M3 - Conference contribution

SN - 9783642214578

VL - 6661 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 467

EP - 478

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -