Better heuristic algorithms for the repetition free LCS and other variants

Radu Stefan Mincu, Alexandru Popa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In Discrete Applied Mathematics 2010, Adi et al. introduce and study a variant of the well known Longest Common Subsequence problem, named Repetition Free Longest Common Subsequence (RFLCS). In RFLCS the input consists of two strings A and B over an alphabet Σ and the goal is to find the longest common subsequence containing only distinct characters from Σ. Adi et al. prove that the problem is APX -hard and show three approximation algorithms. Castelli et al. (Operations Research Letters 2013) propose a heuristic genetic algorithm and Blum and Blesa introduce metaheuristic algorithms (International Conference on Artificial Evolution 2013 and Evolutionary Computation in Combinatorial Optimization 2016). In this paper we design and test several new heuristic algorithms for RFLCS. The first algorithm, uses dynamic programming and in our testing setup outperforms the algorithms of Adi et al. The second heuristic algorithm improves upon the first and becomes comparable to the state-of-the-art algorithms of Blum and Blesa. The third algorithm transforms the RFLCS instance into an instance of the Maximum Independent Set (MIS) problem with the same value of the optimum solution. Then, we apply known algorithms for the MIS problem. We also augment one of the approximation algorithms of Adi et al. and we prove that we achieve an approximation of factor (Formula presented). Finally, we introduce a new variant of the LCS problem, named Multiset Restricted Common Subsequence (MRCS), that is a generalization of RFLCS. We present an exact polynomial time algorithm for MRCS for constant size alphabet. Additionally, we show that MRCS admits a (Formula presented) approximation.

Original languageEnglish
Title of host publicationString Processing and Information Retrieval - 25th International Symposium, SPIRE 2018, Proceedings
EditorsTravis Gagie, Alistair Moffat, Gonzalo Navarro, Ernesto Cuadros-Vargas
PublisherSpringer Verlag
Pages297-310
Number of pages14
ISBN (Print)9783030004781
DOIs
Publication statusPublished - Jan 1 2018
Event25th International Symposium on String Processing and Information Retrieval, SPIRE 2018 - Lima, Peru
Duration: Oct 9 2018Oct 11 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11147 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other25th International Symposium on String Processing and Information Retrieval, SPIRE 2018
CountryPeru
CityLima
Period10/9/1810/11/18

Fingerprint

Longest Common Subsequence
Heuristic algorithms
Heuristic algorithm
Multiset
Subsequence
Maximum Independent Set
Approximation algorithms
Approximation Algorithms
Operations research
Combinatorial optimization
Combinatorial Optimization
Operations Research
Discrete mathematics
Evolutionary Computation
Exact Algorithms
Approximation
Repetition
Applied mathematics
Dynamic programming
Metaheuristics

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Mincu, R. S., & Popa, A. (2018). Better heuristic algorithms for the repetition free LCS and other variants. In T. Gagie, A. Moffat, G. Navarro, & E. Cuadros-Vargas (Eds.), String Processing and Information Retrieval - 25th International Symposium, SPIRE 2018, Proceedings (pp. 297-310). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11147 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-00479-8_24

Better heuristic algorithms for the repetition free LCS and other variants. / Mincu, Radu Stefan; Popa, Alexandru.

String Processing and Information Retrieval - 25th International Symposium, SPIRE 2018, Proceedings. ed. / Travis Gagie; Alistair Moffat; Gonzalo Navarro; Ernesto Cuadros-Vargas. Springer Verlag, 2018. p. 297-310 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11147 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mincu, RS & Popa, A 2018, Better heuristic algorithms for the repetition free LCS and other variants. in T Gagie, A Moffat, G Navarro & E Cuadros-Vargas (eds), String Processing and Information Retrieval - 25th International Symposium, SPIRE 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11147 LNCS, Springer Verlag, pp. 297-310, 25th International Symposium on String Processing and Information Retrieval, SPIRE 2018, Lima, Peru, 10/9/18. https://doi.org/10.1007/978-3-030-00479-8_24
Mincu RS, Popa A. Better heuristic algorithms for the repetition free LCS and other variants. In Gagie T, Moffat A, Navarro G, Cuadros-Vargas E, editors, String Processing and Information Retrieval - 25th International Symposium, SPIRE 2018, Proceedings. Springer Verlag. 2018. p. 297-310. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-00479-8_24
Mincu, Radu Stefan ; Popa, Alexandru. / Better heuristic algorithms for the repetition free LCS and other variants. String Processing and Information Retrieval - 25th International Symposium, SPIRE 2018, Proceedings. editor / Travis Gagie ; Alistair Moffat ; Gonzalo Navarro ; Ernesto Cuadros-Vargas. Springer Verlag, 2018. pp. 297-310 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{a57b0707c95b4d2f9fdaf2fa1c90e632,
title = "Better heuristic algorithms for the repetition free LCS and other variants",
abstract = "In Discrete Applied Mathematics 2010, Adi et al. introduce and study a variant of the well known Longest Common Subsequence problem, named Repetition Free Longest Common Subsequence (RFLCS). In RFLCS the input consists of two strings A and B over an alphabet Σ and the goal is to find the longest common subsequence containing only distinct characters from Σ. Adi et al. prove that the problem is APX -hard and show three approximation algorithms. Castelli et al. (Operations Research Letters 2013) propose a heuristic genetic algorithm and Blum and Blesa introduce metaheuristic algorithms (International Conference on Artificial Evolution 2013 and Evolutionary Computation in Combinatorial Optimization 2016). In this paper we design and test several new heuristic algorithms for RFLCS. The first algorithm, uses dynamic programming and in our testing setup outperforms the algorithms of Adi et al. The second heuristic algorithm improves upon the first and becomes comparable to the state-of-the-art algorithms of Blum and Blesa. The third algorithm transforms the RFLCS instance into an instance of the Maximum Independent Set (MIS) problem with the same value of the optimum solution. Then, we apply known algorithms for the MIS problem. We also augment one of the approximation algorithms of Adi et al. and we prove that we achieve an approximation of factor (Formula presented). Finally, we introduce a new variant of the LCS problem, named Multiset Restricted Common Subsequence (MRCS), that is a generalization of RFLCS. We present an exact polynomial time algorithm for MRCS for constant size alphabet. Additionally, we show that MRCS admits a (Formula presented) approximation.",
author = "Mincu, {Radu Stefan} and Alexandru Popa",
year = "2018",
month = "1",
day = "1",
doi = "10.1007/978-3-030-00479-8_24",
language = "English",
isbn = "9783030004781",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "297--310",
editor = "Travis Gagie and Alistair Moffat and Gonzalo Navarro and Ernesto Cuadros-Vargas",
booktitle = "String Processing and Information Retrieval - 25th International Symposium, SPIRE 2018, Proceedings",
address = "Germany",

}

TY - GEN

T1 - Better heuristic algorithms for the repetition free LCS and other variants

AU - Mincu, Radu Stefan

AU - Popa, Alexandru

PY - 2018/1/1

Y1 - 2018/1/1

N2 - In Discrete Applied Mathematics 2010, Adi et al. introduce and study a variant of the well known Longest Common Subsequence problem, named Repetition Free Longest Common Subsequence (RFLCS). In RFLCS the input consists of two strings A and B over an alphabet Σ and the goal is to find the longest common subsequence containing only distinct characters from Σ. Adi et al. prove that the problem is APX -hard and show three approximation algorithms. Castelli et al. (Operations Research Letters 2013) propose a heuristic genetic algorithm and Blum and Blesa introduce metaheuristic algorithms (International Conference on Artificial Evolution 2013 and Evolutionary Computation in Combinatorial Optimization 2016). In this paper we design and test several new heuristic algorithms for RFLCS. The first algorithm, uses dynamic programming and in our testing setup outperforms the algorithms of Adi et al. The second heuristic algorithm improves upon the first and becomes comparable to the state-of-the-art algorithms of Blum and Blesa. The third algorithm transforms the RFLCS instance into an instance of the Maximum Independent Set (MIS) problem with the same value of the optimum solution. Then, we apply known algorithms for the MIS problem. We also augment one of the approximation algorithms of Adi et al. and we prove that we achieve an approximation of factor (Formula presented). Finally, we introduce a new variant of the LCS problem, named Multiset Restricted Common Subsequence (MRCS), that is a generalization of RFLCS. We present an exact polynomial time algorithm for MRCS for constant size alphabet. Additionally, we show that MRCS admits a (Formula presented) approximation.

AB - In Discrete Applied Mathematics 2010, Adi et al. introduce and study a variant of the well known Longest Common Subsequence problem, named Repetition Free Longest Common Subsequence (RFLCS). In RFLCS the input consists of two strings A and B over an alphabet Σ and the goal is to find the longest common subsequence containing only distinct characters from Σ. Adi et al. prove that the problem is APX -hard and show three approximation algorithms. Castelli et al. (Operations Research Letters 2013) propose a heuristic genetic algorithm and Blum and Blesa introduce metaheuristic algorithms (International Conference on Artificial Evolution 2013 and Evolutionary Computation in Combinatorial Optimization 2016). In this paper we design and test several new heuristic algorithms for RFLCS. The first algorithm, uses dynamic programming and in our testing setup outperforms the algorithms of Adi et al. The second heuristic algorithm improves upon the first and becomes comparable to the state-of-the-art algorithms of Blum and Blesa. The third algorithm transforms the RFLCS instance into an instance of the Maximum Independent Set (MIS) problem with the same value of the optimum solution. Then, we apply known algorithms for the MIS problem. We also augment one of the approximation algorithms of Adi et al. and we prove that we achieve an approximation of factor (Formula presented). Finally, we introduce a new variant of the LCS problem, named Multiset Restricted Common Subsequence (MRCS), that is a generalization of RFLCS. We present an exact polynomial time algorithm for MRCS for constant size alphabet. Additionally, we show that MRCS admits a (Formula presented) approximation.

UR - http://www.scopus.com/inward/record.url?scp=85054885077&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85054885077&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-00479-8_24

DO - 10.1007/978-3-030-00479-8_24

M3 - Conference contribution

AN - SCOPUS:85054885077

SN - 9783030004781

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 297

EP - 310

BT - String Processing and Information Retrieval - 25th International Symposium, SPIRE 2018, Proceedings

A2 - Gagie, Travis

A2 - Moffat, Alistair

A2 - Navarro, Gonzalo

A2 - Cuadros-Vargas, Ernesto

PB - Springer Verlag

ER -