Inference Algorithms for Pattern-Based CRFs on Sequence Data

Vladimir Kolmogorov, Rustem Takhanov

Research output: Contribution to journalArticle

Abstract

We consider Conditional random fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) x1… xn is the sum of terms over intervals [i, j] where each term is non-zero only if the substring xi… xj equals a prespecified pattern w. Such CRFs can be naturally applied to many sequence tagging problems. We present efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities are respectively O(nL) , O(nLℓmax) and O(nLmin{|D|,log(ℓmax+1)}) where L is the combined length of input patterns, ℓmax is the maximum length of a pattern, and D is the input alphabet. This improves on the previous algorithms of Ye et al. (NIPS, 2009) whose complexities are respectively O(nL| D|) , O(n|Γ|L2ℓmax2) and O(nL| D|) , where

Original languageEnglish
Pages (from-to)17-46
Number of pages30
JournalAlgorithmica
Volume76
Issue number1
DOIs
Publication statusPublished - Sep 1 2016

Keywords

  • Conditional random fields
  • Sequence tagging
  • String algorithms

ASJC Scopus subject areas

  • Computer Science(all)
  • Computer Science Applications
  • Applied Mathematics

Fingerprint Dive into the research topics of 'Inference Algorithms for Pattern-Based CRFs on Sequence Data'. Together they form a unique fingerprint.

  • Cite this