### Abstract

We consider Conditional random fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) x_{1}… x_{n} is the sum of terms over intervals [i, j] where each term is non-zero only if the substring x_{i}… x_{j} equals a prespecified pattern w. Such CRFs can be naturally applied to many sequence tagging problems. We present efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities are respectively O(nL) , O(nLℓ_{max}) and O(nLmin{|D|,log(ℓmax+1)}) where L is the combined length of input patterns, ℓ_{max} is the maximum length of a pattern, and D is the input alphabet. This improves on the previous algorithms of Ye et al. (NIPS, 2009) whose complexities are respectively O(nL| D|) , O(n|Γ|L2ℓmax2) and O(nL| D|) , where

Original language | English |
---|---|

Pages (from-to) | 17-46 |

Number of pages | 30 |

Journal | Algorithmica |

Volume | 76 |

Issue number | 1 |

DOIs | |

Publication status | Published - Sep 1 2016 |

### Keywords

- Conditional random fields
- Sequence tagging
- String algorithms

### ASJC Scopus subject areas

- Computer Science(all)
- Computer Science Applications
- Applied Mathematics

## Fingerprint Dive into the research topics of 'Inference Algorithms for Pattern-Based CRFs on Sequence Data'. Together they form a unique fingerprint.

## Cite this

*Algorithmica*,

*76*(1), 17-46. https://doi.org/10.1007/s00453-015-0017-7