Kazakh text summarization using fuzzy logic

Altanbek Zulkhazhav, Zhanibek Kozhirbayev, Zhandos Yessenbayev, Altynbek Sharipbay

Research output: Contribution to journalArticle

Abstract

In this paper we present an extractive summarization method for the Kazakh language based on fuzzy logic. We aimed to extract and concatenate important sentences from the primary text to obtain its shorter form. With the rapid growth of information on the Internet there is a demand on its efficient and cost-effective summarization. Therefore the creation of automatic summarization methods is considered as a very important task of natural language processing. Our approach is based on the preprocessing of the sentences by applying morphological analysis and pronoun resolution techniques in order to avoid their early rejections. Afterwards, we determine the features of the processed sentences need for exploiting fuzzy logic methods. Additionally, since there is no available data for the given task, we collected and manually annotated our own dataset from the different Internet resources in the Kazakh language for the experimentation. We also applied our method on CNN/Daily Mail dataset. The ROUGE-N indicators were calculated to assess the quality of the proposed method. The ROUGE-L(f-score) score by the proposed method with pronoun resolution for the former dataset is 0.40, whereas for the latter one it is 0.38.

Original languageEnglish
Pages (from-to)851-859
Number of pages9
JournalComputacion y Sistemas
Volume23
Issue number3
DOIs
Publication statusPublished - Jan 1 2019

    Fingerprint

Keywords

  • Extractive text summarization
  • Fuzzy logic
  • Natural language processing

ASJC Scopus subject areas

  • Computer Science(all)

Cite this