VITA Search - An Intelligent Multimodal Search and Archive System for Online Media Resources

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper we present work on intelligent multimodal search and archive system, in which the scientific findings obtained in the work on recognition of Kazakh and Russian speeches, language identification and spoken term detection methods were applied. The paper describes the goals and objectives, the architecture, as well as the subsystem modules of the developed system. The VITA Search system allows for accurately determining the exact time of the required spoken information in the data in Kazakh and Russian languages from various broadcast channels. The speech recognition unit uses the Kaldi toolkit to generate lattices from the raw audio data. An acoustic model trained using deep neural networks shows significant results. The word error rate on the train set for recognition of Kazakh speech was 3.86, and for Russian speech - 9.85. Moreover, we integrated a language identification model trained using Long Short-Term Memory Recurrent Neural Networks in order to select the correct model for the input audio. Regarding spoken term detection, we applied word and proxy-based approaches to search for keyword terms among the lattices.

Original languageEnglish
Title of host publication13th IEEE International Conference on Application of Information and Communication Technologies, AICT 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728139005
DOIs
Publication statusPublished - Oct 2019
Event13th IEEE International Conference on Application of Information and Communication Technologies, AICT 2019 - Baku, Azerbaijan
Duration: Oct 23 2019Oct 25 2019

Publication series

Name13th IEEE International Conference on Application of Information and Communication Technologies, AICT 2019 - Proceedings

Conference

Conference13th IEEE International Conference on Application of Information and Communication Technologies, AICT 2019
CountryAzerbaijan
CityBaku
Period10/23/1910/25/19

Keywords

  • automatic language identification
  • intelligent search system
  • speech recognition
  • spoken term detection

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Information Systems
  • Signal Processing
  • Decision Sciences (miscellaneous)

Fingerprint Dive into the research topics of 'VITA Search - An Intelligent Multimodal Search and Archive System for Online Media Resources'. Together they form a unique fingerprint.

  • Cite this

    Kozhirbayev, Z., Yessenbayev, Z., & Myrzakhmetov, B. (2019). VITA Search - An Intelligent Multimodal Search and Archive System for Online Media Resources. In 13th IEEE International Conference on Application of Information and Communication Technologies, AICT 2019 - Proceedings [8981781] (13th IEEE International Conference on Application of Information and Communication Technologies, AICT 2019 - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/AICT47866.2019.8981781