Implementation of the intelligent voice system for Kazakh

Zhandos Yessenbayev, Nurbek Saparkhojayev, Timur Tibeyev

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Modern speech technologies are highly advanced and widely used in day-to-day applications. However, this is mostly concerned with the languages of well-developed countries such as English, German, Japan, Russian, etc. As for Kazakh, the situation is less prominent and research in this field is only starting to evolve. In this research and application-oriented project, we introduce an intelligent voice system for the fast deployment of call-centers and information desks supporting Kazakh speech. The demand on such a system is obvious if the country's large size and small population is considered. The landline and cell phones become the only means of communication for the distant villages and suburbs. The system features Kazakh speech recognition and synthesis modules as well as a web-GUI for efficient dialog management. For speech recognition we use CMU Sphinx engine and for speech synthesis - MaryTTS. The web-GUI is implemented in Java enabling operators to quickly create and manage the dialogs in user-friendly graphical environment. The call routines are handled by Asterisk PBX and JBoss Application server. The system supports such technologies and protocols as VoIP, VoiceXML, FastAGI, Java SpeechAPI and J2EE. For the speech recognition experiments we compiled and used the first Kazakh speech corpus with the utterances from 169 native speakers. The performance of the speech recognizer is 4.1% WER on isolated word recognition and 6.9% WER on clean continuous speech recognition tasks. The speech synthesis experiments include the training of male and female voices.

Original languageEnglish
Pages (from-to)814-816
Number of pages3
JournalWorld Applied Sciences Journal
Volume28
Issue number6
DOIs
Publication statusPublished - 2013

Fingerprint

Speech synthesis
Speech recognition
Graphical user interfaces
Private telephone exchanges
Continuous speech recognition
Computer systems
Servers
Experiments
Engines
Network protocols
Communication

Keywords

  • Dialog systems
  • Intelligent systems
  • Kazakh language
  • Speech recognition
  • Speech synthesis

ASJC Scopus subject areas

  • General

Cite this

Implementation of the intelligent voice system for Kazakh. / Yessenbayev, Zhandos; Saparkhojayev, Nurbek; Tibeyev, Timur.

In: World Applied Sciences Journal, Vol. 28, No. 6, 2013, p. 814-816.

Research output: Contribution to journalArticle

Yessenbayev, Zhandos ; Saparkhojayev, Nurbek ; Tibeyev, Timur. / Implementation of the intelligent voice system for Kazakh. In: World Applied Sciences Journal. 2013 ; Vol. 28, No. 6. pp. 814-816.
@article{e8dd285e1bf04a6d92b5695cbb246f73,
title = "Implementation of the intelligent voice system for Kazakh",
abstract = "Modern speech technologies are highly advanced and widely used in day-to-day applications. However, this is mostly concerned with the languages of well-developed countries such as English, German, Japan, Russian, etc. As for Kazakh, the situation is less prominent and research in this field is only starting to evolve. In this research and application-oriented project, we introduce an intelligent voice system for the fast deployment of call-centers and information desks supporting Kazakh speech. The demand on such a system is obvious if the country's large size and small population is considered. The landline and cell phones become the only means of communication for the distant villages and suburbs. The system features Kazakh speech recognition and synthesis modules as well as a web-GUI for efficient dialog management. For speech recognition we use CMU Sphinx engine and for speech synthesis - MaryTTS. The web-GUI is implemented in Java enabling operators to quickly create and manage the dialogs in user-friendly graphical environment. The call routines are handled by Asterisk PBX and JBoss Application server. The system supports such technologies and protocols as VoIP, VoiceXML, FastAGI, Java SpeechAPI and J2EE. For the speech recognition experiments we compiled and used the first Kazakh speech corpus with the utterances from 169 native speakers. The performance of the speech recognizer is 4.1{\%} WER on isolated word recognition and 6.9{\%} WER on clean continuous speech recognition tasks. The speech synthesis experiments include the training of male and female voices.",
keywords = "Dialog systems, Intelligent systems, Kazakh language, Speech recognition, Speech synthesis",
author = "Zhandos Yessenbayev and Nurbek Saparkhojayev and Timur Tibeyev",
year = "2013",
doi = "10.5829/idosi.wasj.2013.28.06.13814",
language = "English",
volume = "28",
pages = "814--816",
journal = "World Applied Sciences Journal",
issn = "1818-4952",
publisher = "International Digital Organization for Scientific Information",
number = "6",

}

TY - JOUR

T1 - Implementation of the intelligent voice system for Kazakh

AU - Yessenbayev, Zhandos

AU - Saparkhojayev, Nurbek

AU - Tibeyev, Timur

PY - 2013

Y1 - 2013

N2 - Modern speech technologies are highly advanced and widely used in day-to-day applications. However, this is mostly concerned with the languages of well-developed countries such as English, German, Japan, Russian, etc. As for Kazakh, the situation is less prominent and research in this field is only starting to evolve. In this research and application-oriented project, we introduce an intelligent voice system for the fast deployment of call-centers and information desks supporting Kazakh speech. The demand on such a system is obvious if the country's large size and small population is considered. The landline and cell phones become the only means of communication for the distant villages and suburbs. The system features Kazakh speech recognition and synthesis modules as well as a web-GUI for efficient dialog management. For speech recognition we use CMU Sphinx engine and for speech synthesis - MaryTTS. The web-GUI is implemented in Java enabling operators to quickly create and manage the dialogs in user-friendly graphical environment. The call routines are handled by Asterisk PBX and JBoss Application server. The system supports such technologies and protocols as VoIP, VoiceXML, FastAGI, Java SpeechAPI and J2EE. For the speech recognition experiments we compiled and used the first Kazakh speech corpus with the utterances from 169 native speakers. The performance of the speech recognizer is 4.1% WER on isolated word recognition and 6.9% WER on clean continuous speech recognition tasks. The speech synthesis experiments include the training of male and female voices.

AB - Modern speech technologies are highly advanced and widely used in day-to-day applications. However, this is mostly concerned with the languages of well-developed countries such as English, German, Japan, Russian, etc. As for Kazakh, the situation is less prominent and research in this field is only starting to evolve. In this research and application-oriented project, we introduce an intelligent voice system for the fast deployment of call-centers and information desks supporting Kazakh speech. The demand on such a system is obvious if the country's large size and small population is considered. The landline and cell phones become the only means of communication for the distant villages and suburbs. The system features Kazakh speech recognition and synthesis modules as well as a web-GUI for efficient dialog management. For speech recognition we use CMU Sphinx engine and for speech synthesis - MaryTTS. The web-GUI is implemented in Java enabling operators to quickly create and manage the dialogs in user-friendly graphical environment. The call routines are handled by Asterisk PBX and JBoss Application server. The system supports such technologies and protocols as VoIP, VoiceXML, FastAGI, Java SpeechAPI and J2EE. For the speech recognition experiments we compiled and used the first Kazakh speech corpus with the utterances from 169 native speakers. The performance of the speech recognizer is 4.1% WER on isolated word recognition and 6.9% WER on clean continuous speech recognition tasks. The speech synthesis experiments include the training of male and female voices.

KW - Dialog systems

KW - Intelligent systems

KW - Kazakh language

KW - Speech recognition

KW - Speech synthesis

UR - http://www.scopus.com/inward/record.url?scp=84892982946&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84892982946&partnerID=8YFLogxK

U2 - 10.5829/idosi.wasj.2013.28.06.13814

DO - 10.5829/idosi.wasj.2013.28.06.13814

M3 - Article

VL - 28

SP - 814

EP - 816

JO - World Applied Sciences Journal

JF - World Applied Sciences Journal

SN - 1818-4952

IS - 6

ER -