Speech recognition application using deep learning neural network

Akzharkyn Izbassarova, Aziza Duisembay, Alex James Pappachen

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Deep Neural Network (DNN) has demonstrated a great potential in speech recognition systems. This chapter presents two cases with successful implementations of speech recognition based on DNN models. The first example includes a DNN model developed by Apple for its personal assistant Siri. To detect and recognize a “Hey Siri” phrase program runs a detector based on a 5-layer network with 32 and 192 hidden units. To create an acoustic model, sigmoid and softmax activation functions are used together with a recurrent network. The second example is a region-based convolutional recurrent neural network (R-CRNN) designed by Amazon for rare sound detection in home speakers. This system is used in a security package called Alexa Guard. To allow efficient power and memory utilization while running complex machine learning algorithms special hardware is required. This chapter describes hardware solutions used in mobile phones and home speakers to process complex DNN models.

Original languageEnglish
Title of host publicationModeling and Optimization in Science and Technologies
PublisherSpringer Verlag
Pages69-79
Number of pages11
DOIs
Publication statusPublished - Jan 1 2020

Publication series

NameModeling and Optimization in Science and Technologies
Volume14
ISSN (Print)2196-7326
ISSN (Electronic)2196-7334

    Fingerprint

ASJC Scopus subject areas

  • Modelling and Simulation
  • Medical Assisting and Transcription
  • Applied Mathematics

Cite this

Izbassarova, A., Duisembay, A., & James Pappachen, A. (2020). Speech recognition application using deep learning neural network. In Modeling and Optimization in Science and Technologies (pp. 69-79). (Modeling and Optimization in Science and Technologies; Vol. 14). Springer Verlag. https://doi.org/10.1007/978-3-030-14524-8_5