Speech recognition application using deep learning neural network

Akzharkyn Izbassarova, Aziza Duisembay, Alex James Pappachen

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Deep Neural Network (DNN) has demonstrated a great potential in speech recognition systems. This chapter presents two cases with successful implementations of speech recognition based on DNN models. The first example includes a DNN model developed by Apple for its personal assistant Siri. To detect and recognize a “Hey Siri” phrase program runs a detector based on a 5-layer network with 32 and 192 hidden units. To create an acoustic model, sigmoid and softmax activation functions are used together with a recurrent network. The second example is a region-based convolutional recurrent neural network (R-CRNN) designed by Amazon for rare sound detection in home speakers. This system is used in a security package called Alexa Guard. To allow efficient power and memory utilization while running complex machine learning algorithms special hardware is required. This chapter describes hardware solutions used in mobile phones and home speakers to process complex DNN models.

Original languageEnglish
Title of host publicationModeling and Optimization in Science and Technologies
PublisherSpringer Verlag
Pages69-79
Number of pages11
DOIs
Publication statusPublished - Jan 1 2020

Publication series

NameModeling and Optimization in Science and Technologies
Volume14
ISSN (Print)2196-7326
ISSN (Electronic)2196-7334

Fingerprint

Neural Networks (Computer)
Speech Recognition
Speech recognition
Neural Network Model
Learning
Neural Networks
Neural networks
Hardware
Recurrent Networks
Acoustic Model
Cell Phones
Apple
Mobile homes
Activation Function
Recurrent Neural Networks
Malus
Sigmoid Colon
Mobile Phone
Acoustics
Complex Networks

ASJC Scopus subject areas

  • Modelling and Simulation
  • Medical Assisting and Transcription
  • Applied Mathematics

Cite this

Izbassarova, A., Duisembay, A., & James Pappachen, A. (2020). Speech recognition application using deep learning neural network. In Modeling and Optimization in Science and Technologies (pp. 69-79). (Modeling and Optimization in Science and Technologies; Vol. 14). Springer Verlag. https://doi.org/10.1007/978-3-030-14524-8_5

Speech recognition application using deep learning neural network. / Izbassarova, Akzharkyn; Duisembay, Aziza; James Pappachen, Alex.

Modeling and Optimization in Science and Technologies. Springer Verlag, 2020. p. 69-79 (Modeling and Optimization in Science and Technologies; Vol. 14).

Research output: Chapter in Book/Report/Conference proceedingChapter

Izbassarova, A, Duisembay, A & James Pappachen, A 2020, Speech recognition application using deep learning neural network. in Modeling and Optimization in Science and Technologies. Modeling and Optimization in Science and Technologies, vol. 14, Springer Verlag, pp. 69-79. https://doi.org/10.1007/978-3-030-14524-8_5
Izbassarova A, Duisembay A, James Pappachen A. Speech recognition application using deep learning neural network. In Modeling and Optimization in Science and Technologies. Springer Verlag. 2020. p. 69-79. (Modeling and Optimization in Science and Technologies). https://doi.org/10.1007/978-3-030-14524-8_5
Izbassarova, Akzharkyn ; Duisembay, Aziza ; James Pappachen, Alex. / Speech recognition application using deep learning neural network. Modeling and Optimization in Science and Technologies. Springer Verlag, 2020. pp. 69-79 (Modeling and Optimization in Science and Technologies).
@inbook{eb72a5acf5bd4261bf2ec9ae0a97f3ec,
title = "Speech recognition application using deep learning neural network",
abstract = "Deep Neural Network (DNN) has demonstrated a great potential in speech recognition systems. This chapter presents two cases with successful implementations of speech recognition based on DNN models. The first example includes a DNN model developed by Apple for its personal assistant Siri. To detect and recognize a “Hey Siri” phrase program runs a detector based on a 5-layer network with 32 and 192 hidden units. To create an acoustic model, sigmoid and softmax activation functions are used together with a recurrent network. The second example is a region-based convolutional recurrent neural network (R-CRNN) designed by Amazon for rare sound detection in home speakers. This system is used in a security package called Alexa Guard. To allow efficient power and memory utilization while running complex machine learning algorithms special hardware is required. This chapter describes hardware solutions used in mobile phones and home speakers to process complex DNN models.",
author = "Akzharkyn Izbassarova and Aziza Duisembay and {James Pappachen}, Alex",
year = "2020",
month = "1",
day = "1",
doi = "10.1007/978-3-030-14524-8_5",
language = "English",
series = "Modeling and Optimization in Science and Technologies",
publisher = "Springer Verlag",
pages = "69--79",
booktitle = "Modeling and Optimization in Science and Technologies",
address = "Germany",

}

TY - CHAP

T1 - Speech recognition application using deep learning neural network

AU - Izbassarova, Akzharkyn

AU - Duisembay, Aziza

AU - James Pappachen, Alex

PY - 2020/1/1

Y1 - 2020/1/1

N2 - Deep Neural Network (DNN) has demonstrated a great potential in speech recognition systems. This chapter presents two cases with successful implementations of speech recognition based on DNN models. The first example includes a DNN model developed by Apple for its personal assistant Siri. To detect and recognize a “Hey Siri” phrase program runs a detector based on a 5-layer network with 32 and 192 hidden units. To create an acoustic model, sigmoid and softmax activation functions are used together with a recurrent network. The second example is a region-based convolutional recurrent neural network (R-CRNN) designed by Amazon for rare sound detection in home speakers. This system is used in a security package called Alexa Guard. To allow efficient power and memory utilization while running complex machine learning algorithms special hardware is required. This chapter describes hardware solutions used in mobile phones and home speakers to process complex DNN models.

AB - Deep Neural Network (DNN) has demonstrated a great potential in speech recognition systems. This chapter presents two cases with successful implementations of speech recognition based on DNN models. The first example includes a DNN model developed by Apple for its personal assistant Siri. To detect and recognize a “Hey Siri” phrase program runs a detector based on a 5-layer network with 32 and 192 hidden units. To create an acoustic model, sigmoid and softmax activation functions are used together with a recurrent network. The second example is a region-based convolutional recurrent neural network (R-CRNN) designed by Amazon for rare sound detection in home speakers. This system is used in a security package called Alexa Guard. To allow efficient power and memory utilization while running complex machine learning algorithms special hardware is required. This chapter describes hardware solutions used in mobile phones and home speakers to process complex DNN models.

UR - http://www.scopus.com/inward/record.url?scp=85064747169&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064747169&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-14524-8_5

DO - 10.1007/978-3-030-14524-8_5

M3 - Chapter

T3 - Modeling and Optimization in Science and Technologies

SP - 69

EP - 79

BT - Modeling and Optimization in Science and Technologies

PB - Springer Verlag

ER -