Comparison of Word Embeddings of Unaligned Audio and Text Data Using Persistent Homology

Zhandos Yessenbayev, Zhanibek Kozhirbayev

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

We have performed preliminary work on topological analysis of audio and text data for unsupervised speech processing. The work is based on the assumption that phoneme frequencies and contextual relationships are similar in the acoustic and text domains for the same language. Accordingly, this allowed the creation of a mapping between these spaces that takes into account their geometric structure. As a first step, generative methods based on variational autoencoders were chosen to map audio and text data into two latent vector spaces. In the next stage, persistent homology methods are used to analyze the topological structure of two spaces. Although the results obtained support the idea of the similarity of the two spaces, further research is needed to correctly map acoustic and text spaces, as well as to evaluate the real effect of including topological information in the autoencoder training process.

Original languageEnglish
Title of host publicationSpeech and Computer - 24th International Conference, SPECOM 2022, Proceedings
EditorsS.R. Mahadeva Prasanna, Alexey Karpov, K. Samudravijaya, Shyam S. Agrawal
PublisherSpringer Science and Business Media Deutschland GmbH
Pages700-711
Number of pages12
ISBN (Print)9783031209796
DOIs
Publication statusPublished - 2022
Event24th International Conference on Speech and Computer, SPECOM 2022 - Gurugram, India
Duration: Nov 14 2022Nov 16 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13721 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference24th International Conference on Speech and Computer, SPECOM 2022
Country/TerritoryIndia
CityGurugram
Period11/14/2211/16/22

Keywords

  • Persistent homology and diagram
  • Topological data analysis
  • Unsupervised processing
  • Word embeddings

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Comparison of Word Embeddings of Unaligned Audio and Text Data Using Persistent Homology'. Together they form a unique fingerprint.

Cite this