Generating expressive summaries for speech and musical audio using self-similarity clues

Mustafa Sert, Buyurman Baykal, Adnan Yazici

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

We present a novel algorithm for structural analysis of audio to detect repetitive patterns that are suitable for content-based audio information retrieval systems, since repetitive patterns can provide valuable information about the content of audio, such as a chorus or a concept. The Audio Spectrum Flatness (ASF) feature of the MPEG-7 standard, although not having been considered as much as other feature types, has been utilized and evaluated as the underlying feature set. Expressive summaries are chosen as the longest patterns by the k-means clustering algorithm. Proposed approach is evaluated on a test bed consisting of popular song and speech clips based on the ASF feature. The well known Mel Frequency Cepstral Coefficients (MFCCs) are also considered in the experiments for the evaluation of features. Experiments show that, all the repetitive patterns and their locations are obtained with the accuracy of 93% and 78% for music and speech, respectively.

Original languageEnglish
Title of host publication2006 IEEE International Conference on Multimedia and Expo, ICME 2006 - Proceedings
Pages941-944
Number of pages4
DOIs
Publication statusPublished - 2006
Event2006 IEEE International Conference on Multimedia and Expo, ICME 2006 - Toronto, ON, Canada
Duration: Jul 9 2006Jul 12 2006

Publication series

Name2006 IEEE International Conference on Multimedia and Expo, ICME 2006 - Proceedings
Volume2006

Conference

Conference2006 IEEE International Conference on Multimedia and Expo, ICME 2006
Country/TerritoryCanada
CityToronto, ON
Period7/9/067/12/06

ASJC Scopus subject areas

  • Media Technology
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Generating expressive summaries for speech and musical audio using self-similarity clues'. Together they form a unique fingerprint.

Cite this