A flexible and scalable audio information retrieval system for mixed-type audio signals

Ebru Doǧan, Mustafa Sert, Adnan Yazici

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

The content-based classification and retrieval of real-world audio clips is one of the challenging tasks in multimedia information retrieval. Although the problem has been well studied in the last two decades, most of the current retrieval systems cannot provide flexible querying of audio clips due to the mixed-type form (e.g., speech over music and speech over environmental sound) of audio information in real world. We present here a complete, scalable, and extensible content-based classification and retrieval system for mixed-type audio clips. The system gives users an opportunity for flexible querying of audio data semantically by providing four alternative ways, namely, querying by mixed-type audio classes, querying by domain-based fuzzy classes, querying by temporal information and temporal relationships, and querying by example (QBE). In order to reduce the retrieval time, a hash-based indexing technique is introduced. Two kinds of experiments were conducted on the audio tracks of the TRECVID news broadcasts to evaluate the performance of the proposed system. The results obtained from our experiments demonstrate that the Audio Spectrum Flatness feature in MPEG-7 standard performs better in music audio samples compared to other kinds of audio samples and the system is robust under different conditions.

Original languageEnglish
Pages (from-to)952-970
Number of pages19
JournalInternational Journal of Intelligent Systems
Volume26
Issue number10
DOIs
Publication statusPublished - Oct 1 2011

Fingerprint

Information retrieval systems
Information Retrieval
Motion Picture Experts Group standards
Retrieval
Information retrieval
Music
Experiments
Acoustic waves
MPEG-7
Flatness
Indexing
Broadcast
Experiment
Multimedia
Evaluate
Alternatives
Demonstrate

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Human-Computer Interaction
  • Artificial Intelligence

Cite this

A flexible and scalable audio information retrieval system for mixed-type audio signals. / Doǧan, Ebru; Sert, Mustafa; Yazici, Adnan.

In: International Journal of Intelligent Systems, Vol. 26, No. 10, 01.10.2011, p. 952-970.

Research output: Contribution to journalArticle

@article{5e551ee7183a4fecbb5b8ff315512c00,
title = "A flexible and scalable audio information retrieval system for mixed-type audio signals",
abstract = "The content-based classification and retrieval of real-world audio clips is one of the challenging tasks in multimedia information retrieval. Although the problem has been well studied in the last two decades, most of the current retrieval systems cannot provide flexible querying of audio clips due to the mixed-type form (e.g., speech over music and speech over environmental sound) of audio information in real world. We present here a complete, scalable, and extensible content-based classification and retrieval system for mixed-type audio clips. The system gives users an opportunity for flexible querying of audio data semantically by providing four alternative ways, namely, querying by mixed-type audio classes, querying by domain-based fuzzy classes, querying by temporal information and temporal relationships, and querying by example (QBE). In order to reduce the retrieval time, a hash-based indexing technique is introduced. Two kinds of experiments were conducted on the audio tracks of the TRECVID news broadcasts to evaluate the performance of the proposed system. The results obtained from our experiments demonstrate that the Audio Spectrum Flatness feature in MPEG-7 standard performs better in music audio samples compared to other kinds of audio samples and the system is robust under different conditions.",
author = "Ebru Doǧan and Mustafa Sert and Adnan Yazici",
year = "2011",
month = "10",
day = "1",
doi = "10.1002/int.20508",
language = "English",
volume = "26",
pages = "952--970",
journal = "International Journal of Intelligent Systems",
issn = "0884-8173",
publisher = "John Wiley and Sons Ltd",
number = "10",

}

TY - JOUR

T1 - A flexible and scalable audio information retrieval system for mixed-type audio signals

AU - Doǧan, Ebru

AU - Sert, Mustafa

AU - Yazici, Adnan

PY - 2011/10/1

Y1 - 2011/10/1

N2 - The content-based classification and retrieval of real-world audio clips is one of the challenging tasks in multimedia information retrieval. Although the problem has been well studied in the last two decades, most of the current retrieval systems cannot provide flexible querying of audio clips due to the mixed-type form (e.g., speech over music and speech over environmental sound) of audio information in real world. We present here a complete, scalable, and extensible content-based classification and retrieval system for mixed-type audio clips. The system gives users an opportunity for flexible querying of audio data semantically by providing four alternative ways, namely, querying by mixed-type audio classes, querying by domain-based fuzzy classes, querying by temporal information and temporal relationships, and querying by example (QBE). In order to reduce the retrieval time, a hash-based indexing technique is introduced. Two kinds of experiments were conducted on the audio tracks of the TRECVID news broadcasts to evaluate the performance of the proposed system. The results obtained from our experiments demonstrate that the Audio Spectrum Flatness feature in MPEG-7 standard performs better in music audio samples compared to other kinds of audio samples and the system is robust under different conditions.

AB - The content-based classification and retrieval of real-world audio clips is one of the challenging tasks in multimedia information retrieval. Although the problem has been well studied in the last two decades, most of the current retrieval systems cannot provide flexible querying of audio clips due to the mixed-type form (e.g., speech over music and speech over environmental sound) of audio information in real world. We present here a complete, scalable, and extensible content-based classification and retrieval system for mixed-type audio clips. The system gives users an opportunity for flexible querying of audio data semantically by providing four alternative ways, namely, querying by mixed-type audio classes, querying by domain-based fuzzy classes, querying by temporal information and temporal relationships, and querying by example (QBE). In order to reduce the retrieval time, a hash-based indexing technique is introduced. Two kinds of experiments were conducted on the audio tracks of the TRECVID news broadcasts to evaluate the performance of the proposed system. The results obtained from our experiments demonstrate that the Audio Spectrum Flatness feature in MPEG-7 standard performs better in music audio samples compared to other kinds of audio samples and the system is robust under different conditions.

UR - http://www.scopus.com/inward/record.url?scp=81155159648&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=81155159648&partnerID=8YFLogxK

U2 - 10.1002/int.20508

DO - 10.1002/int.20508

M3 - Article

VL - 26

SP - 952

EP - 970

JO - International Journal of Intelligent Systems

JF - International Journal of Intelligent Systems

SN - 0884-8173

IS - 10

ER -