A flexible and scalable audio information retrieval system for mixed-type audio signals

Ebru Doǧan, Mustafa Sert, Adnan Yazici

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)


The content-based classification and retrieval of real-world audio clips is one of the challenging tasks in multimedia information retrieval. Although the problem has been well studied in the last two decades, most of the current retrieval systems cannot provide flexible querying of audio clips due to the mixed-type form (e.g., speech over music and speech over environmental sound) of audio information in real world. We present here a complete, scalable, and extensible content-based classification and retrieval system for mixed-type audio clips. The system gives users an opportunity for flexible querying of audio data semantically by providing four alternative ways, namely, querying by mixed-type audio classes, querying by domain-based fuzzy classes, querying by temporal information and temporal relationships, and querying by example (QBE). In order to reduce the retrieval time, a hash-based indexing technique is introduced. Two kinds of experiments were conducted on the audio tracks of the TRECVID news broadcasts to evaluate the performance of the proposed system. The results obtained from our experiments demonstrate that the Audio Spectrum Flatness feature in MPEG-7 standard performs better in music audio samples compared to other kinds of audio samples and the system is robust under different conditions.

Original languageEnglish
Pages (from-to)952-970
Number of pages19
JournalInternational Journal of Intelligent Systems
Issue number10
Publication statusPublished - Oct 1 2011

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Human-Computer Interaction
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'A flexible and scalable audio information retrieval system for mixed-type audio signals'. Together they form a unique fingerprint.

Cite this