Automatic threat classification without human intervention is a popular research topic in wireless multimedia sensor networks (WMSNs) especially within the context of surveillance applications. This paper explores the effect of fusing audio-visual multimedia and scalar data collected by the sensor nodes in a WMSN for the purpose of energy-efficient and accurate object detection and classification. In order to do that, we implemented a wireless multimedia sensor node with video and audio capturing and processing capabilities in addition to traditional/ordinary scalar sensors. The multimedia sensors are kept in sleep mode in order to save energy until they are activated by the scalar sensors which are always active. The object recognition results obtained from video and audio applications are fused to increase the object recognition performance of the sensor node. Final results are forwarded to the sink in text format, and this greatly reduces the size of data transmitted in network. Performance test results of the implemented prototype system show that the fusing audio data with visual data improves automatic object recognition capability of a sensor node significantly. Since auditory data requires less processing power compared to visual data, the overhead of processing the auditory data is not high, and it helps to extend network lifetime of WMSNs.
- object detection
- visual and auditory data fusion
- Wireless multimedia sensor
ASJC Scopus subject areas
- Electrical and Electronic Engineering