TY - GEN
T1 - Identification of coreferential chains in video texts for semantic annotation of news videos
AU - Küçük, Dilek
AU - Yazici, Adnan
PY - 2008
Y1 - 2008
N2 - People can benefit from today's video archives of huge sizes only through appropriate and effective ways of querying the video data. In order to query the video data through high-level semantic entities such as objects, events, and relations, these entities should be properly extracted and the corresponding video shots should be annotated accordingly. Video texts, which comprise the caption texts on the frames as well as transcription texts obtained through automatic speech recognition techniques, are valuable sources of information for semantic modeling of the videos. In this paper, we present an approach for the extraction of semantic objects from videos by utilizing lexical resources along with the identification of coreference chains in the corresponding video texts. Coreference is a phenomenon in natural language texts where a number of entities in the text refer to the same real world entity. Therefore, while the domain-specific lexical resources aid in the determination of salient entities in the video text, the identification of coreference chains prevents the superfluous extraction of the same underlying entities due to their different surface forms in the video texts. The proposed approach is significant for its being the first attempt to address the importance of coreference phenomenon in video texts for precise entity extraction during the semantic modeling of news videos with a hands-on application. The approach has been evaluated on Turkish political news texts from the METU Turkish corpus and a number of evaluation problems faced such as sparseness of annotated evaluation data for Turkish are also pointed out together with further research directions to pursue.
AB - People can benefit from today's video archives of huge sizes only through appropriate and effective ways of querying the video data. In order to query the video data through high-level semantic entities such as objects, events, and relations, these entities should be properly extracted and the corresponding video shots should be annotated accordingly. Video texts, which comprise the caption texts on the frames as well as transcription texts obtained through automatic speech recognition techniques, are valuable sources of information for semantic modeling of the videos. In this paper, we present an approach for the extraction of semantic objects from videos by utilizing lexical resources along with the identification of coreference chains in the corresponding video texts. Coreference is a phenomenon in natural language texts where a number of entities in the text refer to the same real world entity. Therefore, while the domain-specific lexical resources aid in the determination of salient entities in the video text, the identification of coreference chains prevents the superfluous extraction of the same underlying entities due to their different surface forms in the video texts. The proposed approach is significant for its being the first attempt to address the importance of coreference phenomenon in video texts for precise entity extraction during the semantic modeling of news videos with a hands-on application. The approach has been evaluated on Turkish political news texts from the METU Turkish corpus and a number of evaluation problems faced such as sparseness of annotated evaluation data for Turkish are also pointed out together with further research directions to pursue.
UR - http://www.scopus.com/inward/record.url?scp=58449097890&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=58449097890&partnerID=8YFLogxK
U2 - 10.1109/ISCIS.2008.4717886
DO - 10.1109/ISCIS.2008.4717886
M3 - Conference contribution
AN - SCOPUS:58449097890
SN - 9781424428816
T3 - 2008 23rd International Symposium on Computer and Information Sciences, ISCIS 2008
BT - 2008 23rd International Symposium on Computer and Information Sciences, ISCIS 2008
T2 - 2008 23rd International Symposium on Computer and Information Sciences, ISCIS 2008
Y2 - 27 October 2008 through 29 October 2008
ER -