ABSTRACT: Retrieving videos using key words requires obtaining the semantic features of the videos. Most work reported in the literature focuses on annotating a video shot with a fixed number of key words, no matter how much information is contained in the video shot. In this paper, we propose a new approach to automatically annotate a video shot with an adaptive number of annotation key words according to the richness of the video content. A semantic candidate set (SCS) with fixed size is discovered using visual features. Then the final annotation set, which has an unfixed number of key words, is obtained from the SCS by using Bayesian inference, which combines static and dynamic inference to remove the irrelevant candidate key words. We have applied our approach to video retrieval. The experiments demonstrate that video retrieval using our annotation approach outperforms retrieval using a fixed number of annotation words.
Pattern Recognition, 2008. ICPR 2008. 19th International Conference on; 01/2009
Proceedings of the 7th ACM International Conference on Image and Video Retrieval, CIVR 2008, Niagara Falls, Canada, July 7-9, 2008; 01/2008
Advances in Multimedia Modeling, 13th International Multimedia Modeling Conference, MMM 2007, Singapore, January 9-12, 2007. Proceedings, Part I; 01/2007
Computer Vision/Computer Graphics Collaboration Techniques, Third International Conference, MIRAGE 2007, Rocquencourt, France, March 28-30, 2007, Proceedings; 01/2007
Advanced Data Mining and Applications, Second International Conference, ADMA 2006, Xi'an, China, August 14-16, 2006, Proceedings; 01/2006
Advances in Image and Video Technology, First Pacific Rim Symposium, PSIVT 2006, Hsinchu, Taiwan, December 10-13, 2006, Proceedings; 01/2006
ABSTRACT: Summary. It is an important task to automatically extract semantic annotation of a video shot. This high level semantic information
can improve the performance of video retrieval. In this paper, we propose a novel approach to annotate a new video shot automatically
with a non-fixed number of concepts. The process is carried out by three steps. Firstly, the semantic importance degree (SID)is
introduced and a simple method is proposed to extract the semantic candidate set (SCS) under considering SID of several concepts
co-occurring in the same shot. Secondly, a semantic network is constructed using an improved K2 algorithm. Finally, the final
annotation set is chosen by Bayesian inference. Experimental results show that the performance of automatically annotating
a new video shot is significantly improved using our method, compared with classical classifiers such as Naïve Bayesian and
K Nearest Neighbor.
01/1970: pages 447-466;