Conference PaperPDF Available
A preview of the PDF is not available
... Both of these techniques have performed well when detecting cuts but resulted in poor detection performance for gradual transitions. Tahaghoghi et al. [2002] proposed a ranking technique in a moving window of frames for effective cut detection. This approach applies the concepts of Query-By-Example (QBE) and ranking to the video segmentation problem and has been shown to be very effective with a feature derived from the wavelet transform of the frame data [Tahaghoghi et al., 2002]. ...
... Tahaghoghi et al. [2002] proposed a ranking technique in a moving window of frames for effective cut detection. This approach applies the concepts of Query-By-Example (QBE) and ranking to the video segmentation problem and has been shown to be very effective with a feature derived from the wavelet transform of the frame data [Tahaghoghi et al., 2002]. This feature is generated by computing the six-tap Daubechies wavelet transform coefficients from the native Y C b C r colour data of a frame. ...
... The feature is therefore called the Wavelet transform for Re-ordered data (RWav) and is relatively expensive to compute. With this feature, their system outperformed most other systems in cut detection that participated in TRECVID 2002, but produced poor results in gradual transition detection [Tahaghoghi et al., 2002]. ...
... In this paper, we present our techniques for shot boundary detection and video search. In the shot boundary detection task, we used the moving query window technique that we successfully applied in previous years [17] [18] [21] [22]. The focus of our experiments this year was to test our algorithm on a wider range of video content. ...
... At trecvid 2005 we used a two-pass implementation of our moving query window algorithm [17] [23]. This did not exhibit any improvement over the one-pass algorithm that we used in 2004 [21] [22]. ...
... In this paper, we present our techniques for shot boundary detection and video search. In the shot boundary detection task, we used the moving query window technique that we successfully applied in previous years [17, 18, 21, 22] . The focus of our experiments this year was to test our algorithm on a wider range of video content. ...
Article
Full-text available
Run overview We participated in the shot boundary detection and video search tasks. This page provides a summary of our experiments: Shot Boundary Detection Our approach uses the moving query window tech-nique [17, 18, 21, 22]. We applied the system that we used in 2004 [22] and varied algorithm parameters around the optimal settings that we obtained with training runs on the trecvid 2005 test set. The results of all runs are close together at a high standard in terms of recall and precision, but we could not match the performance that we achieved in previ-ous years. In particular, precision for gradual transi-tion detection has suffered significantly in all runs. We use localised hsv colour histograms with 16 re-gions and 32 bins per dimension. Our system uses dif-ferent weights for histogram regions when computing frame differences. We observe decreased performance compared to pre-vious years because of many falsely reported gradual transitions. Cut detection performance suffered due to high brightness levels in most video clips. Some of the fixed thresholds that we use do not allow the algorithm to adapt well to different types of footage. Our system performed best on videos that are similar to the 2004 and 2005 test sets, such as those from cnn or nbc. Video Search We combine visual high-level concept terms with an in-dex built from the speech transcripts in an early-fusion approach. We experimented with expanding concept terms by lexical semantic referencing before combining them with the speech transcripts. All runs are fully automatic search runs. Table 1 shows an overview of the submitted runs. We used different inverted indexes built using text from speech transcripts (T); semantic high-level concept terms (S); and terms from expanding the concept terms using lex-ical semantic referencing (E). In Run 2, the system automatically used a text-based index (T) for person-x queries or a combined index (T+S+E) for other queries.
... Our approach to shot boundary detection uses the moving query window technique [8] [9] [10]. In 2005, we have applied a new implementation of our system and experimented with different feature representations . ...
... We use our moving query window approach previously presented at trecvid [8] [11] [12]. However, this year we have used a new implementation of this method and experimented with different histogram representations. ...
... For cut detection, we use our ranking-based method [9]. This has been proven to work very effectively [8] with features derived from the Daubechies wavelet transform [3]; however, computation of wavelets is expensive . In 2003, to reduce computational cost, we used the ranking-based method in combination with one-dimensional, global histograms in the HSV colour space [12]. ...
Article
Full-text available
Run overview We participated in the Shot Boundary Detection task. This page provides a summary of: (1) the approaches tested in the submitted runs; (2) differences in results between the runs; (3) the overall relative contribution of the techniques; and, (4) our overall conclusions. 1. Our approach to shot boundary detection uses the moving query window technique [8, 9, 10]. In 2005, we have applied a new implementation of our sys-tem and experimented with different feature rep-resentations. We submitted ten runs using only vi-sual features, exploring different colour histogram representations. The first two runs were used as a baseline in which we have used our system as it was applied in 2004, with the settings as in our best runs of that year (Run 3 and Run 5) [11]. An overview of all submitted runs is shown in Table 1. Feature hws Threshold Run cuts gradual cuts gradual method 1 HSVl HSVl 6 14 old 2 HSVl HSVl 8 16 old 3 HSVl HSVl 8 14 new 4 HSVl HSVl 10 16 new 5 HSV3 HSVl 8 14 new 6 HSV3 HSVl 10 16 new 7 HSV3 HSV3 8 14 old 8 HSV3 HSV3 10 16 old 9 HSVl HSV3 8 14 old 10 HSVl HSV3 10 16 old Table 1: Overview over our ten submitted runs in 2005, the features that we have used, and variations in the settings for the half-window size (hws). Runs 1 and 2 were carried out with our 2004 system and serve as a baseline. 2. In our submissions we have tested a new imple-mentation of our system that is designed as a two-pass algorithm, rather than the single-pass algorithm used in previous years. We have applied different combinations of a localised HSV histogram (HSVl) feature and a true three-dimensional colour histogram (HSV3) representation in Run 3 through to Run 10. We have also implemented a new dynamic threshold computation that was applied in Run 3 through to Run 6. This comes into effect during gradual transition detection and is designed to minimise the number of false positives in clips with few transitions. 3. Our three-dimensional colour histogram expresses each colour as a point in the three-dimensional space. While this representation has been shown to produce promising results in content-based image retrieval, performance gains are often outweighed by computational overhead. Due to the type of footage in 2005, the new threshold computation has had only very limited influence on the results. 4. The baseline runs which performed very well in 2004 were again our best runs. Despite im-proved results during training on the 2003 test set with our new implementation, we could not achieve improvements on the 2005 test set. We see the reasons for this mainly in the limited train-ing that we were able to undertake with our new two-pass algorithm and the different feature com-binations. pre frames post frames moving window current frame frame number 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 6 7 8 Figure 1: Moving query window with a half-window size (hws) of 5. The five frames before and the five frames after the current frame form a collection on which the current frame is used as a query example.
... Post-frames PrePostRatio The effectiveness of this approach for cut detection has been demonstrated with the collections of the trec Video Retrieval Evaluation [26,27,29]. However, without modification, this scheme is less effective on gradual transitions. ...
... False detections are regarded less problematic, as they can be filtered out in later processing steps [19]: We developed our algorithm using the shot boundary detection task subset of the trec-10 video collection. Detailed results of blind runs of trec-12, including comparison with other approaches, appear elsewhere [29]. We have since improved our technique through further training on the trec-11 and trec-12 test sets. ...
... Our algorithm performs well relative to comparable systems. In the trec-12 shot boundary detection task, an earlier implementation was among the betterperforming systems, and obtained the highest precision of all participants for gradual transitions [29]. It achieved average recall, above-average frame precision, and the best results for frame recall. ...
Conference Paper
Full-text available
Segmenting digital video into its constituent basic semantic entities, or shots, is an important step for effective management and retrieval of video data. Recent automated techniques for detecting transitions between shots are highly effective on abrupt transitions. However, automated detection of gradual transitions, and the precise determination of the corresponding start and end frames, remains problematic. In this paper, we present a gradual transition detection approach based on average frame similarity and adaptive thresholds. We report good detection results on the TREC video track collections - particularly for dissolves and fades - and very high accuracy in identifying transition boundaries. Our technique is a valuable new tool for transition detection.
... When we near an abrupt HWS › › DMZ › › current frame ›Figure 6: Moving query window with a half window size (hws) of 8, and a demilitarised zone (dmz) of three frames on either side of the current frame; the eight frames preceding and the eight frames following the current frame form a collection, against which the current frame is used as a query example. Figure reproduced from [28]. transition, NumPreFrames rises above the upper threshold (ub). ...
Article
Full-text available
Digital video is widely used in multimedia databases and requires effective retrieval techniques. Shot bound-ary detection is a common first step in analysing video content. The effective detection of gradual transitions is an especially difficult task. Building upon our past research work, we have designed a novel decision stage for detection of gradual transitions. Its strength lies particularly in the accurate detection of gradual tran-sition boundaries. In this paper, we describe our mov-ing query window method and discuss its performance in the context of the trec-12 shot boundary detection task. We believe this approach is a valuable contribu-tion to video retrieval and worth persueing in the fu-ture.
... This is a problem we all face with our home video collections; it is a far more pressing issue for content providers and defence intelligence organisations. This paper incorporates work from trec-11 (Tahaghoghi, Thom & Williams 2002) and trec-12 (Volkmer, Tahaghoghi, Thom & Williams 2003). ...
Conference Paper
Full-text available
Segmentation is the rst step in managing data for many information retrieval tasks. Automatic audio transcriptions and digital video footage are typically continuous data sources that must be pre-processed for segmentation into logical entities that can be stored, queried, and retrieved. Shot boundary detec- tion is a common low-level video segmentation tech- nique, where a video stream is divided into shots that are typically composed of similar frames. In this paper, we propose a new technique for nding cuts | abrupt transitions that delineate shots | that combines evidence from a xed size window of video frames. We experimentally show that our techniques are accurate using the well-known trec experimental testbed.
Article
Digital libraries became necessary vehicles that provide users with powerful and easy-to-use tools for searching, browsing, and retrieving media Information. The starting point for these endeavors is the same: segmentation of video material into shots. The aim of this study is to segment MPEG video streams into shots. A fully automatic detection for both abrupt and gradual transitions (dissolve and fade groups) with minimal decoding in real time is developed. Each detection process was explored through two phases: macroblock type analysis in bidirectional predictive pictures (Bframes), and on-demand intensity information analysis. The abrupt transition detection is explored first by examining the number of forward and backward macroblocks (p- and b-MBs) in consecutive B-frames, and then an intensity histogram comparison is applied to confirm detected transitions. The gradual transition is detected first by examining the intracoded predicted macroblocks (i-MBs) within successive B-frames, and then the detection is confirmed by checking the parabolic shape of the frame variances of the candidate sequence, Results of the study show remarkable detection rate for both abrupt and gradual transitions. (C) 2008 SPIE and IS&T.
Article
In this paper, firstly, several video shot detection technologies have been discussed. An edited video consists of two kinds of shot boundaries have been known as straight cuts and optical cuts. Experimental result using a variety of videos are presented to demonstrate that moving window detection algorithm and 10-step difference histogram comparison algorithm are effective for detection of both kinds of shot cuts. After shot isolation, methods for shot characterization were investigated. We present a detailed discussion of key-frame extraction and review the visual features, particularly the color feature based on HSV model, of key-frames. Video retrieval methods based on key-frames have been presented at the end of this section. This paper also present an integrated system solution for computer- assisted video parsing and content-based video retrieval. The application software package was programmed on Visual C++ development platform.
Article
Multiresolulion representations are very effective for analyzing the information content of images. We study the properties of the operator which approximates a signal at a given resolution. We show that the difference of information between the approximation of a signal at the resolutions 2 j + l and 2 j can be extracted by decomposing this signal on a wavelet orthonormal basis of L2 (Rn). In L2 (R), a wavelet orthonormal basis is a family of functions (√2j Ψ (2 Jx - π))j,n,ez2+ which is built by dilating and translating a unique functiOn Ψ(x). This decomposition defines an orthogonal multiresolulion representation called a wavelet representation. It is computed with a pyramidal algorithm based on convolutions with quadrature mirror lilters. For images, the wavelet representation differentia1es several spatial orientations. We study the application of this representation to data compression in image coding, texture discrimination and fractal analysis.
Article
An algorithm is proposed for the detection of abrupt scene change and special editing effects such as dissolve in a compressed MPEG/MPEG-2 bitstream with minimal decoding of the bitstream. Scene changes are easily detected with DCT DC coefficients and motion vectors. By performing minimal decoding on the compressed bitstream, the processing speed for searching a video database of compressed image sequences can be dramatically improved. In addition, the algorithm may also be applied in video scene browsing and video indexing as well.
Article
Various methods of automatic shot boundary detection have been proposed and claimed to perform reliably. Although the detection of edits is fundamental to any kind of video analysis since it segments a video into its basic components, the shots, only few comparative investigations on early shot boundary detection algorithms have been published. These investigations mainly concentrate on measuring the edit detection performance, however, do not consider the algorithms' ability to classify the types and to locate the boundaries of the edits correctly. This paper extends these comparative investigations. More recent algorithms designed explicitly to detect specific complex editing operations such as fades and dissolves are taken into account, and their ability to classify the types and locate the boundaries of such edits are examined. The algorithms' performance is measured in terms of hit rate, number of false hits, and miss rate for hard cuts, fades, and dissolves over a large and diverse set of video sequences. The experiments show that while hard cuts and fades can be detected reliably, dissolves are still an open research issue. The false hit rate for dis-solves is usually unacceptably high, ranging from 50% up to over 400%. Moreover, all algorithms seem to fail under roughly the same conditions.
Article
The aim of this paper is to provide an introduction to the subject of wavelet analysis for engineering applications. The paper selects from the recent mathematical literature on wavelets the results necessary to develop wavelet-based numerical algorithms. In particular, we provide extensive details of the derivation of Mallat's transform and Daubechies' wavelet coefficients, since these are fundamental to gaining an insight into the properties of wavelets. The potential benefits of using wavelets are highlighted by presenting results of our research in one- and two-dimensional data analysis and in wavelet solutions of partial differential equations.
Article
In the QBIC (Query By Image Content) project we are studying methods to query large on-line image databases using the images'' content as the basis of the queries. Examples of the content we use include color, texture, shape, position, and dominant edges of image objects and regions. Potential applications include medical (Give me other images that contain a tumor with a texture like this one), photo-journalism (Give me images that have blue at the top and red at the bottom), and many others in art, fashion, cataloging, retailing, and industry. We describe a set of novel features and similarity measures allowing query by image content, together with the QBIC system we implemented. We demonstrate the effectiveness of our system with normalized precision and recall experiments on test databases containing over 1000 images and 1000 objects populated from commercially available photo clip art images, and of images of airplane silhouettes. We also present new methods for efficient processing of QBIC queries that consist of filtering and indexing steps. We specifically address two problems: (a) non Euclidean distance measures; and (b) the high dimensionality of feature vectors. For the first problem, we introduce a new theorem that makes efficient filtering possible by bounding the non-Euclidean, full cross-term quadratic distance expression with a simple Euclidean distance. For the second, we illustrate how orthogonal transforms, such as Karhunen Loeve, can help reduce the dimensionality of the search space. Our methods are general and allow some false hits but no false dismissals. The resulting QBIC system offers effective retrieval using image content, and for large image databases significant speedup over straightforward indexing alternatives. The system is implemented in X/Motif and C running on an RS/6000.