Page 1

1. INTRODUCTION

With advances in both communication and multimedia technologies, there is a critical need to have

visual management systems or user-friendly tools to assist in information retrieval from digital

multimedia databases. Amongst the salient features that could be used to define visual content for

example, are pixel intensity, colour, texture, shape and motion. Of these, motion is the most obvious

and effective feature to provide global and local understanding as well as describing the dynamic

content within a video sequence. The extraction of motion parameters from video sequences has

therefore, been one of the key elements in a range of applications from computer vision through to

157

Fast Block-Based True Motion Estimation Using Distance

Dependent Thresholds

Golam Sorwar

School of Multimedia and Information Technology

Southern Cross University

Coffs Harbour, NSW 2457, Australia

Email: gsorwar@scu.edu.au

Fax: +61-2-6659-3612

Manzur Murshed and Laurence Dooley

Gippsland School of Computing and Information Technology

Monash University

Churchill, Vic 3842, Australia

Email: {Manzur.Murshed,Laurence.Dooley}@infotech.monash.edu.au

Fax: +61-3-9902-6879

A fast motion estimation algorithm, called distance-dependent thresholding search

(DTS), is presented for block-based true motion estimation applications, and

introduces the novel concept of variable distance dependent thresholds. The

performance of the DTS algorithm is analysed and quantitatively compared with

both the traditional and exhaustive full-search (FS) technique, and the compu-

tationally faster, non-exhaustive three-step-search (TSS) algorithm. Experimental

results show that by applying an appropriate threshold function, the DTS algorithm

not only matches the speed of the TSS algorithm, but both retains a block distortion

error comparable to the global minimum produced by the FS algorithm, and avoids

the problem of identifying a large number of spurious motion vectors in the search

process.

ACM Classification: I.4 (Image processing and computer vision)

Manuscript received: 3 June 2003

Communicating Editor: Robyn Owens

Copyright© 2004, Australian Computer Society Inc. General permission to republish, but not for profit, all or part of this

material is granted, provided that the JRPIT copyright notice is given and that reference is made to the publication, to its

date of issue, and to the fact that reprinting privileges were granted by permission of the Australian Computer Society Inc.

Journal of Research and Practice in Information Technology, Vol. 36, No. 3, August 2004

Page 2

Fast Block-Based True Motion Estimation Using Distance Dependent Thresholds

Journal of Research and Practice in Information Technology, Vol. 36, No. 3, August 2004158

popular video compression standards such as the Motion Picture Expert Group (MPEG-1/2/4)

family.

Motion in video sequences may generally be categorised by: camera movement, the movement

of objects within a frame, and movement of both camera and objects. Many different motion

estimation algorithms have been proposed, including pel-recursive (Robbins and Netravali, 1983;

Walker and Rao, 1984), block-matching (Jain and Jain, 1981), and the optical flow-based method

(Horn and Schunck, 1981; Lucas and Kanade, 1981). The block-matching algorithm (BMA) has

proved to be very popular because of its simplicity, robustness, and ease of implementation. The

algorithm, which estimates motion on a block-by-block basis, has been widely exploited in video

coding standards such as MPEG-1/2 and H.261/263. One important feature (Dufaux and Moscheni,

1995) of the BMA is that it exhibits superior performance for larger-sized pixel block

displacements.

The exhaustive BMA, known as the full search (FS) algorithm, searches each candidate block

for the closest match within the entire search region to minimise the block-distortion measure

(BDM). The BDM of image blocks may be measured using various criteria such as, the mean

absolute error (MAE), the mean square error (MSE), and the matching pel-count (MPC).

Since the FS algorithm exhaustively searches for a global minimum block-difference error for

each candidate block, it generally provides the lowest possible distortion error of any BMA. The

algorithm however, suffers two major drawbacks. Its exhaustive nature inevitably means it is very

computationally expensive and in addition, the algorithm tends to capture many false motion

vectors even when there is no object motion within the search region. This is due to the fact that the

distortion of an object in a video frame is relates to its velocity as well as the zoom factor of the

camera and therefore, as the length of a motion vector grows so does the block difference distortion

error. Although this observation has very little impact when the algorithm is used for video coding,

severe artifacts can arise when the algorithm is applied to estimate the true motion vectors, where

both object and/or camera motion is present.

A number of fast non-exhaustive block matching approaches have been proposed including the

three-step search algorithm (TSS) by Koga et al (1981), the new three-step search algorithm

(NTSS) by Li et al (1994), the 2D-logarithmic search algorithm (2DLOG) by Jain and Jain (1981),

the four-step search algorithm (4SS) by Po and Ma (1996), and the cross-search algorithm by

Ghanbari (1990). Of these, TSS has been recommended by both the Reference Model 8 (RM8) of

CCITT and Simulation Model 3 (SM3) of MPEG, because of its simplicity, regularity and

performance (Po and Ma, 1996; Kim and Choi, 1998). It is still today considered one of the best

algorithms against which to compare the performance of new fast algorithms (Zhou and Chen,

2001; Lai and Wong, 2002; Nam et al, 2000).

All the aforementioned fast algorithms have been based upon the assumption that the BDM

increases as the checking points move away from the global minima. According to Chow and Liou

(1993) however, this assumption does not hold true for real world video sequences. Any directional

search algorithm can, therefore, be ambiguous and converge to one of the local minima. Moreover,

none of the above fast algorithms address the key issue of avoiding the capture of significant numbers

of spurious motion vectors in the search process (Dufaux and Moscheni, 1995).

This paper directly addresses these issues by introducing a new distance dependent thresholding

search algorithm (DTS) which not only avoid picking a large number of false motion vectors, but

also simultaneously exhibits the characteristics of a fast search and low BDM. It is our assumption

that true motion, caused by moving objects, will produce an error surface that would stretch the

search even with thresholding.

Page 3

Fast Block-Based True Motion Estimation Using Distance Dependent Thresholds

Journal of Research and Practice in Information Technology, Vol. 36, No. 3, August 2004159

The paper is structured as follows. Section 2 explains the FS and the TSS algorithms, while the

novel distance dependent threshold search (DTS) algorithm using both linear and exponential

thresholding functions, is described in Section 3. Experimental results to verify the performance of

the DTS algorithm in terms of both its search speed and corresponding BDM error measure are

presented in Section 4, which also discusses the selection of the threshold function and related

parameters, as well as explaining how the DTS algorithm avoids a large number of spurious motion

vectors in the search process. Conclusions are provided in Section 5.

2. BLOCK MATCHING ALGORITHMS

Block-based motion estimation algorithm assumes that objects are rigid, move in a translational

movement for at least a few frames and occlusion of one object by another, and uncovered

background, are neglected. In a block-matching algorithm, the current frame is divided into equi-

sized non-overlapping small rectangular blocks, of M × N pixels, as shown in Figure 1. Throughout

the paper, the pixels in a frame are numbered using the Cartesian coordinate system with the origin

being in the upper-left corner. Bn(k,l) denotes the M × N sized block containing all the pixels px,y of

frame number n, where k ≤ x < k + N and l ≤ y < l+M.

For each block of the current frame n, a motion vector is obtained by finding a suitably matched

block, within the search window, of the next frame n+1. For example, in Figure 1, block Bn+1(k + u,

l + v) of the next frame is suitably matched with the block Bn(k,l) of the current frame, so that the

motion vector for the block Bn(k,l) is computed as (u, v).

2.1 The full search (FS) algorithm

In selecting a suitably matched block, the FS algorithm searches the entire search region for a block

such that the BDM is a global minimum. If more than one block generates a minimum BDM, the

FS algorithm selects the block whose motion vector has the smallest magnitude, in order to exploit

the centre-biased motion-vector distribution characteristics of a real-world video sequence (Li et al,

1994; Po and Ma, 1996). To achieve this, checking points are used in a spiral trajectory starting at

the centre of the search region. If the maximum displacement of a motion vector in both the

horizontal and vertical directions is ± d pixels, the total number of search points used to locate the

motion vector for each block can be as high as (2d + 1)2. The spiral trajectory of the checking points

used by the FS algorithm with the maximum displacement, d = 7, is shown in Figure 2.

Figure 1: Frame-block coordinate system

Page 4

Fast Block-Based True Motion Estimation Using Distance Dependent Thresholds

Journal of Research and Practice in Information Technology, Vol. 36, No. 3, August 2004160

2.2 The three step search (TSS) algorithm

The TSS algorithm is based on a coarse-to-fine approach with logarithmically decreasing step sizes

as shown in the example of Figure 3, which has a maximum displacement d = 7.

The initial step size is d/2, where d is the maximum motion displacement. At each step, nine

checking points are matched and the point with the minimum BDM is chosen as the starting centre

of the next step. It is straightforward to prove that the total number of checking points used with

maximum motion d is 1+ 8log2(d + 1).

3. THE DISTANCE DEPENDENT THRESHOLDING SEARCH (DTS) ALGORITHM

In the FS algorithm, the suitability of a block match is measured based on the optimal (minimum)

BDM. FS works well when there is no distortion, but as alluded in Section 1, the level of distortion

in any video frame increases with the velocity of the moving objects and/or the zoom factor used

Figure 2: The spiral trajectory of the checking points in the FS algorithm

Figure 3: Three step search path

• First step

■ Second step

♦Third step

Page 5

Fast Block-Based True Motion Estimation Using Distance Dependent Thresholds

Journal of Research and Practice in Information Technology, Vol. 36, No. 3, August 2004161

by the camera. Locating a block with the minimum difference, but with a motion vector of high

magnitude, is both ineffectual in the prevailing distorted search space, and may also lead to many

false motion vectors being erroneously selected. The exhaustive FS algorithm therefore, becomes

increasingly inefficient as the spiral trajectory (search pattern) expands.

The basis for the solution proposed in this paper is that the suitability measure of the FS

algorithm is relaxed from the optimal criterion as the spiral search trajectory moves from the centre,

and becomes distance-dependent based thereby exploiting the aforementioned observation, i.e. a

Distance-dependent Thresholding Search (DTS) algorithm.

Definition 1: Search Squares SSiThe search space with maximum displacement d, centred at

pixel pcx,cy, can be divided into d+1 mutually exclusive concentric search squares SSi, for all 0 ≤ i

≤ d, such that a checking point at pixel px,yis ∈ SSkif and only if max(|x–cx|,|y–cy|)=k, for all –d+cx

≤ x ≤ d+cx and –d+cy ≤ y ≤ d+cy.

The checking points used in the first three search squares SS0, SS1and SS2are clearly shown in

Figure 4. From this figure it can be easily identified that

(1)

• Checking points in SS0

■ Checking points in SS1

♦Checking points in SS2

3.1 The Formal DTS Algorithm

Like all block-base motion estimation search techniques, the DTS algorithm starts at the centre of

the search space. The search then progresses outwards by using search squares, SSi, in order while

monitoring the current minimum MAE. Aparametric thresholding function, Threshold(i), is used to

determine the various thresholds to be used in the search involving each SSi. After searching each

SSi, the current minimum MAE is compared against the threshold value of that specific search

square and the search is terminated if this MAE value is not higher than the threshold value. The

DTS algorithm is formally presented in Figure 5.

Figure 4: DTS search squares