# Rectification-Based View Interpolation and Extrapolation for Multiview Video Coding.

**0**Bookmarks

**·**

**48**Views

- Citations (0)
- Cited In (0)

Page 1

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 6, JUNE 2011 693

Rectification-Based View Interpolation and

Extrapolation for Multiview Video Coding

Xiaoyu Xiu, Student Member, IEEE, Derek Pang, Student Member, IEEE, and Jie Liang, Member, IEEE

Abstract—In this paper, we first develop improved projective

rectification-based view interpolation and extrapolation methods,

and apply them to view synthesis prediction-based multiview

video coding (MVC). A geometric model for these view synthesis

methods is then developed. We also propose an improved model to

study the rate-distortion (R-D) performances of various practical

MVC schemes, including the current joint multiview video coding

standard. Experimental results show that our schemes achieve

superior view synthesis results, and can lead to better R-D

performance in MVC. Simulation results with the theoretical

models help explaining the experimental results.

Index Terms—Multiview video coding, rate-distortion theory,

view extrapolation, view interpolation.

I. Introduction

R

visual communication services such as 3-D TV and free view-

point video [1]. The former offers a 3-D depth impression of

the observed scenery, while the latter further allows interactive

selection of viewpoints and generation of new views from any

viewpoints. Since multiple cameras are used to capture the

scenes, efficient compression of the multiview video data is

crucial to these services.

Many methods have been developed for multiview video

coding (MVC), ranging from disparity compensated prediction

to view synthesis prediction (VSP). In addition, the theoretical

performance analyses of some approaches have also been

studied. In this section, we give a brief review of the practical

and theoretical MVC works, and point out the contributions

of this paper in the two aspects.

ECENT advances in computer, display, camera, and sig-

nal processing make it possible to deploy next generation

A. Review of MVC Algorithms

A straightforward way to exploit the statistical dependencies

among different viewpoints is to use disparity-compensated

prediction. Similar to the motion-compensated prediction in

Manuscript received October 27, 2009; revised June 26, 2010; accepted

November 7, 2010. Date of publication March 17, 2011; date of current

version June 3, 2011. This work was supported in part by the Natural Sciences

and Engineering Research Council of Canada, under Grants RGPIN312262,

EQPEQ330976-2006, STPGP350740-07, and STPGP380875-09. This paper

was recommended by Associate Editor Y.-S. Ho.

X. Xiu and J. Liang are with the School of Engineering Science, Simon

Fraser University, Burnaby, BC V5A 1S6, Canada (e-mail: xxa4@sfu.ca;

jiel@sfu.ca).

D. Pang was with Simon Fraser University, Burnaby, BC V5A 1S6, Canada.

He is now with Stanford University, Stanford, CA 94305 USA (e-mail:

dcypang@stanford.edu).

Color versions of one or more of the figures in this paper are available

online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCSVT.2011.2129230

single-view video coding, for each block in the current view,

the disparity compensation finds the best matched block in

its neighboring views and encodes the prediction residual. In

[2], the motion compensation and disparity compensation are

combined to encode stereo sequences. The concept of group of

group of pictures for inter-view prediction is introduced in [3],

which allows a picture to refer to decoded pictures of other

views even at different time instants. In [4] and [5], various

modified hierarchical B structures are developed for inter-view

prediction. One of them is implemented in the H.264-based

joint multiview video coding (JMVC) software [6], which uses

the hierarchical B structure in the temporal direction and the

I-B-P disparity prediction structure in the inter-view direction.

To reduce the complexity of finding the best matching, the

multiview geometry is employed in [7] to predict the disparity

values, but only multiview image coding is considered.

However, the translational inter-view motion assumed by the

disparity compensation method could not accurately represent

the geometry relationships between different cameras; there-

fore, this method is not always efficient. For example, larger

disparities than the search window size can frequently occur,

due to different depths of an object in different views [5]. In

addition, effects such as rotation and zooming are difficult to

be modeled as pure translational motion.

An alternative to disparity-compensated prediction is VSP,

where a synthesized view for a target view is created, us-

ing the geometry relationship between different views. The

synthesized view is then used as an additional reference to

predictively encode the target view.

Some VSP methods are based on depth estimation [8]–

[10]. In this paper, VSP schemes without involving depth

information are investigated. In particular, we focus on VSP

schemes that do not need camera parameters, which are not

always available. In this case, the disparity estimation (or

stereo matching) is usually used to calculate the disparity

map between two neighboring views, and the virtual view

is then synthesized using the disparity information. Disparity

estimation has been extensively studied in computer vision.

In [11], the cost function for disparity estimation considers

the smoothness of disparity transition. This method is used

in [12] for view interpolation-based MVC. In this paper, the

disparity estimation method in [13] is adopted, which achieves

better performances in terms of the accuracy and disparity

smoothness, as well as robustness to occlusions.

Most view synthesis methods are designed for stereo vision

and assume aligned cameras, i.e., the two cameras are parallel

1051-8215/$26.00 c ? 2011 IEEE

Page 2

694IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 6, JUNE 2011

and only differ from each other by a small horizontal shift. To

deal with more general camera setups, a rectification-based

view interpolation (RVI) method is proposed in [14]. It first

rectifies the two views using the projective rectification method

in [15] and [16]. This involves the calculation of the fun-

damental matrix between two views and resampling of them

such that they have horizontal and matched epipolar lines. A

modified version of the disparity estimation method in [17] is

then used to create the interpolated view before mapping back

to the original domain. The algorithm does not require camera

parameters, and has little requirement on camera setup, as long

as the distance between the cameras is not too far. Therefore, it

is suitable for multiview video systems with unaligned cameras

and unknown camera parameters.

In this paper, by modifying the method in [14] and [15],

we first develop an improved rectification-based view interpo-

lation RVI method and apply it to MVC.

Most view synthesis methods deal with view interpolation

from a left view and a right view. If these methods are used

in MVC, the VSP can only benefit half of the views. To

overcome this limitation, in this paper we also develop a

rectification-based view extrapolation (RVE) algorithm using

two left views or two right views; hence VSP can be applied

to the coding of all views after the first two views. Our results

show that although the average quality of the extrapolated

views is lower than that of the interpolated views, the overall

R-D performance of all views of the entire MVC system can

outperform that of the view interpolation-based approach, as

the increase of the number of views in the system.

B. Review of Theoretical Analyses of View Synthesis and MVC

Another important topic in MVC is the theoretical R-D

analysis of various MVC algorithms. Such an analysis can

provide important guidelines for the design of practical MVC

systems. The R-D analysis can be achieved by generalizing

that of the traditional single view video coding. The key

problem is how to model the inter-view correlations and the

underlying inter-view prediction algorithm.

The theory of the R-D analysis of motion compensation-

based single view video coding was established by Girod

[19]–[21]. It was generalized to wavelet based video coding

in [22] and light field coding in [23], where the impacts of

the statistical properties of multiple light field images, the

accuracy of the disparity and the transform coding on the

compression efficiency are studied. In [24], the R-D analyses

of multiview image coding with texture-based and model-

aided methods are presented, using the same model as in

[23]. Recently, these theories are generalized to multiview

video coding in [25], where the R-D efficiency of motion and

disparity estimation (MDE)-based MVC is investigated.

However, the R-D analysis of VSP based MVC has not been

reported in the literature. In fact, even the mathematical models

of view synthesis algorithms have not been well established. A

couple of important progresses toward this direction have been

obtained recently. In [26], a model is proposed to describe

the relationship between the accuracy of disparity and the

quality of the interpolated view, based on the framework in

[19], [23], and [24]. A prefilter method is also proposed to

Fig. 1. Block diagram of the proposed RVI algorithm.

improve the view interpolation quality. However, the model for

the disparity error is oversimplified, and only parallel cameras

are considered. In [27], a similar model for view interpolation

is used to analyze the theoretical R-D performance of a view

subsampling-based multiview image coding scheme. However,

MVC is not considered in both [26] and [27].

In this paper, we develop a more accurate geometric model

than that in [26]. Our model enables the study of the impact

of projective rectification on the quality of the interpolated or

extrapolated view when unaligned cameras are used. To the

best of our knowledge, this is the first attempt to quantify the

improvement of the projective rectification in view synthesis.

Another contribution of this paper is that we develop an

improved R-D model to study the performances of different

practical MVC schemes, e.g., the MDE-based JMVC and our

VSP-based schemes. Compared to the models in [22] and [25],

our model characterizes the practical MVC schemes more

accurately. Simulation results of this model agree well with

the experimental results of various MVC schemes.

This paper is organized as follows. In Section II, we

present the proposed RVI method and its application in MVC.

Section III extends the result to view extrapolation and applies

it to MVC. In Section IV, a geometric model is developed to

analyze the performance of rectification-based view synthesis.

An improved R-D model for practical MVC schemes is

developed in Section V. Experimental and simulation results of

the proposed methods and models are presented in Sections VI

and VII, respectively, followed by the concluding remarks in

Section VIII.

II. Projective Rectification-Based View

Interpolation and Application in MVC

In this section, we propose an improved version of the RVI

algorithms in [14] and [15], and apply it to MVC. In particular,

a more robust method is used to rectify the two reference

views to reduce their vertical mismatches. A sub-pixel view

interpolation is also developed to improve the accuracy of the

integer-pixel interpolation in [14].

A. The Proposed RVI Algorithm

Fig. 1 shows the main steps in the proposed RVI algorithm,

which are explained below.

1)

Projective View Rectification: To rectify two non-

parallel input views, we first estimate the fundamental matrix,

which characterizes the epipolar geometry between the two

views [16]. The matrix can be obtained without using any

camera parameter.

Suppose a point X in the 3-D space is projected to point xl

in one view. Its projection point xrin the other view lies on

Page 3

XIU et al.: RECTIFICATION-BASED VIEW INTERPOLATION AND EXTRAPOLATION FOR MULTIVIEW VIDEO CODING695

the line Fxl, where F is the 3 × 3 rank-2 fundamental matrix

with seven degrees of freedom [16]. In addition, xl and xr

satisfy xT

coordinates. This equation is a linear function of the entries

of F. If enough point correspondences between two views are

known, various algorithms can be used to calculate F, such as

the 7-point, the 8-point, or the least-squares algorithm [16].

In this paper, the point correspondences are selected using

corner detection and the random sample consensus (RANSAC)

algorithms [16]. The implementation in [28] is modified to

calculate F from the selected point correspondences. Note that

other correspondence matching algorithms such as the scale

invariant feature transform [28] can also be used to find the

point correspondences.

Given F, the epipoles of the two views (the intersections

between the line joining the two camera centers and the two

image planes) can be obtained from the left and right null

spaces of F. After this, the rectification matrix of each view

can be obtained as follows [14], [15]. First, the coordinate

origin is translated to the image center via a transform

⎡

0

rFxl = 0, where xl and xr are 3 × 1 homogeneous

T =

⎣

1

0

0

1

0

−cx

−cy

1

⎤

⎦

(1)

where c = (cx,cy) is the image center. Suppose the epipole of

a view is at e = (ex,ey,1)Tafter the translation. The next step

is to rotate the image such that the epipole moves to the x-axis,

i.e., its homogeneous coordinate has the format (v,0,1)T. The

required rotation R is thus

⎡

0

R =

⎣

αex

−αey

αey

αex

0

0

0

1

⎤

⎦

(2)

where α = 1 if ex≥ 0 and α = −1 otherwise.

Given the new epipole position (v,0,1)T, the following

transformation is applied to map the epipole to infinity:

⎡

−1/v

As a result, the rectification matrix for a view is

G =

⎣

1

0

0

1

0

0

0

1

⎤

⎦.

(3)

H = GRT.

(4)

In [14], the scheme in (4) is used to obtain the rectification

matrices Hland Hrfor the left and right view, respectively, in

order to create two parallel views. However, its performance

relies mainly on the accuracy of the calculated epipoles. In

[15], a more robust and accurate matching transform method

is used, where the transformation Hlfor the left view is still

obtained by (4), but Hr for the right view is obtained by

finding a matching transform that minimizes the mismatch of

the two rectified views. However, this method needs to solve

the camera matrices, which are not always available.

In this paper, we optimize the rectification matrix Hr for

the right view by minimizing the distances between a group

of rectified corresponding points in the two views, that is

?

argmin

Hr

i

||Hlxli− Hrxri||2

(5)

where xli and xri are some of the most accurate point cor-

respondences in the two images, selected by the RANSAC

algorithm. The Levenberg–Marquardt algorithm [16] is used

to find the optimal solution of Hr, with the initial value given

by the method in (4). Our experimental results show that using

(5) can reduce the average vertical mismatch of the two views

by as much as 80% compared to the method in [14].

After the rectification, the resolutions of some regions in

the rectified views are down-scaled, which can decrease the

quality of the interpolated view. The down-scaled factor at a

pixel position (˜ x, ˜ y) in the rectified view is given by [14]

????????

position (˜ x, ˜ y) is extended to the unfilled pixels in a square

region around (˜ x, ˜ y) with a side length of√m(˜ x, ˜ y).

2)

Disparity Estimation: Since two parallel views are

created after rectification, disparity estimation can be per-

formed in 1-D, which has been studied extensively in computer

vision. A 1-D dynamic programming method is used in [17]

to estimate the disparity. However, independent processing of

different scan lines leads to horizontal stripes in the disparity

map. Several graph cut algorithms have been proposed [29],

which achieve more accurate disparity estimation, but they

cannot handle occlusions well, because they assume that each

pixel in the left view can be mapped into multiple pixels in

the right view, but in reality some pixels in the left view

can be occluded and do not correspond to any pixel in the

right view. In [13], a smoothness term is introduced into the

cost function to favor solutions with small changes between

neighbors, while preserving the advantages of graph cut. The

energy cost function for a pixel at (x,y) is defined as

m(˜ x, ˜ y) =

∂˜ x

∂x

∂˜ y

∂˜ x

∂y

∂˜ y

∂x∂y

????????

.

(6)

To compensate the loss of resolution, the pixel value at

E(x,y) = Edata(x,y) + Eocc(x,y) + Esmooth(x,y)(7)

where Edata results from the intensity differences between

corresponding pixels, Eocc imposes a penalty for making

a pixel as occlusion, and the smooth term Esmooth ensures

that neighboring pixels have similar disparities. Moreover,

an uniqueness constraint is imposed in [13] to deal with

occlusions, in which a pixel can correspond to at most one

pixel in the other view, i.e., a pixel can only be labeled as

either a matching point that corresponds to one pixel, or an

occluded point that corresponds to no pixel in the other view.

The disparity estimation in [14] is based on the method

in [17], by adding an extra term in the cost function to

improve the smoothness of the disparity map. However, our

experimental results show that the improvement is not always

satisfactory. In this paper, we use the more accurate method

in [13] for disparity estimation.

3) Sub-Pixel View Interpolation: View interpolation can be

performed after disparity estimation. Although two neighbor-

ing views are available, there is no guarantee that every pixel

in one view has its corresponding pixel in the other view, due

to occlusion. Therefore different cases need to be considered.

In addition, in [10] and [17], the interpolated coordinates of the

Page 4

696IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 6, JUNE 2011

Fig. 2.

v?

View interpolation. (a) Pixels from both v?

i+1(tj). (c) Occlusion pixels in v?

i−1(tj) and v?

i+1(tj) are visible. (b) Pixels whose correspondences are out of the boundary of v?

i−1(tj). (d) Occlusion pixels in v?

i−1(tj) or

i+1(tj).

pixels in the middle view are directly rounded to integer, which

reduces the quality of the interpolated view and creates more

occlusion regions. Although an occlusion padding algorithm

is performed in [10] to improve the view interpolation, the

quality of the synthesized images is still not satisfactory.

In this paper, we propose a sub-pixel interpolation method,

by distributing the contribution of each interpolated pixel

with floating-point coordinates to the two nearest horizontal

neighbors with integer coordinates.

Let vm(tn) be the image of View m at time tn, v?

rectified vm(tn), and w?

View i at time tj. Also let v?

pixel value of v?

and dm

position (x,y) and time tj. As in [10] and [17], we interpolate

the middle view by considering three cases.

If a pixel is visible in both views, as shown in Fig. 2(a),

the corresponding pixel position in the intermediate view can

be easily obtained by scaling the disparity value, and the

pixel value of the intermediate pixel is interpolated from the

correspondences in the left and right views by

?x − αdi+1

m(tn) the

i(tj) the generated virtual image for

m(x,y,tn) and w?

m(tn) and w?

n(x,y,tj) the disparity of View m relative to View n at

i(x,y,tj) be the

i(tj) at position (x,y), respectively,

w?

i

i−1(x,y,tj),y,tj

= (1 − α)v?

?

i−1(x,y,tj) + αv?

i+1

?x − di+1

i−1(x,y,tj),y,tj

?

(8)

where α is the ratio between the distance from View i − 1

to View i and that from View i − 1 to View i + 1. Note that

x − αdi+1

For pixels whose corresponding pixels are out of the valid

image area in the other view [Fig. 2(b)], we extend the

disparity of the border pixel, and the pixel color is copied

accordingly. That is, if the correspondence of v?

invalid in v?

?x − α · di+1

Similarly, if the correspondence of v?

v?

i−1(x,y,tj) generally has floating-point value.

i−1(x,y,tj) is

i+1(tj), the interpolated pixel is taken as

w?

i

i−1(xr,y,tj),y,tj

?= v?

i−1(x,y,tj).

i+1(x,y,tj) is invalid in

(9)

i−1(tj), the interpolated pixel is

?x + (1 − α) · di+1

In (9) and (10), xr and xlare the horizontal axis of the first

neighbor of v?

correspondence in the other view, as shown in Fig. 2(b).

Due to occlusions, some pixels are only seen in one view.

Their disparity values are therefore unavailable. In our system,

these pixels are detected by the disparity estimation method

in [13]. As shown in Fig. 2(c) and (d) (see also [14]), the

w?

ii−1(xl,y,tj),y,tj

?= v?

i+1(x,y,tj).

(10)

i−1(x,y,tj) and v?

i+1(x,y,tj) with valid point

occlusion areas in the left view (View i − 1) are occluded

by the objects at their right side, and the occlusion pixels in

the right view (View i + 1) are occluded by objects at their

left side. Therefore, view interpolation can use the disparities

of the neighboring background pixels. For view interpolation

involving occlusion pixels in v?

available pixel to the left is used

i−1(tj), the disparity of the first

w?

i

?x − α · di+1

i−1(xl,y,tj),y,tj

?= v?

i−1(x,y,tj).

(11)

For view interpolation involving occlusion pixels in v?

the disparity of the first available pixel to the right is used

i+1(tj),

w?

i

?x + (1 − α) · di+1

In (11) and (12), xland xrare shown in Fig. 2(c) and (d).

Finally, to obtain the interpolated pixels at an integer

location (x0,y0), we use the weighted combination of all pixels

within unit distance from (x0,y0), that is

⎛

C(x,x0)

i−1(xr,y,tj),y,tj

?= v?

i+1(x,y,tj).

(12)

w?

i(x0,y0,tj) = round

⎝

1

?

|x−x0|<1

γ(x,x0) · w?

i(x,y0,tj)

⎞

⎠

(13)

where C(x,x0) =?

in our implementation.

Note that if the distance between the left/right view and the

target view is equal, the factor α in (8) to (12) will be 0.5,

and the interpolated coordinates will be either integer or half-

integer. In this case, the complexity of (13) can be simplified.

4) Projective Un-Rectification: Similar to [16], the recti-

fication algorithm above could generate non-rectangular inter-

polated images. Therefore, the last step of the RVI method is to

back-project the intermediate view to the original coordinates

at the same position. To do so, we first locate the positions of

the four corners from the interpolated image w?

x?

matrix B that minimizes the mapping error from these points

to the four corners of the unrectified image wi(tj), that is

?

where xi are homogeneous coordinates of the four corners

in wi(tj). The direct linear transform method in [16] can

be applied to convert (14) into a constrained least-squares

problem

|x−x0|<1γ(x,x0), γ(x,x0) = 1/(|x−x0|+c0),

and c0is a constant to prevent overflow, and is set to be 0.1

i(tj), denoted

i,i = 1,...,4. Our goal is to find an 3 × 3 un-rectification

argmin

B

i=1,...,4

?Bx?

i− xi?2

(14)

argmin

b?Ab?

s.t.

?b? = 1(15)

Page 5

XIU et al.: RECTIFICATION-BASED VIEW INTERPOLATION AND EXTRAPOLATION FOR MULTIVIEW VIDEO CODING697

Fig. 3.Proposed MVC schemes using (a) view interpolation and (b) view extrapolation.

where b = [b1b2b3]T(bi is the ith row of B), i.e., the

vectorized version of B. Matrix A is an 8 × 9 matrix, and

each pair of corner correspondences contributes to two rows

of A. The optimal solution to (15) is the unit singular vector

that corresponds to the smallest singular value of A.

B. RVI-Based MVC

In this section, we apply our RVI method to H.264-based

MVC, by modifying the MDE-based JMVC software [6],

which uses hierarchical B structure in the temporal direction

and I-B-P prediction structure in the inter-view direction.

The coding structure of our RVI-based MVC is illustrated in

Fig. 3(a) for a system with five views and a group of pictures

(GOP) size of 8. The coding of the even-indexed views is

identical to the even-indexed views in the JMVC. That is, v0is

coded using hierarchical B structure in the temporal direction.

Other even-indexed views are coded by hierarchical B struc-

ture in the temporal direction, as well as disparity-compensated

inter-view prediction using the previously reconstructed even-

indexed view as reference.

For the odd-indexed views v2k+1, in addition to temporal B

references, two inter-view reference pictures are used in our

method. The first is a synthesized frame w2k+1(tj) generated

by the proposed RVI method. The second is the left view. The

encoder then uses R-D optimization to find the best coding

mode for each block, by treating the synthesized view as

an additional reference picture. The synthesized views can

be generated at the decoder using the reconstructed reference

views, thus no additional bits need to be sent to the decoder.

It should be mentioned that the frames of v2k+1are coded

as B pictures in the inter-view direction in the JMVC, using

the left view and the right view as references. Therefore our

scheme has the same number of inter-view references as the

JMVC. However, since the quality of our view interpolation-

based prediction is usually better than that of the disparity

compensation, the proposed MVC scheme can achieve a better

coding efficiency than JMVC, as shown in Section VI.

III. Projective Rectification-Based View

Extrapolation and Application in MVC

View interpolation requires a left view and a right view.

To apply it to MVC, VSP can only be applied to half views

in order to get satisfactory performance. In this section, we

generalize the RVI method to get a RVE algorithm using two

left views or two right views. We then apply the RVE method

to MVC to encode all views after the first two views.

A. The Proposed RVE Algorithm

In this paper, we assume that the view extrapolation algo-

rithm uses two left views to synthesize a right view. Similar to

the view interpolation algorithm in Section II, the extrapolation

algorithm first performs projective rectification and disparity

estimation to the two left views. After that, instead of inter-

polating the disparity to find the corresponding pixel locations

in the middle view, the algorithm extrapolates the disparity

and estimates the pixel locations in the right view. The final

step of un-rectification is still similar to the view interpolation

method. The disparity extrapolation is described below, since

it is the only different step.

Using the same notations as in Section II-A3, two frames

from the two previous views, vi−2(tj) and vi−1(tj), are used to

extrapolate a frame for View i. Let v?

be the rectified frames of vi−2(tj), vi−1(tj) and the synthesized

View i at tj, respectively.

If the horizontal camera distance between u?

is c times of that between v?

disparities have the same scaling factor, that is

i−2(tj), v?

i−1(tj) and u?

i(tj)

i(tj) and v?

i−1(tj)

i−2(tj) and v?

i−1(tj), we assume their

di

i−1(x,y,tj) = c · di−1

i−2(x,y,tj).

(16)

The following three cases need to be handled.

If a pixel is visible in both v?

in Fig. 4(a), we extrapolate their disparity, and the synthesized

pixel in u?

?x − (1 + c)di−1

For pixels whose correspondences are out of the valid region

of v?

first left pixel (xl, y) with valid point correspondence

i−2(tj) and v?

i−1(tj), as shown

i(tj) is the average of the pixel pair. That is

u?

i

1/2?v?

i−2(tj), as shown in Fig. 4(b), we scale the disparity of the

i−2(x,y,tj), y,tj

i−1(x − di−1

?=

i−2(x,y,tj) + v?

i−2(x,y,tj), y,tj)?. (17)

u?

i

?x − c · di−2

If a pixel at (x,y) is only visible in v?

Fig. 4(c), it is also assumed to be visible in the extrapolated

view, and the first available disparity to the right of this pixel,

i−1(xl,y,tj),y,tj

?= v?

i−1(x,y,tj).

i−1(tj), as shown in

(18)