Page 1

The Panum Proxy algorithm for dense stereo matching over a volume of interest

A. Agarwal and A. Blake

Microsoft Research Ltd.

7 J J Thomson Ave, Cambridge, CB3 0FB, UK

http://research.microsoft.com/vision/cambridge

Abstract

Stereo matching algorithms conventionally match over a

range of disparities sufficient to encompass all visible 3D

scene points. Human vision however does not do this. It

works over a narrow band of disparities — Panum’s fu-

sional band — whose typical range may be as little as 1/20

of the full range of disparities for visible points. Points in-

side the band are fused visually and the remainder of points

are seen as “diplopic” — that is with double vision. The

Panum band restriction is important also in machine vision,

both with active (pan/tilt) cameras, and with high resolution

cameras and digital pan/tilt.

A probabilistic approach is presented for dense stereo

matching under the Panum band restriction.

shown that existing dense stereo algorithms are inadequate

in this problem setting. Secondly it is shown that the main

problem is segmentation, separating the (left) image into

the areas that fall respectively inside and outside the band.

Thirdly, anapproximationisderivedthatmakesupformiss-

ing out-of-band information with a “proxy” based on image

autocorrelation. Lastly it is shown that the Panum Proxy

algorithm achieves accuracy close to what can be obtained

when the full disparity band is available.

First it is

1. Introduction

In attentional stereo vision, the viewer steers a volume

of interest around the scene. This is a problem that has re-

ceived a good deal of attention in the realms of oculomotor

control[5, 8] and sparse stereo e.g.[14]. In the area of dense

stereo however e.g.[12, 6, 2, 3, 10, 15] the issue of restrict-

ing attention to a volume, with a limited range of depth or

equivalently disparity, has not been addressed. It is of con-

siderable importance from the point of view of efficiency,

particularly with high resolution or head-mounted cameras,

inrestrictingcomputationtoavolumeofinterestwhichmay

be only a small fraction of the visible volume. In principle

also, it is most unsatisfying that conventional stereo algo-

rithms need to explore an irrelevant background, simply in

order to establish significiant properties of the foreground

— a form of the celebrated “frame” problem of Artificial

Intelligence.

1.1. The Panum band

The geometry of the situation is illustrated in figure 1.

For a particular field of view of each camera, potential

matches between left and right images form a diamond-

shaped region in each epipolar plane.

[13] the space of possible matches is restricted further to

the “Panum band” (see figure). This is typically around 5

mrad wide, and cuts down the number of possible foveal

matches by around an order of magnitude. High quality

stereo cameras with narrow fields of view can also benefit

from a Panum band restriction in a similar way.

The motivation for studying Panum band stereo is then

threefold.

1. It is conceptually appealing to develop a stereo algo-

rithm which focuses on a volume of interest, in the manner

known to prevail in human vision. Why should a stereo al-

gorithm expend needless attention to the entire background

of a scene?

2. Computational cost for stereo matching grows linearly

(or faster) with volume of interest. This is true both for both

main components of stereo matching: cost computation and

global optimization (whether by graph-cut (GC), dynamic

programming (DP) or belief propagation (BP)). Restricting

the size of the matching volume is therefore critical for ef-

ficiency. For stereo geometry similar to human vision, the

saving in computational cost is at least an order of magni-

tude, due to the reduced range of depth (disparity). Usually

there is a further factor of saving, due to the concomitant

restriction in image area over which matching occurs. The

best stereo algorithms (GC, DP or BP [15]) do not currently

come close to real time. This is not going to be solved any

time soon by Moore’s law because camera resolution is in-

creasing faster than processing power.

3. Computational cost, we have argued, necessitates the re-

striction of stereo matching to a Panum volume. However,

existing dense stereo algorithms are not capable of satisfac-

In human vision

Page 2

Panum Proxy Dense Stereo Matching — Agarwal and Blake, Proc CVPR 20062

Left

Right

a)

m

0

Left

n

Right

disparity

b)

Figure 1. The space of possible matches restricted to a Panum

band. a) View from above of rays in a single epipolar plane, form-

ing a diamond-shaped match space; the Panum band forms a rib-

bon across the diamond and thus cuts down the set of possible

matches. b) The match-space represents the situation in (a) in a

standardised diagram, in which the diamond-shaped match-space

becomes a square, whose sides are respectively a left and right

epipolar line, and restricted to the Panum band as shown. A possi-

ble matching path is shown dashed.

tory operation over a Panum band, as this paper will show.

A new algorithm is needed.

1.2. The Panum Proxy algorithm

The principle of the Panum Proxy stereo algorithm is

therefore as follows.

1.

Compute match scores or likelihoods for disparities

within the Panum band.

2. Aggregate those scores to compute a total likelihood, at

each point, that there is a within-band (foreground) match.

3. The same cannot be done for the background likelihood,

as that would require match scores outside the band.

However, it is shown that an autocorrelation-like measure

can be used to estimate the background likelihood.

4. Use the true foreground likelihood and the estimated

background likelihood, in a graph cut algorithm, to achieve

a segmentation.

5. Once segmentation is complete, perform conventional

stereo matching e.g. [3], but restricted to the image regions

that have been labelled as in-band.

Note that the restriction to the Panum band in the seg-

mentation step 4 is indeed essential, because the complex-

ity of segmentation is dominated by the cost of computing

stereo match scores, and this is linear in match volume.

The resulting stereo disparity map can be used, for ex-

ample, to synthesise a new view, as in figure 2, in which

Figure 2. Fusion and diplopia with the Panum Proxy algo-

rithm. Results of the Panum Proxy algorithm are illustrated here

for a frame from one of the six Microsoft stereo datasets. The

matched stereogram shows fusion within the Panum band but

diplopia — double vision — elsewhere.

case the view is fused within the Panum band, but diplopic

outside it, just as in human stereo vision.

2. Probabilistic framework for stereo matching

First we outline the notation for probabilistic stereo

matching. Pixels in the rectified left and right images are

L = {Lm} and R = {Rn} respectively, and jointly we

denote the two images z = (L,R). Left and right pix-

els are associated by any particular matching path (fig. 1).

Frequently in stereo matching the so-called “ordering con-

straint” is imposed, and this means that each move in figure

1b) is allowed only in the positive quadrant [1, 12]. Stereo

“disparity” is d = {dm, m = 0,...,N} and disparity is

simply related to image coordinates as dm= m − n.

In algorithms that deal explicitly with occlusion [10, 7]

an array x of state variables x = {xm}, takes values

xm ∈ {M,O} according to whether the pixel is matched

or occluded.

This sets up the notation for a path in epipolar match-

space which is a sequence ((d1,x1),(d2,x2),...) of dis-

parities and states. A Gibbs energy E(z,d,x;Θ,Φ) can

be defined for the posterior over all epipolar paths taken

together and notated (d,x), given the image data z. Para-

meters Φ and Θ relate respectively to prior and likelihood

terms in the posterior. Then the Gibbs energy can be glob-

Page 3

Panum Proxy Dense Stereo Matching — Agarwal and Blake, Proc CVPR 20063

ally minimised to obtain a segmentation x and disparities

d.

2.1. Prior distribution over matching paths

A Bayesian model for the posterior distribution

p(x,d | z) is set up as a product of prior and likelihood:

p(x,d | z) ∝ p(x,d)p(z | x,d).

The prior distribution p(x,d) ∝ exp−λE0(x,d) is fre-

quently decomposed, in the interests of tractability, as a

Markov model. An MRF (Markov Random Field) prior for

(x,d) is specified as a product of clique potentials Vm,m?

over all pixel pairs (m,m?) ∈ N deemed to be neigh-

bouring in the left image. The potentials are chosen to

favour matches over occlusions, to impose limits on dispar-

ity change along an epipolar line, and to favour figural con-

tinuity between matching paths in adjacent epipolar line-

pairs.

(1)

2.2. Stereo matching likelihood

The stereo likelihood is:

p(z | x,d) ∝

?

m

exp−UM

m(xm,dm)

(2)

where the pixelwise negative log-likelihood ratio, for match

vs. non-match, is

?

where M(...) is a suitable measure of goodness of match

between two patches, often based on normalised sum-

squared difference (SSD) or correlation scores [15].

UM

m(xm,dm) =

M(LP

M0

m, RP

n) if xm= M

if xm= O,

(3)

3. Restricting conventional stereo matching to

a Panum band

We looked at two dense stereo matching algorithms

which are considered competitive [15], one referred to as

BVZ[4]thatusesgraph-cutoptimization; theotherKZalso

using graph-cut but also with explicit allowance for occlu-

sion [10]. The question is whether these algorithms can be

applied to the Panum problem simply by reducing the dis-

parity range available for matching. Following conventions

for stereo testing, we took the four image pairs Tsukuba,

sawtooth, venus and map on the Middlebury database1, to-

gether with supplied ground truth, and calculated error mea-

sures. Over foreground, an error is counted wherever com-

puted disparity is in error by more than 1 pixel. For back-

ground regions, the true disparity is of course out of range,

so an incorrect disparity is considered to be as follows:

1http://cat.middlebury.edu/stereo

Figure 3. Stereo matching error rates for the KZ algorithm

constrained to the Panum band. The error rate data show that

background error is greatly magnified when the Panum band con-

straint is imposed, while foreground error barely changes.

BVZ: not at the endstop of the Panum band;

KZ: wherever the state is not occluded: xm?= O, and the

disparity is not at the endstop of the Panum band.

In each case we used the operating parameters recom-

mended for the algorithms: for BVZ, disparity gradient

penalty [3] λ = 20, and for KZ, λ = 10 with occlusion

penalty [10] K = 50.

Results for the KZ algorithm are shown in figure 3.

Results for the (simpler) BVZ algorithm are similar, but

omitted here. In both cases, disparity error over foreground

regions is not much affected by the Panum band restriction

(in fact improved slightly because of the added constraint).

Over background regions, error for both algorithms rises

substantially. The conventional stereo algorithms simply

fail over the background, generating many random dispari-

ties.

The conclusion from this experiment is that the con-

ventional algorithms, when restricted to the Panum band,

work perfectly well over foreground regions. All that is

required to make the algorithms usable, is reliable iden-

tification of those pixels whose disparities fall within the

band. In other words, successful Panum-band stereo could

be achieved if only segmentation into foreground (within

band) and background could be achieved reliably. There-

fore the remainder of the paper considers the problem

of foreground/background segmentation under the Panum

band constraint.

3.1. Can graph-cut stereo be adapted for segmenta-

tion?

One possibility, for a more subtle adaptation of the exist-

ing KZ algorithm, is that its ability to label occlusions could

Page 4

Panum Proxy Dense Stereo Matching — Agarwal and Blake, Proc CVPR 20064

Figure 4. Segmentation error from conventional graph-cut

stereo. The KZ algorithm is tuned here to use its occlusion la-

bels to indicate background, but error rates are very high compared

withwhatisattainableusingLGCsegmentationwiththefullrange

of disparities.

be extended to label background points. This is reasonable

because, given the restricted Panum band, both occlusions

and background points represent failures to obtain a stereo

match. In order to give the KZ algorithm every chance

of success, parameter value K was explored to minimise

labelling error rate and this yielded parameters λ = 10,

K = 10, quite different from the optimal operating point

for regular use of KZ for stereo matching. Results are given

in figure 4, showing segmentation error for each of the six

test videos in the Microsoft stereo-segmentation database2.

Labelling error-rates (equal error-rate) for the 6 datasets

vary between 7% and 41%, and are in all cases many times

worse than are obtainable from full, unconstrained stereo

segmentation, in the form of LGC (Layered Graph Cut) [9].

Since the aim, with Panum-band stereo, is to approach the

quality of full, unconstrained stereo, the performance of KZ

in this mode is far from acceptable.

3.2. Segmentation of the in-band image region

Given the results and discussion so far, the aim of the

remainder of the paper is to develop a segmentation algo-

rithm, to label all “foreground” points with an accuracy ap-

proaching full LGC, but without any computation out of the

Panum band. Segmentation could be done in one of two

ways. Either it could proceed simultaneously with com-

putation of disparity; or in a separate pass, preceding the

computation of disparity. Simultaneous segmentation and

disparity determination perhaps has the attraction of greater

elegance. On the other hand, separate segmentation could

be achieved by marginalising the stereo likelihood over dis-

parities d, and then performing energy minimisation with

respect to labels x only. A separate labelling pass should

2research.microsoft.com/vision/cambridge/i2i

surely be more efficient, since full consideration of dispar-

ity need then only occur within the foreground region.

4. Stereo segmentation

First we summarise the full LGC (Layered Graph Cut)

algorithm [9] for segmentation by marginalisation of stereo

likelihoods. Then in the next section the full LGC energy

function is approximated to stay within the Panum band re-

striction.

For LGC, the matched state M is further subdivided into

foreground match F and background match B. LGC deter-

mines segmentation x as the minimum of an energy func-

tion E(z,x;Θ), in which stereo disparity d does not appear

explicitly. Instead, the stereo match likelihood (2) in sec-

tion 2.2 is marginalised over disparity, aggregating support

from each putative match, to give a likelihood p(L | x,R)

for each of the three label-types occurring in x: fore-

ground, background and occlusion (F,B,O). Segmentation

is therefore a ternary problem, and it can be solved (approx-

imately) by iterative application of a binary graph-cut al-

gorithm, augmented for a multi-label problem by so-called

α-expansion [4]. The energy function for LGC is composed

of two terms:

E(z,x;Θ,Φ) = V (z,x;Θ) + US(z,x,Φ)

(4)

representing energies for spatial coherence/contrast and

stereo likelihood.

4.1. Encouraging coherence

The coherence energy V (z,x;Θ) is a sum, over cliques,

of pairwise energies with potential coefficients Fm,m? now

defined as follows. Cliques consist of horizontal, verti-

cal and diagonal neighbours on the square grid of pixels.

For vertical and diagonal cliques it acts as a switch ac-

tive across a transition in or out of the foreground state:

Fm,m?[x,x?] = γ if exactly one of the variables x,x?equals

F, and Fm,m?[x,x?] = 0 otherwise. Horizontal cliques,

along epipolar lines, inherit the same cost structure, ex-

cept that certain transitions are disallowed on geometric

grounds. These constraints are imposed via infinite cost

penalties:

Fm,m?[x = F,x?= O] = ∞; Fm,m?[x = O,x?= B] = ∞.

where [9] γ = log(2√WMWO) and parameters WMand

WO are the mean widths (in pixels) of matched and oc-

cluded regions respectively.

4.2.Encouragingboundarieswherecontrastishigh

A tendency for segmentation boundaries in images to

align with contours of high contrast is achieved by defin-

ing prior penalties Fk,k? which are suppressed where image

Page 5

Panum Proxy Dense Stereo Matching — Agarwal and Blake, Proc CVPR 20065

contrast is high [3, 4, 11], multiplying them by a discount

factor C∗

factor ?/(1 + ?) wherever the contrast across (Lm,Lm?) is

high — see [9] for details. Previously, maximal discount-

ing has been obtained [3] by setting ? = 0. Here, as in

stereo segmentation [10], ? = 1 tends to give the best re-

sults, though sensitivity to the precise value of ? is relatively

mild.

m,m?(Lm,Lm?) which suppresses the penalty by a

4.3. Foreground likelihood

The remaining term in (4) is US(z,x) which captures the

influence of stereo matching likelihood on the probability of

a particular segmentation. It is defined to be

?

where US

US(z,x) =

m

US

m(xm)

(5)

m(xm) = −logp(Lm| xm= F,R).

Now, marginalising out disparity, foreground likelihood is

?

where, from (2),

(6)

p(Lm|xm= F,R) =

d

p(Lm|dm= d,R)p(dm= d|xm= F)

(7)

p(Lm| dm= d,R) ∝ f(L,d,R) = exp−UM

m(xm,dm),

(8)

using the log-likelihood ratio defined in (3). As a shorthand,

we write:

?

and, as before, in terms of likelihood ratios, this becomes:

p(L | F) =

d

p(L | d,R)p(d | F)

(9)

L(L | F) ≡p(L | F)

p(L | O)=

?

d

f(L,d,R)p(d | F)

(10)

where f(L,d,R) is the match/non-match likelihood ratio

as above.

4.4. Background likelihood

Since the distribution p(dm= d | xm= F) is defined to

be zero outside the Panum fusional area, it is perfectly pos-

sible, under the Panum assumptions, to compute L(L | F)

in (10). However, the same cannot be said for

L(L | B) ≡p(L | B)

p(L | O)=

?

d

L(L | d)p(d | B)

(11)

since the corresponding summation is entirely outside the

Panum band DFof disparities, in that p(d | B) is non-zero

only outside the Panum band. Each pixel Lmwould there-

fore have to be compared with pixels in the right image R

that are unreachable because they are outside the band.

Figure 5. Segmentation error using a simple threshold in place

of background likelihood. Error curves are shown as a func-

tion of threshold θ for six subjects from the Microsoft database.

(Error-rates are total foreground and background error, averaged

over each sequence.) Horizontal dashed lines show corresponding

error rates for full (non-Panum) LGC segmentation, as a bench-

mark. The substantial shortfall suggests that it should be possible

to improve considerably on simple thresholding.

4.5.Asimplethresholdasproxyforthebackground

likelihood?

Before going to some trouble to approximate the back-

ground likelihood, it is worth looking at the simplest possi-

ble approach, and treating the problem as novelty detection.

Inthatview, wehaveamodelL(L|F)forthepositiveclass,

and no model of the background class. Then the likelihood

ratioclassifierL(L|F) > L(L|B)issimplifiedtoathresh-

old rule, replacing the background likelihood by a constant

L(L | B) = θ. Segmentation under this model, for variable

threshold θ, is exhibited in figure 5. It appears that a con-

stant threshold θ = 1 yields close to the best error for each

of the 6 datasets, so there would be no need for an adaptive

algorithm. However, the best error rate achieved is between

2 and 8 times higher than the error achieved (dashed lines)

by full LGC. Again, therefore, there is strong motivation

to look for a model and an algorithm that performs better

under the Panum-band restriction.

5. The Panum Proxy algorithm

In the previous section, it was shown that the Panum-

band constraint means that information required for com-

puting background likelihood L(L|B) is missing, and that

replacing L(L|B) with a simple threshold constant gives

poor results. Therefore in this section an approximation for

L(L|B) is developed.

5.1. Deriving the approximate likelihood

We assume that p(d | F) is uniform over the Panum band

so that p(d | F) = 1/|DF| and similarly, for the background,

Page 6

Panum Proxy Dense Stereo Matching — Agarwal and Blake, Proc CVPR 20066

0 204060

0

10

20

30

40

d

f(d,L, R)

BF

0 20 40 60

0

10

20

30

40

d

f(d,L, R)

BF

0 204060

0

10

20

30

40

d

f(d,L, R)

BF

−200

d

20

0

10

20

30

40

f(d,L, L)

−200

d

20

0

10

20

30

40

f(d,L, L)

−200

d

20

0

10

20

30

40

f(d,L, L)

a) b)c)

Figure 6. Using the left image as a proxy for the right, to approximate likelihoods. Upper: likelihood function f(L,d,R) for three

sample image points; Panum band is 30 ≤ d ≤ 60. Lower: “autocorrelation-like” proxy f(L,d,L) for the same three sample points. In

the first two examples a,b), the proxy satisfactorily mimics the shape of the true likelihood. In c) the sample happens to fall on a textureless

area, with consequent stereo ambiguity, and the proxy fails, in that f(L,d,L) has a dominant peak whereas there is none in f(L,d,R). A

test will be developed for this case.

p(d | B) = 1/|DB|. Then, defining D = DB∪ DF, we can

write

?

=

|DF|L(L | F) + |DB|L(L | B),

from (10) and (11). If S(L) were known, then it would be

possible, having computed L(L | F) as in (10), to compute

L(L | B) from the constraint (12). Of course S in (12) can-

not be computed exactly, because the summation extends

outside the Panum band. However, and this is the key idea

of the Panum Proxy, we can approximate it by using the

left image L as a proxy for the right image R in the match

likelihood ratio:

S(L)

≡

d∈D

f(L,d,R)

(12)

(13)

˜S(L) =

dS

?

d=−dS

f(L,d,L)

(14)

—seefigure6fordiagramsillustratinghowthisworks. The

approximation rests on the assumption that each match is a

good one, since it is matching the left image with itself.

Note that the value of f(L,d,L) at d = 0 is an upper bound

on the value of f(L,d,R) at the true match value of d, since

the match of the left image directly onto itself is of course

perfect; dShas to be chosen just big enough to capture the

peak of the match-likelihood, but it is reasonably assumed

that dS ? |DF| so that the additional work in computing

(14) is smaller than the amount of matching work done al-

ready in the Panum band. Note that a factor of 2 can be

saved in computing (14) by exploiting the symmetry of au-

tocorrelation, that is that f(Lm,d,L) = f(Lm+d,−d,L).

Finally, having estimated S(L), we can estimate the back-

ground likelihood-ratio from the approximate constraint

˜S(L) = |DF|L(L | F) + |DB|L(L | B).

?˜S(L) − |DF|L(L | F)

5.2. Complementary likelihood

(15)

giving L(L | B) =

?

/|DB|. (16)

Now given the weakness of evidence, resulting from

the Panum band restriction, for distinguishing background

match from occlusion, we do not attempt to distinguish the

hypotheses B and O. Therefore we lump them together as

the complementary hypothesis F = B ∪ O, so that

p(L | F)p(F) = p(L | B)p(B) + p(L | O)p(O),

and again dividing by p(L | O):

L(L | F)p(F) = L(L | B)p(B) + p(O).

(17)

(18)

and this is expressed as

L(L | F) = (1 − ν)L(L | B)p(B) + ν,

where ν = p(O)/(1 − p(F)), for which a typical value

would be ν = 0.1, reflecting the empirical fact that nor-

mallyasmallproportionofbackgroundpointsareoccluded.

(19)

Page 7

Panum Proxy Dense Stereo Matching — Agarwal and Blake, Proc CVPR 20067

5.3. Kurtosis test

Earlier in figure 6, we saw that although f(L,d,L) is of-

ten a good predictor of the shape of f(L,d,R), as in figure

6a,b), it can fail where there is no clear peak in f(L,d,L),

as in figure 6c). The kurtosis k = k(L,L), of f(L,d,L) as

a function of d, is computed as a diagnostic. Figure 7 shows

that high kurtosis is associated with low error in the proxy.

Therefore the likelihood estimate is predicted to be reliable

if k > k0.

Figure 7. Kurtosis of f(L,d,L) as an indication of proxy ac-

curacy. High kurtosis k is associated with reduced magnitude of

error˜S − S, suggesting a validity check based on kurtosis.

In fact low kurtosis occurs in practice over relatively tex-

tureless image areas, just the situation that gives rise to am-

biguous disparity, as in figure 6c).

k0 = 2.5 has proved effective, catching 86% of points on

the tails of the error distribution (defined to be those outside

1 standard deviation). Then the definition of˜S from (14) is

replaced by

A threshold value of

˜S(L) = r(k)

dS

?

d=−dS

f(L,d,L) + (1 − r(k)) |D|L(L|F),

(20)

where r(k) is soft threshold function, taking the value

r(k) = 1 when k ? 0 and r(k) = 0 when k ? 0. In

this way, the estimated complementary likelihood L(L|F)

(19) is unchanged in the reliable case r(k) = 1. In the

unreliable case r(k) = 0,˜S(L) = |D| L(L|F), and

L(L | B) = L(L | F) from (16), and then (19) defaults

towards the no-information condition L(L|F) = L(L|F) as

r(k) → 0.

5.4. Positivity check

The other condition that must be dealt with is the possi-

ble negativity of the estimated L(L|B) (16). In the case of

negativity, we simply replace (16) with

L(L|B) = L(L|F)/η

(21)

AC IU IU−JWJM MSVK

0

1

2

3

4

5

Mean segmentation error (%)

LGC Full

LGC Full+ colour

LGC PP

LGC PP + colour

9.5%

Figure 8. Segmentation error for Panum Proxy approaches

that of full stereo.

The Panum Proxy (LGC PP) algorithm

achieves error rates that approach very nearly the level achieved by

LGC segmentation with the full range of disparities (LGC full), at

considerably reduced computational cost. For one test set (MS)

the error rate for (LGC PP) is relatively high, though this is re-

stored by adding in colour information.

and use this to evaluate the complementary hypothesis (19).

The value of η is set using the statistics of L(L|F)/L(L|B)

in the negativity condition, collected from a variety of im-

ages, and this gives a working value of η = 3.

6. Results

First we show mean error rates, averaged over the en-

tire stereo video sequence for each of the six subjects from

the Microsoft database. These are extensive tests, repre-

senting measurements taken from several hundred stereo

pairs. Figure 8 shows that error for the Panum Proxy algo-

rithm approaches quite closely that for full Layered Graph

Cut (LGC), with unrestricted stereo disparity. Note that er-

ror rates are mostly an order of magnitude better than for

the conventional graph cut stereo algorithm KZ (figure 4).

Compared with the naive thresholding scheme (figure 5),

in which background likelihood is replaced by a constant,

error rates for the Panum Proxy algorithm are lower by fac-

tors ranging from 1.5 to 7 across the six subjects. Error rates

fall even further when colour information is used in the seg-

mentation following the paradigm, used in LGC [9]. These

results can be examined in more detail along their timelines,

and we show this here just for the VK dataset, in figures 9

and 10.

Discussion

band can be solved effectively using a conventional stereo

algorithm, togetherwithapre-segmentationstepthatselects

those pixels that are within the band. This allows stereo to

operatewithinavolumeofinterest, fusingoverthatvolume,

and with diplopic vision elsewhere, as in figure 2. Results

We have shown that stereo within a Panum-

Page 8

Panum Proxy Dense Stereo Matching — Agarwal and Blake, Proc CVPR 20068

Figure 9. Segmentation error for Panum Proxy, over time,

for one subject (VK). The Panum Proxy (LGC PP) algorithm

achieves error rates close to those achieved by LGC segmenta-

tion with the full range of disparities (LGC full), especially when

colour information is fused in with stereo. See also figure 10.

Best case: frame 120Worst case: frame 90

Dataset VK

Figure 10. Some examples of segmentation. Segmentations are

shown for the frames with lowest and highest error, for the dataset

of fig 9. Results are for LGC PP plus colour.

of the Panum Proxy algorithm are close in quality to what

is obtainable under unconstrained conditions, using the full

range of available disparity. It remains for future work to

test the algorithm under more stringent circumstances, with

greater ranges of disparity in the scenes.

Acknowledgements

sions with A. Criminisi, A. Fitzgibbon and V. Kolmogorov.

We acknowledge helpful discus-

References

[1] H.H. Baker and T.O. Binford. Depth from edge and

intensity based stereo. In Proc. Int. Joint Conf. Artifi-

cial Intelligence, pages 631–636, 1981. 2

[2] P.N. Belhumeur, J.P. Hespanha, and D.J. Krieg-

man. Eigenfaces vs. Fisherfaces: recognition using

class specific linear projection.

Conf. Computer Vision, number 800 in Lecture notes

in computer science, pages 45–58. Springer-Verlag,

1996. 1

In Proc. European

[3] Y.Y. Boykov and M-P. Jolly. Interactive graph cuts for

optimal boundary and region segmentation of objects

in N-D images. In Proc. Int. Conf. on Computer Vi-

sion, pages 105–112, 2001. 1, 2, 3, 5

[4] Y.Y. Boykov, O. Veksler, and R.D. Zabih. Fast ap-

proximate energy minimization via graph cuts. IEEE

Trans. on Pattern Analysis and Machine Intelligence,

23(11), 2001. 3, 4, 5

[5] C.M. Brown, D. Coombs, and J. Soong. Real-time

smooth pursuit tracking. In A. Blake and A.L. Yuille,

editors, Active Vision, pages 123–136. MIT, 1992. 1

[6] I.J. Cox, S.L. Hingorani, and S.B. Rao. A maximum

likelihood stereo algorithm. Computer vision and im-

age understanding, 63(3):542–567, 1996. 1

[7] A. Criminisi, J. Shotton, A. Blake, and P.H.S. Torr.

Gaze manipulation for one to one teleconferencing. In

Proc. Int. Conf. on Computer Vision, pages 191–198,

2003. 2

[8] T. Uhlin K. Pahlavan and J-O. Eklundh. Dynamic fix-

ation. In Proc. Int. Conf. on Computer Vision, pages

412–419, 1993. 1

[9] V. Kolmogorov, A. Criminisi, A. Blake, G. Cross, and

C. Rother. Bi-layer segmentation of binocular stereo

video. In Proc. Conf. Computer Vision and Pattern

Recognition, 2005. 4, 5, 7

[10] V.KolmogorovandR.Zabih. Computingvisualcorre-

spondences with occlusions using graph cuts. In Proc.

Int. Conf. on Computer Vision, 2001. 1, 2, 3, 5

[11] V. Kolmogorov and R. Zabih. Multi-camera scene re-

construction via graph cuts. In Proc. European Conf.

Computer Vision, pages 82–96, 2002. 5

[12] Y. Ohta and T. Kanade. Stereo by intra- and inter-

scan line search using dynamic programming. IEEE

Trans. on Pattern Analysis and Machine Intelligence,

7(2):139–154, 1985. 1, 2

[13] T. Poggio and W. Reichardt. Visual control of ori-

entation behaviour in the fly. Quart. Rev. Biophys.,

9(3):377–438, 1984. 1

[14] I.D. Reid and D.W. Murray. Tracking foveated corner

clusters using affine structure. In Proc. Int. Conf. on

Computer Vision, pages 76–83, 1993. 1

[15] D. Scharstein and R. Szeliski. A taxonomy and evalu-

ation of dense two-frame stereo correspondence algo-

rithms. Int. J. Computer Vision, 47(1–3):7–42, 2002.

1, 3