The Panum Proxy algorithm for dense stereo matching over a volume of interest
A. Agarwal and A. Blake
Microsoft Research Ltd.
7 J J Thomson Ave, Cambridge, CB3 0FB, UK
Stereo matching algorithms conventionally match over a
range of disparities sufficient to encompass all visible 3D
scene points. Human vision however does not do this. It
works over a narrow band of disparities — Panum’s fu-
sional band — whose typical range may be as little as 1/20
of the full range of disparities for visible points. Points in-
side the band are fused visually and the remainder of points
are seen as “diplopic” — that is with double vision. The
Panum band restriction is important also in machine vision,
both with active (pan/tilt) cameras, and with high resolution
cameras and digital pan/tilt.
A probabilistic approach is presented for dense stereo
matching under the Panum band restriction.
shown that existing dense stereo algorithms are inadequate
in this problem setting. Secondly it is shown that the main
problem is segmentation, separating the (left) image into
the areas that fall respectively inside and outside the band.
ing out-of-band information with a “proxy” based on image
autocorrelation. Lastly it is shown that the Panum Proxy
algorithm achieves accuracy close to what can be obtained
when the full disparity band is available.
First it is
In attentional stereo vision, the viewer steers a volume
of interest around the scene. This is a problem that has re-
ceived a good deal of attention in the realms of oculomotor
control[5, 8] and sparse stereo e.g.. In the area of dense
stereo however e.g.[12, 6, 2, 3, 10, 15] the issue of restrict-
ing attention to a volume, with a limited range of depth or
equivalently disparity, has not been addressed. It is of con-
siderable importance from the point of view of efficiency,
particularly with high resolution or head-mounted cameras,
be only a small fraction of the visible volume. In principle
also, it is most unsatisfying that conventional stereo algo-
rithms need to explore an irrelevant background, simply in
order to establish significiant properties of the foreground
— a form of the celebrated “frame” problem of Artificial
1.1. The Panum band
The geometry of the situation is illustrated in figure 1.
For a particular field of view of each camera, potential
matches between left and right images form a diamond-
shaped region in each epipolar plane.
 the space of possible matches is restricted further to
the “Panum band” (see figure). This is typically around 5
mrad wide, and cuts down the number of possible foveal
matches by around an order of magnitude. High quality
stereo cameras with narrow fields of view can also benefit
from a Panum band restriction in a similar way.
The motivation for studying Panum band stereo is then
1. It is conceptually appealing to develop a stereo algo-
rithm which focuses on a volume of interest, in the manner
known to prevail in human vision. Why should a stereo al-
gorithm expend needless attention to the entire background
of a scene?
2. Computational cost for stereo matching grows linearly
(or faster) with volume of interest. This is true both for both
main components of stereo matching: cost computation and
global optimization (whether by graph-cut (GC), dynamic
programming (DP) or belief propagation (BP)). Restricting
the size of the matching volume is therefore critical for ef-
ficiency. For stereo geometry similar to human vision, the
saving in computational cost is at least an order of magni-
tude, due to the reduced range of depth (disparity). Usually
there is a further factor of saving, due to the concomitant
restriction in image area over which matching occurs. The
best stereo algorithms (GC, DP or BP ) do not currently
come close to real time. This is not going to be solved any
time soon by Moore’s law because camera resolution is in-
creasing faster than processing power.
3. Computational cost, we have argued, necessitates the re-
striction of stereo matching to a Panum volume. However,
existing dense stereo algorithms are not capable of satisfac-
In human vision
Panum Proxy Dense Stereo Matching — Agarwal and Blake, Proc CVPR 20068
Figure 9. Segmentation error for Panum Proxy, over time,
for one subject (VK). The Panum Proxy (LGC PP) algorithm
achieves error rates close to those achieved by LGC segmenta-
tion with the full range of disparities (LGC full), especially when
colour information is fused in with stereo. See also figure 10.
Best case: frame 120Worst case: frame 90
Figure 10. Some examples of segmentation. Segmentations are
shown for the frames with lowest and highest error, for the dataset
of fig 9. Results are for LGC PP plus colour.
of the Panum Proxy algorithm are close in quality to what
is obtainable under unconstrained conditions, using the full
range of available disparity. It remains for future work to
test the algorithm under more stringent circumstances, with
greater ranges of disparity in the scenes.
sions with A. Criminisi, A. Fitzgibbon and V. Kolmogorov.
We acknowledge helpful discus-
 H.H. Baker and T.O. Binford. Depth from edge and
intensity based stereo. In Proc. Int. Joint Conf. Artifi-
cial Intelligence, pages 631–636, 1981. 2
 P.N. Belhumeur, J.P. Hespanha, and D.J. Krieg-
man. Eigenfaces vs. Fisherfaces: recognition using
class specific linear projection.
Conf. Computer Vision, number 800 in Lecture notes
in computer science, pages 45–58. Springer-Verlag,
In Proc. European
 Y.Y. Boykov and M-P. Jolly. Interactive graph cuts for
optimal boundary and region segmentation of objects
in N-D images. In Proc. Int. Conf. on Computer Vi-
sion, pages 105–112, 2001. 1, 2, 3, 5
 Y.Y. Boykov, O. Veksler, and R.D. Zabih. Fast ap-
proximate energy minimization via graph cuts. IEEE
Trans. on Pattern Analysis and Machine Intelligence,
23(11), 2001. 3, 4, 5
 C.M. Brown, D. Coombs, and J. Soong. Real-time
smooth pursuit tracking. In A. Blake and A.L. Yuille,
editors, Active Vision, pages 123–136. MIT, 1992. 1
 I.J. Cox, S.L. Hingorani, and S.B. Rao. A maximum
likelihood stereo algorithm. Computer vision and im-
age understanding, 63(3):542–567, 1996. 1
 A. Criminisi, J. Shotton, A. Blake, and P.H.S. Torr.
Gaze manipulation for one to one teleconferencing. In
Proc. Int. Conf. on Computer Vision, pages 191–198,
 T. Uhlin K. Pahlavan and J-O. Eklundh. Dynamic fix-
ation. In Proc. Int. Conf. on Computer Vision, pages
412–419, 1993. 1
 V. Kolmogorov, A. Criminisi, A. Blake, G. Cross, and
C. Rother. Bi-layer segmentation of binocular stereo
video. In Proc. Conf. Computer Vision and Pattern
Recognition, 2005. 4, 5, 7
 V.KolmogorovandR.Zabih. Computingvisualcorre-
spondences with occlusions using graph cuts. In Proc.
Int. Conf. on Computer Vision, 2001. 1, 2, 3, 5
 V. Kolmogorov and R. Zabih. Multi-camera scene re-
construction via graph cuts. In Proc. European Conf.
Computer Vision, pages 82–96, 2002. 5
 Y. Ohta and T. Kanade. Stereo by intra- and inter-
scan line search using dynamic programming. IEEE
Trans. on Pattern Analysis and Machine Intelligence,
7(2):139–154, 1985. 1, 2
 T. Poggio and W. Reichardt. Visual control of ori-
entation behaviour in the fly. Quart. Rev. Biophys.,
9(3):377–438, 1984. 1
 I.D. Reid and D.W. Murray. Tracking foveated corner
clusters using affine structure. In Proc. Int. Conf. on
Computer Vision, pages 76–83, 1993. 1
 D. Scharstein and R. Szeliski. A taxonomy and evalu-
ation of dense two-frame stereo correspondence algo-
rithms. Int. J. Computer Vision, 47(1–3):7–42, 2002.