Binocular disparities used for stereopsis. (A) Two views of a simple 3D scene. The eyes are fixating point F 1 , which lies straight ahead. Point P is positioned above and to the right of the viewer's face, and is closer in depth than F 1. The upper panel shows a side view and the lower panel a view from behind the eyes. Lines of equal azimuth and elevation in Helmholtz coordinates are drawn on each eye. (B) Retinal projections of P from the viewing geometry in (A). The yellow and orange dots correspond to the projections in the left and right eyes, respectively. The difference between the left and right eye projections is binocular disparity. The difference in azimuth is horizontal disparity, and the difference in elevation is vertical disparity. In this example, point P has crossed horizontal disparity, because it is closer than point F 1 and the image of P is therefore shifted leftward in the left eye and rightward in the right eye. (C) For a given point in the scene, the disparity at the retinas can change substantially depending on where the viewer is fixating. In the left panel, the same point P is observed, but with a different fixation point F 2 that is now closer to the viewer than P (indicated by the arrow). The original fixation point F 1 is overlaid in gray. In the right panel, the retinal disparities projected by P are shown for both fixations [disparities from (B) are semitransparent]. For this viewing geometry, point P now has uncrossed horizontal disparity: that is, the image of P is shifted rightward in the left eye and leftward in the right eye. 

Binocular disparities used for stereopsis. (A) Two views of a simple 3D scene. The eyes are fixating point F 1 , which lies straight ahead. Point P is positioned above and to the right of the viewer's face, and is closer in depth than F 1. The upper panel shows a side view and the lower panel a view from behind the eyes. Lines of equal azimuth and elevation in Helmholtz coordinates are drawn on each eye. (B) Retinal projections of P from the viewing geometry in (A). The yellow and orange dots correspond to the projections in the left and right eyes, respectively. The difference between the left and right eye projections is binocular disparity. The difference in azimuth is horizontal disparity, and the difference in elevation is vertical disparity. In this example, point P has crossed horizontal disparity, because it is closer than point F 1 and the image of P is therefore shifted leftward in the left eye and rightward in the right eye. (C) For a given point in the scene, the disparity at the retinas can change substantially depending on where the viewer is fixating. In the left panel, the same point P is observed, but with a different fixation point F 2 that is now closer to the viewer than P (indicated by the arrow). The original fixation point F 1 is overlaid in gray. In the right panel, the retinal disparities projected by P are shown for both fixations [disparities from (B) are semitransparent]. For this viewing geometry, point P now has uncrossed horizontal disparity: that is, the image of P is shifted rightward in the left eye and leftward in the right eye. 

Source publication
Article
Full-text available
Humans and many animals have forward-facing eyes providing different views of the environment. Precise depth estimates can be derived from the resulting binocular disparities, but determining which parts of the two retinal images correspond to one another is computationally challenging. To aid the computation, the visual system focuses the search o...

Contexts in source publication

Context 1
... do, provides an advantage and a challenge. The advantage is that the differences in the two views can be used to compute very precise depth information about the three-dimensional (3D) scene. The differing viewpoints cre- ate binocular disparity, namely, horizontal and vertical shifts between the retinal images generated by the visual scene (Fig. 1). These binoc- ular signals are integrated in the primary visual cortex via cells tuned for the magnitude and direction of disparity (1,2), and this informa- tion is sent to higher visual areas where it is used to compute the 3D geometry of the ...
Context 2
... disparities would be very broadly distributed. However, it is actually more complicated than that: If the viewer fixated a very distant point, all other points in the scene would be closer than where the viewer was fixating, and the disparity distribution would be entirely composed of crossed horizontal disparities (illustrated for one point in Fig. 1, A and B). Similarly, fixation of a very near point would produce a wide distri- bution of uncrossed horizontal disparities (Fig. 1C). In this environment with fixations of random distance, the search for solutions to binocular correspondence would have to occur over a very large range of ...
Context 3
... point, all other points in the scene would be closer than where the viewer was fixating, and the disparity distribution would be entirely composed of crossed horizontal disparities (illustrated for one point in Fig. 1, A and B). Similarly, fixation of a very near point would produce a wide distri- bution of uncrossed horizontal disparities (Fig. 1C). In this environment with fixations of random distance, the search for solutions to binocular correspondence would have to occur over a very large range of ...
Context 4
... that are statistically the same as the distances of all visible scene points (19,21). We examined this assumption by comparing the distribution of fixation distances to the distribution of distances of all scene points that are visible in the central 20° of the visual field. The relationship between the two distances varies across tasks ( fig. S1), but the overall distribution of actual fixations is significantly biased toward closer points than the distribution of potential fixations (Fig. 3B). Specifically, the median weighted- combination fixation distance is 0.87 diopters (114 cm), whereas the median weighted-combination scene distance is 0.47 diopters (211 cm): that is, ...
Context 5
... material for this article is available at http://advances.sciencemag.org/cgi/content/ full/1/4/e1400254/DC1 Fig. S1. Distributions of fixation distances and scene distances for the four tasks. Fig. S2. Median, SD, skewness, and kurtosis of the distributions of horizontal and vertical dis- parities for each subject and task. Fig. S3. Cumulative version error distribution across all subjects and all calibrations. Table S1. Experimental setups and ...

Citations

... It is clear that the visual system is biologically optimized (tuned) to process and encode disparity information that aligns with naturally occurring and ecologically relevant disparities [57,58]. Disparity tuning curves of the human stereoscopic system have been obtained by using visual evoked potentials [59][60][61], functional magnetic resonance imaging [62] or by measuring simple reaction times [22]. ...
Article
Full-text available
Previous work on visual short-term memory (VSTM) has encompassed various stimulus attributes including spatial frequency, color, and contrast, revealing specific time courses and a dependence on stimulus parameters. This study investigates visual short-term memory for binocular depth, using dynamic random dot stereograms (DRDS) featuring disparity planes in front of or behind the plane of fixation. In a delayed match-to-sample paradigm, we employed four distinct reference disparities (17.5’, 28.8’ either crossed or uncrossed) at two contrast levels (20%, 80%), spanning interstimulus intervals (ISI) of up to 4 s. Test stimuli represented a range of equally spaced values centered around the reference disparity of the ongoing trial. In addition, the impact of a memory masking stimulus was also tested in a separate experiment. Accuracy and point of subjective equality (PSE) served as performance markers. The performance, indicated by the accuracy of responses, was better for smaller reference disparities (±17.5’) compared to larger ones (±28’), but both deteriorated as a function of ISI. The PSE demonstrated a consistent shift with increasing ISIs, irrespective of the magnitude of the initial disparity, converging gradually toward the range of 20–22’ and deviating from the reference disparity. Notably, the influence of masking stimuli on the PSE was more marked when the mask disparity diverged from the reference value. The findings from our study indicate that the retention of absolute disparity in memory is imprecise, it deteriorates with retention time or due to perturbation by dissimilar masking stimuli. As a result, the memory trace is gradually replaced by a default depth value. This value could potentially signify an optimal point within low-level perceptual memory, however, our results are better explained by perceptual averaging whereby the visual system computationally derives a statistical summary of the presented disparities over time. The latter mechanism would aid in the computation of relative disparity in a dynamically changing environment.
... Another Scene + Eye Tracker: Banks and colleagues [36][37][38][39] developed a custom mobile device that measures distances across a wide field of view and binocular eye fixations while people perform everyday tasks. The device measures distance using stereo cameras. ...
... In everyday activities, people tend to look down. 36,37 Thus, we aimed the 2 sensors 18 degrees downward to take into account the likely gaze directions. The sensors are positioned as close as possible to the eyes to minimize parallax mismatches. ...
... We observed that most gaze directions fall within 10 degrees to 15 degrees of straight ahead and slightly down. 36,44 In the other condition, we made the same measurements of distances to scene points in front of the subject but did not include the eye-tracking data. Instead, we assumed that the subjects were fixating the scene point that was straight ahead and downward by 18 degrees. ...
Article
Full-text available
Purpose In the past few decades, the prevalence of myopia, where the eye grows too long, has increased dramatically. The visual environment appears to be critical to regulating the eye growth. Thus, it is very important to determine the properties of the environment that put children at risk for myopia. Researchers have suggested that the intensity of illumination and range of distances to which a child's eyes are exposed are important, but this has not been confirmed. Methods We designed, built, and tested an inexpensive, child-friendly, head-mounted device that can measure the intensity and spectral content of illumination approaching the eyes and can also measure the distances to which the central visual field of the eyes are exposed. The device is mounted on a child's bicycle helmet. It includes a camera that measures distances over a substantial range and a six-channel spectral sensor. The sensors are hosted by a light-weight, battery-powered microcomputer. We acquired pilot data from children while they were engaged in various indoor and outdoor activities. Results The device proved to be comfortable, easy, and safe to wear, and able to collect very useful data on the statistics of illumination and distances. Conclusions The designed device is an ideal tool to be used in a population of young children, some of whom will later develop myopia and some of whom will not. Translational Relevance Such data would be critical for determining the properties of the visual environment that put children at risk for becoming myopic.
... Thus, despite the impressive scope of some datasets (Grauman et al., 2022), it is unclear from these data, for example, where objects typically fall in the visual field, because little or no gaze data were collected. By contrast, several studies have also collected samples of mobile eye tracking data (e.g., Hayhoe & Ballard, 2014;Kothari et al., 2020;Peterson, Lin, Zaun, & Kanwisher, 2016;Sprague, Cooper, Tošić, & Banks, 2015), but these have tended to focus on eye movements in particular circumstances, such as specific environments (Matthis, Yates, & Hayhoe, 2018), specific tasks such as making food or coffee (Fathi, Hodgins, & Rehg, 2011;Hayhoe & Ballard, 2005) by examining gaze patterns to specific stimuli such as faces (Peterson et al., 2016). Other mobile eye tracking datasets have been collected for developing robust gaze tracking pipelines (Fuhl, Kasneci, & Kasneci, 2021;Kothari et al., 2020). ...
Article
Full-text available
We introduce the Visual Experience Dataset (VEDB), a compilation of more than 240 hours of egocentric video combined with gaze- and head-tracking data that offer an unprecedented view of the visual world as experienced by human observers. The dataset consists of 717 sessions, recorded by 56 observers ranging from 7 to 46 years of age. This article outlines the data collection, processing, and labeling protocols undertaken to ensure a representative sample and discusses the potential sources of error or bias within the dataset. The VEDB's potential applications are vast, including improving gaze-tracking methodologies, assessing spatiotemporal image statistics, and refining deep neural networks for scene and activity recognition. The VEDB is accessible through established open science platforms and is intended to be a living dataset with plans for expansion and community contributions. It is released with an emphasis on ethical considerations, such as participant privacy and the mitigation of potential biases. By providing a dataset grounded in real-world experiences and accompanied by extensive metadata and supporting code, the authors invite the research community to use and contribute to the VEDB, facilitating a richer understanding of visual perception and behavior in naturalistic settings.
... A smaller, but consistent, overrepresentation of crossed disparities has also been observed in early visual areas. Sprague et al. (2015) 56 have argued that this bias is a byproduct of the oversampling of cells from the lower hemifield. The same argument is unlikely to apply for the PPC, first because the overrepresentation of crossed disparities is far stronger, second because receptive fields in the PPC are very large, with very few cells being exclusively selective to the upper or lower hemifield 7,44,45,57 , and finally because area VIP is not organized in a retinotopic manner 54 making it less likely that receptive fields from the lower visual hemifield have been oversampled. ...
Preprint
Full-text available
The encoding of three-dimensional visual spatial information is of ultimate importance in everyday life, in particular for successful navigation toward targets or threat avoidance. Eye-movements challenge this spatial encoding: 2-3 times per second, they shift the image of the outside world across the retina. The macaque ventral intraparietal area (VIP) stands out from other areas of the dorsal ‘where’ pathway of the primate visual cortical system: many neurons encode visual information irrespective of horizontal and vertical eye position. But does this gaze invariance of spatial encoding at the single neuron level also apply to egocentric distance? Here, concurrent with recordings from area VIP, monkeys fixated a central target at one of three distances (vergence), while a visual stimulus was shown at one of seven distances (disparity). Most neurons’ activity was modulated independently by both disparity and eye vergence, demonstrating a different type of invariance than for visual directions. By using population activity, we were able to decode egocentric distance of a stimulus which demonstrates that egocentric distances are nonetheless represented within the neuronal population. Our results provide further strong evidence for a role of area VIP in 3D space encoding.
... Center bias is driven, in part, by the tendency to recenter the eyes in their orbits. Observers exhibit a strong tendency to fixate the center of a computer display or picture (14,15), or near the head orientation in 360 • virtual environments (35,36) and real environments (32)(33)(34)(37)(38)(39). Past work has shown that there are a number of factors that contribute to the dispersion and center of this center bias, including task demands, environmental layout, FOV size, motor biases, photographer bias, and the tendency to re-center the eyes in the orbit (14,15). ...
... There is a possibility of an upward vertical bias in the gaze data, if certain participants accidentally held the calibration card below their interocular axis. That said, we did not observe fixation locations across tasks that were systematically above those measured in past studies with similar mobile eye tracking methods (32,33,(37)(38)(39), so we do not suspect such a bias is significant in our dataset. We observed a larger spread vertically than horizontally of gaze-in-head positions in some tasks but not others (SI Appendix, Fig. S1). ...
... We observed a larger spread vertically than horizontally of gaze-in-head positions in some tasks but not others (SI Appendix, Fig. S1). We do not believe that this was an artifact caused by device slippage given that 1) the eye tracker was affixed well to the head, 2) we do not see a consistent vertical spread bias across tasks or observers, and 3) past studies have observed similar task-dependent anisotropies (see ref. 39, their figure 3C -"Make a Sandwich" task vs. others; also see ref. 33). ...
Article
Humans coordinate their eye, head, and body movements to gather information from a dynamic environment while maximizing reward and minimizing biomechanical and energetic costs. However, such natural behavior is not possible in traditional experiments employing head/body restraints and artificial, static stimuli. Therefore, it is unclear to what extent mechanisms of fixation selection discovered in lab studies, such as inhibition-of-return (IOR), influence everyday behavior. To address this gap, participants performed nine real-world tasks, including driving, visually searching for an item, and building a Lego set, while wearing a mobile eye tracker (169 recordings; 26.6 h). Surprisingly, in all tasks, participants most often returned to what they just viewed and saccade latencies were shorter preceding return than forward saccades, i.e., consistent with facilitation, rather than inhibition, of return. We hypothesize that conservation of eye and head motor effort (“laziness”) contributes. Correspondingly, we observed center biases in fixation position and duration relative to the head’s orientation. A model that generates scanpaths by randomly sampling these distributions reproduced all return phenomena we observed, including distinct 3-fixation sequences for forward versus return saccades. After controlling for orbital eccentricity, one task (building a Lego set) showed evidence for IOR. This, along with small discrepancies between model and data, indicates that the brain balances minimization of motor costs with maximization of rewards (e.g., accomplished by IOR and other mechanisms) and that the optimal balance varies according to task demands. Supporting this account, the orbital range of motion used in each task traded off lawfully with fixation duration.
... As such, this cue supports a variety of perceptual tasks such as figure/ground segregation, three-dimensional motion perception, and breaking camouflage [10][11][12]. The magnitude and direction of horizontal binocular disparity varies lawfully as a function of how far objects are from the observer as well as where the observer is fixating, and prior work has shown that these variations result in predictable statistical regularities in the binocular disparities encountered during natural tasks [13][14][15][16]. We hypothesized that early representations of horizontal binocular disparity maximize the information carried about typical disparities encountered during natural behavior, while later representations may reallocate neuronal resources to facilitate discrimination of disparity in support of perceptual tasks. ...
... To test this sensory transformation hypothesis, we first need an understanding of the distribution of horizontal binocular disparities that the visual system is tasked with processing (hereafter simply referred to as disparities). In recent years, there has been a concerted effort to characterize the visual "diet" of disparities that is typical of natural experience [13][14][15][16]. This prior work suggests several robust statistical properties of typical disparities, most notably that small disparities (near zero) tend to be much more likely than large disparities in central vision. ...
... B. Disparities encountered in the central visual field (10˚radius of fixation) tend to be small. Each plot shows the probability density of disparity obtained from data collected in [13] while human participants either navigated an outdoor environment (left) or prepared a sandwich (right). Densities are estimated from over 100 million samples of disparity recorded while three human participants performed these tasks. ...
Article
Full-text available
Neurons throughout the brain modulate their firing rate lawfully in response to sensory input. Theories of neural computation posit that these modulations reflect the outcome of a constrained optimization in which neurons aim to robustly and efficiently represent sensory information. Our understanding of how this optimization varies across different areas in the brain, however, is still in its infancy. Here, we show that neural sensory responses transform along the dorsal stream of the visual system in a manner consistent with a transition from optimizing for information preservation towards optimizing for perceptual discrimination. Focusing on the representation of binocular disparities—the slight differences in the retinal images of the two eyes—we re-analyze measurements characterizing neuronal tuning curves in brain areas V1, V2, and MT (middle temporal) in the macaque monkey. We compare these to measurements of the statistics of binocular disparity typically encountered during natural behaviors using a Fisher Information framework. The differences in tuning curve characteristics across areas are consistent with a shift in optimization goals: V1 and V2 population-level responses are more consistent with maximizing the information encoded about naturally occurring binocular disparities, while MT responses shift towards maximizing the ability to support disparity discrimination. We find that a change towards tuning curves preferring larger disparities is a key driver of this shift. These results provide new insight into previously-identified differences between disparity-selective areas of cortex and suggest these differences play an important role in supporting visually-guided behavior. Our findings emphasize the need to consider not just information preservation and neural resources, but also relevance to behavior, when assessing the optimality of neural codes.
... However, the formulation of flexible models for perceived contrast in complex imagery, let alone binoc-548 ular contrast perception, remains an ongoing area of research [18,34,37]. In our previous work, we explored 549 how a Bayesian ideal observer model, which assumed binocular percepts are determined through a statistically 550 optimal combination of binocular visual input and prior assumptions about the structure of the natural world, 551 could explain specific properties of binocular depth perception [49]. However, this model did not account for any 552 other properties of binocular appearance, like contrast, luster, and rivalry. ...
Article
Augmented reality (AR) devices seek to create compelling visual experiences that merge virtual imagery with the natural world. These devices often rely on wearable near-eye display systems that can optically overlay digital images to the left and right eyes of the user separately. Ideally, the two eyes should be shown images with minimal radiometric differences (e.g., the same overall luminance, contrast, and color in both eyes), but achieving this binocular equality can be challenging in wearable systems with stringent demands on weight and size. Basic vision research has shown that a spectrum of potentially detrimental perceptual effects can be elicited by imagery with radiometric differences between the eyes, but it is not clear whether and how these findings apply to the experience of modern AR devices. In this work, we first develop a testing paradigm for assessing multiple aspects of visual appearance at once, and characterize five key perceptual factors when participants viewed stimuli with interocular contrast differences. In a second experiment, we simulate optical see-through AR imagery using conventional desktop LCD monitors and use the same paradigm to evaluate the multifaceted perceptual implications when the AR display luminance differs between the two eyes. We also include simulations of monocular AR systems (i.e., systems in which only one eye sees the displayed image). Our results suggest that interocular contrast differences can drive several potentially detrimental perceptual effects in binocular AR systems, such as binocular luster, rivalry, and spurious depth differences. In addition, monocular AR displays tend to have more artifacts than binocular displays with a large contrast difference in the two eyes. A better understanding of the range and likelihood of these perceptual phenomena can help inform design choices that support a high-quality user experience in AR.
... Consistent with prior observations, the nearby terrestrial samples were also dark biased ( The dominance of dark contrasts in the zebrafish habitat is primarily in the lower visual field. Several properties of visible light vary systematically as a function of elevation in the visual field, in both terrestrial and aquatic imagery 12,16,31,50,51 . These variations suggest that natural environments can place different demands on cells and circuits that typically encode the upper and lower visual fields. ...
Article
Full-text available
Animal sensory systems are tightly adapted to the demands of their environment. In the visual domain, research has shown that many species have circuits and systems that exploit statistical regularities in natural visual signals. The zebrafish is a popular model animal in visual neuroscience, but relatively little quantitative data is available about the visual properties of the aquatic habitats where zebrafish reside, as compared to terrestrial environments. Improving our understanding of the visual demands of the aquatic habitats of zebrafish can enhance the insights about sensory neuroscience yielded by this model system. We analyzed a video dataset of zebrafish habitats captured by a stationary camera and compared this dataset to videos of terrestrial scenes in the same geographic area. Our analysis of the spatiotemporal structure in these videos suggests that zebrafish habitats are characterized by low visual contrast and strong motion when compared to terrestrial environments. Similar to terrestrial environments, zebrafish habitats tended to be dominated by dark contrasts, particularly in the lower visual field. We discuss how these properties of the visual environment can inform the study of zebrafish visual behavior and neural processing and, by extension, can inform our understanding of the vertebrate brain.
... Nevertheless, the VMC has often been used in binocular vision research as the benchmark reference against which other horopters are judged; see (Schreiber et al., 2006) and (Sprague et al., 2015), for example. In (Schreiber et al., 2006), the VMC passing through the eyes' rotation centers was used to discuss the extrapolation of the VMC to a full horopteric surface that quantified binocular alignment and visualized its dependence on eye position. ...
... In (Schreiber et al., 2006), the VMC passing through the eyes' rotation centers was used to discuss the extrapolation of the VMC to a full horopteric surface that quantified binocular alignment and visualized its dependence on eye position. In (Sprague et al., 2015), it was suggested that the empirical horopter deviation from the VMC results from the visual system allocating resources according to natural disparity statistics for binocular correspondence matches. These papers' statements do not conform to anatomically and geometrically precise results obtained in (Turski, 2020), the high impact of which was discussed above. ...
Article
Full-text available
The horopter's history may partly be responsible for its ambiguous psychophysical definitions and obscured physiological significance. However, the horopter is a useful clinical tool integrating physiological optics and binocular vision. This article aims to help understand how it could come to such different attitudes toward the horopter. After the basic concepts underlying binocular space perception and stereopsis are presented, the horopter's old ideas that influence today's research show their inconsistencies with the conceptualized binocular vision. Two recent geometric theories of the horopter with progressively higher eye model fidelity that resolve the inconsistencies are reviewed. The first theory corrects the 200-year-old Vieth-Müller circle still used as a geometric horopter. The second theory advances Ogle's classical work by modeling empirical horopters as conic sections in the binocular system with the asymmetric eye model that accounts for the observed misalignment of optical components in human eyes. Its extension to iso-disparity conics is discussed.
... This inhomogeneity persists along visual pathways and is related to cortical magnification factor (Wassle et al., 1989;Chaplin et al., 2013), acuity (Albus, 1975;Wilson and Sherman, 1976;Tusa et al., 1978;Van Essen et al., 1984;van Beest et al., 2021;Tan et al., 2022), color sensitivity (Rhim et al., 2017), and irregular ocular dominance domains (Adams and Horton, 2002). Additional inhomogeneities emerge within cortex, such as the irregular distribution of orientation representation in mice (Tan et al., 2022) and disparity selectivity in mice and nonhuman primates (Sprague et al., 2015;La Chioma et al., 2019;Samonds et al., 2019). By making gaze changes to cover different regions of the scene, an internal representation may be constructed by integrating novel receptive field information over successive fixations (Gottlieb, 2007;Melcher and Colby, 2008;Cavanagh et al., 2010;Ganmor et al., 2015;Wolf and Schutz, 2015;Stewart et al., 2020). ...
Preprint
Full-text available
Most vertebrates use saccadic eye movements to quickly change gaze orientation and sample different portions of the environment. Visual information is integrated across several fixations to construct a more complete perspective. In concert with this sampling strategy, neurons adapt to unchanging input to conserve energy and ensure that only information for novel fixations is processed. We demonstrate how adaptation recovery times and saccade properties interact, and thus shape spatiotemporal tradeoffs observed in the motor and visual systems of different species. These tradeoffs predict that in order to achieve similar visual coverage over time, animals with smaller receptive field sizes require faster saccade rates. We find comparable sampling of the visual environment by neuronal populations across mammals when integrating measurements of saccadic behavior with receptive field sizes and V1 neuronal density. We propose that these mammals share a common statistically driven strategy of maintaining coverage of their visual environment over time calibrated to their respective visual system characteristics. Significance Statement Mammals rapidly move their eyes to sample their visual environment over successive fixations, but they use different spatial and temporal strategies for this sampling. We demonstrate that these different strategies achieve similar neuronal receptive field coverage over time. Because mammals have distinct sensory receptive field sizes and neuronal densities for sampling and processing information, they require different eye movement strategies to encode natural scenes.