Journal of Vision

Published by Association for Research in Vision and Ophthalmology (ARVO)

Online ISSN: 1534-7362

Articles


Change for the better
  • Article

October 2013

·

48 Reads

Share

Figure 1. Experimental methods. (A) Stimuli. Stimuli were 24 images of objects with a planned hierarchical category structure. The exemplar level is the individual images. These images group into six lower tier categories: animal bodies, animal faces, human bodies, human faces, man-made objects, and natural objects. The animate objects group into images depicting intermediate tier categories faces and bodies. They also can be grouped as images depicting intermediate tier categories humans and animals (Level 2b). The highest tier distinguishes animate and inanimate objects. (B) Trial sequence. Each image was displayed for 533 ms with a variable interstimulus interval that ranged from 900 to 1200 ms. In the center of each image was a letter. The participants’ task was to report whether the letter was a consonant or vowel. (C) Same exemplar decoding. The classifier is trained to decode the category of the stimulus. 90% of the data is used to train the classifier. The trained classifier is then tested with the remaining 10% of the data. Color coding in the figure corresponds to the colors in Figure 1A. In the example shown, the classifier is trained to decode whether or not the image shown to the observer is an animal bodies (denoted by the asterisk). (D) Novel exemplar decoding. The classifier is again trained to classify animal bodies (denoted by the asterisk) from other stimuli. Here, the classifier is trained with block of exemplar data. The ratio of training to test is 2:1. Two thirds of the exemplars from each category are used to train the classifier. The excluded exemplars’ data are used to test the classifier. 
Figure 3. Emergence of exemplar and category discriminability for IMCV. (A) Average discriminability (d 0 ) for all exemplar pairs. (B) Within category exemplar discriminability (d 0 ) for Level 1 exemplar pairs. Exemplars are discriminable within each category. (C), (D), & (E) Discriminability of each Level 1, Level 2, and Level 3 category from stimuli outside the categories. The dashed line is average category decoding performance for 100 arbitrary categories (i.e., a categories comprised of randomly assigned exemplars). Solid lines are d 0 averaged across subjects. The shaded region is 1 SEM across subjects. Color coded asterisks below the plots indicate above chance performance, as evaluated by a Wilcoxon signed rank test with a threshold of p , 0.01. Peak performance is indicated by color coded arrows above the plots. The onset and peak latencies are reported in the figure legends. The thick solid line below Plot A indicates the time the stimulus was on the screen (stimulus duration: 500 ms). 
Figure 4. Peak discriminability across categories for IMCV. (A) Decoding peak latencies for each level of the stimulus hierarchy (averages of within-category pairs). Central red lines indicate the median peak latency, box edges indicate the 25th and 75th percentiles, and whiskers indicate the most extreme values that are not outliers. Outliers are plotted as red crosses; those outside of 0 to 400 ms are plotted on the lower and upper bounds. Above the figure, the outcomes of Wilcoxon signed rank tests comparing levels of the stimulus hierarchy are summarized. Thick lines indicate the base comparison; thin lines indicate the comparison. A single asterisk indicates significance less than 0.05, double asterisks indicate significance less than 0.01. (B) Correlations category differences evaluated by visual models and observed onset and peak latency estimates. Asterisks indicate significant correlations (Spearman bootstrap test, uncorrected for multiple comparisons). 
Figure 5. Category decoding with IECV. Panels A–C show decoding accuracy as a function of time for the three category levels. Here the decoder predicts whether a novel exemplar (not used in training) belongs inside or outside the indicated category. The dashed line is average category decoding performance for 100 arbitrary categories (i.e., a categories comprised of randomly assigned exemplars). Solid lines are d 0 averaged across subjects. The shaded region is 1 SEM across subjects. Color coded asterisks below the 
Figure 5. Category decoding with IECV. Panels A-C show decoding accuracy as a function of time for the three category levels. Here the decoder predicts whether a novel exemplar (not used in training) belongs inside or outside the indicated category. The dashed line is average category decoding performance for 100 arbitrary categories (i.e., a categories comprised of randomly assigned exemplars). Solid lines are d 0 averaged across subjects. The shaded region is 1 SEM across subjects. Color coded asterisks below the plot indicate above chance performance, as evaluated by a Wilcoxon signed rank test with a threshold of p , 0.01. The onset and peak latency for each category is shown in the Figure legends. (A) Performance for lower tier category comparisons. (B) Performance for intermediate tier category comparisons. (C) Performance for the highest tier category comparison (animacy). (D) Summary boxplots for IECV Central red lines indicate the median peak latency, edges indicate the 25th and 75th percentiles, and whiskers indicate the extreme values that are not outliers. Outliers are shown as red crosses; those outside of 0 to 400 ms are plotted on the lower and upper bounds. Above the figure, the outcomes of Wilcoxon signed rank tests comparing levels of the stimulus hierarchy are summarized. Thick lines indicate the base comparison; thin lines indicate the comparison. Asterisks indicate significance less than 0.05. (E) Correlations category differences evaluated by visual models and observed onset and peak latency estimates. Asterisks indicate significant correlations (Spearman bootstrap test, uncorrected for multiple comparisons).

+2

Representational dynamics of object vision: The first 1000 ms
  • Article
  • Full-text available

August 2013

·

329 Reads

Human object recognition is remarkably efficient. In recent years, significant advancements have been made in our understanding of how the brain represents visual objects and organizes them into categories. Recent studies using pattern analyses methods have characterized a representational space of objects in human and primate inferior temporal cortex in which object exemplars are discriminable and cluster according to category (e.g., faces and bodies). In the present study we examined how category structure in object representations emerges in the first 1000 ms of visual processing. In the study, participants viewed 24 object exemplars with a planned categorical structure comprised of four levels ranging from highly specific (individual exemplars) to highly abstract (animate vs. inanimate), while their brain activity was recorded with magnetoencephalography (MEG). We used a sliding time window decoding approach to decode the exemplar and the exemplar's category that participants were viewing on a moment-to-moment basis. We found exemplar and category membership could be decoded from the neuromagnetic recordings shortly after stimulus onset (<100 ms) with peak decodability following thereafter. Latencies for peak decodability varied systematically with the level of category abstraction with more abstract categories emerging later, indicating that the brain hierarchically constructs category representations. In addition, we examined the stationarity of patterns of activity in the brain that encode object category information and show these patterns vary over time, suggesting the brain might use flexible time varying codes to represent visual object categories.
Download

James Jurin (1684-1750): A pioneer of crowding research?

January 2015

·

137 Reads

James Jurin wrote an extended essay on distinct and indistinct vision in 1738. In it, he distinguished between "perfect," "distinct," and "indistinct vision" as perceptual categories, and his meticulous descriptions and analyses of perceptual phenomena contained observations that are akin to crowding. Remaining with the concepts of his day, however, he failed to recognize crowding as separate from spatial resolution. We present quotations from Jurin's essay and place them in the context of the contemporary concerns with visual resolution and crowding. © 2015 ARVO.

On the decline of 1st and 2nd order sensitivity with eccentricity

February 2008

·

40 Reads

We studied the relationship between the decline in sensitivity that occurs with eccentricity for stimuli of different spatial scale defined by either luminance (LM) or contrast (CM) modulation. We show that the detectability of CM stimuli declines with eccentricity in a spatial frequency-dependent manner, and that the rate of sensitivity decline for CM stimuli is roughly that expected from their 1st order carriers, except, possibly, at finer scales. Using an equivalent noise paradigm, we investigated the possible reasons for why the foveal sensitivity for detecting LM and CM stimuli differs as well as the reason why the detectability of 1st order stimuli declines with eccentricity. We show the former can be modeled by an increase in internal noise whereas the latter involves both an increase in internal noise and a loss of efficiency. To encompass both the threshold and suprathreshold transfer properties of peripheral vision, we propose a model in terms of the contrast gain of the underlying mechanisms.

Figure 1 . Two-stroke apparent motion sequence. Two pattern frames (Frames 1 and 2) are presented repeatedly. An interstimulus interval (ISI) intervenes at one of the two frame transitions. The Frame 1 – Frame 2 transition in this example should generate a rightward motion signal in the visual system (arrows). The Frame 2 – Frame 1 transition would normally generate a leftward motion signal, but the effect of the ISI reverses this signal, so the sequence appears unidirectionally rightward. Reproduced with permission from Mather and Challinor (2009). 
Figure 2 . Center – surround interaction indicated by effects of size and contrast on motion perception. (A) Duration thresholds of direction discrimination as a function of stimulus size at different contrasts. (B) Log threshold change relative to the optimal size at each contrast level. Reproduced with permission from Tadin and Lappin (2005). 
Figure 3 . Two types of spatial motion pooling (Amano et al., 2009a). (Left) One-dimensional motion pooling. When local motion elements are directionally ambiguous 1D patterns, as in the case of global Gabor motion, 1D local motion signals are integrated across orientation and space at the same time. IOC: intersection of constraints. (Right) Two-dimensional motion pooling. When local motion elements are 2D patterns, as in the case of global plaid motion, 1D local motion signals are fi rst locally integrated across orientation (stage in red), and the resulting local 2D motion signals are integrated over space (stage in blue). VA: vector average. Modi fi ed with permission from Amano et al. (2009a). 
Figure 5 . Motion-based integration of object properties. (A) Trajectory integration of color. Space – time plots of multipath displays in which integration of color signals along a rightward color-alternating path results in color mixing, whereas integration along a leftward color- keeping path results in color segregation. When the path-length ratio of the color-keeping path is 1 (left), the color-keeping path predominates in motion perception. When the path-length ratio is 4 (right), the color-alternating path predominates. In accordance with this direction change, apparent color also changes. Reproduced with permission from Watanabe and Nishida (2007). (B) Mobile computing. In each patch, color alternates between red and green and motion alternates between inward and outward. The task is to report the direction of the red dots while fi xating the central cross. When the observers attend to one location, they cannot judge the binding between color and direction when the alternation rate is fast (say 4 Hz). However, when the observers are shown a guide ring that allows them to attentively track a speci fi c combination of color and motion over space and time, they can perform the binding task due to spatiotemporal integration of object features. Modi fi ed with permission from Cavanagh et al. (2008). 
Advancement of motion psychophysics: Review 2001-2010
This is a survey of psychophysical studies of motion perception carried out mainly in the last 10 years. It covers a wide range of topics, including the detection and interactions of local motion signals, motion integration across various dimensions for vector computation and global motion perception, second-order motion and feature tracking, motion aftereffects, motion-induced mislocalizations, timing of motion processing, cross-attribute interactions for object motion, motion-induced blindness, and biological motion. While traditional motion research has benefited from the notion of the independent "motion processing module," recent research efforts have been also directed to aspects of motion processing in which interactions with other visual attributes play critical roles. This review tries to highlight the richness and diversity of this large research field and to clarify what has been done and what questions have been left unanswered.

Bullet trains and steam engines: Exogenous attention zips but endogenous attention chugs along RID B-2150-2008

April 2011

·

25 Reads

Analyzing a scene requires shifting attention from object to object. Although several studies have attempted to determine the speed of these attentional shifts, there are large discrepancies in their estimates. Here, we adapt a method pioneered by T. A. Carlson, H. Hogendoorn, and F. A. J. Verstraten (2006) that directly measures pure attentional shift times. We also test if attentional shifts can be handled in parallel by the independent resources available in the two cortical hemispheres. We present 10 "clocks," with single revolving hands, in a ring around fixation. Observers are asked to report the hand position on one of the clocks at the onset of a transient cue. The delay between the reported time and the veridical time at cue onset can be used to infer processing and attentional shift times. With this setup, we use a novel subtraction method that utilizes different combinations of exogenous and endogenous cues to determine shift times for both types of attention. In one experiment, subjects shift attention to an exogenously cued clock (baseline condition) in one block, and in other blocks, subjects perform one further endogenous shift to a nearby clock (test condition). In another experiment, attention is endogenously cued to one clock (baseline condition), and on other trials, an exogenous cue further shifts attention to a nearby clock (test condition). Subtracting report delays in the baseline condition from those obtained in the test condition allows us to isolate genuine attentional shift times. In agreement with previous studies, our results reveal that endogenous attention is much slower than exogenous attention (endogenous: 250 ms; exogenous: 100 ms). Surprisingly, the dependence of shift time on distance is minimal for exogenous attention, whereas it is steep for endogenous attention. In the final experiment, we find that endogenous shifts are faster across hemifields than within a hemifield suggesting that the two hemispheres can simultaneously process at least parts of these shifts.

Attention alters decision criteria but not appearance: A reanalysis of Anton-Erxleben, Abrams, and Carrasco (2010)

November 2011

·

30 Reads

Paying attention to a stimulus affords it many behavioral advantages, but whether attention also changes its subjective appearance is controversial. K. A. Schneider and M. Komlos (2008) demonstrated that the results of previous studies suggesting that attention increased perceived contrast could also be explained by a biased decision mechanism. This bias could be neutralized by altering the methodology to ask subjects whether two stimuli were equal in contrast or not rather than which had the higher contrast. K. Anton-Erxleben, J. Abrams, and M. Carrasco (2010) claimed that, even using this equality judgment, attention could still be shown to increase perceived contrast. In this reply, we analyze their data and conclude that the effects that they reported resulted from fitting symmetric functions that poorly characterized the individual subject data, which exhibited significant asymmetries between the high- and low-contrast tails. The strength of the effect attributed to attentional enhancement in each subject was strongly correlated with this skew. By refitting the data with a response model that included a non-zero asymptotic response in the low-contrast regime, we show that the reported attentional effects are better explained as changes in subjective criteria. Thus, the conclusion of Schneider and Komlos that attention biases the decision mechanism but does not alter appearance is still valid and is in fact supported by the data from Anton-Erxleben et al.

On the "special" status of emotional faces ... Comment on Yang, Hong, and Blake (2010)

March 2011

·

66 Reads

A wealth of literature suggests that emotional faces are given special status as visual objects: Cognitive models suggest that emotional stimuli, particularly threat-relevant facial expressions such as fear and anger, are prioritized in visual processing and may be identified by a subcortical “quick and dirty” pathway in the absence of awareness (Tamietto & de Gelder, 2010). Both neuroimaging studies (Williams, Morris, McGlone, Abbott, & Mattingley, 2004) and backward masking studies (Whalen, Rauch, Etcoff, McInerney, & Lee, 1998) have supported the notion of emotion processing without awareness. Recently, our own group (Adams, Gray, Garner, & Graf, 2010) showed adaptation to emotional faces that were rendered invisible using a variant of binocular rivalry: continual flash suppression (CFS, Tsuchiya & Koch, 2005). Here we (i) respond to Yang, Hong, and Blake's (2010) criticisms of our adaptation paper and (ii) provide a unified account of adaptation to facial expression, identity, and gender, under conditions of unawareness.

Figure 1 . PSE ( c max ) in the test-cued (*) and standard-cued ( $ ) 
Figure 2 . Hypothetical effects of attention on the distribution of “ same ” responses in an equality task with different changes in amplitude and standard deviation but the same shift in PSE between test-cued (blue) and standard-cued (red) conditions. (A) This fi gure shows two hypothetical psychometric functions with a PSE shift of È 5% contrast and no changes in amplitude or standard deviation. In the other fi gures, either (B, C) the amplitude, (D, E) the standard deviation, or (F, G) both change. 
Equality judgments cannot distinguish between attention effects on appearance and criterion: A reply to Schneider (2011)

November 2011

·

68 Reads

Whether attention modulates the appearance of stimulus features is debated. Whereas many previous studies using a comparative judgment have found evidence for such an effect, two recent studies using an equality judgment have not. Critically, these studies have relied on the assumption that the equality paradigm yields bias-free PSE estimates and is as sensitive as the comparative judgment, without testing these assumptions. Anton-Erxleben, Abrams, and Carrasco (2010) compared comparative judgments and equality judgments with and without the manipulation of attention. They demonstrated that the equality paradigm is less sensitive than the comparative judgment and also bias-prone. Furthermore, they reported an effect of attention on the PSE using both paradigms. Schneider (2011) questions the validity of the latter finding, stating that the data in the equality experiment are corrupted because of skew in the response distributions. Notably, this argument supports the original conclusion by Anton-Erxleben et al.: that the equality paradigm is bias-prone. Additionally, the necessary analyses to show that the attention effect observed in Anton-Erxleben et al. was due to skew in the data were not conducted. Here, we provide these analyses and show that although the equality judgment is bias-prone, the effects we observe are consistent with an increase of apparent contrast by attention.

Does spatio-temporal filtering account for nonretinotopic motion perception? Comment on Pooresmaeili, Cicchini, Morrone, and Burr (2012)

August 2013

·

28 Reads

Keywords: Ternus-Pikler display ; retinotopic processing ; nonretinotopic processing ; spatio-temporal filters Reference EPFL-ARTICLE-188506doi:10.1167/13.10.19View record in Web of Science Record created on 2013-09-13, modified on 2016-08-09

Head and eye gaze dynamics during visual attention shifts in complex environments (vol 12, pg 1, 2012)

February 2012

·

69 Reads

The dynamics of overt visual attention shifts evoke certain patterns of responses in eye and head movements. In this work, we detail novel findings regarding the interaction of eye gaze and head pose under various attention-switching conditions in complex environments and safety critical tasks such as driving. In particular, we find that sudden, bottom-up visual cues in the periphery evoke a different pattern of eye-head movement latencies as opposed to those during top-down, task-oriented attention shifts. In laboratory vehicle simulator experiments, a unique and significant (p < 0.05) pattern of preparatory head motions, prior to the gaze saccade, emerges in the top-down case. This finding is validated in qualitative analysis of naturalistic real-world driving data. These results demonstrate that measurements of eye-head dynamics are useful data for detecting driver distractions, as well as in classifying human attentive states in time and safety critical tasks.

Figure 2. Reaction time distributions for the speed and accuracy conditions in Experiment 1. Correct trials = thick line. Error trials = thin line. Vertical lines depict mean MRTs.  
Fast saccades toward numbers: Simple number comparisons can be made in as little as 230 ms

April 2011

·

83 Reads

Visual psychophysicists have recently developed tools to measure the maximal speed at which the brain can accurately carry out different types of computations (H. Kirchner & S. J. Thorpe, 2006). We use this methodology to measure the maximal speed with which individuals can make magnitude comparisons between two single-digit numbers. We find that individuals make such comparisons with high accuracy in 306 ms on average and are able to perform above chance in as little as 230 ms. We also find that maximal speeds are similar for "larger than" and "smaller than" number comparisons and in a control task that simply requires subjects to identify the number in a number-letter pair. The results suggest that the brain contains dedicated processes involved in implementing basic number comparisons that can be deployed in parallel with processes involved in low-level visual processing.

The perception of 2D orientation is categorically biased

July 2011

·

51 Reads

Three experimental paradigms were used to investigate the perception of orientation relative to internal categorical standards of vertical and horizontal. In Experiment 1, magnitude estimation of orientation (in degrees) relative to vertical and horizontal replicated a previously reported spatial orientation bias also measured using verbal report: Orientations appear farther from horizontal than they are, whether numeric judgments are made relative to vertical or to horizontal. Analyses of verbal response patterns, however, suggested that verbal reports underestimate the true spatial bias. A non-verbal orientation bisection task (Experiment 2) confirmed that spatial errors are not due to numeric coding and are larger than the 6° error replicated using verbal methods. A spatial error of 8.6° was found in the bisection task, such that an orientation of about 36.4° from horizontal appears equidistant from vertical and horizontal. Finally, using a categorization ("ABX") paradigm in Experiment 3, it was found that there is less memory confusability for orientations near horizontal than for orientations near vertical. Thus, three different types of measures, two of them non-verbal, provide converging evidence that the coding of orientation relative to the internal standards of horizontal and vertical is asymmetrically biased and that horizontal appears to be the privileged axis.

Low-level motion analysis of color and luminance for perception of 2D and 3D motion

June 2012

·

160 Reads

·

·

·

[...]

·

We investigated the low-level motion mechanisms for color and luminance and their integration process using 2D and 3D motion aftereffects (MAEs). The 2D and 3D MAEs obtained in equiluminant color gratings showed that the visual system has the low-level motion mechanism for color motion as well as for luminance motion. The 3D MAE is an MAE for motion in depth after monocular motion adaptation. Apparent 3D motion can be perceived after prolonged exposure of one eye to lateral motion because the difference in motion signal between the adapted and unadapted eyes generates interocular velocity differences (IOVDs). Since IOVDs cannot be analyzed by the high-level motion mechanism of feature tracking, we conclude that a low-level motion mechanism is responsible for the 3D MAE. Since we found different temporal frequency characteristics between the color and luminance stimuli, MAEs in the equiluminant color stimuli cannot be attributed to a residual luminance component in the color stimulus. Although a similar MAE was found with a luminance and a color test both for 2D and 3D motion judgments after adapting to either color or luminance motion, temporal frequency characteristics were different between the color and luminance adaptation. The visual system must have a low-level motion mechanism for color signals as for luminance ones. We also found that color and luminance motion signals are integrated monocularly before IOVD analysis, showing a cross adaptation effect between color and luminance stimuli. This was supported by an experiment with dichoptic presentations of color and luminance tests. In the experiment, color and luminance tests were presented in the different eyes dichoptically with four different combinations of test and adaptation: color or luminance test in the adapted eye after color or luminance adaptation. Findings of little or no influence of the adaptation/test combinations indicate the integration of color and luminance motion signals prior to the binocular IOVD process.

Both parallelism and orthogonality are used to perceive 3D slant of rectangles from 2D images

February 2007

·

44 Reads

A 2D perspective image of a slanted rectangular object is sufficient for a strong 3D percept. Two computational assumptions that could be used to interpret 3D from images of rectangles are as follows: (1) converging lines in an image are parallel in the world, and (2) skewed angles in an image are orthogonal in the world. For an accurate perspective image of a slanted rectangle, either constraint implies the same 3D interpretation. However, if an image is rescaled, the 3D interpretations based on parallelism and orthogonality generally conflict. We tested the roles of parallelism and orthogonality by measuring perceived depth within scaled perspective images. Stimuli were monocular images of squares, slanted about a horizontal axis, with an elliptical hole. Subjects judged the length-to-width ratio of the holes, which provided a measure of perceived depth along the object. The rotational alignment of squares within their surface plane was varied from 0 degrees (trapezoidal projected contours) to 20 degrees (skewed projected contours). In consistent-cue conditions, images were accurate projections of either a 10 degree- or 20 degree-wide square, with slants of 75 degrees and 62 degrees, respectively. In cue-conflict conditions, images were generated either by magnifying a 10 degrees image to have a projected size of 20 degrees or by minifying a 20 degree image to have a projected size of 10 degrees. For the aligned squares, which do not produce a conflicting skew cue, we found that subjects' judgments depended primarily on projected size and not on the size used to generate the prescaled images. This is consistent with reliance on the convergence cue, corresponding to a parallelism assumption. As squares were rotated away from alignment, producing skewed projected contours, judgments were increasingly determined by the original image size. This is consistent with use of the skew cue, corresponding to an orthogonality assumption. Our results demonstrate that both parallelism and orthogonality constraints are used to perceive depth from linear perspective.

Figure 2. Schematic illustrations of the constructions of some 3D and 2D stimuli used in the study. Top shows the original composite image I com and the 2D offset images I com (x) and I com (jx) modified from I com , by a horizontal position offset of I ir to the right or left by x to get I ir (x) and I ir (jx), respectively. Hence, I com (Tx) = I rel + I ir (Tx). Bottom shows the 2D offset stimulus S(x, x) when the 2D offset image I com (x) is presented to both eyes, the 3D stimulus S(0, x), when the positional offset is present in only the right eye image, and 3D stimulus S(jx, x) when the positional offset is in the opposite directions in the two eyes. The relative disparity between I rel and I ir in the 3D stimuli is x for S(0, x) and 2x for S(jx, x).
Figure 4 shows mean RTs in Experiment 1, when I rel had a high 90-orientation contrast, for the various stimulus types listed in Figure 3. RTs vary substantially between subjects, e.g., the mean RT(2D a ) of subjects varied between 460 and 1768 milliseconds (ms). Hence, for better visualization , the RTs in Figure 4 are normalized by the mean  
Figure 6. Stimulus characteristics and average RTs in Experiment 2. (A) The schematics of the textures I rel and I com (0), with the vertical texture border in the middle. (B) The RTs, each normalized by RT(2D a ) of the corresponding subject, averaged over eight subjects, in the same format as that of Figure 4B. By matched sample t-test, the average normalized RT(2D x ) is significantly different (p G 0.003) from those of RT(Ground x ) and RT(Figure x ), for both x = a and 2a.
Relative contributions of 2D and 3D cues in a texture segmentation task, implications for the roles of striate and extrastriate cortex in attentional selection

October 2009

·

61 Reads

Experimental evidence has given strong support to the theory that the primary visual cortex (V1) realizes a bottom-up saliency map (A. R. Koene & L. Zhaoping, 2007; Z. Li, 2002; L. Zhaoping, 2008a; L. Zhaoping & K. A. May, 2007). Unlike the conventional models of texture segmentation, this theory predicted that segmenting two textures in an image I(rel) comprising obliquely oriented bars would become much more difficult when a task-irrelevant texture I(ir) of spatially alternating horizontal and vertical bars is superposed on the original texture I(rel). The irrelevant texture I(ir) interferes with I(rel)'s ability to direct attention. This predicted interference was confirmed (L. Zhaoping & K. A. May, 2007) in the form of a prolonged task reaction time (RT). In this study, we investigate whether and how 3D depth perception, believed to be processed mostly beyond V1 and starting in V2 (J. S. Bakin, K. Nakayama, & C. D. Gilbert, 2000; B. G. Cumming & A. J. Parker, 2000; F. T. Qiu & R. von der Heydt, 2005; R. von der Heydt, H. Zhou, & H. S. Friedman, 2000), contribute additionally to direct attention. We measured the reduction of the interference or the RT when the position of the texture grid for I(ir) was offset horizontally from that for I(rel), forming an offset, 2D, stimulus. This reduction was compared with that when this positional offset was only present in the input image to one eye, or when it was in the opposite directions in the images for the two eyes, creating a 3D stimulus with a depth separation between I(ir) and I(rel). The contribution by 3D processes to attentional guidance would be manifested by any extra RT reduction associated with the 3D stimulus over the offset 2D stimulus. This 3D contribution was not present unless the task was so difficult that RT (by button press) based on 2D cues alone was longer than about 1 second. Our findings suggest that, without other top-down factors, V1 plays a dominant role in attentional guidance during an initial window of processing, while cortical areas beyond V1 play an increasing role in later processing. Subject-dependent variations in the manifestations of the 3D effects also suggest that this later, 3D, contribution to attentional guidance can be easily influenced by top-down control.

Figure 3. Results of the experiment. This plot shows the proportion of ''left'' or ''near'' responses as a function of the latency. The data is pooled across the five participants. The visual stimuli were defined by 3D or 2D motion (blue or red) and by DO or by DL (light or dark). A logit function was fitted to the data from the four experimental conditions.  
Disparity-based stereomotion detectors are poorly suited to track 2D motion

October 2012

·

66 Reads

A study was conducted to examine the time required to process lateral motion and motion-in-depth for luminance- and disparity-defined stimuli. In a 2 × 2 design, visual stimuli oscillated sinusoidally in either 2D (moving left to right at a constant disparity of 9 arcmin) or 3D (looming and receding in depth between 6 and 12 arcmin) and were defined either purely by disparity (change of disparity over time [CDOT]) or by a combination of disparity and luminance (providing CDOT and interocular velocity differences [IOVD]). Visual stimuli were accompanied by an amplitude-modulated auditory tone that oscillated at the same rate and whose phase was varied to find the latency producing synchronous perception of the auditory and visual oscillations. In separate sessions, oscillations of 0.7 and 1.4 Hz were compared. For the combined CDOT + IOVD stimuli (disparity and luminance [DL] conditions), audiovisual synchrony required a 50 ms auditory lag, regardless of whether the motion was 2D or 3D. For the CDOT-only stimuli (disparity-only [DO] conditions), we found that a similar lag (∼60 ms) was needed to produce synchrony for the 3D motion condition. However, when the CDOT-only stimuli oscillated along a 2D path, the auditory lags required for audiovisual synchrony were much longer: 170 ms for the 0.7 Hz condition, and 90 ms for the 1.4 Hz condition. These results suggest that stereomotion detectors based on CDOT are well suited to tracking 3D motion, but are poorly suited to tracking 2D motion.

Figure 1. (A) Stimuli from the masked condition. (B) Stimuli from the unmasked condition. (C) The shapes and positions of the area lights with which the objects were illuminated. (D) The patterns of isointensity contours for the unmasked displays. (E) The pattern of probe points used for each object. 
Figure 3. The linear and affine correlations between the simulated object and the response surface computed from the average of observers' settings in each condition. The small square panels depict the patterns of illumination.
Figure 4. A polar plot of the directions and magnitudes of shear in each condition relative to the average shear for each individual object. The squares and circles represent the masked and unmasked conditions, respectively. The different directions of illumination are coded by color, and the black circle represents a shear magnitude of 0.075.
The Effects of Smooth Occlusions and Directions of illumination on the Visual Perception of 3D Shape from Shading

February 2015

·

139 Reads

Human observers made local orientation judgments of smoothly shaded surfaces illuminated from different directions by large area lights, both with and without visible smooth occlusion contours. Test-retest correlations between the first and second halves of the experiment revealed that observers' judgments were highly reliable, with a residual error of only 2%. Over 88% of the variance between observers' judgments and the simulated objects could be accounted for by an affine correlation, but there was also a systematic nonaffine component that accounted for approximately 10% of the perceptual error. The presence or absence of visible smooth occlusion contours had a negligible effect on performance, but there was a small effect of the illumination direction, such that the response surfaces were sheared slightly toward the light source. These shearing effects were much smaller, however, than the effects produced by changes in illumination on the overall pattern of luminance or luminance gradients. Implications of these results for current models of estimating 3-D shape from shading are considered. © 2015 ARVO.

Infants and adults use line junction information to perceive 3D shape

January 2012

·

14 Reads

Two experiments investigated infants' and adults' perception of 3D shape from line junction information. Participants in both experiments viewed a concave wire half-cube frame. In Experiment 1, adults reported that the concave wire frame appeared to be convex when it was viewed monocularly (with one eye covered) and that it appeared to be concave when it was viewed binocularly. In Experiment 2, 5- and 7-month-old infants were shown the concave wire frame under monocular and binocular viewing conditions, and their reaching behavior was recorded. The infants in both age groups reached preferentially toward the center of the wire frame in the monocular condition and toward its edges in the binocular condition. Because infants typically reach to what they perceive to be closest to them, these reaching preferences provide evidence that they perceived the wire frame as convex when they viewed it monocularly and as concave when they viewed it binocularly. These findings suggest that, by 5 months of age, infants, like adults, use line junction information to perceive depth and object shape.

A Bayesian model of binocular perception of 3D mirror symmetrical polyhedra

April 2011

·

28 Reads

In our previous studies, we showed that monocular perception of 3D shapes is based on a priori constraints, such as 3D symmetry and 3D compactness. The present study addresses the nature of perceptual mechanisms underlying binocular perception of 3D shapes. First, we demonstrate that binocular performance is systematically better than monocular performance, and it is close to perfect in the case of three out of four subjects. Veridical shape perception cannot be explained by conventional binocular models, in which shape was derived from depth intervals. In our new model, we use ordinal depth of points in a 3D shape provided by stereoacuity and combine it with monocular shape constraints by means of Bayesian inference. The stereoacuity threshold used by the model was estimated for each subject. This model can account for binocular shape performance of all four subjects. It can also explain the fact that when viewing distance increases, the binocular percept gradually reduces to the monocular one, which implies that monocular percept of a 3D shape is a special case of the binocular percept.

Figure 1 . The virtual scene. The semi-transparent and opaque parts of the fi gure represent the scene in Intervals 1 and 2, respectively. The participants were required to move from side to side to generate motion parallax (red arrow). In a 2IFC task, participants compared the distance to two squares presented in separate intervals. 
Cue combination for 3D location judgements

February 2010

·

96 Reads

Cue combination rules have often been applied to the perception of surface shape but not to judgements of object location. Here, we used immersive virtual reality to explore the relationship between different cues to distance. Participants viewed a virtual scene and judged the change in distance of an object presented in two intervals, where the scene changed in size between intervals (by a factor of between 0.25 and 4). We measured thresholds for detecting a change in object distance when there were only 'physical' (stereo and motion parallax) or 'texture-based' cues (independent of the scale of the scene) and used these to predict biases in a distance matching task. Under a range of conditions, in which the viewing distance and position of the target relative to other objects was varied, the ratio of 'physical' to 'texture-based' thresholds was a good predictor of biases in the distance matching task. The cue combination approach, which successfully accounts for our data, relies on quite different principles from those underlying traditional models of 3D reconstruction.

Figure 1. (A). Condition 1 (left): Six light-emitting diodes (LEDs) were positioned on a table at eye level. The fixation point (FP) was the central LED at either 20 cm (close) or at 150 cm (far). When the close FP was lighted on, the distracter could be (1) either one of the two lateral LEDs at 20 cm (10° version) or (2) the central LED at 150 cm (14.7° divergence). When the FP was the central LED at 150 cm, the distracter could be (1) either one of the two lateral LEDs at 150 cm (10° version) or (2) the center at 20 cm (14.7° convergence). Condition 2 (middle): Five LEDs were positioned on the same table. The FP was the central LED placed 40 cm away from the subject's eyes. The distracter LED could be (1) either one of the two lateral LEDs at 40 cm (10° version) or (2) the center one at 20 cm (8.3° convergence) or the center one at 150 cm (6.4° divergence). Conditions 3 and 4 (right): Three LEDs were embedded in the vertical plane at a viewing distance of either 40 cm (condition 3) or 150 cm (condition 4). The FP was the central LED. The distracter LED could be one of the two altitudinal LEDs (7.5° version). (B). The gap paradigm was used for the distracter task. After a fixation period varying from 2 to 2.5 s, a temporal gap of 200 ms was introduced before the appearance of the distracter. The distracter was on for 1.5 s, then a beep occurred indicating the 2-s pause (the subject was in a dark room). In the no-distracter task (control), the fixation period lasted 2-2.5 s. Then, the fixation LED was switched off and no distracter appeared. A beep occurred 1.7 s after the extinction of FP to indicate the pause of 2 s. Dotted rectangles indicate the time window of interest on which the analyses were restricted.
Inhibition of saccade and vergence eye movements in 3D space

February 2005

·

51 Reads

Inhibitory capacity was investigated by measuring the eye movements of normal subjects asked to fixate a central point, and to suppress eye movements toward visual distracters appearing in the periphery or in depth. Eight right-handed young adults performed such a suppression or distracter task. In different conditions, the distracter could appear at 10 degrees left or right at a distance of 20, 40, or 150 cm (calling for horizontal saccades), or in a central position far or close (calling for convergence or divergence), or 7.5 degrees up or down at 40 or 150 cm (calling for vertical saccades). Eye movements were recorded binocularly with an infrared light eye-movement device. Results showed that (1) suppression performance was not perfect, as the subjects still produced eye movements; (2) errors were distributed unequally in three-dimensional space, with more frequent errors toward distracters calling for convergence, or leftward and downward saccades at a close distance; (3) distracters calling for saccade suppression yielded saccades in the direction of the distracter (that we called prosaccades), and saccades directed away from it (that we called spontaneous antisaccades); (4) for vergence, only distracters calling for convergence yielded errors, which were always promovements; (5) in addition, a small convergent drift was found for convergence distracters. Differences in the errors between saccade and vergence suggest that different inhibitory mechanisms may be involved in the two systems. Spatial left/right, up/down, and close/far asymmetries are interpreted in terms of attentional biases.

Experience affects the use of ego-motion signals during 3D shape perception

December 2010

·

34 Reads

Experience has long-term effects on perceptual appearance (Q. Haijiang, J. A. Saunders, R. W. Stone, & B. T. Backus, 2006). We asked whether experience affects the appearance of structure-from-motion stimuli when the optic flow is caused by observer ego-motion. Optic flow is an ambiguous depth cue: a rotating object and its oppositely rotating, depth-inverted dual generate similar flow. However, the visual system exploits ego-motion signals to prefer the percept of an object that is stationary over one that rotates (M. Wexler, F. Panerai, I. Lamouret, & J. Droulez, 2001). We replicated this finding and asked whether this preference for stationarity, the "stationarity prior," is modulated by experience. During training, two groups of observers were exposed to objects with identical flow, but that were either stationary or moving as determined by other cues. The training caused identical test stimuli to be seen preferentially as stationary or moving by the two groups, respectively. We then asked whether different priors can exist independently at different locations in the visual field. Observers were trained to see objects either as stationary or as moving at two different locations. Observers' stationarity bias at the two respective locations was modulated in the directions consistent with training. Thus, the utilization of extraretinal ego-motion signals for disambiguating optic flow signals can be updated as the result of experience, consistent with the updating of a Bayesian prior for stationarity.

Figure 1. (A) Surface completion in 2D. (B) Amodal boundaries generated by contour interpolation (in red). The blue disks within amodal boundaries appear as holes unified with the three blue protrusions. The blue disk outside the amodal boundaries and the yellow disks appear as occluding spots. White disks outside amodal boundaries are seen as holes.
Figure 2 . Two stereograms used by Fantoni, Hilger et al. (2005) and Hilger et al. (2006) to demonstrate the occurrence of visual completion in the absence of explicit edge information. (A) A unitary monotonic surface connecting the two random dot patches is perceived. (B) Where a depth offset separates the two patches, they are perceived as unconnected surfaces. A 3D view of the simulated patterns is shown on the left (the arrow indicates the cyclopean visual axis). 
Figure 4. Example of the displays used by Kellman, Garrigan, Shipley, Yin et al. (2005) to test the effect of 3D contour relatability on the speeded classification of parallel/converging displays (rows) either relatable or not (columns). In each quadrant, the upper image is a stereo pair of the two illusory planes slanted in depth, and the lower image is a side view of the same planes [redrawn with permission from Kellman, Garrigan, Shipley, Yin et al. (2005)].
Surface interpolation and 3D relatability

February 2008

·

158 Reads

Although the role of surface-level processes has been demonstrated, visual interpolation models often emphasize contour relationships. We report two experiments on geometric constraints governing 3D interpolation between surface patches without visible edges. Observers were asked to classify pairs of planar patches specified by random dot disparities and visible through circular apertures (aligned or misaligned) in a frontoparallel occluder. On each trial, surfaces appeared in parallel or converging planes with vertical (in Experiment 1) or horizontal (in Experiment 2) tilt and variable amounts of slant. We expected the classification task to be facilitated when patches were perceived as connected. We found enhanced sensitivity and speed for 3D relatable vs. nonrelatable patches. Here 3D relatability does not involve oriented edges but rather inducing patches' orientations computed from stereoscopic information. Performance was markedly affected by slant anisotropy: both sensitivity and speed were worse for patches with horizontal tilt. We found nearly identical advantages of 3D relatability on performance, suggesting an isotropic unit formation process. Results are interpreted as evidence that inducing slant constrains surface interpolation in the absence of explicit edge information: 3D contour and surface interpolation processes share common geometric constraints as formalized by 3D relatability.

The perception of 3D shape from texture based on directional width gradients

May 2010

·

476 Reads

A new computational analysis is described that is capable of estimating the 3D shapes of continuously curved surfaces with anisotropic textures that are viewed with negligible perspective. This analysis assumes that the surface texture is homogeneous, and it makes specific predictions about how the apparent shape of a surface should be distorted in cases where that assumption is violated. Two psychophysical experiments are reported in an effort to test those predictions, and the results confirm that observers' ordinal shape judgments are consistent with what would be expected based on the model. The limitations of this analysis are also considered, and a complimentary model is discussed that is only appropriate for surfaces viewed with large amounts of perspective.

The effects of task difficulty on visual search strategy in virtual 3D displays

February 2013

·

418 Reads

Analyzing the factors that determine our choice of visual search strategy may shed light on visual behavior in everyday situations. Previous results suggest that increasing task difficulty leads to more systematic search paths. Here we analyze observers' eye movements in an "easy" conjunction search task and a "difficult" shape search task to study visual search strategies in stereoscopic search displays with virtual depth induced by binocular disparity. Standard eye-movement variables, such as fixation duration and initial saccade latency, as well as new measures proposed here, such as saccadic step size, relative saccadic selectivity, and x-y target distance, revealed systematic effects on search dynamics in the horizontal-vertical plane throughout the search process. We found that in the "easy" task, observers start with the processing of display items in the display center immediately after stimulus onset and subsequently move their gaze outwards, guided by extrafoveally perceived stimulus color. In contrast, the "difficult" task induced an initial gaze shift to the upper-left display corner, followed by a systematic left-right and top-down search process. The only consistent depth effect was a trend of initial saccades in the easy task with smallest displays to the items closest to the observer. The results demonstrate the utility of eye-movement analysis for understanding search strategies and provide a first step toward studying search strategies in actual 3D scenarios.

Stereo improves 3D shape discrimination even when rich monocular shape cues are available

September 2011

·

546 Reads

We measured the ability to discriminate 3D shapes across changes in viewpoint and illumination based on rich monocular 3D information and tested whether the addition of stereo information improves shape constancy. Stimuli were images of smoothly curved, random 3D objects. Objects were presented in three viewing conditions that provided different 3D information: shading-only, stereo-only, and combined shading and stereo. Observers performed shape discrimination judgments for sequentially presented objects that differed in orientation by rotation of 0°-60° in depth. We found that rotation in depth markedly impaired discrimination performance in all viewing conditions, as evidenced by reduced sensitivity (d') and increased bias toward judging same shapes as different. We also observed a consistent benefit from stereo, both in conditions with and without change in viewpoint. Results were similar for objects with purely Lambertian reflectance and shiny objects with a large specular component. Our results demonstrate that shape perception for random 3D objects is highly viewpoint-dependent and that stereo improves shape discrimination even when rich monocular shape cues are available.

Figure 2. Photograph of the kitchen mold used for making the objects (left image), and side views of the biggest (middle image) and smallest (right image) objects used in the experiments. 
Figure 3. A photograph of a typical object array used in the experiments. The pointers indicating the reference object (double cursor) and its target alternatives (single cursors) have been darkened and enlarged for illustration. 
Color and size interactions in a real 3D object similarity task

September 2004

·

120 Reads

In the natural world, objects are characterized by a variety of attributes, including color and shape. The contributions of these two attributes to object recognition are typically studied independently of each other, yet they are likely to interact in natural tasks. Here we examine whether color and size (a component of shape) interact in a real three-dimensional (3D) object similarity task, using solid domelike objects whose distinct apparent surface colors are independently controlled via spatially restricted illumination from a data projector hidden to the observer. The novel experimental setup preserves natural cues to 3D shape from shading, binocular disparity, motion parallax, and surface texture cues, while also providing the flexibility and ease of computer control. Observers performed three distinct tasks: two unimodal discrimination tasks, and an object similarity task. Depending on the task, the observer was instructed to select the indicated alternative object which was "bigger than," "the same color as," or "most similar to" the designated reference object, all of which varied in both size and color between trials. For both unimodal discrimination tasks, discrimination thresholds for the tested attribute (e.g., color) were increased by differences in the secondary attribute (e.g., size), although this effect was more robust in the color task. For the unimodal size-discrimination task, the strongest effects of the secondary attribute (color) occurred as a perceptual bias, which we call the "saturation-size effect": Objects with more saturated colors appear larger than objects with less saturated colors. In the object similarity task, discrimination thresholds for color or size differences were significantly larger than in the unimodal discrimination tasks. We conclude that color and size interact in determining object similarity, and are effectively analyzed on a coarser scale, due to noise in the similarity estimates of the individual attributes, inter-attribute attentional interactions, or coarser coding of attributes at a "higher" level of object representation.

World-centered perception of 3D object motion during visually guided self-motion

February 2009

·

130 Reads

We investigated how human observers estimate an object's three-dimensional (3D) motion trajectory during visually guided self-motion. Observers performed a task in an immersive virtual reality system consisting of front, left, right, and floor screens of a room-sized cube. In one experiment, we found that the presence of an optic flow simulating forward self-motion in the background induces a world-centered frame of reference, instead of an observer-centered frame of reference, for the perceived rotation of a 3D surface from motion. In another experiment, we found that the perceived direction of 3D object motion is biased toward a world-centered frame of reference when an optic flow pattern is presented in the background. In a third experiment, we confirmed that the effect of the optic flow pattern on the perceived direction of 3D object motion was not caused only by local motion detectors responsible for the change of the retinal size of the target. These results suggest that visually guided self-motion from optic flow induces world-centered criteria for estimates of 3D object motion.

Stereo and motion parallax cues in human 3D vision: Can they vanish without a trace?

February 2006

·

38 Reads

In an immersive virtual reality environment, subjects fail to notice when a scene expands or contracts around them, despite correct and consistent information from binocular stereopsis and motion parallax, resulting in gross failures of size constancy (A. Glennerster, L. Tcheang, S. J. Gilson, A. W. Fitzgibbon, & A. J. Parker, 2006). We determined whether the integration of stereopsis/motion parallax cues with texture-based cues could be modified through feedback. Subjects compared the size of two objects, each visible when the room was of a different size. As the subject walked, the room expanded or contracted, although subjects failed to notice any change. Subjects were given feedback about the accuracy of their size judgments, where the "correct" size setting was defined either by texture-based cues or (in a separate experiment) by stereo/motion parallax cues. Because of feedback, observers were able to adjust responses such that fewer errors were made. For texture-based feedback, the pattern of responses was consistent with observers weighting texture cues more heavily. However, for stereo/motion parallax feedback, performance in many conditions became worse such that, paradoxically, biases moved away from the point reinforced by the feedback. This can be explained by assuming that subjects remap the relationship between stereo/motion parallax cues and perceived size or that they develop strategies to change their criterion for a size match on different trials. In either case, subjects appear not to have direct access to stereo/motion parallax cues.

Figure 2 . Stimulus presentation arrangement. Figure not to scale. See text for details. 
Figure 3 . Speed discrimination JNDs for monocular motion ( x -axis) plotted in terms of the equivalent stereomotion speed, against those for stereomotion ( y -axis). Symbols > , g , 3 , q , and > . represent subjects L.S. (patch), B.B. (patch), C.N. (noise) L.L. (noise), and L.S. (faster motion; noise), respectively. Relative trajectory conditions DD, DL, DR, LD, and RD are represented by points colored black, green, red, blue, and orange, respectively. Error bars represent T 1 SEM . Two outliers V points for observer L.L., high above the line V were omitted for clarity (DL: 0.242, 0.417; LD: 0.092, 0.327). 
Figure 4. Stereomotion speed discrimination JNDs. The histogram bars represent stereomotion thresholds for all six observers under all five trajectory conditions. These show no systematic pattern of variation. Error bars represent T1 SEM.
Stereomotion suppression and the perception of speed: Accuracy and precision as a function of 3D trajectory

February 2006

·

68 Reads

The precision and accuracy of speed discrimination performance for stereomotion stimuli were assessed for several receding 3D trajectories confined to the horizontal meridian. It has previously been demonstrated in a variety of tasks that detection thresholds are substantially higher when subjects observe a stereomotion stimulus than when simply viewing one of its component monocular half-images--a phenomenon known as stereomotion suppression (C. W. Tyler, 1971). Using monocularly visible motion in depth targets, we found mean speed discrimination thresholds to be higher for stereomotion, compared with monocular lateral speed discrimination thresholds for equivalent stimuli, demonstrating a disadvantage for binocular viewing in the case of speed discrimination as well. Furthermore, speed discrimination thresholds for motion in depth were not systematically affected by trajectory angle; hence, the disadvantage of binocular viewing persists even when there are concurrent changes in binocular visual direction. Lastly, there was a tendency for oblique trajectories of stereomotion to be perceived as faster than equally rapid motion receding directly away from the subject along the midline. Our data, in addition to earlier stereomotion suppression observations, are consistent with a stereomotion system that takes a noisy, weighted difference of the stimulus velocities in the two eyes to compute motion in depth.

Relative flattening between velvet and matte 3D shapes: Evidence for similar shape-from-shading computations

January 2012

·

40 Reads

Among other cues, the visual system uses shading to infer the 3D shape of objects. The shading pattern depends on the illumination and reflectance properties (BRDF). In this study, we compared 3D shape perception between identical shapes with different BRDFs. The stimuli were photographed 3D printed random smooth shapes that were either painted matte gray or had a gray velvet layer. We used the gauge figure task (J. J. Koenderink, A. J. van Doorn, & A. M. L. Kappers, 1992) to quantify 3D shape perception. We found that the shape of velvet objects was systematically perceived to be flatter than the matte objects. Furthermore, observers' judgments were more similar for matte shapes than for velvet shapes. Lastly, we compared subjective with veridical reliefs and found large systematic differences: Both matte and velvet shapes were perceived more flat than the actual shape. The isophote pattern of a flattened Lambertian shape resembles the isophote pattern of an unflattened velvet shape. We argue that the visual system uses a similar shape-from-shading computation for matte and velvet objects that partly discounts material properties.

Perception can influence the vergence responses associated with open-loop gaze shifts in 3D

February 2003

·

16 Reads

We sought to determine if perceived depth can elicit vergence eye movements independent of binocular disparity. A flat surface in the frontal plane appears slanted about a vertical axis when the image in one eye is vertically compressed relative to the image in the other eye: the induced size effect (Ogle, 1938). We show that vergence eye movements accompany horizontal gaze shifts across such surfaces, consistent with the direction of the perceived slant, despite the absence of a horizontal disparity gradient. All images were extinguished during the gaze shifts so that eye movements were executed open-loop. We also used vertical compression of one eye's image to null the perceived slant resulting from prior horizontal compression of that image, and show that this reduces the vergence accompanying horizontal gaze shifts across the surface, even though the horizontal disparity is unchanged. When this last experiment was repeated using vertical expansions in place of the vertical compressions, the perceived slant was increased and so too was the vergence accompanying horizontal gaze shifts, although the horizontal disparity again remained unchanged. We estimate that the perceived depth accounted, on average, for 15-41% of the vergence in our experiments depending on the conditions.

Figure 1 . An example of a shape-from-texture stimulus. 
Figure 2 . An example of a shape-from-motion stimulus. (This image depicts the first frame of the surface shown in Figure 1. Click on the image to view the movie.) 
Identification of 3D shape from texture and motion across the visual field

February 2006

·

265 Reads

Little is known about the perception of 3D shape in the visual periphery. Here we ask whether identification accuracy in shape-from-texture and shape-from-motion tasks can be equated across the visual field with sufficient stimulus magnification. Both tasks employed 3D surfaces comprising hills, valleys, and plains in three possible locations, yielding a 27 alternative forced-choice task (27AFC). Participants performed the task at eccentricities of 0 to 16 deg in the right visual field over a 64-fold range of stimulus sizes. Performance reached ceiling levels at all eccentricities, indicating that stimulus magnification was sufficient to compensate for eccentricity-dependent sensitivity loss. The parameter E(2) (in the equation F = 1 + E / E(2)) was used to characterize the rate at which stimulus size must increase with eccentricity (E) to achieve foveal levels of performance. Three parameter models (mu, sigma, and E(2)) captured most of the variability in the psychometric functions relating stimulus size and eccentricity to accuracy for all participants' data in the two experiments. For the shape-from-texture task, the average E(2) was 1.52, and for the shape-from-motion task, it was 0.61. The E(2) values indicate that sensitivity to structure from motion declines at a faster rate with eccentricity than does sensitivity to structure from texture. Although size scaling with F = 1 + E / E(2) eliminated most eccentricity variation from the structure-from-motion data, there was some evidence that E(2) increases as accuracy decreases in the shape-from-texture task, suggesting that there may be more than one eccentricity-dependent limitation on performance in this task.

Figure 1. Velocity visuomotor transformation for manual tracking. (a) Schematic representation of the central nervous system areas implicated in the visuomotor velocity transformation. V1: primary visual cortex. MT: middle temporal area. MST: medial superior temporal area. PPC: posterior parietal cortex. M1: primary motor cortex. PMv: ventral premotor cortex. PMd: dorsal premotor cortex. (b) Example of manual tracking in a challenging dynamic environment: clays shooting. Before shooting, the shooter must react rapidly and track the target as accurately as possible. If the head is slightly tilted, projection of the velocity vector onto the retina will be tilted (black arrow on the retinal projection, inset on the left). If the brain does not take the head posture into account, the tracking movement will start in the direction indicated by the orange arrow. If it is taken into account, the initiation of the movement will be correct (blue arrow). 
Figure 4. Typical trial in the head-roll paradigm. Panel (a) represents trajectories in the screen coordinates after the TT movement onset: TT trajectory (blue line) and projection of the eye-finger vector trajectory onto the screen (black line). Gaze fixation is also represented (red dot) as well as the head-roll indication (black dashed line) for this particular trial. Orange line shows the direction predicted using retinal information only. Panel (b) shows head-in-space 3D angular position and eye-in-head 3D Fick position, as well as the horizontal and vertical components of tracking position and velocity (thick traces for the tracking, dotted lines for the target) as a function of time during the trial. TT onset is indicated by the black vertical line, whereas TT occlusion is represented by the gray area. 
Accurate planning of manual tracking requires a 3D visuomotor transformation of velocity signals

May 2012

·

139 Reads

Humans often perform visually guided arm movements in a dynamic environment. To accurately plan visually guided manual tracking movements, the brain should ideally transform the retinal velocity input into a spatially appropriate motor plan, taking the three-dimensional (3D) eye-head-shoulder geometry into account. Indeed, retinal and spatial target velocity vectors generally do not align because of different eye-head postures. Alternatively, the planning could be crude (based only on retinal information) and the movement corrected online using visual feedback. This study aims to investigate how accurate the motor plan generated by the central nervous system is. We computed predictions about the movement plan if the eye and head position are taken into account (spatial hypothesis) or not (retinal hypothesis). For the motor plan to be accurate, the brain should compensate for the head roll and resulting ocular counterroll as well as the misalignment between retinal and spatial coordinates when the eyes lie in oblique gaze positions. Predictions were tested on human subjects who manually tracked moving targets in darkness and were compared to the initial arm direction, reflecting the motor plan. Subjects spatially accurately tracked the target, although imperfectly. Therefore, the brain takes the 3D eye-head-shoulder geometry into account for the planning of visually guided manual tracking.

Figure 1. 
To CD or not to CD: Is there a 3D motion aftereffect based on changing disparities?

April 2012

·

23 Reads

Recently, T. B. Czuba, B. Rokers, K. Guillet, A. C. Huk, and L. K. Cormack, (2011) and Y. Sakano, R. S. Allison, and I. P. Howard (2012) published very similar studies using the motion aftereffect to probe the way in which motion through depth is computed. Here, we compare and contrast the findings of these two studies and incorporate their results with a brief follow-up experiment. Taken together, the results leave no doubt that the human visual system incorporates a mechanism that is uniquely sensitive to the difference in velocity signals between the two eyes, but--perhaps surprisingly--evidence for a neural representation of changes in binocular disparity over time remains elusive.

Effects of surface reflectance and 3D shape on perceived rotation axis

September 2013

·

34 Reads

Surface specularity distorts the optic flow generated by a moving object in a way that provides important cues for identifying surface material properties (Doerschner, Fleming et al., 2011). Here we show that specular flow can also affect the perceived rotation axis of objects. In three experiments, we investigate how three-dimensional shape and surface material interact to affect the perceived rotation axis of unfamiliar irregularly shaped and isotropic objects. We analyze observers' patterns of errors in a rotation axis estimation task under four surface material conditions: shiny, matte textured, matte untextured, and silhouette. In addition to the expected large perceptual errors in the silhouette condition, we find that the patterns of errors for the other three material conditions differ from each other and across shape category, yielding the largest differences in error magnitude between shiny and matte, textured isotropic objects. Rotation axis estimation is a crucial implicit computational step to perceive structure from motion; therefore, we test whether a structure from a motion-based model can predict the perceived rotation axis for shiny and matte, textured objects. Our model's predictions closely follow observers' data, even yielding the same reflectance-specific perceptual errors. Unlike previous work (Caudek & Domini, 1998), our model does not rely on the assumption of affine image transformations; however, a limitation of our approach is its reliance on projected correspondence, thus having difficulty in accounting for the perceived rotation axis of smooth shaded objects and silhouettes. In general, our findings are in line with earlier research that demonstrated that shape from motion can be extracted based on several different types of optical deformation (Koenderink & Van Doorn, 1976; Norman & Todd, 1994; Norman, Todd, & Orban, 2004; Pollick, Nishida, Koike, & Kawato, 1994; Todd, 1985).

Figure 1. Three-dimensional objects (upper row: group a; lower row: group b) used in all five experiments.
Figure 6. Accuracy of original and reverse conditions in all five experiments. Error bar indicates one SE.  
Decomposing the spatiotemporal signature in dynamic 3D object recognition

August 2010

·

38 Reads

The current study investigated the long-term representation of spatiotemporal signature (J. V. Stone, 1998) and its coding nature in a dynamic object recognition task. In Experiment 1, the observers' recognition performance was impaired by an overall reversal of the studied objects' learning view sequences even when they were unsmooth, suggesting that the spatiotemporal appearance of the objects was used for recognition, and this effect was not restricted to smooth motion condition. In another four experiments, a feature reversal paradigm was applied that only the global-scale or local-scale dynamic feature of the view sequences was reversed at a time. The reversal effect still held, but it was selective to the sequence's feature saliency, suggesting that statistical representation based on specific features instead of the whole view sequence was used for recognition. Furthermore, top-down regulation on sequence smoothness was observed that the observers perceived the objects as moving in a smoother manner than they actually were. These results extend an emerging framework that argues the spatiotemporal appearance of a dynamic object contributes to its recognition. The spatiotemporal signature might be coded in a feature-based manner under the law of perceptual organization, and the coding process is adaptive to variation of the sequence's temporal order.

The effects of viewing angle, camera angle and sign of surface curvature on the perception of 3D shape from texture

February 2007

·

2,421 Reads

Computational models for determining three-dimensional shape from texture based on local foreshortening or gradients of scaling are able to achieve accurate estimates of surface relief from an image when it is observed from the same visual angle with which it was photographed or rendered. These models produce conflicting predictions, however, when an image is viewed from a different visual angle. An experiment was performed to test these predictions, in which observers judged the apparent depth profiles of hyperbolic cylinders under a wide variety of conditions. The results reveal that the apparent patterns of relief from texture are systematically underestimated; convex surfaces appear to have greater depth than concave surfaces, large camera angles produce greater amounts of perceived depth than small camera angles, and the apparent depth-to-width ratio for a given image of a surface is greater for small viewing angles than for large viewing angles. Because these results are incompatible with all existing computational models, a new model is presented based on scaling contrast that can successfully account for all aspects of the data.

The extended horopter: Quantifying retinal correspondence across changes of 3D eye position

February 2006

·

40 Reads

The theoretical horopter is an interesting qualitative tool for conceptualizing binocular correspondence, but its quantitative applications have been limited because they have ignored ocular kinematics and vertical binocular sensory fusion. Here we extend the mathematical definition of the horopter to a full surface over visual space, and we use this extended horopter to quantify binocular alignment and visualize its dependence on eye position. We reproduce the deformation of the theoretical horopter into a spiral shape in tertiary gaze as first described by Helmholtz (1867). We also describe a new effect of ocular torsion, where the Vieth-Müller circle rotates out of the visual plane for symmetric vergence conditions in elevated or depressed gaze. We demonstrate how these deformations are reduced or abolished when the eyes follow the modification of Listing's law during convergence called L2, which enlarges the extended horopter and keeps its location and shape constant across gaze directions.

The perception of 3D shape from planar cut contours

October 2011

·

74 Reads

A new computational analysis is described for estimating 3D shapes from orthographic images of surfaces that are textured with planar cut contours. For any given contour pattern, this model provides a family of possible interpretations that are all related by affine scaling and shearing transformations in depth, depending on the specific values of its free parameters that are used to compute the shape estimate. Two psychophysical experiments were performed in an effort to compare the model predictions with observers' judgments of 3D shape for developable and non-developable surfaces. The results reveal that observers' perceptions can be systematically distorted by affine scaling and shearing transformations in depth and that the magnitude and direction of these distortions vary systematically with the 3D orientations of the contour planes.

Visual detection of symmetry of 3D shapes

June 2010

·

64 Reads

This study tested perception of symmetry of 3D shapes from single 2D images. In Experiment 1, performance in discrimination between symmetric and asymmetric 3D shapes from single 2D line drawings was tested. In Experiment 2, performance in discrimination between different degrees of asymmetry of 3D shapes from single 2D line drawings was tested. The results showed that human performance in the discrimination was reliable. Based on these results, a computational model that performs the discrimination from single 2D images is presented. The model first recovers the 3D shape using a priori constraints: 3D symmetry, maximal 3D compactness, minimum surface area, and maximal planarity of contours. Then the model evaluates the degree of symmetry of the 3D shape. The model provided good fit to the subjects' data.

Color constancy improves for real 3D objects

April 2009

·

1,469 Reads

In this study human color constancy was tested for two-dimensional (2D) and three-dimensional (3D) setups with real objects and lights. Four different illuminant changes, a natural selection task and a wide choice of target colors were used. We found that color constancy was better when the target color was learned as a 3D object in a cue-rich 3D scene than in a 2D setup. This improvement was independent of the target color and the illuminant change. We were not able to find any evidence that frequently experienced illuminant changes are better compensated for than unusual ones. Normalizing individual color constancy hit rates by the corresponding color memory hit rates yields a color constancy index, which is indicative of observers' true ability to compensate for illuminant changes.

Where is the moving object now? Judgments of instantaneous position show poor temporal precision (SD=70 ms)

December 2009

·

75 Reads

Humans can precisely judge relative location between two objects moving with the same speed and direction, as numerous studies have shown. However, the precision for localizing a single moving object relative to stationary references remains a neglected topic. Here, subjects reported the perceived location of a moving object at the time of a cue. The variability of the reported positions increased steeply with the speed of the object, such that the distribution of responses corresponds to the distance that the object traveled in 70 ms. This surprisingly large temporal imprecision depends little on the characteristics of the trajectory of the moving object or of the cue that indicates when to judge the position. We propose that the imprecision reflects a difficulty in identifying which position of the moving object occurs at the same time as the cue. This high-level process may involve the same low temporal resolution binding mechanism that, in other situations, pairs simultaneous features such as color and motion.

Effects of task and coordinate frame of attention in area 7a of the primate posterior parietal cortex

February 2010

·

94 Reads

The activity of neurons in the primate posterior parietal cortex reflects the location of visual stimuli relative to the eye, body, and world, and is modulated by selective attention and task rules. It is not known however how these effects interact with each other. To address this question, we recorded neuronal activity from area 7a of monkeys trained to perform two variants of a delayed match-to-sample task. The monkeys attended a spatial location defined in either spatiotopic (world-centered) or retinotopic (eye-centered) coordinates. We found neuronal responses to be remarkably plastic depending on the task. In contrast to previous studies using the simple version of the delayed match-to-sample task, we discovered that after training in a task where the locus of attention shifted during the trial, neural responses were typically enhanced for a match stimulus. Our results further revealed that responses were mostly enhanced for stimuli matching in spatiotopic coordinates, although the proportion of neurons modulated by either coordinate frame was influenced by the behavioral task executed.

Figure 2. Scleral contact lens movement errors in horizontal translation, vertical translation, and rotation with respect to the underlying wavefront error, sampled each day for 1 hr at 5 min intervals, 10 s per interval, on five different days. Five colors represent data collected on five different days. The invisible error bars are within the symbols. 
Optimizing wavefront-guided corrections for highly aberrated eyes in the presence of registration uncertainty

June 2013

·

85 Reads

Dynamic registration uncertainty of a wavefront-guided correction with respect to underlying wavefront error (WFE) inevitably decreases retinal image quality. A partial correction may improve average retinal image quality and visual acuity in the presence of registration uncertainties. The purpose of this paper is to (a) develop an algorithm to optimize wavefront-guided correction that improves visual acuity given registration uncertainty and (b) test the hypothesis that these corrections provide improved visual performance in the presence of these uncertainties as compared to a full-magnitude correction or a correction by Guirao, Cox, and Williams (2002). A stochastic parallel gradient descent (SPGD) algorithm was used to optimize the partial-magnitude correction for three keratoconic eyes based on measured scleral contact lens movement. Given its high correlation with logMAR acuity, the retinal image quality metric log visual Strehl was used as a predictor of visual acuity. Predicted values of visual acuity with the optimized corrections were validated by regressing measured acuity loss against predicted loss. Measured loss was obtained from normal subjects viewing acuity charts that were degraded by the residual aberrations generated by the movement of the full-magnitude correction, the correction by Guirao, and optimized SPGD correction. Partial-magnitude corrections optimized with an SPGD algorithm provide at least one line improvement of average visual acuity over the full magnitude and the correction by Guirao given the registration uncertainty. This study demonstrates that it is possible to improve the average visual acuity by optimizing wavefront-guided correction in the presence of registration uncertainty.

Binocular visual performance with aberration correction as a function of light level

December 2014

·

411 Reads

The extent to which monocular visual performance of subjects with normal amounts of ocular aberrations can be improved with adaptive optics (AO) depends on both the pupil diameter and the luminance for visual testing. Here, the benefit of correction of higher order aberrations for binocular visual performance was assessed over a range of luminances for natural light-adapted pupil sizes with a binocular AO visual simulator. Results show that binocular aberration correction benefits for visual acuity and contrast sensitivity increase with decreasing luminances. Also, the advantage of binocular over monocular viewing increases when visual acuity becomes worse. The findings suggest that binocular summation mitigates poor visual performance under low luminance conditions. © 2014 ARVO.

Figure 3. Ratio of the coma (Dm) of the posterior corneal and that of the anterior corneal surface as a function of age. Weighted linear regression showed a significant age dependence: Dm post /Dm ant = 2.0 (T0.1) j 0.012 (T0.003) * Age; n = 114; r = j.34; p G .0001. 
The contribution of the posterior surface to the coma aberration of the human cornea

February 2007

·

93 Reads

Scheimpflug imaging was used to measure in six meridians the shape of the anterior and posterior cornea of the right eye of 114 subjects, ranging in age from 18 to 65 years. Subsequently, a three-dimensional model of the shape of the whole cornea was reconstructed, from which the coma aberration of the anterior and whole cornea could be calculated. This made it possible to investigate the compensatory role of the posterior surface to the coma aberration of the anterior corneal surface with age. Results show that, on average, the posterior surface compensates approximately 3.5% of the coma of the anterior surface. The compensation tends to be larger for young subjects (6%) than for older subjects (0%). This small effect of the posterior cornea on the coma aberration makes it clear that for the coma aberration of the whole eye, only the anterior corneal surface and the crystalline lens play a role. Consequently, for the design of an intraocular lens that is able to correct for coma aberration, it would be sufficient to only take the anterior corneal surface into account.

The change of spherical aberration during accommodation and its effect on the accommodation response

November 2010

·

484 Reads

Theoretical and ray-tracing calculations on an accommodative eye model based on published anatomical data, together with wave-front experimental results on 15 eyes, are computed to study the change of spherical aberration during accommodation and its influence on the accommodation response. The three methodologies show that primary spherical aberration should decrease during accommodation, while secondary spherical aberration should increase. The hyperbolic shape of the lens' surfaces is the main factor responsible for the change of those aberrations during accommodation. Assuming that the eye accommodated to optimize image quality by minimizing the RMS of the wave front, it is shown that primary spherical aberration decreases the accommodation response, while secondary spherical aberration slightly increases it. The total effect of the spherical aberration is a reduction of around 1/7 D per diopter of stimulus approximation, although that value depends on the pupil size and its reduction during accommodation. The apparent accommodation error (lead and lag), typically present in the accommodation/response curve, could then be explained as a consequence of the strategy used by the visual system, and the apparatus of measurement, to select the best image plane that can be affected by the change of the spherical aberration during accommodation.

Binocular visual acuity for the correction of spherical aberration in polychromatic and monochromatic light

February 2014

·

73 Reads

Correction of spherical (SA) and longitudinal chromatic aberrations (LCA) significantly improves monocular visual acuity (VA). In this work, the visual effect of SA correction in polychromatic and monochromatic light on binocular visual performance is investigated. A liquid crystal based binocular adaptive optics visual analyzer capable of operating in polychromatic light is employed in this study. Binocular VA improves when SA is corrected and LCA effects are reduced separately and in combination, resulting in the highest value for SA correction in monochromatic light. However, the binocular summation ratio is highest for the baseline condition of uncorrected SA in polychromatic light. Although SA correction in monochromatic light has a greater impact monocularly than binocularly, bilateral correction of both SA and LCA may further improve binocular spatial visual acuity which may support the use of aspheric-achromatic ophthalmic devices, in particular, intraocular lenses (IOLs).

Top-cited authors