[Show abstract][Hide abstract] ABSTRACT: Perception of scenes has typically been investigated by using static or simplified visual displays. How attention is used to perceive and evaluate dynamic, realistic scenes is more poorly understood, in part due to the problem of comparing eye fixations to moving stimuli across observers. When the task and stimulus is common across observers, consistent fixation location can indicate that that region has high goal-based relevance. Here we investigated these issues when an observer has a specific, and naturalistic, task: closed-circuit television (CCTV) monitoring. We concurrently recorded eye movements and ratings of perceived suspiciousness as different observers watched the same set of clips from real CCTV footage. Trained CCTV operators showed greater consistency in fixation location and greater consistency in suspiciousness judgements than untrained observers. Training appears to increase between-operators consistency by learning "knowing what to look for" in these scenes. We used a novel "Dynamic Area of Focus (DAF)" analysis to show that in CCTV monitoring there is a temporal relationship between eye movements and subsequent manual responses, as we have previously found for a sports video watching task. For trained CCTV operators and for untrained observers, manual responses were most highly related to between-observer eye position spread when a temporal lag was introduced between the fixation and response data. Several hundred milliseconds after between-observer eye positions became most similar, observers tended to push the joystick to indicate perceived suspiciousness. Conversely, several hundred milliseconds after between-observer eye positions became dissimilar, observers tended to rate suspiciousness as low. These data provide further support for this DAF method as an important tool for examining goal-directed fixation behavior when the stimulus is a real moving image.
Frontiers in Human Neuroscience 08/2013; 7:441. DOI:10.3389/fnhum.2013.00441 · 3.63 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Clutter is something that is encountered in everyday life, from a messy desk to a crowded street. Such clutter may interfere with our ability to search for objects in such environments, like our car keys or the person we are trying to meet. A number of computational models of clutter have been proposed and shown to work well for artificial and other simplified scene search tasks. In this paper, we correlate the performance of different models of visual clutter to human performance in a visual search task using natural scenes. The models we evaluate are Feature Congestion (Rosenholtz, Li, & Nakano, 2007), Sub-band Entropy (Rosenholtz et al., 2007), Segmentation (Bravo & Farid, 2008), and Edge Density (Mack & Oliva, 2004) measures. The correlations were performed across a range of target-centered subregions to produce a correlation profile, indicating the scale at which clutter was affecting search performance. Overall clutter was rather weakly correlated with performance (r ≈ 0.2). However, different measures of clutter appear to reflect different aspects of the search task: correlations with Feature Congestion are greatest for the actual target patch, whereas the Sub-band Entropy is most highly correlated in a region 12° × 12° centered on the target.
Journal of Vision 04/2013; 13(5). DOI:10.1167/13.5.25 · 2.39 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: An innovative motoric measure of slant based on gait is proposed as the angle between the foot and the walking surface during walking. This work investigates whether the proposed action-based measure is affected by factors such as material and inclination of the walking surface. Experimental studies were conducted in a real environment set-up and in its virtual simulation counterpart evaluating behavioural fidelity and user performance in ecologically-valid simulations. In the real environment, the measure slightly overestimated the inclined path whereas in the virtual environment it slightly underestimated the inclined path. The results imply that the proposed slant measure is modulated by motoric caution. Since the “reality” of the synthetic environment was relatively high, performance results should have revealed the same degree of caution as in the real world, however, that was not the case. People become more cautious when the ground plane was steep, slippery, or virtual.
International Journal of Human-Computer Studies 11/2012; 70(11):781–793. DOI:10.1016/j.ijhcs.2012.07.001 · 1.29 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Various visual functions decline in ageing and even more so in patients with Alzheimer's disease (AD). Here we investigated whether the complex visual processes involved in ignoring illumination-related variability (specifically, cast shadows) in visual scenes may also be compromised. Participants searched for a discrepant target among items which appeared as posts with shadows cast by light-from-above when upright, but as angled objects when inverted. As in earlier reports, young participants gave slower responses with upright than inverted displays when the shadow-like part was dark but not white (control condition). This is consistent with visual processing mechanisms making shadows difficult to perceive, presumably to assist object recognition under varied illumination. Contrary to predictions, this interaction of “shadow” colour with item orientation was maintained in healthy older and AD groups. Thus, the processing mechanisms which assist complex light-independent object identification appear to be robust to the effects of both ageing and AD. Importantly, this means that the complexity of a function does not necessarily determine its vulnerability to age- or AD-related decline.
We also report slower responses to dark than light “shadows” of either orientation in both ageing and AD, in keeping with increasing light scatter in the ageing eye. Rather curiously, AD patients showed further slowed responses to “shadows” of either colour at the bottom than the top of items as if they applied shadow-specific rules to non-shadow conditions. This suggests that in AD, shadow-processing mechanisms, while preserved, might be applied in a less selective way.
PLoS ONE 09/2012; 7(9):e45104. DOI:10.1371/journal.pone.0045104 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Over the last decade, television screens and display monitors have increased in size considerably, but has this improved our televisual experience? Our working hypothesis was that the audiences adopt a general strategy that "bigger is better." However, as our visual perceptions do not tap directly into basic retinal image properties such as retinal image size (C. A. Burbeck, 1987), we wondered whether object size itself might be an important factor. To test this, we needed a task that would tap into the subjective experiences of participants watching a movie on different-sized displays with the same retinal subtense. Our participants used a line bisection task to self-report their level of "presence" (i.e., their involvement with the movie) at several target locations that were probed in a 45-min section of the movie "The Good, The Bad, and The Ugly." Measures of pupil dilation and reaction time to the probes were also obtained. In Experiment 1, we found that subjective ratings of presence increased with physical screen size, supporting our hypothesis. Face scenes also produced higher presence scores than landscape scenes for both screen sizes. In Experiment 2, reaction time and pupil dilation results showed the same trends as the presence ratings and pupil dilation correlated with presence ratings, providing some validation of the method. Overall, the results suggest that real-time measures of subjective presence might be a valuable tool for measuring audience experience for different types of (i) display and (ii) audiovisual material.
[Show abstract][Hide abstract] ABSTRACT: Background / Purpose:
The purpose of this study was to look at the predictive capabilities of image measures when applied to natural scenes, when compared to human search performance.
Clutter metrics have a predictive capability, but the target has to be considered.
Vision Sciences Society Annual Meeting 2012; 05/2012
[Show abstract][Hide abstract] ABSTRACT: Low-level stimulus salience and task relevance together determine the human fixation priority assigned to scene locations (Fecteau and Munoz in Trends Cogn Sci 10(8):382-390, 2006). However, surprisingly little is known about the contribution of task relevance to eye movements during real-world visual search where stimuli are in constant motion and where the 'target' for the visual search is abstract and semantic in nature. Here, we investigate this issue when participants continuously search an array of four closed-circuit television (CCTV) screens for suspicious events. We recorded eye movements whilst participants watched real CCTV footage and moved a joystick to continuously indicate perceived suspiciousness. We find that when multiple areas of a display compete for attention, gaze is allocated according to relative levels of reported suspiciousness. Furthermore, this measure of task relevance accounted for twice the amount of variance in gaze likelihood as the amount of low-level visual changes over time in the video stimuli.
Experimental Brain Research 08/2011; 214(1):131-7. DOI:10.1007/s00221-011-2812-y · 2.04 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We conducted suprathreshold discrimination experiments to compare how natural-scene information is processed in central and peripheral vision (16° eccentricity). Observers' ratings of the perceived magnitude of changes in naturalistic scenes were lower for peripheral than for foveal viewing, and peripheral orientation changes were rated less than peripheral colour changes. A V1-based Visual Difference Predictor model of the magnitudes of perceived foveal change was adapted to match the sinusoidal grating sensitivities of peripheral vision, but it could not explain why the ratings for changes in peripheral stimuli were so reduced. Perceived magnitude ratings for peripheral stimuli were further reduced by simultaneous presentation of flanking patches of naturalistic images, a phenomenon that could not be replicated foveally, even after M-scaling the foveal stimuli to reduce their size and the distances from the flankers. The effects of the peripheral flankers are very reminiscent of crowding phenomena demonstrated with letters or Gabor patches.
Vision research 05/2011; 51(14):1686-98. DOI:10.1016/j.visres.2011.05.010 · 1.82 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: In a virtual environment (VE), efficient techniques are often needed to economize on rendering computation without compromising the information transmitted. The reported experiments devise a functional fidelity metric by exploiting research on memory schemata. According to the proposed measure, similar information would be transmitted across synthetic and real-world scenes depicting a specific schema. This would ultimately indicate which areas in a VE could be rendered in lower quality without affecting information uptake. We examine whether computationally more expensive scenes of greater visual fidelity affect memory performance after exposure to immersive VEs, or whether they are merely more aesthetically pleasing than their diminished visual quality counterparts. Results indicate that memory schemata function in VEs similar to real-world environments. “High-level” visual cognition related to late visual processing is unaffected by ubiquitous graphics manipulations such as polygon count and depth of shadow rendering; “normal” cognition operates as long as the scenes look acceptably realistic. However, when the overall realism of the scene is greatly reduced, such as in wireframe, then visual cognition becomes abnormal. Effects that distinguish schema-consistent from schema-inconsistent objects change because the whole scene now looks incongruent. We have shown that this effect is not due to a failure of basic recognition.
[Show abstract][Hide abstract] ABSTRACT: The Euclidean and MAX metrics have been widely used to model cue summation psychophysically and computationally. Both rules happen to be special cases of a more general Minkowski summation rule , where m = 2 and ∞, respectively. In vision research, Minkowski summation with power m = 3-4 has been shown to be a superior model of how subthreshold components sum to give an overall detection threshold. Recently, we have previously reported that Minkowski summation with power m = 2.84 accurately models summation of suprathreshold visual cues in photographs. In four suprathreshold discrimination experiments, we confirm the previous findings with new visual stimuli and extend the applicability of this rule to cue combination in auditory stimuli (musical sequences and phonetic utterances, where m = 2.95 and 2.54, respectively) and cross-modal stimuli (m = 2.56). In all cases, Minkowski summation with power m = 2.5-3 outperforms the Euclidean and MAX operator models. We propose that this reflects the summation of neuronal responses that are not entirely independent but which show some correlation in their magnitudes. Our findings are consistent with electrophysiological research that demonstrates signal correlations (r = 0.1-0.2) between sensory neurons when these are presented with natural stimuli.
Proceedings of the Royal Society B: Biological Sciences 10/2010; 278(1710):1365-72. DOI:10.1098/rspb.2010.1888 · 5.05 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We measured the temporal relationship between eye movements and manual responses while experts and novices watched a videotaped football match. Observers used a joystick to continuously indicate the likelihood of an imminent goal. We measured correlations between manual responses and between-subjects variability in eye position. To identify the lag magnitude, we repeated these correlations over a range of possible delays between these two measures and searched for the most negative correlation coefficient. We found lags in the order of 2 sec and an effect of expertise on lag magnitude, suggesting that expertise has its effect by directing eye movements to task-relevant areas of a scene more quickly, facilitating a longer processing duration before behavioral decisions are made. This is a powerful new method for examining the eye movement behavior of multiple observers across complex moving images.
[Show abstract][Hide abstract] ABSTRACT: We are studying how people perceive naturalistic suprathreshold changes in the colour, size, shape or location of items in images of natural scenes, using magnitude estimation ratings to characterise the sizes of the perceived changes in coloured photographs. We have implemented a computational model that tries to explain observers' ratings of these naturalistic differences between image pairs. We model the action-potential firing rates of millions of neurons, having linear and non-linear summation behaviour closely modelled on real VI neurons. The numerical parameters of the model's sigmoidal transducer function are set by optimising the same model to experiments on contrast discrimination (contrast 'dippers') on monochrome photographs of natural scenes. The model, optimised on a stimulus-intensity domain in an experiment reminiscent of the Weber-Fechner relation, then produces tolerable predictions of the ratings for most kinds of naturalistic image change. Importantly, rating rises roughly linearly with the model's numerical output, which represents differences in neuronal firing rate in response to the two images under comparison; this implies that rating is proportional to the neuronal response.
Seeing and perceiving 10/2010; 23(4):349-72. DOI:10.1163/187847510X532676 · 1.32 Impact Factor