ArticlePublisher preview available

An Investigation of Linear Separability in Visual Search for Color Suggests a Role of Recognizability

American Psychological Association
Journal of Experimental Psychology: Human Perception and Performance
Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Visual search for color is thought to be performed either using color-opponent processes, or through the comparison of unique color categories. In the present study, we investigate these theories by using displays with a red or green hue, but varying levels of saturation. The linearly inseparable nature of this display makes search for the midsaturated target inefficient. A genetic algorithm was employed, which evolved the distractors in a search display to reveal the processes that people use to search color. Results show that participants were able to search within only midsaturated red items, but not within only midsaturated green items, providing evidence for color categories, as in English there is a basic color label for midsaturated red (i.e., pink), but not for midsaturated green. A follow-up experiment revealed that it was possible to search within midsaturated green items if the exact target color was primed before each trial. We therefore suggest that both priming and a unique color category increase the recognizability of the target color, which has been speculated to increase visual search performance.
An Investigation of Linear Separability in Visual Search for Color
Suggests a Role of Recognizability
Garry Kong and David Alais
The University of Sydney
Erik Van der Burg
The University of Sydney and Vrije Universiteit Amsterdam
Visual search for color is thought to be performed either using color-opponent processes, or through the
comparison of unique color categories. In the present study, we investigate these theories by using
displays with a red or green hue, but varying levels of saturation. The linearly inseparable nature of this
display makes search for the midsaturated target inefficient. A genetic algorithm was employed, which
evolved the distractors in a search display to reveal the processes that people use to search color. Results
show that participants were able to search within only midsaturated red items, but not within only
midsaturated green items, providing evidence for color categories, as in English there is a basic color
label for midsaturated red (i.e., pink), but not for midsaturated green. A follow-up experiment revealed
that it was possible to search within midsaturated green items if the exact target color was primed before
each trial. We therefore suggest that both priming and a unique color category increase the recogniz-
ability of the target color, which has been speculated to increase visual search performance.
Keywords: visual search, attention, genetic algorithm, priming, linguistic relativity
Visual search is a commonly used task in psychophysical ex-
periments. It was originally used to identify basic features through
which the visual system analyzed a stimulus. This idea stemmed
from feature integration theory (Treisman & Gelade, 1980) which
postulated that “pop-out” search indicated the presence of a basic
feature in the visual system. While this idea has been refuted, in
part through the existence of “pop-out” with stimuli not defined by
a basic feature (e.g., Enns & Rensink, 1990;Wang, Cavanagh, &
Green, 1994), it has initiated the debate on whether visual search
is driven by early or late visual processes. For example, visual
search for color has been argued to be based on the low-level
color-opponent channels (Lindsey et al., 2010), but others have
argued that it is based on a limited number of basic color terms
(D’Zmura, 1991;Yokoi & Uchikawa, 2005).
The idea that visual search for color is driven by color-opponent
processes is derived from low-level color perception in the lateral
geniculate nucleus (LGN). Here, cells that respond to color are
tuned to one of three axes: red-green, blue-yellow, and luminance
(Derrington, Krauskopf, & Lennie, 1984). Support for the idea that
these cells affect visual search come from Lindsey et al. (2010),
who investigated visual search with targets of desaturated colors
among white and saturated distractors. They found that search for
some colors, mainly the reddish colors, was faster than for others,
and that this search advantage was modeled well by the theoretical
outputs of the three types of opponent cells in LGN. On the other
hand, evidence against this theory comes from D’Zmura (1991; see
also Bauer, Jolicoeur, & Cowan, 1998), who found that the diffi-
culty of visual search for color depended on whether or not the
target was collinearly flanked by its distractors when represented
in CIE color space, that is, the principle of linear separability. This
was the case with all color pairings including those that could not
be explained by the color-opponent channels. For example, the
search for an orange target among yellow-green and purple dis-
tractors was efficient, despite the red-green axis being unable to
differentiate orange and purple, and the blue-yellow axis being
unable to differentiate between orange and yellow-green. More
recently, (Wool et al., 2015) performed a visual search task with a
target that was a subjectively defined “unique hue,” colors that lie
purely on one end of color opponent axes, among distractors that
were “complementary,” the color that occupies the opposite end of
an objectively defined color circle. They found no search advan-
tage for the unique hue, concluding that as a color that maximizes
the response of one axis had no search advantage over a color that
spreads its response over two axes, visual search for color must not
be using opponent processes.
The proposal that visual search for color is based on a limited
number of basic color terms stems from research motivated by the
idea of linguistic relativity, a theory that suggests that the way we
process and encode colors is influenced by the terms we use to
label color (Brown & Lenneberg, 1954). Specifically for visual
search, it suggests that visual search performance depends on the
color category one encodes the target and distractors with. This
idea has found some support in a visual search study by Yokoi and
Uchikawa (2005) who used a heterogeneous display of colors and
found that reaction time (RT) was correlated with the number of
distractors that shared a linguistic label with the target. Another
study that somewhat supports this idea is by Pilling and Davies
This article was published Online First June 20, 2016.
Garry Kong and David Alais, School of Psychology, The University of
Sydney; Erik Van der Burg, School of Psychology, The University of
Sydney and Department of Cognitive Psychology, Vrije Universiteit Am-
sterdam.
Correspondence concerning this article should be addressed to Garry
Kong, School of Psychology, The University of Sydney, Camperdown,
2006, Australia. E-mail: garry.kong@sydney.edu.au
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Journal of Experimental Psychology:
Human Perception and Performance
© 2016 American Psychological Association
2016, Vol. 42, No. 11, 1724–1738
0096-1523/16/$12.00 http://dx.doi.org/10.1037/xhp0000249
1724
... For instance, it is difficult to recognize a visual object in peripheral vision when surrounded by nearby clutter (i.e., visual crowding; see Whitney & Levi, 2011 for a review), and it is difficult to find a visual target in a cluttered environment (i.e., visual search; Wolfe, 1994). However, performance is only affected by clutter when the target and clutter have similar features, such as orientation, color, shape, luminance, motion, flicker (see, e.g., Bouma, 1970;Cass et al., 2011;Cass & Van der Burg, 2023;Duncan & Humphreys, 1989;Kong et al., 2016Kong et al., , 2017Kooi et al., 1994;Van der Burg et al., 2017. In contrast, if a single target feature differs from the clutter feature (e.g., being the only red item among green items), then it becomes easy to find such a salient item (Itti et al., 1998;Wolfe & Horowitz, 2004). ...
... Nevertheless, in our study the presence of clutter had no significant effect on the objective performance measures. The lack of a clutter effect was surprising since it is known from the literature that it is difficult to find or identify a visual target when it is surrounded by nearby similar distractor objects (Bouma, 1970;Cass et al., 2011;Cass & Van der Burg, 2023;Duncan & Humphreys, 1989;Kong et al., 2016Kong et al., , 2017Kooi et al., 1994;Rosenholtz et al., 2007;Van der Burg et al., 2017;Whitney & Levi, 2011). In our case, the clutter was similar to the target in terms of size, color, luminance, orientation and even shape and, the clutter made it even more difficult to track the moving target due to occlusion (see Figure 4). ...
Article
BACKGROUND: In this study, we investigated the impact of a loss of horizon due to atmospheric conditions on flight performance and workload of helicopter pilots during a low-altitude, dynamic flight task in windy conditions at sea. We also examined the potential benefits of a helmet-mounted display (HMD) for this specific task. METHODS: In a fixed-based helicopter simulator, 16 military helicopter pilots were asked to follow a maneuvering go-fast vessel in a good visual environment (GVE) and in a degraded visual environment (DVE). DVE was simulated by fog, obscuring the horizon and reducing contrast. Both visual conditions were performed once with and once without an HMD, which was simulated by projecting head-slaved symbology in the outside visuals. Objective measures included flight performance, control inputs, gaze direction, and relative positioning. Subjective measures included self-ratings on performance, situation awareness, and workload. RESULTS: The results showed that in DVE the pilots perceived higher workload and were flying closer to the go-fast vessel than in GVE. Consequently, they responded with larger control inputs to maneuvers of the vessel. The availability of an HMD hardly improved flight performance but did allow the pilots to focus their attention more outside, significantly improving their situation awareness and reducing workload. These benefits were found in DVE as well as GVE conditions. DISCUSSION: DVE negatively affects workload and flight performance of helicopter pilots in a dynamic, low-altitude following task. An HMD can help improve situation awareness and lower the workload during such a task, irrespective of the visual conditions. Ledegang WD, van der Burg E, Valk PJL, Houben MMJ, Groen EL. Helicopter pilot performance and workload in a following task in a degraded visual environment . Aerosp Med Hum Perform. 2024; 95(1):16–24.
... Interestingly, the camouflage technique had a major impact on the detection rate as dynamically adapting the camouflage to its environment was more efficient than the adaptative static camouflage technique and the standard Dutch woodland camouflage suit. That the dynamic adaptive camouflage pattern resulted in the best performance is not surprising, as it is known from the visual search literature that a target is more difficult to find when it is very similar to its surrounding than when it is dissimilar to its surrounding [12,18,32]. ...
... Interestingly, Van der Burg and colleagues found compelling evidence that a motion transient did pop-out, but only when this transient resulted in a (temporarily) unique motion direction compared to the motion direction of the other moving objects (e.g., when all the other objects move in the opposite direction). What is most likely important for visual search and for camouflage efficiency, is that an object (like a static or dynamic soldier) stands out from its surrounding when at least a single feature is unique compared to its environment (and/or surrounding distractor elements) [17,32,33]. ...
Article
Full-text available
Targets that are well camouflaged under static conditions are often easily detected as soon as they start moving. We investigated and evaluated ways to design camouflage that dynamically adapts to the background and conceals the target while taking the variation in potential viewing directions into account. In a human observer experiment, recorded imagery was used to simulate moving (either walking or running) and static soldiers, equipped with different types of camouflage patterns and viewed from different directions. Participants were instructed to detect the soldier and to make a rapid response as soon as they have identified the soldier. Mean target detection rate was compared between soldiers in standard (Netherlands) Woodland uniform, in static camouflage (adapted to the local background) and in dynamically adapting camouflage. We investigated the effects of background type and variability on detection performance by varying the soldiers’ environment (such as bushland and urban). In general, detection was easier for dynamic soldiers compared to static soldiers, confirming that motion breaks camouflage. Interestingly, we show that motion onset and not motion itself is an important feature for capturing attention. Furthermore, camouflage performance of the static adaptive pattern was generally much better than for the standard Woodland pattern. Also, camouflage performance was found to be dependent on the background and the local structures around the soldier. Interestingly, our dynamic camouflage design outperformed a method which simply displays the ‘exact’ background on the camouflage suit (as if it was transparent), since it is better capable of taking the variability in viewing directions into account. By combining new adaptive camouflage technologies with dynamic adaptive camouflage designs such as the one presented here, it may become feasible to prevent detection of moving targets in the (near) future.
... p < .001, indicating that the detection rate decreases when search becomes more difficult 9,11,25 . ...
... First, the conspicuity experiment was relatively brief compared to the visual search experiment. One reason is that for a typical visual search study one needs approximately 20 trials per condition to estimate the search efficiency for each condition, as the search path can vary largely from one trial to another (especially when search is very inefficient, and a scan through the scene occurs serially, see e.g., 25,26 ). Moreover, the strong correlation that we observed between the two sessions in the conspicuity experiment suggests that the responses were very accurate, and indicates that a single session may already be sufficient to measure target conspicuity. ...
... Given the earlier-mentioned similarity effects, we expected vertical distractors to interfere more with target discrimination than horizontal distractors, and we sought to determine the spatial range within which they would do so. Given that the large number of possible stimulus configurations (2 284 ) made it impossible to evaluate each potential configuration using a standard factorial design, we applied a technique based on genetic algorithms (Holland, 1975; see also Kong, Alais, & Van der Burg, 2016b;Kong, Alais, & Van der Burg, 2016a;Van der Burg, Cass, Theeuwes, & Alais, 2015). The main idea behind this optimization technique is that displays evolve, such that flanker arrays that yield little crowding and thus good performance survive, while displays that yield strong crowding become extinct (i.e., a survival of the fittest principle). ...
Article
Full-text available
Visual crowding is arguably the strongest limitation imposed on extrafoveal vision, and is a relatively well-understood phenomenon. However, most investigations and theories are based on sparse displays consisting of a target and at most a handful of flanker objects. Recent findings suggest that the laws thought to govern crowding may not hold for densely cluttered displays, and that grouping and nearest neighbour effects may be more important. Here we present a computational model that accounts for crowding effects in both sparse and dense displays. The model is an adaptation and extension of an earlier model that has previously successfully accounted for spatial clustering, numerosity and object-based attention phenomena. Our model combines grouping by proximity and similarity with a nearest neighbour rule, and defines crowding as the extent to which target and flankers fail to segment. We show that when the model is optimized for explaining crowding phenomena in classic, sparse displays, it also does a good job in capturing novel crowding patterns in dense displays, in both existing and new data sets. The model thus ties together different principles governing crowding, specifically Bouma's law, grouping, and nearest neighbour similarity effects.
... Some of this is due to the inability to use Becker's relational guidance when targets are not linearly separable from distractors, and some of the difficulty is due to added bottom-up (DD similarity) noise produced by the highly salient contrast between the two types of yellow and red distractors. Note, however, that attention can still be guided to the orange targets, showing that topdown guidance is not based entirely on a single relationship (for more, see Kong, Alais, & Van der Berg, 2016;Lindsey et al., 2010). Moreover, Buetti et al. (2020) have cast doubt on the whole idea of linear separability, arguing that performance in the inseparable case can be explained as a function of performance on each of the component simple feature searches. ...
Article
This paper describes Guided Search 6.0 (GS6), a revised model of visual search. When we encounter a scene, we can see something everywhere. However, we cannot recognize more than a few items at a time. Attention is used to select items so that their features can be "bound" into recognizable objects. Attention is "guided" so that items can be processed in an intelligent order. In GS6, this guidance comes from five sources of preattentive information: (1) top-down and (2) bottom-up feature guidance, (3) prior history (e.g., priming), (4) reward, and (5) scene syntax and semantics. These sources are combined into a spatial "priority map," a dynamic attentional landscape that evolves over the course of search. Selective attention is guided to the most active location in the priority map approximately 20 times per second. Guidance will not be uniform across the visual field. It will favor items near the point of fixation. Three types of functional visual field (FVFs) describe the nature of these foveal biases. There is a resolution FVF, an FVF governing exploratory eye movements, and an FVF governing covert deployments of attention. To be identified as targets or rejected as distractors, items must be compared to target templates held in memory. The binding and recognition of an attended object is modeled as a diffusion process taking > 150 ms/item. Since selection occurs more frequently than that, it follows that multiple items are undergoing recognition at the same time, though asynchronously, making GS6 a hybrid of serial and parallel processes. In GS6, if a target is not found, search terminates when an accumulating quitting signal reaches a threshold. Setting of that threshold is adaptive, allowing feedback about performance to shape subsequent searches. Simulation shows that the combination of asynchronous diffusion and a quitting signal can produce the basic patterns of response time and error data from a range of search experiments.
... Some of this is due to the inability to use Becker's relational guidance when targets are not linearly separable from distractors, and some of the difficulty is due to added bottom-up (DD similarity) noise produced by the highly salient contrast between the two types of yellow and red distractors. Note, however, that attention can still be guided to the orange targets, showing that topdown guidance is not based entirely on a single relationship (for more, see Kong, Alais, & Van der Berg, 2016;Lindsey et al., 2010). Moreover, Buetti et al. (2020) have cast doubt on the whole idea of linear separability, arguing that performance in the inseparable case can be explained as a function of performance on each of the component simple feature searches. ...
... Interestingly, the camouflage technique had a major impact on the detection rate as dynamically adapting the camouflage to its environment was by far more efficient than the adaptative static camouflage technique and the standard Dutch woodland camouflage suit. That the dynamic adaptive camouflage pattern resulted in the best performance is not surprising, as it is known from the visual search literature that a target is more difficult to find when it is very similar to its surrounding than when it is dissimilar to its surrounding [34][35][36] . ...
... Interestingly, the camouflage technique had a major impact on the detection rate as dynamically adapting the camouflage to its environment was by far more efficient than the adaptative static camouflage technique and the standard Dutch woodland camouflage suit. That the dynamic adaptive camouflage pattern resulted in the best performance is not surprising, as it is known from the visual search literature that a target is more difficult to find when it is very similar to its surrounding than when it is dissimilar to its surrounding [34][35][36] . ...
Article
Objective We examined whether active head aiming with a Helmet Mounted Display (HMD) can draw the pilot’s attention away from a primary flight task. Furthermore, we examined whether visual clutter increases this effect. Background Head up display symbology can result in attentional tunneling, and clutter makes it difficult to identify objects. Method Eighteen military pilots had to simultaneously perform an attitude control task while flying in clouds and a head aiming task in a fixed-base flight simulator. The former consisted of manual compensation for roll disturbances of the aircraft, while the latter consisted of keeping a moving visual target inside a small or large head-referenced circle. A “no head aiming” condition served as a baseline. Furthermore, all conditions were performed with or without visual clutter. Results Head aiming led to deterioration of the attitude control task performance and an increase of the amount of roll-reversal errors (RREs). This was even the case when head aiming required minimal effort. Head aiming accuracy was significantly lower when the roll disturbances in the attitude control task were large compared to when they were small. Visual clutter had no effect on both tasks. Conclusion We suggest that active head aiming of HMD symbology can cause attentional tunneling, as expressed by an increased number of RREs and less accuracy on a simultaneously performed attitude control task. Application This study improves our understanding in the perceptual and cognitive effects of (military) HMDs, and has implications for operational use and possibly (re)design of HMDs.
Conference Paper
Full-text available
Natural scenes are typically highly heterogeneous, making it difficult to assess camouflage effectiveness for moving objects since their local contrast varies with their momentary position. Camouflage performance is usually assessed through visual search and detection experiments involving human observers. However, such studies are time-consuming and expensive since they involve many observers and repetitions. Here, we show that a (state-of-the-art) convolutional neural network (YOLO) can be applied to measure the camouflage effectiveness of stationary and moving persons in a natural scene. The network is trained on human observer data. For each detection, it also provides the probability that the detected object is correctly classified as a person, which is directly related to camouflage effectiveness: more effective camouflage yields lower classification probabilities. By plotting the classification probability as a function of a person’s position in the scene, a ‘camouflage efficiency heatmap’ is obtained, that reflects the variation of camouflage effectiveness over the scene. Such a heatmap can for instance be used to determine locations in a scene where the person is most effectively camouflaged. Also, YOLO can be applied dynamically during a scene traversal, providing real-time feedback on a person’s detectability. We compared the YOLO-predicted classification probability for a soldier in camouflage clothing moving through a rural scene to human performance. Camouflage effectiveness predicted by YOLO agrees closely with human observer assessment. Thus, YOLO appears an efficient tool for the assessment of camouflage of static as well as dynamic objects.
Article
Full-text available
How does the brain find objects in cluttered visual environments? For decades researchers have employed the classic visual search paradigm to answer this question using factorial designs. Although such approaches have yielded important information, they represent only a tiny fraction of the possible parametric space. Here we use a novel approach, by using a genetic algorithm (GA) to discover the way the brain solves visual search in complex environments, free from experimenter bias. Participants searched a series of complex displays, and those supporting fastest search were selected to reproduce (survival of the fittest). Their display properties (genes) were crossed and combined to create a new generation of "evolved" displays. Displays evolved quickly over generations towards a stable, efficiently searched array. Color properties evolved first, followed by orientation. The evolved displays also contained spatial patterns suggesting a coarse-to-fine search strategy. We argue that this behavioral performance-driven GA reveals the way the brain selects information during visual search in complex environments. We anticipate that our approach can be adapted to a variety of sensory and cognitive questions that have proven too intractable for factorial designs. © 2015 ARVO.
Article
Full-text available
Ss classified visually presented verbal units into the categories "in your vocabulary" or "not in your vocabulary." The primary concern of the experiment was to determine if making a prior decision on a given item affects the latency of a subsequent lexical decision for the same item. Words of both high and low frequency showed a systematic reduction in the latency of a lexical decision as a consequence of prior decisions (priming) but did not show any reduction due to nonspecific practice effects. Nonwords showed no priming effect but did show shorter latencies due to nonspecific practice. The results also indicated that many (at least 36) words can be in the primed state simultaneously and that the effect persists for at least 10 min. The general interpretation was that priming produces an alteration in the representation of a word in memory and can facilitate the terminal portion of the memory search process which is assumed to be random.
Article
Full-text available
An experiment was designed to investigate the locus of persistence of information about presentation modality for verbal stimuli. Twenty-four Ss were presented with a continuous series of 672 letter sequences for word/nonword categorization. The sequences were divided equally between words and nonwords, and each item was presented twice in the series, either in the same or in a different modality. Repetition facilitation, the advantage resulting from a second presentation, was greatest in the intramodality conditions for both words (+re responses) and nonwords (-ve responses). Facilitation in these conditions declined from 170 msec at Lag 0 (4 sec) to approximately 40 msec at Lag 63. Facilitation was reduced in the cross-modality condition for words and was absent from the cross-modality condition for nonwords. The modality-specific component of the repetition effect found in the word/nonword categorization paradigm may be attributed to persistence in the nonlexical, as distinct from lexical, component of the word categorization process.
Article
The simplex plays an important role as sample space in many practical situations where compositional data, in the form of proportions of some whole, require interpretation. It is argued that the statistical analysis of such data has proved difficult because of a lack both of concepts of independence and of rich enough parametric classes of distributions in the simplex. A variety of independence hypotheses are introduced and interrelated, and new classes of transformed‐normal distributions in the simplex are provided as models within which the independence hypotheses can be tested through standard theory of parametric hypothesis testing. The new concepts and statistical methodology are illustrated by a number of applications.
Article
The unique hues-blue, green, yellow, red-form the fundamental dimensions of opponent-color theories, are considered universal across languages, and provide useful mental representations for structuring color percepts. However, there is no neural evidence for them from neurophysiology or low-level psychophysics. Tapping a higher prelinguistic perceptual level, we tested whether unique hues are particularly salient in search tasks. We found no advantage for unique hues over their nonunique complementary colors. However, yellowish targets were detected faster, more accurately, and with fewer saccades than their complementary bluish targets (including unique blue), while reddish-greenish pairs were not significantly different in salience. Similarly, local field potentials in primate V1 exhibited larger amplitudes and shorter latencies for yellowish versus bluish stimuli, whereas this effect was weaker for reddish versus greenish stimuli. Consequently, color salience is affected more by early neural response asymmetries than by any possible mental or neural representation of unique hues. © 2015 ARVO.
Article
Although response times (RTs) are the dependent measure of choice in the majority of studies of visual attention, changes in RTs can be hard to interpret. First, they are inherently ambiguous, since they may reflect a change in the central tendency or skew (or both) of a distribution. Second, RT measures may lack sensitivity, since meaningful changes in RT patterns may not be picked up if they reflect two or more processes having opposing influences on mean RTs. Here we describe RT distributions for repetition priming in visual search, fitting ex-Gaussian functions to RT distributions. We focus here on feature and conjunction search tasks, since priming effects in these tasks are often thought to reflect similar mechanisms. As expected, both tasks resulted in strong priming effects when target and distractor identities repeated, but a large difference between feature and conjunction search was also seen, in that the σ parameter (reflecting the standard deviation of the Gaussian component) was far more affected by search repetition in conjunction than in feature search. Although caution should clearly be used when particular parameter estimates are matched to specific functions or processes, our results suggest that analyses of RT distributions can inform theoretical accounts of priming in visual search tasks, in this case showing quite different repetition effects for the two differing search types, suggesting that priming in the two paradigms partly reflects different mechanisms.