ArticlePDF Available

Joint coding of shape and blur in area V4

Springer Nature
Nature Communications
Authors:

Abstract and Figures

Edge blur, a prevalent feature of natural images, is believed to facilitate multiple visual processes including segmentation and depth perception. Furthermore, image descriptions that explicitly combine blur and shape information provide complete representations of naturalistic scenes. Here we report the first demonstration of blur encoding in primate visual cortex: neurons in macaque V4 exhibit tuning for both object shape and boundary blur, with observed blur tuning not explained by potential confounds including stimulus size, intensity, or curvature. A descriptive model wherein blur selectivity is cast as a distinct neural process that modulates the gain of shape-selective V4 neurons explains observed data, supporting the hypothesis that shape and blur are fundamental features of a sufficient neural code for natural image representation in V4.
Stimulus contrast does not explain blur selectivity. a Profile schematic of how a blurred stimulus (red) has a decreased mean foreground intensity relative to a sharp (β = 0.005) stimulus (black). Low/high values correspond to background/foreground image intensities, respectively. An intensity control is constructed from a sharp shape with an identical mean foreground intensity (blue). b Example blurred stimulus and intensity-matched controls. Stimuli are shown in black but were presented at either positive or negative luminance contrast. c, d Responses of preferred (red) to non-preferred (blue) shapes that were presented either blurred (left) or as intensity-matched controls (right). While blur and contrast control tuning curves are remarkably different in c, they are quite similar in d. e Center-of-Mass analysis (see Methods: 'Analysis and model fitting') reveals that for a majority of cells (n = 31 of 34) blur and contrast-control tuning curves have significantly different (black) tuning profiles (t-test, p < 0.05); other cells (gray) exhibited blur and contrast-control tuning curves with CoM values that were not significantly different. For these neurons, blur tuning may be explained in the context of intensity tuning. Neurons above the diagonal typically exhibited a tuning preference for intermediate intensities while remaining largely invariant to all but the highest levels of blur; a more conservative estimate that discounts these cells (n = 6, ≈ 18%) still finds the majority of neurons unexplained by tuning for stimulus contrast (n = 25, ≈ 74%)
… 
Responses explained by a joint model of shape and blur. a NRMS fit error of an APC model plotted as a function of NRMS fit error of an APCB model for each neuron. Example cells are filled and labeled. b, d, g Even though the APCB is a generalization of the APC model, i.e., APCB models form a superset of APC models with two additional parameters, cross-validation demonstrates the APCB model to better predict responses to blurred stimuli: b Average NRMSE across training and testing stimuli reveals the APCB model to better predict responses without overfitting; relative error, computed as APC-APCB error, is positive in all significant cases (g). d Relative testing NRMSE between models, plotted as a function of blur selectivity PV, demonstrates the APCB model to better predict responses over a range of neuron tuning profiles. g Significant (black) and not significant (gray) cases of prediction improvement were determined with a pair-wise t-test across hold-out validation stimuli (see Methods: 'Analysis and model fitting'). c, e, h Observed responses (open circles, dashed lines) and APCB model fits (filled circles, solid lines) for three example neurons (Fig. 2a, c, d). Qualitative assessment of fits suggests that the APCB model captures blur-tuned response properties remarkably well. f Comparison of the APCB gain model against an APCB additive variant with equal degrees of freedom (see Methods: 'Analysis and model fitting'). Most neurons are better fit by the APCB gain model, particularly when error is small, consistent with blur selectivity being explained by gain modulation
… 
This content is subject to copyright. Terms and conditions apply.
ARTICLE
Joint coding of shape and blur in area V4
Timothy D. Oleskiw1,2, Amy Nowack2& Anitha Pasupathy2
Edge blur, a prevalent feature of natural images, is believed to facilitate multiple visual
processes including segmentation and depth perception. Furthermore, image descriptions
that explicitly combine blur and shape information provide complete representations of
naturalistic scenes. Here we report the rst demonstration of blur encoding in primate visual
cortex: neurons in macaque V4 exhibit tuning for both object shape and boundary blur, with
observed blur tuning not explained by potential confounds including stimulus size, intensity,
or curvature. A descriptive model wherein blur selectivity is cast as a distinct neural process
that modulates the gain of shape-selective V4 neurons explains observed data, supporting
the hypothesis that shape and blur are fundamental features of a sufcient neural code for
natural image representation in V4.
DOI: 10.1038/s41467-017-02438-8 OPEN
1Department of Applied Mathematics, University of Washington, Seattle, WA, USA. 2Department of Biological Structure, University of Washington, Seattle,
WA, USA. Correspondence and requests for materials should be addressed to T.D.O. (email: oleskiw@uw.edu)
NATURE COMMUNICATIONS | (2018) 9:466 |DOI: 10.1038/s41467-017-02438-8 |www.nature.com/naturecommunications 1
1234567890
Content courtesy of Springer Nature, terms of use apply. Rights reserved
In any natural scene, visual information is carried by bound-
aries of contrast that exist throughout an image1. For example,
gureground contrast along the borders of solid objects may
provide robust cues for object shape2, while internal boundaries
and surface texture may reveal 3D structure and material com-
position of those objects3,4. Much work has quantied the extent
to which these edge cues contribute to complex visual tasks, such
as segmentation and recognition5, and progress is being made
toward understanding the neural mechanisms responsible (e.g.,
see refs. 4,69).
However, physical environments under naturalistic viewing
conditions often produce edges that are blurred, i.e., exhibit a
spatial gradient of image intensity across the edge (Fig. 1a, b).
Specically, edges without blur are sharp step transitions in
intensity, whereas blurred edges vary smoothly in intensity from
one side to the other. Such blurred boundaries within natural
images can arise from a number of physical scenarios, such as
defocus, cast shadows, or surface shading10, and thus themselves
convey relevant scene information such as object depth11,12.
Importantly, computational studies of luminance boundaries nd
that visual scenes may be sufciently reconstructed from infor-
mation contained in edge features, including the magnitude of
blur at each edge10. Further, psychophysical results demonstrate
that in addition to shape13, the visual system is tuned to detect
cast shadows during segmentation14, of which shape and blur are
diagnostic features.
The human visual system is also adept at discriminating blur15
and detecting blurred boundaries14,16,17. While biophysically
plausible computational models have been proposed to explain
how the brain could utilize blur information18, neural mechan-
isms that underlie the computation and representation of blur
remain unclear. Since sharp and blurred boundaries differ greatly
in their high spatial frequency (SF) content, V1 populations tuned
to various SFs implicitly encode blur. However, at intermediate
stages of form processing, such as in area V4, simple gratings are
ineffective at driving responses of shape-selective neurons19, and
complex shape stimuli that do elicit responses have typically been
dened by sharp boundaries2022. Thus, it is unknown whether
and how blur is encoded and combined with shape information
along the ventral pathway to form a sufcient representation of
natural scenes.
Here we present results from a study targeting single V4
neurons using customized sets of shape stimuli to test the
hypothesis that V4 neurons jointly encode object shape and
boundary blur. Our results demonstrate that shape-selective V4
neurons also exhibit tuning for blur and that single-unit
responses are well described by a joint model explicitly encod-
ing both shape and blur information.
Results
Selectivity for blur in area V4. To understand how blur, i.e., the
gradient of image intensity, is encoded in the intermediate stages
of the ventral pathway, we examined the responses of well-
isolated V4 neurons to shape stimuli as a function of blur mag-
nitude. For each neuron, we rst assessed shape selectivity using a
standard stimulus set (Fig. 1c, d; ref. 22). Based on these
responses, we identied a subset of preferred and non-preferred
stimuli that evoked a range of responses for the neuron in
question (see Methods: Visual stimulation). We then examined
the responses to this chosen subset of stimuli under various levels
of blur (Fig. 1e).
Blurring a stimulus boundary, as implemented here (see
Methods: Visual stimulation), broadens the intensity gradient
across a shapes boundary. Because the responses of roughly 80%
of V4 neurons increase as gureground stimulus contrast is
increased23, one may expect blur to reduce the response of shape-
selective neurons as edge intensity gradients are broadened.
Indeed, many neurons in our population follow this trend. For
example, cell a23 in Fig. 2a exhibited a range of responses from
1545 spk/s for a variety of sharp stimuli subjected to minimal
blur (β=0.005). This response pattern was maintained for small
amounts of blur (β<0.04), but for intermediate and high blur
factors (β0.04) responses to preferred stimuli gradually
declined; at the highest levels of blur tested, i.e., β=0.64, all
stimuli were effectively amorphous with little discernible form
(Fig. 1e), and responses approach baseline (dashed line). Thus, for
this neuron both response magnitude and shape selectivity
declined with increasing levels of blur. Figure 2b, c illustrate
additional examples of this general behavior, but rather than a
gradual decline as in Fig. 2a, cells a08 and b29 of Fig. 2b, c
maintained shape selectivity up to blur factor β=0.16 before
transitioning sharply to a baseline level that is not selective for
b
cde
Blur factor
0.005
0.01
0.02
0.04
0.16
0.32
0.48
0.64
a
0.08
Fig. 1 Examples of blur in natural images and stimuli used to explore selectivity for shape and blur. a,bExamples of different types of blur in natural scenes.
aFocal blur (white arrows) conveys information about depth while shading blur (red arrows) conveys information about 3D structure. bPenumbral blur is
associated with cast shadows (blue arrows); during grouping, cast shadows do not interfere with perception of physical object boundaries and shading. ce
Stimulus set used to assess tuning for shape and blur in V4 neurons. cA standard set of 51 shapes were used to assess shape selectivity of V4 neurons.
Stimulus size is dened relative to the diameter of the large circle (black arrow). dEach shape was presented at up to 8 unique orientations at 45°
increments; all rotations for one example shape are shown. For shapes with radial symmetry, duplicates were excluded. eTo assess tuning for blur, a
subset of preferred and non-preferred shapes were presented at up to 9 levels of Gaussian blur (see Methods: Visual stimulation). Example stimuli β
0.32 were cropped here for display purposes
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-02438-8
2NATURE COMMUNICATIONS | (2018) 9:466 |DOI: 10.1038/s41467-017-02438-8 |www.nature.com/naturecommunications
Content courtesy of Springer Nature, terms of use apply. Rights reserved
shape. Note that limited shape selectivity may occur at high blur
(β0.32) since all stimuli retain low SF-oriented energy as blur
magnitude is increased.
In striking contrast, many other V4 neurons exhibited a
marked increase in response magnitude over intermediate blur
levels (Fig. 2df). In other words, the activity of these three
example neurons was non-monotonic as a function of blur.
Further, this blur modulation appears to depend on stimulus
shape, facilitating responses of preferred shapes more than non-
preferred shapes. As a result, in these neurons shape selectivity is
strongest over intermediate blur factors. For all three examples,
responses to blurred stimuli are highest for blur values between
β=0.16 and β=0.32, overlapping with the range of blur values
associated with a decline in responses seen in Fig. 2ac.
To quantify the effect of blur across the population of V4
neurons, we performed a model-free analysis of blur modulation
illustrated in Fig. 3. For each neuron we rst constructed an
average tuning curve as a function of boundary blur, based on the
interpolated responses to a subset of preferred stimuli (see
Methods: Analysis and model tting). We then calculated two
metrics from each tuning curve: the extremal blur factor
(Fig. 3ad, triangles) that is associated with the maximal response
modulation relative to average activity evoked by non-blurred
(sharp) stimuli, and a modulation index taken as the tuning curve
integral across blur factors (Fig. 3ad, hatching). Figure 3e depicts
the modulation index (MI, y-axis) as a function of extremal blur
factor (β,x-axis) for all neurons in our population (n=65); in this
space our recorded data span a continuum, and we demarcate
cells via modulation index and extremal blur factor criteria.
Immediately visible is a sub-population (n=42, 65%) with a
high extremal blur factor and negative modulation index (MI <0,
β>0.32); these neurons exhibit responses that decrease with
increasing blur, collapsing to a near-baseline response at highest
blur levels (e.g., cell a19 of Fig. 3d). Note also that a few cells (n=
5, 8%) have weak tuning for intermediate blur coupled with a
strong fall-off at high blur values (e.g., cell b13 of Fig. 3b) to
produce a non-negative modulation index at high extremal blur
factors (MI 0, β>0.32). Conversely, other neurons (n=11,
17%) exhibit a non-negative modulation index with inter-
mediate extremal blur factor values (MI 0, 0.1 <β<0.32),
indicative of neurons tuned to intermediate blur magnitudes (e.g.,
cell a15 of Fig. 3a). Interestingly, some cells (n=7, 10%)
demonstrate intermediate inhibition, i.e., negative modulation at
intermediate blur values (MI <0, 0.1 <β<0.32); these neurons
exhibit strong shape selectivity at both low and high blur
magnitudes. In Fig. 3e the rst principal component of our
population in this space, calculated under a scaling to equalize
variance along each dimension (shaded line), aids in segregating
our neurons: cells with a positive principal value (PV, more red)
respond best to intermediate blur, while those with a negative PV
(more blue) show declining responses with increasing blur. While
it is not perfect, we see in Fig. 3f that by superimposing blur
tuning curves across the population, colored according to each
neurons PV, this measure is diagnostic of selectivity for low blur
(more blue) and intermediate blur (more red).
Controlling for stimulus size. In addition to the gradient width
of edge intensity, blurring alters many other stimulus character-
istics. For example, as depicted in Fig. 4a, the foreground area of a
blurred shape stimulus, dened as the number of pixels distinct
from the background, increases with blur magnitude. Thus,
tuning for boundary blur might arise from a simple tuning for
stimulus size. To test this hypothesis, in a subsequent size control
experiment, we presented shape stimuli that were rst scaled by
±10% and then subjected to a diagnostic subset of blur levels
(Fig. 4b). If preference for intermediate blur was simply due to a
preference for stimulus size we would expect a shift in the blur
tuning peak as size was varied. In Fig. 4c, d we plot the responses
of two example neurons that respond preferentially to inter-
mediate levels of blur. For both examples the modulation of
responses with respect to blur was consistent across changes in
stimulus size. Across the neurons subjected to this size control (n
=26), in every case we found that blur modulation was similar
across stimulus size. Importantly, while one might expect a sys-
tematic variation in responses across all stimuli with respect to
0
25
50
Firing rate (spk/s)
0
5
10
15
20
0
20
40
60
80
0.005 0.04 0.16 0.64
0
15
30
45
60
Firing rate (spk/s)
Blur factor
0.005 0.04 0.16 0.64
0
10
20
30
Blur factor
0.005 0.04 0.16 0.64
0
20
40
60
80
Blur factor
abc
def
Cell a08
Cell a15
Cell a23
Cell b31
Cell b29
Cell b27
Fig. 2 Shape-selective V4 neurons are tuned for blur. afFor each neuron we plot the mean responses (y-axis) to several stimuli as a function of the
magnitude of blur factor (x-axis, β). Line color indicates shape identity and is ordered from preferred (red) to non-preferred (blue) stimuli for each neuron
based on responses to the sharp versions of each stimulus (β=0.005). Error bars indicate s.e.m. aResponses of an example V4 neuron that was strongly
selective to sharp stimuli, i.e., β=0.005; responses declined gradually to baseline levels (dashed line) as blur magnitude was increased. b,cTwo additional
examples that also exhibited a monotonic decrease in responses with increasing blur. Unlike a, these neurons maintained their response level across low
blur levels, sharply declining to baseline beyond a critical blur factor (β0.16). dfExample V4 neurons that respond best at intermediate levels of blur;
responses for preferred stimuli dramatically increase for intermediate blur factors
NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-02438-8 ARTICLE
NATURE COMMUNICATIONS | (2018) 9:466 |DOI: 10.1038/s41467-017-02438-8 |www.nature.com/naturecommunications 3
Content courtesy of Springer Nature, terms of use apply. Rights reserved
size, we did not nd a signicant interaction effect between size
and blur for any of the neurons (two-way ANOVA, p>0.14 for
all cells, median 0.96). Rather, blur accounted for a signicant
fraction of variance (p<0.05) in the majority of these neurons
(n=23, 88%). As a result, our analysis suggests that selectivity
for blur cannot be explained in the context of overall stimulus
size. These ndings were conrmed by separate analyses in which
models of V4 shape selectivity were used to predict size control
data. Briey, while V4 responses to shape stimuli were adequately
explained by existing models of boundary conformation22, these
models failed to predict responses to scaled and blurred shape
stimuli (see Methods: Control experiments).
Controlling for stimulus contrast. A further confound arising
from blur is the stimulus intensity contained within a shapes
boundary. If we simply dene the boundary contour of a blurred
shape as the level set of stimulus intensities distinct from the
background, we note from Fig. 1e that the average intensity
within that boundary decreases as blur magnitude increases. That
is to say, blur diffuses stimulus intensity, reducing the average
contrast between gure and ground. Therefore, the preference for
an intermediate level of blur could arise from a preference for a
specic average stimulus intensity which differs from that of
sharp stimuli.
To test this hypothesis, for each blurred stimulus we
constructed a non-blurred (sharp) version matched in mean
intensity (Fig. 5a, b) and compared the responses to these two
stimulus sets. Here, mean stimulus intensity is determined from
the area subtended by pixels differing from background intensity
on our 24-bit color display. Figure 5c, d plots the results for two
example neurons. For cell b26, the tuning curves as a function of
blur and contrast are dramatically different: responses are
strongest for intermediate blur, but fall off as contrast is reduced
for sharp stimuli, inconsistent with blur tuning explained by
contrast. On the other hand, cell b32 demonstrates very similar
selectivity across both stimulus sets, suggesting that blur
selectivity in this case could be explained by a simple
gureground contrast preference. To quantitatively compare
the two tuning curves, we calculated the center-of-mass (CoM)
for each: tuning curves that peak at intermediate contrasts will
garner a large CoM, and curves which monotonically decrease
will retain smaller CoM values. Here, bootstrapping is employed
to estimate the distribution of tuning curve CoM measurements
under the variance of observed responses (see Methods: Analysis
and model tting). Across our sub-population of neurons
subjected to the intensity control (Fig. 5e, 34 cells), the majority
(n=31, 91%) had an intensity-matched CoM signicantly
different than that of blur (t-test, p<0.05), indicating neural
responses are not consistent with tuning for intermediate
stimulus intensity. Thus, while selectivity for blur could be
attributed in some neurons to a simple tuning for intermediate
edge contrast, e.g., cell b32, the majority of cells cannot be
explained in this context.
Controlling for curvature modication. A more subtle con-
found of blurred stimuli arises when considering exactly how to
dene object shape in the presence of blur. Specically, a blurred
shape can be associated with any of a family of closed contours,
each dened from the level set of stimulus intensity (Fig. 6b) and
the magnitude of boundary curvature along each threshold con-
tour may change as a function of blur. This confound between
curvature and blur is signicant from the perspective of shape
coding, as previous studies have leveraged the manner in which
smoothed boundary contours devolve into an ellipse to represent
object shape24. Therefore, since the responses of many V4 neu-
rons to shape stimuli can be explained in the context of tuning for
−20
0
50
−40
0
20
0.005 0.04 0.16 0.64
Relative response (spk/s)
Blur factor
0.005 0.04 0.16 0.64
Blur factor
0.08 0.16 0.32 0.64
−100
−50
0
50
100
Extremal blur factor
Modulation (spk/s)
0.005 0.04 0.16 0.64
−8
−4
0
4
Normalized
relative response
Blur factor
Principal value
−2.06 0 3.27
ab
cd
Cell b13
Cell a21
Cell a15
Cell a19
e
f
a15
b13
a21
a19
−30
0
20
−30
0
10
Fig. 3 Model-free analysis of blur selectivity across cells. adAverage blur tuning curves of four example neurons (dark red) constructed by averaging
responses to preferred shape stimuli (light red). Relative response (y-axis) as a function of blur factor (x-axis, β) was computed with respect to mean
response across sharp preferred stimuli, i.e., relative response is zero for lowest blur factors (β=0.005). An extremal blur factor (triangle) was dened as
the magnitude of blur that evoked the largest absolute deviation relative to responses to sharp stimuli. Response modulation was determined by calculating
the integral of relative responses across blur factors (hatching). eResponse modulation (y-axis) is plotted as a function of extremal blur factor (x-axis) for
the population of neurons (n=65) in our data set. The principal value, calculated from the rst principal component of the population (shaded line),
demarcates neurons with peak responses at intermediate blur values (more red) from those that show declining activity as blur increases (more blue).
Example cells of adare lled and labeled. fSuperposition of blur tuning curves computed in ad, scaled to have a unit-variance of relative response (y-
axis) and colored according to e, demonstrate complementary tuning with respect to blur magnitude (x-axis) within our population
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-02438-8
4NATURE COMMUNICATIONS | (2018) 9:466 |DOI: 10.1038/s41467-017-02438-8 |www.nature.com/naturecommunications
Content courtesy of Springer Nature, terms of use apply. Rights reserved
boundary curvature22, we wondered whether the selectivity for
intermediate blur may be due to a preference for modied cur-
vature values that arise in blurred stimuli that were not a part of
the original sharp stimuli.
To test the hypothesis that blur selectivity is an epiphenome-
non of shape selectivity we rst described each neurons shape
preference in terms of tuning for boundary curvature. As done
elsewhere2527, we identied the 2D Gaussian function in a shape
space spanned by angular position and curvature (APC) that best
predicts responses to the preliminary shape screen conducted
using sharp stimuli (see Methods: Visual stimulation). Boot-
strapping was used to calculate the normalized mean-squared
prediction error (Training NRMSE, Fig. 6d), as a measure of
goodness of t (see Methods: Analysis and model tting). We
then evaluated how well this best-tting APC model could predict
responses to blurred stimuli by considering the curvature
descriptions associated with each blurred stimulus at a range of
intensity thresholds (see Fig. 6a for a schematic of this
procedure). For each intensity threshold, we quantied goodness
of t as the normalized root mean-squared error (NRMSE)
between the predicted and observed responses. Then, for each
neuron, the intensity threshold that minimized NRMSE was
selected as the exemplar threshold (Threshold Curvature
NRMSE, Fig. 6c, d). In Fig. 6c for each neuron we compare
this threshold NRMSE against the NRMSE of a mean model that
is agnostic to blur, i.e., a model that predicts identical responses to
sharp and blurred versions of the same shapes. For most neurons
in our population, model predictions derived from optimized
intensity thresholds were associated with larger errors than
simple predictions equal to mean responses across blur.
Furthermore, Fig. 6d demonstrates that the APC models failure
to explain blur response variance is not due to an inability of the
model to capture shape selectivity exhibited in our population; we
nd that, for the majority of neurons, prediction error of
responses to blurred stimuli were higher than prediction error of
models trained and validated on sharp stimuli alone.
Joint coding of shape and blur. Results thus far demonstrate that
neuronal responses in V4 are modulated by boundary blur, and
this modulation cannot be explained on the basis of tuning for
size, contrast, or curvature. Therefore, our ndings support the
hypothesis of an underlying neural code for object shape and
boundary blur, e.g., a representation of both boundary con-
formation and spatial gradient of edge intensity in the rate
responses of single V4 neurons.
To rigorously test this hypothesis, we evaluate whether a joint
model of shape and blur performs signicantly better at
predicting V4 responses than a marginal model that is tuned
for shape alone. Importantly, this analysis builds on the well-
studied APC model22,2527 that is known to capture tuning for
boundary conformation in shape-selective V4 neurons (see
Methods: Analysis and model tting). For each cell we rst t
a standard APC model to blurred responses under the
assumption that neurons are invariant to boundary blur of shape
stimuli. Thus, the APC model ts to the mean response of each
shape across blur factors. We then augment the APC model to
include a blur-selective term, taken to be log-normal in blur
factor β. Simply put, responses Rbased on the angular position,
curvature, and blur (APCB) model are predicted via R=APC×B,
where the APC model, a shape-selective function of angular
position and curvature, is multiplicatively scaled by B, a blur-
selective function (see Methods: Analysis and model tting).
Note that APC and APCB models were t to the blurred data
without including the preliminary screen data to ensure that any
t differences were not due to the number and diversity of
stimuli.
In Fig. 7a, we plot accuracy as determined by prediction error
(NRMSE) across all trained stimuli and responses for shape-
coding models with (APCB, x-axis) and without (APC, y-axis) the
inuence of blur. Leave-one-out cross-validation demonstrates
(see Fig. 7b, d, g, Methods: Analysis and model tting)an
increased NRMSE prediction error of the APC model relative to
the APCB model (APCAPCB error). Thus, the APCB model
better captures the behavior of neurons to blurred stimuli, and
this increased performance is not due to overtting of additional
parameters. In the majority of neurons, inclusion of blur
information signicantly improved prediction performance to
stimuli in our data set (p<0.05 for n=53 of 65, 82%),
including neurons selective for intermediate blur (blur selectivity
PV >0.25, n=12 of 15, Fig. 7g). Examples in Fig. 7c, e, h illustrate
the effectiveness of the APCB model: it accurately captures a
Blurred stimulus
Original shape
Scaled shape
ab
90% 100% 110%
Blur
factor:
0.005
0.16
0
25
50
0
25
50
Firing rate (spk/s)
0.04 0.16 0.48
0
25
50
Blur factor
0
30
60
0
30
60
0.005 0.16 0.64
0
30
60
Blur factor
cd
Cell b27Cell a15
Size: –10%Size: originalSize: +10%
Size:
Fig. 4 Stimulus size does not explain blur selectivity. aA blurred stimulus
(β=0.16) has an increased foreground area, dened as the number of
pixels distinct from the background, compared to its original boundary prior
to blurring (red). Stimuli scaled by ±10% are shown for comparison (blue),
and correspond to luminance thresholds approximately 1/3 and 2/3 of
maximum, respectively. bExample scaled and blurred stimuli used to
assess a potential size confound. c,dResults of the size control experiment
for cells a15 and b27 demonstrate increased responses for intermediate
blur irrespective of stimulus size. Line color represents stimulus identity per
Fig. 2d, f. For both neurons, responses were not signicantly inuenced by
size (p=0.45 and p=0.18, respectively) but signicant variance was found
with respect to blur (p<0.0001). Error bars indicate s.e.m.
NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-02438-8 ARTICLE
NATURE COMMUNICATIONS | (2018) 9:466 |DOI: 10.1038/s41467-017-02438-8 |www.nature.com/naturecommunications 5
Content courtesy of Springer Nature, terms of use apply. Rights reserved
range of response behaviors including selectivity for intermediate
blur (Fig. 7h) and fall-off of responses to baseline at high blur
levels (Fig. 7c, e).
This choice of model parameterization, wherein blur tuning
multiplicatively scales selectivity for boundary curvature, is but
one of many possible descriptive models. For example, another
formulation could have a selectivity for blur that additively
facilitates position and curvature tuning, i.e., APC(Γ;Θ)+B(Γ;Θ).
In Fig. 7f, however, we nd that the multiplicative APCB model
outperforms the additive variant in the majority of neurons.
Thus, we conclude that neurons in V4 jointly encode information
of shape and blur, and that this coding is best explained by a gain
modulation, with tuning for blur multiplicatively scaling shape
selectivity. It is important to note that, by construction, the APCB
model is separable, i.e., does not rely on the interaction between
boundary conformation and blur to accurately predict neural
response. As will now be addressed, this suggests a distinct neural
mechanism regulating blur selectivity in V4.
Distinct dynamic properties of blur-selective responses. Finally,
we ask how and when tuning for blur emerges in the the
responses of individual neurons. In Fig. 8a, we illustrate, for the
preferred shape of an example neuron (cell a23), the peristimulus
time histograms (PSTH) as blur is varied. For this neuron
responses to blurred shape stimuli decreased with increasing blur
(Fig. 2a) and this decrease was uniform across the stimulus
presentation interval. Given that shape and blur tuning is best
explained by a modulation of gain, in Fig. 8c we quantify blur
modulation by plotting the s.d. of responses across blur factors for
both preferred (black) and non-preferred (gray) shape PSTHs.
The difference between these curves (hatching) then captures the
timecourse over which blur modulation is applied to shape-
selective activity. The utility of this analysis is seen when con-
sidering a blur-selective neuron (cell a15) in Fig. 8b. Here, the
PSTH of preferred shape stimuli demonstrates a nonuniform
increase in responses, most pronounced over the initial (tran-
sient) wave of activity (approximately 50150 ms after stimulus
onset). This effect is captured by the transient response mod-
ulation seen in Fig. 8d. Thus, these example neurons suggest a
potential difference in dynamics between two groups of cells:
those which are selective for intermediate blur factors, and those
which are not. If we demarcate neurons in our population with a
Blur factor:
Intensity control factor:
0.16
0.16
0.32
0.32
Shape profile
Blurred profile
Internal intensity
Mean internal intensity
Intensity control profile
Blurred stimulus Intensity control
ab e
b32
b27
b26
b31
0.02 0.04 0.08
0.02
0.04
0.08
Blur tuning CoM
Intensity control CoM
0
20
40
60
80
Firing rate (spk/s)
0.005 0.04 0.16 0.64
0
10
20
30
Firing rate (spk/s)
Blur factor
0
15
30
45
60
0.005 0.04 0.16 0.64
0
25
50
Intensity control factor
c
d
Cell b32
Cell b26
Significant
Not significant
0.08
0.08
Fig. 5 Stimulus contrast does not explain blur selectivity. aProle schematic of how a blurred stimulus (red) has a decreased mean foreground intensity
relative to a sharp (β=0.005) stimulus (black). Low/high values correspond to background/foreground image intensities, respectively. An intensity
control is constructed from a sharp shape with an identical mean foreground intensity (blue). bExample blurred stimulus and intensity-matched controls.
Stimuli are shown in black but were presented at either positive or negative luminance contrast. c,dResponses of preferred (red) to non-preferred (blue)
shapes that were presented either blurred (left) or as intensity-matched controls (right). While blur and contrast control tuning curves are remarkably
different in c, they are quite similar in d.eCenter-of-Mass analysis (see Methods: Analysis and model tting) reveals that for a majority of cells (n=31 of
34) blur and contrast-control tuning curves have signicantly different (black) tuning proles (t-test, p<0.05); other cells (gray) exhibited blur and
contrast-control tuning curves with CoM values that were not signicantly different. For these neurons, blur tuning may be explained in the context of
intensity tuning. Neurons above the diagonal typically exhibited a tuning preference for intermediate intensities while remaining largely invariant to all but
the highest levels of blur; a more conservative estimate that discounts these cells (n=6, 18%) still nds the majority of neurons unexplained by tuning
for stimulus contrast (n=25, 74%)
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-02438-8
6NATURE COMMUNICATIONS | (2018) 9:466 |DOI: 10.1038/s41467-017-02438-8 |www.nature.com/naturecommunications
Content courtesy of Springer Nature, terms of use apply. Rights reserved
blur-selective principal value (Fig. 3e, f, shading) PV >0.25 as
tuned for intermediate blur from those tuned for sharp (minimal
blur) stimuli, the average normalized PSTHs for intermediate-
selective and sharp-selective sub-populations (Fig. 8e, f) exhibit
the distinct qualitative differences observed in cells a23 (Fig. 8a)
and a15 (Fig. 8b). For the blur-selective sub-population (Fig. 8f)
differences between responses with respect to blur are transient,
restricted to the early time period, while such differences exhib-
ited by the sharp-selective sub-population (Fig. 8e) are uniform.
To quantify these differences we rst plot in Fig. 8g blur-
selective PV versus the difference of blur modulation between
preferred and non-preferred shape stimuli for each cell (Fig. 8c, d,
hatching) over the sustained period of activity (shaded, 200300
ms). We note a signicant anti-correlation between these
quantities (Pearsonsr=0.492, p=0.052); for cells with a larger
PV, sustained modulation of gain is smaller. There was also a
signicant difference in sustained blur modulation between the
intermediate-selective and sharp-selective sub-populations (t-test,
p=0.018). Further, we found no signicant difference in shape-
dependent blur modulation during the initial period of activation
between the two sub-populations (t-test, p=0.154), suggesting
that the more selective for intermediate blur a cell is (high blur-
selective PV), the more transient blur modulation appears to be
(low average-sustained blur modulation).
While the response latency of cell a23 was unaffected by blur,
cell a15 exhibited a consistent shift in tuning peaks: responses at
high blur are signicantly delayed. To examine this effect of
response latency across the population for each neuron, we
quantied the half-rise time as the duration between stimulus
onset and half-of-maximum response at each blur factor. At low
blur factors (β0.16), blur-selective cells tend to have a slightly
(though not signicant) shorter mean half-rise time on average
(Fig. 8h). However, half-rise time increased signicantly (t-test,
p=0.036) for blur-selective cells as the magnitude of blur
increased (arrows). To factor out increased response latency as
a population artifact, we analyzed latency differences based on the
ratio of half-rise times as a function of blur factor, i.e., response
latency normalized by each neurons baseline half-rise time to
sharp stimuli (β=0.005), and nd the same trend to hold (not
shown). This suggests the underlying circuitry of V4 neurons
tuned for intermediate blur is distinct from those which
consistently reduce activity as blur increases; blur-selective
neurons receive a transient multiplicative facilitation to shape-
selective response as high-SF content is removed.
Discussion
We studied the responses of primate V4 neurons to determine
whether, and to what extent, blur inuences neuronal activity. We
found that many V4 neurons are jointly tuned to shape and blur,
with responses explained by a blur-dependent gain-modulation of
shape tuning. Importantly, our results are not a simple by-
product of changes in stimulus size, contrast, and boundary
curvature that co-vary with blur. Further, blur-dependent mod-
ulation of responses does not appear to be a strictly local
mechanism, as modulation is consistent across multiple shapes
with features presented in various RF subregions. This nding is
reinforced by our size and curvature control experiments which
demonstrate blur tuning to hold over local stimulus perturba-
tions. Our demonstration of blur tuning implicates a role for V4
Blurred stimuli
Shape contour
10% Intensity
90% Intensity
0.1 0.2 0.3 0.4
0.1
0.2
0.3
0.4
0.5
Training NRMSE
0.1 0.2 0.3 0.4
0.1
0.2
0.3
0.4
0.5
Mean model NRMSE
Threshold curvature NRMSE
bcd
...
Shape screen data
...
Blur screen data
Threshold contours
Shape-trained APC
Curvature prediction #1
Curvature prediction #2
...
a
...
a15
b27
b31 a23
a15
a23
b31
b27
Fig. 6 Curvature modication does not explain blur selectivity. aSchematic of analysis performed across shape and blur datasets to assess the contribution
of curvature modication toward blur selectivity. An APC model is t to data collected from shape screening (red), which then predicts responses to
modied curvature threshold contours computed from blurred stimuli at different thresholds (blue). bA blurred stimulus (β=0.32) generated from a
shape contour (red) and a family of closed contours dened by the level set of an intensity threshold (blue), each with reduced curvature magnitudes. cFor
each cell, the minimum prediction error across all intensity thresholds (threshold curvature NRMSE; see Results) plotted as a function of a blur-invariant
mean models prediction error. The latter predicts responses to different shapes in accordance with the APC model ignoring blur; responses are identical
overall blur levels for a given shape. Example cells are lled and labeled. dThreshold prediction error as a function of bootstrapped training error, i.e., a
baseline estimate at how well the APC model predicts shape data
NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-02438-8 ARTICLE
NATURE COMMUNICATIONS | (2018) 9:466 |DOI: 10.1038/s41467-017-02438-8 |www.nature.com/naturecommunications 7
Content courtesy of Springer Nature, terms of use apply. Rights reserved
in processes cued by blurred boundaries, i.e., segmentation, depth
perception, and shading, and supports the hypothesis that V4 can
provide an explicit and sufcient representation of natural scenes.
Our results identify blur as a novel tuning dimension in visual
cortex; while some V4 neurons exhibit a monotonic decline in
shape-selective responses with increasing levels of blur, others
maintain shape selectivity over a wide range of blur values, i.e.,
Fig. 2b, c. This latter group of neurons may be robust against the
many physical scenarios in which contours may be blurred in
naturalistic scenes. A separate group of neurons show maximal
responses at intermediate blur levels, responding best when high-
SF content is removed from a stimulus while preserving lower-
band content. This effect cannot be explained by simple mid-
band SF tuning, but rather indicates a preference for intermediate
blur that requires high-SF information to suppress shape-selective
responses. In V4, intermediate blur tuning is associated with
response peaks between blur factors β=0.08 and 0.16. At 3°
eccentricity, for example, this corresponds to a response
enhancement to frequency content between 2.6 and 1.2 cyc/°,
respectively, which is consistent with SF tuning distributions
observed in macaque V128. Given that V4 responses are explained
by a joint model of shape and blur, where shape-selective
responses are multiplicatively scaled as a function of boundary
blur, blur tuning in V4 may arise from the aggregation of SF
information reported by V1, consistent with previous demon-
strations of V4 selectivity for non-Cartesian gratings19 and illu-
mination vectors6. Furthermore, tuned responses to shape occur
at either low or intermediate levels of blur in each neuron,
indicating that blur-tolerant shape identity may be decoded from
a V4 population response. This is signicant toward visual
computation in natural environments, as defocus, due to a nite
depth of eld or improper accommodation, may introduce optical
blur within a scene; it has been argued that the visual system
responds very differently to articial versus naturalistic stimuli
presented under blur at various depths from the plane of
focus12,15. For example, blur may aid in solving the correspon-
dence problem of binocular disparity: a V4 neuron tuned for
relative depth from the focal plane should respond strongly to a
single blurred edge presented in depth and be suppressed when
sharp edges of two physical objects are presented on the focal
plane, which coincidentally fall within displaced binocular
receptive elds. Such a mechanism would be consistent with
previous work suggesting that binocular V4 neurons solve the
correspondence problem by attenuating disparity signals that do
0.5 1 1.5
0.5
1
1.5
2
2.5
APCB gain model error
0.5 1 1.5
0.5
1
1.5
b29
a23
b27
a15
b29 a23
b27
a15
a
f
0
40
80
Firing rate (spk/s)
0
20
40
60
Firing rate (spk/s)
0.005 0.04 0.16 0.64
Blur factor
0
25
50
Firing rate (spk/s)
c
Cell a23 APCB Model
e
Cell b29
h
Cell a15
Significant
Not significant
0 1 2
0
0.1
0.2
0.3
Relative training error
Relative testing error
0
0.1
0.2
0.3
Relative testing error
−2 0 2 4
10
–15
10
–10
10
–5
10
0
Blur selectivity PV
Cross-validation
significance (p)
p = 0.05
b
d
g
Data
APCB additive model error APC model error
APCB gain model error
Fig. 7 Responses explained by a joint model of shape and blur. aNRMS t error of an APC model plotted as a function of NRMS t error of an APCB model
for each neuron. Example cells are lled and labeled. b,d,gEven though the APCB is a generalization of the APC model, i.e., APCB models form a superset
of APC models with two additional parameters, cross-validation demonstrates the APCB model to better predict responses to blurred stimuli: bAverage
NRMSE across training and testing stimuli reveals the APCB model to better predict responses without overtting; relative error, computed as APCAPCB
error, is positive in all signicant cases (g). dRelative testing NRMSE between models, plotted as a function of blur selectivity PV, demonstrates the APCB
model to better predict responses over a range of neuron tuning proles. gSignicant (black) and not signicant (gray) cases of prediction improvement
were determined with a pair-wise t-test across hold-out validation stimuli (see Methods: Analysis and model tting). c,e,hObserved responses (open
circles, dashed lines) and APCB model ts (lled circles, solid lines) for three example neurons (Fig. 2a, c, d). Qualitative assessment of ts suggests that
the APCB model captures blur-tuned response properties remarkably well. fComparison of the APCB gain model against an APCB additive variant with
equal degrees of freedom (see Methods: Analysis and model tting). Most neurons are better t by the APCB gain model, particularly when error is small,
consistent with blur selectivity being explained by gain modulation
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-02438-8
8NATURE COMMUNICATIONS | (2018) 9:466 |DOI: 10.1038/s41467-017-02438-8 |www.nature.com/naturecommunications
Content courtesy of Springer Nature, terms of use apply. Rights reserved
not agree in SF content29. Thus, V4 plays a critical role in not
only detecting shapes subjected to blur for judgments of object
depth, but also for segmenting naturalistic scenes into blur-
invariant object representations.
While we did not nd blur selectivity to signicantly correlate
with many physiological properties of neurons within our data
set, i.e., receptive eld eccentricity, preference for size, color,
luminance, or cortical depth of the recording site, a signicant
relationship was found between selectivity for intermediate blur
and preferred stimulus saturation (Supplementary Fig. 1): neu-
rons with higher blur selectivity tend to prefer shape stimuli
dened by low chromatic contrasts. Given then that blur is a
critical cue for the perception of shadows, this nding is con-
sistent with blur-selective neurons acting as shadow detectors,
since shadows cast across physical objects produce blurred edges
dominated by luminance contrast14.
Shape theorists and psychophysicists have long argued that
attached and cast shadows, formed on the occluding object or
another surface, respectively, contribute to the perception of 3D
shape and scene understanding. For example, the relative position
of shadows may be used to infer the relative location of scene
illuminants3034. However, even though dark and blurry
boundaries may be quickly identied as shadows35, perceptual
judgments on these shadows are difcult and slow14,36. While
some have argued that the poor access to shadow-specic
information is due to low-level shadow detection and discount-
ing14, others have proposed a higher-level process related to the
perceptual segregation of physical objects from nuisance factors
related to illumination36. Our results identify V4 as a plausible
locus for processing shadows within the ventral stream, where
shape and blur information coalesce in the activity of individual
neurons. While the early emergence of blur-selective responses in
V4 could underlie the rapid detection of shadows, more experi-
ments are needed to determine whether V4 differentially encodes
shadow versus non-shadow blurred boundaries, and how this
difference could underlie the limited salience of shadows during
perceptual judgments. An adaptive stimulus presentation proto-
col to rst identify shape-selective and blur-selective V4 neurons,
followed by an investigation of how these neurons respond to
naturalistic images, will give insight into how neurons selective
for blur and/or shape participate in the perceptual organization of
natural scenes.
We note, however, that the descriptive model presented here
does little to explain how computations encoding shape and blur
arise in vivo. One model of V4 activity, the Spectral Receptive
Field (SRF) model, is a tempting candidate to explain the band-
selective nature of blur tuning, but previous work has shown such
linear combinations of spectral power as incapable of explaining
V4 shape selectivity: SRFs are unable to disambiguate stimuli that
contain identical spectral power, such as any shape and its 180°-
0
30
60
90
0
10
20
30
0
35
70
Response (spk/s)
0
10
20
Blur modulation (spk/s)
0 100 200 300 400
0
0.25
0.5
0.75
Population response
Time from stimulus onset (ms)
0 100 200 300 400
0
0.25
0.5
0.75
Time from stimulus onset (ms)
0.005 0.04 0.16 0.64
60
80
100
Half−rise time (ms)
Blur factor
−10 0 10 20
−2
0
2
4
Blur selectivity PV
Sustained gain (spk/s)
Blur factor .005
0.01
0.02
0.04
0.08
0.16
0.32
0.48
0.64
p = 0.036
r = –0.492
Preferred
Non-preferred
PV > 0.25
aCell a23 bCell a15
cd
ef
Blur selectivity PV > 0.25
Blur selectivity PV 0.25
g
h
PV 0.25
p = 0.052
Fig. 8 Response dynamics differ with respect to blur selectivity. a,bPSTH of preferred stimuli for low (blue) to high (red) blur factors for sharp-selective
and blur-selective cells. aFor the sharp-selective cell, increasing blur decreases neuronal response throughout the stimulus presentation interval. bFor the
blur-selective cell, blur modulation is transient. c,dBlur modulation (y-axis), calculated as the s.d. of responses with respect to blur, for preferred and non-
preferred shape stimuli from example cells in aand b. The difference between the blur modulation timecourse (hatching) of preferred and non-preferred
shapes captures the period in time at which blur multiplicatively scales shape-selective responses. e,fAverage normalized PSTHs demonstrating distinct
response dynamics between these cell groups. gThe principle value of blur selectivity (Fig. 2e) plotted as a function of the average difference in blur
modulation between preferred and non-preferred shape stimuli across the sustained period (200300 ms; shaded region of e,f). hFor each blur factor,
mean latency to half-of-maximum response (half-rise time) for each cell group. Error bars denote s.e.m. A signicant increase in response latency (t-test, p
=0.036) occurs across cells selective for intermediate blur (PV 0.25) between blur factors 0.16 and 0.64 (arrows)
NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-02438-8 ARTICLE
NATURE COMMUNICATIONS | (2018) 9:466 |DOI: 10.1038/s41467-017-02438-8 |www.nature.com/naturecommunications 9
Content courtesy of Springer Nature, terms of use apply. Rights reserved
rotated counterpart25. Therefore, further modeling studies are
required to determine how V4 selectivity for shape and blur could
be constructed from upstream populations. Fortunately, the
dynamics of blur selectivity reported here provide key insights
into potential underlying mechanisms. Our results demonstrate
that preference for intermediate levels of blur arise early,
approximately 60 to 100 ms after stimulus onset, comparable to
the time at which shape selectivity arises in V437. Furthermore,
this activity is transient, lasting until approximately 150 to 200 ms
after stimulus onset. One candidate blur-selective circuit consists
of a simple difference of SF power within V1, where intermediate
spatial frequency responses are inhibited by activity selective for
higher spatial frequencies. This is a markedly different compu-
tation from simple band-pass SF tuning, which would not be
associated with stronger responses for intermediate blur levels,
since blurring never increases SF power within a stimulus. While
such a spectral difference model38 cannot capture shape selec-
tivity of V4 neurons25, computations similar to SRFs may explain
blur tuning, consistent with contrast energy models of blur dis-
crimination17. Documented effects of high-SF gratings inhibiting
V1 activity39 (see also refs. 4045) may underlie these computa-
tions, though the extent of such SF-based modulation in natur-
alistic contexts, e.g., focal blur or illumination shading, remains
unknown. Such blur signals, selective for stimuli containing
strong intermediate-SF and little higher-SF content, could then
bypass V2 to apply fast gain reduction in shape-selective V4 units.
Recurrent inhibition, either within V4 or between V4 and pre-
vious areas, would then suppress the contribution of blur over
sustained periods. Alternatively, tuning for intermediate blur
could arise from latent normalization of V1 activity, including
faster magnocellular inputs, as high-SF responses are removed. It
is unknown, however, if such a normalization-based circuit of
blur tuning would reproduce these observed dynamics. It must be
stressed that these circuits only describe potential mechanisms for
blur tuning: blur-selective activity must then converge upon
shape-selective activity within V4 to produce separable shape and
blur tuning.
Computational studies have often argued that ideal repre-
sentations within the earliest stages of visual processing are
general-purpose codes, supporting a diversity of tasks, from
which perception can emerge10. Consistent with this argument,
V1 receptive elds, tuned for local orientation4648 and spatial
frequency49,50, form a wavelet-like representation of visual
space51. Further, it has been shown that natural scenes can be
efciently decomposed into scale-localized and space-localized
Gabor-like bases, which are selective for orientation, remarkably
similar to the receptive elds of V1 neurons52,53. Local popula-
tions of V1 units therefore form a complete and efcient neural
representation of naturalistic scenes54. Beyond V1, however,
sensory representations of higher visual areas are thought to
instead participate in solving specic visual tasks. For example,
face-selective neurons in inferotemporal (IT) cortex facilitate face
recognition55, and border-ownership signals in V2 may underlie
gureground organization56. Thus, rather than a general-
purpose code, representations within each module beyond V1
appear to reect the computations required to solve well-dened
problems. Previous studies of V4 have shown that may neurons
explicitly encode the curvature of object boundary fragments,
thought to provide a structural code for complex object shape57.
Our demonstration of tuning for both shape and blur is especially
signicant since such an encoding framework may provide a
sufcient representation of naturalistic scenes10. Therefore, in
addition to supporting a neural code for object recognition, V4
may also efciently encode visual scenes for use in higher visual
areas, e.g., IT. Furthermore, while the representations of V1 and
V4 may both be complete, single-unit V4 activity, unlike V1,
includes an explicit code of object-centric boundary conforma-
tion. This interpretation is consistent with V1 encoding stuffand
V4 building an intermediate annotation of things, both of which
are likely prominent in higher ventral computations58. However,
the notion of shape and blur underlying a sufcient representa-
tion of natural images does not imply these features to be the sole
dimensions of selectivity within V4; while higher ventral visual
areas like IT may in fact decode the entirety of an image from V4,
an overcomplete representation incorporating additional visual
features may further benet complex visual tasks such as object
identication or scene categorization.
Methods
Animals and surgery. Two rhesus monkeys (Macaca mulatta, one female and one
male) were surgically implanted with custom-build head posts attached to the skull
with orthopedic screws. After xation training, a recording chamber was implan-
ted; a craniotomy (10 mm diameter) was subsequently performed to expose
dorsal area V4. See ref. 37 for detailed surgical procedures. All animal procedures
conformed to NIH guidelines and were approved by the Institutional Animal Care
and Use Committee at the University of Washington.
Animals were seated in front of a CRT monitor at a distance of 57 cm and were
trained to xate on a 0.1° white dot within 0.50.75° of visual angle for 35 s for
water reward. Eye position was monitored using a 1 kHz infrared eye-tracking
system (Eyelink 1000; SR Research). Stimulus presentation and animal behavior
were controlled by customized software PYPE (originally developed in the Gallant
Laboratory, University of California, Berkeley, Berkeley, CA). Each trial began with
the presentation of a xation spot at the center of the screen. Once xation was
acquired, four to six stimuli were presented in succession, each for 300 ms,
separated by interstimulus intervals of 200 ms. Stimulus onset and offset times were
based on photodiode detection of synchronized pulses in the lower left corner of
the monitor.
Data collection. During each recording session, a single transdural tungsten
microelectrode was lowered into cortex with an electromechanical microdrive
system (Gray Matter Research). Electrode signals were amplied and single-unit
activity was isolated using online spike sorting (Plexon Systems). Electrode pene-
trations targeted dorsal V4 from structural MRI scans localizing the prelunate
gyrus. Single-unit waveforms that responded briskly to the onset of shape stimuli
were identied for further recording. After data collection, spikes were sorted
ofine with custom software (Plexon Systems) and exported for analysis.
Visual stimulation. For each recorded neuron, we rst characterized the preferred
RF location, size, luminance contrast and chromaticity with custom shape stimuli
under mouse control. Shape stimuli were presented on an achromatic gray back-
ground of mean luminance 5.4 cd/m2. Foreground luminance was chosen from
four values (2.7, 5.4, 8.1, or 12.1 cd/m2) that were darker, equiluminant, or brighter
than the background; chromaticity was selected from 25 gamma-corrected hues
spanning the CIE color space58. Next, we assessed shape selectivity with a standard
set of 366 shape stimuli generated by rotating 51 shapes (Fig. 1a) by increments of
45° (Fig. 1b), and discarding duplicates due to radial symmetry. The design of these
stimuli is described in detail elsewhere22. All stimuli were presented in the center of
the RF and were scaled such that all parts of the stimuli were within the estimated
RF of the cell: the largest shape stimulus typically had outermost edges at a distance
of 75% RF radius. Stimuli were presented in random order without replacement
with three repeats per stimulus.
To assess how stimulus blur inuences V4 responses, we identied 58 shape
stimuli that evoked a range of responses, from weak (non-preferred) to strong
(preferred), during the shape screen described above; we then studied responses to
these shapes subjected to different levels of blur. As V4 neurons respond selectively
for shape orientation22, it was often the case that preferred and non-preferred
stimuli were chosen to be 180° rotation pairs of the same shape. This had the added
benet of controlling for spectral content, since such stimuli have identical spectral
power25. Additionally, neutral curvature (circular) stimuli were also included for
most neurons. Each of the chosen shapes were presented at up to 9 blur factors
along an approximately exponential scale, i.e., β
{0.005,0.01,0.02,0.04,0.08,0.16,0.32,0.48,0.64} (Fig. 1e). Stimuli were blurred by
convolving the discretized raster image with a circular 2D Gaussian blur kernel.
The kernel standard deviation, denoted by a blur factor β, is written in units
relative to the radius of the large circle (Fig. 1a, black arrow). Due to the limited
color gamut of the display, dithering was employed by noising each pixel with zero-
mean Gaussian noise with s.d. of two 8-bit greylevels. The resulting pixel intensities
were linearly interpolated between background luminance and preferred color,
rounded to the nearest calibrated RGB values. Finally, to prevent aliasing, stimuli
were down-sampled by a factor of two. During the shape screen to assess shape
selectivity, sharp stimuli were presented under a minimal blur factor of β=0.005.
Blur stimuli were randomly chosen without replacement with a median of 20
repetitions.
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-02438-8
10 NATURE COMMUNICATIONS | (2018) 9:466 |DOI: 10.1038/s41467-017-02438-8 |www.nature.com/naturecommunications
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Control experiments. On a subset of cells, we conducted control experiments to
evaluate whether preferred responses to intermediate blur factors could be
explained on the basis of selectivity for stimulus size or stimulus contrast. Because
stimulus blur increases the number of pixels distinct from the background (Fig. 4a),
preference for a specic level of blur could represent preference for stimulus size. If
this were the case then blur preference will depend on the absolute size of the
stimulus. To control for this, we presented blurred stimuli at multiple sizes and
asked whether blur preference depended on the size of the stimulus. Size control
stimuli were generated from up to three exemplar blur factors, consisting of the
extremal blur levels β=0.005 and 0.64, along with an intermediate factor, typically
the blur factor that evoked the strongest responses from the neuron. Shape stimuli
at each of the blur factors were resized with scaling factors of 0.9 and 1.1, i.e., scaled
by ±10%. These factors were chosen for their visual correspondence to stimuli
subjected to intermediate blur factors of 0.08 and 0.16, i.e., Fig. 4a, and approx-
imate contours generated from luminance thresholds at 1/3 and 2/3 of maximum
stimulus intensity under blurring of β=0.16. Each stimulus was presented ran-
domly with 10 to 20 repeats. A model-based analysis of size control data was
performed in two ways. First, for each neuron an APC model was t to the full set
of shape data used for shape selectivity characterization. Contours of scaled and
blurred stimuli were then constructed via level set contours as described for the
curvature modication analysis (see Fig. 6for details). Fitted APC models were
then used to predict responses to size control stimuli. We found that the over-
whelming majority of cells predicted size control responses signicantly below t
performance across shape selectivity data, suggesting that responses to scaled and
blurred stimuli cannot be explained on the basis of selectivity for boundary cur-
vature alone. In a second analysis we evaluated the ability of size or blur infor-
mation to explain the variance of responses to control data. Here, for each neuron
we t APC×Band APC×Smodels to responses of scaled and blurred stimuli, where
Sis a Gaussian tuning function of size computed from the arc length of a stimulis
level-set contour. Again, responses were overwhelmingly better t across our
population by the joint shape and blur-selective APC×Bmodel, despite the APC×S
model having identical number of free parameters and functional form.
To evaluate whether preference for an intermediate level of blur could be
explained on the basis of preference for average stimulus contrast, we also studied
responses to contrast control stimuli generated by rst computing mean stimulus
intensity across the interior of a blurred shape stimulus, where shape interior is
dened by the half-contrast level set. Then, for each blur factor, a control stimulus
was generated with a foreground intensity equal to the mean intensity within this
level set (Fig. 5). Control stimuli were subjected to only the minimal blur factor of
β=0.005, resulting in stimuli with sharp boundaries of reduced gureground
contrast.
Analysis and model tting. Neural responses to individual stimuli were calculated
as the mean ring rate observed during stimulus presentation, 300 ms in duration
with a 50 ms lag relative to onset, averaged across repeats. Peristimulus time his-
tograms were computed for each stimulus by ltering spike rasters with a centered
(noncausal) decaying exponential lter consistent with a membranes integration
time constant (37 ms).
Preferred-shape blur tuning curves (see Figs 3and 5) were constructed by rst
identifying preferred shapes, i.e., sharp stimuli (β=0.005) that evoked a response
greater than mean across shapes (24 preferred shapes per cell), and averaging
across preferred shapes for each blur factor. A cubic spline is then t to average
preferred shape responses. Center-of-mass (CoM) was calculated by integrating
preferred tuning curves across blur factors in log space (or similarly in intensity-
matched factors during intensity control analysis), and returning the median
cumulative factor. Signicance of CoM measurements from intensity-matched
controls was determined by bootstrapped estimates of tuning curves sampled from
response distributions of recorded means and variances from each stimulus (100
repetitions) under a two-way t-test of unequal variance.
The APC model has been shown to accurately capture shape tuning properties
of single-unit V4 spike-rate responses22,25. Here, each shape stimulus is
represented by 48 points in the space of angular position θand curvature κ,
corresponding to locations of curvature inection along the contour. In particular,
a stimulus Γ=(γ1,γ2,,γn) is represented by ncritical points γi¼ðγi
θ;γi
κÞ. The
model predicts responses to shape stimuli by evaluating a Gaussian energy function
(von Mises in periodic angular position) at each of these points, and returning the
maximum. When applicable, Γis augmented with a blur factor γ
β
proportional to
the kernel size of a Gaussian-blurred shape stimuli. Thus,
APCðΓ;ω;α;μθ;σθ;μκ;σκÞ¼ωþαexp cosðγi
θμθÞ
σθ
ðγi
κμκÞ2
σ2
κ

ð1Þ
describes a model over dimensions of angular position and curvature. Note that the
tuning peak μand width σare represented along each selectivity dimension θand
κ. Further, baseline parameter ωcaptures spontaneous activity in the absence of
stimulation, and gain αis t to produce maximal responses for preferred stimuli.
We extend the APC model to also predict neural responses as a function of
boundary blur. Here, blur selectivity is modeled as Gaussian in the logarithm of
blur factors, i.e.,
BðΓ;μβ;σβÞ¼exp logðγβÞμβ

2
σ2
β
!
:ð2Þ
Again note preferred blur factor μ
β
and blur tuning width σ
β
. Every model is tto
minimize squared error between mean ring rate ~
revoked by stimuli ~
Γ, averaged
across repetitions, and the responses predicted by the model. For example, a tΘ*
of neural data to an angular position, curvature, and blur (APCB) model is written
Θ¼arg max
Θmax
iAPCð~
Γ;ΘÞBð~
Γ;ΘÞ

~
r
2
;ð3Þ
such that the model predicts a response to any stimulus as the maximum of each
critical point γievaluated under that model. The optimal model therefore
minimizes the L
2
norm between recorded~
rand predicted responses of stimulus set
~
Γ. Model tting is complicated by the fact that optimization is highly non-convex.
While standard gradient descent methods are quick to converge, solutions are
typically only locally optimal: we employ a repeated randomized initialization
procedure to approximate globally optimal ts, described elsewhere25.
Model selection is performed using leave-one-out cross-validation. APC and
APCB models were trained on all but one blurred stimulus and then used to predict
the hold-out response. This procedure was repeated for all blurred stimuli for each
cell. Training and testing error was computed by averaging NRMSE error for every
training session (i.e., every stimulus in the blur data set independently for each
neuron). To measure signicance of validation performance for each cell we
estimate the distribution of average hold-out prediction error for the APCB model
relative to the APC model. Signicance is determined via paired t-test of relative
testing error, paired across blurred stimuli for each neuron.
For each blurred stimulus, we computed the boundary contours by binary-
thresholding images at a range of intensity levels. These contours were then
represented as points in the 2D space of curvature and angular position. Each
shape and its blurred counterparts were represented by the same set of angular
position values, but the curvature values were reduced under blur (Fig. 6a, b).
To compare model t performance we measure prediction accuracy to a subset
of shape stimuli. For each stimulus, the set of critical points along the contour was
evaluated under a trained APC model to produce a response prediction ~
p.We
compute the normalized root-mean-squared error (NRMSE) between the predicted
and recorded responses ~
r, i.e.,
k~
p~
rk2
max ð
~
rÞmin ð
~
rÞ:ð4Þ
NRMSE training estimates were computed via bootstrapping: a random subset of
shape stimuli were selected, equal in number to the number of blurred stimuli
recorded, and the NRMSE is again calculated between the best-tting APC model
prediction to responses elicited from sharp stimuli from the entire shape set.
Data availability. The data and analysis code that support the ndings of this
study are available from the corresponding author upon reasonable request.
Received: 20 January 2017 Accepted: 30 November 2017
References
1. Attneave, F. Some informational aspects of visual perception. Psychol. Rev. 61,
183193 (1954).
2. Geisler, W. S., Perry, J. S., Super, B. J. & Gallogly, D. P. Edge co-occurrence in
natural images predicts contour grouping performance. Vision. Res. 41,
711724 (2001).
3. Ikeuchi, K. & Horn, B. K. Numerical shape from shading and occluding
boundaries. Artif. Intell. 17, 141184 (1981).
4. Fleming, R. W., Holtmann-Rice, D. & Bulthoff, H. H. Estimation of 3D shape
from image orientations. PNAS 108, 2043820443 (2011).
5. Wagemans, J. et al. A century of Gestalt psychology in visual perception: I.
Perceptual grouping and gure-ground organization. Psychol. Bull. 138,
11721217 (2012).
6. Hanazawa, A. & Komatsu, H. Inuence of the direction of elemental luminance
gradients on the responses of V4 cells to textured surfaces. J. Neurosci. 21,
44904497 (2001).
7. Cadieu, C. et al. A model of V4 shape selectivity and invariance. J. Neurophysiol.
98, 17331750 (2007).
8. Tao, X. et al. Local sensitivity to stimulus orientation and spatial frequency
within the receptive elds of neurons in visual area 2 of macaque monkeys. J.
Neurophysiol. 107, 10941110 (2012).
NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-02438-8 ARTICLE
NATURE COMMUNICATIONS | (2018) 9:466 |DOI: 10.1038/s41467-017-02438-8 |www.nature.com/naturecommunications 11
Content courtesy of Springer Nature, terms of use apply. Rights reserved
9. Drewes, J., Goren, G., Zhu, W. & Elder, J. H. Recurrent processing in the
formation of shape percepts. J. Neurosci. 36, 185192 (2016).
10. Elder, J. H. Are edges incomplete? Int. J. Comput. Vision. 34,97122 (1999).
11. Held, R. T., Cooper, E. A. & Banks, M. S. Blur and disparity are complementary
cues to depth. Curr. Biol. 22, 426431 (2012).
12. Burge, J. & Geisler, W. S. Optimal disparity estimation in natural stereo images.
J. Vision. 14, 1 (2014).
13. Elder, J. H. & Velisavljevi, L. Cue dynamics underlying rapid detection of
animals in natural scenes. J. Vision. 9, 7 (2009).
14. Rensink, R. A. & Cavanagh, P. The inuence of cast shadows on visual search.
Perception 33, 13391358 (2004).
15. Sebastian, S., Burge, J. & Geisler, W. S. Defocus blur discrimination in natural
images with natural optics. J. Vision. 15, 16 (2015).
16. Watt, R. & Morgan, M. A theory of the primitive spatial code in human vision.
Vision. Res. 25, 16611674 (1985).
17. Watson, A. B. & Ahumada, A. J. Blur claried: a review and synthesis of blur
discrimination. J. Vision. 11,1010 (2011).
18. Georgeson, M. A., May, K. A., Freeman, T. C. A. & Hesse, G. S. From lters to
features: Scale space analysis of edge and blur coding in human vision. J. Vision.
7, 7 (2007).
19. Gallant, J., Connor, C. E., Rakshit, S., Lewis, J. W. & Van Essen, D. C. Neural
responses to polar, hyperbolic, and Cartesian gratings in area V4 of the
macaque monkey. J. Neurophysiol. 76, 27182739 (1996).
20. Kobatake, E. & Tanaka, K. Neuronal selectivities to complex object features in
the ventral visual pathway of the macaque cerebral cortex. J. Neurophysiol. 71,
856867 (1994).
21. Carlson, E. T., Rasquinha, R. J., Zhang, K. & Connor, C. E. A sparse object
coding scheme in area V4. Curr. Biol. 21, 288293 (2011).
22. Pasupathy, A. & Connor, C. E. Shape representation in area V4: position-
specic tuning for boundary conformation. J. Neurophysiol. 86, 25052519
(2001).
23. Bushnell, B. N., Harding, P. J., Kosai, Y., Bair, W. & Pasupathy, A.
Equiluminance cells in visual cortical area V4. J. Neurosci. 31, 1239812412
(2011).
24. Mokhtarian, F. & Mackworth, A. Scale-based description and recognition of
planar curves and two-dimensional shapes. IEEE T. Pattern Anal. 8,3443
(1986).
25. Oleskiw, T. D., Pasupathy, A. & Bair, W. Spectral receptive elds do not
explain tuning for boundary curvature in V4. J. Neurophysiol. 112, 21142122
(2014).
26. Kosai, Y., El-Shamayleh, Y., Fyall, A. M. & Pasupathy, A. The role of visual area
V4 in the discrimination of partially occluded shapes. J. Neurosci. 34,
85708584 (2014).
27. El-Shamayleh, Y. & Pasupathy, A. Contour Curvature As an Invariant Code for
Objects in Visual Area V4. J. Neurosci. 36, 55325543 (2016).
28. Foster, K. H., Gaska, J. P., Nagler, M. & Pollen, D. A. Spatial and temporal
frequency selectivity of neurones in visual cortical areas V1 and V2 of the
macaque monkey. J. Physiol. 365, 331363 (1985).
29. Kumano, H., Tanabe, S. & Fujita, I. Spatial frequency integration for binocular
correspondence in macaque area V4. J. Neurophysiol. 99, 402408 (2008).
30. Attwood, C. I., Harris, J. P. & Sullivan, G. D. Learning to search for visual
targets dened by edges or by shading: Evidence for non-equivalence of line
drawings and surface representations. Vis. Cogn. 8, 751767 (2001).
31. Ramachandran, V. S. Perception of shape from shading. Nature 331, 163166
(1988).
32. Sun, J. Y. & Perona, P. Preattentive perception of elementary three-dimensional
shapes. Vision. Res. 36, 25152529 (1996).
33. Allen, B. P. Shadows as sources of cues for distance of shadow-casting objects.
Percept. Mot. Skills. 89, 571584 (1999).
34. Hubona, G. S., Shirah, G. W. & Jennings, D. K. The effects of cast shadows and
stereopsis on performing computer-generated spatial tasks. IEEE Sys. Man
Cybern. 34, 483493 (2004).
35. Elder, J. H., Trithart, S., Pintilie, G. & MacLean, D. Rapid processing of cast and
attached shadows. Perception 33, 13191338 (2004).
36. Porter, G., Tales, A. & Leonards, U. What makes cast shadows hard to see? J.
Vision. 10,118 (2010).
37. Bushnell, B. N., Harding, P. J., Kosai, Y. & Pasupathy, A. Partial occlusion
modulates contour-based shape encoding in primate area V4. J. Neurosci. 31,
40124024 (2011).
38. David, S., Hayden, B. & Gallant, J. Spectral receptive eld properties explain
shape selectivity in area V4. J. Neurophysiol. 96, 34923505 (2006).
39. De Valois, Karen, K. & Tootell, Roger, B. H. Spatial-frequency-specic
inhibition in cat striate cortex cells. J. Physiol. 336, 359376 (1983).
40. DeAngelis, G. C., Robson, J. G., Ohzawa, I. & Freeman, R. D. Organization of
suppression in receptive elds of neurons in cat visual cortex. J. Neurophysiol.
68, 144163 (1992).
41. Mazer, J. A., Vinje, W. E., McDermott, J., Schiller, P. H. & Gallant, J. L. Spatial
frequency and orientation tuning dynamics in area V1. PNAS 99, 16451650
(2002).
42. Bredfeldt, C. E. & Ringach, D. L. Dynamics of spatial frequency tuning in
macaque V1. J. Neurosci. 22, 19761984 (2002).
43. Ninomiya, T., Sanada, T. M. & Ohzawa, I. Contributions of excitation and
suppression in shaping spatial frequency selectivity of V1 neurons as revealed
by binocular measurements. J. Neurophysiol. 107, 22202231 (2012).
44. Nolt, M. J., Kumbhani, R. D. & Palmer, L. A. Suppression at high spatial
frequencies in the lateral geniculate nucleus of the cat. J. Neurophysiol. 98,
11671180 (2007).
45. Webb, B. S., Dhruv, N. T., Solomon, S. G., Tailby, C. & Lennie, P. Early and late
mechanisms of surround suppression in striate cortex of macaque. J. Neurosci.
25, 1166611675 (2005).
46. Hubel, D. H. & Wiesel, T. N. Receptive elds of single neurones in the cats
striate cortex. J. Physiol. 148, 574591 (1959).
47. Hubel, D. H. & Wiesel, T. N. Receptive Fields and Functional Architecture in
Two Nonstriate Visual Areas (18 and 19) of the Cat. J. Neurophysiol. 28,
229289 (1965).
48. Hubel, D. H. & Wiesel, T. N. Receptive elds and functional architecture of
monkey striate cortex. J. Physiol. 195, 215243 (1968).
49. Movshon, J. A., Thompson, I. D. & Tolhurst, D. J. Spatial summation in the
receptive elds of simple cells in the cats striate cortex. J. Physiol. 283,5377
(1978).
50. Albrecht, D. G., De Valois, R. L. & Thorell, L. G. Visual cortical neurons: are
bars or gratings the optimal stimuli? Science 207,8890 (1980).
51. Olshausen, B. A. & Field, D. J. Sparse coding with an overcomplete basis set: a
strategy employed by V1? Vision. Res. 37, 33113325 (1997).
52. Bell, A. J. & Sejnowski, T. J. Theindependent componentsof natural scenes are
edge lters. Vision. Res. 37, 33273338 (1997).
53. Zylberberg, J., Murphy, J. T. & DeWeese, M. R. A sparse coding model with
synaptically local plasticity and spiking neurons can account for the diverse
shapes of V1 simple cell receptive elds. PLoS. Comput. Biol. 7, e1002250
(2011).
54. Olshausen, B. A. & Field, D. J. Natural image statistics and efcient coding.
Network 7, 333339 (1996).
55. Tsao, D. Y., Freiwald, W. A., Knutsen, T. A., Mandeville, J. B. & Tootell, R. B.
H. Faces and objects in macaque cerebral cortex. Nat. Neurosci. 6, 989995
(2003).
56. Zhou, H., Friedman, H. S. & von der Heydt, R. Coding of border ownership in
monkey visual cortex. J. Neurosci. 20, 65946611 (2000).
57. Kourtzi, Z. & Connor, C. E. Neural representations for object perception:
structure, category, and adaptive coding. Annu. Rev. Neurosci. 34,4567
(2011).
58. Adelson, E. H. & Bergen, J. R. in Computational Models of Visual Processing
(eds Landy, M. S. & Movshon, J. A.) 320 (The MIT Press, Cambridge, MA,
USA, 1991).
Acknowledgements
We thank W. Bair and J. Elder for helpful discussions and comments on the manu-
script and A. Fyall for assistance with animal training and data collection. Technical
support was provided by the Bioengineering group at the Washington National Primate
Research Center. This work was funded by NEI grant R01EY018839 to A.P., Vision Core
grant P30EY01730 to the University of Washington, P51 grant OD010425 to the
Washington National Primate Research Center, Natural Sciences and Research Counsel
of Canada PGS-D to T.D.O., and University of Washington Computational Neuroscience
Training Grant to T.D.O.
Author contributions
T.D.O. and A.P. contributed experiment design, analysis methodology, and manuscript
preparation; T.D.O. and A.N. contributed data collection. T.D.O. contributed data
analysis and modeling. All authors contributed to interpretation of ndings and
manuscript revisions.
Additional information
Supplementary Information accompanies this paper at https://doi.org/10.1038/s41467-
017-02438-8.
Competing interests: The authors declare no competing nancial interests.
Reprints and permission information is available online at http://npg.nature.com/
reprintsandpermissions/
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional afliations.
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-02438-8
12 NATURE COMMUNICATIONS | (2018) 9:466 |DOI: 10.1038/s41467-017-02438-8 |www.nature.com/naturecommunications
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative
Commons license, and indicate if changes were made. The images or other third party
material in this article are included in the articles Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the
articles Creative Commons license and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this license, visit http://creativecommons.org/
licenses/by/4.0/.
© The Author(s) 2018
NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-02438-8 ARTICLE
NATURE COMMUNICATIONS | (2018) 9:466 |DOI: 10.1038/s41467-017-02438-8 |www.nature.com/naturecommunications 13
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
... In contrast to simple artificial stimuli, natural images can vary in many features, and these features are jointly encoded by the responses of populations of neurons in visual cortex [11][12][13][14][15][16] . It is well known that neurons in visual cortex are tuned for various features 4,17 , so a single neuron's response may not allow unique identification of multiple image feature values. ...
... It is possible that other feature pairs that we did not study, such as purely lateral shifts in background objects relative to the central object, may interact either in behavior or neurally. In recent years, many studies have demonstrated that neural networks trained to categorize natural images produce representations that strongly resemble neural representations in the ventral visual stream 12,15,[29][30][31] . These models provide an opportunity to understand the conditions under which aspects of natural stimuli are most likely to be represented orthogonally and which aspects might best be targeted in future studies to probe potential failures of orthogonality 25,32-37 . ...
Article
Full-text available
In natural visually guided behavior, observers must separate relevant information from a barrage of irrelevant information. Many studies have investigated the neural underpinnings of this ability using artificial stimuli presented on blank backgrounds. Natural images, however, contain task-irrelevant background elements that might interfere with the perception of object features. Recent studies suggest that visual feature estimation can be modeled through the linear decoding of task-relevant information from visual cortex. So, if the representations of task-relevant and irrelevant features are not orthogonal in the neural population, then variation in the task-irrelevant features would impair task performance. We tested this hypothesis using human psychophysics and monkey neurophysiology combined with parametrically variable naturalistic stimuli. We demonstrate that (1) the neural representation of one feature (the position of an object) in visual area V4 is orthogonal to those of several background features, (2) the ability of human observers to precisely judge object position was largely unaffected by those background features, and (3) many features of the object and the background (and of objects from a separate stimulus set) are orthogonally represented in V4 neural population responses. Our observations are consistent with the hypothesis that orthogonal neural representations can support stable perception of object features despite the richness of natural visual scenes. Supplementary Information The online version contains supplementary material available at 10.1038/s41598-025-88910-8.
... Recent work has further revealed that defocus blur provides an important cue for depth perception 44 . Neurophysiological studies have also found that shapeselective neurons in the visual cortex can be tuned to varying degrees of blur, with some neurons preferring blurry over clear depictions of 2D object shapes 45 . Thus, blur appears to be an important feature that is encoded by the visual system. ...
... We reasoned that the existing gap between CNN models and biological visual systems 5,19,22,25,31,47,50,61,62 may be ascribed, at least in part, to inadequate diversity in the set of images that are commonly used to train CNNs. In particular, we hypothesized that blur may be a critical property of natural vision 34,44,45 that contributes to the development and maintenance of robust visual systems. Although we and others have previously posited that exposure to blurry visual input may have the potential to confer some robustness to biological or artificial visual systems, the evidence to support this notion so far has been mixed 6,52,63-65 . ...
Article
Full-text available
Whenever a visual scene is cast onto the retina, much of it will appear degraded due to poor resolution in the periphery; moreover, optical defocus can cause blur in central vision. However, the pervasiveness of blurry or degraded input is typically overlooked in the training of convolutional neural networks (CNNs). We hypothesized that the absence of blurry training inputs may cause CNNs to rely excessively on high spatial frequency information for object recognition, thereby causing systematic deviations from biological vision. We evaluated this hypothesis by comparing standard CNNs with CNNs trained on a combination of clear and blurry images. We show that blur-trained CNNs outperform standard CNNs at predicting neural responses to objects across a variety of viewing conditions. Moreover, blur-trained CNNs acquire increased sensitivity to shape information and greater robustness to multiple forms of visual noise, leading to improved correspondence with human perception. Our results provide multi-faceted neurocomputational evidence that blurry visual experiences may be critical for conferring robustness to biological visual systems.
... Neurons in V4 are selective for complex shapes (Kobatake and Tanaka, 1994;Connor, 1999, 2001) and lesions to area V4 profoundly disrupt form-processing behaviors (Merigan, 1996). Many neurons in V4 are selective to the sharpness of object edges (Oleskiw et al., 2018) and monocular cues for three-dimensional shape (Srinath et al., 2021), both of which are disrupted by image scrambling. ...
... This is also consistent with many other studies that have described slowly-developing signals within V4. Some stimulus-driven properties, such as selectivity for complex contours (Yau et al., 2013) or shapes with blurred edges (Oleskiw et al., 2018), emerge more quickly than scrambled image modulation. Others, like the V4 population's selectivity to complex, perceptually-salient features of texture, emerge over a time course (Kim et al., 2022) similar to that of scrambled image modulation. ...
Preprint
Full-text available
Humans and monkeys can rapidly recognize objects in everyday scenes. While it is known that this ability relies on neural computations in the ventral stream of visual cortex, it is not well understood where this computation first arises. Previous work suggests selectivity for object shape first emerges in area V4. To explore the mechanisms of this selectivity, we generated a continuum of images between "scrambled" textures and photographic images of both natural and man-made environments, using techniques that preserve the local statistics of the original image while discarding information about scene and shape. We measured image responses from single units in area V4 from two awake macaque monkeys. Neuronal populations in V4 could reliably distinguish photographic from scrambled images, could more reliably discriminate between photographic images than between scrambled images, and responded with greater dynamic range to photographic images than scrambled images. Responses to partially scrambled images were more similar to fully scrambled responses than photographic responses, even for perceptually subtle changes. This same pattern emerged when these images were analyzed with an image-computable similarity metric that predicts human judgements of image degradation (DISTS - Deep Image Structure and Texture Similarity). Finally, analysis of response dynamics showed that sensitivity to differences between photographic and scrambled responses grew slowly, peaked 190 ms after response onset, and persisted for hundreds of milliseconds following response offset, suggesting that this signal may arise from recurrent mechanisms.
... Recent work has further revealed that defocus blur provides an important cue for depth perception 44 . Neurophysiological studies have also found that shape-selective neurons in the visual cortex can be tuned to varying degrees of blur, with some neurons preferring blurry over clear depictions of 2D object shapes 45 . Thus, blur appears to be an important feature that is encoded by the visual system. ...
... We reasoned that the existing gap between CNN models and biological visual systems 5,19,22,25,31,47,50,59,60 may be ascribed, at least in part, to inadequate diversity in the set of images that are commonly used to train CNNs. In particular, we hypothesized that blur may be a critical property of natural vision 34,44,45 that contributes to the development and maintenance of robust visual systems. Although we and others have previously posited that exposure to blurry visual input may have the potential to confer some robustness to biological or artificial visual systems, the evidence to support this notion so far has been mixed 6, 52, 61, 62, 63 . ...
Preprint
Full-text available
Whenever a visual scene is cast onto the retina, much of it will appear degraded due to poor resolution in the periphery; moreover, optical defocus can cause blur in central vision. However, the pervasiveness of blurry or degraded input is typically overlooked in the training of convolutional neural networks (CNNs). We hypothesized that the absence of blurry training inputs may cause CNNs to rely excessively on high spatial frequency information for object recognition, thereby causing systematic deviations from biological vision. We evaluated this hypothesis by comparing standard CNNs with CNNs trained on a combination of clear and blurry images. We show that blur-trained CNNs outperform standard CNNs at predicting neural responses to objects across a variety of viewing conditions. Moreover, blur-trained CNNs acquire increased sensitivity to shape information and greater robustness to multiple forms of visual noise, leading to improved correspondence with human perception. Our results provide novel neurocomputational evidence that blurry visual experiences are very important for conferring robustness to biological visual systems.
... This curvature tuning is robust to changes in position (Pasupathy and Connor, 1999), scale (El-Shamayleh and Pasupathy, 2016), and color (Bushnell et al., 2011). The tuning o V4 neurons or combinations o shape and texture (Kim et al., 2019), or combinations o shape and blur (Oleskiw et al., 2018) is largely separable, suggesting that decoders constructed according to the principles we have employed here could be used to represent these and other stimulus dimensions o interest and relevance to behavior. ...
Preprint
Full-text available
Sensory stimuli vary across a variety of dimensions, like contrast, orientation, or texture. The brain must rely on population representations to disentangle changes in one dimension from changes in another. To understand how the visual system might extract separable stimulus representations, we recorded multiunit neuronal responses to texture images varying along two dimensions: contrast, a property represented as early as the retina, and naturalistic statistical structure, a property that modulates neuronal responses in V2 and V4, but not in V1. We measured how sites in these 3 cortical areas responded to variation in both dimensions. Contrast modulated responses in all areas. In V2 and V4, the presence of naturalistic structure both modulated responses and increased contrast sensitivity. Tuning for naturalistic structure was strongest in V4; tuning in both dimensions was most heterogeneous in V4. We measured how well populations in each area could support the linear readout of both dimensions. Populations in V2 and V4 could support the linear readout of naturalistic structure, but only in V4 did we find evidence for a robust representation that was contrast-invariant. Significance Statement Single neurons in visual cortex respond selectively to multiple stimulus dimensions, so signals from single neurons cannot distinguish changes in one dimension from changes in another. We measured responses from simultaneously recorded neural populations in three hierarchically linked visual areas – V1, V2, and V4 – using texture stimuli that varied in two dimensions, contrast and naturalistic image structure. We used linear decoding methods to extract information about each dimension. In all three areas, contrast could be decoded independently of image structure. Only in V4, however, could image structure be decoded independently of contrast. The reason is that selectivity for texture and contrast in V4 was much more diverse than in V1 or V2. This heterogeneity allows V4 to faithfully represent naturalistic image structure independent of contrast.
... Recently, Olekiw et al. [41] demonstrated that V4 neurons in the primate visual cortex exhibit tuning to both the curvature of object boundaries and the degree of blur at these boundaries. Their findings suggest that the primate visual system, including humans, possesses specialized neural mechanisms to process blurred edges. ...
Article
Full-text available
Systems with occlusion capabilities, such as those used in vision augmentation, image processing, and optical see-through head-mounted display (OST-HMD), have gained popularity. Achieving precise (hard-edge) occlusion in these systems is challenging, often requiring complex optical designs and bulky volumes. On the other hand, utilizing a single transparent liquid crystal display (LCD) is a simple approach to create occlusion masks. However, the generated mask will appear defocused (soft-edge) resulting in insufficient blocking or occlusion leakage. In our work, we delve into the perception of soft-edge occlusion by the human visual system and present a preference-based optimal expansion method that minimizes perceived occlusion leakage. In a user study involving 20 participants, we made a noteworthy observation that the human eye perceives a sharper edge blur of the occlusion mask when individuals see through it and gaze at a far distance, in contrast to the camera system's observation. Moreover, our study revealed significant individual differences in the perception of soft-edge masks in human vision when focusing. These differences may lead to varying degrees of demand for mask size among individuals. Our evaluation demonstrates that our method successfully accounts for individual differences and achieves optimal masking effects at arbitrary distances and pupil sizes.
... One line of evidence suggests that V4 is tuned in a high-dimensional space that facilitates the joint encoding of shape and surface characteristics of object parts [1,2] (e.g. sensitivity to luminance [3], texture [4] and chromatic contrasts [5], blurry boundaries [6], and luminance gradients [7]). Although very insightful, these experiments are constrained to a relatively small number of stimulus feature directions that potentially miss other important aspects of V4 function that could be unlocked with richer natural stimuli. ...
Article
Full-text available
Responses to natural stimuli in area V4—a mid-level area of the visual ventral stream—are well predicted by features from convolutional neural networks (CNNs) trained on image classification. This result has been taken as evidence for the functional role of V4 in object classification. However, we currently do not know if and to what extent V4 plays a role in solving other computational objectives. Here, we investigated normative accounts of V4 (and V1 for comparison) by predicting macaque single-neuron responses to natural images from the representations extracted by 23 CNNs trained on different computer vision tasks including semantic, geometric, 2D, and 3D types of tasks. We found that V4 was best predicted by semantic classification features and exhibited high task selectivity, while the choice of task was less consequential to V1 performance. Consistent with traditional characterizations of V4 function that show its high-dimensional tuning to various 2D and 3D stimulus directions, we found that diverse non-semantic tasks explained aspects of V4 function that are not captured by individual semantic tasks. Nevertheless, jointly considering the features of a pair of semantic classification tasks was sufficient to yield one of our top V4 models, solidifying V4’s main functional role in semantic processing and suggesting that V4’s selectivity to 2D or 3D stimulus properties found by electrophysiologists can result from semantic functional goals.
Preprint
Full-text available
To correctly parse the visual scene, one must detect edges and determine their underlying cause. Previous work has demonstrated that image-computable neural networks trained to differentiate natural shadow and occlusion edges exhibited sensitivity to boundary sharpness and texture differences. Although these models showed a strong correlation with human performance on an edge classification task, this previous study did not directly investigate whether humans actually make use of boundary sharpness and texture cues when classifying edges as shadows or occlusions. Here we directly investigated this using synthetic image patch stimuli formed by quilting together two different natural textures, allowing us to parametrically manipulate boundary sharpness, texture modulation, and luminance modulation. In a series of initial training experiments, observers were trained to correctly identify the cause of natural image patches taken from one of three categories (occlusion, shadow, uniform texture). In a subsequent series of test experiments, these same observers then classified 5 sets of synthetic boundary images defined by varying boundary sharpness, luminance modulation, and texture modulation cues using a set of novel parametric stimuli. These three visual cues exhibited strong interactions to determine categorization probabilities. For sharp edges, increasing luminance modulation made it less likely the patch would be classified as a texture and more likely it would be classified as an occlusion, whereas for blurred edges, increasing luminance modulation made it more likely the patch would be classified as a shadow. Boundary sharpness had a profound effect, so that in the presence of luminance modulation increasing sharpness decreased the likelihood of classification as a shadow and increased the likelihood of classification as an occlusion. Texture modulation had little effect on categorization, except in the case of a sharp boundary with zero luminance modulation. Results were consistent across all 5 stimulus sets, showing these effects are not due to the idiosyncrasies of the particular texture pairs. Human performance was found to be well explained by a simple linear multinomial logistic regression model defined on luminance, texture and sharpness cues, with slightly improved performance for a more complicated nonlinear model taking multiplicative parameter combinations into account. Our results demonstrate that human observers make use of the same cues as our previous machine learning models when detecting edges and determining their cause, helping us to better understand the neural and perceptual mechanisms of scene parsing.
Article
In this response paper, we start by addressing the main points made by the commentators on the target article's main theoretical conclusions: the existence and characteristics of the intermediate shape-centered representations (ISCRs) in the visual system, their emergence from edge detection mechanisms operating on different types of visual properties, and how they are eventually reunited in higher order frames of reference underlying conscious visual perception. We also address the much-commented issue of the possible neural mechanisms of the ISCRs. In the final section, we address more specific and general comments, questions, and suggestions which, albeit very interesting, were less directly focused on the main conclusions of the target paper.
Article
Full-text available
Unlabelled: Size-invariant object recognition-the ability to recognize objects across transformations of scale-is a fundamental feature of biological and artificial vision. To investigate its basis in the primate cerebral cortex, we measured single neuron responses to stimuli of varying size in visual area V4, a cornerstone of the object-processing pathway, in rhesus monkeys (Macaca mulatta). Leveraging two competing models for how neuronal selectivity for the bounding contours of objects may depend on stimulus size, we show that most V4 neurons (∼70%) encode objects in a size-invariant manner, consistent with selectivity for a size-independent parameter of boundary form: for these neurons, "normalized" curvature, rather than "absolute" curvature, provided a better account of responses. Our results demonstrate the suitability of contour curvature as a basis for size-invariant object representation in the visual cortex, and posit V4 as a foundation for behaviorally relevant object codes. Significance statement: Size-invariant object recognition is a bedrock for many perceptual and cognitive functions. Despite growing neurophysiological evidence for invariant object representations in the primate cortex, we still lack a basic understanding of the encoding rules that govern them. Classic work in the field of visual shape theory has long postulated that a representation of objects based on information about their bounding contours is well suited to mediate such an invariant code. In this study, we provide the first empirical support for this hypothesis, and its instantiation in single neurons of visual area V4.
Article
Full-text available
Unlabelled: The human visual system must extract reliable object information from cluttered visual scenes several times per second, and this temporal constraint has been taken as evidence that the underlying cortical processing must be strictly feedforward. Here we use a novel rapid reinforcement paradigm to probe the temporal dynamics of the neural circuit underlying rapid object shape perception and thus test this feedforward assumption. Our results show that two shape stimuli are optimally reinforcing when separated in time by ∼60 ms, suggesting an underlying recurrent circuit with a time constant (feedforward + feedback) of 60 ms. A control experiment demonstrates that this is not an attentional cueing effect. Instead, it appears to reflect the time course of feedback processing underlying the rapid perceptual organization of shape. Significance statement: Human and nonhuman primates can spot an animal shape in complex natural scenes with striking speed, and this has been taken as evidence that the underlying cortical mechanisms are strictly feedforward. Using a novel paradigm to probe the dynamics of shape perception, we find that two shape stimuli are optimally reinforcing when separated in time by 60 ms, suggesting a fast but recurrent neural circuit. This work (1) introduces a novel method for probing the temporal dynamics of cortical circuits underlying perception, (2) provides direct evidence against the feedforward assumption for rapid shape perception, and (3) yields insight into the role of feedback connections in the object pathway.
Article
Full-text available
The mid-level visual cortical area V4 in the primate is thought to be critical for the neural representation of visual shape. Several studies agree that V4 neurons respond to contour features, e.g., convexities and concavities along a shape boundary, that are more complex than the oriented segments encoded by neurons in the primary visual cortex. Here we compare two distinct approaches to modeling V4 shape selectivity: one based on a spectral receptive field (SRF) map in the orientation and spatial frequency domain and the other based on a map in an object-centered angular-position and contour curvature space. We test the ability of these two characterizations to account for the responses of V4 neurons to a set of parametrically designed two-dimensional shapes recorded previously in the awake macaque. We report two lines of evidence suggesting that the SRF model does not capture the contour sensitivity of V4 neurons. First, the SRF model discards spatial phase information, which is inconsistent with the neuronal data. Second, the amount of variance explained by the SRF model was significantly less than that explained by the contour curvature model. Notably, cells best fit by the curvature model were poorly fit by the SRF model, the latter being appropriate for a subset of V4 neurons that appear to be orientation tuned. These limitations of the SRF model suggest that a full understanding of mid-level shape representation requires more complicated models that preserve phase information and perhaps deal with object segmentation.
Article
Full-text available
The primate brain successfully recognizes objects, even when they are partially occluded. To begin to elucidate the neural substrates of this perceptual capacity, we measured the responses of shape-selective neurons in visual area V4 while monkeys discriminated pairs of shapes under varying degrees of occlusion. We found that neuronal shape selectivity always decreased with increasing occlusion level, with some neurons being notably more robust to occlusion than others. The responses of neurons that maintained their selectivity across a wider range of occlusion levels were often sufficiently sensitive to support behavioral performance. Many of these same neurons were distinctively selective for the curvature of local boundary features and their shape tuning was well fit by a model of boundary curvature (curvature-tuned neurons). A significant subset of V4 neurons also signaled the animal's upcoming behavioral choices; these decision signals had short onset latencies that emerged progressively later for higher occlusion levels. The time course of the decision signals in V4 paralleled that of shape selectivity in curvature-tuned neurons: shape selectivity in curvature-tuned neurons, but not others, emerged earlier than the decision signals. These findings provide evidence for the involvement of contour-based mechanisms in the segmentation and recognition of partially occluded objects, consistent with psychophysical theory. Furthermore, they suggest that area V4 participates in the representation of the relevant sensory signals and the generation of decision signals underlying discrimination.
Article
The spatial receptive fields of simple cells in mammalian striate cortex have been reasonably well described physiologically and can be characterized as being localized, oriented, and bandpass, comparable with the basis functions of wavelet transforms. Previously, we have shown that these receptive field properties may be accounted for in terms of a strategy for producing a sparse distribution of output activity in response to natural images. Here, in addition to describing this work in a more expansive fashion, we examine the neurobiological implications of sparse coding. Of particular interest is the case when the code is overcomplete--i.e., when the number of code elements is greater than the effective dimensionality of the input space. Because the basis functions are non-orthogonal and not linearly independent of each other, sparsifying the code will recruit only those basis functions necessary for representing a given input, and so the input-output function will deviate from being purely linear. These deviations from linearity provide a potential explanation for the weak forms of non-linearity observed in the response properties of cortical simple cells, and they further make predictions about the expected interactions among units in response to naturalistic stimuli.
Article
The lens system in the human eye is able to best focus light from only one distance at a time.Therefore, many objects in the natural environment are not imaged sharply on the retina. Furthermore, light from objects in the environment is subject to the particular aberrations of the observer's lens system (e.g., astigmatism and chromatic aberration). We refer to blur created by the observer's optics as "natural" or "defocus" blur as opposed to "on-screen" blur created by software on a display screen. Although blur discrimination has been studied extensively, human ability to discriminate defocus blur in images of natural scenes has not been systematically investigated. Here, we measured discrimination of defocus blur for a collection of natural image patches, sampled from well-focused photographs. We constructed a rig capable of presenting stimuli at three physical distances simultaneously. In Experiment 1, subjects viewed monocularly two simultaneously presented natural image patches through a 4-mm artificial pupil at ±1° eccentricity. The task was to identify the sharper patch. Discrimination thresholds varied substantially between stimuli but were correlated between subjects. The lowest thresholds were at or below the lowest thresholds ever reported. In a second experiment, we paralyzed accommodation and retested a subset of conditions from Experiment 1. A third experiment showed that removing contrast as a cue to defocus blur had only a modest effect on thresholds. Finally, we describe a simple masking model and evaluate how well it can explain our experimental results and the results from previous blur discrimination experiments.
Article
Humans are good at rapidly detecting animals in natural scenes, and evoked potential studies indicate that the corresponding neural signals emerge in the brain within 100 msec of stimulus onset (Kirchner & Thorpe, 2006). Given this speed, it has been suggested that the cues underling animal detection must be relatively primitive. Here we report on the role and dynamics of four potential cues: luminance, colour, texture and contour shape. We employed a set of natural images drawn from the Berkeley Segmentation Dataset (BSD, Martin et al, 2001), comprised of 180 test images (90 animal, 90 non-animal) and 45 masking images containing humans. In each trial a randomly-selected test stimulus was briefly displayed, followed by a randomly-selected and block-scrambled masking stimulus. Stimulus duration ranged from 30–120 msec. Hand-segmentations provided by the BSD allow for relatively independent manipulation of cues. Contour cues can be isolated using line drawings representing segment boundaries. Texture cues can be removed by painting all pixels within each segment with the mean colour of the segment. Shape cues can also be removed by replacing segmented images with Voronoi tessellations based on the centres of mass of the BSD segments. In this manner, we created nine different stimulus classes involving different combinations of cues, and used these to estimate the dynamics of the mechanisms underlying animal detection in natural scenes. Results suggest that the fastest mechanisms use contour shape as a principal discriminative cue, while slower mechanisms integrate texture cues. Interestingly, dynamics based on machine-generated edge maps are similar to dynamics for hand-drawn contours, suggesting that rapid detection can be based upon contours extracted in bottom-up fashion. Consistent with prior studies, we find little role for luminance and colour cues throughout the time course of visual processing, even though information relevant to the task is available in these signals.