Content uploaded by Hamutal Slovin
Author content
All content in this area was uploaded by Hamutal Slovin on Jan 02, 2018
Content may be subject to copyright.
Reconstruction of shape contours from V1 activity at high resolution
Guy Zurawel, Itay Shamir, Hamutal Slovin ⁎
The Gonda Multidisciplinary Brain Research Center, Bar-Ilan University, Ramat Gan, Israel
abstractarticle info
Article history:
Received 30 June 2015
Accepted 24 October 2015
Available online 27 October 2015
Keywords:
Monkey
Primary visual cortex
Image reconstruction
High-resolution visual perception
Decoding
Imaging
The role of primary visual cortex (V1) in encoding physical stimulus features is well known, while stimulus cat-
egorization is mainly attributed to highervisual areas. However,visual experience is not stripped down to invari-
ant, categorical-only “labels.”Rather, visual experiences are remarkably rich with details resulting in high-
resolution perception of objects. If V1 is involved in this process, high-resolution readout of shape contours
should be possible from V1 activity. To test this, we presented various shapes to awake, fixating monkeys
while recording V1 population activity using voltage-sensitive dye imaging. A simplified bottom-up model was
constructed based on known cortical properties and without an image prior. Contours were reconstructed
from single trials, in sub-degree resolution by applying the inverse model toneuronal responses. These novel re-
construction results suggest V1 can be an important constituent in the detailed internal representation of visual
experiences.
© 2015 Elsevier Inc. All rights reserved.
Introduction
Visual experiences are remarkably detailed and quickly processed
under variable conditions. The role of primary visual cortex (V1) in
encoding stimulus attributes of visual input is crucial to this capability
and consists of various processing mechanisms: retinotopy (Tootell
et al., 1988), receptive fields (RFs, Hubel and Wiesel, 1968), contrast
coding (DeValois and DeValois, 1988), neuronal non-linearity
(Albrecht, 1995), and several others (orientations, color, surfaces,
etc.). In contrast, high-level processing of complex features like invari-
ant categorization and face processing are associated with RFs and stim-
ulus attributes' convergence to and processing by higher visual areas
(Freiwald and Tsao, 2010; Hung et al., 2005). However, visual experi-
ence is not stripped down to invariant, categorical-only “labels.”Rather,
detailed visual information is maintained, resulting in a detailed, high-
resolution perception of visualobjects.The presence of fine details of vi-
sual stimuli (those not fully captured by categorical labels) in V1 could
contribute to detailed visual perception (Gur, 2015; Hochstein and
Ahissar, 2002; Wokke et al., 2013), potentially via readout by higher vi-
sual areas. Consequently, visual features such as shape contours should
be reconstructible (decoded) from V1 neuronal activity.
Decoding visual contents from neuronal activity has been explored
recently using various approaches. Using multivariate classification
and other statistical tools, stimulus category or orientation can be
decoded among several possibilities from cortical activity (Haynes and
Rees, 2005; Hung et al., 2005). Supplementing the observed activation
with a basis of visual elements wasshown to improve decoding results,
facilitating reconstruction of edge-defined visual stimuli (Miyawaki
et al., 2008). Use of a prior bank of images facilitated the identification
of an image that resembled the original one, visually (Kay et al., 2008)
as well as semantically (Horikawa et al., 2013; Naselaris et al., 2009).
Despite the great advances, reconstructing visual stimuli in high de-
tail directly from cortical activity without an image prior has yet to be
accomplished. However, several important steps have already been
made in this direction; Gollisch and Meister (2008) reconstructed the
contours of natural images using retinal spike latencies, and Stanley
et al. (1999) reconstructed features of natural image sequences from
cat LGN spiking activity. Finally, Thirion et al. (2006) were able to de-
code pattern identity and the position of pattern elements based on
fMRI activity and retinotopy.
Here, we set out to reconstruct high-resolution contours from V1 ac-
tivity without prior knowledge regardingshape identity or position. We
presented achromatic shapes to awake, fixating monkeys, while imag-
ing V1 activity using voltage-sensitive dye imaging (VSDI). Contours
were reconstructed by applying the inverse of a simplified forward
model to observed responses.
Materials and methods
V1 population responses were recorded using voltage-sensitive dye
imaging (VSDI) from 3 hemispheres of 3 awake, fixating macaque mon-
keys, a total of 27 imaging sessions. All experimental procedures were
approved by the Animal Care and Use Guidelines Committee of Bar-
Ilan University, supervised by the Israeli authorities for animal experi-
ments and conformed to NIH guidelines.
NeuroImage 125 (2016) 1005–1012
⁎Corresponding author at: Gonda Multidisciplinary Brain Research Center, Bar-Ilan
University, 52900, Ramat Gan, Israel.
E-mail addresses: Guy.zur@gmail.com (G. Zurawel), Itay.shamir@live.biu.ac.il
(I. Shamir), Hamutal.Slovin@biu.ac.il (H. Slovin).
http://dx.doi.org/10.1016/j.neuroimage.2015.10.072
1053-8119/© 2015 Elsevier Inc. All rights reserved.
Contents lists available at ScienceDirect
NeuroImage
journal homepage: www.elsevier.com/locate/ynimg
Animals
Three adult male monkeys (Macaca fascicularis, 12, 13, 8 kg) were
used for the current study. The surgical procedure has been reported
in detail elsewhere and is outlined brieflybelow.
Surgical procedures and voltage-sensitive dye imaging
All methods were approved by the Animal Care and Use Guidelines
Committee of Bar-Ilan University, supervised by the Israeli authorities
for animal experiments and conformed to the NIH guidelines. The surgi-
cal, staining and VSD imaging procedure have been reported in detail
elsewhere (Slovin et al., 2002; Zurawel et al., 2014).
Behavioral paradigm
The monkeys were trained on a simple fixation task. On each trial,
after a short fixation period (3–4 s), monkeys were briefly(200–
300 ms) presented with a single visual stimulus as described below.
The monkeys were required to maintain tight fixation throughout the
whole trial and were rewarded with a drop of juice for each correct
trial. Stimulated trials were interleaved with blank trials, in which the
monkeys fixated but no visual stimulus appeared.
Experimental setup
The experimental setup has been described in detail elsewhere
(Ayzenshtat et al., 2010; Zurawel et al., 2014).
Visual stimuli
Visual stimuli consisted of achromatic rings (diameter 0.6°–1.6°),
squares and square surfaces (1.0°–3.0°) or bars (2 co-aligned bars,
0.25° each, spaced 1° apart). Stimulus contrast ranged 38%–100%
(Weber), iso-luminant background luminance 17, 37 or 74 cd*mm
-2
,
and eccentricity ranged 1.9°–3.1° (see Table S1). Contour width was
0.03°–0.09°. In two sessions, 3° square stimuli were presented on the
left hemi-field and positioned such that the right side crossed the verti-
cal meridian to the right hemi-field (e.g., Fig. 3a 9th row).All other stim-
uli were presented entirely in a single hemi-field.
Eye movements
Eye positions were monitored by a monocular infrared eye tracker
(Dr. Bouis, Karlsruhe, Germany), sampled at 1 kHz and recorded at
250 Hz. Only trials where the animals maintained a tight fixation were
analyzed.
Data analysis
Typically, we analyzed 15–30 correct trials for each visual stimulus in
a recording session (see Table S1). Matlab software (version 2011b, The
MathWorks, Inc.) was used for statistical analyses and calculations. The
basic VSDI analysis consisted of (a) defining region-of-interest (only
pixels with fluorescence level ≥15% of maximal fluorescence were ana-
lyzed) ; (b) normalizing to background fluorescence; (c) average blank
subtraction for removal artifacts (e.g., heartbeat, see more details and
schematic illustration of the basic VSDI analysis in Supplementary
Fig. S12 of Ayzenshtat et al., 2010); and (d) removal of pixels located
on blood vessels. VSDI responses were then averaged over 150–200 ms.
This time range was used to avoid the effect of fast eye movements, i.e.,
saccades and micro-saccades (saccade and micro-saccade suppression
occurs up to 150–200 ms after stimulus onset; Engbert and Kliegl, 2003).
Motion artifacts were avoided, mostly due to the animals' training to
calmly maintain fixation while fixed to the head-post. Further screening
for very large saccades or irregular ECG activity, followed by
examination for irregular noise patterns associated with a motion arti-
fact eliminated introduction of noise due to motion.
Retinotopic mapping of V1
Retinotopic mapping of V1 and the V1/V2 border was obtained in a
separate set of imaging sessions using VSD and optical imagingof intrin-
sic signals and has been described elsewhere (Ayzenshtat et al., 2012).
Encoding/decoding model
A simplified bottom-up forward model was constructed based on
known cortical mechanisms: (1) population RFs (the collective RF of
all neurons imaged in a single pixel) and absolute contrast coding as
captured by cortical point spread, (2) non-linear spiking responses,
and (3) retinotopic mapping. Three corresponding model components
were constructed as detailed below. In the forward direction
(encoding), model components were applied to the stimulus in the
above sequence. To reconstruct stimuli (reverse direction), the inverse
form of each component was used, and components were applied in re-
verse order to cortical data.
To arrive at a reversible model with a limited parameter set, the
model was simplified extensively,accountingonly for major V1 mecha-
nisms necessary to reconstruct high-contrast stimuli (detailed above).
Other known coding mechanisms such as orientations, spatial frequen-
cies, phase, and the weaker surface coding (Zurawel et al., 2014) were
left out to maintain reversibility and minimum assumptions. Introduc-
ing these mechanisms would likely result in improved encoding perfor-
mance and enable reconstructions of more elaborate stimuli. However,
such a complex model would entail difficult parameter acquisition and
validation, requiring a large parameterizing data set. Importantly,
inverting such a model would require more assumptions thus making
it less naïve. We therefore restricted stimuli to high contrasts and a nar-
row band of frequencies and used the PSF to capture typical contrast re-
sponses, while neglecting contributions and other stimulus dimensions
not addressed in the model. Therefore, prior to encoding, the absolute,
non-normalized contrast of stimuli was taken. As a result, negative con-
trast (black), positive contrast and surface stimuli were all represented
as contour-only versions thereof (See Fig. S1b). Model building blocks
are detailed below.
Estimation of the point-spread function (PSF)
Presentation of a small contrast stimulus (“point”) elicits a large
spread of V1 activity (Van Essen et al., 1984) that results from (1) RFs
mapping: each contrast-responsive neuron whose RF falls within the
stimulus' position responds to some degree, and there are overlapping
RFs in V1. (2) The point-spread VSD activation reflects neuronal popula-
tions' membrane potential, encompassing both supra- and sub-threshold
response. The model is simplified by representing the contribution of
contrast responses for all orientations and their population RF in a single
PSF.
PSF was obtained empirically as the average (150–200 ms post stim-
ulus) VSD pattern evoked by a high-contrast point stimulus (0.1° × 0.1°
white square, see point-spread time course in Fig. S4). Patternswere ex-
pectedlyanisotropic (average anisotropy 1.6) and extended over sever-
al mm of cortex (Grinvald et al., 1994). This is expected from a PSF with
a sub-threshold component, which typically extends beyond the spik-
ing PSF (Das and Gilbert, 1995; Sincich and Blasdel, 2001; Xing et al.,
2009). The transition to spiking was expressed in the non-linear part
of the model (see Non-linearity section). In cases the observed PSF
was noisy or not available, a 2D well-fit exponential model was used
(br
2
N=0.66, see Fig. S1a):
fR;θðÞ¼ae −bRgθðÞðÞ ð1Þ
1006 G. Zurawel et al. / NeuroImage 125 (2016) 1005–1012
gθðÞ¼cþ1−c
2
ðcos 2 θ−θ0
ðÞðÞþ1ðÞ ð2Þ
where Rand θare polar coordinates, ais a scaling coefficient, bis the
spatial decay rate at the widest direction (θ=θ
0
), and gis the anisotro-
py function, modulating bas a function of angle, wherein cdefines the
anisotropy ratio and θ
0
the orientation of the spread. We used an expo-
nential model rather than a Gaussian (same number of parameters) to
achieve a better spatial fit, specifically at outer regions of the patterns.
The PSF pattern was then transformed to the VF (see Retinotopic
transformation section), and diameter kept within 1° in accordance
with known summation field sizes at the imaged eccentricity
(Angelucci et al., 2002). Finally, the PSF pattern was normalized to a
sum of 1 (PSF, Fig. S1a).
PSF convolution and de-convolution
The PSF was convolved with the stimulus (encoding direction) or
de-convolved to arrive at the stimulus (decoding model, reconstruc-
tion). De-convolution was performed using Richardson–Lucy (RL) de-
convolution (Lucy, 1974; Richardson, 1972), an iterative procedure
that converges to the maximum-likelihood solution of a source image
given a known PSF. RL operates under the single assumption of
Poissonian noise and receives only a blurred image and a known PSF
as input, with no other parameters or assumptions regarding the source
image.
Non-linearity
Non-linearity is a well-known property of cortical neurons
(e.g., Albrecht, 1995) . In the context of VSDI, non-linearity may repre-
sent the “thresholding”transition between membrane potential
(reflected in VSD signal) to spiking activity. We used the sigmoid func-
tion in the inverse (decoding) model:
SxðÞ¼ 1
1þe−βx−x0
ðÞ ð3Þ
where xis the input value, βis the slope, and x
0
is the value at which
S(x) = 0.5. Parameters were dependent on imaged pixel distributions:
X
0
was set as a percentile (65th–90th), βwas inversely proportional
to a percentile (65th–87th) of the distribution, scaled by a fixed factor:
β¼13
percentile xðÞ ð4Þ
In the forward (encoding) model, the inverse form was used:
S−1xðÞ¼ log xðÞ−log 1−xðÞ
βð5Þ
X
0
and βpercentiles were fit per-session based on session noise levels,
and the constant scale factor was determined using data and recon-
struction from a single session and maintained as a constant for all data.
Retinotopic transformation from the visual field to the cortical surface
Retinotopic mapping from cortical surface to the VF (bidirectional)
was performed using a log-polar model (Schira et al., 2010), as de-
scribed previously (Zurawel et al., 2014). Parameters were determined
for each animal using independent control points.
Reconstruction and discrimination performance evaluation
To evaluate reconstruction performance, we calculated pixel-wise
Pearson correlation coefficients (r) for each reconstruction–stimulus
pair. To overcome edge-width discrepancies (see discussion), stimulus
representations were taken as a 9-pixel (0.25°) wide mask of the
stimulus contour. To allow for small retinotopic inaccuracies (arising
from small variance in animal's gaze position, retinotopic model fit,
etc.), each correlation was performed with limited transformation free-
dom (±0.5° translation, ±15° rotation, ±15% scaling, not exceeding
the next smaller/larger stimulus). All correlations computed on a
3° × 5° region of the VF that covered the entire portion of the VF corre-
sponding to the imaged area. For some sessions, background noise near
the boundaries of the imaging region created a visible artifact of contrast
(e.g., Figs. 2c v). These boundary regionswere excluded from the single-
trial correlation calculation.
Discrimination among stimuli using reconstructions was per-
formed, for each reconstruction, as follows: First, rvalues (described
above) were computed against all candidate stimuli. For averages, d
prime (d′) was computed between the “correct”rvalue (correlation
with the correct stimulus) and the distribution of remaining rvalues
(correlation with all other stimuli). For single trials, ROC (receiver oper-
ating characteristics) was computed between the distribution of single-
trial “correct”rvalues and the distribution of all other single-trial r
values (obtained for all other stimuli), followed by area under the
curve (AUC) calculation. To eliminate variation due to VF position, the
analysis was repeated with all stimuli co-located with (translated to)
the same VF position.
Results
Our goal was to evaluate V1 as the neural substrate for high-resolu-
tion visual details (e.g., contours). This was done by reconstructing
shape contours along with their precise visual field (VF) position, in
high-resolution and without the use of prior image or shape examples.
Three fixating monkeys were presented with various circle, square, or
bar stimuli for 200–300 ms. V1 population responses were recorded
using voltage-sensitive dye imaging (VSDI, total of 27 imaging sessions
of all stimulus types, 3 hemispheres), reflecting supra and sub-
threshold membrane-potentials from the underlying neuronal ele-
ments. To reconstruct stimuli, neuronal data was fed into the inverse
of a simplified bottom-up model, constructed from well-known V1 pro-
cessing mechanisms: receptive fields (RFs) and cortical response to
local spatial contrast that elicit a cortical point-spread function (PSF),
non-linearity and retinotopy.
Population responses to achromatic shapes
Monkeys were presented with 10 different stimuli in various VF lo-
cations (17 stimulus/location combinations total). VSD responses were
averaged 150–200 ms (see Fig. S4) after stimulus onset, capturing late
cortical response that is more closely linked to perception (Gilad et al.,
2013; Lamme, 1995) and prior to onset of eye movements (Engbert
and Kliegl, 2003). Responses evoked by large shapes (N1.5°) showed
some resemblance to the stimulus: circle stimuli (Fig. 1a i, iv) elicited el-
liptical, annular patterns over theV1 imaged area (Fig. 1bi,ivtwodiffer-
ent animals), and similarly squares (Fig. 1a v) a rectangular pattern
(Fig. 1b v). Cortical patterns were skewed and blurred relative to the
stimuli, which can be explained by retinotopic mapping, cortical spread
and magnification in V1 (Tootell et al., 1988). Some stimuli's parts fell
very close to the Vertical Meridian (VM also the V1/V2 border on the
cortical surface). This occasionally resulted in weaker activation of the
corresponding V1 regions (Figs. 1a–biv,a–b v, red arrows; see
Discussion). Consequently, these stimulus parts would likely be more
difficult to reconstruct (see Results).
Responses to smaller stimuli showed less resemblance to the origi-
nal stimulus, possibly as a result of the cortical point-spread and the im-
aging resolution: smaller circles (0.8°, Fig. 1aiii)evokeda“filled”
(Fig. 1b iii) pattern, and responses to smaller squares (Fig. 1aii)lacked
the clear sharp corners (Fig. 1b ii). Surface “filled”stimuli also yielded
contour-like activations, accompanied by a weaker center (surface) re-
lated activation (Fig. S1b), a phenomenon we addressed recently
1007G. Zurawel et al. / NeuroImage 125 (2016) 1005–1012
(Zurawel et al., 2014). For simplicity, here we focus on reconstructing
stimulus contours and neglect the surface-related contribution.
These responses hint to possible size and resolution bounds of a re-
construction procedure relying solely on V1 population activity imaged
with current spatial resolution (see Discussion). We next constructed a
simplified forward model that could explain the data.
Stimulus encoding—forward model
A forward model (illustrated in Fig. 1c) that accounted for local con-
trast coding and RFs, non-linearity, and retinotopy was applied to stim-
uli, via the following steps, performed in sequence: Convolution of the
original stimulus (Fig. 1c i) with a contrast-evoked PSF (see Materials
Fig. 1. Typical stimuli, V1VSD responses and forward model predictions.(a) Visual stimuli: (i) Circular ring, diameter (d) = 1.6°, contrast (C) = 38%, centered positioned at (P) = −1°,–
2.5° (leftand below, respectively) relative to thefixation point (fp,small red circle). (ii) 1.0 × 1.0° square, C = 100%, P = -1°–2.25°. (iii)Circular ring, d= 0.8°, C = 38%, P = -1.2°–2.5°. (iv)
Circular ring, d= 1.6°, C = 38%, P = -1°–2°. v 2.0 × 2.0° black square outline, C = -100%, P = -1.5°–2.5°. (b) Evoked responses corresponding a(i–vi), calculated by averaging VSD maps
150–200ms after stimulus onset.Imaging window outlined in black.Upper border of imaged region marks the vertical meridian(see Materials andMethods). Major blood vessels marked
in gray. Red arrows mark weaklyactivated regions corresponding to parts of the stimulus contour. Responsesi, iii are from monkey S. Responses shown in ii,iv, and v are from monkey T.
(c) Forward model illustration (right to left): (i) example stimulus presented atop gray screen at lower left quadrant. fp illustrated in red. (ii) Stimulus convolved with contrast PSF (see
Materials and Methods). (iii) Convolved stimulus (b) after non-linearity (see Materials and Methods). Iso-eccentricity (concentric) and iso-polarity (straight) lines in blue.(iv) Projection
of encoded image onto V1 surface following retinotopic transformation (imaging window outline illustrated in black). Inset: Imaging window illustrated on a schematic macaque brain
(not to scale). (d) Response patterns predicted by the forward model for stimuli in (a). (e) Encoding performance: Pearson correlation coefficient (r) between observed and predicted
patterns for each stimulus type (n= 27 sessions): circles (d=0.6–1.6°), squares (1° × 1°–3 × 3°), and a two co-aligned bars (0.25° each, spaced 1° apart).
1008 G. Zurawel et al. / NeuroImage 125 (2016) 1005–1012
and Methods,Eqs.(1) and (2), Fig. S1a) capturing the contrast RFs con-
tribution of neuronal populations in each pixel (Fig. 1c ii), a pixel-wise
non-linearity (Eqs. (4) and (5),Fig. 1c iii) and a retinotopic transforma-
tion from the cortical surface to visual space (Fig. 1civ,seeMaterials and
Methods and Schira et al., 2010). For simplification and reversibility
purposes, additional stimulus features (e.g., orientation, frequency and
surface processing) were not explicitly addressed in the model (see
Materials and Methods).
To validate the forward model, each stimulus (defined by shape, size
and VF position, see Table S1) was encoded, and resulting patterns
(Fig. 1d shows several examples) were compared to corresponding ob-
served patterns (Fig. 1b). Similarity in shape and cortical position was
evident. Moreover, some unique, size-dependent features were also
reproduced: (1) elliptical patterns for large circles (Fig. 1di,ivcompare
with observed patterns in Fig. 1b i, iv) in contrast to “filled”patterns for
smaller circles (Figs. 1d iii, b iii); (2) apparent square-like patterns for
Fig. 2. (a) Reconstruction model (inverse model) illustration:(i) Example VSD response evoked by 1.6° ring, large blood vesselsmarked in pink. Inset: Imaging window illustrated on a
schematic macaque brain (not to scale). (ii) Computed retinotopic projection of response pattern (i) onto the VF (see Materials and Methods). (iii) Image from (ii) after applying non-
linearity (see Materials and Methods). (iv) Image from (iii) following contrast PSF de-convolution. (b, c) Reconstructions from average responses. (b, i–v) Example stimuli (same as
Fig. 1a). Insets (above arrows): average evoked responses. Weakly activated regions marked with red arrows. (c) Reconstructions corresponding to b, obtained by applyi ng the inver se
model on average responses (b insets). Original stimuli outlined in dashed gray. An artifact arising from background noise near edges of the imaging region is visible in v (sporadic acti-
vations leftof and above the squarecontour), see Materials and Methods for more detail. (d, i) Reconstruction–stimulus correlations (r) for all stimuli (defined by shape, size and VF po-
sition, see Table S1). (ii)Average correlation over time. Each value represents mean calculated for 9 time points.Data averaged over 20 ms. Shading denotes ±1 SEM (n= 27 sessions).
1009G. Zurawel et al. / NeuroImage 125 (2016) 1005–1012
large square stimuli (Figs. 1d v, b v) compared to smaller squares
(Figs. 1d ii, b ii); and (3) reduced activations for portions of the pattern
that were near or crossed the vertical meridian(Figs. 1biv–v, d iv–v, red
arrows).
Predicted and observed patterns were similar for all stimuli, indicat-
ed by Pearson correlation coefficient (r) values (Fig. 1e, r= 0.62 ± 0.02
mean ± SEM, n= 27 sessions, pb0.001). Similarity to single-trial re-
sponses was expectedly lower, but still consistently positive (0.35 ±
0.02, n= 703 trials, pb0.001). Overall, qualitative and quantitative ex-
amination of the encoded patterns attested to the validity of the model.
We next turned to reconstruct the presented shapes using the model.
Stimulus contour reconstruction
To reconstruct stimulus contours, the inverse form of the model was
applied to observed responses, in reverse order. First, measured VSDI
patterns (Fig. 2a i) were retinotopically transformed onto visual space
(Fig. 2a ii, see Materials and Methods). Next, non-linearity was applied
(see Materials and Methods and Eq. (3)) resulting in a “thresholded”
image (Fig. 2a iii), possibly isolating the supra-threshold component
of the VSD signal. Finally, the contrast PSF was de-convolved from the
image using the Richardson–Lucy (RL) de-convolution (see Materials
and Methods), an iterative maximum-likelihood based algorithm
(Fig. 2a iv and inset). Input to the RL was the blurred (thresholded)
image from the previous step and the contrast PSF, with no additional
prior knowledge regarding stimulus shape, size or position.
Reconstruction successfully reproduced the contour characteristics
of presented stimuli; large circle reconstructions (Fig. 2c i,iv) resembled
the outline, size and VF position of original stimuli (dashed circles and
Fig. 2b i,iv). Similarly, reconstructed large squares (Fig. 2c v) matched
stimuli contours in apparent shape, size, and position (Fig. 2b v). Recon-
structions of smaller stimuli (Fig. 2b ii, iii) were accurately sized and po-
sitioned, but with contours that showed deviations from the original
(rounded corners in Fig. 2c ii, near-filled circle in Fig. 2ciii).Expectedly,
weak activations corresponding to stimulus regions that were near or
crossed the vertical meridian (described above, red arrows in Fig. 2c
iv, v insets, Fig. 1b iv, v) resulted in incomplete contours (Fig. 2civ, v).
The apparent size- and position-dependent reconstruction quality
bounds may originate from limitations imposed by corresponding
evoked patterns, discussed above. Additionally, reconstructed edges ap-
peared thicker than originals and with varying intensity, as further ad-
dressed in discussion. Pixel-wise correlations between reconstructions
and stimulus contours (r,seeMaterials and Methods) were positive
and relatively high (Fig. 2d i, 0.55 ± 0.03, mean ± SEM, n=27sessions,
pb0.0001 for all).
To examine how the choice of time window (150–200 ms, Fig. S4)
affected performance, we re-calculated rat various times (Fig. 2d ii).
Correlations were low at early times, rose gradually and plateaued
(~100 ms) at a steady level.
Single-trial contour reconstruction
Averaging is efficient in overcoming noise, but natural processing
occurs without repetitions, thus reconstruction from single-trial data
is important. Most single-trial reconstructions (Fig. 3a iii, examples
from multiple sessions, different animals and VF positions in each
row; see Fig. S5 for single-trial reconstructions from the same session)
resembled the stimuli (Fig. 3a i, centered for illustration). Similar to av-
erages, contours of smaller stimuli were more difficult to reconstruct
(e.g., Fig. 3a iii rows 4–6 from top). Single-trial reconstructions bore sev-
eral differences from their trial-averaged counterparts (Fig. 3a ii, single-
session average examples). Some contours appeared “weaker”(e.g.,
Fig. 3a iii 1st column vs. averages in 3a ii). Additionally, background re-
gions contained more visible noise (e.g., trials marked with a solid black
contour) that arose from within the imaged area or its edges. In other
cases, reconstructions were slightly scaled (e.g., dashed black contour).
Finally, some had weaker or missing contour parts compared to aver-
ages (e.g., dotted black contours). Overall, single-trial reconstructions
were expectedly noisier, but in many cases, stimulus contours could
be detected. Consequently, rvalues were lower than those of averages
(Fig. 3b, 0.42 ± 0.005, n= 703) and were significantly positive (N0)
for all stimuli (maximal pb0.0001, Wilcoxon signed rank computed
separately for each session's distribution of rvalues).
Discrimination performance
Discrimination was measured by computing rvalues for each recon-
struction against all candidate stimuli (Fig. 3d bottom, normalized for
each column for visualization. see Materials and Methods). High rbe-
tween reconstructed pattern and original stimulus (diagonal entry in
3d, bottom) indicates that stimulus identity could be decoded with
high certainty. Discrimination was quantified using dprime (d′) for av-
erages (original stimulus’rvs. distribution of all others) and ROC analy-
sis for single trials (distribution of original stimulus’r’s vs. all others, see
Materials and Methods). Average reconstructions showed high d′dis-
crimination (Fig. 3c, 2.46 ± 0.57 mean ± SEM, n= 17 unique stimuli).
For single trials, ROC area under the curve measures (AUCs, ranging 1 to
0.5, perfect to chance-level) were mostly high (Fig. 3d top, 0.87 ± 0.03
mean AUC ± SEM, n= 17) and above chance (Wilcoxon signed rank,
pb0.0003). Discrimination among similar, closely positioned stimuli
(see Fig. S2a) was more difficult, especially for small-medium sized
shapes (e.g.,Fig. 3d bottom, sporadic high values). Consequently, AUCs
for these reconstructions were lower.
To isolate contributions of stimulus contour and size, we repeated
the analysis with all stimuli co-located (see Materials and Methods
and Fig. S2b). As expected, discrimination performance decreased, but
remained above chance for all stimuli (0.8 ± 0.05, pb0.002, Fig. S3).
We attribute this decrease to the previous contribution of VF position
(Table S1).
Discussion
To evaluate V1 as the neural substrate underlying representation of
fine visual details that can contribute to detailed visual perception and
be read out by higher visual areas, shape contours were reconstructed
from relatively late (150–200 ms) V1 population activity. Reconstructed
shape, size, and location accuracy were in thesub-degree range and ac-
complished via a brain-inspired model, without the use of prior image
or shape examples.
Brain-inspired modeling enabled novel, high-resolution contour re-
construction that extends beyond course reconstruction of VF position
(Thirion et al., 2006). Reconstruction was accomplished by embedding
theoretical and empirical knowledge of brain processing into the
model. In contrast, other studies used a prior collection or training set
Fig. 3. Single-trialstimulus reconstruction. (a)Reconstructions.(i) All stimulus categories, defined by shape and size(see Table S1). (ii) Typicalsingle-session average reconstructionscor-
responding to (ii). (iii) Typical single-trialreconstruction examples from various imaging sessions, animals and stimulus VF position (hence the variability in reconstructedcontours and
their positions among trials). Same model parameters as usedfor the averages. Note that the3° square stimulus was positioned such thatthe right side crossed the vertical meridian (9th
row from the top). (b) Single-trial reconstruction–stimulus correlations (r) averaged for each stimulus (error bars denote ±1sem, n= 703 trialstotal), defined by shape, size and VF po-
sition (seeTable S1). Same x-axisas c. (c) d′for average reconstructions.(d) Bottom: correlation matrixbetween reconstructions and stimuli. Eachcolumn holds averagervalues (single-
trial correlation) of one reconstruction(x-axis) with all stimuli (y-axis), defined asin b (shape, size, andVF position. X/Y-axisentries with same stimulus specification differ in VF position,
see Table S1). Values are normalized (r
n
) to a sum of 1 within each columnfor visualization purposes. Values along diagonal (solidgray contour) are r
n
sfor the actual (correct) stimulus.
Top: Single-trial discrimination: AUCs computed for each reconstruction (same x-axis as bottom panel) between distributions of rvalues for “correct”stimuli and of all other r's (for all
other stimuli).
1010 G. Zurawel et al. / NeuroImage 125 (2016) 1005–1012
1011G. Zurawel et al. / NeuroImage 125 (2016) 1005–1012
of images to reconstruct more complex stimulus features or natural
stimuli (Kay et al., 2008; Miyawaki et al., 2008) The use of high-
resolution VSDI enabled more refined reconstructions. Reconstruction
resolution (bounds) was directly affected by the imaging resolution
and per-pixel averaging but is also likely eccentricity dependent and
therefore affected by the relative location of the V1 imaging.
By constructing the model using separate encoding modules in se-
quence, representations of different encoding mechanisms were kept
separate, and model reversibility was easily maintained. Such modular
construction also enables future introduction of additional (excluded)
components like spatial interactions and surface processing. The
latter may facilitate readout of additional features, provided inter-
component interactions are modeled properly.
To avoid model over-fitting to the reconstructed stimuli, retinotopy
and PSF parameters were estimated via independent experiments using
stimuli that had minimal to no overlap with the test stimuli. Non-
linearity parameters were estimated directly from observed activation
distributions. This eliminated the possibility of circular contribution
from the encoding to the reconstruction direction that could affect re-
construction performance.
Reconstruction evaluation had low sensitivity to contour width (see
Materials and Methods), and reconstructed contours were thicker than
actual ones, which likely contributed to the presence of some elevated
correlations to stimuli other than the original. This may be due to corti-
cal magnification, explicitly modeled in the retinotopic component but
also partly expressed in the PSF. PSF imperfections and de-convolution
performance under noisy conditions also may have contributed.
For some stimuli positioned close to the VM, evoked activations cor-
responding to near-VM contour parts (0.5° and closer) were weaker
(Fig. 1b iv, v). Consequently, reconstruction of these contour portions
was more difficult (Fig. 2c iv, v). There are few possible explanations
for this phenomenon: (1) VSD staining quality; (2) effect of small eye
movements/drift, which may cause the contour located close to the
VM to shift to the opposite hemi-field; and (3) the simplified PSF repre-
sentation, which does not fully capture the special characteristics of V1
spread profile near the VM (where activation in V1 can show a steep
decay near the V1/V2 border, which is also the VM).
Edge intensity varied along reconstructed contours. This could be
explained by variable SNR within the imaged area, but also by the
model's over-simplifications; PSFs were constant throughout the VF,
while both RF and PSF size vary with eccentricity (Angelucci et al.,
2002; Van Essen et al., 1984). Similarly, contrast gain is influenced local-
ly, and non-linearity parameters were fixed for the entire image. Both
simplifications were important in constructing a simple, reversible
model.
Acknowledgments
We thank Inbal Ayzenshtat for valuable input concerning retinotopic
transformations and computational considerations. This work was
supported by the DFG: Program of German-Israeli Project cooperation
(DIP grant, ref: 185/1-1) and by theIsraeli Center of Research Excellence
(I-CORE) in Cognition (I-CORE Program 51/11).
Appendix A. Supplementary data
Supplementary data to this article can be found online at http://dx.
doi.org/10.1016/j.neuroimage.2015.10.072.
References
Albrecht, D.G., 1995. Visual cortex neurons in monkey and cat: effect of contrast on the
spatial and temporal phase transfer functions. Vis. Neurosci. 12, 1191–1210.
Angelucci, A., Levitt, J.B., Walton, E.J.S., Hupe, J.M., Bullier, J., Lund, J.S., 2002. Circuits for
local and global signal integration in primary visual cortex. J. Neurosci. 22 ,
8633–8646.
Ayzenshtat, I., Meirovithz, E., Edelman, H., Werner-reiss, U., Bienenstock, E., Abeles, M.,
Slovin, H., 2010. Precise spati otemporal patterns among visual cortical areas and
their relation to visual stimulus processing. J. Neurosci. 30, 11232–11245.
Ayzenshtat, I., Gilad, A., Zurawel, G., Slovin, H., 2012. Population response to natural im-
ages in the primary visual cortex encodes local stimulus attributes and perceptual
processing. J. Neurosci. 32, 13971–13986.
Das, A., Gilbert, C.D.,1995. Long-range horizontal connections and their role in corticalre-
organization revealed by optical recording of cat primary visual cortex. Nature 375,
780–784.
DeValois, R.L., DeValois, K.K., 1988. Spatial Vision. Oxford University Press, USA.
Engbert, R., Klieg l, R., 2003. Microsaccades uncoverthe orientation of covertattention. Vis.
Res. 43, 1035–1045.
Freiwald, W.A., Tsao,D.Y., 2010. Functional compartmentalization and viewpoint general-
ization within the macaque face-processing system. Science 330, 845–851.
Gilad, A., Meirovithz, E., Slovin, H., 2013. Population responses to contour integration:
early encoding of discrete eleme nts and late perceptual grouping. Neuron 78,
389–402.
Gollisch, T., Meister, M., 2008. Rapid neural coding in the retina with relative spike laten-
cies. Science 319, 1108–1111.
Grinvald, A., Lieke, E.E., Frostig, R.D., Hildesheim, R., 1994. Cortical point-spread function
and long-range lateral interactions revealed by real-time optical imaging of macaque
monkey primary visual cortex. J. Neurosci. 14, 2545–2568.
Gur, M., 2015. Space reconstruction by primary visual cortex activity: a parallel, non-
computational mechanism of object representation. Trends Neurosci. 38, 207–216.
Haynes, J.D., Rees, G., 2005. Predicting the orientation of invisible stimuli from activity in
human primary visual cortex. Nat. Neurosci. 8, 686–691.
Hochstein, S., Ahissar, M., 2002. View from the top: hierarchies and reverse hierarchies.
Review 36, 791–804.
Horikawa, T., Tamaki, M., Miyawaki, Y., Kamitani, Y., 2013. Neural decoding of visual im-
agery during sleep. Science 340, 639–642.
Hubel, D.H., Wiesel, T.N., 1968. Receptive fields and functional architecture of monkey
striate cortex. J. Physiol. 195, 215–243.
Hung, C.P., Kreiman, G., Poggio, T., DiCarlo, J.J., 2005. Fast readout of object identity from
macaque inferior temporal cortex. Science 310, 863–866.
Kay, K.N., Naselaris, T., Prenger, R.J., Gallant, J.L., 2008. Identifying natural images from
human brain activity. Nature 452, 352–355.
Lamme, A.F., 1995. The neurophysiology visual cortexfigure-ground segregation primary.
J. Neurosci. 15, 1605–1615.
Lucy, L., 1974. An iterative technique for th e rectification of ob served distributions.
Astron. J. 79, 745.
Miyawaki, Y., Uchida, H., Yamashita, O., Sato, M., Morito, Y., Tanabe, H.C., Sad ato, N.,
Kamitani, Y., 2008. Visual image reconstruction from human brain activity using a
combination of multiscale local image decoders. Neuron 60, 915–929.
Naselaris, T., Prenger, R.J., Kay, K.N., Oliver, M., Gallant, J.L., 2009. Bayesian reconstruction
of natural images from human brain activity. Neuron 63, 902–915.
Richardson, W.H., 1972. Bayesian-based iterative method of image restoration. JOSA 62,
55–59.
Schira, M.M., Tyler,C.W., Spehar, B., Breakspear,M., 2010. Modeling magnification and an-
isotropy in the primate foveal confluence. PLoS Comput. Biol. 6, e1000651.
Sincich, L.C., Blasdel, G.G., 2001. Oriented axon projections in primary visual cortex of the
monkey. J. Neurosci. 21, 4416–4426.
Slovin, H., Arieli, A., Hildesheim, R., Grinvald, A., 2002. Long-term voltage-sensitive dye
imaging reveals cortical dynamics in behaving monke ys. J. Neurophysiol. 88,
3421–3438.
Stanley, G.B., Li, F.F., Dan, Y., 1999. Reconstruction of natural scenes from ensemble re-
sponses in the lateral geniculate nucleus. J. Neurosci. 19, 8036–8042.
Thirion, B., Duchesnay, E., Hubbard, E., Dubois, J., Poline, J.-B., Lebihan, D., Dehaene, S.,
2006. Inverse retinotopy: inferring the visual content of images from brain activation
patterns. Neuroimage 33, 1104–1116.
Tootell, R.B., Silverman, M.S., Hamilton, S.L., De Valois, R.L., Switkes, E., 1988. Functional
anatomy of macaque striate cortex. III. Color. J. Neurosci. 8, 1569–1593.
Van Essen, D.C., Newsome, W.T., Maunsell, J.H., 1984. The visual field representation in
striate cortex of the macaque monkey: asymmetries, anisotropies, and individual var-
iability. Vis. Res. 24, 429–448.
Wokke, M.E.,Vandenbroucke, A.R.E., Scholte, H.S., Lamme, V.A.F., 2013. Confuse your illu-
sion: feedback to early visual cortex contributes to perceptual completion. Psychol.
Sci. 24, 63–71.
Xing, D., Yeh, C.I., Shapley, R.M., 2009. Spatial spread of the local field potential and its
laminar variation in visual cortex. J. Neurosci. 29, 11540–11549.
Zurawel, G., Ayzenshtat, I., Zweig, S., Shapley, R., Slovin, H., 2014. A contrast and surface
code explains complex responses to black and white stimuli in V1. J. Neurosci. 34,
14388–14402.
1012 G. Zurawel et al. / NeuroImage 125 (2016) 1005–1012