Eero P. Simoncelli

Eero P. Simoncelli
New York University | NYU · Center for Neural Science (CNS)

PhD, Elec Eng & Comp Sci, MIT

About

365
Publications
92,376
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
96,131
Citations
Citations since 2017
53 Research Items
53447 Citations
201720182019202020212022202302,0004,0006,0008,00010,000
201720182019202020212022202302,0004,0006,0008,00010,000
201720182019202020212022202302,0004,0006,0008,00010,000
201720182019202020212022202302,0004,0006,0008,00010,000
Additional affiliations
January 2001 - present
September 1996 - present
New York University
January 1993 - August 1996
University of Pennsylvania

Publications

Publications (365)
Preprint
Full-text available
Perceptual sensitivity often improves with training, a phenomenon known as 'perceptual learning'. Another important perceptual dimension is appearance, the subjective sense of stimulus magnitude. Are training-induced improvements in sensitivity accompanied by more accurate appearance? Here, we examine this question by measuring both discrimination...
Preprint
Full-text available
A fraction of the visual information arriving at the retina is transmitted to the brain by signals in the optic nerve, and the brain must rely solely on these signals to make inferences about the visual world. Previous work has probed the visual information contained in retinal signals by reconstructing images from retinal activity using linear reg...
Preprint
Full-text available
The perception of sensory attributes is often quantified through measurements of discriminability (an observers' ability to detect small changes in stimulus), as well as direct judgements of appearance or intensity. Despite their ubiquity, the relationship between these two measurements is controversial and unresolved. Here, we propose a framework...
Article
Full-text available
Neurons in primate visual cortex (area V1) are tuned for spatial frequency, in a manner that depends on their position in the visual field. Several studies have examined this dependency using functional magnetic resonance imaging (fMRI), reporting preferred spatial frequencies (tuning curve peaks) of V1 voxels as a function of eccentricity, but the...
Article
Denoising is a fundamental challenge in scientific imaging. Deep convolutional neural networks (CNNs) provide the current state of the art in denoising photographic images. However, their potential has been inadequately explored for scientific imaging. Denoising CNNs are typically trained on clean images corrupted with artificial noise, but in scie...
Article
Full-text available
Many sensory-driven behaviors rely on predictions about future states of the environment. Visual input typically evolves along complex temporal trajectories that are difficult to extrapolate. We test the hypothesis that spatial processing mechanisms in the early visual system facilitate prediction by constructing neural representations that follow...
Preprint
Full-text available
A bstract Neurons in primate visual cortex (area V1) are tuned for spatial frequency, in a manner that depends on their position in the visual field. Several studies have examined this dependency using fMRI, reporting preferred spatial frequencies (tuning curve peaks) of V1 voxels as a function of eccentricity, but their results differ by as much a...
Article
Full-text available
A deep convolutional neural network has been developed to denoise atomic-resolution transmission electron microscope image datasets of nanoparticles acquired using direct electron counting detectors, for applications where the image signal is severely limited by shot noise. The network was applied to a model system of CeO 2 -supported Pt nanopartic...
Article
Full-text available
Sensory processing necessitates discarding some information in service of preserving and reformatting more behaviorally relevant information. Sensory neurons seem to achieve this by responding selectively to particular combinations of features in their inputs, while averaging over or ignoring irrelevant combinations. Here, we expose the perceptual...
Preprint
Deep convolutional neural networks (CNNs) for image denoising are usually trained on large datasets. These models achieve the current state of the art, but they have difficulties generalizing when applied to data that deviate from the training distribution. Recent work has shown that it is possible to train denoisers on a single noisy image. These...
Article
Full-text available
Significance Humans have a remarkable ability to remember images they have seen, even after seeing thousands, each only once and for a few seconds. One important step toward understanding how the primate brain supports this remarkable form of memory involves pinpointing the neural activity patterns that enable image memory behavior. This paper pres...
Preprint
Full-text available
Sensory-guided behavior requires reliable encoding of stimulus information in neural responses, and task-specific decoding through selective combination of these responses. The former has been the topic of intensive study, but the latter remains largely a mystery. We propose a framework in which shared stochastic modulation of task- informative neu...
Article
Full-text available
The performance of objective image quality assessment (IQA) models has been evaluated primarily by comparing model predictions to human quality judgments. Perceptual datasets gathered for this purpose have provided useful benchmarks for improving IQA methods, but their heavy use creates a risk of overfitting. Here, we perform a large-scale comparis...
Preprint
A deep learning-based convolutional neural network has been developed to denoise atomic-resolution in situ TEM image datasets of catalyst nanoparticles acquired on high speed, direct electron counting detectors, where the signal is severely limited by shot noise. The network was applied to a model catalyst of CeO2-supported Pt nanoparticles. We lev...
Article
Objective measures of image quality generally operate by comparing pixels of a “degraded” image to those of the original. Relative to human observers, these measures are overly sensitive to resampling of texture regions (e.g., replacing one patch of grass with another). Here, we develop the first full-reference image quality model with explicit tol...
Preprint
Deep convolutional neural networks (CNNs) currently achieve state-of-the-art performance in denoising videos. They are typically trained with supervision, minimizing the error between the network output and ground-truth clean videos. However, in many applications, such as microscopy, noiseless videos are not available. To address these cases, we bu...
Preprint
Denoising is a fundamental challenge in scientific imaging. Deep convolutional neural networks (CNNs) provide the current state of the art in denoising natural images, where they produce impressive results. However, their potential has barely been explored in the context of scientific imaging. Denoising CNNs are typically trained on real natural im...
Preprint
Full-text available
Prior probability models are a central component of many image processing problems, but density estimation is notoriously difficult for high-dimensional signals such as photographic images. Deep neural networks have provided state-of-the-art solutions for problems such as denoising, which implicitly rely on a prior probability model of natural imag...
Preprint
Full-text available
Memories of the images that we have seen are thought to be reflected in the reduction of neural responses in high-level visual areas such as inferotemporal (IT) cortex, a phenomenon known as repetition suppression (RS). We challenged this hypothesis with a task that required rhesus monkeys to report whether images were novel or repeated while ignor...
Preprint
Full-text available
We develop a model for representing visual texture in a low-dimensional feature space, along with a novel self-supervised learning objective that is used to train it on an unlabeled database of texture images. Inspired by the architecture of primate visual cortex, the model uses a first stage of oriented linear filters (corresponding to cortical ar...
Preprint
Neural populations do not perfectly encode the sensory world: their capacity is limited by the number of neurons, metabolic and other biophysical resources, and intrinsic noise. The brain is presumably shaped by these limitations, improving efficiency by discarding some aspects of incoming sensory streams, while preferentially preserving commonly o...
Preprint
Full-text available
The performance of objective image quality assessment (IQA) models has been evaluated primarily by comparing model predictions to human judgments. Perceptual datasets (e.g., LIVE and TID2013) gathered for this purpose provide useful benchmarks for improving IQA methods, but their heavy use creates a risk of overfitting. Here, we perform a large-sca...
Preprint
Full-text available
Objective measures of image quality generally operate by making local comparisons of pixels of a "degraded" image to those of the original. Relative to human observers, these measures are overly sensitive to resampling of texture regions (e.g., replacing one patch of grass with another). Here we develop the first full-reference image quality model...
Article
Full-text available
Responses of sensory neurons are often modeled using a weighted combination of rectified linear subunits. Since these subunits often cannot be measured directly, a flexible method is needed to infer their properties from the responses of downstream neurons. We present a method for maximum likelihood estimation of subunits by soft-clustering spike-t...
Article
Motion selectivity in primary visual cortex (V1) is approximately separable in orientation, spatial frequency, and temporal frequency ("frequency-separable"). Models for area MT neurons posit that their selectivity arises by combining direction-selective V1 afferents whose tuning is organized around a tilted plane in the frequency domain, specifyin...
Preprint
Motion selectivity in primary visual cortex (V1) is approximately separable in orientation, spatial frequency, and temporal frequency (“frequency-separable”). Models for area MT neurons posit that their selectivity arises by combining direction-selective V1 afferents whose tuning is organized around a tilted plane in the frequency domain, specifyin...
Preprint
Deep convolutional networks often append additive constant ("bias") terms to their convolution operations, enabling a richer repertoire of functional mappings. Biases are also used to facilitate training, by subtracting mean response over batches of training images (a component of "batch normalization"). Recent state-of-the-art blind denoising meth...
Article
Full-text available
Many behaviors rely on predictions derived from recent visual input, but the temporal evolution of those inputs is generally complex and difficult to extrapolate. We propose that the visual system transforms these inputs to follow straighter temporal trajectories. To test this ‘temporal straightening’ hypothesis, we develop a methodology for estima...
Article
Full-text available
The original and corrected figures are shown in the accompanying Author Correction.
Preprint
Full-text available
Sensory-guided behavior requires reliable encoding of information (from stimuli to neural responses) and flexible decoding (from neural responses to behavior). In typical decision tasks, a small subset of cells within a large population encode task-relevant stimulus information and need to be identified by later processing stages for relevant infor...
Preprint
Full-text available
Integration of rectified synaptic inputs is a widespread nonlinear motif in sensory neuroscience. We present a novel method for maximum likelihood estimation of nonlinear subunits by soft-clustering spike-triggered stimuli. Subunits estimated from parasol ganglion cells recorded in macaque retina partitioned the receptive field into compact regions...
Article
Full-text available
Sensory neurons represent stimulus information with sequences of action potentials that differ across repeated measurements. This variability limits the information that can be extracted from momentary observations of a neuron's response. It is often assumed that integrating responses over time mitigates this limitation. However, temporal response...
Article
Full-text available
The stimulus selectivity of neurons in V1 is well known, as is the finding that their responses can be affected by visual input to areas outside of the classical receptive field. Less well understood are the ways selectivity is modified as signals propagate to visual areas beyond V1, such as V2. We recently proposed a role for V2 neurons in represe...
Article
Full-text available
We develop a method for comparing hierarchical image representations in terms of their ability to explain perceptual sensitivity in humans. Specifically, we utilize Fisher information to establish a model-derived prediction of sensitivity to local perturbations around a given natural image. For a given image, we compute the eigenvectors of the Fish...
Article
We compare several functional models of LGN population response in terms of their ability to predict human judgments of visual distortion. The model-derived Fisher Information matrix provides a bound on discrimination thresholds for the visibility of arbitrary distortions. In particular, the largest and smallest eigenvectors of this matrix represen...
Article
Full-text available
Responses of individual task-relevant sensory neurons can predict monkeys' trial-by-trial choices in perceptual decision-making tasks. Choice-correlated activity has been interpreted as evidence that the responses of these neurons are causally linked to perceptual judgements. To further test this hypothesis, we studied responses of orientation-sele...
Article
We develop a framework for rendering photographic images, taking into account display limitations, so as to optimize perceptual similarity between the rendered image and the original scene. We formulate this as a constrained optimization problem, in which we minimize a measure of perceptual dissimilarity, the Normalized Laplacian Pyramid Distance (...
Conference Paper
Full-text available
We describe an image compression method, consisting of a nonlinear analysis transformation, a uniform quantizer, and a nonlinear synthesis transformation. The transforms are constructed in three successive stages of convolutional linear filters and nonlinear activation functions. Unlike most convolutional neural networks, the joint nonlinearity is...
Article
We describe an image compression system, consisting of a nonlinear encoding transformation, a uniform quantizer, and a nonlinear decoding transformation. Like many deep neural network architectures, the transforms consist of layers of convolutional linear filters and nonlinear activation functions, but we use a joint nonlinearity that implements a...
Article
Linear-nonlinear (LN) models and their extensions have proven successful in describing transformations from stimuli to spiking responses of neurons in early stages of sensory hierarchies. Neural responses at later stages are highly nonlinear and have generally been better characterized in terms of their decoding performance on prespecified tasks. H...
Conference Paper
We introduce a general framework for end-to-end optimization of the rate–distortion performance of nonlinear transform codes assuming scalar quantization. The framework can be used to optimize any differentiable pair of analysis and synthesis transforms in combination with any differentiable perceptual metric. As an example, we consider a code buil...
Article
Full-text available
Significance The brain generates increasingly complex representations of the visual world to recognize objects, to form new memories, and to organize visual behavior. Relatively simple signals in the retina are transformed through a cascade of neural computations into highly complex responses in visual cortical areas deep in the temporal lobe. The...
Article
Full-text available
The mammalian brain is a metabolically expensive device, and evolutionary pressures have presumably driven it to make productive use of its resources. For sensory areas, this concept has been expressed more formally as an optimality principle: the brain maximizes the information that is encoded about relevant sensory variables, given available reso...
Article
We present an image quality metric based on the transformations associated with the early visual system: local luminance subtraction and local gain control. Images are decomposed using a Laplacian pyramid, which subtracts a local estimate of the mean luminance at multiple scales. Each pyramid coefficient is then divided by a local estimate of ampli...
Article
We perceive a stable environment despite the fact that visual information is essentially acquired in a sequence of snapshots separated by saccadic eye movements. The resolution of these snapshots varies—high in the fovea and lower in the periphery—and thus the formation of a stable percept presumably relies on the fusion of information acquired at...
Article
Full-text available
Two-photon imaging of calcium indicators allows simultaneous recording of responses of hundreds of neurons over hours and even days, but provides a relatively indirect measure of their spiking activity. Existing deconvolution algorithms attempt to recover spikes from observed imaging data, which are then commonly subjected to the same analyses that...
Conference Paper
Full-text available
We introduce a parametric nonlinear transformation that is well-suited for Gaussianizing data from natural images. The data are linearly transformed, and each component is then normalized by a pooled activity measure, computed by exponentiating a weighted sum of rectified and exponentiated components and a constant. We optimize the parameters of th...
Article
Full-text available
We develop a new method for visualizing and refining the invariances of learned representations. Given two reference images (typically, differing by some transformation), we synthesize a sequence of images lying on a path between them that is of minimal length in the space of a representation (a "representational geodesic"). If the transformation r...
Article
Neurons in visual cortex vary in their orientation selectivity. We measured responses of V1 and V2 cells to orientation mixtures and fit them with a model whose stimulus selectivity arises from the combined effects of filtering, suppression, and response nonlinearity. The model explains the diversity of orientation selectivity with neuron-to-neuron...
Article
Unlabelled: The response properties of neurons in the early stages of the visual system can be described using the rectified responses of a set of self-similar, spatially shifted linear filters. In macaque primary visual cortex (V1), simple cell responses can be captured with a single filter, whereas complex cells combine a set of filters, creatin...
Article
Visual acuity rapidly declines with eccentricity. Consequently humans move their eyes several times per second, repeatedly placing a new target region of the scene under the high-resolution scrutiny of the fovea. But what happens to the low resolution information acquired in the periphery just before the fovea lands on its new target? Integrating t...
Article
What determines the shape of the spatiotemporal tuning curves of visual neurons? In V1, direction-selective simple cells have selectivity that is roughly separable in orientation, spatial frequency, and temporal frequency ("frequency separable"). Models for tuning in area MT predict that signals from V1 inputs with suitable spatial and temporal fre...
Article
Many have proposed that peripheral vision operates by computing statistical summaries over local portions of the visual field, and that the loss of information associated with this process underlies the phenomenon of "crowding" (Parkes et. al. 2001; Pelli et. al. 2004; Greenwood et. al. 2009; Balas et. al. 2009; Freeman and Simoncelli, 2011). Here,...
Article
What are the basic image attributes sensed by human vision? This fundamental question has proved difficult to answer experimentally. We introduce a novel psychophysical method that provides leverage for addressing this question in the context of visual texture perception. On each trial, the participant sees a brief display comprising a randomly pos...
Article
Full-text available
Neural responses are highly variable, and some portion of this variability arises from fluctuations in modulatory factors that alter their gain, such as adaptation, attention, arousal, expected or actual reward, emotion, and local metabolic resource availability. Regardless of their origin, fluctuations in these signals can confound or bias the inf...
Article
Full-text available
The perception of complex visual patterns emerges from neuronal activity in a cascade of areas in the primate cerebral cortex. We have probed the early stages of this cascade with "naturalistic" texture stimuli designed to capture key statistical features of natural images. Humans can recognize and classify these synthetic images and are insensitiv...
Article
Full-text available
We examine properties of perceptual image distortion models, computed as the mean squared error in the response of a 2-stage cascaded image transformation. Each stage in the cascade is composed of a linear transformation, followed by a local nonlinear normalization operation. We consider two such models. For the first, the structure of the linear t...
Conference Paper
Full-text available
Independent Component Analysis (ICA) is a generalization of Principal Component Analysis that optimizes a linear transformation to whiten and sparsify a family of source signals. The computational costs of ICA grow rapidly with dimensionality, and application to high-dimensional data is generally achieved by restricting to small windows, violating...
Article
Orientation selectivity emerges in primary visual cortex of primates and carnivores, but with striking diversity. Some neurons exhibit high selectivity for the orientation of bars and gratings, while others are hardly selective at all. It is not known how this diversity arises, nor what function it serves. In macaque V1, orientation selectivity ori...
Article
Many visual tasks, including object recognition and target search, require an “untangling” computation, in which inaccessible task-relevant information contained in an initial population representation is transformed into a more accessible linearly separable format. How are untangling computations implemented in the brain? Here we propose that a li...