Chapter

Spiking, Salience, and Saccades: Using Cognitive Models to Bridge the Gap Between “How” and “Why”

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Cognitive models, which describe cognition in terms of processes and representations, are ideally suited to help build bridges between “how” cognition works at the level of individual neurons and “why” cognition occurs at the level of goal-directed whole-organism behavior. This chapter presents an illustrative example of such a model, Salience by Competitive and Recurrent Interaction (SCRI; Cox et al. Psychol Rev, 2022), a theory of how neurons in the Frontal Eye Fields (FEF) integrate localization and identification information over time to represent the relative salience of objects in visual search. SCRI is framed in cognitive terms but is able to explain the millisecond-by-millisecond spiking activity of individual FEF neurons. This enables SCRI to help identify differences between neurons in terms of the computational mechanisms they instantiate by means of accounting for their dynamics. Such neural data also provide valuable constraints on SCRI that illuminate the relative importance of different types of competitive and recurrent interactions. Simulated activity from SCRI, coupled with a Gated Accumulator Model (GAM) of FEF movement neurons, reproduces the details of response time distributions in visual search behavior. The chapter includes extensive discussion of the difficult choices and exciting prospects for developing joint neuro-cognitive models like SCRI, developments which are enabled by recent advances in dynamic cognitive models and neural recording technologies.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Decision field theory provides for a mathematical foundation leading to a dynamic, stochastic theory of decision behavior in an uncertain environment. This theory is used to explain (a) violations of stochastic dominance, (b) violations of strong stochastic transitivity, (c) violations of independence between alternatives, (d) serial position effects on preference, (e) speed–accuracy tradeoff effects in decision making, (f) the inverse relation between choice probability and decision time, (g) changes in the direction of preference under time pressure, (h) slower decision times for avoidance as compared with approach conflicts, and (i) preference reversals between choice and selling price measures of preference. The proposed theory is compared with 4 other theories of decision making under uncertainty.
Preprint
Full-text available
The Weber-Fechner law proposes that our perceived sensory input increases with physical input on a logarithmic scale. Hippocampal "time cells" carry a record of recent experience by firing sequentially during a circumscribed period of time after a triggering stimulus. Different cells have "time fields" at different delays up to at least tens of seconds. Past studies suggest that time cells represent a compressed timeline by demonstrating that fewer time cells fire late in the delay and their time fields are wider. This paper asks whether the compression of time cells obeys the Weber-Fechner Law. Time cells were studied with a hierarchical Bayesian model that simultaneously accounts for the firing pattern at the trial level, cell level, and population level. This procedure allows separate estimates of the within-trial receptive field width and the across-trial variability. The analysis at the trial level suggests the time cells represent an internally coherent timeline as a group. Furthermore, even after isolating across-trial variability, time field width increases linearly with delay. Finally, we find that the time cell population is distributed evenly on a logarithmic time scale. Together, these findings provide strong quantitative evidence that the internal neural temporal representation is logarithmically compressed and obeys a neural instantiation of the Weber- Fechner Law.
Article
Full-text available
We present a model of the encoding of episodic associations between items, extending the dynamic approach to retrieval and decision making of Cox and Shiffrin (2017) to the dynamics of encoding. This model is the first unified account of how similarity affects associative encoding and recognition, including why studied pairs consisting of similar items are easier to recognize, why it is easy to reject novel pairs that recombine items that were studied alongside similar items, and why there is an early bias to falsely recognize novel pairs consisting of similar items that is later suppressed (Dosher, 1984; Dosher & Rosedale, 1991). Items are encoded by sampling features into limited-capacity parallel channels in working memory. Associations are encoded by conjoining features across these channels. Because similar items have common features, their channels are correlated which increases the capacity available to encode associative information. The model additionally accounts for data from a new experiment illustrating the importance of similarity for associative encoding across a variety of stimulus types (objects, words, and abstract forms) and types of similarity (perceptual or conceptual), illustrating the generality of the model. (PsycINFO Database Record (c) 2020 APA, all rights reserved).
Article
Full-text available
Modeling human cognition is challenging because there are infinitely many mechanisms that can generate any given observation. Some researchers address this by constraining the hypothesis space through assumptions about what the human mind can and cannot do, while others constrain it through principles of rationality and adaptation. Recent work in economics, psychology, neuroscience, and linguistics has begun to integrate both approaches by augmenting rational models with cognitive constraints, incorporating rational principles into cognitive architectures, and applying optimality principles to understanding neural representations. We identify the rational use of limited resources as a unifying principle underlying these diverse approaches, expressing it in a new cognitive modeling paradigm called resource-rational analysis . The integration of rational principles with realistic cognitive constraints makes resource-rational analysis a promising framework for reverse-engineering cognitive mechanisms and representations. It has already shed new light on the debate about human rationality and can be leveraged to revisit classic questions of cognitive psychology within a principled computational framework. We demonstrate that resource-rational models can reconcile the mind's most impressive cognitive skills with people's ostensive irrationality. Resource-rational analysis also provides a new way to connect psychological theory more deeply with artificial intelligence, economics, neuroscience, and linguistics.
Article
Full-text available
Frontal eye field (FEF) in macaque monkeys contributes to visual attention, visual-motor transformations and production of eye movements. Traditionally, neurons in FEF have been classified by the magnitude of increased discharge rates following visual stimulus presentation, during a waiting period, and associated with eye movement production. However, considerable heterogeneity remains within the traditional visual, visuomovement, and movement categories. Cluster analysis is a data-driven method of identifying self-segregating groups within a dataset. Because many cluster analysis techniques exist and outcomes vary with analysis assumptions, consensus clustering aggregates over multiple analyses, identifying robust groups. To describe more comprehensively the neuronal composition of FEF, we applied a consensus clustering technique for unsupervised categorization of patterns of spike rate modulation measured during a memory-guided saccade task. We report 10 functional categories, expanding on the traditional 3 categories. Categories were distinguished by latency, magnitude, and sign of visual response; the presence of sustained activity; and the dynamics, magnitude and sign of saccade-related modulation. Consensus clustering can include other metrics and can be applied to datasets from other brain regions to provide better information guiding microcircuit models of cortical function.
Article
Full-text available
In studies of voluntary movement, a most elemental quantity is the reaction time (RT) between the onset of a visual stimulus and a saccade toward it. However, this RT demonstrates extremely high variability which, in spite of extensive research, remains unexplained. It is well established that, when a visual target appears, oculomotor activity gradually builds up until a critical level is reached, at which point a saccade is triggered. Here, based on computational work and single-neuron recordings from monkey frontal eye field (FEF), we show that this rise-to-threshold process starts from a dynamic initial state that already contains other incipient, internally driven motor plans, which compete with the target-driven activity to varying degrees. The ensuing conflict resolution process, which manifests in subtle covariations between baseline activity, buildup rate, and threshold, consists of fundamentally deterministic interactions, and explains the observed RT distributions while invoking only a small amount of intrinsic randomness.
Article
Full-text available
We present a dynamic model of memory that integrates the processes of perception, retrieval from knowledge, retrieval of events, and decision making as these evolve from 1 moment to the next. The core of the model is that recognition depends on tracking changes in familiarity over time from an initial baseline generally determined by context, with these changes depending on the availability of different kinds of information at different times. A mathematical implementation of this model leads to precise, accurate predictions of accuracy, response time, and speed–accuracy trade-off in episodic recognition at the levels of both groups and individuals across a variety of paradigms. Our approach leads to novel insights regarding word frequency, speeded responding, context reinstatement, short-term priming, similarity, source memory, and associative recognition, revealing how the same set of core dynamic principles can help unify otherwise disparate phenomena in the study of memory.
Article
Full-text available
A key problem in computational neuroscience is to find simple, tractable models that are nevertheless flexible enough to capture the response properties of real neurons. Here we examine the capabilities of recurrent point process models known as Poisson generalized linear models (GLMs). These models are defined by a set of linear filters and a point nonlinearity and are conditionally Poisson spiking. They have desirable statistical properties for fitting and have been widely used to analyze spike trains from electrophysiological recordings. However, the dynamical repertoire of GLMs has not been systematically compared to that of real neurons. Here we show that GLMs can reproduce a comprehensive suite of canonical neural response behaviors, including tonic and phasic spiking, bursting, spike rate adaptation, type I and type II excitation, and two forms of bistability. GLMs can also capture stimulus-dependent changes in spike timing precision and reliability that mimic those observed in real neurons, and can exhibit varying degrees of stochasticity, from virtually deterministic responses to greater-than-Poisson variability. These results show that Poisson GLMs can exhibit a wide range of dynamic spiking behaviors found in real neurons, making them well suited for qualitative dynamical as well as quantitative statistical studies of single-neuron and population response properties.
Article
Full-text available
We generalize the integrated system model of Smith and Ratcliff (2009) to obtain a new theory of attentional selection in brief, multielement visual displays. The theory proposes that attentional selection occurs via competitive interactions among detectors that signal the presence of task-relevant features at particular display locations. The outcome of the competition, together with attention, determines which stimuli are selected into visual short-term memory (VSTM). Decisions about the contents of VSTM are made by a diffusion-process decision stage. The selection process is modeled by coupled systems of shunting equations, which perform gated where-on-what pathway VSTM selection. The theory provides a computational account of key findings from attention tasks with near-threshold stimuli. These are (a) the success of the MAX model of visual search and spatial cuing, (b) the distractor homogeneity effect, (c) the double-target detection deficit, (d) redundancy costs in the post-stimulus probe task, (e) the joint item and information capacity limits of VSTM, and (f) the object-based nature of attentional selection. We argue that these phenomena are all manifestations of an underlying competitive VSTM selection process, which arise as a natural consequence of our theory.
Article
Full-text available
The time course of perceptual choice is discussed in a model of gradual, leaky, stochastic, and competitive information accumulation in nonlinear decision units. Special cases of the model match a classical diffusion process, but leakage and competition work together to address several challenges to existing diffusion, random walk, and accumulator models. The model accounts for data from choice tasks using both time-controlled (e.g., response signal) and standard reaction time paradigms and its adequacy compares favorably with other approaches. A new paradigm that controls the time of arrival of information supporting different choice alternatives provides further support. The model captures choice behavior regardless of the number of alternatives, accounting for the log-linear relation between reaction time and number of alternatives (Hick's law) and explains a complex pattern of visual and contextual priming in visual word identification.
Article
Full-text available
Develops a theory of memory retrieval and shows that it applies over a range of experimental paradigms. Access to memory traces is viewed in terms of a resonance metaphor. The probe item evokes the search set on the basis of probe–memory item relatedness, just as a ringing tuning fork evokes sympathetic vibrations in other tuning forks. Evidence is accumulated in parallel from each probe–memory item comparison, and each comparison is modeled by a continuous random walk process. In item recognition, the decision process is self-terminating on matching comparisons and exhaustive on nonmatching comparisons. The mathematical model produces predictions about accuracy, mean reaction time, error latency, and reaction time distributions that are in good accord with data from 2 experiments conducted with 6 undergraduates. The theory is applied to 4 item recognition paradigms (Sternberg, prememorized list, study–test, and continuous) and to speed–accuracy paradigms; results are found to provide a basis for comparison of these paradigms. It is noted that neural network models can be interfaced to the retrieval theory with little difficulty and that semantic memory models may benefit from such a retrieval scheme. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Tested the 2-process theory of detection, search, and attention presented by the current authors (1977) in a series of experiments. The studies (a) demonstrate the qualitative difference between 2 modes of information processing: automatic detection and controlled search; (b) trace the course of the learning of automatic detection, of categories, and of automatic-attention responses; and (c) show the dependence of automatic detection on attending responses and demonstrate how such responses interrupt controlled processing and interfere with the focusing of attention. The learning of categories is shown to improve controlled search performance. A general framework for human information processing is proposed. The framework emphasizes the roles of automatic and controlled processing. The theory is compared to and contrasted with extant models of search and attention. (31/2 p ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
A 2-process theory of human information processing is proposed and applied to detection, search, and attention phenomena. Automatic processing is activation of a learned sequence of elements in long-term memory that is initiated by appropriate inputs and then proceeds automatically--without S control, without stressing the capacity limitations of the system, and without necessarily demanding attention. Controlled processing is a temporary activation of a sequence of elements that can be set up quickly and easily but requires attention, is capacity-limited (usually serial in nature), and is controlled by the S. A series of studies, with approximately 8 Ss, using both reaction time and accuracy measures is presented, which traces these concepts in the form of automatic detection and controlled search through the areas of detection, search, and attention. Results in these areas are shown to arise from common mechanisms. Automatic detection is shown to develop following consistent mapping of stimuli to responses over trials. Controlled search was utilized in varied-mapping paradigms, and in the present studies, it took the form of serial, terminating search. (60 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Visual psychophysics has shown that the perceptual representation of a stimulus has complex time-varying properties that depend on the response characteristics of the channel on which it is encoded. A fundamental expression of these properties is the distinction between sustained and transient processing channels. A theoretical and mathematical framework is introduced that allows such properties to be incorporated into fully stochastic models of simple reaction time (RT). These models, the multichannel leaky stochastic integrators, combine a linear filter model of stimulus encoding with an accumulative decision process and yield a stimulus representation described by a time-inhomogeneous Ornstein-Uhlenbeck diffusion process. Methods for obtaining RT distributions for these models are described, together with comparative fits to luminance-increment data obtained under conditions of channel pooling and channel independence. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
We describe a stochastic accumulator model demonstrating that visual search performance can be understood as a gated feedforward cascade from a salience map to multiple competing accumulators. The model quantitatively accounts for behavior and predicts neural dynamics of macaque monkeys performing visual search for a target stimulus among different numbers of distractors. The salience accumulated in the model is equated with the spike trains recorded from visually responsive neurons in the frontal eye field. Accumulated variability in the firing rates of these neurons explains choice probabilities and the distributions of correct and error response times with search arrays of different set sizes if the accumulators are mutually inhibitory. The dynamics of the stochastic accumulators quantitatively predict the activity of presaccadic movement neurons that initiate eye movements if gating inhibition prevents accumulation before the representation of stimulus salience emerges. Adjustments in the level of gating inhibition can control trade-offs in speed and accuracy that optimize visual search performance.
Article
Full-text available
Stochastic accumulator models account for response time in perceptual decision-making tasks by assuming that perceptual evidence accumulates to a threshold. The present investigation mapped the firing rate of frontal eye field (FEF) visual neurons onto perceptual evidence and the firing rate of FEF movement neurons onto evidence accumulation to test alternative models of how evidence is combined in the accumulation process. The models were evaluated on their ability to predict both response time distributions and movement neuron activity observed in monkeys performing a visual search task. Models that assume gating of perceptual evidence to the accumulating units provide the best account of both behavioral and neural data. These results identify discrete stages of processing with anatomically distinct neural populations and rule out several alternative architectures. The results also illustrate the use of neurophysiological data as a model selection tool and establish a novel framework to bridge computational and neural levels of explanation.
Article
Full-text available
The goal of this study was to obtain a better understanding of the physiological basis of errors of visual search. Previous research has shown that search errors occur when visual neurons in the frontal eye field (FEF) treat distractors as if they were targets. We replicated this finding during an inefficient form search and extended it by measuring simultaneously a macaque homologue of an event-related potential indexing the allocation of covert attention known as the m-N2pc. Based on recent work, we expected errors of selection in FEF to propagate to areas of extrastriate cortex responsible for allocating attention and implicated in the generation of the m-N2pc. Consistent with this prediction, we discovered that when FEF neurons selected a distractor instead of the search target, the m-N2pc shifted in the same, incorrect direction prior to the erroneous saccade. This suggests that such errors are due to a systematic misorienting of attention from the initial stages of visual processing. Our analyses also revealed distinct neural correlates of false alarms and guesses. These results demonstrate that errant gaze shifts during visual search arise from errant attentional processing.
Article
Full-text available
A theory of discrimination which assumes that subjects compare psychological values evoked by a stimulus to a subjective referent is proposed. Momentary differences between psychological values for the stimulus and the referent are accumulated over time until one or the other of two response thresholds is first exceeded. The theory is analyzed as a random walk bounded between two absorbing barriers. A general solution to response conditioned expected response times is computed and the important role played by the moment generating function (mgf) for increments to the random walk is examined. From considerations of the mgf it is shown that unlike other random walk models [Stone, 1960; Laming, 1968] the present theory does not imply that response conditioned mean correct and error times must be equal. For two fixed stimuli and a fixed referent it is shown that by controlling values of response thresholds, subjects can produce Receiver Operating Characteristics similar or identical to those predicted by Signal Detection Theory, High Threshold Theory, or Low Threshold Theory.
Article
Full-text available
Visual search for a target object among distractors often takes longer when more distractors are present. To understand the neural basis of this capacity limitation, we recorded activity from visually responsive neurons in the frontal eye field (FEF) of macaque monkeys searching for a target among distractors defined by form (randomly oriented T or L). To test the hypothesis that the delay of response time with increasing number of distractors originates in the delay of attention allocation by FEF neurons, we manipulated the number of distractors presented with the search target. When monkeys were presented with more distractors, visual target selection was delayed and neuronal activity was reduced in proportion to longer response time. These findings indicate that the time taken by FEF neurons to select the target contributes to the variation in visual search efficiency.
Article
Full-text available
This article provides a systems framework for the analysis of cortical and subcortical interactions in the control of saccadic eye movements, A major thesis of this model is that a topography of saccade direction and amplitude is preserved through multiple projections between brain regions until they are finally transformed into a temporal pattern of activity that drives the eyes to the target. The control of voluntary saccades to visual and remembered targets is modeled in terms of interactions between posterior parietal cortex, frontal eye fields, the basal ganglia (caudate and substantia nigra), superior colliculus, mediodorsal thalamus, and the saccade generator of the brainstem. Interactions include the modulation of eye movement motor error maps by topographic inhibitory projections, dynamic remapping of spatial target representations in saccade motor error maps, and sustained neural activity that embodies spatial memory. Models of these mechanisms implemented in our Neural Simulation Language simulate behavior and neural activity described in the literature, and suggest new experiments.
Article
Full-text available
A new theory of search and visual attention is presented. Results support neither a distinction between serial and parallel search nor between search for features and conjunctions. For all search materials, instead, difficulty increases with increased similarity of targets to nontargets and decreased similarity between nontargets, producing a continuum of search efficiency. A parallel stage of perceptual grouping and description is followed by competitive interaction between inputs, guiding selective access to awareness and action. An input gains weight to the extent that it matches an internal description of that information needed in current behavior (hence the effect of target-nontarget similarity). Perceptual grouping encourages input weights to change together (allowing "spreading suppression" of similar nontargets). The theory accounts for harmful effects of nontargets resembling any possible target, the importance of local nontarget grouping, and many other findings.
Article
Full-text available
Describes 2 contrasting models for response latency in the yes-no signal detection situation and outlines their main characteristics. The 1st model is a generalization of the notion that latency in detection is some inverse function of distance from the criterion; the 2nd proposes that instead of a single observation on any 1 trial the S makes multiple observations and a count of these observations determines the response and its latency. Although the predictions of the models are similar in many respects, there are some points concerning the ordering of mean latencies, reaction time receiver operating characteristic curves, latency-probability relations, and the constancy of d' which differentiate them. Particularly important, the multiple observations model predicts that response bias and sensitivity are interdependent. The possibility of multiple observations in detection is briefly considered. (2 p. ref.)
Article
Full-text available
Discusses how competition between afferent data and learned feedback expectancies can stabilize a developing code by buffering committed populations of detectors against continual erosion by new environmental demands. The gating phenomena that result lead to dynamically maintained critical periods and to attentional phenomena such as overshadowing in the adult. The functional unit of cognitive coding is suggested to be an adaptive resonance, or amplification and prolongation of neural activity, that occurs when afferent data and efferent expectancies reach consensus through a matching process. The resonant state embodies the perceptual event, and its amplified and sustained activities are capable of driving slow changes of long-term memory. These mechanisms help to explain and predict (a) positive and negative aftereffects, the McCollough effect, spatial frequency adaptation, monocular rivalry, binocular rivalry and hysteresis, pattern completion, and Gestalt switching; (b) analgesia, partial reinforcement acquisition effect, conditioned reinforcers, underaroused vs overaroused depression; (c) the contingent negative variation, P300, and pontogeniculo-occipital waves; and (d) olfactory coding, corticogeniculate feedback, matching of proprioceptive and terminal motor maps, and cerebral dominance. (125 ref)
Article
Full-text available
The primate visual system consists of at least two processing streams, one passing ventrally into temporal cortex that is responsible for object vision, and the other running dorsally into parietal cortex that is responsible for spatial vision. How information from these two streams is combined for perception and action is not understood. Visually guided eye movements require information about both feature identity and location, so we investigated the topographic organization of visual cortex connections with frontal eye field (FEF), the final stage of cortical processing for saccadic eye movements. Multiple anatomical tracers were placed either in parietal and temporal cortex or in different parts of FEF in individual macaque monkeys. Convergence from the dorsal and ventral processing streams occurred in lateral FEF but not in medial FEF. Certain extrastriate areas with retinotopic visual field organizations projected topographically onto FEF. The dorsal bank of the superior temporal sulcus projected to medial FEF; the ventral bank, to lateral FEF, and the fundus, throughout FEF. Thus, lateral FEF, which is responsible for generating short saccades, receives visual afferents from the foveal representation in retinotopically organized areas, from areas that represent central vision in inferotemporal cortex and from other areas having no retinotopic order. In contrast, medial FEF, which is responsible for generating longer saccades, is innervated by the peripheral representation of retinotopically organized areas, from areas that emphasize peripheral vision or are multimodal and from other areas that have no retinotopic order or are auditory.
Article
Full-text available
Decision field theory provides for a mathematical foundation leading to a dynamic, stochastic theory of decision behavior in an uncertain environment. This theory is used to explain (a) violations of stochastic dominance, (b) violations of strong stochastic transitivity, (c) violations of independence between alternatives, (d) serial position effects on preference, (e) speed-accuracy trade-off effects in decision making, (f) the inverse relation between choice probability and decision time, (g) changes in the direction of preference under time pressure, (h) slower decision times for avoidance as compared with approach conflicts, and (i) preference reversals between choice and selling price measures of preference. The proposed theory is compared with 4 other theories of decision making under uncertainty.
Article
Full-text available
When humans respond to sensory stimulation, their reaction times tend to be long and variable relative to neural transduction and transmission times. The neural processes responsible for the duration and variability of reaction times are not understood. Single-cell recordings in a motor area of the cerebral cortex in behaving rhesus monkeys (Macaca mulatta) were used to evaluate two alternative mathematical models of the processes that underlie reaction times. Movements were initiated if and only if the neural activity reached a specific and constant threshold activation level. Stochastic variability in the rate at which neural activity grew toward that threshold resulted in the distribution of reaction times. This finding elucidates a specific link between motor behavior and activation of neurons in the cerebral cortex.
Article
Decisions about where to move the eyes depend on neurons in frontal eye field (FEF). Movement neurons in FEF accumulate salience evidence derived from FEF visual neurons to select the location of a saccade target among distractors. How visual neurons achieve this salience representation is unknown. We present a neuro-computational model of target selection called salience by competitive and recurrent interactions (SCRI), based on the competitive interaction model of attentional selection and decision-making (Smith & Sewell, 2013). SCRI selects targets by synthesizing localization and identification information to yield a dynamically evolving representation of salience across the visual field. SCRI accounts for neural spiking of individual FEF visual neurons, explaining idiosyncratic differences in neural dynamics with specific parameters. Many visual neurons resolve the competition between search items through feedforward inhibition between signals representing different search items, some also require lateral inhibition, and many act as recurrent gates to modulate the incoming flow of information about stimulus identity. SCRI was tested further by using simulated spiking representations of visual salience as input to the gated accumulator model of FEF movement neurons (Purcell et al., 2010, 2012). Predicted saccade response times fit those observed for search arrays of different set sizes and different target-distractor similarities, and accumulator trajectories replicated movement neuron discharge rates. These findings offer new insights into visual decision-making through converging neuro-computational constraints and provide a novel computational account of the diversity of FEF visual neurons. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Article
The ultimate test of the validity of a cognitive theory is its ability to predict patterns of empirical data. Cognitive models formalize this test by making specific processing assumptions that yield mathematical predictions, and the mathematics allow the models to be fitted to data. As the field of cognitive science has grown to address increasingly complex problems, so too has the complexity of models increased. Some models have become so complex that the mathematics detailing their predictions are intractable, meaning that the model can only be simulated. Recently, new Bayesian techniques have made it possible to fit these simulation-based models to data. These techniques have even allowed simulation-based models to transition into neuroscience, where tests of cognitive theories can be biologically substantiated.
Book
Available again, an influential book that offers a framework for understanding visual perception and considers fundamental questions about the brain and its functions. David Marr's posthumously published Vision (1982) influenced a generation of brain and cognitive scientists, inspiring many to enter the field. In Vision, Marr describes a general framework for understanding visual perception and touches on broader questions about how the brain and its functions can be studied and understood. Researchers from a range of brain and cognitive sciences have long valued Marr's creativity, intellectual power, and ability to integrate insights and data from neuroscience, psychology, and computation. This MIT Press edition makes Marr's influential work available to a new generation of students and scientists. In Marr's framework, the process of vision constructs a set of representations, starting from a description of the input image and culminating with a description of three-dimensional objects in the surrounding environment. A central theme, and one that has had far-reaching influence in both neuroscience and cognitive science, is the notion of different levels of analysis—in Marr's framework, the computational level, the algorithmic level, and the hardware implementation level. Now, thirty years later, the main problems that occupied Marr remain fundamental open problems in the study of perception. Vision provides inspiration for the continuing efforts to integrate knowledge from cognition and computation to understand vision and the brain.
Article
S ummary A logarithmic assessment of the performance of a predicting density is found to lead to asymptotic equivalence of choice of model by cross‐validation and Akaike's criterion, when maximum likelihood estimation is used within each model.
Article
Every scientist chooses a preferred level of analysis and this choice shapes the research program, even determining what counts as evidence. This contribution revisits Marr's (1982) three levels of analysis (implementation, algorithmic, and computational) and evaluates the prospect of making progress at each individual level. After reviewing limitations of theorizing within a level, two strategies for integration across levels are considered. One is top-down in that it attempts to build a bridge from the computational to algorithmic level. Limitations of this approach include insufficient theoretical constraint at the computation level to provide a foundation for integration, and that people are suboptimal for reasons other than capacity limitations. Instead, an inside-out approach is forwarded in which all three levels of analysis are integrated via the algorithmic level. This approach maximally leverages mutual data constraints at all levels. For example, algorithmic models can be used to interpret brain imaging data, and brain imaging data can be used to select among competing models. Examples of this approach to integration are provided. This merging of levels raises questions about the relevance of Marr's tripartite view. Copyright © 2015 Cognitive Science Society, Inc.
Article
Normalization models of visual sensitivity assume that the response of a visual mechanism is scaled divisively by the sum of the activity in the excitatory and inhibitory mechanisms in its neighborhood. Normalization models of attention assume that the weighting of excitatory and inhibitory mechanisms is modulated by attention. Such models have provided explanations of the effects of attention in both behavioral and single-cell recording studies. We show how normalization models can be obtained as the asymptotic solutions of shunting differential equations, in which stimulus inputs and the activity in the mechanism control growth rates multiplicatively rather than additively. The value of the shunting equation approach is that it characterizes the entire time course of the response, not just its asymptotic strength. We describe two models of attention based on shunting dynamics, the integrated system model of Smith and Ratcliff (2009) and the competitive interaction theory of Smith and Sewell (2013). These models assume that attention, stimulus salience, and the observer’s strategy for the task jointly determine the selection of stimuli into visual short-term memory (VSTM) and way in which stimulus representations are weighted. The quality of the VSTM representation determines the speed and accuracy of the decision. The models provide a unified account of a variety of attentional phenomena found in psychophysical tasks using single-element and multi-element displays. Our results show the generality and utility of the normalization approach to modeling attention.
Article
A mathematical characterization of serial, parallel and hybrid processes is given, and this characterization is related to several current experimental paradigms. Non-identifiability (mimicking) between two systems (i.e. models of systems) is defined as equivalence of probability distributions on element completion times for the two systems, where n elements are available for processing by each. Results are then presented for a class of systems with exponential processing times, and it is seen that several interesting cases of parallel and serial systems are equivalent to systems of the opposite type. Evidence that will allow accurate discrimination between parallel and serial processing for this and other classes of systems either requires more complete and precise information about the actual probability distributions of the systems or more specialized sets of converging operations than is usually obtained in psychological experimentation. For example, it is noted that at the level of first moments (means), even a parallel independent system can predict results usually associated with a serial system (an overall increasing linear mean reaction time curve as a function of the number of elements to be processed). Next, a functional equation is developed that must hold in order for mimicking to occur between parallel and serial systems within the same general family of probability distributions, and three special cases are examined. A parallel system with gamma-distributed processing times for element completion is then investigated, and it is shown that a strictly serial system cannot mimic it, but an interesting hybrid system can. This is followed by discussion of two kinds of partial identifiability, mimicking at the level of means and possible predicted differences at higher levels, and mimicking by approximation. Some qualitative considerations that may enter into conclusions as to parallelity or seriality of processing are then introduced. Last, it is suggested that in a broad sense questions related to parallel and serial systems concern fundamental aspects of information-processing structure and distribution of processing energy and hence merit further mathematical investigation.
Article
Statistical mimicking issues involving reaction time measures are introduced and discussed in this article. Often, discussions of mimicking have concerned the question of the serial versus parallel processing of inputs to the cognitive system. We will demonstrate that there are several alternative structures that mimic various existing models in the literature. In particular, single-process models have been neglected in this area. When parameter variability is incorporated into single-process models, resulting in discrete or continuous mixtures of reaction time distributions, the observed reaction time distribution alone is no longer as useful in allowing inferences to be made about the architecture of the process that produced it. Many of the issues are raised explicitly in examination of four different case studies of mimicking. Rather than casting a shadow over the use of quantitative methods in testing models of cognitive processes, these examples emphasize the importance of examining reaction time data armed with the tools of quantitative analysis, the importance of collecting data from the context of specific process models, and the importance of expanding the database to include other dependent measures.
Article
This article offers a synthesis of Bayesian and sample-reuse approaches to the problem of high structure model selection geared to prediction. Similar methods are used for low structure models. Nested and nonnested paradigms are discussed and examples given.
Article
A statistical model or a learning machine is called regular if the map taking a parameter to a probability distribution is one-to-one and if its Fisher information matrix is always positive definite. If otherwise, it is called singular. In regular statistical models, the Bayes free energy, which is defined by the minus logarithm of Bayes marginal likelihood, can be asymptotically approximated by the Schwarz Bayes information criterion (BIC), whereas in singular models such approximation does not hold. Recently, it was proved that the Bayes free energy of a singular model is asymptotically given by a generalized formula using a birational invariant, the real log canonical threshold (RLCT), instead of half the number of parameters in BIC. Theoretical values of RLCTs in several statistical models are now being discovered based on algebraic geometrical methodology. However, it has been difficult to estimate the Bayes free energy using only training samples, because an RLCT depends on an unknown true distribution. In the present paper, we define a widely applicable Bayesian information criterion (WBIC) by the average log likelihood function over the posterior distribution with the inverse temperature 1/logn1/\log n, where n is the number of training samples. We mathematically prove that WBIC has the same asymptotic expansion as the Bayes free energy, even if a statistical model is singular for and unrealizable by a statistical model. Since WBIC can be numerically calculated without any information about a true distribution, it is a generalized version of BIC onto singular statistical models.
Article
In a visual-detection experiment. a display of several letters was presented. and S was to report the presence or absence of a given target letter. Results clearly are incompatible with a self-terminating visual-scanning process as hypothesized by Sternberg (1967). Two models are considered. a serial exhaustive scanning process and a parallel exhaustive process, but findings from the present study do not provide a basis for differentiating between them.
Article
The accumulator model of two-choice discrimination conceives the decision process as a race between competing evidence totals in discrete time and continuous state space. General expressions for the terminating probabilities for the model are derived, and a tractable version considered in which the increment distributions are exponential. Expressions for response probabilities, first passage time distributions, and a “balance of evidence” theory of response confidence are presented, and the resulting model compared to that obtained under the more usual assumption of truncated normal increments. The exponential model is fitted to data from an experiment exhibiting both faster and slower mean error times across a range of discriminability levels, and is shown to satisfactorily account for these when augmented with the assumption of negatively correlated within-condition variation in discriminability and decision criterion. Cross-paradigmatic support for the fitted model is provided by an estimate of sampling time of the order of 100 msec, in agreement with estimates obtained from a technique employing brief stimulus exposures.
Article
There is increasing evidence that the brain relies on a set of canonical neural computations, repeating them across brain regions and modalities to apply similar operations to different problems. A promising candidate for such a computation is normalization, in which the responses of neurons are divided by a common factor that typically includes the summed activity of a pool of neurons. Normalization was developed to explain responses in the primary visual cortex and is now thought to operate throughout the visual system, and in many other sensory modalities and brain regions. Normalization may underlie operations such as the representation of odours, the modulatory effects of visual attention, the encoding of value and the integration of multisensory information. Its presence in such a diversity of neural systems in multiple species, from invertebrates to mammals, suggests that it serves as a canonical neural computation.
Article
The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.
Article
In the two-choice situation, the Wald sequential probability ratio decision procedure is applied to relate the mean and variance of the decision times, for each alternative separately, to the error rates and the ratio of the frequencies of presentation of the alternatives. For situations involving more than two choices, a fixed sample decision procedure (selection of the alternative with highest likelihood) is examined, and the relation is found between the decision time (or size of sample), the error rate, and the number of alternatives.
Article
Attention has been found to have a wide variety of effects on the responses of neurons in visual cortex. We describe a model of attention that exhibits each of these different forms of attentional modulation, depending on the stimulus conditions and the spread (or selectivity) of the attention field in the model. The model helps reconcile proposals that have been taken to represent alternative theories of attention. We argue that the variety and complexity of the results reported in the literature emerge from the variety of empirical protocols that were used, such that the results observed in any one experiment depended on the stimulus conditions and the subject's attentional strategy, a notion that we define precisely in terms of the attention field in the model, but that has not typically been completely under experimental control.
Article
Neuronal activity in the frontal eye field (FEF) identifies locations of behaviorally important objects for guiding attention and eye movements. We recorded neural activity in the FEF of monkeys trained to manually turn a lever towards the location of a pop-out target of a visual search array without shifting gaze. We examined whether the reliability of the neural representation of the salient target location predicted the monkeys' accuracy of reporting target location. We found that FEF neurons reliably encoded the location of the target stimulus not only on correct trials but also on error trials. The representation of target location in FEF persisted until the manual behavioral report but did not increase in magnitude. This result suggests that, in the absence of an eye movement report, FEF encodes the perceptual information necessary to perform the task but does not accumulate this sensory evidence towards a perceptual decision threshold. These results provide physiological evidence that, under certain circumstances, accurate perceptual representations do not always lead to accurate behavioral reports and that variability in processes outside of perception must be considered to account for the variability in perceptual choice behavior.
Article
Simple cells in the striate cortex have been depicted as half-wave-rectified linear operators. Complex cells have been depicted as energy mechanisms, constructed from the squared sum of the outputs of quadrature pairs of linear operators. However, the linear/energy model falls short of a complete explanation of striate cell responses. In this paper, a modified version of the linear/energy model is presented in which striate cells mutually inhibit one another, effectively normalizing their responses with respect to stimulus contrast. This paper reviews experimental measurements of striate cell responses, and shows that the new model explains a significantly larger body of physiological data.
Article
Recent theoretical approaches to the problem of psychophysical discrimination have produced what may be classified as ‘ statistical decision ’ or ‘ data accumulation ’ models. While the former have received much attention their application to judgment and choice meets with some difficulties. Among the latter, the two types which have received most attention are a ‘ runs ’ and a ‘ recruitment ’ model, but neither seems able to account for all of the relevant data. It is suggested instead that an ‘ accumulator ’ model, in which sampled events may vary in magnitude as well as probability, can be developed to give a good account of much of the available data on psychophysical discrimination. Two experiments are reported, in which the subject presses one of two keys as soon as he has decided whether the longer of two simultaneously presented lines is on the left or right. Results are found to be inconsistent with a runs or recruitment process, but to accord well with predictions from the accumulator model. Other evidence consistent with such a mechanism is briefly reviewed
Article
Linking propositions are statements that relate perceptual states to physiological states, and as such are one of the fundamental building blocks of visual science. A brief history of the concept of linking proposition is presented. Five general families of linking propositions--Identity, Similarity, Mutual Exclusivity, Simplicity and Analogy--are discussed, and examples of each are developed. Two specific linking propositions, involving the explanation of perceptual phenomena on the basis of the activity of single neurons, are explicated and their limitations are explored in detail. Finally, the question of the empirical testability and falsifiability of linking propositions is discussed.
Article
A new hypothesis about the role of focused attention is proposed. The feature-integration theory of attention suggests that attention must be directed serially to each stimulus in a display whenever conjunctions of more than one separable feature are needed to characterize or distinguish the possible objects presented. A number of predictions were tested in a variety of paradigms including visual search, texture segregation, identification and localization, and using both separable dimensions (shape and color) and local elements or parts of figures (lines, curves, etc. in letters) as the features to be integrated into complex wholes. The results were in general consistent with the hypothesis. They offer a new set of criteria for distinguishing separable from integral features and a new rationale for predicting which tasks will show attention limits and which will not.