Jonathan D. Cohen’s research while affiliated with Princeton University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (458)


Sculpting new visual categories into the human brain
  • Article

December 2024

·

10 Reads

Proceedings of the National Academy of Sciences

Coraline Rinn Iordan

·

Victoria J H Ritvo

·

Kenneth A Norman

·

[...]

·

Jonathan D Cohen

Learning requires changing the brain. This typically occurs through experience, study, or instruction. We report an alternate route for humans to acquire visual knowledge, through the direct sculpting of activity patterns in the human brain that mirror those expected to arise through learning. We used neurofeedback from closed-loop real-time functional MRI to create new categories of visual objects in the brain, without the participants’ explicit awareness. After neural sculpting, participants exhibited behavioral and neural biases for the learned, but not for the control categories. The ability to sculpt new perceptual distinctions into the human brain offers a noninvasive research paradigm for causal testing of the link between neural representations and behavior. As such, beyond its current application to perception, our work potentially has broad relevance for advancing understanding in other domains of cognition such as decision-making, memory, and motor control.


Figure 1. Effort cost by diagnostic group and effort type. (a) mean and standard error of the mean of individual differences in effort cost ( y axis) by effort type (x axis). (b) individual differences histograms, x axis indicates effort cost (larger values indicate more effort avoidance), y axis indicates proportion of diagnostic group.
Figure 2. Effort costs relationships to individual MDD symptom domains. Blue indicates cognitive effort and red indicates physical effort. y axes: effort costs from MVT model, x axes: symptom severity (z scores) for overall depression (Hamilton Depression Rating Scale Total), anhedonia, anxiety, and behavioral apathy (MDD group only).
Foraging environment parameters and results of best threshold simulation A. Environment parameter Value
Symptom effort cost regressions (MDD group only)
Symptom overall exit threshold regressions (MDD group only)
Major depression symptom severity associations with willingness to exert effort and patch foraging strategy
  • Article
  • Full-text available

December 2024

·

16 Reads

Psychological Medicine

Background Individuals with major depressive disorder (MDD) can experience reduced motivation and cognitive function, leading to challenges with goal-directed behavior. When selecting goals, people maximize ‘expected value’ by selecting actions that maximize potential reward while minimizing associated costs, including effort ‘costs’ and the opportunity cost of time. In MDD, differential weighing of costs and benefits are theorized mechanisms underlying changes in goal-directed cognition and may contribute to symptom heterogeneity. Methods We used the Effort Foraging Task to quantify cognitive and physical effort costs, and patch leaving thresholds in low effort conditions (reflecting perceived opportunity cost of time) and investigated their shared versus distinct relationships to clinical features in participants with MDD ( N = 52, 43 in-episode) and comparisons ( N = 27). Results Contrary to our predictions, none of the decision-making measures differed with MDD diagnosis. However, each of the measures was related to symptom severity, over and above effects of ability (i.e. performance). Greater anxiety symptoms were selectively associated with lower cognitive effort cost (i.e. greater willingness to exert effort). Anhedonia and behavioral apathy were associated with increased physical effort costs. Finally, greater overall depression was related to decreased patch leaving thresholds. Conclusions Markers of effort-based decision-making may inform understanding of MDD heterogeneity. Increased willingness to exert cognitive effort may contribute to anxiety symptoms such as worry. Decreased leaving threshold associations with symptom severity are consistent with reward rate-based accounts of reduced vigor in MDD. Future research should address subtypes of depression with or without anxiety, which may relate differentially to cognitive effort decisions.

Download

Figure 9: Human Evaluation
Visual analogy results: GPT-4o.
Visual analogy results: Claude Sonnet 3.5.
Visual analogy results: Gemini Ultra 1.5.
Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem

October 2024

·

13 Reads

Recent work has documented striking heterogeneity in the performance of state-of-the-art vision language models (VLMs), including both multimodal language models and text-to-image models. These models are able to describe and generate a diverse array of complex, naturalistic images, yet they exhibit surprising failures on basic multi-object reasoning tasks -- such as counting, localization, and simple forms of visual analogy -- that humans perform with near perfect accuracy. To better understand this puzzling pattern of successes and failures, we turn to theoretical accounts of the binding problem in cognitive science and neuroscience, a fundamental problem that arises when a shared set of representational resources must be used to represent distinct entities (e.g., to represent multiple objects in an image), necessitating the use of serial processing to avoid interference. We find that many of the puzzling failures of state-of-the-art VLMs can be explained as arising due to the binding problem, and that these failure modes are strikingly similar to the limitations exhibited by rapid, feedforward processing in the human brain.


A protective role for the cerebellum in cognitive aging

October 2024

·

48 Reads

Background: Brain reserve - the brain's resilience to age-related change or damage - provides protection against cognitive decline. The cerebellum is relatively unstudied as a contributor to brain reserve. This study investigates cerebellar brain reserve in the largest cohort to date. Methods: We used data from the Human Connectome Project (n=708, 36-100yrs), UK Biobank (n=45,013, 44-81yrs), and ADNI (n=1,423, 56-95yrs). ADNI participants were cognitively normal or had a diagnosis of mild cognitive impairment or Alzheimer's disease (AD) dementia. We examined associations between cerebellar tissue volume, age, Montreal Cognitive Assessment (MoCA) scores, global PET amyloid burden, and APOE genotype. Findings: HCP-Aging data revealed heterogenous aging-associated changes in cerebellar volume, with the greatest effects in posterior hemispheric regions (crus I) (Bonferroni-corrected, p<0.05). MoCA scores were associated with higher tissue density in the cerebellum (p<0.0001) to the same extent as neocortex, and MoCA scores coupled most strongly with posterior cerebellar cortex. Strikingly, greater volume in MoCA visuospatial-related cerebellar cortex protected against aging-related cognitive decline (p=0.0001). We replicated tissue aging results in the UK Biobank with the greatest aging-related effects in posterior cerebellum (p<0.0001), and an association of greater cerebellar volumes with less cognitive decline (Trails Making-B: p<0.00001; Digit Symbol Substitution: p=0.034). AD patients with low amyloid-beta burden (AB-) exhibited the strongest cerebellar association with MoCA (volume x group, AB- AD: p=0.0001). In AB- individuals, APOE e4/e4 carriers showed the greatest effect with MoCA (volume x APOE, e4/e4: p=0.017). Interpretation: Our large-scale study demonstrates a potentially strong role for the cerebellum in mitigating cognitive decline. The persistence of this protection in APOE e4 carriers reshapes our understanding of reserve and AD risk. Our findings open the cerebellum as a novel target for future clinical research on brain reserve in aging populations. Funding: National Science Foundation, National Academies of Sciences, Engineering and Medicine, National Institutes of Health.


Structural covariation parallels cerebello-thalamo-cortico circuit
a Circuit diagram of the cerebellar ascending pathway demonstrating inhibitory influence from cerebellar cortex on deep cerebellar nuclei which synapse onto thalamic nuclei via excitatory projections, ultimately innervating cerebral cortex. Marginal effects plots of the relationship between cerebellar cortex and dentate cerebellar nuclei (b); dentate cerebellar nuclei and thalamic nuclei (c); and thalamic nuclei and sensorimotor cortex (d), corrected for sex, age, and estimated intracranial volume. e Correlation matrix, thresholded at p < 0.0001, in typically developing youth generally showing negative correlations between gray matter volume in cerebellar cortex and deep cerebellar dentate nuclei, positive correlations between dentate nuclei and thalamic nuclei, and positive correlations between thalamic nuclei (central lateral, CL, laterodorsal, LD, medial ventral, MV; combined into a meta-Thalamus ROI) and sensorimotor cortex.
Cerebellar influence on sensorimotor cortex via thalamus and not pons
a, b Marginal effects plots, corrected for sex and intracranial volume, of the thalamic nuclei moderation effect on the relationship between gray matter volume in deep cerebellar dentate nuclei and sensorimotor cortex in typically developing children. Model shown per age tercile. c, d Similar to (a, b), but depicting autism cohort data. The thalamic moderation effect across age in ASD was stronger in the left cerebellar dentate × right thalamus × age model, relative to the contralateral side. In the typically developing cohort, the thalamic moderation effect across age was stronger in the right cerebellar dentate × left thalamus × age model than the contralateral side. Terciles approximated for lower: 9.64 years; middle: 13.74 years; upper: 17.84 years. e Illustration of the descending pathway between cortex, the pons, and the cerebellum. f Similar to (a–d) but using the pons as a moderator between right somatosensory cortex and left cerebellar cortex. Only lower age tercile shown, other tercile data were similar. L left, R right.
Cognitive trajectories converge in pre-adolescence and relate to cerebellum
Dimensions of cognitive function measured by the BRIEF parent-teacher questionnaire in a subset of individuals with complete behavioral data. In fitted models of age² × group across the sub-dimensions of cognitive function measured, corrected for sex and intracranial volume, there was greater impairment in the autism cohort along (a) inhibition (β = −100.10, se = 46.14, Cohen’s f = 0.16, p = 0.031), b planning (β = −119.62, se = 48.69, Cohen’s f = 0.19, p = 0.015), and c emotional regulation (β = −117.55, se=44.85, Cohen’s f = 0.20, p = 0.009). Groups did not differ in their shifting (p = 0.530), organization (p = 0.810), or working memory domains (p = 0.100). No cubic models were significant. When averaging all sub-scales into a composite measure (d), anti-correlated nonlinear cognitive trajectories between groups were evident (β = −80.03, se = 34.89, Cohen’s f = 0.17, p = 0.022). In ASD this was observed as a drop in impairment scores followed by a gradual rise with increasing age. e Correcting for biological sex and intracranial volume, greater dentate cerebellar nuclei volume predicted greater impairment scores in ASD (β = −93.88, se = 43.15, Cohen’s f = 0.16, p = 0.030).
Cerebellar-neocortical covariance network
(a:left) Vertex-wise differences in dentate nuclei-related cortical thickness, corrected for sex and intracranial volume, between autism and typically developing cohorts (p < 0.05). Results surviving multiple-comparisons correction (a:right) encompassed left insular cortex, ventral temporal cortex, bilateral lateral occipital cortex, post central cortex, left opercularis, pars orbitalis, precuneus, and bilateral supramarginal gyri. Kamada–Kawai force-directed graphs for typically developing (b), and autism cohorts (c). Size of node represents nodal hub score, edges represent covariation, and the position of a node in the graph depicts its level of covariance to the rest of the network, with highly influential nodes placed in the center, and less influential nodes placed in the periphery. Purple = neocortex, blue = cerebellum, orange = thalamic nuclei.
ASD differences in resting state functional connectivity and their relation to structure
a Seed-to-voxel analysis comparing the functional connectivity with dentate nuclei between ASD and TD revealed decreased functional connectivity in ASD (pFWE < 0.05) across voxels encompassing bilateral precuneus, left lateral occipital and cuneate cortex, left supramarginal gyrus, left middle frontal gyrus, and ventral postero-medial thalamus, as well as left sensorimotor cortex as shown in (b). c Further, the association between cerebellar dentate nuclei size and cerebellar dentate nuclei functional connectivity-to-sensorimotor cortex was different in ASD and TD, such that in ASD, smaller dentate nuclei predicted lower functional connectivity, while in TD smaller dentate nuclei predicted greater functional connectivity (p < 0.025).
Multimodal evidence for cerebellar influence on cortical development in autism: structural growth amidst functional disruption

October 2024

·

48 Reads

·

1 Citation

Molecular Psychiatry

Despite perinatal damage to the cerebellum being one of the highest risk factors for later being diagnosed with autism spectrum disorder (ASD), it is not yet clear how the cerebellum might influence the development of cerebral cortex and whether this co-developmental process is distinct between neurotypical and ASD children. Leveraging a large structural brain MRI dataset of neurotypical children and those diagnosed with ASD, we examined whether structural variation in cerebellar tissue across individuals was correlated with neocortical variation during development, including the thalamus as a coupling factor. We found that the thalamus plays a distinct role in moderating cerebro-cerebellar structural coordination in ASD. Notably, structural coupling between cerebellum, thalamus, and neocortex was strongest in younger childhood and waned by early adolescence, mirroring a previously undescribed trajectory of behavioral development between ASD and neurotypical children. Complementary functional connectivity analyses likewise revealed atypical connectivity between cerebellum and neocortex in ASD. This relationship was particularly prominent in a model of cerebellar structure predicting functional connectivity, where ASD and neurotypical children showed divergent patterns. Interestingly, these functional-structural relationships became more prominent with age, while structural effects were most prominent earlier in childhood, and showed significant lateralization. This pattern may suggest a developmental sequence where early uncoordinated structural growth amongst regions is followed by increasingly atypical functional synchronization. These findings provide multimodal evidence in the living brain for a cerebellar diaschisis model of autism, where both increased cerebellar-cerebral structural coupling and altered functional connectivity in cerebro-cerebellar pathways contribute to the ontogeny of this neurodevelopmental disorder.


Fig. 2. State Space Analysis. A) Top : Schematic of the latent state space model. Bottom: temporal basis set spanning the epoch. B) The log-likelihood from a held-out test set (y-axis), across different number of factors (x-axis). Dots indicate single subjects; log-likelihood is mean-centered within-participant to show relative differences. Inset: Bayesian model selection. C) Parameter recovery on a synthetic dataset generated from participant 2's parameters. D) Predictive accuracy in one example trial from the test set across a subset of the EEG electrodes. Thick black lines indicate EEG voltage, red lines indicate next-timestep predictions from a Kalman filter (112 factor model). E) Cross-validated coefficient of determination for SSA models across factor size, relative to autoregressive encoding models fit directly to the observations. Horizontal green line indicates the fit of parametermatched RNNs; shaded area indicates the 95% interval of the RNN fit across participants.
Active reconfiguration of neural task states

September 2024

·

18 Reads

The ability to switch between different tasks is a critical component of adaptive cognitive functioning, but a mechanistic understanding of this capacity has remained elusive. Longstanding debates over whether task switching requires active preparation remain hotly contested, in large part due to the difficulty of inferring task preparation from behavior alone. We make progress on this debate by quantifying neural task representations through high-dimensional linear dynamical systems fit to human electroencephalographic recordings. We find that these dynamical systems are highly predictive of macroscopic neural activity, and reveal neural signatures of active preparation that are shared with task-optimized neural networks. These findings help inform a classic debate about how we control our cognition, and offer a promising new paradigm for neuroimaging analysis.


Learning expectations shape cognitive control allocation

August 2024

·

12 Reads

Current models frame the allocation of cognitive control as a process of expected utility maximization. The benefits of a candidate control signal are weighed against its costs (e.g., opportunity costs). Recent theorizing has found that, despite promoting the counterintuitive behavior of longer deliberation, which is less rewarding in the short term, it is nevertheless normative to account for the value of learning when determining control allocation. Here, we sought to test this proposal by examining whether people were willing to allocate greater control and thereby expend greater effort (e.g., deliberate for longer) when they perceived a task to be learnable compared to when they did not. We found that participants' proficiency and learning rate in the first block of a simple perceptual dot-motion task were able to predict their willingness to deliberate in a second block. These findings support the hypothesis that agents consider learnability when allocating cognitive control, and comply with a formal model of control allocation that considers the future discounted value of learning on reward.


Figure 2. Generation of test analogies from training analogies (region marked in blue) by: (a) translating both dimension values of A, B, C, D by the same amount; and (b) scaling both dimension values of A, B, C, D by the same amount. Since both dimension values are transformed by the same amount, each input gets transformed along the diagonal.
Determinantal point process attention over grid cell code supports out of distribution generalization

August 2024

·

10 Reads

·

1 Citation

eLife

Deep neural networks have made tremendous gains in emulating human-like intelligence, and have been used increasingly as ways of understanding how the brain may solve the complex computational problems on which this relies. However, these still fall short of, and therefore fail to provide insight into how the brain supports strong forms of generalization of which humans are capable. One such case is out-of-distribution (OOD) generalization – successful performance on test examples that lie outside the distribution of the training set. Here, we identify properties of processing in the brain that may contribute to this ability. We describe a two-part algorithm that draws on specific features of neural computation to achieve OOD generalization, and provide a proof of concept by evaluating performance on two challenging cognitive tasks. First we draw on the fact that the mammalian brain represents metric spaces using grid cell code (e.g., in the entorhinal cortex): abstract representations of relational structure, organized in recurring motifs that cover the representational space. Second, we propose an attentional mechanism that operates over the grid cell code using determinantal point process (DPP), that we call DPP attention (DPP-A) – a transformation that ensures maximum sparseness in the coverage of that space. We show that a loss function that combines standard task-optimized error with DPP-A can exploit the recurring motifs in the grid cell code, and can be integrated with common architectures to achieve strong OOD generalization performance on analogy and arithmetic tasks. This provides both an interpretation of how the grid cell code in the mammalian brain may contribute to generalization performance, and at the same time a potential means for improving such capabilities in artificial neural networks.


An Integrated Model of Semantics and Control

July 2024

·

15 Reads

·

2 Citations

Psychological Review

Understanding the mechanisms enabling the learning and flexible use of knowledge in context-appropriate ways has been a major focus of research in the study of both semantic cognition and cognitive control. We present a unified model of semantics and control that addresses these questions from both perspectives. The model provides a coherent view of how semantic knowledge, and the ability to flexibly access and deploy that knowledge to meet current task demands, arises from end-to-end learning of the statistics of the environment. We show that the model addresses unresolved issues from both literatures, including how control operates over features that covary with one another and how control representations themselves are structured and emerge through learning, through a series of behavioral experiments and simulations. We conclude by discussing the implications of our approach to other fundamental questions in cognitive science, machine learning, and artificial intelligence.



Citations (52)


... .09.27.615545 doi: bioRxiv preprint 2022. Critically, the elevated effort costs in the high-effort orchards are thought to decrease 178 participants' estimate of the average reward rate, leading to over-harvesting of trees and a lower 179 exit threshold (i.e., the number of expected apples received for the next harvest at the time 180 participants choose to exit the current tree) (Bustamante et al., 2023(Bustamante et al., , 2024. Accordingly, we 181 interpret the difference in exit thresholds between low-and high-effort environments as the 182 perceived (effort) cost associated with travel. ...

Reference:

Non-invasive brain stimulation over the Frontopolar Cortex promotes willingness to exert cognitive effort in a foraging-like sequential choice task
Major depression symptom severity associations with willingness to exert effort and patch foraging strategy

... Its performance during the test phase was much better in the blocked condition compared to the interleaved condition ( Fig. 3f, g). This is because the model's prior on LC inference was set to be sticky; this assumption that contexts are autocorrelated in time has been incorporated in other models of continual learning 3,43,45 , including models of this particular task 9,46 . In our case, the stickiness prior matches the autocorrelation structure in the blocked curriculum, where LCs are persistent, compared to the interleaved curriculum, where LCs have low autocorrelation. ...

Toward the Emergence of Intelligent Control: Episodic Generalization and Optimization
  • Citing Article
  • May 2024

Open Mind

... The systematic generalization is a crucial property for neural networks, required to achieve strong out-of-distribution (OOD) generalization [13][14][15][16][17][18]. It has been widely studied to improve model generalization in various fields, including natural language processing [18,20,25], computer vision [26][27][28], and robotic agents [16,29,30]. ...

The relational bottleneck as an inductive bias for efficient abstraction
  • Citing Article
  • May 2024

Trends in Cognitive Sciences

... Each component measured (cognitive and physical effort cost, cognitive task ability, subjective opportunity cost of time) has been proposed as an underlying mechanism of specific MDD symptoms. To test these accounts, we had participants who met diagnostic criteria for MDD (most in-episode) and demographically matched comparison participants with no psychiatric diagnoses complete the cognitive and physical Effort Foraging Task (Bustamante et al., 2023). To our knowledge, all previous MDD studies used explicit tasks in which participants choose between low-effort/low-reward and high-effort/high-reward options. ...

Effort Foraging Task reveals positive correlation between individual differences in the cost of cognitive and physical effort in humans

Proceedings of the National Academy of Sciences

... There exists a range of views on how unattended WM storage is implemented mechanistically in the brain (Beukers et al., 2021;Stokes, 2015;Van Loon et al., 2018;Wan, 2022;Wolff et al., 2017;Yu et al., 2020) and the extent to which the underlying processes are distinguished (or not) from episodic LTM remains debated (Beukers et al., 2023;Oberauer & Awh, 2022). A previous study found no evidence that unattended WM maintenance would improve subsequent LTM (LaRocque et al., 2015), and we likewise observed no LTM-benefits for deprioritized materials (see Exp. 2, NP items) unless the material was explicitly tested. ...

When Working Memory May Be Just Working, Not Memory

Psychological Review

... Task-Driven Representations Task-driven representations aim to summarize the robot's state and environment sufficiently for a given task (i.e., the robot has enough information to successfully complete the task). Humans are well studied for employing such representations (e.g., the gaze heuristic [16,17]) to acquire strong robustness to task-irrelevant distractors in the environment, improved planning efficiency for real-time decision making, and generalization to new tasks [18,19]. Various approaches for robotics have sought to construct or learn task-driven representations using information bottlenecks [20,2,21], minimizing the size complexity of the representation conditioned on the task [22,23], or leveraging state abstraction theory for Markov Decision Processes [24,25,26]. ...

Rational Simplification and Rigidity in Human Planning
  • Citing Article
  • October 2023

Psychological Science

... This result was replicated using 56 triaxial OPM sensors, each featuring three channels measuring different directions during presentation of a movie. Connectomes differed between participants but showed high test-retest reliability within participants, comparable to the quality of measures reported with SQUID-MEG systems [42]. A summary of the application properties discussed in the preceding sections and how they compare between systems is given in Table 1. ...

Test-retest reliability of the human connectome: An OPM-MEG study

Imaging Neuroscience

... Tile-revealing task Data source: [87] Number Example prompt: You are playing a game where you are revealing patterns on a binary grid. ...

Disentangling Abstraction from Statistical Pattern Matching in Human and Machine Learning

... Our paper additionally shows how such learned values may be combined to enable flexible behaviour. Additionally, recent theoretical work has demonstrated that RL agents including separate modules predicting reward in different dimensions have advantages in tasks requiring acquisition of multiple resource types [89]. ...

Having multiple selves helps learning agents explore and adapt in complex changing worlds

Proceedings of the National Academy of Sciences

... ;https://doi.org/10.1101https://doi.org/10. /2024 representations are partly embodied in modality-specific neural systems that process different aspects of our experiences [5][6][7][8][11][12][13][34][35][36] and 2. that semantic activation is shaped by control processes and the need for these varies with concept type and level of contextual support 13,58,61,62,91 . ...

An Integrated Model of Semantics and Control
  • Citing Preprint
  • June 2023