Shiva Farashahi’s research while affiliated with Flatiron Health and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (33)


A simplified schematic of a mushroom body compartment
In both the left and right panels, the mushroom body compartment, indicated by the shaded box, is innervated by the axons of multiple KCs, the dendrites from one MBON and the axon terminals of one DAN. The bi-colored circles at the intersections of the KC axons and the MBON dendrites denote the KC-MBON synapses. Faintly shaded cell bodies indicate inactive neurons and boldly shaded cell bodies indicate active neurons. In the left panel, the DAN is inactive. In the right panel, the DAN is active and co-activation of the KCs and the DAN weakens the associated KC-MBON synapses (as illustrated by the smaller synapses).
Linear separation of odors in the space of KC activities
Left: Illustration of the hyperplane H = { x : w · x = b } (dashed teal line) in the space of KC activities that separates conditioned odor responses from neutral odor responses. Each light red (resp. blue) dot denotes the KC response to a conditioned (resp. neutral) odor. The teal arrow denotes the vector of KC-MBON synaptic weights w, which is translated to show that it is orthogonal to the hyperplane H. Right: Co-activation of the KCs and the DAN weakens the synaptic weights w. The KC activities xt are denoted by the dark red dot with black border. The change of the synaptic weights Δw is in the direction −xt. The hyperplane H rotates to remain orthogonal to w. The change in bias Δb, which translates the hyperplane, is not depicted.
Performance of Algorithm 1
Accuracy of Algorithm 1 on the synthetic datasets (left) and the KC dataset (right). Each line denotes the mean accuracy over 10 runs. Each shaded region indicates the area between the minimum and maximum accuracy over 10 runs.
Performance of competing MBONs
Accuracy of 2 parallel runs of Algorithm 1 on the KC dataset to classify odors as aversive, attractive or neutral. Each line denotes the mean accuracy over 10 runs. Each shaded region indicates the area between the minimum and maximum accuracy over 10 runs.
A linear discriminant analysis model of imbalanced associative learning in the mushroom body compartment
  • Article
  • Full-text available

February 2023

·

17 Reads

·

5 Citations

David Lipshutz

·

Aneesh Kashalikar

·

Shiva Farashahi

·

To adapt to their environments, animals learn associations between sensory stimuli and unconditioned stimuli. In invertebrates, olfactory associative learning primarily occurs in the mushroom body, which is segregated into separate compartments. Within each compartment, Kenyon cells (KCs) encoding sparse odor representations project onto mushroom body output neurons (MBONs) whose outputs guide behavior. Associated with each compartment is a dopamine neuron (DAN) that modulates plasticity of the KC-MBON synapses within the compartment. Interestingly, DAN-induced plasticity of the KC-MBON synapse is imbalanced in the sense that it only weakens the synapse and is temporally sparse. We propose a normative mechanistic model of the MBON as a linear discriminant analysis (LDA) classifier that predicts the presence of an unconditioned stimulus (class identity) given a KC odor representation (feature vector). Starting from a principled LDA objective function and under the assumption of temporally sparse DAN activity, we derive an online algorithm which maps onto the mushroom body compartment. Our model accounts for the imbalanced learning at the KC-MBON synapse and makes testable predictions that provide clear contrasts with existing models.

Download

Coordinated drift of receptive fields in Hebbian/anti-Hebbian network models during noisy representation learning

January 2023

·

124 Reads

·

34 Citations

Nature Neuroscience

Shanshan Qin

·

Shiva Farashahi

·

David Lipshutz

·

[...]

·

Recent experiments have revealed that neural population codes in many brain areas continuously change even when animals have fully learned and stably perform their tasks. This representational ‘drift’ naturally leads to questions about its causes, dynamics and functions. Here we explore the hypothesis that neural representations optimize a representational objective with a degenerate solution space, and noisy synaptic updates drive the network to explore this (near-)optimal space causing representational drift. We illustrate this idea and explore its consequences in simple, biologically plausible Hebbian/anti-Hebbian network models of representation learning. We find that the drifting receptive fields of individual neurons can be characterized by a coordinated random walk, with effective diffusion constants depending on various parameters such as learning rate, noise amplitude and input statistics. Despite such drift, the representational similarity of population codes is stable over time. Our model recapitulates experimental observations in the hippocampus and posterior parietal cortex and makes testable predictions that can be probed in future experiments. A computational model predicts coordinated drift of neural receptive fields during noisy representation learning and recapitulates experimental observations in the posterior parietal cortex and hippocampal CA1.


A linear discriminant analysis model of imbalanced associative learning in the mushroom body compartment

September 2022

·

23 Reads

To adapt to their environments, animals learn associations between sensory stimuli and unconditioned stimuli. In invertebrates, olfactory associative learning primarily occurs in the mushroom body, which is segregated into separate compartments. Within each compartment, Kenyon cells (KCs) encoding sparse odor representations project onto mushroom body output neurons (MBONs) whose outputs guide behavior. Associated with each compartment is a dopamine neuron (DAN) that modulates plasticity of the KC-MBON synapses within the compartment. Interestingly, DAN-induced plasticity of the KC-MBON synapse is imbalanced in the sense that it only weakens the synapse and is temporally sparse. We propose a normative mechanistic model of the MBON as a linear discriminant analysis (LDA) classifier that predicts the presence of an unconditioned stimulus (class identity) given a KC odor representation (feature vector). Starting from a principled LDA objective function and under the assumption of temporally sparse DAN activity, we derive an online algorithm which maps onto the mushroom body compartment. Our model accounts for the imbalanced learning at the KC-MBON synapse and makes testable predictions that provide clear contrasts with existing models.


Computational mechanisms of distributed value representations and mixed learning strategies

December 2021

·

50 Reads

·

7 Citations

Learning appropriate representations of the reward environment is challenging in the real world where there are many options, each with multiple attributes or features. Despite existence of alternative solutions for this challenge, neural mechanisms underlying emergence and adoption of value representations and learning strategies remain unknown. To address this, we measure learning and choice during a multi-dimensional probabilistic learning task in humans and trained recurrent neural networks (RNNs) to capture our experimental observations. We find that human participants estimate stimulus-outcome associations by learning and combining estimates of reward probabilities associated with the informative feature followed by those of informative conjunctions. Through analyzing representations, connectivity, and lesioning of the RNNs, we demonstrate this mixed learning strategy relies on a distributed neural code and opponency between excitatory and inhibitory neurons through value-dependent disinhibition. Together, our results suggest computational and neural mechanisms underlying emergence of complex learning strategies in naturalistic settings. Real-world learning is particularly challenging because reward can be associated to many features of choice options. Here, the authors show that humans can learn complex learning strategies and reveal their underlying computational and neural mechanisms.


Neural optimal feedback control with local learning rules

November 2021

·

84 Reads

A major problem in motor control is understanding how the brain plans and executes proper movements in the face of delayed and noisy stimuli. A prominent framework for addressing such control problems is Optimal Feedback Control (OFC). OFC generates control actions that optimize behaviorally relevant criteria by integrating noisy sensory stimuli and the predictions of an internal model using the Kalman filter or its extensions. However, a satisfactory neural model of Kalman filtering and control is lacking because existing proposals have the following limitations: not considering the delay of sensory feedback, training in alternating phases, and requiring knowledge of the noise covariance matrices, as well as that of systems dynamics. Moreover, the majority of these studies considered Kalman filtering in isolation, and not jointly with control. To address these shortcomings, we introduce a novel online algorithm which combines adaptive Kalman filtering with a model free control approach (i.e., policy gradient algorithm). We implement this algorithm in a biologically plausible neural network with local synaptic plasticity rules. This network performs system identification and Kalman filtering, without the need for multiple phases with distinct update rules or the knowledge of the noise covariances. It can perform state estimation with delayed sensory feedback, with the help of an internal model. It learns the control policy without requiring any knowledge of the dynamics, thus avoiding the need for weight transport. In this way, our implementation of OFC solves the credit assignment problem needed to produce the appropriate sensory-motor control in the presence of stimulus delay.


Figure 3: Drift of a single localized RF learned from a ring data manifold. (A) A ring in 2D as input dataset: x(θ) = [cos(θ), sin(θ)] , θ ∈ [0, 2π). (B) The single RF has the shape of a truncated cosine curve, whose position drift on the ring and behaves like a random walk. (C,D) The effective diffusion constant D of centroid position increases with learning rate η even without explicit synatpic noise (σ = 0), and with the noise amplitude of explicit synaptic noise. Error bars represent standard deviation of 40 simulations. Megenta lines correspond to (9).
Figure 4: Drift of manifold-tiling localized RFs. (A) Learned localized RFs tile the input ring data manifold. Colors represent RFs of 5 example neurons. (B) Evolution of the RF centroids of two example neurons due to synaptic noise. (C) The representational similarity matrix Y Y is approximately circulant and stable over time. (D) When there are large number of neurons, each neuron has active and silent (shaded region) periods. (E) At population level, the fraction of neurons with active RFs are constant. (F) The fraction of neurons that have active RFs decreases with the total number of output neurons, as well as the noise amplitude. (G-H) Neurons that have stronger RFs tend to have longer active time (G) and also are more stable as quantified by smaller effective diffusion constants D (H). (I) At population level, the drift of RFs are coordinated such that their centroids are more uniformly distributed compared to that of the independent random walk case, in which the step size follows the same distribution of the Hebbian/anti-Hebbian network model. Shown are the variance of distances between adjacent centroids. Parameters in C-G: N = 200, σ = 0.002, η = 0.05, α 2 = 0 except σ = 0 in B. In H: η = 0.05.
Figure 6: Representational drift in PPC. (A) Schematic of the visual-cue-guided T-maze sensorimotor task as in (Driscoll et al. 2017). The linear length of the track from the beginning to the end (dashed line) is L. (B) Population activity for the left-turn and right-turn task before (upper) and after (lower) sorting based on the centroids of their RFs. Only neurons that have active RFs at the given time point are shown. (C) Population activity drifts but representational similarity is stable over time. Activity of neurons identified with significant peak in the sorted time (upper and middle). Representational similarity matrix is stable for both left-turn and right-turn task (lower panels). (D) left: For a group of neurons that have tuning to left-turn (or right-turn) tasks, the fraction of them that have consistent tuning (black), switched tuning (magenta), losing tuning (cyan) to left (or right) in the following time. (E) Shift of RFs for neurons with a significant peak between time t and t + ∆t. Smaller shift happens more often than larger shift. (F) The fraction of the neurons with active RFs is stable across time. In (D)-(F) Left panels are simulation results of our model, right panels are corresponding experiment results from (Driscoll et al. 2017). Parameters: N = 400, α 2 = 1.6, η = 0.05, σ = 10 −4 , β 1 = 10 −4 , β 2 = 10 −3 .
FIG. S1. (A) The relative change of Frobenus norm of the similarity matrix at time t compared with time point 0 in the PSP task. (B) Ensemble of output Y ≡ [y1, · · · , y1] at two time points. The data clouds have ellipsoid shape. Related to figure 2 in the main text. Parameters are the same as in figure 2 of the main text.
Coordinated drift of receptive fields during noisy representation learning

September 2021

·

98 Reads

·

4 Citations

Long-term memories and learned behavior are conventionally associated with stable neuronal representations. However, recent experiments showed that neural population codes in many brain areas continuously change even when animals have fully learned and stably perform their tasks. This representational "drift" naturally leads to questions about its causes, dynamics, and functions. Here, we explore the hypothesis that neural representations optimize a representational objective with a degenerate solution space, and noisy synaptic updates drive the network to explore this (near-)optimal space causing representational drift. We illustrate this idea in simple, biologically plausible Hebbian/anti-Hebbian network models of representation learning, which optimize similarity matching objectives, and, when neural outputs are constrained to be nonnegative, learn localized receptive fields (RFs) that tile the stimulus manifold. We find that the drifting RFs of individual neurons can be characterized by a coordinated random walk, with the effective diffusion constants depending on various parameters such as learning rate, noise amplitude, and input statistics. Despite such drift, the representational similarity of population codes is stable over time. Our model recapitulates recent experimental observations in hippocampus and posterior parietal cortex, and makes testable predictions that can be probed in future experiments.


Figure 4. RNNs can capture main behavioral results. (A) Plotted are average estimates at the end of a simulated session of our learning task, for 50 instances of the simulated RNNs (filled) and average value of participants' reward estimate (hollow) vs. actual reward probabilities associated with each stimulus (each symbol represents one stimulus). Error bars represent s.e.m., and the dashed line is the identity. (B) The plot shows the time course of explained variance (R 2 ) in RNNs' estimates based on different models. Error bars represent s.e.m. The solid line is the average of exponential fits to RNNs' data, and the shaded areas indicate s.e.m. of the fit. (C) Time course of adopted learning strategies measured by fitting the RNNs' output. Plotted is the weight of the informative feature and informative conjunction in the F+C1 model. Error bars represent s.e.m. The solid line is the average of exponential fits to RNNs' data, and the shaded areas indicate s.e.m. of the fit.
Figure 5. Response of different types of recurrent units show differential degrees of similarity to reward probabilities based on different learning strategies. (A-D) Plotted are the estimated weights for predicting the response dissimilarity matrix of different types of recurrent populations (indicated by the inset diagrams explained in Figure 3B) using the dissimilarity of reward probabilities based on the informative feature, informative conjunction, and object. Error bars represent s.e.m. The solid line is the average of fitted exponential functions to RNNs' data, and the shaded areas indicate s.e.m. of the fit. (E-H) Same as A-D but for inhibitory recurrent populations. Dissimilarity of reward probabilities of the informative feature can best predict dissimilarity of response in inhibitory populations, while dissimilarity of reward probabilities of the objects can best predict dissimilarity of response in excitatory populations.
Figure 7. Lesioning recurrent connections from inhibitory populations with plastic sensory input to excitatory populations with plastic sensory input results in drastic changes in the behavior of the RNNs. (A) The plot shows the time course of explained variance (R 2 ) in RNNs' estimates based on different models. Error bars represent s.e.m. The solid line is the average of exponential fits to RNNs' data, and the shaded areas indicate s.e.m. of the fit. (B) Plotted is the average rate of value-dependent changes in the connection weights from feature-encoding, conjunction-encoding, and object-identity encoding units to recurrent units with plastic sensory input, during the simulation of our task. Plus sign indicates one-sided significant rates of change (P<0.05), and hatched squares indicate connections with rates of change that were not significantly different from zero (P>0.05). Highlighted rectangles in cyan, magenta, and red indicate the values for input from sensory units encoding the informative feature, the informative conjunction, and object-identity, respectively.
Neural mechanisms of distributed value representations and learning strategies

April 2021

·

16 Reads

·

1 Citation

Learning appropriate representations of the reward environment is extremely challenging in the real world where there are many options to learn about and these options have many attributes or features. Despite existence of alternative solutions for this challenge, neural mechanisms underlying emergence and adoption of value representations and learning strategies remain unknown. To address this, we measured learning and choice during a novel multi-dimensional probabilistic learning task in humans and trained recurrent neural networks (RNNs) to capture our experimental observations. We found that participants estimate stimulus-outcome associations by learning and combining estimates of reward probabilities associated with the informative feature followed by those of informative conjunctions. Through analyzing representations, connectivity, and lesioning of the RNNs, we demonstrate this mixed learning strategy relies on a distributed neural code and distinct contributions of inhibitory and excitatory neurons. Together, our results reveal neural mechanisms underlying emergence of complex learning strategies in naturalistic settings.


Learning arbitrary stimulus-reward associations for naturalistic stimuli involves transition from learning about features to learning about objects

September 2020

·

38 Reads

·

14 Citations

Cognition

Most cognitive processes are studied using abstract or synthetic stimuli with specific features to fully control what is presented to subjects. However, recent studies have revealed enhancements of cognitive capacities (such as working memory) when processing naturalistic versus abstract stimuli. Using abstract stimuli constructed from distinct visual features (e.g., color and shape), we have recently shown that human subjects can learn multidimensional stimulus-reward associations via initially estimating reward value of individual features (feature-based learning) before gradually switching to learning about reward value of individual stimuli (object-based learning). Here, we examined whether similar strategies are adopted during learning about naturalistic stimuli that are clearly perceived as objects (instead of a combination of features) and contain both task-relevant and irrelevant features. We found that similar to learning about abstract stimuli, subjects initially adopted feature-based learning more strongly before transitioning to object-based learning. However, there were three key differences between learning about naturalistic and abstract stimuli. First, compared with abstract stimuli, the initial learning strategy was less feature-based for naturalistic stimuli. Second, subjects transitioned to object-based learning faster for naturalistic stimuli. Third, unexpectedly, subjects were more likely to adopt feature-based learning for naturalistic stimuli, both at the steady state and overall. These results suggest that despite the stronger tendency to perceive naturalistic stimuli as objects, which leads to greater likelihood of using object-based learning as the initial strategy and a faster transition to object-based learning, the influence of individual features on learning is stronger for these stimuli such that ultimately the object-based strategy is adopted less. Overall, our findings suggest that feature-based learning is a general initial strategy for learning about reward value of all types of multi-dimensional stimuli.


Flexible combination of reward information across primates

November 2019

·

354 Reads

·

92 Citations

Nature Human Behaviour

A fundamental but rarely contested assumption in economics and neuroeconomics is that decision-makers compute subjective values of risky options by multiplying functions of reward probability and magnitude. By contrast, an additive strategy for valuation allows flexible combination of reward information required in uncertain or changing environments. We hypothesized that the level of uncertainty in the reward environment should determine the strategy used for valuation and choice. To test this hypothesis, we examined choice between risky options in humans and rhesus macaques across three tasks with different levels of uncertainty. We found that whereas humans and monkeys adopted a multiplicative strategy under risk when probabilities are known, both species spontaneously adopted an additive strategy under uncertainty when probabilities must be learned. Additionally, the level of volatility influenced relative weighting of certain and uncertain reward information, and this was reflected in the encoding of reward magnitude by neurons in the dorsolateral prefrontal cortex.



Citations (15)


... While previous studies have investigated learning algorithms in the Drosophila olfactory circuit where the association of inputs and rewards is given (i.e., supervised learning) (62)(63)(64)(65), this study proposes an unsupervised learning algorithm that operates without labels. Some studies have suggested that odor exposure without reward or punishment alters MBON activity and fly behavior through synaptic plasticity induced by DANs (47,66). ...

Reference:

A biological model of nonlinear dimensionality reduction
A linear discriminant analysis model of imbalanced associative learning in the mushroom body compartment

... Aside from plausible mechanisms of BCI learning, another notable observation in both BCI and non-BCI studies is the occurrence of representational driftdefined as a consistent shift in the representation of task variables despite that the associated behavioral and environmental conditions remain unchanged [32][33][34][35][36][37][38] . In BCI settings, this adversely affects performance and learning progress and necessitates frequent decoder calibration and possible corrective approaches to misclassified/missing data 9,[39][40][41] . ...

Coordinated drift of receptive fields in Hebbian/anti-Hebbian network models during noisy representation learning

Nature Neuroscience

... In reinforcement learning, when people are confronted with multi-dimensional stimuli and environments, they often adopt a feature-based learning strategy to manage complexity and dimensionality. This approach involves the learning of reward values of individual features shared across different options, which facilitates rapid learning and generalization when consistent rules are present (Dezfouli, & Balleine, 2019;Farashahi et al., 2017;2020;Farashahi & Soltani, 2021;Franklin & Frank, 2019;Schaaf et al., 2022). For instance, if a child enjoys eating green grapes, they might assume they'll like other green fruits as well. ...

Computational mechanisms of distributed value representations and mixed learning strategies

... Exploratory random walk behaviour has been universally seen across species [8,22,34], along neural manifolds [34,66], gait cycles [12,42] and trial-by-trial reaching behaviour [8,9,33,40]. Humans show greater movement variability royalsocietypublishing.org/journal/rspb Proc. ...

Coordinated drift of receptive fields during noisy representation learning

... For example, one can associate reward to individual features of stimuli and combine this information to estimate values associated with each stimulus (feature-based learning) instead of directly learning the value of individual stimuli (object-based learning). Recent studies showed that in response to multi-dimensional stimuli, the learning strategy also depends on the volatility, generalizability (i.e., how well features of stimuli or options predict their values), and dimensionality of the environment [21][22][23]. ...

Neural mechanisms of distributed value representations and learning strategies

... Finally, we investigated whether participants would adopt a feature-based or object-based learning approach in the subsequent value-based decision-making phase. Farashahi et al. (2020) compared value learning with naturalistic and abstract stimuli, showing that while both groups initially used a feature-based strategy, those exposed to naturalistic stimuli shifted more quickly to object-based learning. ...

Learning arbitrary stimulus-reward associations for naturalistic stimuli involves transition from learning about features to learning about objects
  • Citing Article
  • September 2020

Cognition

... The aim of this study was to investigate whether bifrontal tDCS could normalize reinforcement learning deficits in low mood. DLPFC is part of a brain network involved in reinforcement learning and is activated in response to volatility [24][25][26]. In our previous study, we found that bifrontal tDCS increased reward learning rates in healthy volunteers [27]. ...

Flexible combination of reward information across primates

Nature Human Behaviour

... We determined the most appropriate model by evaluating model performance using two metrics: the average negative log likelihood and the Bayesian Information Criterion (BIC). Furthermore, we also computed the Bayesian Information Criterion per trial, denoted as , following the method of Farashahi et al. (2018), to assess how well each model adapted to changes in participants' decisions over time. The formula for is: ...

Influence of learning strategy on response time during complex value-based learning and choice

... The vmPFC and OFC are predominantly associated with the evaluation of the relative value of alternative offers [15][16][17][18][19][20][21] , while the ACC plays a role in monitoring conflicting information and actions [22][23][24][25][26] , as well as the estimation of expected outcomes and the cognitive cost of the decision performance [27][28][29][30] . In particular, the dorsal ACC (dACC) has been linked to reward anticipation and cognitive effort computation 31,32 , and to the processing of delayed rewards across various decision-making contexts [33][34][35][36][37][38] . Neural signals in ACC have been previously associated with multi-trial 39 and virtual reward expectation 40 , with remarked impact on behavioral adjustments 41,42 . ...

On the Flexibility of Basic Risk Attitudes in Monkeys

The Journal of Neuroscience : The Official Journal of the Society for Neuroscience

... However, while the anatomical overlap between the distributed circuits of decision-making and sensorimotor control is well established, theoretical models of these processes remain largely separate. Decision-making is often modeled as the accumulation of evidence until a threshold is reached [11][12][13][14][15][16][17][18][19], at which time a target is chosen. Models of movement control usually begin with that chosen target, toward which the system is guided through feedback and feedforward mechanisms [20][21][22]. ...

Dynamic combination of sensory and reward information under time pressure