Conference PaperPDF Available

Generative Environment-Representation Instance-Based Learning: A Cognitive Model

Authors:

Abstract

Instance-Based Learning Theory (IBLT) suggests that humans learn to engage in dynamic decision making tasks through the accumulation of experiences, represented by the decision task features, the actions performed , and the utility of decision outcomes. This theory has been applied to the design of Instance-Based Learning (IBL) models of human behavior in a variety of contexts. One key feature of all IBL model applications is the method of accumulating instance-based memory and performing recognition-based retrieval. In simple tasks with few features, this knowledge representation and retrieval could hypothetically be done using all relevant information. However, these methods do not scale well to complex tasks when exhaustive enumeration of features is unfeasible. This requires cog-nitive modelers to design task-specific representations of state features, as well as similarity metrics, which can be time consuming and fail to generalize to related tasks. To address this issue, we leverage recent advancements in Artificial Neural Networks, specifically generative models (GMs), to learn representations of complex dynamic decision making tasks without relying on domain knowledge. We evaluate a range of GMs in their usefulness in forming representations that can be used by IBL models to predict human behavior in a complex decision making task. This work connects generative and cognitive models by using GMs to form representations and determine similarity.
Generative Environment-Representation
Instance-Based Learning: A Cognitive Model
Tyler Malloy,1Yinuo Du,12 Fei Fang,2Cleotilde Gonzalez1
1,2 Carnegie Mellon University
1Department of Social and Decision Sciences
2Software and Societal System Department
5000 Forbes Ave, Pittsburgh PA, USA
Abstract
Instance-Based Learning Theory (IBLT) suggests that
humans learn to engage in dynamic decision making
tasks through the accumulation of experiences, repre-
sented by the decision task features, the actions per-
formed, and the utility of decision outcomes. This the-
ory has been applied to the design of Instance-Based
Learning (IBL) models of human behavior in a variety
of contexts. One key feature of all IBL model appli-
cations is the method of accumulating instance-based
memory and performing recognition-based retrieval. In
simple tasks with few features, this knowledge repre-
sentation and retrieval could hypothetically be done
using all relevant information. However, these meth-
ods do not scale well to complex tasks when exhaustive
enumeration of features is unfeasible. This requires cog-
nitive modelers to design task-specic representations
of state features, as well as similarity metrics, which
can be time consuming and fail to generalize to re-
lated tasks. To address this issue, we leverage recent ad-
vancements in Articial Neural Networks, specically
generative models (GMs), to learn representations of
complex dynamic decision making tasks without rely-
ing on domain knowledge. We evaluate a range of GMs
in their usefulness in forming representations that can
be used by IBL models to predict human behavior in
a complex decision making task. This work connects
generative and cognitive models by using GMs to form
representations and determine similarity.
Introduction
Instance Based Learning Theory (IBLT) represents the
cognitive processes for human decision making based
on cognitive memory mechanisms (i.e, recognition, re-
call, decay, noise) relevant to dynamic decision mak-
ing tasks (Gonzalez, Lerch, and Lebiere 2003). IBLT
brings together the following characteristics: accumula-
tion of examples in memory through training and task
repetition, development of pattern recognition and se-
lective alternative search, similarity-based memory re-
trieval, gradual withdrawal of attention while increas-
ing memory retrieval, and transition from rule-based to
exemplar-based performance.
Copyright © 2023, Association for the Advancement of Ar-
ticial Intelligence (www.aaai.org). All rights reserved.
Although IBLT models have been applied to dynamic
tasks involving complex information, this has previ-
ously relied on the use of hand-crafted features of the
environment being represented in an IBL model and,
therefore, the features are unique to each environment.
Another issue of applications of IBL modeling is the re-
quirement of a static denition of similarity in the space
of environment states throughout modeling.
In contrast, Generative Models (GM) are trained
to learn from a data set the underlying distribution
that is causally responsible for generating those data
(Salakhutdinov 2015). In other words, in GMs, the at-
tributes are not hand-crafted, but are learned from the
data. GMs have been integrated with other learning
models to demonstrate impressive success in improving
learning speed (Higgins et al. 2017).
One useful application of such GMs is in unsuper-
vised and semi-supervised learning, where there data is
not categorized, or only a small fraction has relevant
categories (Kingma et al. 2014). The learning of rep-
resentations useful for behavioral goals is an important
area of research in modelling human utility-based learn-
ing (Radulescu, Shin, and Niv 2021). However, to date,
the integration of GMs with cognitive models is lacking.
In this work, we propose the integration of GMs and
IBLT, into a new proposed algorithm called Generative
Environment-Representation Instance-Based Learning
(GERIBL) (pronounced as “jur-bl”). This new algo-
rithm seeks to enable IBLT models to leverage pre-
trained models that form representations of environ-
ments for dynamic decision making. This is done by in-
tegrating IBLT with Generative Models (GMs) that are
trained to learn from a data set the underlying distri-
bution that is causally responsible for generating such
data (Salakhutdinov 2015).
GMs have previously been integrated with Reinforce-
ment Learning (RL) to predict human learning of the
utility of visual stimuli (Malloy, Klinger, and Sims 2022)
and fast generalization to novel tasks (Malloy et al.
2022). This integration of GMs and RL has demon-
strated the usefulness of pre-trained GMs in forming
representations of environments that can be used in cog-
nitive models of learning. We expect that, a similar ap-
proach can be taken by integrating GMs into IBLT, and
take advantage of the strong cognitive foundations of
IBLT into cognitive architectures (i.e., ACT-R (Thom-
son et al. 2015)).
GERIBL is used as a test bed for the potential in-
tegration of GMs with cognitive models by comparing
dierent GM approaches. The learning task of genera-
tive models is closely related to the human experience
of making decisions based on visual information. Hu-
mans can leverage their experience observing visual in-
formation outside of the context of decision making to
improve their speed of learning and high generalization
(i.e., transfer of learning). Part of the reason for this
is that humans observe visual information in an unsu-
pervised context and form representations of that infor-
mation that is useful for a variety of tasks. This is sim-
ilar to the unsupervised training of Deep GMs which
enables them to form useful representations of infor-
mation that are generalizable. GERIBL leverages these
useful features of GMs to integrate with the cognitive
mechanisms of IBLT.
GERIBL describes the general framework for inte-
grating environment representations learned by a gen-
erative model into an IBL model. We evaluated two
approaches for GMs, AutoEncoders (AEs) which form
representations of stimuli that are useful for reconstruc-
tion; and Generative Adversarial Networks (GANs),
which attempt to learn to discriminate between envi-
ronment stimuli not in the original data set while simul-
taneously learning to generate environment stimuli that
are similar to the underlying data. The results show the
advantages of the integration of GMs and IBLT.
Preliminaries: Instance-Based Learning
Theory
In IBLT, the memory of agents consists of instances
(s, a, x)dened by the state s, their action aand the
outcome x(Gonzalez, Lerch, and Lebiere 2003). All in-
stances are stored in memory as outcomes xand options
k= (s, a). This means that an IBL model requires the
storage of all instances in memory in the form of these
triplets.
At time tthere may be nk,t generated instances
(k, xi,k,t ). Calculating the expected utility of an action
requires an aggregation of all similar instances to de-
termine their memory activation and probability of re-
trieval.
Among a set of actions considered at each time step,
agents take the action with the expected maximum util-
ity. Expected utility is calculated through a “blending”
function according to:
Vk,t =
nk,t
X
i=1
pi,k,txi,k ,t (1)
Where nk,t are the instances in memory, xi,k,t are
the outcomes, and the probability of retrieval is pi,k,t is
calculated as:
pi,k,t =exp i,k,t /τ)
Pnk,t
j=1 exp j,k,t/τ)(2)
where τis a temperature parameter and the activa-
tion value Λi,k,t, which represents the ease of recall of
a specic instance in memory, calculated according to:
Λi,k,t =ln X
t0Ti,k,t
(tt0)d+αX
j
Simj(fk
j, f ki
j)
+σln 1ξi,k,t
ξi,k,t
(3)
where dand σare decay and noise parameters, and
Ti,k,t {0, ..., t 1}is the previous observations of
instance i. The similarity function Sim(f, f 0)calculates
the similarity of instances in memory with the current
instance (Nguyen, Phan, and Gonzalez 2022). Because
of the relationship between noise σand temperature τ
in IBLT, the temperature parameter τis typically set
to σ2.
Pattern Recognition
One potential challenge with the use of IBL models in
practice, for real-world problems, is that states can be
signicantly complex. This motivates the formation of
hand-crafted representations of the state by cognitive
modelers. A cognitive modeler often represents the fea-
tures in the state of an instance by using the observable
attributes in the environment that are relevant to per-
form a task. This has the benet of more accurately
representing cognitive realities, compared to the alter-
native of storing complex visual information in memory
or using hand-crafted features. The model proposed in
this work seeks to determine whether storing represen-
tations of complex information learned from a GM can
still be useful for modeling cognition, or if the task-
relevant information is lost.
Although the hand-crafted features that cognitive
modelers dene might be practical, a disadvantage of
this approach is that they cannot be formed automati-
cally. The representations depend on the cognitive mod-
elers’ judgment of what is important for the task. There
are no general principles or guidelines to decide on the
features that are relevant for the state in a task. Al-
though cognitive modelers rely on what is “observable”
in the task, the selection of features may be arbitrary,
highly determined by the experience of the cognitive
modeler on the task. The model proposed in this work
seeks to address this requirement on cognitive modelers.
Similarity-Based Memory Retrieval
A key feature of IBLT is that the activation function de-
pends on the similarity Sim(fk, f ki)between the char-
acteristics of the environment and the attributes of the
stored instances. This means that recognition, judg-
ment, and choice depend on the method of determining
similarity (Gonzalez, Lerch, and Lebiere 2003). IBLT
also proposes that decision makers learn to focus their
attention on task-relevant features and, in turn, select
the limited information they attend to based on this
similarity (Gonzalez, Lerch, and Lebiere 2003). How-
ever, until now, there has been no principled method to
achieve this goal.
Although measuring similarity is highly relevant in
models designed in IBLT, relatively little work has been
done in IBLT to compare dierent approaches to mea-
suring similarity. The similarity function used is of-
ten linear similarity, but some times it is opted for
some non-linear similarity function in a trial-and-error
modeling process. In the next section, the proposed
model will attempt to address the challenge of automat-
ically producing instance attributes, and consistently
and meaningfully measuring similarity, through the in-
tegration of an IBL model with a generative model.
Preliminaries: Generative Models
Generative Models (GM) are a class of machine learn-
ing methods that attempt to learn from a data set
by assuming that a probability distribution generated
the data and attempting to learn the underlying dis-
tribution (Harshvardhan et al. 2020). In this research,
we propose a set of methods to integrate IBL models
with three major classes of generative models, Vari-
ational Autoencoders (VAEs), Generative Adversarial
Networks (GANs), and Visual Transformers (ViT) to
address the current limitations of IBL models described
above.
Figure 1 illustrates the proposed Generative
Environment-Representation Instance-Based Learning
(GERIBL) cognitive model. In this proposed model,
the environment representation can be generated from
the GM, producing an environment state that the IBL
model can use to make decisions from experience. Fur-
thermore, the gure illustrates how the execution of ac-
tions from an IBL model can inuence the environment
presented in the GM.
While other types of generative models exist, these
two were chosen because of their general applicability to
various input modalities (image, text, audio, etc.) and
their usefulness in applications of the learning setting
described later. The remainder of this section provides
background information on these two types of genera-
tive models, as well as insight into the usefulness of rep-
resentations learned by these approaches in IBL models.
AutoEncoders
Autoencoder (AE) models function by assuming that
there is a set of generative factors ζthat causally ex-
plain the data in a set xX. The goal of training these
models is to learn an encoding function p(z|x)and a
decoding function p(x|z)that reect these generative
factors. The result is a model that can approximate the
true environmental distribution p(x).
When used with image data, these models typically
use the general structure of Convolutional Neural Net-
Figure 1: GERIBL: Generative Environment Represen-
tation Instance Based Learning Model consisting of a
generative model producing environment stimuli repre-
sentations that are used by an instance-based learning
model to make decisions from experience.
works (CNNs) to learn low-dimensional representations
of visual information that can be used to form recon-
structions of unobserved visual stimuli, such as human
faces (Zhang 2018).
Variational Autoencoder: (VAE) models use a
deep neural network to learn an encoder function
qφ(z|x)that outputs constrained representations zof
visual stimuli x(Kingma and Welling 2014). These rep-
resentations dene a vector of means µzand variances
σzthat form a Normal distribution N(µz, σz). This dis-
tribution is sampled to form a vector zthat is trans-
lated through to the encoder layers pθ(x|z)to produce a
reconstruction. These VAE models are trained to min-
imize the dierence between input and reconstruction
by maximizing the objective function (Pu et al. 2016):
L(θ, φ;x, z =Eqφ(z|x)[log pθ(x|z)] (4)
This learning objective is guaranteed to learn a gen-
erative model that will approximate the true environ-
mental distribution p(x). However, there is no guaran-
tee of any meaningful connection between the learned
latent representation zand the true generative factors
ζ(Chen et al. 2016). This lack of connection could be
problematic for decision models based on these inter-
nal representations, potentially motivating the use of
alternative training (Aridor, da Silveira, and Woodford
2023).
β-Variational Autoencoder: models seek to con-
nect generative factors ζand latent representations z
by adjusting the training of traditional VAEs by intro-
ducing a βparameter that further controls the infor-
mation bottleneck (Burgess et al. 2018). This is done
by penalizing a metric of informational complexity of
the representations using the KL-divergence between
the decoder and latent distribution, using the training
function (Higgins et al. 2016):
L(θ, φ;x, z, β ) = Eqφ(z|x)[log pθ(x|z)]
βDK Lqφ(z|x)||p(z)(5)
The βparameter allows for additional control over
the information bottleneck of the model by adding a
weight to the informational complexity of the latent
representations dening the multivariate Gaussian dis-
tribution. The result is that the entire model is trained
to balance the accuracy of reconstruction and the com-
plexity of latent representation in an adjustable fashion.
Image Transformers
Pre-trained transformer models have the advantage of
wide applicability on a variety of dierent tasks and do-
mains, particularly in the context of Natural Language
Processing (NLP) (Wolf et al. 2019). However, concerns
have been raised over the use and usability of massive
pre-trained transformer models, suggesting that their
output may be the results of spurious correlations and
stochasticity (Bender et al. 2021). Part of the testing
of the Transformer based GMs with GIRBL will be to
compare models pre-trained using the exact same stim-
uli with ones trained using similar stimuli.
Image-based transformers apply transformer-based
self-attention mechanisms to machine learning domains
with visual data (Parmar et al. 2018; Dosovitskiy et al.
2020). The two models used to test the GERIBL model
use transformers, and dier in their training methods
and the size and form of their representations of visual
information.
Vision Transformer VAE: Variational Autoen-
coders trained using transformer models are able to
learn constrained representations of images of vari-
able size that are still useful for reconstruction (He
et al. 2022). These models can be integrated into the
GERIBL cognitive model using the encodings learned
by a Visual Transformer Variational Autoencoder (ViT-
VAE) model.
The ViT-VAE model uses 4 attention heads, and 2
NN layers of 64 nodes for the multi-layer perceptron
layers. The loss function is based on the dierence be-
tween the input and reconstruction. The VAE encoding
representation is used by the GERIBL model as an en-
vironment state representation, and takes the form of a
vector of real numbers of size 100.
Attention: The second transformer based GM that
is compared using the GERIBL model uses learned val-
ues from the self-attention heads of the transformer net-
work when processing visual information, this model is
referred to as the Attention model.
The Attention model has the same general structure
as the ViT-VAE model with the main dierence being
that it is not trained to reconstruct lossy versions of in-
put stimuli. The second dierence is the form and size of
the representation that is used by the GERIBL model.
In the case of the Attention model, the values of the 4
self-attention heads are used as the representations for
the GERIBL model.
Generative Adversarial Networks
Generative Adversarial Network (GAN) models are
trained using generator and discriminator networks
(Salakhutdinov 2015). The goal of the generator is to
produce images that appear similar to those in the
training data set so that the discriminator network can-
not tell the dierence. The goal of the discriminator is to
determine if a given image was produced by the genera-
tor or is a genuine original data set member. These mod-
els are trained in tandem in an adversarial structure.
Two GAN based models are used for comparison with
integration with the proposed GERIBL model. Both
models have the same general structure and training,
diering only in the size of their internal representation
space and other network features.
GAN Model: is the rst GAN model uses represen-
tations of size 100 to complete the learning objectives
of the generator and discriminator networks. This is
considered to be an ‘unconstrained’ version of a GAN,
analogous to the VAE model which has a larger repre-
sentation size and information complexity compared to
the β-VAE model. The calculation of similarity of the
GAN model is determined by the
Constrained GAN: The second GAN-based model
is motivated by a similar motivation to the β-VAE
model, in using an information bottleneck to produce
constrained representations that are less information-
ally complex, allowing for faster generalization, while
still being useful for the IBL module. This is done by
reducing the size of the internal representation from
100 to 3, the same size as the latent representation of
the β-VAE model. Additionally, the generator and dis-
criminator network feature map is reduced from 64 to
8, additionally imposing a stricter information bottle-
neck. All other model structures and hyper-parameters
are kept the same.
GERIBL: Proposed Model
The proposed Generative Environment-Representation
Instance-Based Learning (GERIBL) model is the inte-
gration of IBLT (the Python implementation of IBLT
called PyIBL) and generative models. We compare a va-
riety of GMs, including VAEs and GANs, in their abil-
ity to form representations of visual information that
can be used in a cognitive architecture model of dy-
namic decision making. This change is made primarily
by replacing environment state swith the correspond-
ing GM internal representations p(z). The result is a
cognitive architecture that predicts human recognition,
Figure 2: Contextual bandit learning task stimuli used in Experiment 1 (Right Panel) (https://nivlab.princeton.edu/-
data) and in Experiment 2 (Left, Middle, and Right Panel). Left panel: The rst set of stimuli shown to participants
in Experiment 2. Middle panel: The second set of stimuli shown to participants in Experiment 2. Right panel: the
third and nal set of stimuli shown to participants in Experiment 2. This is also the stimulus used in Experiment 1,
to learn which of the 9 possible features (shape,color,texture) was associated with a higher reward.
judgement, choice, and execution based on constrained
representations of visual information.
Furthermore, the GERIBL model alters the IBLT ac-
tivation function (Eq. 3) by replacing the feature-based
similarity function Sim(f, f 0), where similarity is based
on the internal representation of the GM zand the sim-
ilarity metric of the GM SimGM as follows:
Λi,k,t =ln X
t0Ti,k,t
(tt0)d
+αX
j
(SimGM(p(zk|k), p(zkj|kj)))
+σln 1ξi,k,t
ξi,k,t
(6)
where p(zs|s)is the GM internal representation of ob-
served state sand p(zsj|sj)are the GM internal repre-
sentations of each instance in memory sj. Importantly,
this altered activation function avoids the necessity of
storing the full original environment stimuli, instead al-
lowing for cognitive mechanisms to use low-dimensional
representations of environments.
The type of GM that is used in the GERIBL model
results in dierences based on how the internal repre-
sentations of each GM are formed and how those models
determine representation similarity. For example, the β-
VAE determines similarity based on the loss in Eq. 5,
according to the KL divergence between the two rep-
resentation distributions and their informational com-
plexities.
Model Representations
Another benet of using GM-acquired representations
as instances of IBL models is that they can be updated
as the IBL model learns the utility of choice options.
This can reect the tendency of decision makers to at-
tend to features that are more relevant for a task at
hand, which in turn changes how they represent in-
formation internally. Previous work has compared how
β-VAE model representations can change as utility is
learned in a bandit task involving images of human
faces (Malloy et al. 2022). This is integrated into the
proposed model by training the generative model with
feedback from the GIRBL model blending function Vk,t
which uses the activation function 6 according to:
L(υ, k) = υVk ,t xk2(7)
Where Vk,t is the predicted utility of the IBL model
before choice selection, and xkis the true observed out-
come. This functionality of the proposed model allows
for the updating of representation of environments as
the relevance to utility of dierent features is learned.
This utility-based training of generative model repre-
sentations has demonstrated more human-like decision-
making, reproducing biases in utility selection (Aridor,
da Silveira, and Woodford 2023), and fast generaliza-
tion (Malloy et al. 2022).
Learning Tasks
Experiment 1: Visual Utility Learning
The rst learning task was originally described (Niv
et al. 2015) collected by the Princeton Niv Neuroscience
Lab and made publicly available on their lab website1.
The experiment study was approved by the Princeton
University Internal Review Board.
This task consisted of a contextual n-armed bandit
in which participants were shown 3 dierent choice op-
tions consisting of a shape (square, circle, triangle),
color (red, green, blue), and texture (hatched, dotted,
wavy), as shown in Figure 2 (Right panel). On each
trial, the color, shape, and texture of each option are
randomized, with one instance of each feature type oc-
curring across the stimuli options (i.e., there is always
1 green option, 1 square option, etc.).
Experiment trials were variable lengths of roughly 20-
25 stimuli decision trials in which the same 1 of the 9
possible features was associated with a higher probabil-
ity (75% vs. 25%) of observing a reward of 1 instead of a
reward of 0. Data from 22 participants were collected in
this task, each making a total of 500 choice selections.
1https://nivlab.princeton.edu/data
Experiment 2: Transfer of Learning
This second experiment was originally collected and de-
tailed in (Malloy et al. 2023) by the Dynamic Decision
Making lab at Carnegie Mellon University, and made
publicly available on OSF2. 60 participants were re-
cruited online through Amazon Mechanical Turk. The
experiment was pre-registered on OSF and approved by
the Carnegie Mellon University Internal Review Board.
For full methods see (Malloy et al. 2023).
This experiment sought to test human Transfer of
Learning (ToL), referring to the application of previ-
ously learned skills onto a new task. The learning task
in Experiment 2 involves ToL in which participants rst
learned the values associated with shapes alone, then
shapes and colors, and nally the same shape-color-
texture features described in (Niv et al. 2015). The re-
wards ranged from roughly 4-6 points, determined by
the features of the chosen option, with random noise
added to the reward points to make the learning task
more challenging.
The experiment episodes consisted of 14 trials of each
type in the order shown in Figure 2. During one set
of trials, one of the three feature options was associ-
ated with a higher reward (roughly 7 vs. 5). As the
experiment progressed, the previously high-valued fea-
ture continued to indicate that an option had a higher
value. For example, if a square is associated with a
higher expected utility initially, then red squares will
have a higher expected utility than red triangles for the
remainder of the experiment block. The same is true for
the higher utility color once the texture is introduced.
Model and Human Performance
This section compares the 6 previously mentioned GMs
in their ability to be integrated with the proposed
GERIBL model. These GMs are pre-trained with a sub-
set of the stimuli shown in Figure 2, either the 3 shape
stimuli, 9 shape-color stimuli, or 27 shape-color-texture
stimuli. After this pre-training, the models are used to
produce a representation that the IBL module of the
GERIBL model takes in as an environment state. We
use the two learning experiments to compare human
participant performance, the 6 proposed GM instantia-
tions of GERIBL, and a handcrafted version of the IBL
model.
Visual Utility Learning
In the rst experiment on visual utility learning, GMs
are pre-trained using only the shape-color-texture stim-
uli set of 27 images.The results in 3 compare the three
types of GMs (VAE, Transformer, and GAN) with
human performance and an IBL model using hand-
crafted features. These results demonstrate that all
GMs roughly emulate human-like performance, with
the worst performing GMs being the GAN and ViT-
VAE model.
2https://osf.io/mt4ws/
Figure 3: Model and participant average probability of
selecting the correct option in the contextual bandit
task by within episode, chance rate is at 1/3.
In Figure 3, the blue models correspond to the GMs
with smaller representation sizes than the orange mod-
els which correspond to the GMs with larger represen-
tation sizes. As shown, the GMs with smaller represen-
tations are a better t to human behavior compared to
those with larger representations. This is likely due to
the fact that smaller representations are less informa-
tionally complex and thus are easier to quickly gener-
alize. These results indicate that one important factor
of GMs when integrating them in the GERIBL model
is the informational complexity of representations.
However, when using simple representations it is im-
portant to retain enough information for behavioral
goals. If the GMs representations were too simple, they
could remove information relevant to the task, mak-
ing it dicult for the IBL module to learn. This would
be a detriment to applying the GERIBL model, since
the main benet is the possibility of automatically gen-
erating environment features, as well as a metric for
comparing them.
Transfer of Learning
Transfer of learning is related to the goals of applying
GMs onto cognitive modeling in the potential applica-
tion of pre-trained models onto novel environments. To
compare the ability of GM representations to be ap-
plied onto new tasks, we limit the training data-sets in
Figure 4: GERIBL model average reward in the second experiment separated by the experiment condition. Generative
Models were trained only on a subset of the stimuli space indicated by color shade. Purple lines represent IBL model
performance using hand-crafted features. Green lines represent participant performance.
Experiment 2 by including only the shape images, only
the shape-color images, and nally only the shape-color-
texture images (see Figure 2). This produces 3 sets of
GMs for each image type, that are used to produce
representations of the visual information used to make
decisions in the other two types of tasks.
The rst noticeable aspect of these results is that
the majority of GMs had a higher transfer of learning
compared to the IBL model with hand-crafted features.
This can be observed by the asymptotic reward (mea-
sured by the average reward on the nal 5 trials) of each
GM trained on a subset of stimuli and tested in each
of the experiment task conditions. Of these GMs, the
best performing is the Transformer model using Atten-
tion values as its representation, which matches human
performance regardless of the stimuli it was trained on.
This indicates that this model has learned an ecient
representation of the stimuli applicable to related tasks.
In addition to testing GMs in their ability to be
applied onto a novel experiment task, these results
strengthen the two other motivations of GIRBL, in au-
tomatically determining relevant stimuli features and a
metric of similarity. If GMs required a unique training
approach for each stimuli space limited to that task,
then the applicability of pre-trained models would be
signicantly diminished. We show that GMs with small
representation spaces can be applied onto producing
human-like learning patterns even with novel stimuli.
Conclusions
The GERIBL model uses GMs to improve an IBL model
in three areas. Firstly, it uses representations of task
environments that are generated automatically, with-
out requiring cognitive modellers to develop a feature
set for each new task. Secondly, it allows for a metric
of similarity dened by the GM training, instead of by
cognitive modellers. Thirdly, it allows for improved pre-
diction of human behavior in transfer of learning tasks,
also demonstrating the ability of pre-trained GMs to be
applied onto novel tasks.
Of the GMs tested using the GERIBL model, the
β-VAE based model has the closest connection to bio-
logical visual processing, which has been related to the
disentanglement objective (Higgins et al. 2021). How-
ever, performing a complete analysis and comparison of
dierent types of GMs provides support of our proposed
model as a general framework for integrating GMs into
cognitive architectures that replicate human learning.
In addition to these main benets, the results shown
here point towards future research investigating the im-
pact of utility on the representations learned by GMs.
This could be one area where GMs dier highly in their
connection to human cognition, as they would likely
react dierently to training that incorporated utility
prediction. Previous work has compared GM represen-
tations as utility is learned in simulated settings (Mal-
loy, Klinger, and Sims 2022), but not yet compared to
behavior from human participants
Acknowledgements
This research was sponsored by the Army Research
Oce and accomplished under Australia-US MURI
Grant Number W911NF-20-S-000 and by the Army Re-
search Laboratory under Cooperative Agreement Num-
ber W911NF-13-2-0045 (ARL Cyber Security CRA)
References
Aridor, G.; da Silveira, R. A.; and Woodford, M.
2023. Information-Constrained Coordination of Eco-
nomic Behavior.
Bender, E. M.; Gebru, T.; McMillan-Major, A.; and
Shmitchell, S. 2021. On the dangers of stochastic par-
rots: Can language models be too big? . In Proceed-
ings of the 2021 ACM conference on fairness, account-
ability, and transparency, 610–623.
Burgess, C. P.; Higgins, I.; Pal, A.; Matthey, L.; Wat-
ters, N.; Desjardins, G.; and Lerchner, A. 2018. Un-
derstanding disentangling in β-VAE. arXiv preprint
arXiv:1804.03599.
Chen, X.; Kingma, D. P.; Salimans, T.; Duan, Y.;
Dhariwal, P.; Schulman, J.; Sutskever, I.; and Abbeel,
P. 2016. Variational Lossy Autoencoder. In Interna-
tional Conference on Learning Representations.
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn,
D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer,
M.; Heigold, G.; Gelly, S.; et al. 2020. An Image is
Worth 16x16 Words: Transformers for Image Recogni-
tion at Scale. In International Conference on Learning
Representations.
Gonzalez, C.; Lerch, J. F.; and Lebiere, C. 2003.
Instance-based learning in dynamic decision making.
Cognitive Science, 27(4): 591–635.
Harshvardhan, G.; Gourisaria, M. K.; Pandey, M.; and
Rautaray, S. S. 2020. A comprehensive survey and anal-
ysis of generative models in machine learning. Com-
puter Science Review, 38: 100285.
He, K.; Chen, X.; Xie, S.; Li, Y.; Dollár, P.; and Gir-
shick, R. 2022. Masked autoencoders are scalable vi-
sion learners. In Proceedings of the IEEE/CVF con-
ference on computer vision and pattern recognition,
16000–16009.
Higgins, I.; Chang, L.; Langston, V.; Hassabis, D.; Sum-
mereld, C.; Tsao, D.; and Botvinick, M. 2021. Unsu-
pervised deep learning identies semantic disentangle-
ment in single inferotemporal face patch neurons. Na-
ture communications, 12(1): 6456.
Higgins, I.; Matthey, L.; Pal, A.; Burgess, C.; Glorot,
X.; Botvinick, M.; Mohamed, S.; and Lerchner, A. 2016.
beta-vae: Learning basic visual concepts with a con-
strained variational framework. In International con-
ference on learning representations.
Higgins, I.; Pal, A.; Rusu, A.; Matthey, L.; Burgess,
C.; Pritzel, A.; Botvinick, M.; Blundell, C.; and Ler-
chner, A. 2017. Darla: Improving zero-shot transfer in
reinforcement learning. In International Conference on
Machine Learning, 1480–1490. PMLR.
Kingma, D. P.; Mohamed, S.; Jimenez Rezende, D.; and
Welling, M. 2014. Semi-supervised learning with deep
generative models. Advances in neural information pro-
cessing systems, 27.
Kingma, D. P.; and Welling, M. 2014. Auto-Encoding
Variational Bayes. stat, 1050: 1.
Malloy, T.; Du, Y.; Fang, F.; and Gonzalez, C. 2023.
Accounting for Transfer of Learning using Human Be-
havior Models. Human Computation and Crowdsourc-
ing.
Malloy, T.; Klinger, T.; and Sims, C. R. 2022. Modeling
human reinforcement learning with disentangled visual
representations. Reinforcement Learning and Decision
Making (RLDM).
Malloy, T. J.; Sims, C. R.; Klinger, T.; Riemer, M. D.;
Liu, M.; and Tesauro, G. 2022. Learning in Factored
Domains with Information-Constrained Visual Repre-
sentations. In NeurIPS 2022 Workshop on Information-
Theoretic Principles in Cognitive Systems.
Nguyen, T. N.; Phan, D. N.; and Gonzalez, C. 2022.
SpeedyIBL: A comprehensive, precise, and fast imple-
mentation of instance-based learning theory. Behavior
Research Methods, 1–24.
Niv, Y.; Daniel, R.; Geana, A.; Gershman, S. J.; Leong,
Y. C.; Radulescu, A.; and Wilson, R. C. 2015. Rein-
forcement learning in multidimensional environments
relies on attention mechanisms. Journal of Neuro-
science, 35(21): 8145–8157.
Parmar, N.; Vaswani, A.; Uszkoreit, J.; Kaiser, L.;
Shazeer, N.; Ku, A.; and Tran, D. 2018. Image trans-
former. In International conference on machine learn-
ing, 4055–4064. PMLR.
Pu, Y.; Gan, Z.; Henao, R.; Yuan, X.; Li, C.; Stevens,
A.; and Carin, L. 2016. Variational autoencoder for
deep learning of images, labels and captions. Advances
in neural information processing systems, 29.
Radulescu, A.; Shin, Y. S.; and Niv, Y. 2021. Hu-
man representation learning. Annual Review of Neu-
roscience, 44: 253–273.
Salakhutdinov, R. 2015. Learning deep generative mod-
els. Annual Review of Statistics and Its Application, 2:
361–385.
Thomson, R.; Lebiere, C.; Anderson, J. R.; and
Staszewski, J. 2015. A general instance-based learning
framework for studying intuitive decision-making in a
cognitive architecture. Journal of Applied Research in
Memory and Cognition, 4(3): 180–190.
Wolf, T.; Debut, L.; Sanh, V.; Chaumond, J.; De-
langue, C.; Moi, A.; Cistac, P.; Rault, T.; Louf, R.;
Funtowicz, M.; et al. 2019. transformers: State-
of-the-art natural language processing. arXiv preprint
arXiv:1910.03771.
Zhang, Y. 2018. A better autoencoder for image: Con-
volutional autoencoder. In International Conference on
Neural Information Processing ICONIP17-DCEC.
Article
Full-text available
Introduction Generative Artificial Intelligence has made significant impacts in many fields, including computational cognitive modeling of decision making, although these applications have not yet been theoretically related to each other. This work introduces a categorization of applications of Generative Artificial Intelligence to cognitive models of decision making. Methods This categorization is used to compare the existing literature and to provide insight into the design of an ablation study to evaluate our proposed model in three experimental paradigms. These experiments used for model comparison involve modeling human learning and decision making based on both visual information and natural language, in tasks that vary in realism and complexity. This comparison of applications takes as its basis Instance-Based Learning Theory, a theory of experiential decision making from which many models have emerged and been applied to a variety of domains and applications. Results The best performing model from the ablation we performed used a generative model to both create memory representations as well as predict participant actions. The results of this comparison demonstrates the importance of generative models in both forming memories and predicting actions in decision-modeling research. Discussion In this work, we present a model that integrates generative and cognitive models, using a variety of stimuli, applications, and training methods. These results can provide guidelines for cognitive modelers and decision making researchers interested in integrating Generative AI into their methods.
Article
Full-text available
An important characteristic of human learning and decision-making is the flexibility with which we rapidly adapt to novel tasks. To this day, models of human behavior have been unable to emulate the ease and success with which humans transfer knowledge in one context to another. Humans rely on a lifetime of experience and a variety of cognitive mechanisms that are difficult to represent computationally. To address this problem, we propose a novel human behavior model that accounts for human transfer of learning using three mechanisms: compositional reasoning, causal inference, and optimal forgetting. To evaluate this proposed model, we introduce an experiment task designed to elicit human transfer of learning under different conditions. Our proposed model demonstrates a more human-like transfer of learning compared to models that optimize transfer or human behavior models that do not directly account for transfer of learning. The results of the ablation testing of the proposed model and a systematic comparison to human data demonstrate the importance of each component of the cognitive model underlying the transfer of learning.
Conference Paper
Full-text available
Humans learn quickly even in tasks that contain complex visual information. This is due in part to the efficient formation of compressed representations of visual information, allowing for better generalization and robustness. However, compressed representations alone are insufficient for explaining the high speed of human learning. Reinforcement learning (RL) models that seek to replicate this impressive efficiency may do so through the use of factored representations of tasks. These informationally simplistic representations of tasks are similarly motivated as the use of compressed representations of visual information. Recent studies have connected biological visual perception to disentangled and compressed representations. This raises the question of how humans learn to efficiently represent visual information in a manner useful for learning tasks. In this paper we present a model of human factored representation learning based on an altered form of a β-Variational Auto-encoder used in a visual learning task. Modelling results demonstrate a trade-off in the informational complexity of model latent dimension spaces, between the speed of learning and the accuracy of reconstructions.
Conference Paper
Full-text available
Humans are able to learn about the visual world with a remarkable degree of generality and robustness, in part due to attention mechanisms which focus limited resources onto relevant features. Deep learning models that seek to replicate this feature of human learning can do so by optimizing a so-called "disentanglement objective", which encourages representations that factorize stimuli into separable feature dimensions [4]. This objective is achieved by methods such as the β-Variational Autoencoder (β-VAE), which has demonstrated a strong correspondence to neural activity in biological visual representation formation [5]. However, in the β-VAE method, learned visual representations are not influenced by the utility of information, but are solely learned in an unsupervised fashion. In contrast to this, humans exhibit generalization of learning through acquired equivalence of visual stimuli associated with similar outcomes [7]. The question of how humans combine utility-based and unsupervised learning in the formation of visual representations is therefore unanswered. The current paper seeks to address this question by developing a modified β-VAE model which integrates both unsupervised learning and reinforcement learning. This model is trained to produce both psychological representations of visual information as well as predictions of utility based on these representations. The result is a model that predicts the impact of changing utility on visual representations. Our model demonstrates a high degree of predictive accuracy of human visual learning in a contextual multi-armed bandit learning task [8]. Importantly, our model takes as input the same complex visual information presented to participants, instead of relying on hand-crafted features. These results provide further support for disentanglement as a plausible learning objective for visual representation formation by demonstrating their usefulness in learning tasks that rely on attention mechanisms.
Article
Full-text available
In order to better understand how the brain perceives faces, it is important to know what objective drives learning in the ventral visual stream. To answer this question, we model neural responses to faces in the macaque inferotemporal (IT) cortex with a deep self-supervised generative model, β-VAE, which disentangles sensory data into interpretable latent factors, such as gender or age. Our results demonstrate a strong correspondence between the generative factors discovered by β-VAE and those coded by single IT neurons, beyond that found for the baselines, including the handcrafted state-of-the-art model of face perception, the Active Appearance Model, and deep classifiers. Moreover, β-VAE is able to reconstruct novel face images using signals from just a handful of cells. Together our results imply that optimising the disentangling objective leads to representations that closely resemble those in the IT at the single unit level. This points at disentangling as a plausible learning objective for the visual brain. Little is known about the brain’s computations that enable the recognition of faces. Here, the authors use unsupervised deep learning to show that the brain disentangles faces into semantically meaningful factors, like age or the presence of a smile, at the single neuron level.
Article
Instance-based learning theory (IBLT) is a comprehensive account of how humans make decisions from experience during dynamic tasks. Since it was first proposed almost two decades ago, multiple computational models have been constructed based on IBLT (i.e., IBL models). These models have been demonstrated to be very successful in explaining and predicting human decisions in multiple decision-making contexts. However, as IBLT has evolved, the initial description of the theory has become less precise, and it is unclear how its demonstration can be expanded to more complex, dynamic, and multi-agent environments. This paper presents an updated version of the current theoretical components of IBLT in a comprehensive and precise form. It also provides an advanced implementation of the full set of theoretical mechanisms, SpeedyIBL, to unlock the capabilities of IBLT to handle a diverse taxonomy of individual and multi-agent decision-making problems. SpeedyIBL addresses a practical computational issue in past implementations of IBL models, the curse of exponential growth, that emerges from memory-based tabular computations. When more observations accumulate over time, there is an exponential growth of the memory of instances that leads directly to an exponential slowdown of the computational time. Thus, SpeedyIBL leverages parallel computation with vectorization to speed up the execution time of IBL models. We evaluate the robustness of SpeedyIBL over an existing implementation of IBLT in decision games of increased complexity. The results not only demonstrate the applicability of IBLT through a wide range of decision-making tasks, but also highlight the improvement of SpeedyIBL over its prior implementation as the complexity of decision features the of agents increase. The library is open sourced for the use of the broad research community.
Article
Generative models have been in existence for many decades. In the field of machine learning, we come across many scenarios when directly learning a target is intractable through discriminative models, and in such cases the joint distribution of the target and the training data is approximated and generated. These generative models help us better represent or model a set of data by generating data in the form of Markov chains or simply employing a generative iterative process to do the same. With the recent innovation of Generative Adversarial Networks (GANs), it is now possible to make use of AI to generate pieces of art, music, etc. with a high extent of realism. In this paper, we review and analyse critically all the generative models, namely Gaussian Mixture Models (GMM), Hidden Markov Models (HMM), Latent Dirichlet Allocation (LDA), Restricted Boltzmann Machines (RBM), Deep Belief Networks (DBN), Deep Boltzmann Machines (DBM), and GANs. We study their algorithms and implement each of the models to provide the reader some insights on which generative model to pick from while dealing with a problem. We also provide some noteworthy contributions done in the past to these models from the literature.
Conference Paper
The ever-increasing size of modern data sets combined with the difficulty of obtaining label information has made semi-supervised learning one of the problems of significant practical importance in modern data analysis. We revisit the approach to semi-supervised learning with generative models and develop new models that allow for effective generalisation from small labelled data sets to large unlabelled ones. Generative approaches have thus far been either inflexible, inefficient or non-scalable. We show that deep generative models and approximate Bayesian inference exploiting recent advances in variational methods can be used to provide significant improvements, making generative approaches highly competitive for semi-supervised learning.