ArticlePDF Available

Sailing to the model's edge: Testing the limits of parameter space and scaling

Authors:

Abstract and Figures

Using the MS/RPD integrated modeling approach, we have modeled a variety of tasks. We typically try to capture aspects of human performance and evaluate the qualitative and quantitative fit of model behavior to human data. A collection of individual models and demonstrations of fit to human data constitute an important validation of a modeling approach. However, there are problems with focusing solely on the "good fit" and "typical model" section of model complexity and parameter space. In this paper, we argue that as modelers, we need to examine our approaches in a broader context, going beyond the comfort zone of good fit and typical models. Using a very simple "generic" model, we examined a relatively small search space, with the goal of better covering and understanding a wider range of complexity and parameter values than our typical models utilize. We investigated scaling by systematically increasing the number of cues and COAs, and we investigated a range of values for three key model parameters. We learned something about limits of scaling. In our parameter exploration, the results underscored the importance of exploring the full range of possible values because parameter values did not always affect performance and learning in a monotonic way.
Content may be subject to copyright.
Sailing to the Model’s Edge:
Testing the Limits of Parameter Space and Scaling
Amy Santamaria
Walter Warwick
Alion Science and Technology
MA&D Operation
4949 Pearl East Circle, Suite 200
Boulder, CO 80301
303-442-6947
asantamaria@alionscience.com, wwarwick@alionscience.com
Keywords:
Naturalistic Decision Making, Human Performance Modeling
ABSTRACT: Using the MS/RPD integrated modeling approach, we have modeled a variety of tasks. We
typically try to capture aspects of human performance and evaluate the qualitative and quantitative fit of
model behavior to human data. A collection of individual models and demonstrations of fit to human data
constitute an important validation of a modeling approach. However, there are problems with focusing
solely on the “good fit” and “typical model” section of model complexity and parameter space. In this
paper, we argue that as modelers, we need to examine our approaches in a broader context, going beyond
the comfort zone of good fit and typical models. Using a very simple "generic" model, we examined a
relatively small search space, with the goal of better covering and understanding a wider range of
complexity and parameter values than our typical models utilize. We investigated scaling by systematically
increasing the number of cues and COAs, and we investigated a range of values for three key model
parameters. We learned something about limits of scaling. In our parameter exploration, the results
underscored the importance of exploring the full range of possible values because parameter values did not
always affect performance and learning in a monotonic way.
1. Introduction
Over the past ten years, we have constructed and
presented models of a variety of tasks using our MS/RPD
approach (Warwick, McIlwaine, Hutton, & McDermott,
2001; Warwick & Hutchins, 2004; Warwick &
Fleetwood, 2006; Warwick & Santamaria, 2006;
Santamaria & Warwick, 2007; 2008). Our approach
combines Micro Saint task network modeling (the MS
component) with underlying learning and memory
mechanisms that capture key aspects of recognition-
primed decision making (the RPD component) in an
integrated architecture. The MS component breaks down
tasks into their constituent processes, creating a kind of
“dynamic flowchart,” represented as a network of tasks.
The RPD component uses a multiple-trace model of long-
term memory, a similarity-based recall mechanism, and
simple reinforcement-based learning to set values or
determine the flow of control in the task network. Using
this integrated modeling approach, we typically we focus
on a single task, constructing a model, trying to capture
aspects of human performance, and evaluating the
qualitative or quantitative fit of model behavior to the
human data.
A collection of individual models and demonstrations of
fit to human data constitute an important validation of a
modeling approach. However, there are bigger issues to
take into consideration when developing, exploring, and
evaluating a modeling framework. There are problems
with goodness of fit as the sole criterion (see Roberts &
Pashler, 2000, Collyer, 1985). But more critically, there
are problems with focusing solely on the “good fit” and
“typical model” section of model size and parameter
space.
Several important points related to issues of scaling are
brought out in Gluck et al. (2007). The authors describe
three levels of theory that are implemented in models of
cognition: architecture and control mechanisms (Type 1),
internal component/module implementation (Type 2) and
knowledge (Type 3). Gluck et al. point out that the
parameter space for each of those levels is very large and
that a typical modeling effort only selects a single point at
the intersection of these spaces. From their paper:
A thorough search of even a modest portion of
the total possible theoretical state space will
require an unprecedented amount of computing
power because of the combinatorics associated
with searching a multi-dimensional
space…seemingly innocuous assumptions and
implementation decisions can have dramatic
consequences downstream in a complex system
like a cognitive architecture that interacts with a
simulation environment
The tendency in modeling is to focus on “pet problems”
where the model succeeds. However, the potential
parameter space for any given model is huge. We
modelers need to examine our approaches in a broader
context, not just the “good fit” space, or comfort zone.
This problem is well laid out in Best et al. (2009):
The de-facto approach to cognitive modeling is
more often a focus on maximizing fit to human
data. This is done through either hand-tuning
based on the intuition and experience of the
modeler or automated optimizing of the fit…Any
of these approaches can be sufficiently
successful, but they provide little data about the
performance of the model outside of the ultimate
parameter values used in presenting the final fit.
Best et al. also point out the benefits of such exploration
of parameter space:
Information about how a model performs outside
the best-fitting parameter combination provides
modelers with information about…the full range
of behavior possible from the model and how
different parameters interact to generate possibly
complex behavioral dynamics.
Our modeling approach is simpler than the typical
cognitive architecture of the type Gluck et al. and Best et
al. describe (e.g., ACT-R or Soar), but issues of scaling
still apply. For this paper, we examined a relatively small
search space with a very simple model, but our goals were
similar – to cover and better understand a wider space
than our typical models explore.
In a recent paper (Santamaria & Warwick, 2009), we gave
an overview of our MS/RPD modeling approach, the
ground we have covered and tasks we have modeled, and
our vision for the next steps to take. In our “next steps”
section, we promised to “systematically investigate the
computational limits of our algorithms, scaling up a
simple model by adding cues and courses of actions.”
To follow through on this promise, we constructed a
“generic” model without built-in assumptions about tasks
or processes (and the expectations that come with them);
the inputs to the model are cue 1 through cue n, and the
values of these cues determine the selection of one of m
courses of action (COAs). We used this model to explore
issues of scaling by systematically increasing the number
of cues and COAs. We went beyond the typical size for
MS/RPD models, on the order of 2 cues and 2 COAs, to
explore up to 15 cues and 5 COAs. Using the same
generic model, we also investigated a range of values for
three key model parameters: the activation exponent, the
COA selection mechanism, and confidence.
2. The Generic Model: A Testbed
The generic model was developed to explore scaling and
parameter space issues. Why did we construct a generic
model? In our models, closed form analytic solutions are
not obvious or even tractable. Even the simplest
cognitive models are fairly complicated pieces of
software, and they need to be explored empirically. The
generic model can be incrementally scaled up in the
number of cues and the number of courses of action. In
this section, we describe the underlying learning,
memory, and recognition mechanisms and the
construction and cue structure of the generic model.
2.1 Learning, Memory, and Recognition Mechanisms
Our decision modeling mechanism was inspired by
Klein’s theory of the recognition primed decision, or RPD
(see Klein, 1998). It uses a multiple-trace mechanism
based on the multiple-trace model of memory (see
Hintzman, 1984; 1986a; 1986b). Following Klein, the
major features of our modeling approach are cues and
COAs, and the associations between them. Models learn
the associations between cues and COAs through
experience, and this accumulation of this experience can
be modified by several recognition and learning
parameters. These parameters include the activation
exponent, the COA selection mechanism, and confidence,
each of which is described in more detail below.
2.2 Construction and Cue Structure
The high-level task structure of the generic model is
shown in Figure 1. The first task sets the model
parameters, including number of cues, number of COAs,
runtime, number of situations, and cue-to-COA mapping.
We explored several different cue-to-COA mappings in
order to reduce the chance that we had hidden or
"smuggled in" informative structure that essentially gave
extra help to the model. Standard experimental
paradigms are carefully crafted to have internal structure
that is predictable and learnable. The model can latch on
to certain kinds of structure, but what happens when the
structure is completely arbitrary? We tested several
mappings, including random assignment of cue
combinations (situations) to COAs (“random”), a list-
based mapping covering all possible combinations
(“alternating”), an offset list-based mapping (“offset”),
and a mapping based on the location of cues in the
situation vector (“left-right”). Results were similar for all
mappings; the results reported in this paper used either the
random or the alternating mapping.
Figure 1. The task structure of the generic model.
After setting model parameters, the task network model
passes control to the RPD (decision) model, which selects
a COA. Figure 2 shows the screen where cues are
specified in the RPD model.
Figure 2. Specifying cues in the decision (RPD) model.
Next, the task network model resumes, goes to the
“continue” task, and if runtime is not yet up, loops back to
make another decision. There are no actual consequences
in the task network model of choosing one COA over
another other; that is why we call this a “generic” model
that does not have built-in task assumptions.
3. Scaling Up Model Complexity
To test effects of scale and explore a wider range of
model size than we typically investigate, we
systematically changed the number of cues and the
number of COAs in the model.
We tested all combinations of cues and COAs from one to
five cues and from two to five COAs. To ensure that all
cue situations deterministically predicted a COA, we
omitted combinations with fewer cue situations than
COAs. An example is the combination of three COAs
and one cue (3-1); with one cue, there are two cue
situations that cannot uniquely map to three different
COAs. The combinations tested are listed in Table 1.
Table 1. Combinations of cues and COAs tested.
COAs
Cues 2 3 4 5
1 2-1 X X X
2 2-2 3-2 4-2 X
3 2-3 3-3 4-3 5-3
4 2-4 3-4 4-4 5-4
5 2-5 3-5 4-5 5-5
We tested each model holding confidence at medium and
the activation exponent at 15. The cue-to-COA mapping
was the “alternating” mapping and runtime was 500 trials.
Figure 3 and Figure 4 show the results of these tests.
They present the same data but group them differently,
with Figure 3 showing the effect of number of COAs by
grouping the models by number of cues, and Figure 4
showing the effect of number of cues by grouping the
models by number of COAs.
Figure 3 shows the effect of number of COAs on learning
for models with 2 cues (top left), 3 cues (top right), 4 cues
(bottom left), and 5 cues (bottom right). Learning
differences are very small for 2 or 3 cues. However,
when the number of cues increases to 4 or 5, adding
COAs slows learning. Tests with long runs showed that it
takes much longer for model 5-5 to reach asymptote than
for model 2-5 to reach asymptote.
Figure 4 shows the effect of number of cues on learning
for models with 2 COAs (top left), 3 COAs (top right), 4
COAs (bottom left), and 5 COAs (bottom right). Again,
learning differences are small for a small number of
COAs but grow larger as the number of COAs increase.
4. Exploring Parameter Values
With our generic model, we explored three of the
parameters that are available in the MS/RPD modeling
Figure 3. Effect of number of COAs on learning for 2, 3, 4, and 5 cues (left to right, top to bottom). Models are
referred to as A-B, where A is the number of COAs and B is the number of cues. Time is on the x-axis (trial/50).
Figure 4. Effect of number of cues on learning for 2, 3, 4, and 5 COAs (left to right, top to bottom). Models are
referred to as A-B, where A is the number of COAs and B is the number of cues. Time is on the x-axis (trial/50).
approach: activation exponent, COA selection
mechanism, and confidence.
4.1 Activation Exponent
The first parameter we explored with the generic model
was the activation exponent. Remember that the
MS/RPD approach uses a similarity-based recall
mechanism. The similarity value between the current
episode and all the episodes in long-term memory is
raised to a power, the activation exponent. The similarity
value determines the proportion that each remembered
episode contributes to the recognition process. A higher
value for the activation exponent means that the match
must be more exact for the remembered episode to
contribute to the current decision.
We tested the 2-10 model (2 COAs and 10 cues), holding
confidence at medium and COA selection at default. The
cue-to-COA mapping was the “random” mapping, and
runtime was 5000 trials. With 2 COAs, chance
performance is 50 percent correct. As shown in Figure 5,
all versions of the model performed above chance. A
higher activation exponent yielded better performance and
a faster learning curve.
Figure 5. Learning (percent correct over time) as a
function of activation exponent for the 2-10 model, for
a run of 5000 trials. The x-axis is trial/500.
For this model, activation exponent is an important
parameter. Holding everything else constant, it can
improve overall performance from 64 percent correct to
85 percent correct. Figure 6 shows overall percent correct
(across all trials) for the 2-10 model for activation
exponent values of 3 to 15.
4.2 COA Selection Mechanism
The second parameter we explored with the generic
model was the COA selection mechanism. The COA
selection mechanism controls how the model will choose
among recognized courses of action. By default, the
model will always choose the COA most strongly
recognized as successful among those that exceed a
recognition threshold; conversely, the model will not
choose any COAs that have been recognized as
unsuccessful. This selection strategy is referred to as
“default”.
Figure 6. Overall percent correct as a function of
activation exponent for the 2-10 model, for 5000 trials.
The default strategy is intended to steer the model toward
the most successful COAs. The model can also employ a
“fuzzy” selection strategy where it tends to choose the
COA recognized as most successful, but not always. The
fuzzy option uses a probabilistic draw weighted with
respect to the normalized strength of recognition for each
COA.
We tested the 2-10 model (2 COAs and 10 cues), holding
confidence at none and COA selection at default. The
cue-to-COA mapping was the “alternating” mapping.
The effect of COA selection mechanism on learning for
the first 200 trials is shown in Figure 7. Both default and
fuzzy mechanisms result in similar performance, but they
differ in the initial spin-up over the first 50 trials. On
average, across a batch of ten runs, the model using the
default mechanism spins up more quickly.
Figure 7. Effect of COA selection mechanism on
learning for the first 200 trials. (Default and fuzzy are
each averaged over 10 runs.)
4.3 Confidence
The third parameter we explored with the generic model
was confidence. Confidence sets a threshold above which
the model will recognize a COA. The lower the
threshold, the less “confident” you can be that the
recognition is due to systematic associations in long term
memory between situations and COAs rather than the
noise inherent in the similarity-based recognition process.
Viewing long-term memory as a “population” of
experience, the threshold corresponds to the number of
standard deviations from the mean recognition value one
would expect from a population of random experiences.
Low confidence corresponds to one standard deviation,
medium to two standard deviations, and high to three
standard deviations.
The effects of confidence should show in early trials, as
the model spins up. Early trials are especially important
in models that are very sensitive to noise and initial
effects. We have seen confidence affect early
performance and spin-up in other models. However, our
tests did not reveal differences in the generic model for
different levels of confidence across a variety of
conditions (specific results are not reported here).
5. Discussion
We used the generic model to investigate 1) scaling
beyond our typical model size and 2) a range of values for
several key model parameters. In the exploration of
scaling, we found that we could increase either cues or
COAs with only a very minor slowing of learning, but
that increasing both beyond three led to a much larger
slowdown in learning.
These results demonstrate the syntactic nature of the
model. It is not learning anything about specific COAs or
cues; it is learning about the combination of COAs and
cues. This is evident in the symmetry of the effect of
scaling up in number of cues and COAs on performance.
It doesn't matter if the increase in decision space size is
due to cues or COAs; the model is sensitive to the size of
the decision space, not the source of the complexity.
In addition to the results presented here, we built models
that scaled up even further: a 2 COA, 10 cue model (2-
10), a 2 COA, 15 cue model (2-15), and a 5 COA, 10 cue
model (5-10). The 2-10 model was able to learn to
asymptote, although it took longer to reach asymptote
than did models whose number of cues/number of COAs
were capped at 5. The 2-15 and 5-10 models were not
able to converge, even with runtimes of 25,000 trials.
This was because of the very large space to learn (all
combinations of cues were possible and had an assigned
“correct answer”). For example, the 2-10 model had 210,
or 1024, possible cue combinations. The 2-15 model had
215, or 32,768, and the 5-10 model had 510, or 9,765,625!
When we limited the number of possible cue
combinations the model could face (to 50, 100, even 500),
the 2-15 and 5-10 models were able to learn without a
problem. So scaling up the cue and COA space and
scaling up the situation space are actually separate issues.
Two of the parameters we examined provided interesting
results: activation exponent and the COA selection
mechanism. The value of the activation exponent made a
substantial difference in the model's learning and
performance. The higher the activation exponent, the
faster the learning. Differences were largest among
smaller activation exponents (3 to 7), and learning curves
became more similar for higher values (9-15). Overall
performance (percent correct) also improved as activation
exponent increased, with the largest differences at the
small end of the parameter scale.
It was important to explore the full range of possible
activation exponent values because they did not uniformly
affect performance and learning. The lesson from our
exploration of this parameter is that you need to make
sure the activation exponent is high enough (maybe 7 or
higher), but beyond a certain point, it does not make much
of a difference in the model's performance.
The COA selection mechanism showed a difference in
learning but not performance. On average, the model
reached similar levels of accuracy with default and fuzzy
mechanisms, but it learned faster with default, showing
better performance than fuzzy on the first 50 trials.
There were two puzzling results with the generic model
that have not yet been explained. The first puzzling result
was that model performance on the 3 COA, 4 cue (3-4)
and 3 COA, 5 cue (3-5) models stagnated at chance
performance. We suspect this is an anomaly resulting
from the way cues were mapped to COAs (the "right
answers" for which the model was reinforced).
The second puzzling result was the absence of a result for
confidence. Earlier models have shown effects of
confidence, particularly on early performance and spin-
up. However, the generic model failed to show an effect
of confidence under a variety of conditions. An effect of
confidence should show up where there are systematic
associations over and above the noise present. However,
in the generic model, we deliberately built random cue-to-
COA mappings - this is only noise! So there are no
systematic associations inherent in cue structure. Finding
no effect of confidence in this model is actually a
validation that we haven't smuggled in any informative
internal structure or biases, providing a purer test of the
model's ability to learn essentially arbitrary relationships.
6. Conclusions
In this paper, we have described our integrated modeling
approach and our attempts to push its boundaries a bit.
While it is important for a modeling approach to build a
repertoire of single-task models validated with human
performance data, we have argued that it is also important
to explore beyond the "good fit" areas of parameter space
and the "typical model" areas of complexity space/scale.
Examining a relatively small search space with a very
simple "generic" model, we attempted to gain a better
understanding of a larger space than we typically explore
with our models. We learned some interesting things as
we tried to scale up the model and systematically move
across parameter space.
This is just the beginning of this effort. It is critical to go
beyond holding all parameters but one constant in order to
explore the intersection of parameter space and to
understand how model parameters interact. These efforts
are a very small step in an enormous and intimidating
effort that is emerging in the modeling community:
putting our modeling endeavors in a broader context and
moving outside our modeling comfort zones.
7. References
Best, B. J., Furjanic, C., Gerhart, N., Fincham, J., Gluck, K.
A., Gunzelmann, G., & Krusmark, M. A. (2009).
Adaptive mesh refinement for efficient exploration of
cognitive architectures and cognitive methods.
Proceedings of the ?? International Conference on
Cognitive Modeling.
Collyer, C. E. (1985). Comparing strong and weak models
by fitting them to computer-generated data. Perception
& Psychophysics, 38, 476-481.
Gluck, K., Scheutz, M., Gunzelmann, G., Harris, J., &
Kershner, J. (2007). Combinatorics meets processing
power: Large-scale computational resources for
BRIMS. In Proceedings of the Sixteenth Conference on
Behavior Representation in Modeling and Simulation
(pp. 73-83). Orlando, FL: Simulation Interoperability
Standards Organization.
Hintzman, D. L. (1984). MINERVA 2: A simulation
model of human memory. Behavior Research Methods,
Instruments & Computers, 16, 96-101.
Hintzman, D. L. (1986a). Judgments of Frequency and
Recognition Memory in a Multiple-Trace Memory
Model. Eugene, OR: Institute of Cognitive and
Decision Sciences.
Hintzman, D. L. (1986b). "Schema Abstraction" in a
Multiple-Trace Memory Model. Psychological Review,
93, 411-428.
Klein, G. (1998). Sources of Power: How People Make
Decisions. Cambridge, MA: The MIT Press.
Roberts, S. & Pashler, H. (2000). How persuasive is a
good fit? A comment on theory testing. Psychological
Review, 107, 358-367.
Santamaria, A. & Warwick, W. (2007). A naturalistic
approach to adversarial behavior: Modeling the
prisoner’s dilemma. In Proceedings of the 16th
Conference on Behavioral Representations in Modeling
and Simulation.
Santamaria, A. & Warwick, W. (2008). Modeling
probabilistic category learning in a task network model.
In Proceedings of the 17th Conference on Behavioral
Representations in Modeling and Simulation.
Warwick, W. & Fleetwood, M. (2006), A bad Hempel day:
The decoupling of explanation and prediction in
computational cognitive modeling. In Proceedings of
the Fall 2006 Simulation Interoperability Workshop.
Orlando, FL. SISO.
Warwick, W. & Hutchins, S. (2004). Initial comparisons
between a "naturalistic" model of decision making and
human performance data. In Proceedings of the 13th
Conference on Behavior Representation in Modeling
and Simulation.
Warwick, W., McIlwaine, S., Hutton, R. J. B., &
McDermott, P. (2001). Developing computational
models of recognition-primed decision making. In
Proceedings of the 10th Conference on Computer
Generated Forces.
Warwick, W. & Santamaria, A. (2006). Giving up
vindication in favor of application: Developing
cognitively-inspired widgets for human performance
modeling tools. Proceedings of the 7th International
Conference on Cognitive Modeling.
Author Biographies
AMY SANTAMARIA is a Senior Cognitive Scientist at
Alion Science and Technology. Her research focuses on
modeling human behavior and cognition and
experimentation for robotics interfaces. She received her
Ph.D. in Cognitive Psychology and Neuroscience and an
M.A. in Cognitive Psychology from the University of
Colorado Boulder.
WALTER WARWICK is a Principal Systems Analyst
at Alion Science and Technology. He is working on
several projects having to do with the modeling and
simulation of human behavior. He received his Ph.D. in
History and Philosophy of Science, an Area Certificate in
Pure and Applied Logic, and an M.S. in Computer
Science from Indiana University.
... Once the basics of the problem-domain are well-understood, the restrictions may be relaxed and a larger parameter space will be covered (cp. [9]). Currently, three types of agents are distinguished: the Reconnaissance-Agent, the Coordinator-Agent and the Engagement-Agent. ...
... With this background knowledge the restrictions may be relaxed and a larger parameter space will be covered (cp. [9]). A first extension is to implement moving targets which will add a lot of complexity to the coordination of the agents. ...
Chapter
Full-text available
We present an agent-based model to compare different coordination patterns in joint fire support (JFS) scenarios. Modern warfighting approaches depend heavily on a separation of concerns (like reconnaissance, coordination and engagement) and therefore impose high requirements on the coordination of all involved parties. Following the General Reference Model for Agent-Based Modeling and Simulation (GRAMS), we present an agent-based model of this problem domain. Our simulations indicate that decentralized JFS coordination leads to smaller average times from identification of a target to final engagement, while at the same time requiring extensive resources. Central coordination is more effective in terms of engaged units and reduced resource requirements, but tends to take more time.
... Once the basics of the problem-domain are well-understood, the restrictions may be relaxed and a larger parameter space will be covered (cp. [9]). Currently, three types of agents are distinguished: the Reconnaissance-Agent, the Coordinator-Agent and the Engagement-Agent. ...
... With this background knowledge the restrictions may be relaxed and a larger parameter space will be covered (cp. [9]). A first extension is to implement moving targets which will add a lot of complexity to the coordination of the agents. ...
Conference Paper
Full-text available
This paper presents an agent-based model to compare different coordination patterns in joint fire support (JFS) scenarios. Modern war fighting approaches depend heavily on a separation of concerns (like reconnaissance, coordination and engagement) and therefore impose high requirements on the coordination of all involved parties. Following the General Reference Model for Agent-Based Modeling and Simulation (GRAMS), we present an agent-based model of this problem domain. Our simulations indicate that decentralized JFS coordination leads to smaller average times from identification of a target to final engagement, while at the same time requiring extensive resources. Central coordination is more effective in terms of engaged units and reduced resource requirements, but tends to take more time.
Article
Full-text available
The multiple-trace simulation model, MINERVA 2, was applied to a number of phenomena found in experiments on relative and absolute judgments of frequency, and forced-choice and yes-no recognition memory. How the basic model deals with effects of repetition, forgetting, list length, orientation task, selective retrieval, and similarity and how a slightly modified version accounts for effects of contextual variability on frequency judgments were shown. Two new experiments on similarity and recognition memory were presented, together with appropriate simulations; attempts to modify the model to deal with additional phenomena were also described. Questions related to the representation of frequency are addressed, and the model is evaluated and compared with related models of frequency judgments and recognition memory.
Article
Full-text available
In this paper we describe our ongoing work to develop a computational representation of Klein’s model of Recognition-Primed Decision making (RPD). The RPD model differs from traditional, analytical models of decision making insofar as RPD emphasizes situation assessment rather than the comparison of options. Like many efforts, our research is motivated by the need for improved realism in the behavior of Computer Generated Forces, and in particular, the need for better representations of human decision making. Unlike other efforts, however, our attempt to model RPD is not part of a more general model of cognition. We describe here the extent to which we have been able to model RPD apart from other aspects of cognition and we present the resulting RPD-specific architecture. We also point to the test bed environment we used for proof-of-concept and our current efforts to scale-up to a more complex decision making environment. Finally, we discuss our vision of a computational model of RPD as one tool in a suite of task networked based models of human decision making and the potential advantages that follow from this approach.
Article
Full-text available
The multiple-trace simulation model, {minerva} 2, was applied to a number of phenomena found in experiments on relative and absolute judgments of frequency, and forced-choice and yes–no recognition memory. How the basic model deals with effects of repetition, forgetting, list length, orientation task, selective retrieval, and similarity and how a slightly modified version accounts for effects of contextual variability on frequency judgments were shown. Two new experiments on similarity and recognition memory were presented, together with appropriate simulations; attempts to modify the model to deal with additional phenomena were also described. Questions related to the representation of frequency are addressed, and the model is evaluated and compared with related models of frequency judgments and recognition memory. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Applies a simulation model of episodic memory, MINERVA 2, to the learning of concepts, as represented by the schema-abstraction task. The model assumes that each experience produces a separate memory trace and that knowledge of abstract concepts is derived from the pool of episodic traces at the time of retrieval. A retrieval cue contacts all traces simultaneously, activating each according to its similarity to the cue, and the information retrieved from memory reflects the summed content of all activated traces responding in parallel. It is suggested that the MINERVA 2 model is able to retrieve an abstracted prototype of the category when cued with the category name and to retrieve and disambiguate a category name when cued with a category exemplar. The model successfully predicts basic findings from the schema-abstraction literature (e.g., differential forgetting of prototypes and old instances, typicality, and category size effects), including some that have been cited as evidence against exemplar theories of concepts. The model is compared with other classification models, and its implications regarding the abstraction problem are discussed. (63 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
The philosopher of science Gustav Hempel famously argued for the symmetry of prediction and explanation in scientific practice. Pointing to parallels in their logical structure, Hempel maintained that any adequate scientific explanation could engender a prediction, and vice versa. This symmetry is evident in the practice of computational cognitive modeling. Indeed, much of the motivation for computational cognitive modeling follows from the belief that process-level representations (i.e., the implementation of explanatory accounts of cognition) will lead to better predictions of cognitive performance and, conversely, that such predictions can be used to confirm or refute the accounts of cognition so-modeled. In this paper we present the results of a case study in which we undertook three very different modeling approaches to a well-known categorization task in experimental psychology. The approaches ran the gamut in terms of the commitments they made to the nature of the cognitive processes underlying the task. At the end of the day, however, each modeling approach yielded good and, moreover, essentially indistinguishable fits to the human performance data. So, what does it say about the relationship between prediction and explanation in cognitive modeling when three qualitatively different approaches yield quantitatively similar results? Several responses are possible. We canvass two such responses before discussing what we consider to be a more basic concern about computational approaches to cognitive modeling.
Article
Full-text available
Computational models of cognition are most often developed to explore or vindicate particular theoretical views in psychology. The computer provides a ready environment in which to develop models and generate quantitative predictions about cognitive performance which, in turn, can be compared against actual human performance. Validating the model with a good fit vindicates the theoretical view the model implements. The resulting view of the cognitive modeling enterprise is decidedly hypothetico deductive. For all its familiar appeal, however, this is not the only reason to develop a cognitive model. In this paper, we describe our efforts to develop computational models of decision making that can be applied within existing human performance modeling tools. Although our efforts are inspired theoretically, the goal is not to vindicate this or that theoretical view of cognition but, rather, to engender sufficiently human-like behavior in a modeling environment where it is otherwise lacking. We begin by briefly describing the computational nuts and bolts we've developed for representing 'naturalistic' decisions in a task network modeling environment. We then point to results from several applications to demonstrate the flexibility of our approach and the extent to which it does really produce human-like behaviors, both quantitatively and qualitatively. We conclude by discussing the alternative, more instrumentalist view of the cognitive modeling enterprise that results when efforts are focused on the development of applications rather than architectures.
Article
Full-text available
The sophistication of computational cognitive architectures has opened the door to model development across a range of human activity. However, a continuing challenge for model developers is validating the model both by exploring the range of a model's behavior and by optimizing mechanisms, knowledge, and parameters to best account for human cognition and performance. We have been developing an infrastructure to facilitate this process that takes advantage of high performance computing resources and technologies to allow for more ambitious model development and validation efforts. This work involves a combination of increased computational power and increased sophistication in running models using available resources. Examples of the promise of this work are presented, along with current areas of emphasis and future directions.
Article
Full-text available
An overview of a simulation model of human memory is presented. The model assumes: (1) that only episodic traces are stored in memory, (2) that repetition produces multiple traces of an item, (3) that a retrieval cue contacts all memory traces simultaneously, (4) that each trace is activated according to its similarity to the retrieval cue, and (5) that all traces respond in parallel, the retrieved information reflecting their summed output. The model has been applied with success to a variety of phenomena found with human subjects in frequency and recognition judgment tasks, the schema-abstraction task, and paired-associate learning. Application of the model to these tasks is briefly summarized.
Article
Quantitative theories with free parameters often gain credence when they closely fit data. This is a mistake. A good fit reveals nothing about the flexibility of the theory (how much it cannot fit), the variability of the data (how firmly the data rule out what the theory cannot fit), or the likelihood of other outcomes (perhaps the theory could have fit any plausible result), and a reader needs all 3 pieces of information to decide how much the fit should increase belief in the theory. The use of good fits as evidence is not supported by philosophers of science nor by the history of psychology; there seem to be no examples of a theory supported mainly by good fits that has led to demonstrable progress. A better way to test a theory with free parameters is to determine how the theory constrains possible outcomes (i.e., what it predicts), assess how firmly actual outcomes agree with those constraints, and determine if plausible alternative outcomes would have been inconsistent with the theory, allowing for the variability of the data.