ArticlePDF Available

Dissociations among judgments do not reflect cognitive priority: An associative explanation of memory for frequency information in contingency learning


Abstract and Figures

Previous research on causal learning has usually made strong claims about the relative complexity and temporal priority of some processes over others based on evidence about dissociations between several types of judgments. In particular, it has been argued that the dissociation between causal judgments and trial-type frequency information is incompatible with the general cognitive architecture proposed by associative models. In contrast with this view, we conduct an associative analysis of this process showing that this need not be the case. We conclude that any attempt to gain a better insight on the cognitive architecture involved in contingency learning cannot rely solely on data about these dissociations. (PsycINFO Database Record (c) 2012 APA, all rights reserved).
Content may be subject to copyright.
Dissociations Among Judgments Do Not Reflect Cognitive Priority:
An Associative Explanation of Memory for Frequency Information in
Contingency Learning
Miguel A. Vadillo
University of Deusto
David Luque
University of Ma´laga
Previous research on causal learning has usually made strong claims about the relative complexity and
temporal priority of some processes over others based on evidence about dissociations between several
types of judgments. In particular, it has been argued that the dissociation between causal judgments and
trial-type frequency information is incompatible with the general cognitive architecture proposed by
associative models. In contrast with this view, we conduct an associative analysis of this process showing
that this need not be the case. We conclude that any attempt to gain a better insight on the cognitive
architecture involved in contingency learning cannot rely solely on data about these dissociations.
Keywords: contingency learning, probability learning, statistical models, associative models
Inferring the causal structure of the environment is an invaluable
skill for survival in an ever changing environment. As first noted by
Hume (1739/1964), we are unable to directly perceive the link con-
necting causes and effects, which means that causal relations can only
be inferred on the basis of indirect evidence. Unfortunately, it is not
easy to determine when a given event is a real cause of an effect
because some events can co-occur regularly without any causal link
between them. Psychological models of causal learning try to explain
when and how humans can learn new causal links.
Following the advice of Marr (1982), some authors have fo-
cused on developing computational models of causal learning
(e.g., Allan, 1980; Cheng, 1997; Cheng & Novick, 1992; Holyoak
& Cheng, 2011; Pearl, 2000). According to the popular levels-of-
processing framework, computational models do not aim at spec-
ifying every step that the cognitive system has to give to solve a
problem. Instead, these models should clarify, for a given task,
what is the function that maps the input of the cognitive system
with its output, while being agnostic about the algorithm involved
in that computation. Following this general perspective, several
statistical models of causal learning have been proposed to de-
scribe what input allows us to determine that two events are
causally related. These models usually consist of simple mathe-
matical equations that provide a numerical index of the strength of
the relationship between a candidate cause and an effect: Values
different from zero usually indicate that a causal relation exists.
Note that, in principle, computational models are not concerned
about how humans actually acquire this causal knowledge: Com-
putational models only establish when a causal link must be
inferred (if the system works well).
However, some authors have gone one step further suggesting
that the statistical calculations proposed by these computational
models could also be appropriate theories about how humans
actually learn new causal associations. According to these authors,
the mathematical formulas of statistical models of causal learning
could also be considered algorithm-level theories of causal learn-
ing that specify the real steps that people give when they solve a
causal induction problem. From this algorithm-level viewpoint,
people act as intuitive statisticians who first gather information
about the joint occurrence of two events and then combine this
information following certain rules to decide whether there is an
statistical connection between those two events. For instance,
when faced with a sequence of trials in which a cause, C, and an
effect, E, appear together or in isolation, people are assumed to
first encode this information in a mental representation to some
extent isomorphic to the 2 2 contingency table depicted in Table
1 (Beyth-Marom, 1982; Busemeyer, 1991; Shaklee & Mims,
1982). This contingency table would summarise the evidence
experienced by the participant regarding the joint occurrence or
absence of the target cause and effect, including the number of
occasions in which both the cause and the effect have appeared
together (a), the number of occasions in which only the effect or
only the cause has appeared (bor c, respectively), and the number
of occasions in which both the cause and the effect have been
simultaneously absent (d). Once the participant has stored this
information, he or she can use these frequency data to compute a
contingency index that describes the strength of the causal link
between the cause and the effect. For example, it has been argued
that one of the contingency indexes that participants might try to
compute from frequency information is the so-called pindex,
This article was published Online First April 16, 2012.
Miguel A. Vadillo, Department of Foundations and Methods of Psy-
chology, University of Deusto, Bilbao, Spain; David Luque, Department of
Basic Psychology, University of Ma´laga, Ma´laga, Spain.
Support for this research was provided by Grant IT363–10 from Depar-
tamento de Educacio´n, Universidades e Investigacio´n of the Basque Gov-
ernment and Grants PSI2011–26965 and PSI2011–24662 from Ministerio
de Ciencia e Innovacio´n. We thank Lorraine Allan and an anonymous
reviewer for their helpful comments on a previous draft of this article.
Correspondence concerning this article should be addressed to Miguel
A. Vadillo, Dpto. de Fundamentos y Me´todos de la Psicologı´a, Universidad
de Deusto, Apartado 1, 48080 Bilbao, Spain. E-mail:
Canadian Journal of Experimental Psychology / Revue canadienne de psychologie expérimentale © 2012 Canadian Psychological Association
2013, Vol. 67, No. 1, 60–71 1196-1961/13/$12.00 DOI: 10.1037/a0027617
which is defined as the extent to which the occurrence of the cause
increases the probability of the effect (Allan, 1980; Jenkins &
Ward, 1965). As can be seen in the following equation, this index
can be easily computed from the data contained in the 2 2
contingency table described in Table 1:
ppECpECa/abc/cd. (1)
Many experiments have shown that people judge causal
strengths in a way consistent with the predictions of the prule
(e.g., Shanks, 1987; Shanks & Dickinson, 1987, 1991; Wasserman,
1990; Wasserman, Elek, Chatlosh, & Baker, 1993). Moreover,
although it has sometimes been found that peoples’ judgments
deviate from the prule in certain conditions (e.g., Allan &
Jenkins, 1980; Alloy & Abramson, 1979; Jenkins & Ward, 1965;
Smedslund, 1963), it is usually easy to propose alternative statis-
tical rules that participants could be applying on frequency data
that could potentially account for these deviations (Allan & Jen-
kins, 1980, 1983; Busemeyer, 1991; White, 2003).
In contrast with this general framework, some researchers have
proposed an alternative algorithm-level account of contingency
detection. From this alternative approach, contingency detection,
and causal learning in general, is just another instance of associa-
tive learning, comparable to simpler processes such as classical or
instrumental conditioning (Alloy & Abramson, 1979; Dickinson,
Shanks, & Evenden, 1984). In other words, people would associate
causes and effects just the same way they associate, in general, any
type of cues and outcomes that are regularly paired in the envi-
ronment. The key assumption of these models is that whereas
participants are exposed to a sequence of cause-effect pairings (or
cue-outcome pairings, in the more neutral, associative terminol-
ogy), an association between the mental representations of the
cause and the effect is formed. The strength of the cause-effect
association is assumed to change on a trial-by-trial basis according
to a simple error-correction algorithm similar to the ones used in
connectionist simulations of cognitive processes. One of the most
common algorithms used to model contingency learning is the rule
initially proposed by Rescorla and Wagner (1972) in the domain of
classical conditioning (R–W rule, henceforth). According to this
model, in any given trial, the strength of a cause-effect association,
, changes according to the following equation:
VCE共␭ VTotal, (2)
where denotes the presence or absence of the effect in that trial
(usually coded as 1 or 0, respectively), V
is the sum of the
associative strengths of all the cues presented on that trial (usually
including not only the strength of the target cause, V
, but also that
of a constant context, V
) and and are learning-rate param-
eters dependent on the salience of the cause and the effect, respec-
tively, with values ranging from 0 to 1. The associative strength of the
context would also change trial-by-trial according to the same equa-
tion, although with a different that would represent the salience of
the context. The associative strength of the context would be updated
even in trials in which the target cause is nevertheless absent.
Many of the experiments conducted during the last two decades
aimed at discriminating between rule-based and associative ac-
counts of causal learning (Allan, 1993, 2003). However, this task
is more difficult than it could be expected prima facie,asthe
predictions made by both accounts are virtually indistinguishable
under many conditions. For example, it is well known that in
situations in which a single cue-outcome association is being
learned, the asymptotic strength of a cue-outcome association as
computed with the R–W rule converges to the value of the con-
tingency as computed with the prule (Chapman & Robbins,
1990; Wasserman et al., 1993; see also Danks, 2003).
any evidence showing that participants’ judgments of a causal
relation fit with the predictions of the prule is usually also
explainable in terms of the R–W learning algorithm. More general,
both types of theories share important assumptions: They assume
a similar representation format (a single scalar parameter describ-
ing the strength of the cause-effect relationship) and also a similar
normative analysis of what causal induction is. The main differ-
ence between these models lies in the algorithmic details specify-
ing how these computations are carried out. In the case of rule-
based models, people are assumed to store the frequency
information and use that information to compute an index of
contingency when asked to do so, whereas in associative models
they are assumed to constantly update the strength of the cause-
effect association on a trial-by-trial basis by means of an error-
correction mechanism. Although the algorithmic details and the
cognitive architecture assumed by both theories are very different,
the final output of the process can be remarkably similar.
Given the similarities between the predictions made by rule-based
and associative accounts of causal learning, some studies have at-
tempted to test them not by contrasting their differential predictions,
but by checking the plausibility of the different mechanisms invoked
by both types of models. In particular, researchers have relied on at
least three types of evidence to draw conclusions about the relative
success of rule-based models over associative models.
First, in contrast to associative models, rule-based accounts
assume that people store a mental representation of frequency data
and can use this information flexibly to compute conditional
probabilities or contingency indexes. Therefore, any evidence that
participants exposed to a series of cause-effect pairings can later
recall the frequencies of each of the trial types considered in Table
1(a,b,c, and d) has usually been interpreted as supporting
rule-based models over associative ones. Just as an example,
although a classical study conducted by Price and Yates (1995)
concluded that associative elements were necessary to account for
some types of contingency judgments, the authors nevertheless
considered that frequency estimates begged for a completely dif-
ferent kind of explanation. In their own words,
to answer the question “How often have B and A co-occurred?”
however, seems to suggest, if not require, a much different strategy for
generating a judgment. Specifically, it suggests that one directly
This prediction only holds if the learning rate parameter is assumed
to have the same value in effect-present and effect-absent trials.
Table 1
Contingency Table
Cause Effect present (E) Effect absent (E)
Cause present (C)ab
Cause absent (C)cd
accesses co-occurrences of B and A, perhaps by processes similar to
those described in current exemplar models of memory.” (p. 1646)
Second, participants are able to assess not only causal judgments
but also flexibly compute other indexes of the relationship between
the cause and the effect such as, for example, the conditional proba-
bility of the effect given the cause (Gredeba¨ck, Winman, & Juslin,
2000; Vadillo & Matute, 2007; Vadillo, Miller, & Matute, 2005). This
implies that participants have encoded the information in a format on
which several statistical rules can be applied. At first glance, this
flexibility fits better with the cognitive architecture proposed by
rule-based models than with the more automatic and simple encoding
and retrieval processes assumed to be at work in associative mecha-
Third, some studies have shown that manipulations that are
known to have a strong impact on causal judgments sometimes
have little or no effect on other types of judgments, such as
conditional probability estimates, effect predictions in the presence
of the cause, and most important, participants’ estimations of the
frequency of each type of trial across the sequence of trials. For
instance, it has been found that cue competition (i.e., the fact that
increasing the contingency between a cause and an effect has a
detrimental effect in the assessment of the causal role of other
potential causes of the same effect), which can be readily observed
in causal judgments, has no impact on conditional probability
estimates (Gredeba¨ck et al., 2000; Matute, Arcediano, & Miller,
1996; but see Cobos, Can˜o, Lo´ pez, Luque, & Almaraz, 2000) and
trial-type frequency estimates (Ramos-A
´lvarez & Catena, 2005).
Similarly, the overall probabilities of the cause, p(C), and the
effect, p(E), which are known to bias causal judgments, seem to
have no impact on trial-by-trial effect predictions made by partic-
ipants across the learning phase (Allan, Siegel, & Tangen, 2005;
Perales, Catena, Shanks, & Gonza´lez, 2005; but see Vadillo,
Musca, Blanco, & Matute, 2011). The trial-type frequency esti-
mates also seem to be immune to effect-density biases (Crump,
Hannah, Allan, & Hord, 2007). In a similar vein, Ca´ndido et al.
(2006) showed that exposing participants to aversive stimuli that
induce a negative mood, intermixed with the cue-outcome pair-
ings, can have a biasing effect on causal judgments, without a
parallel impact on frequency estimates.
The fact that some manipulations influence certain dependent vari-
ables but not others has been taken as a clue for discovering the
cognitive priority of some processes over others. For example, in light
of the fact that cue competition and emotion induction have an
influence on causal judgments but not on frequency estimates, re-
searchers have assumed that causal judgments are based on frequency
estimates (and not the opposite) and whatever mechanisms are re-
sponsible for these effects, they must be acting at a relatively late stage
of processing, before the causal judgment is produced but after the
frequency data have been properly encoded and stored (Ca´ndido et al.,
2006; Ramos-A
´lvarez & Catena, 2005). Following the same logic,
other researchers have concluded that estimations of the probability of
the effect given the cause and effect predictions based on the presence
of the cause must be based on earlier (and more automatic) cognitive
processes than causal judgments (Allan et al., 2005; Gredeba¨ck et al.,
2000; Perales et al., 2005).
This interpretation fits nicely with the cognitive architecture
underlying rule-based models of causal judgment: Participants
would first store a representation of the raw frequency data and
then they would use this information to compute conditional
probabilities and contingency indexes. Manipulations such as cue
interaction or emotion induction would influence causal judgments
at this latter point. In contrast, it is difficult for associative models
to explain why there are manipulations that only affect some
judgments but not others. If participants’ judgments are based on
associations, should not all types of judgments be biased in a
similar manner? Moreover, why should participants have memo-
ries of frequency data at all?
Inferring Frequency Data From Probabilities and
The preceding evidence has been generally taken in support for
the idea that people must store some kind of mental representation
of the raw data contained in a contingency matrix and that this
information is later used to make inferences about probabilities or
contingencies. However, as we try to show in the present section,
the opposite view is also plausible and cannot be disregarded in
light of just this evidence. The fact that participants remember the
frequencies of each trial type and that they are able to judge
probabilities and contingencies does not, per se, prove that the
former are the basis for the latter. In fact, it could happen that
during training participants encode some information about the
different conditional probabilities that relate cause and effect and
that, if asked to do so, they use this information to infer the
frequencies that must have been experienced. In other words,
maybe it is frequency estimates and not causal judgments or
conditional probability ratings that require an inference; and, ac-
cordingly, maybe it is conditional probabilities or even causal
relationships and not frequency information that are directly en-
coded in memory.
Consider the following situation: As most participants in a
causal learning experiment, imagine that you are exposed to a
series of trials in which a potential cause might be present or
absent (e.g., a patient taking or not taking a medicine) and an effect
is also present or absent (e.g., the patient suffers an allergic
reaction or not). Based on the instructions given to you and on the
general structure of the task, you suspect that your goal is to learn
to use the information about the cause to better predict the effect.
However, at the end of training, you are suddenly and unexpect-
edly asked to estimate the number of times you saw the cause and
the effect together, the number of times you saw the cause but not
the effect, the number of times you saw the effect without the
cause, and the number of times both elements were absent.
Even if you did not pay attention to this information, it might
nevertheless be relatively easy to make a good guess based on just
a little information you do remember. For instance, you might
remember that you were exposed to approximately 50 trials
that about half of them were trials in which the patient took the
medicine. Thus, you already know that you experienced about 25
In fact, this information is usually given to participants in many causal
learning studies. For example, when participants are presented with infor-
mation about a fictitious patient taking a medicine and suffering or not an
allergic reaction, it is not uncommon that each trial begins with the
sentence “On day X, the patient took the medicine.” Therefore, participants
can easily remember how many trials they have been exposed to by simply
paying attention to the number of days.
medicine-present trials and 25 medicine-absent trials. In addition,
you may remember that the chances of suffering an allergic reac-
tion were noticeably higher for patients taking the medicine, com-
pared to those not taking it, although the probability of suffering
the allergy was positive even for the patients that did not take the
medicine. Let’s say that your guess is that 80% of the patients who
took the medicine suffered the allergic reaction, but only 20% of
those who did not suffer it. Based on all this information, you
could easily infer that about 20 of the 25 medicine-present trials
must have been medicine-allergy pairings, whereas only five of the
25 medicine-absent trials must have been no medicine-allergy
Therefore, a good estimate of the relative frequencies of each
type of trial can be made on the basis of some knowledge of (a) the
probability of the cause, p(C), (b) the probability of the effect
given the cause, p(EC), and (c) the probability of the effect given
the absence of the cause, p(EC). As shown by the previous
example, the estimated relative frequency of type atrials can be
computed as:
erfapCpEC. (3)
The relative frequencies of other trial types can be computed in
a similar fashion:
erfbpC1pEC兲兴. (4)
erfc1pC兲兴 pEC. (5)
erfd1pC兲兴 1pEC兲兴. (6)
If you also have a general feeling of the total amount of trials
you may have seen, it is very easy to infer even the absolute
frequencies of each of these trial types, simply by multiplying this
total amount of trials by each relative frequency.
As we have already discussed, the asymptotic value of the
cause-effect association converges to the value of the objective
cause-effect contingency. In a similar vein, other probabilistic
parameters that relate the cause and the effect, apart from contin-
gency, can be estimated on the basis of associations computed by
means of the R–W rule. For instance, the probability of the effect
given the cause, p(EC), can be computed adding the associative
strength of the cause and the strength of the association between
the context and the effect:
This is a natural consequence of the fact that in cause-present trials
the R–W rule tries to minimise the error made when predicting the
effect based on the associative strengths of the cause and the
context. This can only be accomplished by gradually developing
cause-effect and context-effect associative strengths which sum is
In a similar fashion, in cause absent trials, the R–W rule mini-
mizes the error made in predicting the effect on the basis of the
context-effect association, which can only be accomplished by
developing a context-effect association whose associative strength
approaches p(EC). Therefore, the probability of the effect in the
absence of the cause is asymptotically equivalent to the associative
strength of the context,
For the same reason, if there is a single, constant context, the
overall probability of the cause, p(C), is equal to the probability of
the cause given the context, p(CCTX), which is equivalent to the
asymptotic value of the association between the context and the
pCVCTXC. (9)
As in Equations 7 and 8, Equation 9 follows from the fact that the
context-cause association computed with the R–W rule would
minimise the error made when trying to predict the cause on the
basis of the context. This error is minimal when the associative
strength of the context-cause association equals the probability of
the cause given the context, p(CCTX), which in situations in
which the context is constant also equals the overall probability of
the cause, p(C). The strength of this association, V
, can also
be computed with the general R–W rule using to code for the
presence or absence of the cause (instead of the effect), to
represent the salience of the cause (also instead of that of the
effect) and V
to represent the strength of the context-cause
association in the previous trial.
Equations 7 to 9 can be combined with Equations 3 to 6, so that
the estimated relative frequencies of each trial type can be inferred
from the value of these three associations,
erfbVCTXC1VCEVCTXE兲兴, (11)
erfc1VCTXCVCTXE, (12)
erfd1VCTXC1VCTXE, (13)
which means that, contrary to the logic followed by the authors of
the studies previously considered, the fact that people are able to
“recall” the frequency of each trial type need not be taken as
evidence that these data are directly encoded in their memory and
that, therefore, all other estimates have to be inferences made on
the basis of this information. At least from a formal point of view,
the possibility that causal judgments are based on encoded fre-
quency estimates is just as likely as the opposite: That frequency
estimates are the result of an inference made on the basis of
probability information, which could be encoded as associations
formed by means of the R–W rule. Equations 10 to 13 are just as
an example of how this can be done. The following simulations
were conducted to better assess the ability of Equations 10 to 13 to
correctly infer frequency estimates under a number of conditions.
Simulation 1: Frequency Assessment Under Different
First, we were interested in knowing how these frequency-
retrieval rules behaved under different cause-effect contingencies.
Therefore, we conducted a simulation comprising four conditions
in which the probability of the effect given the cause and the
The simulation of the R–W rule conducted by Vadillo and Matute
(2007, p. 441) shows that conditions differing in their cause-effect contin-
gencies but nevertheless similar regarding p(EC) give rise to different
cause-effect associations, but similar values of the sum V
probability of the effect given the absence of the cause were
manipulated, yielding four different contingency values.
The design summary of the conditions included in Simulation 1
is shown in Table 2. As can be seen, the simulation included four
conditions. Each condition consisted of a different combination of
frequencies of each type of trial (a,b,c, and d) that resulted in
different values for p(EC) and p(EC) and, consequently, in a
different cause-effect contingency, as defined by p. The two
numbers used to denote the experimental conditions refer to the
probability of the effect given the cause, p(EC) and the probability
of the effect in the absence of the cause, p(EC), respectively. In
two conditions (.20 .50 and .20 .80), the cause-effect con-
tingency was negative (.30 and .60, respectively) and in the
other two (.50 .20 and .80 .20) the contingency was positive
(.30 and .60, respectively). Each condition comprised a sequence
of 100 trials.
In most simulations of the R–W rule, the learning rate parameter
is larger for the cause than for the constant context, which
salience is assumed to be relatively low (e.g., Mercier, 1996).
Therefore, the learning-rate parameter was set to .5 for the cause
and .2 for the context. The learning-rate parameter was set to .5
for the computation of all the associations (cause-effect, context-
effect, and context-cause associations).
To properly measure the models ability to infer the relative
frequencies of each trial type, in each trial we computed the root
mean squared error made by the model when estimating the
relative frequency of trials athrough don the basis of Equations 10
to 13. This error was computed according to the following equa-
4. (14)
That is, the error was computed as the squared root of the mean
squared difference between the estimation of relative frequencies,
, and actual relative frequencies, rf
On each simulated trial, the program first chose a trial type
(a,b,c,ord) from the list of trials in that condition (see Table
2). Then, the strengths of the cause-effect association, the
context-effect association, and the context-cause associations
were updated following Equation 2 and using the learning-rate
parameters mentioned above. After updating the associative
strengths, the frequency-retrieval rules instantiated in Equations
10 to 13 were used to reconstruct the relative frequencies of
each trial type that could be inferred on the basis of those
associations. Finally, the error in these inferences was com-
puted following Equation 14, and the simulation proceeded to
the next simulated trial. To obtain smooth learning curves, we
conducted 500 simulations for each condition, each one with a
randomly ordered sequence of trials.
Results and Discussion
The results of the simulation are depicted in Figure 1. The top
panel shows the strengthening of the cause-effect association in
each condition over trials. As could be expected, this associative
strength gradually approaches the value of the objective cause-
effect contingency, as predicted by the computational analysis of
the R–W model mentioned in the Introduction (Chapman & Rob-
bins, 1990; Danks, 2003; Wasserman et al., 1993).
The lower panel shows the average error made by the model
when estimating the frequency of each trial type. As can be
seen, this error is large in the beginning of each simulation,
before the associative strengths used in the estimations have
reached their asymptotic value. However, as soon as the three
associative strengths, V
, and V
, reach the
learning asymptote, the global error made in the frequency
assessment decreases noticeably in all conditions. It is interest-
ing to note that the global errors tend to be rather similar in both
conditions with positive contingency, but they are systemati-
cally larger in negative contingencies. In fact, the more negative
the contingency, the larger this overall error seems to be. Figure
2 depicts the estimated relative frequencies of each trial type in
Simulation 2.
Table 2
Summary of the Conditions Included in Simulations 1, 2, and 4
Condition Frequencies p(EC)p(EC)qp
Simulation 1
.20 – .50 10a,40b,25c,25d0.20 .50 .30
.20 – .80 10a,40b,40c,10d0.20 .80 .60
.50 – .20 25a,25b,10c,40d0.50 .20 .30
.80 – .20 40a,10b,10c,40d0.80 .20 .60
Simulation 2
.50 – .00 20a,20b,0c,40d0.50 .00 .50
.75 – .25 30a,10b,10c,30d0.75 .25 .50
1.00 – .50 40a,0b,20c,20d1.00 .50 .50
Simulation 4
.25/.25 25a,75b,0c, 100d0.25 .00 .25 .25
.25/.10 70a,30b,60c,40d0.70 .60 .25 .10
.50/.25 75a,25b,50c,50d0.75 .50 .50 .25
.50/.10 90a,10b,80c,20d0.90 .80 .50 .10
Note. Eeffect; Ccause.
Simulation 2: Frequency Assessment and
Outcome-Density Effects
Most of the evidence arguing that people store a memory trace
of trial frequencies is based on manipulations that affected causal
judgment but leave frequency assessments unaltered (Ca´ndido et
al., 2006; Crump et al., 2007; Price & Yates, 1995; Ramos-A
& Catena, 2005). In our next simulation we wanted to show that
this type of evidence is not at odds with the R–W learning
algorithm per se. In particular, we wanted to assess to what extent
a well-studied manipulation that is known to have a strong impact
in causal judgment, namely the outcome-density effect (e.g., Allan
& Jenkins, 1983; Alloy & Abramson, 1979; Matute, Yarritu, &
Vadillo, 2011; Musca, Vadillo, Blanco, & Matute, 2010; Wasser-
man, Kao, van Hamme, Katagiri, & Young, 1996), also does have
an influence on frequency estimates. Therefore, we contrasted
three conditions, all of them with the same cause-effect contin-
gency, p.50, but each one with a different overall probability
of the effect. It is widely known that the R–W learning algorithm
predicts that this manipulation should have at least a pre-
asymptotic biasing effect on the development of cue-outcome
associations. However, as we try to show in this simulation, this
need not imply that the same biasing effect should appear in
frequency estimations made on the basis of these associations.
The specific probabilities of the effect, in the presence and in the
absence of the cause, in each of the three conditions included in
Simulation 2 are shown in Table 2, along with the frequencies of
each trial type, ato d. In this case, each condition comprised a
sequence of 80 trials, instead of 100. All the other procedural
details (the learning-rate parameters and the number of simulations
per condition) were kept the same as in Simulation 1.
Results and Discussion
As can be seen in the top panel of Figure 3, the strength of the
cue-outcome association is noticeably biased by this manipulation,
though only pre-asymptotically. This is a well-known property of
the R–W model (Shanks, 1995). However, as shown in the lower
panel, this outcome-density manipulation has little impact on fre-
Figure 1. Results of Simulation 1: The top panel shows the strength of
the cause-effect association in each condition. The lower panel shows the
global error made by the model when attempting to infer the relative
frequency of each type of trial. RMS root mean squared error.
Figure 2. Results of Simulation 1: Estimated relative frequencies of each trial type (a,b,c, and d).
quency estimations. Furthermore, the condition that should give
rise to larger outcome-density biases, 1.00-0.50, is actually the one
with the best overall performance; and there are few, if any,
differences between the other two conditions. This happens be-
cause, even though cue-outcome associations develop at a different
rate in each condition, the frequency estimates do not only depend
on that single association, but also in a complex equilibrium of
associations that can remain globally accurate despite minor indi-
vidual pre-asymptotic biases in each association.
This last simulation is particularly relevant because it shows that
a manipulation can have a biasing effect on a cue-outcome asso-
ciation without a parallel effect on other inferences that can be
made on the basis of that association (in combination with other
associations). Thus, all the evidence presented above showing that
there are manipulations that have an effect on contingency or
causal judgments without a similar impact on frequency estimates
(Ca´ndido et al., 2006; Crump et al., 2007; Price & Yates, 1995;
´lvarez & Catena, 2005) or other dependent variables
(Allan et al., 2005; Gredeba¨ck et al., 2000; Perales et al., 2005;
Vadillo & Matute, 2007; Vadillo et al., 2005), need not imply that
these latter judgments are cognitively simpler than the former. Nor
do they imply that causal judgments are based on a memory trace
of the frequency data, instead of the opposite.
Simulation 3: Manipulating the Learning Rates of
Cues and Contexts
One of the experiments that we discussed above (Ca´ndido et al.,
2006) found that the presentation of aversive stimuli intermixed with
the cause-effect pairings delayed the perception of contingency, but
did not have a parallel effect on frequency estimations. As the authors
argued, associative models could potentially account for the former
effect by means of a change in the learning rates. For example, in the
R–W model the learning rate depends on the salience of the cause
), the salience of the outcome (), and the salience of a constant
context (
). However, according to Ca´ndido et al. (2006), as the
participants also remembered the frequency information and, more-
over, this memory was not affected by the aversive stimulation, these
data beg for a nonassociative explanation. Contrary to this claim, in
Simulation 3 we show that even from our associative view, manipu-
lating the learning rates can have complex effects on the cause-effect
association and on the frequency estimations. Although some manip-
ulations do have a similar effect in both variables, some other manip-
ulations have a stronger impact on the cause-effect association than in
the frequency estimations, consistent with the results of Ca´ndido et al.
In all the conditions, the contingency was kept constant at .60,
with the following trial frequencies: 40 atrials, 10 btrials, 10 c
trials, and 40 dtrials. As in Simulations 1 and 2, the learning-rate
parameter was set to .5 and 500 randomly ordered trial se-
quences were simulated for each condition. The saliences of the
cause and of the context were manipulated orthogonally.
was .2 for half of the conditions and .8 for the other half. Similarly,
was .2 for half of the conditions and .8 for the other half.
Results and Discussion
Figure 4 shows the results of Simulation 3. The first number in the
name of the data series refers to the salience of the cause (
or .8) and the second number to the salience of the context (
Figure 3. Results of Simulation 2: The top panel shows the strength of
the cause-effect association in each condition. The lower panel shows the
global error made by the model when attempting to infer the relative
frequency of each type of trial. RMS root mean squared error.
Figure 4. Results of Simulation 3: The top panel shows the strength of
the cause-effect association in each condition. The lower panel shows the
global error made by the model when attempting to infer the relative
frequency of each type of trial. RMS root mean squared error.
.2 or .8). As can be seen in the top panel, the manipulation of the
learning rates had a clear impact on the cause-effect association.
The conditions with the higher
were the first ones to reach
the learning asymptote. The salience of the context, also had an
impact (though seemingly smaller) in the acquisition of the cause-
effect association.
The lower panel shows that the manipulation of the learning
rates also had a remarkable effect on the relative frequency esti-
mations, although the effects on this variable are somewhat more
complex. The condition in which both learning rates were highest
(.8/.8) yields the worse frequency estimations. The reason for this
is that the high learning rates result in very large oscillations
around the learning asymptote after every trial (recall the smooth
learning curves that can be seen in the top panel of Figure 4 show
the average results of 500 simulations with different trial orders),
which in turn gives rise to less than perfect frequency estimations
on each trial. By contrast, in the condition with the lowest learning
rates, .2/.2, these oscillations are minimal and have a less disturb-
ing effect on the relative frequency estimations.
The most interesting conditions for our present purposes are
those in which the cause and the context have different saliences.
As can be seen in the top panel, conditions .8/.2 and .2/.8 have the
most different cause-effect associations; however, they show a
relatively similar accuracy in their relative frequency estimations.
This shows that manipulations of the learning rates can have very
different effects on the development of the cause-effect association
and on the frequency estimates: Manipulations that have the stron-
ger impact on the former might have a rather small impact on the
latter, and vice versa. Thus, contrary to the theoretical interpreta-
tion of Ca´ndido et al. (2006), their dissociation between contin-
gency judgments and frequency estimations, can be accounted for
in purely associative terms.
Simulation 4: Frequency Estimations Based on Other
Associative-Learning Rules
The previous simulations rely in the R–W learning rule, which
is perhaps the most popular associative learning theory in the area
of animal conditioning and also in human contingency learning
research (Allan, 1993; Shanks, Lo´pez, Darby, & Dickinson, 1996).
However, the same idea can be implemented in alternative learning
algorithms. For example, the comparator hypothesis (Miller &
Matzel, 1988; Stout & Miller, 2007) assumes that cause-effect
associations are developed by means of a much simpler learning
algorithm which is only sensitive to the conditional probability of
the effect given the cause instead of the cause-effect contingency
(Bush & Mosteller, 1951). It is easy to develop an associative
learning model that reconstructs the frequency data on the basis of
cause-effect associations computed with that algorithm.
Similarly, some researchers (Cheng, 1997; Holyoak & Cheng,
2011; Novick & Cheng, 2004) have argued that human causal
learning is not sensitive to the cause-effect contingency, but to a
related concept called causal power. Under some conditions, the
generative causal power (i.e., the ability of a cause to produce an
effect) can be computed from the observable contingency infor-
mation by means of the following rule:
1pEC. (15)
In principle, this equation is not proposed as a specific rule that
people apply consciously to infer causal relations, but as a norma-
tive standard for the rational analysis of causal learning. However,
some authors have proposed an associative learning rule, similar to
the R–W model, that computes causal power asymptotically and
that could provide an algorithmic explanation of peoples’ ability to
estimate causal power (Danks, Griffiths, & Tenenbaum, 2003). As
we try to show below, it is equally plausible to make inferences
about the relative frequencies of each trial type on the basis of the
output of this associative learning rule.
The associative rule proposed by Danks et al. (2003) only differs
from the R–W rule in the assumptions about how the associative
strengths of several stimuli (e.g., the cause and the context) are
combined to produce V
. Although in the standard R–W model
is linear sum of the associative strengths of the stimuli that
are present on any single trial, in the model proposed by Danks et
al. (2003) these associative strengths are combined by means of a
noisy-OR integration rule.
In situations in which only one poten-
tial cause and a constant context are involved, this rule equals to
computing V
When this associative learning rule is used instead of the stan-
dard R–W rule, the probability of the effect given the cue, p(EC),
can no longer be estimated following Equation 7, but it can be
computed as follows:
Consequently, Equations 10 and 11 also need to be rewritten as:
VCEVCXTE兲兴其. (19)
In this simulation we show that, as expected, when using these
equations to compute the strength of the cause-effect association,
its asymptotic value depends on causal power (as computed by
Equation 15) and not on contingency. Furthermore, regardless of
the precise value of causal power or contingency, the frequency
estimations that can be made on the basis of this associative
learning algorithm (based on Equations 12, 13, 18, and 19) are
similarly accurate across conditions and do not differ remarkably
from those observed in Simulations 1 to 3, when the standard R–W
rule was used.
Four conditions were included in Simulation 4. In half of the
conditions, the generative power of the cause, as computed by
The formulation proposed by Danks et al. (2003) includes both a
noisy-OR rule for the combination of generative causes and an AND-NOT
integration rule for the combination of preventive causes. For the sake of
simplicity, we have only simulated positive cause-effect relationships and,
consequently, our formalization neglects the AND-NOT integration rule
necessary to take into account preventive causal powers.
Equation 15, was .25 and in the other half it was .50. Orthogonally,
in half of the conditions the cause-effect contingency as measured
by pwas .25 and in the other half it was .10. The trial frequencies
of each condition are shown in Table 2. All the parameter values
were kept as in Simulations 1 and 2. As in previous simulations,
500 replications, each one with a randomized trial sequence, were
simulated for each condition.
Results and Discussion
The results of Simulation 4 are depicted in Figure 5. The first
number in the name of each data series refers to the causal power
and the second one to the pvalue of that condition. As can be
seen in the top panel of Figure 5, the asymptotic associative
strength converges to the causal power, with contingency playing
a relatively minor role in the pre-asympotic strength of the asso-
ciations. The lower panel shows that the accuracy of the relative
frequency estimations made on the basis of the associations was
relatively similar in all conditions and rather similar to those found
in previous associations when the standard R–W rule was used for
the computation of the associative strengths. We found it interest-
ing that conditions with remarkably different learning curves, such
as, for example, conditions .50/.25 and .25/.10 reach similar levels
of accuracy in the relative frequency estimations. This provides
additional support for our claim that manipulations that influence
the value of the cause-effect association need not have a similar
impact on the accuracy of frequency estimates.
General Discussion
In the Introduction, we argued that researchers have often made
use of three arguments to favour rule-based accounts of causal
learning over associative explanations. First, the fact that partici-
pants can estimate the number of trials of each type they have been
exposed to is assumed to fit better with rule based accounts.
Admittedly, the associative models most widely cited in the area of
causal learning do not include any mechanism by which frequency
data can be encoded and stored. However, as we have shown, the
associations computed by means of the R–W rule and related
associative algorithms contain all the information necessary to
reconstruct the frequency data. Therefore, any evidence that peo-
ple “remember” this information cannot be taken as a strong
argument in favour of rule-based models.
Second, the abundant literature showing that people can flexibly
compute several statistical indexes to describe the relationship
between the cause and the effect (Gredeba¨ck et al., 2000; Matute
et al., 1996; Vadillo & Matute, 2007; Vadillo et al., 2005) has
sometimes been interpreted as uniquely supporting rule-based ac-
counts over associative ones (see, e.g., Cobos et al., 2000; Grede-
ba¨ck et al., 2000). However, just the same way associations con-
tain all the information necessary to reconstruct the contingency
table that gave rise to those associations, that information can also
be used to infer other statistical indexes relating cause and effect.
For example, as noted by Vadillo and Matute (2007), the proba-
bility of the effect given the cause can be computed by combining
the associative strengths of the cause and the context. Moreover,
this associative perspective makes new predictions that cannot be
easily accommodated by rule-based models without making ad
hoc assumptions. The associative framework described in the
present paper is nothing but an extension of these ideas.
Perhaps the strongest arguments in favour of rule-based ac-
counts are the ones related to the third point mentioned in the
Introduction: That is, results showing that there are experimental
manipulations that have an effect on some dependent variables but
not others (Gredeba¨ck et al., 2000; Matute et al., 1996; Pinen˜o,
Denniston, Beckers, Matute, & Miller, 2005). Dissociations be-
tween causal judgments and frequency estimations are particularly
compelling for the present work (Ca´ndido et al., 2006; Crump et
al., 2007; Price & Yates, 1995; Ramos-A
´lvarez & Catena, 2005).
However, our Simulation 2, aimed at addressing this particular
issue, illustrates how, at least sometimes, inferences made on the
basis of biased associations need not exhibit the same biases. In
particular, we showed that, although an outcome-density manipu-
lation does have an effect on the cause-effect associations com-
puted by the R–W algorithm, that manipulation has no observable
impact on the frequency estimations based on those associations.
In a similar fashion, in Simulation 3 we showed that manipulations
of the learning rates need not have the same impact on cause-effect
associations and on the relative frequency estimations. The results
of Simulation 4 also support this view.
Some of the published reports that have used dissociations in
judgments to test associative models have relied on cue competi-
tion manipulations. In a typical cue competition design, several
cues (or potential causes) are trained as predictors of the outcome
(or effect), with the usual result that the contingency between one
of those cues with the outcome has an effect not only on judgments
about that cue but also on judgments about the second cue whose
contingency is not manipulated. As mentioned above, extant evi-
dence shows that although cue competition affects causal and
contingency learning, again it has little or no impact on predictive
judgments (Gredeba¨ck et al., 2000; Matute et al., 1996; Pinen˜o et
Figure 5. Results of Simulation 4: The top panel shows the strength of
the cause-effect association in each condition. The lower panel shows the
global error made by the model when attempting to infer the relative
frequency of each type of trial. RMS root mean squared error.
al., 2005) or on frequency estimates (Ramos-A
´lvarez & Catena,
2005). The frequency-retrieval rules that we have developed here
deal only with situations involving just one cue and one context
and, therefore, they are not well suited to account for these disso-
ciations. However, as the astute reader may have noticed, the logic
behind this set of rules can be easily extended to multiple-cue
settings, so that unbiased estimations of frequency data can be
observed in multiple-cause settings that usually give rise to cue
competition effects. In this case, frequency estimates would rely
not only in three associations but more, including associations of
the context with each cue, associations between both cues, and so
on. Such a model would certainly look more complex from a
purely formal point of view, but would be based on a similarly
simple logic: Instead of recalling frequency data, they would just
infer these data from their intuitions about conditional probabili-
Regardless of the potential merits and shortcomings of the
specific formalization that we have advanced here, our simulations
show that researchers should be more cautious when making
strong claims about the incompatibility of associative models with
any evidence of flexibility of judgments. If we want to have a more
detailed picture of the cognitive priority of some processes over
others, then we cannot simply rely on information about these
dissociations. This strategy should be complemented with alterna-
tive methods that allow us to more directly assess the relative
complexity or automaticity of several processes and their precise
sequence over time. For example, an important (although not
necessary) assumption behind associative models is that learning
mechanisms operate in a relatively automatic fashion and posse
few demands in terms of cognitive resources. Based on this idea
many researchers have tested the plausibility of these mechanisms
by trying to measure learning effects using more implicit measures,
instead of the traditional verbal causal ratings. Some of these studies
(e.g., Morı´s, Cobos, & Luque, 2010; Sternberg & McClelland, 2012)
have obtained data consistent with associative models, whereas
others have failed to find some learning effects with these
implicit measures (e.g., De Houwer & Vandorpe, 2010; Ratliff
& Nosek, 2010). Although the evidence is not conclusive yet,
we think that these alternative measures of learning are a
promising tool for any attempt to discriminate between asso-
ciative and rule-based accounts of contingency learning. Ma-
nipulations of time-pressure (Vadillo & Matute, 2010) or sec-
ondary task (De Houwer & Beckers, 2003; Wills, Graham, Koh,
McLaren, & Rolland, 2011) are complementary tools to address
the automaticity of the processes involved in contingency learn-
ing. In a similar way, physiological data can also be used to test
the plausibility of different theories by looking for physiolog-
ical correlates of the processes assumed by each theory. For
example, recent research has found correlates of error-
correction processes or attention modulation that are consistent
with associative learning theory (e.g., Fletcher et al., 2001;
Luque, Lo´pez, Marco-Pallares, Ca`mara, & Rodrı´guez-Fornells,
in press; Walsh & Anderson, 2011; Wills, Lavric, Croft, &
Hodgson, 2007). We think that any serious attempt to contrast
the predictions of rule-based and associative accounts of con-
tingency learning should benefit from the fruitful combination
of these and other methods, instead of relying solely in rela-
tively simplistic interpretations of judgment dissociations
which, as we have tried to show in the present paper, are usually
open to alternative and equally plausible interpretations.
Le recherches ante´rieures sur l’apprentissage causal ont ge´ne´rale-
ment mene´a` de fortes affirmations a` propos de la complexite´et
de la priorite´ temporelle de certains processus aux de´pens des
autres en se fondant sur les observations a` propos des dissociations
entre diffe´rents types de jugements. En particulier, il a e´te´ avance´
qu’une dissociation entre le jugement causal et l’information sur la
fre´quence du type d’essai est incompatible avec l’architecture
cognitive ge´ne´rale propose´e par les mode`les associatifs. En con-
traste avec cette approche, nous effectuons une analyse associative
de ce processus montrant que ce n’est pas ne´cessairement le cas.
Nous concluons que tout essai visant a` obtenir une meilleure
compre´hension de l’architecture cognitive dans l’apprentissage de
la contingence ne peut s’appuyer uniquement sur les donne´es a`
propos de ces dissociations.
Mots-cle´s : apprentissage de la contingence, apprentissage proba-
biliste, mode`les statistique, mode`les associatifs.
Allan, L. G. (1980). A note on measurement of contingency between two
binary variables in judgement tasks. Bulletin of the Psychonomic Soci-
ety, 15, 147–149.
Allan, L. G. (1993). Human contingency judgments: Rule-based or asso-
ciative? Psychological Bulletin, 114, 435– 448. doi:10.1037/0033-
Allan, L. G. (2003). Assessing power PC. Learning & Behavior, 31,
192–204. doi:10.3758/BF03195982
Allan, L. G., & Jenkins, H. M. (1980). The judgment of contingency and
the nature of the response alternatives. Canadian Journal of Psychology,
34, 1–11. doi:10.1037/h0081013
Allan, L. G., & Jenkins, H. M. (1983). The effect of representations of
binary variables on judgment of influence. Learning and Motivation, 14,
381– 405. doi:10.1016/0023-9690(83)90024-3
Allan, L. G., Siegel, S., & Tangen, J. M. (2005). A signal detection analysis
of contingency data. Learning & Behavior, 33, 250 –263. doi:10.3758/
Alloy, L. B., & Abramson, L. Y. (1979). Judgment of contingency in
depressed and nondepressed students: Sadder but wiser? Journal of
Experimental Psychology: General, 108, 441– 485. doi:10.1037/0096-
Beyth-Marom, R. (1982). Perception of correlation re-examined. Memory
& Cognition, 10, 511–519. doi:10.3758/BF03202433
Busemeyer, J. R. (1991). Intuitive statistical estimation. In N. H. Anderson
(Ed.), Contributions to information integration theory: Vol. 1. Cognition
(pp. 187–215). Hillsdale, NJ: Erlbaum.
Bush, R. R., & Mosteller, F. (1951). A mathematical model for simple
learning. Psychological Review, 58, 313–323.
Ca´ndido, A., Perales, J. C., Catena, A., Maldonado, A., Guadarrama, L.,
Beltra´n, R., . . . Herrera, A. (2006). Efectos de induccio´n emocional en
el aprendizaje causal [Effects of emotion induction on causal learning].
Psicolo´gica, 27, 243–267.
Chapman, G. B., & Robbins, S. J. (1990). Cue interaction in human
contingency judgment. Memory & Cognition, 18, 537–545. doi:10.3758/
Cheng, P. W. (1997). From covariation to causation: A causal power
theory. Psychological Review, 104, 367– 405. doi:10.1037/0033-
Cheng, P. W., & Novick, L. R. (1992). Covariation in natural causal
induction. Psychological Review, 99, 365–382. doi:10.1037/0033-
Cobos, P. L., Can˜o, A., Lo´pez, F. J., Luque, J. L., & Almaraz, J. (2000).
Does the type of judgement required modulate cue competition? Quar-
terly Journal of Experimental Psychology, 53B, 193–207.
Crump, M. J. C., Hannah, S. D., Allan, L. G., & Hord, L. K. (2007).
Contingency judgments on the fly. Quarterly Journal of Experimental
Psychology, 60, 753–761. doi:10.1080/17470210701257685
Danks, D. (2003). Equilibria of the Rescorla–Wagner model. Journal of
Mathematical Psychology, 47, 109 –121. doi:10.1016/S0022-
Danks, D., Griffiths, T. L., & Tenenbaum, J. B. (2003). Dynamical causal
learning. In S. Becker, S. Thrun, & K. Obermayer (Eds.), Advances in
neural information processing systems (Vol. 15, pp. 67–74). Cambridge,
MA: MIT Press.
De Houwer, J., & Beckers, T. (2003). Secondary task difficulty modulates
forward blocking in human contingency learning. Quarterly Journal of
Experimental Psychology, 56B, 345–357.
De Houwer, J., & Vandorpe, S. (2010). Using the Implicit Association Test
as a measure of causal learning does not eliminate effects of rule
learning. Experimental Psychology, 57, 61– 67. doi:10.1027/1618-3169/
Dickinson, A., Shanks, D. R., & Evenden, J. (1984). Judgement of act-
outcome contingency: The role of selective attribution. Quarterly Jour-
nal of Experimental Psychology, 36A, 29 –50.
Fletcher, P. C., Anderson, J. M., Shanks, D. R., Honey, R., Carpenter,
T. A., Donovan, T., . . . Bullmore, E. T. (2001). Responses of human
frontal cortex to surprising events are predicted by formal associative
learning theory. Nature Neuroscience, 4, 1043–1048. doi:10.1038/nn733
Gredeba¨ck, G., Winman, A., & Juslin, P. (2000). Rational assessments of
covariation and causality. In L. R. Gleitman & A. K. Joshi (Eds.),
Proceedings of the 22nd annual conference of the Cognitive Science
Society (pp. 190 –195). Hillsdale, NJ: Erlbaum.
Holyoak, K. J., & Cheng, P. W. (2011). Causal learning and inference as
a rational process: The new synthesis. Annual Review of Psychology, 62,
135–163. doi:10.1146/annurev.psych.121208.131634
Hume, D. (1964). Treatise of human nature (L. S. Selby-Bigge, Ed.).
London, England: Oxford University Press. (Original work published
Jenkins, H. M., & Ward, W. C. (1965). Judgment of contingency between
responses and outcomes. Psychological Monographs, 79, 1–17. doi:
Luque, D., Lo´pez, F. J., Marco-Pallares, J., Ca`mara, E., & Rodrı´guez-
Fornells, A. (in press). Feedback-related brain potential activity com-
plies with basic assumptions of associative learning theory. Journal of
Cognitive Neuroscience.
Marr, D. (1982). Vision: A computational investigation into the human
representation and processing of visual information. San Francisco, CA:
Matute, H., Arcediano, F., & Miller, R. R. (1996). Test question modulates
cue competition between causes and between effects. Journal of Exper-
imental Psychology: Learning, Memory, and Cognition, 22, 182–196.
Matute, H., Yarritu, I., & Vadillo, M. A. (2011). Illusions of causality at the
heart of pseudoscience. British Journal of Psychology, 102, 392– 405.
Mercier, P. (1996). Computer simulations of the Rescorla–Wagner and
Pearce–Hall models in conditioning and contingency judgment. Behav-
ior Research Methods, 28, 55– 60.
Miller, R. R., & Matzel, L. D. (1988). The comparator hypothesis: A
response rule for the expression of associations. In G. H. Bower (Ed.),
The psychology of learning and motivation (Vol. 22, pp. 51–92). San
Diego, CA: Academic Press.
Morı´s, J., Cobos, P. L., & Luque, D. (2010). What priming techniques can
tell us about associative representations acquired during human contin-
gency learning. Open Psychology Journal, 3, 97–104. doi:10.2174/
Musca, S. C., Vadillo, M. A., Blanco, F., & Matute, H. (2010). The role of
cue information in the outcome-density effect: Evidence from neural
network simulations and a causal learning experiment. Connection Sci-
ence, 22, 177–192. doi:10.1080/09540091003623797
Novick, L. R., & Cheng, P. W. (2004). Assessing interactive causal
influence. Psychological Review, 111, 455– 485. doi:10.1037/0033-
Pearl, J. (2000). Causality: Models, reasoning, and inference. New York,
NY: Cambridge University Press.
Perales, J. C., Catena, A., Shanks, D. R., & Gonza´lez, J. A. (2005).
Dissociation between judgments and outcome expectancy measures in
covariation learning: A signal detection theory approach. Journal of
Experimental Psychology: Learning, Memory, and Cognition, 31, 1105–
1120. doi:10.1037/0278-7393.31.5.1105
Pinen˜o, O., Denniston, J. C., Beckers, T., Matute, H., & Miller, R. R.
(2005). Contrasting predictive and causal values of predictors and
causes. Learning & Behavior, 33, 184 –196. doi:10.3758/BF03196062
Price, P. C., & Yates, J. F. (1995). Associative and rule-based accounts of
cue interaction in contingency judgment. Journal of Experimental Psy-
chology: Learning, Memory, and Cognition, 21, 1639 –1655. doi:
´lvarez, M. M., & Catena, A. (2005). The dissociation between the
recall of stimulus frequencies and the judgment of contingency allows
the placement of the competition effect in the final causal processing
stages. Psicolo´gica, 26, 293–303.
Ratliff, K. A., & Nosek, B. A. (2010). Creating distinct implicit and
explicit attitudes with an illusory correlation paradigm. Journal of Ex-
perimental Social Psychology, 46, 721–728. doi:10.1016/j.jesp.2010
Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian condi-
tioning: Variations in the effectiveness of reinforcement and nonrein-
forcement. In A. H. Black & W. F. Prokasy (Eds.), Classical condition-
ing II: Current research and theory (pp. 64 –99). New York, NY:
Shaklee, H., & Mims, M. (1982). Sources of error in judging event
covariations: Effects of memory demands. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 8, 208 –224. doi:
Shanks, D. R. (1987). Acquisition functions in contingency judgment.
Learning and Motivation, 18, 147–166. doi:10.1016/0023-
Shanks, D. R. (1995). Is human learning rational? Quarterly Journal of
Experimental Psychology, 48A, 257–279.
Shanks, D. R., & Dickinson, A. (1987). Associative accounts of causality
judgement. In G. H. Bower (Ed.), The psychology of learning and
motivation (Vol. 21, pp. 229 –261). San Diego, CA: Academic Press.
Shanks, D. R., & Dickinson, A. (1991). Instrumental judgment and per-
formance under variations in action-outcome contingency and contigu-
ity. Memory & Cognition, 19, 353–360. doi:10.3758/BF03197139
Shanks, D. R., Lo´pez, F. J., Darby, R. J., & Dickinson, A. (1996).
Distinguishing associative and probabilistic contrast theories of human
contingency judgment. In D. R. Shanks, K. J. Holyoak, & D. L. Medin
(Eds.), The psychology of learning and motivation, Vol. 34: Causal
learning (pp. 265–311). San Diego, CA: Academic Press.
Smedslund, J. (1963). The concept of correlation in adults. Scandinavian
Journal of Psychology, 4, 165–173. doi:10.1111/j.1467-9450.1963
Sternberg, D. A., & McClelland, J. L. (2012). Two mechanisms of human
contingency learning. Psychological Science, 23, 59 – 68. doi:10.1177/
Stout, S. C., & Miller, R. R. (2007). Sometimes-competing retrieval
(SOCR): A formalization of the comparator hypothesis. Psychological
Review, 114, 759 –783. doi:10.1037/0033-295X.114.3.759
Vadillo, M. A., & Matute, H. (2007). Predictions and causal estimations
are not supported by the same associative structure. Quarterly Jour-
nal of Experimental Psychology, 60, 433– 447. doi:10.1080/
Vadillo, M. A., & Matute, H. (2010). Augmentation in contingency learn-
ing under time pressure. British Journal of Psychology, 101, 579 –589.
Vadillo, M. A., Miller, R. R., & Matute, H. (2005). Causal and predictive-
value judgments, but not predictions, are based on cue-outcome contin-
gency. Learning & Behavior, 33, 172–183. doi:10.3758/BF03196061
Vadillo, M. A., Musca, S. C., Blanco, F., & Matute, H. (2011). Contrasting
cue-density effects in causal and prediction judgments. Psychonomic
Bulletin & Review, 18, 110 –115. doi:10.3758/s13423-010-0032-2
Walsh, M. M., & Anderson, J. R. (2011). Modulation of the feedback-
related negativity by instruction and experience. Proceedings of the
National Academy of Sciences, 108, 19048 –19053. doi:10.1073/
Wasserman, E. A. (1990). Detecting response-outcome relations: Toward
an understanding of the causal texture of the environment. In G. H.
Bower (Ed.), The psychology of learning and motivation (Vol. 26, pp.
27– 82). San Diego, CA: Academic Press.
Wasserman, E. A., Elek, S. M., Chatlosh, D. L., & Baker, A. G. (1993).
Rating causal relations: The role of probability in judgments of
response-outcome contingency. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 19, 174 –188. doi:10.1037/0278-
Wasserman, E. A., Kao, S.-F., Van Hamme, L. J., Katagari, M., & Young,
M. E. (1996). Causation and association. In D. R. Shanks, K. J. Holyoak,
& D. L. Medin (Eds.), The psychology of learning and motivation, Vol.
34: Causal learning (pp. 207–264). San Diego, CA: Academic Press.
White, P. A. (2003). Making causal judgments from the proportion of
confirming instances: The pCI rule. Journal of Experimental Psychol-
ogy: Learning, Memory, and Cognition, 29, 710 –727. doi:10.1037/
Wills, A. J., Graham, S., Koh, Z., McLaren, I. P. L., & Rolland, M. D.
(2011). Effects of concurrent load on feature- and rule-based general-
ization in human contingency learning. Journal of Experimental Psy-
chology: Animal Behavior Processes, 37, 308 –316. doi:10.1037/
Wills, A. J., Lavric, A., Croft, G. S., & Hodgson, T. L. (2007). Predictive
learning, prediction errors, and attention: Evidence from event-related
potentials and eye tracking. Journal of Cognitive Neuroscience, 19,
843– 854. doi:10.1162/jocn.2007.19.5.843
Received October 6, 2011
Accepted January 31, 2012
... These overestimations are larger when the outcome (1E) or the cue (1F) is very frequent, and even larger when both of them are very frequent (1G). Therefore, the model also provides a nice explanation for cue-and outcome-density biases (Matute, Vadillo, Blanco, & Musca, 2007;Shanks, 1995;Vadillo & Luque, 2013). ...
Decades of research in causal and contingency learning show that people’s estimations of the degree of contingency between two events are easily biased by the relative probabilities of those two events. If two events co-occur frequently, then people tend to overestimate the strength of the contingency between them. Traditionally, these biases have been explained in terms of relatively simple single-process models of learning and reasoning. However, more recently some authors have found that these biases do not appear in all dependent variables and have proposed dual-process models to explain these dissociations between variables. In the present paper we review the evidence for dissociations supporting dual-process models and we point out important shortcomings of this literature. Some dissociations seem to be difficult to replicate or poorly generalizable and others can be attributed to methodological artefacts. Overall, we conclude that support for dual-process models of biased contingency detection is scarce and inconclusive.
Full-text available
In this study we focus on the influence of emotions on causal learning In two experiments, we show that subjective estimates of causal strength between a potential cause (a fertilizer) and an effect (the blooming of a plant) are hampered by concurrent negative emotions, elicited by means of emotional pictures (IAPS). In Experiment 1, participants were exposed to two consistently correlated events (the fertilizer and the blooming), in a trial-by-trial, sequential manner (32 trials) and were asked to estimate the degree to what the fertilizer made the plant bloom. The procedure in Experiment 2 was identical to the one in Experiment 1, but in this case the cause and the effect were inversely correlated (the fertilizer prevented the plant to bloom). In both cases three different groups of participants were presented with negative, positive, or neutral pictures during the task. Positive and neutral picture groups did not differ between them, but they both performed better than the negative picture group. In the two experiments, mean judgment in the negative picture group were lower (in absolute value) and worse tuned to the objective contingency than in the positive and neutral picture groups. However, once the task had finished, the participants were able to accurately retrieve the frequencies of the different types of pairings (cause-effect, cause-no effect, no cause-effect, and no cause-no effect) presented during the task. Accordingly, trial-by-trial predictions (made between the cause and the effect in each trial) were insensitive to the emotion manipulation. These data indicate that the judgmental bias induced by negative emotions was not due to a faulty coding of the information on which the judgments are based. Coding was equally accurate in all groups, which means the effect of emotion must occur in a higher information integration stage. Specifically, reasoners in the negative emotion groups seemed to be unable to update previous beliefs on the basis of newly acquired causal evidence, an effect that can be easily described by the belief-adjustment model (Catena et al, 1998), but is clearly at odds with single algorithm models.
This chapter describes the potential explanatory power of a specific response rule and its implications for models of acquisition. This response rule is called the “comparator hypothesis.” It was originally inspired by Rescorla's contingency theory. Rescorla noted that if the number and frequency of conditioned stimulus–unconditioned stimulus (CS–US) pairings are held constant, unsignaled presentations of the US during training attenuate conditioned responding. This observation complemented the long recognized fact that the delivery of nonreinforced presentations of the CS during training also attenuates conditioned responding. The symmetry of the two findings prompted Rescorla to propose that during training, subjects inferred both the probability of the US in the presence of the CS and the probability of the US in the absence of the CS and they then established a CS–US association based upon a comparison of these quantities. The comparator hypothesis is a qualitative response rule, which, in principle, can complement any model of acquisition.
The study of the mechanism that detects the contingency between events. in both humans and non-human animals, is a matter of considerable research activity. Two broad categories of explanations of the acquisition of contingency information have received extensive evaluation: rule-based models and associative models. This article assesses the two categories of models for human contingency judgments. The data reveal systematic departures in contingency judgments from the predictions of rule-based models. Recent studies indicate that a contiguity model of Pavlovian conditioning is a useful heuristic for conceptualizing human contingency judgments.
Written by one of the preeminent researchers in the field, this book provides a comprehensive exposition of modern analysis of causation. It shows how causality has grown from a nebulous concept into a mathematical theory with significant applications in the fields of statistics, artificial intelligence, economics, philosophy, cognitive science, and the health and social sciences. Judea Pearl presents and unifies the probabilistic, manipulative, counterfactual, and structural approaches to causation and devises simple mathematical tools for studying the relationships between causal connections and statistical associations. The book will open the way for including causal analysis in the standard curricula of statistics, artificial intelligence, business, epidemiology, social sciences, and economics. Students in these fields will find natural models, simple inferential procedures, and precise mathematical definitions of causal concepts that traditional texts have evaded or made unduly complicated. The first edition of Causality has led to a paradigmatic change in the way that causality is treated in statistics, philosophy, computer science, social science, and economics. Cited in more than 5,000 scientific publications, it continues to liberate scientists from the traditional molds of statistical thinking. In this revised edition, Judea Pearl elucidates thorny issues, answers readers’ questions, and offers a panoramic view of recent advances in this field of research. Causality will be of interests to students and professionals in a wide variety of fields. Anyone who wishes to elucidate meaningful relationships from data, predict effects of actions and policies, assess explanations of reported events, or form theories of causal understanding and causal speech will find this book stimulating and invaluable.
In the first experiment subjects were presented with a number of sets of trials on each of which they could perform a particular action and observe the occurrence of an outcome in the context of a video game. The contingency between the action and outcome was varied across the different sets of trials. When required to judge the effectiveness of the action in controlling the outcome during a set of trials, subjects assigned positive ratings for a positive contingency and negative ratings for a negative contingency. Furthermore, the magnitude of the ratings was related systematically to the strength of the actual contingency. With a fixed probability of an outcome given the action, judgements of positive contingencies decreased as the likelihood that the outcome would occur without the action was raised. Correspondingly, the absolute value of ratings of negative contingencies was increased both by an increment in the probability of the outcome in the absence of the action and by a decrement in the probability of the outcome following the action. A systematic bias was observed, however, in that positive judgements were given under a non-contingent relationship when the outcome frequency was relatively high. However, this bias could be reduced by giving extended exposure to the non-contingent schedule (Experiment 2). This pattern of contingency judgements can be explained if it is assumed that a process of selective attribution operates, whereby people are less likely to attribute an outcome to some potential target cause if another effective cause is present. Experiments 2 and 3 demonstrated the operation of this process by showing that initially establishing another agent as an effective cause of the outcome subsequently reduced or blocked the extent to which the subjects attributed the outcome to the action. Finally, we argue that the pattern and bias in contingency judgements based upon interactions with a causal process can be explained in terms of contemporary conditioning models of associative learning.
This chapter discusses that experimental psychology is no longer a unified field of scholarship. The most obvious sign of disintegration is the division of the Journal of Experimental Psychology into specialized periodicals. Many forces propel this fractionation. First, the explosion of interest in many small spheres of inquiry has made it extremely difficult for an individual to master more than one. Second, the recent popularity of interdisciplinary research has lured many workers away from the central issues of experimental psychology. Third, there is a growing division between researchers of human and animal behavior; this division has been primarily driven by contemporary cognitive psychologists, who see little reason to refer to the behavior of animals or to inquire into the generality of behavioral principles. The chapter considers the study of causal perception. This area is certainly at the core of experimental psychology. Although recent research in animal cognition has taken the tack of bringing human paradigms into the animal laboratory, the experimental research is described has adopted the reverse strategy of bringing animal paradigms into the human laboratory. A further unfortunate fact is that today's experimental psychologists are receiving little or no training in the history and philosophy of psychology. This neglected aspect means that investigations of a problem area are often undertaken without a full understanding of the analytical issues that would help guide empirical inquiry.
This chapter discusses the associative accounts of causality judgment. The perceptual and cognitive approaches to causal attribution can be contrasted with a more venerable tradition of associationism. The only area of psychology that offered an associative account of a process sensitive to causality is that of conditioning. An instrumental or operant conditioning procedure presents a subject with a causal relationship between an action and an outcome, the reinforcer; performing the action either causes the reinforcer to occur under a positive contingency or prevents its occurrence under a negative one, and the subjects demonstrate sensitivity to these causal relationships by adjusting their behavior appropriately. Most of these associative theories are developed to explain classic or Pavlovian conditioning rather than the instrumental or operant variety, but there are good reasons for assuming that the two types of conditioning are mediated by a common learning process.
This chapter discusses theoretical issues concerning contingency judgment. One empirical result exists that appears straightaway to challenge the idea that contingency judgments can be modeled by the Rescorla-Wagner theory. This is the finding that judgments under noncontingent schedules do not always appear to converge across trials. The idea that stimuli are represented configurally allows the results of the experiments to be accommodated; it should be acknowledged that there are a number of problems facing this approach. Account of retrospective revaluation effects requires an elemental rather than a configural analysis: in an AB → 0, B → 0 design, subjects are assumed to relate what they learn in the second stage about element B to what they already know about compound AB, such that the balance of associative strengths of A and B is altered. It is difficult to see how a configural analysis, whereby the compound AB is represented quite independently of its elements, would allow this to happen. Some recent data raise the possibility that subjects behave configurally only under certain conditions. Many researchers agree that the appropriate normative theory is provided by the Δp metric: contingency judgments should then be evaluated for their objective accuracy against Δp and are assumed to be biased whenever they deviate from that statistic. Rather than proving that contingency judgment is nonnormative, however, results should be viewed in the same way as visual illusions: manifestations of an incorrect output from a system that fundamentally does provide a true picture of the world but that can be misled as a result of having to produce a response on the basis of insufficient evidence.