Content uploaded by Lena Nadarevic
Author content
All content in this area was uploaded by Lena Nadarevic on Oct 30, 2020
Content may be subject to copyright.
Research article
Cognitive processes in implicit attitude tasks: An experimental
validation of the Trip Model
LENA NADAREVIC*AND EDGAR ERDFELDER
Department of Psychology, University of Mannheim, Mannheim, Germany
Abstract
Implicit attitude tasks have become very popular in various areas of psychology. However, little is known about the cognitive
processes they involve. To address this issue, we investigated the underlying processes of the Go/No-go Association Task (GNAT),
a go/no-go variant of the well-known Implicit Association Test (IAT). More precisely, we tested two alternative multinomial
processing tree (MPT) models of GNAT performance, the Trip Model and the generalized Quad Model. Both models assume that
GNAT performance is influenced not only by automatic associations but also by response biases and a controlled discrimination
process. However, the two models differ with respect to an additional overcoming bias process. In contrast to the Quad Model, the
Trip Model assumes that overcoming bias does not play a major role in GNAT performance. Instead, the Trip Model emphasizes
the role of response biases. We report three experiments that demonstrate the validity of the Trip Model for GNAT data. Copyright
#2010 John Wiley & Sons, Ltd.
Several methods to measure implicit attitudes have recently
been proposed (Steffens & Jonas, 2010). Examples include the
Implicit Association Test (IAT; Greenwald, McGhee, &
Schwartz, 1998; Sriram & Greenwald, 2009), the Extrinsic
Affective Simon Task (EAST; De Houwer, 2003; Degner &
Wentura, 2008), and the Go/No-go Association Task (GNAT;
Nosek & Banaji, 2001). All these techniques are based on a
discrimination task that involves speeded responding to object-
evaluation pairs (Banaji, 2001). For instance, the IAT is based
on the idea that it is easier to assign exemplars of two
associated concepts (e.g., flowers and positive words) to the
same response key than exemplars of two unassociated
concepts (e.g., flowers and negative words). Indeed, Green-
wald et al. (1998) demonstrated that people are better in IAT
performance (i.e., faster and more accurate) when they are
instructed to respond to flower exemplars and positive words
with the same response key and to insect exemplars and
negative words with a different key (compatible block). In
contrast, they are poorer in performance when flowers and
negative words are assigned to the same response key and both
insects and positive words are assigned to the other key
(incompatible block). Thus, the average reaction time
difference between incompatible and compatible IAT blocks
can be used as an indirect preference measure for one target
concept (e.g., flower) over the other (e.g., insect).
In spite of some significant advantages indirect attitude
measures have over self-report measures (e.g., enhanced
validity and being less susceptible to faking, cf. Boysen, Vogel,
& Madon, 2006; Nosek & Smyth, 2007; Steffens, 2004), they
also share a serious problem — the lack of a theoretical
foundation. As Fazio and Olson (2003, p. 301) concluded, ‘‘...
research concerning implicit measures has been surprisingly
atheoretical. It largely has been a methodological, empirically
driven enterprise.’’
THE QUAD MODEL
A promising theoretical model of the cognitive processes
involved in implicit attitude measures was introduced by
Conrey, Sherman, Gawronski, Hugenberg, and Groom (2005).
The authors developed the Quad Model, a multinomial
processing tree (MPT) model (Batchelder & Riefer, 1999;
Erdfelder, Auer, Hilbig, Aßfalg, Moshagen, & Nadarevic,
2009) of IAT performance. MPT models are stochastic models
for discrete categorical data as typically obtained in implicit
attitude tasks (e.g., correct vs. incorrect responses for different
conditions). They are based on the idea that observed
responses rarely depend on a single cognitive process. In
general, multiple processes may affect each possible response.
Moreover, it is assumed that each of these underlying
processes occurs with a certain probability. By means of
MPT modeling of observed data, estimates of these process
probabilities can be obtained. Hence, MPT models are useful
tools for disentangling the contributions of different cognitive
processes to observed response frequencies, provided these
models have been shown to be psychologically valid (for a
brief introduction and a useful practical guide to MPT
modeling, see Klauer and Wegener (1998), Appendix A).
According to Conrey et al. (2005), performance in the IAT
and related tasks also depends on multiple cognitive processes.
In contrast to the standard interpretation of IAT scores as pure
measures of automatic associations, they proposed that IAT
performance is based on a quadruplet of processes: (1)
European Journal of Social Psychology,Eur. J. Soc. Psychol. 41, 254–268 (2011)
Published online 12 October 2010 (wileyonlinelibrary.com) DOI: 10.1002/ejsp.776
*Correspondence to: Lena Nadarevic, Department of Psychology, University of Mannheim, D-68131 Mannheim, Germany.
E-mail: nadarevic@psychologie.uni-mannheim.de
Copyright #2010 John Wiley & Sons, Ltd. Received 4 March 2010, Accepted 19 August 2010
Automatically activated stimulus–valence associations (prob-
ability AC), (2) the discriminability of the correct response
(probability D), (3) the ability to inhibit automatically
activated associations (‘‘overcoming bias’’, probability OB),
and (4) guessing (probability G). Specifically, the Quad Model
makes the following predictions (see Figure 1): When an IAT
stimulus is presented, a valence association is activated
automatically with probability AC that depends on the strength
of a stimulus–valence association. Regardless of whether an
association has been activated (probability AC) or not
(probability 1-AC), the IAT stimulus can be discriminated
(i.e., categorized) correctly with probability D. Because
discrimination is a controlled process, the size of D should
mainly depend on cognitive capacity. Whereas both associ-
ation activation and discrimination result in correct responses
in compatible IAT blocks, they interfere in incompatible
blocks. In the latter case, association activation is suppressed
with probability OB to select the correct response. Thus, OB
reflects ‘‘overcoming bias,’’ a controlled inhibition process.
When no association has been activated (probability 1-AC)
and there is no correct answer available (probability 1-D), the
response is determined by guessing. Whether this leads to a left
hand guess (probability G) or a right hand guess (probability 1-
G) depends on automatic response tendencies, strategic
guessing, or random influences.
One of the advantages the Quad Model shares with other
MPT models is the fact that it allows for model tests by means of
goodness-of-fit statistics. These statistics indicate whether the
observed data match the data that are predicted by the model’s
assumptions, treating the four unknown probabilities AC, D,
OB, and G as free parameters in the interval [0,1]. Additionally,
the Quad Model is capable of disentangling and quantifying the
contributions of its proposed four cognitive processes by means
of maximum likelihood (ML) parameter estimation. By now,
several studies have been conducted that attest the Quad Model
an excellent model fit to IAT data (Conrey et al., 2005;
Gonsalkorale, Sherman, & Klauer, 2009; Sherman, Gawronski,
Gonsalkorale, Hugenberg, Allen, & Groom, 2008). Moreover,
these studies demonstrate that the Quad Model’s parameters
mirror experimental manipulations of study materials and
context conditions in a meaningful, psychologically plausible
way. For instance, D and OB proved to be substantially reduced
under time constraints, and G has been shown to be sensitive to
experimentally induced response biases.
However, despite all these advantages of the Quad Model,
the model has one serious drawback. Because MPT models are
models for categorical data, the Quad Model captures error
rates only and ignores response latencies entirely. This feature
of the Quad Model is of course problematic, particularly with
regard to IAT data. Because the IAT does not make use of response
deadlines, errors can easily be avoided by taking as much time as
needed to determine the correct response. Consequently, IAT
effect sizes based on error rates are usually much smaller than IAT
effect sizes based on response latencies (Steffens et al., 2004).
Considering this fact, MPT models seem more appropriate for
implicit attitude tasks that focus on error rates rather than
response latencies, for example, the GNAT (Nosek & Banaji,
2001).AccordingtoRudolph,Schro
¨der-Abe
´, Schu
¨tz, Gregg, and
Sedikides (2008), one main difference between the IAT and the
GNAT is that ‘‘whereas in the IAT accuracy is held constant and
reaction time varies, in the GNAT reaction time is held constant
and accuracy varies’’ (p. 279). Unlike the IAT, the GNAT is
capable of measuring implicit attitudes toward a single target
concept. Therefore, the GNAT employs a set of distractor items
instead of a second target concept. Furthermore, GNAT res-
ponses are restricted by a response deadline. When performing
the GNAT, participants are required to respond to target stimuli
and positive words in one block and to target stimuli and negative
words in another block (‘‘go’’ stimuli) whereas other stimuli, the
distractors, have to be ignored (‘‘no-go’’ stimuli). Accuracy
differences between both blocks (usually assessed by means of
the signal detection sensitivity measure d0) indicate the implicit
attitude toward the target concept. Hence, because the GNAT
focuses on accuracy differences between blocks of trials, it is
clearly suitable for multinomial modeling approaches.
Consequently, Gonsalkorale, von Hippel, Sherman, and
Klauer (2009) recently proposed a generalized Quad Model to
analyze GNAT data. This generalized Quad Model differs from
the original Quad Model in a single aspect: Gonsalkorale, von
Hippel, et al. (2009) assume that the discrimination accuracy D
in the GNAT differs between (a) stimuli of the category that
serves as the go category throughout all GNAT blocks and (b)
stimuli of the other categories. More precisely, it is assumed that
because stimuli of the constant go category capture attention
continuously in every GNAT block, participants become better
at detecting these stimuli after a while. Therefore, the Quad
Model variant for the GNAT uses two separate D parameters
(one parameter D1 for the stimuli of the constant go category
and a different parameter D2 for all other stimuli). Note that
Conrey et al.’s (2005) original Quad Model can be derived from
Gonsalkorale, von Hippel et al.’s (2009) generalized Quad
Model by setting D1 ¼D2. Thus, the former model is a nested
submodel of the latter. In other words, the original Quad Model
can fit the data only if the generalized Quad Model does.
Apart from the additional assumption concerning D1 and D2,
Gonsalkorale, von Hippel, et al. (2009) do not believe that the
cognitive processes underlying the IAT and the GNAT differ
from each other. In other words, the Quad Model should be
applicable to both paradigms, either with a single D parameter
(IAT) or with two different D parameters (GNAT). The latter
hypothesis has so far only been tested in a single experiment.
This study revealed a good fit of the generalized Quad Model for
GNAT data Gonsalkorale, von Hippel, et al. (2009).
THE TRIP MODEL
Despite the generalized Quad Model’s good fit to GNAT data
in the Gonsalkorale, von Hippel, et al. (2009) study, we doubt
that the IAT and the GNAT involve the same cognitive
processes. Our skepticism is based on empirical findings as
well as theoretical considerations.
Empirical studies have shown that correlations between IAT
and GNAT measures are rather low (Nosek & Banaji, 2001;
Rudolph et al., 2008). If both measures reflect the same
implicit attitude, these weak correlations can only be explained
by low reliability of the measures involved or by method
specific variance caused by different underlying cognitive
processes. The latter explanation is supported (a) by the finding
of Rudolph et al. (2008) that both the IAT and the GNAT
measures exhibit satisfactory to good levels of reliability and
(b) by the fact that divergent results between two-choice
Copyright #2010 John Wiley & Sons, Ltd. Eur. J. Soc. Psychol. 41, 254–268 (2011)
The Trip Model 255
procedures and go/no-go procedures have also been observed
for other psychological tasks, for example, the Simon task
(Ansorge & Wu
¨hr, 2004, 2008) and lexical-decision tasks
(Gomez, Ratcliff, & Perea, 2007).
From a theoretical perspective there is also reason to believe
that the IAT and the GNAT differ in their underlying cognitive
processes. There are two essential differences between the two
paradigms. First, the GNAT involves a (short) response
Figure 1. Structure and parameters of the Quad Model. Each branch of the model represents a possible sequence of cognitive processes
resulting in either a correct response (þ) or an incorrect response (). Rectangles represent observable stimuli, rounded rectangles indicate non-
observable cognitive processes. Parameters attached to the branches denote transition probabilities from left to right. The model refers to four
independent processing trees for the following four conditions: (1) Compatible blocks, required response ¼left, (2) compatible blocks, required
response ¼right, (3) incompatible blocks, required response ¼left, and (4) incompatible blocks, required response ¼right. For the sake of
brevity, processing trees (1) and (2) are illustrated as a single tree that differs in the responses assigned to the branches only (separately for ‘‘left
key’’ and ‘‘right key’’ trials). The same holds for processing trees (3) and (4).
Copyright #2010 John Wiley & Sons, Ltd. Eur. J. Soc. Psychol. 41, 254–268 (2011)
256 Lena Nadarevic and Edgar Erdfelder
deadline whereas the IAT does not. Therefore, we believe that
a controlled overcoming bias process is not involved in GNAT
performance simply because participants do not have enough
time to overcome bias when performing a GNAT. Second,
when performing a GNAT, participants have to respond to go
stimuli and to withhold responses to no-go stimuli. In contrast,
when performing an IAT, participants have to respond to all
stimuli. Research on go/no-go tasks has demonstrated that it is
more difficult to withhold responses to no-go stimuli than to
respond to go stimuli. For instance, Nieuwenhuis, Yeung, van
den Wildenberg, and Ridderinkhof (2003) observed 8.3% false
alarms (responses to no-go stimuli) in comparison to only
0.6% misses (omissions to go stimuli) in a go/no-go tasks with
an equal amount of go and no-go stimuli. Consequently, we
assume that GNAT performance is more strongly influenced by
general response biases than IAT performance is.
Based on these two major differences between the IAT and
the GNAT, we designed a new MPT model of GNAT performance
which complies with the task specific features of the GNAT. This
model, called the Trip Model, is illustrated in Figure 2. The Trip
Model differs from the Quad Model in the following manner.
First, the Trip Model does not include an overcoming bias
process. Because GNATs typically involve relatively short
response deadlines, it is very unlikely that people are able to
overcome bias when performing a typical GNAT. Hence, in
contrast to the Quad Model, the Trip Model uses a triplet of
parameters only: AC, D, and G.
Second, whereas AC and D are assumed to measure
independent processes in the Quad Model, the Trip Model
assumes that association activation and discrimination reflect
strongly correlated processes, with block type moderating the
sign of this correlation. In other words, the model predicts that,
depending on block type, an automatically activated associ-
ation either strongly facilitates or impairs discrimination.
Regarding the compatible GNAT block, the Trip Model
assumes that associations already suggest the correct response
and thus facilitate discrimination. Consequently, discrimi-
nation should be perfect in the compatible block if an
automatic association has been activated. In contrast,
associations in incompatible GNAT blocks counteract the
discrimination process because they suggest the incorrect
response. Thus, if an automatic association has been activated
in the incompatible GNAT block, participants should be unable
to determine the correct response in time.
Third, whereas the Quad Model assumes that response
biases can drive responses only when no association is activated,
the Trip Model assumes that response biases always determine
responses when people are unable to detect the correct response
(irrespective of whether an association has been activated or not).
With this assumption the Trip Model tries to account for the
strong impact of response activation and inhibition in go/no-go
tasks (e.g., Johnstone, Pleffer, Barry, Clarke, & Smith, 2005). If
response activation is dominant, participants should show a
response bias toward pressing the go-key (probability G). In
contrast, if response inhibition is dominant they should show a
bias toward not responding (probability 1-G).
Note that the Trip Model shares some features with the
ABC Model (Stahl & Degner, 2007), another MPT model that
has recently been developed for analyzing data of the EAST
(De Houwer, 2003). Both the Trip Model and the ABC Model
include an association activation parameter, a response bias
parameter, and a controlled discrimination parameter, whereas
they omit an overcoming bias parameter. Moreover, in contrast
to the Quad Model, both models do not conceive association
activation and discriminability as independent processes.
However, unlike the ABC Model, the Trip Model assumes that
response bias determines responses in the incompatible block
whenever people are unable to detect the correct response. This
difference is due to the fact that the models have been
developed to comply with specific features of two different
tasks—the GNAT and the EAST, respectively.
In this article, we assess two different multinomial models
of GNAT performance, namely the Trip Model and the
generalized Quad Model. In particular, we evaluate the
models’ fit for different paradigms by applying them to two
different GNAT variants and to the IAT (Experiment 1). In
addition, we assess the validity of the Trip Model’s parameters,
that is, the AC parameter (in Experiment 2), the G parameter,
and the D parameter (both of them in Experiment 3).
EXPERIMENT 1
The goal of Experiment 1 was to test the Trip Model and the
generalized Quad Model of GNAT performance against each
other. Experiment 1 used a standard IAT and two different
GNAT variants. The IAT and the two GNAT variants assessed
flower–positive and insect–negative associations. Therefore,
‘‘flowers’’ and ‘‘insects’’ were used as target concepts and
‘‘positive’’ and ‘‘negative’’ as attribute concepts. The two
GNAT variants differed in their go categories only. One variant
was a standard GNAT that applied the target concepts
‘‘flowers’’ (or ‘‘insects’’, respectively) as go concepts
throughout all GNAT blocks. The other GNAT variant applied
the attribute concepts ‘‘positive’’ (or ‘‘negative’’, respectively)
as go concepts throughout all blocks. In line with these
differences, we refer to the former GNAT variant as Target
GNAT and to the latter as Attribute GNAT.
Based on the assumption that the IAT and the GNAT differ
in their underlying cognitive processes, we made the following
predictions. (1) Regarding the Quad Model and the generalized
Quad Model, we expected a good model fit for IAT data, as has
previously been found by Conrey et al. (2005) and Sherman
et al. (2008). In contrast, we expected that the fit of the Quad
Model would be worse for the Target GNAT and Attribute
GNAT data. (2) Regarding the Trip Model, we predicted the
reverse pattern of results. That is, we expected a good Trip
Model fit for both GNAT variants. Primarily because an
overcoming bias process is not taken into account in the Trip
Model, we did not expect a good model fit for IAT data.
Method
Participants
Sixty-one students of the University of Mannheim participated
in the experiment. One non-native German speaker had to be
excluded from analysis because of difficulties in language
processing. The age of the remaining 60 participants ranged
from 19 to 40 years (M¼23.35, SD ¼3.56). Fifteen
participants were males, 45 were females.
Copyright #2010 John Wiley & Sons, Ltd. Eur. J. Soc. Psychol. 41, 254–268 (2011)
The Trip Model 257
Materials
Stimulus material consisted of 60 German nouns (15 flower
names, 15 insect names, 15 positive words, and 15 negative
words; see Appendix).
Design
Task type was manipulated between participants. Each
participant had to accomplish four Target GNATs, four
Attribute GNATs, or four IATs in a row. In the Target GNAT
Figure 2. Structure and parameters of the Trip Model. Each branch of the model represents a possible sequence of cognitive processes
resulting in either a correct response (þ) or an incorrect response (). Rectangles represent observable stimuli, rounded rectangles indicate non-
observable cognitive processes. Parameters attached to the branches denote transition probabilities from left to right. The model refers to four
independent processing trees for the following four conditions: (1) Compatible blocks, required response ¼go, (2) compatible blocks, required
response ¼no-go, (3) incompatible blocks, required response ¼go, and (4) incompatible blocks, required response ¼no-go, For the sake of
brevity, processing trees (1) and (2) are illustrated as a single tree that differs in the responses assigned to the branches only (separately for go and
no-go trials). The same holds for processing trees (3) and (4).
Copyright #2010 John Wiley & Sons, Ltd. Eur. J. Soc. Psychol. 41, 254–268 (2011)
258 Lena Nadarevic and Edgar Erdfelder
condition, the target concept switched between the four
GNATs, with the order (Flower GNAT first vs. Insect GNAT
first) counter-balanced across participants. Likewise, for the
Attribute GNAT and the IAT, the attribute assignment changed
alternately, and the order of attribute assignment (positive
GNAT first vs. negative GNAT first) was counter-balanced.
Participants were randomly assigned to the experimental
conditions.
Procedure
Up to four participants simultaneously performed the
experiment in experimental cubicles on standard PCs. To
familiarize participants with their task, the experiment started
with three warm-up blocks that consisted of a target
discrimination task (20 trials), an attribute discrimination
task (20 trials), and a combined task with response compatible
concept assignment (20 trials). The concept-key assignment of
the warm-up blocks was opposite to the concept-key assign-
ment of the first actual task block. For example, when the first
GNAT was a Flower GNAT, participants had to identify insects
and negative words in the warm-up blocks.
Following the standard IAT procedure (e.g., Greenwald,
Nosek, & Banaji, 2003), each of the subsequent GNATs or
IATs contained seven blocks. The first four blocks consisted of
a target discrimination task (e.g., ‘‘respond to flowers and
ignore all other stimuli’’; 20 trials), an attribute discrimination
task (e.g., ‘‘respond to positive words and ignore all other
stimuli’’; 20 trials), a compatible combined practice block
(e.g., ‘‘respond to flowers and positive words and ignore all
other stimuli’’; 20 trials), and a compatible combined test
block (40 trials) with identical task instructions. The fifth block
(20 trials) differed between the experimental paradigms.
Participants performing the Target GNAT had to accomplish a
reversed attribute discrimination task (e.g., ‘‘respond to
negative words and ignore all other stimuli’’). Because
valence assignment was not supposed to change for the other
two paradigms, the fifth block consisted of a reversed target
discrimination block for the IAT and the Attribute GNAT (e.g.,
‘‘respond to insects and ignore all other stimuli’’). In the final
two blocks, participants had to accomplish an incompatible
combined practice block (e.g., ‘‘respond to insects and positive
stimuli’’; 20 trials) and an incompatible combined test block
(40 trials) with identical task instructions. In each GNAT
block, the same number of go stimuli and no-go stimuli was
presented. Likewise, the IAT installed an equal amount of
stimuli assigned to the left response key and stimuli assigned to
the right response key.
For the Target GNATs and the Attribute GNATs, every
block started with new instructions introducing the go concepts
during that block. To remind participants of these go concepts,
category labels were displayed in the upper corners of the screen
throughout all events of the block. When pressing ‘‘enter’’,
instructions disappeared while a fixation mark remained
500 milliseconds in the center of the screen. Subsequently, a
black screen was shown for 200 milliseconds until the first
stimulus was presented. Stimuli were randomly sampled and
presented in the center of the screen. To facilitate discrimination,
targets were presented in white letters whereas attributes were
presented in light blue letters against a black background. Each
stimulus remained on the screen until the participant responded
by pressing the space bar or until the response deadline ran out
(900 milliseconds). When the stimulus disappeared, a green
circle (correct response) or a red cross (incorrect response) was
displayed for 200 milliseconds. This feedback slide was followed
by a 200 milliseconds black screen preceding the next stimulus.
For the IAT, the procedure was almost identical. The only
difference was that response time was not restricted and that ‘‘d’’
served as left response key and ‘‘l’’ served as right response key.
Task completion took about 15–20 minutes. In the end of the
experiment, participants were fully debriefed and thanked.
Results
As usual in IAT and GNAT applications, only data from the
combined test blocks were analyzed.
Analysis of Error Rates
Performance was significantly better in the compatible blocks
than in the incompatible blocks for the Target GNAT,
t(19) ¼6.23, p<.001, d
z
¼1.39, the Attribute GNAT,
t(19) ¼7.37, p<.001, d
z
¼1.65, and the IAT, t(19) ¼3.70,
p¼.002, d
z
¼0.83. Mean error rates are displayed in Table I.
As expected, error rates differed only slightly between the
compatible and the incompatible IAT blocks. In comparison,
performance differences between these blocks were much
more pronounced for the two GNAT variants.
Model-based Analyses
The computer program multiTree (Moshagen, 2010) was
employed for testing the Trip Model and the Quad Model. The
model specifications for both models are presented in the
Appendix. The equation files and the corresponding data files
for the following analyses can be obtained from the first author.
Goodness of fit of the different models was evaluated by means
of the log-likelihood ratio statistic G
2
which is asymptotically
x
2
-distributed if the model holds (Read & Cressie, 1988), with
degrees of freedom (df) corresponding to the difference
between the number of independent data categories and the
number of estimated parameters.
To determine the sensitivity of our goodness-of-fit tests for
model violations, we ran a power analysis using G
Power 3.1
(Faul, Erdfelder, Buchner, & Lang, 2009). In doing so, we
Table I. Mean error rates for the different experimental groups of
Experiments 1, 2, and 3
Experiment Group
Compatible
blocks (%)
Incompatible
blocks (%)
1 Target GNAT 6.87 13.84
Attribute GNAT 6.78 14.81
IAT 6.75 9.88
2 Pro-stereotype 8.04 17.32
Counter-stereotype 10.67 16.36
3 75% go stimuli 14.61 21.04
25% go stimuli 6.81 15.50
Copyright #2010 John Wiley & Sons, Ltd. Eur. J. Soc. Psychol. 41, 254–268 (2011)
The Trip Model 259
applied the following default values. First, for the sake of
comparability between studies, we used the conventional
significance level of a¼.05 in all of the following analyses.
Second, in order to detect possible deviations from the models
reliably, we set the target power of the model test at 1b¼.99.
Finally, to be as conservative as possible, we assumed N¼3200
observations which is the smallest number of observations for a
model test within this paper (i.e., in Experiment 3). The results of
the G
Power sensitivity analysis revealed that, given these
specifications, the following goodness-of-fit tests are powerful
enough to detect deviations from the Trip Model of size w¼0.09
and deviations from both the Quad Model and the generalized
Quad Model of size w¼0.10. According to Cohen (1988, p.
227), effects of size w¼0.10 can be considered ‘‘small.’’ Thus,
given the present sample sizes and significance levels, our
goodness-of-fit tests are powerful enough to detect even small
model violations reliably.
Trip Model Analyses
When testing the Trip Model, we observed a good model fit for
the Target GNAT, G
2
(4) ¼6.39, p¼.17 as well as for the
Attribute GNAT, G
2
(4) ¼0.35, p¼.99. The four Trip Model
parameter estimates (AC1, AC2, G, and D) did not
significantly differ between the two GNAT versions,
DG
2
(4) ¼6.10, p¼.19 (see Figure 3). Parameter estimates
showed flower–positive associations (AC1) significantly larger
than zero, DG
2
(2) ¼100.50, p<.001, and significant insect–
negative associations (AC2), DG
2
(2) ¼132.34, p<.001. D
parameter estimates were relatively high, indicating that
people were quite good at determining the correct response.
Furthermore, G parameter estimates suggest a general
response bias toward pressing the go-key. More precisely,
despite an equal number of go and no-go stimuli, G parameters
were significantly larger than 0.50, DG
2
(2) ¼129.37, p<.001.
This observation is in line with the common finding that false
alarms (responses to no-go stimuli) are more frequent than
misses (omissions to go stimuli) in go/no-go tasks (e.g.,
Falkenstein, Hoormann, & Hohnsbein, 1999; Menon, Adle-
man, White, Glover, & Reiss, 2001; Nieuwenhuis et al., 2003).
The implications of the Trip Model findings are twofold.
First, the Trip Model seems to be an appropriate model of the
cognitive processes underlying GNAT performance. Second,
the underlying cognitive processes of the GNAT do not depend
on whether an attribute concept or a target concept is used as
the constant go category. In contrast, when IAT data were
analyzed with the Trip Model, the goodness-of-fit test
indicated only a marginal model fit, G
2
(4) ¼9.29, p¼.05.
Parameter estimates of the Trip Model for the IAT data are also
displayed in Figure 3. As can been seen, the AC parameters
were smaller for the IAT compared to the GNAT. This is
consistent with the idea that the IAT tends to underestimate
automatic associations. However, given the marginal fit of the
Trip Model, this result should be interpreted with caution.
Quad Model Analyses
The Quad Model was tested in two variants. The generalized
variant of Gonsalkorale, von Hippel, et al. (2009) involving
two separate D parameters was used for the GNAT data
analyses whereas the classical variant with a single D
parameter was used for the IAT data analysis.
When analyzing the GNAT data, the fit of the generalized
Quad Model was not as good as the one that had been observed
with the Trip Model. More precisely, the generalized Quad
Model only fit the data of the Target GNAT, G
2
(8) ¼14.21,
p¼.08, but failed to fit the data of the Attribute GNAT,
G
2
(8) ¼31.73, p<.001.
1
As explained above, misfit of the
generalized Quad Model implies misfit of the classical Quad
Model involving only a single D parameter. Hence, neither of
the two Quad Model variants can account for the Attribute
GNAT data of Experiment 1. Interestingly, Quad Model
parameter estimates for the two GNAT variants were similar to
the Trip Model parameter estimates. AC parameters differed
significantly from zero, DG
2
(4) ¼189.49, p<.001, D
parameters were rather high, and G parameters indicated a
significant response bias, DG
2
(2) ¼67.57, p<.001 (see
Figure 4). Moreover, parameter estimates for D1 and D2
did not significantly differ from each other, DG
2
(2) ¼2.81,
p¼.25, and the OB parameters did not differ significantly
from zero, DG
2
(2) ¼1.44, p¼.49.
When analyzing the IAT data with the Quad Model, the
goodness-of-fit test indicated a good model fit, G
2
(7) ¼10.93,
p¼.14. This result replicates previous findings (Conrey et al.,
2005; Sherman et al., 2008) and corroborates the Quad Model
as an appropriate model of IAT performance. Parameter
estimates of the Quad Model for both the GNAT and the IAT
data are displayed in Figure 4. Again, the AC parameters were
lower for IAT as compared to GNAT data.
Discussion
As predicted, the Quad Model fit the IAT data better than the
GNAT data whereas the Trip Model showed a better model fit
to the GNAT data. These findings (a) support our assumption
that the cognitive processes underlying the GNAT differ from
those underlying the IAT and (b) favor the Trip Model as a
measurement model of GNAT performance.
EXPERIMENT 2
The goal of Experiment 2 was to investigate the validity of the
Trip Model in further detail. More precisely, we wanted (1) to
replicate the Trip Model’s fit using independent GNAT data
and (2) to assess the construct validity of the Trip Model’s AC
parameter. Because the AC parameter is supposed to reflect
automatic associations, it is probably the most interesting
parameter of the Trip Model for researchers and practitioners.
One method to manipulate automatic associations is to
confront participants with a short scenario before performing
an implicit attitude task. For instance, Foroni and Mayr (2005)
used apocalyptic scenarios that either described insects as
1
To check the possibility that the generalized Quad Model’s misfit to the data of
the Attribute GNAT was influenced by the unusual large number of GNAT
trials, we repeated the analysis separately for the first half of trials and for the
second half of trials. However, the generalized Quad Model still failed to fit the
data of the Attribute GNAT, G
2
(8) 20.49, p¼.01.
Copyright #2010 John Wiley & Sons, Ltd. Eur. J. Soc. Psychol. 41, 254–268 (2011)
260 Lena Nadarevic and Edgar Erdfelder
radioactively contaminated and flowers as indispensable to life
(pro-stereotype scenario) or flowers as radioactively contami-
nated and insects as indispensable to life (counter-stereotype
scenario). They found that these scenarios affected GNAT
performance significantly. After presentation of the pro-
stereotype scenario, participants exhibited highly reliable
GNAT effects, indicating significant flower–positive and
insect–negative associations. However, when the counter-
stereotype scenario had been presented, the GNAT effects did
not significantly differ from zero.
In Experiment 2, we used the manipulation of Foroni and
Mayr (2005) to test the validity of the Trip Model’s AC
parameter. We hypothesized that reading the pro-stereotype
scenario would enhance the accessibility of flower–positive
and insect–negative associations whereas reading the counter-
stereotype scenario would decrease the accessibility of these
associations. Therefore, we predicted that the AC parameters
would be significantly higher in the pro-stereotype condition
than in the counter-stereotype condition. Moreover, we
assumed that the manipulation should not affect response
biases (parameter G). Concerning the D parameter, we had no
specific prediction. Although we did not intend to manipulate
the D parameter in a specific direction with the presented
scenarios, we considered it to be plausible that the scenario
Figure 3. Parameter estimates for the Trip Model, separately for the Target GNAT, the Attribute GNAT, and the IAT data of Experiment 1.
Error bars represent 95% confidence intervals.
Figure 4. Parameter estimates for the Quad Model, separately for the Target GNAT, the Attribute GNAT, and the IAT data of Experiment 1.
Error bars represent 95% confidence intervals.
Copyright #2010 John Wiley & Sons, Ltd. Eur. J. Soc. Psychol. 41, 254–268 (2011)
The Trip Model 261
manipulation affects discrimination performance. For
instance, people who hold a negative attitude toward insects
are forced in the counter-stereotypical condition to think of
insects as essential for surviving. Keeping this new conflicting
association in mind might require additional cognitive
resources and thus might reduce discrimination performance.
Method
Participants
Fifty-seven University of Mannheim psychology undergradu-
ates participated in the experiment. The age of the participants
ranged from 18 to 38 years (M¼21.98, SD ¼4.67). Fifteen
participants were males, 42 were females.
Materials
The presented scenarios were taken from Foroni and Mayr
(2005) and translated into German. The GNAT stimuli were
the same as used in Experiment 1. However, in contrast to
Experiment 1, only Target GNATs (i.e., Flower GNATs and
Insects GNATs) were used.
Design
The presented scenarios were manipulated between partici-
pants. Half of the participants read a pro-stereotype scenario
and the other half read a counter-stereotype scenario before
performing two GNATs. One GNAT assessed flower–positive
associations and the other one assessed insect–negative
associations. The order of the two GNATs (Flower GNAT
vs. Insect GNAT first) and the order of combined blocks within
each GNAT (compatible block vs. incompatible block first)
were counter-balanced across participants. Participants were
assigned randomly to the experimental conditions.
Procedure
Up to six participants simultaneously performed the exper-
iment in experimental cubicles on standard PCs. The first three
task blocks were warm-up blocks. Following these warm-up
blocks, either the pro-stereotype scenario or the counter-
stereotype scenario was presented on the computer screen.
Participants read the scenario before completing two GNATs.
To keep the procedure as short as possible, each GNAT
consisted of only five instead of seven blocks: A target
discrimination block (20 trials), an attribute discrimination
block (20 trials), a compatible combined block (80 trials), a
reversed attribute discrimination block (20 trials), and an
incompatible combined block (80 trials). A reminder of the
scenario preceded each combined block. The settings for the
single blocks and response trials were the same as in
Experiment 1.
Task completion took about 15 minutes. At the end of the
experiment, participants were fully debriefed and thanked.
Results
Analysis of Error Rates
In both experimental groups, participants made significantly
more errors in the incompatible blocks compared to the
compatible blocks (pro-stereotype condition: t(27) ¼7.42,
p<.001, d
z
¼1.40; counter-stereotype condition: t(28) ¼4.52,
p<.001, d
z
¼0.84). The effectiveness of the scenario manip-
ulation was apparent in the smaller GNAT effect size of the
counter-stereotype condition compared to the pro-stereotype
condition (as indicated by Cohen’s d
z
). Mean error rates for both
conditions are displayed in Table I.
Model-based Analyses
When analyzing the GNAT data with the Trip Model, the
model fit the data both in the counter-stereotype condition,
G
2
(4) ¼6.63, p¼.16, and in the pro-stereotype condition,
G
2
(4) ¼9.12, p¼.06. The Trip Model’s parameter estimates
for the two groups are displayed in Figure 5. As predicted, the
AC parameters reflecting the flower–positive association
(AC1) and the insect–negative association (AC2) were smaller
in the counter-stereotype condition than in the pro-stereotype
condition. This difference in the AC parameters was
statistically significant, DG
2
(2) ¼13.51, p¼.001. Replicating
Experiment 1, G parameter estimates were significantly larger
than 0.50, DG
2
(2) ¼121.32, p<.001, but did not differ
between the two experimental conditions, DG
2
(1) ¼1.57,
p¼.21. Thus, as expected, the scenario manipulation did not
affect the G parameter. In contrast, there was a numerically
small but statistically significant scenario effect on the D
parameter, DG
2
(1) ¼13.38, p<.001, indicating smaller
discrimination ability in the counter-stereotype condition.
This finding confirms our conjecture that cognitive capacity in
the counter-stereotype group might be reduced. In contrast to
participants in the pro-stereotype condition, participants in the
counter-stereotype condition might be more distracted when
performing the GNAT because they are forced to keep new
associations in mind that contradict their intrinsic automatic
associations.
The GNAT data were also analyzed with the generalized
Quad Model of Gonsalkorale, von Hippel, et al. (2009).
However, this model clearly failed to fit the data in
both conditions (pro-stereotype condition: G
2
(8) ¼58.71,
p<.001; counter-stereotype condition: G
2
(8) ¼43.56,
p<.001). Therefore, parameter estimates of the Quad Model
are not reported.
Discussion
Experiment 2 successfully replicated the major findings of
Experiment 1. The Trip Model fit the GNAT data in both
experimental conditions whereas the generalized Quad Model
did not fit the data in either condition. Furthermore, by using
short scenarios influencing the accessibility of stereotypical
associations, we successfully tested the validity of the Trip
Model’s AC parameter. After reading a counter-stereotypical
scenario, the AC parameters of the Trip Model were
Copyright #2010 John Wiley & Sons, Ltd. Eur. J. Soc. Psychol. 41, 254–268 (2011)
262 Lena Nadarevic and Edgar Erdfelder
significantly reduced compared to a pro-stereotypical scenario.
Thus, the Trip Model’s AC parameter indeed reflects cognitive
associations.
EXPERIMENT 3
The primary goal of Experiment 3 was to complete the series
of validation studies by testing the construct validity of the Trip
Model’s G and D parameters. Because the G parameter is
supposed to reflect response biases toward pressing the go key,
we manipulated response bias by implementing GNATs with
different base rates of go and no-go stimuli. We expected that
G would exceed 0.50 and fall below 0.50 for high and low
proportions of go-stimuli, respectively. Moreover, in order to
assess the validity of the D parameter, which is supposed to
reflect a controlled discrimination process, we implemented
GNATs with different response deadlines. We predicted that
shorter response deadlines impair discriminability and thus
should result in smaller D parameter estimates. Because
associations should not be affected by design features of the
GNAT, such as base rates and response deadlines, we expected
that the AC parameters would not differ between the different
response bias and response deadline conditions.
Method
Participants
Forty-two University of Mannheim psychology undergradu-
ates participated in the experiment. One participant had to be
excluded from the analysis because of an extremely high error
rate (75%) in one GNAT block. The age of the remaining 41
participants ranged from 19 to 47 years (M¼23.54,
SD ¼5.98). Nine participants were males, 32 were females.
Materials
GNAT stimuli were the same as used in Experiments 1 and 2.
Design
The ratio of go to no-go stimuli was manipulated between
participants. Half of the participants processed 75% go and
25% no-go stimuli. This ratio was inverted for the other half of
the participants. Each participant performed four GNATs, the
first two with a response deadline of 700 milliseconds and
the other two with a response deadline of 900 milliseconds, or
vice versa. Thus, response deadline was a within-subject
factor, with the order of deadlines (700 milliseconds vs.
900 milliseconds deadline first) counter-balanced across
participants. Two of the four GNATs assessed flower–positive
associations, whereas the other two assessed insect–negative
associations. Flower and Insect GNATs were presented in an
alternating order. In doing so, the sequence of the different
GNATs (Flower GNAT first vs. Insect GNAT first) was also
counter-balanced across participants. Participants were ran-
domly assigned to the experimental groups.
Procedure
Up to four persons simultaneously performed the tasks in
experimental cubicles on standard PCs. Again, the first three
blocks were warm-up blocks. Response deadlines of these
warm-up blocks matched the one of the first actual GNAT.
Subsequently, participants completed four GNATs, that is,
one GNAT for each combination of target concept (flower vs.
insect) and response deadline (700 milliseconds vs. 900 milli-
seconds). Each GNAT consisted of five blocks: A target
discrimination block (20 trials), an attribute discrimination
block (20 trials), a compatible combined block (40 trials), a
reversed attribute discrimination block (20 trials), and an
Figure 5. Parameter estimates for the Trip Model, separately for the GNAT data of the pro-stereotype condition and of the counter-stereotype
condition of Experiment 2. Error bars represent 95% confidence intervals
Copyright #2010 John Wiley & Sons, Ltd. Eur. J. Soc. Psychol. 41, 254–268 (2011)
The Trip Model 263
incompatible combined block (40 trials). The order of the
compatible and the incompatible blocks was randomized.
The settings for the single blocks and response trials were
identical to Experiments 1 and 2, except for the response
deadlines (700 milliseconds for one half of the GNATs and
900 milliseconds for the other half of the GNATs) and the
proportions of go stimuli (75% vs. 25% for the two groups).
The experiment took about 15–20 minutes. After com-
pletion, participants were fully debriefed and thanked.
Results and Discussion
Analysis of Error Rates
The typical GNAT effect (more errors in the incompatible
blocks than in the compatible blocks) was observed for
participants that were exposed to 75% go stimuli, t(20) ¼5.88,
p<.001, d
z
¼1.28, and also for participants that were exposed
to 25% go stimuli, t(19) ¼7.41, p<.001, d
z
¼1.66. Mean
error rates are summarized in Table I. Surprisingly, error rates
were much higher in the 75% go stimuli group than in the 25%
go stimuli group. Although this outcome was unexpected, it
can be explained in terms of interference between responding
and visual encoding (e.g., Danielmeier, Zysset, Mu
¨sseler, &
von Cramon, 2004; Koch & Prinz, 2005; Mu
¨sseler & Wu
¨hr,
2002). For instance, Mu
¨sseler and Wu
¨hr (2002) used a dual-
task paradigm including a simple go/no-go keypress task and a
visual identification task. They found that identification
performance was better in no-go trials than in go trials. Thus,
given this finding, it is plausible that participants in the 75% go
stimuli group were more strongly impaired in stimulus
discrimination simply because they had to respond more
frequently. If this explanation is correct, D parameters should
turn out to be significantly lower in the 75% go stimuli
condition compared to the 25% go stimuli condition. We will
assess this prediction below.
Model-based Analyses
When analyzing the GNAT data with the Trip Model, we
obtained a good model fit for three of the four experimental
conditions, G
2
(4) 4.60, p.33. No satisfactory fit was
observed for the 75% go stimuli group under the 700 milli-
seconds response deadline condition, G
2
(4) ¼13.91, p¼.01.
However, given the high statistical power of our goodness-of-
fit tests, this slight discrepancy between model and data is
presumably due to small and unsystematic model deviations.
This interpretation is supported by the fact that (a) this is the
only marginal misfit observed in eight model tests based on
three experiments and (b) despite the marginal misfit,
parameter estimates varied as predicted.
Within each of the two response bias groups, D parameter
estimates were significantly higher in the 900 milliseconds
than in the 700 milliseconds response deadline condition,
DG
2
(2) ¼113.73, p<.001. However, D parameter estimates
not only varied between the two response deadline conditions
but also between the two base rate conditions, DG
2
(2) ¼
122.09, p<.001. The observation that the D parameter was
significantly reduced when participants were exposed to 75%
go stimuli compared to only 25% go stimuli is consistent
with the idea that responding interferes with stimulus
discrimination. As hypothesized, G parameter estimates were
significantly larger than 0.50 when the ratio of go stimuli was
75% and significantly lower than 0.50 when the ratio of go
stimuli was 25%, DG
2
(4) ¼1300.45, p<.001 (see Figure 6).
Importantly, the two AC parameters measuring flower–
positive associations (AC1) and insect–negative associations
(AC2) did not differ significantly between the four exper-
imental conditions, DG
2
(6) ¼7.44, p¼.28. This finding
confirms the Trip Model’s AC parameter as an uncontami-
nated, pure measure of automatic attitudes. More precisely, AC
is neither affected by response base rates nor by response speed
manipulations.
Finally, GNAT data were also analyzed with the generalized
Quad Model of Gonsalkorale, von Hippel, et al. (2009).
Replicating Experiment 2, the Quad Model again failed to fit
the data in either condition, G
2
(8) 21.37, p.01.
GENERAL DISCUSSION
Validity of the Trip Model
The results of our experiments provide strong evidence for the
validity of the Trip Model. For seven of the eight GNAT data
sets, goodness-of-fit tests showed a good model fit for the Trip
Model. In contrast, the generalized Quad Model did not fit
more than a single of the eight GNAT data sets. This result
clearly demonstrates that the Trip Model captures the
processes underlying GNAT performance better than the
generalized Quad Model. Although not the primary goal of the
present paper, we also compared the Trip Model to the ABC
Model (Stahl & Degner, 2007). As outlined above, the latter
model was originally suggested for the EAST paradigm and
not for the GNAT. In principle, however, it can also be adapted
to the GNAT. Interestingly, despite some conceptual sim-
ilarities between the Trip and the ABC Model, our analyses
revealed that the GNAT-adaptation of the ABC Model fit only
one of the eight GNAT data sets. The Trip Model should,
therefore, be preferred to both the Quad and the ABC Model
when analyzing GNAT data. Table II provides a summary of
the goodness-of-fit tests for the Trip Model, the generalized
Quad Model, and the ABC Model.
Apart from the Trip Model’s superiority in terms of model
fit, additional evidence for the validity of the Trip Model
derives from the finding that its parameters are affected by
experimental manipulations in meaningful ways. As demon-
strated in Experiments 2 and 3, (1) both AC parameters were
smaller after reading counter-stereotype scenarios compared to
pro-stereotype scenarios, (2) the G parameter correlated with
the GNAT’s base rate of go stimuli, and (3) the D parameter
was smaller when the GNAT’s response deadline was set to
700 milliseconds compared to 900 milliseconds. These find-
ings support three important conclusions regarding the Trip
Model’s parameters: First, the AC parameters provide valid,
uncontaminated measures of implicit associations; second, the
G parameter assesses response bias selectively; and third, the
D parameter captures the cognitive capacity to detect the
correct response.
Copyright #2010 John Wiley & Sons, Ltd. Eur. J. Soc. Psychol. 41, 254–268 (2011)
264 Lena Nadarevic and Edgar Erdfelder
Because the D parameter represents a controlled process
that requires cognitive capacity, any manipulation that
directly or indirectly affects cognitive resources should also
affect D. Presumably, this is the reason why D was sensitive to
(1) the response deadline manipulation, (2) the scenario
manipulation, and (3) the base rate manipulation. The counter-
stereotype scenario in Experiment 2 not only affected
associations as measured by AC but also impaired cognitive
capacity as indexed by D because participants experienced
interference between the newly created associations and their
conflicting automatic associations. Similarly, the base rate
manipulation in Experiment 3 not only influenced response
biases but also discrimination ability D. We explain this
finding in terms of interference between action and visual
encoding, a phenomenon that was previously found by
Danielmeier et al. (2004), Koch and Prinz (2005), and
Mu
¨sseler and Wu
¨hr (2002). Moreover, we would not be
surprised to find significant correlations between the D
parameter and individual differences in excecutive control
or intelligence, as recently reported for the IAT (e.g., Klauer,
Schmitz, Teige-Mocigemba, & Voss, 2010; von Stu
¨lpnagel &
Steffens, 2010).
Generalizability of Results
The superior fit of the Trip Model to GNAT data compared to
the generalized Quad Model implies that the GNAT and the
IAT do not rely on exactly the same processes. As previously
stated, we believe that this difference is due to two procedural
features of the GNAT. First, unlike the IAT, the GNAT is not a
two-choice task but a go/no-go task. Because go/no-go
procedures usually provoke significant response biases, GNAT
performance should also be strongly influenced by response
biases. Second, in contrast to the IAT, the GNAT requires
response deadlines. Typically, short response deadlines are
chosen to avoid bottom effects in the error rates (i.e., deadlines
shorter than 1000 milliseconds as recommended by Nosek and
Banaji (2001)). With deadlines shorter than 1000 milliseconds,
it is very unlikely that people have sufficient time to overcome
bias when performing a GNAT. Hence, this process is ignored
in the Trip Model. Of course, overcoming bias could possibly
play a role if substantially longer response deadlines are
applied. We thus do not expect that the Trip Model holds for all
conceivable GNAT procedures. Instead we claim that the Trip
Model is a valid model for the processes underlying a typical
Figure 6. Parameter estimates for the Trip Model, separately for the GNAT data of the different base rate conditions (75% vs. 25% go-stimuli)
and response deadline conditions (700 milliseconds vs. 900 milliseconds) of Experiment 3. Error bars represent 95% confidence intervals.
Table II. Summary of goodness-of-fit test results for the eight different GNAT data sets
Experiment Task Instruction
‘‘Go’’-ratio
(%)
Deadline
(milliseconds)
Trip
Model fit
Quad
Model fit
ABC
Model fit
1 Target GNAT None 50 900 G
2
(4) ¼6.39 G
2
(8) ¼14.21 G
2
(4) ¼11.10
Attribute GNAT None 50 900 G
2
(4) ¼0.35 G
2
(8) ¼31.73
G
2
(4) ¼4.83
2 Target GNAT Pro-stereotype 50 900 G
2
(4) ¼9.12 G
2
(8) ¼58.71
G
2
(4) ¼16.06
Target GNAT Counter-stereotype 50 900 G
2
(4) ¼6.63 G
2
(8) ¼43.56
G
2
(4) ¼12.71
3 Target GNAT None 75 700 G
2
(4) ¼13.91
G
2
(8) ¼33.38
G
2
(4) ¼21.71
Target GNAT None 75 900 G
2
(4) ¼4.60 G
2
(8) ¼21.37
G
2
(4) ¼13.55
Target GNAT None 25 700 G
2
(4) ¼4.27 G
2
(8) ¼30.98
G
2
(4) ¼24.94
Target GNAT None 25 900 G
2
(4) ¼4.42 G
2
(8) ¼34.07
G
2
(4) ¼24.71
Note: Significant G
2
-values indicate model misfit.
p.05;
p.01;
p.001.
Copyright #2010 John Wiley & Sons, Ltd. Eur. J. Soc. Psychol. 41, 254–268 (2011)
The Trip Model 265
GNAT, that is, a GNAT with response deadlines shorter than
1000 milliseconds.
Although the length of response deadline is certainly
the most evident variable that might influence the validity of
the Trip Model, effects of other procedural variables are
also possible. For example, when reanalyzing the data of
Gonsalkorale, von Hippel, et al. (2009), we found good fit for
the Trip Model only if two separate D parameters were used—
one for stimuli of the category to which people had to respond
throughout all GNAT blocks (D1) and one for stimuli of the
remaining three categories (D2). One possible explanation for
not finding such an effect in our own data could be that we used
words as stimuli whereas Gonsalkorale, von Hippel, et al.
(2009) used images. Foroni and Bel-Bahar (2010) argue that
words and images usually differ in their level of stimulus
representation and thus can produce divergent results. Effects of
these and other procedural GNAT features on the validity of the
Trip Model remain an interesting topic for subsequent research.
Robustness Against Interindividual Variability
One possible objection against our data analyses (and also
against those of other authors that previously made use of MPT
analyses of IAT and GNAT data) could address data
aggregation across participants prior to submitting these data
to the model fitting procedures. To test whether the observed
estimates for the Trip Model are unbiased by data aggregation
across participants, we first checked our data for parameter
heterogeneity using the computer program HMMTree (Stahl &
Klauer, 2007)
2
. Details of this analysis can be obtained from
the first author. As expected, parameter heterogeneity was
present in each of the eight experimental conditions. There-
fore, in a second step, we fitted latent-class hierarchical MPT
models (Klauer, 2006) to our data that allow for variability
between participants. In almost all cases, a rather good model
fit was observed with either a two-class or a three-class solution.
Most importantly, however, in all cases the basic pattern of the
Trip Model’s parameters describing participants within latent
classes mirrored the effects found for the aggregate data. Hence,
our results are robust against aggregation across participants.
Parameter heterogeneity between participants was present but
did not affect the pattern of results reported here. Moreover,
inspection of the parameter estimates for different latent classes
also revealed that the variability is mainly due to different ability
(or motivation) levels of participants, as indicated by large
differences in the D parameter between latent classes.
Implications
The present research emphasizes the utility of MPT models in
studying the processes underlying implicit attitude measures.
Multinomial models combine two important advantages: They
are capable of disentangling and measuring the contributions
of different cognitive processes, and they allow for the
assessment of model fit. Based on goodness-of-fit tests, we
were able to show that a typical IAT and a typical GNAT differ
in their underlying cognitive processes.
As demonstrated by previous research, IAT performance is
influenced by an overcoming bias process. However, as
indicated by the results of our three experiments, overcoming
bias can safely be ignored for typical GNAT tasks. Based on
this finding, we recommend using a typical GNAT procedure
rather than an IAT when research interests focus on implicit
attitude assessments. Although overcoming bias is a theoreti-
cally interesting process for cognitive psychologists, applied
researchers interested in measuring implicit attitudes will see it
as a to-be-controlled confounding variable in the first place.
One might argue that this problem can be solved by analyzing
IAT data with the Quad Model, a model previously shown to be
capable of separating and quantifying the influence of
overcoming bias. However, in doing so, the strength of
implicit attitudes is typically underestimated because IAT
response latency information is ignored by the Quad Model.
Therefore, we prefer the standard GNAT procedure combined
with the Trip measurement model. Using this measurement
tool has three important advantages: First, differences in
overcoming bias cannot affect GNAT performance (at least
if response deadlines are sufficiently short). Second, the
Trip Model disentangles and quantifies the contribution of
implicit attitudes, response biases, and discriminability in
GNAT performance. Finally, in line with conventional GNAT
measures, the Trip Model is based on error rates so that no
important information is lost when analyzing GNAT data with
the Trip Model.
REFERENCES
Ansorge, U., & Wu
¨hr, P. (2004). A response-discrimination account of the
Simon effect. Journal of Experimental Psychology: Human Perception and
Performance,30, 365–377.
Ansorge, U., & Wu
¨hr, P. (2008). Transfer of response codes from choice-
response to go/no-go tasks. The Quarterly Journal of Experimental Psy-
chology,62, 1216–1235.
Banaji, M. R. (2001). Implicit attitudes can be measured. In H. L. Roedinger,
I. N. Nairne, & A. M. Sprenant (Eds.), The nature of remembering: Essays in
honor of Robert G. Crowder (pp. 117–149). Washington D.C.: APA.
Batchelder, W. H., & Riefer, D. M. (1999). Theoretical and empirical review of
multinomial process tree modeling. Psychonomic Bulletin and Review,6,
57–86.
Boysen, G. A., Vogel, D. L., & Madon, S. (2006). A public versus private
administration of the Implicit Association Test. European Journal of Social
Psychology,36, 845–856.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences
(second edition). Hilsdale, NJ: Erlbaum.
Conrey, F. R., Sherman, J. W., Gawronski, B., Hugenberg, K., & Groom, C. J.
(2005). Separating multiple processes in implicit social cognition: The quad
model of implicit task performance. Journal of Personality and Social
Psychology,89, 469–487.
Danielmeier, C., Zysset, S., Mu
¨sseler, J., & von Cramon, D. Y. (2004). Where
action impairs visual encoding: An event-related fMRI study. Cognitive
Brain Research,21, 39–48.
Degner, J., & Wentura, D. (2008). The extrinsic affective Simon task as an
instrument for indirect assessment of prejudice. European Journal of Social
Psychology,38, 1033–1043.
De Houwer, J. (2003). The extrinsic affective Simon task. Experimental
Psychology,50, 77–85.
Erdfelder, E., Auer, T.-S., Hilbig, B. E., Aßfalg, A., Moshagen, M., &
Nadarevic, L. (2009). Multinomial processing tree models: A review of
the literature. Zeitschrift fu
¨r Psychologie/Journal of Psychology,217, 108–
124.
2
Parameter homogeneity across participants is implicitly assumed whenever
MPT models are applied to data that have been aggregated across participants.
Violations of this assumption may cause incorrect model rejections, biased
parameter estimates, and biased confidence intervals and thus needs to be
tested.
Copyright #2010 John Wiley & Sons, Ltd. Eur. J. Soc. Psychol. 41, 254–268 (2011)
266 Lena Nadarevic and Edgar Erdfelder
Falkenstein, M., Hoormann, J., & Hohnsbein, J. (1999). ERP components in
Go/Nogo tasks and their relation to inhibition. Acta Psychologica,101,
267–291.
Faul, F., Erdfelder, E., Buchner, A., & Lang, A. G. (2009). Statistical power
analysis using G
Power 3.1: Tests for correlation and regression analyses.
Behavior Research Methods,41, 1149–1160.
Fazio, R. H., & Olson, M. A. (2003). Implicit measures in social cognition
research: Their meaning and uses. Annual Review of Psychology,54, 297–
327.
Foroni, F., & Bel-Bahar, T. (2010). Picture-IAT versus Word-IAT: Level of
stimulus representation influences on the IAT. European Journal of Social
Psychology,40, 321–337.
Foroni, F., & Mayr, U. (2005). The power of a story: New, automatic
associations from a single reading of a short scenario. Psychonomic Bulletin
& Review,12, 139–144.
Gomez, P., Ratcliff, R., & Perea, M. (2007). A model of the go/no-go task.
Journal of Experimental Psychology: General,136, 389–413.
Gonsalkorale, K., Sherman, J. W., & Klauer, K. C. (2009). Aging and
prejudice: Diminished regulation of automatic race bias among older adults.
Journal of Experimental Social Psychology,45, 410–414.
Gonsalkorale, K., von Hippel, W., Sherman, J. W., & Klauer, K. C. (2009). Bias
and regulation of bias in intergroup interactions: Implicit attitudes toward
Muslims and interaction quality. Journal of Experimental Social Psychol-
ogy,45, 161–166.
Greenwald, A. G., McGhee, D. E., & Schwartz, J. L. K. (1998). Measuring
individual differences in implicit cognition: The Implicit Association Test.
Journal of Personality and Social Psychology,74, 1464–1480.
Greenwald, A. G., Nosek, B. A., & Banaji, M. R. (2003). Understanding and
using the Implicit Association Test: I. An improved scoring algorithm.
Journal of Personality and Social Psychology,85, 197–216.
Johnstone, S. J., Pleffer, C. B., Barry, R. J., Clarke, A. R., & Smith, J. L. (2005).
Development of inhibitory processing during the go/nogo task: A behavioral
and event-related potential study of children and adults. Journal of Psy-
chophysiology,19, 11–23.
Klauer, K. C. (2006). Hierarchical multinomial processing tree models: A
latent-class approach. Psychometrica,71, 7–31.
Klauer, K. C., Schmitz, F., Teige-Mocigemba, S., & Voss, A. (2010). Under-
standing the role of executive control in the Implicit Association Test: Why
flexible people have small IAT effects. Quarterly Journal of Experimental
Psychology,63, 595–619.
Klauer, K. C., & Wegener, I. (1998). Unraveling social categorization in the
‘‘who said what?’’ paradigm. Journal of Personality and Social Psychology,
75, 1155–1178.
Koch, I., & Prinz, W. (2005). Response preparation and code overlap in dual
tasks. Memory & Cognition,33, 1085–1095.
Menon, V., Adleman, N. E., White, C. D., Glover, G. H., & Reiss, A. L. (2001).
Error-related brain activation during a Go/NoGo response inhibition task.
Human Brain Mapping,12, 131–141.
Moshagen, M. (2010). MultiTree: A computer program for the analysis of
multinomial processing tree models. Behavior Research Methods,42, 42–
50.
Mu
¨sseler, J., & Wu
¨hr, P. (2002). Response-evoked interference in visual
encoding. In W. Prinz, & B. Hommel (Eds.), Common mechanisms in
perception and action: Attention and performance, Vol. XIX (pp. 520–537).
Oxford: Oxford University Press.
Nieuwenhuis, S., Yeung, N., van den Wildenberg, W., & Ridderinkhof, K. R.
(2003). Electrophysiological correlates of anterior cingulate function in a
go/no-go task: Effects of response conflict and trial type frequency. Cog-
nitive, Affective, & Behavioral Neuroscience,3, 17–26.
Nosek, B. A., & Banaji, M. R. (2001). The Go/No-go Association Task. Social
Cognition,19, 625–666.
Nosek, B. A., & Smyth, F. L. (2007). A multitrait-multimethod validation of
the Implicit Association Test: Implicit and explicit attitudes are related but
distinct constructs. Experimental Psychology,54, 14–29.
Read, T. R. C., & Cressie, N. A. C. (1988). Goodness-of-fit statistics for
discrete multivariate data. New York: Springer-Verlag.
Rudolph, A., Schro
¨der-Abe
´, M., Schu
¨tz, A., Gregg, A. P., & Sedikides, C.
(2008). Through a glass, less darkly? Reassessing convergent and discri-
minant validity in measures of implicit self-esteem. European Journal of
Psychological Assessment,24, 273–281.
Sherman, J. W., Gawronski, B., Gonsalkorale, K., Hugenberg, K., Allen, T. J.,
& Groom, C. J. (2008). The self-regulation of automatic associations and
behavioral impulses. Psychological Review,115, 314–335.
Sriram, N., & Greenwald, A. G. (2009). The brief Implicit Association Test.
Experimental Psychology,56, 283–294.
Stahl, C., & Degner, J. (2007). Assessing automatic activation of valence: A
multinomial model of EAST performance. Experimental Psychology,54,
99–112.
Stahl, C., & Klauer, K. C. (2007). HMMTree: A computer program for
hierarchical multinomial models. Behavior Research Methods,39, 267–
273.
Steffens, M. C. (2004). Is the Implicit Association Test immune to faking?
Experimental Psychology,51, 165–179.
Steffens, M. C., & Jonas, K. J. (2010). Implicit attitude measures. Zeitschrift
fu
¨r Psychologie/Journal of Psychology,218, 1–3.
Steffens, M. C., Lichau, J., Still, Y., Jelenec, P., Anheuser, J., & Goergens,
N. K., et al. (2004). Individuum oder Gruppe, Exemplar oder Kategorie? Ein
Zweifaktorenmodell zur Erkla
¨rung der Reaktionszeitunterschiede im
Implicit Association Test (IAT). [Individual or group, exemplar or category?
A two-factor model for the explanation of reaction time differences in the
Implicit Association Test (IAT)]. Zeitschrift fu
¨r Psychologie,212, 57–65.
von Stu
¨lpnagel, R., & Steffens, M. C. (2010). Prejudiced or just smart?
Intelligence as a confounding factor in the IAT effect. Zeitschrift fu
¨r
Psychologie/Journal of Psychology,218, 51–53.
STIMULUS MATERIAL
Positive Words
Diamant (diamond), Ehre (honor), Freiheit (freedom), Freund
(friend), Frieden (peace), Geschenk (gift), Gesundheit (health),
Glu
¨ck (luck), Himmel (heaven), Humor (humor), Liebe (love),
Regenbogen (rainbow), Sympathie (sympathy), Treue (loyalty),
and Vertrauen (trust).
Negative Words
Aids (aids), Armut (poverty), Bombe (bomb), Desaster
(desaster), Gefa
¨ngnis (prison), Gestank (stink), Ho
¨lle (hell),
Krankheit (disease), Missbrauch (abuse), Mord (murder),
Schmutz (dirt), Su
¨nde (sin), Tod (death), Verbrechen (crime),
and Zersto
¨rung (destruction).
Flower Names
Flieder (lilac), Ga
¨nseblume (daisy), Lavendel (lavender), Lilie
(lily), Lo
¨wenzahn (dandelion), Maiglo
¨ckchen (lily of the
valley), Mohn (poppy), Narzisse (daffodil), Nelke (carnation),
Orchidee (orchid), Rose (rose), Seerose (water lily), Sonnen-
blume (sunflower), Tulpe (tulip), and Veilchen (violet).
Insect Names
Ameise (ant), Floh (flea), Grille (cricket), Heuschrecke (locust),
Hornisse (hornet), Kakerlake (roach), Mistka
¨fer (dung beetle),
Moskito (mosquito), Motte (moth), Spinne (spider), Stubenfliege
(housefly), Biene (bee), Termite (termite), Wanze (bug), and
Wesp e ( w a s p ) .
MODEL SPECIFICATIONS
Quad Model
As specified by Gonsalkorale, von Hippel, et al. (2009), the
following parameters were estimated. One AC parameter
representing the flower–positive association (AC1), one AC
parameter representing the insect–negative association (AC2),
one G parameter, and one OB parameter for target stimuli only.
Copyright #2010 John Wiley & Sons, Ltd. Eur. J. Soc. Psychol. 41, 254–268 (2011)
The Trip Model 267
Furthermore, again following Gonsalkorale, von Hippel, et al.
(2009), we used two D parameters for the GNAT analyses, one
for stimuli of the constant go category (D1), and one for the
stimuli of the remaining three categories (D2). Thus, six
parameters (AC1, AC2, G, OB, D1, and D2) were estimated.
In contrast to Gonsalkorale, von Hippel, et al. (2009) who
did not collapse data across associated categories (e.g., flower
and positive), we sticked to the aggregation level originally
suggested by Conrey et al. (2005). That is, data were
aggregated across the categories flower and positive as well as
insect and negative whenever the model equations did not
differ for these categories. Aggregating independent data
generated by the same process probabilities is an effective
means of reducing sampling error. We thus preferred the
aggregation procedure of Conrey et al. (2005) because it is
statistically more efficient compared to the procedure of
Gonsalkorale, von Hippel, et al. (2009). Furthermore, the Quad
Model performed better when the aggregation procedure of
Conrey et al. (2005) was applied than when the procedure of
Gonsalkorale, von Hippel, et al. (2009) was applied.
Based on these model specifications, the model for the
GNAT consisted of 14 model trees. For the Target GNATs
there were six model trees for the compatible blocks: Three
model trees for the Flower GNAT (one tree for flowers as the
constant go category, one for positive words as the temporary
go category, and one for the aggregated data of the no-go
categories insects and negative words) and three model trees
for the Insect GNAT (one tree for insects as constant go
category, one for negative words as temporary go category, and
one for the aggregated data of flowers and positive words as no-
go categories). The same logic applies to the Attribute GNATs.
Because the OB parameter was estimated for targets (flowers
and insects) but not for attributes (positive and negative
words), we could not aggregate data across the associated
stimulus categories for incompatible GNAT blocks. Thus, the
Quad Model consisted of eight trees for the incompatible
blocks, resulting from the possible combinations of four
different stimulus categories (flower, insect, positive, negative)
and the required responses (go vs. no-go). Because each model
tree involves two response categories only (i.e., correct
response vs. incorrect response), Quad Model tests for GNAT
data were based on 14 (independent data categories)6 (esti-
estimated parameters) ¼8 degrees of freedom.
For Quad Model analyses of IAT data only one D parameter
was estimated so that in the compatible blocks data could
always be aggregated for associated categories. Therefore,
the model for the IAT consisted of only 12 model trees, that
is, four trees for the compatible blocks (flower–positive vs.
insect-negative crossed with required response left vs. right)
and eight trees for the incompatible blocks (flower vs. positive
vs. insect vs. negative crossed with required response left
vs. right). Thus, the Quad Model test for IAT data was based
on 12 (independent data categories)5 (estimated para-
meters) ¼7 degrees of freedom.
Trip Model
Only four parameters were estimated for the Trip Model (i.e.,
AC1, AC2, G, and D). Due to the lack of an OB parameter,
data could be aggregated across the associated categories
(flower and positive; insect and negative) not only in the
compatible but also in the incompatible blocks. Hence,
the model consisted of eight model trees (compatible block
vs. incompatible block crossed with flowerpositive vs.
insectnegative crossed with go vs. no-go instructions) with
two response categories (correct response vs. false response)
each. Thus, Trip Model tests were based on 8 (independent
data categories)4 (estimated parameters) ¼4 degrees of
freedom.
Copyright #2010 John Wiley & Sons, Ltd. Eur. J. Soc. Psychol. 41, 254–268 (2011)
268 Lena Nadarevic and Edgar Erdfelder