Content uploaded by Jamal Mansour
Author content
All content in this area was uploaded by Jamal Mansour on Oct 02, 2024
Content may be subject to copyright.
IDENTIFICATION DECISION PROCESSES
This is the pre-peer reviewed version of the following article:
Mansour, J. K., Beaudry, J. L., Nguyen, M-T. & Groncki, R. (in press). Eyewitness decision
processes: A valid reflector variable. Applied Cognitive Psychology.
This article will be published in final form at
https://onlinelibrary.wiley.com/journal/10990720. This article may be used for non-
commercial purposes in accordance with Wiley Terms and Conditions for Use of Self
Archived Versions.
IDENTIFICATION DECISION PROCESSES 1
.Eyewitness Decision Processes: A Valid Reflector Variable
Jamal K. Mansour,1 Jennifer L. Beaudry,2 Mai-Tram Nguyen3, & Roy Groncki3
1 Department of Psychology, University of Lethbridge
2 Research Development and Support, Flinders University
3 Independent scientist
Author Note
Jamal K. Mansour https://orcid.org/0000-0001-7162-8493, Jennifer L. Beaudry
https://orcid.org/0000-0003-1596-6708, jen.beaudry@flinders.edu.au, Mai-Tram Nguyen,
maitramn23@gmail.com, Roy Groncki, super_roy@hotmail.com
Much of this work occurred while Jennifer L. Beaudry, Mai-Tram Nguyen, and Roy
Groncki were with the Department of Psychological Sciences; School of Health Sciences;
Faculty of Health, Arts and Design; Swinburne University of Technology and Jamal K.
Mansour was with Psychology, Sociology, and Education; Queen Margaret University.
We have no known conflicts of interest to disclose. The videos and images used are
available upon reasonable request to first author for research use. The questions asked, data,
and analysis scripts are available on the Open Science Framework (OSF):
https://osf.io/yj8mx/?view_only=7529a7ee6d944609b5e4c9b0b67963a5
Correspondence concerning this article should be addressed to Jamal K. Mansour,
Department of Psychology, University of Lethbridge, Lethbridge, Alberta, Canada, T1K 3M4
Phone: +001 (403) 329-2077. Email: jamal.mansour@uleth.ca
This research partially fulfilled the requirements for the degree of Bachelor of Arts
in Psychology (Honours) to the third author. This research was presented at the Society for
Applied Research in Memory and Cognition, Sydney, Australia and the American
Psychology-Law Society, Seattle, Australia in 2017. This research was partially supported by
an Australian Government Research Training Program Scholarship to the fourth author.
IDENTIFICATION DECISION PROCESSES 2
Abstract
Identification accuracy can be predicted from eyewitnesses’ self-reported decision processes
but the evidence of their ability to improve prediction when confidence and response time is
mixed and minimal. Typically, decision processes are measured via one or five self-report
questions; we explored whether a more nuanced questionnaire could improve prediction.
Participants viewed a mock-crime video, made a target-present or -absent lineup decision,
and completed 17 decision process items. An exploratory factor analysis on choosers’ (n =
391) responses revealed three correlated factors, broadly reflecting Automatic Responses,
Relative Judgment, and Absolute Judgment. The three-factor solution had good internal
reliability (McDonald’s ωs = .93, .89, and .74, respectively). Scores produced from the
questions loading on the Automatic Responses and Relative Judgment improved predictions
of accuracy compared to using confidence and response time alone. Self-reported decision
processes may be an easy-to-administer and useful reflector of identification accuracy.
Keywords: eyewitness identification; decision processes; absolute judgments; relative
judgments; automatic recognition; reflector variables
IDENTIFICATION DECISION PROCESSES 3
Eyewitness Decision Processes: A Valid Reflector Variable
Eyewitness identifications are a common source of evidence in criminal justice but
evaluating their reliability is not straightforward. Triers of fact and research participants tend
to believe eyewitnesses regardless of other evidence (Boyce, Beaudry, & Lindsay, 2007).
Unfortunately, eyewitnesses sometimes make errors. Criminal justice could benefit from
tools for judging identification evidence. Measuring behaviors that covary with accuracy (i.e.,
reflector variables) can help meet this need (Wells, 2020).
The most well-established reflector variable is eyewitness confidence, partly because
jurors intuitively rely on confidence (Cutler, Penrod, & Dexter, 1990; Cutler, Penrod &
Stuve, 1988; Penrod & Cutler, 1995; Slane & Dodson, 2022). Confidence is a valid reflector
under certain circumstances (Sauer, Brewer, & Palmer, 2019; Wixted & Wells, 2017).
Eyewitness accuracy has also been reliably associated with how long an eyewitness takes to
make their decision (Brewer et al., 2006; Dunning & Peretta, 2002; Nyman et al., 2019;
Quigley-McBride & Wells, 2023; Seale-Carlisle et al., 2019; Sporer, 1992; Weber, et al.,
2004) and their decision process (Dunning & Stern, 1994; Kneller, Memon, & Stevenage,
2001; Ross et al., 2007; Smith, Stinson, & Prosser, 2004; Wittwer et al., 2022). Other
potential reflector variables are less well established: memory for fillers (Charman & Cahill,
2012), metamemory (Saraiva et al., 2020a; 2020b), response bias (Baldassari, Kantner, &
Lindsay, 2019), and (perhaps) measurements of individual differences such as face
identification ability (Bindemann et al., 2012)1. We were concerned with decision processes.
Historically, researchers have considered eyewitness decision processes in two ways:
using Wells’ (1984) absolute/relative judgment conceptualization and Dunning and Stern’s
1 Wells (2020) did not provide a formal definition but rather outlined features of reflector variables. We believe
any formal definition could be expanded to include individual differences because 1) reflector variables covary
with whether the lineup includes the culprit, and different levels of specific individual difference variables may
be associated with a higher probability of identifying a culprit versus an innocent suspect and 2) a key function
of reflector variables is to assist the criminal justice system in predicting the likelihood that an individual
eyewitness’ identification is accurate, and measuring individual differences may do so.
IDENTIFICATION DECISION PROCESSES 4
(1994) automatic recognition/deliberation conceptualization. Wells defined relative
judgement as looking for the person in a lineup who looks most like the culprit. Eyewitnesses
who do this are highly likely to make an identification (cf. reject the lineup) and will choose
based on comparisons between lineup members. In contrast, absolute judgment—wherein the
eyewitness compares each lineup member to their memory—can improve accuracy by
reducing identifications of lineup members who poorly resemble the perpetrator (even while
being the closest match to their memory). Incorrect lineup decisions are reduced when
eyewitnesses report or appear to engage more in absolute than relative judgments (e.g., Clark,
Erickson, & Brenneman, 2011; Gronlund, 2005; Lindsay & Bellinger, 1999; Lindsay et al.,
1991; Smith, Lindsay, & Pryke, 2000; Smith et al., 2001, but see Smith, Stinson, & Prosser,
2004).
Dunning and Stern (1994) aimed to differentiate accurate and inaccurate lineup
decisions. Participants answered questions about their decision process in four eyewitness
identification experiments. A principal components analysis of the responses elicited two
components, though the second was judged unnecessary. Dunning and Stern considered the
two items loading positively on their first component as reflecting automatic recognition and
the three items loading negatively on their first component as reflecting a deliberative process
normally instantiated via a process of elimination. An almost identical structure was elicited
by Robinson and Johnson (1998) using the same questions. Dunning and Stern found that
accurate identifiers endorsed more automatic recognition items, fewer process of elimination
items, were more likely to state that non-chosen photos had little influence on their decision
and that their memories had a greater influence than the photos on their decision, and were
less likely to say that the photos had a greater influence than their memories on their decision
than inaccurate eyewitnesses.
One limitation, however, is that the scope of the self-reported decision processes used
IDENTIFICATION DECISION PROCESSES 5
as reflectors of eyewitness accuracy has been limited. The absolute/relative judgment
conceptualization was never intended to be predictive2, but rather was an explanation for why
(simultaneous) lineup decisions tend to elicit high rates of innocent suspect identifications.
Although the questions created by Dunning and Stern (1994) were aimed at differentiating
correct and incorrect eyewitnesses, they comprise only five items and a single dimension.
Furthermore, self-reported decision processes will be more practically useful to
criminal justice systems if they improve one’s ability to predict accuracy beyond what
existing measures (confidence, response time) already provide. Only a few researchers have
explored this. Sauerland and Sporer (2007) found that while Dunning and Stern’s (1994)
questions predicted identification accuracy, they did not improve predictability when
confidence and response time were also considered (see also Sauerland & Sporer, 2009).
Recently, Wittwer et al. (2022) examined two different 17-item questionnaires in the
context of confidence and response time.3 Multiple factors representing decision processes
interacted with target presence in models that also included confidence and response time.
Their Automatic Recognition factor was associated with more accurate than inaccurate target-
present decisions but more inaccurate than accurate target-absent lineup decisions. Lack of
Familiarity (Experiment 1) was associated with more accurate responses to target-absent
lineups only. In Experiment 2, Process of Elimination was associated with more inaccurate
than accurate responses regardless of target presence, but the effect was larger for target-
present than -absent lineups while Familiarity-Search was associated with more accurate than
inaccurate responses to target-absent lineups but fewer accurate than inaccurate responses to
target-present lineups. Wittwer et al. did not test the three-way interaction of Chooser
(whether participants made an identification or a rejection), Presence (target presence), and
2 Strictly speaking, any measure collected after the lineup decision is a postdictor rather than a predictor of
eyewitness performance. However, for simplicity and clarity we use the term predictive.
3 We were not aware of Wittwer et al.’s (2022) work when we designed this study as we collected our data
before their work was published.
IDENTIFICATION DECISION PROCESSES 6
each decision process or conduct a separate analysis of identifiers and test for the two-way
interaction of Presence and each decision process. However, inspection of their 95%
confidence intervals suggests that Lack of Familiarity in Experiment 1 and that Familiarity-
Search and Process of Elimination in Experiment 2 differentiated accurate from inaccurate
identifications. Wittwer and colleagues therefore demonstrated that self-reported decision
processes can reflect accuracy, even when confidence and response time are considered (cf.
Sauerland & Sporer. 2007; 2009). However, more research and explicit tests of the ability of
decision processes to predict identification accuracy alongside confidence and response time
would be informative.
Even fewer studies have examined rejectors, and those that did report mixed results
(Charmin & Cahill, 2012; Kneller et al., 2001; Robinson & Johnson, 1998; Sauerland,
Sagana, & Sporer, 2012; Sauerland & Sporer, 2007; 2009). For example, Kneller et al. (2001)
found that accurate rejectors were more likely to endorse items resembling Dunning and
Stern’s (1994) automatic recognition items but found no relationship for
elimination/deliberation items. In contrast, Charman and Cahill (2012) reported that
inaccurate rejectors endorsed more deliberative items but did not differ in their endorsement
of automatic recognition items. More research is needed.
Another consideration is that eyewitness decision processes are likely more complex
than unidimensional conceptualizations (i.e., Dunning & Stern, 1994; Wells, 1978) can
account for. For example, Mansour et al. (2009) used eye tracking to monitor eyewitnesses
making lineup decisions and noted that participants appeared to use a combination of relative
and absolute judgments (see also Flowe, 2011; Flowe & Cottrell, 2010; Josephson & Holmes,
2011). Indeed, Wittwer et al. (2022) found four factors in each of their experiments.
Likewise, more nuanced decision processes have been detected when participants are asked
to think aloud while making a lineup decision (Mansour & da Costa, 2015). Finally, the
IDENTIFICATION DECISION PROCESSES 7
WITNESS model, a mathematical model of eyewitness decision processes, adequately fits
identification data when absolute/relative judgments are represented as a continuous
parameter along with response criterion (and memory strength; Clark, 2003; 2008; Clark et
al., 2011).
Yet, not all evidence points to eyewitness decision processes as multidimensional.
Fife et al. (2014) argued that their modification to the WITNESS model (WITNESS-
Restricted) could account for eyewitness decision processes by holding the absolute/relative
judgment parameter from the original model constant and relying on response criterion as the
sole decision process parameter. However, they tested their model only under circumstances
where response criterion was very liberal. When eyewitnesses are highly likely to choose to
begin with, it is unsurprising that the contribution of decision processes would be minimal.
The findings from WITNESS and WITNESSR (Clark, 2003; Fife et al., 2014)
highlight that a more advanced understanding of eyewitness decision processes may not only
benefit practice, but also theory. Consider applied lineup theory (Charman & Wells, 2007)
which implies that an eyewitness’ decision process mediates the relationship between
memory quality and identification accuracy such that when memory quality is strong, they
are more likely to use an effective decision process (cf. a counterproductive one), which in
turn increases their likelihood of being accurate. Other perspectives, such as the Diagnostic-
feature-detection theory (DFD; Wixted & Mickes, 2014) and the differential filler siphoning
hypothesis (Smith et al., 2022) may also benefit from an understanding of the
phenomenology of eyewitness decision processes by providing evidence of the extent to
which different processes occur and how these affect other strategies. If eyewitness’ decision
can be understood in terms of multiple, differentiable decision processes, these could inform
refinements to these models.
Another reason to further explore eyewitness identification decision processes is to
IDENTIFICATION DECISION PROCESSES 8
elucidate the relationship between existing conceptualizations. The terms absolute and
relative judgment are often used interchangeably with automatic recognition and process of
elimination (e.g., Harvey et al., 2020; Fife et al., 2014; Kneller et al., 2001). Yet, there is
reason to differentiate between them. Consider that automatic recognition—which is fast and
accurate (Moors, 2016)—likely involves absolute judgment, whereby only a single
comparison between a lineup member and memory is made—it just happens to be made very
quickly. In contrast, it would not make sense to say that all absolute judgments are also
examples of automatic recognition. That is, it is easy to imagine a witness comparing
multiple lineup members to their memory but doing so slowly because they do not experience
a strong feeling of recognition as they look at the lineup members. Thus, automatic
recognition is likely better understood as a special case of absolute judgment or as an extreme
point on the decision process continuum.
Likewise, we can think of the process of elimination as a type of relative judgment.
An eyewitness could make comparisons between lineup members in a variety of ways,
including engaging in a process of elimination (Dunning & Stern, 1994) or surveying the
lineup for diagnostic features (Wixted & Mickes, 2014). Notably, a process of elimination
could be conducted in an “absolute” way whereby the eyewitness compares each lineup
member to her memory and eliminates them one at a time if they do not meet their response
criterion. As such, a process of elimination could be a type of deliberative judgment that more
or less resembles an absolute or relative judgment, depending on how the eyewitness uses it.
Relative judgments may take a variety of forms.
Although Wells (1984) conceptualized them on a single continuum, absolute and
relative judgments processes may represent separately measurable dimensions. Indeed, it is
reasonable to expect that most, if not all, eyewitnesses first examine lineups for a match to
their memory. If they do not find a strong match quickly (i.e., experience automatic
IDENTIFICATION DECISION PROCESSES 9
recognition), they may engage in either an absolute or a relative judgment process.
Whichever process they engage in, they may follow up with other processes. One can
imagine an eyewitness who looks carefully at each lineup member, comparing them to their
memory (absolute judgment) and when they do not find one who sufficiently matches their
memory, begins comparing between them (relative judgment). Likewise, one could imagine
an eyewitness who compares between lineup members until they come to a decision about
who is the closest to their memory (relative judgment) and then decides whether that person
is a sufficient match to their memory to identify them as the perpetrator (absolute judgment).
With the current study, we aimed to better understand the extent to which the
eyewitness identification decision process can be understood as falling on a single continuous
dimension versus multiple correlated (or uncorrelated) dimensions. We hoped to produce a
tool for criminal justice but also to inform theory. Although our decision process questions
were aimed at identifications, those who rejected the lineup also answered these. Therefore,
we also examined their responses to add to that small literature. We hypothesized that
eyewitness decision processes would be better accounted for by multiple correlated
dimensions than a single dimension. Importantly, we also examined whether these
dimensions reflect (i.e., predict) eyewitness identification accuracy and hypothesized that
they would do so above and beyond what confidence and response time provide. Finally, the
nature of our method allowed us to directly test applied lineup theory (Charman & Wells,
2007), therefore we did so and hypothesized that we would find support for it.
Method
This study was conducted in compliance with the requirements of the ethical review
board at the second author’s university. This study was not preregistered, however our
questions asked, data, and analysis scripts are available on the Open Science Framework
(OSF): https://osf.io/yj8mx/?view_only=7529a7ee6d944609b5e4c9b0b67963a5.
IDENTIFICATION DECISION PROCESSES 10
Participants
First-year undergraduate psychology and criminology students from an Australian
university (N = 1038) participated in exchange for course credit. We removed those who did
not complete the study (n = 141), self-reported that they did not follow the mock-crime video
instructions (n = 214; i.e., they did not watch the video or watched it more than once), and
those who did not make a lineup decision (n = 7). The final sample (N = 676) ranged in age
from 18 to 68 years (M = 32.00, SD = 10.06) and was mostly female (83.43%) and Caucasian
(88.31%; 4.34% Asian, 1.18% Indigenous, and 6.21% Other).
We collected as many participants as possible while our participant pool was
available. Nunnally (1978, as cited in Boateng et al., 2018) recommended a minimum of 10
participants per question (i.e., 170 participants for a scale with 17 items). Comrey and Lee
(1992, as cited in Boateng et al.) considered a sample of 300 as good, 500 as very good, and
1000 or more as excellent. Our primary aim was to produce a scale for identifiers and our
final sample of identifiers for the scale development was 391, which is broadly considered a
good sample size for scale development.
Design
We randomly assigned participants in a 2 (Memory Strength: strong, weak) x 2
(Target Presence: present, absent) x 4 (Target: two, eight, twelve, fourteen) between-subjects
factorial design. Participants watched a mock-crime video that depicted one of four culprits
(Target). To create a strong or weak memory for the culprit, we manipulated both opportunity
to encode the mock-crime video and the retention interval between the video and lineup (see
Materials). The purpose of including Memory Strength and Target manipulations was to
increase variability in memory quality and therefore responding, which mitigates, to a degree,
stimulus sampling considerations (Wells & Windschitl, 1999). Participants attempted to
identify the culprit from a six-person photographic simultaneous lineup containing either the
IDENTIFICATION DECISION PROCESSES 11
target (target-present) or a replacement filler (target-absent).
Materials
We used Qualtrics (Provo, UT) to present the online study to participants. The videos
and lineups comprised a subset of those originally created for Mansour et al. (2020).
Mock-crime Videos
The target was recorded head-on from the shoulders up against a green screen, acting
out one of two mock-crime scenarios: discussing on the phone a plot to murder someone or
being questioned by an (off-screen) police officer after a robbery. We used four White male
targets. In the strong Memory Strength condition, the mock-crime video was 30 seconds long
and presented full-screen, with the lineup shown immediately after the video. In the weak
condition, the mock-crime video was 5 seconds long, the video was half as large as in the
strong condition, and there was a three-minute delay between the video and lineup during
which participants answered questions about a Where’s Waldo?4 image.
Lineups
The target-present lineup included the target from the mock-crime video and five
fillers who matched the general description of the target. The target-absent lineup included
the same five fillers and one additional filler. The photos (heads only) were organized in a 3 x
2 simultaneous array. The lineups were constructed using a modified match-to-description
procedure. Tredoux’s E estimates how many lineup members were considered reasonable
choices by the sample of eyewitnesses. The resultant Tredoux’s Es (Tredoux, 1998) with
95% bca confidence intervals were 3.25 [1.08, 3.31] for target two, 3.14 [1.08, 4.07] for
target eight, 2.90 [1.34, 3.17] for target twelve, and 3.16 [1.00, 4.39] for target fourteen.
Measures
Perceived Memory Strength - Manipulation Check
4 TM and © 2008 Entertainment Rights Distribution Limited. All rights reserved.
IDENTIFICATION DECISION PROCESSES 12
Five questions assessed participants’ perceived memory strength for the perpetrator
(attention, view, memory quality, feature visibility, confidence to identify) on 7-point scales
(e.g., 1 = very poor; 7 = very good). We also asked participants for how long they saw the
culprit’s face (seconds) and the distance (meters) between the camera and the culprit.
Lineup
Participants could select one of the six lineup members or “none of the above.” We
did not designate an innocent suspect. Qualtrics (Provo, UT) recorded the time between when
the lineup appeared and when the participant clicked a response, which took them to the next
screen. Participants rated their confidence in their decision on a scale from 0% = Not at all
confident to 100% = Very confident.
Decision Process Questions
We first asked, “How did you make your decision?” and “Approximately how long
(in seconds) did it take you to make your decision?” However, these questions were
exploratory and were not analysed. We also created two sets of 17 Likert-style decision
process items: One asked participants to predict how they would make their lineup decision
(pre-lineup questions; see the project’s OSF page) and one asked how they made their lineup
decision (post-lineup questions; see Appendix A). The pre-lineup questions were exploratory,
but we report their analysis in the supplementary materials. Finally, after explaining absolute
and relative judgements, we asked “Which type of judgment did you use more in your
decision?” (1 = Absolute to 7 = Relative).
Data Quality
Three items assessed whether participants followed the video instructions.
Participants first indicated if they had watched a video. If they reported that they had, two
open-ended questions asked how many times they watched the video and what the video
depicted. Participants were included only if they reported watching the video once.
IDENTIFICATION DECISION PROCESSES 13
Procedure
Participants consented to participate by clicking a button after reading the letter of
information. After reporting their age, sex, and ethnicity, participants were informed that they
were about to view a video (“You are about to view a video. The video you are about to see
might last for only a few seconds. Please resist the desire to watch the video more than once.
The validity of this study depends on you watching it only once.”). They then viewed a
mock-crime video. After the video, participants were informed that they were a witness to a
mock crime and would later view a lineup. Participants either advanced to the next section of
the survey (strong Memory Strength condition) or completed the Where’s Waldo task for
three minutes (weak Memory Strength condition). Participants then answered the memory
strength manipulation check questions and the pre-lineup decision process questions. All
participants next read unbiased lineup instructions, made their lineup decision, rated their
confidence in their decision, and answered the post-lineup decision process questions.
Finally, participants answered the data quality questions and were debriefed.
Results
We conducted all analyses in R (R Core Team, 2024). Packages used included boot
(Canty & Ripley, 2021; Davison & Hinkley, 1997), dplyr (Wickham et al., 2023), effectsize
(Ben-Shachar, Lüdecke, & Mackowski, 2020), ggplot2 (Wickham et al., 2016), here (Müller,
2020), Hmisc (Harrell, 2022), janitor (Firke, 2021), lavaan (Rosseel, 2012), multiUS (Aleš &
Marjan, 2023), readr (Wickham, Hester, & Bryan, 2022), psych (Revelle, 2022), r4lineuips
(Tredoux & Naylor, 2018), stringer (Wickham, 2019), and tidyr (Wickham & Girlich, 2022).
All confidence intervals are 95% confidence intervals and presented in square brackets.
Perceived Memory Strength - Manipulation Check
Multivariate analysis of variance (MANOVA) confirmed our memory strength
manipulation influenced participants’ perceived memory strength. The multivariate main
IDENTIFICATION DECISION PROCESSES 14
effect of Memory Strength was significant, Wilks’ λ = 0.59, F(7, 649) = 65.30, p < .001, ηp2 =
.41 [.36, .46] 5. The supplemental materials report the univariate results.
Lineup Decisions
First, we examined whether identification accuracy—that is, whether choosing the
culprit from a target-present lineup or a filler from a target-absent lineup—was predicted by
memory strength. We entered Memory Strength, Target Presence, and their interaction as
predictors into a probit binary logistic regression. The probit approach allows one to test
whether discriminability (per Signal Detection Theory; DeCarlo, 1998) differs as a function
of memory strength. Discriminability was significantly higher for the strong (d = 1.07) than
the weak Memory Strength condition (d = .12), z = 4.78, p < .001, as expected.
Table 1 illustrates the distribution of lineup decisions by Memory Strength in target-
present and -absent lineups. The association between lineup decisions and Memory Strength
was significant for target-present lineups, χ2(2, n = 340) = 28.08, p < .001. Participants in the
strong condition made more correct identifications than those in the weak condition, z = 4.91,
p < .001, while those in the weak condition made significantly more incorrect rejections, z =
2.21, p = .027, and filler identifications, z = 4.21, p < .001, than those in the strong condition.
There was no association between lineup response and Memory Strength for target-absent
lineups, χ2(1, n = 336) = 2.94, p = .086.
Scale Development
Identifications
We included only participants who chose someone from the lineup (n = 393) in these
analyses. Two additional participants were dropped because they did not respond to all the
scale questions. We first examined the data (n = 391) for its appropriateness for factor
analysis and found they were. Next, we used exploratory factor analysis to determine the
5 This analysis is based on 657 participants because 19 did not complete all the measures.
IDENTIFICATION DECISION PROCESSES 15
underlying structure of the 17 items used to probe eyewitnesses’ decision processes. The
supplementary materials provide the specific details of these analyses. We retained the three-
factor model as the most appropriate (see Table 2).
Table 1
Proportion of Lineup Decisions by Memory Strength for Target-present and Target-absent
Lineups
Memory
Strength
n Suspect Selection Filler Selection Rejection
Target-Present Lineups
Weak
165
.50 (.06)
.21 (.07)
.29 (.06)
Strong
175
.75 (.04)
.06 (.07)
.19 (.07)
Overall
340
.63 (.03)
.13 (.05)
.24 (.05)
Target-Absent Lineups
Weak
165
-
.45 (.06)
.55 (.05)
Strong
171
-
.35 (.06)
.65 (.04)
Overall
336
-
.40 (.04)
.60 (.03)
Note: Standard errors are provided in parentheses.
We named Factor 1 Automatic Response because the questions reflect a quick and
easy (effortless) decision with minimal conscious input (e.g., “I recognized the culprit
easily”). Factor 2 was named Relative Judgment because the questions suggest use of the
lineup photos to aid one’s decision via comparisons (e.g., “[The non-chosen pictures] had
some influence on my decision because more than one face had the feature/features I
remembered about the perpetrator”). Finally, we named Factor 3 Absolute Judgment because
the questions indicate conscious use of one’s memory trace (e.g., “I searched the faces for the
feature/features I remembered”).
That we found three factors is consistent with our expectation that the
automatic/deliberative and absolute/relative conceptualizations do not reflect identical
decision processes. The results are also consistent with Dunning and Stern (1994) who found
IDENTIFICATION DECISION PROCESSES 16
a single factor reflected automatic versus deliberative processing. Interestingly, our process
of elimination item was dropped because it loaded moderately on all three factors.
Table 2
Summary of the Final Factor Structure for Identifiers’ Post-lineup Decision Process Items
Item
Factor loadings
Auto
Relative
Absolute
λ
2
1
2
3
I recognized the culprit easily.
.95
.02
-.02
.88
The culprit’s face just ‘popped out’ at me.
.94
.09
-.09
.79
To what extent did the perpetrator standout to
you in the lineup?
.89 .01 -.03 .78
How easy was it for you to make your decision?
.85
-.05
.04
.78
Rate your memory for the culprit.
.66
.05
.11
.43
[The non-chosen pictures] had little influence on
my decision. I knew straight away who to pick.*
.56 -.04 -.02 .34
[The non-chosen pictures] had some influence on
my decision because more than one face had the
feature/features I remember about the
perpetrator.
.06 1.00 -.02 .92
[The non-chosen pictures] had some influence on
my decision because more than one face seemed
familiar.
.05 .89 .02 .76
[The non-chosen pictures] confused me, and
made the task more difficult. I didn’t know
whom to pick.
-.24 .58 -.06 .54
I searched the faces for the feature/features I
remembered.
-.04 -.09 .82 .64
I compared the photos to my memory in order
to pick someone who was the closest match to
what I remembered.
-.05 .06 .67 .47
I searched the photos for a familiar face.
.004
.10
.59
.39
Unrotated sums of square loadings
4.91
1.55
1.26
Rotated sums of square loadings
4.13
2.16
1.50
Rotated proportion variance
.34
.18
.12
Factor correlations
1
2
3
1. Auto
-
2. Relative
-.55
-
3. Absolute
.08
.21
-
Note. Factor loadings > 0.30 are bolded. λ2 = communality. Auto = Automatic Recognition.
Relative = Relative Judgment. Absolute = Absolute Judgment.
IDENTIFICATION DECISION PROCESSES 17
The correlations between factors aligned with our expectations. Automatic Response
and Relative Judgement were strongly and negatively correlated (r = -.55) indicating
participants tended to endorse only one of these. Automatic recognition has often been
treated as equivalent to absolute judgment, but our Automatic Response and Absolute
Judgment factors correlated weakly (r = .08). We expected a proportion of absolute
judgments to also be automatic, so while this is lower than expected, it is in line with our
expectations. Finally, Absolute Judgment and Relative Judgment correlated moderately and
positively (r = .21), suggesting participants may engage in both processes but rely more on
one or the other, potentially as a function of other variables (such as memory strength).
Reliability. Table 3 provides internal consistency estimates for the subscales, item-
total correlations, and descriptives (DiStefano, Zhu, & Mîndrilă, 2009). All factors exceeded
the minimum acceptable value for internal consistency (≥ .70). Automatic Response met the
criterion for excellent reliability (≥ .90) and Relative Judgment met the criterion for good
reliability (≥ .80; George & Mallery, 2019; Nunnally, 1978, as cited in Petersen, 1994).
Rejections
We examined rejections, though our sample was relatively small for factor analysis (n =
283). Four participants were excluded from this analysis for not responding to all the items.
The data (n = 279) were not ideal for factor analysis but did meet the minimum requirements
for suitability (see supplemental materials). Therefore, we proceeded with the exploratory
factor analysis (see supplementary materials). Table 4 summarizes our results, which
indicated the two-factor solution was preferred.
We named the first factor Automatic Response because the items indicated a decision
made quickly and easily, despite being a rejection; the three items also appeared in the
Automatic Response factor for Identifiers. Our second factor we named Relative Judgment as
it comprised the same items as the Relative Judgment factor on our identifications scale.
IDENTIFICATION DECISION PROCESSES 18
Table 3
Internal Consistency, Item-total Correlations, and Descriptives for the Post-Lineup Identifier
Scale
Factors & Items M (SD) Item-total
Correlation
ɑ if item
deleted
Automatic Response
(ɑ = .91 95% CI [.90, .93]; ω = .93) 4.73 (1.44)
I recognised the culprit
easily
. 4.65 (1.86) .93 .88
The culprit's face just "popped out" at me. 4.82 (1.88) .88 .89
To what extent did the culprit stand out to you in
the lineup? 4.81 (1.69) .88 .89
How easy was it for you to make your decision? 4.70 (1.65) .88 .89
[The non-chosen pictures] had little influence on
my decision. I knew straight away who to pick 3.81 (1.89) .58 .93
Rate the extent to which your memory for the
culprit influenced your decision. 5.57 (1.31) .65 .92
Relative Judgment
(ɑ = .87 95% CI [.85, .89]; ω = .89) 3.35 (1.63)
[The non-chosen pictures] had some influence on
my decision because more than one face seemed
familiar.
3.63 (1.84) .86 .80
[The non-chosen pictures] had some influence on
my decision because more than one face had the
feature/features I remember about the
perpetrator.
3.55 (1.80) .91 .74
[The non-chosen pictures] confused me, and made
the task more difficult. I didn’t know whom to
pick.
2.88 (1.85) .69 .91
Absolute Judgment
(ɑ = .73 95% CI [.68, .78]; ω = .74) 5.52 (1.14)
I searched the faces for the feature/features I
remembered. 5.62 (1.31) .71 .60
I compared the photos to my memory in order to
pick someone who was the closest match to what I
remembered.
5.54 (1.46) .66 .65
I searched the photos for a familiar face. 5.41 (1.45) .60 .70
Note. ɑ = Cronbach’s alpha. ω = McDonald’s omega. In all cases, ratings ranged from 1-7.
Bolded words are shorthand terms for items.
IDENTIFICATION DECISION PROCESSES 19
Table 4
Summary of the Final Factor Structure for Rejectors’ Post-lineup Decision Process Items
Item
Factor loadings
Auto
Relative
λ
2
1
2
The culprit’s face just ‘popped out’ at me.
.96
.08
.84
I recognized the culprit easily.
.94
.01
.87
To what extent did the perpetrator standout to you in
the lineup? .85 -.02 .74
[The non-chosen pictures] had some influence on my
decision because more than one face had the
feature/features I remember about the perpetrator.
.07 1.01 .94
[The non-chosen pictures] had some influence on my
decision because more than one face seemed familiar. .06 .89 .74
[The non-chosen pictures] confused me, and made the
task more difficult. I didn’t know whom to pick. -.25 .56 .53
Unrotated sums of square loadings
2.04
1.80
Rotated sums of square loadings
2.61
2.13
Rotated proportion variance
.44
.36
Factor correlations
1
2
1. Auto
-
2. Relative
.56
-
Note. Factor loadings > 0.30 are bolded. λ2 = communality. Auto = Automatic Response.
Relative = Relative Judgment.
Reliability. We next examined the internal consistency estimates for the subscales,
item-total correlations, and descriptive statistics (Table 5). Both factors met the criterion for
good reliability (≥ .80; George & Mallery, 2019; Nunnally, 1978, as cited in Petersen, 1994).
Scale Validity
Convergent Validity
As an indicator of convergent validity, we examined the correlations between
participants’ composite scores on each factor and participants’ responses to the question
IDENTIFICATION DECISION PROCESSES 20
about the extent to which they made an absolute versus relative judgment (higher values
reflected greater agreement that a relative judgment was made compared to an absolute
judgment). We first considered identifications. As expected, the absolute/relative item
correlated negatively with Automatic Response, r(389) = -.44, p < .001, and positively with
Relative Judgment, r(389) = .40, p < .001. Unexpectedly, the absolute/relative item did not
correlate with Absolute Judgement, r(389) = .06, p = .20. For rejections, neither Automatic
Response, r(276) = -.02, p = .73, nor Relative Judgment, r(276) = .12, p = .053, correlated
with the absolute/relative item, although Relative Judgment approached significance.
Table 5
Internal Consistency, Item-total Correlations, and Descriptives for the Post-lineup Rejector
Scale
Factors & Items M (SD) Item-Total
Correlation
ɑ if item
deleted
Automatic Response
(ɑ = .82 95% CI [.978, .86]; ω = .84) 1.92 (1.26)
The culprit's face just "popped out" at me. 1.81 (1.42) .82 .71
I recognised the culprit easily. 1.86 (1.41) .86 .68
To what extent did the culprit
stand out
to you in
the lineup? 2.09 (1.57) .61 .87
Relative Judgment
(ɑ = .81 95% CI [.77, .85]; ω = .84) 3.47 (1.68)
[The non-chosen pictures] had some influence on
my decision because more than one face seemed
familiar.
3.40 (1.92) .83 .69
[The non-chosen pictures] had some influence on
my decision because more than one face had the
feature/features I remember about the
perpetrator.
3.58 (1.93) .86 .65
[The non-chosen pictures] confused me, and
made the task more difficult. I didn’t know whom
to pick.
3.44 (2.06) .58 .88
Note. ɑ = Cronbach’s alpha. ω = McDonald’s omega. In all cases, ratings ranged from 1-7.
IDENTIFICATION DECISION PROCESSES 21
Bolded words are shorthand terms for items.
External Validity
We conducted additional analyses to test the external validity of the scale for
identifications (see supplementary materials). First, we considered the extent to which the
obtained factor structure generalized across our Memory Strength conditions for
identification. Confirmatory factor analysis indicated the factor structure was acceptable for
both conditions across multiple, though not all, fit indices and was a better fit to the strong
than the weak condition. Second, we considered how scores on the scale differed across
Memory Strength conditions. They differed in logical ways. Participants in the strong
condition (cf. the weak condition) scored higher on Automatic Response and lower on
Relative Judgment, though the latter depended on whether raw scores or regression factor
scores were used (with no difference when regression factor scores were used).
Predictive Validity
Identifications. We conducted logit logistic regressions to investigate whether
responses on the decision process scale were related to the accuracy of lineup selections. For
each model, we predicted identification accuracy such that suspect selections from target-
present lineups were considered accurate and all other selections were considered inaccurate.
First, we entered the composite scores from the questions comprising the three factors in our
questionnaire and found two were reliably predictive. Automatic Response, β = 0.84, SE =
0.14, z = 5.90, p < .001, OR = 1.10 [1.07, 1.14], Relative Judgment, β = -0.46, SE = 0.14, z =
3.36, p = .001, OR = 0.91 [0.86, 0.96], and Absolute Judgment, β = 0.18, SE = 0.12, z = 1.47,
p = .14, OR = 1.05 [0.98, 1.13].
Our scale will be more useful if it provides predictive power over and above that
provided by confidence and response time. We first confirmed these variables predicted
identification accuracy. Both confidence, β = 0.93, SE = 0.13, z = 7.14, p < .001, OR = 1.04
IDENTIFICATION DECISION PROCESSES 22
CI [1.03, 1.05], and response time, β = -0.27, SE = 0.12, z = 2.19, p = .028, OR = 0.98 CI
[0.96, 0.998], were reliable predictors. We then tested whether a model with Confidence,
response time, Automatic Response, Relative Judgment, and Absolute Judgment predicted
identification accuracy. Compared to the confidence and response time only model (AIC =
463.65), the current model (AIC = 442.51) was a significantly better fit to the data, χ2 (3) =
27.14, p < .001. This indicates that self-reported decision processes are a useful reflector
variable. Within the current model, Relative Judgment, β = -0.48, SE = 0.14, z = 3.41, p =
.001, OR = 0.90 [0.86, 0.96], and confidence, β = 0.48, SE = 0.17, z = 2.82, p = .005, OR =
1.02 [1.01, 1.03], were the strongest predictors, followed by Automatic Response, β = 0.42,
SE = 0.19, z = 2.22, p = .026, OR = 1.05 [1.00, 1.10]. Neither Absolute Judgement, β = 0.19,
SE = 0.12, z = 1.52, p = .13, OR = 1.06 [0.98, 1.13], nor response time, β = -0.21, SE = 0.13, z
= 1.59, p = .11, OR = 0.98 [0.96, 1.00], were significant predictors in the current model.
Given that Absolute Judgement was not a significant predictor, we ran a fourth model,
removing Absolute Judgement as a predictor. Response time was still not a significant
predictor (p = .13), but the significance and relative priority of the remaining predictors was
maintained. The fourth model was not a significantly better fit to the data than the third
model, χ2 (1) = 2.29, p = .13, although it is more parsimonious and therefore preferable.
These data support our expectation that a more nuanced self-report questionnaire about
eyewitness identification decision processes can improve our ability to predict eyewitness
identification accuracy.6
Rejections. As with identifications, we produced a composite score for each factor
6 Best practice is to use factor scores rather than a summed composite score; however, we wanted to produce a
tool that was as easy as possible for practitioners to use. Importantly, when we repeated the analyses in this
section using factor scores, there was no difference in our conclusions and the variance accounted for by the
final model was identical as measured by pseudo-R2 measures (Cox & Snell = .24; Nagelkerke = .32). However,
the relative priority of the predictors changed such that Automatic Recognition and Relative Judgments were
stronger predictors (βs = 0.80) than confidence (β = .49). Furthermore, we drew identical conclusions when we
produced factors with different oblique rotations (Quartimin and Oblimin).
IDENTIFICATION DECISION PROCESSES 23
directly from the scale responses. In a logit logistic regression, neither Automatic Response (p
= .35) nor Relative Judgment (p = .70) were significant predictors of rejection accuracy. This
was also true using regression factor scores as predictors rather than the composite scores.
Testing Applied Lineup Theory
Given that we successfully manipulated memory strength and obtained reliable
measures of decision processes, we examined whether our data supported Applied Lineup
Theory (Charman & Wells, 2007). Specifically, we examined whether decision processes—
operationalized by participants’ composite scores on the three factors derived from our factor
analysis—mediated the relationship between memory strength and identification accuracy.
We first constructed a model with our manipulation of memory strength as a predictor
of identification accuracy. As expected given our prior analyses, memory strength was a
reliable predictor, β = 0.45, SE = 0.10, z = 4.32, p < .001 OR = 2.46 [1.64, 3.71]. Next, we
examined whether Memory Strength predicted each of the three decision processes using
three separate models. The models for Automatic Response, β = 1.69, SE = 0.43, z = 3.92, p <
.001, OR = 29.08 [5.40, 156.48], and Relative Judgment, β = -0.49, SE = 0.25, z = 1.99, p =
.047, OR = 0.37 [0.14, 0.98], were significant, but the model for Absolute Judgment was not,
β = 0.11, SE = 0.17, z = 0.65, p = .52, OR = 1.25 [0.64, 2.46]. As such, we next tested
whether Automatic Response and Relative Judgment mediated the relationship between
Memory Strength and identification accuracy by constructing models with both Memory
Strength and the decision process as predictors of identification accuracy.
IDENTIFICATION DECISION PROCESSES 24
Figure 1
Path Models Predicting Identification Accuracy from Memory Strength and Decision
Processes
Note: path coefficients are standardized.
In the two-predictor model with Automatic Response, both Automatic Response, β = 1.02, SE
= 0.13, z = 7.76, p < .001, OR = 1.12 [1.09, 1.16], and Memory Strength, β = 0.34, SE = 0.12,
z = 2.94, p = .003, OR = 1.97 [1.25, 3.10], were significant. Notably, the effect of Memory
Strength (β = 0.34) was smaller than in the model with only Memory Strength (β = 0.45),
indicating partial mediation. The effect was tested by producing a 95% confidence interval
via bootstrapping with 1000 bootstrapped samples. The indirect effect of Memory Strength
Memory strength Ide nficaon accuracy
Relave Judgment
a* = -0.10 b* = -0.34
c’ = 0.19
c = 0.22
B.
Memory strength Ide nficaon accuracy
Absolute Judgment
a* = 0.03 b* = 0.03
c’ = 0.22
c = 0.22
C.
IDENTIFICATION DECISION PROCESSES 25
through Automatic Response was .08 [.04, .0.13], which is 36% of the overall effect of
Memory Strength. The 95% confidence interval did not include zero indicating the partial
mediation was reliably different from zero. Figure 1A depicts the mediation path model.
Both Relative Judgment, β = -0.78, SE = 0.12, z = 6.57, p < .001, OR = 0.85 [0.81,
0.89], and Memory Strength, β = 0.43, SE = 0.11, z = 3.86, p < .001, OR = 2.36 [1.53, 3.67],
were significant in the two-predictor mediation model. The regression coefficient for
Memory Strength was slightly lower than in the Memory Strength only model, again
suggesting partial mediation. The indirect effect of Memory Strength through Relative
Judgment was .03 [0, .07], which is 14% of the total effect—smaller than for Automatic
Response. Bootstrapping for the 95% confidence interval indicated that the effect did not
reach significance as it included zero. Figure 1B depicts this path model with standardized
path coefficients. Figure 1C depicts the path model for Absolute Judgement for completeness.
In summary, our results are consistent with applied lineup theory (Charman & Wells,
2007) regarding how memory strength, decision processes, and identification accuracy relate.
Discussion
Our goal was to develop a self-report decision process questionnaire more nuanced
than Dunning and Stern’s (1994) or asking eyewitnesses whether they made a relative versus
absolute judgment (e.g., Smith et al., 2000; 2001), and in doing so, enhance the ability of
decision processes to predict identification accuracy beyond what confidence and response
time already provide. We also hoped to inform theory: whether a unidimensional or
multidimensional approach to decision processes is more appropriate and the extent to which
applied lineup theory (Charman & Wells, 2007) can account for identification performance.
A secondary goal was to examine the decision processes of eyewitnesses who reject a lineup.
Critically, we demonstrated that self-reports of decision processes (Automatic
Response, Relative Judgment) explained variability in eyewitness identification decisions that
IDENTIFICATION DECISION PROCESSES 26
was not accounted for when only confidence and response time were used. This indicates that
administering our scale to eyewitnesses can improve predictions of whether the eyewitness is
accurate. Thus, consistent with Wittwer et al. (2022), we find that self-reports of decision
processes are a valid reflector variable.
Eyewitness decision processes were best accounted for using multiple, correlated
factors, consistent with our expectations. Building on Dunning and Stern (1994), we found
one dimension represented automatic responses, and—in the case of identifications—two
dimensions represented the type of deliberation used (only one deliberative dimension was
found for rejections). The nature of these dimensions reflected Wells’ (1984)
conceptualizations of how people approach lineups—using primarily relative or absolute
judgment.
Although researchers have often thought of eyewitnesses as using one or the other of
absolute versus relative judgment, research suggests eyewitnesses may use a combination
(e.g., Mansour et al., 2009). Our results support this conclusion: our Relative Judgment and
Absolute Judgment factors were positively correlated. Moreover, Relative Judgment was
strongly and negatively correlated with Automatic Response, as would be expected.
Surprisingly though, the correlation between Absolute Judgment and Automatic Response
was negligible. This indicates that automatic recognition and absolute judgments are not
interchangeable and that eyewitness decision processes are not best reflected by a single
continuous dimension, as suggested by prior research (e.g., Dunning & Stern, 1994; Fife et
al., 1994) and theory (e.g., Charman & Wells, 2007). Indeed, a one-factor solution was an
inadequate fit to the identification data.
Our Automatic Response factor seemed to reflect a quick and effortless process, like
Dunning and Stern’s (1994) automatic recognition and Wittwer et al.’s (2022) Automatic
factor. The questions associated with this factor asked the participant to indicate the extent to
IDENTIFICATION DECISION PROCESSES 27
which a face popped out or stood out for them, they found the decision easy, and relied on
their memory. High ratings on these questions align well with conceptions of automatic
responses as being uncontrolled (i.e., occurring regardless of the task one is asked to do),
unconscious (i.e., one need not think to produce the response), efficient (i.e., requiring little
or no attention), and fast (Moors, 2016). That is, our questions about whether the face popped
out or stood out get at the response being uncontrolled and fast while our questions about the
ease with which the identification was made reflect unconsciousness and efficiency. The
question about memory reflects the mechanism that leads to automatic responses—repeated
or good quality exposure. Consistent with the notion that automatic responses are fast, the
inclusion of the Automatic Response factor alongside confidence and response time as
predictors of identification accuracy noticeably reduced the power of these factors to predict
accuracy, based on standardized betas. The contribution of response time became non-
significant and the contribution of confidence became less than the contribution of Relative
Judgment. In contrast, Automatic Response accounted for the most variance when only the
decision process composite scores were used as predictors. It is notable that Automatic
Response is the factor on which the largest number of items from our questionnaire loaded,
even though its unique contribution may be more modest. Future research examining more
items related to other decision processes may produce a better questionnaire.
We expected an automatic response to be negatively correlated or uncorrelated with
counterproductive or less efficacious deliberative processes. This was true in our data. The
Automatic Response factor was negatively and strongly correlated with the Relative Judgment
factor, which was associated with lower identification accuracy. However, it is perhaps
surprising that the correlation between Automatic Response and Absolute Judgment was as
weak as it was (r = .08). Earlier we noted that we do not consider automatic recognition and
absolute judgments as equivalent, but we do think that automatic recognition can be
IDENTIFICATION DECISION PROCESSES 28
considered a special case of absolute judgment. We expected a moderate correlation between
those processes but instead found a small positive correlation between Absolute Judgment
and Relative Judgment (r = .21). This suggests a tendency to sometimes engage in relative
judgments even when one can and does engage in absolute judgments.
Another reason for the low correlation between Automatic Response and Absolute
Judgments may be related to our identification procedure. We used simultaneous lineups in
this experiment; it is reasonable to expect that placing all lineup members side by side
encourages relative judgments even when they may be unnecessary. We can imagine that the
correlation may have been stronger had we used sequential lineups—and the correlation
between Absolute Judgments and Relative Judgments weaker. Future research should
consider the generalizability of our scale across identification procedures.
A third possibility is that what we should rethink our conceptualization of Factor 3.
We were surprised to find that identifiers’ scores for our Absolute Judgment factor did not
correlate with our Likert-style absolute/relative judgment question, although the question did
correlate with our Automatic Response and Relative Judgment factors. Although each
question that loaded on the Absolute Judgement factor implied a comparison between the
eyewitness’ memory and the lineup members (“I searched the faces for the feature/features I
remembered”, “I compared the photos to my memory”, “I searched for a familiar face”), this
factor may actally tap into participants’ perception of their memory strength. Yet, if that was
the case, we would expect the Absolute Judgment factor to predict identification accuracy and
to be related to memory strength, neither of which was the case, as demonstrated in our
consideration of applied lineup theory (Charman and Wells, 2007).
On the other hand, it seems logical that eyewitnesses who do not experience
automatic recognition will engage in a more extended absolute judgment process, that is a
search for a match to memory. Regardless, if an absolute judgment strategy fails,
IDENTIFICATION DECISION PROCESSES 29
eyewitnesses may turn to a relative judgment process—or they may start with a relative
judgment process and finish with an absolute judgment process. Thus, we might expect
accurate and inaccurate eyewitnesses to engage in absolute judgment. To that end, our third
factor may reflect a search that occurs regardless of memory strength and which is unrelated
to accuracy. Still, this explanation is unsatisfactory because we found a small positive
correlation between Automatic Recognition and Absolute Judgement. We would expect a
negative correlation in this case because eyewitnesses who experience automatic recognition
would be expected to engage in no or a very short search. We suggest that future researchers
consider asking questions about the temporal nature of the eyewitness’ decision process to
tease apart whether the eyewitness started with an absolute judgment process and then
followed up with a relative judgment process or vice versa.
To what extent do our items align with prior efforts to produce a decision process
questionnaire? Many of our items derived from Dunning and Stern’s (1994) seminal work
and our final set of items incorporated items that resonated with four of their five items.
Although we had an item that asked about the extent to which participants used an
elimination strategy, this item did not survive the exploratory factor analysis. This was
surprising, however, the item may have been double-barreled (See Appendix 1) which may
have reduced its utility. Nonetheless, our items were broadly consistent with Dunning and
Stern’s work.
Wittwer et al. (2022) also found evidence that people may engage in multiple types of
judgments. Like us they found an Automatic factor which was related to accuracy (though not
identification accuracy in their case7). That some eyewitnesses experience automatic
recognition and that this is relevant to performance appears ubiquitous. In both studies they
7 In two studies, Wittwer et al. (2022) reported a significant interaction between their Automatic Recognition
factor and target presence such that scores on the Automatic Recognition factor were higher for (correct)
identifications from compared to any other target-present decision but lower for (filler) identifications compared
to correct rejections from target-absent lineups.
IDENTIFICATION DECISION PROCESSES 30
labelled a factor Elimination and the items loading on this factor resonate with our Relative
Judgment factor. As argued earlier, an elimination process can be considered a type of
relative judgment and the questions on this factor were consistent with that notion (e.g., “I
compared the lineup members to each other, especially their facial features, looking for ways
in which they were dissimilar. This helped me make my decision.”). While we labelled our
remaining factor as Absolute Judgment, they labelled their remaining factors as Lack of
Familiarity and Difficulty Feeling (Experiment 1) and Familiarity-Search (Experiment 2;
their fourth factor labelled Facial Matching Detail was not considered sufficiently reliable to
maintain). The items on their “familiarity” factors resonate with the items on our Absolute
Judgment factor as they point to examination of features (e.g., “There were some features that
I remembered about the perpetrator that were missing from all the lineup members’ faces.”)
and the impact of non-chosen faces (e.g., “When I inspected the photographs, I immediately
noticed that no face match the culprit.”). Interestingly, their “familiarity” factors tended to be
associated with accuracy while their Elimination factor was not. This contrasts with our
finding that Relative Judgment but not Absolute Judgment was predictive. However, because
our scale and Wittwer et al.’s used different items, it is impossible to draw firm conclusions.
How can we contextualize our findings in relation to theory? First, these results are
contrary to existing conceptualizations of the lineup decision process as comprising a single
dimension (e.g., Charman & Wells, 2007). We found that multiple, correlated dimensions
accounted for our participants’ characterizations of their lineup decision, whether it was
identification or a rejection. Moreover, our Automatic Response factor was separable from
our Absolute Judgment factor, suggesting that automatic recognition and absolute judgments
are not interchangeable. Their very low correlation suggests they may reflect quite different
processes—or at least that people perceive them differently. That (for identifications) our
Relative Judgment factor correlated with the Absolute Judgment factor supports previous
IDENTIFICATION DECISION PROCESSES 31
findings and speculations that eyewitnesses engage in a mix of absolute and relative
judgments (e.g., Mansour et al., 2009) but adds to our understanding by highlighting their
relationship with automatic recognition.
The fact that we manipulated memory strength and measured both decision processes
and identification accuracy provided an opportunity to test a key tenet of applied lineup
theory (Charman & Wells, 2007). The theory suggests that decision processes mediate or
partially mediate the relationship between memory strength and identification accuracy. In
the current study, we found that memory strength and decision processes were strongly
related to identification accuracy, therefore we tested a mediation model for the decision
process factors we found. Automatic Response was a significant partial mediator, providing
support for applied lineup theory. As shown in Figure 1A, a stronger memory was associated
with greater endorsement of Automatic Response, which in turn increased the likelihood of
identification accuracy. Because mediation was partial, memory strength also directly
affected the likelihood of accuracy.
As a secondary question, we examined decision processes by those who rejected the
lineup. We were interested to find that a two-factor solution explained significant variance
(79%) in rejectors’ responses to the items and that the factors produced mirrored the factors
produced for identifiers: Automatic Response and Relative Judgment. Yet, these factors were
not predictive of rejection accuracy. Given that our items were constructed with identifiers in
mind, we are reluctant to make much of this result. Future research should take a more
systematic approach to producing items relevant to rejectors. Indeed, other research has
suggested that rejections may involve rather different decision processes from identifications
(Lindsay et al., 2013; Sporer, 1992).
Limitations
A few limitations are worth considering. Although we used a variety of mock-crime
IDENTIFICATION DECISION PROCESSES 32
videos, they were similar in content—with a strong and obvious focus on the culprit. If our
scale is to be useful, it will need to be shown to be effective across a broader range of
memory qualities. Similarly, our targets were all White males and our participants were
primarily White, so it is unlikely that many of our participants experienced a cross-race
effect, which may significantly influence one’s decision process. Another limitation is that, in
constructing our target-absent lineups, we simply replaced the culprit with another filler. It is
possible that this influenced our identification results such that the specific filler chosen may
have been a particularly good or poor filler, and therefore affected how the participants made
their decision. However, as we did not designate a specific innocent suspect for our analysis,
this concern is mitigated to some extent (i.e., the decision processes considered were for all
target-absent lineup identifications). Encouragingly, the rate of filler identifications and
correct rejections from our target-absent lineups almost perfectly align with the mean rates
reported by Steblay et al. (2011) in their meta-analysis (Steblay et al.: 43% filler
identifications, 57% correct rejections; Us: 40% filler identifications, 60% correct rejections).
A final limitation is that prior to the lineup decision, participants were asked to predict how
they would make their lineup decision using modified versions of the questions they were
asked after they made their lineup decision. Doing so may have affected their decision
process or responses to the subsequent decision process questions.
Implications
We have directly demonstrated that (12) self-report items improve the ability to
predict identification accuracy when confidence and response time are also considered,
providing further evidence that asking participants to self-report about their decision
processes can enhance predictions of eyewitness accuracy. Given how ubiquitous
identification evidence is, how difficult errors are to detect, and that many police jurisdictions
to not ask eyewitnesses to judge their confidence, more reflectors of eyewitness accuracy are
IDENTIFICATION DECISION PROCESSES 33
needed (Wells, 2020). Self-reports, which would require minimal resources to administer,
deserve further attention as reflectors of eyewitness identification accuracy. People are
relatively familiar with answering Likert-style questions. This short questionnaire can be
completed within a minute or two and, if needed, officers could even read out the questions
to people with literacy challenges. The use of scales as evidence is already commonplace
within criminal justice (e.g., for risk assessments, competency assessments) and therefore
there may be less resistance to their use than other tools. For example, many jurisdictions are
reluctant to ask eyewitnesses how confident they are because any response other than 100%
suggests the possibility that the eyewitness is inaccurate. In contrast, questions about how the
eyewitness made their decision may be less controversial because no single response
obviously indicates inaccuracy.
After sufficient replication and evidence of generalizability, this questionnaire could
be completed by eyewitnesses immediately after they express their confidence and provided
as part of the evidence presented in court. We could see this tool being used by investigators
to help them judge how confident they should be about an identification. If the eyewitness’
score on the scale (in combination with their confidence if possible) suggests a moderate or
low probability of accuracy, we would hope they would be more inclined to investigate
alternative suspects. Prosecutors might use this tool to inform their considerations of plea
deals. Lawyers might use this tool to better inform fact finders about how to weight
identification evidence. Of course, like any tool, this tool would have to be used carefully to
ensure it is not abused. It is certainly plausible that an eyewitness could be coached about
how to respond, which would mitigate the tool’s effectiveness. However, such concerns
should not be new to the criminal justice system: many assessment tools used regularly in
criminal justice can be abused similarly. It is important that best practices for administering
such a tool be studied and adhered to.
IDENTIFICATION DECISION PROCESSES 34
Conclusions
To summarize, the factors we found and the pattern of correlations amongst our
factors highlight the multidimensional nature of eyewitness identifications—which has
implications for theories of eyewitness decision processes, and we found support for applied
lineup theory (Charman & Wells, 2007). We found evidence that an Automatic Response
factor and two deliberative decision process factors: Relative Judgment and Absolute
Judgment reflected the decision processes of participants who made a lineup identification
and all three factors had adequate internal reliability. Importantly, scores based on these
factors not only predicted identification accuracy, but improved predictions of accuracy
compared to using only confidence and response time. Although validation and replication
are clearly necessary, it appears that eyewitness’ self-reports of their decision processes can
be an effective reflector variable—one that can easily be incorporated into extant
identification procedures—and that provides predictive power over and above confidence and
response time.
IDENTIFICATION DECISION PROCESSES 35
References
Aleš, Ž., & Marjan, C. (2023). multiUS: Functions for the courses multivariate analysis and
computer intensive methods (Version 1.2.3) [R package]. https://CRAN.R-
project.org/package=multiUS
Baldassari, M. J., Kantner, J., & Lindsay, D. S. (2019). The importance of decision bias for
predicting eyewitness lineup choices: toward a Lineup Skills Test. Cognitive
Research: Principles and Implications, 4, 1-13. https://doi.org/10.1186/s41235-018-
0150-3
Ben-Shachar, M., Lüdecke, D., & Makowski, D (2020). effectsize: Estimation of effect size
indices and standardized parameters. Journal of Open Source Software, 5(56), 2815.
https://doi.org/10.21105/joss.02815
Bindemann, M., Brown, C., Koyas, T., & Russ, A. (2012). Individual differences in face
identification postdict eyewitness accuracy. Journal of Applied Research in Memory
and Cognition, 1(2), 96-103. https://doi.org/10.1016/j.jarmac.2012.02.001
Boateng, G. O., Neilands, T. B., Frongillo, E. A., Melgar-Quiñonez, H. R., & Young, S. L.
(2018). Best Practices for Developing and Validating Scales for Health, Social, and
Behavioral Research: A Primer. Frontiers in Public Health, 6, 149.
https://doi.org/10.3389/fpubh.2018.00149
Boyce, M., Beaudry, J., & Lindsay, R. C. L. (2007). Belief of eyewitness identification
evidence. In The handbook of eyewitness psychology: Volume II (pp. 515-540).
Psychology Press.
Brewer, N., Caon, A., Todd, C., & Weber, N. (2006). Eyewitness identification accuracy and
response latency. Law and Human Behavior, 30(1), 31–50.
https://doi.org/10.1007/s10979-006-9002-7
Canty, A., & Ripley, B. (2021). boot: Bootstrap R (S-Plus) functions (Version 1.3-28) [R
IDENTIFICATION DECISION PROCESSES 36
package]. https://CRAN.R-project.org/package=boot
Charman, S. D., & Cahill, B. S. (2012). Witnesses’ memories for lineup fillers postdicts their
identification accuracy. Journal of Applied Research in Memory and Cognition, 1(1),
11-17. https://doi.org/10.1016/j.jarmac.2011.08.001
Charman, S., & Wells, G. L. (2007). Applied lineup theory. In R. C. L. Lindsay, D. F. Ross,
J. Read, M. P. Toglia (Eds.). The handbook of eyewitness psychology, Vol II: Memory
for people (pp. 219–254). Lawrence Erlbaum Associates, Publishers.
Clark, S. E. (2003). A memory and decision model for eyewitness identification. Applied
Cognitive Psychology, 17(6), 629-654. https://doi.org/10.1002/acp.891
Clark, S. E., Erickson, M. A., & Breneman, J. (2011). Probative value of absolute and
relative judgments in eyewitness identification. Law and Human Behavior, 35, 364–
380. https://doi.org/10.1007/s10979-010-9245-1
Cutler, B. L., Penrod, S. D., & Dexter, H. R. (1990). Juror sensitivity to eyewitness
identification evidence. Law and Human Behavior, 14(2), 185-191.
https://doi.org/10.1007/BF01062972
Cutler, B. L., Penrod, S. D., & Stuve, T. E. (1988). Juror decision making in eyewitness
identification cases. Law and Human Behavior, 12(1), 41-55.
https://doi.org/10.1007/BF01064273
Davison, A. C., & Hinkley, D. V. (1997) Bootstrap Methods and Their Applications.
Cambridge.
DeCarlo, L. T. (1998). Signal detection theory and generalized linear models. Psychological
Methods, 3(2), 186–205. https://doi.org/10.1037/1082-989X.3.2.186
Dunning, D., & Stern, L. B. (1994). Distinguishing accurate from inaccurate eyewitness
identifications via inquiries about decision processes. Journal of Personality and Social
Psychology, 67(5), 818–835. https://doi.org/10.1037/0022-3514.67.5.818
IDENTIFICATION DECISION PROCESSES 37
Fife, D., Perry, C., & Gronlund, S. D. (2014). Revisiting absolute and relative judgments in
the WITNESS model. Psychonomic Bulletin Review, 21, 479–487.
https://doi.org/10.3758/s13423-013-0493-1
Firke, S. (2021). janitor: Simple tools for examining and cleaning dirty data (Version 2.1.0)
[R package]. https://CRAN.R-project.org/package=janitor
Flowe, H. (2011). An exploration of visual behaviour in eyewitness identification tests.
Applied Cognitive Psychology, 25(2), 244-254. https://doi.org/10.1002/acp.1670
Flowe, H., & Cottrell, G. W. (2011). An examination of simultaneous lineup identification
decision processes using eye tracking. Applied Cognitive Psychology, 25(3), 443–451.
https://doi.org/10.1002/acp.1711
George, D., & Mallery, P. (2019). IBM SPSS statistics 25 step by step: A simple guide and
reference. Taylor & Francis.
Gronlund, S. D. (2005). Sequential lineup advantage: Contributions of distinctiveness and
recollection. Applied Cognitive Psychology, 19(1), 23-37.
https://doi.org/10.1002/acp.1047
Harrell Jr, F. (2022). Hmisc: Harrell miscellaneous (Version 4.7-0) [R package].
https://CRAN.R-project.org/package=Hmisc
Harvey, A. J., Shrimpton, B., Azzopardi, Z., O'Grady, K., Hicks, E., Hirst, E., & Atkinson-
Cox, K. (2020). The influence of alcohol and weapon presence on eyewitness memory
and confidence. Applied Cognitive Psychology, 34(2), 489-503.
https://doi.org/10.1002/acp.3636
Josephson, S., & Holmes, M. E. (2011). Selecting the suspect: An eye-tracking comparison
of viewing of same-race vs. cross-race photographs in eyewitness identification. Visual
Communication Quarterly, 18(4), 236–249.
http://doi.org/10.1080/15551393.2011.627280
IDENTIFICATION DECISION PROCESSES 38
Kneller, W., Memon, A., & Stevenage, S. (2001). Simultaneous and sequential lineups:
Decision processes of accurate and inaccurate eyewitnesses. Applied Cognitive
Psychology, 15(6), 659-671. https://doi.org/10.1002/acp.739
Lindsay, R. C. L., & Bellinger, K. (1999). Alternatives to the sequential lineup: the
importance of controlling the pictures. The Journal of Applied Psychology, 84(3), 315–
321. https://doi.org/10.1037/0021-9010.84.3.315
Lindsay, R. C. L., Kalmet, N., Leung, J., Bertrand, M., Sauer, J. D., & Sauerland, M. (2013).
Confidence and accuracy of lineup selections and rejections: Postdicting rejection
accuracy with confidence. Journal of Applied Research in Memory and Cognition, 2(3),
179–184. https://doi.org/10.1016/j.jarmac.2013.06.002
Lindsay, R. C. L., Lea, J. A., Nosworthy, G. J., Fulford, J. A., Hector, J., Le Van, V., &
Seabrook, C. (1991). Biased lineups: Sequential presentation reduces the problem.
Journal of Applied Psychology, 76(6), 796–802. https://doi.org/10.1037/0021-
9010.76.6.796
Mansour, J. K., Beaudry, J. L., Bertrand, M. I., Kalmet, N., Melsom, E. I., & Lindsay, R. C.
L. (2020). Impact of disguise on identification decisions and confidence with
simultaneous and sequential lineups. Law and Human Behavior, 44(6), 502–515.
https://doi.org/10.1037/lhb0000427
Mansour, J. K., & da Costa, J. R. (2015, June 24-27). Testing applied lineup theory using
metacognitions: The relationship between memory strength, decision strategy, and
identification accuracy [Conference presentation]. Society for Applied Research in
Memory and Cognition, Victoria, Canada.
Mansour, J. K., Lindsay, R. C. L, Brewer, N., & Munhall, K. G. (2009). Characterizing visual
behaviour on a lineup task. Applied Cognitive Psychology, 23(7), 1012–1026.
https://doi.org/10.1002/acp.1570
IDENTIFICATION DECISION PROCESSES 39
Moors, A. (2016). Automaticity: Componential, causal, and mechanistic explanations.
Annual Review of Psychology, 67, 263–287. https://doi.org/10.1146/annurev-psych-
122414-033550
Müller, K. (2020). here: A simpler way to find your files. R package version 1.0.1,
https://CRAN.R-project.org/package=here
Nyman, T. J., Lampinen, J. M., Antfolk, J., Korkman, J., & Santtila, P. (2019). The distance
threshold of reliable eyewitness identification. Law and Human Behavior, 43(6), 527–
541. https://doi.org/10.1037/lhb0000342
Penrod, S., & Cutler, B. (1995). Witness confidence and witness accuracy: Assessing their
forensic relation. Psychology, Public Policy, and Law, 1(4), 817–845.
https://doi.org/10.1037/1076-8971.1.4.817
Qualtrics software. (2015). Provo, UT, USA. http://www.qualtrics.com
Quigley-McBride, A., & Wells, G. L. (2023). Eyewitness confidence and decision time
reflect identification accuracy in actual police lineups. Law and Human Behavior,
47(2), 333–347. https://doi.org/10.1037/lhb0000518
R Core Team (2024). R: A language and environment for statistical computing. R Foundation
for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
Revelle, W. (2022). psych: Procedures for personality and psychological research (Version
2.2.5) [R package]. https://CRAN.R-project.org/package=psych
Robinson, M. D., & Johnson, J. T. (1998). How not to enhance the confidence—accuracy
relation: The detrimental effects of attention to the identification process. Law and
Human Behavior, 22(4), 409-428. https://doi.org/10.1023/A:1025770926338
Ross, D. F., Benton, T. R., McDonnell, S., Metzger, R., & Silver, C. (2007). When accurate
and inaccurate eyewitnesses look the same: A limitation of the ‘pop-out’ effect and the
10- to 12-second rule. Applied Cognitive Psychology, 21(5), 677–690.
IDENTIFICATION DECISION PROCESSES 40
https://doi.org/10.1002/acp.1308
Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of
Statistical Software, 48(2), 1-36. https://doi.org/10.18637/jss.v048.i02
Saraiva, R. B., van Boeijen, I., Hope, L., Horselenberg, R., Sauerland, M., & van Koppen, P.
J. (2020a). Eyewitness metamemory predicts identification performance in biased and
unbiased lineups. Legal and Criminological Psychology, 25(2), 111-132.
https://doi.org/10.1111/lcrp.12166
Saraiva, R. B., Hope, L., Horselenberg, R., Ost, J., Sauer, J. D., & van Koppen, P. J. (2020b).
Using metamemory measures and memory tests to estimate eyewitness free recall
performance. Memory, 28(1), 94-106. https://doi.org/10.1080/09658211.2019.1688835
Sauer, J. D., Palmer, M. A., & Brewer, N. (2019). Pitfalls in using eyewitness confidence to
diagnose the accuracy of an individual identification decision. Psychology, Public
Policy, and Law, 25(3), 147–165. https://doi.org/10.1037/law0000203
Sauerland, M., Sagana, A., & Sporer, S. L. (2012). Assessing nonchoosers' eyewitness
identification accuracy from photographic showups by using confidence and response
times. Law and Human Behavior, 36(5), 394–403. https://doi.org/10.1037/h0093926
Sauerland, M., & Sporer, S. L. (2007). Post-decision confidence, decision time, and self-
reported decision processes as postdictors of identification accuracy. Psychology,
Crime & Law, 13(6), 611–625. https://doi.org/10.1080/10683160701264561
Sauerland, M., & Sporer, S. L. (2009). Fast and confident: Postdicting eyewitness
identification accuracy in a field study. Journal of Experimental Psychology: Applied,
15(1), 46–62. https://doi.org/10.1037/a0014560
Seale-Carlisle, T. M., Colloff, M. F., Flowe, H. D., Wells, W., Wixted, J. T., & Mickes, L.
(2019). Confidence and response time as indicators of eyewitness identification
accuracy in the lab and in the real world. Journal of Applied Research in Memory and
IDENTIFICATION DECISION PROCESSES 41
Cognition, 8(4), 420-428. https://doi.org/10.1016/j.jarmac.2019.09.003
Slane, C. R., & Dodson, C. S. (2022). Eyewitness confidence and mock juror decisions of
guilt: A meta-analytic review. Law and Human Behavior, 46(1), 45.
https://doi.org/10.1037/lhb0000481
Smith, A. M., Smalarz, L., Wells, G. L., Lampinen, J. M., & Mackovichova, S. (2022). Fair
lineups improve outside observers’ discriminability, not eyewitnesses’ discriminability:
Evidence for differential filler-siphoning using empirical data and the WITNESS
computer-simulation architecture. Journal of Applied Research in Memory and
Cognition, 11(4), 534–544. https://doi.org/10.1037/mac0000021
Smith, S. M., Lindsay, R. C. L., & Pryke, S. (2000). Postdictors of eyewitness errors: Can
false identifications be diagnosed? Journal of Applied Psychology, 85(4), 542–550.
https://doi.org/10.1037/0021-9010.85.4.542
Smith, S. M., Lindsay, R. C. L., Pryke, S., & Dysart, J. E. (2001). Postdictors of eyewitness
errors: Can false identifications be diagnosed in the cross-race situation? Psychology,
Public Policy, and Law, 7(1), 153–169. https://doi.org/10.1037/1076-8971.7.1.153
Smith, S. M., Stinson, V., & Prosser, M. A. (2004). Do they all look alike? An exploration of
decision-making strategies in cross-race facial identifications. Canadian Journal of
Behavioural Science / Revue Canadienne des Sciences du Comportement, 36(2), 146–
154. https://doi.org/10.1037/h0087225
Sporer, S. L. (1992). Post-dicting eyewitness accuracy: Confidence, decision-times and
person descriptions of choosers and non-choosers. European Journal of Social
Psychology, 22(2), 157–180. https://doi.org/10.1002/ejsp.2420220205
Steblay, N. K., Dysart, J. E., & Wells, G. L. (2011). Seventy-two tests of the sequential
lineup superiority effect: A meta-analysis and policy discussion. Psychology, Public
Policy, and Law, 17(1), 99–139. https://doi.org/10.1037/a0021650
IDENTIFICATION DECISION PROCESSES 42
Tredoux, C. G. (1998). Statistical inference on measures of lineup fairness. Law and Human
Behavior, 22(2), 217-237. https://doi.org/10.1023/A:1025746220886
Tredoux, C., & Naylor, T. (2018). r4lineups: Statistical inference on lineup fairness (Version
0.1.1) [R package]. https://CRAN.R-project.org/package=r4lineups
Weber, N., Brewer, N., Wells, G. L., Semmler, C., & Keast, A. (2004). Eyewitness
identification accuracy and response latency: The unruly 10-12-second rule. Journal of
Experimental Psychology: Applied, 10(3), 139–147. https://doi.org/10.1037/1076-
898X.10.3.139
Wells, G. L. (1984). The psychology of lineup identifications. Journal of Applied Social
Psychology, 14(2), 89–103. https://doi.org/10.1111/j.1559-1816.1984.tb02223.x
Wells, G. L. (2020). Psychological science on eyewitness identification and its impact on
police practices and policies. American Psychologist, 75(9), 1316–1329.
https://doi.org/10.1037/amp0000749
Wells, G. L., & Windschitl, P. (1999). Stimulus sampling and social psychological
experimentation. Personality and Social Psychology Bulletin, 25(9), 1115–1125.
https://doi.org/10.1177/01461672992512005
Wickham, H. (2019). stringr: Simple, consistent wrappers for common string operations
(Version 1.4.0) [R package]. https://CRAN.R-project.org/package=stringr
Wickham, H., Chang, W., Henry, L., Pedersen, T. L., Takahashi, K., Wilke, C., Woo, K.,
Yutani, H., Dunnington, D., & Posit, P.B.C. (2016). ggplot2: Elegant Graphics for Data
Analysis (Version 3.4.2) [R package]. https://cran.r-project.org/package=ggplot2
Wickham, H., Franҫois, R., Henry, L., Müller, K., Vaughan, D., & Posit Software, PBC.
(2023). dplyr: A grammar of data manipulation (Version 1.1.2) [R package].
https://cran.r-project.org/package=dplyr
Wickham, H., & Girlich, M. (2022). tidyr: Tidy messy data (Version 1.2.0) [R package].
IDENTIFICATION DECISION PROCESSES 43
https://CRAN.R-project.org/package=tidyr
Wickham, H., Hester, J., & Bryan, J. (2022). readr: Read rectangular text data (Version
2.1.2) [R package]. https://CRAN.R-project.org/package=readr
Wittwer, T., Tredoux, C. G., Py, J., Nortje, A., Kempen, K., & Launay, C. (2022). Automatic
recognition, elimination strategy and familiarity feeling: Cognitive processes predict
accuracy from lineup identifications. Consciousness and Cognition, 98, 103266.
https://doi.org/10.1016/j.concog.2021.103266
Wixted, J. T., & Mickes, L. (2014). A signal-detection-based diagnostic-feature-detection
model of eyewitness identification. Psychological Review, 121(2), 262–276.
https://doi.org/10.1037/a0035940
Wixted, J. T., & Wells, G. L. (2017). The relationship between eyewitness confidence and
identification accuracy: A new synthesis. Psychological Science in the Public Interest,
18(1), 10-65. https://doi.org/10.1177/1529100616686966
IDENTIFICATION DECISION PROCESSES 44
Appendix A: Post-lineup Decision Process Questions
Question
Response Options
How easy was it for you to make your decision?*
1 = extremely difficult –
7 = extremely easy
To what extent did the perpetrator standout to you in the
lineup?*†
1 = not at all distinct –
7 = extremely distinct
Rate the extent to which the following factors influenced your
decision.
1 = no influence –
7 = strong influence
The lineup pictures.
Your memory for the culprit.*
Please rate the extent to which you agree with each of the
following statements.
1 = strongly disagree –
7 = strongly agree
The culprit’s face just ‘popped out’ at me. †‡
I recognized the culprit easily. †‡
I searched the photos for a familiar face. †
I searched the faces for the feature/features I
remembered.
†
I compared the photos to my memory in order to pick
someone who was the closest match to what I
remembered. †
I first eliminated the ones that were definitely not the
culprit, and then chose from those remaining. I didn’t
recognize the culprit easily.
I looked for clues in the lineup as to whom I should pick.
Regarding the influence of the other (non-chosen) photos:
1 = strongly disagree –
7 = strongly agree
They helped confirm my decision, to reinforce my
decision after it was made.
They had little influence on my decision. I knew straight
away who to pick. †
They had some influence on my decision because more
than one face seemed familiar.
†‡
They had some influence on my decision because more
than one face had the feature/features I remember about
the perpetrator. †‡
They confused me, and made the task more difficult. I
didn’t know whom to pick.
†‡
I was able to pick the person because the other lineup
members weren’t really plausible.
Note. Items with a † were included in the final scale for identifiers. Items with ‡ were
included in the final scale for rejectors.
EYEWITNESS IDENTIFICATION DECISION PROCESSES 1
Supplementary Materials
Demographic Questions
Participants were asked in three separate questions to “Please indicate your” ethnic
background, age, and gender. The responses options for ethnic background were Caucasian,
Indigenous, Asian, and Other (specify). Age was entered into a text box. Participants could
select Male or Female for gender8. Participants could also choose not to respond.
Perceived Memory Strength - Manipulation Check
To follow up the significant multivariate result, we examined the univariate results.
Participants in the strong condition reported a significantly better memory than participants in
the weak condition across four of the five manipulation check measures (see Table S1). A
non-significant effect emerged for distance. However, given that we manipulated distance
only by changing the size of the video on screen, we acknowledge that this item was a poor
measure of memory strength because a correct response would be the same across our
manipulation. Based on our analysis, we deemed our manipulation successful.
Table S1
Descriptive and Inferential Statistics for Perceived Memory Strength - Manipulation Check
Weak
(n = 315)
Strong
(n = 342)
M (SD) M (SD) F (1, 655) d [95% CI]
View
5.07 (1.65)
5.90 (1.27)
48.92***
0.55 [0.39, 0.70]
Features
4.61 (1.57)
5.65 (1.17)
89.34***
0.74 [0.58, 0.90]
Memory
4.33 (1.40)
5.34 (1.17)
98.59***
0.78 [0.62, 0.93]
Attention
4.46 (1.56)
5.36 (1.32)
60.22***
0.61 [0.45, 0.76]
Pick out
4.42 (1.56)
5.59 (1.22)
109.72***
0.82 [0.66, 0.98]
Time
6.40 (8.87)
26.6 (19.0)
343.40***
1.45 [1.28, 1.62]
Distance
2.01 (3.56)
2.06 (6.11)
0.02
0.01 [-0.14, 0.16]
Note. Time was reported in seconds and distance in meters. *** p < .0001
8 Note we acknowledge a greater number of options would have been more informative, however, these data
were collected in 2015.
EYEWITNESS IDENTIFICATION DECISION PROCESSES 2
Factor Analysis
Identifications
Checking Appropriateness. Prior to conducting an exploratory factor analysis, it is
appropriate to check the suitability of the data for factor analysis. For identifications, we first
examined the post-lineup questions to determine if they elicited a normal distribution of
responses because significant departures from normality suggest transformations of those
variables may improve results. Inspection of the histograms of responses for each of the 17
items indicated some departure from normality but none were extreme. We examined
whether the two most non-normal items showed a curvilinear relationship; they did not,
therefore we concluded that curvilinearity was unlikely to be an issue in our sample.
Next, we examined our data for factorability. A correlation matrix with each of the 17
items indicated that of the 136 correlations, 55 were above .30 (i.e., factorable). Consistent
with our conclusion regarding factorability, Bartlett’s test of sphericity was significant, χ 2 =
3692.64, p < .001, and the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy
exceeded the recommended value of .60 (KMO = .86; Tabachnick & Fidel, 2013). A KMO
between .8 and .9 is considered meritorious (Kaiser & Rice, 1974). Finally, we examined the
anti-image matrix which provides information about how strongly pairs of items are related
when the influence of other items is removed. An item should be highly correlated with itself
if sampling is adequate; consistent with this conclusion, all 17 values on the diagonal were
above 0.5. Items should be less highly correlated with other items in order to conduct factor
analysis. Indeed, only four of the off-diagonal values were above 0.5 (indicating a lack of
redundancy in our items). These results further support the conclusion of factorability.
We returned to the correlation matrix to assess whether multicollinearity was an issue
in our sample. All 17 items correlated below .90. Indeed, the determinant of the correlation
matrix was .00006. A value greater than .00001 indicates a lack of multicollinearity.
EYEWITNESS IDENTIFICATION DECISION PROCESSES 3
We next checked that the data set was not singular, which we achieved by conducting
an unrotated principal components analysis. That analysis ran successfully indicating a lack
of singularity. This analysis was followed up with an examination of communalities from an
unrotated principal axis factor analysis. High communalities (>.60) mitigate the effects of
small sample sizes (Tabachnick & Fidell, 2013). Communalities ranged from .44 to .98 with
eight items below .60. We concluded that the data were appropriate for factor analysis.
Exploratory Factor Analysis. Satisfied that our data were appropriate for factor
analysis, we produced conducted Horn’s parallel analysis of factors to assess how many
factors were appropriate for our data and found that three factors were appropriate. We thus
began by conducting our factor analysis for three factors using maximum likelihood
extraction with Promax rotation. The solution explained 53% of the total variance.
Per Tabachnik and Fidel (2013), we inspected the pattern matrix for weak items—
those loading below .32 and those that cross-loaded with other items (75% or greater loading
on the non-maximum loaded factor; Samuels, 2016), which indicates that an item does not
discriminate between factors. One item loaded below .32 (pictures9). Two items cross-loaded
above our criterion (eliminated, confirm). Inspection of the communalities indicated that
pictures and clues had communalities below .20 (Child, 2006). All four items were dropped.
We repeated the factor analysis using the remaining 13 items. This three-factor
solution explained 61% of the total variance and all loadings remained above .32 without
cross-loadings. Inspection of the communalities indicated one additional item had a low
communality (plausible). Thus, we repeated the factor analysis without this item. Doing so
resulted in a solution that explained 65% of the variance with all items loading above .32, no
cross-loadings, and all communalities above .20. We retained this solution as the final three-
factor solution.
9 Appendix 1 provides the full question text with the shorthand term (used here) in bold.
EYEWITNESS IDENTIFICATION DECISION PROCESSES 4
Upon inspection of the magnitude of the three correlations amongst the factors, we
noted that Factors 1 and 2 correlated quite highly (r = -.55). This indicates that an
uncorrelated solution is not appropriate for our data. However, Field (2018) notes that when
two factors correlate above .30, researchers should consider combining them, therefore we
next considered a two-factor solution.
We conducted another factor analysis on the 17 items but constrained the solution to
two factors, allowing the factors to correlate (with Promax rotation). The two-factor solution
explained 42% of the sample’s variance with a lower absolute value for the correlation
between the factors (r = -.37) compared to the three-factor solution. In this solution,
remembered fell below our .32 loading criterion and confirm cross-loaded above our
criterion. Furthermore, inspection of the communalities indicated that pictures, familiar,
remembered, compared, clues, confirm, and plausible were all below .20. We removed these
seven items and repeated the factor analysis. The resultant solution explained 66% of the
variance and the two factors correlated similarly to our three-factor solution (r = -.59). Factor
loadings and communalities were all acceptable in this final two-factor solution.
The factors comprising the two-factor solution strongly resembled the first two factors
of our three-factor solution. Notably, Factor 1 of the two-factor solution included six of the
seven items loading on Factor 1 of the three-factor solution; only knew was dropped.
Likewise, Factor 2 of the two-factor solution resembled Factor 2 of the three-factor solution:
all three items from the three-factor solution were included as well as one additional item
(eliminated).
The high correlation between the factors in the final two-factor solution suggests that
combining the factors may be appropriate (Field, 2018). A one-factor solution was produced
following the same procedures used above. The result was a 9-item solution which explained
53% of the variance. This factor comprised the same six items as Factor 1 of the two-factor
EYEWITNESS IDENTIFICATION DECISION PROCESSES 5
solution (and were included on Factor 1 of the three-factor solution) and the three items that
appeared on Factor 2 of both the two- and three-factor solutions.
Choosing Between Solutions. To choose between the solutions, we examined model
fit. For each solution we compared the reproduced correlation matrix (i.e., correlations
predicted/implied by the factors loadings) to the correlation matrix from the (unrotated)
solutions (i.e., observed correlations). Less than 50% of the residual differences greater than
.05 is generally considered acceptable (Field, 2018). We also used Bikos’ (2022) fit measure
which involves calculating the sum of the squared residuals divided by the sum of the
squared correlations and subtracting this value from one. A value greater than .95 indicates
good fit.
For the three-factor solution, just 3% of the residuals were above .05, specifically:
easy-popped out10 (.054) and knew-confused (.075). Field (2018) suggests that the mean of
the residuals should be below .08, and it was (M = .02). Residuals should be normally
distributed and a histogram showed this to be the case. The three-factor solution (.954) was a
good fit to our data based on Bikos’ (2022) measure. For the two-factor solution, 11% of
residuals were above .05, the mean of the residuals was .03, the histogram of residuals was
normal, and fit was .945. Thus, the solution was acceptable but did not reach Bikos’ .95
criterion. Finally, for the one-factor solution, the proportion of residuals above .05 was
acceptable (28%) but the mean of the residuals was greater than .08 (.13) which, according to
Field, suggests more factors should be extracted. Furthermore, the residuals were positively
skewed (cf. normal) and fit was only .76. Thus, the three-factor solution was the preferred
solution.
Rejections
Checking Appropriateness. We started with a check of whether these data were
10 Appendix 1 provides the full question text with the shorthand term (used here) in bold.
EYEWITNESS IDENTIFICATION DECISION PROCESSES 6
appropriate for factor analysis. Inspection of the histograms of responses for each of the 17
items indicated considerable departure from normality. When we examined whether the two
most non-normal items showed a curvilinear relationship; a curvilinear pattern was present,
indicating curvilinearity is an issue with these data. However, though normality is preferred,
factor analysis is robust despite non-normality, we did not transform our items.
Next, we considered factorability. A correlation matrix with each of the 17 items
indicated that of the 136 correlations, only 13 were above .30 which is a small number but
still potentially enough for factor analysis. Bartlett’s test of sphericity was significant, χ 2 =
1366.387, p < .001, and the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy
exceeded the recommended value of .60 (KMO = .70; Tabachnick & Fidel, 2013). A KMO
between .5 and .7 is considered mediocre while .7 to .8 is considered good (Kaiser & Rice,
1974). Finally, we examined the anti-image matrix. All 17 values on the diagonal were above
0.5 and only six of the off-diagonal values were above 0.5. These results support a conclusion
of factorability, despite the small number of items with correlations above .30.
There was no concern with multicollinearity for the items as all 17 items correlated
below .90. The determinant of the correlation matrix was .006, much higher than the
threshold of .00001 for a lack of multicollinearity.
An unrotated principal components analysis ran successfully, therefore the structure
was not singular. Communalities from an unrotated principal axis factor analysis ranged from
.32 to .85, with six items below .60. High communalities (>.60) mitigate the effects of small
sample sizes (Tabachnick & Fidell, 2013), a slight concern in this case.
In conclusion, although the rejection data are not ideal for factor analysis (non-normal
items, small number of items correlating with multiple other items above .30), they did meet
the minimum requirements for being suitable for factor analysis.
Exploratory Factor Analysis. Factor analysis was used to determine the underlying
EYEWITNESS IDENTIFICATION DECISION PROCESSES 7
structure of the 17-item questionnaire. Horn’s parallel analysis indicated a three-factor
solution was appropriate. Thus, we conducted a factor analysis for a three-factor solution
with Promax rotation. The solution explained 37% of the total variance in our data set.
Following Tabachnik and Fidel (2013), we again dropped items which loaded below
.32 or cross-loaded with other items based on Samuels’ (2016) rule. The pattern matrix
indicated that pictures loaded below .32 on all three factors (max loading = .20). In addition,
eliminated cross-loaded with factors 2 and 3. Finally the the communalities of pictures,
compared, eliminated, clues, confirm, knew, and plausible were below.
We repeated our factor analysis excluding the 7 items identified as problematic above.
This solution explained 69% of the variance. None of the remaining items loaded below .32,
cross-loaded with other items above our criterion, or had communalities below .20.
The next step was to consider the correlations amongst our factors. As with the scale
we developed for identifications, Factors 1 and 2 correlated quite highly (r = -.54), which
indicated that an uncorrelated solution (i.e., varimax rotation) would be inappropriate.
However, Field (2018) suggests combining factors that correlate above .30, therefore we
considered whether a two-factor solution would be more appropriate for these data.
We conducted a new factor analysis with Promax rotation, using all 17 items, but
constrained the analysis to two factors. This solution explained 28% of the sample’s variance.
We again applied our minimum loading criterion, cross-loading criterion, and minimum
communality criterion. Eight items: pictures, memory, familiar, remembered, compared,
eliminated, clues, and confirm failed to load above .32. No items cross-loaded. Inspection of
the communalities indicated 11 items could be dropped: easy, pictures, memory, familiar,
remembered, compared, eliminated, clues, confirm, knew, and plausible.
Repeating the factor analysis without the 11 items resulted in a solution that explained
79% of the variance. This two-factor solution resulted in no items with loadings below .32,
EYEWITNESS IDENTIFICATION DECISION PROCESSES 8
cross-loaded, or with communalities below .20. However, the stability of this 6-item solution
must be questioned as feature loaded above 1.00 (1.007 on Factor 2) which can signal a
problem. The correlation between the factors was relatively high (r = .56).
Given this high correlation, we considered a one-factor solution. This solution
explained 13% of the sample’s response variance but only three items met our loading
criterion (face, feature, confused). The one-factor solution explained 63% of the sample’s
response variance with all three items loading above.32 and communalities above .20.
Choosing Between Solutions. For the three-factor solution, 9% of residuals were
above .05 with a mean of .046, and a normal distribution. These values are acceptable based
on Field (2018); however, model fit was .83 so well below the acceptable threshold suggested
by Bikos (2022). The two-factor solution produced no residuals above .05, with a mean of
.012, and a normal distribution. In contrast to the three-factor model, fit of the two-factor
model was .97. The one-factor solution had no residuals above .05, a mean residual of .014,
and a non-normal (positively skewed) distribution of residuals with a fit of .97. The two- and
one-factor solutions were similar on all metrics except the distribution of residuals, where the
two-factor solution was preferable because its residuals were normal (vs. skewed). Thus, we
judged the two-factor solution as preferable.
Generalizability Across Memory Strength Conditions (Identifications)
We examined whether our memory strength manipulation affected eyewitness
decision processes. First, we tested the generalizability of the three-factor structure for
identifications across the two memory strength conditions. Second, we created composite
scores for each of the factors and compared them across the two memory strength conditions.
Generalizability of Factor Structure Across Memory Strength Conditions
We conducted a confirmatory factor analysis (CFA) to ensure our memory strength
manipulation did not interact with our observed factor structure. Following Brown (2006), we
EYEWITNESS IDENTIFICATION DECISION PROCESSES 9
calculated absolute fit, fit adjusting for model parsimony, and comparative fit for the CFA
models. As measures of absolute fit, we used χ2 and standardized root mean square residual
(SRMR). As a measure of fit adjusting for model parsimony, we used the root mean-square
error of approximation (RMSEA). Finally, as a measure of comparative fit, we used the
comparative fit index (CFI). Poor, acceptable, and good model fit parameters for the
indices—per Bikos (2022) and Awang (2015)—are provided in Table S2.
Table S3
Parameters Indicating Poor, Acceptable, and Good Model Fit by Fit Index
Fit Index
Model Fit Parameters
Poor
Acceptable
Good
Chi-square (p)
≤ .05
> .05
> .05
SRMR
≥ .10
< .10*
≤ .05
RMSEA
≥ .10
< .10*
≲ .05
CFI
≤ .90
> .90
≥ .95
Note. Based on Bikos (2022) and Awang (2015). SRMR = Standardized root mean square.
RMSEA = Root mean-square error of approximation. CFI = Comparative fit index. * Hooper
et al. (2008) and West, Taylor, and Wu (2012) consider ≤ .08 as the appropriate cutoff.
We used a Maximum Likelihood extraction method with a multi-sample approach to
determine if there was model invariance between the weak and strong Memory Strength
conditions (for a detailed description of this approach, see Brown, 2006). We set the first
model to be unconstrained and set the second with constraints on measurement weights
(factor loadings), measurement intercepts, and covariances (residuals and latent variables).
The difference between the goodness-of-fit statistics for the two models was not significant,
∆χ2(21) = 29.19, p = .11, indicating that the constraints were equal across the groups and no
model invariance. Per Table S3, the χ2 for both the constrained and unconstrained models
was significant, indicating a cause for concern. However, this index is overly sensitive,
including when one’s sample size is greater than 200 as in this case (Awang, 2015; Babyak &
EYEWITNESS IDENTIFICATION DECISION PROCESSES 10
Green, 2010; Bikos, 2022; Hooper et al., 2008; West, Tayler, & Wu, 2012). Kline (2016)
notes that a failed exact-fit test (like the chi-square test) gives preliminary evidence against a
model while passing it provides preliminary support; therefore, local fit testing (vis-à-vis fit
approximate fit indices) is also necessary. Our other fit indices indicated adequate model fit,
although if .08 is used as the criterion for acceptable fit for SRMR and RMSEA, the
unconstrained model does not meet that criterion for RMSEA.
EYEWITNESS IDENTIFICATION DECISION PROCESSES 11
Table S3
Model Fit Indices for the Confirmatory Factor Analyses for Multiple-Sample Models and for the Strong and Weak Memory Strength Conditions
Fit Index
Multi-Sample Models
Unconstrained Revised Models by
Memory Strength
Unconstrained
Original
(df = 102)
Constrained
Original
(df = 123)
Unconstrained
Revised
(df = 100)
Constrained
Revised
(df = 122)
Strong
(df = 50)
Weak
(df = 50)
Chi-Square
269.47*
298.67*
223.54*
251.73*
86.76*
136.78*
SRMR
.06
.07
.05
.06
.04
.06
RMSEA
.09
.08
.08
.07
.06
.10
CFI
.94
.94
.96
.96
.98
.94
Note. SRMR = Standardized root mean square. RMSEA = Root mean-square error of approximation. CFI = Comparative fit index. * p ≤ .001.
EYEWITNESS IDENTIFICATION DECISION PROCESSES 12
Given that fit may not be ideal and per Kline (2016), we considered why this may be
the case. We noted that the cross-loading for confused on Factor 1 was markedly higher than
for other items. Inspection of modification indices showed that the path between confused
and the latent variable represented by Factor 1 had the second largest index. Thus, we added
this path to our factor structure. Confused loads negatively on the Automatic Response factor
therefore inclusion of this item does make some sense as an eyewitness who automatically
recognizes the culprit is not going to be confused by the presence of other images, much like
recognition of one’s mother in a crowd of people is no more difficult than recognizing her on
her own. Adding this path had a significant effect on our fit indices as can be seen in the
Unconstrained Revised and Constrained Revised sections of Table S3. Although the chi-
square statistics remained significant, the remaining fit metrics reflected acceptable fit.
We next conducted separate unconstrained confirmatory factor analyses to test model
fit for each of the two memory strength conditions using our revised structure. As seen in
Table S3, the χ2 was significant for both conditions. The other model fit indices were
acceptable, except for the RMSEA for the weak memory strength condition which is outside
the acceptable range if .08 (cf. .10) is used as the criterion for acceptable fit. These results
suggest the weak memory strength condition has marginal model fit. Notably, the model fit
the strong memory condition data better than the weak memory condition data.
Composite Scores Across Memory Strength Conditions
To further investigate how memory strength affects decision processes and as a basic
check of criterion validity, we created composite scores for each factor by summing the raw
scores for each participant on the items comprising each factor. Independent samples t-tests
were then used to compare the composite scores for each Memory Strength group for each
factor. For Automatic Response, scores could range from 7-49 and for both Relative
Judgment and Absolute Judgment, scores could range from 3-21. Participants in the strong
EYEWITNESS IDENTIFICATION DECISION PROCESSES 13
condition scored higher on Automatic Response (M = 32.65, SD = 7.48, Range = 10-47),
t(383.16) = 3.70, p < .001, d = 0.38 [0.18, 0.58], and lower on Relative Judgment (M = 9.58,
SD = 4.66, Range = 3-21), t(379.91) = 1.98, p = .048, d = 0.20 [0.00, 0.41], than those in the
weak condition (M = 29.77, SD = 7.91, Range = 9-49 for Automatic Response; M = 10.57, SD
= 5.10, Range = 3-21 for Relative Judgement). There was no evidence of a difference in
Absolute Judgment scores for the strong (M = 16.67, SD = 3.24) and weak conditions (M =
16.44, SD = 3.59), t(378.13) = 0.64, p = .52, d = 0.07 [ -0.14, 0.27]. We also conducted this
analysis on the regression factor scores and found slightly different results: in this case, there
was no difference by memory condition for scores on the Relative Judgment factor (p = .89).
Exploratory Factor Analysis of Pre-lineup Questionnaire
Responses to our pre-lineup questionnaire were of secondary interest. Only choosers
were included in these analyses. One participant that did not respond to all the scale questions
were dropped so our sample comprised 392 participants. The 17 items we factor analysed
were similar to those used in the post-lineup questionnaire (see Appendix A); specific
wording can be found on the project’s OSF page.
The histograms of responses indicated minimal departure from normality. Inspection
of the scatterplot for the two most non-normal items confirmed that curvilinearity was not a
concern; the curvilinear and linear best-fitting lines were very similar.
Next, we examined our data for factorability. Fewer correlations between items were
above .30 (i.e., factorable) on this questionnaire (51) compared to the post-lineup
questionnaire (85). Nonetheless, Bartlett’s test of sphericity was significant, χ 2 (136) =
2715.24, p < .001 and the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy was
greater than .60 (KMO = .81 or meritorious; Kaiser & Rice, 1974; Tabachnick & Fidel,
2013). All 17 values on the diagonal and only four of the off-diagonal values were above 0.5.
Thus, we concluded that our data were factorable.
EYEWITNESS IDENTIFICATION DECISION PROCESSES 14
Multicollinearity was not a concern as no items correlated above .90 and the
determinant of the correlation matrix was much greater than .00001 (det = .001). Next, using
an unrotated principal components analysis we confirmed the data were not singular. An
unrotated principal axis factor analysis indicated that communalities ranged from .45 to .91
with five items below .60. We concluded that our data were appropriate for factor analysis.
Horn’s parallel analysis of factors suggested three factors. Given this, we conducted
our factor analysis for three factors using maximum likelihood extraction with Promax
rotation. The solution explained 44% of the total variance. Inspection of the factor loadings
showed eliminate cross-loaded (i.e., 75% or greater loading on the non-maximum loaded
factor; Samuels, 2016) but all items loaded above .32. Two items had communalities below
.20 (pictures and clues). All three items were dropped.
Factor analysis on the remaining items explained 50% of the total variance but
confirm was below the .32 loading criterion and the .20 communality criterion (no items
cross-loaded). Thus, we dropped confirm. All loadings in a factor analysis with the remainign
12 items were above .32 without cross-loadings and all communalities were above .20. At
this point, we checked whether the three-factor solution met the criteria outlined by Field
(2018) for maintaining more factors based on Kaiser’s criterion. Specifically, we looked at
whether either the communalities of all items were above .70 or the mean of the
communalities were above .60. Only four of the 12 items had communalities above .70 and
the mean of the communalities was .53. Thus, we concluded that our three-factor solution
was unacceptable and conducted an exploratory factor analysis with Promax rotation on the
original 17 items but constrained it to two factors.
The two-factor solution explained 39% of the sample’s variance. No items had factor
loadings below .32 but three items cross-loaded (familiar, compare, and confuse) and three
items had communalities below .20 (pictures, clues, and plausible). We removed these items
EYEWITNESS IDENTIFICATION DECISION PROCESSES 15
and repeated the factor analysis on the remaining 11 items. The resultant solution explained
48% of the variance with all loadings above .32, no cross-loading, and all communalities
above .20. There was a negligible correlation between the factors (r = -.03) indicating
orthogonal factors. Thus, we rotated the final two-factor solution with a Varimax instead of a
Promax rotation. Table S4 depicts our final factor structure and other relevant indices.
Table S4
Summary of the Final Factor Structure for Identifiers’ Pre-lineup Decision Process Items
Item
Factor loadings
Automatic
Relative
λ
2
1
2
I expect to recognize the culprit easily.
.89
.06
.79
When viewing the lineup, I expect the culprit’s face to just
‘pop out’ at me.
.78 .08 .61
To what extent do you think the culprit will stand out to
you in the lineup?
.75 .01 .56
How easy do you think it will be for you to make your
decision?
.71 .03 .51
[The non-chosen pictures] will probably have little
influence on my decision. I will know straight away who
to pick.
.55 -.14 .32
Rate the extent to which you expect your memory for the
culprit influenced your decision.
.47 .11 .23
[The non-chosen pictures] will have some influence on my
decision because more than one face may have the
feature/features I remember about the perpetrator.
-.07 .93 .87
[The non-chosen pictures] will have some influence on my
decision because more than one face may seem familiar.
-.03 .87 .77
I expect to search the faces for the feature/features I
remember
.26 .42 .24
[The non-chosen pictures] will help confirm my decision,
to reinforce my decision after it is made.
.22 .39 .20
I expect I will first eliminate the ones that are definitely
not the culprit, and then choose from those remaining. I
don't expect to recognise the culprit easily
-.23 .38 .20
Unrotated sums of square loadings
3.17
2.13
Rotated sums of square loadings
3.16
2.14
Proportion variance (rotated)
.29
.48
Note. Factor loadings > 0.30 are bolded. λ2 = communality. Automatic = Automatic
Response. Relative = Relative Judgment.
EYEWITNESS IDENTIFICATION DECISION PROCESSES 16
We also checked how well our final two-factor solution fit the data. Specifically, we
considered the extent to which the reproduced correlation matrix (i.e., the correlations
predicted by the factors loadings) and the correlation matrix from the (unrotated) solutions
differed; less than 50% of these residuals should be greater than .05 (Field, 2018). For our
solution, 29% of residuals were above .05. The mean of the residuals was .05 and the
histogram of residuals was normal, which also indicate good fit. Finally, we examined fit by
summing the squared residuals and dividing them by the sum of the squared correlations
(Bikos, 2022). We were well below the .95 criterion at .83. Thus, we found mixed results in
terms of model fit.
We also subjected the pre-lineup scale to a reliability analysis (see Table S5). The
extent to which each factor was reliable depended on the measure considered. Based on
Chronbach’s alpha, the Automatic Response factor met the criterion for good reliability (≥
.80; George & Mallery, 2019)while the Relative Judgment meets the criterion for acceptable
reliability (≥ .70). However, when McDonald’s omega is considered, Automatic Response
meets the criterion for excellent reliablity (≥ .90) while Relative Judgment meets the criterion
for good reliability.
The items on the final pre-lineup scale were similar to the items on the post-lineup
scale. The six items that comprised Factor 1 on the post-lineup questionnaire (three-factor
solution) comprised Factor 1 on the pre-lineup questionnaire, therefore we also labelled the
first factor on this scale Automatic Response. As discussed in the main paper, the items
reflect a quick and effortless decision process with minimal conscious input (e.g., “I expect to
recognize the culprit easily”). Factor 2 comprised all three items that loaded on Factor 2 of
the post-lineup scale but included two additional items, eliminate and confirm, which were
not included in the final post-lineup questionnaire. We labelled Factor 2 on the pre-lineup
scale as Relative Judgment because most of the items point to comparisons between lineup
EYEWITNESS IDENTIFICATION DECISION PROCESSES 17
members (e.g., “[The non-chosen pictures] will have some influence on my decision because
more than one face may have the feature/features I remember about the perpetrator”).
Table S5
Internal Consistency, Item-total Correlations, and Descriptives for Pre-lineup Identifier
Scale
Factors & Items M (SD)
Item-total
Correlation
ɑ if item
deleted
Automatic Response
(ɑ = .84 95% CI [.82, .86]; ω = .92)
4.47 (1.01)
I expect to recognize the culprit easily.
.86
.78
To what extent do you think the culprit will stand
out to you in the lineup?
4.19 (1.23) .80 .80
How easy do you think it will be for you to make
your decision?
4.24 (1.24) .76 .80
When viewing the lineup, I expect the culprit’s
face to just ‘pop out’ at me.
4.84 (1.41) .74 .81
[The non-chosen pictures] will probably have little
influence on my decision. I will know straight
away who to pick.
3.64 (1.57) .68 .84
Rate the extent to which you expect your memory
for the culprit influenced your decision.
5.26 (1.27) .47 .85
Relative Judgment
(ɑ = .71 95% CI [.66, .75] ; ω = .80)
4.93 (0.96)
[The non-chosen pictures] will have some
influence on my decision because more than one
face may have the feature/features I remember
about the perpetrator.
4.94 (1.28) .86 .58
[The non-chosen pictures] will have some
influence on my decision because more than one
face may seem familiar.
4.87 (1.24) .83 .60
I expect to search the faces for the feature/features
I remember
5.55 (1.17) .42 .70
[The non-chosen pictures] will help confirm my
decision, to reinforce my decision after it is made.
4.55 (1.41) .44 .69
I expect I will first eliminate the ones that are
definitely not the culprit, and then choose from
those remaining. I don't expect to recognise the
culprit easily
4.72 (1.83) .42 .73
Note. ɑ = Cronbach’s alpha. ω = McDonald’s omega. In all cases, ratings ranged from 1-7.
Bolded words are shorthand terms for items.
EYEWITNESS IDENTIFICATION DECISION PROCESSES 18
Predicting Identification Accuracy from the Pre-lineup Questions
As we were able to develop a reliable scale from our pre-lineup questionnaire, we
wanted to determine whether these questions had any predictive ability and therefore
probative value. To that end, we considered whether composite scores derived from our two-
factor solution for the pre-lineup questionnaire predicted identification accuracy. We entered
the composite scores for Automatic Response and Relative Judgment into a binomial logistic
regression predicting identification accuracy. Automatic Response, β = 0.26, SE = 0.10, z =
2.51, p = .012, OR = 1.04 [1.01, 1.08], was a significant predictor, but Relative Judgment β =
0.08, SE = 0.10, z = 0.80, p = .42, OR = 1.02 [0.98, 1.06], was not.
Pre-lineup and Post-lineup Questionnaire Item-Pair Correlations
We examined our decision process questionnaire item-pair correlations to determine
whether the pre-lineup questionnaire primed participants before their lineup decision and the
post-lineup questionnaire. As seen in Table S6, ratings by participants who made an
identification tended to be moderately positively correlated, .15 ≤ rs ≤ .49, ps ≤ .002. For
those who made rejections, the picture was slightly more complex. For most items, the
correlations were moderate and positive, .13 ≤ rs ≤ .56, ps ≤ .02. However, three items
(standout, poppedout, and features) showed low, nonsignificant correlations.
EYEWITNESS IDENTIFICATION DECISION PROCESSES 19
Table S4
Item-Pair Correlations from the Pre-lineup and Post-lineup Decision Process Questionnaires
Identifications (df = 389)
Rejections (df = 277)
Item
Pearson r
Sig.
Pearson r
Sig.
easy
.29
< .001
.20
< .001
standout
.15
.002
.06
.29
picture
.31
< .001
.35
< .001
memory
.37
< .001
.26
< .001
poppedout
.18
< .001
.07
.23
easily
.27
< .001
.17
.005
familiar
.39
< .001
.30
< .001
remembered
.41
< .001
.20
< .001
compared
.26
< .001
.16
.007
eliminated
.35
< .001
.38
< .001
clues
.49
< .001
.56
< .001
confirm
.27
< .001
.34
< .001
knew
.26
< .001
.28
< .001
face
.16
.001
.14
.02
features
.24
< .001
.09
.15
confused
.28
< .001
.23
< .001
plausible
.29
< .001
.20
.001
EYEWITNESS IDENTIFICATION DECISION PROCESSES 20
References
Awang, Z. (2015). Validating the measurement model: CFA. In SEM made simple: A Gentle
Approach to Learning Structural Equation Modelling (pp. 54-74). MPWS.
Babyak, M. A., & Green, S. B. (2010). Confirmatory factor analysis: An introduction for
psychosomatic medicine researchers. Psychosomatic Medicine, 72(6), 587–597.
https://doi.org/10.1097/PSY.0b013e3181de3f8a
Bikos, L. H. (2022). ReCentering psych stats: Psychometrics.
https://lhbikos.github.io/ReC_Psychometrics/index.html
Brown, T. A. (2006). Confirmatory factor analysis for applied research. Guilford Press.
Child, D. (2006). The essentials of factor analysis. A&C Black.
DeVellis, R. F. (2017). Scale development: Theory and applications (4th edition). Sage.
DiStefano, C., Zhu, M., & Mîndrilă, D. (2009). Understanding and using factor scores:
Considerations for the applied researcher. Practice Assessment, Research &
Evaluation, 14, 1–11. https://doi.org/10.7275/da8t-4g52
Field, A. (2018). Exploratory Factor Analysis. In Discovering statistics using IBM SPSS
Statistics (pp. 989-1060). Sage.
George, D., & Mallery, P. (2019). IBM SPSS statistics 25 step by step: A simple guide and
reference. Taylor & Francis.
Hooper, D., Coughlan, J., & Mullen, M. R. (2008). Structural equation modelling: Guidelines
for determining model fit. Electronic Journal of Business Research Methods, 6(1), 53-
60.
Kaiser, H. F., & Rice, J. (1974). Little jiffy, mark IV. Educational and Psychological
Measurement, 34(1), 111-117. https://doi.org/10.1177/001316447403400115
Kline, R. B. (2016). Principles and practice of structural equation modeling (4th ed.).
Guilford Publications.
EYEWITNESS IDENTIFICATION DECISION PROCESSES 21
Samuels, P. (2016). Advice on exploratory factor analysis (Working Paper).
https://doi.org/10.13140/RG.2.1.5013.9766
Tabachnick, B. G., & Fidel, L. S. (2013). Using multivariate statistics (6th ed.). Allyn and
Bacon.
West, S.G., Taylor, A.B., Wu, W. (2012). Model fit and model selection in structural
equation modeling. In R. H. Hoyle (Ed.). Handbook of structural equation modeling (p
209-231). The Guilford Press.
EYEWITNESS IDENTIFICATION DECISION PROCESSES 22
Appendix B: Parameters Indicating Poor, Acceptable, and Good Model Fit by Fit Index
Fit Index
Model Fit Parameters
Poor
Acceptable
Good
Chi-square (p)
≤ .05
> .05
> .05
SRMR
≥ .10
< .10*
≤ .05
RMSEA
≥ .10
< .10*
≲ .05
CFI
≤ .90
> .90
≥ .95
Note. Based on Bikos (2022) and Awang (2015). SRMR = Standardized root mean square.
RMSEA = Root mean-square error of approximation. CFI = Comparative fit index. * Hooper
et al. (2008) and West, Taylor, and Wu (2012) consider ≤ .08 as the appropriate cutoff.