ArticlePDF Available

The Rule Out Procedure: Increasing the Potential for Police Investigators to Detect Suspect Innocence from Eyewitness Lineup Procedures

Authors:

Abstract and Figures

When following scientific best-practice recommendations, the simultaneous lineup is effective at demonstrating guilt. The simultaneous lineup is less effective at demonstrating innocence. A critical problem is that when a witness identifies a filler or indicates the culprit is not present, confidence does not measure the strength of match between the suspect and the witness’ memory for the culprit. We propose a novel rule out procedure as a potential remedy. After making an identification decision and expressing their confidence, participants indicated for each person they did not identify, how confident they were this person was not the culprit. The rule out procedure better discriminated guilty suspects from innocent suspects than did the simultaneous lineup. This improvement was strictly attributable to increased potential to rule out the innocent. Interestingly, both witnesses who made rejections and witnesses who mistakenly identified fillers possessed additional memorial information that was useful for ruling out the innocent.
Content may be subject to copyright.
THE RULE OUT PROCEDURE
1
The Rule Out Procedure: Increasing the Potential for Police Investigators to Detect Suspect
Innocence from Eyewitness Lineup Procedures
Nydia T. Ayala, Andrew M. Smith, & Rebecca C. Ying
Department of Psychology, Iowa State University
Author Note
Nydia T. Ayala https://orcid.org/0000-0003-4073-8141
Andrew M. Smith https://orcid.org/0000-0002-4184-9364
Rebecca C. Ying https://orcid.org/0000-0002-5666-7790
Correspondence concerning this article should be addressed to Nydia T. Ayala, Dept. of
Psychology, Lagomarcino Hall, 1347 Stange Rd., Ames, IA 50011. Email: ntayala@iastate.edu
The preregistration, raw data, and analysis scripts are available at
https://osf.io/ksrp3/?view_only=06f37a3acafd4fb0b8dccba0f6ca6808.
© 2022, American Psychological Association. This paper is not the copy of record and may
not exactly replicate the final, authoritative version of the article. Please do not copy or cite
without authors' permission. The final article will be available, upon publication, via its
DOI: 10.1037/mac0000018
THE RULE OUT PROCEDURE
2
Abstract
When following scientific best-practice recommendations, the simultaneous lineup is effective at
demonstrating guilt. The simultaneous lineup is less effective at demonstrating innocence. A
critical problem is that when a witness identifies a filler or indicates the culprit is not present,
confidence does not measure the strength of match between the suspect and the witness’ memory
for the culprit. We propose a novel rule out procedure as a potential remedy. After making an
identification decision and expressing their confidence, participants indicated for each person
they did not identify, how confident they were this person was not the culprit. The rule out
procedure better discriminated guilty suspects from innocent suspects than did the simultaneous
lineup. This improvement was strictly attributable to increased potential to rule out the innocent.
Interestingly, both witnesses who made rejections and witnesses who mistakenly identified fillers
possessed additional memorial information that was useful for ruling out the innocent.
Keywords: eyewitness lineups, eyewitness identification, signal detection theory, receiver
operating characteristic analysis, rule out, exonerate
THE RULE OUT PROCEDURE
3
General Audience Summary
Psychological scientists have developed several practices and procedures that help to prevent
mistaken identifications from lineups. However, merely preventing a mistaken identification
does little to establish that someone is innocent and little research has focused on increasing the
potential of lineups to demonstrate that a suspect is innocent. We introduce a novel rule out
procedure that is intended to extract more information from eyewitness memory so that police
investigators can do a better job of detecting when a suspect is innocent. After making an
identification decision and expressing their level of confidence in that decision, we asked
participant-witnesses to indicate for each person they did not identify, how confident they were
that person was not the culprit. The rule out procedure was better able to distinguish guilty
suspects from innocent suspects than was the traditional simultaneous lineup. This effect was
completely attributable to the rule out procedure increasing the potential for eyewitness memory
to demonstrate innocence.
THE RULE OUT PROCEDURE
4
The Rule Out Procedure: Increasing the Potential for Police Investigators to Detect Suspect
Innocence from Eyewitness Lineup Procedures
Preventing the mistaken identification of innocent suspects has long been in the
foreground of the eyewitness lineup literature. After all, innocent-suspect identifications can lead
to wrongful arrest and conviction. Since the 1970s psychological scientists have developed
several practices that help to prevent mistaken identification and increase the ability of police
investigators to discriminate guilty-suspect identifications from innocent-suspect identifications.
But there is a fundamental question that has lurked in the background and received little
attention. How do we demonstrate that a suspect is innocent? Preventing mistaken identifications
should not be confused with demonstrating innocence. Maybe the witness did not pick out the
suspect because of weak memory. Maybe the witness just lacked the confidence required to pick
anyone out of the lineup. Focusing on mere prevention of innocent-suspect identifications sells
the potential of eyewitness memory short. We show that eyewitness memory has immense
potential to demonstrate innocence, but only to the extent that identification procedures are
designed to extract the right information from witness memory.
A lineup is a procedure for measuring how strongly the police suspect matches the
witness’ memory for the culprit. If the suspect is guilty, then that person should provide a
relatively strong match to the witness’ memory, and if the suspect is innocent then that person
should provide a relatively weak match to the witness’ memory. Because investigators cannot
directly observe the match between the suspect and the witness’ memory trace, the witness is
asked to make an identification decision and to accompany that decision with a confidence
statement. If the witness behaves in a manner that suggests the suspect provides a strong match
to memory (e.g., high-confidence suspect identification), the investigator might infer guilt, and if
THE RULE OUT PROCEDURE
5
the witness behaves in a manner that suggests the suspect provides a weak match to memory
(e.g., high-confidence rejection), the investigator might infer innocence. What we wish to stress
here is that, at least in theory, eyewitness memory has the potential to both rule in (inculpate) the
guilty and to rule out (exculpate) the innocent (Smith & Ayala, 2021; Smith, Yang, et al., 2020;
Wells & Lindsay, 1980; Wells & Olson, 2002; Wells et al., 2015).
Although eyewitness memory has the potential to rule out innocent suspects, the
traditional simultaneous lineup is unlikely to ever realize that potential. A simultaneous lineup is
constructed by surrounding a photograph of the suspect with photographs of known-innocent
persons called fillers. The logic for surrounding the suspect with fillers is that the suspect might
be innocent, and if so, fillers offer that person protection from mistaken identification as many
witnesses who would have otherwise identified the innocent suspect, identify a filler instead (Lee
& Penrod, 2019; Smith, Mackovichova, et al., 2020; Smith et al., 2017; Smith et al., 2018). That
fillers reduce innocent-suspect identifications is clearly beneficial, but fillers may also come with
the cost of undermining the potential for confidence to help with ruling out innocent suspects.
When a witness identifies a suspect, confidence measures the degree of match between the
suspect and the witness’ memory for the culprit and therefore indexes the likely guilt of the
suspect (e.g., Brewer & Wells, 2006; Juslin et al., 1996; Palmer et al., 2013; Sauer et al., 2010;
Wixted & Wells, 2017). But when a witness identifies a filler, confidence measures the match
between the identified filler and the witness’ memory for the culprit and does not index the likely
guilt of the suspect. What exactly confidence measures when a witness rejects a lineup is not
entirely clear, but one possibility is that it reflects the discrepancy between the best-matching
lineup member and the witness’ memory for the culprit. Because the suspect is not always the
best-matching lineup member, confidence often will not measure match-to-memory for the
THE RULE OUT PROCEDURE
6
suspect (Smith & Ayala, 2021). This reasoning might explain why confidence often does not
provide a good index of culprit presence for rejection decisions (Brewer & Wells, 2006; Sporer
et al., 1995; Wixted & Wells, 2017).
What is most frustrating is that the potential of lineups to rule out innocent suspects
might easily be increased by asking witnesses better questions. It is easy to envision scenarios
where a witness who rejected an entire lineup with low confidence, or even a witness who
identified a filler, could more definitively rule out an innocent suspect. For example, maybe the
witness had low confidence in the decision to reject the entire simultaneous lineup because
“number three looked sort of familiar”. But what if number three was not the suspect? Maybe
number three was a filler and number four was the suspect. If the investigator had followed up by
asking the witness about number four, the witness might have rejected the suspect with high
confidence. Likewise, even a witness who identified a filler might be able to tell the investigator
that number four, the suspect, was “definitely not” the perpetrator. A recent Signal Detection-
based simulation confirms that, in theory, a simultaneous lineup would be better able to
discriminate between guilty suspects and innocent suspects if it were designed so that it always
measured how strongly the suspect matched the witness’ memory for the culprit (Smith & Ayala,
2021).
One identification procedure that always secures a measure of the match between the
suspect and the witness’ memory for the culprit is the ratings-based identification procedure
(Brewer & Doyle, 2021; Brewer et al., 2012; Sauer & Brewer, 2021; Sauer et al., 2008). The
ratings-based procedure eschews categorical identification decisions altogether and instead asks
the witness to provide a confidence rating for each lineup member. These ratings can then be
leveraged to determine the likely guilt of the suspect. One clear benefit of a ratings-based
THE RULE OUT PROCEDURE
7
procedure is that it always provides a direct measure of how strongly the suspect matches the
witness’ memory for the culprit. This might explain why the ratings-based procedure has proven
more effective at demonstrating innocence than a categorical rejection (Sauer et al., 2008). Yet,
despite several potential benefits, the legal system may be resistant to the ratings-based
procedure. One basis for this resistance is that a lineup is the point in an investigation when a
witness either formally accuses the suspect or does not (Steblay & Brooks, 2021). Because the
ratings-based procedure eschews the categorical identification decision, no formal accusation is
ever made.
In lieu of a ratings-only approach to eyewitness identification, we propose a hybrid where
after completing a traditional simultaneous lineup and providing a confidence rating, the
investigator extracts additional ratings from the witness. We call this hybrid the rule out
procedure. The goal of this procedure is to extract more and better information from eyewitness
memory than does the simultaneous lineup, while at the same time, securing the categorical
identification decisions that are so coveted by the legal system. We presented a simultaneous
lineup to witness-participants. Witnesses viewed the lineup, made an identification decision, and
indicated their confidence in that decision. We then presented witnesses with the same lineup for
a second viewing but this time we attached a rating scale to the photo of each lineup member that
the witness did not initially identify. We asked witnesses to indicate, for each person that they
did not initially identify, how confident they were that this person was not the culprit. The rule
out procedure ensures there is always a confidence rating that provides nuanced information
about how strongly the suspect matches the witness’ memory for the culprit. We hypothesized
that the rule out procedure would better discriminate guilty suspects from innocent suspects than
would the traditional simultaneous lineup.
THE RULE OUT PROCEDURE
8
Method
This experiment and data analysis plan was preregistered prior to data collection and can
be accessed on the Open Science Framework (Ayala et al., 2022). De-identified data for this
experiment is also available on the Open Science Framework. The institutional review board at
Iowa State University approved this experiment. Our hypothesis was that the rule out procedure
would increase the potential for eyewitness memory to rule out innocent suspects. Specifically,
we predicted that the rule out procedure would better discriminate between guilty suspects and
innocent suspects than would the traditional simultaneous lineup. An a priori power analysis
using G*Power (Version 3.1, Faul et al., 2007) revealed that a sample size of 1100 would give us
95% power to detect a small effect size (dz = 0.10) using a dependent-samples t-test. We used
this as a proxy for estimating the sample required to find a difference in Area Under the Receiver
Operating Characteristic Curve (AUC). Because we expected to exclude some participants due to
failed attention checks, technological issues, or prior exposure to our stimuli, we oversampled by
recruiting 1500 Mechanical Turk workers.
Participants
In total, 1474 Mechanical Turk workers participated in this study in exchange for $0.50
USD. We fell slightly short of our 1500-participant target because we detected 26 Mechanical
Turk workers who either submitted an unauthorized completion code or completed the study
more than once. We used the TurkPrime platform to facilitate data collection (Litman et al.,
2017). Of the 1474 Mechanical Turk workers who participated in this experiment, 251 were
excluded because they failed the attention check (n = 109), encountered a technological issue
with the encoding video (n = 33), encountered a technological issue with the lineup task (n = 27),
or recognized the stimuli from another experiment (n = 114). The number of exclusion reasons
THE RULE OUT PROCEDURE
9
exceeds the number of excluded participants because some of these participants reported
multiple problems. In addition, we include in our list of exclusions participants who wrote
incomprehensible responses to these questions. Of the 1223 participants who provided valid data,
142 skipped one or more of the confidence ratings that were associated with the rule out
procedure. We analyzed our data both including all 1223 participants and including only the
1081 participants who provided full data. Because we reached the same conclusions, we present
analyses based only on the 1081 participants who provided complete data.
Of the 1081 participants who provided valid and complete data, 75% (n = 810) identified
as White or Caucasian, 9% (n = 101) identified as Black or African American, 8% (n = 89)
identified as Asian or Asian-American, 5% (n = 58) identified as Hispanic or Latino/Latina, 2%
(n = 22) identified as other, and one person did not identify. Fifty-seven percent (n = 613) of
participants identified as female, 43% (n = 462) identified as male, and less than 1% identified as
other (n = 6). On average, participants were 39.09 years of age (SD = 12.25).
Design
We randomly assigned participants to a 2 (culprit: present, absent) x 2 (lineup:
simultaneous, rule out) mixed-participants design. Culprit presence was manipulated between
participants and lineup type was manipulated within participants. After completing a
simultaneous lineup and providing a confidence statement, participants were asked to indicate
for any lineup member that they did not affirmatively identify, how confident they were that this
person was not the culprit (viz. to complete the rule out procedure). Participants were also
randomly assigned to view one of two different target videos of a man discussing a crime. The
sole purpose of this manipulation was to create some degree of stimulus sampling (Wells &
Windschitl, 1999).
THE RULE OUT PROCEDURE
10
Materials
Culprit Videos
Each participant viewed a short video depicting a man speaking about a crime to a person
who is not in view of the camera. The videos did not have any sound and the camera frame
captured the man in the video from the chest up. The video was 16 seconds in length, including a
3-second-long blank screen at both the beginning and end of the video. The first target was an
early 20s, White male, with short to medium length, wavy blonde hair, and clean shaven. The
second target was an early 20s, White male, with medium length, dark brown hair, and short
stubble for facial hair.
Lineups
Culprit-present lineups included a photo of the man from the encoding video along with
five fillers who the participants had never seen before. Culprit-absent lineups included six fillers
who the participants had never seen before. All lineup fillers matched the description of the
culprit provided above. (See the project page on the Open Science Framework for examples of
each lineup type). Lineups were presented to participants as a 2 (rows) x 3 (columns)
simultaneous photo array with a separate option to indicate that the culprit was not present.
Because we planned to estimate the innocent-suspect identification rate by dividing the total
number of culprit-absent false alarms by the number of lineup members (6), we counterbalanced
which of the six culprit-absent fillers were alongside the culprit in the culprit-present lineup. We
created six versions of each of our two culprit-present lineups, each time excluding a different
filler who was present in the culprit-absent lineup. (See Quigley-McBride and Wells (2021) for
discussion on why this counterbalancing is essential when estimating innocent-suspect
identification rates from the total culprit-absent false-alarm rate).
THE RULE OUT PROCEDURE
11
Dividing the total culprit-absent false alarm rate by the number of culprit-absent lineup
members generates an estimate of the innocent-suspect identification rate that assumes a fair
lineup. We also measured resultant lineup fairness to evaluate how uniformly witness responses
distributed across lineup members (Quigley-McBride & Wells, 2021; Smith et al., 2019).
Resultant lineup fairness is computed in the same way that mock-witness lineup fairness is
computed (for example, using Tredoux’s (1998) E'), but rather than relying on a proxy for lineup
fairness—which may not be accurate—resultant lineup fairness measures the fairness that was
actually achieved in the experiment. Because recognition memory for the target will obviously
influence the distribution of responses, resultant lineup fairness is only calculated for culprit-
absent conditions (See Quigley-McBride and Wells (2021) for a detailed discussion). For a six-
person lineup, Tredoux’s E' can vary from 1 (perfectly biased) to 6 (perfectly fair). The
Tredoux’s E' value for our first culprit (the blonde male) was 3.50, suggesting a moderately fair
lineup. The Tredoux’s E' for our second culprit (the brunette male) was 5.03, suggesting a very
fair lineup.
Procedure
After providing informed consent, participants were instructed that they were going to
watch a video and because they would have only one opportunity to watch the video, not to
advance to the next page until they were ready to do so. After viewing the video, participants
completed a 2-minute word-unscrambling (anagrams) task. The purpose of the anagrams task
was to create a delay between viewing the encoding video and completing the lineup task. After
participants completed the anagrams task, they were instructed that they would view a lineup,
that the culprit may or may not be present, that they should identify the culprit if he is present,
and that otherwise, they should indicate that he is not present. After making an identification
THE RULE OUT PROCEDURE
12
decision, participants were asked to indicate their confidence in that decision: 0%, 20%, 40%,
60%, 80%, or 100%. At this point the simultaneous lineup procedure was complete and the rule
out lineup procedure began. Participants were told that they would now view the same lineup
again but this time, we wanted them to indicate for any person they did not initially identify, how
confident they were that person was not the culprit. If a participant had initially indicated that the
culprit was not present, a rating scale (0%, 20%, 40%, 60%, 80%, 100%) appeared above each of
the three photos in the top row of the lineup and below each of the three photos in the bottom
row of the lineup (there were six rating scales in total—one for each photo). If a participant made
an affirmative identification during the initial lineup presentation, the identified person’s photo
was present during the rule out procedure, but there was no rating scale attached to that photo
(there were five rating scales in total—one for each photo depicting a person not previously
identified). During the rule out procedure all lineup photos appeared in the same configuration as
they did during the initial lineup.
Results
We start our results section by providing a descriptive overview of performance on the
traditional simultaneous lineup. We then provide a brief tutorial on creating Receiver Operating
Characteristic (ROC) curves for eyewitness lineup procedures (Smith & Ayala, 2021; Smith,
Yang, et al., 2020; Yang & Smith, 2021). Following this tutorial, we present the results of an
ROC analysis comparing the rule out procedure with the traditional simultaneous lineup. Finally,
we present supplemental analyses that further clarify why the rule out procedure was superior to
the traditional simultaneous lineup.
Categorical Eyewitness Identification Performance
THE RULE OUT PROCEDURE
13
Table 1 provides a summary of eyewitness identification performance on the traditional
simultaneous lineup. When the culprit was present in the lineup, 63.59% (n = 344) of
eyewitnesses identified him, 11.09% (n = 60) identified a filler, and 25.32% (n = 137) indicated
that the culprit was not present. When the culprit was absent from the lineup, 29.44% (n = 159)
mistakenly identified an innocent lineup member and 70.56% (n = 381) indicated that the culprit
was not present. Because we did not have a designated innocent suspect, we estimated the
innocent-suspect identification rate by dividing the total false-alarm rate from the culprit-absent
lineup (29.44%) by six. Hence, our estimate of the innocent-suspect identification rate was
4.91% and our estimate of the culprit-absent filler-identification rate was 24.53% (29.44% -
4.91%).
Table 1
Identification decisions as a function of culprit presence
Culprit Presence
Culprit Present
Suspect
ID
Filler
ID
Rejection
Suspect
ID
Filler
ID
Rejection
63.59%
(344)
11.09%
(60)
25.32%
(137)
4.91%
24.53%
(159)
70.56%
(381)
Note. Values in parentheses represent the number of participants who provided that response. The suspect and filler
ID rates for the culprit-absent lineup were estimated using the 1-over-6 method, as explained in text.
Eyewitness Lineup ROC Curves
We initiate our discussion of ROC analysis with a brief tutorial on creating ROC curves
for eyewitness lineups. For this tutorial we used the simulated data from Smith and Ayala
(2021). There are two qualitative differences between the simulated data that we used for this
tutorial and the experimental data that we collected. First, whereas in our experiment, we had
participants indicate their confidence on a six-point scale (0%, 20%, 40%, 60%, 80%, 100%), for
THE RULE OUT PROCEDURE
14
simplicity we used a 3-point confidence scale (low, moderate, high) in this tutorial. Second,
whereas in our experimental data we estimate the innocent-suspect identification rate from the
overall false alarm rate, for purpose of tutorial we used a designated innocent suspect.
We first explain how to create an ROC curve for the rule out procedure. For witnesses
who initially identified the suspect, we focused on their expressed level of confidence that the
suspect was the culprit (low, moderate, high). For witnesses who did not initially identify the
suspect, we focused on their expressed level of confidence that the suspect was not the culprit
(low, moderate, high). We then combined these two scales to form a 6-point scale that ranged
from high-confidence that the suspect was the culprit to high-confidence that the suspect was not
the culprit. This 6-point scale reflects how strongly the suspect matched the witness’ memory for
the culprit, which served as a proxy for the likelihood that the suspect was guilty. To create an
ROC curve, we then plotted the cumulative distribution of culprit-present responses on this 6-
point scale against the cumulative distribution of culprit-absent responses on this 6-point scale.
The area under the ROC curve (AUC) measures the ability of the procedure to discriminate
guilty suspects from innocent suspects.
Figure 1displays the ROC curves for both the rule out procedure and simultaneous lineup
procedure (discussed below) and Table 2 displays the underlying identification rates. We plotted
cumulative ratings as operating points in the ROC space by starting with the rating that was most
diagnostic of guilt and finishing with the rating that was most diagnostic of innocence. The
leftmost operating point pits high-confidence culprit identifications against high-confidence
innocent-suspect identifications. The second to leftmost operating point pits culprit
identifications made with at least moderate confidence against innocent-suspect identifications
made with at least moderate confidence. In other words, the second to leftmost operating point
THE RULE OUT PROCEDURE
15
includes both high- and moderate-confidence suspect identifications. One continues plotting
operating points in this fashion until all culprit-present outcomes and all culprit-absent outcomes
are included in the single operating point depicted in the upper righthand corner of the ROC
space. These operating points are then connected by line segments to form a non-parametric
ROC curve. The cumulative rates depicted in Table 2 correspond to the operating points in
Figure 1.
1
The simultaneous-lineup ROC curve was created in similar fashion to the rule out ROC
curve. We started by plotting operating points for suspect identifications by descending levels of
confidence (see Figure 1 and Table 2). But then we ran into a complexity that we did not
encounter with the rule out procedure. Witnesses who do not identify the suspect can either reject
the lineup or identify a filler, and there is no inherent ordering of rejections versus filler picks.
To resolve this issue, we determined an a priori rule that would order these operating points from
most to least diagnostic of guilt. The Signal Detection model used to generate the simulated data
in Figure 1 predicted that confidence would be related to guilt for rejection decisions but not for
filler identifications. In addition, this model also predicted that a filler identification made with
any level of confidence was about as diagnostic of innocence as was a low-confidence rejection
(Smith & Ayala, 2021). Hence, we ordered the operating points as follows: high-confidence
suspect, moderate-confidence suspect, low-confidence suspect, fillers, low-confidence rejection,
1
Unlike in our simulated data, we did not have a designated innocent suspect in our experimental data. Because our
goal was to generate estimates of the innocent-suspect identification rate that were analogous to dividing the total
culprit-absent false-alarm rate by the number of lineup members (6), we used all six confidence ratings that each
participant provided to produce the rule out ROC curve for our experimental data. The result is that each operating
point in the ROC space is equivalent to having divided the total culprit-absent false alarm rate by the number of
lineup fillers (6).
THE RULE OUT PROCEDURE
16
moderate-confidence rejection, and high-confidence rejection. We use that same ordering both
for this tutorial and for our subsequent experimental comparison.
2
Figure 1
ROC curves for the traditional simultaneous lineup procedure and the rule out procedure
derived from simulated eyewitness identification decisions
Note. The AUC for the rule out procedure was .8368 and the AUC for the traditional simultaneous lineup was .7492.
Hence, the model predicted that the rule out procedure would better discriminate guilty suspects from innocent
suspects than would the traditional simultaneous lineup.
2
For our experimental comparison of the simultaneous lineup and the rule out procedure, we also compared the two
procedures with an alternative a priori ordering based on the empirical literature (see Smith, Yang, et al., 2020) and
our AUC estimates were virtually identical to the ones we obtained with the model-predicted ordering that we use in
the main body of this paper. More importantly, both orderings led to the same conclusion: the rule out procedure
was superior to the simultaneous lineup.
SuspectHigh
SuspectModerate
SuspectLow
Fillers
NotPresentLow
NotPresentModerate
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
1 Specificity (Culprit Absent)
Sensitivity (Culprit Present)
Rule Out
Traditional Simultaneous
THE RULE OUT PROCEDURE
17
Table 2
Noncumulative and cumulative true-positive and false-positive rates for both the simultaneous
lineup and the rule out identification procedures
Simultaneous Lineup
Rule Out Procedure
Noncumulative
Cumulative
Noncumulative
Cumulative
Decision
TP
FP
TP
FP
Rating
TP
FP
TP
FP
SuspectHigh
.156
.007
.156
.007
SuspectHigh
.158
.007
.158
.007
SuspectModerate
.139
.016
.295
.023
SuspectModerate
.149
.017
.307
.024
SuspectLow
.215
.063
.510
.086
SuspectLow
.190
.046
.498
.071
Filler
.255
.399
.765
.484
-
RejectLow
.104
.162
.869
.646
RejectLow
.345
.240
.843
.311
RejectModerate
.104
.240
.973
.886
RejectModerate
.094
.194
.938
.504
RejectHigh
.027
.114
1.00
1.00
RejectHigh
.062
.496
1.00
1.00
Note. TP = True Positive. FP = False Positive. Each pair of cumulative true-positive rates and false-positive rates
corresponds to a different operating point on the ROC curves depicted in Figure 1.
Beyond clarifying how to generate ROC curves for the rule out and simultaneous lineup
procedures, this tutorial reveals an interesting prediction of Signal Detection Theory: the rule out
procedure should be better able to discriminate guilty suspects from innocent suspects than
should the simultaneous lineup. Even though the rule out procedure and simultaneous lineup
were based on the same underlying memory signals (See Smith & Ayala, 2021), the rule out
procedure was associated with better discriminability (AUC = .8368) than was the simultaneous
lineup (AUC = .7492). Given that these two ROC curves were based on the same underlying
memory signals, the only explanation for this pattern of results is that the rule out procedure
extracts more and better information from memory than does the simultaneous lineup.
It is evident from Figure 1 that the reason the rule out procedure has better
discriminability than does the simultaneous lineup is because it was better able to detect
innocence. For example, consider the operating point (triangle) for the rule out procedure at X =
.502, Y = .929. The distance from this operating point to the upper righthand corner of the ROC
THE RULE OUT PROCEDURE
18
space reflects the high-confidence rejection rate for the culprit-present (7.1%) and culprit-absent
(49.8%) conditions; 87.5% of high-confidence rejections from the rule out procedure came from
the culprit-absent condition. High-confidence rejections from the simultaneous lineup were also
relatively diagnostic of innocence as is evident from the distance between the
“NotPresentModerate” operating point and the upper righthand corner. This line segment reflects
the high-confidence rejection rates for culprit-present (2.7%) and culprit-absent (11.4%)
conditions; 81% of high-confidence rejections came from the culprit-absent condition. Given that
the rule out procedure only slightly increased high-confidence rejection accuracy, one might
question why it leads to a much larger AUC value than does the simultaneous lineup. The reason
is because the rule out procedure leads to a massive increase in the yield of high-confidence
correct rejections. When the culprit was absent, 49.8% of rule out procedures led to a high-
confidence correct rejection but only 11.4% of culprit-absent simultaneous lineups led to a high-
confidence correct rejection.
ROC Analysis
Recall that the simultaneous lineup and rule out procedure share the same overall suspect
identification, filler identification, and rejection rates. Indeed, the difference between the
simultaneous lineup procedure and the rule out procedure is that after completing the
simultaneous lineup and providing a confidence statement, participants were then asked to
indicate for each non-identified lineup member, how confident they were this individual was not
the culprit. Hence, the ROC curves associated with these two procedures are identical from the
origin (0,0) up until the operating point that contrasts the overall guilty-suspect identification rate
(.6359 on the ordinate) with the overall innocent-suspect identification rate (.0491 on the
abscissa). Anywhere to the right or above that point, the two procedures are based on different
THE RULE OUT PROCEDURE
19
data. For both procedures, we ordered the data as we did in the above tutorial. The result is that
we had 12 operating points (plus a point of origin) for the rule out procedure and 13 operating
points (plus a point of origin) for the traditional simultaneous lineup.
The ROC curves for both the traditional simultaneous lineup procedure and the
simultaneous lineup followed by a rule out procedure are displayed in Figure 2. It is apparent
from Figure 2 that the rule out procedure was better able to discriminate between guilty suspects
and innocent suspects than was the traditional simultaneous lineup procedure. We used the
bootstrap method (n = 2000) from the pROC package (Robin et al., 2011) to assess whether the
apparent difference in AUCs was statistically significant. As predicted, the rule out procedure
better discriminated between guilty-suspects and innocent-suspects (AUC = .8714) than did the
traditional simultaneous lineup (AUC = .8264), D = 2.85, p = .004 (unpaired test).
THE RULE OUT PROCEDURE
20
Figure 2
ROC curves for the traditional simultaneous lineup procedure and the rule out procedure
Note. The AUC for the rule out procedure was .8714 and the AUC for the traditional simultaneous lineup was .8264.
The rule out procedure better discriminated guilty suspects from innocent suspects than did the traditional
simultaneous lineup procedure.
Our supplemental materials include six additional tests of the rule-out advantage. For
each of these tests, we treated a different culprit-absent lineup member as the designated
innocent suspect. Each of these six additional tests also showed a statistically significant rule out
advantage. We also show 12 additional descriptive tests, where we treated a different culprit-
absent lineup member as the designated innocent suspect separately for each of our two stimulus
sets. Although the size of the AUC difference varied some across these 12 tests, every single test
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Culprit Absent
Culprit Present
Rule Out
Traditional Simultaneous
THE RULE OUT PROCEDURE
21
showed a rule out advantage. There was not a single case where the simultaneous lineup fared
better than the rule out procedure.
Interpreting Rule Out Performance
To further clarify why the rule out procedure was superior to the traditional simultaneous
lineup, we examined both confidence-accuracy calibration of rejection decisions and the yield of
culprit-absent outcomes (Figure 3). Figure 3 also includes a calibration-like function for filler
identifications. We included a calibration-like function for filler identifications for two reasons.
First, filler identifications do have modest rule out potential. Second, it is the presence of fillers
that undermines the traditional simultaneous lineup’s potential to rule out innocent suspects and
the calibration-like function for fillers helps to clarify why.
The points in Figure 3 depict the calibration functions or the probability of suspect
innocence for each outcome and associated level of confidence. The bars depict the yield or the
total proportion of culprit-absent lineups producing that particular outcome. For example, on
48% of rule out procedures, the innocent suspect was rejected with 100% confidence, which is
evident from the black bar at the 100% confidence bin.
Figure 3 demonstrates that filler identifications were only modestly diagnostic of
innocence and their diagnostic value did not differ as a function of confidence (as evidenced by
the lack of positive slope). Conversely, confidence was related to culprit-absence for both
rejections from the traditional simultaneous lineup and rejections from the rule out procedure.
Qualitatively, the calibration function had a steeper slope for the rule out procedure than it did
for rejections from the traditional simultaneous lineup. This is evident from the fact that at 0% to
40% confidence, rejections from the rule out procedure were less diagnostic of innocence than
were rejections from the traditional simultaneous lineup, but at 100% confidence, rejections from
THE RULE OUT PROCEDURE
22
the rule out procedure were more diagnostic of innocence than were rejections from the
traditional simultaneous lineup.
Figure 3 clarifies that the superiority of the rule out procedure over the traditional
simultaneous lineup is a product of two factors. First, 100%-confidence rejections from the rule
out procedure were slightly more diagnostic of innocence (88%) than were 100%-confidence
rejections from the traditional simultaneous lineup (81%) (this is evidenced from the non-
overlapping error bars). Second, rejections made with 100% confidence were more diagnostic of
innocence than were rejections made with lower levels of confidence and the rule out procedure
produced a much larger yield of 100%-confidence rejections (48% of all culprit-absent lineups)
than did the traditional simultaneous lineup (15% of culprit-absent lineups). This difference in
yield was also evidenced in the ROC curves depicted in Figure 2. The horizontal distance
between the rightmost and second to rightmost operating points is equal to the proportion of
culprit-absent lineups that resulted in 100%-confidence rejections. For the rule out procedure,
this value is equal to 48% and for the simultaneous lineup, this value is equal to only 15%.
THE RULE OUT PROCEDURE
23
Figure 3
Yield of confidence ratings for culprit-absent lineups (bars) and the probability of suspect
innocence given the level of confidence (points)
Note. Points represent the probability that the culprit was absent given the witness’ response and level of confidence.
Bars represent the proportion of all culprit-absent lineups that produced that response and associated level of
confidence (i.e., yield).
Are Rule Out Ratings That Follow the Mistaken Identification of a Filler Diagnostic?
An interesting question is whether rule out ratings discriminate between guilty suspects
and innocent suspects even for witnesses who mistakenly identified a known-innocent filler. A
witness who identifies a filler from a lineup clearly does not have the most reliable memory for
the culprit. Yet, there is some empirical evidence that even witnesses who identify fillers may
have additional memorial information that could be used to discriminate between guilty suspects
0
25
50
75
100
040 60 80 100
Confidence
Percentage (%)
Rule Out
Traditional Reject
Traditional Filler
THE RULE OUT PROCEDURE
24
and innocent suspects (e.g., Brewer et al., 2020; McAdoo & Gronlund, 2016). For example,
Brewer et al. (2020) found that even for witnesses who identified fillers, reasonably high
confidence that the suspect was guilty was diagnostic of guilt. Accordingly, we examined the
efficacy of rule out ratings separately for witnesses who initially identified a filler and for
witnesses who initially rejected the lineup.
For each culprit-absent witness, we computed an average rule out rating. For culprit-
absent witnesses who initially identified a filler, this average was based on the ratings assigned to
the five non-identified lineup members and for culprit-absent witnesses who initially rejected,
this average was based on the rule out ratings assigned to all six non-identified lineup members.
We then compared these averages to the rule out ratings that culprit-present witnesses assigned
to the culprit. We used Welch two-sample t-tests for these comparisons. Culprit-present
witnesses who identified a filler assigned lower rule out ratings to the culprit (M = 52.00, SD =
33.39) than culprit-absent witnesses who identified a filler assigned to non-identified innocent
persons (M = 63.90, SD = 29.95), t(96.93) = 2.42, p = .02, d = 0.37, (95% CI: .07 to .67).
Likewise, culprit-present witnesses who indicated the culprit was not present assigned lower rule
out ratings to the culprit (M = 49.93, SD = 33.09) than culprit-absent witnesses who indicated the
culprit was not present assigned to non-identified innocent persons (M = 78.02, SD = 23.41),
t(187.19) = 9.15, p < .001, d = 0.91, (95% CI: 0.71 to 1.11). These findings suggest that the rule
out procedure extracts diagnostic memorial information both from witnesses who reject lineups
and even from witnesses who mistakenly identify fillers.
Discussion
We tested the Signal Detection-based prediction that collecting confidence ratings for all
lineup members would increase the abilities of investigators to discriminate guilty suspects from
THE RULE OUT PROCEDURE
25
innocent suspects (Smith & Ayala, 2021). After making an identification decision from a
simultaneous lineup and expressing their confidence in that decision, witnesses indicated for
each person they did not identify, how confident they were that person was not the culprit. We
called this the rule out procedure. As predicted, the rule out procedure better discriminated guilty
suspects from innocent suspects than did the traditional simultaneous lineup. This increase in
discriminability was solely attributable to extracting additional information from witness
memory that could be used to improve detection of suspect innocence.
A substantial limitation of the traditional simultaneous lineup is that confidence does not
always scale the match between the suspect and the witness’ memory for the culprit. When a
witness identifies a filler, confidence scales match-to-memory for that individual. When a
witness rejects a lineup, confidence likely scales match-to-memory for the best-matching lineup
member, who will often not be the suspect (Smith & Ayala, 2021; Starns et al., 2021).
Conversely, the rule out procedure always secures a confidence rating that reflects how strongly
the suspect matches the witness’ memory for the culprit. A witness who identifies a filler from a
culprit-absent lineup may still be able to reject the innocent suspect with high confidence.
Likewise, a witness who rejects the entire lineup with only a modest degree of confidence may
still be able to reject the innocent suspect with high confidence. The rule out procedure has better
discriminability than does the simultaneous lineup because it extracts more and better
information from eyewitness memory that investigators can leverage to detect suspect innocence.
Others have demonstrated that the confidence-accuracy relation for rejection decisions is
stronger when witnesses provide confidence ratings about individual faces rather than confidence
ratings about sets of faces (e.g., Lindsay et al., 2013; Sauer et al., 2012; Sauerland et al., 2012;
Yilmaz et al., 2022). The rule out procedure advances this work in several ways. First, the rule
THE RULE OUT PROCEDURE
26
out procedure is more expansive. Past work has focused only on the manner of collecting
confidence statements from witnesses who had rejected a lineup and did not follow up with
witnesses who selected fillers. In contrast, the rule out procedure collects confidence ratings for
all lineup members no matter the categorical outcome of the lineup. No other procedure that
collects categorical identification decisions has followed a filler identification by asking
witnesses to provide confidence ratings for all non-selected lineup members. Yet, the present
findings demonstrate that even witnesses who identify fillers may have additional information in
memory that can be used to discriminate between guilty suspects and innocent suspects. Second,
our focus was not on assessing the confidence-accuracy relation for rejection decisions but on
developing a procedure that could increase discriminability. The present results demonstrate that
simply asking witnesses to provide confidence ratings for each lineup member increases the
potential for investigators to discriminate guilty suspects from innocent suspects.
Like all studies, the present work is not without limitation. We used encoding videos that
were short in duration and that likely do not capture the complexity of some real-world crimes.
We also used a short retention interval between encoding and recognition. Finally, we used only
two different stimulus sets. While future empirical work assessing the generalizability of the
present findings is essential, there is good reason to expect generalizability. We used a
mathematical model to formulate our predictions (Smith & Ayala, 2021) and designed our
experiment only after examining the predictions of our theoretical model. Consistent with our
theoretical model, all inferential tests that we conducted showed a rule out advantage. Our
findings were also consistent with past work comparing the ratings-based procedure to the
simultaneous lineup (e.g., Brewer & Doyle, 2021; Brewer et al., 2020; Sauer et al., 2008; Sauer
et al., 2012). For instance, Sauer et al. (2008) found that the ratings-based procedure was more
THE RULE OUT PROCEDURE
27
effective at demonstrating innocence than was a simultaneous lineup. Likewise, Brewer et al.
(2020) found that even for witnesses who rated a filler as the strongest match to memory,
reasonably high confidence that the suspect was guilty was diagnostic of guilt. Hence, there is
good reason to expect our results to generalize beyond the current context.
From an applied perspective, members of the legal system might find the rule out
procedure an appealing alternative to the traditional simultaneous lineup. Rather than requiring
complete overhaul of the simultaneous lineup, the rule out procedure can be tacked on to the end
of the simultaneous lineup simply by asking the witness for a handful of additional confidence
ratings. Assuming a culprit-present base rate of 50%, the present findings suggest the rule out
procedure (AUC = .8714) would lead to about 5 more correct classifications per 100 cases than
does the traditional simultaneous lineup (AUC = .8264). When one considers the relevant
context, that effect size is more impressive than it might seem at first blush. First, even an
investigator who was flipping a coin would be able to correctly classify the suspect 50 times out
of 100, so the maximum possible improvement is only 50 (and not 100). Second, because the
rule out procedure follows the traditional lineup, it could not possibly have improved decision
accuracy. It seems likely that the rule out procedure will only improve performance in instances
in which the witness does not initially identify the suspect. Yet, simply by asking the witness for
a few additional confidence ratings, police investigators might increase their ability to
discriminate guilty suspects from innocent suspects.
Empirical progress often lags the development of statistical tools (Gigerenzer, 1991).
Historically, those who studied eyewitness lineup procedures did not have a statistical tool that
permitted comparison of total informational value across lineup procedures. Given the absence
of such a tool, the empirical literature came to focus on the diagnostic value of suspect
THE RULE OUT PROCEDURE
28
identifications while ignoring the diagnostic value of rejections and filler identifications. Recent
progress has led to the development of both the full ROC and expected information gain
approaches to comparing eyewitness lineups (Smith et al., 2020; Starns et al., 2021; Yang &
Smith, 2021). Both approaches account for all lineup outcomes—suspect identifications, filler
identifications, and rejections—and measure the total potential for a procedure to discriminate
guilty suspects from innocent suspects. These novel tools have brought to light many
shortcomings of current eyewitness identification procedures including the discovery that the
traditional simultaneous lineup undermines the ability for eyewitness memory to rule out
innocent suspects (Smith & Ayala, 2021). It was only through examining the full ROC curve that
the limitations of the simultaneous lineup came to light and that a superior alternative—the rule
out procedure—surfaced. From an applied perspective, the present work demonstrates a new and
intriguing procedure that could increase the abilities of police investigators to discriminate guilty
suspects from innocent suspects. From a theoretical perspective, the present work adds to a
growing body of research demonstrating the importance of considering the total informational
value of lineups (Smith et al., 2020; Starns et al., 2021). Inferences about which of two lineups is
superior based only on the relative rates of suspect identifications are unwarranted.
Conclusion
For over 40 years, attempts to improve discriminability have focused on increasing the
potential for police investigators to rule in guilty suspects. The present work demonstrates a
novel pathway for improving discriminability: increasing rule out potential. The traditional
simultaneous lineup undermines the potential for eyewitness memory to rule out innocent
suspects and thus, undermines the potential for police investigators to discriminate guilty
suspects from innocent suspects. This happens because when a witness identifies a filler or
THE RULE OUT PROCEDURE
29
rejects a lineup, confidence does not directly measure the strength of match between the suspect
and the witness’ memory for the culprit. A lineup should not conclude with a categorical
identification decision followed by a single confidence rating. Instead, the present work suggests
that police investigators would do well to follow a categorical identification decision by
soliciting confidence ratings for all lineup members. This feature of the rule out procedure
ensures that there is always a rating that directly reflects the match between the suspect and the
eyewitness’ memory for the culprit. Both theoretical simulations and empirical data show
evidence that the rule out procedure increases the potential for investigators to discriminate
guilty suspects from innocent suspects.
THE RULE OUT PROCEDURE
30
References
Ayala, N. T., Smith, A. M., & Ying, R. C. (2022). The rule out procedure: Increasing the
potential for police investigators to detect suspect innocence from eyewitness lineup
procedures. Open Science Framework.
https://osf.io/ksrp3/?view_only=06f37a3acafd4fb0b8dccba0f6ca6808. Registered 18
August 2021.
Akan, M., Robinson, M. M., Mickes, L., Wixted, J. T., & Benjamin, A. S. (2021). The effect of
lineup size on eyewitness identification. Journal of Experimental Psychology:
Applied, 27(2), 369–392. http://dx.doi.org/10.1037/xap0000340
Brewer, N., & Doyle, J. (2021). Changing the face of police lineups: Delivering more
information from witnesses. Journal of Applied Research in Memory and Cognition,
10(2), 180–195. http://dx.doi.org/10.1016/j.jarmac.2020.12.004
Brewer, N., Weber, N., Wootton, D., & Lindsay, D. S. (2012). Identifying the bad guy in a
lineup using confidence judgments under deadline pressure. Psychological Science,
23(10), 1208–1214. http://dx.doi.org/10.1177/0956797612441217
Brewer, N., Weber, N., & Guerin, N. (2020). Police lineups of the future? American
Psychologist, 75(1), 76–91. http://dx.doi.org/10.1037/amp0000465
Brewer, N., & Wells, G. L. (2006). The confidence-accuracy relationship in eyewitness
identification: Effect of lineup instructions, foil similarity, and target-absent base rates.
Journal of Experimental Psychology: Applied, 12(1), 11–30.
http://dx.doi.org/10.1037/1076-898x.12.1.11
Clark, S. E. (2003). A memory and decision model for eyewitness identification. Applied
Cognitive Psychology, 17(6), 629–654. http://dx.doi.org/10.1002/acp.891
THE RULE OUT PROCEDURE
31
Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: A flexible statistical
power analysis program for the social, behavioral, and biomedical sciences. Behavior
Research Methods, 39(2), 175–191. http://dx.doi.org/10.3758/bf03193146
Gigerenzer, G. (1991). From tools to theories: A heuristic of discovery in cognitive psychology.
Psychological Review, 98(2), 254–267. http://dx.doi.org/10.1037/0033-295X.98.2.254
Juslin, P., Olsson, N., & Winman, A. (1996). Calibration and diagnosticity of confidence in
eyewitness identification: Comments on what can be inferred from the low confidence-
accuracy correlation. Journal of Experimental Psychology: Learning, Memory, &
Cognition, 22(5), 1304–1316. http://dx.doi.org/10.1037/0278-7393.22.5.1304
Lee, J., & Penrod, S. D. (2019). New signal detection theory-based framework for eyewitness
performance in lineups. Law and Human Behavior, 43(5), 436–454.
http://dx.doi.org/10.1037/lhb0000343
Lindsay, R. C. L., Kalmet, N., Leung, J., Bertrand, M. I., Sauer, J. D., & Sauerland, M. (2013).
Confidence and accuracy of lineup selections and rejections: Postdicting rejection
accuracy with confidence. Journal of Applied Research in Memory and Cognition, 2(3),
179–184. http://dx.doi.org/10.1016/j.jarmac.2013.06.002
Litman, L., Robinson, J., & Abberbock, T. (2017). TurkPrime.com: A versatile crowdsourcing
data acquisition platform for the behavioral sciences. Behavioral Research Methods,
49(2), 433–442. http://dx.doi.org/10.3758/s13428-016-0727-z
McAdoo, R. M., & Gronlund, S. D. (2016). Relative judgment theory and the mediation of facial
recognition: Implications for theories of eyewitness identification. Cognitive Research:
Principles and Implications, 1(1), 1–12. http://dx.doi.org/10.1186/s41235-016-0014-7
THE RULE OUT PROCEDURE
32
Palmer, M. A., Brewer, N., Weber, N., & Nagesh, A. (2013). The confidence-accuracy
relationship for eyewitness identification decisions: Effects of exposure duration,
retention interval, and divided attention. Journal of Experimental Psychology: Applied,
19(1), 55–71. http://dx.doi.org/10.1037/a0031602
Quigley-McBride, A., & Wells, G. L. (2021). Methodological considerations in eyewitness
identification experiments. In A. M. Smith, M. P. Toglia, & J. M. Lampinen (Eds.).
Methods, Measures, and Theories in Eyewitness Identification Tasks. Taylor & Francis
(pp. 85–112).
Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J., & Müller, M. (2011).
pROC: An open-sourced package for R and S+ to analyze and compare ROC curves.
BMC Bioinformatics, 12, 77–85. http://dx.doi.org/10.1186/1471-2105-12-77
Sauer, J. D., & Brewer, N. (2021). Ratings-based identification procedures. In A. M. Smith, M.
P. Toglia, & J. M. Lampinen (Eds). Methods, Measures, and Theories in Eyewitness
Identification Tasks. Taylor & Francis (pp. 192–210).
Sauer, J. D., Brewer, N., & Weber, N. (2008). Multiple confidence estimates as indices of
eyewitness memory. Journal of Experimental Psychology: General, 137(3), 528–547.
http://dx.doi.org/10.1037/a0012712
Sauer, J. D., Brewer, N., Zweck, T. & Weber, N. (2010). The effect of retention interval on the
confidence-accuracy relationship for eyewitness identification. Law and Human
Behavior, 34(4), 337–347. http://dx.doi.org/10.1007/s10979-009-9192-x
Sauer, J. D., Weber, N., & Brewer N. (2012). Using ecphoric confidence ratings to discriminate
seen from unseen faces: The effects of retention interval and distinctiveness.
THE RULE OUT PROCEDURE
33
Psychonomic Bulleting & Review, 19(3), 490–498. http://dx.doi.org/10.3758/s13423-012-
0239-5
Sauerland, S., Sagana, A., & Sporer, S. (2012). Assessing nonchoosers’ eyewitness identification
accuracy from photographic showups by using confidence and response times. Law and
Human Behavior, 36(5), 394–403. http://dx.doi.org/10.1037/h0093926
Smith, A. M., & Ayala, N. T. (2021). Do traditional lineups undermine the capacity for
eyewitness memory to rule out innocent suspects? Journal of Applied Research in
Memory and Cognition, 10(2), 215–220. http://dx.doi.org/10.1016/j.jarmac.2021.03.003
Smith, A. M., Mackovichova, S., Jalava, S. T., & Pozzulo, J. (2020). Fair forensic-object lineups
are superior to forensic-object showups. Journal of Applied Research in Memory and
Cognition, 9(1), 68–82. http://dx.doi.org/10.1016/j.jarmac.2019.11.001
Smith, A. M., Wells, G. L., Lindsay, R. C. L., & Penrod, S. D. (2017). Fair lineups are better
than biased lineups and showups, but not because they increase underlying
discriminability. Law and Human Behavior, 41(2), 127–145.
http://dx.doi.org/10.1037/lhb0000219
Smith, A. M., Wells, G. L., Smalarz, L., & Lampinen, J. M. (2018). Increasing the similarity of
lineup fillers to the suspect improves applied value of lineups without improving memory
performance. Psychological Science, 29(9), 1548–1551.
http://dx.doi.org/10.1177/0956797617698528
Smith, A. M., Wilford, M. M., Quigley-McBride, A., & Wells, G. L. (2019). Mistaken
eyewitness identification rates increase when either witnessing or testing conditions get
worse. Law and Human Behavior, 43(4), 358–368. http://dx.doi.org/10.1037/lhb0000334
THE RULE OUT PROCEDURE
34
Smith, A. M., Yang, Y., & Wells, G. L. (2020). Distinguishing between investigator
discriminability and eyewitness discriminability: A method for creating full Receiver
Operating Characteristic curves of lineup identification procedures. Perspectives on
Psychological Science, 15(3), 589–607. http://dx.doi.org/10.1177/174569162-9-2426
Sporer, S. L., Penrod, S. D., Read, D., & Cutler, B. (1995). Choosing, confidence, and accuracy:
A meta-analysis of the confidence-accuracy relation in eyewitness identification studies.
Psychological Bulletin, 118(3), 315–327. http://dx.doi.org/10.1037/0033-2909.118.3.315
Starns, J. J., Cohen, A., & Rotello, C. M. (2021). A complete method for assessing the
effectiveness of eyewitness identification procedures: Expected information gain.
Psychological Review. http://dx.doi.org/10.31234/osf.io/az9xf
Steblay, N. K., & Brooks, W. G. (2021). Practical concerns for investigations and Courtroom: A
commentary on Brewer and Doyle (2021). Journal of Applied Research in Memory and
Cognition, 10(2), 208–211. http://dx.doi.org/10.1016/j.jarmac.2021.03.005
Tredoux, C. G. (1998). Statistical inference on measures of lineup fairness. Law and Human
Behavior, 22(2), 217–237. http://dx.doi.org/10.1023/A:1025746220886
Wells, G. L., & Lindsay, R. C. L. (1980). On estimating the diagnosticity of eyewitness
nonidentificaitons. Psychological Bulletin, 88(3), 776–784.
http://dx.doi.org/10.1037/0033-2909.88.3.776
Wells, G. L., & Olson, E. A. (2002). Eyewitness testimony. Annual Review of Psychology, 54(1),
277–295. http://dx.doi.org/10.1146/annurev.psych.54.101601.145028
Wells, G. L., Yang, Y., & Smalarz, L. (2015). Eyewitness identification: Bayesian information
gain, base-rate effect equivalency curves, and reasonable suspicion. Law and Human
Behavior, 39(2), 99–122. http://dx.doi.org/10.1037/lhb0000125
THE RULE OUT PROCEDURE
35
Wells, G. L., & Windschitl, P. D. (1999). Stimulus sampling and social psychological
experimentation. Personality and Social Psychology Bulletin, 25(9), 1115–1125.
http://dx.doi.org/10.1177/01461672992512005
Wixted, J. T., & Wells, G. L. (2017). The relationship between eyewitness confidence and
identification accuracy: A new synthesis. Psychological Science in the Public Interest,
18(1), 10–65. http://dx.doi.org/10.1177/1529100616686966
Yang, Y., & Smith, A. M. (2022). fullROC: An R package for generating and analyzing
eyewitness-lineup ROC curves. Behavior Research Methods.
Yilmaz, A. S., Lebensfeld, T., & Wilson, B. M. (2022). The reveal procedure: A way to enhance
evidence of innocence from police lineups. Law and Human Behavior. Advance online
publication. https://doi.org/10.1037/lhb0000478
... We then use this same model to imagine what a better eyewitness identification procedure might look like. To that end, we introduce the simultaneous lineup plus rule out procedure (Ayala et al., 2022). We demonstrate that the rule out procedure is superior to the two identification procedures most used in real cases: simultaneous lineups and showups. ...
... We recently demonstrated that the rule out procedure does a better job of ruling out innocent suspects and has better overall discriminability than does the standard simultaneous lineup procedure. Moreover, rule-out ratings discriminated guilty suspects from innocent suspects both when witnesses rejected the lineup and even when witnesses mistakenly identified fillers (Ayala et al., 2022). ...
... A witness who mistakenly identifies a filler from a lineup clearly does not have the most reliable memory for the culprit. Yet, past research suggests that even these Rejections demonstrably unreliable eyewitnesses possess memorial information that can be used to discriminate between guilty suspects and innocent suspects (e.g., Ayala et al., 2022;McAdoo & Gronlund, 2016). In our initial test of the rule out procedure, we found that rule out ratings discriminated between guilty suspects and innocent suspects even for those witnesses who mistakenly identified a filler (Ayala et al., 2022). ...
Article
Full-text available
Police lineups are widely used despite evidence that eyewitnesses frequently err by failing to identify the culprit or mistakenly identifying innocent suspects or lineup fillers. We examine how police investigators, prosecutors, defense attorneys, and judges interpret the information witnesses provide when confronted with a traditional lineup and describe one alternative procedure that departs significantly from current practice. This alternative abandons the categorical witness decision, replacing it with confidence judgments about each lineup member’s match to memory for the culprit. Compared with the traditional lineup, this approach provides more nuanced information about suspect guilt. We challenge criminal justice researchers and professionals to examine the potential informational gains this approach offers—and to explore other genuine alternatives—rather than merely tinkering with the existing approach. Finally, we emphasize that a truly collaborative engagement with justice system professionals will be more productive than researchers being the sole arbiters of model practices.
Article
Full-text available
Eyewitness identification via lineup procedures is an important and widely used source of evidence in criminal cases. However, the scientific literature provides inconsistent guidance on a very basic feature of lineup procedure: lineup size. In two experiments, we examined whether the number of fillers affects diagnostic accuracy in a lineup, as assessed with receiver-operating characteristic (ROC) analysis. Showups (identification procedures with one face) led to lower discriminability than simultaneous lineups. However, in neither experiment did the number of fillers in a lineup affect discriminability. We also evaluated competing models of decision-making from lineups. This analysis indicated that the standard Independent Observations (IO) model, which assumes a decision rule based on the comparison of memory strength signals generated by each face in a lineup, is incapable of reproducing the lower level of performance evident in showups. We could not adjudicate between the Ensemble model, which assumes a decision rule based on the comparison of the strength of each face with the mean strength across the lineup, and a newly introduced Dependent Observations model, which adopts the same decision rule as the IO model, but with correlated signals across faces. We draw lessons for users of lineup procedures and for basic research on eyewitness decision making. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Article
Full-text available
When presenting a suspect to a witness for an identification attempt, fair lineups are superior to one-person showups. Relative to showups, fair lineups decrease innocent-suspect identifications to a greater extent than culprit identifications (Steblay et al., 2003). We examined whether the lineup advantage extends from facial identification to forensic-object identification. Participants (N = 1906) watched a short video of a car theft and then completed two culprit-present or culprit-absent showups or lineups. Participants first attempted to identify the culprit from the video and then attempted to identify the vehicle from the video. Forensic-object lineups were superior to forensic-object showups to the extent that the cost of an innocent-suspect identification exceeded the cost of a missed culprit identification or to the extent that the base rate of culprit presence was low. Importantly, we are referring to actual costs and base rates in the real world rather than to methods of manipulating witness decision criteria (see Clark, 2012 for a similar approach). Finally, confidence discriminated between accurate and inaccurate suspect identifications for forensic-object lineups, but not for forensic-object showups.
Article
Full-text available
Objectives: Eyewitness research has adapted signal detection theory (SDT) to investigate eyewitness performance. SDT-based measures in yes/no tasks fit well for the measurement of eyewitness performance in show-ups, but not in lineups, because the application of the measures to eyewitness identifications neglects the role of fillers. In the present study, we introduce a SDT-based framework for eyewitness performance in lineups-Multi-d' Model. Method: The Multi-d' model provides multiple discriminability measures which can be used as parameters to investigate eyewitness performance. We apply the Multi-d' model to issues in eyewitness research, such as the comparison of eyewitness discriminability between show-ups and lineups; the influence of lineup bias on eyewitness performance; filler selection methods (match-to-description vs. match-to-suspect); eyewitness confidence; and lineup presentation modes (simultaneous vs. sequential lineups). Results: The Multi-d' model demonstrates that the discriminability of a guilty suspect from an innocent suspect is a function of discriminability involving fillers; and underscores that the decisions that eyewitnesses make in lineups can be regarded from two perspective-detection and identification. Conclusions: We propose that the Multi-d' model is a useful tool to understand decisionmakers' performance in a variety of compound decision tasks, as well as eyewitness identifications in lineups. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Article
Full-text available
We examined how giving eyewitnesses a weak recognition experience impacts their identification decisions. In 2 experiments we forced a weak recognition experience for lineups by impairing either encoding or retrieval conditions. In Experiment 1 (n = 245), undergraduate participants were randomly assigned to watch either a clear or a degraded culprit video and then viewed either a culprit-present or culprit-removed lineup identification procedure. In Experiment 2 (n = 227), all participants watched the same clear culprit video but were then randomly assigned to either view a clear or noise-degraded lineup procedure. Half of the participants viewed a culprit-present lineup procedure and the remaining participants viewed a culprit-removed lineup procedure. Not surprisingly, degrading either encoding or retrieval conditions led to a sharp drop in culprit identifications. Critically, and as predicted, degrading either encoding or retrieval conditions also led to a sharp increase in the identification of innocent persons. These results suggest that when a lineup procedure gives a witness a weak match-to-memory experience, the witness will lower her criterion for making an affirmative identification decision. This pattern of results is troubling because it suggests that witnesses who encounter lineups that do not include the culprit might have a tendency to use a lower criterion for identification than do witnesses who encounter lineups that actually include the culprit. (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Article
Objective: Recent work has established that high-confidence identifications (IDs) from a police lineup can provide compelling evidence of guilt. By contrast, when a witness rejects the lineup, it may offer only limited evidence of innocence. Moreover, confidence in a lineup rejection often provides little additional information beyond the rejection itself. Thus, although lineups are useful for incriminating the guilty, they are less useful for clearing the innocent of suspicion. Here, we test predictions from a signal-detection-based model of eyewitness ID to create a lineup that is capable of increasing information about innocence. Hypotheses: Our model-based simulations suggest that high-confidence rejections should exonerate many more innocent suspects and do so with higher accuracy if, after a witness rejects a lineup but before they report their confidence, they are shown the suspect and asked, "How sure are you that this person is not the perpetrator?" Method: Participants (N = 3,346) recruited from Amazon Mechanical Turk watched a 30-s mock-crime video of a perpetrator. Afterward, they were randomly assigned to lineup procedures using a 2 (standard control vs. reveal condition) × 2 (target present vs. target absent) design. A standard simultaneous lineup served as the control condition. The reveal condition was identical to the control condition except in cases of lineup rejection: When a lineup rejection occurred, the suspect appeared on the screen, and participants provided a confidence rating indicating their belief that the suspect was not the perpetrator. Results: The reveal procedure increased both the accuracy and frequency of high-confidence rejections relative to the standard simultaneous lineup. Conclusions: Collecting a confidence rating about the suspect after a lineup is rejected may make it possible to quickly clear innocent suspects of suspicion and reduce the amount of contact that innocent people have with the legal system. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Preprint
We present a method for measuring the efficacy of eyewitness identification procedures by applying fundamental principles of information theory. The resulting measure evaluates the Expected Information Gain (EIG) for an identification attempt, a single value that summarizes an identification procedure’s overall potential for reducing uncertainty about guilt or innocence across all possible witness responses. In a series of theoretical demonstrations, we show that EIG often disagrees with existing measures (e.g., diagnosticity ratios or area under the ROC) about the relative effectiveness of different identification procedures. Each demonstration is designed to highlight “blind spots” of the existing measures as a contrast to EIG, which considers every factor relevant to a procedure’s potential for decreasing uncertainty about guilt or innocence. Collectively, these demonstrations show that EIG has substantial potential to inspire new discoveries in eyewitness research. For research designed to identify procedures that will be most effective in criminal investigations, EIG supersedes all other measures, on both theoretical and practical grounds.