ArticlePDF Available

Comparing Witness Performance in the Field versus the Lab: How Real-World Conditions Affect Eyewitness Decision-making

Authors:

Abstract

Objectives: This field-simulation experiment was designed to compare eyewitness performance when conducting showups and lineups under field versus laboratory conditions. Hypotheses: We expected to replicate the findings from previous field-simulation experiments showing over-confidence in showup identifications made under field but not lab-conditions, and further predicted that under field-conditions, high-confidence identifications are more likely to be correct when using lineups compared to showups. It was also expected that field-conditions would lead witnesses to lower their criterion for choosing with showups, but we did not know how field-conditions would affect lineup decision-making. Method: Participants (N = 719) witnessed the theft of a laptop computer and were asked to identify a suspect from either a live-showup, a photographic-showup, or a photographic-lineup administered under either field or lab conditions. In the field-condition, uniformed officers functioned as experimenters and participants were immersed in what they were led to believe was an actual police investigation. In the lab-condition, participants were debriefed before the identification procedure that the theft was staged for research purposes and that their identifications were being made as part of a study on eyewitness memory. Results: As predicted, witnesses were overconfident in their showup identifications made under field but not lab-conditions, and high-confidence identifications were more likely to be correct when using lineups compared to showups. Also as expected, field-conditions led witnesses to lower their criterion for choosing with showups regardless of culprit presence. However, the opposite was true for lineups, as field-conditions resulted in witnesses raising their criterion for choosing. Conclusions: Field-conditions had a very different effect on witness performance when conducting showups compared to lineups. When witnesses were led to believe their identification would result in the arrest and prosecution of the suspect, they became more liberal in their decision-making when showups were used, but more conservative when lineups were employed.
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
This article has been published in Law and Human Behavior. 46(3), 175-188.
Available open access online,
https://psycnet.apa.org/PsycARTICLES/journal/lhb/46/3
Comparing Witness Performance in the Field versus the Lab: How Real-World
Conditions Affect Eyewitness Decision-making
Mitchell L. Eisen Rebecca C. Ying Charmaine Chui Monique Swaby
California State University, Los Angeles
Author Note
Mitchell L. Eisen https://orcid.org/ 0000-0003-0395-6705
Rebecca C. Ying https://orcid.org/ 0000-0002-5666-7790
Charmaine Chui https://orcid.org/ 0000-0002-5976-6719
Monique Swaby https://orcid.org/0000-0003-4395-5512
Rebecca Ying is now at Iowa State University.
Charmain Chui is now at Arizona State University
This work was supported by a grant from Arnold Ventures, 400-61-12C (Andrew Smith
PI). The views and opinions expressed in this work are those of the authors and do not
necessarily reflect the views of Arnold Ventures. We would like to acknowledge Jennifer
Jones, T’awna Williams, Lauren Ristrom, Diana Fajardo Castellanos, and Jordyn
Mullinax, Gabriella Cedre and Marianne Lacsamana for their work in collecting and
entering the data. We are also grateful for the support of the California State University
Police who supported this work and the officers who ran the identification procedures.
Finally we want to thank Andrew Smith for his assistance in handling and analyzing the
data. The data that support the findings of this study are available upon reasonable
request.
2
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
Abstract
Objectives: This field-simulation experiment was designed to compare eyewitness
performance when conducting showups and lineups under field versus laboratory
conditions.
Hypotheses: We expected to replicate the findings from previous field-simulation
experiments showing over-confidence in showup identifications made under field but not
lab-conditions, and further predicted that under field-conditions, high-confidence
identifications are more likely to be correct when using lineups compared to showups. It
was also expected that field-conditions would lead witnesses to lower their criterion for
choosing with showups, but we did not know how field-conditions would affect lineup
decision-making.
Method: Participants (N = 719) witnessed the theft of a laptop computer and were
asked to identify a suspect from either a live-showup, a photographic-showup, or a
photographic-lineup administered under either field or lab conditions. In the field-
condition, uniformed officers functioned as experimenters and participants were
immersed in what they were led to believe was an actual police investigation. In the lab-
condition, participants were debriefed before the identification procedure that the theft
was staged for research purposes and that their identifications were being made as part of
a study on eyewitness memory.
Results: As predicted, witnesses were overconfident in their showup
identifications made under field but not lab-conditions, and high-confidence
identifications were more likely to be correct when using lineups compared to showups.
Also as expected, field-conditions led witnesses to lower their criterion for choosing with
3
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
showups regardless of culprit presence. However, the opposite was true for lineups, as
field-conditions resulted in witnesses raising their criterion for choosing.
Conclusions: Field-conditions had a very different effect on witness performance
when conducting showups compared to lineups. When witnesses were led to believe their
identification would result in the arrest and prosecution of the suspect, they became more
liberal in their decision-making when showups were used, but more conservative when
lineups were employed.
Keywords: Eyewitness, Lineups, Showups, Field Experiment, Field-Simulation
Public Significance Statement
This study examined how real-world conditions affect eyewitnesses beyond what can be
assessed in a lab study. This was accomplished by comparing eyewitness performance
when witnesses knew they were participating in research, to field-conditions, in which
their identification would presumably result in the arrest of the suspect. Results
demonstrated that many important findings from previous lab studies generalized well to
field-conditions, but also revealed that the situational pressures of being a witness in an
actual police investigation can affect eyewitness performance differently depending on
the procedures used to obtain the identification.
4
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
Comparing Witness Performance in the Field versus the Lab: How Real-World
Conditions Affect Eyewitness Decision-making
When there is an eyewitness to a crime, police will generally use one of two
procedures to obtain identifications: showups or lineups. Showups involve presenting a
single suspect to the witness, while lineups most commonly involve presenting the
suspect alongside five other individuals. Previous research has consistently shown that
showups are inherently suggestive and yield substantially higher rates of false
identifications compared to lineups (Clark, 2012; Steblay et al., 2003; Gronlund et al.,
2012; Mickes, 2015; Smith et al., 2020; Wetmore et al., 2014; Yarmey et al., 1996).
Although there is no debate about the highly suggestive nature of showups, it is important
to note that most all of the research to date comparing showups to lineups have utilized
traditional laboratory paradigms, in which participants watched a staged crime and were
then asked to identify the perpetrator either from a group of pictures or when viewing a
single photo of the suspect (Steblay et al., 2003; Gronlund et al., 2012; Mickes, 2015;
Smith et al., 2020; Wetmore et al., 2014; Yarmey et al., 1996). Although the laboratory is
arguably the ideal environment for examining the relatively intellectual aspects of
memory and decision-making in isolation, this controlled environment is more limited for
studying other factors that might influence eyewitness identification performance in
actual cases, including certain situational pressures, witness motivations, and other ‘hot’
affective components of human decision-making that can be aroused when participating
in a police investigation.
Psychological scientists have long recognized the importance of demonstrating
the ecological validity of eyewitness research. Indeed, the National Academy of Sciences
5
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
(NAS) recently published a report calling for more collaborative work between
researchers and law enforcement to develop quality field experiments to improve the
broader impact of this research on the criminal justice system (NRC, 2014). The current
study was designed to address this call for more field experiments done in collaboration
with law enforcement and directly compared witness performance under field versus lab-
conditions, when using lineups and showups. This was accomplished by using the field-
simulation paradigm, which combines the ecological validity of field experiments with
the experimental control of lab studies (Eisen, Skerrit-Perta et al., 2017; Eisen, Smith et
al., 2017). Like field experiments, the police are the experimenters, and witnesses are led
to believe that their identifications will lead to the arrest of the suspect. However, like lab
studies, the investigators maintain experimental control over all aspects of witnessing
experience and identification procedures. Before discussing the experiment in detail, we
will review previous research examining how eyewitness decision-making is
differentially affected when using showups versus lineups, and how eyewitness
identifications can be affected by real-world field conditions.
Showups Versus Lineups
Although showups result in higher rates of innocent-suspect identifications than
lineups, previous research has shown that witnesses are actually more likely to make
affirmative identifications (i.e., are more likely to choose) when making identifications
from lineups compared to showups (Gonzalez et al., 1993; Wells, 2001). This apparent
contradiction is easily explained by the protective effect of presenting five known
innocent-fillers in the lineup alongside the suspect. When innocent suspects are presented
in fair photographic lineups, false-positive choices will be spread across the group. By
6
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
contrast, with showups, since there are no fillers to draw choices away from the target, all
false-positive choices will fall on the single innocent suspect.
When the actual culprit is presented in a lineup, fillers will still siphon some mistaken
choices away from the culprit, but should attract far fewer choices than they would from
the innocent suspect (Wells, 2001). This is because witnesses are more likely to
recognize and select the culprit when he is present, and are therefore less likely to
erroneously choose a filler.
Hot and Cold Cognition: Field vs. Lab Conditions
In order to develop more effective procedures to collect eyewitness evidence, it is
obviously important to understand how recognition memory, discriminability and related
unemotional aspects of "cold" cognition can influence witness performance. It is also
important to examine the influence of non-memorial “hot” affective components of
human decision-making that might affect witnesses in actual police investigations, such
as situational pressures and witness motivations. However, capturing the variance in
witness performance associated with hot affective components of decision-making has
proven to be a real challenge for eyewitness researchers, even when conducting
traditional field experiments, which involve assessing witness performance in actual
cases (Wells, Steblay et al., 2015). This is because ground truth is generally unknown in
actual cases, as we rarely know with 100% certainty whether the suspect presented to the
witness in any given case is factually innocent or guilty (Horry et al., 2014). However,
elements of hot cognition can be directly examined with the field-simulation paradigm,
because ground truth is always known, and all elements of the crime and identification
procedures are controlled to the same degree as a typical lab study.
7
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
Eisen, Smith et al. (2017) directly examined the impact of hot affective components
of decision-making in a series of field-simulation experiments that directly compared
showup identifications obtained under highly realistic field versus lab-conditions. In the
field-condition, participants were immersed in what they were led to believe was an
actual police investigation, and were asked to make an identification of the suspect at a
live-showup conducted by law enforcement. In the lab-condition, participants were
debriefed shortly after the crime that the theft was staged for research purposes and were
asked to make identifications under equivalent conditions, but without law enforcement
involvement. Across three different experiments, these investigators found that innocent-
suspect identifications were substantially higher in the field compared to lab-conditions.
Indeed, witnesses who were led to believe their identification would result in the arrest
and prosecution of the suspect, lowered their criterion for choosing, regardless of culprit
presence (Eisen, Smith et al., 2017). Notably, witnesses in the field-condition were also
consistently more likely to report feeling pressured to identify the suspect presented to
them when compared to the lab-condition, and argued that the lowered criterion for
choosing evidenced in the field-condition was likely due to the pressures witnesses
experienced when being presented with a suspect in police custody at a showup. Eisen
and his colleagues also reported that witnesses at showups in the field-condition were
overconfident in their decisions, whereas this was not true in lab-condition.
What About Lineups Administered in the Field?
It cannot be assumed that the pressures of making an identification in an actual
police investigation will have the same effect on witness performance when lineups are
used. With showups, witnesses simply need to decide whether the culprit is present or
8
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
not. As such, increased choosing will always boost both accurate and false
identifications. However, with lineups, the suspect’s picture is embedded in a group of
similar photos, and witnesses need to engage in what is arguably a different sort of
discrimination task (Smith et al., 2019). Because these tasks are quite different, it is not
clear that the elements of hot cognition which have been found to lower witnesses’
criterion for choosing at showups conducted under real-world field conditions will
generalize to decision-making with lineups. The current study was designed to directly
address this issue.
The Current Study
This experiment compared identifications made when using showups and lineups
under field versus lab-conditions. This was accomplished by using the field-simulation
paradigm (Eisen, Smith et al., 2017; Eisen, Skeritt-Perta et al., 2017). Participants
witnessed what they were led to believe was an actual theft of a laptop computer. In the
field-condition, participants were immersed in what they were led to believe was an
actual police investigation and the identification procedures were administered by
uniformed officers. In the lab-condition, prior to making their identifications, witnesses
were debriefed that the theft was staged for research purposes as part of a study on
eyewitness memory. Otherwise, the identifications were obtained using the same
procedures, but with no law enforcement involvement. Identifications were obtained in
one of three ways: live-showups, photographic-showups, or photographic-lineups.
For live-showups, it was predicted that we would replicate previous field-
simulation research showing increased choosing and overconfidence in showup
identifications made under field compared to lab-conditions. Also, since previous studies
9
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
have found no difference in witness performance when viewing suspects live versus in
photographs (Fitzgerald et al., 2018), we expected witness performance when making
identifications from photo-showups to follow the same pattern as when live-showups
were used. Specifically, we predicted higher rates of choosing for both live and photo-
showups conducted in the field versus the lab-conditions, and expected overconfidence in
showup identifications made under field, but not lab conditions.
Although we did not know how field-conditions would affect lineup decision-
making, based on the previous field-simulation research with showups, we expected that
field-conditions would result in increased choosing with lineups, without a commensurate
increase in accuracy. Also, based on previous lab research comparing showups to lineups,
we expected higher false-identification rates for showups compared to lineups in both the
field and lab-conditions. Additionally, we predicted that identifications made with high
levels of confidence will be more likely to be correct if the identification was made from
a lineup compared to a showup. Finally, based on previous field-simulation research, we
expected that witnesses in the field-condition would feel more pressure to make an
identification from both showups and lineups, and that quicker response time would be
associated with greater accuracy in both lab and field-conditions.
Method
Participants
Participants were 719 undergraduate students from introductory psychology
courses at a large state university in Southern California who took part in the study in
exchange for course credit. Participants were 66.9% female. The age of the sample
ranged from 17 to 64, with a mean age of 20.17 (SD = 3.79). The racial background of
10
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
the sample was primarily Latino (78.7%), with 4.7% identifying as White, 10.0% as
Asian American, 1.9% as African American, and 4.5% indicating that they did not
belonging to any of these groups.
Power analysis. We did not conduct a power analysis a priori, and instead ran as
many participants as possible with the funds available and time allotted. An effect-size
sensitivity test was conducted using G*Power (Version 3.1; Faul, et al., 2007). According
to an effect size sensitivity test, we could detect in a logistic regression with 3 predictors
(procedure, ecological validity, and culprit presence) with .80 power and with our sample
size of 719, we can detect a main effect with an odds ratio of 1.30.
Design
This study conformed to a 2 (Field vs. lab) × 2 (Culprit-absent vs. culprit-present)
× 3 (Computer-tablet lineup vs. computer-tablet showup vs. live showup) design.
Procedure
Participants signed up for a study listed as “Personality and Memory”. In order to
avoid participants learning about the study through word-of-mouth, data collection for the
field-condition was limited to a single day per academic term across the span of two-
years. Up to 100 participants were gathered into two separate waiting rooms. They were
told that the study needed to be conducted in smaller groups and that they would need to
wait for their turn. While waiting, they were given a NEO-PI personality test to work on.
Groups of up to 10 participants at a time were escorted from the waiting rooms to smaller
rooms in different parts of the building to take part in the study. A confederate/thief was
embedded into each group.
The Staged Crime and Exposure to the Thief
11
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
When participants arrived at the smaller room, they were seated around a
rectangular table. The confederate/thief (henceforth, the thief) always sat at the head of
the table opposite the experimenter, so he would be clearly visible to witnesses at every
other seat around the table. During the reading of the consent, the thief interrupted the
proceedings and drew attention to himself by rudely taking a phone call. He continued
talking on his phone for approximately 20 seconds while the experimenter repeatedly
asked him to end the call. This was done to draw attention to the thief, to ensure that each
participant would take note of his presence in the group. After taking the call, the thief set
the timer of his phone to go off seven minutes later (see below). This was done to
standardize the interval between drawing attention to himself with the call and the theft.
After the consent forms were completed, participants were seated at computer
workstations around the room and were asked to complete a lengthy filler task. The
confederate was always seated at the laptop workstation positioned closest to the exit.
When the thief’s phone timer went off silently, seven minutes after ending the phone call,
he fled the room with the laptop computer that he was working on. After waiting a few
seconds, the experimenter ran out to chase the thief. The experimenter then returned to
the room and stated aloud in front of the group that the student had stolen one of the
laptop computers. The second experimenter then stated aloud, “You had better call Dr.
Eisen (The study’s PI)”.
Field vs. Lab Condition. After the culprit fled, participants were either debriefed
that the theft was staged for research purposes (lab-condition), or they were immersed in
what they were led to believe was an actual police investigation (field-condition).
Thief Suspect Pairs
12
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
The experiment was conducted over multiple sessions across a two-year period.
Multiple thief suspect pairs were used across sessions. Each time the study was run, the
two actors would rotate roles so that they were each either the thief or the suspect an
equal number of times in each condition (Culprit-Absent live-showup, Culprit-Present
live-showup, Culprit-Absent photo-showup, Culprit-Present photo-showup, Culprit-
Absent lineup, Culprit-Present lineup).
Recruitment of confederate actor/pairs and lineup fillers. Members of the
research team recruited potential confederate/actors who were Hispanic males, college
age, medium height, medium weight, medium complexion, with close-cut hair. Actors
were paid $250 per day, and agreed to have one of our research assistants cut their hair
with a #2 trimmer on the day of the experiment to control for differences in hairstyle. In
addition, they agreed to shave that morning. Once we found our first actor who agreed to
participate in the project for the foreseeable length of the study members of the research
team then recruited additional prospective confederate actors to create our initial
thief/suspect pair (and additional pairs if needed). Lab team members recruited friends,
acquaintances, and students from different local college campuses in Los Angeles who
were judged to be a good match to the original confederate/actor recruited for the study.
The lab members took photos of these individuals and gathered their contact information.
The photos of these potential confederate/actors were presented to the lab group as a
whole to judge which actors were a good match to our initial recruit. Over the course of
the two-years we conducted this experiment, we used six different thief/suspect pairs.
The same process was used to find adequate filler photos for the lineup.
13
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
Composing the Photographic-Lineup. We created the lineup using the first
confederate/actor recruited for the experiment. Specifically, we composed a large pool of
head-shot/photos that generally matched our original confederate/actor (Hispanic males,
mid-20s, medium height, medium weight, medium complexion, with close-cut hair). The
pool was narrowed based on similarity/match judgements from the lab group. We took
the remaining photos in the pool and obtained ratings from 40 students for how well the
potential fillers matched the first confederate/actor we had recruited for the study. Based
on these ratings we selected five fillers to be used throughout the experiment.
Post-hoc lineup fairness analysis for the experiment yielded a Trudoux’s E of 4.2.
Specifically, when the culprit was absent in the lab condition, choices were fairly well
distributed: 28% rejected the lineup, 13% choose the innocent suspect, 24% chose filler
1, 10.7% choose filler 2, 18.7% chose filler 3, 2.7% chose filler 4 and 2.7% chose filler 5.
Conversely when the culprit was present, 10.1% rejected the lineup, 66.7% accurately
identified the culprit, 10% chose filler 1, 4.3% choose filler 2, 5.8% chose filler 3, 1.4%
chose filler 4 and 1.4% chose filler 5.
Each time we ran the experiment we used two confederate/actors. The actor who
was not the thief for any given trial served as the innocent suspect for the both the
showup and lineup conditions on that day. As noted above, the procedures were
counterbalanced throughout the day so that each actor was both the thief and the suspect
an equal number of times in each condition.
The Field-Condition
After the theft, the experimenter made a phone call to report the theft in front of
the participant/witnesses, so that they could clearly hear what was being said. The
14
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
experimenter stated that the laptop had been stolen by one of the participants who was
described as a medium height, medium weight Latino male with short, closely cut hair, a
dark t-shirt, and jeans. This accurate generic description of the thief was scripted, and
each line was fed to the experimenter by the person on the other end of the call to ensure
uniform reporting across all groups. When the phone call was finished, the experimenter
told the participants that Dr. Eisen would call the university police and that they should
continue their work so they could receive credit for the study. If participants began to talk
amongst themselves, they were redirected and asked not to talk to each other during the
session.
After a 20-minute delay, the experimenter received a call. The caller fed the
following scripted lines to the experimenter to tell the group: “The police have detained a
person who matches the description of the thief and they want us to come down to make
an identification”. They were also told that the police asked that they not discuss the
details of the crime among themselves. Participants were led downstairs by one of the
experimenters who actively kept the participants from discussing the crime.
The Identification
The group was greeted by a campus police officer who was accompanied by two
student campus security patrol assistants. The officer gathered the witnesses together and
explained that they had detained a suspect who matched the description of the thief. The
officer explained that they would be taken individually to view the suspect, but they first
needed to be read an instruction. The officer then read the police department’s standard
pre-identification admonition to the group. For lineups the standard departmental
instruction read,
15
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
“You are under no obligation to identify this person as the suspect. We
want to have guilty persons identified, but we also want to make sure that
innocent persons are cleared of any suspicion in this matter. The person
who committed the crime may or may not be in the set of photographs
being presented.”
For showups, the standard departmental instruction read,
“You are under no obligation to identify this person as the suspect. We want to
have guilty persons identified, but we also want to make sure that innocent
persons are cleared of any suspicion in this matter. You should not draw any
conclusions about a person just because he is in our custody, or handcuffed.”
For the photo-showup condition, the admonition had to be modified, because the
police department who participated in the study only had an instruction for live-showups
(they had no specific policy or instructions for photo-showups). Thus, the same
instruction for live showups was used, but the final line noting that You should not draw
any conclusions about a person just because he is in our custody, or handcuffed.” was
replaced with “You should not draw any conclusions about the person just because he is
being presented to you.”
After admonishing the witnesses as a group the officer then escorted each witness
one at a time to a separate area to make an identification and asked the students not to
discuss the case while waiting. The witnesses were taken to an area out of sight from the
group to participate in either a live showup administered in front of a squad car, or either
a photographic showup or a photographic lineup presented on a computer tablet,
administered inside the squad car (see below). One student campus security patrol
16
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
member stayed with the group to make sure they did not discuss the case while waiting.
The other student security patrol member accompanied the officer and the witness.
Participants were randomly assigned to see either the actual culprit (culprit-
present condition) or an innocent suspect (culprit-absent condition). The procedure was
administered in a double-blind manner so that the officers did not know if the actual thief
was present in the showup or the lineup; or what position the suspect was in the lineup.
The Live Showup
Each participant-witness was led to a spot out of sight and earshot from the group,
approximately 200 feet away (around the corner and down a long hallway). They were
brought to an exit door which had a one-foot by one-foot square window looking outside
the building where the suspect was being detained by police, 12-feet in front of that
window. Before viewing the suspect, the officer stopped several feet away from the door
and wrote down the witness’s information (e.g., name and age) and were told they were
going to view the suspect through a one-way window so he could not see them. The
officer then instructed the witnesses to stand in front of the window to view the suspect
and said, “Tell me if that’s the person you saw”. The thief was detained by a uniformed
officer with his hands held tightly behind his back as if he were handcuffed. They were
standing up against a full-length glass door out in the sunlight which was 12-feet away
from the smaller window that the witness was looking through. The officer and thief were
standing in front of a patrol car parked on the sidewalk just outside the building’s doors.
The officer and the witness were accompanied by a student security patrol assistant who
kept a redundant record of all aspects of the identification procedure (name, age,
decision, confidence) and timed how long it took the participant-witnesses to report their
17
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
identification decision, from the moment they stepped up to the window to view the
suspect, to the moment they voiced their decision.
The Photographic Lineup and Showup
Witnesses were escorted one at a time, out of the building to a squad car which
was positioned on the sidewalk just outside the doors of the building, in the same spot as
for the live showups. For photographic showups and photo lineups witnesses were asked
to sit in the front passenger seat, next to an officer (with the door open). The witnesses
were told that they were going to view a picture of the suspect to see if they recognized
him as the person they saw. Before showing them the photograph(s), the officers took
down their personal information just as they would do in a typical police investigation
(i.e., name and age). For the lineup, the witness was shown a picture of the suspect along
with five fillers displayed on a computer tablet. For the showup, they were shown a
single picture of the suspect on the tablet. The officer handed the tablet to the witness,
and then pressed a button to turn on the display. For the lineup, as he handed the tablet to
the witness, the officer said, “Let me know if you recognize any of these people as the
person you saw”. For the showup, as they handed the tablet to the witness, he stated, ‘Let
me know if you recognize this person as the person your saw”. The student security
patrol assistant sat in the back seat of the squad car and kept a redundant record of all
aspects of the identification procedure and timed how long it took the witnesses to report
their decisions from the moment the tablet was handed to them and turned on, to the
moment they voiced their identification decision.
Recording the Identification Decision and Witness Confidence
18
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
The officer recorded whether the witnesses identified the suspect or not, and
asked each participant/witness how confident they were in their decision. If the witness
identified the suspect, they were asked how confident they were that the person they
identified was the thief on a 0–100 scale, with 0 being not confident at all and 100 being
perfectly confident. If the witness did not identify the suspect, they were asked how
confident they were that this person was not the thief using the same scale. The student
patrol assistant quietly recorded the same information as the officer to enhance reliability.
Debriefing
After the identification procedure (showup or lineup), witnesses were directed to a
waiting area which was supervised by an experimenter. They were debriefed that the
crime and identification had been staged for research and were instructed not to talk to
the others or communicate with other participants on their phones. After the whole group
had participated in the showups or lineups, they were more fully debriefed as a group and
escorted to the final room to complete the post-event questionnaire (see supplemental
materials).
The Lab Procedure
For the lab procedure, exposure to the culprit and the staged crime was conducted
in exactly the manner described above. However, for the lab-condition, five minutes after
the experimenters reported the crime, they debriefed participants that the theft was staged
for research purposes as part of a study on eyewitness memory. After a 20-minute delay,
participants were brought to the same location where the field identifications were made,
and the same procedures were used to obtain the identifications. However, the procedures
were conducted by student experimenters rather than police.
19
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
Post-Event Questionnaire
After making their identifications, participant-witnesses in both the field and lab-
conditions were led to a final room to complete a post-event questionnaire asking about
their identification experience. The post-event questionnaire included questions about
whether they identified the suspect, confidence in their decision, and how much pressure
they felt to identify (see supplemental materials).
Manipulation Check. The post-event questionnaire included an item asking
participants if they suspected the crime and procedures were fake and staged for research.
If they responded yes to this question, they were asked how confident they were that the
crime and identification procedure was staged for research on a 0-100 scale. After the
students completed the post-event questionnaire, they were thanked for their participation
and were asked to not tell their classmates about the study.
Was It Real?: True Believers vs. Doubters. Twenty-five percent of the sample
reported that they were more than 50% sure that the procedures were staged for research
(here forward to be referred to as the ‘doubters’). This is typical of the proportion of
doubters found in previous field-simulation experiments which have employed a similar
paradigm (Eisen, Smith et al., 2017; Eisen, Skerrit-Perta et al., 2017). Previous field-
simulation experiments have found that participants who did not express significant
doubts that the crime and identification procedures were real (here forward to be referred
to as the ‘true believers’) were significantly more likely to positively identify the suspect
presented to them at a showup (Eisen, Smith et al., 2017; Eisen, Skeritt-Perta et al.,
2017). This finding was replicated in the current study, as true believers were
significantly more likely than doubters to make suspect identifications when showups
20
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
were used, X2 (1, 382) = 8.76, p < .01. However, this finding did not generalize to witness
performance when lineups were administered, X2 (1, 435) = 1.01, p < .30. Similarly,
when examining choosing, true believers were no more likely than the doubters to make
a selection (i.e., choose), rather than rejecting the lineups presented to them, X2 (1, 435) =
1.83, p = .18. These preliminary data suggest that field conditions may not have the same
effect on witnesses when making identifications from lineups versus showups.
All participants were included in the analyses reported, but the data in each table
are presented with and without the doubters. Although the effect sizes of some analyses
differed when restricting the analyses to only true believers, none of the results changed
significantly regardless of whether doubters were included or not. This protocol was
approved by the institutional review board at California State University, Los Angeles
(#M14-154).
Cross Race effects. We originally set out to examine same race bias in our
analyses plan. However, we decided to remove same the race variable from our analyses,
as the manipulation was not adequately crossed, the results did not interact with other
effects. Removing this variable also improved power.
Results
Live Versus Picture Showups
Previous research has shown no difference in witness performance when viewing
suspects live versus in photographs (Fitzgerald et al., 2018). Based on this work, we did
not expect to see differences in witness performance when using live versus photo-
showups under field or lab-conditions. Preliminary chi-square analysis revealed that
under field-conditions, no difference was found in suspect identifications between the
21
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
live and picture showups; when the actual culprit was presented, X2(1, 102) = .22, p
= .64, or when an innocent suspect was shown, X2(1, 103) = .25, p = .62. Similarly, in the
lab-condition, no difference in suspect identifications was found between the two showup
modalities used when the culprit was present, X2(1, 139) = 1.99, p = .16, or absent, X2(1,
130) = .10, p = .75. Based on these preliminary analyses (see Table 1), the two showup
conditions were combined for all analyses that follow. Although data from the two
showup conditions were combined for the analyses described below, each table provides
complete data showing witness performance separately for the live and photo-showup
conditions. Moreover, for interested readers, supplemental analyses were conducted
without combing the showup conditions; treating procedure as a three-level variable
(lineup vs. live-showup vs. photo-showup).
Logistic Regression Analyses
Logistic regression analyses were conducted to examine if ecological validity
(field vs. lab), identification condition (showup vs. lineup), or culprit presence (present
vs. absent), were related to witness performance. Hierarchical multilevel logistic
regression models were used to account for the shared variance among participants nested
in the many small groups run in this study, many of which viewed different thieves and
suspects.
Choosing
The first multilevel logistic regression was conducted with choosing as the
dependent variable. Choosing was defined as making an affirmative identification, as
opposed to rejecting the lineup or showup. In the first model, the main effects were
entered, and two- and three-way interactions were entered in subsequent models. Based
22
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
on previous field-simulation research, it was predicted that for showups, choosing would
be higher in the field compared to the lab-condition regardless of culprit presence.
However, we did not know how field conditions would affect choosing when lineups
were used. Also, based on previous lab research, it was also expected that witnesses
would choose more when presented with a lineup versus a showup. As expected, a main
effect for procedure was revealed, as witnesses were more likely to choose in the lineup
compared to the showup condition, B = 1.54, SE = .45, Wald’s z = 3.40, p < .001, eB =
4.66, (95% CI [.65, 2.43]). Also, choosing was greater when the culprit was present,
compared to when he was absent, B = 2.56, SE = .49, Wald’s z = 5.23, p < .001, eB =
12.94, (95% CI [1.60, 3.52]). Notably, an interaction between procedure and culprit
presence was revealed, B = -1.70, SE = .62, Wald’s z = -2.74, p < .01, eB = .18, (95% CI [-
2.92, -.48]). When the culprit was absent, choosing was substantially higher when lineups
were used (59.2%) compared to showups (16.7%), B = 4.05, SE = .63, Wald’s z = 6.46, p
< .001, eB = 57.40, (95% CI [2.82, 5.29]). When the culprit was present, choosing was
still higher in lineups (80.0%) compared to showups (67.6%), but the difference was not
as large, B = 2.35, SE = .63, Wald’s z = 3.75, p < .001, eB = 10.49, (95% CI [1.12, 3.58]).
Although ecological validity did not directly predict choosing, B = -.05, SE = .41,
Wald’s z = -.12, p = .91, eB = .95 (95% CI [-.85, .75]), a two-way interaction emerged
between ecological validity and procedure, B = -3.35, SE = .64, Wald’s z = -5.21, p
< .001, eB = .04, (95% CI [-4.61, -2.09]). As predicted, for showups, choosing was
significantly higher in the field (49.3%) compared to the lab-condition (37.5%), B = 1.84,
SE = .55, Wald’s z = 3.37, p < .001, eB = 6.30, (95% CI [.77, 2.92]). However, an opposite
pattern emerged for lineups, as choosing was significantly lower in the field (53.5%)
23
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
compared to the lab-condition (80.6%), B = -1.51, SE = .53, Wald’s z = -2.87, p < .01, eB
= .22, (95% CI [-2.54, -.48]). This finding indicates that field-conditions had a very
different effect on witness decision-making when using showups versus lineups; leading
witnesses to lower their criterion for choosing when showups were used, but raising their
criterion when viewing a lineup. Although all key elements of analyses are provided in
the text above, these data are also provided in table form in the supplemental materials.
Accuracy
The next logistic regression examined accuracy while using the same set of
predictors. Accuracy was defined by choosing the suspect when he was present in the
identification task or rejecting the lineup/showup when he was absent. Specifically,
accuracy = (correct identifications + correct rejections)/ (all decisions). As done in the
previous regression, the main effects were entered in first block and two- and three-way
interactions were entered in subsequent blocks. A main effect for procedure was revealed,
as participants were more accurate when using showups (75.3%) compared to lineups
(51.0%), B = -1.26, SE = 0.33, Wald’s z = -3.84, p < .001, eB = .28, (95% CI [-1.90,
-.62]). No main effects were found for culprit presence, B = -.14, SE = 0.36, Wald’s z = -
0.37, p = .71, eB = .87, (95% CI [-.85, .58]), or ecological validity, B = -.18, SE = 0.36,
Wald’s z = -0.50, p = .62, eB = .84, (95% CI [-.90, .53]).
A significant two-way interaction was found between culprit presence and
procedure, B = 2.47, SE = 0.61, Wald’s z = 4.06, p < .001, eB = 11.82, (95% CI [1.28,
3.66]). For culprit present conditions, accuracy was not significantly higher for showups
(67.6%) compared to lineups ( 61.7%), B = -.62, SE = .45, Wald’s z = -.30, p = .16, eB
= .54, (95% CI [-1.50, .25]). However, in culprit absent conditions, accuracy was
24
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
significantly higher for showups (83.3%) compared to lineups (40.8%), B = -3.09, SE
= .50, Wald’s z = -6.17, p < .001, eB = .05, (95% CI [-4.08, -2.11].
There was also a statistically significant three-way interaction between ecological
validity, procedure, and culprit presence, B = -4.28, SE = 1.15, Wald’s z = -3.74, p
< .001, eB = .014, (95% CI [-6.53, -2.03]). This interaction revealed that in the field
condition, culprit presence was not directly related to accuracy for showups or lineups.
However, in the lab-condition, for showups, accuracy was higher when the culprit was
absent (91.5%) compared to when he was present (64.7%). In contrast, for lineups done
in the lab, the opposite was true, as accuracy was higher when the culprit was present
(66.7%) compared to when he was absent (28.0%). This is related to the fact that when
showups were conducted in the lab, participants easily dismissed the innocent suspect
presented to them, resulting in a very high accurate rejection rate. In contrast, with
lineups, choosing rate was much higher in the lab, resulting in a substantially lower rate
of correct rejections when the culprit was absent. Again, although all key elements of
analyses are provided in the text above, a regression table can be found in the
supplemental materials. Also, for interested readers, additional logistic regression
analyses examining suspect identifications are provided in the supplemental materials.
ROC Analyses
ROC curves are formed by plotting the true-positive identification rate against the
false-positive identification rate for each procedure across varying levels of witness
confidence. The procedure that yields greater area under the curve (AUC) demonstrates a
better tradeoff between accurate culprit-identifications and false identifications (i.e., a
25
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
higher proportion of suspect-identifications relative to false identifications at each level
of confidence examined).
Figure 1 shows the full ROC curves for lineups versus showups in the field-
condition. Three levels of confidence were used to plot the curves: high (90%-100%),
medium (70%-89%), and low (<70%). The first three confidence points on the curve
starting from the lower-left corner of the chart plot the proportion of accurate culprit
identifications against the proportion of innocent-suspect identifications at high, medium,
and low confidence. The data points on the curve are cumulative. Accordingly, the first
point on the curve (after the 0, 0, origin point), plots suspect identifications made at 90-
100% confidence, the second plots suspect identifications made with at least medium
confidence (70-100%), and the third point plots all suspect identifications. Note that the
lineup curve has seven points, while the showup curve has only six points. This is
because lineups can result in three types of choices (suspect picks, filler picks, and
rejections), but showups only result in two types of decisions (suspect picks or
rejections). The middle point on the lineup curve plots the proportion of mistaken filler
picks against accurate decisions for all confidence levels, as there is no strong evidence
for a confidence-accuracy relationship amongst filler picks. The final three points on both
curves plot accurate rejections when the culprit was absent against mistaken rejections
when he was present. These points are plotted cumulatively at each of the three levels of
confidence, but in reverse order of confidence: from rejections made with the lowest
confidence to rejections with the highest confidence. Taken together, when using the full
ROC space, decisions are plotted from the suspect being the best match-to-memory at the
far left (suspect picks made with the highest confidence) to the worst match-to-memory
26
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
at the far right (rejections made with the highest confidence). To examine differences
between ROCs and their respective AUC values, we used the fullROC package
developed by Yang and Smith (2020).
Field-Conditions
Results revealed no differences between the showup curve (AUC = .77) and the
lineup curve (AUC = .72) in field conditions, D = -.06, 95% CI [-0.17, .05], p = .33.
Notably, Figure 1 shows there is an overlap in the curves, as the lineup curve starts above
the showup curve but is ultimately overtaken by the showup curve. This is because when
lineups were used under real-world field-conditions, very few witnesses made high-
confidence false-identifications of the innocent suspect. Thus, the first point on the lineup
curve, which plots identifications made with the highest level of confidence, clings
closely to the y-axis (which plots accurate suspect identifications), which initially puts it
above the showup curve. In contrast, high-confidence false-identifications of innocent
suspects were common when showups were used. In essence, when considering only
decisions made at 90-100% confidence, the very low rate of high-confidence innocent
suspect identifications gave lineups a clear advantage over showups in the field.
However, this advantage was lost when considering suspect-identifications made with
medium and low levels of confidence; mainly because suspect-identifications were so
much higher overall when showups were used. These results suggest that the use of the
more conservative lineup procedure successfully reduced the chances of obtaining high-
confidence false-identifications, but also resulted in fewer suspect-identifications overall.
When considering rejections, showups had superior predictive value. This is
evidenced by how the showup line stays over the lineup curve when looking at the next
27
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
three points in the ROC space, from left to right. As noted above, the final three points on
both curves plot rejections made at each of the three levels of confidence, from lowest to
highest. Showups have an advantage when it comes to rejections because witnesses
choose more from lineups than showups.
The Lab Condition
Figure 2 shows that in lab-conditions, the showup curve dominates the lineup curve
across the entire ROC space. Results revealed that the showup curve (AUC = .87) has a
higher AUC than the lineup curve (AUC = .78) in laboratory conditions, D = -.09, 95%
CI [-.18, .00], p = .03.The superiority of showups in the lab condition in this study
appears to be related to the very low false-identification rate made by lab participants
when showups were used with this paradigm. These data suggest that under the
conditions examined, exposure to the culprit was strong enough, and the retention
interval was short enough, that when showups were used, lab participants had very little
difficulty discriminating between the innocent suspect and the actual culprit; resulting in
very low rates of mistaken identifications (See Table 1). However, lab participants still
had difficulty with the lineup task, resulting in more choosing and fewer correct
rejections when the culprit was absent. For interested readers, additional regression
analyses examining discriminability are provided in the supplemental materials.
Additionally, ROC figures including descriptive confidence intervals can be found in the
supplemental materials.
Confidence
It was predicted that for showups, witnesses in the field-condition would be
overconfident in their decisions regardless of culprit presence. When viewing the ROCs,
28
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
it is clear that high-confidence innocent-suspect identifications were relatively rare with
lineups conducted under field conditions, but quite common when showups were used.
Table 3 displays the proportion of accurate and false identifications in the field and lab
conditions for both showups and lineups binned for decisions made at three confidence
levels: 90-100%, 89-70%, and below 70%. Notably, Table 3 shows that 50% of all false-
identifications from showups in the field-condition were made at 90-100% confidence. In
contrast, when using lineups under field conditions, only 6.6% of all false-identifications
(filler and suspect-identifications) were made at 90-100% confidence. Also as predicted,
overconfidence in showup decisions was restricted to the field-conditions. Under lab-
conditions, less than 10% of participants falsely identified the innocent suspect when a
showup was used, and only 20% of those false-identifications were made with 90%
confidence or greater.
Response-Time
It was predicted that response times would be longer under field conditions, and
that accurate decisions would be made more quickly for both showups and lineups
conducted under both field and lab conditions. To address this question a 2 (Ecological
validity: Field vs. Lab) x 2 (Procedure: Showup vs. Lineup) x 2 (Accuracy: Yes vs. No) x
2 (Culprit presence: Yes vs. No) ANOVA was conducted with response-time in seconds
as the dependent variable. Means and standard deviations from these analyses can be
found in Table 4. As expected, significant main effect for ecological validity was found,
F(1, 717) = 15.09, p < .001, η2 = .02, as response-times were substantially longer when
identifications were made in the field, M = 8.19, SD = 7.49, compared to the lab
conditions, M = 6.45, SD = 5.26. Also as predicted, accurate witnesses made significantly
29
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
quicker identification decisions, M = 6.26, SD = 6.11, than their inaccurate counterparts,
M = 9.09, SD = 6.46, F(1, 717) = 6.66, p < .01, η2 = .01. Notably, showup decisions, M =
5.19, SD = 5.40 were made substantially more quickly than lineup decisions, M = 11.08,
SD = 6.29, F(1, 717) = 125.72, p < .001, η2 = .15. An interaction was found between
accuracy and culprit presence, F(1, 717) = 12.49, p < .001, η2 = .02. Table 4 shows that
for showups, accurate decisions, M = 4.58, SD = 5.01, were made more quickly than
inaccurate decisions, M = 7.04, SD = 6.12, Cohen’s, d = .44. However, contrary to
expectations, when lineups were used, there was no difference in response-time, between
accurate, M = 11.06, SD = 6.44, and inaccurate decisions, M = 11.10, SD = 6.16,
Cohen’s, d = .01. No other statistically significant interactions were found.
Pressure to Identify: Field vs. Lab
Based on previous field-simulation research done with live showups (Eisen, Smith et
al., 2017), it was predicted that witnesses in the field condition would report feeling more
pressure to make an identification. To address this question, a 2 (field vs. lab) x 2
(showup vs. lineup) chi-square analysis was conducted to examine whether participants
in the field- condition were more likely to report feeling pressured to make an
identification than participants in the lab-condition. Results of these analyses replicate the
findings of Eisen, Smith et al., (2017), and revealed that witnesses in the field were
significantly more likely to report feeling pressured to make an identification (35.9%)
than participants in the lab-condition (25.2%), X2(1,718) = 9.60, p = .002, ϕ = .12. Also,
the percentage of witnesses in the field-condition who reported feeling pressured to
choose was generally comparable among those who viewed lineups (35.6.%) and
showups (36.6%), X2(1,718) = .03, p = .86, ϕ = .01.
30
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
Discussion
This study was designed to directly compare witness performance when using
showups and lineups under lab versus field-conditions. This was accomplished by using
the field-simulation paradigm, in which half of the witnesses were immersed in what they
were led to believe was an actual police investigation while the others were informed
their identifications were being made as part of a study on eyewitness memory. Results
showed that the field-conditions had a very different effect on witness performance when
using showups compared to lineups. For showups, field-conditions led witnesses to lower
their criterion for choosing regardless of culprit presence. Notably, this difference was
even greater when only examining the true-believers, who expressed no significant
doubts about the reality of the field-simulation. However, the field-conditions appeared to
have the opposite effect on lineup decision-making; resulting in decreased choosing
regardless of culprit presence, which led to a commensurate decrease in both false-
identifications and accurate culprit-identifications.
Overconfidence of Showup Identifications Made in the Field
The increased incidence of false-identifications when using showups compared to
lineups is not a particularly novel finding and has been demonstrated in numerous lab
studies (Colloff et al., 2016; Colloff et al., 2020; Gronlund et al., 2014; Wetmore et al.,
2014). However, the field-simulation data also revealed that when witnesses were led to
believe that their identification would likely result in the arrest of the suspect, they tended
to be overconfident in their showup identifications. Table 3 shows that when showups
were used under field-conditions, half of all false-identifications were made at 90-100%
confidence. As expected, the overconfidence in showup decisions was unique to
31
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
identifications made under field-conditions and did not generalize to participants in the
lab group. Indeed, when witnesses were aware that their identifications were being made
as part of a research study on eyewitness memory, participants rarely misidentified the
innocent suspect in the showup condition (8.5%), and only a small fraction of decisions
made at 90-100% confidence were made in error (2.6%). Eisen, Smith et al. (2017) also
found a similar divergence between lab and field data when examining showup
identifications and proposed that this difference could be attributed to the influence of
non-memorial “hot” affective components of human decision-making driven by the
pressures of being put on the spot to make a showup identification in an actual police
investigation; which are largely absent in lab studies. This might explain why the results
of field-simulation studies, which have now consistently shown overconfidence in
showup identifications, differ from some lab studies which have found that witness
confidence was aligned with accuracy among choosers making identifications from single
photo-showups (Sauerland et al., 2018; Sauerland et al., 2012). Indeed, in the current
study, confidence appeared to be relatively well aligned with accuracy for choosers in the
lab-showup condition.
Notably, field conditions did not appear to boost confidence in lineup
identifications, as less than 5% of all innocent suspect-identifications, and fewer than
10% of all mistaken selections (fillers and innocent-suspects combined) from lineups
were made with 90-100% confidence. Taken together, these data suggest that under field-
conditions, high-confidence identification decisions tend to be predictive of accuracy for
lineups, but not for showups.
Field Conditions and Choosing from Lineups Versus Showups
32
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
As predicted, when showups were used, choosing increased significantly in the
field compared to the lab-condition, regardless of culprit presence. This replicates the
findings from three previously published field-simulation experiments (Eisen, Smith et
al., 2017). Eisen and his colleagues proposed that the increased choosing with showups
conducted under field compared to lab-conditions was the result of situational pressures
associated with participating in what they were led to believe was an actual police
investigation: specifically, witnesses feeling increased pressured to make an
identification. As expected, this was the case in the current experiment, as witnesses in
the field-condition were more likely to report feeling pressured to identify the suspect
presented to them than their counterparts in the lab group. However, field conditions
appeared to have a very different effect on witnesses when lineups were used. Rather
than choosing more, when witnesses were led to believe that their lineup identification
was being made as part of an actual police investigation, they became more conservative
in their decision-making, and actually chose less in the field compared to the lab-
conditions. Notably, this decrease in choosing resulted in a commensurate decrease in
both false-identifications and accurate culprit-identifications. Moreover, this conservative
shift was magnified when the doubters were removed from the analysis. Indeed, in both
experiments, when considering only the true believers, choosing was lower for lineups in
the field compared to the lab-condition, while the opposite was true for showups.
Response-Time
As expected, accurate decisions were made more quickly for showups and lineups
in both the field and lab-conditions. Also as predicted, response-time was significantly
longer in the field compared to the lab-conditions for both lineups and showups. The
33
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
difference in response-time between field and lab-conditions replicates the findings from
previous field-simulation research done with showups (Eisen, Smith et al., 2017), and
demonstrates that this effect extended to decisions made when using lineups. Eisen,
Smith et al. (2017) argued that the longer response time evidenced under field-conditions
is related to elements of “hot” cognition; as witnesses in the field wrestled with their
decisions for much longer because they were aware of the real-world consequences of
making the identification. However, in the lab-condition, participants were able to make
relatively quick decisions driven by “cold” cognitive factors of the clear-cut
discrimination task presented to them, unencumbered by the inherent situational
pressures of being a witness in an actual police investigation.
Showups versus lineups. The response-time analyses also indicated that showup
decisions were made substantially more quickly than lineup decisions. These data
demonstrate that in general, witnesses wrestled with lineup decisions for significantly
longer than showup decisions, and suggests that many witnesses found the lineup to be a
more difficult task. It is possible that the shift towards more conservative decision-
making evidenced when making identifications from lineups in the field may be related
to the relative difficulty of the two tasks. As noted earlier, making an identification from
a showup involves a relatively simple discrimination task, as the witness simply needs to
decide whether the person presented is the culprit or not. In contrast, lineup decision-
making involves a more complex task. Unlike showups, witnesses first need to decide
whether the culprit is present or not; and then, if they believe he is present in the group,
they must identify which of the alternatives is the perpetrator. In the absence of
experiencing an immediate sense of recognition when initially viewing the suspect’s
34
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
picture in the group, witnesses must decide which, if any of the pictures is likely to be the
culprit. Horry et al. (2014) observed that as fillers become more similar to the suspect, the
task becomes increasingly difficult. Weber and Perfect (2012) speculated that when some
witnesses are presented with a fair lineup and don’t experience a powerful sense of
recognition, they may reject the lineup simply because they do not know which one to
choose. In the current study, it is possible that when faced with this more difficult task in
the context of the presumed real-world consequences of the field-condition, some
witnesses simply gave up rather than pressing forward and guessing.
Live Versus Picture Showups
Intuitively, one might think that the live confrontation between a witness and a
suspect in police custody would be more arousing and/or anxiety-provoking than viewing
a single picture, and that this increased arousal might influence witness decision-making.
We were able to control for this factor by adding the picture showup-condition, and
found no difference in witness performance when viewing the suspect live in police
custody versus being presented with a single picture of the suspect under similar
circumstances. Although the mode of the viewing did not affect witness performance
(live vs. photo) in either the field or the lab-condition, the belief that their identification
would presumably lead to the arrest and prosecution of the person they identified had a
significant effect on decision-making. Indeed, regardless of whether the showup was
conducted live or with a photo, witnesses lowered their criterion for choosing in the field
compared to the lab-conditions, and witnesses were overconfident in their decisions in
the field, but not the lab.
Limitations
35
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
The most notable limitation of this study is related to its modest sample size. This
limited the range of analytic techniques available to us; as we were unable to conduct
CAC analyses, and the sample was arguably a bit small to construct stable ROC curves.
That said, the findings from the current experiment build on previous studies which also
found that witnesses lowered their criterion for choosing and were overconfident in their
showup identifications made under real-world field conditions, and this replication
increases our confidence in these particular findings. However, because this is the first
study of its kind to compare witness performance when making identifications from
photographic-lineups conducted under field versus lab conditions, additional field-
simulation experiments are needed to see if field conditions consistently lead witnesses to
become more conservative in their decision-making when photo-lineups are used.
Although this pattern of results is intriguing, replication is required before we can invest
too much confidence in this novel finding.
Regarding the group nature of the paradigm; although witness interactions were
strictly limited after the crime and before the identifications were made, the group nature
of the procedures used still introduced the potential for co-witness effects. Notably, it was
necessary to have the experimenter provide a general description of the suspect to the
authorities in front of the witnesses to explain why the suspect was detained by the
police. Although this description was entirely accurate (i.e., medium high medium weight
Hispanic male with a dark tee shirt, jeans, and close cut hair), it is still possible that
hearing this description may have affected some witnesses’ memory for the culprit.
Moreover, since the witnesses were taken one at a time to view the suspect, it is also
36
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
possible that some witnesses may have made inferences of culprit presence based on how
long other witness took to make their identification decisions (Douglass, et al. 2020).
Regarding the sample, participants were primarily Latinx female college students.
In general, college students tend to be more educated and often wealthier than typical
members of the surrounding community (Gosling, Sandy, John & Potter, 2010). That
said, it is worth noting that this study was done at a very diverse urban commuter campus
in East Los Angeles, and many of the participants came from low income homes and are
the first in their families ever to attend college. As such, the students at this university
are arguably more representative of the urban community surrounding the school than
typical samples from wealthier educational institutions.
Conclusions
The current study builds on previous work showing that when participants were
led to believe their identifications were being made as part of an actual police
investigation, witnesses at showups lowered their criterion for choosing, and were
overconfident in their identification decisions. Moreover, findings from the current study
demonstrated that the observed effects were robust enough to generalize to identifications
made from both live and photo showups. Notably, these effects were even larger after
excluding participants who expressed significant doubts about the field-simulation (i.e.,
the doubters) and only looking at witnesses who presumably believed that their
identification would result in the arrest and prosecution of the suspect (i.e., the true
believers). This pattern of results suggests that the observed differences between
witnesses in the lab and field-conditions are driven primarily by the situational pressures
37
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
of being a witness in an actual police investigation, rather than any subtle procedural
differences between the lab and field-conditions.
Notably, field conditions appeared to have a very different effect on witnesses
when making identifications from lineups, as witnesses were actually less likely to
choose from lineups when they were led to believe their identification was being made as
part of an actual police investigation. Essentially, something about being a witness in an
actual police investigation made witnesses more liberal in their decision-making when
showups were used, but more conservative when lineups were employed. This more
conservative approach to making identifications from lineups in the field presents both
advantages and disadvantages relative showups. The clearest advantage of using lineups
is that under field conditions, witnesses were unlikely to make high-confidence innocent-
suspect identifications. As a result, high-confidence identifications were much more
likely to be accurate. Of course, the downside is, because witnesses were more
conservative in their decision-making, lineups administered under the field conditions
also yielded fewer accurate culprit identifications overall.
The overconfidence of witnesses when making identifications from showups
conducted under field conditions presents a major concern for the criminal justice system,
as identifications that are initially asserted with a high degree of confidence are likely to
carry more weight with the police and prosecutors who make the decisions on charging
the crime and prosecuting the case.
Future Directions
Although controlled laboratory research will always be the gold standard for
studying how recognition memory, discriminability and related unemotional aspects of
38
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
"cold" cognition can influence witness performance when making identifications, the
field would surely benefit from more field-simulation studies done in collaboration with
law enforcement. Field-simulation studies have the potential to both improve the broader
impact of eyewitness research and to expand our understanding of how the situational
pressures of being a witness in an actual police investigation can influence witness
performance in ways that cannot be examined with typical lab paradigms. The current
experiment showed that many important findings from previous lab research generalized
well to field-conditions (e.g., greater false identifications with showups compared to
lineups, more choosing from lineups compared to showups, no live-superiority effect),
but also revealed that the situational pressures of being a witness in an actual police
investigation can affect eyewitness performance differently depending on the procedures
used to obtain the identification. Future field-simulation experiments can be used to test
new promising advances in procedures used to obtain eyewitness evidence under real
world field-conditions while still maintaining absolute control over the witnessing
conditions and circumstances surrounding the identification procedures used. Most
notably, new methods are urgently needed to reduce law enforcement’s reliance on the
use of dangerously suggestive showups. Testing new procedures under field-simulation
conditions would be a solid first step in demonstrating the efficacy of promising new
approaches before asking law enforcement to test these methods in actual cases, as done
with traditional field-experiments.
39
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
References
Berkowitz, S. R., Garrett, B. L., Fenn, K. M., & Loftus, E. F. (2020). Convicting with
confidence? Why we should not over-rely on eyewitness confidence. Memory, 1–
6. https://doi.org/10.1080/09658211.2020.1849308
Colloff, M. F., Wade, K. A., & Strange, D. (2016). Unfair lineups make witnesses more
likely to confuse innocent and guilty suspects. Psychological Science, 27(9),
1227–1239. https://doi.org/10.1177/0956797616655789
Colloff, M. F., & Wixted, J. T. (2020). Why are lineups better than showups? A test of
the filler siphoning and enhanced discriminability accounts. Journal of
Experimental Psychology. Applied, 26(1), 124–143.
https://doi.org/10.1037/xap0000218
Clark, S. E. (2012). Costs and benefits of eyewitness identification reform. Perspectives
on Psychological Science, 7(3), 238–259.
https://doi.org/10.1177/1745691612439584
Douglass, A. B., Lucas, C. A., Brewer, N. (2020). Cowitness Identification Speed
Affects Choices from Target-Absent Photo-spreads. Law and Human Behavior,
44(6), 474-484. https://doi-org.lprx.bates.edu/10.1037/lhb0000420.
Eisen, M. L., Skerrit-Perta, A., Jones, J. M., Owen, J., & Cedré, G. C. (2017). Pre-
admonition suggestion in live showups: When witnesses learn that the cops
caught ‘the’ guy. Applied Cognitive Psychology, 31(5), 520–529.
https://doi.org/10.1002/acp.3349
Eisen, M. L., Smith, A. M., Olaguez, A. P., & Skerritt-Perta, A. S. (2017). An
examination of showups conducted by law enforcement using a field-simulation
40
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
paradigm. Psychology, Public Policy, and Law, 23(1), 1–22.
https://doi.org/10.1037/law0000115
Fitzgerald, R. J., Price, H. L., & Valentine, T. (2018). Eyewitness identification: Live,
photo, and video lineups. Psychology, Public Policy, and Law, 24(3), 307–325.
https://doi.org/10.1037/law0000164
Garrett, B. L. (2011). Convicting the innocent: Where criminal prosecutions go wrong.
Harvard University Press.
Gonzalez, R., Ellsworth, P. C., & Pembroke, M. (1993). Response biases in lineups and
showups. Journal of Personality and Social Psychology, 64(4), 525–537.
https://doi.org/10.1037/0022-3514.64.4.525
Gosling, S. D., Sandy, C. J., John, O. P., & Potter, J. (2010). Wired but not WEIRD: The
promise of the Internet in reaching more diverse samples. Behavioral and Brain
Sciences, 33, 94.
Gronlund, S. D., Carlson, C. A., Neuschatz, J. S., Goodsell, C. A., Wetmore, S. A.,
Wooten, A., & Graham, M. (2012). Showups versus lineups: An evaluation using
ROC analysis. Journal of Applied Research in Memory and Cognition, 1(4), 221–
228. https://doi.org/10.1016/j.jarmac.2012.09.003
Gronlund, S. D., Wixted, J., & Mickes, L. (2014). Evaluating eyewitness identification
procedures using receiver operating characteristic analysis. Current Directions in
Psychological Science, 23(1), 3-10. https://doi.org/10.1177/0963721413498891
Horry, R., Halford, P., Brewer, N., Milne, R., & Bull, R. (2014). Archival analyses of
41
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
eyewitness identification test outcomes: What can they tell us about eyewitness
memory?. Law and Human Behavior, 38(1), 94.
https://doi.org/10.1037/lhb0000060
Mickes, L. (2015). Receiver operating characteristic analysis and confidence–accuracy
characteristic analysis in investigations of system variables and estimator
variables that affect eyewitness memory. Journal of Applied Research in Memory
and Cognition, 4(2), 93–102. https://doi.org/10.1016/j.jarmac.2015.01.003
National Research Council (2014). Identifying the Culprit: Assessing Eyewitness
Identification. Washington, DC: National Academy Press.
Sauerland, M., Sagana, A., & Sporer, S. L. (2012). Assessing nonchoosers’ eyewitness
identification accuracy from photographic showups by using confidence and
response-times. Law and Human Behavior, 36(5), 394-403.
https://doi.org/10.1037/h0093926
Sauerland, M., Sagana, A., Sporer, S. L., & Wixted, J. T. (2018). Decision time and
confidence predict choosers’ identification performance in photographic showups.
PLoS ONE, 13(1). https://doi.org/10.1371/journal.pone.0190416
Smith, A. M., Lampinen, J. M., Wells, G. L., Smalarz, L., & Mackovichova, S. (2019).
Deviation from perfect performance measures the diagnostic utility of eyewitness
lineups but partial Area Under the ROC Curve does not. Journal of Applied
Research in Memory and Cognition, 8(1), 50–59.
https://doi.org/10.1016/j.jarmac.2018.09.003
Smith, A. M., Smalarz, L., Ditchfield, R., & Ayala, N. T. (2021). Evaluating the Claim
that High Confidence Implies High Accuracy in Eyewitness Identification.
42
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
Psychology, Public Policy, and Law.
https://doi.org/10.13140/RG.2.2.28211.25122
Smith, A. M., Wells, G. L., Lindsay, R. C. L., & Penrod, S. D. (2017). Fair lineups are
better than biased lineups and showups, but not because they increase underlying
discriminability. Law and Human Behavior, 41(2), 127–145.
https://doi.org/10.1037/lhb0000219
Smith, A. M., Yang, Y., & Wells, G. L. (2020). Distinguishing between investigator
discriminability and eyewitness discriminability: A method for creating full
receiver operating characteristic curves of lineup identification
performance. Perspectives on Psychological Science, 15(3), 589–607.
https://doi.org/10.1177/1745691620902426
Steblay, N., Dysart, J., Fulero, S., & Lindsay, R. C. L. (2003). Eyewitness accuracy rates
in police showup and lineup presentations: A meta-analytic comparison. Law and
Human Behavior, 27(5), 523–540. https://doi.org/10.1023/a:1025438223608
Weber, N., & Perfect, T. J. (2012). Improving eyewitness identification accuracy by
screening out those who say they don’t know. Law and Human Behavior, 36(1),
28-36. https://doi.org/10.1037/h0093976
Wells, G. L. (2001). Police lineups: Data, theory, and policy. Psychology, Public Policy,
and Law, 7(4), 791–801. https://doi.org/10.1037/1076-8971.7.4.791
Wells, G. L., Smalarz, L., & Smith, A. M. (2015). ROC analysis of lineups does not
measure underlying discriminability and has limited value. Journal of Applied
Research in Memory and Cognition, 4(4), 313–317.
https://doi.org/10.1016/j.jarmac.2015.08.008
43
WITNESS PERFORMANCE IN THE FIELD VERSUS THE LAB
Wells, G. L., Steblay, N. K., & Dysart, J. E. (2015). Double-blind photo lineups using
actual eyewitnesses: An experimental test of a sequential versus simultaneous
lineup procedure. Law and Human Behavior, 39, 1 – 14.
https://doi.org/10.1037/lhb0000096
Wetmore, S. A., Neuschatz, J. S., & Gronlund, S. D. (2014). On the power of secondary
confession evidence. Psychology, Crime & Law, 20(4), 339–357.
https://doi.org/10.1080/1068316X.2013.777963
Wixted, J. T., & Mickes, L. (2014). A signal-detection-based diagnostic-feature-detection
model of eyewitness identification. Psychological Review, 121(2), 262–276.
https://doi.org/10.1037/a0035940
Yarmey, A. D., Yarmey, M. J., & Yarmey, A. L. (1996). Accuracy of eyewitness
identifications in showups and lineups. Law and Human Behavior, 20(4), 459-
477. https://doi.org/10.1007/BF01498981
... Yet, more recent research focusing on both the potential to rule in guilty suspects and on the potential to rule out innocent suspects finds that showups are better at ruling out innocent suspects than are lineups (they have a better trade-off for correct rejections and misses). In fact, showups are so much more effective at ruling out innocent suspects that they have better overall discriminability than do lineups (Eisen et al., 2022;Starns et al., 2021). This creates a serious dilemma. ...
... For showups, suspect identifications made with 100% confidence were correct 89% of the time, but identifications made with lower levels of confidence were far less accurate. These patterns largely replicate past research (e.g., Eisen et al., 2022;Sauerland et al., 2018;Wixted & Wells, 2017). For suspect identifications, the yield reflects the total proportion of culprit-present conditions leading to a particular outcome. ...
... According to the MAX signal detection model (Figure 2), lineup rejections often scale match to memory for a filler rather than for the suspect, which undermines the potential for lineups to rule out innocent suspects. Fillers were so detrimental to the rule out potential of lineups that lineups had worse discriminability than did showups ( Figure 4) (Eisen et al., 2022;Starns et al., 2021). This pattern demonstrates that lineups are not objectively superior to showups and reinforces recent calls to consider all identification outcomes when making inferences about which of two identification procedures is superior (Smith, Yang, et al., 2020;Starns et al., 2021). ...
... Yet, more recent research focusing on both the potential to rule in guilty suspects and on the potential to rule out innocent suspects finds that showups are better at ruling out innocent suspects than are lineups. In fact, showups are so much more effective at ruling out innocent suspects that they have better overall discriminability than do lineups (Eisen et al., 2022;Starns et al., 2021). This creates a serious dilemma. ...
... For showups, suspect identifications made with 100% confidence were correct 89% of the time, but identifications made with lower levels of confidence were far less accurate. These patterns largely replicate past research (e.g., Eisen et al., 2022;Sauerland et al., 2018;Wixted & Wells, 2017). For suspect identifications, the yield reflects the total proportion of culprit-present conditions leading to a particular outcome. ...
... According to the MAX signal detection model (Figure 2), lineup rejections often scale match to memory for a filler rather than for the suspect, which undermines the potential for lineups to rule out innocent suspects. Fillers were so detrimental to the rule out potential of lineups that lineups had worse discriminability than did showups ( Figure 4) (Eisen et al., 2022;Starns et al., 2021). This pattern contradicts field consensus that lineups are superior to showups (e.g., Steblay et al., 2003;Wells et al., 2020;Wetmore et al., 2015) and underpins recent calls to consider all identification outcomes when making inferences about which of two identification procedures is superior (Smith, Yang, et al., 2020;Starns et al., 2021). ...
ResearchGate has not been able to resolve any references for this publication.