Content uploaded by Laszlo A Erdodi
Author content
All content in this area was uploaded by Laszlo A Erdodi on Sep 12, 2022
Content may be subject to copyright.
Full Terms & Conditions of access and use can be found at
https://www.tandfonline.com/action/journalInformation?journalCode=tajf20
Australian Journal of Forensic Sciences
ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/tajf20
They are not destined to fail: a systematic
examination of scores on embedded performance
validity indicators in patients with intellectual
disability
Isabelle Messa, Matthew Holcomb, Jonathan D Lichtenstein, Brad T Tyson,
Robert M Roth & Laszlo A Erdodi
To cite this article: Isabelle Messa, Matthew Holcomb, Jonathan D Lichtenstein, Brad T Tyson,
Robert M Roth & Laszlo A Erdodi (2022) They are not destined to fail: a systematic examination
of scores on embedded performance validity indicators in patients with intellectual disability,
Australian Journal of Forensic Sciences, 54:5, 664-680, DOI: 10.1080/00450618.2020.1865457
To link to this article: https://doi.org/10.1080/00450618.2020.1865457
Published online: 24 Aug 2021. Submit your article to this journal
Article views: 212 View related articles
View Crossmark data Citing articles: 6 View citing articles
They are not destined to fail: a systematic examination of
scores on embedded performance validity indicators in
patients with intellectual disability
Isabelle Messa
a
, Matthew Holcomb
b
, Jonathan D Lichtenstein
c
, Brad T Tyson
d
,
Robert M Roth
c
and Laszlo A Erdodi
a
a
Department of Psychology, University of Windsor, Windsor, ON, Canada;
b
Jefferson Neurobehavioral Group,
New Orleans, LA, USA;
c
Department of Psychiatry, Dartmouth-Hitchcock Medical Center, Lebanon, NH, USA;
d
Neuropsychological Service, EvergreenHealth Medical Center, Kirkland, WA, USA
ABSTRACT
This study was designed to determine the clinical utility of embedded
performance validity indicators (EVIs) in adults with intellectual dis-
ability (ID) during neuropsychological assessment. Based on previous
research, unacceptably high (>16%) base rates of failure (BR
Fail
) were
predicted on EVIs using on the method of threshold, but not on EVIs
based on alternative detection methods. A comprehensive battery of
neuropsychological tests was administered to 23 adults with ID
(M
Age
= 37.7 years, M
FSIQ
= 64.9). BR
Fail
were computed at two levels
of cut-os for 32 EVIs. Patients produced very high BR
Fail
on 22 EVIs
(18.2%-100%), indicating unacceptable levels of false positive errors.
However, on the remaining ten EVIs BR
Fail
was <16%. Moreover, six of
the EVIs had a zero BR
Fail
, indicating perfect specicity. Consistent
with previous research, individuals with ID failed the majority of EVIs
at high BR
Fail
. However, they produced BR
Fail
similar to cognitively
higher functioning patients on select EVIs based on recognition
memory and unusual patterns of performance, suggesting that the
high BR
Fail
reported in the literature may reect instrumentation
artefacts. The implications of these ndings for clinical and forensic
assessment are discussed.
ARTICLE HISTORY
Received 29 October 2020
Accepted 25 November 2020
KEYWORDS
Intellectual disability;
performance validity;
embedded validity
indicators; false positive rate;
forensic assessment
Introduction
The prevalence of Intellectual disability (ID) is dicult to determine due to dierences in
epidemiological and diagnostic practices around the world. However, it is thought to
aect approximately 1% of the global population, with rates in low- and middle- income
countries nearly double those in high-income ones
1
. Calculating the prevalence of ID
within the United States is complicated by the inclusion of mild ID (traditionally dened as
an IQ falling between 50–70) resulting in estimates ranging from 8.7 to 36.8 per 1,000
births
2–4
. The most recent edition of the Diagnostic and Statistical Manual of Mental
Disorders [DSM-5;
5
] denes ID as a disorder that impacts an individual’s general intellec-
tual functioning, produces impairments in several domains of adaptive functioning, and
has its onset in the developmental period.
CONTACT Laszlo A Erdodi lerdodi@gmail.com
AUSTRALIAN JOURNAL OF FORENSIC SCIENCES
2022, VOL. 54, NO. 5, 664–680
https://doi.org/10.1080/00450618.2020.1865457
© 2020 Australian Academy of Forensic Sciences
Historically, an IQ of 70 (i.e., two or more standard deviations below the normative
mean) was a proverbial ‘line in the sand’ for diagnosing ID. In contrast, the DSM-5 allows
for a ‘oating’ criterion, where the individual’s adaptive behavioural functioning is taken
into account. This distinction is important for several reasons. First, a exible cut-o
recognizes the measurement error inherent in the IQ test. Second, the oating criteria
for diagnosing ID have been accepted by the United States Supreme Court (Hall v Florida).
In addition, the spectrum of ID is divided into severity ranges. The present study was
restricted to individuals with mild ID (i.e., IQ between 50 and 70), for several reasons. First,
given the focus on neurocognitive proles, the scope of this investigation is restricted to
‘testable’ patients (i.e., individuals who are capable of completing a comprehensive
battery of neurocognitive tests). Performance validity in examinees who cannot partici-
pate in psychometric testing due to severe sensory-motor decits is a moot point. Second,
mild ID is the most prevalent severity range within the United States (85% of ID cases).
Third, within the paediatric neuropsychological literature there is a lack of consensus
about the developmental abilities needed to understand engagement, eort, and
deception
6,7
. Individuals within the mild range of ID are thought to function at a mental
age between 9 and 12 years of age , which coincides with the lower age limit for
meaningful performance validity testing
8–12
.
Nevertheless, a DSM-5 diagnosis of ID requires both a clinical assessment and standar-
dized intelligence testing. As such, the validity of the diagnosis depends on the integrity
of the psychometric data. To make an accurate diagnosis, clinicians must be condent
that test scores provide a true reection of an individual’s intellectual and adaptive
functioning. The assessment of adaptive functioning incorporates several independent
sources of information such as self- and informant- reports, objective measures of activ-
ities of daily living, the individual’s medical, educational and social history including
current life circumstances. The validity of the FSIQ, however, is dicult to assess through
behavioural observations or clinical judgement
13,14
. As such, assessors typically rely on
psychometric measures to determine the validity of an individual’s performance.
Performance validity is the extent to which scores on cognitive tests accurately reect
an examinee’s true ability level Bigler
15,16,
refers to it as the ‘Achilles heel’ of cognitive
testing, as it exposes a serious weakness in an otherwise robust system. In clinical
neuropsychology, the assessment of performance validity has become a standard com-
ponent of evaluations, and its routine use is recommended by various professional
organizations
17,18–20
and PVT researchers
21–25
. Moreover, best practice guidelines pre-
scribe the use of multiple performance validity tests (PVTs) throughout the assessment,
tapping varying cognitive domains
20
.
By design, there are two types of PVTs: free-standing instruments and embedded
validity indicators (EVIs). Free-standing PVTs were developed specically to assess the
credibility of a given neurocognitive prole, whereas EVIs are nested within standard
neuropsychological tests. As such, EVIs provide a second function without requiring
additional resources, enabling clinicians to simultaneously assess both cognitive function
and credibility of performance. Although free-standing PVTs have long been considered
the gold-standard, and generally possess superior signal detection properties relative to
their embedded counterparts
26
, they are more cumbersome in that they require addi-
tional test material, take up valuable assessment time, and contribute to patient fatigue
while failing to provide information regarding an individual’s cognitive functioning.
AUSTRALIAN JOURNAL OF FORENSIC SCIENCES 665
EVIs, on the other hand, are more ecient and cost-eective, as they use data already
being collected for clinical purposes
27
. They are also less susceptible to coaching, as they
are less readily identiable as validity indicators
28
. Though EVIs in isolation tend to have
inferior signal detection relative to PVTs, aggregating several EVIs across the testing
battery not only provides for the inconspicuous measurement of performance validity
at several points, but has also been shown to improve classication accuracy to levels
comparable with standalone measures
23
. A fundamental limitation of EVIs, however, is
their potential to confound invalid performance with genuine decits in cognitive
ability
29
. One context in which this confound may be especially consequential is in the
assessment of ID.
Indeed, a primary concern of performance validity assessment in individuals with ID is
whether the classication accuracy of commonly used PVTs generalizes to this
population
30
. Genuine and severe cognitive decits have been shown to result in unac-
ceptably high false positive rates on many PVTs. Given that ability and eort are especially
intertwined in EVIs, at face value they are increasingly susceptible to this confound. To
further complicate matters, dierentiating bona de and feigned ID is extremely challen-
ging in the presence of external incentive to appear impaired
30,31,32
,aptly described this as
a ‘chicken-egg problem’ (i.e., is a FSIQ <70 an extenuating circumstance for PVT failures or
an invalid result because of them?), leaving clinicians sensitized to a signicant diagnostic
issue, without oering concrete, practical solutions.
Although feigning ID to qualify for disability payments or other social services seems
a plausible scenario
30
, a more extreme and compelling incentive to malinger emerged
at the turn of the century, when the US Supreme Court determined that individuals with
ID should be exempt from the death penalty in the case of Atkins v. Virginia
33
. At the
time, Justice Scalia voiced his dissent, citing concerns that this would incentivize
malingered ID among defendants trying to evade the death penalty
34
. Given the
potentially extreme consequences of a false positive error (erroneously labelling an
individual with genuine ID as a malingerer) in these cases, there has since been
increased emphasis on the development of PVTs that are appropriate for use in deter-
mining the presence or absence of ID
34–36
.
Though some measures have been identied as potentially useful in this
population
34,35
, most of the evidence suggests that the PVTs applied to the assessment
of ID have an unacceptably high [>16%;
37
] false positive rate
36,38–43
. Progress on the issue
of high false positive rates on PVTs among patients with ID is hindered by the common
practice of excluding individuals with FSIQ <75 from cross-validation studies as
a methodological safeguard against contaminating criterion grouping
44–52
.
Despite these discouraging ndings, there may be a psychometric solution to the
‘chicken-egg problem’
30
. Rather than relying on the method of threshold (i.e., identifying
a cut-o on the ability scale below which patients with genuine cognitive impairment
rarely score), assessors could explore PVTs based on alternative detection mechanisms.
On rational grounds, PVTs designed to identify internal inconsistencies and neurologically
implausible patterns of performance may provide an accurate measure of performance
validity even in the presence of credible severe cognitive decits. EVIs could be particu-
larly well-suited to this task, eectively transforming their primary weakness [i.e., that the
measurement of cognitive ability and performance validity is inextricably intertwined
within a task) into a strength. For example Erdodi et al.
53
, found that although two EVIs
666 I. MESSA ET AL.
nested within measures of visuomotor processing speed had adequate specicity overall,
they were associated with unacceptably high false positive rates in patients with severe
traumatic brain injury. However, the absolute value of the dierence between the two
scores maintained high specicity. Their ndings were subsequently replicated in
a forensic sample
54,55–59
, reinforcing the potential of derivative EVIs to detect non-
credible responding even in examinees with genuine and severe cognitive impairment.
The current study was designed to systematically evaluate a large number of EVIs
representing a variety of cognitive domains, diculty levels, and detection mechanisms
to determine their suitability for clinical and forensic use in patients with ID. Based on
previous research, we predicted a high overall failure rate (>50%). At the same time, we
hypothesized that patients would pass EVIs designed to detect neurologically implausible
patterns of performance, because these indices monitor the internal consistency of test
taking behaviour, and do not penalize examinees for genuine impairment.
Method
Participants
A consecutive case sequence of 23 adults with ID was selected form an archival data set of
patients referred for neuropsychological testing at a tertiary care hospital in the
Northeastern US. Inclusion criteria were: 1. FSIQ <75 (to account for the standard error
of measurement around the point estimate of 70); 2. A documented history of develop-
mental delay and decits in adaptive functioning; and 3. A full administration of the
California Verbal Learning Test – Second Edition (CVLT-II). Having data on the CVLT-II (an
extensive measure of verbal memory) ensured that patients were able to complete
a comprehensive battery of tests and thus, provide the opportunity to examine their
full EVI prole. The majority of the sample was male (12 or 55%) and right-handed (15 or
68%). Mean age was 37.7 years (SD = 14.1). Mean FSIQ was 64.9 (SD = 5.1; range: 57–74).
Materials and procedure
A comprehensive battery of neuropsychological tests was administered to all patients,
encompassing ve core domains: attention/processing speed, memory, language, execu-
tive function and manual dexterity. Psychometric testing was administrated and scored
by Masters or doctoral level psychometrists working under the supervision of licenced
clinical neuropsychologists. The study was approved by the hospital’s research ethics
board. APA ethical guidelines regulating research with human participants were followed
throughout the process.
Data analysis
Given the scope of the study, base rate of failure (BR
Fail
) was the main outcome of interest.
BR
Fail
is a descriptive statistic, representing the percentage of people in the sample who
failed a given cut-o. As ID is considered an exempt category from PVTs, the performance of
the entire sample is considered credible on neuropsychological testing. As such, BR
Fail
is
conceptually equivalent to false positive rate (i.e., 100 – specicity). In a clinical and forensic
AUSTRALIAN JOURNAL OF FORENSIC SCIENCES 667
setting, the highest acceptable level of false positive rate is 16%
37
, although ≤10% is the
emerging threshold.
Where possible, BR
Fail
was calculated at two dierent levels: liberal and conservative.
Liberal cut-os prioritize sensitivity over specicity. Consequently, they are more likely to
detect non-credible responding, but are prone to false positive errors. In contrast, con-
servative cut-os prioritize specicity over sensitivity. As such, they are less likely to detect
non-credible responding, but when they do, they are more likely to be correct. To help
visually identify EVIs and cut-os that met specicity standards, BR
Fail
≤16% were shaded in
the data tables. Instances of zero BR
Fail
(i.e., perfect specicity) were also marked in boldface.
Results
BR
Fail
on EVIs within tests of attention and processing speed
The majority of the sample (mean BR
Fail
= 75.7%; SD = 30.3) failed the liberal cut-os, with
one notable exception: the absolute value of the dierence score between age-corrected
scaled scores on the Coding and Symbol Search subtests on the WAIS-IV, which had a BR
Fail
of zero. The logic behind this EVI is that the normative dierence between the scores on
these two tests measuring similar constructs (i.e., visuomotor speed) is small
60
. A large
discrepancy reveals that the lower score of the pair likely underestimates the examinee’s
true ability level. The present results suggest that patients demonstrated an even perfor-
mance across both tests, producing a valid prole. At conservative cut-os, BR
Fail
declined
(M = 51.3%; SD = 24.1), but remained much above the lowest acceptable threshold (42.9%-
85.7%). Further details are provided in Table 1.
BR
Fail
on EVIs within memory tests
Across nine EVIs at liberal cut-os mean BR
Fail
was 42.7% (SD = 28.2). Minimum acceptable
specicity was achieved on the CVLT-II Forced Choice Recognition (FCR), RCFT Yes/No
Recognition and True Positives. Predictably, mean BR
Fail
at conservative cut-os was
Table 1. EVIs within tests of attention and processing speed.
LIB CON
EVI Cut-Off BR
Fail
Cut-Off BR
Fail
References
RDS ≤7 86.4 ≤6 50.0 61,62,63,53
DS
WAIS-IV
≤6 100.0 ≤4 42.9 29,63
CD
WAIS-IV
≤5 93.3 ≤4 85.7 65,66,54,67
SS
WAIS-IV
≤6 94.7 ≤4 68.4 68,69,69
|CD – SS| ≥3 0.0 - - 70,71
Trails
D-KEFS
1 ≤5 64.3 ≤4 42.9 127,72
Trails
D-KEFS
2 ≤5 78.6 ≤4 64.3 72,73
Trails
D-KEFS
3 ≤5 85.7 ≤4 64.3 74,75
Trails
D-KEFS
5 ≤8 78.6 ≤5 42.9 76,77
EVI: Embedded validity indicator; RDS: Reliable Digit Span; DS: Digit Span age-corrected scaled score (ACSS; M = 10;
SD = 3]; WAIS-IV: Wechsler Adult Intelligence Scale – Fourth Edition; CD: Coding ACSS; SS: Symbol Search ACSS; Trails
D-KEFS
: Trails of the Delis-Kaplan Executive Function Systems (ACSS; M = 10; SD = 3); LIB: Liberal (optimized for
sensitivity); CON: Conservative (optimized for specificity); BR
Fail
: Base rate of failure (% of the sample classified as non-
credible at a given cut-off).
668 I. MESSA ET AL.
notably lower (19.1%; SD = 19.5). More important, only 9.1% of the sample failed the
FCR
CVLT-II
cut-o, achieving >.90 specicity. Remarkably, no one failed the RCFT recogni-
tion cut-os (Table 2). Likewise, all three trials of Logical Memory (Immediate and Delayed
Free Recall; Yes/No Recognition) produced BR
Fail
<16%. However, BR
Fail
remained high on
the validity cut-os embedded within the CVLT-II acquisition trials (63.6%) and the RCFT
copy trial (33.3%).
BR
Fail
on EVIs within language tests
At liberal cut-os, BR
Fail
on all ve EVIs within language tests were above the minimum
acceptable threshold (mean = 58.1%; SD = 26.8). However, at conservative cut-os, mean
BR
Fail
was lower (38.4%; SD = 26.4). In addition, on one derivative EVI based on a dis-
crepancy score (Vocabulary minus Digit Span) had a zero BR
Fail
even at the most liberal
cut-o available (Table 3).
BR
Fail
on EVIs within tests of executive function
Mean BR
Fail
was 45.1% (SD = 35.8) at liberal cut-os. One derivative EVI based on the ratio
between two raw scores achieved .90 specicity. At conservative cut-os BR
Fail
was slightly
lower (M = 34.1%; SD = 34.6). Although two of the EVIs had a zero BR
Fail
, the remaining three
had unacceptably high failure rates (35.7–78.6%). Table 4 provides additional details.
BR
Fail
on EVIs within tests of manual dexterity
On motor measures, BR
Fail
remained high across levels of cut-o (liberal: M = 68.8%,
SD = 9.7; conservative: M = 45.1%, SD = 18.2), tests (GPB and FTT) and whether the
dominant or non-dominant hand was used (Table 5).
Table 2. EVIs within memory tests.
LIB CON
EVI Cut-Off BR
Fail
Cut-Off BR
Fail
References
T 1–5
CVLT-II
≤37 72.7 ≤31 63.6 78,79
RH
CVLT-II
≤11 27.3 ≤10 18.2 80,81
FCR
CVLT-II
≤15 13.6 ≤14 9.1 82,83,84,85,86
RCFT Copy ≤26 88.9 ≤23 33.3 87,127,129
RCFT REC ≤16 11.1 ≤13 0.0 88
RCFT TP ≤6 12.5 ≤4 0.0 88,89,130
LM I ≤3 47.4 ≤1 15.8 90
LM II ≤4 57.9 ≤2 15.8 91
LM REC ≤20 52.6 ≤16 15.8 76,92
EVI: Embedded validity indicator; T 1–5
CVLT-II
: Acquisition trials (sum of scores on Trials 1 through 5) on the California
Verbal Learning Test – Second Edition (raw scores); RH
CVLT-II
: Yes/No Recognition hits on the California Verbal Learning
Test – Second Edition (raw scores); FCR
CVLT-II
: Forced Choice Recognition trial on the California Verbal Learning Test –
Second Edition (raw scores); RCFT: Rey Complex Figure Test (raw scores); REC: Yes/No Recognition (raw scores); TP: Yes/
No Recognition true positives (raw scores); LM: Logical Memory; I: Immediate Recall age-corrected scaled score (ACSS;
M = 10; SD = 3); II: Delayed Recall age-corrected scaled score (ACSS; M = 10; SD = 3); REC: Recognition (raw scores); LIB:
Liberal (optimized for sensitivity); CON: Conservative (optimized for specificity); BR
Fail
: Base rate of failure (% of the
sample classified as non-credible at a given cut-off).
AUSTRALIAN JOURNAL OF FORENSIC SCIENCES 669
Discussion
This study was designed to empirically evaluate the doctrine within clinical neuropsychol-
ogy that patients with ID should be exempt from performance validity testing because of
the genuine and severe cognitive impairment inherent in the diagnosis. We predicted
that on EVIs nested within tests that are globally sensitive to diuse neurological decits,
BR
Fail
would be high – well above the acceptable threshold. However, on derivative
Table 3. EVIs within language tests.
LIB CON
EVI Cut-Off BR
Fail
Cut-Off BR
Fail
References
Animals ≤33 75.0 ≤29 56.3 93,94
BNT ≤37 78.9 ≤33 47.4 62,63,94,95
CIM
BDAE
≤9 41.2 ≤8 23.5 69,96,97,98
FAS ≤33 76.5 ≤29 64.7 69,99,100
VC – DS
WAIS-IV
≥3 19.0 ≥5 0.0 101,102,103,131
EVI: Embedded validity indicator; Animals: Category fluency test T-scores (M = 50; SD = 10] based on norms by
Heaton et al. (2004
134
); BNT: Boston Naming Test T-scores (M = 50; SD = 10) based on norms by Heaton et al.
(2004
134
); CIM
BDAE
: Complex Ideational Material subtest of the Boston Diagnostic Aphasia Battery (raw scores);
FAS: Letter fluency test T-scores (M = 50; SD = 10) based on norms by Heaton et al. (2004
134
); VC – DS:
Vocabulary minus Digit Span age-corrected scaled score (M = 10; SD = 3); WAIS-IV: Wechsler Adult Intelligence
Scale – Fourth Edition; LIB: Liberal (optimized for sensitivity); CON: Conservative (optimized for specificity);
BR
Fail
: Base rate of failure (% of the sample classified as non-credible at a given cut-off).
Table 4. EVIs within tests of executive function.
LIB CON
EVI Cut-Off BR
Fail
Cut-Off BR
Fail
References
FMS
WCST
≥2 15.4 ≥3 0.0 64,104,105,106
UE
WCST
≥1 66.7 ≥4 41.7 64,65,107
PMM
WCST
- - ≥1 55.6 108
Trails
D-KEFS
4 ≤4 84.6 ≤1 76.9 109
Trails
D-KEFS
4/2 ≤1.60 15.4 ≤1.50 0.0 110,111,112,113,114
EVI: Embedded validity indicator; FMS
WCST
: Failures to maintain set on the Wisconsin Card Sorting Test (raw score];
UE
WCST
: Unique errors (‘Other’ responses) on the Wisconsin Card Sorting Test (raw score); PMM
WCST
: Perfect
matches missed on the Wisconsin Card Sorting Test (raw score); Trails
D-KEFS
4: Trails 4 of the Delis-Kaplan Executive
Function Systems (ACSS; M = 10; SD = 3); Trails
D-KEFS
4/2: Raw score ratio of Trails 4/Trails 2 of the Delis-Kaplan
Executive Function Systems; LIB: Liberal (optimized for sensitivity); CON: Conservative (optimized for specificity);
BR
Fail
: Base rate of failure (% of the sample classified as non-credible at a given cut-off).
Table 5. EVIs within tests of manual dexterity.
LIB CON
EVI Cut-Off BR
Fail
Cut-Off BR
Fail
References
GPB DH ≤31 73.3 ≤27 66.7 115
GPB ND ≤31 80.0 ≤27 53.3 116
FTT DH ≤33 58.3 ≤25 33.0 117
FTT ND ≤33 63.6 ≤25 27.3 117
EVI: Embedded validity indicator; GPB: Grooved Pegboard Test T-scores (M = 50; SD = 10] based on norms
by Heaton et al. (2004
134
); DH: Dominant hand; ND: Non-dominant hand; FTT: Finger Tapping Test
T-scores (M = 50; SD = 10) based on norms by Heaton et al. (2004
134
); LIB: Liberal (optimized for
sensitivity); CON: Conservative (optimized for specificity); BR
Fail
: Base rate of failure (% of the sample
classified as non-credible at a given cut-off).
670 I. MESSA ET AL.
indices designed to measure neurologically implausible uctuations in cognitive ability,
we expected to nd signicantly lower BR
Fail
.
Results support both hypotheses. Overall BR
Fail
was unacceptably high across all ve
neurocognitive domains, consistent with previous research
43
. Although predictably, con-
servative cut-os were more ecient at containing BR
Fail
overall (19.1%-51.3%) compared
to liberal cut-os (42.7%-75.7%), they failed to provide a uniformly eective safeguard
against false positive errors.
In contrast, on ten out of 32 EVIs examined in this study BR
Fail
were below 16%. Among
these, for six EVIs there was a zero false positive rate, converging with the promising results
of earlier investigations
35
. This is an important nding, as zero BR
Fail
is rare even in
cognitively high functioning samples. As a reference
118
, reported BR
Fail
on EVIs within
Digit Span and verbal uency ranging from 5.8% to 10.9% in healthy university students
who volunteered for academic research and were instructed to perform to the best of their
abilities. Their ndings were subsequently replicated by Abeare et al. (2019
55
) and
119
.
Therefore, having identied several EVIs with perfect specicity in patients with ID is
a remarkable (although not unprecedented) nding.
Clinical implications
Detection methods matter
Feigning symptoms and disability is a ubiquitous phenomenon in the general
population
28,120
and relatively common in clinical settings
121,122
. The most common
approach to dierentiating credible from non-credible responding is the method of
threshold: identifying a cut-o score that most (≥90%) individuals with genuine impair-
ment can pass. However, the cumulative evidence (including the present study) suggests
that cognitive impairment in patients with ID is so severe and pervasive that traditional
PVTs routinely misclassify them as non-credible
41,43
.
Some EVIs remain effective in patients with ID
Our ndings revealed that alternative detection methods based on patterns of perfor-
mance provide a promising psychometric solution to the dilemma, consistent with pre-
vious research
35,54
. Three of the ve EVIs with zero BR
Fail
are derivative indices: they
combine information from two test scores, and evaluate the credibility of the prole
based on the relationship between them, not absolute performance on individual tasks. As
such, patients with ID who demonstrate their true ability on testing will produce con-
sistently low scores and thus, easily pass the validity cut-os. In contrast, individuals
attempting to feign low cognitive ability in general or ID specically, may be prone to
‘slip-ups’ (i.e., accidentally performing well on one of the tests and thus, failing the
derivative EVI).
Another detection method for PVTs is the violation of the inherent diculty gradient in
cognitive tasks (i.e., doing well on dicult tests, while doing poorly on easy ones). An
example of this would be higher performance on the genuinely dicult acquisition trials
and delayed free recall during a memory test compared to the signicantly easier
recognition testing. Results show that as a group, the patients within the sample demon-
strated the normative pattern of performance (high BR
Fail
on EVIs based on acquisition
trials, with a marked improvement on recognition trials), indicating credible responding.
AUSTRALIAN JOURNAL OF FORENSIC SCIENCES 671
Overall, results suggest that a handful of EVIs continue to be useful in patients with ID, as
they protect against false positive errors. Equally important, ndings consolidate the
signal detection prole of a number of EVIs, providing another source of empirical
evidence for their specicity and refuting reexive claims that failing a validity cut-o is
likely a false positive error even in cognitively high functioning examinees.
Forensic implications
In the presence of strong external incentives to appear impaired
5
it is incumbent on the
assessor to verify the veracity of test scores suggesting extremely low ability
17,18,19,20,22,25
,
as failing to detect non-credible decits comes at a high societal cost
123,124
. In the context
of criminal justice, establishing or ruling out ID can play a central role in high-stake legal
arguments such as competency to stand trial or carrying out the death penalty
35,36,125
.
Given the ‘exempt status’ from performance validity assessment in ID, assessors are faced
with the ‘chicken-egg problem’
30
: are PVT failures the natural consequence of an ID
diagnosis, or evidence that the examinee is feigning the condition?
The present study provides a list of PVTs with perfect specicity in patients with bona
de ID. Consequently, if examinees with external incentive to feign ID and insucient
evidence of a developmental history of adaptive decits fail these EVIs, assessors can
make the case for non-credible presentation with reasonable certainty. Naturally, the
evidence presented in this study is far from being conclusive. However, it provides
forensic experts with a compelling example that not all PVTs are contaminated by the
genuine and severe cognitive decits characteristic of ID. Therefore, a mere claim of ID
would not automatically render a carefully calibrated PVT arsenal null and void.
Limitations
The ndings should be interpreted in the light of the study’s limitations. The most
obvious one is the small sample size, although it is actually higher than some previously
published studies
40
. To some extent, the small sample is an inevitable trade-o for
having data on a comprehensive battery of commonly used neuropsychological tests. In
contrast, many assessors evaluating patients with ID administer abbreviated batteries
with lower cognitive load, and rely on informant report of their adaptive functioning.
Second, this is a relatively high functioning sample (reading level was broadly within
normal range), which is likely an artefact of a full CVLT-II administration being one of the
inclusion criteria. While this was necessary condition to obtain a sample with sucient
data to answer the main research question, it also means that the study was restricted to
‘testable’ patients (i.e., with high enough cognitive functioning and mental stamina to
complete several hours’ worth of psychometric testing). Thus, results may not generalize
to patients with signicantly lower overall cognitive functioning. Finally, BR
Fail
was
reported only on individual EVIs. While this is a necessary step in early stages of the
investigation, combining the evidence using composite measures of performance
validity has been shown to improve classication accuracy in general
26,126,127
and in
examinees with ID specically
42
. Future investigations would benet from exploring
multivariate models of EVIs.
672 I. MESSA ET AL.
Strengths
The study also has several strengths. Patients were administered a comprehensive battery
of neuropsychological tests that contained 32 EVIs, providing an opportunity to simulta-
neously examine, compare and contrast a large number of validity indices tapping various
cognitive domains and using dierent detection mechanisms. The present investigation
extended and consolidated previous sporadic reports that carefully chosen PVTs can
maintain appropriate levels of specicity. As such, it provides clinical and forensic asses-
sors with actionable practical knowledge on test selection and interpretation.
Conclusions
Assessors should continue to exercise great caution when they encounter EVI failures in
patients with ID, as this population is indeed vulnerable to high false positive rates. At the
same time, assuming that all examinees with ID will fail all EVIs by virtue of their diagnosis
is an overgeneralization that is inconsistent with empirical evidence. Rather, the present
study identied ve EVIs with a zero false positive rate – lower than what was observed in
higher functioning samples in previous research (Abeare et al., 2019
55
)
128,129,130,131
. The fact
that a minority of EVIs performed surprisingly well underscores the importance of psycho-
metric diversity (in cognitive domain, diculty level, detection mechanism) in PVT devel-
opment and deployment, a frequently echoed conclusion of the research literature
21,23,132
.
Although our results suggest that the unacceptably high BR
Fail
on most EVIs cannot be
contained by simply applying more conservative cut-os, we uncovered a number of EVIs
that demonstrated high specicity even at commonly used cut-os. If replicated by future
investigations, these instruments can provide eective psychometric tools for monitoring
the credibility of neurocognitive proles in this patient population. Since non-credible
responding can coexist with any type of neurological disorder, it is important to have EVIs
that can carry out their mission even in the unusually challenging signal detection
environment created by the presence of genuine and severe cognitive decits.
Moreover, they may be used to detect feigned ID in forensic settings, lling an existing
void in the current assessment methods
36,38,43,125,133
.
Acknowledgments
Relevant ethical guidelines were followed throughout the project. All data collection, storage and
processing was done with the approval of relevant institutional authorities regulating research
involving human participants, in compliance with the 1964 Helsinki Declaration and its subsequent
amendments or comparable ethical standards.
Disclosure statement
No potential conict of interest was reported by the author(s).
ORCID
Brad T Tyson http://orcid.org/0000-0002-5113-790X
AUSTRALIAN JOURNAL OF FORENSIC SCIENCES 673
Laszlo A Erdodi http://orcid.org/0000-0003-0575-9991
References
1. Maulik PK, Mascarenhas MN, Mathers CD, Dua T, Saxena S. Prevalence of intellectual disability:
a meta-analysis of population-based studies. Res Dev Disabil. 2011;32(2):419–436.
2. Bhasin TK, Brocksen S, Avchen RN, Braun KVN. Prevalence of four developmental disabilities
among children aged 8 years: metropolitan Atlanta developmental disabilities surveillance
program, 1996 and 2000. Morbidity Mortality Week Rep. 2006;55(SS01):1–9.
3. Boyle CA, Lary JM. Prevalence of selected developmental disabilities in children 3-10 years of
age: the Metropolitan Atlanta developmental disabilities surveillance program, 1991.
Morbidity Mortality Week Rep. 1996;45(SS02):1–14.
4. Camp BW, Broman SH, Nichols PL, Le M. Maternal and neonatal risk factors for mental
retardation: dening the “at-risk” child. Early Hum Dev. 1998;50(2):159–173.
5. American Psychiatric Association. Diagnostic and statistical manual of mental disorders (5th
ed.). Washington (DC): Author; 2013.
6. Kirkwood MW, Kirk JW. The base rate of suboptimal eort in a pediatric mild TBI sample:
performance on the medical symptom validity test. Clin Neuropsychol. 2010;24
(5):860–872.
7. Kirkwood MW, Kirk JW, Blaha RZ, Wilson P. Noncredible eort during pediatric neuropsycho-
logical exam: a case series and literature review. Child Neuropsychol. 2010;16(6):604–618.
8. Abeare CA, Messa I, Zuccato BG, Merker B, Erdodi LA. Prevalence of invalid performance on
baseline testing for sport-related concussion by age and validity indicator. JAMA Neurol.
2018;75(6):697–703. doi:10.1001/jamaneurol.2018.0031.
9. Blaskewitz N, Merten T, Kathmann N. Performance of children on symptom validity tests:
TOMM, MSVT, and FIT. Arch Clin Neuropsychol. 2008;23(4):379–391.
10. Constantinou M, McCarey RJ. Using the TOMM for evaluating children’s eort to perform
optimally on neuropsychological measures. Child Neuropsychol. 2003;9(2):81–90.
11. Holcomb MJ. Pediatric performance validity testing: state of the eld and current research.
J Pedia Neuropsychol. 2018;4:83–85. doi:10.1007/s40817-018-00062-y.
12. Lichtenstein JD, Erdodi LA, Linnea KS. Introducing a forced-choice recognition task to the
California verbal learning test – children’s version. Child Neuropsychol. 2017;23(3):284–299.
doi:10.1080/09297049.2015.1135422.
13. Dandachi-FitzGerald B, Ponds RW, Merten T. Symptom validity and neuropsychological
assessment: a survey of practices and beliefs of neuropsychologists in six European
countries. Arch Clin Neuropsychol. 2013;28(8):771–783.
14. Heaton RK, Smith HH, Lehman RAW, Vogt AT. Prospects for faking believable decits on
neuropsychological testing. J Consult Clin Psychol. 1978;46(5):892–900. doi:10.1037/0022-
006X.46.5.892.
15. Lezak MD, Howieson DB, Bigler ED, Tranel D. Neuropsychological assessment. New York:
Oxford University Press; 2012.
16. Bigler ED. Neuroimaging as a biomarker in symptom validity and performance validity
testing. Brain Imaging Behav. 2015;9(3):421–444. doi:10.1007/s11682-015-9409-1.
17. Bush SS, Heilbronner RL, Ru RM. Psychological assessment of symptom and performance
validity, response bias, and malingering: ocial position of the association for scientic
advancement in psychological injury and law. Psychol Inj Law. 2014;7(3):197–205.
18. Bush SS, Ru RM, Troster AI, Barth JT, Koer SP, Pliskin NH, . . . Silver CH. Symptom validity
assessment: practice issues and medical necessity (NAN Policy and Planning Committees).
Arch Clin Neuropsychol. 2005;20:419–426.
19. Chafetz MD, Williams MA, Ben-Porath YS, Bianchini KJ, Boone KB, Kirkwood MW, Larrabee GJ,
Ord JS. Ocial position of the American academy of clinical neuropsychology social security
administration policy on validity testing: guidance and recommendations for change. Clin
Neuropsychol. 2015;29(6):723–740.
674 I. MESSA ET AL.
20. Heilbronner RL, Sweet JJ, Morgan JE, Larrabee GJ, Millis SR, Participants C. American academy of
clinical neuropsychology consensus conference statement on the neuropsychological assess-
ment of eort, response bias, and malingering. Clin Neuropsychol. 2009;23:1093–1129.
doi:10.1080/13854040903155063.
21. Boone KB. The need for continuous and comprehensive sampling of eort/response bias during
neuropsychological examination. Clin Neuropsychol. 2009;23(4):729–741. doi:10.1080/
13854040802427803.
22. Chafetz M. Reducing the probability of false positives in malingering detection of social
security disability claimants. Clin Neuropsychol. 2011;25(7):1239–1252. doi:10.1080/
13854046.2011.586785.
23. Larrabee GJ. Assessment of malingering. In: Larrabee GJ, editor. Forensic neuropsychology:
a scientic approach. Second ed. New York: Oxford University Press; 2012. p. 116–159.
24. Millis SR. What clinicians really need to know about symptom exaggeration, insucient eort,
and malingering: statistical and measurement matters. In: Morgan JE, Sweet JJ, editors.
American academy of clinical neuropsychology/psychology press continuing education ser-
ies. Neuropsychology of malingering casebook. Psychology Press; 2009. p. 21–37.
25. Schutte C, Axelrod BN, Montoya E. Making sure neuropsychological data are meaningful: use
of performance validity testing in medicolegal and clinical contexts. Psychol Inj Law. 2015;8
(2):100–105.
26. Larrabee GJ. Aggregation across multiple indicators improves the detection of malingering:
relationship to likelihood ratios. Clin Neuropsychol. 2008;22:410–425. doi:10.1080/
13854040701494987.
27. Miele AS, Gunner JH, Lynch JK, McCarey RJ. Are embedded validity indices equivalent to
free-standing symptom validity tests? Arch Clin Neuropsychol. 2012;27(1):10–22.
28. Boone KB. Clinical practice of forensic neuropsychology. New York (NY): Guilford; 2013.
29. Erdodi LA, Lichtenstein JD. Invalid before impaired: an emerging paradox of embedded validity
indicators. Clin Neuropsychol. 2017;31(6–7):1029–1046. doi:10.1080/13854046.2017.1323119.
30. Chafetz MD. Symptom validity issues in the psychological consultative examination for social
security disability. Clin Neuropsychol. 2010;24(6):1045–1063.
31. Chafetz MD, Abrahams JP, Kohlmaier J. Malingering on the Social Security disability con-
sultative exam: a new rating scale. Arch Clin Neuropsychol. 2007;22(1):1–14.
32. Chafetz MD, Prentkowski E, Rao A. To work or not to work: motivation (not low IQ) determines
symptom validity test ndings. Arch Clin Neuropsychol. 2011;26(4):306–313.
33. Atkins v. Virginia, 536 U.S. 304 (2002).
34. Chafetz MD, Biondolillo A. Validity issues in Atkins death cases. Clin Neuropsychol. 2012;26
(8):1358–1376.
35. Barker A, Musso MW, Jones GN, Roid G, Gouvier D. Unreliable block span reveals simulated
intellectual disability on the stanford-binet intelligence scales. Appl Neuropsychol Adult.
2014;21(1):51–59.
36. Salekin KL, Doane BM. Malingering intellectual disability: the value of available measures and
methods. Appl Neuropsychol. 2009;16(2):105–113.
37. Larrabee GJ. Detection of malingering using atypical performance patterns on standard
neuropsychological tests. Clin Neuropsychol. 2003;17(3):410–425. doi:10.1076/
clin.17.3.410.18089.
38. Graue LO, Berry DT, Clark JA, Sollman MJ, Cardi M, Hopkins J, Werline D. Identication of
feigned mental retardation using the new generation of malingering detection instruments:
preliminary ndings. Clin Neuropsychol. 2007;21(6):929–942.
39. Hurley KE, Deal WP. Assessment instruments measuring malingering used with individuals
who have mental retardation: potential problems and issues. Ment Retard. 2006;44
(2):112–119.
40. Love CM, Glassmire DM, Zanolini SJ, Wolf A. Specicity and false positive rates of the test of
memory malingering, rey 15-item test, and rey word recognition test among forensic inpa-
tients with intellectual disabilities. Assessment. 2014;21(5):618–627.
AUSTRALIAN JOURNAL OF FORENSIC SCIENCES 675
41. Marshall P, Happe M. The performance of individuals with mental retardation on cognitive
tests assessing eort and motivation. Clin Neuropsychol. 2007;21(5):826–840.
42. Smith K, Boone K, Victor T, Miora D, Cottingham M, Ziegler E, Zeller M, Wright M. Comparison
of credible patients of very low intelligence and non-credible patients on neurocognitive
performance validity indicators. Clin Neuropsychol. 2014;28(6):1048–1070.
43. Victor TL, Boone KB. Identication of feigned mental retardation. In: Boone KB, editor
Assessment of feigned cognitive impairment: a neuropsychological perspective. New York
(NY): The Guilford Press; 2007. p. 310–345.
44. Arentsen TJ, Boone KB, Lo TT, Goldberg HE, Cottingham ME, Victor TL, . . . Zeller MA.
Eectiveness of the Comalli Stroop Test as a measure of negative response bias. Clin
Neuropsychol. 2013;27(6):1060–1076.
45. Bell-Sprinkel TL, Boone KB, Miora D, Cottingham M, Victor T, Ziegler E, Zeller M, Wright M. Re-
examination of the rey word recognition test. Clin Neuropsychol. 2013;27(3):516–527.
46. Boone KB, Salazar X, Lu P, Warner-Chacon K, Razani J. The Rey 15-item recognition trial:
a technique to enhance sensitivity of the Rey 15-item memorization test. J Clin Exp
Neuropsychol. 2002;24(5):561–573. doi:10.1076/jcen.24.5.561.1004.
47. Lichtenstein JD, Erdodi LA, Rai JK, Mazur-Mosiewicz A, Flaro L. Wisconsin card sorting test
embedded validity indicators developed for adults can be extended to children. Child
Neuropsychol. 2018;24(2):247–260. doi:10.1080/09297049.2016.1259402.
48. Lichtenstein JD, Greenacre MK, Cutler L, Abeare K, Baker SD, Kent K, Ali J, Erdodi LA. Geographic
variation and instrumentation artifacts: in search of confounds in performance validity assessment
in adults with mild TBI. Psychol Inj Law. 2019;12(2):127–145. doi:10.1007/s12207-019-0935.
49. Marshall P, Schroeder R, O’Brien J, Fischer R, Ries A, Blesi B, Barker J. Eectiveness of symptom
validity measures in identifying cognitive and behavioral symptom exaggeration in adult
attention decit hyperactivity disorder. Clin Neuropsychol. 2010;24:1204–1237. doi:10.1080/
13854046.2010.514290.
50. Morse CL, Douglas-Newman K, Mandel S, Swirsky-Sacchetti T. Utility of the Rey-15 recognition
trial to detect invalid performance in a forensic neuropsychological sample. Clin
Neuropsychol. 2013;1–13. doi:10.1080/13854046.2013.832385
51. Nitch S, Boone KB, Wen J, Arnold G, Alfano K. The utility of the Rey word recognition test in
the detection of suspect eort. Clin Neuropsychol. 2006;20(4):873–887.
52. Poynter K, Boone KB, Ermshar A, Miora D, Cottingham M, Victor TL, Ziegler E, Zeller MA,
Wright M. Wait, there’s a baby in this bath water! Update on quantitative and qualitative cut-
os for Rey 15-item recall and recognition. Arch Clin Neuropsychol. 2019;34(8):1367–1380.
doi:10.1093/arclin/acy087.
53. Erdodi LA, Abeare CA, Lichtenstein JD, Tyson BT, Kucharski B, Zuccato BG, Roth RM. WAIS-IV
processing speed scores as measures of non-credible responding – the third generation of
embedded performance validity indicators. Psychol Assess. 2017;29(2):148–157. doi:10.1037/
pas0000319.
54. Glassmire DM, Wood ME, Ta MT, Kinney DI, Nitch SR. Examining false-positive rates of
Wechsler Adult Intelligence Scale (WAIS-IV) processing speed based embedded validity
indicators among individuals with schizophrenia spectrum disorders. Psychol Assess.
2019;31(1):120–125. doi:10.1037/pas0000650.
55. Abeare C, Messa I, Whiteld C, Zuccato B, Casey J, Rykulski N, Erdodi L. Performance validity in
collegiate football athletes at baseline neurocognitive testing. J Head Trauma Rehabil.
2019;34(4):E20–E31.
56. Chafetz MD, Biondolillo AM. Feigning a severe impairment prole. Arch Clin Neuropsychol.
2013;28(3):205–212.
57. Erdodi LA, Pelletier CL, Roth RM. Elevations on select Conners’ CPT-II scales indicate non-
credible responding in adults with traumatic brain injury. Appl Neuropsychol Adult. 2018;25
(1):19–28. doi:10.1080/23279095.2016.1232262.
58. Erdodi LA, Tyson BT, Shahein A, Lichtenstein JD, Abeare CA, Pelletier CL, Zuccato BG,
Kucharski B, Roth RM. The power of timing: adding a time-to-completion cuto to the
676 I. MESSA ET AL.
word choice test and recognition memory test improves classication accuracy. J Clin Exp
Neuropsychol. 2017;39(4):369–383. doi:10.1080/13803395.2016.1230181.
59. Greve KW, Curtis KL, Bianchini KJ, Ord JS. Are the original and second edition of the California
Verbal Learning Test equally accurate in detecting malingering? Assessment. 2009;16
(3):237–248.
60. Wechsler D. Wechsler Adult Intelligence Test — fourth Edition [WAIS-IV). San Antonio (TX):
Pearson; 2008.
61. Babikian T, Boone KB, Lu P, Arnold G. Sensitivity and specicity of various digit span scores in
the detection of suspect eort. Clin Neuropsychol. 2006;20(1):145–159.
62. Erdodi LA. Aggregating validity indicators: the salience of domain specicity and the inde-
terminate range in multivariate models of performance validity assessment. Appl
Neuropsychol Adult. 2019;26(2):155–172. doi:10.1080/23279095.2017.1384925.
63. Erdodi LA, Abeare CA. Stronger together: the Wechsler adult intelligence scale – fourth
Edition as a multivariate performance validity test in patients with traumatic brain injury.
Arch Clin Neuropsychol. 2020;35(2):188–204. doi:10.1093/arclin/acz032/5613200.
64. Erdodi LA, Hurtubise JL, Charron C, Dunn A, Enache A, McDermott A, Hirst R. The D-KEFS Trails
as performance validity tests. Psychol Assess. 2018;30(8):1081–1095.
65. Erdodi LA, Lichtenstein JD. Information processing speed tests as PVTs. In: Boone KB, editor.
Assessment of feigned cognitive impairment. A neuropsychological perspective. New York (NY):
Guilford; 2020. p. 218–247.
66. Etherton JL, Bianchini KJ, Heinly MT, Greve KW. Pain, malingering, and performance on the
WAIS-III processing speed index. J Clin Exp Neuropsychol. 2006;28(7):1218–1237. doi:10.1080/
13803390500346595.
67. Greienstein MF, Baker WJ, Gola T. Validation of malingered amnesia measures with a large
clinical sample. Psychol Assess. 1994;6(3):218–224. https://doi.org/10.1037/1040-3590.6.3.218
68. Heinly MT, Greve KW, Bianchini K, Love JM, Brennan A. WAIS Digit-Span-based indicators of
malingered neurocognitive dysfunction: classication accuracy in traumatic brain injury.
Assessment. 2005;12(4):429–444.
69. Hurtubise J, Baher T, Messa I, Cutler L, Shahein A, Hastings M, Carignan-Querqui M, Erdodi L.
Verbal uency and digit span variables as performance validity indicators in experimentally
induced malingering and real world patients with TBI. Appl Neuropsychol Child. 2020;1–18.
doi:10.1080/21622965.2020.1719409
70. Jasinski LJ, Berry DT, Shandera AL, Clark JA. Use of the Wechsler adult intelligence scale digit
span subtest for malingering detection: a meta-analytic review. J Clin Exp Neuropsychol.
2011;33(3):300–314.
71. Mathias CW, Greve KW, Bianchini KJ, Houston RJ, Crouch JA. Detecting malingered neuro-
cognitive dysfunction using the reliable digit span in traumatic brain injury. Assessment.
2002;9(3):301–308.
72. Reese CS, Suhr JA, Riddle TL. Exploration of malingering indices in the Wechsler adult
intelligence scale – fourth edition digit span subtest. Arch Clin Neuropsychol.
2012;27:176–181.
73. Schroeder RW, Twumasi-Ankrah P, Baade LE, Marshall PS. Reliable digit span: a systematic
review and cross-validation study. Assessment. 2012;19(1):21–30.
74. Shura RD, Martindale SL, Taber KH, Higgins AM, Rowland JA. Digit Span embedded validity
indicators in neurologically-intact veterans. Clin Neuropsychol. 2020;34(5):1025–1037.
75. Spencer RJ, Axelrod BN, Drag LL, Waldron-Perrine B, Pangilinan PH, Bieliauskas LA. WAIS-IV
reliable digit span is no more accurate than age corrected scaled score as an indicator of
invalid performance in a veteran sample undergoing evaluation for mTBI. Clin Neuropsychol.
2013;27(8):1362–1372.
76. Trueblood W. Qualitative and quantitative characteristics of malingered and other invalid
WAIS-R and clinical memory data. J Clin Exp Neuropsychol. 1994;14(4):697–707. doi:10.1080/
01688639408402671.
AUSTRALIAN JOURNAL OF FORENSIC SCIENCES 677
77. Webber TA, Soble JR. Utility of various WAIS-IV digit span indices for identifying noncredible
performance among performance validity among cognitively impaired and unimpaired
examinees. Clin Neuropsychol. 2018;32(4):657–670.
78. Ashendorf L. Rey auditory verbal learning test and Rey–osterrieth complex gure test
performance validity indices in a VA polytrauma sample. Clin Neuropsychol. 2019;33
(8):1388–1402.
79. Axelrod BN, Schutte C. Concurrent validity of three forced-choice measures of symptom
validity. Appl Neuropsychol. 2011;18(1):27–33.
80. Bauer L, Yantz CL, Ryan LM, Warden DL, McCarey RJ. An examination of the California verbal
learning Test II to detect incomplete eort in a traumatic brain injury sample. Appl
Neuropsychol. 2005;12(4):202–207. doi:10.1207/s15324826an1204_3.
81. Blaskewitz N, Merten T, Brockhaus R. Detection of suboptimal eort with the Rey complex
gure test and recognition trial. Appl Neuropsychol. 2009;16:54–61.
82. Bortnik KE, Boone KB, Marion SD, Amano S, Ziegler E, Victor TL, Zeller MA. Examination of
various WMS-III logical memory scores in the assessment of response bias. Clin Neuropsychol.
2010;24(2):344–357. doi:10.1080/13854040903307268.
83. Curtis KL, Greve KW, Bianchini KJ, Brennan A. California verbal learning test indicators of
malingered neurocognitive dysfunction: sensitivity and specicity in traumatic brain injury.
Assessment. 2006;13(1):46–61.
84. Donders J, Strong CH. Embedded eort indicators on the California verbal learning test – second
edition (CVLT-II): an attempted cross-validation. Clin Neuropsychol. 2011;25:173–184.
85. Erdodi LA, Abeare CA, Medo B, Seke KR, Sagar S, Kirsch NL. A single error is one too many:
the forced choice recognition trial on the CVLT-II as a measure of performance validity in
adults with TBI. Arch Clin Neuropsychol. 2018;33(7):845–860. doi:10.1093/arclin/acx110.
86. Langeluddecke PM, Lucas SK. Quantitative measures of memory malingering on the Wechsler
memory scale—third edition in mild head injury litigants. Arch Clin Neuropsychol. 2003;18
(2):181–197.
87. Moore BA, Donders J. Predictors of invalid neuropsychological performance after traumatic
brain injury. Brain Inj. 2004;18(10):975–984. doi:10.1080/02699050410001672350.
88. Rai J, An KY, Charles J, Ali S, Erdodi LA. Introducing a forced choice recognition trial to the Rey
Complex Figure Test. Psychol Neurosci. 2019;12(4):451–472. doi:10.1037/pne0000175.
89. Resch ZJ, Pham AT, Abramson DA, White DJ, DeDios-Stern S, Soble JR. Examining indepen-
dent and combined accuracy of embedded performance validity tests in the California verbal
learning Test-II and brief visuospatial memory-revised for detecting invalid performance.
Appl Neuropsychol Adult. 2020. Advance online publication.
90. Root JC, Robbins RN, Chang L, Van Gorp WG. Detection of inadequate eort on the California
verbal learning Test - second edition: forced choice recognition and critical item analysis.
J Int Neuropsychol Soc. 2006;12:688–696. doi:10.1017/S1355617706060838.
91. Schwartz ES, Erdodi L, Rodriguez N, Jyotsna JG, Curtain JR, Flashman LA, Roth RM. CVLT-II
forced choice recognition trial as an embedded validity indicator: a systematic review of the
evidence. J Int Neuropsychol Soc. 2016;22(8):851–858. doi:10.1017/S1355617716000746.
92. Shura RD, Miskey HM, Rowland JA, Yoash-Gatz RE, Denning JH. Embedded performance validity
measures with postdeployment veterans: cross-validation and eciency with multiple measures.
Appl Neuropsychol Adult. 2016;23:94–104. doi:10.1080/23279095.2015.1014556.
93. Abramson DA, Resch ZJ, Ovsiew GP, White DJ, Bernstein MT, Basurto KS, Soble JR. Impaired or
invalid? Limitations of assessing performance validity using the Boston naming test.
Appl Neuropsychol Adult. 2020. Advance online publication. doi:10.1080/23279095.2020.1774378
94. An KY, Charles J, Ali S, Enache A, Dhuga J, Erdodi LA. Re-examining performance validity
cutos within the complex ideational material and the Boston naming test-short form using
an experimental malingering paradigm. J Clin Exper Neuropsychol. 2019;41(1):15–25.
doi:10.1080/13803395.2018.1483488.
95. Curtis KL, Thompson LK, Greve KW, Bianchini KJ. Verbal uency indicators of malingering in
traumatic brain injury: classication accuracy in known groups. Clin Neuropsychol.
2008;22:930–945. doi:10.1080/13854040701563591.
678 I. MESSA ET AL.
96. Erdodi LA, Dunn AG, Seke KR, Charron C, McDermott A, Enache A, Maytham C, Hurtubise J.
The Boston naming test as a measure of performance validity. Psychol Inj Law. 2018;11:1–8.
doi:10.1007/s12207-017-9309-3.
97. Erdodi LA, Roth RM. Low scores on BDAE complex ideational material are associated with
invalid performance in adults without aphasia. Appl Neuropsychol Adult. 2017;24(3):264–274.
doi:10.1080/23279095.2017.1298600.
98. Erdodi LA, Tyson BT, Abeare CA, Lichtenstein JD, Pelletier CL, Rai JK, Roth RM. The BDAE
complex ideational material – a measure of receptive language or performance validity?
Psychol Inj Law. 2016;9:112–120. doi:10.1007/s12207-016-9254-6.
99. Iverson GI, Binder LM. Detecting exaggeration and malingering in neuropsychological
assessment. J Head Trauma Rehabil. 2000;15(2):829–858.
100. Millis SR, Ross SR, Ricker JH. Detection of incomplete eort of the Wechsler adult intelligence
scale-revised: a cross-validation. J Clin Exp Neuropsychol. 1998;20(2):167–173.
101. Mittenberg W, Theroux-Fichera S, Zielinski RE, Heilbronner RL. Identication of malingered
head injury on the Wechsler adult intelligence scale-revised. Prof Psychol Res Pr. 1995;26
(5):491–498.
102. Mittenberg W, Theroux S, Aguila-Puentes G, Bianchini K, Greve K, Rayls K. Identication of
malingered head injury on the Wechsler adult intelligence scale-3rd Edition. Clin
Neuropsychol. 2001;15(4):440–445.
103. Sugarman MA, Axelrod BN. Embedded measures of performance validity using verbal uency
tests in a clinical sample. Appl Neuropsychol Adult. 2015;22(2):141–146.
104. Abeare C, Sabelli A, Taylor B, Holcomb M, Dumitrescu C, Kirsch N, Erdodi L. The importance of
demographically adjusted cutos: age and education bias in raw score cutos within the Trail
Making Test. Psychol Inj Law. 2019;12(2):170–182. doi:10.1007/s12207-019-09353.
105. Ashendorf L, O’Bryant SE, McCarey RJ. Specicity of malingering detection strategies in older
adults using the CVLT and WCST. Clin Neuropsychol. 2003;17(2):255–262.
106. DenBoer JW, Hall S. Neuropsychological test performance of successful brain injury
simulators. Clin Neuropsychol. 2007;21(6):943–955.
107. Gligorović M, Buha N. Conceptual abilities of children with mild intellectual disability: analysis
of Wisconsin card sorting test performance. J Intell Dev Disab. 2013;38(2):134–140.
doi:10.3109/13668250.2013.772956.
108. Greve KW, Bianchini KJ, Mathias CW, Houston RJ, Crouch JA. Detecting malingered neuro-
cognitive dysfunction with the Wisconsin card sorting test: a preliminary investigation in
traumatic brain injury. Clin Neuropsychol. 2002;16(2):179–191.
109. Greve KW, Heinly MT, Bianchini KJ, Love JM. Malingering detection with the Wisconsin card
sorting test in mild traumatic brain injury. Clin Neuropsychol. 2009;23:343–362.
110. Iverson GL, Lange RT, Green P, Franzen MD. Detecting exaggeration and malingering with the trail
making test. Clin Neuropsychol. 2002;16(3):398–406. doi:10.1076/clin.16.3.398.13861.
111. Jodzio K, Biechowska D. Wisconsin card sorting test as a measure of executive function
impairments in stroke patients. Appl Neuropsychol. 2010;17(4):267–277.
112. Merten T, Bossink L, Schmand B. On the limits of eort testing: symptom validity tests and
severity of neurocognitive symptoms in nonlitigant patients. J Clin Exp Neuropsychol.
2007;29(3):308–318.
113. Merten T, Green P, Henry M, Blaskewitz N, Brockhaus R. Analog validation of
German-language symptom validity tests and the inuence of coaching. Arch Clin
Neuropsychol. 2005;20:719–726.
114. Suhr JA, Boyer D. Use of the Wisconsin Card Sorting Test in the detection of malingering in
student simulator and patient samples. J Clin Exp Psychol. 1999;21(5):701–708. doi:10.1076/
jcen.21.5.701.868.
115. Erdodi LA, Kirsch NL, Sabelli AG, Abeare CA. The Grooved Pegboard Test as a validity
indicator – A study on psychogenic interference as a confound in performance validity
research. Psychol Inj Law. 2018;11(4):307–324. doi:10.1007/s12207-018-9337-7.
AUSTRALIAN JOURNAL OF FORENSIC SCIENCES 679
116. Erdodi LA, Seke KR, Shahein A, Tyson BT, Sagar S, Roth RM. Low scores on the Grooved
Pegboard Test are associated with invalid responding and psychiatric symptoms. Psychol
Neurosci. 2017;10(3):325–344. doi:10.1037/pne0000103.
117. Erdodi LA, Taylor B, Sabelli A, Malleck M, Kirsch NL, Abeare CA. Demographically adjusted
validity cutos in the Finger tapping test are superior to raw score cutos. Psychol Inj Law.
2019;12(2):113–126. doi:10.1007/s12207-019-09352-y.
118. An KY, Kaploun K, Erdodi LA, Abeare CA. Performance validity in undergraduate research
participants: a comparison of failure rates across tests and cutos. Clin Neuropsychol. 2017;31
(1):193–206.
119. Roye S, Calamia M, Bernstein JP, De Vito AN, Hill BD. A multi-study examination of perfor-
mance validity in undergraduate research participants. Clin Neuropsychol. 2019;33
(6):1138–1155.
120. Dandachi-FitzGerald B, Duits AA, Leentjens AF, Verhey FR, Ponds RW. Performance and
symptom validity assessment in patients with apathy and cognitive impairment.
J Int Neuropsychol Soc. 2020;26(3):314–321.
121. Larrabee GJ, Millis SR, Meyers JE. 40 plus or minus 10, a new magical number: reply to Russel.
Clin Neuropsychol. 2009;23:841–849.
122. Martin PK, Schroeder RW. Base rates of invalid test performance across clinical non-forensic
contexts and settings. Arch Clin Neuropsychol. 2020;35(6):717–725. doi:10.1093/arclin/
acaa017.
123. Chafetz M, Underhill J. Estimated costs of malingered disability. Arch Clin Neuropsychol.
2013;28(7):633–639.
124. Denning JH, Shura RD. Cost of malingering mild traumatic brain injury-related cognitive
decits during compensation and pension evaluations in the veterans benets
administration. Appl Neuropsychol Adult. 2019;26(1):1–16.
125. Grossi LM, Green D, Einzig S, Bel B. Evaluation of the response bias scale and improbable
failure scale in assessing feigned cognitive impairment. Psychol Assess. 2017;29(5):531–541.
doi:10.1037/pas0000364.
126. Odland AP, Lammy AB, Martin PK, Grote CL, Mittenberg W. Advanced administration and
interpretation of multiple validity tests. Psychol Inj Law. 2015;8(1):46–63.
127. Pearson. Advanced clinical solutions for the WAIS-IV and WMS-IV – technical manual. San
Antonio (TX): Author; 2009.
128. Erdodi LA, Roth RM, Kirsch NL, Lajiness-O’Neill R, Medo B. Aggregating validity indicators
embedded in Conners’ CPT-II outperforms individual cutos at separating valid from invalid
performance in adults with traumatic brain injury. Arch Clin Neuropsychol. 2014;29
(5):456–466. doi:10.1093/arclin/acu026.
129. Lu PH, Boone KB, Cozolino L, Mitchell C. Eectiveness of the Rey-Osterrieth complex gure
test and the Meyers and Meyers recognition trial in the detection of suspect eort. Clin
Neuropsychol. 2003;17(3):426–440. doi:10.1076/clin.17.3.426.18083.
130. Reedy SD, Boone KB, Cottingham ME, Glaser DF, Lu PH, Victor TL, Ziegler EA, Zeller MA,
Wright MJ. Cross validation of the Lu and colleagues (2003) Rey-Osterrieth complex gure
test eort equation in a large known-group sample. Arch Clin Neuropsychol. 2013;28:30–37.
doi:10.1093/arclin/acs106.
131. Tyson BT, Baker S, Greenacre M, Kent KJ, Lichtenstein JD, Sabelli A, Erdodi LA. Dierentiating
epilepsy from psychogenic nonepileptic seizures using neuropsychological test data.
Epilepsy Behav. 2018;87:39–45.
132. Lace JW, Grant AF, Kosky KM, Teague CL, Lowell KT, Gfeller JD. Identifying novel embedded
performance validity test formulas within the repeatable battery for the assessment of
neuropsychological status: a simulation study. Psychol Inj Law. 2020;13:303–315.
133. Johnstone L, Cooke DJ. Feigned intellectual decits on the Wechsler adult intelligence scale-
revised. British J Clin Psychol. 2003;42(3):303–318.
134. Heaton RK, Miller, SW, Taylor, MJ, & Grant, I. 2004. Revised comprehensive norms for an
expanded Halstead-Reitan battery: Demographically adjusted neuropsychological norms for
African American and Caucasian adults. Psychological Assessment Resources.
680 I. MESSA ET AL.