Content uploaded by Laszlo A Erdodi
Author content
All content in this area was uploaded by Laszlo A Erdodi on Jul 22, 2019
Content may be subject to copyright.
A Single Error Is One Too Many: The Forced Choice Recognition Trial
of the CVLT-II as a Measure of Performance Validity in Adults with TBI
Laszlo A. Erdodi
1,
*, Christopher A. Abeare
2
, Brent Medoff
3
, Kristian R. Seke
4
, Sanya Sagar
5
,
Ned L. Kirsch
6
1
Department of Psychology, University of Windsor, 168 Chrysler Hall South, Windsor, Canada ON N9B3P4
2
Department of Psychology, University of Windsor, 170 Chrysler Hall South, Windsor, Canada ON N9B3P4
3
The Commonwealth Medical College, 525 Pine St, Scranton, PA 18509, USA
4
University of Windsor, Brain-Cognition-Neuroscience Program, G105 Chrysler Hall North, Windsor, Canada ON N9B3P4
5
Department of Psychology, University of Windsor, 109 Chrysler Hall North, Windsor, Canada ON N9B3P4
6
Department of Physical Medicine and Rehabilitation, University of Michigan, Briarwood Circle #4 Ann Arbor, MI 48108, USA
*Corresponding author at: Department of Psychology, University of Windsor, 168 Chrysler Hall South, 401 Sunset Avenue, Windsor, Canada ON N9B 3P4.
Tel.: +519 253 3000x2202. E-mail address: lerdodi@gmail.com (Laszlo Erdodi)
Editorial Decision 5 October 2017; Accepted 20 October 2017
Abstract
Objective: The Forced Choice Recognition (FCR) trial of the California Verbal Learning Test—Second Edition (CVLT-II) was designed
to serve as a performance validity test (PVT). The present study was designed to compare the classification accuracy of a more liberal alter-
native (≤15) to the de facto FCR cutoff (≤14).
Method: The classification accuracy of the two cutoffs was computed in reference to psychometrically defined invalid performance,
across various criterion measures, in a sample of 104 adults with TBI clinically referred for neuropsychological assessment.
Results: The FCR was highly predictive (AUC: .71–.83) of Pass/Fail status on reference PVTs, but unrelated to performance on measures
known to be sensitive to TBI. On average, FCR ≤15 correctly identified an additional 6% of invalid response sets compared to FCR ≤14,
while maintaining .92 specificity. Patients who failed the FCR reported higher levels of emotional distress.
Conclusions: Results suggest that even a single error on the FCR is a reliable indicator of invalid responding. Further research is needed
to investigate the clinical significance of the relationship between failing the FCR and level of self-reported psychiatric symptoms.
Keywords: CVLT-II; Forced choice recognition; Traumatic brain injury; Performance validity assessment; Alternative cutoffs
Introduction
The interpretation of neuropsychological tests rests on the assumption that examinees are able and willing to fully engage
with the tasks presented to them and therefore, demonstrate their maximal ability level (Delis, Kramer, Kaplan, & Ober,
2000). There is a growing consensus within the field of neuropsychology that valid performance cannot be assumed by
default, but should be objectively evaluated (Boone, 2009;Chafetz et al., 2015;Heilbronner, Sweet, Morgan, Larrabee, &
Millis, 2009). Some even consider assessments that omit formal measures of test-taking effort incomplete (Iverson, 2003).
Along with free-standing performance validity tests [PVTs; Test of Memory Malingering (TOMM; Tombaugh, 1996); Word
Memory Test (WMT; Green, 2003); Validity Indicator Profile (VIP; Frederick, 2003)] that represent the traditional approach
to performance validity assessment, a growing array of embedded validity indicators (EVIs) have also been developed to help
clinicians determine the credibility of a given response set (Arnold et al., 2005;Erdodi, Sagar, et al., 2017;Greiffenstein,
Baker, & Gola, 1994;Larrabee, 2003).
The Forced Choice Recognition (FCR) task of the California Verbal Learning Test—Second Edition (CVLT-II; Delis
et al., 2000) falls somewhere in between these two categories of validity measures. It was introduced as an optional module
© The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
doi:10.1093/acn/acx110
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
with the explicit purpose of evaluating test-taking effort, and is administered 10 min after the original recall and recognition
trials are completed. These features are consistent with a free-standing PVT. However, FCR is dependent on the prior admin-
istration of the rest of the CVLT-II, which makes it an EVI.
The technical manual references a study by Connor, Drake, Bondi and Delis (1997) on an early experimental version of
FCR administered in conjunction with the original CVLT. On this instrument, a cutoff of ≤13 produced impressive classifica-
tion accuracy (.80 sensitivity at .97 specificity) separating credible and simulated memory deficits. Although the authors re-
frained from endorsing this or any other cutoff, they reported that over 90% of the CVLT-II normative sample obtained a
perfect score on FCR (16/16), with ≤1% scoring ≤14. Nobody scored ≤13. They suggested that its pronounced ceiling effect
in neurologically healthy individuals makes FCR a viable instrument for detecting non-credible responding in unsophisticated
examinees who blatantly exaggerate their memory problems.
Early studies on FCR in clinical samples provided indirect support for this claim. Baldo, Delis, Kramer and Shimamura
(2002) reported that all of the 11 patients with neuroradiologically confirmed focal frontal lesions and notable impairment in
acquisition, recall and recognition on the CVLT-II obtained perfect scores on FCR. Demonstrating that performance on FCR
is unrelated to brain lesions or credibly poor performance on the rest of the CVLT-II was an essential first step in gaining
acceptance as a validity indicator.
The other requirement for validation was examining whether FCR can correctly identify individuals who fail other estab-
lished PVTs. Moore and Donders (2004) conducted the first large scale concordance rate analysis against the TOMM in 132
clinically referred adults with TBI. Most of the sample was male (62.1%) and classified as mild TBI (54.5%). Mean age was
35.8 (SD =14.2) and mean level of education was 12.3 (SD =2.6). The majority (81.1%) obtained a perfect score on FCR;
6.8% scored 15 and 11.4% scored ≤14. The authors turned the base rate argument advanced by the test developers into an
explicit diagnostic threshold, defining failure on FCR as ≤14. The reference PVT was TOMM Trial 2 ≤44 resulting in an
8.3% base rate of failure (BR
Fail
). FCR ≤14 produced a respectable combination of sensitivity (.55) and specificity (.93)
against this criterion. No alternative cutoff was considered. The authors expressed concerns that both the TOMM and FCR
might be too transparent as PVTs and thus, highly specific, but not very sensitive to invalid responding.
Bauer, Yantz, Ryan, Warden and McCaffrey (2005) examined the relationship between FCR and the WMT in a military
sample of 120 patients with TBI. Mean age was 28.4 (SD =9.2) and mean level of education was 13.4 (SD =2.3). The
BR
Fail
,defined by the WMT at standard cutoffs, was 24.2%. Although the authors did not provide enough detail to compute
classification accuracy, the mean FCR value in the “invalid”group (14.9) was significantly lower than in the “valid”group
(15.7). Also, there was a positive linear relationship between WMT performance as a continuous variable (average of the IR,
DR and CNS subtests) and FCR: those with M
WMT
≥91% produced a M
FCR
of 15.8, while those with M
WMT
61–70% pro-
duced a M
FCR
of 14.2. The authors concluded that while FCR was effective at discriminating those who passed and those
who failed the WMT, the mean difference was lower (0.8) than what was observed on Yes/No recognition hits raw scores
(2.0). They attributed this to the inherently lower cognitive demands of FCR paradigm compared to the Yes/No recognition
trial, which has a 3:1 distractor-to-target ratio. They also emphasized that FCR has high specificity, but low sensitivity to
invalid performance.
Root, Robbins, Chang and van Gorp (2006) investigated the relationship among FCR scores, memory impairment and per-
formance validity across three groups: a mixed clinical sample (n=25), a forensic sample with adequate effort (n=27) and
a forensic sample with inadequate effort (n=25). Performance validity was operationalized as passing or failing the TOMM
and/or the VIP among forensic patients, resulting in an overall BR
Fail
of 48%. Performance validity was not formally assessed
in the clinically referred patients; instead, they were assumed to have valid neuropsychological profiles based on the lack of
apparent secondary gain. Given the emerging evidence that even university students with no incentive to appear impaired fail
PVTs (An, Zakzanis, & Joordens, 2012;An, Kaploun, Erdodi, & Abeare, 2017;Ross et al., 2016;Santos, Kazakov, Reamer,
Park, & Osmon, 2014), this logic (“lack of apparent motivation to perform poorly =evidence of valid performance”) seems
flawed by current methodological standards that emphasize the importance of objective evidence from multiple independent
sources in establishing the credibility of a given neurocognitive profile (Boone, 2009,2013;Larrabee, 2012).
Nevertheless, Root et al. (2006) found that FCR scores were unrelated to delayed free recall performance. An FCR cutoff
of ≤15 produced .60 sensitivity at .81 specificity. Lowering the cutoff to ≤14 resulted in improved specificity (.93), but
decreased sensitivity (.44). The authors endorsed FCR as a “brief screen of effort”given its quick and easy administration and
strong positive predictive power. At the same time, they cautioned against using a passing score on FCR as evidence for cred-
ible performance. They also acknowledged the modality specificity as a potential confound in establishing the optimal cutoff
on FCR: the TOMM is a visually based PVT, while the VIP is non-memory based. As such, they may not be ideal reference
PVTs to cross-validate FCR.
Once FCR’s ability to separate valid and invalid response sets had been established, later studies used it as an EVI. Some
of these reports provide indirect evidence that further consolidates the evidence base supporting its diagnostic utility.
2L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
For example, the investigation by Donders and Strong (2011) based on 100 clinically referred adults with TBI found that the
majority (85%) of the patients obtained a perfect score on FCR, 6% scored 15 and 9% scored ≤14. Although concordance rates
were not provided, 24% of the sample failed the WMT. The authors noted that FCR and WMT were unrelated to injury severity,
while other CVLT-II measures (Trials 1–5, LD-FR, d’, Total Recall Discriminability) covaried with duration of coma.
Another method for assessing FCR’s ability to differentiate invalid responding from credible impairment is to examine its
distribution in clinical populations with severe, credible neurological deficits. Extremely low intellectual functioning (FSIQ <
70) and dementia are two conditions that have been identified as being at risk for high false positive rates on PVTs (Boone,
2009,2013). Marshall and Happe (2007) examined the BR
Fail
in several PVTs in a sample of 100 adults with intellectual dis-
ability (52% male, M
Age
=36.6; M
FSIQ
=63). Mean FCR score was 15.1 (SD =1.9). A frequency distribution for a subset of
71 participants for which FCR data were available revealed that the majority (66.2%) obtained a perfect score, 18.3% of the
sample scored 15 and 15.5% scored ≤14.
Clark and colleagues (2012) demonstrated that FCR performance is a useful clinical marker of anterograde amnesia in later
stages of Alzheimer’s disease (AD). As such, in conjunction with other CVLT-II measures, it can aid the subtyping of late
life memory disorders and track disease progression in individuals diagnosed with a neurodegenerative disorder. Mean FCR
was 13.9 in the Alzheimer’s sample (n=37), 15.8 in the amnestic MCI sample (n=18), 15.7 in the non-amnestic MCI sam-
ple (n=19) and a near-perfect score in the control sample (n=35).
Research on FCR appears to converge in a few areas. First, the exceptionally good signal detection profile of the ≤13 cut-
off in the original experimental version has not been replicated. Second, the ≤14 cutoff performed well against established
PVTs, with classification accuracy hovering around the “Larrabee limit”: .50 sensitivity at .90 specificity (Lichtenstein,
Erdodi, & Linnea, 2017). Third, no alternative cutoff has been systematically evaluated, despite accumulating evidence that
the vast majority of credible individuals produce perfect scores on FCR, making ≤15 a logical candidate for a more liberal
cutoff. A recent systematic review of the literature on the FCR’s classification accuracy found that the ≤14 cutoff tended to
sacrifice sensitivity for specificity, and identified investigating the more liberal alternative cutoff (≤15) as a direction for future
research (Schwartz et al., 2016).
The implication of these findings is that a score of 15 on FCR is more likely to be a Fail than a Pass. Even if it might not
constitute strong enough evidence to render the whole profile invalid, FCR =15 should be considered at least a red flag in
the evaluation of performance validity (D. Delis, personal communication, 10 May 2012). In fact, some researchers started
treating an FCR score of 15 as the first level of invalid performance (Erdodi, Kirsch, Lajiness-O’Neill, Vingilis & Medoff,
2014;Erdodi et al., 2016). Similarly, the authors of the newly introduced FCR trial to the child version of CVLT suggested
that even one incorrect response is indicative of suboptimal effort (Lichtenstein et al., 2017).
The present study was designed to investigate two psychometric issues involving FCR. First, we proposed to compare the
de facto FCR cutoff of ≤14 to its more liberal alternative (≤15) in a sample of clinically referred adults with TBI. We hypoth-
esized that raising the cutoff to ≤15 would improve the sensitivity of FCR, while maintaining acceptable specificity, as re-
ported in the child version of CVLT.
Second, based on earlier reports that active psychiatric conditions increase the BR
Fail
on PVTs (Moore & Donders, 2004),
we also hypothesized that performance on FCR would be related to self-ratings of emotional distress. Although previous
research suggests that failing one type of validity indicator is predictive of failing other types of validity measures
(Constantinou, Bauer, Ashendorf, Fisher, & McCaffrey, 2005), most clinicians seem to agree that the credibility of symptom
report and performance on cognitive tests are conceptually distinct and hence, should be assessed separately (van Dyke,
Millis, Axelrod, & Hanks, 2013). Overall, the link between performance validity and psychiatric functioning remains contro-
versial. Some investigators found that PVT failure was unrelated to depression (Considine et al., 2011;Egeland et al., 2005;
Rees, Tombaugh, & Boulay, 2001), while others reported an increase the BR
Fail
in patients with psychiatric disorders (Erdodi
et al., 2016). Recent research suggests that while PVT failure is unrelated to self-reported depression and anxiety, it has a
strong relationship with somatic symptoms (Erdodi, Sagar et al., 2017).
Method
Participants
The sample consisted of 104 outpatients medically referred for neuropsychological assessment after TBI. The majority were
males (55.8%) and right-handed (90.4%). Mean age was 38.8 years (SD =16.7) and mean level of education was 13.7 years
(SD =2.6). Mean FSIQ
WAIS-IV
was 92.6 (SD =15.9). Using data on injury severity indices gathered through clinical interview
and the review of medical records (presence/length of loss of consciousness, neuroradiological findings, peri-traumatic amnesia,
3L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
GCS score), 75.0% were classified as mild (mTBI). The rest were classified as moderate-to-severe. All patients were in the post-
acute stage of recovery (>3 months post mTBI and >1 year post moderate-to-severe TBI). As the assessments were conducted
for clinical purposes, no data were available on litigation status.
Materials
Afixed battery of commonly used, standardized measures of general intelligence, learning, memory, attention, executive
functions, language, visuoperceptual and motor skills was administered to all patients (Table 1). Emotional functioning was
assessed with self-report inventories. Performance validity was evaluated using a combination of stand-alone and embedded
PVTs. As a free-standing PVT based on multiple trials separated by time-delay, the WMT represented the traditional approach
to performance validity research.
To address concerns about instrumentation artifacts as a threat to the internal validity of signal detection analyses (Root
et al., 2006), we developed two composites using five independent validity measures to complement the WMT. The first one
consists of PVTs based on recognition memory, labeled “Erdodi Index Five—Recognition”(EI-5
REC
). The second consists of
EVIs that are not based on recognition memory, labeled “Erdodi Index Five—Non-Recognition”(EI-5
NR
). Each component
of the EI-5s was recoded into a four-point scale: the performance reflecting an incontrovertible Pass was assigned a value of
zero, while the most extreme level of failure was assigned the value of three, with intermediate levels of failure coded as one
and two, following the methodology described by Erdodi (2017). Table 2provides the details of the re-scaling process and re-
ferences to the cutoffs used.
In addition to aggregating multiple independent validity indicators and thus, increasing the overall diagnostic power of the
measurement model (Larrabee, 2003), an essential feature of the EI-5s is that they recapture the underlying continuity in per-
formance validity, distinguishing between near-passes and clear failures. An EI-5 score provides a summary of both the “num-
ber”and “extent”of PVT failures. Since the practical demands of validity assessment require a dichotomous outcome, the
first two levels were considered a Pass, and values of ≥4 were considered a Fail. EI-5 values 2–3 were considered borderline
(Table 3), and excluded from further analyses involving the EI-5 to ensure the purity of the criterion groups (Pass/Fail), a
methodological standard in calibrating new PVTs (Erdodi, Tyson, Abeare et al., 2017;Greve & Bianchini, 2004;Sugarman
& Axelrod, 2015).
To complement the WMT and the EI-5s, several other validity measures were used as reference PVTs to provide a more
representative sample of sensory modalities, testing paradigms and sensitivity to invalid responding. Including a variety of
independent PVTs is essential to keep multicollinearity at a minimum and thus, maximize the predictive power of the multi-
variate model of performance validity assessment (Boone, 2013;Larrabee, 2012).
The Word Choice Test (WCT) is a single-trial free-standing PVT based on the FCR paradigm (Pearson, 2009). Number of
hits on the Yes/No recognition trial of the CVLT-II (RH
CVLT-II
) was selected because it is nested within the same test as FCR
Table 1. List of Tests Administered: Abbreviations, Scales and Norms
Test Name Abbreviation Norms
Beck Depression Inventory, 2nd edition BDI-II —
Booklet Category Test BCT Heaton
California Verbal Leaning Test, 2nd edition CVLT-II Manual
Conners’Continuous Performance Test, 2nd edition
a
CPT-II Manual
Letter and Category Fluency Test FAS & Animals Heaton
Finger Tapping Test FTT Heaton
Green’s Word Memory Test
a
WMT Manual
Peabody Picture Vocabulary Test, 4th edition PPVT-4 Manual
Symptom Checklist-90-Revised
a
SCL-90-R Manual
Tactual Performance Test TPT Heaton
Trail Making Test TMT Heaton
Wechsler Adult Intelligence Scale, 4th edition WAIS-IV Manual
Wechsler Memory Scale, 4th edition WMS-IV Manual
Wide Range Achievement Test, 4th edition WRAT-4 Manual
Wisconsin Card Sorting Test WCST Manual
Word Choice Test WCT Manual
Note: T: Heaton: Demographically adjusted norms published by Heaton, Miller, Taylor, & Grant (2004); Manual: Normative data published in the technical
manual.
a
Administered and scored on a computer.
4L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
and there are previous comparisons between the two tasks. The logistic regression equation developed by Wolfe and
colleagues (2010; LRE
Wolfe
) was the alternative CVLT-II-based reference PVT. Given reports of high false positive rates asso-
ciated with the original cutoff (≥.50), the more conservative alternative (≥.625) was used in cross-validation analyses
(Donders & Strong, 2011). The WAIS-IV Digit Span age-corrected scaled score (DS
ACSS
) is a measure of auditory attention
and working memory that has been shown to be effective at detecting invalid responding (Axelrod, Fichteberg, Millis &
Wertheimer, 2006;Reese, Suhr, & Riffle, 2012;Spencer et al., 2013).
Self-reported emotional functioning was assessed using the Beck Depression Inventory—Second Edition (BDI-II) and the
Symptom Checklist-90-Revised (SCL-90-R). The BDI-II was designed to measure depressive symptoms consistent with the
DSM-IV (APA, 1996) diagnostic criteria (Beck, Steer, & Brown, 1996). Its brevity (21 items rated on a 4-point ordinal scale
[0–3]) combined with excellent psychometric properties and discriminant validity in both healthy controls and psychiatric pa-
tients make the BDI-II a popular screening tool for depression (Sprinkle et al., 2002;Storch, Roberti, & Roth, 2004).
The SCL-90-R is a widely used screening tool for a broad range of psychiatric symptoms in clinical populations with a
broad range of etiology (Derogatis, 1994) and in patients with TBI specifically (Hoofien, Barak, Vakil, & Gilboa, 2005). As
the name indicates, it contains 90 items self-rated on a 5-point ordinal scale [0–4] that converge into nine scales: somatization
(SOM), obsessive-compulsive symptoms (O-C), interpersonal sensitivity (I-S), depression (DEP), anxiety (ANX), hostility
(HOS), phobic anxiety (PHO), paranoid ideation (PAR) and psychotic symptoms (PSY). In addition, a Global Severity Index
(GSI) is computed to reflect the mean of all items. The GSI has been found to be the most sensitive of the SCL-90-R indica-
tors to disruptions in emotional and social functioning post TBI (Baker, Schmidt, Heinemann, Langley & Miranti, 1998;
Marschark, Richtsmeier, Richardson, Crovitz, & Henry, 2000;Westcott & Alfano, 2005). Clinical elevations (T ≥63;
Table 2. The Components of the EI-5s and Base Rates of Failure at Given Cutoffs
EI-5
REC
Components EI-5 Values EI-5
NR
Components EI-5 Values
0123 012 3
WMT 0 1 2 3 RDS ≥87 6 ≤5
Base Rate 60.6 4.8 11.5 23.1 Base Rate 76.0 13.5 4.8 5.8
WCT >47 44–47 40–43 ≤39 WCST FMS ≤12 3 ≥4
Base Rate 74.0 10.6 8.7 6.7 Base Rate 87.5 7.7 1.9 2.9
VR Recognition >44 3 ≤2 FTT None One Both —
Base Rate 68.3 16.3 8.7 6.7 Base Rate 85.6 8.7 5.8 —
LM Recognition >20 20–21 18–19 ≤17 Animals >13 12–13 10–11 ≤9
Base Rate 80.8 13.5 1.0 4.8 Base Rate 86.5 6.7 2.9 3.8
VPA Recognition >35 32–35 28–31 ≤27 CPT-II OMI ≤65 66–80 81–100 >100
Base Rate 71.2 16.3 6.7 5.8 Base Rate 74.0 4.8 6.7 14.4
Note: WMT (IR, DR & CNS): Word Memory Test—Number of failures on Immediate Recall, Delayed Recall & Consistency trials at standard cutoffs; WCT:
Word Choice Test (Pearson, 2009); VR: WMS-IV Visual Reproduction (Pearson, 2009); LM: WMS-IV Logical Memory (Bortnik et al., 2010;Pearson,
2009); VPA: WMS-IV Verbal Paired Associates (Pearson, 2009); RDS: Reliable Digit Span (Greiffenstein et al., 1994;Pearson, 2009); WCST FMS:
Wisconsin Card Sorting Test Failures to Maintain Set (Greve, Bianchini, Mathias, Houston & Crouch, 2002;Larrabee, 2003;Suhr & Boyer, 1999); FTT:
Finger Tapping Test, number of cutoffs failed of dominant hand raw score ≤28/35 and combined raw scores ≤58/66 (Arnold et al., 2005); Animals: Animal flu-
ency raw score (Sugarman & Axelrod, 2015); CPT-II OMI: Conners’Continuous Performance Test, 2nd edition Omissions T-scores (Erdodi et al., 2014;Lange
et al., 2013;Ord, Boettcher, Greve, & Bianchini, 2010). The italic values represent the percent of the sample that scored within a given range of cutoffs.
Table 3. Frequency, Cumulative Frequency and Classification Range for the First Eight Levels of the EI-5s
EI-5 EI-5
REC
EI-5
NR
fCumulative % fCumulative % Classification
0 42 40.4 49 47.1 PASS
1 12 51.9 15 61.5 Pass
2 12 63.5 10 71.2 Borderline
3 6 69.2 16 86.5 Borderline
4 2 71.2 4 90.4 Fail
5 7 77.9 2 92.3 Fail
6 6 83.7 2 94.2 FAIL
7 7 90.4 1 95.2 FAIL
8 1 91.3 3 98.1 FAIL
Note: EI-5
REC
: Erdodi Index Five—Recognition memory based; EI-5
NR
: Erdodi Index Five—Non-recognition memory based.
5L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
Derogatis, 1994) were also commonly observed on the O-C, I-S, DEP and PHO scales (Baker et al., 1998;Marschark et al.,
2000;Palav, Ortega, & McCaffrey, 2001;Westcott & Alfano, 2005).
Procedure
Participants were assessed in two half-day appointments through the neurorehabilitation service of a Midwestern academic
medical center. Psychometric testing was completed in an outpatient setting by trained psychometricians. A staff neuro-
psychologist conducted the clinical interview and review of medical records, wrote the integrative report and provided feed-
back to patients. Data were collected through an archival retrospective chart review of a consecutive series of TBI referrals.
The study was approved by the Institutional Review Board. Ethical guidelines regulating research with human participants
were followed throughout the project.
Data Analysis
Descriptive statistics (frequency, percentage and cumulative percentage; mean, standard deviation) were computed for the
key variables. Significance testing was performed using the F- and t-tests as well as χ
2
. ANOVAs were followed up with
uncorrected t-tests. Since all post hoc contrasts were a priori planned comparisons, no statistical correction was applied
(Rothman, 1990;Perneger, 1998). In addition, the tension between statistical versus clinical significance was resolved by con-
sistently reporting effect size estimates associated with each relevant contrast: partial eta squared (η
2
p
), Cohen’sdand Ф
2
.
Receiver operating characteristics (ROC) analyses [area under the curve (AUC) with 95% CI] were performed using SPSS
22.0. The rest of the classification accuracy parameters [sensitivity, specificity, positive and negative likelihood ratio (+LR
and −LR)] were computed using standard formulas.
Results
Mean scores on tests of cognitive ability ranged from Low Average to Average (Table 4). The mean FCR score in the sam-
ple was 15.4 (SD =1.4; range: 9–16). The median value was 16. The distribution was negatively skewed (−2.48) and had a
strong positive kurtosis (+6.47). The majority of the sample (75.0%) obtained a perfect score on FCR; 6.7% scored 15 and
18.3% scored ≤14.
The Effect of Age, Education, Cognitive Functioning and Injury Severity on FCR
As the study focused on comparing the discriminant power of two cutoffs (FCR ≤14 and ≤15) against a perfect score, the
sample was divided into three groups: FCR =16, FCR =15 and FCR ≤14. This trichotomy was used as the independent var-
iable (IV) for a series of ANOVAs with age, education and cognitive functioning as dependent variables (DVs).
There was no difference in age among groups. However, there was a significant overall effect on level of education, driven
by the higher mean of FCR =16 subsample. A medium effect was observed on word knowledge, picture vocabulary and sin-
gle word reading performance. ANOVAs were not significant for BCT (Total Errors), TPT (Total Time) and TMT-B T-scores
(Table 5).
Likewise, the three groups did not differ in TBI severity (percentage of mTBI patients and those with positive neuroradio-
logical findings). In addition, the mTBI subsample was almost three times more likely to fail the old FCR cutoff (≤14; BR
Fail
=
21.8%) than the subsample with moderate-to-severe TBI (BR
Fail
=7.7%). Similarly, patients with mTBI were twice as
likely to fail the alternative FCR cutoff (≤15; BR
Fail
=28.2%) than the subsample with moderate-to-severe TBI
(BR
Fail
=15.4%).
The Classification Accuracy of FCR Against Reference PVTs
All ROC models evaluating the level of agreement between FCR and reference PVTs were statistically significant (p<
.01). AUC values ranged from .71 (DS
ACSS
) to .83 (RH
CVLT-II
). The most stable AUC estimates were obtained against the
WMT (95% CI: .65–.85), while the least stable estimates were observed on EI-5
NR
(95% CI:.61–.93).
ROC analyses were followed up with direct comparisons between the classification accuracy of the old FCR cutoff (≤14)
and the proposed alternative (≤15) against the reference PVTs. All cross-validation analyses met the minimum standard of
6L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
specificity (.84; Larrabee, 2003), with values ranging from .85 to .98. Sensitivity was more variable, fluctuating between .40
and .72. The BR
Fail
in reference PVTs ranged from 10.6% (TMT-A) to 38.5% (WMT).
FCR ≤14 produced a sensitivity of .40 against the WMT, at .95 specificity. The switch to ≤15 increased sensitivity to .47,
while preserving the same specificity. Classification accuracy was comparable between the two cutoffs against WCT (.48–.50
sensitivity at .93 specificity). The new cutoff outperformed the old one against EI-5
REC
in sensitivity (.52/.44) while both
Table 4. Group-Level Performance on the Tests Administered
Test Name Measure MSDDescriptive Range
Animals T-score 42.6 12.0 Low Average
BDI-II Total Raw Score 15.3 11.5 Mild Depression
BCT Total Errors T-score 41.7 13.8 Low Average
CVLT-II Trials 1–5 T-score 45.1 14.5 Average
LD-FR z-score −1.03 1.54 Low Average
CPT-II Omissions T-score 73.4 61.5 Elevated
Commissions T-score 52.2 11.2 Within Normal Limits
Hit RT T-score 53.7 14.6 Within Normal Limits
FTT Dominant Hand T-score 45.4 12.5 Average
WMT % Fail 38.5 N/A
PPVT-4 Standard Score 98.1 13.8 Average
SCL-90-R GSI T-score 62.5 12.7 Within Normal Limits
TPT Total Time T-score 45.0 13.7 Average
TMT Trails A T-score 43.0 13.5 Low Average
Trails B T-score 43.3 13.9 Low Average
WAIS-IV VCI Standard Score 95.1 15.3 Average
PRI Standard Score 96.2 16.4 Average
WMI Standard Score 92.7 15.6 Average
PSI Standard Score 89.4 16.4 Low Average
WMS-IV LM I Age-Corrected Scaled Score 8.1 3.5 Low Average
LM II Age-Corrected Scaled Score 7.6 3.4 Low Average
VPA I Age-Corrected Scaled Score 8.5 3.5 Low Average
VPA II Age-Corrected Scaled Score 8.6 3.7 Average
VR I Age-Corrected Scaled Score 8.6 3.7 Average
VR II Age-Corrected Scaled Score 7.9 3.1 Low Average
WRAT-4 Word Reading Standard Score 93.9 12.5 Average
WCST Perseverative Errors T-score 46.9 11.1 Average
WCT Total Accuracy Raw Score 47.6 3.7 Pass
Note: LD-FR: Long-delay free recall; RT: Reaction Time; GSI: Global Severity Index; VCI: Verbal Comprehension Index; PRI: Perceptual Reasoning Index;
WMI: Working Memory Index; PSI: Processing Speed Index; LM: Logical Memory; I: Immediate Recall; II: Delayed Recall; VPA: Verbal Paired
Associates; VR: Visual Reproduction. Values for standard deviations were italicized.
Table 5. Age, Education, Injury Severity and Performance on Select Neuropsychological Tests as a Function of Trichotomized FCR Scores
FCR Age ED VC PPVT WRAT BCT TPT TMT % mTBI +NR
Reading Total Total B
16 M 37.5 14.0 9.8 100.1 95.7 42.6 46.1 44.8 71.8 67.7
SD 15.6 2.8 3.0 14.1 12.0 13.9 14.1 13.3
15 M 47.0 12.0 9.0 96.4 91.4 43.2 44.7 41.1 71.4 66.7
SD 12.1 1.3 3.1 14.1 13.8 13.2 8.2 7.9
≤14 M 41.4 12.8 7.9 90.1 87.5 38.1 39.2 37.7 89.5 71.4
SD 10.2 1.9 2.2 9.4 12.4 13.6 12.2 17.0
p.18 .05 <.05 <.05 <.05 .43 .28 .12 .27 .96
η
2
.03 .06 .06 .08 .07 .02 .03 .04 Φ
2
.03 .00
Sig. post hocs None 0–10–20–20–2 None None 0–2——
d—.92 .72 .83 .67 —— .47 ——
Note. FCR: CVLT-II Forced Choice Recognition trial raw score; ED: years of formal education; VC: WAIS-IV Vocabulary age-corrected scale score; PPVT:
Peabody Picture Vocabulary Test, 4th edition; WRAT-4 Reading: Wide Range Achievement Test, 4th edition, Reading subtest standard score; BCT: Booklet
Category Test Total Errors T-score; Tactual Performance Test Total Time T-score; TMT-B: Trail Making Test B T-score; % mTBI: % patients with mild trau-
matic brain injury (vs. those with moderate-to-severe TBI); +NR: positive neuroradiological findings; Sig. post hocs: pairwise comparisons with p<.05; 0:
FCR =16 (n=78); 1: FCR =15 (n=7); 2: FCR ≤14 (n=19). Values for standard deviations were italicized.
7L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
maintained very high specificity (.98). A similar pattern of increased sensitivity (.50/.58) and steady specificity (.91/.90) was
observed against EI-5
NR
as the analyses shifted from the old to the new cutoff. Sensitivity spiked against RH
CVLT-II
with both
cutoffs (.65/.72) in the backdrop of good specificity (.93/.92). Again, the new cutoff outperformed the old one against DS
ACSS
in sensitivity (.45/.53) while producing the same specificity (.89). Overall, the new cutoff increased sensitivity from .50 to .56
compared to the old one, while preserving the same specificity (.92).
This pattern of consistently higher sensitivity and essentially unchanged specificity associated with the new cutoff was also
observed at the level of LRs (Table 6). With the exception of WCT, FCR ≤15 produced higher +LRs than FCR ≤14 against
the reference PVTs. The new cutoff had consistently lower −LRs against the all reference PVTs than the old cutoff, suggest-
ing superior classification accuracy.
The Relationship Between FCR and Emotional Functioning
The majority of the sample (54.1%) scored in the non-clinical range on the SCL-90-R using a GSI T-score ≥63 as the cut-
off. However, only 38.5% had fewer than two elevations (T ≥63) on the nine clinical scales, the other criterion for establish-
ing the presence of clinically significant distress (Derogatis, 1994). The number of clinical elevations (M=3.6, SD =3.3)
produced a bimodal distribution with two distinct clusters: patients with either zero (25.0%) or nine (14.6%) scores ≥63.
ANOVAs using the trichotomized FCR (16, 15 and ≤14) as IV and the SCL-90-R scales as DVs produced significant
main effects for all SCL-90-R scales except ANX and PHO. Effect sizes (η
2
p
) ranged from .08 (medium) on HOS to .18
(large) on PSY. All post hoc contrasts were significant between FCR =16 and FCR =15 subsamples except ANX and PHO.
Effect sizes (d) ranged from .87 (large) on SOM to 1.67 (very large) on O-C. All post hoc contrasts were significant between
FCR =16 and FCR ≤14 subsamples except HOS. Effect sizes (d) ranged from .62 (medium) on PHO to .83 (large) on PSY.
When SCL-90-R scores were dichotomized around the T ≥63 cutoff into “clinical”versus “non-clinical”, non-parametric
contrasts produced essentially the same results (Table 7). One comparison (PAR) became non-significant. All effect size esti-
mates (Φ
2
) were within .02 of η
2
p
values produced by ANOVAs with the exception of the GSI.
All three groups produced saw-tooth profiles, with distinct spikes on O-C, DEP and PSY (Fig. 1). FCR =16 subsample
had only one mean ≥63 on O-C, and on average had 2.9 elevations (SD =3.2). The FCR =15 subsample produced mean
T≥63 on all scales, and on average had 6.9 elevations (SD =1.9). FCR ≤14 subsample produced mean T ≥63 on SOM, O-C,
DEP, PSY and the GSI, and on average had 5.4 elevations (SD =3.1).
ANOVAs were repeated on the BDI-II, producing a large effect (η
2
p
=.17), driven by the non-clinical range score of the
FCR =16 group (M=12.5, SD =10.5). FCR =15 group (M=24.9, SD =6.9) did not differ from the FCR ≤14 group
(M=22.9, SD =11.8). Both of these means were in the range of moderate clinical depression, and significantly higher than
FCR =16 mean (d=.93 and 1.40, large).
Table 6. A Direct Comparison between the Classification Accuracy of the Two FCR Cutoffs against Reference PVTs
FCR WMT WCT EI-5
REC
EI-5
NR
RH
CVLT-II
LRE
Wolfe
DS
ACSS
Cutoff Standard ≤47 ≥4≥4≤10 ≥.625 ≤6
BR
Fail
38.5 27.0 37.2 17.9 19.2 18.0 21.2
AUC .75 .72 .78 .77 .83 .74 .71
p<.001 <.01 <.001 <.001 <.001 <.01 <.01
95% CI .65–.85 .59–.85 .67–.90 .61–.93 .71–.95 .60–.89 .59–.85
≤15 25.0 SENS .47 .50 .52 .58 .72 .59 .53
SPEC .95 .93 .98 .90 .92 .88 .89
+LR 9.88 6.70 27.5 6.13 9.51 5.03 4.56
−LR 0.56 0.54 0.49 0.46 0.30 0.47 0.54
≤14 18.3 SENS .40 .48 .44 .50 .65 .56 .45
SPEC .95 .93 .98 .91 .93 .89 .89
+LR 8.53 7.03 23.6 5.33 9.10 5.06 4.14
−LR 0.63 0.57 0.57 0.55 0.38 0.50 0.61
Note: WMT: Word Memory Test (Green, 2003); WCT: Word Choice Test (Pearson, 2009); EI-5
REC
: Erdodi Index Five—Recognition based; EI-5
NR
:Erdodi
Index Five—Non-recognition based; RH
CVLT-II
: CVLT-II Yes/No Recognition hits raw score (Wolfe et al., 2010); LRE
Wolfe
: Logistical regression equation based
on a combination of three CVLT-II scores: long-delay free recall raw score, total recall discriminability z-score and d’raw score (Donders & Strong, 2011;
Wolfe et al., 2010); DS
ACSS
: Digit Span age-corrected scaled score (Axelrod et al., 2006;Spencer et al., 2013); BR
Fail
: Base rate of failure (%); AUC: Area under
the curve; FCR: CVLT-II forced choice recognition; SENS: Sensitivity; SPEC: Specificity; +LR: Positive likelihood ratio; −LR: Negative likelihood ratio;
Number of participants with FCR ≤15 is 26; Number of participants with FCR ≤14 is 19. The italic values represent base rates of failure.
8L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
To investigate whether these findings would generalize to other PVTs, a series of independent t-tests were performed
between patients who passed and those who failed the WMT on SCL-90-R and BDI-II scores. All contrasts were significant,
with the Fail group reporting higher levels of symptoms. Effect size estimates ranged from .46 (medium) to 1.01 (large).
The analyses were repeated using a series of ANOVAs with the EI-5
REC
as trichotomous independent variable (Pass/
Borderline/Fail) and the SCL-90-R and BDI-II scores as dependent variables. All ANOVAs were significant (η
2
p
: .06–.12;
medium-large effects) with the exception of the SOM scale (Table 8). The only post hoc contrast that consistently reached sig-
nificance was between the Pass and Fail groups, with effect sizes ranging from .43 (medium) to .87 (large). Unlike with
FCR, there was a linear relationship between level of PVT failure and self-reported emotional distress, with the Pass group re-
porting the least, the Fail group reporting the most emotional distress, with the Borderline group in the middle (Fig. 2).
Table 7. SCL-90-R Scores as a Function of FCR Performance
FCR SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ63 BDI-II
16 M 58.3 63.6 54.3 60.2 55.5 54.3 55.5 53.0 56.4 59.8 2.9 12.5
SD 12.4 11.8 13.2 11.5 11.9 11.3 11.9 12.8 11.7 12.2 3.2 10.5
%
CLIN
34.2 47.9 29.2 39.7 31.5 23.3 31.9 21.9 35.6 32.9 49.3 19.2
15 M 67.7 78.9 70.4 73.3 63.5 65.1 63.4 66.3 72.3 75.0 6.9 24.9
SD 9.0 3.5 9.7 5.6 15.6 7.9 15.6 11.6 7.1 6.6 1.9 6.9
%
CLIN
71.4 100 85.7 100 57.1 57.1 57.1 57.1 100 100 100 57.1
≤14 M 66.4 71.6 63.6 67.9 62.1 58.8 62.1 61.2 65.9 68.9 5.4 22.9
SD 12.5 12.3 12.7 10.2 12.7 10.5 13.4 10.6 11.1 11.7 3.1 11.8
%
CLIN
72.2 83.3 55.6 72.2 50.0 44.4 50.0 38.9 72.2 77.8 94.4 61.1
ANOVA p<.05 <.01 <.01 <.01 NS <.05 NS <.01 <.01 <.01 <.01 <.01
η
2
p
.09 .15 .14 .13 —.08 —.11 .18 .15 .16 .17
Sig. post hocs 0–10–10–10–1NS0–1NS 0–10–10–10–10–1
0–20–20–20–20–2NS 0–20–20–20–20–20–2
dfor 0–1 .87 1.67 1.39 1.45 —1.11 —1.09 1.64 1.55 1.55 1.40
dfor 0–2 .65 .66 .74 .71 .61 —.52 .63 .83 .76 .79 .93
χ
2
p<.01 <.01 <.01 <.01 NS .05 NS NS <.01 <.01 <.01 <.01
Φ
2
.11 .13 .12 .14 —.06 —— .17 .21 .18 .15
Note. All SCL-90-R scales are in T-scores (M=50, SD =10); FCR: CVLT-II Forced Choice Recognition trial raw score; SCL-90-R: Symptom Checklist-90-Revised;
SOM: somatic distress; O-C: obsessive-compulsive symptoms; I-S: interpersonal sensitivity; DEP: depression; ANX: anxiety; HOS: hostility; PHOB: phobic anxiety; PAR:
paranoia; PSY: psychotic symptoms; GSI: Global Severity Index; Σ63: Sum of T-scores ≥63 on the SCL-90-R clinical scales; BDI-II: Deck Depression Inventory—
Second Edition; %
CLIN
: Percent of the subsample scoring T ≥63 on the SCL-90-R clinical scales; percent of the subsample with two or more scores T ≥63 on Σ63; and
percent of the subsample with BDI-II raw score ≥20 (cutoff for Moderate Depression); Sig. post hocs: pairwise comparisons with p<.05; 0: FCR =16 (n=78);1:FCR
=15 (n=7);2:FCR≤14 (n=19). Italic and bold values represent standard deviations and percent of the sample above the clinical threshold/phi-squared, respectively.
50
55
60
65
70
75
80
SOM O–C I–S DEP ANX HOS PHOB PAR PSY GSI
T-score
SCL-90-R Scale
FCR = 16
FCR = 15
FCR ≤ 14
Fig. 1. SCL-90-R profiles associated with three levels of FCR performance; number of participants with perfect score on the FCR is 78; number of partici-
pants with FCR =15 is 7; number of participants with FCR ≤14 is 19.
9L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
Discussion
The present study was designed to compare the de facto FCR cutoff (≤14) to a more liberal alternative (≤15) in a sample
of clinically referred patients with TBI. Both cutoffs performed around the “Larrabee limit”(.50 sensitivity at .90 specificity).
The hypothesis that increasing the cutoff will improve sensitivity while maintaining specificity was supported by the data. On
average, FCR ≤15 correctly classified an additional 6% of the invalid response sets, while maintaining a false positive rate of
<10%. Likewise, the alternative cutoff produced comparable or better classification accuracy at the level of likelihood ratios.
This pattern of findings was remarkably consistent across a wide range of reference PVTs, including auditory and visual,
univariate and multivariate criteria, free-standing and embedded PVTs, indicators based on the FCR paradigm and those
derived from tests of attention. The replicable superiority of the new cutoff against a variety of criterion measures addresses
previous concerns about modality specificity (Erdodi, Tyson, Abeare et al., 2017;Erdodi, Tyson, Shahein et al., 2017), and
Table 8. SCL-90-R and BDI-II Scores as a Function of Passing or Failing the WMT and the EI-5
REC
SOM O-C I-S DEP ANX HOS PHO PAR PSY GSI Σ63 BDI-II
WMT
Pass M 57.1 62.0 54.1 60.0 54.2 53.4 53.7 53.2 56.2 58.8 2.6 12.3
SD 12.5 11.8 13.1 11.2 13.5 11.2 10.9 12.6 11.9 12.2 3.0 10.4
Fail M 65.3 73.0 62.4 66.7 62.2 60.0 63.1 59.1 64.2 68.7 5.3 20.3
SD 11.6 10.0 13.4 11.3 13.6 10.3 13.4 12.9 11.7 11.0 3.2 11.7
p<.01 <.01 <.01 <.01 <.01 <.01 <.01 <.05 <.01 <.01 <.01 <.01
d.68 1.01 .63 .60 .59 .61 .77 .46 .68 .85 .87 .72
EI-5
REC
Pass M 58.8 62.3 53.8 59.9 53.4 52.2 54.1 52.8 56.1 58.7 2.6 12.3
n=51 SD 12.7 11.4 13.3 11.5 13.9 11.2 11.1 13.0 12.1 12.7 3.0 10.7
Borderline M 61.1 67.6 56.9 63.8 60.4 58.1 58.6 56.7 59.0 63.7 4.4 17.9
n=18 SD 11.1 13.3 13.1 10.5 10.4 11.7 12.3 11.9 10.6 11.0 3.2 12.3
Fail M 64.8 72.1 63.3 66.3 62.0 59.3 62.0 59.3 64.9 68.6 5.0 19.0
n=29 SD 12.7 11.0 13.3 11.8 14.5 10.3 14.2 13.0 12.4 11.6 3.4 11.2
p.06 <.01 <.01 .06 <.05 <.05 <.05 .09 <.01 <.01 <.01 <.05
η
2
p
.06 .13 .09 .06 .09 .06 .08 .05 .10 .12 .12 .08
d
P-F
.43 .87 .72 .55 .61 .66 .62 .50 .72 .81 .75 .61
Note: All SCL-90-R scales are in T-scores (M=50, SD =10); SCL-90-R: Symptom Checklist-90-Revised; SOM: somatic distress; O-C: obsessive-
compulsive symptoms; I-S: interpersonal sensitivity; DEP: depression; ANX: anxiety; HOS: hostility; PHOB: phobic anxiety; PAR: paranoia; PSY: psychotic
symptoms; GSI: Global Severity Index;; Σ63: Sum of T-scores ≥63 on the SCL-90-R clinical scales; BDI-II: Deck Depression Inventory—Second Edition;
WMT: Word Memory Test; EI-5
REC
: Erdodi Index Five—Recognition based; d
P-F
: Cohen’sdfor the Pass vs. Fail post hoc contrast. Italic and bold values
represent standard deviations and Cohen’sd, respectively.
50
55
60
65
70
75
80
SOM O–C I–S DEP ANX HOS PHOB PAR PSY GSI
T-score
SCL-90-R Scale
Pass (0–1)
Borderline (2–3)
Fail (≥4)
Fig. 2. SCL-90-R profiles associated with the three levels of EI-5
REC
performance; number of participants in the Pass range (0–1) is 51; number of partici-
pants in the Borderline range (2–3) is 18; number of participants in the Fail range (≥4) is 29.
10 L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
provides empirical validation to earlier predictions that even a single error on FCR may indicate invalid responding (D. Delis,
personal communication, 10 May 2012). Our results are also consistent with research on the child version of FCR
(Lichtenstein et al., 2017). In addition, the consistently high specificity and +LR of the new cutoff against multiple reference
PVTs suggests that the more liberal FCR cutoff does not inflate false positive rates.
Equally importantly, subsamples with FCR scores 16, 15 and ≤14 did not differ from each other in injury severity, neuro-
radiological findings, or on the measures known to be sensitive to TBI (Booklet Category Test, Tactual Performance Test and
Trails B; Grant & Adams, 1996). These findings suggest that FCR is independent of objective measures of impairment, con-
sistent with previous reports (Baldo et al., 2002;Donders & Strong, 2011). The fact that, paradoxically, a significant differ-
ence emerged on “hold”tests (Boone, 2013) known to be resistant to the deleterious effects of TBI (i.e., word knowledge,
picture vocabulary and single word reading) provides further evidence that FCR is unrelated to cognitive impairment subse-
quent to TBI. In fact, internally inconsistent patterns of test scores have been identified as emergent markers of invalid re-
sponding (Boone, 2013;Larrabee, 2012;Slick, Sherman & Iverson, 1999).
Furthermore, there was a “reverse injury severity effect”on FCR. In other words, patients with mTBI were two to three
times more likely to fail FCR cutoffs compared to patients with moderate-to-severe TBI. Although counterintuitive, this phe-
nomenon is well-replicated in the research literature (Carone, 2008;Erdodi & Rai, 2017;Green, Iverson, & Allen, 1999;
Green, Flaro, & Courtney, 2009;Sweet, Goldman, & Guidotti Breting, 2013). In the broader context of this well-established
apparent paradox of elevated BR
Fail
in mTBI, the current results should alleviate concerns about false positive errors on FCR
due to genuine neurological impairment.
The second hypothesis that performance on FCR would be related to self-reported emotional distress was also supported.
Patients who obtained a perfect score on FCR had the lowest level of depression on SCL-90-R and BDI-II, both as continuous
scales and as percentage in the clinical range. Those who made any error on FCR reported more severe psychiatric symptoms
globally, with large to very large effect sizes. These findings are consistent with some of the existing literature that documents
a link between psychiatric history and invalid performance on neurocognitive testing (Martens, Donders, & Millis, 2001;
Moore & Donders, 2004), but contradicts other reports that anxiety and depression are unrelated to PVT failure (Ashendorf,
Constantinou & McCaffery, 2004;Considine et al., 2011;Egeland et al., 2005;Rees et al., 2001).
The divergence between our study and some previous investigations on PVTs and psychological distress may be driven by
two main factors. First, many of them operationalized performance validity using a single criterion measure, such as the
TOMM or the Rey 15-item test at traditional cutoffs (Trial 2 <45 and free recall <9, respectively), which are known to have
limited sensitivity to invalid responding (Green, 2007;Reznek, 2005). Therefore, those negative findings may reflect unde-
tected invalid profiles. Second, those studies focused on psychiatric disorders, whereas our sample was comprised of patients
with TBI, some of whom also reported emotional problems. As such, our positive findings could be due to the additive effect
of neuropsychological deficits subsequent to TBI, pre-existing or emerging deficits in emotional regulation, or other contex-
tual factors uniquely related to TBI and post-TBI depression and anxiety.
While the evidence linking depression and memory deficits is mixed both within and between studies (Bearden et al., 2006;
Ilsley, Moffoot, & Carroll, 1995;Keiski, Shore, & Hamilton, 2007;Kessels, Ruis, & Kappelle, 2007;Langenecker et al., 2005;
Raskin, Mateer, & Tweeten, 1998), there is growing evidence that memory tests are impacted more by invalid responding than
psychiatric disorders (Boone, 2013;Coleman, Rapport, Millis, Ricker, & Farchione, 1998;Larrabee, 2012;Suhr, Tranel, Wefel,
& Barrash, 1997;Trueblood, 1994). In fact, Rohling, Green, Allen and Iverson (2002) argue that a meaningful investigation of
the interaction between depression and cognitive functioning must exclude individuals who fail PVTs. Our findings are congru-
ent with this line of research on co-existing TBI, self-reported emotional problems and PVT failures.
As FCR performance correlates with scores on both PVTs and psychiatric symptom inventories, the clinical interpretation
of failing this validity indicator is a challenge. The group-level pattern of scores observed in this sample fits several criteria of
“Cogniform Disorder”introduced by Delis and Wetter (2007): internally inconsistent neurocognitive profiles, combination of
test scores that are rare in patients with genuine neurological impairment, and objective evidence of poor effort. Although the
observational data presented in this study does not allow for causal attributions, they raise some important questions. Does
genuine emotional distress increase vulnerability to PVT failures? Do patients with non-credible presentation exaggerate both
emotional distress and cognitive deficits? Are certain PVTs more sensitive than others to both forms of invalid responding?
Although there is an emerging consensus that symptom and performance validity are distinct constructs and therefore,
should be evaluated separately (van Dyke et al., 2013), it is plausible that they share part of their etiology. If the link between
FCR and psychiatric symptoms is replicated in future studies, failing FCR might become a marker of not only invalid perfor-
mance, but perhaps also of “psychogenic interference”—a failure to demonstrate one’s true ability level on cognitive testing
due to acute psychiatric symptoms. It is interesting that the FCR =15 group reported more severe psychiatric symptomatol-
ogy than the FCR ≤14 group. Also, the FCR =16 group produced a pattern of performance that is consistent with the bona
fide cognitive sequelae subsequent to TBI (i.e., intact performance on “hold”tests, and mild deficits on measures known to be
11L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
sensitive to head injury). In contrast, the FCR ≤14 group demonstrated uniformly low performance across both types of tests,
with the FCR =15 group in between.
It is possible that there are group-level differences in the etiology of PVT failures, with the more heavily psychogenic in-
fluences having a milder impact than other factors that are known to have strong effects on PVT performance, such as external
incentives to appear impaired (Boone, 2013;Larrabee, 2012). However, this cannot be determined with the current sample,
given the absence of data on litigation status. While previous research found that certain PVTs appear to be uniquely sensitive
to emotional distress (Erdodi et al., 2016), it failed to disentangle the relative contribution of psychogenic interference and
volitional suppression of performance on cognitive testing.
The cumulative clinical evidence suggests that the etiology of invalid performance is likely multifactorial. A PVT failure
can be the expression of several contributing and potentially interacting factors and hence, does not automatically mean delib-
erate suppression of cognitive ability (i.e., malingering). A full consideration of alternative explanations to non-credible pre-
sentation is instrumental in providing an accurate, nuanced and clinically helpful interpretation of neurocognitive profiles
(Bigler, 2012,2015). Developing a conceptually sound and empirically supported model for subtyping non-credible respond-
ing has important forensic and clinical applications.
For example, in personal injury litigation, multiple unequivocal PVT failures raise the possibility of malingering and thus,
have obvious implications for the legitimacy of the lawsuit. In contrast, a neuropsychologist’s conclusion that a plaintiff failed
to put forth adequate effort, but not deliberately so, may shift the focus to exploring other plausible clinical issues that may or
may not be related to the accident (depression, unresolved developmental trauma, exacerbation of a pre-existing psychological
vulnerability, righteous anger towards the perpetrator of the injury, etc.). In those cases, the assessor’s responsibility is to (a)
determine whether the data are consistent with an alternative accident-related etiology; (b) render an opinion that even if psy-
chogenic factors are operative, they cannot account for the level of impairment demonstrated during testing, or (c) conclude
that regardless of the reason behind unexpectedly low scores, they cannot be attributed to accident-related factors.
Even in clinical settings and in the absence of apparent external incentives to appear impaired, assessors often face the
complex task of interpreting co-existing PVT failures and medically verified neurological problems (Erdodi et al., 2016). In
such cases, it is the neuropsychologist’s responsibility to determine whether (a) low scores are a manifestation of a legitimate
disease process; (b) even in the context of documented severe impairments the low scores are still not credible; or (c) indepen-
dent of neurological manifestations, ancillary issues are contributing to low performance, such as living with a debilitating
neurological impairment for many years has resulted in unremitting dependence or chronic resignation in the face of cognitive
demands.
These considerations are important for optimizing the clinical management of the patient. If an evaluation is deemed valid
(i.e., PVT failures are attributable to despondent resignation that deflated scores throughout testing), certain aspects of the pa-
tient’s impairment might be reversible. In such cases referral for psychotherapy or cognitive rehabilitation has the potential to
restore some of the cognitive functioning.
For example, in the present sample elevations on SCL-90-R were related to errors on FCR and failures on other PVTs. If
self-reported psychopathology is causally related to invalid responding, treating the psychiatric symptoms could conceivably
improve cognitive performance. Although speculations about the reasons behind poor efforts are epistemologically risky, pro-
viding a sound, albeit tentative, explanation could be important, as the clinical outcome hinges on the correct interpretation of
non-credible presentation. Beyond the simple “valid/invalid”dichotomy, the assessor carries the responsibility of determining
whether a meaningful intervention is feasible. Erring on either side can be costly. Dismissing a patient as non-credible may
deprive the individual of the opportunity to recover lost function. Recommending therapy for a malingerer may allocate lim-
ited health care services to an individual who is invested in appearing impaired and thus, is unlikely to benefit from the
intervention.
In conclusion, FCR scores should be interpreted in the larger context of injury severity, clinical and psycho-social history,
incentive status as well as the rest of the neurocognitive profile. Marginal failures (FCR =15) likely have a different clinical
meaning in patients with medically verified severe pathology and those with mild or questionable TBI. In the former group, a
single error may be a direct manifestation of the injury. Conversely, in the latter group it may raise concerns about non-
neurological factors contributing to the presentation.
The present study has several strengths. It provided a direct comparison of the classification accuracy of two different FCR
cutoffs across a wide range of reference PVTs in a clinically referred sample with mild and moderate-to-severe TBI. We also
examined the link between FCR failures and self-reported emotional functioning. Inevitably, the study has a number of limita-
tions, too: the sample is relatively small and geographically restricted. More importantly, the FCR =15 subsample was too
small to draw definite conclusions about the neurocognitive profile of patients who only failed the liberal cutoff on FCR.
In addition, as psychiatric symptoms were assessed using face-valid self-report inventories without built-in validity scales,
the veracity of these data is unknown, which is a considerable limitation of our measurement model. However, given the
12 L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
limited research on the link between emotional functioning and performance validity, documenting a systematic difference in
the level of self-reported psychiatric symptoms as a function of passing or failing PVTs is a meaningful initial step towards a
better understanding of this complex relationship. The fact that previous research that controlled for response bias in self-
report produced similar results (Erdodi, Sagar et al., 2017;Erdodi, Seke et al. 2017) suggests that the shared variance between
elevated symptom report and PVT failure cannot be attributed to a common “malingering factor”(i.e., the same people fabri-
cate/exaggerate both psychiatric problems and cognitive deficits). More importantly, the nature of the data (archival/observa-
tional) precludes causal modeling of the main effects. Prospective experimental and longitudinal studies that can separate
invalid performance from psychiatric history by design are needed to determine the clinical meaning of FCR failures—evi-
dence of non-credible responding, emotional distress or both?
Conclusion
Even a single error on the FCR is a reliable marker of invalid responding. Based on its superior classification accuracy,
≤15 should replace the current de facto FCR cutoff of ≤14. Failing the FCR was associated with elevated self-reported psy-
chiatric symptoms. Given that the link between invalid performance and emotional distress is poorly understood, further
research is needed to explore the underlying causal mechanisms.
Funding
None.
Conflict of Interest
None declared.
Acknowledgments
The authors would like to thank Drs. Donders and Marshall for providing additional data on the clinical samples used in
their studies that were not included in the original publications.
References
American Psychiatric Association. (1996). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author.
An, K. Y., Kaploun, K., Erdodi, L. A., & Abeare, C. A. (2017). Performance validity in undergraduate research participants: A comparison of failure rates
across tests and cutoffs. The Clinical Neuropsychologist,31, 193–206. doi:10.1080/13854046.2016.1217046.
An, K. Y., Zakzanis, K. K., & Joordens, S. (2012). Conducting research with non-clinical healthy undergraduates: Does effort play a role in neuropsychologi-
cal test performance? Archives of Clinical Neuropsychology,27, 849–857.
Arnold, G., Boone, K. B., Lu, P., Dean, A., Wen, J., Nitch, S., et al. (2005). Sensitivity and specificity of finger tapping test scores for the detection of suspect
effort. The Clinical Neuropsychologist,19, 105–120.
Ashendorf, L., Constantinou, M., & McCaffrey, R. J. (2004). The effect of depression and anxiety on the TOMM in community-dwelling older adults.
Archives of Clinical Neuropsychology,19, 125–130.
Axelrod, B. N., Fichteberg, N. L., Millis, S. R., & Wertheimer, J. C. (2006). Detecting incomplete effort with Digit Span from the Wechsler Adult
Intelligence Scale—Third Edition. The Clinical Neuropsychologist,10, 513–523.
Baker, K. A., Schmidt, M. F., Heinemann, A. W., Langley, M., & Miranti, S. V. (1998). The validity of the Katz Adjustment Scale among people with trau-
matic brain injury. Rehabilitation Psychology,43,30–40.
Baldo, J. V., Delis, D., Kramer, J., & Shimamura, A. (2002). Memory performance on the California Verbal Learning Test-II: Findings from patients with
focal frontal lesions. Journal of the International Neuropsychological Society,8, 539–546.
Bauer, L., Yantz, C. L., Ryan, L. M., Warned, D. L., & McCaffrey, R. J. (2005). An examination of the California Verbal Learning Test II to detect incom-
plete effort in a traumatic brain injury sample. Applied Neuropsychology,12, 202–207.
Bearden, C. E., Glahn, D. C., Monkul, E. S., Barrett, J., Najt, P., Villarreal, V., et al. (2006). Patterns of memory impairment in bipolar disorder and unipolar
major depression. Psychiatry Research,142, 139–150.
Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Beck Depression Inventory (2nd ed.). San Antonio, TX: Psychological Corporation.
Bigler, E. D. (2012). Symptom validity testing, effort and neuropsychological assessment. Journal of the International Neuropsychological Society,18,
632–642.
Bigler, E. D. (2015). Neuroimaging as a biomarker in symptom validity and performance validity testing. Brain Imaging and Behavior,9, 421–444.
Boone, K. B. (2013). Clinical practice of forensic neuropsychology. New York: Guilford.
13L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
Boone, K. B. (2009). The need for continuous and comprehensive sampling of effort/response bias during neuropsychological examinations. The Clinical
Neuropsychologist,23, 729–741.
Bortnik, K. E., Boone, K. B., Marion, S. D., Amano, S., Ziegler, E., Victor, T. L., et al. (2010). Examination of various WMS-III logical memory scores in
the assessment of response bias. The Clinical Neuropsychologist,24, 344–357.
Carone, D. A. (2008). Children with moderate/severe brain damage/dysfunction outperform adults with mild-to-no brain damage on the Medical Symptom
Validity Test. Brain Injury,22, 960–971.
Chafetz, M. D., Williams, M. A., Ben-Porath, Y. S., Bianchini, K. J., Boone, K. B., Kirkwood, M. W., et al. (2015). Official position of the American
Academy of Clinical Neuropsychology Social Security Administration policy on validity testing: Guidance and recommendations for change. The Clinical
Neuropsychologist,29, 723–740.
Connor, D. J., Drake, A. I., Bondi, M. W., & Delis, D. C. (1997). Detection of feigned cognitive impairments in patients with a history of mild to severe
closed head injury. Paper presented at the American Academy of Neurology, Boston.
Clark, L. R., Stricker, N. H., Libon, D. J., Delano-Wood, L., Salmon, D. P., Delis, D. C., et al. (2012). Yes/No versus forced-choice recognition memoryin
mild cognitive impairment and alzheimer’s disease: Patterns of impairment and associations with dementia severity. The Clinical Neuropsychologist,16,
1201–1216.
Coleman, R. D., Rapport, L. J., Millis, S. R., Ricker, J. H., & Farchione, T. J. (1998). Effects of coaching on the California Verbal Learning Test. Journal of
Clinical and Experimental Neuropsychology,20, 201–210.
Considine, C., Weisenbach, S. L., Walker, S. J., McFadden, E. M., Franti, L. M., Bieliauskas, L. A., et al. (2011). Auditory memory decrements, without dis-
simulation, among patients with major depressive disorder. Archives of Clinical Neuropsychology,26, 445–453.
Constantinou, M., Bauer, L., Ashendorf, L., Fisher, J. M., & McCaffrey, R. J. (2005). Is poor performance on recognition memory effort measures indicative
of generalized poor performance on neuropsychological tests? Archives of Clinical Neuropsychology,20, 191–198.
Delis, D. C., Kramer, J. H., Kaplan, E., & Ober, B. A. (2000). ). California Verbal Learning Test—Second edition. San Antonio, TX: Psychological Corporation.
Delis, D., & Wetter, S. R. (2007). Cogniform disorder and cogniform condition: Proposed diagnoses for excessive cognitive symptoms. Archives of Clinical
Neuropsychology,22, 589–604.
Derogatis, L. R. (1994). SCL-90-R: Administration, scoring, and procedures manual (3rd ed.). Minneaplois, MN: National Computer Systems.
Donders, J., & Strong, C. A. (2011). Embedded effort indicators on the California Verbal Learning Test—Second Edition: An attempted cross-validation. The
Clinical Neuropsychologist,25, 173–184.
Egeland, J., Lund, A., Landro, N. I., Rund, B. R., Sudet, K., Asbjornsen, A., et al. (2005). Cortisol level predicts executive and memory function in depres-
sion, symptom level predicts psychomotor speed. Acta Psychiatrica Scandinavica,112, 434–441.
Erdodi, L. A. (2017). Aggregating validity indicators: The salience of domain specificity and the indeterminate range in multivariate models of performance
validity assessment. Applied Neuropsychology: Adult. doi: 10.1080/23279095.2017.1384925Advance online publication.
Erdodi, L. A., & Rai, J. K. (2017). A single error is one too many: Examining alternative cutoffs on Trial 2 on the TOMM. Brain Injury,31, 1362–1368.
doi: 10.1080/02699052.2017.1332386.
Erdodi, L. A., Kirsch, N. L., Lajiness-O’Neill, R., Vingilis, E., & Medoff, B. (2014). Comparing the Recognition Memory Test and the Word Choice Test in
a mixed clinical sample: Are they equivalent? Psychological Injury and Law,7, 255–263. doi:10.1007/s12207-014-9197-8.
Erdodi, L. A., Roth, R. M., Kirsch, N. L., Lajiness-O’Neill, R., & Medoff, B. (2014). Aggregating validity indicators embedded in Conners’CPT-II outper-
forms individual cutoffs at separating valid from invalid performance in adults with traumatic brain injury. Archives of Clinical Neuropsychology,29,
456–466.
Erdodi, L. A., Sagar, S., Seke, K., Zuccato, B. G., Schwartz, E. S., & Roth, R. M. (2017). The Stroop Test as a measure of performance validity in adults clin-
ically referred for neuropsychological assessment. Psychological Assessment. doi:10.1037/pas0000525.
Erdodi, L. A., Seke, K. R., Shahein, A., Tyson, B. T., Sagar, S., & Roth, R. M. (2017). Low scores on the Grooved Pegboard Test are associated with invalid
responding and psychiatric symptoms. Psychology and Neuroscience,10, 325–344. doi: 10.1037/pne0000103.
Erdodi, L. A., Tyson, B., Abeare, T., Lichtenstein, C. A., Pelletier, J. D., Rai, C. L., et al. (2016). The BDAE Complex Ideational Material—A measure of
receptive language or performance validity? Psychological Injury and Law,9, 112–120. doi: 10.1007/s12207-016-9254-6.
Erdodi, L. A., Tyson, B. T., Abeare, C. A., Zuccato, B. G., Rai, J. K., Seke, K. R., et al. (2017). Utility of critical items within the Recognition Memory Test
and Word Choice Test. Advance online publication. Applied Neuropsychology: Adult. doi:10.1080/23279095.2017.1298600.
Erdodi, L. A., Tyson, B. T., Shahein, A., Lichtenstein, J. D., Abeare, C. A., Pelletiere, C. L., et al. (2017). The power of timing: Adding a time-to-completion
cutoff to the Word Choice Test and Recognition Memory Test improves classification accuracy. Journal of Clinical and Experimental Neuropsychology,
39, 369–383. doi:10.1080/13803395.2016.1230181.
Frederick, R. I. (2003). VIP: Validity indicator profile. Manual (2nd ed.). Minneapolis, MN: NCS Pearson.
Grant I., & Adams K. M. (Eds.) (1996). Neuropsychological assessment of neuropsychiatric disorders. New York: Oxford University Press.
Greve, K. W., & Bianchini, K. J. (2004). Setting empirical cut-offs on psychometric indicators of negative response bias: A methodological commentary with
recommendations. Archives of Clinical Neuropsychology,19, 533–541.
Greve, K. W., Bianchini, K. J., Mathias, C. W., Houston, R. J., & Crouch, J. A. (2002). Detecting malingered neurocognitive dysfunction with the Wisconsin
Card Sorting Test: A preliminary investigation in traumatic brain injury. The Clinical Neuropsychologist,16, 179–191.
Green, P. (2003). Green’s Word Memory Test. Edmonton, Canada: Green’s Publishing.
Green, P. (2007). Spoiled for choice: Making comparisons between forced-choice effort tests. In Boone K. B. (Ed.), Assessment of feigned cognitive
impairment (pp. 50–77). New York, NY: Guilford.
Green, P., Iverson, G., & Allen, L. (1999). Detecting malingering in head injury litigation with the Word Memory Test. Brain Injury,13, 813–819.
Green, P., Flaro, L., & Courtney, J. (2009). Examining false positives on the word memory test in adults with mild traumatic brain injury. Brain Injury,23,
741–750.
Greiffenstein, M. F., Baker, W. J., & Gola, T. (1994). Validation of malingered amnesic measures with a large clinical sample. Psychological Assessment,6,
218–224.
Heaton, R. K., Miller, S. W., Taylor, M. J., & Grant, L. (2004). Revised comprehensive norms for an expanded Halstead-Reitan battery: Demographically
adjusted neuropsychological norms for African American and Caucasian adults. Lutz, Fla.: PAR.
14 L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
Heilbronner, R. L., Sweet, J. J., Morgan, J. E., Larrabee, G. J., & Millis, S. R. (2009). American Academy of Neuropsychology Consensus Conference
Statement on the neuropsychological assessment of effort, response bias, and malingering. The Clinical Neuropsychologist,23, 1093–1129.
Hoofien, D., Barak, O., Vakil, E., & Gilboa, A. (2005). Symptom Checklist 90 Revised scores in persons with traumatic brain injury: Affective reactions of
neurobehavioral outcomes of the injury? Applied Neuropsychology,12,30–39.
Ilsley, J. E., Moffoot, A. P. R., & O’Carroll, R. E. (1995). An analysis of memory dysfunction in major depression. Journal of Affective Disorders,35 (1-2), 1–9.
Iverson, G. L. (2003). Detecting malingering in civil forensic evaluations. In Horton A. M., & Hartlage L. C. (Eds.), Handbook of forensic neuropsychology
(pp. 137–177). New York: Springer Publishing Company.
Keiski, M. A., Shore, D. L., & Hamilton, J. M. (2007). The role of depression in verbal memory following traumatic brain injury. The Clinical
Neuropsychologist,21, 744–761.
Kessels, R. P. C., Ruis, C., & Kappelle, L. J. (2007). The impact of self-reported depressive symptoms on memory function in neurological outpatients.
Clinical Neurology and Neurosurgery,109, 323–326.
Lange, R. T., Iverson, G. L., Brickell, T. A., Staver, T., Pancholi, S., Bhagwat, A., et al. (2013). Clinical utility of the Conners’Continuous Performance
Test-II to detect poor effort in U.S. military personnel following traumatic brain injury. Psychological Assessment,25, 339–352.
Langenecker, S. A., Bieliauskas, L. A., Rapport, L. J., Zubieta, J. K., Wilde, E. A., & Berent, S. (2005). Face emotion perception and executive functioning
deficits in depression. Journal of Clinical and Experimental Psychology,27, 320–333.
Larrabee, G. J. (2003). Detection of malingering using atypical performance on standard neuropsychological tests. The Clinical Neuropsychologist,17,
410–425.
Larrabee, G. J. (2012). Assessment of malingering. In Larrabee G. J. (Ed.), Forensic neuropsychology: A scientific approach. NY: Oxford University Press.
Leighton, A., Weinborn, M., & Maybery, M. (2014). Bridging the gap between neurocognitive processing theory and performance validity assessment among
the cognitively impaired: A review and methodological approach. Journal of the International Neuropsychological Society,20, 873–886. doi:10.1017/
S135561771400085X.
Lichtenstein, J. D., Erdodi, L. A., & Linnea, K. S. (2017). Introducing a forced-choice recognition task to the California Verbal Learning Test—Children’s
Version. Child Neuropsychology,23, 284–299. doi:10.1080/09297049.2015.1135422.
Marschark, M., Richtsmeier, L. M., Richardson, J. T. E., Crovitz, H. F., & Henry, J. (2000). Intellectual and emotional functioning in college students follow-
ing mild traumatic brain injury in childhood and adolescence. Journal of Head Trauma Rehabilitation,15, 1227–1245.
Marshall, P., & Happe, M. (2007). The performance of individuals with mental retardation on cognitive tests assessing effort and motivation. The Clinical
Neuropsychologist,21, 826–840.
Martens, M., Donders, J., & Millis, S. R. (2001). Evaluation of invalid response sets after traumatic head injury. Journal of Forensic Neuropsychology,2(1),
1–18.
Moore, B. A., & Donders, J. (2004). Predictors of invalid neuropsychological test performance after traumatic brain injury. Brain Injury,18, 975–984.
Ord, J. S., Boettcher, A. C., Greve, K. J., & Bianchini, K. J. (2010). Detection of malingering in mild traumatic brain injury with the Conners’Continuous
Performance Test-II. Journal of Clinical and Experimental Neuropsychology,32, 380–387.
Palav, A., Ortega, A., & McCaffrey, R. J. (2001). Incremental validity of the MMPI-2 content scales: A preliminary study with brain-injured patients. Journal
of Head Trauma Rehabilitation,16, 275–283.
Pearson (2009). Advanced clinical solutions for the WAIS-IV and WMS-IV—Technical manual. San Antonio, TX: Author.
Perneger, T. V. (1998). What’s wrong with Bonferroni adjustments. BMJ (Clinical research ed.),316, 1236–1238.
Raskin, S. A., Mateer, C. A., & Tweeten, R. (1998). Neuropsychological assessment of individuals with mild traumatic brain injury. The Clinical
Neuropsychologist,12,21–30.
Rees, L. M., Tombaugh, T. N., & Boulay, L. (2001). Depression and the Test of Memory Malingering. Archives of Clinical Neuropsychology,16, 501–506.
Reese, C. S., Suhr, J. A., & Riddle, T. L. (2012). Exploration of malingering indices in the Wechsler Adult Intelligence Scale—Fourth Edition Digit Span
subtest. Archives of Clinical Neuropsychology,27, 176–181.
Reznek, L. (2005). The Rey 15-item memory test for malingering: A meta-analysis. Brain Injury,19, 539–543. doi:10.1080/02699050400005242.
Rohling, M. L., Green, P., Allen, L. M., & Iverson, G. L. (2002). Depressive symptoms and neurocognitive test scores in patients passing symptom validity
tests. Archives of Clinical Neuropsychology,17, 205–222.
Root, J. C., Robbins, R. N., Chang, L., & van Gorp, W. (2006). Detection of inadequate effort on the California Verbal Learning Test-Second edition: Forced
choice recognition and critical item analysis. Journal of the International Neuropsychological Society,12, 688–696.
Ross, T. P., Poston, A. M., Rein, P. A., Salvatore, A. N., Wills, N. L., & York, T. M. (2016). Performance invalidity base rates among healthy undergraduate
research participants. Archives of Clinical Neuropsychology,31,97–104.
Rothman, K. J. (1990). No adjustments are needed for multiple comparisons. Epidemiology (Cambridge, Mass.),1,43–46.
Santos, O. A., Kazakov, D., Reamer, M. K., Park, S. E., & Osmon, D. C. (2014). Effort in college undergraduate is sufficient on the Word Memory Test.
Archives of Clinical Neuropsychology,29, 609–613.
Schwartz, E. S., Erdodi, L., Rodriguez, N., Jyotsna, J. G., Curtain, J. R., Flashman, L. A., et al. (2016). CVLT-II forced choice recognition trial as an embed-
ded validity indicator: A systematic review of the evidence. Journal of the International Neuropsychological Society,22, 851–858. doi:10.1017/
S1355617716000746.
Slick, D. J., Sherman, E. M. S., Grant, L., & Iverson, G. L. (1999). Diagnostic criteria for malingered neurocognitive dysfunction: Proposed standards for clin-
ical practice and research. The Clinical Neuropsychologist,13, 545–561.
Spencer, R. J., Axelrod, B. N., Drag, L. L., Waldron-Perrine, B., Pangilinan, P. H., & Bieliauskas, L. A. (2013). WAIS-IV reliable digit span is no more accu-
rate than age corrected scaled score as an indicator of invalid performance in a veteran sample undergoing evaluation for mTBI. The Clinical
Neuropsychologist,27, 1362–1372.
Sprinkle, S. D., Lurie, D., Insko, S. L., Atkinson, G., Jones, G. L., Logan, A. R., et al. (2002). Criterion validity, severity cut scores, and test-retest reliability
of the Beck Depression Inventory-II in a university counseling center sample. Journal of Counseling Psychology,49, 381.
Storch, E. A., Roberti, J. W., & Roth, D. A. (2004). Factor structure, concurrent validity, and internal consistency of the Beck Depression Inventory—Second
edition in a sample of college students. Depression and Anxiety,19, 187–189.
15L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017
Suhr, J. A., & Boyer, D. (1999). Use of the Wisconsin Card Sorting Test in the detection of malingering in student simulator and patient samples. Journal of
Clinical and Experimental Psychology,21, 701–708. doi:10.1076/jcen.21.5.701.868.
Suhr, J., Tranel, D., Wefel, J., & Barrash, J. (1997). Memory performance after head injury: Contributions of malingering, litigation status, psychological fac-
tors, and medication use. Journal of Clinical and Experimental Psychology,19, 500–514.
Sugarman, M. A., & Axelrod, B. N. (2014). Embedded measures of performance validity using verbal fluency tests in a clinical sample. Applied
Neuropsychology: Adult. DOI:10.1080/23279095.2013.873439.
Sugarman, M. A., & Axelrod, B. N. (2015). Embedded measures of performance validity using verbal fluency tests in a clinical sample. Applied
Neuropsychology: Adult,22, 141–146.
Sweet, J. J., Goldman, D. J., & Guidotti Breting, L. M. (2013). Traumatic brain injury: Guidance in a forensic context from outcome, dose-response, and
response bias research. Behavioral Sciences and the Law,31, 756–778.
Tombaugh, T. N. (1996). Test of Memory Malingering. New York: Multi-Health Systems.
Trueblood, W. (1994). Qualitative and quantitative characteristics of malingered and other invalid WAIS-R and clinical memory data. Journal of Clinical and
Experimental Neuropsychology,14, 597–607.
van Dyke, S. A., Millis, S. R., Axelrod, B. N., & Hanks, R. A. (2013). Assessing effort: Differentiating performance and symptom validity. The Clinical
Neuropsychologist,27, 1234–1246.
Westcott, M. C., & Alfano, D. P. (2005). The Symptom Checklist-90-Revised and mild traumatic brain injury. Brain Injury,19, 1261–1267.
Wolfe, P. L., Millis, S. R., Hanks, R., Fichtenberg, N., Larrabee, G. J., & Sweet, J. J. (2010). Effort indicators within the California Verbal Learning Test-II
(CVLT-II). The Clinical Neuropsychologist,24, 153–168.
16 L.A. Erdodi et al. / Archives of Clinical Neuropsychology (2017); 1–16
Downloaded from https://academic.oup.com/acn/advance-article-abstract/doi/10.1093/acn/acx110/4774665
by University of Windsor Paul Martin Law Library, lerdodi@uwindsor.ca
on 27 December 2017