Table 9 - uploaded by Laszlo A Erdodi
Content may be subject to copyright.
VI-11 scores as a function of passing or failing the traditional RMT cutoff or select cutoffs on the critical items. 

VI-11 scores as a function of passing or failing the traditional RMT cutoff or select cutoffs on the critical items. 

Source publication
Article
Full-text available
This study was designed to examine the clinical utility of critical items within the Recognition Memory Test (RMT) and the Word Choice Test (WCT). Archival data were collected from a mixed clinical sample of 202 patients clinically referred for neuropsychological testing (54.5% male; mean age ¼ 45.3 years; mean level of education ¼ 13.9 years). The...

Contexts in source publication

Context 1
... of patients who passed the traditional cutoff on the RMT (>42) failed select CR RMT cutoffs (Table 9). Within the WCT, between 5.6% and 17.5% of those who passed the traditional cutoff (>45) failed select CR WCT cutoffs (Table 10). ...
Context 2
... of patients who passed the traditional cutoff on the RMT (>42) failed select CR RMT cutoffs (Table 9). Within the WCT, between 5.6% and 17.5% of those who passed the traditional cutoff (>45) failed select CR WCT cutoffs (Table 10). ...

Citations

... A potential solution for improving an instrument's precision without increasing the related costs (administration and scoring time, complex interpretive algorithms, logistic regression equations, etc.) is critical item analysis. The assumption underlying critical item analysis is that predictive power is not equally distributed along a scale: certain items contribute more information about the target construct than others (Dunn et al., 2021;Erdodi, Tyson, et al., 2018). Identifying a small subset of items with comparable, better or differential predictive power can abbreviate the testing process, improve classification accuracy or, ideally, both (Denning, 2012). ...
... This engineered method variance in criterion grouping was developed to protect against instrumentation artifacts (Campbell & Fiske, 1959), and provide an empirical estimate of the generalizability of findings. Based on previous research (Dunn et al., 2021;Erdodi, Tyson, et al., 2018), we hypothesized that critical items would improve both the sensitivity (by identifying invalid response sets missed by traditional cutoffs) and specificity (by identifying potentially valid response sets that failed the traditional cutoffs) of the WCT. ...
... The measures in this study are available to qualified users through the test publishers. Critical item analysis on the WCT was performed based on the string of items (CR-7, CR-5, and CR-3) identified by the original study (Erdodi, Tyson, et al., 2018). The actual item numbers are not disclosed here to protect test security. ...
Article
Full-text available
Objective: This study was designed to replicate previous research on critical item analysis within the Word Choice Test (WCT). Method: Archival data were collected from a mixed clinical sample of 119 consecutively referred adults (Mage = 51.7, Meducation = 14.7). The classification accuracy of the WCT was calculated against psychometrically defined criterion groups. Results: Critical item analysis identified an additional 2%-5% of the sample that passed traditional cutoffs as noncredible. Passing critical items after failing traditional cutoffs was associated with weaker independent evidence of invalid performance, alerting the assessor to the elevated risk for false positives. Failing critical items in addition to failing select traditional cutoffs increased overall specificity. Non-White patients were 2.5 to 3.5 times more likely to Fail traditional WCT cutoffs, but select critical item cutoffs limited the risk to 1.5-2. Conclusions: Results confirmed the clinical utility of critical item analysis. Although the improvement in sensitivity was modest, critical items were effective at containing false positive errors in general, and especially in racially diverse patients. Critical item analysis appears to be a cost-effective and equitable method to improve an instrument's classification accuracy. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
... Results provided tentative support for the PIH, by confirming an association between self-reported emotional distress and underperformance on cognitive tests. Specifically, patients with elevated somatic or depressive symptoms on the MMPI-2 were 2-2.5 times more likely to fail PVTs, consistent with previous reports (Erdodi & Roth, 2017;Jak et al., 2019;Miskey et al., 2020;Qureshi et al., 2011;Rock et al., 2014;Rowland et al., 2017). At the same time, given the overlap between symptom and performance validity observed within this study and in previous research (Gaasedelen et al., 2019;Gervais et al., 2007;Haggerty et al., 2007;Larrabee et al., 2017;Mathias et al., 2002;Merten et al., 2016;Whiteside et al., 2009), the credibility of clinical elevations on the MMPI-2 scales cannot be taken at face value. ...
Article
Full-text available
This study was designed to examine the relative contribution of symptom (SVT) and performance validity tests (PVTs) to the evaluation of the credibility of neuropsychological profiles in mild traumatic brain injury (mTBI). An archival sample of 326 patients with mTBI was divided into four psychometrically defined criterion groups: pass both SVT and PVT; pass one, but fail the other; and fail both. Scores on performance-based tests of neurocognitive ability and self-reported symptom inventories were compared across the groups. As expected, PVT failure was associated with lower scores on ability tests (ηp2 .042–.184; d 0.56–1.00; medium-large effects), and SVT failure was associated with higher levels of symptom report (ηp2 .039–.312; d 0.32–1.58; small-very large effects). However, SVT failure also had a marginal deleterious effect on performance based measures (ηp2 .017–.023; d 0.23–0.46; small-medium effects) and elevations on self-report inventories were observed in the context of PVT failure (ηp2 .026; d 0.23–0.57; small-medium effects). SVT failure was associated with not only inflated symptom reports but also distorted configural patterns of psychopathology. Patients with clinically elevated somatic and depressive symptoms were twice as likely to fail PVTs. Consistent with previous research, SVTs and PVTs provide overlapping, but non-redundant information about the credibility of neuropsychological profiles associated with mTBI. Therefore, they should be used in combination to afford a comprehensive evaluation of cognitive and emotional functioning. The heuristic value of validity tests has both clinical and forensic relevance.
... Initially, PVTs were free-standing instruments dedicated exclusively to detecting invalid responding (Boone, 2013). Compared with EVIs, they require extra administration time/test material, they only measure performance validity at discrete points during the test battery Erdodi et al., 2018), and place additional burden on examinees' mental stamina (Lichtenstein, Erdodi, & Linnea, 2017). In contrast, EVIs simultaneously measure cognitive ability and performance validity throughout an evaluation at no additional cost (Boone, 2013). ...
Article
Full-text available
To assess noncredible performance on the NIH Toolbox Cognitive Battery (NIHTB-CB), we developed embedded validity indicators (EVIs). Data were collected from 98 adults (54.1% female) as part of a prospective multicenter cross-sectional study at 4 mild traumatic brain injury (mTBI) specialty clinics. Traditional EVIs and novel item-based EVIs were developed for the NIHTB-CB using the Medical Symptom Validity Test (MSVT) as criterion. The signal detection profile of individual EVIs varied greatly. Multivariate models had superior classification accuracy. Failing ≥4 traditional EVIs at the liberal cutoff or ≥3 at the conservative cutoff produced a good combination of sensitivity (.57 to .61) and specificity (.92 to .94) to MSVT. Combining the traditional and item-based EVIs improved sensitivity (.65 to .70) at comparable specificity (.91 to .95). In conclusion, newly developed EVIs within the NIHTB-CB effectively discriminated between patients who passed versus failed the MSVT. Aggregating EVIs within the same category into validity composites improved signal detection over univariate cutoffs. Item-based EVIs improved classification accuracy over that of traditional EVIs. However, the marginal gains hardly justify the burden of extra calculations. The newly introduced EVIs require cross-validation before wide-spread research or clinical application. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
... In addition to measuring cognitive ability, the credibility of the response set was continuously monitored using embedded validity indicators. Administering multiple performance validity tests (PVTs) throughout the assessment is consistent with established guidelines in clinical neuropsychology (Boone, 2009;Bush, Heilbronner, & Ruff, 2014;Schutte, Axelrod, & Montoya, 2015), and continues to be supported by recent empirical evidence (Critchfield et al., 2019;Erdodi, Tyson et al., 2018;Lichtenstein et al., 2018a;Schroeder, Olsen, & Martin, 2019). Free-standing PVTs were designed explicitly to evaluate the credibility of a response set and, therefore, have been considered the gold standard instruments. ...
Article
Full-text available
This observational study examined the acute cognitive effects of cannabis. We hypothesized that cognitive performance would be negatively affected by acute cannabis intoxication. Twenty-two medical cannabis patients from Southwestern Ontario completed the study. The majority (n = 13) were male. Mean age was 36.0 years, and mean level of education was 13.7 years. Participants were administered the same brief neurocognitive battery three times during a six-hour period: at baseline (“Baseline”), once after they consumed a 20% THC cannabis product (“THC”), and once again several hours later (“Recovery”). The average self-reported level of cannabis intoxication prior to the second assessment (i.e., during THC) was 5.1 out of 10. Contrary to expectations, performance on neuropsychological tests remained stable or even improved during the acute intoxication stage (THC; d: .49−.65, medium effect), and continued to increase during Recovery (d: .45−.77, medium-large effect). Interestingly, the failure rate on performance validity indicators increased during THC. Contrary to our hypothesis, there was no psychometric evidence for a decline in cognitive ability following THC intoxication. There are several possible explanations for this finding but, in the absence of a control group, no definitive conclusion can be reached at this time.
... In a recent study with a mixed clinical sample of 234 adult outpatients, Erdodi (2019) demonstrated that the similarity in predictor and criterion PVTs (e.g., in terms of sensory modality, cognitive domain, and/or testing paradigm) influenced classification accuracy. This phenomenon was termed the modality specificity effect and was subsequently replicated in a number of investigations (Erdodi, Hurtubise, et al., 2018;Erdodi, Kirsch, Sabelli, & Abeare, 2018;Erdodi, Tyson, et al., 2018), although isolated exceptions (i.e., where the match between the predictor and criterion PVT was orthogonal to classification accuracy) were also reported . ...
Article
Full-text available
This study was designed to examine the effect of various criterion measures on the classification accuracy of Trial 1 of the Test of Memory Malingering (TOMM-1), a free-standing performance validity test (PVT). Archival data were collected from a case sequence of 91 (MAge = 42.2 years; MEducation = 12.7) patients clinically referred for neuropsychological assessment. Trials 2 and Retention of the TOMM, the Word Choice Test, and three validity composites were used as criterion PVTs. Classification accuracy varied systematically as a function of criterion PVT. TOMM-1 ≤ 43 emerged as the optimal cutoff, resulting in a wide range of sensitivity (.47–1.00), with perfect overall specificity. Failing the TOMM-1 was unrelated to age, education or gender, but was associated with elevated self-reported depression. Results support the utility of TOMM-1 as an independent, free-standing, single-trial PVT. Consistent with previous reports, the choice of criterion measure influences parameter estimates of the PVT being calibrated. The methodological implications of modality specificity to PVT research and clinical/forensic practice should be considered when evaluating cutoffs or interpreting scores in the failing range.
... Although this suggests that TMT validity cutoffs, especially the derivative indices, may be less sensitive to aspects of invalid performance that are responsible for the commonly observed reverse dose-response relationship, they also have the benefit of not being sensitive to genuine impairment. This finding is consistent with previous reports (Arnold et al., 2005;Erdodi, Tyson, et al., 2018;Jasinski, Berry, Shandera, & Clark, 2011). The variability in the signal detection profile across samples occasionally leads to accidental discoveries that reaffirm the utility of derivative EVIs (Axelrod, Meyers, & Davis, 2014;Dean, Victor, Boone, Philpott, & Hess, 2009;Glassmire, Wood, Ta, Kinney, & Nitch, 2019). ...
Article
Full-text available
This study was designed to develop validity cutoffs by utilizing demographically adjusted T-scores on the trail making test (TMT), with the goal of eliminating potential age and education-related biases associated with the use of raw score cutoffs. Failure to correct for the effect of age and education on TMT performance may lead to increased false positive errors for older adults and examinees with lower levels of education. Data were collected from an archival sample of 100 adult outpatients (MAge = 38.8, 56% male; MEd = 13.7) who were clinically referred for neuropsychological assessment at an academic medical center in the Midwestern USA after sustaining a traumatic brain injury (TBI). Performance validity was psychometrically determined using the Word Memory Test and two multivariate validity composites based on five embedded performance validity indicators. Cutoffs on the demographically corrected TMT T-scores had generally superior classification accuracy compared to the raw score cutoffs reported in the literature. As expected, the T-scores also eliminated age and education bias that was observed in the raw score cutoffs. Both T-score and raw score cutoffs were orthogonal to injury severity. Multivariate models of T-score based cutoff failed to improve classification accuracy over univariate T-score cutoffs. The present findings provide support for the use of demographically adjusted validity cutoffs within the TMT. They produced superior classification to raw score-based cutoffs, in addition to eliminating the bias against older adults and examinees with lower levels of education.
... Regardless of the specific configuration, this range of performance fails to deliver sufficiently strong evidence to render the entire response set invalid. At the same time, it signals subthreshold evidence of non-credible responding (Abeare, Messa, Whitfield, et al., 2018a;Erdodi, Hurtubise, et al., 2018c;Erdodi, Seke, Shahein, et al., 2017c;Erdodi et al., 2018f;Proto et al., 2014). ...
Article
Full-text available
This study was designed to develop validity cutoffs within the Finger Tapping Test (FTT) using demographically adjusted T-scores, and to compare their classification accuracy to existing cutoffs based on raw scores. Given that FTT performance is known to vary with age, sex, and level of education, failure to correct for these demographic variables poses the risk of elevated false positive rates in examinees who, at the level of raw scores, have inherently lower FTT performance (women, older, and less educated individuals). Data were collected from an archival sample of 100 adult outpatients (MAge = 38.8 years, MEducation = 13.7 years, 56% men) consecutively referred for neuropsychological assessment at an academic medical center in the Midwestern USA after sustaining a traumatic brain injury (TBI). Performance validity was psychometrically defined using the Word Memory Test and two validity composites based on five embedded performance validity indicators. Previously published raw score-based validity cutoffs disproportionately sacrificed sensitivity (.13–.33) for specificity (.98–1.00). Worse yet, they were confounded by sex and education. Newly introduced demographically adjusted cutoffs (T ≤ 33 for the dominant hand, T ≤ 37 for both hands) produced high levels of specificity (.89–.98) and acceptable sensitivity (.36–.55) across criterion measures. Equally importantly, they were robust to injury severity and demographic variables. The present findings provide empirical support for a growing trend of demographically adjusted performance validity cutoffs. They provide a practical and epistemologically superior alternative to raw score cutoffs, while also reducing the potential bias against examinees inherently vulnerable to lower raw score level FTT performance.
... For example, on the Complex Ideational Material (Goodglass & Kaplan, 1972), the suggested clinical cutoff for aphasia is ≤ 7/12 (Goodglass & Kaplan, 1983). However, subsequent research found that a score as high as ≤ 9/12 (or even ≤ 10/12, depending on the examinee's demographic characteristics) provides strong evidence of non-credible responding Erdodi, 2019a;Erdodi et al., 2016;Erdodi & Roth, 2017). Similarly, an age-corrected scaled score of ≤ 6 (traditionally considered a Borderline range performance, so technically not impaired) on the Digit Span or Symbol Search subtests of the Wechsler Adult Intelligence Scales has been shown to be specific to noncredible responding (Axelrod, Fichtenberg, Millis, & Wertheimer, 2006;Erdodi et al., 2017a;Spencer et al., 2013). ...
Article
Full-text available
Regional fluctuations in cognitive ability have been reported worldwide. Given perennial concerns that the outcome of performance validity tests (PVTs) may be contaminated by genuine neuropsychological deficits, geographic differences may represent a confounding factor in determining the credibility of a given neurocognitive profile. This pilot study was designed to investigate whether geographic location affects base rates of failure (BRFail) on PVTs. BRFail were compared across a number of free-standing and embedded PVTs in patients with mild traumatic brain injury (mTBI) from two regions of the US (Midwest and New England). Retrospective archival data were collected from clinically referred patients with mTBI at two different academic medical centers (nMidwest = 76 and nNew England = 84). One free-standing PVT (Word Choice Test) and seven embedded PVTs were administered to both samples. The embedded validity indicators were combined into a single composite score using two different previously established aggregation methods. The New England sample obtained a higher score on the Verbal Comprehension Index of the WAIS-IV (d = .34, small-medium). The difference between the two regions in Full Scale IQ (FSIQ) was small (d = .28). When compared with the omnibus population mean (100), the effect of mTBI on FSIQ was small (d = .22) in the New England sample and medium (d = .53) in the Midwestern one. However, contrasts using estimates of regional FSIQ produced equivalent effect sizes (d: .47–.53). BRFail was similar on free-standing PVTs, but varied at random for embedded PVTs. Aggregating individual indices into a validity composite effectively neutralized regional variability in BRFail. Classification accuracy varied as a function of both geographic region and instruments. Despite small overall effect sizes, regional differences in cognitive ability may potentially influence clinical decision making, both in terms of diagnosis and performance validity assessment. There was an interaction between geographic region and instruments in terms of the internal consistency of PVT profiles. If replicated, the findings of this preliminary study have potentially important clinical, forensic, methodological, and epidemiological implications.
... Regardless of the configuration of PVT failures, this range of performance does not meet the threshold to deem the entire neurocognitive profile invalid (Boone, 2013;Pear- son, 2009). However, previous studies demonstrated that scores in the borderline range contain significantly stronger evidence of noncredible responding than scores in the pass range (Erdodi, Hurtubise, et al., 2018;Erdodi, Tyson, et al., 2018). Therefore, they are excluded from analyses requiring a dichotomous outcome (pass/fail) to preserve the purity of the criterion groups (Er- dodi, 2019). ...
Article
Full-text available
This study was designed to introduce and validate a forced choice recognition trial to the Rey Complex Figure Test (FCR RCFT ). Healthy undergraduate students at a midsized Canadian university were randomly assigned to the control (n = 80) or experimental malingering (n = 60) conditions. All participants were administered a brief battery of neuropsychological tests. The FCR RCFT had good overall classification accuracy (area under the curve: .79 -.88) against various criterion variables. The conservative cutoff (≤16) was highly specific (.93-.96) but not very sensitive (.38 -.51). Conversely, the liberal cutoff (≤18) was sensitive (.57-.72) but less specific (.88 -.90). The FCR RCFT provided unique information about performance validity above and beyond the existing yes/no recognition trial. Combining multiple RCFT validity indices improved classification accuracy. The utility of previously published validity indicators embedded in the RCFT was also replicated. The FCR RCFT extends the growing trend of enhancing the clinical utility of widely used standard memory tests by developing a built-in validity check. Multivariate models were superior to univariate cutoffs. Although the FCR RCFT performed well in the current sample, replication in clinical/forensic patients is needed to establish its utility in differentiating genuine memory deficits from noncredible responding.
... Although this suggests that TMT validity cutoffs, especially the derivative indices, may be less sensitive to aspects of invalid performance that are responsible for the commonly observed reverse dose-response relationship, they also have the benefit of not being sensitive to genuine impairment. This finding is consistent with previous reports (Arnold et al., 2005;Erdodi, Tyson, et al., 2018;Jasinski, Berry, Shandera, & Clark, 2011). The variability in the signal detection profile across samples occasionally leads to accidental discoveries that reaffirm the utility of derivative EVIs (Axelrod, Meyers, & Davis, 2014;Dean, Victor, Boone, Philpott, & Hess, 2009;Glassmire, Wood, Ta, Kinney, & Nitch, 2019). ...