Table 5 - uploaded by Laszlo A Erdodi
Content may be subject to copyright.
Sensitivity, Specificity, and Base Rates of Failure of Various FMS Cutoffs against Reference PVTs. 

Sensitivity, Specificity, and Base Rates of Failure of Various FMS Cutoffs against Reference PVTs. 

Source publication
Article
Full-text available
Past studies have examined the ability of the Wisconsin Card Sorting Test (WCST) to discriminate valid from invalid performance in adults using both individual embedded validity indicators (EVIs) and multivariate approaches. This study is designed to investigate whether the two most stable of these indicators—failures to maintain set (FMS) and the...

Contexts in source publication

Context 1
... a Pass as <2 and a Fail as ≥4 produces good specificity (.89-.92), but low and fluctuating sensitivity (.00-.33). Further increasing the cutoff to ≥5 produces significant gains in specificity (.95-.99), without a notable loss in sensitivity ( Table 5). ...
Context 2
... a Pass as <2 and a Fail as ≥4 produces good specificity (.89-.92), but low and fluctuating sensitivity (.00-.33). Further increasing the cutoff to ≥5 produces significant gains in specificity (.95-.99), without a notable loss in sensitivity ( Table 5). ...

Similar publications

Article
Full-text available
The aim of the present study was to specify if cerebellar lesions cause memory impairment in children. The study sample consisted of 44 children with low-grade cerebellar astrocytoma, who underwent surgical treatment and 30 healthy controls, matched with regard to age and sex. Memory was tested using the Rey Auditory Verbal Learning Test AVLT, Cors...

Citations

... Beyond memory-apparent tasks, there are stand-alone, embedded, and/or derived PVTs from essentially every neuropsychological domain. The Failure to Maintain Set score of the Wisconsin Card-Sorting Test (WCST) for instance, is an executive functioning PVT (Lichtenstein et al., 2018), the Dot Counting Test (Rey, 1941) offers a PVT in the complex attention domain, and the Judgment of Line Orientation Test even has embedded performance validity markers for visuoperception (Meyers et al., 2011;Meyers & Volbrecht, 2003). ...
Article
Full-text available
The present study evaluated whether Grooved Pegboard (GPB), when used as a performance validity test (PVT), can incrementally predict psychiatric symptom report elevations beyond memory-apparent PVTs. Participants (N = 111) were military personnel and were predominantly White (84%), male (76%), with a mean age of 43 (SD = 12) and having on average 16 years of education (SD = 2). Individuals with disorders potentially compromising motor dexterity were excluded. Participants were administered GPB, three memory-apparent PVTs (Medical Symptom Validity Test, Non-Verbal Medical Symptom Validity Test, Reliable Digit Span), and a symptom validity test (Personality Assessment Inventory Negative Impression Management [NIM]). Results from the three memory-apparent PVTs were entered into a model for predicting NIM, where failure of two or more PVTs was categorized as evidence of non-credible responding. Hierarchical regression revealed that non-dominant hand GPB T-score incrementally predicted NIM beyond memory-apparent PVTs (F(2,108) = 16.30, p < .001; R2 change = .05, β = -0.24, p < .01). In a second hierarchical regression, GPB performance was dichotomized into pass or fail, using T-score cutoffs (≤29 for either hand, ≤31 for both). Non-dominant hand GPB again predicted NIM beyond memory-apparent PVTs (F(2,108) = 18.75, p <.001; R2 change = .08, β = -0.28, p < .001). Results indicated that noncredible/failing GPB performance adds incremental value over memory-apparent PVTs in predicting psychiatric symptom report.
... One appeal of embedded PVTs is that little to no additional testing time is needed in order to obtain evidence of validity on that particular score or battery, although we agree that the time needed for stand-alone PVTs is medically necessary for clinical evaluations , and the combination of multiple stand-alone and embedded PVTs is most appropriate. Embedded PVTs have started to receive more attention over the past decade, with evidence being provided for both non-memory tests [e.g., Reliable Digit Span (Kirkwood, Hargrave, & Kirk, 2011;Welsh, Bender, Whitman, Vasserman, & Macallister, 2012); Matrix Reasoning (Sussman, Peterson, Connery, Baker, & Kirkwood, 2017); CNS Vital Signs (Brooks, Sherman, & Iverson, 2014); Automatized Sequences Test ; Conners Continuous Performance Test (Erdodi, Lichtenstein, Rai, & Flaro, 2017;Lichtenstein, Flaro, Baldwin, Rai, & Erdodi, 2019); Wisconsin Card Sorting Test (Lichtenstein, Erdodi, Rai, Mazur-Mosiewicz, & Flaro, 2018)] and memory tests [e.g., California Verbal Learning Test, Children's Version (CVLT-C) Brooks & Ploetz, 2015;Lichtenstein, Erdodi, & Linnea, 2017); Child and Adolescent Memory Profile (ChAMP) Lists subtest (Brooks, Plourde, MacAllister, & Sherman, 2018); and ChAMP Objects subtest (Brooks, MacAllister, Fay-McClymont, Vasserman, & Sherman, 2019b)]. ...
Article
Objective It is essential to interpret performance validity tests (PVTs) that are well-established and have strong psychometrics. This study evaluated the Child and Adolescent Memory Profile (ChAMP) Validity Indicator (VI) using a pediatric sample with traumatic brain injury (TBI). Method A cross-sectional sample of N = 110 youth (mean age = 15.1 years, standard deviation [SD] = 2.4 range = 8–18) on average 32.7 weeks (SD = 40.9) post TBI (71.8% mild/concussion; 3.6% complicated mild; 24.6% moderate-to-severe) were administered the ChAMP and two stand-alone PVTs. Criterion for valid performance was scores above cutoffs on both PVTs; criterion for invalid performance was scores below cutoffs on both PVTs. Classification statistics were used to evaluate the existing ChAMP VI and establish a new VI cutoff score if needed. Results There were no significant differences in demographics or time since injury between those deemed valid (n = 96) or invalid (n = 14), but all ChAMP scores were significantly lower in those deemed invalid. The original ChAMP VI cutoff score was highly specific (no false positives) but also highly insensitive (sensitivity [SN] = .07, specificity [SP] = 1.0). Based on area under the curve (AUC) analysis (0.94), a new cutoff score was established using the sum of scaled scores (VI-SS). A ChAMP VI-SS score of 32 or lower achieved strong SN (86%) and SP (92%). Using a 15% base rate, positive predictive value was 64% and negative predictive value was 97%. Conclusions The originally proposed ChAMP VI has insufficient SN in pediatric TBI. However, this study yields a promising new ChAMP VI-SS, with classification metrics that exceed any other current embedded PVT in pediatrics.
... emphasize the importance of routinely administering multiple PVTs (Chafetz et al., 2015;Sweet et al., 2021). On the other hand, systemic pressures to abbreviate batteries in an increasingly costconscious health care climate (Lichtenstein et al., 2018) are incompatible with this mandate. The tension between guidelines for best practice and the "reality on the ground" is often resolved by yielding to the latter (Cottingham, 2021;MacAllister et al., 2019). ...
Article
Full-text available
Objective: This study was designed to replicate previous research on critical item analysis within the Word Choice Test (WCT). Method: Archival data were collected from a mixed clinical sample of 119 consecutively referred adults (Mage = 51.7, Meducation = 14.7). The classification accuracy of the WCT was calculated against psychometrically defined criterion groups. Results: Critical item analysis identified an additional 2%-5% of the sample that passed traditional cutoffs as noncredible. Passing critical items after failing traditional cutoffs was associated with weaker independent evidence of invalid performance, alerting the assessor to the elevated risk for false positives. Failing critical items in addition to failing select traditional cutoffs increased overall specificity. Non-White patients were 2.5 to 3.5 times more likely to Fail traditional WCT cutoffs, but select critical item cutoffs limited the risk to 1.5-2. Conclusions: Results confirmed the clinical utility of critical item analysis. Although the improvement in sensitivity was modest, critical items were effective at containing false positive errors in general, and especially in racially diverse patients. Critical item analysis appears to be a cost-effective and equitable method to improve an instrument's classification accuracy. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
... Though some measures have been identified as potentially useful in this population 34,35 , most of the evidence suggests that the PVTs applied to the assessment of ID have an unacceptably high [>16%; 37 ] false positive rate 36,38-43 . Progress on the issue of high false positive rates on PVTs among patients with ID is hindered by the common practice of excluding individuals with FSIQ <75 from cross-validation studies as a methodological safeguard against contaminating criterion grouping [44][45][46][47][48][49][50][51][52] . ...
Article
Full-text available
This study was designed to determine the clinical utility of embedded performance validity indicators (EVIs) in adults with intellectual disability (ID) during neuropsychological assessment. Based on previous research, unacceptably high (>16%) base rates of failure (BRFail) were predicted on EVIs using on the method of threshold, but not on EVIs based on alternative detection methods. A comprehensive battery of neuropsychological tests was administered to 23 adults with ID (MAge = 37.7 years, MFSIQ = 64.9). BRFail were computed at two levels of cut-offs for 32 EVIs. Patients produced very high BRFail on 22 EVIs (18.2%-100%), indicating unacceptable levels of false positive errors. However, on the remaining ten EVIs BRFail was <16%. Moreover, six of the EVIs had a zero BRFail, indicating perfect specificity. Consistent with previous research, individuals with ID failed the majority of EVIs at high BRFail. However, they produced BRFail similar to cognitively higher functioning patients on select EVIs based on recognition memory and unusual patterns of performance, suggesting that the high BRFail reported in the literature may reflect instrumentation artefacts. The implications of these findings for clinical and forensic assessment are discussed.
... Although the WCST may be considered a test of executive functioning and problem solving, FMS individually may indicate reductions in sustained attention and increased distractibility (Figueroa & Youmans, 2013;Greve, Williams, Haas, Littell, & Reinoso, 1996). Unusually high numbers of FMS are rare in persons even with bona fide neurological insult (Heaton, Chelune, Talley, Kay, & Curtis 1993), with increased numbers of FMS related to noncredible presentations across diverse samples (Bernard, McGrath, & Houston, 1996;Greve, Bianchini, Mathias, Houston, & Crouch, 2002;Kosky, Lace, Austin, Seitz, & Clark, 2020;Larrabee, 2003;Lichtenstein et al., 2018;Suhr & Boyer, 1999). Of brief note, the WCST is not mentioned in recommended batteries for MS (Benedict et al., 2006;Kalb et al., 2018;Langdon et al., 2012;Merz et al., 2018). ...
Article
Objective: Research regarding performance validity tests (PVTs) in patients with multiple sclerosis (MS) is scant, with recommended batteries for neuropsychological evaluations in this population lacking suggestions to include PVTs. Moreover, limited work has examined embedded PVTs in this population. As previous investigations indicated that non-memory-based embedded PVTs provide clinical utility in other populations, this study sought to determine if a logistic regression-derived PVT formula can be identified from selected non-memory variables in a sample of patients with MS. Method: One hundred eighty-four patients (M age = 48.45; 76.6% female) with MS were referred for neuropsychological assessment at a large, Midwestern academic medical center. Patients were placed into “credible” (n = 146) or “noncredible” (n = 38) groups according to performance on standalone PVT. Missing data were imputed with HOTDECK. Results: Classification statistics for a variety of embedded PVTs were examined, with none appearing psychometrically appropriate in isolation (AUCs = .48-.64). Four exponentiated equations were created via logistic regression. Six, five, and three predictor equations yielded acceptable discriminability (AUC = .71-.74) with modest sensitivity (.34-.39) while maintaining good specificity (≥.90). The two predictor equation appeared unacceptable (AUC = .67). Conclusions: Results suggest that multivariate combinations of embedded PVTs may provide some clinical utility while minimizing test burden in determining performance validity in patients with MS. Nonetheless, the authors recommend routine inclusion of several PVTs and utilization of comprehensive clinical judgment to maximize signal detection of noncredible performance and avoid incorrect conclusions. Clinical implications, limitations, and avenues for future research are discussed.
... Further research is needed to determine the relationship between the MVP and ChAMP List Recognition subtest. At this time, and in-line with research on other ePVTs' use in children (Lichtenstein et al., 2018(Lichtenstein et al., , 2019, the ChAMP LR subtest should be used in conjunction with PVT(s) versus alone. ...
Article
The primary purpose of this study was to evaluate performance on the Memory Validity Profile (MVP) in a mixed pediatric clinical population. Goal 1, assessing MVP pass rates, yielded a slightly higher pass rate (98.58%) compared to research on other performance validity tests (PVTs) in youth when using manual-based cutoffs and a slightly lower pass rate (84.40%) using an experimental cutoffs (PASS Total ≥31) similar to others’ research. Goal 2, determining if MVP performance was contingent on variables other than effort, yielded significant differences in age, sex, and intelligence (p < 0.05); but not parental education or occurrence/nonoccurrence of previous neurological issues, ADHD, or psychiatric disorders. Goal 3, investigating the agreement of an embedded PVT (Children and Adolescent Memory Profile [ChAMP] List Recognition [LR] subtest) with the MVP in classification of adequate vs. suboptimal effort, showed that the highest levels of consistency (81%) were achieved when experimental MVP and LR ss ≤5 cutoffs were utilized. In conclusion, the MVP is a useful tool in detecting suboptimal effort in children in a broad clinical sample and the ChAMP LR subtest adds to identification of suboptimal effort as an ePVT with the MVP. The established cutoffs stated in the MVP manual should be used, as these better identify suboptimal effort in children by age than experimental cutoffs (i.e., 31 and 32 PASS). The ChAMP LR cutoff should be ss ≤5, with MVP manual-based (75%) or experimental cutoffs (81%).
... King et al. (2002) indicated that a logistic regression-derived formula with CAT, FMS, and Conceptual Level Responses (CLR) correctly classified between 70% and 99% of suspected noncredible performers from those with TBIs across three studies. Extensive follow-up research has sought to cross-validate these results in other samples (e.g., King et al., 2002;Larrabee, 2003;Lichtenstein et al., 2018;Miller et al., 2000), while other work has suggested limited utility for WCST variables at detecting noncredible performers in various populations (Ashendorf et al., 2003;Greve et al., 2009;Inman & Berry, 2002). In all, these reviewed studies indicate the need for continued research in this regard. ...
Article
Clinicians who evaluate patients with concerns related to attention-deficit/hyperactivity disorder (ADHD) are encouraged to include validity indicators throughout clinical assessment procedures. To date, no known previous literature has examined the Wisconsin Card Sorting Test (WCST) specifically to address noncredible ADHD, and none has attempted to identify an embedded PVT within the 64-card version. The present study sought to address these gaps in the literature with a simulation study. Sixty-seven undergraduate participants (M age = 19.30) were grouped as credible (combining healthy controls and individuals with ADHD) or noncredible (combining coached and uncoached participants simulating ADHD-related impairment) and administered a battery of neuropsychological tests. Results revealed the noncredible group performed significantly worse on several WCST-64 variables, including failure to maintain set, number of trials to first category, and total categories. Raw scores from these variables were entered as predictors as one set in a logistic regression (LR) with group membership as the outcome variable. An exponentiated equation (EE) derived from LR results yielded acceptable discriminability (area under receiver operating characteristic curve = .73) with modest sensitivity (.38) while maintaining ideal specificity (.91), generally commensurate with a standalone forced-choice memory PVT and better than an embedded attention-based PVT. These findings suggested the WCST-64 may be sensitive to noncredible performance in the context of ADHD and reiterates the importance of considering tests of various cognitive abilities in the evaluation of performance validity. Implications of these findings, limitations of the present study, and directions for future inquiry, including cross-validation in clinical samples, were discussed.
... The Wisconsin Card Sorting Test (WCST) (Kongs, Thompson, Iverson, & Heaton, 2000), an adaptation of the task switching paradigm, is one of the assessment tools commonly used to evaluate the level of the cognitive flexibility of children (e.g. Gao et al., 2018;Kieffer, Vukovic, & Berry, 2013;Lichtenstein, Erdodi, Rai, Mazur-Mosiewicz, & Flaro, 2018;Stad, Wiedl, Vogelaar, Bakker, & Resing, 2019). Similar to WCST, but with a reduced number of sorting dimension, the Dimensional Change Card Sort task is another measure of cognitive flexibility that is used amongst young children in pre-school age (Carlson & Moses, 2001;Muller, Gela, Dick, Overton, & Zelazo, 2006) and primary school years (Cantin, Gnaedinger, Gallaway, Hesson-McInnis, & Hund, 2016). ...
... The Wisconsin Card Sorting Test À 64 card version (WCST; Kongs et al., 2000) had been adopted to measure the cognitive flexibility of children previously (e.g. Gao et al., 2018;Kieffer et al., 2013;Lichtenstein et al., 2018;Stad et al., 2019). The task was implemented with the aid of a computer programme. ...
Article
Full-text available
The present study aims to examine the contribution of cognitive flexibility to metalinguistic skills and reading comprehension during primary school years. Forty-nine third-grade primary school children completed the measures of cognitive flexibility, metalinguistic skills including syntactic awareness (word order knowledge), morphosyntactic skill and discourse awareness (sentence order knowledge), and reading comprehension. Hierarchical regression analysis results showed that syntactic awareness, morphosyntactic skill, discourse awareness and cognitive flexibility are significantly predictive of reading comprehension. However, cognitive flexibility is a unique predictor of reading comprehension over and above age and all other metalinguistic measures. Cognitive flexibility is insignificantly associated with syntactic awareness, morphosyntactic skill, and discourse awareness. The results extended the understanding of the role of cognitive flexibility in the reading comprehension process.
... In addition to measuring cognitive ability, the credibility of the response set was continuously monitored using embedded validity indicators. Administering multiple performance validity tests (PVTs) throughout the assessment is consistent with established guidelines in clinical neuropsychology (Boone, 2009;Bush, Heilbronner, & Ruff, 2014;Schutte, Axelrod, & Montoya, 2015), and continues to be supported by recent empirical evidence (Critchfield et al., 2019;Erdodi, Tyson et al., 2018;Lichtenstein et al., 2018a;Schroeder, Olsen, & Martin, 2019). Free-standing PVTs were designed explicitly to evaluate the credibility of a response set and, therefore, have been considered the gold standard instruments. ...
Article
Full-text available
This observational study examined the acute cognitive effects of cannabis. We hypothesized that cognitive performance would be negatively affected by acute cannabis intoxication. Twenty-two medical cannabis patients from Southwestern Ontario completed the study. The majority (n = 13) were male. Mean age was 36.0 years, and mean level of education was 13.7 years. Participants were administered the same brief neurocognitive battery three times during a six-hour period: at baseline (“Baseline”), once after they consumed a 20% THC cannabis product (“THC”), and once again several hours later (“Recovery”). The average self-reported level of cannabis intoxication prior to the second assessment (i.e., during THC) was 5.1 out of 10. Contrary to expectations, performance on neuropsychological tests remained stable or even improved during the acute intoxication stage (THC; d: .49−.65, medium effect), and continued to increase during Recovery (d: .45−.77, medium-large effect). Interestingly, the failure rate on performance validity indicators increased during THC. Contrary to our hypothesis, there was no psychometric evidence for a decline in cognitive ability following THC intoxication. There are several possible explanations for this finding but, in the absence of a control group, no definitive conclusion can be reached at this time.
... Moreover, discriminant analysis showed that the WCST measures excellently differentiated subjects who did not pass the MMSE with respect to healthy ones. This is consistent with previous studies that used different clinical samples (Lange et al., 2018;Lichtenstein, Erdodi, Rai, Mazur-Mosiewicz, & Flaro, 2018). Regard to this, Sherer, Nick, Millis, and Novack (2003) found that WCST is a sensitive test for detecting deficit in patients with traumatic brain injury. ...
Article
The Wisconsin Card Sorting Test (WCST) is a widely used neuropsychological assessment of executive functioning. The aim of this study was to provide norm values and analyze the psychometric properties of WCST in healthy Argentinian adults aged from 18 to 89 years old (N = 235). Descriptive statistics are reported as means, standard deviations and percentiles, with the effects of age, education and gender being investigated by ANOVA, and with the effect sizes being calculated. The psychometrics were studied using the WCST structure, reliability, convergent validity, and discriminant validity, and WCST norms adjusted for age and educational level are proposed. This instrument is a reliable and valid tool for the assessment of executive functions. However, as the age- and educational-related effects were demonstrated, these characteristics need to be considered before interpreting WCST scores. Regarding gender, no differences were found. Our results expand the geographical and sociocultural applicability of WCST.