Article

Non-Memory Composite Embedded Performance Validity Formulas in Patients with Multiple Sclerosis

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Objective: Research regarding performance validity tests (PVTs) in patients with multiple sclerosis (MS) is scant, with recommended batteries for neuropsychological evaluations in this population lacking suggestions to include PVTs. Moreover, limited work has examined embedded PVTs in this population. As previous investigations indicated that non-memory-based embedded PVTs provide clinical utility in other populations, this study sought to determine if a logistic regression-derived PVT formula can be identified from selected non-memory variables in a sample of patients with MS. Method: One hundred eighty-four patients (M age = 48.45; 76.6% female) with MS were referred for neuropsychological assessment at a large, Midwestern academic medical center. Patients were placed into “credible” (n = 146) or “noncredible” (n = 38) groups according to performance on standalone PVT. Missing data were imputed with HOTDECK. Results: Classification statistics for a variety of embedded PVTs were examined, with none appearing psychometrically appropriate in isolation (AUCs = .48-.64). Four exponentiated equations were created via logistic regression. Six, five, and three predictor equations yielded acceptable discriminability (AUC = .71-.74) with modest sensitivity (.34-.39) while maintaining good specificity (≥.90). The two predictor equation appeared unacceptable (AUC = .67). Conclusions: Results suggest that multivariate combinations of embedded PVTs may provide some clinical utility while minimizing test burden in determining performance validity in patients with MS. Nonetheless, the authors recommend routine inclusion of several PVTs and utilization of comprehensive clinical judgment to maximize signal detection of noncredible performance and avoid incorrect conclusions. Clinical implications, limitations, and avenues for future research are discussed.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Employing a dedicated free-standing PVT as a single criterion measure follows a longstanding practice in performance validity research (Abeare, Erdodi, et al., 2021;Giromini, Viglione, Zennaro, Maffei, & Erdodi, 2020;Green, Kirk, Connery, Baker, & Kirkwood, 2014;Greve, Bianchini, & Roberson, 2007;Iverson, Lange, Green, & Franzen, 2002;Lace et al., 2021a;Lange et al., 2013;Langeluddecke & Lucas, 2003;Larrabee, Rohling, & Meyers, 2019;Shura, Miskey, Rowland, Yoash-Gantz, & Denning, 2016;Stevens, Friedel, Mehren, & Merten, 2008;Suhr, Hammers, Dobbins-Buckland, Zimak, & Hughes, 2008;SussmSussman, Peterson, Connery, Baker, & A, 2019;Whiteside, Wald, & Busse, 2011;Whitney, Hook, Steiner, Shepard, & Callaway, 2008) grounded in the observation that free-standing PVTs have superior classification to embedded validity cutoffs Lau et al., 2017;Miele, Gunner, Lynch, & McCaffrey, 2012) and more importantly, they are robust to genuine and severe cognitive deficits by design (Brockhaus & Merten, 2004;Carone, 2008;Carone, Green, & Drane, 2014;Erdodi, Green, et al., 2019;Green, Montijo, & Brockhaus, 2011;Larochette & Harrison, 2012). Granted, most of the PVTs in the studies above were multi-trial instruments with individual cutoffs (TOMM, Word Memory Test, and Medical Symptom Validity Test), with the resulting outcome (Pass/Fail) based on performance across multiple subtests. ...
... Consequently, results may not generalize to different geographic regions (Lichtenstein et al., 2019). The persistent difference in the classification accuracy of the BNT as the function of criterion measures serves as a reminder of potential methodological artifacts (Lace et al., 2020;Lace, Merz, & Galioto, ;Rai & Erdodi, 2021;Schroeder et al., 2019) that are worth exploring further in future research. Specifically, it underlines the importance of relying on multivariate models of performance validity assessment Boone, 2009;Larrabee, 2008;Pearson, 2009) instead of a single instrument (Schroeder, Boone, & Larrabee, 2021). ...
Article
Full-text available
This study was designed to examine alternative validity cutoffs on the Boston Naming Test (BNT). Archival data were collected from 206 adults assessed in a medicolegal setting following a motor vehicle collision. Classification accuracy was evaluated against three criterion PVTs. The first cutoff to achieve minimum specificity (.87-.88) was T ≤ 35, at .33-.45 sensitivity. T ≤ 33 improved specificity (.92-.93) at .24-.34 sensitivity. BNT validity cutoffs correctly classified 67–85% of the sample. Failing the BNT was unrelated to self-reported emotional distress. Although constrained by its low sensitivity, the BNT remains a useful embedded PVT.
... Clinicians who perform neuropsychological assessments for patients with MS are encouraged to continually assess validity (both performance and symptom) throughout their evaluations, as is recommended by other experts (Boone, 2009;Sweet et al., 2021). Continued research into validity of all types in neuropsychological evaluations of patients with MS and the critical psychometric evaluation of the SIMS and related instruments (e.g., PVTs across several domains; Lace et al., 2021b) in this and other neurological populations is warranted. ...
Article
Full-text available
Greater attention is being paid to performance validity in patients with multiple sclerosis (MS), though limited work has examined symptom validity measures in this population. Extensive previous literature has examined the Structured Inventory of Malingered Symptomatology (SIMS) across diverse populations with variable results. The purpose of the present study was to determine the extent to which the SIMS and/or its associated variables/items are useful in detecting noncredible cognitive performance in patients with MS. Sixty-seven patients with MS (Mage = 48.64; 71.6% women; 83.6% White) were referred for neuropsychological evaluation within a large, Midwestern academic medical center. Participants were categorized into credible (n = 50) and noncredible (n = 17) groups based on standalone performance validity test (PVT) criteria. The noncredible group reported significantly higher SIMS Total scores than the credible group (d = 1.69). Results revealed that SIMS Total score was broadly elevated across the entire sample (M = 17.55, SD = 8.30) and significantly correlated with depressive (r = .56) and anxiety (r = .38) symptoms. Publisher recommended cutoffs for SIMS Total scores of > 14 (specificity = .50) and > 16 (specificity = .64) yielded unacceptable levels of false positives. A novel subscale (SIMS-MS) was created from six items that were endorsed by ≥ 25% of the noncredible group and ≤ 10% of the credible group. The SIMS-MS demonstrated good differentiability between groups (AUC = .87), with good sensitivity (.53) and excellent specificity (.96) at a cutoff of >1 (i.e., ≥ 2), and it did not significantly correlate with depressive (r = .21) nor anxiety (r = .23) symptoms, suggesting appropriate discriminant validity from face valid psychiatric symptoms. The SIMS-MS showed preliminary clinical utility at identifying individuals with MS who demonstrate noncredible cognitive performance on standalone PVTs. Implications, limitations, and directions for further inquiry, including validation against other measures of both performance and symptom validity, were discussed.
... On reevaluation, all PVTs continued to be outside of expectations, including VSVT (total ¼ 20/ 48, hard items ¼ 9/24), Dot Counting Test E-Score (22), ACS Word Choice Test (20/50), CVLT-FC (7/16), RDS (6), BVMT-R RD (0), and a composite formula (EE-3 ¼ .72; Lace et al., 2021). Of note, her scores on three forced-choice PVTs (VSVT, ACS Word Choice Test, and CVLT-FC) were below chance level and it was determined to be highly unlikely that this pattern of unusual PVT scores was a false positive (Odland et al., 2015). ...
Article
Full-text available
Determining the validity of data during clinical neuropsychological assessment is crucial for proper interpretation, and extensive literature has emphasized myriad methods of doing so in diverse samples. However, little research has considered noncredible presentation in persons with multiple sclerosis (pwMS). PwMS often experience one or more factors known to impact validity of data, including major neurocognitive impairment, psychological distress/psychogenic interference, and secondary gain. This case series aimed to illustrate the potential relationships between these factors and performance validity testing in pwMS. Six cases from an IRB-approved database containing pwMS referred for neuropsychological assessment at a large, academic medical center involving at least one of the above-stated factors were identified. Backgrounds, neuropsychological test data, and clinical considerations for each were reviewed. Interestingly, no pwMS diagnosed with major neurocognitive impairment was found to have noncredible performance, nor was any patient with noncredible performance in the absence of notable psychological distress. Given the variability of noncredible performance and multiplicity of factors affecting performance validity in pwMS, clinicians are strongly encouraged to consider psychometrically appropriate methods for evaluating validity of cognitive data in pwMS. Additional research aiming to elucidate base rates of, mechanisms begetting, and methods for assessing noncredible performance in pwMS is imperative.
Preprint
Full-text available
Within neuropsychological assessment, clinicians are responsible for ensuring the validity of obtained cognitive data. As such, increased attention is being paid to performance validity in patients with multiple sclerosis (pwMS). Experts have proposed batteries of neuropsychological tests for use in this population, though none contain recommendations for standalone performance validity tests (PVTs). The California Verbal Learning Test, Second Edition (CVLT-II) and Brief Visuospatial Memory Test, Revised (BVMT-R)-both of which are included in the aforementioned recommended neuropsychological batteries-include previously validated embedded PVTs (which offer some advantages, including expedience and reduced costs), with no prior work exploring their utility in pwMS. One hundred thirty-three (133) patients (M age = 48.28; 76.7% women; 85.0% White) with MS were referred for neuropsychological assessment at a large, Midwestern academic medical center. Patients were placed into "credible" (n = 100) or "noncredible" (n = 33) groups based on standalone PVT criterion. Classification statistics for four CVLT-II and BVMT-R PVTs of interest in isolation were poor (AUCs = .58-.62) and nonsignificant. Several arithmetic and logistic regression-derived multivariate formulas were calculated, all of which similarly demonstrated poor discriminability (AUCs = .61-.64). Although embedded PVTs may arguably maximize efficiency and minimize test burden in pwMS, common PVTs in the CVLT-II and BVMT-R do not appear psychometrically appropriate, sufficiently useful, nor substitutable for standalone PVTs in this population. Clinical neuropsychologists who evaluate such patients are encouraged to include standalone PVTs in their assessment batteries to ensure that clinical care conclusions drawn from neuropsychological data are valid.
Article
Full-text available
The Victoria Symptom Validity Test (VSVT) is a performance validity test (PVT) with over two decades of empirical backing, although methodological limitations within the extant literature restrict its clinical and research generalizability. Chief among these constraints includes limited consensus on the most accurate index within the VSVT and the most appropriate cut-scores within each VSVT validity index. The current systematic review synthesizes existing VSVT validation studies and provides additional cross-validation in an independent sample using a known-groups design. We completed a systematic search of the literature, identifying 17 peer-reviewed studies for synthesis (7 simulation designs, 7 differential prevalence designs, and 3 known-groups designs). The independent cross-validation sample consisted of 200 mixed clinical neuropsychiatric patients referred for outpatient neuropsychological evaluation. Across all indices, Total item accuracy produced the strongest psychometric properties at an optimal cut-score of ≤ 40 (62% sensitivity/88% specificity). However, ROC curve analyses for all VSVT indices yielded statistically significant areas under the curve (AUCs; .73–81), suggestive of moderate classification accuracy. Cut-scores derived using the independent cross-validation sample converged with some previous findings supporting cut-scores of ≤ 22 for Easy item accuracy and ≤ 40 for Total item accuracy, although divergent findings were noted for Difficult item accuracy. Overall, VSVT validity indicators have adequate diagnostic accuracy across populations, with the current study providing additional support for its use as a psychometrically sound PVT in clinical settings. However, caution is recommended among patients with certain verified clinical conditions (e.g., dementia) and those with pronounced working memory deficits due to concerns for increased risk of false positives.
Article
Full-text available
While literature on performance validity tests (PVTs) in neuropsychological assessment has examined memory-based paradigms, other research has suggested that tests of attention, visuospatial ability, and language may also detect noncredible performance. Previous work has identified several PVTs in the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS), though all of them emphasize memory-based subtests. This study sought to determine if PVT formulas can be derived from exclusively non-memory RBANS subtests (i.e., Figure Copy, Line Orientation, Picture Naming, Semantic Fluency, Digit Span, and Coding) using an analog simulation study. Seventy-two undergraduate participants (M age = 18.9) were assigned to either an asymptomatic (AS) group, which was instructed to perform optimally, or a simulated mild traumatic brain injury (S-mTBI) group, which received symptom and test coaching to help simulate mTBI-related impairment. Participants were administered a battery of neuropsychological tests, including the RBANS and standalone PVTs. Differences were found between groups for all RBANS subtests of interest except Picture Naming. Five subtests showing meaningful group differences were entered as predictor variables as one set in logistic regressions (LR); raw and norm-based scores were considered separately. Both LRs accurately classified 90.3% of cases with good sensitivity (.89) while maintaining ideal specificity (.92). Two exponentiated equations were described from LR results, with both yielding good discriminability (AUCs = .94), generally comparable with other PVTs. These findings suggested that non-memory RBANS subtests may be sensitive to noncredible performance and reiterate the importance of considering tests of various cognitive abilities when assessing performance validity during neuropsychological assessment. Limitations of this study and directions for future inquiry, including necessity for validation in a clinical sample, were discussed.
Article
Full-text available
Loring and Goldstein presented a case of a woman with Multiple Sclerosis (MS) who failed the traditional performance validity criteria of the WMT. Scoring lower than the mean from patients with Alzheimer's Disease on extremely easy subtests, the patient carried on to produce a WMT profile which is typical of someone with invalid test results, based on the usual interpretation, which is standardized within the Advanced Interpretation Program. Statements were made that are incorrect, including the claim there are no available data on the WMT in MS patients, that the minor tranquilizer Lorazepam can explain WMT failure even in healthy adults and that this patient produced a neuropsychological profile that is credible and typical of MS. We report data from MS patients given comprehensive neuropsychological assessment, including the WMT. Loring and Goldstein's interpretation of this case does not fit the facts.
Article
Full-text available
A sound performance validity test is accurate for detecting invalid neuropsychological test performance and relatively insensitive to actual cognitive ability or impairment. This study explored the relationship of several cognitive abilities to several performance indices on the Victoria Symptom Validity Test (VSVT), including accuracy and response latency. This cross-sectional study examined data from a mixed clinical sample of 88 adults identified as having valid neurocognitive test profiles via independent validity measures, and who completed the VSVT along with objective measures of working memory, processing speed, and verbal memory during their clinical neuropsychological evaluation. Results of linear regression analyses indicated that cognitive test performance accounted for 5% to 14% of total variance for VSVT performance across indices. Working memory was the only cognitive ability to predict significant, albeit minimal, variance on the VSVT response accuracy indices. Results show that VSVT performance is minimally predicted by working memory, processing speed, or delayed verbal memory recall.
Article
Full-text available
Objective: This study was designed to examine the classification accuracy of verbal fluency (VF) measures as performance validity tests (PVT). Method: Student volunteers were assigned to the control (n = 57) or experimental malingering (n = 24) condition. An archival sample of 77 patients with TBI served as a clinical comparison. Results: Among students, FAS T-score ≤29 produced a good combination of sensitivity (.40–.42) and specificity (.89–.95). Animals T-score ≤31 had superior sensitivity (.53–.71) at .86-.93 specificity. VF tests performed similarly to commonly used PVTs embedded within Digit Span: RDS ≤7 (.54–.80 sensitivity at .93–.97 specificity) and age-corrected scaled score (ACSS) ≤6 (.54–.67 sensitivity at .94–.96 specificity). In the clinical sample, specificity was lower at liberal cutoffs [animals T-score ≤31 (.89–.91), RDS ≤7 (.86–.89) and ACSS ≤6 (.86–.96)], but comparable at conservative cutoffs [animals T-score ≤29 (.94–.96), RDS ≤6 (.95–.98) and ACSS ≤5 (.92–.96)]. Conclusions: Among students, VF measures had higher signal detection performance than previously reported in clinical samples, likely due to the absence of genuine impairment. The superior classification accuracy of animal relative to letter fluency was replicated. Results suggest that existing validity cutoffs can be extended to cognitively high functioning examinees, and emphasize the importance of population-specific cutoffs.
Article
Full-text available
Objective: While the Neuropsychological Assessment Battery, Screening Module (S-NAB) is a commonly used cognitive screening measure, no composite embedded performance validity test (PVT) formula has yet been described within it. This study sought to empirically derive PVT formulas within the S-NAB using an analog simulation paradigm. Method: Seventy-two university students (M age = 18.92) were randomly assigned to either an Asymptomatic (AS) or simulated mild traumatic brain injury (S-mTBI) group and were administered a neuropsychological test battery that included the S-NAB and standalone and embedded PVTs. The AS group was instructed to perform optimally, and the S-mTBI group received symptom and test coaching to help simulate mTBI-related impairment. Both groups received warnings regarding the presence of PVTs throughout the test battery. Results: Groups showed significant differences (all ps < .001) on all S-NAB domain scores and PVTs. In the S-NAB, the Attention (S-ATT) and Executive Function (S-EXE) domains showed the largest effect sizes (Cohen’s ds = 2.02 and 1.79, respectively). Seven raw scores from S-ATT and S-EXE subtests were entered as predictor variables in a direct logistic regression (LR). The model accurately classified 90.3% of cases. Two PVT formulas were described: (1) an exponentiated equation from LR results and (2) an arithmetic formula using four individually meaningful variables. Both formulas demonstrated outstanding discriminability between groups (AUCs = .96–.97) and yielded good classification statistics compared to other PVTs. Conclusions: This study is the first to describe composite, embedded PVT formulas within the S-NAB. Implications, limitations, and appropriate future directions of inquiry are discussed.
Article
Full-text available
Objective: This study was designed to evaluate the classification accuracy of a multivariate model of performance validity assessment using embedded validity indicators (EVIs) within the Wechsler Adult Intelligence Scale-Fourth Edition (WAIS-IV). Method: Archival data were collected from 100 adults with traumatic brain injury (TBI) consecutively referred for neuropsychological assessment in a clinical setting. The classification accuracy of previously published individual EVIs nested within the WAIS-IV and a composite measure based on six independent EVIs were evaluated against psychometrically defined non-credible performance. Results: Univariate validity cutoffs based on age-corrected scaled scores on Coding, Symbol Search, Digit Span, Letter-Number-Sequencing, Vocabulary minus Digit Span, and Coding minus Symbol Search were strong predictors of psychometrically defined non-credible responding. Failing ≥3 of these six EVIs at the liberal cutoff improved specificity (.91-.95) over univariate cutoffs (.78-.93). Conversely, failing ≥2 EVIs at the more conservative cutoff increased and stabilized sensitivity (.43-.67) compared to univariate cutoffs (.11-.63) while maintaining consistently high specificity (.93-.95). Conclusions: In addition to being a widely used test of cognitive functioning, the WAIS-IV can also function as a measure of performance validity. Consistent with previous research, combining information from multiple EVIs enhanced the classification accuracy of individual cutoffs and provided more stable parameter estimates. If the current findings are replicated in larger, diagnostically and demographically heterogeneous samples, the WAIS-IV has the potential to become a powerful multivariate model of performance validity assessment. Brief summary: Using a combination of multiple performance validity indicators embedded within the subtests of the Wechsler Adult Intelligence Scale, the credibility of the response set can be established with a high level of confidence. Multivariate models improve classification accuracy over individual tests. Relying on existing test data is a cost-effective approach to performance validity assessment.
Code
Full-text available
Statistical significance specifies, if a result may not be the cause of random variations within the data. But not every significant result refers to an effect with a high impact, resp. it may even describe a phenomenon that is not really perceivable in everyday life. Statistical significance mainly depends on the sample size, the quality of the data and the power of the statistical procedures. If large data sets are at hand, as it is often the case f. e. in epidemiological studies or in large scale assessments, very small effects may reach statistical significance. In order to describe, if effects have a relevant magnitude, effect sizes are used to describe the strength of a phenomenon. The most popular effect size measure surely is Cohen's d (Cohen, 1988), but there are many more. On https://http://www.psychometrica.de/effect_size.html , you will find online calculators for Cohen's d, Glass' Delta, Hedges' g, Odds Ratio, Eta Square, calculation of effects from dependent and independent t-tests, ANOVAs and other repeated measure designs, non-parametric effect sizes (Kruskal Wallice, Number Needed to Treat, Common Language Effect Size), conversion tools and tables for interpretation. The code for computing these measures is avaliable as Javascript in the header of the source code of webpage.
Article
Full-text available
Multiple sclerosis is a chronic autoimmune disease of the central nervous system that results in varying degrees of disability. Progressive multiple sclerosis, characterized by a steady increase in neurological disability independently of relapses, can occur from onset (primary progressive) or after a relapsing–remitting course (secondary progressive). As opposed to active inflammation seen in the relapsing–remitting phases of the disease, the gradual worsening of disability in progressive multiple sclerosis results from complex immune mechanisms and neurodegeneration. A few anti-inflammatory disease-modifying therapies with a modest but significant effect on measures of disease progression have been approved for the treatment of progressive multiple sclerosis. The treatment effect of anti-inflammatory agents is particularly observed in the subgroup of patients with younger age and evidence of disease activity. For this reason, a significant effort is underway to develop molecules with the potential to induce myelin repair or halt the degenerative process. Appropriate trial methodology and the development of clinically meaningful disability outcome measures along with imaging and biological biomarkers of progression have a significant impact on the ability to measure the efficacy of potential medications that may reverse disease progression. In this issue, we will review current evidence on the physiopathology, diagnosis, measurement of disability, and treatment of progressive multiple sclerosis.
Article
Full-text available
Objective: Discrimination of patients passing vs. failing the Word Memory Test (WMT) by performance on 11 performance and symptom validity tests (PVTs, SVTs) from the Meyers Neuropsychological Battery (MNB) at per-test false positive cutoffs ranging from 0 to 15%. PVT and SVT intercorrelation in subgroups passing and failing the WMT, as well as the degree of skew of the individual PVTs and SVT in the pass/fail subgroups, were also analyzed. Method: In 255 clinical and forensic cases, 100 failed and 155 passed the WMT, at a base-rate of invalid performance of 39.2%. Performance was contrasted on 10 PVTs and 1 SVT from the MNB, using per-test false positive rates of 0.0%, 3.3%, 5.0%, 10.0%, and 15.0% in discriminating WMT pass and WMT fail groups. These two WMT groups were also contrasted using the 10 PVTs and 1 SVT as continuous variables in a logistic regression. Results: The per-PVT false positive rate of 10% yielded the highest WMT pass/fail classification, and more closely approximated the classification obtained by logistic regression than other cut scores. PVT and SVT correlations were higher in cases failing the WMT, and data were more highly skewed in those passing the WMT. Conclusions: The optimal per-PVT and SVT cutoff is at a false positive rate of 10%, with failure of ≥3 PVTs/SVTs out of 11 yielding sensitivity of 61.0% and specificity of 90.3%. PVTs with the best classification had the greatest degree of skew in the WMT pass subgroup.
Article
Full-text available
Purpose To promote understanding of cognitive impairment in multiple sclerosis (MS), recommend optimal screening, monitoring, and treatment strategies, and address barriers to optimal management. Methods The National MS Society (“Society”) convened experts in cognitive dysfunction (clinicians, researchers, and lay people with MS) to review the published literature, reach consensus on optimal strategies for screening, monitoring, and treating cognitive changes, and propose strategies to address barriers to optimal care. Recommendations Based on current evidence, the Society makes the following recommendations, endorsed by the Consortium of Multiple Sclerosis Centers and the International Multiple Sclerosis Cognition Society: Increased professional and patient awareness/education about the prevalence, impact, and appropriate management of cognitive symptoms. For adults and children (8+ years of age) with clinical or magnetic resonance imaging (MRI) evidence of neurologic damage consistent with MS: As a minimum, early baseline screening with the Symbol Digit Modalities Test (SDMT) or similarly validated test, when the patient is clinically stable; Annual re-assessment with the same instrument, or more often as needed to (1) detect acute disease activity; (2) assess for treatment effects (e.g. starting/changing a disease-modifying therapy) or for relapse recovery; (3) evaluate progression of cognitive impairment; and/or (4) screen for new-onset cognitive problems. For adults (18+ years): more comprehensive assessment for anyone who tests positive on initial cognitive screening or demonstrates significant cognitive decline, especially if there are concerns about comorbidities or the individual is applying for disability due to cognitive impairment. For children (<18 years): neuropsychological evaluation for any unexplained change in school functioning (academic or behavioral). Remedial interventions/accommodations for adults and children to improve functioning at home, work, or school.
Article
Full-text available
To supplement memory-based Performance Validity Tests (PVTs) in identifying noncredible performance, we examined the validity of the two most commonly used nonmemory-based PVTs—Dot Counting Test (DCT) and Wechsler Adult Intelligence Scale–Fourth edition (WAIS-IV) Reliable Digit Span (RDS)—as well as two alternative WAIS-IV Digit Span (DS) subtest PVTs. Examinees completed DCT, WAIS-IV DS, and the following criterion PVTs: Test of Memory Malingering, Word Memory Test, and Word Choice Test. Validity groups were determined by passing 3 (valid; n = 69) or failing ⩾2 (noncredible; n = 30) criterion PVTs. DCT, RDS, RDS–Revised (RDS-R), and WAIS-IV DS Age-Corrected Scaled Score (ACSS) were significantly correlated (but uncorrelated with memory-based PVTs). Combining RDS, RDS-R, and ACSS with DCT improved classification accuracy (particularly for DCT/ACSS) for detecting noncredible performance among valid-unimpaired, but largely not valid-impaired examinees. Combining DCT with ACSS may uniquely assess and best supplement memory-based PVTs to identify noncredible neuropsychological test performance in cognitively unimpaired examinees.
Article
Full-text available
Objective: This study was designed to cross-validate previously published performance validity cutoffs embedded within the Complex Ideational Material (CIM) and the Boston Naming Test–Short Form (BNT–15). Method: Seventy healthy undergraduate students were randomly assigned to either a control condition (n = 40) and instructed to perform to the best of their ability or an experimental malingering (n = 30) condition and instructed to feign cognitive impairment while avoiding detection. All participants were administered the same battery of neuropsychological tests. Results: Previously published validity cutoffs within the CIM (raw score ≤9 or T-score ≤29) and BNT–15 (≤12) produced good classification accuracy using both experimental malingering and psychometrically defined invalid responding as criterion variable. However, a BNT–15 completion time ≥85 s produced a better signal detection profile than BNT–15 accuracy scores. Conclusions: Results support the clinical utility of existing cutoffs. Given the relatively high base rate of failure even in the control group (5–15%), and the perfect specificity of CIM ≤9 and BNT–15 ≤ 11 to noncredible responding, relabeling this range of performance as “Abnormal” instead of “Impaired” would better capture the uncertainty in its clinical interpretation.
Article
Full-text available
Objective: Neurocognitive deficits commonly are an accompanying feature of Multiple Sclerosis (MS). A brief, yet comprehensive neuropsychological battery is desirable for assessing the extent of these deficits. Therefore, the present study examined the validity of the Mercy Evaluation of Multiple Sclerosis (MEMS) for use with the MS population. Methods: Archival data from individuals diagnosed with MS (N = 378) by independent neurologists was examined. Cognitive domains assessed included processing speed and attention, learning, and memory, visuospatial, language, and executive functioning. A mean battery index was calculated to provide a general indicator of cognitive impairment within the current sample. Results: Overall performance across participants was found to be in the lower limits of the average range. Results of factor analytic statistical procedures yielded a four-factor solution, accounting for 67% of total variance within the MEMS. Four neurocognitive measures exhibited the highest sensitivity in detecting cognitive impairment, constituting a psychometrically established brief cognitive screening battery, which accounted for 83% of total variance within the mean battery index score. Conclusion: Overall, the results of the current study suggest appropriate construct validity of the MEMS for use with individuals with MS, as well as provide support for previously established cognitive batteries.
Article
Full-text available
This study was designed to evaluate the potential of the Boston Naming Test (BNT) as a performance validity test (PVT). The classification accuracy of the BNT was examined against several criterion PVTs in a mixed clinical sample of 214 adult outpatients physician referred for neuropsychological assessment. Mean age was 46.7 (SD = 12.5); mean education was 13.5 (SD = 2.5). All participants were native speakers of English. A BNT raw score ≤ 50 produced high specificity (.87–.95), but low and variable sensitivity (.15–.41). Similarly, a T score ≤ 37 was specific (.87–.95), but not very sensitive (.15–.35) to psychometrically defined non-credible responding. Ipsative analyses (i.e., case-by-case review of individual PVT profiles) suggest that failing these cutoffs was associated with zero false positives when all available PVTs were taken into account. Results are consistent with previous reports that the validity cutoffs on the BNT have high positive predictive power, but low negative predictive power. As such, they are useful in ruling in invalid performance, but they cannot be used to rule it out.
Article
Full-text available
Objective: Embedded performance validity tests (PVTs) within the Hopkins Verbal Learning Test-Revised (HVLT-R) and Brief Visuospatial Memory Test-Revised (BVMT-R) were recently identified. This study aimed to further validate/replicate these embedded PVTs. Method: Eighty clinically referred veterans who underwent neuropsychological evaluation were included. Validity groups were established by passing/failing 2-3 well-validated PVTs, with 75% (n = 60) classified as valid and 25% (n = 20) noncredible. Fifty-two percent of valid participants were cognitively impaired. Results: HVLT-R Recognition Discrimination (RD) of ≤5 yielded 67% sensitivity/80% specificity for identifying noncredible performance. Removal of seven valid participants with an amnestic profile who produced a false positive, improved specificity to 92%, which replicated the original findings. Replication efforts failed for BVMT-R Percent Retained; however, significant findings for RD were elucidated. Conclusion: Replication efforts were positive for the HVLT-R embedded PVT, corroborating its ability to identify invalid performance in this heterogeneous clinical veteran sample with and without cognitive impairment.
Article
Full-text available
In verbal fluency (VF) tests, subjects articulate words in a specified category during a short test period (typically 60 s). Verbal fluency tests are widely used to study language development and to evaluate memory retrieval in neuropsychiatric disorders. Performance is usually measured as the total number of correct words retrieved. Here, we describe the properties of a computerized VF (C-VF) test that tallies correct words and repetitions while providing additional lexical measures of word frequency, syllable count, and typicality. In addition, the C-VF permits (1) the analysis of the rate of responding over time, and (2) the analysis of the semantic relationships between words using a new method, Explicit Semantic Analysis (ESA), as well as the established semantic clustering and switching measures developed by Troyer et al. (1997). In Experiment 1, we gathered normative data from 180 subjects ranging in age from 18 to 82 years in semantic (“animals”) and phonemic (letter “F”) conditions. The number of words retrieved in 90 s correlated with education and daily hours of computer-use. The rate of word production declined sharply over time during both tests. In semantic conditions, correct-word scores correlated strongly with the number of ESA and Troyer-defined semantic switches as well as with an ESA-defined semantic organization index (SOI). In phonemic conditions, ESA revealed significant semantic influences in the sequence of words retrieved. In Experiment 2, we examined the test-retest reliability of different measures across three weekly tests in 40 young subjects. Different categories were used for each semantic (“animals”, “parts of the body”, and “foods”) and phonemic (letters “F”, “A”, and “S”) condition. After regressing out the influences of education and computer-use, we found that correct-word z-scores in the first session did not differ from those of the subjects in Experiment 1. Word production was uniformly greater in semantic than phonemic conditions. Intraclass correlation coefficients (ICCs) of correct-word z-scores were higher for phonemic (0.91) than semantic (0.77) tests. In semantic conditions, good reliability was also seen for the SOI (ICC = 0.68) and ESA-defined switches in semantic categories (ICC = 0.62). In Experiment 3, we examined the performance of subjects from Experiment 2 when instructed to malinger: 38% showed abnormal (p< 0.05) performance in semantic conditions. Simulated malingerers with abnormal scores could be distinguished with 80% sensitivity and 89% specificity from subjects with abnormal scores in Experiment 1 using lexical, temporal, and semantic measures. In Experiment 4, we tested patients with mild and severe traumatic brain injury (mTBI and sTBI). Patients with mTBI performed within the normal range, while patients with sTBI showed significant impairments in correct-word z-scores and category shifts. The lexical, temporal, and semantic measures of the C-VF provide an automated and comprehensive description of verbal fluency performance.
Article
Full-text available
Past studies have examined the ability of the Wisconsin Card Sorting Test (WCST) to discriminate valid from invalid performance in adults using both individual embedded validity indicators (EVIs) and multivariate approaches. This study is designed to investigate whether the two most stable of these indicators—failures to maintain set (FMS) and the logistical regression equation S-BLRE—can be extended to pediatric populations. The classification accuracy for FMS and S-BLRE was examined in a mixed clinical sample of 226 children aged 7 to 17 years (64.6% male, MAge = 13.6 years) against a combination of established performance validity tests (PVTs). The results show that at adult cutoffs, FMS and S-BLRE produce an unacceptably high failure rate (33.2% and 45.6%) and low specificity (.55–.72), but an upward adjustment in cutoffs significantly improves classification accuracy. Defining Pass as <2 and Fail as ≥4 on FMS results in consistently good specificity (.89–.92) but low and variable sensitivity (.00–.33). Similarly, cutting the S-BLRE distribution at 3.68 produces good specificity (.90–.92) but variable sensitivity (.06–.38). Passing or failing FMS or S-BLRE is unrelated to age, gender and IQ. The data from this study suggest that in a pediatric sample, adjusted cutoffs on the FMS and S-BLRE ensure good specificity, but with low or variable sensitivity. Thus, they should not be used in isolation to determine the credibility of a response set. At the same time, they can make valuable contributions to pediatric neuropsychology by providing empirically-supported, expedient and cost-effective indicators to enhance performance validity assessment.
Article
Full-text available
Objectives: The Forced Choice Recognition (FCR) trial of the California Verbal Learning Test, 2nd edition, was designed as an embedded performance validity test (PVT). To our knowledge, this is the first systematic review of classification accuracy against reference PVTs. Methods: Results from peer-reviewed studies with FCR data published since 2002 encompassing a variety of clinical, research, and forensic samples were summarized, including 37 studies with FCR failure rates (N=7575) and 17 with concordance rates with established PVTs (N=4432). Results: All healthy controls scored >14 on FCR. On average, 16.9% of the entire sample scored ≤14, while 25.9% failed reference PVTs. Presence or absence of external incentives to appear impaired (as identified by researchers) resulted in different failure rates (13.6% vs. 3.5%), as did failing or passing reference PVTs (49.0% vs. 6.4%). FCR ≤14 produced an overall classification accuracy of 72%, demonstrating higher specificity (.93) than sensitivity (.50) to invalid performance. Failure rates increased with the severity of cognitive impairment. Conclusions: In the absence of serious neurocognitive disorder, FCR ≤14 is highly specific, but only moderately sensitive to invalid responding. Passing FCR does not rule out a non-credible presentation, but failing FCR rules it in with high accuracy. The heterogeneity in sample characteristics and reference PVTs, as well as the quality of the criterion measure across studies, is a major limitation of this review and the basic methodology of PVT research in general. (JINS, 2016, 22, 851-858).
Article
Full-text available
Objective: This study compared failure rates on performance validity tests (PVTs) across liberal and conservative cutoffs in a sample of undergraduate students participating in academic research. Method: Participants (n = 120) were administered four free-standing PVTs (Test of Memory Malingering, Word Memory Test, Rey 15-Item Test, Hiscock Forced-Choice Procedure) and three embedded PVTs (Digit Span, letter and category fluency). Participants also reported their perceived level of effort during testing. Results: At liberal cutoffs, 36.7% of the sample failed ≥1 PVTs, 6.7% failed ≥2, and .8% failed 3. At conservative cutoffs, 18.3% of the sample failed ≥1 PVTs, 2.5% failed ≥2, and .8% failed 3. Participants were 3 to 5 times more likely to fail embedded (15.8-30.8%) compared to free-standing PVTs (3.3-10.0%). There was no significant difference in failure rates between native and non-native English speaking participants at either liberal or conservative cutoffs. Additionally, there was no relation between self-reported effort and PVT failure rates. Conclusions: Although PVT failure rates varied as a function of PVTs and cutoffs, between a third and a fifth of the sample failed ≥1 PVTs, consistent with high initial estimates of invalid performance in this population. Embedded PVTs had notably higher failure rates than free-standing PVTs. Assuming optimal effort in research using students as participants without a formal assessment of performance validity introduces a potentially significant confound in the study design.
Article
Full-text available
The present study examined the impact of performance validity test (PVT) failure on the Test of Premorbid Functioning (TOPF) in a sample of 252 neuropsychological patients. Word reading performance differed significantly according to PVT failure status, and number of PVTs failed accounted for 7.4% of the variance in word reading performance, even after controlling for education. Furthermore, individuals failing ≥2 PVTs were twice as likely as individuals passing all PVTs (33% vs. 16%) to have abnormally low obtained word reading scores relative to demographically predicted scores when using a normative base rate of 10% to define abnormality. When compared with standardization study clinical groups, those failing ≥2 PVTs were twice as likely as patients with moderate to severe traumatic brain injury and as likely as patients with Alzheimer’s dementia to obtain abnormally low TOPF word reading scores. Findings indicate that TOPF word reading based estimates of premorbid functioning should not be interpreted in individuals invalidating cognitive testing.
Article
Full-text available
The Victoria Symptom Validity Test (VSVT) is one of the most accurate performance validity tests. Previous research has recommended several cutoffs for performance invalidity classification on the VSVT. However, only one of these studies used a known groups design and no study has investigated these cutoffs in an exclusively mild traumatic brain injury (mTBI) medico-legal sample. The current study used a known groups design to validate VSVT cutoffs among mild traumatic brain injury litigants and explored the best approach for using the multiple recommended cutoffs for this test. Cutoffs of <18 Hard items correct, <41 Total items correct, an Easy - Hard items correct difference >6, and <5 items correct on any block yielded the strongest classification accuracy. Using multiple cutoffs in conjunction reduced classification accuracy. Given convergence across studies, a cutoff of <18 Hard items correct is the most appropriate for use with mTBI litigants.
Article
Objective: Citation and download data pertaining to the 2009 AACN consensus statement on validity assessment indicated that the topic maintained high interest in subsequent years, during which key terminology evolved and relevant empirical research proliferated. With a general goal of providing current guidance to the clinical neuropsychology community regarding this important topic, the specific update goals were to: identify current key definitions of terms relevant to validity assessment; learn what experts believe should be reaffirmed from the original consensus paper, as well as new consensus points; and incorporate the latest recommendations regarding the use of validity testing, as well as current application of the term 'malingering.' Methods: In the spring of 2019, four of the original 2009 work group chairs and additional experts for each work group were impaneled. A total of 20 individuals shared ideas and writing drafts until reaching consensus on January 21, 2021. Results: Consensus was reached regarding affirmation of prior salient points that continue to garner clinical and scientific support, as well as creation of new points. The resulting consensus statement addresses definitions and differential diagnosis, performance and symptom validity assessment, and research design and statistical issues. Conclusions/Importance: In order to provide bases for diagnoses and interpretations, the current consensus is that all clinical and forensic evaluations must proactively address the degree to which results of neuropsychological and psychological testing are valid. There is a strong and continually-growing evidence-based literature on which practitioners can confidently base their judgments regarding the selection and interpretation of validity measures.
Article
Clinicians who evaluate patients with concerns related to attention-deficit/hyperactivity disorder (ADHD) are encouraged to include validity indicators throughout clinical assessment procedures. To date, no known previous literature has examined the Wisconsin Card Sorting Test (WCST) specifically to address noncredible ADHD, and none has attempted to identify an embedded PVT within the 64-card version. The present study sought to address these gaps in the literature with a simulation study. Sixty-seven undergraduate participants (M age = 19.30) were grouped as credible (combining healthy controls and individuals with ADHD) or noncredible (combining coached and uncoached participants simulating ADHD-related impairment) and administered a battery of neuropsychological tests. Results revealed the noncredible group performed significantly worse on several WCST-64 variables, including failure to maintain set, number of trials to first category, and total categories. Raw scores from these variables were entered as predictors as one set in a logistic regression (LR) with group membership as the outcome variable. An exponentiated equation (EE) derived from LR results yielded acceptable discriminability (area under receiver operating characteristic curve = .73) with modest sensitivity (.38) while maintaining ideal specificity (.91), generally commensurate with a standalone forced-choice memory PVT and better than an embedded attention-based PVT. These findings suggested the WCST-64 may be sensitive to noncredible performance in the context of ADHD and reiterates the importance of considering tests of various cognitive abilities in the evaluation of performance validity. Implications of these findings, limitations of the present study, and directions for future inquiry, including cross-validation in clinical samples, were discussed.
Article
Disagreements in science and medicine are not uncommon, and formal exchanges of disagreements serve a variety of valuable roles. As identified by a Nature Methods editorial entitled “The Power of Disagreement” (2016 The power of disagreement. (2016). Nature Methods , 13 (3), 185–185. https://doi.org/10.1038/nmeth.3798 [Crossref], [PubMed], [Web of Science ®] , [Google Scholar]), disagreements bring attention to best practices so that differences in interpretation do not result from inferior data sets or confirmation bias, “prompting researchers to take a second look at evidence that is not in agreement with their hypothesis, rather than dismiss it as artifacts.” Graver and Green published reasons why they disagree with a recent clinical case report and a decades old randomized control trial characterizing the effect of an acute 2 mg dosing of lorazepam on the Word Memory Test. In this article, we formally responded to their commentary to further clarify the reasons for our data interpretations. These two opposing views provide an excellent learning opportunity, particularly for students, demonstrating the importance of careful articulation of the rationale behind certain conclusions from different perspectives. We encourage careful review of the original articles being discussed so the neuropsychologists can read both positions and decide which interpretation of the findings they consider most sound.
Article
Performance validity tests (PVTs) are widely used in attempts to quantify effort and/or detect negative response bias during neuropsychological testing. However, it can be challenging to interpret the meaning of poor PVT performance in a clinical context. Compensation-seeking populations predominate in the PVT literature. We aimed to establish base rates of PVT failure in clinical populations without known external motivation to underperform. We searched MEDLINE, EMBASE and PsycINFO for studies reporting PVT failure rates in adults with defined clinical diagnoses, excluding studies of active or veteran military personnel, forensic populations or studies of participants known to be litigating or seeking disability benefits. Results were summarised by diagnostic group and implications discussed. Our review identified 69 studies, and 45 different PVTs or indices, in clinical populations with intellectual disability, degenerative brain disease, brain injury, psychiatric disorders, functional disorders and epilepsy. Various pass/fail cut-off scores were described. PVT failure was common in all clinical groups described, with failure rates for some groups and tests exceeding 25%. PVT failure is common across a range of clinical conditions, even in the absence of obvious incentive to underperform. Failure rates are no higher in functional disorders than in other clinical conditions. As PVT failure indicates invalidity of other attempted neuropsychological tests, the finding of frequent and unexpected failure in a range of clinical conditions raises important questions about the degree of objectivity afforded to neuropsychological tests in clinical practice and research.
Article
Objective: This study examined the specificity of both individual PVTs and three different PVT batteries in individuals undergoing neuropsychological evaluation for dementia in order to establish both appropriate individual test cutoffs and multiple-PVT failure criterion. Methods: Participants were 311 validly performing patients with no cognitive impairment (n = 24), mild cognitive impairment (MCI; n = 115), mild dementia (n = 122), or moderate dementia (n = 50). Cutoffs associated with ≥90% specificity were established for 11 individual PVTs across impairment severity groups. Aggregate false positive rates according to number of PVTs failed were examined for two 4-PVT batteries and one 7-PVT battery. One-way ANOVAs with post-hoc comparisons were conducted for each PVT. Results: Performance on 9 of 11 PVTs significantly differed according to impairment severity. PVT cutoffs achieving ≥90% specificity also generally varied by group. For PVTs previously validated in non-dementia samples, slight adjustments from established cutoffs were generally required to maintain adequate specificity in MCI and mild dementia groups, with greater modifications required in the moderate dementia group. A criterion of ≥2 PVT failures resulted in ≥90% specificity in both 4-PVT batteries across groups. In the 7-PVT battery, adequate specificity was achieved with ≥2 failures in MCI and ≥3 failures in the mild dementia group. Conclusions: The incorporation and interpretation of several easily assimilated multiple-PVT batteries in dementia evaluations are explored. Additionally, data regarding individual PVT performance according to cognitive impairment severity are provided to aide validity assessment of both patients undergoing dementia evaluation and examinees who are less impaired.
Article
Introduction: Embedded performance validity tests (PVTs) allow for continuous and economical validity assessment during neuropsychological evaluations; however, similar to their freestanding counterparts, a limitation of well-validated embedded PVTs is that the majority are memory-based. This study cross-validated several previously identified non-memory-based PVTs derived from language, processing speed, and executive functioning tests within a single mixed clinical neuropsychiatric sample with and without cognitive impairment. Method: This cross-sectional study included data from 124 clinical patients who underwent outpatient neuropsychological evaluation. Validity groups were determined by four independent criterion PVTs (failing ≤1 or ≥2), resulting in 98 valid (68% cognitively impaired) and 26 invalid performances. In total, 23 previously identified embedded PVTs derived from Verbal Fluency (VF), Trail Making Test (TMT), Stroop (SCWT), and Wisconsin Card Sorting Test (WCST) were examined. Results: All VF, SCWT, and TMT PVTs, along with WCST Categories, significantly differed between validity groups (ηp2 =.05-.22) with areas under the curve (AUCs) of.65-.81 and 19-54% sensitivity (≥89% specificity) at optimal cut-scores. When subdivided by impairment status, all PVTs except for WCST Failures to Maintain Set were significant (AUCs =.75–94) with 33-85% sensitivity (≥90% specificity) in the cognitively unimpaired group. Among the cognitively impaired group, most VF, TMT, and SCWT PVTs remained significant, albeit with decreased accuracy (AUCs =.65-.76) and sensitivities (19-54%) at optimal cut-scores, whereas all WCST PVTs were nonsignificant. Across groups, SCWT embedded PVTs evidenced the strongest psychometric properties. Conclusion: VF, TMT, and SCWT embedded PVTs generally demonstrated moderate accuracy for identifying invalid neuropsychological performance. However, performance on these non-memory-based PVTs from processing speed and executive functioning tests are not immune to the effects of cognitive impairment, such that alternate cut-scores (with reduced sensitivity if adequate specificity is maintained) are indicated in cases where the clinical history is consistent with cognitive impairment. In contrast, WCST indices generally had poor accuracy.
Article
Objective Performance validity tests (PVTs) are designed to detect nonvalid responding on neuropsychological testing, but their associations with disease-specific and other factors are not well understood in multiple sclerosis (MS). We examined PVT performance among MS patients and associations with clinical characteristics, cognition, mood, and disability status. Method Retrospective data analysis was conducted on a sample of patients with definite MS ( n = 102) who were seen for a clinical neuropsychological evaluation. Comparison samples included patients with intractable epilepsy seen for presurgical workup ( n = 102) and patients with nonacute mild traumatic brain injury (mTBI; n = 50). Patients completed the Victoria Symptom Validity Test (VSVT) and validity cutoffs were defined as <16/24 and <18/24 on the hard items. Results In this MS cohort, 14.4% of patients scored <16 on the VSVT hard items and 21.2% scored <18. VSVT hard item scores were associated with disability status and depression, but not with neuropsychological scores, T2 lesion burden, atrophy, disease duration, or MS subtype. Patients applying for disability benefits were 6.75 times more likely to score <18 relative to those who were not seeking disability. Rates of nonvalid scores were similar to the mTBI group and greater than the epilepsy group. Conclusions This study demonstrates that nonvalid VSVT scores are relatively common among MS patients seen for clinical neuropsychological evaluation. VSVT performance in this group relates primarily to disability status and psychological symptoms and does not reflect factors specific to MS (i.e., cognitive impairment, disease severity). Recommendations for future clinical and research practices are provided.
Article
Objective: Base rates of invalidity in forensic neuropsychological contexts are well explored and believed to approximate 40%, whereas base rates of invalidity across clinical non-forensic contexts are relatively less known. Methods: Adult-focused neuropsychologists (n = 178) were surveyed regarding base rates of invalidity across various clinical non-forensic contexts and practice settings. Median values were calculated and compared across contexts and settings. Results: The median estimated base rate of invalidity across clinical non-forensic evaluations was 15%. When examining specific clinical contexts and settings, base rate estimates varied from 5% to 50%. Patients with medically unexplained symptoms (50%), external incentives (25%-40%), and oppositional attitudes toward testing (37.5%) were reported to have the highest base rates of invalidity. Patients with psychiatric illness, patients evaluated for attention deficit hyperactivity disorder, and patients with a history of mild traumatic brain injury were also reported to invalidate testing at relatively high base rates (approximately 20%). Conversely, patients presenting for dementia evaluation and patients with none of the previously mentioned histories and for whom invalid testing was unanticipated were estimated to produce invalid testing in only 5% of cases. Regarding practice setting, Veterans Affairs providers reported base rates of invalidity to be nearly twice that of any other clinical settings. Conclusions: Non-forensic clinical patients presenting with medically unexplained symptoms, external incentives, or oppositional attitudes are reported to invalidate testing at base rates similar to that of forensic examinees. The impact of context-specific base rates on the clinical evaluation of invalidity is discussed.
Article
Pediatric neuropsychologists are increasingly recognizing the importance of performance validity testing during evaluations. The use of such measures to detect insufficient effort is of particular importance in pediatric epilepsy evaluations, where test results are often used to guide surgical decisions and failure to detect poor task engagement can result in postsurgical cognitive decline. The present investigation assesses the utility of the Medical Symptom Validity Test (MSVT) in 104 clinically referred children and adolescents with epilepsy. Though the overall failure rate was 15.4% of the total group, children with 2nd grade or higher reading skills (a requirement of the task) passed at a very high rate (96.6%). Of the three failures, two were unequivocally deemed true positives, while the third failed due to extreme somnolence during testing. Notably, for those with ≥2nd grade reading levels, MSVT validity indices were unrelated to patient age, intellectual functioning, or age of epilepsy onset, while modest relations were seen with specific memory measures, number of epilepsy medications, and seizure frequency. Despite these associations, however, this did not result in more failures in this population of children and adolescents with substantial neurologic involvement, as pass rates exceeded 92% for those with intellectual disability, high seizure frequency, high medication burden, and even prior surgical resection of critical memory structures.
Article
Objective: Data for the use of embedded performance validity tests (ePVTs) with multiple sclerosis (MS) patients are limited. The purpose of the current study was to determine whether ePVTs previously validated in other neurological samples perform similarly in an MS sample. Methods: In this retrospective study, the prevalence of below-criterion responding at different cut-off scores was calculated for each ePVT of interest among patients with MS who passed a stand-alone PVT. Results: Previously established PVT cut-offs generally demonstrated acceptable specificity when applied to our sample. However, the overall cognitive burden of the sample was limited relative to that observed in prior large-scale MS studies. Conclusion: The current study provides initial data regarding the performance of select ePVTs among an MS sample. Results indicate most previously validated cut-offs avoid excessive false positive errors in a predominantly relapsing remitting MS sample. Further validation among MS patients with more advanced disease is warranted.
Article
Objectives: Descriptive labels of performance test scores are a critical component of communicating outcomes of neuropsychological and psychological evaluations. Yet, no universally accepted system exists for assigning qualitative descriptors to scores in specific ranges. In addition, the definition and use of the term “impairment” lacks specificity and consensus. Consequently, test score labels and the denotation of impairment are inconsistently applied by clinicians, creating confusion among consumers of neuropsychological services, including referral sources, trainees, colleagues, and the judicial system. To reduce this confusion, experts in clinical and forensic neuropsychological and psychological assessment convened in a consensus conference at the 2018 Annual Meeting of the American Academy of Clinical Neuropsychology (AACN). The goals of the consensus conference were to recommend (1) a system of qualitative labels to describe results from performance-based tests with normal and non-normal distributions and (2) a definition of impairment and its application in individual case determinations. Results: The goals of the consensus conference were met resulting in specific recommendations for the application of uniform labels for performance tests and for the definition of impairment, which are described in this paper. In addition, included in this consensus statement is a description of the conference process and the rationales for these recommendations. Conclusions/Importance: This consensus conference is the first formal attempt by the professional neuropsychological community to make recommendations for uniform performance test score labels and to advance a consistent definition of impairment. Using uniform descriptors and terms will reduce confusion and enhance report comprehensibility by the consumers of our reports as well as our trainees and colleagues.
Article
Background: Performance Validity Testing (PVT) decision-making rules may be indeterminate in patients with neurological disease in which PVT characteristics have not been adequately studied. We report a patient with multiple sclerosis (MS) who failed computerized PVT testing but had normal memory scores with a neuropsychological profile consistent with expected MS disease-related weaknesses. Method: Neuropsychological testing was conducted on two occasions in a middle-aged woman with an established MS diagnosis to address concerns of possible memory decline. Testing was discontinued after PVT scores below recommended cut-points were obtained during the first evaluation. During the second assessment, subthreshold PVT scores on a different computerized PVT were obtained, but unlike the first assessment, the entire neuropsychological protocol was administered. Results: Despite subthreshold computerized PVT scores, normal learning and memory performance was obtained providing objective data to answer the referral question. Other neuropsychological findings included decreased processing speed, poor working memory, and poor executive function consistent with her MS diagnosis. Embedded PVT scores were normal. Conclusions: We speculate that poor computerized PVT scores resulted from the disease-related features of MS, although we also discuss approaches to reconcile apparently contradictory PVT versus neuropsychological results if the contributions of disease-related variables on PVTs scores are discounted. This case demonstrates the value of completing the assessment protocol despite obtaining PVT scores below publisher recommended cutoffs in clinical evaluations. If subthreshold PVT scores are considered evidence of performance invalidity, it is still necessary to have an approach for interpreting seemingly credible neuropsychological test results rather than simply dismissing them as invalid.
Article
Longitudinal studies have shown inconsistent findings regarding the association between cognition, demographic characteristics, and clinical decline in relapsing-remitting multiple sclerosis (RRMS). Our objective was to further explore these relations, over time, while also considering age and sex. A total of 183 patients with RRMS were assessed at two time points, using a neuropsychological battery and the Expanded Disability Status Scale (EDSS). For the first assessment, participants were divided by age (<29, 30–39, 40–49, 50–60) and sex. Next, they were divided according to their participation in one of three interval assessment points: 2–3, 4–5, and 6–8 years. Cognitive function was not correlated with disease duration but was negatively correlated with EDSS score. Men under 29 and women under 39 showed negative correlations between cognitive and clinical impairment. Executive functions, attention, and information processing speed (IPS) showed cognitive decline between the first and second assessments. Furthermore, at the 4–5 year interval IPS predicted EDSS scores, while at the 6–8 year interval it was IPS and visuo-spatial ability. Therefore, relation between clinical status and cognition is not consistent across different age and sex groups. Additionally, cognitive deterioration is only partially evident longitudinally; however, IPS appears to be the most sensitive in predicting one’s future clinical condition.
Article
Introduction: Performance validity testing has developed into an indispensible element of neuropsychological assessment, mostly applied in forensic determinations. Its aim is to distinguish genuine patient performance from invalid test profiles. Limits to the applicability of performance validity tests (PVTs) may arise when genuine cognitive symptoms are present. Method: We studied the robustness of four commonly used PVTs in a sample of 15 acute patients after cerebrovascular stroke, with first manifestations of aphasia. Severity of aphasia varied from very mild to severe. Subsequent neuroimaging revealed left-hemisphere infarction for all participants. Results: The Test of Memory Malingering was the only measure found to be robust against effects of genuine language impairment (one positive on Trials 1 and 2, none on Trial 3), while unacceptable false-positive rates were found for the Fifteen-Item Test (60%) and two embedded measures, Reliable Spatial Span (40%) and Reliable Digit Span (73.3%). Four patients (26.7%) scored positive on at least three of the four PVTs. Conclusions: These data add to the ongoing discussion about the risk of false-positive classifications in genuine patient populations. Misdiagnosis with severe consequences for the patient in question may arise if results of PVTs are interpreted without concurrently considering the whole context of clinical evidence.
Article
Objective: This study investigated whether indices within the Brief Visuospatial Memory Test – Revised (BVMT-R) could function as embedded performance validity measures in an outpatient clinical sample. Method: A sample of 138 neuropsychological outpatients was utilized; approximately 45% had a known or suspected external incentive. Patients were determined to be valid performers if they passed all criterion performance validity tests (PVTs) and determined to be invalid performers if they failed two or more PVTs. BVMT-R indices met criteria for optimal embedded PVTs if they were not significantly correlated with genuine cognitive dysfunction and if they adequately differentiated the validly from invalidly performing patient groups. Classification accuracy statistics for the indices were then calculated. Supplementary analyses were also calculated for a separate dementia sample. Results: Recognition Hits and Recognition Discrimination were identified as two optimal embedded PVTs for patients without dementia. Recognition Hits showed a sensitivity rate of 41% and a specificity rate of 95%. Recognition Discrimination showed a sensitivity rate of 54% and a specificity rate of 93%. Conclusion: Embedded BVMT-R PVTs are discussed in relation to previous research findings, which were obtained from veteran samples. Recognition Hits and Recognition Discrimination are now validated in a non-veteran clinical sample.
Article
Objective: Assessment of performance validity is a necessary component of any neuropsychological evaluation. Prior research has shown that cutoff scores of ≤6 or ≤7 on Reliable Digit Span (RDS) can detect suboptimal effort across numerous adult clinical populations; however, these scores have not been validated for that purpose in an adult epilepsy population. This investigation aims to determine whether these previously established RDS cutoff scores could detect suboptimal effort in adults with epilepsy. Method: Sixty-three clinically referred adults with a diagnosis of epilepsy or suspected seizures were administered the Digit Span subtest of the Wechsler Adult Intelligence Scale (WAIS-III or WAIS-IV). Most participants (98%) passed Trial 2 of the Test of Memory Malingering (TOMM), achieving a score of ≥45. Results: Previously established cutoff scores of ≤6 and ≤7 on RDS yielded a specificity rate of 85% and 77% respectively. Findings also revealed that RDS scores were positively related to attention and intellectual functioning. Given the less than ideal specificity rate associated with each of these cutoff scores, together with their strong association to cognitive factors, secondary analyses were conducted to identify more optimal cutoff scores. Preliminary results suggest that an RDS cutoff score of ≤4 may be more appropriate in a clinically referred adult epilepsy population with a low average IQ or lower. Conclusions: Preliminary findings indicate that cutoff scores of ≤6 and ≤7 on RDS are not appropriate in adults with epilepsy, especially in individuals with low average IQ or below.
Article
The development of more sophisticated performance validity measures is important due to concerns with coaching as well as providing clinicians with a greater variety of options when assessing performance validity. Examinees with noncredible performance may find it more difficult to elude detection by PVTs derived from arithmetical summation or logistic regression. The present study evaluated the classification accuracy of several executive functioning (EF) variables as PVTs both individually and when combined into derived variables. The current study evaluated a simple mathematic summation of embedded PVT scores and a logistic regression-based formula based on embedded PVTs from executive function measures. A total of 155 consecutive patients completed neuropsychological evaluation after sustaining a mild traumatic brain injury (MTBI) were studied and were placed into a PVT-PASS (N = 95, mean age = 44.9, SD = 12.55, mean education = 13.45, SD = 2.23, 38% male, 97% Caucasian) or PVT-FAIL group (N = 60, mean age = 44.1, SD = 15.47, mean education = 13.05, SD = 2.58, 55% male, 92% Caucasian). Trail Making Test B, Wisconsin Card Sorting Test, and Stroop Color Word Test were summed and also used in logistic regression to predict whether patients had credible performance. Both the mathematical summation and the logistic regression methods achieved excellent classification accuracy (summation AUC = .79; logistic regression AUC = .82) with higher sensitivity than individual PVTs.
Article
Objective: Over the past two decades, there has been much research on measures of response bias and myriad measures have been validated in a variety of clinical and research samples. This critical review aims to guide clinicians through the use of performance validity tests (PVTs) from test selection and administration through test interpretation and feedback. Method/results: Recommended cutoffs and relevant test operating characteristics are presented. Other important issues to consider during test selection, administration, interpretation, and feedback are discussed including order effects, coaching, impact on test data, and methods to combine measures and improve predictive power. When interpreting performance validity measures, neuropsychologists must use particular caution in cases of dementia, low intelligence, English as a second language/minority cultures, or low education. Conclusions: PVTs provide valuable information regarding response bias and, under the right circumstances, can provide excellent evidence of response bias. Only after consideration of the entire clinical picture, including validity test performance, can concrete determinations regarding the validity of test data be made.
Article
Objective : To examine the association between caregiver proxy report of executive function (EF) and dysregulated eating behavior in children with obesity. Participants were 195 youth with obesity aged 8-17 years, and their legal guardians. Youth height, weight, demographics, depressive symptoms, eating behaviors, and EF were assessed cross-sectionally during a medical visit. Analyses of covariance, adjusted for child age, gender, race/ethnicity, standardized BMI, depressive symptoms, and family income were used to examine differences in youth EF across caregiver and youth self-report of eating behaviors. Youth EF differed significantly by caregiver report of eating behavior but not youth self-report. Post hoc analyses showed that youth with overeating or binge eating had poorer EF than youth without these eating behaviors. Executive dysfunction, as reported by caregivers, in youth with obesity may be associated with dysregulated eating behaviors predictive of poor long-term psychosocial and weight outcomes. Further consideration of EF-specific targets for assessment and intervention in youth with obesity may be warranted.
Article
Objective: The present study evaluated strategies used by healthy adults coached to simulate traumatic brain injury (TBI) during neuropsychological evaluation. Method: Healthy adults (n = 58) were coached to simulate TBI while completing a test battery consisting of multiple performance validity tests (PVTs), neuropsychological tests, a self-report scale of functional independence, and a debriefing survey about strategies used to feign TBI. Results: "Successful" simulators (n = 16) were classified as participants who failed 0 or 1 PVT and also scored as impaired on one or more neuropsychological index. "Unsuccessful" simulators (n = 42) failed ≥2 PVTs or passed PVTs but did not score impaired on any neuropsychological index. Compared to unsuccessful simulators, successful simulators had significantly more years of education, higher estimated IQ, and were more likely to use information provided about TBI to employ a systematic pattern of performance that targeted specific tests rather than performing poorly across the entire test battery. Conclusion: Results contribute to a limited body of research investigating strategies utilized by individuals instructed to feign neurocognitive impairment. Findings signal the importance of developing additional embedded PVTs within standard cognitive tests to assess performance validity throughout a neuropsychological assessment. Future research should consider specifically targeting embedded measures in visual tests sensitive to slowed responding (e.g. response time).
Article
Objective: Various research studies and neuropsychology practice organizations have reiterated the importance of developing embedded performance validity tests (PVTs) to detect potentially invalid neurocognitive test data. This study investigated whether measures within the Hopkins Verbal Learning Test - Revised (HVLT-R) and the Brief Visuospatial Memory Test - Revised (BVMT-R) could accurately classify individuals who fail two or more PVTs during routine clinical assessment. Method: The present sample of 109 United States military veterans (Mean age = 52.4, SD = 13.3), all consisted of clinically referred patients and received a battery of neuropsychological tests. Based on performance validity findings, veterans were assigned to valid (n = 86) or invalid (n = 23) groups. Of the 109 patients in the overall sample, 77 were administered the HLVT-R and 75 were administered the BVMT-R, which were examined for classification accuracy. Results: The HVLT-R Recognition Discrimination Index and the BVMT-R Retention Percentage showed good to adequate discrimination with an area under the curve of .78 and .70, respectively. The HVLT-R Recognition Discrimination Index showed sensitivity of .53 with specificity of .93. The BVMT-R Retention Percentage demonstrated sensitivity of .31 with specificity of .92. Conclusions: When used in conjunction with other PVTs, these new embedded PVTs may be effective in the detection of invalid test data, although they are not intended for use in patients with dementia.
Article
In the present study, faculty who teach in clinical and counseling doctor of philosophy (PhD) or doctor of psychology (PsyD) programs completed surveys regarding preferences for prospective student preparations to graduate programs. Faculty expectations of minimum and ideal undergraduate training were highest for scientific methods, though expectations systematically varied among clinical PhD, counseling PhD, and clinical PsyD programs. Faculty preferences for applicants’ research and clinical “fit” within the program in which they are applying, as well as general interpersonal skills and intellect, also emerged as important admissions factors. These results describe the desirable undergraduate preparations and qualities of applicants for advanced study in clinical and counseling psychology. The findings have implications for prospective graduate students, faculty who train and mentor undergraduates, and faculty who serve on admissions committees.
Article
Introduction: Recognition and visual working memory tasks from the Wechsler Memory Scale-Fourth Edition (WMS-IV) have previously been documented as useful indicators for suboptimal performance. The present study examined the clinical utility of the Dutch version of the WMS-IV (WMS-IV-NL) for the identification of suboptimal performance using an analogue study design. Method: The patient group consisted of 59 mixed-etiology patients; the experimental malingerers were 50 healthy individuals who were asked to simulate cognitive impairment as a result of a traumatic brain injury; the last group consisted of 50 healthy controls who were instructed to put forth full effort. Results: Experimental malingerers performed significantly lower on all WMS-IV-NL tasks than did the patients and healthy controls. A binary logistic regression analysis was performed on the experimental malingerers and the patients. The first model contained the visual working memory subtests (Spatial Addition and Symbol Span) and the recognition tasks of the following subtests: Logical Memory, Verbal Paired Associates, Designs, Visual Reproduction. The results showed an overall classification rate of 78.4%, and only Spatial Addition explained a significant amount of variation (p < .001). Subsequent logistic regression analysis and receiver operating characteristic (ROC) analysis supported the discriminatory power of the subtest Spatial Addition. A scaled score cutoff of <4 produced 93% specificity and 52% sensitivity for detection of suboptimal performance. Conclusion: The WMS-IV-NL Spatial Addition subtest may provide clinically useful information for the detection of suboptimal performance.