Short and Precise Patient Self-Assessment of Heart Failure Symptoms Using a Computerized Adaptive Test
ABSTRACT Assessment of dyspnea, fatigue, and physical disability is fundamental to the monitoring of patients with heart failure (HF). A plethora of patient-reported measures exist, but most are too burdensome or imprecise to be useful in clinical practice. New techniques used for computer adaptive tests (CATs) may be able to address these problems. The purpose of this study was to build a CAT for patients with HF.
Item banks of 74 queries ("items") were developed to assess self-reported physical disability, fatigue, and dyspnea. All queries were administered to 658 adults with HF to build 3 item banks. The resulting HF-CAT was administered to 100 patients with ancillary HF (New York Heart Association I, 11%; II, 53%; III and IV, 36%). In addition, the physical function and vitality domains of the SF-36 Health Survey questionnaire, an established shortness-of-breath scale, and the Minnesota Living with Heart Failure Questionnaire were applied. The HF-CAT assessment took 3:09±1:52 minutes to complete and score. All HF-CAT scales demonstrated good construct validity through high correlations with the corresponding SF-36 Health Survey physical function (r=-0.87), vitality (r=-0.85), and shortness-of-breath (r=0.84) scales. Simulation studies showed a more precise measurement of all HF-CAT scales over a larger range than comparable static tools. The HF-CAT scales identified significant differences between patients classified by the New York Heart Association symptom criteria, similar to the Minnesota Living with Heart Failure Questionnaire.
A new CAT for patients with HF was built using modern psychometric methods. Initial results demonstrate its potential to increase the feasibility and precision of patient self-assessments of symptoms of HF with minimized respondent burden. CLINICAL TRIAL REGISTRATION- URL: http://www.projectreporter.nih.gov. Unique identifier: 1R43HL083622-01.
- [Show abstract] [Hide abstract]
ABSTRACT: Item response theory is increasingly used in the development of psychometric tests. This paper evaluates whether these modern psychometric methods can improve self-reported screening for depression and anxiety in patients with heart failure. The mental health status of 194 patients with heart failure was assessed using six screening tools for depression (Patient Health Questionnaire -9 (9 items), Hospital Anxiety and Depression Scale (HADS) (7 items), PROMIS-Depression Short Form 8a (8 items)) and Anxiety (GAD-7 (7 items), Hospital Anxiety and Depression Scale (HADS) (7 items), PROMIS-Anxiety Short Form 8a (8 items)). An in-person structured clinical interview was used as the current gold standard to identify the presence of a mental disorder. The diagnostic accuracy of all static tools was compared when item response theory (IRT)-based person parameter were estimated instead of sum scores. Furthermore, we compared performance of static instruments with post hoc simulated individual-tailored computer-adaptive test (CATs) for both disorders and a common negative affect CAT. In general, screening for depression was highly efficient and showed a better performance than screening for anxiety with only minimal differences among the assessed instruments. IRT-based person parameters yielded the same diagnostic accuracy as sum scores. CATs showed similar screening performance compared to legacy instruments but required significantly fewer items to identify patients without mental conditions. Ideal cutoffs varied between male and female samples. Overall, the diagnostic performance of all investigated instruments was similar, regardless of the methods being used. However, CATs can individually tailor the test to each patient, thus significantly decreasing the respondent burden for patients with and without mental conditions. Such approach could efficiently increase the acceptability of mental health screening in clinical practice settings.Quality of Life Research 12/2013; DOI:10.1007/s11136-013-0599-y · 2.86 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: To provide a standardized metric for the assessment of depression severity to enable comparability among results of established depression measures. A common metric for 11 depression questionnaires was developed applying item response theory (IRT) methods. Data of 33,844 adults were used for secondary analysis including routine assessments of 23,817 in- and outpatients with mental and/or medical conditions (46% with depressive disorders) and a general population sample of 10,027 randomly selected participants from three representative German household surveys. A standardized metric for depression severity was defined by 143 items, and scores were normed to a general population mean of 50 (standard deviation = 10) for easy interpretability. It covers the entire range of depression severity assessed by established instruments. The metric allows comparisons among included measures. Large differences were found in their measurement precision and range, providing a rationale for instrument selection. Published scale-specific threshold scores of depression severity showed remarkable consistencies across different questionnaires. An IRT-based instrument-independent metric for depression severity enables direct comparisons among established measures. The "common ruler" simplifies the interpretation of depression assessment by identifying key thresholds for clinical and epidemiologic decision making and facilitates integrative psychometric research across studies, including meta-analysis.Journal of clinical epidemiology 01/2014; 67(1):73-86. DOI:10.1016/j.jclinepi.2013.04.019 · 5.48 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Computerized adaptive testing (CAT) based on Item Response Theory, (IRT) offers an efficient way for accurate measurement of patient reported outcomes. The efficiency lies within a minimal response burden and a high measurement precision over a broad measurement range. The objective of the study was to evaluate and compare the responsiveness of CATs measuring anxiety, depression, and stress reaction to standard static self-assessment tools. Longitudinal data of n=595 psychosomatic inpatients were analyzed for evaluating retest-reliability and sensitivity to change of the CATs compared to static measures (GAD-7, PHQ-9, and PSQ) using correlational and ANOVA statistics. The study hypothesized that CATs are at least as retest-reliable and as sensitive to change as static tools. The three CATs show a low burden for patients, administering on average 5-7 (±2-6SD) items with similar retest-reliability compared to the static tools applied (A-CAT: r=.78 vs. GAD-7: r=.75, D-CAT: r=.71 vs. PHQ-9: r=.75, S-CAT: r=.80 vs. PSQworries scale: r=.80). The CATs were overall as sensitive to change as the static tools (Cohen׳s d ranged between .19 and .69). This is a monocenter, observational, longitudinal study without external clinical criteria; thus generalization to other settings may be limited. The tested CATs belong to the first generation of CATs being used in daily routine for more than a decade. They are as retest reliable and sensitive to change as static tools. Newer CATs may provide further practical advantages. Copyright © 2014 Elsevier B.V. All rights reserved.Journal of Affective Disorders 11/2014; DOI:10.1016/j.jad.2014.10.063 · 3.71 Impact Factor