Article

Efficiency of static and computer adaptive short forms compared to full-length measures of depressive symptoms

Department of Medical Social Sciences, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA.
Quality of Life Research (Impact Factor: 2.49). 11/2009; 19(1):125-36. DOI: 10.1007/s11136-009-9560-5
Source: PubMed

ABSTRACT

Short-form patient-reported outcome measures are popular because they minimize patient burden. We assessed the efficiency of static short forms and computer adaptive testing (CAT) using data from the Patient-Reported Outcomes Measurement Information System (PROMIS) project.
We evaluated the 28-item PROMIS depressive symptoms bank. We used post hoc simulations based on the PROMIS calibration sample to compare several short-form selection strategies and the PROMIS CAT to the total item bank score.
Compared with full-bank scores, all short forms and CAT produced highly correlated scores, but CAT outperformed each static short form in almost all criteria. However, short-form selection strategies performed only marginally worse than CAT. The performance gap observed in static forms was reduced by using a two-stage branching test format.
Using several polytomous items in a calibrated unidimensional bank to measure depressive symptoms yielded a CAT that provided marginally superior efficiency compared to static short forms. The efficiency of a two-stage semi-adaptive testing strategy was so close to CAT that it warrants further consideration and study.

Download full-text

Full-text

Available from: Paul A Pilkonis
  • Source
    • "Test information and standard error curves for the severity of substance use item bank. item bank based on the observed data from our calibration sample , expected information under the standard normal distribution with a mean of 0 and SD of 1, and expected information under a normal distribution with a larger SD, i.e., a mean of 0 and SD of 1.5 (Choi et al., 2010). The CAT simulations were performed using the Firestar program (Choi, 2009). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background: Two item banks for substance use were developed as part of the Patient-Reported Outcomes Measurement Information System (PROMIS(®)): severity of substance use and positive appeal of substance use. Methods: Qualitative item analysis (including focus groups, cognitive interviewing, expert review, and item revision) reduced an initial pool of more than 5300 items for substance use to 119 items included in field testing. Items were written in a first-person, past-tense format, with 5 response options reflecting frequency or severity. Both 30-day and 3-month time frames were tested. The calibration sample of 1336 respondents included 875 individuals from the general population (ascertained through an internet panel) and 461 patients from addiction treatment centers participating in the National Drug Abuse Treatment Clinical Trials Network. Results: Final banks of 37 and 18 items were calibrated for severity of substance use and positive appeal of substance use, respectively, using the two-parameter graded response model from item response theory (IRT). Initial calibrations were similar for the 30-day and 3-month time frames, and final calibrations used data combined across the time frames, making the items applicable with either interval. Seven-item static short forms were also developed from each item bank. Conclusions: Test information curves showed that the PROMIS item banks provided substantial information in a broad range of severity, making them suitable for treatment, observational, and epidemiological research in both clinical and community settings.
    Full-text · Article · Oct 2015 · Drug and alcohol dependence
    • "Despite their shortness, they are able to cover a substantially larger measurement range compared to short forms (Choi et al., 2010). CAT scores are highly correlated to conventional questionnaires measuring the same construct (r¼ .56-.66; Becker et al., 2008, .68–.77; Fliege et al., 2009). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Computerized adaptive testing (CAT) based on Item Response Theory, (IRT) offers an efficient way for accurate measurement of patient reported outcomes. The efficiency lies within a minimal response burden and a high measurement precision over a broad measurement range. The objective of the study was to evaluate and compare the responsiveness of CATs measuring anxiety, depression, and stress reaction to standard static self-assessment tools. Longitudinal data of n=595 psychosomatic inpatients were analyzed for evaluating retest-reliability and sensitivity to change of the CATs compared to static measures (GAD-7, PHQ-9, and PSQ) using correlational and ANOVA statistics. The study hypothesized that CATs are at least as retest-reliable and as sensitive to change as static tools. The three CATs show a low burden for patients, administering on average 5-7 (±2-6SD) items with similar retest-reliability compared to the static tools applied (A-CAT: r=.78 vs. GAD-7: r=.75, D-CAT: r=.71 vs. PHQ-9: r=.75, S-CAT: r=.80 vs. PSQworries scale: r=.80). The CATs were overall as sensitive to change as the static tools (Cohen׳s d ranged between .19 and .69). This is a monocenter, observational, longitudinal study without external clinical criteria; thus generalization to other settings may be limited. The tested CATs belong to the first generation of CATs being used in daily routine for more than a decade. They are as retest reliable and sensitive to change as static tools. Newer CATs may provide further practical advantages. Copyright © 2014 Elsevier B.V. All rights reserved.
    No preview · Article · Nov 2014 · Journal of Affective Disorders
  • Source
    • "Analyses of potential differential item functioning due to gender, age, and educational attainment were performed during the development of the item banks to ensure that items performed comparably regardless of variations in these background characteristics. In general, experience with CAT suggests that the PROMIS depression item bank provides excellent precision with 4e6 items (Choi et al., 2010). A generic 8-item short form is also available, and this short form was one of the cross-cutting dimensional measures used in the DSM-5 field trials, where its feasibility was established and where it performed well with regard to test-retest reliability (Narrow et al., 2013). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The Patient-Reported Outcomes Measurement Information System (PROMIS(®)) is an NIH Roadmap initiative devoted to developing better measurement tools for assessing constructs relevant to the clinical investigation and treatment of all diseases-constructs such as pain, fatigue, emotional distress, sleep, physical functioning, and social participation. Following creation of item banks for these constructs, our priority has been to validate them, most often in short-term observational studies. We report here on a three-month prospective observational study with depressed outpatients in the early stages of a new treatment episode (with assessments at intake, one-month follow-up, and three-month follow-up). The protocol was designed to compare the psychometric properties of the PROMIS depression item bank (administered as a computerized adaptive test, CAT) with two legacy self-report instruments: the Center for Epidemiological Studies Depression scale (CESD; Radloff, 1977) and the Patient Health Questionnaire (PHQ-9; Spitzer et al., 1999). PROMIS depression demonstrated strong convergent validity with the CESD and the PHQ-9 (with correlations in a range from .72 to .84 across all time points), as well as responsiveness to change when characterizing symptom severity in a clinical outpatient sample. Identification of patients as "recovered" varied across the measures, with the PHQ-9 being the most conservative. The use of calibrations based on models from item response theory (IRT) provides advantages for PROMIS depression both psychometrically (creating the possibility of adaptive testing, providing a broader effective range of measurement, and generating greater precision) and practically (these psychometric advantages can be achieved with fewer items-a median of 4 items administered by CAT-resulting in less patient burden).
    Full-text · Article · May 2014 · Journal of Psychiatric Research
Show more