Differential Item Functioning Impact in a Modified Version of the Roland-Morris Disability Questionnaire

Department of Medicine, University of Washington, Harborview Medical Center, 325 Ninth Avenue, Box 359780, Seattle, WA 98104, USA.
Quality of Life Research (Impact Factor: 2.49). 09/2007; 16(6):981-90. DOI: 10.1007/s11136-007-9200-x
Source: PubMed


To evaluate a modified version of the Roland-Morris Disability Questionnaire for differential item functioning (DIF) related to several covariates.
DIF occurs in an item when, after controlling for the underlying trait measured by the test, the probability of endorsing the item varies across groups.
Secondary data analysis of two studies of participants with back pain (total n = 875). We used a hybrid item response theory/ logistic regression approach for detecting DIF. We obtained scores that accounted for DIF. We evaluated the impact of DIF on individual and group scores, and compared scores that ignored or accounted for DIF in terms of the strength of association with SF-36 subscale scores.
DIF was found in 18/23 items. Salient scale-level differential functioning was found related to age, education, and employment. Overall 24 participants (3%) had salient scale-level differential functioning. Mean scores across demographic groups differed minimally when accounting for DIF. The strength of association of scores with SF-36 scores was similar for scores that ignored and scores that accounted for DIF.
The modified version of the Roland-Morris Disability Questionnaire appears to have largely negligible DIF related to the covariates assessed here.

4 Reads
  • Source
    • "guage ) . We used differences between the naive scores and the scores accounting for DIF with respect to all the covariates to address questions of cumulative DIF impact . Differences smaller than the median standard error of measure - ment ( SEM ) of the scale are considered negligible , while differences larger than this amount are " salient " ( Crane et al . 2007a , 2008b ) . The SEM quanti - fies the amount of noise that is present in the instrument . The median value of the SEM quantifies the center of the " noise distribution " that is tolerated for the instrument . Thus , DIF impact larger than this amount represents impact greater than the tolerated level of noise for the instrument . A more"
    [Show abstract] [Hide abstract]
    ABSTRACT: To evaluate psychometric properties of a widely used patient experience survey. English-language responses to the Clinician & Group Consumer Assessment of Healthcare Providers and Systems (CG-CAHPS®) survey (n = 12,244) from a 2008 quality improvement initiative involving eight southern California medical groups. We used an iterative hybrid ordinal logistic regression/item response theory differential item functioning (DIF) algorithm to identify items with DIF related to patient sociodemographic characteristics, duration of the physician-patient relationship, number of physician visits, and self-rated physical and mental health. We accounted for all sources of DIF and determined its cumulative impact. The upper end of the CG-CAHPS® performance range is measured with low precision. With sensitive settings, some items were found to have DIF. However, overall DIF impact was negligible, as 0.14 percent of participants had salient DIF impact. Latinos who spoke predominantly English at home had the highest prevalence of salient DIF impact at 0.26 percent. The CG-CAHPS® functions similarly across commercially insured respondents from diverse backgrounds. Consequently, previously documented racial and ethnic group differences likely reflect true differences rather than measurement bias. The impact of low precision at the upper end of the scale should be clarified.
    Health Services Research 12/2011; 46(6pt1):1778-802. DOI:10.1111/j.1475-6773.2011.01299.x · 2.78 Impact Factor
  • Source
    • "Thus, it has been suggested that item response theory (IRT) scoring should be used to derive the matching variable, even when IRT is not itself used for DIF detection. This hybrid logistic regression/IRT method has been used in a number of recent studies and free software is available for this purpose [2,62,63]. It also has the advantage of incorporating purification by using an iterative approach that can account for DIF in other items [63,64]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application. A review of logistic regression DIF analyses in HRQoL was undertaken. Methodological articles from other fields and using other DIF methods were also included if considered relevant. There are many competing approaches for the conduct of DIF analyses and many criteria for determining what constitutes significant DIF. DIF in short scales, as commonly found in HRQL instruments, may be more difficult to interpret. Qualitative methods may aid interpretation of such DIF analyses. A number of methodological choices must be made when applying logistic regression for DIF analyses, and many of these affect the results. We provide recommendations based on reviewing the current evidence. Although the focus is on logistic regression, many of our results should be applicable to DIF analyses in general. There is a need for more empirical and theoretical work in this area.
    Health and Quality of Life Outcomes 08/2010; 8:81. DOI:10.1186/1477-7525-8-81 · 2.12 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A post hoc simulation of a computer adaptive administration of the items of a modified version of the Roland-Morris Disability Questionnaire. To evaluate the effectiveness of adaptive administration of back pain-related disability items compared with a fixed 11-item short form. Short form versions of the Roland-Morris Disability Questionnaire have been developed. An alternative to paper-and-pencil short forms is to administer items adaptively so that items are presented based on a person's responses to previous items. Theoretically, this allows precise estimation of back pain disability with administration of only a few items. Data were gathered from 2 previously conducted studies of persons with back pain. An item response theory model was used to calibrate scores based on all items, items of a paper-and-pencil short form, and several computer adaptive tests (CATs). Correlations between each CAT condition and scores based on a 23-item version of the Roland-Morris Disability Questionnaire ranged from 0.93 to 0.98. Compared with an 11-item short form, an 11-item CAT produced scores that were significantly more highly correlated with scores based on the 23-item scale. CATs with even fewer items also produced scores that were highly correlated with scores based on all items. For example, scores from a 5-item CAT had a correlation of 0.93 with full scale scores. Seven- and 9-item CATs correlated at 0.95 and 0.97, respectively. A CAT with a standard-error-based stopping rule produced scores that correlated at 0.95 with full scale scores. A CAT-based back pain-related disability measure may be a valuable tool for use in clinical and research contexts. Use of CAT for other common measures in back pain research, such as other functional scales or measures of psychological distress, may offer similar advantages.
    Spine 06/2008; 33(12):1378-83. DOI:10.1097/BRS.0b013e3181732acb · 2.30 Impact Factor
Show more