Modifying Measures Based on Differential Item Functioning (DIF) Impact Analyses

Columbia University Stroud Center, New York, NY 10471, USA.
Journal of Aging and Health (Impact Factor: 1.56). 03/2012; 24(6). DOI: 10.1177/0898264312436877

ABSTRACT OBJECTIVE: Measure modification can impact comparability of scores across groups and settings. Changes in items can affect the percent admitting to a symptom. METHODS: Using item response theory (IRT) methods, well-calibrated items can be used interchangeably, and the exact same item does not have to be administered to each respondent, theoretically permitting wider latitude in terms of modification. RESULTS: Recommendations regarding modifications vary, depending on the use of the measure. In the context of research, adjustments can be made at the analytic level by freeing and fixing parameters based on findings of differential item functioning (DIF). The consequences of DIF for clinical decision making depend on whether or not the patient's performance level approaches the scale decision cutpoint. High-stakes testing may require item removal or separate calibrations to ensure accurate assessment. DISCUSSION: Guidelines for modification based on DIF analyses and illustrations of the impact of adjustments are presented.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Objective Previous studies have identified differential item function (DIF) in depressive symptoms measures, but the impact of DIF has been rarely reported. Given the critical importance of depressive symptoms assessment among older adults, we examined whether DIF due to demographic characteristics resulted in salient score changes in commonly used measures.Methods Four longitudinal studies of cognitive aging provided a sample size of 3754 older adults and included individuals both with and without a clinical diagnosis of major depression. Each study administered at least one of the following measures: the Center for Epidemiologic Studies Depression scale (20-item ordinal response or 10-item dichotomous response versions), the Geriatric Depression Scale, and the Montgomery–Åsberg Depression Rating Scale. Hybrid logistic regression-item response theory methods were used to examine the presence and impact of DIF due to age, sex, race/ethnicity, and years of education on the depressive symptoms items.ResultsAlthough statistically significant DIF due to demographic factors was present on several items, its cumulative impact on depressive symptoms scores was practically negligible.Conclusions The findings support substantive meaningfulness of previously reported demographic differences in depressive symptoms among older adults, showing that these individual differences were unlikely to have resulted from item bias attributable to demographic characteristics we examined. Copyright © 2014 John Wiley & Sons, Ltd.
    International Journal of Geriatric Psychiatry 01/2015; 30(1):88-96. DOI:10.1002/gps.4121 · 3.09 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Objective To investigate the impact of differences in depressive symptom reporting across clinical groups (healthcare setting, chronic illness, depression diagnosis and anxiety diagnosis) on clinical interpretability and comparability of depression scores. Methods Participants from the Netherlands Study of Depression and Anxiety (n = 2981) completed the self-report Inventory of Depressive Symptomatology (IDS-SR). Differences in depressive symptom reporting between distinct clinical subpopulations were assessed using a Differential Item Functioning (DIF) analysis. The effects of DIF on symptom level were evaluated by examining whether DIF-adjustment had clinically relevant effects. Results Significant DIF was detected across all tested clinical subpopulation groupings. Clinically relevant DIF was found on the symptom level for 13 IDS-SR items. However, impact of DIF on the aggregate level ranged from small to negligible: adjustment for DIF only led to salient changes in aggregate scores for 0.2-12.7% of individuals across tested sources of DIF. Conclusion Differences in endorsement patterns of depressive symptoms were observed across clinical populations, challenging the assumptions regarding the measurement properties of self-reported depression. However, effects of DIF on the aggregate level of IDS-SR total scores were found to be minimal and not clinically important. The IDS-SR thus seems robust against DIF across clinical populations.
    Journal of Psychosomatic Research 08/2014; 78(2). DOI:10.1016/j.jpsychores.2014.08.014 · 2.84 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Individuals with medical conditions are likely to have elevated health anxiety; however, research has not demonstrated how medical status impacts response patterns on health anxiety measures. Measurement bias can undermine the validity of a questionnaire by overestimating or underestimating scores in groups of individuals. We investigated whether the Short Health Anxiety Inventory (SHAI), a widely-used measure of health anxiety, exhibits medical condition-based bias on item and subscale levels, and whether the SHAI subscales adequately assess the health anxiety continuum. Data were from 963 individuals with diabetes, breast cancer, or multiple sclerosis, and 372 healthy individuals. Mantel-Haenszel tests and item characteristic curves were used to classify the severity of item-level differential item functioning in all three medical groups compared to the healthy group. Test characteristic curves were used to assess scale-level differential item functioning and whether the SHAI subscales adequately assess the health anxiety continuum. Nine out of 14 items exhibited differential item functioning. Two items exhibited differential item functioning in all medical groups compared to the healthy group. In both Thought Intrusion and Fear of Illness subscales, differential item functioning was associated with mildly deflated scores in medical groups with very high levels of the latent traits. Fear of Illness items poorly discriminated between individuals with low and very low levels of the latent trait. While individuals with medical conditions may respond differentially to some items, clinicians and researchers can confidently use the SHAI with a variety of medical populations without concern of significant bias. Copyright © 2014. Published by Elsevier Inc.
    Journal of Psychosomatic Research 01/2015; 78(4). DOI:10.1016/j.jpsychores.2014.12.014 · 2.84 Impact Factor