Item response theory facilitated cocalibrating cognitive tests and reduced bias in estimated rates of decline.

Department of Medicine, University of Washington, Seattle, WA, USA.
Journal of clinical epidemiology (Impact Factor: 5.48). 05/2008; 61(10):1018-27.e9. DOI: 10.1016/j.jclinepi.2007.11.011
Source: PubMed

ABSTRACT To cocalibrate the Mini-Mental State Examination, the Modified Mini-Mental State, the Cognitive Abilities Screening Instrument, and the Community Screening Instrument for Dementia using item response theory (IRT) to compare screening cut points used to identify cases of dementia from different studies, to compare measurement properties of the tests, and to explore the implications of these measurement properties on longitudinal studies of cognitive functioning over time.
We used cross-sectional data from three large (n>1000) community-based studies of cognitive functioning in the elderly. We used IRT to cocalibrate the scales and performed simulations of longitudinal studies.
Screening cut points varied quite widely across studies. The four tests have curvilinear scaling and varied levels of measurement precision, with more measurement error at higher levels of cognitive functioning. In longitudinal simulations, IRT scores always performed better than standard scoring, whereas a strategy to account for varying measurement precision had mixed results.
Cocalibration allows direct comparison of cognitive functioning in studies using any of these four tests. Standard scoring appears to be a poor choice for analysis of longitudinal cognitive testing data. More research is needed into the implications of varying levels of measurement precision.

Download full-text


Available from: Sebastien Haneuse, Jul 01, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Objectives.We describe and compare the expected performance trajectories of older adults on the Mini-Mental Status Examination (MMSE) across six independent studies from four countries in the context of a collaborative network of longitudinal studies of aging. A coordinated analysis approach is used to compare patterns of change conditional on sample composition differences related to age, sex, and education. Such coordination accelerates evaluation of particular hypotheses. In particular, we focus on the effect of educational attainment on cognitive decline.Method.Regular and Tobit mixed models were fit to MMSE scores from each study separately. The effects of age, sex, and education were examined based on more than one centering point. RESULTS: Findings were relatively consistent across studies. On average, MMSE scores were lower for older individuals and declined over time. Education predicted MMSE score, but, with two exceptions, was not associated with decline in MMSE over time. CONCLUSION: A straightforward association between educational attainment and rate of cognitive decline was not supported. Thoughtful consideration is needed when synthesizing evidence across studies, as methodologies adopted and sample characteristics, such as educational attainment, invariably differ.
    The Journals of Gerontology Series B Psychological Sciences and Social Sciences 10/2012; 68(3). DOI:10.1093/geronb/gbs077 · 2.85 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: To evaluate psychometric properties of a widely used patient experience survey. English-language responses to the Clinician & Group Consumer Assessment of Healthcare Providers and Systems (CG-CAHPS®) survey (n = 12,244) from a 2008 quality improvement initiative involving eight southern California medical groups. We used an iterative hybrid ordinal logistic regression/item response theory differential item functioning (DIF) algorithm to identify items with DIF related to patient sociodemographic characteristics, duration of the physician-patient relationship, number of physician visits, and self-rated physical and mental health. We accounted for all sources of DIF and determined its cumulative impact. The upper end of the CG-CAHPS® performance range is measured with low precision. With sensitive settings, some items were found to have DIF. However, overall DIF impact was negligible, as 0.14 percent of participants had salient DIF impact. Latinos who spoke predominantly English at home had the highest prevalence of salient DIF impact at 0.26 percent. The CG-CAHPS® functions similarly across commercially insured respondents from diverse backgrounds. Consequently, previously documented racial and ethnic group differences likely reflect true differences rather than measurement bias. The impact of low precision at the upper end of the scale should be clarified.
    Health Services Research 12/2011; 46(6pt1):1778-802. DOI:10.1111/j.1475-6773.2011.01299.x · 2.49 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Logistic regression provides a flexible framework for detecting various types of differential item functioning (DIF). Previous efforts extended the framework by using item response theory (IRT) based trait scores, and by employing an iterative process using group-specific item parameters to account for DIF in the trait scores, analogous to purification approaches used in other DIF detection frameworks. The current investigation advances the technique by developing a computational platform integrating both statistical and IRT procedures into a single program. Furthermore, a Monte Carlo simulation approach was incorporated to derive empirical criteria for various DIF statistics and effect size measures. For purposes of illustration, the procedure was applied to data from a questionnaire of anxiety symptoms for detecting DIF associated with age from the Patient-Reported Outcomes Measurement Information System.
    Journal of statistical software 03/2011; 39(8):1-30. · 3.80 Impact Factor