Practical issues in the application of item response theory: a demonstration using items from the pediatric quality of life inventory (PedsQL) 4.0 generic core scales.

RTI Health Solutions, Research Triangle Park, North Carolina 27709-2194, USA.
Medical Care (Impact Factor: 2.94). 06/2007; 45(5 Suppl 1):S39-47. DOI: 10.1097/01.mlr.0000259879.05499.eb
Source: PubMed

ABSTRACT Item response theory (IRT) is increasingly being applied to health-related quality of life instrument development and refinement. This article discusses results obtained using categorical confirmatory factor analysis (CCFA) to check IRT model assumptions and the application of IRT in item analysis and scale evaluation.
To demonstrate the value of CCFA and IRT in examining a health-related quality of life measure in children and adolescents.
This illustration uses data from 10,241 children and their parents on items from the 4 subscales of the PedsQL 4.0 Generic Core Scales. CCFA was applied to confirm domain dimensionality and identify possible locally dependent items. IRT was used to assess the strength of the relationship between the items and the constructs of interest and the information available across the latent construct.
CCFA showed generally strong support for 1-factor models for each domain; however, several items exhibited evidence of local dependence. IRT revealed that the items generally exhibit favorable characteristics and are related to the same construct within a given domain. We discuss the lessons that can be learned by comparing alternate forms of the same scale, and we assess the potential impact of local dependence on the item parameter estimates.
This article describes CCFA methods for checking IRT model assumptions and provides suggestions for using these methods in practice. It offers insight into ways information gained through IRT can be applied to evaluate items and aid in scale construction.

1 Follower
  • [Show abstract] [Hide abstract]
    ABSTRACT: Purpose In child–parent agreement studies in the field of paediatric health-related quality of life (HRQoL), little attention has been paid to the effect of gender in parental proxy rating of children’s HRQoL. This study aims to test the potential interchangeability of parent dyads in reporting children’s HRQoL on both item and scale levels of the PedsQL™ 4.0 instrument, using the approach of differential item functioning (DIF). Methods The PedsQL™ 4.0 Generic Core Scales were completed by 576 father-and-mother dyads. A polytomous item response theory model, graded response model, was used to detect DIF across fathers and mothers. Result Assessment at item level showed that fathers and mothers perceived the meaning of items of the PedsQL™ 4.0 consistently. Regarding the scale level, a moderate to high level of agreement was observed between mothers’ and fathers’ reports on all similar subscales. Although the significant mean score differences in total, physical and emotional functioning indicated that fathers gave higher scores to their children, the small effect size implied that this difference may not be practically meaningful. Conclusion Our findings revealed that discrepancy in parent dyads in rating children’s HRQoL is a “real” difference and not an artefact due to measurement non-invariance. Fathers were seen to have slightly different insights into their children, especially for emotional functioning, but overall the results were not all that different. This suggests that paternal proxy-reports can be included in studies along with maternal proxy-reports, and the two may be combined when looking at parent–child agreement. Parent–child agreement studies in Iran are not affected by parents’ gender, and therefore, researchers may rely on the assumption of the interchangeability of fathers and mothers in these studies.
    Quality of Life Research 02/2015; DOI:10.1007/s11136-015-0931-9 · 2.86 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The violation of the assumption of local independence when applying item response theory (IRT) models has been shown to have a negative impact on all estimates obtained from the given model. Numerous indices and statistics have been proposed to aid analysts in the detection of local dependence (LD). A Monte Carlo study was conducted to evaluate the relative performance of selected LD measures in conditions considered typical of studies collecting psychological assessment data. Both the Jackknife Slope Index and likelihood ratio statistic G 2 are available across the two IRT models used and displayed adequate to good performance in most simulation conditions. The use of these indices together is the final recommendation for applied analysts. Future research areas are discussed.
    Applied Psychological Measurement 10/2013; 37(7):541-562. DOI:10.1177/0146621613491456 · 1.49 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Performance on psychometric tests is key to diagnosis and monitoring treatment of dementia. Results are often reported as a total score, but there is additional information in individual items of tests which vary in their difficulty and discriminatory value. Item difficulty refers to an ability level at which the probability of responding correctly is 50%. Discrimination is an index of how well an item can differentiate between patients of varying levels of severity. Item response theory (IRT) analysis can use this information to examine and refine measures of cognitive functioning. This systematic review aimed to identify all published literature which had applied IRT to instruments assessing global cognitive function in people with dementia. A systematic review was carried out across Medline, Embase, PsychInfo and CINHAL articles. Search terms relating to IRT and dementia were combined to find all IRT analyses of global functioning scales of dementia. Of 384 articles identified four studies met inclusion criteria including a total of 2,920 people with dementia from six centers in two countries. These studies used three cognitive tests (MMSE, ADAS-Cog, BIMCT) and three IRT methods (Item Characteristic Curve analysis, Samejima's graded response model, the 2-Parameter Model). Memory items were most difficult. Naming the date in the MMSE and memory items, specifically word recall, of the ADAS-cog were most discriminatory. Four published studies were identified which used IRT on global cognitive tests in people with dementia. This technique increased the interpretative power of the cognitive scales, and could be used to provide clinicians with key items from a larger test battery which would have high predictive value. There is need for further studies using IRT in a wider range of tests involving people with dementia of different etiology and severity.
    BMC Psychiatry 02/2014; 14(1):47. DOI:10.1186/1471-244X-14-47 · 2.24 Impact Factor


1 Download
Available from