Reliability and validity of a scoring instrument for clinical performance during Pediatric Advanced Life Support simulation scenarios

Division of Pediatric Critical Care Medicine, Children's Hospital of Philadelphia, PA 19104, United States.
Resuscitation (Impact Factor: 4.17). 03/2010; 81(3):331-6. DOI: 10.1016/j.resuscitation.2009.11.011
Source: PubMed


To assess the reliability and validity of scoring instruments designed to measure clinical performance during simulated resuscitations requiring the use of Pediatric Advanced Life Support (PALS) algorithms.
Pediatric residents were invited to participate in an educational trial involving simulated resuscitations that employ PALS algorithms. Each subject participated in a session comprised of four scenarios (asystole, dysrhythmia, respiratory arrest, shock). Video-recorded sessions were independently reviewed and scored by four raters using instruments designed to measure performance in terms of timing, sequence, and quality. Validity was assessed by two-factor analysis of variance with postgraduate year (PGY-1 versus PGY-2) as an independent variable. Reliability was assessed by calculation of overall interrater reliability (IRR) as well as a generalizability study to estimate variance components of individual measurement facets (scenarios, raters) and associated interactions.
20 subjects were scored by four raters. Based on a two-factor ANOVA, PGY-2s outperformed PGY-1s (p<0.05); significant differences in difficulty existed between the four scenarios, with dysrhythmia scores being the lowest. Overall IRR was high (0.81) and most variance could be attributed to subject (17%), scenario (13%), and the interaction between subject and scenario (52%); variance attributable to rater was minimal (1.4%).
The instruments assessed in this study measure clinical performance during PALS scenarios in a reliable and valid manner. Measurement error could be minimized further through the use of additional scenarios but additional raters, for a given scenario, would not improve reliability. Further studies should assess validity of measurement with respect to actual clinical performance during resuscitations.

8 Reads
  • [Show abstract] [Hide abstract]
    ABSTRACT: To compare the psychometric performance of two rating instruments used to assess trainee performance in three clinical scenarios. This study was part of a two-phase, randomized trial with a wait-list control condition assessing the effectiveness of a pediatric emergency medicine curriculum targeting general emergency medicine residents. Residents received 6 hours of instruction either before or after the first assessment. Separate pairs of raters completed either a dichotomous checklist for each of three cases or the Global Performance Assessment Tool (GPAT), an anchored multidimensional scale. A fully crossed person×rater×case generalizability study was conducted. The effect of training year on performance is assessed using multivariate analysis of variance. The person and person×case components accounted for most of the score variance for both instruments. Using either instrument, scores demonstrated a small but significant increase as training level increased when analyzed using a multivariate analysis of variance. The inter-rater reliability coefficient was >0.9 for both instruments. We demonstrate that our checklist and anchored global rating instrument performed in a psychometrically similar fashion with high reliability. As long as proper attention is given to instrument design and testing and rater training, checklists and anchored assessment scales can produce reproducible data for a given population of subjects. The validity of the data arising for either instrument type must be assessed rigorously and with a focus, when practicable, on patient care outcomes.
    Simulation in healthcare: journal of the Society for Simulation in Healthcare 02/2011; 6(1):18-24. DOI:10.1097/SIH.0b013e318201aa90 · 1.48 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Robustly tested instruments for quantifying clinical performance during pediatric resuscitation are lacking. Examining Pediatric Resuscitation Education through Simulation and Scripting Collaborative was established to conduct multicenter trials of simulation education in pediatric resuscitation, evaluating performance with multiple instruments, one of which is the Clinical Performance Tool (CPT). We hypothesize that the CPT will measure clinical performance during simulated pediatric resuscitation in a reliable and valid manner. Using a pediatric resuscitation scenario as a basis, a scoring system was designed based on Pediatric Advanced Life Support algorithms comprising 21 tasks. Each task was scored as follows: task not performed (0 points); task performed partially, incorrectly, or late (1 point); and task performed completely, correctly, and within the recommended time frame (2 points). Study teams at 14 children's hospitals went through the scenario twice (PRE and POST) with an interposed 20-minute debriefing. Both scenarios for each of eight study teams were scored by multiple raters. A generalizability study, based on the PRE scores, was conducted to investigate the sources of measurement error in the CPT total scores. Inter-rater reliability was estimated based on the variance components. Validity was assessed by repeated measures analysis of variance comparing PRE and POST scores. Sixteen resuscitation scenarios were reviewed and scored by seven raters. Inter-rater reliability for the overall CPT score was 0.63. POST scores were found to be significantly improved compared with PRE scores when controlled for within-subject covariance (F1,15 = 4.64, P < 0.05). The variance component ascribable to rater was 2.4%. Reliable and valid measures of performance in simulated pediatric resuscitation can be obtained from the CPT. Future studies should examine the applicability of trichotomous scoring instruments to other clinical scenarios, as well as performance during actual resuscitations.
    Simulation in healthcare: journal of the Society for Simulation in Healthcare 02/2011; 6(2):71-7. DOI:10.1097/SIH.0b013e31820c44da · 1.48 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: As the use of simulation-based assessment expands for healthcare workers, there is a growing need for research to quantify the psychometric properties of the associated process and outcome measures.
    Simulation in healthcare: journal of the Society for Simulation in Healthcare 06/2011; 6 Suppl:S48-51. DOI:10.1097/SIH.0b013e31822237d0 · 1.48 Impact Factor
Show more