Longitudinal evaluation of interobserver and intraobserver agreement of cervical intraepithelial neoplasia diagnosis among an experienced panel of gynecologic pathologists.
ABSTRACT Histologic diagnoses of cervical intraepithelial neoplasia grades 2 and 3 (CIN 2/3) are the key end points in clinical trials that evaluate the efficacy of a prophylactic quadrivalent human papillomavirus vaccine against cervical cancer. Adjudication of end points uses a panel of 4 pathologists. Quality control slides (n=185) from a nonclinical trial study with preestablished gold standard CIN diagnoses were used to characterize the panel's agreement on CIN diagnoses and monitor performance longitudinally. At 3-month intervals over 2 years, 1 of 6 different batches of quality control slides (n=30-31) was included with clinical trial slides for independent review by each of the 4 panelists. Unweighted kappas (kappa) were estimated within each panelist pair by dichotomizing the diagnoses as CIN+ versus non-CIN+ (including normal, unsatisfactory, and atypical immature metaplasia) or CIN 2/3+ versus non-CIN 2/3+ (including normal, unsatisfactory, atypical immature metaplasia, and CIN 1). Quadratic weighted kappa was calculated within each panelist pair using 4 diagnostic categories: normal, CIN 1, CIN 2, and CIN 3 or worse. Substantial interobserver agreement was observed (weighted kappa=0.765 to 0.865). Agreement with weighted kappa=0.779 to 0.887 was observed between the individual panelists and the gold standard, which is almost perfect agreement by Landis-defined categories. Intraobserver agreement was very high (weighted kappa=0.756 to 0.883). Some fluctuation in intraobserver and interobserver agreement was observed over the study period but there was no decreasing time trend. These data indicate that the interpretation of histologic end points used in the quadrivalent vaccine clinical trial program is highly valid and reliable.
- SourceAvailable from: Edward Neely Atkinson[show abstract] [hide abstract]
ABSTRACT: As part of a program project to evaluate emerging optical technologies for cervical neoplasia, we performed fluorescence and reflectance spectroscopic examinations of patients with abnormal Papanicolaou smears. Biopsy specimens were taken from each area and measured optically, and study pathologists performed qualitative histopathologic readings. Several methodologic issues arose in this analysis: (1) the interpathologist and intrapathologist agreement between institutions for the 1790 biopsy specimens; (2) the interinstitutional agreement among the two institutions conducting the trials on 117 randomly chosen biopsy specimens; (3) the interinstitutional agreement among the two institutions and a third expert gynecologic pathologist to ensure the expert readings were comparable to those outside both institutions on 117 randomly chosen biopsy specimens; and (4) an additional three reviews of the 106 difficult biopsy specimens by all three institutions. All 1790 specimens from 850 patients were reviewed three times at each institution in blinded fashion; those for which the first and second reviews were identical were not reviewed a third time. A randomly selected sample of 117 specimens was randomly ordered and read by study pathologists at The University of Texas M. D. Anderson Cancer Center, British Columbia Cancer Agency (BCCA), and Brigham and Women's Hospital (BWH). The 106 difficult cases were treated in the same manner as the randomized and random-ordered cases. Generalized, unweighted, and weighted kappas and their 95% confidence intervals were used to assess agreement. Binary comparisons were used to compare diagnostic categories. The kappas for the three readings of the overall data set using eight-category World Health Organization (WHO) criteria were as follows: 0.66 for the generalized, 0.72 for weighted, and ranged from 0.59 to 0.94 unweighted binary categories; those read using four-category Bethesda criteria: 0.70 for generalized, 0.69 for weighted, and 0.56-0.94 for unweighted binary categories. For the pool versus the study pathologist readings, the eight-category kappa was 0.51 for generalized, 0.72 for weighted, and 0.56-0.82 for unweighted binary categories; for those read using Bethesda criteria: 0.70 for generalized, 0.70 for weighted, and 0.59-0.82 for the unweighted binary categories. The interpathologist and intrapathologist readings were fair by Landis standards at the low end of the diagnostic scale (atypia, human papillomavirus, and CIN1) and substantial to almost perfect at the high end (CIN2, CIN3, and CIS). The randomly selected and randomly ordered sample of 117 specimens read with the WHO system yielded a generalized kappa of 0.45; among the three institutions (M. D. Anderson Cancer Center vs. BCCA, M. D. Anderson vs. BWH, and BCCA vs. BWH), the unweighted kappas were 0.46, 0.41, and 0.49 and the weighted were 0.65, 0.66, and 0.68, respectively; for the Bethesda, a generalized kappa of 0.65, unweighted kappas of 0.66, 0.65, and 0.47, and weighted of 0.74, 0.72, and 0.74. The difficult specimens read with the WHO system yielded a generalized kappa of 0.23; among the three institutions the unweighted kappas were 0.20, 0.30, and 0.37, and the weighted were 0.17, 0.34, and 0.31; for the Bethesda, a generalized kappa of 0.25; among the three institutions, the unweighted kappas were 0.21, 0.32, and 0.37, and the weighted were: 0.07, 0.21, and 0.37, respectively. Kappas in this expert group of pathologists were in the moderate, substantial, and almost perfect ranges for the overall and randomized samples. The randomized sample was representative of the larger sample. The kappa of the specimens for which disagreements arose was, predictably, in the slight range. Our findings will aid both the correlations with optical measurements using fluorescence and reflectance spectroscopy and the quantitative histopathologic analysis of these study specimens.Gynecologic Oncology 01/2006; 99(3 Suppl 1):S38-52. · 3.93 Impact Factor
- [show abstract] [hide abstract]
ABSTRACT: The Bethesda 2001 Workshop was convened to evaluate and update the 1991 Bethesda System terminology for reporting the results of cervical cytology. A primary objective was to develop a new approach to broaden participation in the consensus process. Forum groups composed of 6 to 10 individuals were responsible for developing recommendations for discussion at the workshop. Each forum group included at least 1 cytopathologist, cytotechnologist, clinician, and international representative to ensure a broad range of views and interests. More than 400 cytopathologists, cytotechnologists, histopathologists, family practitioners, gynecologists, public health physicians, epidemiologists, patient advocates, and attorneys participated in the workshop, which was convened by the National Cancer Institute and cosponsored by 44 professional societies. More than 20 countries were represented. Literature review, expert opinion, and input from an Internet bulletin board were all considered in developing recommendations. The strength of evidence of the scientific data was considered of paramount importance. Bethesda 2001 was a year-long iterative review process. An Internet bulletin board was used for discussion of issues and drafts of recommendations. More than 1000 comments were posted to the bulletin board over the course of 6 months. The Bethesda Workshop, held April 30-May 2, 2001, was open to the public. Postworkshop recommendations were posted on the bulletin board for a last round of critical review prior to finalizing the terminology. Bethesda 2001 was developed with broad participation in the consensus process. The 2001 Bethesda System terminology reflects important advances in biological understanding of cervical neoplasia and cervical screening technology.JAMA The Journal of the American Medical Association 05/2002; 287(16):2114-9. · 29.98 Impact Factor
- [show abstract] [hide abstract]
ABSTRACT: To assess interobserver variation in reporting cervical colposcopic biopsy specimens and to determine whether a modified Bethesda grading system results in better interobserver agreement than the traditional cervical intraepithelial neoplasia (CIN) grading system. One hundred and twenty five consecutive cervical colposcopic biopsy specimens were assessed independently by six histopathologists. Specimens were classified using the traditional CIN grading system as normal, koilocytosis, CIN I, CIN II, or CIN III. The specimens were also classified using a modified Bethesda grading system as either normal, low grade squamous intraepithelial lesion (LSIL) or high grade squamous intraepithelial lesion (HSIL). Participants were also asked to categorise biopsy specimens by the CIN system with the addition of the recently proposed category "basal abnormalities of uncertain significance (BAUS)". The degree of agreement between participants was assessed by kappa statistics. Using the CIN system, interobserver agreement was generally poor: unweighted and weighted kappa values between individual pairs of observers ranging from 0.05 to 0.34 (average 0.20) and from 0.20 to 0.54 (average 0.36), respectively. With the modified Bethesda system, interobserver agreement was better but still poor: unweighted and weighted kappa values ranging from 0.15 to 0.58 (average 0.30) and from 0.21 to 0.61 (average 0.36), respectively. There was little or no agreement between observers in the diagnosis of BAUS. Interobserver agreement in the reporting of cervical colposcopic biopsy specimens using the CIN grading system is poor. Agreement, while still poor, is better when a modified Bethesda grading system is used. There is little or no consensus in the diagnosis of BAUS.Journal of Clinical Pathology 11/1996; 49(10):833-5. · 2.44 Impact Factor