Variability of Interpretive Accuracy Among Diagnostic Mammography Facilities

Department of Internal Medicine, University of Washington School of Medicine, Box 359854, Seattle, WA 98104, USA.
Journal of the National Cancer Institute (Impact Factor: 12.58). 07/2009; 101(11):814-27. DOI: 10.1093/jnci/djp105
Source: PubMed


Interpretive performance of screening mammography varies substantially by facility, but performance of diagnostic interpretation has not been studied.
Facilities performing diagnostic mammography within three registries of the Breast Cancer Surveillance Consortium were surveyed about their structure, organization, and interpretive processes. Performance measurements (false-positive rate, sensitivity, and likelihood of cancer among women referred for biopsy [positive predictive value of biopsy recommendation {PPV2}]) from January 1, 1998, through December 31, 2005, were prospectively measured. Logistic regression and receiver operating characteristic (ROC) curve analyses, adjusted for patient and radiologist characteristics, were used to assess the association between facility characteristics and interpretive performance. All statistical tests were two-sided.
Forty-five of the 53 facilities completed a facility survey (85% response rate), and 32 of the 45 facilities performed diagnostic mammography. The analyses included 28 100 diagnostic mammograms performed as an evaluation of a breast problem, and data were available for 118 radiologists who interpreted diagnostic mammograms at the facilities. Performance measurements demonstrated statistically significant interpretive variability among facilities (sensitivity, P = .006; false-positive rate, P < .001; and PPV2, P < .001) in unadjusted analyses. However, after adjustment for patient and radiologist characteristics, only false-positive rate variation remained statistically significant and facility traits associated with performance measures changed (false-positive rate = 6.5%, 95% confidence interval [CI] = 5.5% to 7.4%; sensitivity = 73.5%, 95% CI = 67.1% to 79.9%; and PPV2 = 33.8%, 95% CI = 29.1% to 38.5%). Facilities reporting that concern about malpractice had moderately or greatly increased diagnostic examination recommendations at the facility had a higher false-positive rate (odds ratio [OR] = 1.48, 95% CI = 1.09 to 2.01) and a non-statistically significantly higher sensitivity (OR = 1.74, 95% CI = 0.94 to 3.23). Facilities offering specialized interventional services had a non-statistically significantly higher false-positive rate (OR = 1.97, 95% CI = 0.94 to 4.1). No characteristics were associated with overall accuracy by ROC curve analyses.
Variation in diagnostic mammography interpretation exists across facilities. Failure to adjust for patient characteristics when comparing facility performance could lead to erroneous conclusions. Malpractice concerns are associated with interpretive performance.

Download full-text


Available from: Berta M Geller
  • Source
    • "This is supported by the fact that agreement increases considerably after the second reading using the proposed criteria to an overall good agreement. This, for instance, compares quite favorably with other screening methods, like for instance mammography [27–29]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: To assess the inter-observer agreement of adenosine "stress"-only visual analysis of perfusion MR images in relation to experience and reading criteria. 106 adenosine perfusion MR examinations out of 350, 46 consecutive positive examinations and 60 randomly selected negative examinations were visually analysed by three individual readers (two residents and a technician) with different levels of experience. Readings (blinded for any information) were compared with the reading of an expert radiologist. After a month the examinations were presented again (randomly) without knowledge regarding the first readings. This time readings were performed with the systematical use of reading criteria. Agreement with the expert reading was good for the most experienced resident (k = 0.88). Kappa was 0.48 for the least experienced, and 0.57 for the technician. After the second systematical reading inter-observer agreement increased to 0.9, 0.68 and 0.77 respectively. Overall kappa increased from 0.59 to 0.71. The use of reading criteria significantly improved the performance of the least experienced reader (P = 0.01). Visual analysis of adenosine "stress"-only first-pass perfusion MR images has moderate to very good agreement. Performance is experience related, but the systematic use of reading criteria significantly increased performance for the least experienced observer.
    Full-text · Article · Sep 2010 · The international journal of cardiovascular imaging
  • [Show abstract] [Hide abstract]
    ABSTRACT: To check epidemiological data from a breast diagnostic clinic. Mammographies from 35,041 patients were studied, within a period of 2 years and 7 months, from 2004 to 2006, 32,049 (91.5%) of them from screening, and 2,992 from symptomatic patients (8.5%). The calculated parameters were: detection rate of the screening patients, percentage of cancer among the symptomatic patients, rate of biopsy indication, percentage of minimal, in situ, and stage 0-1 carcinomas, recall rate, and predictive value of mammographies considered as abnormal and of biopsies' indications in screening patients. 228 diagnoses of breast cancer were made, 111 in screening patients (0.34% detection rate) and 117 in symptomatic patients (3.91% detection rate). The number of biopsies' recommendations among screening patients was 544 (1.7% of those patients). There were 28% of minimal carcinomas, 10% of in situ carcinomas and 93% of stage 0-1 carcinomas among the screening patients. Recall rate was 19%. Positivity of mammographies considered as abnormal (VPP1) was 1.65%. The rate of biopsies' positivity (VPP2) was 21.9%. This study brings important epidemiological data for the audit of mammographic screening, rare among us. Data have been analyzed as compared to what is recommended by the literature, the detection rate and the percentage of minimal and in situ carcinomas found being comparable to the established values, but with the VPP value lower than the ideal.
    No preview · Article · Oct 2009 · Revista brasileira de ginecologia e obstetrićia: revista da Federação Brasileira das Sociedades de Ginecologia e Obstetrícia
  • [Show abstract] [Hide abstract]
    ABSTRACT: Breast cancer missed on diagnostic mammography may contribute to delayed diagnoses, whereas false-positive results may lead to unnecessary invasive procedures. Whether accuracy of diagnostic mammography at facilities serving vulnerable women differs from other facilities is unknown. To compare the interpretive performance of diagnostic mammography at facilities serving vulnerable women to those serving nonvulnerable women. We examined 168,251 diagnostic mammograms performed at Breast Cancer Surveillance Consortium facilities from 1999 to 2005. We used hierarchical logistic regression to compare sensitivity, false positive rates, and cancer detection rates. Women aged between 40 and 80 years underwent diagnostic mammography to evaluate an abnormal screening mammogram or breast problem. Facilities were assigned vulnerability indices according to the populations served based on the proportion of mammograms performed on women with lower educational attainment, racial/ethnic minority status, limited household income, or rural residences. Sensitivity of diagnostic mammography did not vary significantly across vulnerability indices adjusted for patient-level characteristics, but false-positive rates for diagnostic mammography examinations to evaluate a breast problem were higher at facilities serving vulnerable women defined as those with lower educational attainment (odds ratio [OR], 1.39; 95% confidence interval [CI]: 1.08, 1.79); racial/ethnic minorities (OR, 1.32; 95% CI: 0.98, 1.76); limited income (OR, 1.34; 95% CI: 1.08, 1.66); and rural residence (OR, 1.55; 95% CI: 1.27, 1.88). Diagnostic mammography to evaluate a breast problem at facilities serving vulnerable women had higher false positive rates than at facilities serving nonvulnerable women. This may reflect concerns that vulnerable populations may be less likely to follow-up after abnormal diagnostic mammography or concerns that such populations have higher cancer prevalence.
    No preview · Article · Oct 2010 · Medical care
Show more