Article

Some practical issues of experimental design and data analysis in radiological ROC studies.

Department of Radiology, University of Chicago, IL 60637.
Investigative Radiology (Impact Factor: 5.46). 04/1989; 24(3):234-45. DOI: 10.1097/00004424-198903000-00012
Source: PubMed

ABSTRACT Receiver operating characteristic (ROC) analysis has been used in a broad variety of medical imaging studies during the past 15 years, and its advantages over more traditional measures of diagnostic performance are now clearly established. But despite the essential simplicity of the approach, workers in the field often find--sometimes only after an ROC study is under way--that a number of subtle issues related to experimental design and data analysis must be confronted in practice. Many of these issues have not been discussed in the literature in detail, and most are not well known. The purposes of this paper are to make users of ROC methodology in medical imaging aware of potential problems that should be confronted before an ROC study is begun and to indicate, at least broadly, how those problems may be dealt with, given the present state of the art. Some of the issues raised here can be addressed adequately by easily prescribed techniques, whereas others remain difficult and will be resolved fully only by new methodologic developments.

4 Bookmarks
 · 
134 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: OBJECTIVE. The purpose of this study was to compare the diagnostic accuracy achieved with and without the calibration method established by the DICOM standard in both medical-grade gray-scale displays and consumer-grade color displays. MATERIALS AND METHODS. This study involved 76 cases, six radiologists, three displays, and two display calibrations for a total of 2736 observations in a multireadermulticase factorial design. The evaluated conditions were interstitial opacities, pneumothorax, and nodules. CT was adopted as the reference standard. One medical-grade gray-scale display and two consumer-grade color displays were evaluated. Analyses of ROC curves, diagnostic accuracy (measured as AUC), accuracy of condition classification, and false-positive and false-negative rate comparisons were performed. The degree of agreement between readers was also evaluated. RESULTS. No significant differences in image quality perception by the readers in the presence or absence of calibration were observed. Similar forms of the ROC curves were observed. No significant differences were detected in the observed variables (diagnostic accuracy, accuracy of condition classification, false-positive rates, false-negative rates, and image-quality perception). Strong agreement between readers was also determined for each display with and without calibration. CONCLUSION. For the chest conditions and selected observers included in this study, no significant differences were observed between the three evaluated displays with respect to accuracy performance with and without calibration.
    AJR. American journal of roentgenology. 06/2014; 202(6):1272-80.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Ovarian cancer is particularly deadly because it is usually diagnosed after it has metastasized. We have previously identified features of ovarian cancer using optical coherence tomography (OCT) and second-harmonic generation (SHG) microscopy (targeting collagen). OCT provides an image of the ovarian microstructure, while SHG provides a high-resolution map of collagen fiber bundle arrangement. Here, we investigated the diagnostic potential of dual-modality OCT and SHG imaging. We conducted a fully crossed, multireader, multicase study using seven human observers. Each observer classified 44 ex vivo mouse ovaries (16 normal and 28 abnormal) as normal or abnormal from OCT, SHG, and simultaneously viewed, coregistered OCT and SHG images and provided a confidence rating on a six-point scale. We determined the average receiver operating characteristic (ROC) curves, area under the ROC curves (AUC), and other quantitative figures of merit. The results show that OCT has diagnostic potential with an average AUC of 0.91 ` 0.06. The average AUC for SHG was less promising at 0.71 ` 0.13. The average AUC for simultaneous OCT and SHG was not significantly different from OCT alone, possibly due to the limited SHG field of view. The high performance of OCT and coregistered OCT and SHG warrants further investigation.
    Journal of Medical Imaging. 07/2014; 1(2):025501.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Abstract Introduction: In teleradiology services and in hospitals, the extensive use of visualization displays requires affordable devices. The purpose of this study was to compare three differently priced displays (a medical-grade grayscale display and two consumer-grade color displays) for image visualization of digitized chest X-rays. Materials and Methods: The evaluated conditions were interstitial opacities, pneumothorax, and nodules using computed tomography as the gold standard. The comparison was accomplished in terms of receiver operating characteristic (ROC) curves, the diagnostic power measured as the area under ROC curves, accuracy in conditions classification, and main factors affecting accuracy, in a factorial study with 76 cases and six radiologists. Results: The ROC curves for all of the displays and pathologies had similar shapes and no differences in diagnostic power. The proportion of cases correctly classified for each display was greater than 71.9%. The correctness proportions of the three displays were different (p<0.05) only for interstitial opacities. The evaluation of the main factors affecting these proportions revealed that the display factor was not significant for either nodule size or pneumothorax size (p>0.05). Conclusions: Although the image quality variables showed differences in the radiologists' perceptions of the image quality of the three displays, significant differences in the accuracy did not occur. The main effect on the variability of the proportions of correctly classified cases did not come from the display factor. This study confirms previous findings that medical-grade displays could be replaced by consumer-grade color displays with the same image quality.
    Telemedicine and e-Health 02/2014; · 1.40 Impact Factor