[Show abstract][Hide abstract] ABSTRACT: The aim of this study was to establish when a second-stage diagnostic test may be of value in cases where a primary diagnostic test has given an uncertain diagnosis of the benign or malignant nature of an adnexal mass.
The diagnostic performance with regard to discrimination between benign and malignant adnexal masses for mathematical models including ultrasound variables and for subjective evaluation of ultrasound findings by an experienced ultrasound examiner was expressed as area under the receiver-operating characteristics curve (AUC), sensitivity and specificity. These were calculated for the total study population of 1938 patients with an adnexal mass as well as for subpopulations defined by the certainty with which the diagnosis of benignity or malignancy was made. The effect of applying a second-stage test to the tumors where risk estimation was uncertain was determined.
The best mathematical model (LR1) had an AUC of 0.95, sensitivity of 92% and specificity of 84% when applied to all tumors. When model LR1 was applied to the 10% of tumors in which the calculated risk fell closest to the risk cut-off of the model, the AUC was 0.59, sensitivity 90% and specificity 21%. A strategy where subjective evaluation was used to classify these 10% of tumors for which LR1 performed poorly and where LR1 was used in the other 90% of tumors resulted in a sensitivity of 91% and specificity of 90%. Applying subjective evaluation to all tumors yielded an AUC of 0.95, sensitivity of 90% and specificity of 93%. Sensitivity was 81% and specificity 47% for those patients where the ultrasound examiner was uncertain about the diagnosis (n = 115; 5.9%). No mathematical model performed better than did subjective evaluation among the 115 tumors where the ultrasound examiner was uncertain.
When model LR1 is used as a primary test for discriminating between benign and malignant adnexal masses, the use of subjective evaluation of ultrasound findings by an experienced examiner as a second-stage test in the 10% of cases for which the model yields a risk of malignancy closest to its risk cut-off will improve specificity without substantially decreasing sensitivity. However, none of the models tested proved suitable as a second-stage test in tumors where subjective evaluation yielded an uncertain result.
Ultrasound in Obstetrics and Gynecology 01/2011; 37(1):100-6. · 3.56 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Two logistic regression models have been developed for the characterization of adnexal masses. The goal of this prospective analysis was to see whether these models perform differently according to the prevalence of malignancy and whether the cut-off levels of risk assessment for malignancy by the models require modification in different centers.
Centers were categorized into those with a prevalence of malignancy below 15%, between 15 and 30% and above 30%. The areas under the receiver-operating characteristics curves (AUC) were compared using bootstrapping. The optimal cut-off level of risk assessment for malignancy was chosen per center, corresponding to the highest sensitivity level possible while still keeping a good specificity.
Both models performed better in centers with a lower prevalence of malignant cases. The AUCs of the two models for centers with fewer than 15% malignant cases were 0.97 and 0.95, those of centers with 15-30% malignancy were 0.95 and 0.93 and those of centers with more than 30% malignant cases were 0.94 and 0.92. This decrease in performance was due mainly to the decrease in specificity from over 90 to around 76%. In the centers with a higher percentage of malignant cases, a sensitivity of at least 90% with a good specificity could not be obtained by choosing a different cut-off level.
Overall the models performed well in all centers. The performance of the logistic regression models worsened with increasing prevalence of malignancy, due to a case mix with more borderline and complex benign masses seen in those centers. Because the cut-off of 0.10 is optimal for all three types of center, it seems reasonable to use this cut-off for both models in all centers.
Ultrasound in Obstetrics and Gynecology 09/2010; 37(2):226-31. · 3.56 Impact Factor