StAR: a simple tool for the statistical comparison of ROC curves.

Departamento de Genética Molecular y Microbiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Alameda 340, Santiago, Chile.
BMC Bioinformatics (Impact Factor: 2.67). 02/2008; 9:265. DOI: 10.1186/1471-2105-9-265
Source: PubMed

ABSTRACT As in many different areas of science and technology, most important problems in bioinformatics rely on the proper development and assessment of binary classifiers. A generalized assessment of the performance of binary classifiers is typically carried out through the analysis of their receiver operating characteristic (ROC) curves. The area under the ROC curve (AUC) constitutes a popular indicator of the performance of a binary classifier. However, the assessment of the statistical significance of the difference between any two classifiers based on this measure is not a straightforward task, since not many freely available tools exist. Most existing software is either not free, difficult to use or not easy to automate when a comparative assessment of the performance of many binary classifiers is intended. This constitutes the typical scenario for the optimization of parameters when developing new classifiers and also for their performance validation through the comparison to previous art.
In this work we describe and release new software to assess the statistical significance of the observed difference between the AUCs of any two classifiers for a common task estimated from paired data or unpaired balanced data. The software is able to perform a pairwise comparison of many classifiers in a single run, without requiring any expert or advanced knowledge to use it. The software relies on a non-parametric test for the difference of the AUCs that accounts for the correlation of the ROC curves. The results are displayed graphically and can be easily customized by the user. A human-readable report is generated and the complete data resulting from the analysis are also available for download, which can be used for further analysis with other software. The software is released as a web server that can be used in any client platform and also as a standalone application for the Linux operating system.
A new software for the statistical comparison of ROC curves is released here as a web server and also as standalone software for the LINUX operating system.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose an automatic method to segment the five main brain sub-regions (i.e. left/right hemispheres, left/right cerebellum and brainstem) from magnetic resonance images. The proposed method uses a library of pre-labeled brain images in a stereotactic space in combination with a non-local label fusion scheme for segmentation. The main novelty of the proposed method is the use of a multi-label block-wise label fusion strategy specifically designed to deal with the classification of main brain sub-volumes that process only specific parts of the brain images significantly reducing the computational burden. The proposed method has been quantitatively evaluated against manual segmentations. The evaluation showed that the proposed method was faster while producing more accurate segmentations than a current state-of-the-art method. We also present evidences suggesting that the proposed method was more robust against brain pathologies than the compared method. Finally, we demonstrate the clinical value of our method compared to the state-of-the-art approach in terms of the asymmetry quantification in Alzheimer's disease. Copyright © 2015. Published by Elsevier Inc.
    Magnetic Resonance Imaging 02/2015; DOI:10.1016/j.mri.2015.02.005 · 2.02 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background The purpose of this study was to evaluate serum HE4 as a biomarker to detect recurrent disease during follow-up of patients with endometrial adenocarcinoma (EAC).Methods We performed a retrospective analysis of 98 EAC patients treated at Innsbruck Medical University, between 1999 and 2009. Twenty-six patients developed recurrent disease. Median follow-up was 5 years. Serum HE4 and CA125 levels were analyzed using the ARCHITECT assay (Abbott, Wiesbaden, Germany) pre-operatively (baseline), post-operative (interval) and after histological confirmation of recurrent disease or when patients returned for clinical review with no evidence of recurrent disease (recurrence/final)). Receiver operator curves (ROC), Spearman rank correlation coefficient, chi-squared and Mann¿Whitney tests were used for statistical analysis.ResultsHE4 levels decreased after initial treatment (p¿=¿0.001) and increased again at recurrence (p¿=¿0.002). HE4 was elevated (>70pmol/L) in 21 of 26 (81%) and CA125 was elevated (>35U/ml) in 12 of 26 (46%) patients at recurrence. In endometrioid histology (n¿=¿69) serum HE4 measured during follow up (Area under the curve (AUC)¿=¿0.87, 95%CI 0.79-0.95) was a better indicator of recurrence than CA125 (AUC¿=¿0.67, 95%CI 0.52-0.83). A HE4 level of 70 pmol/L was associated with a sensitivity of 84%, a specificity of 74% and a negative predictive value of 93% when assessing for recurrent endometrioid EAC.Conclusion This is a preliminary description of HE4 serum levels measured during routine follow up of EAC patients. Serum HE4 measured during clinical follow-up may identify recurrent disease particularly in patients with endometrioid histology. Further prospective validation of HE4 is warranted.
    BMC Cancer 02/2015; 15(1):33. DOI:10.1186/s12885-015-1028-0 · 3.32 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: To assess the potential of flicker-defined form (FDF) perimetry to detect functional loss in patient groups with beginning glaucoma, and to evaluate the dynamic range of the FDF stimulus in individual patients and at individual test positions. FDF perimetry and standard automated perimetry (SAP) were performed at identical test locations (adapted G1 protocol) in 60 healthy subjects and 111 glaucoma patients. All patients showed glaucomatous optic disc appearance. Grouping within the glaucoma cohort was based on SAP-performance: 33 "preperimetric" open-angle glaucoma (OAG) patients, 28 "borderline" OAG (focal defects and SAP-mean defect (MD) <2 dB), 33 "early" OAG (SAP-MD < 5 dB), 17 "advanced" OAG. All participants were experienced in psychophysical and perimetric tests. Defect values and the areas under receiver operating characteristic curves (ROC) in patient groups were statistically compared. The values of FDF-MD in the preperimetric, borderline, and early OAG group were 2.7 ± 3.4 dB, 5.5 ± 2.6 dB, and 8.5 ± 3.4 dB respectively (all significantly above normal). The percentage of patients exceeding normal FDF-MD was 27.3 %, 60.7 %, and 87.9 % respectively. The age-adjusted FDF-mean defect (MD) of the G1X-protocol was not significantly correlated with refractive error, lens opacity, pupil size, or gender. Occurrence of ceiling effects (inability to detect targets at highest contrast) showed a high correlation with visual field losses (R = 0.72, p < 0.001). Local analysis indicates that SAP losses exceeding 5 dB could not be distinguished with the FDF technique. The FDF stimulus was able to detect beginning glaucoma damage. Patients with SAP-MD values exceeding 5 dB should be monitored with conventional perimetry because of its larger dynamic range.

Full-text (2 Sources)

Available from
May 22, 2014