ROCR: Visualizing classifier performance in R

Department of Computational Biology and Applied Algorithmics, Max-Planck-Institute for Informatics, Saarbrücken, Germany.
Bioinformatics (Impact Factor: 4.98). 11/2005; 21(20):3940-1. DOI: 10.1093/bioinformatics/bti623
Source: PubMed


ROCR is a package for evaluating and visualizing the performance of scoring classifiers in the statistical language R. It
features over 25 performance measures that can be freely combined to create two-dimensional performance curves. Standard methods
for investigating trade-offs between specific performance measures are available within a uniform framework, including receiver
operating characteristic (ROC) graphs, precision/recall plots, lift charts and cost curves. ROCR integrates tightly with R's
powerful graphics capabilities, thus allowing for highly adjustable plots. Being equipped with only three commands and reasonable
default values for optional parameters, ROCR combines flexibility with ease of usage.

Availability: ROCR can be used under the terms of the GNU General Public License. Running within R, it is platform-independent.

Contact: tobias.sing{at}

Full-text preview

Available from:
  • Source
    • "We applied coefficients from our top regression model to our investigated cluster data to calculate a prediction value for each cluster. We then used a receiver operating characteristic (ROC) curve generated in package ROCR (Sing et al. 2005) in R to determine an optimal cutoff value for our model predictions (Boyce et al. 2002; Knopff et al. 2009; Miller et al. 2013). Selection of cutoff values was imperative for optimal prediction performance of the model (Webb et al. 2008; Knopff et al. 2009; Merrill et al. "

    Journal of Mammalogy 11/2015; DOI:10.1093/jmammal/gyv183 · 1.84 Impact Factor
    • "Those data were derived from the GBIF database (retrieved in April 2013). We calculated the AUC value (Fielding & Bell, 1997) where NPP was used as the prediction and GBIF presence as label (Sing et al., 2005; Robin et al., 2011), and Pearson correlation coefficient (COR) between the modelling result and the independent GBIF occurrence data (Elith et al., 2006), two frequently used measures of model performance (Elith & Leathwick, 2009). We furthermore calculated Cohen's kappa (j) with a threshold value for NPP selected to maximize j (Gamer et al., 2012). "

    Journal of Biogeography 11/2015; DOI:10.1111/jbi.12646 · 4.59 Impact Factor
  • Source
    • "This curve represents the relationship between the true-positive rate and the true negative rate at different thresholds and ranges from 0.5 (low accuracy) to 1 (high accuracy). All the analyses were performed using R 2.15.1 (R Development Core Team 2012) with the ROCR (Sing et al. 2005) package. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The management of animal endangered species requires detailed information on their distribution and abundance, which is often hard to obtain. When animals communicate using sounds, one option is to use automatic sound recorders to gather information on the species for long periods of time with low effort. One drawback of this method is that processing all the information manually requires large amounts of time and effort. Our objective was to create a relatively " user-friendly " (i.e., that does not require big programming skills) automatic detection algorithm to improve our ability to get basic data from sound-emitting animal species. We illustrate our algorithm by showing two possible applications with the Hawai'i 'Amakihi, Hemignathus virens virens, a forest bird from the island of Hawai'i. We first characterized the 'Amakihi song using recordings from areas where the species is present in high densities. We used this information to train a classification algorithm, the support vector machine (SVM), in order to identify 'Amakihi songs from a series of potential songs. We then used our algorithm to detect the species in areas where its presence had not been previously confirmed. We also used the algorithm to compare the relative abundance of the species in different areas where management actions may be applied. The SVM had an accuracy of 86.5% in identifying 'Amakihi. We confirmed the presence of the 'Amakihi at the study area using the algorithm. We also found that the relative abundance of 'Amakihi changes among study areas, and this information can be used to assess where management strategies for the species should be better implemented. Our automatic song detection algorithm is effective, " user-friendly " and can be very useful for optimizing the management and conservation of those endangered animal species that communicate acoustically.
    Ecology and Evolution 10/2015; DOI:10.1002/ece3.1743 · 2.32 Impact Factor
Show more