Is it possible to plot a ROC curve for a multiclass classification algorithm to study its performance, or is it better to analyze by confusion matrix?
I have a a multiclass data-set , which I am analyzing using classification algorithms, but I am having difficultlies plotting the ROC curve. I searched through a lot of papers and sites but most discussions are in relation to binary classification. Is it better to plot a ROC curve for multiclass or just do an analysis of the confusion matrix which could give us a fair idea about the performance of different algorithms?
The best answer could be to treat the multiclass as a binary classification problem that is consider one vs all and calculate the operating points for each class and then average it out for the entire classifier. I am not sure how good of an idea this gives regarding the performance of the classifier, but this is the approach I have seen in many papers.
Yes Dheeb, you can take the average of the three AUCs. Alternatively, using the levels argument in the multiclass.roc function in pROC library, all levels are used and combined to compute the multiclass AUC. See David J. Hand and Robert J. Till (2001). A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning 45(2), p. 171–186.
An approach that almost no one uses (because they don't know it), but I consider far superior, would be to compute the consequential information, which is an information measure that captures all the performance issues in a single scalar (and bigger is always better). If you are interested, you can find the development for the binary case in a paper of mine published in IEEE Access, where it is referred to as "separation", and I would be happy to provide the yet to be published notes of how to extend it to the classification problem. The problem with both ROC curves and confusion matrices is they are ambiguous measures of performance; this is not.
In multi-class classification problem, you either formulate the problem as one-vs-all, where you will have a ROC curve for each class. or you formulate it as one-vs-one, where you will have a ROC curve for all class-pair combinations.
Confusion matrix is nice, but it is not statistically significant as it is a point estimate, except if you will plot the whole Precision-Recall curve and calculate its area.
I have a three class problem. My label for the three classes are 0, 1, and 2 for class 1, class 2, and class 3, respectively. I used one-vs-all approach to compute the sensitivities ans specificities. And once I know these two values, I can compute the area under the curve.
But the problem is, since I am using one-vs-all approach, I have three sensitivities, specificities, and AUC. How can I determine the overall values of these parameters for a classifier?
What happens in a multi-class "ranking" classifier? I have a classifier that will identify an input of data as one of five available classes by just looking for the higher score (a score is generated for each class). How can I vary a classifier threshold in such a situation? Is it possible?
- For multi-class problem, you can do vROC (Volume ROC) instead of ROC. Indeed, ROC is just for binary classification task.
- If you want to only use ROC, you can evaluate your model by getting AUC for bi-ROC (between each two-classes) and then average all of them.
- Rather than ROC, there are many evaluation metrics for evaluation such as: Sensitivity, Specificity, Jaccard, F1 score, Overall accuracy, and MCC. You can find the definition of these metrics in my research papers here on ResearchGate.
Both PR and ROC is used for binary classification but if for multiple class, then consider one the positive class and rest of all lies in negative class. so, on this strategy we incorporate mulit class problem.
Com o crescente acesso à Web, grande quantidade de conteúdos são produzidos diariamente. O estudo de tais conteúdos permite a descoberta de novos conhecimentos. Nesse sentido, este trabalho apresenta uma análise de algoritmos que permitem a detecção de emoções em tweets no idioma português brasileiro. Assim, são considerados dez algoritmos, desde á...
This paper presents the intrinsic limit determination algorithm (ILD Algorithm), a novel technique to determine the best possible performance, measured in terms of the AUC (area under the ROC curve) and accuracy, that can be obtained from a specific dataset in a binary classification problem with categorical features regardless of the model used. T...