Commonly used evaluation measures including Recall, Precision, F-Measure and Rand Accuracy are biased and should not be used without clear understanding of the biases, and corresponding identification of chance or base case levels of the statistic. Using these measures a system that performs worse in the objective sense of Informedness, can appear...


... Even the SWIR 1 and SWIR 2 regions also include a large built-up class has been chosen to assess the performance of the classifier. F1 has been defined as the harmonic mean of recall and precision values (Powers 2020). A good F1 score is also indicative of good classification performance. ...