A Comparative Experimental Assessment of a Threshold Selection Algorithm in Hierarchical Text Categorization

Conference Paper · April 2011with8 Reads
DOI: 10.1007/978-3-642-20161-5_6 · Source: DBLP
Conference: Advances in Information Retrieval - 33rd European Conference on IR Research, ECIR 2011, Dublin, Ireland, April 18-21, 2011. Proceedings


    Most of the research on text categorization has focused on mapping text documents to a set of categories among which structural
    relationships hold, i.e., on hierarchical text categorization. For solutions of a hierarchical problem that make use of an
    ensemble of classifiers, the behavior of each classifier typically depends on an acceptance threshold, which turns a degree
    of membership into a dichotomous decision. In principle, the problem of finding the best acceptance thresholds for a set of
    classifiers related with taxonomic relationships is a hard problem. Hence, devising effective ways for finding suboptimal
    solutions to this problem may have great importance. In this paper, we assess a greedy threshold selection algorithm aimed
    at finding a suboptimal combination of thresholds in a hierarchical text categorization setting. Comparative experiments,
    performed on Reuters, report the performance of the proposed threshold selection algorithm against a relaxed brute-force algorithm
    and against two state-of-the-art algorithms. Results highlight the effectiveness of the approach.