Conference Paper

Discrimination of Old Document Images Using Their Style

L3i Labs., Univ. of La Rochelle, La Rochelle, France
DOI: 10.1109/ICDAR.2011.86 Conference: 2011 International Conference on Document Analysis and Recognition, ICDAR 2011, Beijing, China, September 18-21, 2011
Source: IEEE Xplore


Based on the principle described by Pareti et al. in [1], [2], and by Chouaib et al. in [3], this paper proposes to combine the use of the Zipf law and the use of bag of patterns for the implementation of a document indexing processing scheme. Contrarily to these two mentioned approaches, we retain the most important patterns based on the TF-IDF criteria, and the pattern selection is local. This paper presents the different stages of our indexing process, as well as their application to historical documents. Results on comlex images are given, illustrated and discussed.

Download full-text


Available from: Mickaël Coustaty,
  • Source
    • "As a consequence, we can use them as global document features and thus describes styles of image. 4) From Zipf law to Image description: Starting from the results obtained in [2], we keep patterns which are in the lefthand portion. To do this, we retain patterns where the TF-IDF is higher than t% of the max value of TF-IDF for each image. "
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper deals with cultural heritage preservation and ancient document indexing. In the management of historical documents, ancient images are described using semantic information, often manually annotated by historians. In this paper, we propose an approach to interactively propagate the historians' knowledge to a database of drop caps images manually populated by historians with drop caps image annotations. Based on a novel document indexing processing scheme which combines the use of the Zipf law and the use of bag of patterns, our approach extends the Bag of Words model to represent the knowledge by visual features through relevance feedback. Then annotation propagation is automatically performed to propagate knowledge to the drop caps image database. In this article, our approach is presented together with preliminary experimental results and an illustrative example.
    Document Analysis and Recognition (ICDAR), 2013 12th International Conference on; 01/2013