Conference Paper

Discrimination of Old Document Images Using Their Style.

L3i Labs., Univ. of La Rochelle, La Rochelle, France
DOI: 10.1109/ICDAR.2011.86 Conference: 2011 International Conference on Document Analysis and Recognition, ICDAR 2011, Beijing, China, September 18-21, 2011
Source: IEEE Xplore

ABSTRACT Based on the principle described by Pareti et al. in [1], [2], and by Chouaib et al. in [3], this paper proposes to combine the use of the Zipf law and the use of bag of patterns for the implementation of a document indexing processing scheme. Contrarily to these two mentioned approaches, we retain the most important patterns based on the TF-IDF criteria, and the pattern selection is local. This paper presents the different stages of our indexing process, as well as their application to historical documents. Results on comlex images are given, illustrated and discussed.

  • [Show abstract] [Hide abstract]
    ABSTRACT: Discrimination between graphical drawings is a difficult problem. It can be considered at different levels according to the applications, details can be observed or more globally what could be called the style. Here we are concerned with a global view of initial letters extracted from early renaissance printed documents. We are going to present a new method to index and classify ornamental letters in ancient books. We show how the Zipf law, originally used in mono-dimensional domains can be adapted to the image domain. We use it as a model to characterize the distribution of patterns occurring in these special drawings that are initial letters. Based on this model some new features are extracted and we show their efficiency for style discrimination.
    Graphics Recognition. Ten Years Review and Future Perspectives, 6th Internation Workshop, GREC 2005, Hong Kong, China, August 25-26, 2005, Revised Selected Papers; 01/2005
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper deals with the difficult problem of indexing ancient graphic images. It tackles the particular case of indexing drop caps (also called Lettrines) and specifically, considers the problem of letter extraction from this complex graphic images. Based on an analysis of the features of the images to be indexed, an original strategy is proposed. This approach relies on filtering the relevant information, on the basis of Meyer decomposition. Then, in order to accommodate the variability of representation of the information, a Zipf’s law modeling enables detection of the regions belonging to the letter, what allows it to be segmented. The overall process is evaluated using a relevant set of images, which shows the relevance of the approach.
    IJDAR. 01/2011; 14:243-254.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In the field of statistical discrimination k-nearest neighbor classification is a well-known, easy and successful method. In this paper we present an extended version of this technique, where the distances of the nearest neighbors can be taken into account. In this sense there is a close connection to LOESS, a local regression technique. In addition we show possibilities to use nearest neighbor for classification in the case of an ordinal class structure. Empirical studies show the advantages of the new techniques.

Full-text (4 Sources)

Available from
Jun 2, 2014