-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, we propose the combination of the self organizing map (SOM) and of the tangent distance for effective clustering in document image analysis. The proposed model (SOM_TD) is used for character and layout clustering, with applications to word retrieval and to page classification. By using the tangent distance it is possible to improve the SOM clustering so as to be more tolerant with respect to small local transformations of the input patterns.
Image Analysis and Processing, 2007. ICIAP 2007. 14th International Conference on; 10/2007
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper we propose recognizing logo images by using an adaptive model referred to as recursive artificial neural network. At first, logo images are converted into a structured representation based on contour trees. Recursive neural networks are then learnt using the contourtrees as inputs to the neural nets. On the other hand, the contour-tree is constructed by associating a node with each exterior or interior contour extracted from the logo instance. Nodes in the tree are labeled by a feature vector, which describes the contour by means of its perimeter, surrounded area, and a synthetic representation of its curvature plot. The contour-tree representation contains the topological structured information of logo and continuous values pertaining to each contour node. Hence symbolic and sub-symbolic information coexist in the contour-tree representation of logo image. Experimental results are reported on 40 real logos distorted with artificial noise and performance of recursive neural network is compared with another two types of neural approaches.
11/2006: pages 104-117;
-
[show abstract]
[hide abstract]
ABSTRACT: Recurrent neural networks are powerful learning machines capable of processing sequences. A recent extension of these machines can conveniently be used to process also general data structures like trees and graphs, which opens the doors to a number of new very interesting applications previously unexplored.
In this paper, we show that when the problem of learning is restricted to purely symbolic data structures, like trees, the continuous representation developed after learning can also be given a symbolic interpretation. In particular, we show that a proper quantization of the neuron activation trajectory makes it possible to induce tree automata. We present preliminary experiments for small-size problems that, however, are very promising, especially when considering that this methodology is very robust with respect to accidental or malicious corruption of the learning set.
11/2006: pages 36-47;
-
[show abstract]
[hide abstract]
ABSTRACT: We describe a system for the retrieval on the basis of layout similarity of document images belonging to collections stored in digital libraries. Layout regions are extracted and represented with the XY tree. The proposed indexing method combines a new tree clustering algorithm (based on self organizing maps) with principal component analysis. The combination of these techniques allows us to retrieve the most similar pages from large collections without the need for a direct comparison of the query page with each indexed document.
Document Image Analysis for Libraries, 2006. DIAL '06. Second International Conference on; 05/2006
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, we describe a system capable of extracting textual information from images of structured documents. In particular
the model and the algorithms we described are used to process forms in which the information fields can not be located only
by their position on the page, but can also be identified after locating the corresponding instruction fields. The proposed
model is based on attributed relational graphs and performs form registration and location of information fields using algorithms
based on the hypothesize-and-verify paradigm. The location of instruction fields is carried out in an holistic way, by using
connectionist models.
04/2006: pages 438-448;
-
[show abstract]
[hide abstract]
ABSTRACT: This paper deals with furnishing object-oriented database models with a paradigm of taxonomic reasoning, i.e., an inference capability characteristic of the knowledge representation systems developed within the KL-ONE family. This particular ability is based on subsumption computation, i.e., on deducing subset relationships among classes from their structural descriptions. We endow a data model developed in an object-oriented database environment with a formal framework for dealing with taxonomic reasoning. In particular, we define the model's intensional and extensional levels, and an interpretation function specifying their way of interacting; a subsumption algorithm is given and its soundness, completeness, and polynomial complexity are proven.
04/2006: pages 124-140;
-
[show abstract]
[hide abstract]
ABSTRACT: This paper addresses the problem of locating and recognizing graphic items in document images. The proposed approach allows us to recognize such items also in the presence of high noise, scaling, and rotation. This is accomplished by a hybrid model which performs graphic item location by morphological operations and connected component analysis, and item recognition by a proper connectionist model. Some very promising experimental results are reported to support the proposed algorithms.
01/2006: pages 135-147;
-
[show abstract]
[hide abstract]
ABSTRACT: This paper deals with endowing object-oriented data models with taxonomic reasoning, i.e., an inference capability characteristic of the knowledge representation systems developed within the KL-ONE family. For this purpose, the main point is to introduce the concept of defined class, i.e., class whose structural description gives necessary and sufficient conditions for an object to belong to it. Object-oriented data models usually don't refer to this concept, because only necessary conditions are expressed by the type descriptions. We take into account a data model, developed in a database environment, and show how a uniform formal framework can be defined in order that this model fit taxonomic reasoning.
01/2006: pages 375-384;
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper we propose a neural model conceived for problems of word recognition and understanding of small protocol-driven sentences. The model is based on an unified approach to integrate priori knowledge and learning by example. The priori knowledge, injected into the network connections, can be of different levels, while learning is mainly conceived as a refinement process, and is responsible of dealing with uncertainty. We describe a small prototype for problems of isolated word recognition.
01/2006: pages 450-454;
-
[show abstract]
[hide abstract]
ABSTRACT: Large collections of scanned documents (books and journals) are now available in digital libraries. The most common method for retrieving relevant information from these collections is image browsing, but this approach is not feasible for books with more than a few dozen pages. The recognition of printed text can be made on the images by OCR systems, and in this case a retrieval by textual content can be performed. However, the results heavily depend on the quality of original documents. More sophisticated navigation can be performed when an electronic table of contents of the book is available with links to the corresponding pages. An opposite approach relies on the reduction of the amount of symbolic information to be extracted at the storage time. This approach is taken into account by document image retrieval systems. We describe a system that we developed in order to retrieve information from digitized books and journals belonging to digital libraries. The main feature of the system is the ability of combining two principal retrieval strategies in several ways. The first strategy allows an user to find pages with a layout similar to a query page. The second strategy is used in order to retrieve words in the collection matching a user-defined query, without performing OCR. The combination of these basic strategies allows users to retrieve meaningful pages with a low effort during the indexing phase. We describe the basic tools used in the system (layout analysis, layout retrieval, word retrieval) and the integration of these tools for answering complex queries. The experimental results are made on 1287 pages and show the effectiveness of the integrated retrieval.
Document Image Analysis for Libraries, 2004. Proceedings. First International Workshop on; 02/2004
-
[show abstract]
[hide abstract]
ABSTRACT: This paper describes a system for efficient indexing and retrieval of words in collections of document images. The proposed method is based on two main principles: unsupervised prototype clustering, and string encoding for efficient string matching. During indexing, a self organizing map (SOM) is trained so as to cluster together similar symbols (character-like objects) in a sub-set of the documents to be stored. By using the trained SOM the words in the whole collection can be stored and represented with a fixed-length description that can be easily compared in order to score most similar words in response to a user query. The system can be automatically adapted to different languages and font styles. The most appropriate applications are for the processing of old documents (18th and 19th Centuries) where current OCRs have more difficulties. Experimental results describe three application scenarios having various levels of difficulty for current OCR systems.
Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on; 09/2003
-
[show abstract]
[hide abstract]
ABSTRACT: We describe an approach for table location in document images. The documents are described by means of a hierarchical representation that is based on the MXY tree. The presence of a table is hypothesized by searching parallel lines in the MXY tree of the page. This hypothesis is afterwards verified by locating perpendicular lines or white spaces in the region included between the parallel lines. Lastly, located tables can be merged on the basis of proximity and similarity criteria. The use of an optimization method, that relies on the definition of an appropriate table location index, allows us to identify, the optimal values of thresholds involved in the algorithm. In this way the algorithm can be adapted to recognize tables with different features by maximizing the performance on an appropriate training set. The algorithm has been evaluated on two data-sets containing more than 1500 pages, and comparing its results with the tables identified by two commercial OCRs.
Pattern Recognition, 2002. Proceedings. 16th International Conference on; 02/2002
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper an architecture for understanding documents of a domain that can be grouped into classes is shown. Documents are grouped with respect to the physical structure. The architecture is based on two knowledge descriptions of the domain: one is independent from the classes and one related to the classes. Such knowledge levels are used to understand the documents of the domain. The understanding phase is described in relation with the phases of analysis and classification of such documents
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on; 10/1999
-
[show abstract]
[hide abstract]
ABSTRACT: We describe a top-down approach to the segmentation and representation of documents containing tabular structures. Examples of these documents are invoices and technical papers with tables. The segmentation is based on an extension of X-Y trees, where the regions are split by means of cuts along separators (e.g. lines), in addition to cuts along white spaces. The leaves describe regions containing homogeneous information and cutting separators. Adjacency links among leaves of the tree describe local relationships between corresponding regions
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on; 10/1999
-
[show abstract]
[hide abstract]
ABSTRACT: We introduce a probabilistic graphical model for supervised
learning on databases with categorical attributes. The proposed belief
network contains hidden variables that play a role similar to nodes in
decision trees and each of their states either corresponds to a class
label or to a single attribute test. As a major difference with respect
to decision trees, the selection of the attribute to be tested is
probabilistic. Thus, the model can be used to assess the probability
that a tuple belongs to some class, given the predictive attributes.
Unfolding the network along the hidden states dimension yields a trellis
structure having a signal flow similar to second order connectionist
networks. The network encodes context specific probabilistic
independencies to reduce parametric complexity. We present a custom
tailored inference algorithm and derive a learning procedure based on
the expectation-maximization algorithm. We propose decision trellises as
an alternative to decision trees in the context of tuple categorization
in databases, which is an important step for building data mining
systems. Preliminary experiments on standard machine learning databases
are reported, comparing the classification accuracy of decision
trellises and decision trees induced by C4.5. In particular, we show
that the proposed model can offer significant advantages for sparse
databases in which many predictive attributes are missing
IEEE Transactions on Knowledge and Data Engineering 10/1999; · 1.66 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: We describe a flexible form-reader system capable of extracting
textual information from accounting documents, like invoices and bills
of service companies. In this kind of document, the extraction of some
information fields cannot take place without having detected the
corresponding instruction fields, which are only constrained to range in
given domains. We propose modeling the document's layout by means of
attributed relational graphs, which turn out to be very effective for
form registration, as well as for performing a focused search for
instruction fields. This search is carried out by means of a hybrid
model, where proper algorithms, based on morphological operations and
connected components, are integrated with connectionist models.
Experimental results are given in order to assess the actual performance
of the system
IEEE Transactions on Pattern Analysis and Machine Intelligence 08/1998; · 4.91 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Recurrent neural networks processing symbolic strings can be
regarded as adaptive neural parsers. Given a set of positive and
negative examples, picked up from a given language, adaptive neural
parsers can effectively be trained to infer the language grammar. In
this paper we use adaptive neural parsers to face the problem of
inferring grammars from examples that are corrupted by a kind of noise
that simply changes their membership. We propose a training algorithm,
referred to as hybrid finite state filter, which is based on a parsimony
principle that penalizes the development of complex rules. We report
very promising experimental results showing that the proposed inductive
inference scheme is indeed capable of capturing rules, while removing
noise
IEEE Transactions on Neural Networks 06/1998; · 2.95 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: This paper is concerned with the presentation of a declarative knowledge base, the Conceptual Model, which describes the invoice domain as generally as possible. Such a model is based on a semantic network that is able to describe the invoice domain by different levels of abstraction. The Conceptual Model can be used for the labelling procedure of physical rectangles, extracted from invoices, in order to construct a model (Document Model) for each class of invoices. The Document Model contains physical coordinates for each rectangle, which can be estimated from an invoice, and the related semantic label. Once the Document Model is constructed, it can be applied to understand an invoice instance, whose class is univocally identified by its logo
Database and Expert Systems Applications, 1997. Proceedings., Eighth International Workshop on; 10/1997
-
[show abstract]
[hide abstract]
ABSTRACT: We present a method for the logical labelling of physical rectangles, extracted from invoices, based on a conceptual model which describes, as generally as possible, the invoice universe. This general knowledge is used in the semi automatic construction of a model for each class of invoices. Once the model is constructed, it can be applied to understand an invoice instance, whose class is univocally identified by its logo. This approach is used to design a flexible system which is able to learn, from a nucleus of general knowledge, a monotonic set of specific knowledge for each class of invoices (document models), in terms of physical coordinates for each rectangle and related semantic label
Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on; 09/1997
-
[show abstract]
[hide abstract]
ABSTRACT: Much attention has recently been paid to the recognition of graphical objects, such as company logos and trademarks. Recognizing these objects facilitates the recognition of document classes. Some promising results have been achieved by using autoassociator-based artificial neural networks (AANN) in the presence of homogeneously distributed noise. However, the performance drops significantly when dealing with spot-noisy logos, where strips or blobs produce a partial obstruction of the pictures. We propose a new approach for training AANNs especially conceived for dealing with spot noise. The basic idea is to introduce new metrics for assessing the reproduction error in AANNs. The proposed algorithm, referred to as spot-backpropagation (S-BP), is significantly more robust with respect to spot-noise than classical Euclidean norm-based backpropagation (BP). Our experimental results are based on a database of 88 real logos that are artificially corrupted by spot-noise
Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on; 09/1997