Conference Paper

Text/Graphics Separation in Maps

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The separation of overlapping text and graphics is a challenging problem in document image analysis. This paper proposes a specific method of detecting and extracting characters that are touching graphics. It is based on the observation that the constituent strokes of characters are usually short segments in comparison with those of graphics. It combines line continuation with the feature line width to decompose and reconstruct segments underlying the region of intersection. Experimental results showed that the proposed method improved the percentage of correctly detected text as well as the accuracy of character recognition significantly.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Several attempts to address the problem of separating text and other graphical features in topographic maps have been reported. In reference [16], map text and other map features were separated based on the differences in their constituent strokes. Tembre et al. [17] proposed an approach for locating overlapping text by searching for seed strings. ...
... Once text units are available as a result of text unit separation, the task of text recognition consists of text localization and text classification [2], [3], [16], [12]. A direct and simple approach is to exploit optical character recognition (OCR) engines, as reported in many previous works on topographic map text recognition [15], [16], [21]. ...
... Once text units are available as a result of text unit separation, the task of text recognition consists of text localization and text classification [2], [3], [16], [12]. A direct and simple approach is to exploit optical character recognition (OCR) engines, as reported in many previous works on topographic map text recognition [15], [16], [21]. In particular, the emergence of deep learning approaches has greatly facilitated OCR tasks [24]- [27]. ...
Article
Full-text available
Text features in topographic maps are important for helping users to locate the area that a map covers and to understand the map’s content. Previous works on the optical detection of map text from topographic maps have used geometric features, the Hough transform, and segmentation. However, these approaches still face challenges when detecting map text in complicated contexts, especially when the map text is touching other map features, such as contours or geographical features. Thus, state-of-the-art techniques for map text and feature recognition as well as manual interpretation and correction are always required to produce accurate results when optically converting topographic maps into a readable format. This paper proposes a methodological framework called the intelligent map reader that enables the automatic and accurate optical understanding of the content of a topographic map using deep learning techniques in combination with a gazetteer. The intelligent map reader framework includes the detection of map text via deep learning, the separation of text units via graph-based segmentation and clustering, optical character recognition (OCR) via an OCR engine, and digital-gazetteer-based map content understanding. Experimental results validate the efficiency and robustness of our proposed methodology for map text recognition and map content understanding. We expect the proposed intelligent map reader to contribute to various applications in the GeoAI field.
... Here, researchers prefer various shape information of the connected components (CCs) for their classification as text/non-text. Examples of such features are height, width, aspect ratio and thickness of CCs [64][65][66]. ...
... Methods discussed above are designed to perform text/nontext separation in book and/or magazine covers, and CD covers for example. A significant number of method are also reported in literature which perform text/non-text separation for different types of maps [53,64,105]. Figure 9 shows the stages of a typical map processing system presented by Chiang et al. in [15]. ...
... Cao and Tan [64] initially separate the sub-layer from the map images by defining a threshold on pixel intensity. Solid graphical components are then eliminated using morphology-based method described in [111], and dashed lines are removed using total line regression. ...
Article
Full-text available
Separation of text and non-text is an essential processing step for any document analysis system. Therefore, it is important to have a clear understanding of the state-of-the-art of text/non-text separation in order to facilitate the development of efficient document processing systems. This paper first summarizes the technical challenges of performing text/non-text separation. It then categorizes offline document images into different classes according to the nature of the challenges one faces, in an attempt to provide insight into various techniques presented in the literature. The pros and cons of various techniques are explained wherever possible. Along with the evaluation protocols, benchmark databases, this paper also presents a performance comparison of different methods. Finally, this article highlights the future research challenges and directions in this domain.
... Methods employed by Goto and Aso (1999) and Pouderoux et al. (2007) which identify text in maps based on the geometry of individual connected components do not consider characters of various sizes. Cao and Tan (2002) made use of individual thresholds to detach the black map layer consisting of text and contours as well as of connected components to differentiate between those. Although this is considered a much faster approach compared to a Hough transform, their tailor-made size filters cannot handle overlaps between text and other map features apart from specific line types (Tombre et al., 2002). ...
... Due to their occasionally recurring patterns, textures are often mistakenly identified as text by automated detection processes. Tofani and Kasturi (1998), Cao and Tan (2002), , as well as Nazari et al. (2016) defined different thresholds based on connected components to distinguish between text and other map elements. This laborious task is certainly not adaptable to a large variance of maps. ...
Article
Full-text available
Historical maps are frequently neither readable, searchable nor analyzable by machines due to lacking databases or ancillary information about their content. Identifying and annotating map labels is seen as a first step towards an automated legibility of those. This article investigates a universal and transferable methodology for the work with large-scale historical maps and their comparability to others while reducing manual intervention to a minimum. We present an end-to-end approach which increases the number of true positive identified labels by combining available text detection, recognition, and similarity measuring tools with own enhancements. The comparison of recognized historical with current street names produces a satisfactory accordance which can be used to assign their point-like representatives within a final rough georeferencing. The demonstrated workflow facilitates a spatial orientation within large-scale historical maps by enabling the establishment of relating databases. Assigning the identified labels to the geometries of related map features may contribute to machine-readable and analyzable historical maps.
... Codes and annotations in different fonts and styles are used to distinguish symbols with a similar geometry, identify connectors and clarify additional information; however text characters may overlap with symbols, connectors, or other characters. Methods such as Cao et al. [18] and Roy et al. [104] have pointed out the difficulty of identifying overlapping characters in document images. Furthermore, three challenges have been identified once all text characters have been detected: (1) strings of text describing symbols and connector are represented using arbitrary lengths and sizes as shown in Fig. 2, (2) associating the corresponding text to symbols and connectors is not a straightforward task and (3) text interpretation is prone to errors, and thus some information can be misinterpreted. ...
... One of the most representative forms of segmenting text in images is TGS. It is possible to identify a vast amount of literature related to TGS methods which may have a general purpose [44,110], or be designed for a certain type of document images, such as maps [18,80,104,108], book pages [26,42,115] and EDs [15,36,54,66,71,79]. TGS frameworks consist in two steps: character detection and string grouping. ...
Article
Full-text available
Engineering drawings are commonly used across different industries such as oil and gas, mechanical engineering and others. Digitising these drawings is becoming increasingly important. This is mainly due to the legacy of drawings and documents that may provide rich source of information for industries. Analysing these drawings often requires applying a set of digital image processing methods to detect and classify symbols and other components. Despite the recent significant advances in image processing, and in particular in deep neural networks, automatic analysis and processing of these engineering drawings is still far from being complete. This paper presents a general framework for complex engineering drawing digitisation. A thorough and critical review of relevant literature, methods and algorithms in machine learning and machine vision is presented. Real-life industrial scenario on how to contextualise the digitised information from specific type of these drawings, namely piping and instrumentation diagrams, is discussed in details. A discussion of how new trends on machine vision such as deep learning could be applied to this domain is presented with conclusions and suggestions for future research directions.
... The method consists on first applying CC analysis to the drawing in order to locate each character and graphic, discarding the ones longer than a size threshold and a height-towidth ratio threshold. To group characters into strings, authors introduce a methodology for linear analysis based on applying the Hough transform [37] to the centroids of the text CCs, which is a widely used method that has been applied to find lines [23], [54], arbitrary shapes [5] and in more recent work, to locate partial images within their full counterparts [57]. This system has become a largely replicated solution due to its versatility and simplicity, however one of its greatest disadvantages is the incapability of the system to correctly identify individual characters and text overlapping lines or even other characters. ...
... Finally, geometrical symbols such as circles and polygons can be located. Circles may be found within dot-dash connectors or representing symbols, and can be segmented through the application of the Hough circles method [5] taking into consideration factors such as size and localisation to avoid false positives within text. On the other hand, polygons can be detected through contour detection and approximation, by means of methods such as the Douglas-Peucker algorithm [22]). ...
Conference Paper
Full-text available
The demand for digitisation of complex engineering drawings becomes increasingly important for the industry given the pressure to improve the efficiency and time effectiveness of operational processes. There have been numerous attempts to solve this problem, either by proposing a general form of document interpretation or by establishing an application dependant framework. Moreover, text/graphics segmen-tation has been presented as a particular form of addressing document digitisation problem, with the main aim of splitting text and graphics into different layers. Given the challenging characteristics of complex engineering drawings, this paper presents a novel sequential heuristics-based methodology which is aimed at localising and detecting the most representative symbols of the drawing. This implementation enables the subsequent application of a text/graphics segmentation method in a more effective form. The experimental framework is composed of two parts: first we show the performance of the symbol detection system and then we present an evaluation of three different state of the art text/graphic segmentation techniques to find text on the remaining image.
... These studies in which typically text labels are extracted from map images and incorporated into subsequent processing steps of Optical Character Recognition (OCR) have a wide range of applications such as building gazetteers, carrying out historical research on location name changes or studying changes in the landscape and land-use. In addition, extracting and removing map text can improve the recognition of other geographic features such as cadastral boundaries (Cao and Tan, 2002), vegetation features (Leyk et al., 2006), elevation contours (Khotanzad and Zink, 2003) or roads (Li et al., 2000;Chiang and Knobock, 2013). ...
... The discussion is structured by the major characteristics of text labels and map content: language (script), font, curvature and spacing, print and image quality, text color as well as map complexity. In general, the aim in most studies on text recognition in maps is to detect, extract, and transfer text labels to an OCR component, which then performs the final recognition process (Nagy et al., 1997;Cao and Tan, 2002;Li et al., 2000;Velázquez and Levachkine, 2004;Gelbukh et al., 2004;Pouderoux et al., 2007;. How well map labels can be identified and recognized heavily depends on the characteristics described below. ...
Article
Converting geographic features (e.g., place names) in map images into a vector format is the first step for incorporating cartographic information into a geographic information system (GIS). With the advancement in computational power and algorithm design, map processing systems have been considerably improved over the last decade. However, the fundamental map processing techniques such as color image segmentation, (map) layer separation, and object recognition are sensitive to minor variations in graphical properties of the input image (e.g., scanning resolution). As a result, most map processing results would not meet user expectations if the user does not “properly” scan the map of interest, pre-process the map image (e.g., using compression or not), and train the processing system, accordingly. These issues could slow down the further advancement of map processing techniques as such unsuccessful attempts create a discouraged user community, and less sophisticated tools would be perceived as more viable solutions. Thus, it is important to understand what kinds of maps are suitable for automatic map processing and what types of results and process-related errors can be expected. In this paper, we shed light on these questions by using a typical map processing task, text recognition, to discuss a number of map instances that vary in suitability for automatic processing. We also present an extensive experiment on a diverse set of scanned historical maps to provide measures of baseline performance of a standard text recognition tool under varying map conditions (graphical quality) and text representations (that can vary even within the same map sheet). Our experimental results help the user understand what to expect when a fully or semi-automatic map processing system is used to process a scanned map with certain (varying) graphical properties and complexities in map content.
... The method was demonstrated using the fundamental procedures human beings use to locate, extract spatial feature in maps which is a brilliant idea this research intends to work out in an OOA context. Cao & Tan (2002) proposed a separation of overlapping text and graphics method, the method works with the assumption that the constituent strokes of characters are usually short segments in comparison with those of graphics. It combines line continuation with the feature line width to decompose and reconstruct segments underlying the region of intersection. ...
... Then, resolved the terminal and crossed points and finally matched and reconnected broken contour lines to automatically extract contour lines form paper map. Furthermore Cao & Tan (2002) tested the concept of line continuation based on similar slopes, adjacency and size measures while Chen et al. (1999) applied graphical line tracing based on tests for slope equality and offsets. Since features on the topographic map lack discrete boundary thereby making the delineation of feature fuzzy, thus a reliable fuzzy set and fuzzy logic method will be appropriate in semantically defining boundaries of the geo-spatial objects due their semantic fuzziness. ...
Thesis
Full-text available
Historical topographic maps are distinct sources of spatial information for hind-cast studies. They are acclaimed to be one of the most reliable legacy archives representing and describing geographic features prior to aerial photography and the present day satellite imagery. However, two major challenges are encountered in extracting information from these sources. These challenges are conceptual and technical emanating from scanning artefacts, inherent map complexity and analogousness, although information extraction has been manually done through digitizing, pixel-based methods and visual map analysis, which are time consuming and tedious. Hence, there is urgent need to explore robust and reliable methods such as the object-oriented analysis (OOA) to efficiently develop new information extraction techniques for scanned topographic paper maps. Therefore, this research investigated and answered questions about conceptualization, development, implementation and transferability of an OOA-based information extraction method for complex papers maps and potential applications. This study demonstrates the OOA-based information extraction technique on a 1967 topographic map of Nigeria. The work was structured broadly into three major parts. The first part investigated underpinning theoretical concepts of saliency and semantics to conceptualize a generic OOA-based information extraction framework. The second part consequently, translated the conceptualized procedure to develop multi-step object-centered rules and implemented the developed algorithm on input maps to robustly and accurately extract tangible thematic information, despite typical complexities known with paper maps. Thirdly, the research further probed how far OOA rules-sets can be transferred to comparable data sets of same series with slightly changed imaging conditions to see how robust and reliable will the once developed rule-sets perform. Results show the suitability of saliency and semantic to conceptualize and develop an OOA-based information extraction formalism for complex paper maps. Similarly, the created OOA rule-sets robustly and reliably extracted thematic information that corresponds to targeted map objects with accuracies of 95% for the hydrographic layer, 97% and 92% for correctness and completeness of symbols respectively, 70% for texts and 55% for contour lines. Interestingly, transferability of the once created rule-set proved realistic after testing it on different map section of same map series with slight modifications. Empirical observation of the developed method reveals that OOA-based information procedure was swifter than manual method and is thus useful for environmental modelling/monitoring programs. More importantly the method is suited for applied earth’s sciences especially disaster risk management programs where rapid understanding and mapping of multi-temporal hazards, element at risk and vulnerability assessment are increasingly demanded. Therefore, the increasing demand for quick insights on hazards and risk assessment over time are critical milestones that OOA-based method and extracted information can achieve since we can rapidly unlock such relevant information from historical paper maps using this approach. Index Terms—Information extraction, paper maps, object oriented analysis, saliency, semantics, transferability, robustness and reliability
... Features of characters template are used to localize touching characters. In [4], Cao and Tan have proposed a three-stage method, including sublayer separation, solid graphical components removal and dashed lines removal for text-graphics segmentation. It has been shown that long graphical components with touching text can be decomposed into short segments by breaking the graphical components at the intersection points, thereby decreasing computational load [4].To classify each component, Hoang and Tabbone [5] have also used morphological Component Analysis (MCA) method, by using sparse representation as text or graphic.MCA allows the separation of features for text and graphics, contained in an image, when these features present divergent morphological aspects. ...
... In [4], Cao and Tan have proposed a three-stage method, including sublayer separation, solid graphical components removal and dashed lines removal for text-graphics segmentation. It has been shown that long graphical components with touching text can be decomposed into short segments by breaking the graphical components at the intersection points, thereby decreasing computational load [4].To classify each component, Hoang and Tabbone [5] have also used morphological Component Analysis (MCA) method, by using sparse representation as text or graphic.MCA allows the separation of features for text and graphics, contained in an image, when these features present divergent morphological aspects. Promoting sparse representation of these features facilitates these aspects. ...
Conference Paper
Full-text available
The present paper proposes a technique for text-graphics separation of geographical maps. The novelty comes from the fact that the process is based on a structural representation of the map and on a decision tree design using a priori knowledge. In the proposed method, every map image is vectorized in order to extract a set of features for characterizing text and graphics. Vectorization provides structural primitives. We associate features to these structural primitives. A decision tree is then designed by an expert to discriminate text and graphics in map images, considering the features extracted from the vectorized images. This method provides a binary decision for every vectorized component, classifying the components into graphic or text. The proposed method was tested on a Bangla (a popular Indian regional language) maps dataset composed of a set of grey level images. The proposed text-graphic separation method provides 72.6% and 67.01% character and word-level text extraction accuracy respectively, when tested on map images.
... If the ED designs are not in compliance with the ISO's standard ED formatting structures, the accuracy of the builtin OCR in CAD inspection software for text and character recognition could drop by 80% below; however, using neural network models such as the EAST model and a LSTM model for text detection and recognition can increase accuracy to 86%, although these models cannot still identify overlapping text, symbols, and characters as highlighted in Fig. 4 and previously discussed by Cao and Tan [54] and Roy et al. [55] due to the arbitrary lengths and sizes of text strings that describe symbols [4], [56]- [59]. Although the built-in OCR in CAD inspection software can make inspecting ED documents or images for text and characters easier, misinterpretation can still occur due to error-prone text interpretation. ...
Article
Full-text available
Engineering Drawing (ED) digitization is a crucial aspect of modern industrial processes, enabling efficient data management and facilitating automation. However, the accurate detection and recognition of ED elements pose significant challenges. This paper presents a comprehensive review of existing research on ED element detection and recognition, focusing on the role of neural networks in improving the analysis process. The study evaluates the performance of the YOLOv7 model in detecting ED elements through rigorous experimentation. The results indicate promising precision and recall rates of up to 87.6% and 74.4%, respectively, with a mean average precision (mAP) of 61.1% at IoU threshold 0.5. Despite these advancements, achieving 100% accuracy remains elusive due to factors such as symbol and text overlapping, limited dataset sizes, and variations in ED formats. Overcoming these challenges is vital to ensuring the reliability and practical applicability of ED digitization solutions. By comparing the YOLOv7 results with previous research, the study underscores the efficacy of neural network-based approaches in handling ED element detection tasks. However, further investigation is necessary to address the challenges above effectively. Future research directions include exploring ensemble methods to improve detection accuracy, fine-tuning model parameters to enhance performance, and incorporating domain adaptation techniques to adapt models to specific ED formats and domains. To enhance the real-world viability of ED digitization solutions, this work highlights the importance of conducting testing on diverse datasets representing different industries and applications. Additionally, fostering collaborations between academia and industry will enable the development of tailored solutions that meet specific industrial needs. Overall, this research contributes to understanding the challenges in ED digitization and paves the way for future advancements in this critical field.
... If the ED designs are not in compliance with the ISO's standard ED formatting structures, the accuracy of the builtin OCR in CAD inspection software for text and character recognition could drop by 80% below; however, using neural network models such as the EAST model and a LSTM model for text detection and recognition can increase accuracy to 86%, although these models cannot still identify overlapping text, symbols, and characters as highlighted in Fig. 4 and previously discussed by Cao and Tan [54] and Roy et al. [55] due to the arbitrary lengths and sizes of text strings that describe symbols [4], [56]- [59]. Although the built-in OCR in CAD inspection software can make inspecting ED documents or images for text and characters easier, misinterpretation can still occur due to error-prone text interpretation. ...
Article
Full-text available
Engineering Drawing (ED) digitization is a crucial aspect of modern industrial processes, enabling efficient data management and facilitating automation. However, the accurate detection and recognition of ED elements pose significant challenges. This paper presents a comprehensive review of existing research on ED element detection and recognition, focusing on the role of neural networks in improving the analysis process. The study evaluates the performance of the YOLOv7 model in detecting ED elements through rigorous experimentation. The results indicate promising precision and recall rates of up to 87.6% and 74.4%, respectively, with a mean average precision (mAP) of 61.1% at IoU threshold 0.5. Despite these advancements, achieving 100% accuracy remains elusive due to factors such as symbol and text overlapping, limited dataset sizes, and variations in ED formats. Overcoming these challenges is vital to ensuring the reliability and practical applicability of ED digitization solutions. By comparing the YOLOv7 results with previous research, the study underscores the efficacy of neural network-based approaches in handling ED element detection tasks. However, further investigation is necessary to address the challenges above effectively. Future research directions include exploring ensemble methods to improve detection accuracy, fine-tuning model parameters to enhance performance, and incorporating domain adaptation techniques to adapt models to specific ED formats and domains. To enhance the real-world viability of ED digitization solutions, this work highlights the importance of conducting testing on diverse datasets representing different industries and applications. Additionally, fostering collaborations between academia and industry will enable the development of tailored solutions that meet specific industrial needs. Overall, this research contributes to understanding the challenges in ED digitization and paves the way for future advancements in this critical field.
... R. Cao and C. L. Tan [18] proposed the method for discovering and pulling out text associated with graphics has been proposed. The connection of text and graphics has been interpreted by combining the line continuation with line width. ...
Article
Traditionally, papery documents were used for communication and data storage. Moreover, a large number of old documents and books around the world are kept in the files and which are threatened vanishing. Thus, it is essential to preserve this heritage and make it available for everyone in easily understandable form. So, the management of paper documents must be done in well-organized and integrated way. The ultimate key would be a computer, which deals with a paper document as effectively as it able to do with other digital media. Thus to involve computers for processing of paper documents it becomes necessary to convert hardcopy of documents into softcopy. This sort of conversation can be done through the scanners or cameras, which stores the paper documents in the form of document image. The accessibility is one of the major concerns of user. The OCR (Optical Character Recognition) recognizes characters in a document, but OCR process only textual part of document image and non-text components are skipped. There are several applications (such as, scanned copy of technical book or journal may contain diagrams or tables, official documents or application forms may contain some special symbol or organization logo) where each part of document is equally important. Document layout analysis (DLA) can be used for such task. DLA is nothing but the task of splitting up document images into various different sections such as scripts, pictures, charts, logos, symbols and tables. Though, it is complex problem because of the diversity of document structure. The major focus of this paper is on the detailed study of common approaches, features used for splitting of document image into separate parts and to give future research direction for researchers.
... distortions in low-resolution images. This approach provides separation of text-from non-text pixels and allows us to skip additional steps such as background line removal [61,63,69,70]. Some of the limitations of this approach, and concomitant future directions for research, are noted in our discussion. ...
Article
Full-text available
A great deal of information is contained within archival maps—ranging from historic political boundaries, to mineral resources, to the locations of cultural landmarks. There are many ongoing efforts to preserve and digitize historic maps so that the information contained within them can be stored and analyzed efficiently. A major barrier to such map digitizing efforts is that the geographic location of each map is typically unknown and must be determined through an often slow and manual process known as georeferencing. To mitigate the time costs associated with the georeferencing process, this paper introduces a fully automated method based on map toponym (place name) labels. It is the first study to demonstrate these methods across a wide range of both simulated and real-world maps. We find that toponym-based georeferencing is sufficiently accurate to be used for data extraction purposes in nearly half of all cases. We make our implementation available to the wider research community through fully open-source replication code, as well as an online georeferencing tool, and highlight areas of improvement for future research. It is hoped that the practical implications of this research will allow for larger and more efficient processing and digitizing of map information for researchers, institutions, and the general public.
... Similarly, considering the map images, Cao et al. in [7] have initially performed thinning operation on the large sized graphic components and then they have identified the junction points. At the junction points, each component is decomposed into their constituent segments. ...
Article
Segmentation of a touching component to separate out its constituent text and non-text parts is always a very crucial but challenging task towards developing a comprehensive document image processing (DIP) system. This is because irrespective of document types, either printed or handwritten, the non-text parts need to be suppressed first before processing the text parts through an optical character recognition (OCR) system. Although a good number of attempts have been made to address this issue for printed documents, but the same for regular handwritten document images is almost none. However, appearance of touching components where a non-text part gets joined with a text part is a common issue in freestyle handwriting. To this end, in this work, we tailor-make a generative adversarial network (GAN) based model with a suitable loss function which we name as tsegGAN. We also prepare an in-house dataset by collecting touching components from different real-world handwritten documents to evaluate our model. Performance comparison of our model with state-of-the-art GAN models shows that tsegGAN has outperformed the others with a significant margin.
... Approaches commonly used connected component analysis or sliding windows [1] for that matter. In particular, a family of approaches known as Text/Graphics Separation (TGS) methods [2] were used for drawings such as general purpose EDs [3], circuit diagrams, maps [4] and musical scores, with moderate success. ...
Conference Paper
Engineering drawings such as Piping and Instru-mentation Diagrams contain a vast amount of text data which is essential to identify shapes, pipeline activities, tags, amongst others. These diagrams are often stored in undigitised format, such as paper copy, meaning the information contained within the diagrams is not readily accessible to inspect and use for further data analytics. In this paper, we make use of the benefits of recent deep learning advances by selecting models for both text detection and text recognition, and apply them to the digitisation of text from within real world complex engineering diagrams. Results show that 90% of text strings were detected including vertical text strings, however certain non text diagram elements were detected as text. Text strings were obtained by the text recognition method for 86% of detected text instances. The findings show that whilst the chosen Deep Learning methods were able to detect and recognise text which occurred in simple scenarios, more complex representations of text including those text strings located in close proximity to other drawing elements were highlighted as a remaining challenge.
... Bhowmik et al. [18] developed a novel method for text/non-text separation for handwritten document images using rotation invariant texture features. Separation of the text string from graphics in mixed text/graphics document images is reported in [19][20]. A novel segmentation approach for document level text/non-text segmentation using textural information and neural network is reported in [21]. ...
Article
Full-text available
Text detection and localization from text embedded natural images is still considered as a challenging problem in complex imagery environment. Foreground object segmentation followed by classification is a popular approach for this task. Component level object classification in clutter environment is therefore an important sub-problem. Appropriate extraction of foreground objects leads to effective classification that may certainly increase the performance of text detection. In this paper, a novel feature vector is developed based on area occupancy profile of equidistant pixels is reported for text/non-text classification. The generated feature descriptors are script invariant and much effective in practical scenario. This proposed feature set is evaluated on our dataset using five different pattern classifiers and experimental result shows that the said feature set yields more than 86% classification accuracy irrespective of scripts.
... All prevalent graphical text searching studies involve time consuming and inefficient techniques. Correct location of the graphical text within an image, i.e. graphical text localization algorithms, discussed in [1,2,3,4] are used in content-based information retrieval. For accurate graphical text reading, it is important to separate graphical texts from other remaining non-text pixels. ...
... This is simply because we have not encountered such problem within the collection of drawings we used for the experiment. However, the method can easily be expanded to take such problem into consideration with methods such as [32]. Once the above heuristics are applied, all the elements of the drawings such as text, circles, dashed segments, etc… are extracted. ...
Conference Paper
Full-text available
Technical drawings are commonly used across different industries such as Oil and Gas, construction, mechanical and other types of engineering. In recent years, the digitization of these drawings is becoming increasingly important. In this paper, we present a semi-automatic and heuristic-based approach to detect and localise symbols within these drawings. This includes generating a labeled dataset from real world engineering drawings and investigating the classification performance of three different state-of the art supervised machine learning algorithms. In order to improve the classification accuracy the dataset was pre-processed using unsupervised learning algorithms to identify hidden patterns within classes. Testing and evaluating the proposed methods on a dataset of symbols representing one standard of drawings, namely Process and Instrumentation (P&ID) showed very competitive results.
... Cheng and Liu [6] proposed a method based on the assumption that a line as an interferential curve in the text image must be detected and then separated from it. The graph representation of input image is obtained using thinning process. ...
Article
Full-text available
Text segmentation is a live research field with vast new areas to be explored. Separating text layer from graphics is a fundamental step to exploit text and graphics information. The language used in the map is a challenging issue in text layer separation problem. All current methods are proposed for non-Persian language maps. In Persian, text strings are composed of one or more subwords. Each subword is also composed of one to several letters connected together. Therefore, the components of the text strings in Persian are more diverse in terms of size and geometric form than in English. Thus, the overlapping of the Persian text and the lines usually produces a complex structure that the existing methods cannot handle with the necessary efficiency. For this purpose, the stroke width variety of the input map is calculated, and then the average line width of graphics is estimated by analyzing the content of stroke width. After finding the average width of graphical lines, we classify the complex structure into text and graphics in pixel level. We evaluate our method on some variety of full crossing text and graphics in Persian maps and show that some promising results in terms of precision and recall (above 80% and 90%, respectively) are obtained.
... Because of removal of overlapping objects, text and line segments, some portions of road network disappeared. We have applied structuring elements to reconstruct [13] the missing portion of the road. ...
Research
Full-text available
Extraction of Region of Interest (ROI) from geographical map image is an important task of document analysis and recognition. The extracted segments are applied to different machine vision and embedded system. The task is very complex because of having overlapping objects, intersected lines etc in map. Keeping this in mind, the present thesis paper describes two methods that have been applied to extract efficient ROI for both road network and waterway from geographical map; one is color based segmentation applying K-means clustering and other is template based matching which overcome the previous limitations. Different from the existent methods, these proposed approaches are efficient both in segmentation results and further reconstruction also. And our experimental results are close to human perceptions; therefore our methods provide better and more robust performance than either of the individual methods. We hope these methods will find diverse applications in ROI extraction from geographical map and also image analysis.
... Here again, our maps are mainly composed of black connected components on a noisy background, and a lot of overlapping text and graphics exits. Modelling each component in a generic way will impose to model all the different kind of details in all maps to finally obtained an overfitted model, or to manually post-process images like in [3]. ...
... Maps consist of many simple elements: points, lines, text and complex objects, for instance areas with points, lines and text labels. Frequent attempts of automatic extraction were related to map text (Myers et al., 1996;Luyang et al., 2000;Cao and Tan, 2001;Pezeshk and Tutwiler, 2011;Chiang and Knoblock, 2015) and linear objects, e.g. contour lines that were later processed into Digital Elevation Models, rivers or roads (Kaneko, 1992;Arrighi and Soille, 1999;Wu et al., 2009;Ghircoias and Brad, 2011;Oka et al., 2012). ...
Article
Full-text available
This study aimed to obtain accurate binary forest masks which might be directly used in analysis of land cover changes over large areas. A sequence of image processing operations was conceived, parameterized and tested using various topographic maps from mountain areas in Poland and Switzerland. First, the input maps were filtered and binarized by thresholding in Hue-Saturation-Value colour space. The second step consisted of a set of morphological image analysis procedures leading to final forest masks. The forest masks were then assessed and compared to manual forest boundary vectorization. The Polish topographical map published in the 1930s showed low accuracy which could be attributed to methods of cartographic presentation used and degradation of original colour prints. For maps published in the 1970s, the automated forest extraction performed very well, with accuracy exceeding 97%, comparable to accuracies of manual vectorization of the same maps performed by nontrained operators. With this method, we obtained a forest cover mask for the entire area of the Polish Carpathians, easily readable in any Geographic Information System software.
... There have been other attempts to introduce mathematical tools to the field of epigraphy but this discipline of research is still in its infant steps (recent work on digital epigraphy can be found in [8,17,22,28,29,43], some works related to Iron Age Hebrew epigraphy can be found in [11][12][13][38][39][40]). Nevertheless, those attempts are not connected directly to the techniques presented below. There are works dealing with reconstruction of damaged handwritten characters as a result of graphics removal (e.g., see the works [1,3,6,23]). However, the deficiencies dealt with in these articles are relatively easy to model, in contrary to the case of natural deterioration processes. ...
Article
Full-text available
This work suggests a new variational approach to the task of computer aided restoration of incomplete characters, residing in a highly noisy document. We model character strokes as the movement of a pen with a varying radius. Following this model, a cubic spline representation is being utilized to perform gradient descent steps, while maintaining interpolation at some initial (manually sampled) points. The proposed algorithm was utilized in the process of restoring approximately 1000 ancient Hebrew characters (dating to ca. 8th-7th century BCE), some of which are presented herein and show that the algorithm yields plausible results when applied on deteriorated documents.
... Moreover, heatmap recognition has not been extensively researched. Relevant work in the related area of map recognition includes the use of knowledge of the colourisation schemes in maps for automatically segmenting them based on their semantic contents (e.g., roads) [9], and the development of techniques for improving segmentation quality of text and graphics in colour maps through the cleaning up of possible errors (e.g., dashed lines) [2]. Map recognition has also been investigated at TRECVID (http://trecvid. ...
Conference Paper
This work proposes a framework for the discovery of environmental Web resources providing air quality measurements and forecasts. Motivated by the frequent occurrence of heatmaps in such Web resources, it exploits multimedia evidence at different stages of the discovery process. Domain-specific queries generated using empirical information and machine learning driven query expansion are submitted both to the Web and Image search services of a general-purpose search engine. Post-retrieval filtering is performed by combining textual and visual (heatmap-related) evidence in a supervised machine learning framework. Our experimental results indicate improvements in the effectiveness when performing heatmap recognition based on SURF and SIFT descriptors using VLAD encoding and when combining multimedia evidence in the discovery process.
... This means that the data from this unit may undergo further processing consistent with the application under consideration. For example, the application can be either text and graphics separation (47)(48)(49)(50)(51)(52)(53) or simply pruning by removing small connected components. In other words, to focus on graphics efficiently, one needs to separate texts from the document images. ...
Chapter
The chapter focuses on one of the key issues in document image processing i.e., graphical symbol recognition. Graphical symbol recognition is a sub-field of a larger research domain: pattern recognition. The domain covers several approaches (i.e., statistical, structural and syntactic) and specially designed symbol recognition techniques inspired by real-world industrial problems. The chapter, in general, contains research problems, state-of-the-art methods that convey basic steps as well as prominent concepts or techniques and research standpoints/directions that are associated with graphical symbol recognition.
... Toponym recognition in scanned maps is an area of active research. The vast majority of this work, however, focuses on contemporary maps. Cao and Tan (2002) , for example, present an approach that separates text and graphics in scanned maps, and subsequently feeds the extracted text into state of the art OCR software for toponym identification. In similar work, Velázquez and Levachkine (2003) propose a refined approach to enhance separation between overlapping text and graphics, including c ...
... Toponym recognition in scanned maps is an area of active research. The vast majority of this work, however, focuses on contemporary maps. Cao and Tan (2002) , for example, present an approach that separates text and graphics in scanned maps, and subsequently feeds the extracted text into state of the art OCR software for toponym identification. In similar work, Velázquez and Levachkine (2003) propose a refined approach to enhance separation between overlapping text and graphics, including c ...
Article
Full-text available
Present-day digitization methods produce data that is semantically opaque; that is to a machine, a digitized map is merely a collection of bits and bytes. The area it depicts, the places it mentions, any text contained within legends or written on its margins remain unknown - unless a human appraises the image and manually adds this information to its metadata. This problem is especially severe in the case of old maps: these are typically handwritten, may contain text in varying orientations and sizes, and can be in a bad condition due to varying levels of deterioration or damage. As a result, searching for the contents of these documents remains challenging, which makes them hard to discover for users, unusable for matching processing and analysis, and thus effectively lost to many forms of public, scientific or commercial utilization. Fully automatic detection and transcription of place names and legends is, likely, not achievable with today’s technology. We argue, however, that semi-automated methods can eliminate much of the tedious effort required to annotate map scans entirely by hand. In this paper, we showcase early work on semi-automatic place name annotation. In our experiment, we utilize open source tools to identify potential locations on the map representing toponyms. We present how, in next steps, we aim to extend our experiment by exploiting the spatial layout of identified candidates to deduce possible place names based on existing toponym lists. Ultimately, or goal is to combine this work with a toolset for manual image annotation into a convenient online environment. This will allow curators, researchers, and potentially also the general public “tag” and annotate toponyms on digitised maps rapidly.
Article
Oil & Gas facilities are extremely huge and have complex industrial structures that are documented using thousands of printed sheets. During the last years, it has been a tendency to migrate these paper sheets towards a digital environment, with the final end of regenerating the original computer-aided design (CAD) projects which are useful to visualise and analyse these facilities through diverse com- puter applications. Usually, this was done manually by re-sketching each page using CAD applications. Nevertheless, some applications have appeared which generate the CAD document automatically given the paper sheets. In this last case, the final document is always verified by an engineer due to the need of being a zero-error process. Since the need of an engineer is absolutely accepted, we present a new method to reduce the required engineer working time. This is done by highlighting the digitised components in the CAD document that the automatic method could have incorrectly identified. Thus, the engineer is required only to look at these components. The experimental section shows our method achieves a reduction of approximately 40% of the human effort keeping a zero-error proces.
Chapter
Ancient maps are an historical and cultural heritage widely recognized as a very important source of information, especially for dialectological researches, the cartographical heritage produces first-rate data. However, exploiting such maps is a quite difficult task to achieve, and we are focusing our attention on this major issue. In this paper, we consider the Linguistic Atlas of France (ALF), built between 1902 and 1910 and we propose an original approach using tree of connected components for the separation of the content in layers for facilitating the extraction, the analysis, the viewing and the diffusion of the data contained in these ancient linguistic atlases.
Chapter
This chapter provides a fundamental study on graphics recognition systems. It basically includes data acquisition and processing to data representation, recognition, retrieval, and spotting. Further, depending on datasets and their availability (for research purpose), and the way we validate the graphics recognition systems (validation protocol).
Article
Full-text available
Detection of transportation network from geographical map image is an important task of document analysis and recognition. The extracted segments are applied to different machine vision and embedded system. The task is very complex because of having overlapping objects, intersected lines etc in map. Keeping this in mind, the present thesis paper describes an adaptive method, which has been applied to extract efficient waterway (an important portion of transportation network) that ove overcome the previous limitations. Different from the existent methods, proposed approach is efficient both in segmentation results and further reconstruction. And the experimental results are close to human perceptions; therefore this method provides better and more robust performance than either of the individual methods. We hope this method will find diverse applications in Automatic waterway from geographical map and also image analysis.
Article
In optical character recognition, text strings should be extracted from images first. But only the complete text strings can accurately express the meanings of the words, so the extracted individual characters should be grouped into text strings before recognition. There are lots of text strings in topographic maps, and these texts consist of the characters with multi-colored, multi-sized and multi-oriented, and the existing methods cannot effectively group them. In this paper, a dynamic character grouping method is proposed to group the characters into text strings based on four consistency constraints, which are the color, size, spacing and direction respectively. As we know that the characters in the same word have similar colors, sizes and distances between them, and they are also on some curve lines with a certain bending, but the characters in different words are not. Based on these features of the characters, the background pixels around the characters are expanded to link the characters into text strings. In this method, due to the introduction of the color consistency constraint, the characters with different colors can be grouped well. And this method can deal with curved character strings more accurately by the improved direction consistency constraint. The final experimental results show that this method can group the characters more efficiently, especially for the case in which the beginning or the end characters of words are close to the characters of the other words.
Article
Along with fast development of computer science, traditional recognition for the documents and literatures have replaced by the technique of computer auto-digitalization. Computer auto-digitalization is a popular technique which not only has the merit of low labor cost, but also has comprehensive functional extensions (multi-status, for the detection, recognition, classification and storage). However, the problem is centered in the reliability and adaptation. Bad compatibility and error-prone are universal features. So in this paper, taking a complex material, the civil aerial meteorological map, as experimental goal, we developed a novel system for the analysis of drawing map, and also introduced the detailed algorithms of it. The superiority of algorithms and efficiency of system in generating meaningful interpretation are clear from experimental results.
Conference Paper
In this work, we propose a novel method for robust, scale and rotation independent text/graphics separation for early maps. We apply a connected component analysis with density, minimum and maximum diameter as main features. In addition, we use a combined threshold region for the density and the ratio of maximum and minimum diameter, extended by an analysis of neighboring components to recognize text with large variations in style, size and orientations. Our method reaches an F1-score of 0.73 which is 0.19 higher than the 0.54 achieved by a state-of-the-art approach from the literature on the same test data set.
Article
In topographic maps, only the complete text strings can accurately express the properties of the geographic elements, so individual characters should be grouped into text strings before recognition. This paper presents a novel character grouping method based on the graph model. In this method, undirected graphs are used to describe different words, where the color and size of the characters are served as the properties of the nodes, while the distance and angle between the characters are served as the weights of the edges connecting pairs of characters. Therefore, the nodes can be connected to construct undirected graphs according to their properties. Then the constructed graphs are simplified according to the weights of the edges. Finally, we can get the final results corresponding to the grouped characters. Experimental results show that this method can especially group the characters with significant wide spacing. Moreover, it has higher efficiency with graphic processing instead of image processing.
Article
The color layers of contour-lines separated from scanned topographic map are the basis of contour-line extraction, but it is difficult to separate them well due to the color aliasing and mixed color problems. This paper will focus us on contour-line color layer separation and presents a novel approach for it based on fuzzy clustering and Single-prototype Region Growing for Contour-line Layer (SRGCL). The purpose of this paper is to provide a solution for processing scanned topographic maps on which contour-lines are abundant and densely distributed, for example, in the condition similar to hilly areas and mountainous regions, the contour-lines always occupy the largest proportion in linear features and the contour-line separation is the most difficult task. The proposed approach includes steps as follows. First step, line features are extracted from the map to reduce the interference from area features in fuzzy clustering. Second step, fuzzy clustering algorithm is employed to obtain membership matrix of pixels in the line map. Third step, based on the membership matrix, we obtain the most-similar prototype and the second-similar prototype of each pixel as the indicators of the pixel in SRGCL. The spatial relationship and the fuzzy similarity of color features are used in SRGCL to overcome the inaccurate classification of ambiguous pixels. The procedure focusing on single contour-line layer will improve the accuracy of contour-line segmentation result of SRGCL relative to general segmentation methods. We verified the algorithm on several USGS historical maps, the experimental results show that our algorithm produces contour-line color layers with good continuity and few noises, which verifies the improvement in contour-line color layer separation of our algorithm relative to two general segmentation methods.
Conference Paper
The text lines in graphical documents (e.g., maps, engineering drawings), artistic documents etc., are often annotated in curve lines to illustrate different locations or symbols. For the optical character recognition of such documents, individual text lines from the documents need to be extracted and recognized. Due to presence of multi-oriented characters in such non-structured layout, word recognition is a challenging task. In this paper, we present an approach towards the recognition of scale and orientation invariant text words in graphical documents using Hidden Markov Models (HMM). First, a line extraction method is applied to segment text lines and the method is based on the foreground and background information of the text components. To effectively utilize the background information, a water reservoir concept is used here. For recognition of curved text lines, a path of sliding window is estimated and features extracted from the sliding window are fed to the HMM system for recognition. Local gradient histogram (LGH) based frame-wise feature is used in HMM. The experimental results are evaluated on a dataset of graphical words and we have obtained encouraging results.
Thesis
Im Internet befindet sich eine sehr große Menge an raumbezogenen Daten, die in Form von Raster- und Vektorkarten unterschiedliche Ausschnitte der Welt darstellen. Die in diesen Karten enthaltenen Informationen sind jedoch nicht automatisch auffindbar, da sie mittels bestimmter Kartenelemente kodiert sind. Ihre Semantik wird erst bei der Interpretation durch einen Betrachter explizit. Die Kar-teninformationen sollen jedoch nicht nur von Menschen, sondern auch von Maschinen interpretiert werden können. Dies erfordert schon die große Menge der zu interpretierenden Daten. Die automati-sche Ableitung der Semantik aus den Karten wird unter dem Begriff Automatische Karteninterpreta-tion zusammengefasst. Es handelt sich dabei also um einen Prozess, der implizites Wissen eines Kar-tenbestandes explizit macht. Hierzu soll die vorliegende Arbeit Lösungen in Form der Karteninterpre-tation anbieten. Die Karteninterpretation dieser Arbeit erfolgt an Vektorkarten, die im Internet zu finden sind. Für die gezielte Suche der Vektorkarten des Internets wird eigens ein Webcrawler entwickelt. Der Webcrawler ist eine Suchmaschine, die speziell nach Vektorkarten sucht. Dazu wird ausschließlich das Shapefile-Dateiformat gesucht, das sich zu einer Art Standardformat im GIS-Umfeld entwickelt hat und in dem die Vektorkarten zumeist abgespeichert sind. Um möglichst viele Shapefiles zu finden, wird die Suche auf Servern betrieben, auf denen die Wahrscheinlichkeit Shapefiles zu finden hoch ist. Diese Server werden zuvor durch Google-Suche nach dem Schlüsselwort „shapefile download“ gefunden. Die Karteninterpretation umfasst Verfahren zur Interpretation der Kartenobjekte, der Kartentypen so-wie des Maßstabs. Zunächst soll das Verfahren zur Interpretation der Objekte einer Karte vorgestellt werden. Hier geht es darum, die Objekte anhand ihrer spezifischen Charakteristika automatisch zu erkennen. Die Ob-jekterkennung basiert auf SOM (Self-Organizing Map), bekannt aus der künstlichen Intelligenz. Die Kartenobjekte werden in Klassen wie beispielsweise Gebäudegrundriss oder Straßennetz gegliedert. Für jede Klasse sollen die ihr jeweils eigenen Merkmale gefunden und in eine der SOM zugängliche Form, hier als Parametervektor, gebracht werden. Die Parametervektoren bilden die Eingabemuster, die in der Lernphase von SOM gelernt werden. Nachdem die Eingabemuster aller Objektklassen von SOM gelernt wurden, wird der Parametervektor für jedes auf der Karte vorliegende Objekt ausgewertet und in die SOM eingegeben. Durch das zunächst erfolgte Lernen der Eingabemuster können die Ob-jekte anhand ihrer jeweils berechneten Parametervektoren der entsprechenden Objektklasse zugeord-net werden. Als weiteres Verfahren soll die Interpretation des Kartentyps vorgestellt werden. Karten sind nach ihrem inhaltlichen Gehalt und Zweck in Kartentypen wie beispielsweise Flusskarten, Straßenkarten, Höhenlinienkarten etc. kategorisiert. Wie bei der Interpretation der Objekte wird auch hierzu die SOM verwandt. Es werden also auch Eingabemuster gelernt, die die geometrischen Merkmale der Karten-typen repräsentieren. Die Merkmale ergeben sich sowohl aus der Struktur der einzelnen Objekte als auch aus der Topologie zwischen den Objekten auf einer Karte. Wird nun eine Karte in die SOM eingegeben, so erkennt die SOM anhand des gelernten Eingabemusters den entsprechenden Kartentyp. Zusätzlich erhält man den Dateinamen der Karten sowie den Inhalt der Webseite, auf welcher die Karte gefunden wurde. So wird in der vorliegenden Arbeit ebenfalls untersucht, inwiefern diese Zusatzin-formationen bei der Interpretation des Kartentyps helfen können. Die automatische Interpretation des Maßstabs ist neben der Interpretation der Kartenobjekte und Kar-tentypen ein weiteres Verfahren, das in der vorliegenden Arbeit diskutiert werden soll. Die Interpreta-tion des Maßstabs wird auf zwei Wegen vorangetrieben: Die Mehrfachrepräsentation und die Detail-lierungsgrade. Im ersten Fall kann der Maßstab aus der entsprechenden Repräsentation hergeleitet werden, da ein identisches Objekt in unterschiedlichen realitätsgetreuen Repräsentationen auf der Karte dargestellt wird. Im zweiten Fall kann der Maßstab aus den Detaillierungsgraden abgeleitet wer-den. Dies basiert darauf, dass die Karten mit verschiedenen Maßstäben unterschiedlich detailliert dar-gestellt werden.
Conference Paper
Full-text available
In this chapter, we describe an automatic procedure to capture features on old maps. Early maps contain specific informations which allow us to reconstruct trajectories over time and space for land use/cover studies or urban area development. The most commonly used approach to extract these elements requires a user intervention for digitizing which widely limits its utilization. Therefore, it is essential to propose automatic methods in order to establish reproducible procedures. Capturing features automatically on scanned paper maps is a major challenge in GIS for many reasons: (1) many planimetric elements can be overlapped, (2) scanning procedure may conduct to a poor image quality, (3) lack of colors complicates the distinction of the elements. Based on a state of art, we propose a method based on color image segmentation and unsupervised classification (K-means algorithm) to extract forest features on the historical 'Map of France'. The first part of the procedure conducts to clean maps and eliminate elevation contour lines with filtering techniques. Then, we perform a color space conversion from RGB to L*a*b color space to improve uniformity of the image. To finish, a post processing step based on morphological operators and contextual rules is applied to clean-up features. Results show a high global accuracy of the proposed scheme for different excerpt of this historical map.
Article
Word searching in non-structural layout such as graphical documents is a difficult task due to arbitrary orientations of text words and the presence of graphical symbols. This paper presents an efficient approach for word searching in documents of non-structural layout using an efficient indexing and retrieval approach. The proposed indexing scheme stores spatial information of text characters of a document using a character spatial feature table (CSFT). The spatial feature of text component is derived from the neighbor component information. The character labeling of a multi-scaled and multi-oriented component is performed using support vector machines. For searching purpose, the positional information of characters is obtained from the query string by splitting it into possible combinations of character pairs. Each of these character pairs searches the position of corresponding text in document with the help of CSFT. Next, the searched text components are joined and formed into sequence by spatial information matching. String matching algorithm is performed to match the query word with the character pair sequence in documents. The experimental results are presented on two different datasets of graphical documents: maps dataset and seal/logo image dataset. The results show that the method is efficient to search query word from unconstrained document layouts of arbitrary orientation.
Article
Full-text available
Raster maps are easily accessible and contain rich road information; however, converting the road information to vector format is challenging because of varying image quality, overlapping features, and typical lack of metadata (e.g., map geocoordinates). Previous road vectorization approaches for raster maps typically handle a specific map series and require significant user effort. In this paper, we present a general road vectorization approach that exploits common geometric properties of roads in maps for processing heterogeneous raster maps while requiring minimal user intervention. In our experiments, we compared our approach to a widely used commercial product using 40 raster maps from 11 sources. We showed that overall our approach generated high-quality results with low redundancy with considerably less user input compared with competing approaches.
Article
Text labels in maps provide valuable geographic information by associating place names with locations. This information from historical maps is especially important since historical maps are very often the only source of past information about the earth. Recognizing the text labels is challenging because heterogeneous raster maps have varying image quality and complex map contents. In addition, the labels within a map do not follow a fixed orientation and can have various font types and sizes. Previous approaches typically handle a specific type of map or require intensive manual work. This paper presents a general approach that requires a small amount of user effort to semi-automatically recognize text labels in heterogeneous raster maps. Our approach exploits a few examples of text areas to extract text pixels and employs cartographic labeling principles to locate individual text labels. Each text label is then rotated automatically to horizontal and processed by conventional OCR software for character recognition. We compared our approach to a state-of-art commercial OCR product using 15 raster maps from 10 sources. Our evaluation shows that our approach enabled the commercial OCR product to handle raster maps and together produced significant higher text recognition accuracy than using the commercial OCR alone.
Book
This book covers up-to-date methods and algorithms for the automated analysis of engineering drawings and digital cartographic maps. The Non-Deterministic Agent System (NDAS) offers a parallel computational approach to such image analysis. The book describes techniques suitable for persistent and explicit knowledge representation for engineering drawings and digital maps. It also highlights more specific techniques, e.g., applying robot navigation and mapping methods to this problem. Also included are more detailed accounts of the use of unsupervised segmentation algorithms to map images. Finally, all these threads are woven together in two related systems: NDAS and AMAM (Automatic Map Analysis Module).
Article
Digitization of newspaper article is important for registering historical events. Layout analysis of Indian newspaper is a challenging task due to the presence of different font size, font styles and random placement of text and non-text regions. In this paper we propose a novel framework for learning optimal parameters for text graphic separation in the presence of complex layouts. The learning problem has been formulated as an optimization problem using EM algorithm to learn optimal parameters depending on the nature of the document content.
Conference Paper
Environmental data are considered of utmost importance for human life, since weather conditions, air quality and pollen are strongly related to health issues and affect everyday activities. This paper addresses the problem of discovery of air quality and pollen forecast Web resources, which are usually presented in the form of heatmaps (i.e. graphical representation of matrix data with colors). Towards the solution of this problem, we propose a discovery methodology, which builds upon a general purpose search engine and a novel post processing heatmap recognition layer. The first step involves generation of domain-specific queries, which are submitted to the search engine, while the second involves an image classification step based on visual low level features to identify Web sites including heatmaps. Experimental results comparing various visual features combinations show that relevant environmental sites can be efficiently recognized and retrieved.
Conference Paper
Focussed crawlers enable the automatic discovery of Web resources about a given topic by automatically navigating the Web link structure and selecting the hyperlinks to follow by estimating their relevance to the topic based on evidence obtained from the already downloaded pages. This work proposes a classifier-guided focussed crawling approach that estimates the relevance of a hyperlink to an unvisited Web resource based on the combination of textual evidence representing its local context, namely the textual content appearing in its vicinity in the parent page, with visual evidence associated with its global context, namely the presence of images relevant to the topic within the parent page. The proposed focussed crawling approach is applied towards the discovery of environmental Web resources that provide air quality measurements and forecasts, since such measurements (and particularly the forecasts) are not only provided in textual form, but are also commonly encoded as multimedia, mainly in the form of heatmaps. Our evaluation experiments indicate the effectiveness of incorporating visual evidence in the link selection process applied by the focussed crawler over the use of textual features alone, particularly in conjunction with hyperlink exploration strategies that allow for the discovery of highly relevant pages that lie behind apparently irrelevant ones.
Article
Full-text available
A one-pass parallel thinning algorithm based on a number of criteria, including connectivity, unit-width convergence, medial axis approximation, noise immunity, and efficiency, is proposed. A pipeline processing model is assumed for the development. Precise analysis of the thinning process is presented to show its properties, and proofs of skeletal connectivity and convergence are provided. The proposed algorithm is further extended to the derived-grid to attain an isotropic medial axis representation. A set of measures based on the desired properties of thinning is used for quantitative evaluation of various algorithms. Image reconstruction from connected skeletons is also discussed. Evaluation shows that the procedures compare favorably to others
Article
Full-text available
A system for interpretation of images of paper-based line drawings is described. Since a typical drawing contains both text strings and graphics, an algorithm has been developed to locate and separate text strings of various font sizes, styles, and orientations. This is accomplished by applying the Hough transform to the centroids of connected components in the image. The graphics in the segmented image are processed to represent thin entities by their core-lines and thick objects by their boundaries. The core-lines and boundaries are segmented into straight line segments and curved lines. The line segments and their interconnections are analyzed to locate minimum redundancy loops which are adequate to generate a succinct description of the graphics. Such a description includes the location and attributes of simple polygonal shapes, circles, and interconnecting lines, and a description of the spatial relationships and occlusions among them. Hatching and filling patterns are also identified. The performance of the system is evaluated using several test images, and the results are presented. The superiority of these algorithms in generating meaningful interpretations of graphics, compared to conventional data compression schemes, is clear from these results
Article
This paper describes the development of a system using pyramid to extract text strings from a mixed text/graphics image, such as a road map. The pyramid helps to identify and locate words or phrases in the image efficiently and quickly. The system is thus able to isolate the text from the graphics so that practical electronic versions of each kind can be treated and processed independently.
Conference Paper
A method for segmentation of text that may be connected to graphics in engineering drawings is presented. It consists of three steps: growing individual characterbox regions, using a recursive merging scheme by stroke linking; merging the detected characterboxes into a textbox and determining its orientation; and re-segmenting the textbox back into the refined characterbox that can be input to an OCR subsystem. The method can segment dimensioning text as well as other classes of text. It handles both isolated and touching characters, aligned at any slant. The capability of segmenting characters that touch either themselves or graphics, which is an important feature in handling real life drawings, is obtained by focusing on intermediate vector information rather that on the raw pixel data. We present the details of the algorithm and show both successful and unsuccessful examples from an experimental set of 36 dimensioning textboxes, in which 94% segmentation rate was achieved with 3% false alarm rate.
Conference Paper
The global interpolation (GIM) we propose evaluates segment pattern continuity and connectedness to produce characters with smooth edges while interpreting blank or missing segments, e.g., in extracting a handwritten character overlapping a border, correctly. Characters contacting a border, for example, are extracted after the border itself is labeled and removed. The absence of character segments is then interpolated based on segment continuity. Interpolated segments are relabeled and checked for matching against the original labeled pattern. If a match cannot be made, segments are reinterpolated until they can be identified. Experimental results show that global interpolation interprets the absence of character segments correctly and generates with smooth edges
Conference Paper
Correct recognition of dashed lines is essential for high-level technical drawing understanding. Automatic solution is quite difficult due to the limitations of machine vision algorithm. In order to promote development of better techniques, a dashed line detection contest was held at the Pennsylvania State University during the First International Workshop on Graphics Recognition, August 9–11, 1995. The contest required automatic detection of dashed lines on test drawings at three difficulty levels: simple, medium and complex, which contained dashed and dash-dotted lines in straight and curved shapes, and even interwoven texts. This paper presents dashed line detection techniques which won the first place in the contest. It successfully detected the dashed lines in all drawings. The underlying mechanism is a sequential stepwise recovery of components that meet certain continuity conditions. Results of experiments are presented and discussed.
Conference Paper
It is very common that the filled-in character images in the form documents touch, cross, or overlap the formatted-line images. In that case, it is not easy to extract characters correctly because the shapes of characters are transformed by line images. In this paper, we propose a new method to reconstruct the character images damaged by the preprinted lines of documents. The method consists of two stages — the character decomposition stage and the character reconstruction stage. In the character decomposition stage, an input character is decomposed into some line-elements which are units of reconstruction, through the hierarchical steps. In the character reconstruction stage, the various reconstruction methods are used to restore the characters according to the 4 types of line-elements. To evaluate the performance of the proposed method objectively, we used simple recognition modules on CENPARMI handwritten digits and NIST handwritten alphabets. Experimental results showed that the difference of the recognition rates between the original characters without any damages by lines and the characters reconstructed by the proposed method is within about 1%, and the shapes of reconstructed character images are almost the same as those of the original ones.
Conference Paper
In this text, we briefly overview some of the basic issues and trends in the vectorization and segmentation of graphics, and provide a short list of relevant references on these topics. It is intended as an introduction to various papers on related topics, elsewhere in this collection.
Article
The global interpolation method we propose evaluates segment pattern continuity and connectedness to produce characters with smooth edges while interpreting blank or missing segments based on global label connectivities, e.g., in extracting a handwritten character overlapping a border, correctly. Conventional character segmentation involving overlapping a border concentrates on removing the thin border based on known format information rather than extracting the character. This generates discontinuous segments which produce distortion due to thinning and errors in direction codes, and is the problem to recognize the extracted character. In our method, characters contracting a border are extracted after the border itself is labeled and removed automatically by devising how to extract wavy and oblique borders involved in fax communication. The absence of character segments is then interpolated based on segment continuity. Interpolated segments are relabeled and checked for matching against the original labeled pattern. If a match cannot be made, segments are reinterpolated until they can be identified. Experimental results show that global interpolation interprets the absence of character segments correctly and generates with smooth edges.
Article
The development and implementation of an algorithm for automated text string separation that is relatively independent of changes in text font style and size and of string orientation are described. It is intended for use in an automated system for document analysis. The principal parts of the algorithm are the generation of connected components and the application of the Hough transform in order to group components into logical character strings that can then be separated from the graphics. The algorithm outputs two images, one containing text strings and the other graphics. These images can then be processed by suitable character recognition and graphics recognition systems. The performance of the algorithm, both in terms of its effectiveness and computational efficiency, was evaluated using several test images and showed superior performance compared to other techniques.
Conference Paper
The black layer is digitized from a USGS topographic map digitized at 1000 dpi. The connected components of this layer are analyzed and separated into line art, text, and icons in two passes. The paired street casings are converted to polylines by vectorization and associated with street labels from the character recognition phase. The accuracy of character recognition is shown to improve by taking account of the frequently occurring overlap of line art with street labels. The experiments show that complete vectorization of the black line-layer bitmap is the major remaining problem
Conference Paper
The off-line handwritten characters recorded on prescribed form documents may be overwritten by the lines of the form documents. Overwritten characters should be isolated in order to be recognized more effectively. However, removal of the lines causes breaks in the overwritten characters. Consequently, a character restoration process is necessary. In this paper, the shape types of overwritten characters are analyzed and a method of restoring characters that have been broken by line removal is proposed. A 97% correct restoration ratio was obtained through this method
Conference Paper
The use of computer aided design requires line drawings and maps to be digitized and stored in databases. The input of line drawings and maps into databases requires vectorization of lines, and recognition of symbols and characters. The paper addresses two aspects related to the input process. The first aspect is an automatic algorithm for the separation of character strings from maps. The second aspect is an algorithm for line thinning. The proposed algorithms are based on directional morphology operations. The character string extraction algorithm is independent of font style, size, and language and is suitable for a variety of map styles with straight or curved lines. The presented experimental results demonstrate very good performance of the algorithms even in cases where the character strings touch or intersect lines in the map
Article
The contributions to document image analysis of 99 papers published in the IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) are clustered, summarized, interpolated, interpreted, and evaluated
Article
An algorithm for text/graphics separation is presented in this paper. The basic principle of the algorithm is to erase nontext regions from mixed text and graphics engineering drawings, rather than extract text regions directly. This algorithm can be used to extract both Chinese and Western characters, dimensions, and symbols and has few limitations on the kind of engineering drawings and noise level. It is robust to text-graphics touching, text fonts, and written orientations
Analysis of form images, in Document Image Analysis
  • D Wang
  • S N Srihari
How to win a dashed line detection contest, in Graphics Recognition: methods and Applications
  • . D Dori
  • W Liu
  • M Peleg
An introduction to vectorization and segmentation, in Graphics Recognition: Algorithms and Systems
  • D S Doermann
D. S. Doermann, An introduction to vectorization and segmentation, in Graphics Recognition: Algorithms and Systems, K. Tombre and A. K. Chhabra (eds.), Lecture Notes in Computer Science 1389, Springer, pp. 1 – 8, 1998