Mathias Seuret

Mathias Seuret
  • University of Fribourg

About

87
Publications
13,205
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,407
Citations
Current institution
University of Fribourg

Publications

Publications (87)
Article
Full-text available
Handwritten document layout analysis is a fundamental step in digitizing scanned ancient documents for further processing (e.g., optical character recognition). So far, single branch-based fully convolutional networks (FCN) dominate this field. However, we contend that this task faces significant challenges, particularly in layouts with only semant...
Article
Full-text available
Traditional methods in handwritten text recognition primarily focus on generating basic transcriptions, which often fall short for in-depth humanities research. Our study enhances this by providing diplomatic transcriptions for German studies, meticulously reproducing the original manuscripts, including layout and expanded abbreviations. State-of-t...
Preprint
The imitation of cursive handwriting is mainly limited to generating handwritten words or lines. Multiple synthetic outputs must be stitched together to create paragraphs or whole pages, whereby consistency and layout information are lost. To close this gap, we propose a method for imitating handwriting at the paragraph level that also works for un...
Article
Full-text available
Transformers have emerged as the leading methods in natural language processing, computer vision, and multi-modal applications due to their ability to capture complex relationships and dependencies in data. In this study, we explore the potential of transformers as feature aggregators in the context of patch-based writer retrieval, with the objecti...
Chapter
This competition investigates the performance of glyph detection and recognition on a very challenging type of historical document: Greek papyri. The detection and recognition of Greek letters on papyri is a preliminary step for computational analysis of handwriting that can lead to major steps forward in our understanding of this major source of i...
Chapter
In this paper, we investigate the usage of fine-grained font recognition on OCR for books printed from the 15th to the 18th century. We used a newly created dataset for OCR of early printed books for which fonts are labeled with bounding boxes. We know not only the font group used for each character, but the locations of font changes as well. In bo...
Chapter
The Rey-Osterrieth Complex Figure Test (ROCFT) is a widely used neuropsychological tool for assessing the presence and severity of different diseases. It involves presenting a complex illustration to the patient who is asked to copy it, followed by recall from memory after 3 and 30 min. In clinical practice, a human rater evaluates each component o...
Chapter
Text-to-Image synthesis is the task of generating an image according to a specific text description. Generative Adversarial Networks have been considered the standard method for image synthesis virtually since their introduction. Denoising Diffusion Probabilistic Models are recently setting a new baseline, with remarkable results in Text-to-Image s...
Chapter
Methods for detecting forged handwriting are usually based on the assumption that the forged handwriting is produced by humans. Authentic-looking handwriting, however, can also be produced synthetically. Diffusion-based generative models have recently gained popularity as they produce striking natural images and are also able to realistically mimic...
Article
Full-text available
Optical character recognition (OCR) has proved a powerful tool for the digital analysis of printed historical documents. However, its ability to localize and identify individual glyphs is challenged by the tremendous variety in historical type design, the physicality of the printing process, and the state of conservation. We propose to mitigate the...
Preprint
Full-text available
Text-to-Image synthesis is the task of generating an image according to a specific text description. Generative Adversarial Networks have been considered the standard method for image synthesis virtually since their introduction; today, Denoising Diffusion Probabilistic Models are recently setting a new baseline, with remarkable results in Text-to-...
Article
Full-text available
Historical documents contain essential information about the past, including places, people, or events. Many of these valuable cultural artifacts cannot be further examined due to aging or external influences, as they are too fragile to be opened or turned over, so their rich contents remain hidden. Terahertz (THz) imaging is a nondestructive 3D im...
Chapter
The analysis of digitized historical manuscripts is typically addressed by paleographic experts. Writer identification refers to the classification of known writers while writer retrieval seeks to find the writer by means of image similarity in a dataset of images. While automatic writer identification/retrieval methods already provide promising re...
Article
Full-text available
This paper presents a systematic literature review of image datasets for document image analysis, focusing on historical documents, such as handwritten manuscripts and early prints. Finding appropriate datasets for historical document analysis is a crucial prerequisite to facilitate research using different machine learning algorithms. However, bec...
Chapter
This paper studies the effect of using various data augmentation by synthetization approaches on historical image data, particularly for font classification. Historical document image datasets often lack the appropriate size to train and evaluate deep learning models, motivating data augmentation and synthetic document generation techniques for cre...
Chapter
Binarization of document images is an important pre-processing step in the field of document analysis. Traditional image binarization techniques usually rely on histograms or local statistics to identify a valid threshold to differentiate between different aspects of the image. Deep learning techniques are able to generate binarized versions of the...
Preprint
Full-text available
We propose the use of fractals as a means of efficient data augmentation. Specifically, we employ plasma fractals for adapting global image augmentation transformations into continuous local transforms. We formulate the diamond square algorithm as a cascade of simple convolution operations allowing efficient computation of plasma fractals on the GP...
Preprint
Full-text available
This paper presents a systematic literature review of image datasets for document image analysis, focusing on historical documents, such as handwritten manuscripts and early prints. Finding appropriate datasets for historical document analysis is a crucial prerequisite to facilitate research using different machine learning algorithms. However, bec...
Chapter
This competition investigated the performance of historical document classification. The analysis of historical documents is a difficult challenge commonly solved by trained humanists. We provided three different classification tasks, which can be solved individually or jointly: font group/script type, location, date. The document images are provid...
Chapter
As of recent generative adversarial networks have allowed for big leaps in the realism of generated images in diverse domains, not the least of which being handwritten text generation. The generation of realistic-looking handwritten text is important because it can be used for data augmentation in handwritten text recognition (HTR) systems or human...
Preprint
As of recent generative adversarial networks have allowed for big leaps in the realism of generated images in diverse domains, not the least of which being handwritten text generation. The generation of realistic-looking hand-written text is important because it can be used for data augmentation in handwritten text recognition (HTR) systems or huma...
Chapter
In order to detect lesions on medical images, deep learning models commonly require information about the size of the lesion, either through a bounding box or through the pixel-/voxel-wise annotation of the lesion, which is in turn extremely expensive to produce in most cases. In this paper, we aim at demonstrating that by having a single central p...
Article
Books printed before 1800 present major problems for OCR. One of the main obstacles is the lack of diversity of historical fonts in training data. The OCR-D project, consisting of book historians and computer scientists, aims to address this deficiency by focussing on three major issues. Our first target was to create a tool that identifies font gr...
Preprint
This competition succeeds upon a line of competitions for writer and style analysis of historical document images. In particular, we investigate the performance of large-scale retrieval of historical document fragments in terms of style and writer identification. The analysis of historic fragments is a difficult challenge commonly solved by trained...
Chapter
Notarial instruments are a category of documents. A notarial instrument can be distinguished from other documents by its notary sign, a prominent symbol in the certificate, which also allows to identify the document’s issuer. Naturally, notarial instruments are underrepresented in regard to other documents. This makes a classification difficult bec...
Chapter
Full-text available
Automatic writer identification is a common problem in document analysis. State-of-the-art methods typically focus on the feature extraction step with traditional or deep-learning-based techniques. In retrieval problems, re-ranking is a commonly used technique to improve the results. Re-ranking refines an initial ranking result by using the knowled...
Article
Full-text available
Tree-based classifiers provide easy-to-understand outputs. Artificial neural networks (ANN) commonly outperform tree-based classifiers; nevertheless, understanding their outputs requires specialized knowledge in most cases. The highly redundant architecture of ANN is typically designed through an expensive trial-and-error scheme. We aim at (1) inve...
Preprint
The type used to print an early modern book can give scholars valuable information about the time and place of its production as well as the printer responsible. Currently type recognition is done manually using the shapes of `M' or `Qu' and the size of a type to look it up in a large reference work. This is reliable, but slow and requires speciali...
Preprint
Full-text available
Notarial instruments are a category of documents. A notarial instrument can be distinguished from other documents by its notary sign, a prominent symbol in the certificate, which also allows to identify the document's issuer. Naturally, notarial instruments are underrepresented in regard to other documents. This makes a classification difficult bec...
Preprint
Automatic writer identification is a common problem in document analysis. State-of-the-art methods typically focus on the feature extraction step with traditional or deep-learning-based techniques. In retrieval problems, re-ranking is a commonly used technique to improve the results. Re-ranking refines an initial ranking result by using the knowled...
Article
Full-text available
Bereits seit einigen Jahren werden große Anstrengungen unternommen, um die im deutschen Sprachraum erschienenen Drucke des 16.-18. Jahrhunderts zu erfassen und zu digitalisieren. Deren Volltexttransformation konzeptionell und technisch vorzubereiten, ist das übergeordnete Ziel des DFG-Projekts OCR-D, das sich mit der Weiterentwicklung von Verfahren...
Conference Paper
Notarial instruments are a category of documents. A notarial instrument can be distinguished from other documents by its notary sign, a prominent symbol in the certificate, which also allows to identify the document's issuer. Naturally, notarial instruments are underrepresented in regard to other documents. This makes a classification difficult bec...
Preprint
Full-text available
Most people think that their handwriting is unique and cannot be imitated by machines, especially not using completely new content. Current cursive handwriting synthesis is visually limited or needs user interaction. We show that subdividing the process into smaller subtasks makes it possible to imitate someone's handwriting with a high chance to b...
Chapter
In this paper, we introduce a mechanism for designing the architecture of a Sparse Multi-Layer Perceptron network, for classification, called ForestNet. Networks built using our approach are capable of handling high-dimensional data and learning representations of both visual and non-visual data. The proposed approach first builds an ensemble of ra...
Conference Paper
In this paper, we introduce a mechanism for designing the architecture of a Sparse Multi-Layer Perceptron network, for classification, called ForestNet. Networks built using our approach are capable of handling high-dimensional data and learning representations of both visual and non-visual data. The proposed approach first builds an ensemble of ra...
Chapter
Most people think that their handwriting is unique and cannot be imitated by machines, especially not using completely new content. Current cursive handwriting synthesis is visually limited or needs user interaction. We show that subdividing the process into smaller subtasks makes it possible to imitate someone’s handwriting with a high chance to b...
Preprint
Full-text available
This competition investigates the performance of large-scale retrieval of historical document images based on writing style. Based on large image data sets provided by cultural heritage institutions and digital libraries, providing a total of 20 000 document images representing about 10 000 writers, divided in three types: writers of (i) manuscript...
Conference Paper
Based on contemporary scripts, early printers developed a large variety of different fonts. While fonts may slightly differ from one printer to another, they can be divided into font groups, such as Textura, Antiqua, or Fraktur. The recognition of font groups is important for computer scientists to select adequate OCR models, and of high interest t...
Preprint
Global pooling layers are an essential part of Convolutional Neural Networks (CNN). They are used to aggregate activations of spatial locations to produce a fixed-size vector in several state-of-the-art CNNs. Global average pooling or global max pooling are commonly used for converting convolutional features of variable size images to a fix-sized e...
Preprint
This paper introduces a new way for text-line extraction by integrating deep-learning based pre-classification and state-of-the-art segmentation methods. Text-line extraction in complex handwritten documents poses a significant challenge, even to the most modern computer vision algorithms. Historical manuscripts are a particularly hard class of doc...
Chapter
We propose a novel approach towards adversarial attacks on neural networks (NN), focusing on tampering the data used for training instead of generating attacks on trained models. Our network-agnostic method creates a backdoor during training which can be exploited at test time to force a neural network to exhibit abnormal behaviour. We demonstrate...
Preprint
Full-text available
We propose a novel approach towards adversarial attacks on neural networks (NN), focusing on tampering the data used for training instead of generating attacks on trained models. Our network-agnostic method creates a backdoor during training which can be exploited at test time to force a neural network to exhibit abnormal behaviour. We demonstrate...
Article
The point of this paper is to question typical assumptions in deep learning and suggest alternatives. A particular contribution is to prove that even if a Stacked Convolutional Auto-Encoder is good at reconstructing pictures, it is not necessarily good at discriminating their classes. When using Auto-Encoders, intuitively one assumes that features...
Conference Paper
We have developed a simple, but powerful extension for two well known line segmentation methods which makes them more robust when working on historical manuscripts with almost regular line spacing. Against the intuitive impression that such manuscripts are easy to be handled, existing methods and tools fail to correctly segment some columns, mainly...
Article
Full-text available
In this paper, we present a novel approach to perform deep neural networks layer-wise weight initialization using Linear Discriminant Analysis (LDA). Typically, the weights of a deep neural network are initialized with: random values, greedy layer-wise pre-training (usually as Deep Belief Network or as auto-encoder) or by re-using the layers from a...
Preprint
In this paper, we present a novel approach to perform deep neural networks layer-wise weight initialization using Linear Discriminant Analysis (LDA). Typically, the weights of a deep neural network are initialized with: random values, greedy layer-wise pre-training (usually as Deep Belief Network or as auto-encoder) or by re-using the layers from a...
Article
Full-text available
This paper presents a Convolutional Neural Network (CNN) based page segmentation method for handwritten historical document images. We consider page segmentation as a pixel labeling problem, i.e., each pixel is classified as one of the predefined classes. Traditional methods in this area rely on carefully hand-crafted features or large amounts of p...
Article
Historical documents usually have a complex layout, making them one of the most challenging types of documents for automatic image analysis. In the pipeline of automatic document image analysis (DIA), layout analysis is an important prerequisite for further steps including optical character recognition, script analysis, and image recognition. It ai...
Article
In this paper, we thoroughly investigate the quality of features produced by deep neural network architectures obtained by stacking and convolving Auto-Encoders. In particular, we are interested into the relation of their reconstruction score with their performance on document layout analysis. When using Auto-Encoders, intuitively one could assume...
Article
Full-text available
In this paper, we present a novel approach for initializing deep neural networks, i.e., by turning PCA into neural layers. Usually, the initialization of the weights of a deep neural network is done in one of the three following ways: 1) with random values, 2) layer-wise, usually as Deep Belief Network or as auto-encoder, and 3) re-use of layers fr...
Article
In historical manuscripts, humans can detect handwritten words, lines, and decorations with lightness even if they do not know the language or the script. Yet for automatic processing this task has proven elusive, especially in the case of handwritten documents with complex layouts, which is why semiautomatic methods that integrate the human user i...
Conference Paper
Automatic layout analysis of historical documents has to cope with a large number of different scripts, writing supports, and digitalization qualities. Under these conditions, the design of robust features for machine learning is a highly challenging task. We use convolutional autoencoders to learn features from the images. In order to increase the...
Conference Paper
Full-text available
The term "historical documents" encompasses an enormous variety of document types considering different scripts, languages, writing supports, and degradation degrees. For automatic processing with machine learning and pattern recognition methods, it would be ideal to share labeled learning samples and trained statistical models across similar docum...
Conference Paper
Full-text available
In this paper, we present an unsupervised feature learning method for page segmentation of historical handwritten documents available as color images. We consider page segment- ation as a pixel labeling problem, i.e., each pixel is classified as either periphery, background, text block, or decoration. Traditional methods in this area rely on carefu...
Conference Paper
We present a novel method for adding realistic degradations to historical document images in order to generate more training data. Degradation patches are extracted from other documents and applied to the target document in the gradient domain. Working in the gradient domain has not been done for this purpose in document images analysis so far. It...
Article
Full-text available
In this paper, we propose a new dataset and a ground-truthing methodology for layout analysis of historical documents with complex layouts. The dataset is based on a generic model for ground-truth presentation of the complex layout structure of historical documents. For the purpose of extracting uniformly the document contents, our model defines fi...
Conference Paper
We present a novel method for adding realistic degradations to historical document images in order to generate more training data. Degradation patches are extracted from other documents and applied to the target document in the gradient domain. Working in the gradient domain has not been done for this purpose in document images analysis so far. It...
Conference Paper
In this paper, we present an unsupervised feature learning method for page segmentation of historical handwritten documents available as color images. We consider page segment- ation as a pixel labeling problem, i.e., each pixel is classified as either periphery, background, text block, or decoration. Traditional methods in this area rely on carefu...
Conference Paper
Automatic layout analysis of historical documents has to cope with a large number of different scripts, writing sup- ports, and digitalization qualities. Under these conditions, the design of robust features for machine learning is a highly challenging task. We use convolutional autoencoders to learn features from the images. In order to increase t...
Conference Paper
Automatic layout analysis of historical documents has to cope with a large number of different scripts, writing sup- ports, and digitalization qualities. Under these conditions, the design of robust features for machine learning is a highly challenging task. We use convolutional autoencoders to learn features from the images. In order to increase t...
Article
Classification of the content of a scanned document as either printed or handwritten is typically tackled as a segmentation problem of pages into text lines or words. However these methods are not applicable on documents where handwritten annotations overlay printed text. In this paper we propose to treat the task as a pixel classification task, i....

Network

Cited By