Tom Monnier's research while affiliated with Université Gustave Eiffel and other places

Publications (11)

Chapter
Approaches for single-view reconstruction typically rely on viewpoint annotations, silhouettes, the absence of background, multiple views of the same instance, a template shape, or symmetry. We avoid all such supervision and assumptions by explicitly leveraging the consistency between images of different object instances. As a result, our method ca...
Article
Full-text available
Progress in the digitization of cultural assets leads to online databases that become too large for a human to analyze. Moreover, some analyses might be challenging, even for experts. In this paper, we explore two applications of computer vision to analyze historical data: watermark recognition and one-shot repeated pattern detection in artwork col...
Preprint
Approaches to single-view reconstruction typically rely on viewpoint annotations, silhouettes, the absence of background, multiple views of the same instance, a template shape, or symmetry. We avoid all of these supervisions and hypotheses by leveraging explicitly the consistency between images of different object instances. As a result, our method...
Preprint
Full-text available
In this paper, we revisit the classical representation of 3D point clouds as linear shape models. Our key insight is to leverage deep learning to represent a collection of shapes as affine transformations of low-dimensional linear shape models. Each linear model is characterized by a shape prototype, a low-dimensional shape basis and two neural net...
Preprint
We present an unsupervised learning framework for decomposing images into layers of automatically discovered object models. Contrary to recent approaches that model image layers with autoencoder networks, we represent them as explicit transformations of a small set of prototypical images. Our model has three main components: (i) a set of object pro...
Preprint
We present docExtractor, a generic approach for extracting visual elements such as text lines or illustrations from historical documents without requiring any real data annotation. We demonstrate it provides high-quality performances as an off-the-shelf system across a wide variety of datasets and leads to results on par with state-of-the-art when...
Article
Full-text available
The study of watermarks is a key step for archivists and historians as it enables them to reveal the origin of paper. Although highly practical, automatic watermark recognition comes with many difficulties and is still considered an unsolved challenge. Nonetheless, Shen et al. [2019] recently introduced a new approach for this specific task which s...
Preprint
Recent advances in image clustering typically focus on learning better deep representations. In contrast, we present an orthogonal approach that does not rely on abstract features but instead learns to predict image transformations and performs clustering directly in image space. This learning process naturally fits in the gradient-based training o...

Citations

... DOVE [68] learns coarse articulated objects by exploiting temporal information from videos with optical flow supervision. All these methods require object masks for supervision, except Unsup3D [70] which exploits bilateral symmetry for reconstructing roughly frontal objects, like faces, and UNICORN [42] which uses a progressive conditioning strategy with a heavily constrained bottleneck, leading to coarse reconstructions. Another emerging paradigm is to leverage generative models [7,46,50,56] that encourages images rendered from viewpoints sampled from a prior distribution to be realistic. ...
... Spatial warps, as implemented in [39], have proven useful for various tasks, e.g., automatic image rectification for text recognition [65], semantic segmentation [24], and the contextual synthesis of images [56,96] or videos [1,2,6,7,25,31,47,52,77,84,85]. Here, we parameterize the warp with thin-plate splines (TPS) [8], whose parameters are motion vectors sampled at a small set of control points. ...
... describe poles, trunks and branches. Recently, deep learning approaches proposed to train neural networks to recognize complex shapes [32], [33]. In order to train and test such networks, [34] proposed a dataset composed of annotated 3D CAD models comprising a large variety of objects. ...
... Web application The ultimate goal of watermark recognition is to develop an application to simplify the search of watermarks. We developed a first version of such a web application [8] which can be accessed at https: //filigranes.inria.fr/. ...
... Text line detection. Many open-source models have been proposed to tackle text line detection from historical documents, mainly ARU-Net [19], dhSegment [3], Do-cExtractor [30] and Doc-UFCN [6]. These models have been pre-trained on multiple datasets [40,18], making them very efficient on a wide variety of historical documents. ...