Carlo Colombo

Carlo Colombo
University of Florence | UNIFI · Dipartimento di Ingegneria dell'Informazione

Professor

About

133
Publications
28,028
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,071
Citations
Citations since 2017
26 Research Items
557 Citations
2017201820192020202120222023020406080100120
2017201820192020202120222023020406080100120
2017201820192020202120222023020406080100120
2017201820192020202120222023020406080100120
Introduction
My research work focuses on geometric image and video analysis and 3D computer vision, with applications to autonomous robotics, biomedicine and aids to disabled people, cultural heritage preservation and valorization, advanced human-machine interaction, multimedia, and information forensics (recent research projects: http://cvg.dsi.unifi.it/cvg/index.php?id=research)

Publications

Publications (133)
Article
Full-text available
We present a nonintrusive system based on computer vision for human-computer interaction in three-dimensional (3-D) environments controlled by hand pointing gestures. Users are allowed to walk around in a room and manipulate information displayed on its walls by using their own hands as pointing devices. Once captured and tracked in real-time using...
Article
Full-text available
Image analysis and computer vision can be effectively employed to recover the three-dimensional structure of imaged objects, together with their surface properties. In this paper, we address the problem of metric reconstruction and texture acquisition from a single uncalibrated view of a surface of revolution (SOR). Geometric constraints induced in...
Conference Paper
Full-text available
Broadcasters are demonstrating interest in systems that ease the process of annotation the huge amount of live and archived video materials. Exploitation of such assets is considered a key method for the improvement of production quality, and sport videos are one of the most marketable assets. In particular, in Europe, soccer is one of the most rel...
Conference Paper
Full-text available
We present an approach for camera calibration from the im- age of at least two circles arranged in a coaxial way. Such a geometric conflguration arises in static scenes of objects with rotational symme- try or in scenes including generic objects undergoing rotational motion around a flxed axis. The approach is based on the automatic localization of...
Conference Paper
Full-text available
This paper discusses and compares the best and most recent local descriptors, evaluating them on increasingly complex image matching tasks, encompassing planar and non-planar scenarios under severe viewpoint changes. This evaluation, aimed at assessing de-scriptor suitability for real-world applications, leverages the concept of Approximated Overla...
Chapter
Full-text available
Image matching, as the task of finding correspondences in images, is the upstream component of vision and photogrammetric applications aiming at the reconstruction of 3D scenes, their understanding and comparison. Such applications are of special importance in the context of cultural heritage, as they can support archaeologists to digitally preserv...
Article
Full-text available
Assessing if an image comes from a specific device is fundamental in many application scenarios. The most promising techniques to solve this problem rely on the Photo Response Non Uniformity (PRNU), a unique trace left during image acquisition. A PRNU fingerprint is computed from several images of a given device, then it is compared with the probe...
Conference Paper
Full-text available
This paper presents a closed form solution for the problem of computing a set of projective cameras from the fundamental matrices of a given viewing graph. The approach is incremental, exploits trifocal constraints, and does not rely on either image or structure points. Represented by a vector of four parameters that uniquely ensure its consistency...
Article
Full-text available
Restoration of digital visual media acquired from repositories of historical photographic and cinematographic material is of key importance for the preservation, study and transmission of the legacy of past cultures to the coming generations. In this paper, a fully automatic approach to the digital restoration of historical stereo photographs is pr...
Chapter
Full-text available
Restoration of digital visual media acquired from repositories of historical photographic and cinematographic material is of key importance for the preservation, study and transmission of the legacy of past cultures to the coming generations. In this paper, a fully automatic approach to the digital restoration of historical stereo photographs is pr...
Article
Full-text available
This paper explores content-based image registration as a means of dealing with and understanding better Electronic Image Stabilization (EIS) in the context of Photo Response Non-Uniformity (PRNU) alignment. A novel and robust solution to extrapolate the transformation relating the different image output formats for a given device model is proposed...
Article
Full-text available
SIFT is a classical hand-crafted, histogram-based descriptor that has deeply affected research on image matching for more than a decade. In this paper, a critical review of the aspects that affect SIFT matching performance is carried out, and novel descriptor design strategies are introduced and individually evaluated. These encompass quantization,...
Article
Full-text available
This paper introduces an extension of the sGLOH2 local image descriptor inspired by RootSIFT "square rooting" as a way to indirectly alter the matching distance used to compare the descriptor vectors. The extended descriptor, named Roots-GLOH2, achieved the best results in terms of matching accuracy and robustness among the latest state-of-the-art...
Conference Paper
Full-text available
This paper proposes a novel approach for registering the PRNU pattern between different camera acquisition modes by relying on the imaged scene content. First, images are aligned by establishing correspondences between local descriptors: The result can then optionally be refined by maximizing the PRNU correlation. Comparative evaluations show that...
Article
Full-text available
The definition of valid and robust methodologies for assessing the authenticity of digital information is nowadays critical to contrast social manipulation through the media. A key research topic in multimedia forensics is the development of methods for detecting tampered content in large image collections without any human intervention. This paper...
Conference Paper
Full-text available
Matching with local image descriptors is a fundamental task in many computer vision applications. This paper describes the WISW contest held within the framework of the CAIP 2019 conference, aimed at benchmarking recent descriptors in challenging planar and non-planar real image matching scenarios. According to the contest results, the de-scriptors...
Article
Full-text available
Tampered images spread nowadays over any visual media influencing our judgement in many aspects of our life. This is particularly critical for face splicing manipulations, where recognizable identities are put out of context. To contrast these activities on a large scale, automatic detectors are required. In this paper, we present a novel method fo...
Conference Paper
Full-text available
Tracking the structural evolution of a site has important fields of application, ranging from documenting the excavation progress during an archaeological campaign, to hydro-geological monitoring. In this paper, we propose a simple yet effective method that exploits vision-based reconstructed 3D models of a time-changing environment to automaticall...
Article
Full-text available
We present a practical, robust, and effective pipeline to compute a high-resolution (HR) image of the corneal endothelium starting from a low-resolution (LR) video sequence obtained with a general purpose slit lamp biomicroscope. An image quality typical of dedicated and more expensive confocal microscopes is achieved via software magnification by...
Conference Paper
Full-text available
The analysis of early photographic sources is fundamental for documenting and understanding the evolution of a city so rich in history and art as Florence. Indeed, by the 1860s several photographers used to work in town, and their images (often obtained through stereoscopic set-ups) can help us to reconstruct Florence in 3D as it was by the time of...
Article
Full-text available
This paper introduces a new compositional framework for classifying color correction methods according to their two main computational units. The framework was used to dissect fifteen among the best color correction algorithms and the computational units so derived, with the addition of four new units specially designed for this work, were then rea...
Article
Full-text available
sGLOH (shifting GLOH) is a histogrambased keypoint descriptor that can be associated to multiple quantized rotations of the keypoint patch without any recomputation. This property can be exploited to define the best distance between two descriptor vectors, thus avoiding computing the dominant orientation. In addition, sGLOH can reject incongruous c...
Article
Full-text available
In this paper we present a stereo visual odometry system developed for autonomous underwater vehicle localization tasks. The main idea is to make use of only highly reliable data in the estimation process, employing a robust keypoint tracking approach and an effective keyframe selection strategy, so that camera movements are estimated with high acc...
Article
Full-text available
Although quite recent as a forensic research domain, computer vision analysis of scenes is likely to become more and more important in the near future, thanks to its robustness to image alterations at the signal level, such as image compression and filtering. However, the experimental assessment of vision-based forensic algorithms is a particularly...
Article
Full-text available
This paper presents a novel stereo visual odometry (VO) framework based on structure from motion, where a robust keypoint tracking and matching is combined with an effective keyframe selection strategy. In order to track and find correct feature correspondences a robust loop chain matching scheme on two consecutive stereo pairs is introduced. Keyfr...
Conference Paper
Full-text available
This paper proposes an extension of the sGLOH keypoint de-scriptor which improves its robustness and discriminability. The sGLOH descriptor can handle discrete rotations by a cyclic shift of its element thanks to its circular structure but its performances can decrease when the keypoint relative rotation is in between two sGLOH discrete rotations....
Conference Paper
Full-text available
One of the main problems that visually impaired people have to deal with is moving autonomously in an unknown environment. Currently, the most used autonomous walking aid is still the white can. Though in the last few years more technological devices have been introduced, referred to as electronic travel aids (ETAs). In this paper, we present a nov...
Conference Paper
Full-text available
This paper proposes a novel color correction scheme for image stitching where the color map transfer is modelled by a monotone Hermite cubic spline and smoothly propagated into the target image. A three-segments monotone cubic spline minimizing color distribution statistics and gradient differences with respect to both the source and target images...
Conference Paper
A commonly ignored problem in planar mosaics, yet often present in practice, is the selection of a reference homography reprojection frame where to attach the successive image frames of the mosaic. A bad choice for the reference frame can lead to severe distortions in the mosaic and can degenerate in incorrect configurations after some sequential f...
Conference Paper
Full-text available
MARTA (MARine Tool for Archaeology) is a modular AUV (Autonomous Underwater Vehicle) designed and developed by the University of Florence in the framework of the ARROWS (ARchaeological RObot systems for the World’s Seas) FP7 European project. The ARROWS project challenge is to provide the underwater archaeologists with technological tools for cost...
Conference Paper
Full-text available
ARchaeological RObot systems for the World's Seas (ARROWS) EU Project proposes to adapt and develop low-cost Autonomous Underwater Vehicle (AUV) Technologies to significantly reduce the cost of archaeological operations, covering the full extent of archaeological campaign. ARROWS methodology is to identify the archaeologists requirements in all pha...
Conference Paper
Full-text available
This paper proposes a novel strategy to find the best reference homography in mosaics from video sequences. The reference homography globally minimizes the distortions induced on each image frame by the mosaic ho-mography itself. This method is designed for planar mosaics on which a bad choice of the first reference image frame can lead to severe d...
Conference Paper
Full-text available
This paper presents a new online preprocessing strategy to detect and discard ongoing bad frames in video sequences. These include frames where an accurate localization between corresponding points is difficult, such as for blurred frames, or which do not provide relevant information with respect to the previous frames in terms of texture, image co...
Article
Full-text available
Computer vision is one of the most exciting areas of all information science and technology. It appeals both to the scientist looking for challenging research topics, and to the industrialist aiming at developing successful new products.In the last few years, the proliferation of vision-related material (tutorials, publications, software, datasets,...
Conference Paper
ARROWS is the acronym for Archaeological RObot systems for the World’s Seas. The project, started in September 2012, is funded by the EU in the framework of the FP7 call ENV-2012, challenge 6.2-6, devoted to “Development of advanced technologies and tools for mapping, diagnosing, excavating, and securing underwater and coastal archaeological sites”...
Conference Paper
Full-text available
This paper presents a novel stereo SLAM framework, where a robust loop chain matching scheme for tracking keypoints is combined with an effective frame selection strategy. The proposed approach, re-ferred to as selective SLAM (SSLAM), relies on the observation that the error in the pose estimation propagates from the uncertainty of the three-dimens...
Conference Paper
Full-text available
This paper proposes a novel monocular SLAM approach. For a triplet of successive keyframes, the approach inteleaves the registration of the three 3D maps associated to each image pair in the triplet and the refinement of the corresponding poses, by progressively limiting the allowable reprojection error according to a simulated annealing scheme. Th...
Conference Paper
Full-text available
We present a tool for the acquisition of 3D textured models of objects of desktop size using an hybrid computer vision framework. This framework combines active laser-based triangulation with passive motion estimation. The 3D models are obtained by motion-based alignment (with respect to a fixed world frame) of imaged laser profiles backprojected o...
Conference Paper
Full-text available
We present a fast and effective method to compute a high-resolution image of the corneal endothelium starting from a low-resolution video sequence obtained with a general purpose biomicroscope. Our goal is to exploit information redundancy in the sequence so as to achieve via software a magnification power and an image quality typical of dedicated...
Conference Paper
Full-text available
The Thesaurus Project, funded by the Regione Toscana, combines humanistic and technological research aiming at developing a new generation of cooperating Autonomous Underwater Vehicles and at documenting ancient and modern Tuscany shipwrecks. Technological research will allow performing an archaeological exploration mission through the use of a swa...
Conference Paper
Full-text available
In this paper we propose an original framework for the description and the subsequent recognition of objects of limited size. Although of general applicability, the framework is presented here as a way to trace different yet similar metal tools employed in the mechanical constructions industry. For the purpose of object description, time-varying si...
Conference Paper
Full-text available
The THESAURUS project (2011-2013) is financed by Regione Toscana (Italy) in the framework of the "FAS" program 2007-2013 under Deliberation CIPE (Italian government) 166/2007. The overall goal of THESAURUS project is to develop multidisciplinary methodologies and technologies to detect, catalogue and document underwater artifacts and wreckage with...
Conference Paper
Full-text available
We present an approach for merging into a single super-image a set of uncalibrated images of a general 3D scene taken from multiple viewpoints. To this aim, the content of either image is augmented with visual information taken from the others, while maintaining projective coherence. The approach extends the usual mosaicing techniques to image coll...
Article
Full-text available
In this paper it is shown how to obtain a three-dimensional, textured object model by relying exclusively on image warping 2D–2D transformations. To achieve this goal, a dual (laser and natural light) illumination of the scene is exploited. Shape reconstruction is not based on triangulation, but on the planar rectification and collation of laser pr...
Conference Paper
Full-text available
In this paper, machine learning and geometric computer vision are combined for the purpose of automatic reading bus line numbers with a smart phone. This can prove very useful to improve the autonomy of visually impaired people in urban scenarios. The problem is a challenging one, since standard geometric image matching methods fail due to the abun...
Conference Paper
Full-text available
We describe a computationally fast and effective ap-proach to 2D-3D conversion of an image pair for the three-dimensional rendering on stereoscopic displays of scenes including a ground plane. The stereo disparities of all the other scene elements (background, foreground objects) are computed after statistical segmentation and geometric lo-calizati...
Article
Full-text available
A self-calibrated approach to visual servoing with respect to non-planar targets modeled through a pair of coaxial circles plus one point is discussed. Full calibration data (fixed internal parameters) are obtained from two views, and used to recover the Euclidean structure of an auxiliary virtual plane associated to the target, together with the r...
Article
Full-text available
Non-intrusive human body tracking is a key issue in advanced human–computer interaction, with applications ranging from virtual reality to videoconference and telepresence. This paper describes a system for vision-based tracking of body posture. The system is explicitly designed to provide a robust yet simple and inexpensive solution to real-time b...
Article
Full-text available
A single-camera iris-tracking and remapping approach based on passive computer vision is presented. Tracking is aimed at obtaining accurate and robust measurements of the iris/pupil position. To this purpose, a robust method for ellipse fitting is used, employing search constraints so as to achieve better performance with respect to the standard RA...
Conference Paper
Full-text available
A robust iris localization and tracking algorithm based on computer vision is presented. The iris localization algorithm acts as a bootstrap for the tracking algorithm, providing it with a set of multiple hypotheses to restart from in the case of a tracking failure. Tracking is performed with a RANSAC-like robust method for ellipse fitting that inc...
Conference Paper
Full-text available
This paper addresses the problem of classifying human actions in a video sequence. A representation eigenspace approach based on the PCA algorithm is used to train the classifier according to an incremental learning scheme based on a "one action, one eigenspace" approach. Before dimensionality reduction, a high dimensional description of each frame...
Conference Paper
Full-text available
This paper addresses the problem of classifying actions per- formed by a human subject in a video sequence. A represen- tation eigenspace approach based on the visual appearance is used to train the classifier. Before dimensionality reduction exploiting the PCA/LLE algorithms, a high dimensional de- scription of each frame of the video sequence is...
Article
Full-text available
We present a low-cost, hybrid active/passive 3D scanning system based on an off-the-shelf camera, a laser stripe illuminator and a turntable. The system combines the good accuracy of active triangulation approaches with the flexibility of self-calibration based approaches. System operation is based on the construction of a single view of a virtual...
Conference Paper
Full-text available
An uncalibrated approach to visual servoing with respect to non planar targets modeled through a pair of coaxial circles plus one point is discussed. Full calibration data (fixed internal parameters) are obtained from two views, and used to recover Euclidean target structure and camera relative pose. Pose disambiguation is achieved without requirin...
Conference Paper
Full-text available
We describe a low cost system for metric 3D scanning from uncalibrated images based on rotational kinematic constraints. The system is composed by a turntable, an offthe- shelf camera and a laser stripe illuminator. System operation is based on the construction of the virtual image of a surface of revolution (SOR), from which two imaged SOR cross-s...