
Erickson R. NascimentoFederal University of Minas Gerais | UFMG · Departamento de Ciência da Computação
Erickson R. Nascimento
PhD
About
124
Publications
21,168
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,739
Citations
Introduction
I am an Asssociate Professor in the Department of Computer Science at Federal University of Minas Gerais (UFMG), Brazil. My research interests include Computer Vision, Pattern Recognition, and Graphics. I am working on low level description for images and geometrical data, three-dimensional reconstruction and modeling, participating media, and hyperlapse for egocentric videos.
Additional affiliations
July 2017 - December 2018
December 2012 - present
Publications
Publications (124)
Supervised machine learning methods require large-scale training datasets to perform well in practice. Synthetic data has been showing great progress recently and has been used as a complement to real data. However, there is yet a great urge to assess the usability of synthetically generated data. To this end, we propose a novel UCB-based training...
Visual correspondence is a crucial step in key computer vision tasks, including camera localization, image registration, and structure from motion. The most effective techniques for matching keypoints currently involve using learned sparse or dense matchers, which need pairs of images. These neural networks have a good general understanding of feat...
From the dawn of the digital revolution until today, data has grown exponentially, especially in images and videos. Smartphones and wearable devices with high storage and long battery life contribute to continuous recording and massive uploads to social media. This rapid increase in visual data, combined with users’ limited time, demands methods to...
We propose a novel learned keypoint detection method to increase the number of correct matches for the task of non-rigid image correspondence. By leveraging true correspondences acquired by matching annotated image pairs with a specified descriptor extractor, we train an end-to-end convolutional neural network (CNN) to find keypoint locations that...
Local feature extraction is a standard approach in computer vision for tackling important tasks such as image matching and retrieval. The core assumption of most methods is that images undergo affine transformations, disregarding more complicated effects such as non-rigid deformations. Furthermore, incipient works tailored for non-rigid corresponde...
We present a novel learned keypoint detection method designed to maximize the number of correct matches for the task of non-rigid image correspondence. Our training framework uses true correspondences, obtained by matching annotated image pairs with a predefined descriptor extractor, as a ground-truth to train a convolutional neural network (CNN)....
Background: Cardiac involvement seems to impact the prognosis of COVID-19. Bedside echocardiography (echo) holds promise for the early outcome prediction, and artificial intelligence has shown to be an additional tool to overcome personnel limitations. We propose a spatial-temporal deep learning-based approach for automatic prediction of mortality...
We tackle the problem of learning classification models with very small amounts of labeled data (e.g., less than 10% of the dataset) by introducing a novel Single View Co-Training strategy supported by Reinforcement Learning (CoRL). CoRL is a novel semi-supervised learning framework that can be used with a single view (representation). Differently...
With the advance in technology and social media usage, first-person recording videos has become a common habit. These videos are usually very long and tiring to watch, bringing the need to speed up them. Despite recent progress of fast-forward methods, they do not consider inserting background music in the videos, which could make them more enjoyab...
This thesis investigates the problem of transferring human motion and appearance from video to video preserving motion features, body shape, and visual quality. In other words, given two input videos, we investigate how to synthesize a new video, where a target person from the first video is placed into a new context performing different motions fr...
Recent semantic segmentation models perform well under standard weather conditions and sufficient illumination but struggle with adverse weather conditions and nighttime. Collecting and annotating training data under these conditions is expensive, time-consuming, error-prone, and not always practical. Usually, synthetic data is used as a feasible d...
Recent semantic segmentation models perform well under standard weather conditions and sufficient illumination but struggle with adverse weather conditions and night-time. Collecting and annotating training data under these conditions is expensive, time-consuming, error-prone, and not always practical. Usually, synthetic data is used as a feasible...
With the recent growth in the use of social media and new digital devices like smartphones and wearable cameras, people are often recording long first-person videos of their daily activities. These videos are usually very long and tiring to watch, bringing the need to speed them up. Recent fast-forward methods do not consider the background music t...
Video stabilization plays a central role to improve videos quality. However, despite the substantial progress made by these methods, they were, mainly, tested under standard weather and lighting conditions, and may perform poorly under adverse conditions. In this paper, we propose a synthetic-aware adverse weather robust algorithm for video stabili...
A cardiopatia reumática (CR) afeta aproximadamente 39 milhões de pessoas no mundo e é a doença cardíaca adquirida mais comum entre crianças e adolescentes. Ecocardiogramas são o padrão-ouro para o diagnóstico de CR, mas uma escassez de profissionais qualificados impede a implementação em larga escala de programas de prevenção e identificação precoc...
The growth of videos in our digital age and the users' limited time raise the demand for processing untrimmed videos to produce shorter versions conveying the same information. Despite the remarkable progress that summarization methods have made, most of them can only select a few frames or skims, creating visual gaps and breaking the video context...
Most of the existing handcrafted and learning-based local descriptors are still at best approximately invariant to affine image transformations, often disregarding deformable surfaces. In this paper, we take one step further by proposing a new approach to compute descriptors from RGB-D images (where RGB refers to the pixel color brightness and D st...
The growth of videos in our digital age and the users’ limited time raise the demand for processing untrimmed videos to produce shorter versions conveying the same information. Despite the remarkable progress that summarization methods have made, most of them can only select a few frames or skims, creating visual gaps and breaking the video context...
Most of the existing handcrafted and learning-based local descriptors are still at best approximately invariant to affine image transformations, often disregarding deformable surfaces. In this paper, we take one step further by proposing a new approach to compute descriptors from RGB-D images (where RGB refers to the pixel color brightness and D st...
We present a convolutional neural network (CNN)-based approach for scene-level change detection in aerial images with registration errors. Thousands of aerial images and long videos are routinely acquired for monitoring large areas, such as forests and oil pipelines. Annotating changes in those videos and images can be tedious, error-prone, or even...
Despite the advances in extracting local features achieved by handcrafted and learning-based descriptors, they are still limited by the lack of invariance to non-rigid transformations. In this paper, we present a new approach to compute features from still images that are robust to non-rigid deformations to circumvent the problem of matching deform...
This paper proposes a new end-to-end neural rendering architecture to transfer appearance and reenact human actors. Our method leverages a carefully designed graph convolutional network (GCN) to model the human body manifold structure, jointly with differentiable rendering, to synthesize new videos of people in different contexts from where they we...
Learning to move naturally from music, i.e., to dance, is one of the most complex motions humans often perform effortlessly. Synthesizing human motion through learning techniques is becoming an increasingly popular approach to alleviating the requirement of new data capture to produce animations. Most approaches, addressing the problem of automatic...
With the advance of technology and social media usage, the recording of first-person videos has become a widespread habit. These videos are usually very long and tiring to watch, bringing the need to speed-up them. Despite recent progress of fast-forward methods, they generally do not consider inserting background music in the videos, which could m...
Objective:
Rheumatic heart disease (RHD) affects an estimated 39 million people worldwide and is the most common acquired heart disease in children and young adults. Echocardiograms are the gold standard for diagnosis of RHD, but there is a shortage of skilled experts to allow widespread screenings for early detection and prevention of the disease...
Learning to move naturally from music, i.e., to dance, is one of the most complex motions humans often perform effortlessly. Existing techniques of automatic dance generation with classical CNN and RNN models undergo training and variability issues due to the non-Euclidean geometry of the motion manifold. We design a novel method based on GCNs to t...
In this paper, we hypothesize that the effects of the degree of typicality in natural semantic categories can be generated based on the structure of artificial categories learned with deep learning models. Motivated by the human approach to representing natural semantic categories and based on the Prototype Theory foundations, we propose a novel Co...
Transferring human motion and appearance between videos of human actors remains one of the key challenges in Computer Vision. Despite the advances from recent image-to-image translation approaches, there are several transferring contexts where most end-to-end learning-based retargeting methods still perform poorly. Transferring human appearance fro...
In this paper, we hypothesize that the effects of the degree of typicality in natural semantic categories can be generated based on the structure of artificial categories learned with deep learning models. Motivated by the human approach to representing natural semantic categories and based on the Prototype Theory foundations, we propose a novel Co...
Autonomous mobile devices operating in confined environments, such as pipes, underground tunnel systems, and cave networks, face multiple open challenges from the robotics perspective. Those challenges, such as mobility, localization, and mapping in GPS denied scenarios, are receiving particular attention from the academy and industry. One example...
Transferring human motion and appearance between videos of human actors remains one of the key challenges in Computer Vision. Despite the advances from recent image-to-image translation approaches, there are several transferring contexts where most end-to-end learning-based retargeting methods still perform poorly. Transferring human appearance fro...
Synthesizing human motion through learning techniques is becoming an increasingly popular approach to alleviating the requirement of new data capture to produce animations. Learning to move naturally from music, i.e., to dance, is one of the more complex motions humans often perform effortlessly. Each dance movement is unique, yet such movements ma...
This research aims to build a model for the semantic description of objects based on visual features extracted from images. We introduce a novel semantic description approach inspired by the Prototype Theory. Inspired by the human approach used to represent categories, we propose a novel Computational Prototype Model (CPM) that encodes and stores t...
Synthesizing human motion through learning techniques is becoming an increasingly popular approach to alleviating the requirement of new data capture to produce animations. Learning to move naturally from music, i.e., to dance, is one of the more complex motions humans often perform effortlessly. Each dance movement is unique, yet such movements ma...
Technological advances in sensors have paved the way for digital cameras to become increasingly ubiquitous, which, in turn, led to the popularity of the self-recording culture. As a result, the amount of visual data on the Internet is moving in the opposite direction of the available time and patience of the users. Thus, most of the uploaded videos...
This paper addresses the problem of building augmented metric representations of scenes with semantic information from RGB-D images. We propose a complete framework to create an enhanced map representation of the environment with object-level information to be used in several applications such as human-robot interaction, assistive robotics, visual...
The availability of low-cost and high-quality wearable cameras combined with the unlimited storage capacity of video-sharing websites have evoked a growing interest in First-Person Videos. Such videos are usually composed of long-running unedited streams captured by a device attached to the user body, which makes them tedious and visually unpleasan...
The growing data sharing and life-logging cultures are driving an unprecedented increase in the amount of unedited First-Person Videos. In this paper, we address the problem of accessing relevant information in First-Person Videos by creating an accelerated version of the input video and emphasizing the important moments to the recorder. Our method...
The rapid increase in the amount of published visual data and the limited time of users bring the demand for processing untrimmed videos to produce shorter versions that convey the same information. Despite the remarkable progress that has been made by summarization methods, most of them can only select a few frames or skims, which creates visual g...
Technological advances in sensors have paved the way for digital cameras to become increasingly ubiquitous which, in turn, led to the popularity of the self-recording culture. As a result, the amount of visual data on the Internet is moving in the opposite direction of the available time and patience of the users. Thus, most of the uploaded videos...
This paper addresses the problem of building augmented metric representations of scenes with semantic information from RGB-D images. We propose a complete framework to create an enhanced map representation of the environment with object-level information to be used in several applications such as human-robot interaction, assistive robotics, visual...
The safety and comfort of drivers have been improved over the decades as a result of our broadened understanding of driver modeling and behavior prediction. Despite these remarkable advances in autonomous and interactive systems, there is a significant lack of approaches that consider the passengers and the vehicle as components of a dynamical vibr...
Creating plausible virtual actors from images of real actors remains one of the key challenges in computer vision and computer graphics. Marker-less human motion estimation and shape modeling from images in the wild bring this challenge to the fore. Although the recent advances on view synthesis and image-to-image translation, currently available f...
The growth of Social Networks has fueled the habit of people logging their day-to-day activities, and long First-Person Videos (FPVs) are one of the main tools in this new habit. Semantic-aware fast-forward methods are able to decrease the watch time and select meaningful moments, which is key to increase the chances of these videos being watched....
Different applications in remote sensing, such as crop monitoring and visual surveillance, demand the automatic detection of changes from sets of images acquired over time. Most traditional approaches use satellite imagery, which, besides the known issues such as cloud cover and image acquisition frequency for nongeostationary satellites, are very...
The availability of low-cost, high-quality personal wearable cameras combined with the unlimited storage capacity of video-sharing websites has evoked a growing interest in First-Person Videos (FPVs). Such videos are usually composed of long-running unedited streams captured by a device attached to the user body, which makes them tedious and visual...
One of the steps to provide fundamental data for planning a mining effort is the magnetic surveying of a target area, which is typically carried out by conventional aircraft campaigns. However, besides the high cost, fixed‐wing aerial vehicles present shortcomings especially for drape flights on mountainous regions, where steep slopes are often pre...
In this paper, we introduce a novel semantic description approach inspired on Prototype Theory foundations. We propose a Computational Prototype Model (CPM) that encodes and stores the central semantic meaning of objects category: the semantic prototype. Also, we introduce a Prototype-based Description Model that encodes the semantic meaning of an...
The detection of fiducial points on faces has significantly been favored by the rapid progress in the field of machine learning, in particular in the convolution networks. However, the accuracy of most of the detectors strongly depends on an enormous amount of annotated data. In this work, we present a domain adaptation approach based on a two-step...
The image processing community has witnessed remarkable advances in enhancing and restoring images. Nevertheless, restoring the visual quality of underwater images remains a great challenge. End-to-end frameworks might fail to enhance the visual quality of underwater images since in several scenarios it is not feasible to provide the ground truth o...
Creating textured 3D meshes of objects for real-time applications can be a laborious, slow and expensive task, demanding specific, highly specialized human resources such as 2D and 3D artists. In this paper, we present a fully automatic 3D modeling methodology based on silhouette carving, capable of creating textured 3D meshes from three pieces of...
In this paper, we present a novel method to restore the visual quality of images from scenes immersed in participating media, in particular water. Our method builds upon existing physics-based model and estimates the scene radiance by removing the medium interference on light propagation. Our approach requires a single image as input and, by combin...
The remarkable technological advance in well-equipped wearable devices is pushing an increasing production of long first-person videos. However, since most of these videos have long and tedious parts, they are forgotten or never seen. Despite a large number of techniques proposed to fast-forward these videos by highlighting relevant moments , most...