Claudio Rosito Jung

Claudio Rosito Jung
Universidade Federal do Rio Grande do Sul | UFRGS · Institute of Informatics

PhD

About

159
Publications
83,800
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,655
Citations
Citations since 2017
53 Research Items
1997 Citations
20172018201920202021202220230100200300400
20172018201920202021202220230100200300400
20172018201920202021202220230100200300400
20172018201920202021202220230100200300400
Additional affiliations
November 2009 - present
Universidade Federal do Rio Grande do Sul

Publications

Publications (159)
Article
The Coronavirus Disease 2019 (COVID-19) has drastically overwhelmed most countries in the last two years, and image-based approaches using computerized tomography (CT) have been used to identify pulmonary infections. Recent methods based on deep learning either require time-consuming per-slice annotations (2D) or are highly data- and hardware-deman...
Conference Paper
Omnidirectional media are becoming widespread with the increasing popularization of devices for capture and visualization. Unlike traditional pinhole-based images, omnidi-rectional images are defined on the surface of a sphere, present a full field of view, and store light intensities from a whole scene. In particular, applications exploring immers...
Preprint
Defocus blur is a physical consequence of the optical sensors used in most cameras. Although it can be used as a photographic style, it is commonly viewed as an image degradation modeled as the convolution of a sharp image with a spatially-varying blur kernel. Motivated by the advance of blur estimation methods in the past years, we propose a non-b...
Conference Paper
Pose estimation is a crucial problem in several computer vision and robotics applications. For the two-view scenario, the typical pipeline consists of finding point correspondences between the two views and using them to estimate the pose. However, most available keypoint extraction and matching methods were designed to work with perspective images...
Conference Paper
Full-text available
Object detection is a classical problem in computer vision , and the vast majority of approaches require large annotated datasets for training and evaluation purposes. The most popular representations are bounding boxes (BBs), usually defined as the minimal-area rectangle that encompasses the whole object region. However, the annotation process pre...
Article
This paper provides a comprehensive survey on pioneer and state-of-the-art 3D scene geometry estimation methodologies based on single, two, or multiple images captured under omnidirectional optics. We first revisit the basic concepts of the spherical camera model and review the most common acquisition technologies and representation formats suitabl...
Article
Full-text available
This paper presents an edge-based defocus blur estimation method from a single defocused image. We first distinguish edges that lie at depth discontinuities (called depth edges, for which the blur estimate is ambiguous) from edges that lie at approximately constant depth regions (called pattern edges, for which the blur estimate is well-defined). T...
Conference Paper
Monocular depth inference methods based on 360° images allow 3D reconstruction of entire rooms with a single capture. However, most state-of-the-art approaches assume gravity aligned images and are highly sensitive to camera rotations. Such limitations result in poor depth estimates, which may jeopardize further 3D-based applications. Here, we pres...
Article
Superpixels are fundamental in many visual computing applications, and most existingalgorithms are designed to work with pinhole-based images. However, immersiveapplications are gaining visibility with the growing number of devices for capturing andvisualizing 360º media. This paper introduces two fast and accurate superpixelalgorithms tailored to...
Article
View synthesis allows observers to explore static scenes using aligned color images and depth maps captured in a preset camera path. Among the options, depth-image-based rendering (DIBR) approaches have been effective and efficient since only one pair of color and depth map is required, saving storage and bandwidth. The present work proposes a nove...
Article
Full-text available
Human cortical and subcortical areas integrate emotion, memory, and cognition when interpreting various environmental stimuli for the elaboration of complex, evolved social behaviors. Pyramidal neurons occur in developed phylogenetic areas advancing along with the allocortex to represent 70–85% of the neocortical gray matter. Here, we illustrate an...
Preprint
In this work, we explore techniques to improve performance for rare classes in the task of Automatic Chord Recognition (ACR). We first explored the use of the focal loss in the context of ACR, which was originally proposed to improve the classification of hard samples. In parallel, we adapted a self-learning technique originally designed for image...
Conference Paper
Techniques for 3D reconstruction of scenes based on images are popular and support a number of secondary applications. Traditional approaches require several captures for covering whole environments due to the narrow field of view (FoV) of the pinhole-based/perspective cameras. This paper summarizes the main contributions of the homonym Ph.D. Thesi...
Conference Paper
This paper presents a methodology for image classification using Graph Neural Network (GNN) models. We transform the input images into region adjacency graphs (RAGs), in which regions are superpixels and edges connect neighboring superpixels. Our experiments suggest that Graph Attention Networks (GATs), which combine graph convolutions with self-at...
Preprint
This paper presents an edge-based defocus blur estimation method from a single defocused image. We first distinguish edges that lie at depth discontinuities (called depth edges, for which the blur estimate is ambiguous) from edges that lie at approximately constant depth regions (called pattern edges, for which the blur estimate is well-defined). T...
Preprint
Weakly supervised object detection (WSOD) aims to tackle the object detection problem using only labeled image categories as supervision. A common approach used in WSOD to deal with the lack of localization information is Multiple Instance Learning, and in recent years methods started adopting Multiple Instance Detection Networks (MIDN), which allo...
Article
Automatic License Plate Recognition (ALPR) is an important task with many applications in Intelligent Transportation and Surveillance systems. This work presents an end-to-end ALPR method based on a hierarchical Convolutional Neural Network (CNN). The core idea of the proposed method is to identify the vehicle and the license plate region using two...
Preprint
This document reports the use of Graph Attention Networks for classifying oversegmented images, as well as a general procedure for generating oversegmented versions of image-based datasets. The code and learnt models for/from the experiments are available on github. The experiments were ran from June 2019 until December 2019. We obtained better res...
Conference Paper
Full-text available
In crowds, one important aspect that has been studied in literature is the sociability of groups dealing with aspects based on personality and emotions. In this paper we contribute to the space design area while considering the cultural, personality and thermal aspects to provide spatial group distribution. Our method applies a thermal comfort meth...
Conference Paper
This paper presents a perturbation analysis for the estimate of epipolar matrices using the 8-Point Algorithm (8-PA). Our approach explores existing bounds for singular subspaces and relates them to the 8-PA, without assuming any kind of error distribution for the matched features. In particular, if we use unit vectors as homogeneous image coordina...
Conference Paper
In this paper we compare the quality of synthesized views produced by four DIBR methods when fed by depth maps estimated by five state-of-the-art stereo matching algorithms. Also, we compute the correlation between four popular metrics for ranking stereo matching algorithms and two metrics commonly used to evaluate synthesized views (PSNR and SSIM)...
Preprint
This paper presents a novel hierarchical approach for collective behavior recognition based solely on ground-plane trajectories. In the first layer of our classifier, we introduce a novel feature called Personal Interaction Descriptor (PID), which combines the spatial distribution of a pair of pedestrians within a temporal window with a pyramidal r...
Conference Paper
In this paper we propose a framework for inferring depth from a single spherical image, which can be coupled to any generic planar image monocular depth estimation algorithm. It consists of first inferring depth from overlapping planar patches extracted from the spherical image, and then using a regularized minimization scheme to stitch the patches...
Conference Paper
Full-text available
Despite the large number of both commercial and academic methods for Automatic License Plate Recognition (ALPR), most existing approaches are focused on a specific license plate (LP) region (e.g. European, US, Brazilian, Taiwanese, etc.), and frequently explore datasets containing approximately frontal images. This work proposes a complete ALPR sys...
Chapter
Despite the large number of both commercial and academic methods for Automatic License Plate Recognition (ALPR), most existing approaches are focused on a specific license plate (LP) region (e.g. European, US, Brazilian, Taiwanese, etc.), and frequently explore datasets containing approximately frontal images. This work proposes a complete ALPR sys...
Article
Full-text available
Crowd simulation addresses algorithmic approaches to steering, navigation, perception, and behavioral models. Significant progress has been achieved in modeling interactions between agents and the environment to avoid collisions, exploit empirical local decision data, and plan efficient paths to goals. We address a relatively unexplored dimension o...
Article
Full-text available
Objects that do not lie at the focal distance of a digital camera generate defocused regions in the captured image. This paper presents a new edge-based method for spatially varying defocus blur estimation using a single image based on reblurred gradient magnitudes. The proposed approach initially computes a scale-consistent edge map of the input i...
Chapter
People, when part of a crowd, are able to perform unusual behavior, which would not be performed by a single person (LeBon 1895). A crowd is a powerful entity and its understanding is very important, specially regarding safety issues. The understanding of crowd motion can provide enough information in order to map people features and behaviors that...
Chapter
This chapter introduces some important aspects of crowd dynamics and crowd evacuation, and discusses regulations concerning evacuation processes as discussed and presented in some countries.
Chapter
When evaluating or simulating crowd egress situations, there are several parameters that should be taken into account. For instance, the initial distribution of the people and/or local densities are important to assess possible hazardous events; tracking people or detecting main flows can be very useful to identify main escape routes or bottlenecks...
Chapter
This chapter presents some current technologies in crowd simulation and it is organized into two parts. First, we present and discuss the main existing state-of-the-art technologies developed with an explicit goal to simulate crowds. Then we present in detail CrowdSim, a new crowd simulation software developed by the authors. This is not a commerci...
Chapter
When analyzing an egress process simulation, it is possible to extract data that can be used to provide a deep analysis of scenarios. Before something goes wrong, a simulation project can identify attention points related to people’s comfort and safety when in egress.
Book
This book describes from a computer science viewpoint the software, methods of simulating and analysing crowds with a particular focus on the effects of panic in emergency situations. The power of modern technology impacts on modern life in multiple ways every day. A variety of scientific models and computational tools have been developed to impro...
Conference Paper
Full-text available
Automatic License Plate Recognition (ALPR) is an important task with many applications in Intelligent Transportation and Surveillance systems. As in other computer vision tasks, Deep Learning (DL) methods have been recently applied in the context of ALPR, focusing on country-specific plates, such as American or European, Chinese, Indian and Korean....
Conference Paper
Full-text available
Keypoint extraction and matching has been widely studied by the computer vision community, mostly focused on pinhole camera models. In this paper we perform a comparative analysis of four keypoint extraction algorithms applied to full spherical images, particularly in the context of pose estimation. Two of the methods chosen for the comparative stu...
Article
Background: Different approaches aim to unravel detailed morphological features of neural cells. Dendritic spines are multifunctional units that reflect cellular connectivity, synaptic strength and plasticity. New method: A novel three-dimensional (3D) reconstruction procedure is introduced for visualization of dendritic spines from human postmo...
Data
Matlab code for our paper in Pattern Recognition Letters : Adaptive image denoising and edge enhancement in scale-space using the wavelet transform April 2003 Pattern Recognition Letters DOI: 10.1016/S0167-8655(02)00220-9
Article
In this work, we propose a novel approach for Facial Expression Recognition (FER). We introduce the TPOEM (Temporal Patterns of Oriented Edge Magnitudes) features, which are volumetric extensions of the known POEM features by exploring adjacent frames and also temporal derivatives. To cope with the increase in the code length produced by TPOEM, we...
Article
This paper presents a new method for spatio-temporally coherent disparity map estimation and view interpolation for multiview linear camera arrays based on 2D domain triangulation. In the first frame of the sequence, a 3D mesh is computed for each camera, leading to a spatially coherent view interpolation. For the remaining frames of the sequence,...
Conference Paper
Crowd simulation has become an important area, mainly in entertainment and security applications. In particular, this area has been explored in safety systems to evaluate environments in terms of people comfort and security. In general, the evaluation involves the execution of one or more simulations in order to provide statistical information abou...
Article
Evacuation planning is an important and difficult task in building design. The proposed framework can identify optimal evacuation plans using decision points, which control the ratio of agents that select a particular route at a specific spatial location. The authors optimize these ratios to achieve the best evacuation based on a quantitatively val...
Article
Solfège is a general technique used in the music learning process that involves the vocal performance of melodies, regarding the time and duration of musical sounds as specified in the music score, properly associated with the meter-mimicking performed by hand movement. This article presents an audiovisual approach for automatic assessment of this...
Article
Full-text available
Pedestrian detection reliability is a key problem for autonomous or aided driving, and methods that use Histogram of Oriented Gradients (HOG) are very popular. Embedded Graphics Processing Units (GPUs) are exploited to run HOG in a very efficient manner. Unfortunately, GPUs architecture has been shown to be particularly vulnerable to radiation-indu...
Conference Paper
This paper presents a new approach for pedestrian detection in the context of Driver Assistance Systems (DAS). Given a camera with known intrinsic parameters, a flexible online calibration scheme that explores the expected road geometry is used to obtain the extrinsic parameters. With the full camera parameters, the expected geometry and size of a...
Conference Paper
The processing time to simulate crowds for games or simulations is a real challenge. While the increasing power of processing capacity is a reality in the hardware industry, it also means that more agents, better rendering and most sophisticated Artificial Intelligence (AI) methods can be used, so again the computational time is an issue. Despite t...
Conference Paper
Full-text available
This paper presents a new image retargeting method that explores blur information. Given the input image, we compute the blur map and estimate in-focus regions. For retargeting, we first try to crop image boundaries as much as possible (pre-serving in-focus regions). If cropping is not enough, we use seam carving exploring a novel blur-aware energy...
Article
Crowds arise in a variety of situations, such as public concerts and sporting matches. In typical conditions, the crowd moves in an orderly manner, but panic situations may lead to catastrophic results. We propose a computer vision method to identify motion pattern changes in human crowds that can be related to an unusual event. The proposed approa...
Article
Speaker diarization (SD) is the process of assigning speech segments of an audio stream to its corresponding speakers, thus comprising the problem of voice activity detection (VAD), speaker labeling/identification, and often sound source localization (SSL). Most research activities in the past aimed towards applications as broadcast news, meetings,...
Conference Paper
Full-text available
We introduce a new approach for hand and object segmentation using RGB-D cameras suitable for gesture-based Human-Computer Interfaces (HCIs) that involve an interaction plane. The technique consists of detecting the interaction plane using a temporally coherent version of RANSAC, followed by segmenting off-plane objects using a markers-based waters...
Conference Paper
Full-text available
This paper presents a new approach for automatic license plate detection using an embedded camera inside a moving vehicle, implemented in a low-cost prototype. The key idea of the proposed method is to initially identify shadowed regions under the rear of visible vehicles, and then explore information from a calibrated camera to reduce the search s...
Article
This paper presents a new approach for road lane classification using an onboard camera. Initially, lane boundaries are detected using a linear–parabolic lane model, and an automatic on-the-fly camera calibration procedure is applied. Then, an adaptive smoothing scheme is applied to reduce noise while keeping close edges separated, and pairs of loc...
Conference Paper
One of the biggest challenges in view interpolation is to fill the regions without projective information in the synthesized view. In this paper, we present a new approach that identifies and corrects different types of missing information. In the first stage, we propose a fast solution to tackle the problems of cracks and ghost, common artifacts i...
Article
Full-text available
This paper describes a new approach for eventdetection in video sequences. A tracking algorithm for obliquecamera setups is initially used to extract trajectories in a trainingperiod, and a map of spatial occupancy of the scene is built. In thetest stage, Voronoi Diagrams are used to obtain some informationregarding interpersonal relationships, suc...
Article
Musical performance by an ensemble of performers often requires a conductor. This paper presents a tool to aid the study of basic conducting gestures, also known as meter- mimicking gestures, performed by beginners. It is based on the automatic detection of musical metrics and their subdivisions by analysis of hand gestures. Musical metrics are rep...
Article
In Depth Image-Based Rendering (DIBR), interpolated views generated using one or two cameras usually present artifacts and holes due to occlusions and/or inconsistencies in the input disparity maps. In this paper we propose a multiple (3 or more) camera view interpolation technique that is able to combine redundant projections in a single interpola...
Article
In this paper we propose a head-shoulder contour estimation model for human figures in still images, captured in a frontal pose. The contour estimation is guided by a learned head-shoulder shape model, initialized automatically by a face detector. A graph is generated around the detected face with an omega-like shape, and the estimated head-shoulde...
Article
This paper presents a new approach for self-calibration of static cameras in the context of surveillance applications. Initially, pedestrian detector is applied and the responses are validated using background removal. Then, foreground-related pixels within the detection results are used to estimate the feet-head line segments of each person (calle...
Conference Paper
Full-text available
This paper presents a new method for defocus blur estimation using a single image. The proposed method exploits the ratio of gradient magnitude images computed at multiple scales, using the scale-space theory to estimate the number of reliable scales. Experimental results on synthetic and real images show that the proposed method is robust to noise...
Article
Full-text available
Humans can extract speech signals that they need to understand from a mixture of background noise, interfering sound sources, and reverberation for effective communication. Voice Activity Detection (VAD) and Sound Source Localization (SSL) are the key signal processing components that humans perform by processing sound signals received at both ears...
Conference Paper
This paper explores a simple yet effective way to generate temporally coherent disparity maps from binocular video sequences based on kinematic constraints. Given the disparity map at a certain frame, the proposed approach computes the set of possible disparity values for each pixel in the subsequent frame, assuming a maximum displacement constrain...
Article
This paper presents a new approach for tracking multiple people in monocular calibrated cameras combining patch matching and pedestrian detection. Initially, background removal and pedestrian detection are used in conjunction with the vertical standing hypothesis to initialize the targets with multiples patches. In the tracking step, each patch rel...
Article
This paper presents an on-the-fly procedure for obtaining extrinsic camera parameters of onboard vehicular cameras, as well as augmented reality applications for driver assistance systems and road inspection. The proposed approach employs a lane detection algorithm to extract the lane boundaries of the road, and estimates the distance between adjac...
Conference Paper
In this paper we propose a self-occlusion and 3D pose estimation model for human figures in still images based on a user-provided 2D skeleton. An initial segmentation model is used to capture labeled human body parts in a 2D image. Then, occluded body parts are detected when different body parts overlap, and are disambiguated by analyzing the energ...
Conference Paper
Full-text available
This paper presents a method to detect unusual behavior in human crowds based on histograms of velocities in world coordinates. A combination of background removal and optical flow is used to extract the global motion at each image frame, discarding small motion vectors due artifacts such as noise, non-stationary background pixels and compression i...
Conference Paper
Full-text available
This paper presents a method for detection and recognition of road lane markings using an uncalibrated onboard camera. Initially, lane boundaries are detected based on a linear parabolic model. Then, we build a simple model to represent pixels related to the pavement, and explore this model to estimate pixels related to lane markings. A set of feat...
Article
Full-text available
The aim of most microphone array applications is to localize sound sources in a noisy and reverberant environment. For that purpose, many different sound source localization (SSL) algorithms have been proposed, where the SRP-PHAT (steered response power using the phase transform) has been known as one of the state-of-the-art methods. Its original f...
Article
Full-text available
This paper presents a new approach for stereo matching and view interpolation problem based on triangular tessellations suitable for a linear array of rectified cameras. The domain of the reference image is initially partitioned into triangular regions using edge and scale information, aiming to place vertices along image edges and increase the num...
Article
Full-text available
This paper presents an approach for object tracking based on multiple disjoint patches. Initially, the target is subdivided into a set of rectangular patches, and each patch is represented parametrically by the mean vector and covariance matrix computed from a set of feature vectors that represent each pixel of the target. Each patch is tracked ind...
Article
Full-text available
Audiovisual voice activity detection is a necessary stage in several problems, such as advanced teleconferencing, speech recognition, and human-computer interaction. Lip motion and audio analysis provide a large amount of information that can be integrated to produce more robust audiovisual voice activity detection (VAD) schemes, as we discuss in t...
Conference Paper
Full-text available
In this paper we propose a skeleton-based model for human segmentation in static images. Our approach explores edge information, orientation coherence and anthropometric-estimated parameters to generate a graph, and the desired contour is a path with maximal cost. Experimental results show that the proposed technique works well in non trivial image...