Sung-eui Yoon

Sung-eui Yoon
  • Korea Advanced Institute of Science and Technology

About

195
Publications
23,785
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,860
Citations
Current institution
Korea Advanced Institute of Science and Technology

Publications

Publications (195)
Preprint
Full-text available
Visual re-ranking using Nearest Neighbor graph~(NN graph) has been adapted to yield high retrieval accuracy, since it is beneficial to exploring an high-dimensional manifold and applicable without additional fine-tuning. The quality of visual re-ranking using NN graph, however, is limited to that of connectivity, i.e., edges of the NN graph. Some e...
Preprint
Full-text available
Domain generalizable person re-identification (DG re-ID) aims to learn discriminative representations that are robust to distributional shifts. While data augmentation is a straightforward solution to improve generalization, certain augmentations exhibit a polarized effect in this task, enhancing in-distribution performance while deteriorating out-...
Article
Generating reliable pseudo masks from image-level labels is challenging in the weakly supervised semantic segmentation (WSSS) task due to the lack of spatial information. Prevalent class activation map (CAM)-based solutions are challenged to discriminate the foreground (FG) objects from the suspicious background (BG) pixels (a.k.a. co-occurring) an...
Article
Physically based differentiable rendering allows an accurate light transport simulation to be differentiated with respect to the rendering input, i.e., scene parameters, and it enables inferring scene parameters from target images, e.g., photos or synthetic images, via an iterative optimization. However, this inverse Monte Carlo rendering inherits...
Preprint
Existing open-set recognition (OSR) studies typically assume that each image contains only one class label, and the unknown test set (negative) has a disjoint label space from the known test set (positive), a scenario termed full-label shift. This paper introduces the mixed OSR problem, where test images contain multiple class semantics, with known...
Preprint
Full-text available
Generating reliable pseudo masks from image-level labels is challenging in the weakly supervised semantic segmentation (WSSS) task due to the lack of spatial information. Prevalent class activation map (CAM)-based solutions are challenged to discriminate the foreground (FG) objects from the suspicious background (BG) pixels (a.k.a. co-occurring) an...
Preprint
Full-text available
This paper presents a novel method designed to enhance the efficiency and accuracy of both image retrieval and pixel retrieval. Traditional diffusion methods struggle to propagate spatial information effectively in conventional graphs due to their reliance on scalar edge weights. To overcome this limitation, we introduce a hypergraph-based framewor...
Preprint
Full-text available
Audio-visual segmentation (AVS) aims to segment sound sources in the video sequence, requiring a pixel-level understanding of audio-visual correspondence. As the Segment Anything Model (SAM) has strongly impacted extensive fields of dense prediction problems, prior works have investigated the introduction of SAM into AVS with audio as a new modalit...
Article
Mobile robot navigation in crowded indoor environments is a challenging task due to the limited sensing capabilities of onboard sensors. In this study, we propose a mobile robot navigation framework that utilizes external CCTV data to address the limitations of local sensors in a crowded environment. This approach enables mobile robots to navigate...
Article
Auxiliary features such as geometric buffers (G-buffers) and path descriptors (P-buffers) have been shown to significantly improve Monte Carlo (MC) denoising. However, recent approaches implicitly learn to exploit auxiliary features for denoising, which could lead to insufficient utilization of each type of auxiliary features. To overcome such an i...
Article
Full-text available
This research proposes a deep-learning paradigm, termed functional learning (FL), to physically train a loose neuron array, a group of non-handcrafted, non-differentiable, and loosely connected physical neurons whose connections and gradients are beyond explicit expression. The paradigm targets training non-differentiable hardware, and therefore so...
Preprint
Full-text available
Auxiliary features such as geometric buffers (G-buffers) and path descriptors (P-buffers) have been shown to significantly improve Monte Carlo (MC) denoising. However, recent approaches implicitly learn to exploit auxiliary features for denoising, which could lead to insufficient utilization of each type of auxiliary features. To overcome such an i...
Preprint
Full-text available
Deep neural networks are susceptible to adversarial attacks due to the accumulation of perturbations in the feature level, and numerous works have boosted model robustness by deactivating the non-robust feature activations that cause model mispredictions. However, we claim that these malicious activations still contain discriminative cues and that...
Preprint
Full-text available
In this paper, we learn a diffusion model to generate 3D data on a scene-scale. Specifically, our model crafts a 3D scene consisting of multiple objects, while recent diffusion research has focused on a single object. To realize our goal, we represent a scene with discrete class labels, i.e., categorical distribution, to assign multiple objects int...
Article
We present a novel sound source localization method that leverages microphone pair training, designed to deliver robust performance in various real-world environments. Existing deep learning (DL)-based approaches face scalability issues when dealing with various types of microphone arrays. To address these issues, our approach has been structured i...
Article
For the realistic and immersive experience in a virtual environment, it is important to estimate and reflect the acoustic characteristic of the real indoor scenes. This article proposes a method of directly measuring the reflection coefficient of a surface, which is an acoustic characteristics in the real environment. Because expensive optimization...
Chapter
A training pipeline for optical flow CNNs consists of a pretraining stage on a synthetic dataset followed by a fine tuning stage on a target dataset. However, obtaining ground truth flows from a target video requires a tremendous effort. This paper proposes a practical fine tuning method to adapt a pretrained model to a target dataset without groun...
Preprint
Full-text available
With increasing demands for high-quality semantic segmentation in the industry, hard-distinguishing semantic boundaries have posed a significant threat to existing solutions. Inspired by real-life experience, i.e., combining varied observations contributes to higher visual recognition confidence, we present the equipotential learning (EPL) method....
Preprint
Full-text available
Adversarial attacks with improved transferability - the ability of an adversarial example crafted on a known model to also fool unknown models - have recently received much attention due to their practicality. Nevertheless, existing transferable attacks craft perturbations in a deterministic manner and often fail to fully explore the loss surface,...
Preprint
Full-text available
A training pipeline for optical flow CNNs consists of a pretraining stage on a synthetic dataset followed by a fine tuning stage on a target dataset. However, obtaining ground truth flows from a target video requires a tremendous effort. This paper proposes a practical fine tuning method to adapt a pretrained model to a target dataset without groun...
Article
Full-text available
A redundant manipulator can have many trajectories for joints that follow a given end-effector path in the Cartesian space, since it has multiple inverse kinematics solutions per end-effector pose. While maintaining accuracy with the given end-effector path, it is challenging to quickly synthesize a feasible trajectory that satisfies robot-specific...
Preprint
Unsupervised person re-identification (re-ID) aims at learning discriminative representations for person retrieval from unlabeled data. Recent techniques accomplish this task by using pseudo-labels, but these labels are inherently noisy and deteriorate the accuracy. To overcome this problem, several pseudo-label refinement methods have been propose...
Preprint
Full-text available
Super-resolution of LiDAR range images is crucial to improving many downstream tasks such as object detection, recognition, and tracking. While deep learning has made a remarkable advances in super-resolution techniques, typical convolutional architectures limit upscaling factors to specific output resolutions in training. Recent work has shown tha...
Article
With increasing demands for high-quality semantic segmentation in the industry, hard-distinguishing semantic boundaries have posed a significant threat to existing solutions. Inspired by real-life experience, i.e., combining varied observations contributes to higher visual recognition confidence, we present the equipotential learning (EPL) method....
Article
In this paper, we present real-time collision-free inverse kinematics (RCIK) that accurately performs consecutively provided six-degrees-of-freedom commands in environments containing static and dynamic obstacles. Our method is based on an optimization-based IK approach to generate IK candidates with high feasibility for the command. While checking...
Article
In this article, we present a novel localization method for multiple sources in indoor environments. Our approach can estimate different propagation paths, including the reflection and diffraction paths of sound waves based on a backward ray tracing technique. To estimate diffraction propagation paths, we combine a ray tracing algorithm with a unif...
Chapter
The intellectual value of digitized 3D properties in scientific, artistic, historical, and entertaining domains is increasing. However, there has been less attention on designing an immutable, secure database for their management. We propose a secure 3D property management platform powered by blockchain and decentralized storage. The platform conne...
Article
Image-space auxiliary features such as surface normal have significantly contributed to the recent success of Monte Carlo (MC) reconstruction networks. However, path-space features, another essential piece of light propagation, have not yet been sufficiently explored. Due to the curse of dimensionality, information flow between a regression loss an...
Article
Full-text available
Monte Carlo (MC) integration is used ubiquitously in realistic image synthesis because of its flexibility and generality. However, the integration has to balance estimator bias and variance, which causes visually distracting noise with low sample counts. Existing solutions fall into two categories, in-process sampling schemes and post-processing re...
Preprint
Full-text available
Although bipedal locomotion provides the ability to traverse unstructured environments, it requires careful planning and control to safely walk across without falling. This poses an integrated challenge for the robot to perceive, plan, and control its movements, especially with dynamic motions where the robot may have to adapt its swing-leg traject...
Article
Serious noise affects the rendering of global illumination using Monte Carlo (MC) path tracing when insufficient samples are used. The two common solutions to this problem are filtering noisy inputs to generate smooth but biased results and sampling the MC integrand with a carefully crafted probability distribution function (PDF) to produce unbiase...
Article
Full-text available
A recent trend in optimal motion planning has broadened the research area toward the hybridization of sampling, optimization, and grid-based approaches. A synergy from such integrations can be expected to bring the overall performance improvement, but seamless integration and generalization is still an open problem. In this paper, we suggest a hybr...
Preprint
Full-text available
A redundant manipulator has multiple inverse kinematics solutions per an end-effector pose. Accordingly, there can be many trajectories for joints that follow a given end-effector path in a Cartesian space. In this paper, we present a trajectory optimization of a redundant manipulator (TORM) to synthesize a trajectory that follows a given end-effec...
Preprint
This paper proposes a real-time system integrating an acoustic material estimation from visual appearance and an on-the-fly mapping in the 3-dimension. The proposed method estimates the acoustic materials of surroundings in indoor scenes and incorporates them to a 3-D occupancy map, as a robot moves around the environment. To estimate the acoustic...
Preprint
We propose a novel method utilizing an objectness score for maintaining the locations and classes of objects detected from Mask R-CNN during mobile robot navigation. The objectness score is defined to measure how well the detector identifies the locations and classes of objects during navigation. Specifically, it is designed to increase when there...
Article
Full-text available
We present a new outlier removal technique for a gradient‐domain path tracing (G‐PT) that computes image gradients as well as colors. Our approach rejects gradient outliers whose estimated errors are much higher than those of the other gradients for improving reconstruction quality for the G‐PT. We formulate our outlier removal problem as a least t...
Preprint
Recently, deep learning based single image reflection separation methods have been exploited widely. To benefit the learning approach, a large number of training image pairs (i.e., with and without reflections) were synthesized in various ways, yet they are away from a physically-based direction. In this paper, physically based rendering is used fo...
Preprint
We present a novel, robust sound source localization algorithm considering back-propagation signals. Sound propagation paths are estimated by generating direct and reflection acoustic rays based on ray tracing in a backward manner. We then compute the back-propagation signals by designing and using the impulse response of the backward sound propaga...
Article
In this paper, we present two novel approaches, super rays and culling region, for efficiently updating grid-based occupancy maps with point clouds. Rays, which traverse from the sensor origin to the sensor data, update the occupancy probabilities of a map representing an environment. Based on the ray model, we define a super ray as a representat...
Article
In this paper, we propose a new technique to incorporate recent adaptive rendering approaches built upon local regression theory into a gradient‐domain path tracing framework, in order to achieve high‐quality rendering results. Our method aims to reduce random artifacts introduced by random sampling on image colors and gradients. Our high‐level app...
Preprint
Full-text available
We present a novel sound localization algorithm for a non-line-of-sight (NLOS) sound source in indoor environments. Our approach exploits the diffraction properties of sound waves as they bend around a barrier or an obstacle in the scene. We combine a ray tracing based sound propagation algorithm with a Uniform Theory of Diffraction (UTD) model, wh...
Preprint
Full-text available
Mobile manipulation planning commonly adopts a decoupled approach that performs planning separately for the base and manipulator, to make each planning space low dimensional. While this approach is fast, it can generate suboptimal paths or cannot find solutions. Another approach, i.e., a coupled approach, is to jointly adjust the base and manipulat...
Preprint
We introduce an optical neural network system made by off-the-shelf components. Compared to electronic systems, the proposed system can execute advanced inference tasks on the speed of light and much lower power consumption. By using see-through screens and optical components, it can support a programmable optical neural network, see-through displa...
Book
https://sglab.kaist.ac.kr/~sungeui/render/ Rendering is a way of visualizing various 3D models in 2D images or videos. It is one of fundamental tools in the field of computer graphics. Thanks to its ubiquitous demand, it is not only used for applications in computer graphics, but also widely used for many other fields. There have been tremendous...
Article
Approximate K-nearest neighbor search is a fundamental problem in computer science. The problem is especially important for high-dimensional and large-scale data. Recently, many techniques encoding high-dimensional data to compact codes have been proposed. The product quantization and its variations that encode the cluster index in each subspace ha...
Article
Full-text available
We present a novel, reflection-aware method for 3D sound localization in indoor environments. Unlike prior approaches, which are mainly based on continuous sound signals from a stationary source, our formulation is designed to localize the position instantaneously from signals within a single frame. We consider direct sound and indirect sound signa...
Conference Paper
We present a timeline based scheduling method for Monte Carlo ray tracing of out-of-core models on distributed memory clusters. We abstract different setups of various compute and memory devices into a graph-based representation, and estimate the time for job execution and data transfer in a simple timing model. Our scheduler allocates not only job...
Article
Full-text available
We present a rank-based voting technique utilizing inclusion relationship for high-quality image search. Since images can have multiple regions of interest, we extract representative object regions using a state-of-the-art region proposal method tailored for our search problem. We then extract CNN features locally from those representative regions...
Article
We present an interactive technique for generating realistic lightning. Our method captures the main characteristics of the dielectric breakdown model, a physical model for lightning formation. Our algorithm uses a distance-based approximation to quickly compute the electric potentials of different charge types. In particular, we use a rational fun...
Article
Full-text available
Performance of interactive graphics walkthrough systems depends on the time taken to fetch the required data from the secondary storage to main memory. It has been earlier established that a large fraction of this fetch time is spent on seeking the data on the hard disk. In order to reduce this seek time, redundant data storage has been proposed in...
Article
We present a novel, compact bounding volume hierarchy, TSS BVH, for ray tracing subdivision surfaces computed by the Catmull-Clark scheme. We use Tetrahedron Swept Sphere (TSS) as a bounding volume to tightly bound limit surfaces of such subdivision surfaces given a user tolerance. Geometric coordinates defining our TSS bounding volumes are implici...
Article
Monte Carlo ray tracing has been widely used for simulating a diverse set of photo-realistic effects. However, this technique typically produces noise when insufficient numbers of samples are used. As the number of samples allocated per pixel is increased, the rendered images converge. However, this approach of generating sufficient numbers of samp...
Article
We propose to use discriminative subgraphs to discover family photos from group photos in an efficient and effective way. Group photos are represented as face graphs by identifying social contexts such as age, gender, and face position. The previous work utilized bag-of-word models and considered frequent subgraphs from all group photos as features...
Article
Under the conditional independence assumption among local features, the Naive Bayes Nearest Neighbor (NBNN) classifier has been recently proposed and performs classification without any training or quantization phases. While the original NBNN shows high classification accuracy without adopting an explicit training phase, the conditional independenc...
Article
Many binary code embedding schemes have been actively studied recently, since they can provide efficient similarity search, and compact data representations suitable for handling large scale image databases. Existing binary code embedding techniques encode high-dimensional data by using hyperplane-based hashing functions. In this paper we propose a...
Article
Full-text available
We present a recursive path-planning method that efficiently generates a path by using reduced states of the search space and taking into account the kinematics, shape, and turning space of a car-like vehicle. Our method is based on a kinematics-aware node expansion method that checks for collisions based on the shape and turning space of a vehicle...

Network

Cited By