Junghyun Cho's research while affiliated with The Seoul Institute and other places

Publications (23)

Preprint
We propose a scene-level inverse rendering framework that uses multi-view images to decompose the scene into geometry, a SVBRDF, and 3D spatially-varying lighting. Because multi-view images provide a variety of information about the scene, multi-view images in object-level inverse rendering have been taken for granted. However, owing to the absence...
Preprint
In this paper, we propose a new challenge that synthesizes a novel view in a more practical environment, where the number of input multi-view images is limited and illumination variations are significant. Despite recent success, neural radiance fields (NeRF) require a massive amount of input multi-view images taken under constrained illuminations....
Chapter
Many skeleton-based action recognition models have been introduced with the application of graph convolutional networks (GCNs). Most of the models suggested new ways to aggregate adjacent joints information. In this paper, we propose a novel way to define the adjacency matrix from the geometrical distance between joints. By combining this method wi...
Article
Full-text available
We present a robust skeleton-based action recognition method with graph convolutional network (GCN) that uses the new adjacency matrix, called Rank-GCN. In Rank-GCN, the biggest change from previous approaches is how the adjacency matrix is generated to accumulate features from neighboring nodes by re-defining “adjacency.” The new adjacency matrix,...
Article
Full-text available
Custom inspection using X-ray imaging is a very promising application of modern pattern recognition technology. However, the lack of data or renewal of tariff items makes the application of such technology difficult. In this paper, we present a data augmentation technique based on a new image-to-image translation method to deal with these difficult...
Preprint
Full-text available
In this paper, we introduce a new large-scale face database from KIST, denoted as K-FACE, and describe a novel capturing device specifically designed to obtain the data. The K-FACE database contains more than 1 million high-quality images of 1,000 subjects selected by considering the ratio of gender and age groups. It includes a variety of attribut...
Preprint
Face recognition research now requires a large number of labelled masked face images in the era of this unprecedented COVID-19 pandemic. Unfortunately, the rapid spread of the virus has left us little time to prepare for such dataset in the wild. To circumvent this issue, we present a 3D model-based approach called WearMask3D for augmenting face im...
Article
Full-text available
To train deep learning models for vision-based action recognition of elders’ daily activities, we need large-scale activity datasets acquired under various daily living environments and conditions. However, most public datasets used in human action recognition either differ from or have limited coverage of elders’ activities in many aspects, making...
Preprint
To train deep learning models for vision-based action recognition of elders' daily activities, we need large-scale activity datasets acquired under various daily living environments and conditions. However, most public datasets used in human action recognition either differ from or have limited coverage of elders' activities in many aspects, making...
Article
Full-text available
Facial expressions are one of the important non-verbal ways used to understand human emotions during communication. Thus, acquiring and reproducing facial expressions is helpful in analyzing human emotional states. However, owing to complex and subtle facial muscle movements, facial expression modeling from images with face poses is difficult to ac...
Preprint
Full-text available
Existing techniques to encode spatial invariance within deep convolutional neural networks only model 2D transformation fields. This does not account for the fact that objects in a 2D space are a projection of 3D ones, and thus they have limited ability to severe object viewpoint changes. To overcome this limitation, we introduce a learnable module...
Article
Full-text available
This study proposes a novel face detector called DEFace that focuses on the challenging tasks of face detection to cope with a small size that is under 12 pixels and occlusions due to a mask or human body parts. This study proposed the extended feature pyramid network (FPN) module to detect small faces by expanding the range of P layer, and the net...
Article
Full-text available
Face recognition is one research area that has benefited from the recent popularity of deep learning, namely the convolutional neural network (CNN) model. Nevertheless, the recognition performance is still compromised by the model’s dependency on the scale of input images and the limited number of feature maps in each layer of the network. To circu...
Article
Full-text available
In this paper, we present a novel framework for automatically assessing facial attractiveness that considers four ratio feature sets as objective elements of facial attractiveness. In our framework, these feature sets are combined with three regression-based predictors to estimate a facial beauty score. To enhance the system's performance to make i...
Conference Paper
We propose an efficient computational method to acquire diffuse and specular normal map with three types of illumination patterns by removing the redundancy of the conventional method. By analyzing the relationship between four reflectances under XY Z-gradient and constant patterns, the number of patterns needed is reduced to three patterns.
Conference Paper
Generating a user-specific 3D face model is useful for a variety of applications, such as facial animation, games or movie industries. Recently, there have been spectacular developments in 3D sensors, however, accurately recovering the 3D shape model from a single image is a major challenge of computer vision and graphics. In this paper, we present...
Chapter
Facial modelling is a fundamental technique in a variety of applications in computer graphics, computer vision and pattern recognition areas. As 3D technologies evolved over the years, the quality of facial modelling greatly improved. To enhance the modelling quality and controllability of the model further, parametric methods, which represent or m...
Article
Reconstructing 3D models given a single-view 2D information is inherently an ill-posed problem and requires additional information such as shape prior or user input. We introduce a method to generate multiple 3D models of a particular category given corresponding photographs when the topological information is known. While there is a wide range of...
Conference Paper
What if your electronics with cheap cameras can reveal 3D faces of captured people? In daily life, we use a lot of consumer electronics employing cameras such as a mobile phone, a tablet PC, a CCTV, a car black box, and so on. If such devices provide 3D facial shapes of 2-dimensionally framed people, it would benefit new applications and services i...
Conference Paper
Since the character expressions are high dimensional, it is not easy to control them intuitively with simple interface. So far, existing controlling and animating methods are mainly based on three dimensional motion capture system for high quality animation. However, using the three dimensional motion capture system is not only unhandy but also qui...

Citations

... • We address the issue of calculating adjacency matrices by using the geometrical distance measure and introducing the rank graph convolution algorithm. We use distance rankings instead of using the distance threshold directly as in [6]. By using ranks to determine the adjacent groups of joints, neighboring nodes are better utilized and provide performance and robustness in activity recognition over those of the state-of-the-art methods. ...
... However, for this AI-based system, there is a limitation in that high-quality X-ray image datasets of large size must be available to achieve good results [5,13]. In addition, because most of the publicly available X-ray inspection benchmarks are X-ray images for baggage inspection purposes, another problem is that they cannot be used for cargo inspections with different X-ray imaging energy levels [5,[14][15][16][17][18][19]. ...
... The performance received a testing accuracy score of 88.92%. WearMask3D is a 3D model-based method that Hong et al. [11] presented for enhancing face photos in various positions to their masked face counterparts. They have also shown that practicing with artificially generated 3D masks can increase the recognition accuracy of masked faces. ...
... List of Tables [1][2][3][4][5][6][7]. With the rising use of smartphones and other mobile devices with built-in sensors, human activity identification and localization have become a significant topic of research in the field of ubiquitous computing [8][9][10][11][12][13][14][15][16][17][18][19][20].Walking, sitting, climbing stairs, coming down steps, standing, and running are a few instances of human movement. ...
... In order to solve the problem of insufficient supervision in semantic alignment and object landmark detection, Jeon et al. [47] designed a joint loss function to impose constraints between tasks, and only reliable matched pairs were used to improve the model robustness with weak supervision. Joung et al. [48] solved the problem of object viewpoint changes in 3D object detection and viewpoint estimation with a cylindrical convolutional network, which obtains viewspecific features with structural information at each viewpoint for both two tasks. Luo et al. [49] presented a multi-task framework for referring expression comprehension and segmentation. ...
... Multiscale features extracted from the backbone network and FPN (P5, P4, P3, and P2) were exploited by using the RPN. Feature maps obtained via the FPN could contribute to the outstanding detection result without compromising the representational power, speed, or memory (Hoang et al. 2020). For instance, Mask R-CNN effectively performs on the detection of small objects because of the integration of the FPN mechanism (Cai et al. 2020). ...
... In the second family of methods, the multilinear structure is imposed on the latent space of deep autoencoder-based models. More specifically, [178], [179], [202], employ CNN-based encoders to obtain a set of M − 1 latent vector representations {u (m) i } M m=2 for each image using label information. Each factor accounts for a source of variability in the data, while the multiplicative interactions of these factors emulate the entangled variability using (26). ...
... In [20], the proposed PSI-CNN architecture improves face recognition performance by extracting untrained feature maps across multiple image resolutions, allowing the network to learn scale-independent information and outperforming the VGG face model in terms of accuracy. Paper [21] introduces a real-time ITS that uses video imaging and deep learning to detect vehicles driving the wrong way. ...
... Earlier research on FBP focused mostly on a set of hand-crafted features (geometric and texture) that led to the shallow machine-learning algorithms used to estimate facial aesthetics. Hong et al. [23] considered a set of facial ratios as an objective of facial beauty criteria to be incorporated into ensemble-regression-based predictors to obtain the beauty score. However, geometric-feature-based techniques have limited performance due to the influence of facial-expression variation, and it demands a computational burden through landmark localizations. ...
... Kim Pallister, the Director of Content Strategy for Intel's Visual Computing Team, vividly describes how VR works by deceiving the senses. The development of VR technology is closely related to the progress of computer technology because the quality of three-dimensional (3D) images [9][10][11] and the effectiveness of sensors depend on computing speed. Some time ago, the slowness of machines, small memory capacities, and the immaturity of integrated circuits and imaging technology all caused the development of VR to proceed very slowly, and associated equipment was extremely expensive. ...