
Hyung Jin ChangUniversity of Birmingham · School of Computer Science
Hyung Jin Chang
PhD
About
74
Publications
21,655
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,776
Citations
Citations since 2017
Introduction
Additional affiliations
January 2018 - present
May 2013 - present
March 2006 - May 2013
Education
March 2006 - February 2013
March 2001 - February 2006
Publications
Publications (74)
We propose a new gaze-initialized optimization framework to generate aesthetically pleasing image crops based on user description.
We extended the existing description-based image cropping dataset by collecting user eye movements corresponding to the image captions.
To best leverage the contextual information to initialize the optimization framew...
In this paper, we propose a novel 3D graph convolution based pipeline for category-level 6D pose and size estimation from monocular RGB-D images. The proposed method leverages an efficient 3D data augmentation and a novel vector-based decoupled rotation representation. Specifically, we first design an orientation-aware autoencoder with 3D graph con...
Tracking in 3D scenes is gaining momentum because of its numerous applications in robotics, autonomous driving, and scene understanding. Currently, 3D tracking is limited to specific model-based approaches involving point clouds, which impedes 3D trackers from applying in natural 3D scenes. RGBD sensors provide a more reasonable and acceptable solu...
Despite the recent efforts in accurate 3D annotations in hand and object datasets, there still exist gaps in 3D hand and object reconstructions. Existing works leverage contact maps to refine inaccurate hand-object pose estimations and generate grasps given object models. However, they require explicit 3D supervision which is seldom available and t...
Recent unsupervised domain adaptation methods have utilized vicinal space between the source and target domains. However, the equilibrium collapse of labels, a problem where the source labels are dominant over the target labels in the predictions of vicinal instances, has never been addressed. In this paper, we propose an instance-wise minimax stra...
We propose a new transformer model for the task of unsupervised learning of skeleton motion sequences. The existing transformer model utilized for unsupervised skeleton-based action learning is learned the instantaneous velocity of each joint from adjacent frames without global motion information. Thus, the model has difficulties in learning the at...
Despite the recent efforts in accurate 3D annotations in hand and object datasets, there still exist gaps in 3D hand and object reconstructions. Existing works leverage contact maps to refine inaccurate hand-object pose estimations and generate grasps given object models. However, they require explicit 3D supervision which is seldom available and t...
Estimating the pose and shape of hands and objects under interaction finds numerous applications including augmented and virtual reality. Existing approaches for hand and object reconstruction require explicitly defined physical constraints and known objects, which limits its application domains. Our algorithm is agnostic to object models, and it l...
We created a novel dataset based on MS-COCO, similarly to the refCOCO and Visual Genome datasets. We selected 100 images randomly from the MS-COCO test set to avoid the images being ever seen by any of our pre-trained networks and created our captions for each image. The photos can be found under images/.
In our new dataset, we are interested in e...
Fast and accurate tracking of an object's motion is one of the key functionalities of a robotic system for achieving reliable interaction with the environment. This paper focuses on the instance-level six-dimensional (6D) pose tracking problem with a symmetric and textureless object under occlusion. We propose a Temporally Primed 6D pose tracking f...
We propose a novel optimization framework that crops a given image based on user description and aesthetics. Unlike existing image cropping methods, where one typically trains a deep network to regress to crop parameters or cropping actions, we propose to directly optimize for the cropping parameters by repurposing pre-trained networks on image cap...
We propose a novel optimization framework that crops a given image based on user description and aesthetics. Unlike existing image cropping methods, where one typically trains a deep network to regress to crop parameters or cropping actions, we propose to directly optimize for the cropping parameters by repurposing pre-trained networks on image cap...
Utilizing vicinal space between the source and target domains is one of the recent unsupervised domain adaptation approaches. However, the problem of the equilibrium collapse of labels, where the source labels are dominant over the target labels in the predictions of vicinal instances, has never been addressed. In this paper, we propose an instance...
In this paper, we focus on category-level 6D pose and size estimation from a monocular RGB-D image. Previous methods suffer from inefficient category-level pose feature extraction, which leads to low accuracy and inference speed. To tackle this problem, we propose a fast shape-based network (FS-Net) with efficient category-level feature extraction...
To tackle problems arising from unexpected camera motions in unmanned aerial vehicles (UAVs), we propose a three-mode ensemble tracker where each mode specializes in distinctive situations. The proposed ensemble tracker is composed of appearance-based tracking mode, homography-based tracking mode, and momentum-based tracking mode. The appearance-ba...
Most of the existing literature regarding hyperbolic embedding concentrate upon supervised learning, whereas the use of unsupervised hyperbolic embedding is less well explored. In this paper, we analyze how unsupervised tasks can benefit from learned representations in hyperbolic space. To explore how well the hierarchical structure of unlabeled da...
Assistive robots in home environments are steadily increasing in popularity. Due to significant variabilities in human behaviour, as well as physical characteristics and individual preferences, personalising assistance poses a challenging problem. In this paper, we focus on an assistive dressing task that involves physical contact with a human’s up...
We propose a symmetric graph convolutional autoencoder which produces a low-dimensional latent representation from a graph. In contrast to the existing graph autoencoders with asymmetric decoder parts, the proposed autoencoder has a newly designed decoder which builds a completely symmetric autoencoder form. For the reconstruction of node features,...
This paper proposes a new high dimensional regression method by merging Gaussian process regression into a variational autoencoder framework. In contrast to other regression methods, the proposed method focuses on the case where output responses are on a complex high dimensional manifold, such as images. Our contributions are summarized as follows:...
We propose a symmetric graph convolutional autoencoder which produces a low-dimensional latent representation from a graph. In contrast to the existing graph autoencoders with asymmetric decoder parts, the proposed autoencoder has a newly designed decoder which builds a completely symmetric autoencoder form. For the reconstruction of node features,...
The Visual Object Tracking challenge VOT2018 is the sixth annual tracker benchmarking activity organized by the VOT initiative. Results of over eighty trackers are presented; many are state-of-the-art trackers published at major computer vision conferences or in journals in the recent years. The evaluation included the standard VOT and other popula...
This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both the human and the robot. The framework, based on a biologically-grounded theory of the brain and mind, integrates a reactive interaction engine,...
We present a novel framework for finding the kinematic structure correspondences between two articulated objects in videos via hypergraph matching. In contrast to appearance and graph alignment based matching methods, which have been applied among two similar static images, the proposed method finds correspondences between two dynamic kinematic str...
Hand-eye coordination is a requirement for many manipulation tasks including grasping and reaching. However, accurate hand-eye coordination has shown to be especially difficult to achieve in complex robots like the iCub humanoid. In this work, we solve the hand-eye coordination task using a visuomotor deep neural network predictor that estimates th...
In this work, we consider the problem of robust gaze estimation in natural environments. Large camera-to-subject distances and high variations in head pose and eye gaze angles are common in such environments. This leads to two main shortfalls in state-of-the-art methods for gaze estimation: hindered ground truth gaze annotation and diminished gaze...
We propose a new context-aware correlation filter based tracking framework to achieve both high computational speed and state-of-the-art performance among real-time trackers. The major contribution to the high computational speed lies in the proposed deep feature compression that is achieved by a context-aware scheme utilizing multiple expert auto-...
In this paper, we present a novel framework for unsupervised kinematic structure learning of complex articulated objects from a single-view 2D image sequence. In contrast to prior motion-based methods, which estimate relatively simple articulations, our method can generate arbitrarily complex kinematic structures with skeletal topology via a succes...
We propose a new tracking framework with an attentional mechanism that chooses a subset of the associated correlation filters for increased robustness and computational efficiency. The subset of filters is adaptively selected by a deep attentional network according to the dynamic properties of the tracking target. Our contributions are manifold, an...
We propose a novel training procedure for Generative Adversarial Networks (GANs) to improve stability and performance by using an adaptive hinge loss objective function. We estimate the appropriate hinge loss margin with the expected energy of the target distribution, and derive both a principled criterion for updating the margin and an approximate...
We propose an online iterative path optimisation method to enable a Baxter humanoid robot to assist human users to dress. The robot searches for the optimal personalised dressing path using vision and force sensor information: vision information is used to recognise the human pose and model the movement space of upper-body joints; force sensor info...
In the future, robots will support humans in their every day activities. One particular challenge that robots will face is understanding and reasoning about the actions of other agents in order to cooperate effectively with humans. We propose to tackle this using a developmental framework, where the robot incrementally acquires knowledge, and in pa...
The Visual Object Tracking challenge VOT2016 aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results of 70 trackers are presented, with a large number of trackers being published at major computer vision conferences and journals in the recent years. The number of tested state-of-...
In this paper, we present a novel framework for finding the kinematic structure correspondence between two objects in videos via hypergraph matching. In contrast to prior appearance and graph alignment based matching methods which have been applied among two similar static images, the proposed method finds correspondences between two dynamic kinema...
In this paper, we present an approach to enable a humanoid robot to provide personalised dressing assistance for human users using multi-modal information. A depth sensor is mounted on top of the robot to provide visual information, and the robot end effectors are equipped with force sensors to provide haptic information. We use visual information...
A new blockwise collaborative representation-based classification with L2-norm of test data for accurate face recognition is presented. For training we divide images into several blocks and estimate representation coefficients of each block via L2-norm minimisation. For testing, the L2-norm of test image blocks are scaled by the trained representat...
In this paper we present the latent regression forest (LRF), a novel framework for real-time, 3D hand pose estimation from a single depth image. Prior discriminative methods often fall into two categories: holistic and patch-based. Holistic methods are efficient but less flexible due to their nearest neighbour nature. Patch-based methods can genera...
We present a Spatio-Temporal Attention Relocation (STARE) method, an information-theoretic approach for efficient detection of simultaneously occurring structured activities. Given multiple human activities in a scene, our method dynamically focuses on the currently most informative activity. Each activity can be detected without complete observati...
Assistive robots can improve the well-being of disabled or frail human users by reducing the burden that activities of daily living impose on them. To enable personalised assistance, such robots benefit from building a user-specific model, so that the assistance is customised to the particular set of user abilities. In this paper, we present an end...
In this paper we present a novel framework for unsupervised kinematic structure learning of complex articulated objects from a single-view image sequence. In contrast to prior motion information based methods, which estimate relatively simple articulations, our method can generate arbitrarily complex kinematic structures with skeletal topology by a...
In this paper we present a novel framework for simultaneous detection of click action and estimation of occluded fingertip positions from egocentric viewed single-depth image sequences. For the detection and estimation, a novel probabilistic inference based on knowledge priors of clicking motion and clicked position is presented. Based on the detec...
In this paper we present the Latent Regression Forest (LRF), a novel framework for real-time, 3D hand pose estimation from a single depth image. In contrast to prior forest-based methods, which take dense pixels as input, classify them independently and then estimate joint positions afterwards, our method can be considered as a structured coarse-to...
Recognizing actions in a video is a critical step for making many vision-based applications possible and has attracted much attention recently. However, action recognition in a video is a challenging task due to wide variations within an action, camera motion, cluttered background, and occlusions, to name a few. While dense sampling based approache...
In this paper we propose an object tracking method in case of inaccurate initializations. To track objects accurately in such situation, the proposed method uses "motion saliency" and "descriptor saliency" of local features and performs tracking based on generalized Hough transform (GHT). The proposed motion saliency of a local feature emphasizes f...
Conventional minutiae-based fingerprint recognition approaches consider only local characteristics and their accuracy dramatically decreases as the number of available minutiae decreases. We propose new features based on Abstracted Radon Profile (ARP). Proposed method uses global properties of an image and it does not necessitate any heavy preproce...
Detecting moving objects on mobile cameras in real-time is a challenging problem due to the computational limits and the motions of the camera. In this paper, we propose a method for moving object detection on non-stationary cameras running within 5.8 milliseconds (ms) on a PC, and real-time on mobile devices. To achieve real time capability with s...
In this paper we propose an efficient method for the recognition of long and complex action streams. First, we design a new motion feature flow descriptor by composing low-level local features. Then a new data embedding method is developed in order to represent the motion flow as an one-dimensional sequence, whilst preserving useful motion informat...
In this paper, we propose a hierarchical feature grouping method for multiple object segmentation and tracking. The proposed method aims to segment and track objects in the object-level without prior knowledge about the scene and object. We firstly group the motion feature into region-level with the proposed region features which represent a homoge...
In this paper, we present an active sampling method to speed up conventional pixel-wise background subtraction algorithms. The proposed active sampling strategy is designed to focus on attentional region such as foreground regions. The attentional region is estimated by detection results of previous frame in a recursive probabilistic way. For the e...
To solve the problem due to fast illumination change in a visual surveillance system, we propose a novel moving object detection algorithm for which we develop an illumination change model, a chromaticity difference model, and a brightness ratio model. When fast illumination change occurs, background pixels as well as moving object pixels are detec...
This paper proposes a new approach to modeling the sequential flow characteristics of data patterns for detecting and classifying faulty processes in semiconductor manufacturing. Unlike conventional methods, which consider the spatial pattern distributions, the proposed approach models the spatial patterns local in time, transition time, staying ti...
In this paper, we introduce a new platform for integrated development of visual surveillance algorithms, named as PIL-EYE system. In our system, any functional modules and algorithms can be added or removed, not affecting other modules. Also, functional flow can be designed by simply scheduling the order of modules. Algorithm optimization becomes e...
In this paper, we present a tracking failure detection method by imitating human visual system. By adopting log-polar transformation, we could simulate properties of retina image, such as rotation and scaling invariance and foveal predominance. The rotation and scaling invariance helps to reduce false alarms caused by pose changes and intensify tra...
This paper proposes a trajectory analysis method by handling the spatio-temporal property of trajectory. Not using similarity measures of two trajectories, our model analyzes overall path of a trajectory. Learning of spatio property is presented as semantic regions (e.g. go straight, turn left, turn right) that are clustered effectively using topic...
A new optical image stabilizing system for a small mobile device camera is presented. A gyro sensor is used to detect the amount of shaking, and a charge-coupled device (CCD) is shifted to correct the deviated optical axis using a voice coil motor (VCM). Because the VCM is nonlinear, unstable, and time-varying, a new adaptive control technique--mul...
In this paper, we present a fast incremental one-class classifier algorithm for large scale problems. The proposed method reduces space and time complexities by reducing training set size during the training procedure using a criterion based on sample margin. After introducing the sample margin concept, we present the proposed algorithm and apply i...
There has been a lot of research for developing intelligent robots that can effectively perform human robot interaction. Our motivation for this research is to develop a robot that can interact with people and assist them in their daily routines, in common places such as homes, super markets, hospitals or offices. In order to accomplish these tasks...
Support Vector Data Description (SVDD) has a limitation for dealing with a large data set in which computational load drastically
increases as training data size becomes large. To handle this problem, we propose a new fast SVDD method using K-means clustering
method. Our method uses divide-and-conquer strategy; trains each decomposed sub-problems t...
Structural and optical properties of Mg-doped AlGaN/GaN superlattices have been investigated by photoluminescence, scanning electron microscopy, cathodoluminescence (CL), and transmission electron microscopy (TEM). We found that the edge blue-band emission shows a strong optical anisotropy. Through the combination of the CL and TEM images, we clear...