Xiaolong Zhu

The University of Hong Kong, Hong Kong, Hong Kong

Are you Xiaolong Zhu?

Claim your profile

Publications (6)5.35 Total impact

  • Xiaolong Zhu · Xuhui Jia · Kwan-Yee K. Wong
    [Show abstract] [Hide abstract]
    ABSTRACT: Hand detection has many important applications in Human-Computer Interactions, yet it is a challenging problem because the appearance of hands can vary greatly in images. In this paper, we present a new approach that exploits the inherent contextual information from structured hand labelling for pixel-level hand detection and hand part labelling. By using a random forest framework, our method can predict hand mask and hand part labels in an efficient and robust manner. Through experiments, we demonstrate that our method can outperform other state-of-the-art pixel-level detection methods in ego-centric videos, and further be able to parse hand parts in details.
    No preview · Article · Dec 2015 · Computer Vision and Image Understanding
  • Xuhui Jia · Xiaolong Zhu · Angran Lin · K. P. Chan
    [Show abstract] [Hide abstract]
    ABSTRACT: Face alignment involves locating several facial parts such as eyes, nose and mouth, and has been popularly tackled by fitting deformable models. In this paper, we explore the effect of the combination of structured random regressors and Constrained Local Models (CLMs). Unlike most previous CLMs, we proposed a novel structured random regressors to give a joint prediction rather than pursuing independence while learning the response map for each facial part. In our method, we first present a fast algorithm to learn local graph, which will then be efficiently incorporated into the random regressors. Finally we regularize the output using a global shape model. The benefits of our method are: (i) random regressors allow integration of votes from nearby regions, which can handle various appearance variations, (ii) local graph encodes local geometry and enables joint learning of features of facial parts, (iii) the global model regularizes the result to ensure a plausible final shape. Experimentally, we found our methods to converge easily. We conjecture that structured random regressors can efficiently select good candidate points. Encouraging experimental results are obtained on several publicly available face databases.
    No preview · Conference Paper · Nov 2013
  • Xiaolong Zhu · Ruoxin Sang · Xuhui Jia · Kwan-Yee K. Wong
    [Show abstract] [Hide abstract]
    ABSTRACT: Hand shape recognition is one of the most important techniques used in human-computer interaction. However, it often takes developers great efforts to customize their hand shape recognizers. In this paper, we present a novel method that enables a hand shape recognizer to be built automatically from simple sketches, such as a "stick-figure" of a hand shape. We introduce the Hand Boltzmann Machine (HBM), a generative model built upon unsupervised learning, to represent the hand shape space of a binary image, and formulate the user provided sketches as an initial guidance for sampling to generate realistic hand shape samples. Such samples are then used to train a hand shape recognizer. We evaluate our method and compare it with other state-of-the-art models in three aspects, namely i) its capability of handling different sketch input, ii) its classification accuracy, and iii) its ability to handle occlusions. Experimental results demonstrate the great potential of our method in real world applications.
    No preview · Conference Paper · Nov 2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we introduce a novel method for depth acquisition based on refraction of light. A scene is captured directly by a camera and by placing a transparent medium between the scene and the camera. A depth map of the scene is then recovered from the displacements of scene points in the images. Unlike other existing depth from refraction methods, our method does not require prior knowledge of the pose and refractive index of the transparent medium, but instead can recover them directly from the input images. By analyzing the displacements of corresponding scene points in the images, we derive closed form solutions for recovering the pose of the transparent medium and develop an iterative method for estimating the refractive index of the medium. Experimental results on both synthetic and real-world data are presented, which demonstrate the effectiveness of the proposed method.
    Preview · Article · Mar 2012 · International Journal of Computer Vision
  • Xiaolong Zhu · K.-Y.K. Wong
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a flexible method for single-frame hand gesture recognition by fusing information from color and depth images. Existing methods usually focus on designing intuitive features for color and depth images. On the contrary, our method first extracts common patch-level features, and fuses them by means of kernel descriptors. Linear SVM is then adopted to predict the class label efficiently. In our experiments on two American Sign Language (ASL) datasets, we demonstrate that our approach recognizes each sign accurately with only a small number of training samples, and is robust to the change of distance between the hand and the camera.
    No preview · Conference Paper · Jan 2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we introduce a novel method for depth acquisition based on refraction of light. A scene is captured twice by a fixed perspective camera, with the first image captured directly by the camera and the second by placing a transparent medium between the scene and the camera. A depth map of the scene is then recovered from the displacements of scene points in the images. Unlike other existing depth from refraction methods, our method does not require the knowledge of the pose and refractive index of the transparent medium, but can recover them directly from the input images. We hence call our method self-calibrating depth from refraction. Experimental results on both synthetic and real-world data are presented, which demonstrate the effectiveness of the proposed method.
    Preview · Conference Paper · Jan 2011