Xiangjian He

University of Technology Sydney , Sydney, New South Wales, Australia

Are you Xiangjian He?

Claim your profile

Publications (16)4.35 Total impact

  • Lei Zhou, Yu Qiao, Yijun Li, XiangJian He, Jie Yang
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a novel interactive segmentation method based on conditional random field (CRF) model to utilize the location and color information contained in user input. The CRF is configured with the optimal weights between two features, which are the color Gaussian Mixture Model (GMM) and probability model of location information. To construct the CRF model, we propose a method to collect samples for the cuttraining tasks of learning the optimal weights on a single image׳s basis and updating the parameters of features. To refine the segmentation results iteratively, our method applies the active learning strategy to guide the process of CRF model updating or guide users to input minimal training data for training the optimal weights and updating the parameters of features. Experimental results show that the proposed method demonstrates qualitative and quantitative improvement compared with the state-of-the-art interactive segmentation methods. The proposed method is also a convenient tool for interactive object segmentation.
    Neurocomputing 07/2014; 135:240–252. · 2.01 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Salient object detection is essential for applications, such as image classification, object recognition and image retrieval. In this paper, we design a new approach to detect salient objects from an image by describing what does salient objects and backgrounds look like using statistic of the image. Firstly, we introduce a saliency driven clustering method to reveal distinct visual patterns of images by generating image clusters. The Gaussian Mixture Model (GMM) is applied to represent the statistic of each cluster, which is used to compute the color spatial distribution. Secondly, three kinds of regional saliency measures, i.e, regional color contrast saliency, regional boundary prior saliency and regional color spatial distribution, are computed and combined. Then, a region selection strategy integrating color contrast prior, boundary prior and visual patterns information of images is presented. The pixels of an image are divided into either potential salient region or background region adaptively based on the combined regional saliency measures. Finally, a Bayesian framework is employed to compute the saliency value for each pixel taking the regional saliency values as priority. Our approach has been extensively evaluated on two popular image databases. Experimental results show that our approach can achieve considerable performance improvement in terms of commonly adopted performance measures in salient object detection.
    Signal Processing Image Communication 01/2014; · 1.29 Impact Factor
  • Sheng Wang, Xiangjian He, Qiang Wu, Jie Yang
    [Show abstract] [Hide abstract]
    ABSTRACT: Local Binary Pattern (LBP) has been well recognised and widely used in various texture analysis applications of computer vision and image processing. It integrates properties of texture structural and statistical texture analysis. LBP is invariant to monotonic gray-scale variations and has also extensions to rotation invariant texture analysis. In recent years, various improvements have been achieved based on LBP. One of extensive developments was replacing binary representation with ternary representation and proposed Local Ternary Pattern (LTP). This paper further generalises the local pattern representation by formulating it as a generalised weight problem of Bachet de Meziriac and proposes Local N-ary Pattern (LNP). The encouraging performance is achieved based on three benchmark datasets when compared with its predecessors.
    Advanced Video and Signal Based Surveillance (AVSS), 2013 10th IEEE International Conference on; 01/2013
  • Chen Gong, Keren Fu, Enmei Tu, Jie Yang, Xiangjian He
    [Show abstract] [Hide abstract]
    ABSTRACT: Object tracking is widely used in many applications such as intelligent surveillance, scene understanding, and behavior analysis. Graph-based semisupervised learning has been introduced to deal with specific tracking problems. However, existing algorithms following this idea solely focus on the pairwise relationship between samples and hence could decrease the classification accuracy for unlabeled samples. On the contrary, we regard tracking as a one-class classification issue and present a novel graph-based semisupervised tracker. The proposed tracker uses linear neighborhood propagation, which aims to exploit the local information around each data point. Moreover, the manifold structure embedded in the whole sample set is discovered to allow the tracker to better model the target appearance, which is crucial to resisting the appearance variations of the object. Experiments on some public-domain sequences show that the proposed tracker can exhibit reliable tracking performance in the presence of partial occlusions, complicated background, and appearance changes, etc.
    Journal of Electronic Imaging 01/2013; · 1.06 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: High-frequency energy distributions are important characteristics of blurry images. In this paper, directional high-pass filters are proposed to analyze blurry images. Firstly, we show that the proposed directional high-pass filters can effectively estimate the motion direction of motion blurred images. A closed-form solution for motion direction estimation is derived. It achieves a higher estimation accuracy and is also faster than previous methods. Secondly, the paper suggests two important applications of the directional high-frequency energy analysis. It can be employed to identify out-of-focus blur and motion blur, and to detect motion blurred regions in observed images. Experiments on both synthetic and real blurred images are conducted. Encouraging results demonstrate the efficacy of the proposed methods.
    Signal Processing: Image Communication. 08/2012; 27(7):760–771.
  • Source
    Ruo Du, QiangWu, XiangJian HE, Jie Yang
    [Show abstract] [Hide abstract]
    ABSTRACT: Multi-instance learning (MIL) is a variational supervised learning. Instead of getting a set of instances that are labeled, the learner receives a set of bags that are labeled. Each bag contains many instances. In this paper, we present a novel MIL algorithm that can efficiently learn classifiers in a large instance space. We achieve this by estimating instance distribution using a proposed extended kernel density estimation (eKDE) which is an alternative to previous diverse density estimation (DDE). A fast method is devised to approximately locate the multiple modes of eKDE. Comparing to DDE, eKDE is more efficient and robust to the labeling noise (the mislabeled training data). We compare our approach with other state-of-the-art MIL methods in object categorization on the popular Caltech-4 and SIVAL datasets, the results illustrate that our approach provides superior performance.
    IEEE International Conference on Multimedia and Expo (ICME) Workshops; 01/2012
  • Sheng Wang, Qiang Wu, Xiangjian He, Min Xu
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we study the impact of learning an Adaboost classifier with small sample set (i.e., with fewer training examples). In particular, we make use of car localization as an underlying application, because car localization can be widely used to various real world applications. In order to evaluate the performance of Adaboost learning with a few examples, we simply apply Adaboost learning to a recently proposed feature descriptor - Locally Adaptive Regression Kernel (LARK). As a type of state-of-the-art feature descriptor, LARK is robust against illumination changes and noises. More importantly, we use LARK because its spatial property is also favorable for our purpose (i.e., each patch in the LARK descriptor corresponds to one unique pixel in the original image). In addition to learning a detector from the entire training dataset, we also split the original training dataset into several sub-groups and then we train one detector for each sub-group. We compare those features associated using the detector of each sub-group with that of the detector learnt with the entire training dataset and propose improvements based on the comparison results. Our experimental results indicate that the Adaboost learning is only successful on a small dataset when those learnt features simultaneously satisfy two conditions that: 1. features are learnt from the Region of Interest (ROI), and 2. features are sufficiently far away from each other.
    Control Automation Robotics & Vision (ICARCV), 2012 12th International Conference on; 01/2012
  • Lei Zhou, Yu Qiao, Jie Yang, Xiangjian He
    [Show abstract] [Hide abstract]
    ABSTRACT: Graph cut based on color model is sensitive to statistical information of images. Integrating priority information into graph cut approach, such as the geodesic distance information, may overcome the well-known drawback of bias towards shorter paths that occurred frequently with graph cut methods. In this paper, a conditional random field (CRF) model is formulated to combine color model and geodesic distance information into a graph cut optimization framework. A discriminative model is used to capture more comprehensive statistical information for geodesic distance. A simple and efficient parameter learning scheme based on feature fusion is proposed for CRF model construction. The method is evaluated by applying it to segmentation of natural images, medical images and low contrast images. The experimental results show that the geodesic information obtained by learning can provide more reliable object features. The dynamic parameter learning scheme is able to select best cues from geodesic map and color model for image segmentation.
    Image Processing (ICIP), 2012 19th IEEE International Conference on; 01/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Recently, object tracking has been widely studied as a binary classification problem. Semi-supervised learning is particularly suitable for improving classification accuracy when large quantities of unlabeled samples are generated (just like tracking procedure). The purpose of this paper is to fulfill robust and stable tracking by using collaborative learning, which belongs to the scope of semi-supervised learning, among three classifiers. Different from [1], random fern classifier is incorporated to deal with 2bitBP feature newly added and certain constraints are specially implemented in our framework. Besides, the way for selecting positive samples is also altered by us in order to achieve more stable tracking. Algorithm proposed in this paper is validated by tracking pedestrian and cup under occlusion. Experiments and comparison show that our algorithm can avoid drifting problem to some degree and make tracking result more robust and adaptive.
    Multimedia and Expo (ICME), 2012 IEEE International Conference on; 01/2012
  • Xiaogang Chen, Xiangjian He, Jie Yang, Qiang Wu
    [Show abstract] [Hide abstract]
    ABSTRACT: Deblurring camera-based document image is an important task in digital document processing, since it can improve both the accuracy of optical character recognition systems and the visual quality of document images. Traditional deblurring algorithms have been proposed to work for natural-scene images. However the natural-scene images are not consistent with document images. In this paper, the distinct characteristics of document images are investigated. We propose a content-aware prior for document image deblurring. It is based on document image foreground segmentation. Besides, an upper-bound constraint combined with total variation based method is proposed to suppress the rings in the deblurred image. Comparing with the traditional general purpose deblurring methods, the proposed deblurring algorithm can produce more pleasing results on document images. Encouraging experimental results demonstrate the efficacy of the proposed method.
    The 24th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, Colorado Springs, CO, USA, 20-25 June 2011; 01/2011
  • Chunhua Du, Jie Yang, Qiang Wu, Xiangjian He
    [Show abstract] [Hide abstract]
    ABSTRACT: Active shape model (ASM) plays an important role in face research such as face recognition, pose estimation and gaze estimation. A crucial step of the common ASM is finding a new position for each facial landmark at each iteration. Mahalanobis distance minimisation is used for this finding, provided there are enough training data such that the grey-level profiles for each landmark following a multivariate Gaussian distribution. However, this condition could not be satisfied in most cases. In this paper, a novel method support vector machine-based active shape model (SVMBASM) is proposed for this task. It approaches the finding task as a small sample size classification problem. Moreover, considering the poor classification performance caused by the imbalanced dataset which contains more negative instances (incorrect candidates for new position) than positive instances (correct candidates for new position), a multi-class classification framework is further proposed. Performance evaluation on Shanghai Jiao Tong University face database shows that the proposed SVMBASM outperforms the original ASM in terms of the average error and average frequency of convergence.
    IJISTA. 01/2011; 10:151-170.
  • Sheng Wang, Ruo Du, Qiang Wu, Xiangjian He
    [Show abstract] [Hide abstract]
    ABSTRACT: Human detection has been widely used in many applications. In the meantime, it is still a difficult problem with many open questions due to challenges caused by various factors such as clothing, posture and etc. By investigating several benchmark methods and frameworks in the literature, this paper proposes a novel method which successfully implements the Real AdaBoost training procedure on multi-scale images. Various object features are exposed on multiple levels. To further boost the overall performance, a fusion scheme is established using scores obtained at various levels which integrates decision results with different scales to make the final decision. Unlike other score-based fusion methods, this paper re-formulates the fusion process through a supervised learning. Therefore, our fusion approach can better distinguish subtle difference between human objects and non-human objects. Furthermore, in our approach, we are able to use simpler weak features for boosting and hence alleviate the training complexity existed in most of AdaBoost training approaches. Encouraging results are obtained on a well recognized benchmark database.
    International Conference on Digital Image Computing: Techniques and Applications, DICTA 2010, Sydney, Australia, 1-3 December, 2010; 01/2010
  • Source
    Jia Liu, Jie Yang, Yi Zhang, Xiangjian He
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we propose a novel framework for action recognition based on multiple features for improve action recognition in videos. The fusion of multiple features is important for recognizing actions as often a single feature based representation is not enough to capture the imaging variations (view-point, illumination etc.) and attributes of individuals (size, age, gender etc.). Hence, we use two kinds of features: i) a quantized vocabulary of local spatio-temporal (ST) volumes (cuboids and 2-D SIFT), and ii) the higher-order statistical models of interest points, which aims to capture the global information of the actor. We construct video representation in terms of local space-time features and global features and integrate such representations with hyper-sphere multi-class SVM. Experiments on publicly available datasets show that our proposed approach is effective. An additional experiment shows that using both local and global features provides a richer representation of human action when compared to the use of a single feature type.
    20th International Conference on Pattern Recognition, ICPR 2010, Istanbul, Turkey, 23-26 August 2010; 01/2010
  • Source
    Yin Li, Yue Zhou, Junchi Yan, Jie Yang, Xiangjian He
    [Show abstract] [Hide abstract]
    ABSTRACT: The multi-channel image or the video clip has the natural form of tensor. The values of the tensor can be corrupted due to noise in the acquisition process. We consider the problem of recovering a tensor L of visual data from its corrupted observations X = L + S, where the corrupted entries S are unknown and unbounded, but are assumed to be sparse. Our work is built on the recent studies about the recovery of corrupted low-rank matrix via trace norm minimization. We extend the matrix case to the tensor case by the definition of tensor trace norm in. Furthermore, the problem of tensor is formulated as a convex optimization, which is much harder than its matrix form. Thus, we develop a high quality algorithm to efficiently solve the problem. Our experiments show potential applications of our method and indicate a robust and reliable solution.
    Proceedings of the International Conference on Image Processing, ICIP 2010, September 26-29, Hong Kong, China; 01/2010
  • Ruo Du, Sheng Wang, Qiang Wu, Xiangjian He
    [Show abstract] [Hide abstract]
    ABSTRACT: Many machine learning tasks can be achieved by using Multiple-instance learning (MIL) when the target features are ambiguous. As a general MIL framework, Diverse Density (DD) provides a way to learn those ambiguous features by maxmising the DD estimator, and the maximum of DD estimator is called a concept. However, modeling and finding multiple concepts is often difficult especially without prior knowledge of concept number, i.e., every positive bag may contain multiple coexistent and heterogeneous concepts but we do not know how many concepts exist. In this work, we present a new approach to find multiple concepts of DD by using an supervised mean shift algorithm. Unlike classic mean shift (an unsupervised clustering algorithm), our approach for the first time introduces the class label to feature point and each point differently contributes the mean shift iterations according to its label and position. A feature point derives from an MIL instance and takes corresponding bag label. Our supervised mean shift starts from positive points and converges to the local maxima that are close to the positive points and far away from the negative points. Experiments qualitatively indicate that our approach has better properties than other DD methods.
    International Conference on Digital Image Computing: Techniques and Applications, DICTA 2010, Sydney, Australia, 1-3 December, 2010; 01/2010
  • Source
    Fanglin Wang, Jie Yang, Xiangjian He, Artur Loza
    [Show abstract] [Hide abstract]
    ABSTRACT: The recently proposed use of the structural similarity measure, in the particle filter-based video tracker has been shown to improve the tracking performance, compared to similar methods using the colour or edge histograms and Bhattacharyya distance. However, the combined use of the structural similarity and a particle filter results in a computationally complex tracker that may not be suitable for some real time applications. In this paper, a novel fast approach to the use of the structural similarity in video tracking is proposed. The tracking algorithm presented in this work determines the state of the target (location, size) based on the gradient ascent procedure applied to the structural similarity surface of the video frame, thus avoiding computationally expensive sampling of the state space. The new method, while being computationally less expensive, performs better, than the standard mean shift and the structural similarity particle filter trackers, as shown in exemplary surveillance video sequences.

Publication Stats

11 Citations
4.35 Total Impact Points


  • 2012–2014
    • University of Technology Sydney 
      • Centre for Innovation in IT Services Applications (iNEXT)
      Sydney, New South Wales, Australia
  • 2010
    • Shanghai University
      Shanghai, Shanghai Shi, China