Dacheng Tao

University of Technology Sydney , Sydney, New South Wales, Australia

Are you Dacheng Tao?

Claim your profile

Publications (378)581.76 Total impact

  • Lin Zhao, Xinbo Gao, Dacheng Tao, Xuelong Li
    [Show abstract] [Hide abstract]
    ABSTRACT: Highlights • We present a multi-layer Markov network for detecting parts in different granularity. • A composite model structure is designed for better describing spatial constrains. • The proposed model can be trained jointly and favors exact inference. • Extensive experiments show superior performance of the proposed method.
    Signal Processing 03/2015; 108:36–45. · 2.24 Impact Factor
  • Chen Gong, Tongliang Liu, Dacheng Tao, Keren Fu, Enmei Tu, Jie Yang
    [Show abstract] [Hide abstract]
    ABSTRACT: Graph Laplacian has been widely exploited in traditional graph-based semisupervised learning (SSL) algorithms to regulate the labels of examples that vary smoothly on the graph. Although it achieves a promising performance in both transductive and inductive learning, it is not effective for handling ambiguous examples (shown in Fig. 1). This paper introduces deformed graph Laplacian (DGL) and presents label prediction via DGL (LPDGL) for SSL. The local smoothness term used in LPDGL, which regularizes examples and their neighbors locally, is able to improve classification accuracy by properly dealing with ambiguous examples. Theoretical studies reveal that LPDGL obtains the globally optimal decision function, and the free parameters are easy to tune. The generalization bound is derived based on the robustness analysis. Experiments on a variety of real-world data sets demonstrate that LPDGL achieves top-level performance on both transductive and inductive settings by comparing it with popular SSL algorithms, such as harmonic functions, AnchorGraph regularization, linear neighborhood propagation, Laplacian regularized least square, and Laplacian support vector machine.
    IEEE transactions on neural networks and learning systems 01/2015; · 4.37 Impact Factor
  • Changxing Ding, Chang Xu, Dacheng Tao
    [Show abstract] [Hide abstract]
    ABSTRACT: Face images captured in unconstrained environments usually contain significant pose variation, which dramatically degrades the performance of algorithms designed to recognize frontal faces. This paper proposes a novel face identification framework capable of handling the full range of pose variations within 90 of yaw. The proposed framework first transforms the original pose-invariant face recognition problem into a partial frontal face recognition problem. A robust patch-based face representation scheme is then developed to represent the synthesized partial frontal faces. For each patch, a transformation dictionary is learnt under the proposed multitask learning scheme. The transformation dictionary transforms the features of different poses into a discriminative subspace. Lastly, face matching is performed at patch level rather than at the holistic level. Extensive and systematic experimentation on FERET, CMU-PIE, and Multi-PIE databases shows that the proposed method consistently outperforms single-task based baselines as well as state-of-the-art methods for the pose problem. We further extend the proposed algorithm for the unconstrained face verification problem and achieve top level performance on the challenging LFW dataset.
    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. 01/2015;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Objective: Glaucoma is an irreversible chronic eye disease that leads to vision loss. As it can be slowed down through treatment, detecting the disease in time is important. However, many patients are unaware of the disease because it progresses slowly without easily noticeable symptoms. Currently, there is no effective method for low cost population-based glaucoma detection or screening. Recent studies have shown that automated optic nerve head assessment from 2D retinal fundus images is promising for low cost glaucoma screening. In this paper, we propose a method for cup to disc ratio (CDR) assessment using 2D retinal fundus images. Methods: In the proposed method, the optic disc is first segmented and reconstructed using a novel sparse dissimilarity-constrained coding (SDC) approach which considers both the dissimilarity constraint and the sparsity constraint from a set of reference discs with known CDRs. Subsequently, the reconstruction coefficients from the SDC are used to compute the CDR for the testing disc. Results: The proposed method has been tested for CDR assessment in a database of 650 images with CDRs manually measured by trained professionals previously. Experimental results show an average CDR error of 0.064 and correlation coefficient of 0.67 compared with the manual CDRs, better than the state-of-theart methods. Our proposed method has also been tested for glaucoma screening. The method achieves areas under curve of 0.83 and 0.88 on datasets of 650 and 1676 images, respectively, outperforming other methods. Conclusion: The proposed method achieves good accuracy for glaucoma detection. Significance: The method has a great potential to be used for large-scale populationbased glaucoma screening.
    IEEE transactions on bio-medical engineering 01/2015; · 2.15 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Example learning-based super-resolution (SR) algorithms show promise for restoring a high-resolution (HR) image from a single low-resolution (LR) input. The most popular approaches, however, are either time- or space-intensive, which limits their practical applications in many resource-limited settings. In this paper we propose a novel computationally efficient single image SR method that learns multiple linear mappings (MLM) to directly transform LR feature subspaces into HR subspaces. Specifically, we first partition the large non-linear feature space of LR images into a cluster of linear subspaces. Multiple LR subdictionaries are then learned, followed by inferring the corresponding HR subdictionaries based on the assumption that the LR-HR features share the same representation coefficients. We establish MLM from the input LR features to the desired HR outputs in order to achieve fast yet stable SR recovery. Furthermore, in order to suppress displeasing artifacts generated by the MLM-based method, we apply a fast non-local means (NLM) algorithm to construct a simple yet effective similaritybased regularization term for SR enhancement. Experimental results indicate that our approach is both quantitatively and qualitatively superior to other application-oriented SR methods, while maintaining relatively low time and space complexity.
    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. 01/2015;
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we introduce an efficient tensor to vector projection algorithm for human gait feature representation and recognition. The proposed approach is based on the multi-dimensional or tensor signal processing technology, which finds a low-dimensional tensor subspace of original input gait sequence tensors while most of the data variation has been well captured. In order to further enhance the class separability and avoid the potential overfitting, we adopt a discriminative locality preserving projection with sparse regularization to transform the refined tensor data to the final vector feature representation for subsequent recognition. Numerous experiments are carried out to evaluate the effectiveness of the proposed sparse and discriminative tensor to vector projection algorithm, and the proposed method achieves good performance for human gait recognition using the sequences from the University of South Florida (USF) HumanID Database.
    Signal Processing 01/2015; 106:245–252. · 2.24 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Whereas the transform coding algorithms have been proved to be efficient and practical for grey-level and color images compression, they could not directly deal with the hyperspectral images (HSI) by simultaneously considering both the spatial and spectral domains of the data cube. The aim of this paper is to present an HSI compression and reconstruction method based on the multi-dimensional or tensor data processing approach. By representing the observed hyperspectral image cube to a 3-order-tensor, we introduce a tensor decomposition technology to approximately decompose the original tensor data into a core tensor multiplied by a factor matrix along each mode. Thus, the HSI is compressed to the core tensor and could be reconstructed by the multi-linear projection via the factor matrices. Experimental results on particular applications of hyperspectral remote sensing images such as unmixing and detection suggest that the reconstructed data by the proposed approach significantly preserves the HSI׳s data quality in several aspects.
    Neurocomputing 01/2015; 147:358–363. · 2.01 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Distance metric learning (DML) is successful in discovering intrinsic relations in data. However, most algorithms are computationally demanding when the problem size becomes large. In this paper, we propose a discriminative metric learning algorithm, develop a distributed scheme learning metrics on moderate-sized subsets of data, and aggregate the results into a global solution. The technique leverages the power of parallel computation. The algorithm of the aggregated DML (ADML) scales well with the data size and can be controlled by the partition. We theoretically analyze and provide bounds for the error induced by the distributed treatment. We have conducted experimental evaluation of the ADML, both on specially designed tests and on practical image annotation tasks. Those tests have shown that the ADML achieves the state-of-the-art performance at only a fraction of the cost incurred by most existing methods.
    IEEE transactions on neural networks and learning systems 12/2014; · 4.37 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Technical analysis with numerous indicators and patterns has been regarded as important evidence for making trading decisions in financial markets. However, it is extremely difficult for investors to find useful trading rules based on numerous technical indicators. This paper innovatively proposes the use of biclustering mining to discover effective technical trading patterns that contain a combination of indicators from historical financial data series. This is the first attempt to use biclustering algorithm on trading data. The mined patterns are regarded as trading rules and can be classified as three trading actions (i.e., the buy, the sell, and no-action signals) with respect to the maximum support. A modified K nearest neighborhood (K-NN) method is applied to classification of trading days in the testing period. The proposed method [called biclustering algorithm and the K nearest neighbor (BIC-K-NN)] was implemented on four historical datasets and the average performance was compared with the conventional buy-and-hold strategy and three previously reported intelligent trading systems. Experimental results demonstrate that the proposed trading system outperforms its counterparts and will be useful for investment in various financial markets.
    IEEE transactions on cybernetics. 12/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Random Forest is a very important classifier with applications in various machine learning tasks, but its promising performance heavily relies on the size of labeled training data. In this paper, we investigate constructing Random Forests with a small size of labeled data and find that the performance bottleneck is located in the node splitting procedures; hence existing solutions fail to properly partition the feature space if there are insufficient training data. To achieve robust node splitting with insufficient data, we present semi-supervised splitting to overcome this limitation by splitting nodes with the guidance of both labeled and abundant unlabeled data. In particular, an accurate quality measure of node splitting is obtained by carrying out the kernel-based density estimation whereby a multi-class version of asymptotic mean integrated squared error criterion is proposed to adaptively select the optimal bandwidth of the kernel. To avoid the curse of dimensionality, we project the data points from the original high-dimensional feature space onto a low-dimensional subspace before estimation. A unified optimization framework is proposed to select a coupled pair of subspace and separating hyperplane such that the smoothness of the subspace and the quality of the splitting are guaranteed simultaneously. Our algorithm efficiently avoids overfitting caused by bad initialization and local maxima when compared with conventional margin maximizationbased semi-supervised methods. We demonstrate the effectiveness of the proposed algorithm by comparing it with state-of-the-art supervised and semi-supervised algorithms for typical computer vision applications such as object categorization, face recognition and image segmentation on publicly available datasets.
    IEEE Transactions on Image Processing 12/2014; · 3.11 Impact Factor
  • Jun Yu, Yong Rui, Yuan Yan Tang, Dacheng Tao
    [Show abstract] [Hide abstract]
    ABSTRACT: How do we find all images in a larger set of images which have a specific content? Or estimate the position of a specific object relative to the camera? Image classification methods, like support vector machine (supervised) and transductive support vector machine (semi-supervised), are invaluable tools for the applications of content-based image retrieval, pose estimation, and optical character recognition. However, these methods only can handle the images represented by single feature. In many cases, different features (or multiview data) can be obtained, and how to efficiently utilize them is a challenge. It is inappropriate for the traditionally concatenating schema to link features of different views into a long vector. The reason is each view has its specific statistical property and physical interpretation. In this paper, we propose a high-order distance-based multiview stochastic learning (HD-MSL) method for image classification. HD-MSL effectively combines varied features into a unified representation and integrates the labeling information based on a probabilistic framework. In comparison with the existing strategies, our approach adopts the high-order distance obtained from the hypergraph to replace pairwise distance in estimating the probability matrix of data distribution. In addition, the proposed approach can automatically learn a combination coefficient for each view, which plays an important role in utilizing the complementary information of multiview data. An alternative optimization is designed to solve the objective functions of HD-MSL and obtain different views on coefficients and classification scores simultaneously. Experiments on two real world datasets demonstrate the effectiveness of HD-MSL in image classification.
    IEEE transactions on cybernetics. 12/2014; 44(12):2431-42.
  • Source
    Tongliang Liu, Dacheng Tao
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we study a classification problem in which sample labels are randomly corrupted. In this scenario, there is an unobservable sample with noise-free labels. However, before being observed, the true labels are independently flipped with a probability $\rho\in[0,0.5)$, and the random label noise can be class-conditional. Here, we address two fundamental problems raised by this scenario. The first is how to best use the abundant surrogate loss functions designed for the traditional classification problem when there is label noise. We prove that any surrogate loss function can be used for classification with noisy labels by using importance reweighting, with consistency assurance that the label noise does not ultimately hinder the search for the optimal classifier of the noise-free sample. The other is the open problem of how to obtain the noise rate $\rho$. We show that the rate is upper bounded by the conditional probability $P(y|x)$ of the noisy sample. Consequently, the rate can be estimated, because the upper bound can be easily reached in classification problems. Experimental results on synthetic and real datasets confirm the efficiency of our methods.
    11/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a fast and robust level set method for image segmentation. To enhance the robustness against noise, we embed a Markov random field (MRF) energy function to the conventional level set energy function. This MRF energy function builds the correlation of a pixel with its neighbors and encourages them to fall into the same region. To obtain a fast implementation of the MRF embedded level set model, we explore algebraic multigrid (AMG) and sparse field method (SFM) to increase the time step and decrease the computation domain, respectively. Both AMG and SFM can be conducted in a parallel fashion, which facilitates the processing of our method for big image databases. By comparing the proposed fast and robust level set method with the standard level set method and its popular variants on noisy synthetic images, synthetic aperture radar (SAR) images, medical images and natural images, we comprehensively demonstrate the new method is robust against various kinds of noises. Especially, the new level set method can segment an image of size 500 by 500 within three seconds on MATLAB R2010b installed in a computer with 3.30GHz CPU and 4GB memory.
    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. 11/2014;
  • Source
    Chang Xu, Tongliang Liu, Dacheng Tao, Chao Xu
    [Show abstract] [Hide abstract]
    ABSTRACT: We analyze the local Rademacher complexity of empirical risk minimization (ERM)-based multi-label learning algorithms, and in doing so propose a new algorithm for multi-label learning. Rather than using the trace norm to regularize the multi-label predictor, we instead minimize the tail sum of the singular values of the predictor in multi-label learning. Benefiting from the use of the local Rademacher complexity, our algorithm, therefore, has a sharper generalization error bound and a faster convergence rate. Compared to methods that minimize over all singular values, concentrating on the tail singular values results in better recovery of the low-rank structure of the multi-label predictor, which plays an import role in exploiting label correlations. We propose a new conditional singular value thresholding algorithm to solve the resulting objective function. Empirical studies on real-world datasets validate our theoretical results and demonstrate the effectiveness of the proposed algorithm.
    10/2014;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a comprehensive survey of facial feature point detection with the assistance of abundant manually labeled images. Facial feature point detection favors many applications such as face recognition, animation, tracking, hallucination, expression analysis and 3D face modeling. Existing methods can be categorized into the following four groups: constrained local model (CLM)-based, active appearance model (AAM)-based, regression-based, and other methods. CLM-based methods consist of a shape model and a number of local experts, each of which is utilized to detect a facial feature point. AAM-based methods fit a shape model to an image by minimizing texture synthesis errors. Regression-based methods directly learn a mapping function from facial image appearance to facial feature points. Besides the above three major categories of methods, there are also minor categories of methods which we classify into other methods: graphical model-based methods, joint face alignment methods, independent facial feature point detectors, and deep learning-based methods. Though significant progress has been made, facial feature point detection is limited in its success by wild and real-world conditions: variations across poses, expressions, illuminations, and occlusions. A comparative illustration and analysis of representative methods provide us a holistic understanding and deep insight into facial feature point detection, which also motivates us to explore promising future directions.
    10/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: In the 3D facial animation and synthesis community, input faces are usually required to be labeled by a set of landmarks for parameterization. Because of the variations in pose, expression and resolution, automatic 3D face landmark localization remains a challenge. In this paper, a novel landmark localization approach is presented. The approach is based on Local Coordinate Coding (LCC) and consists of two stages. In the first stage, we perform nose detection, relying on the fact that the nose shape is usually invariant under the variations in the pose, expression and resolution. Then, we use the Iterative Closest Points (ICP) algorithm to find a 3D affine transformation that aligns the input face to a reference face. In the second stage, we perform re-sampling to build correspondences between the input 3D face and the training faces. Then, an LCC-based localization algorithm is proposed to obtain the positions of the landmarks in the input face. Experimental results show that the proposed method is comparable to state of the art methods in terms of its robustness, flexibility and accuracy.
    IEEE Transactions on Image Processing 10/2014; · 3.11 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Co-training is a major multi-view learning paradigm that alternately trains two classifiers on two distinct views and maximizes the mutual agreement on the two-view unlabeled data. Traditional co-training algorithms usually train a learner on each view separately and then force the learners to be consistent across views. Although many co-trainings have been developed, it is quite possible that a learner will receive erroneous labels for unlabeled data when the other learner has only mediocre accuracy. This usually happens in the first rounds of co-training, when there are only a few labeled examples. As a result, co-training algorithms often have unstable performance. In this paper, Hessian-regularized co-training is proposed to overcome these limitations. Specifically, each Hessian is obtained from a particular view of examples; Hessian regularization is then integrated into the learner training process of each view by penalizing the regression function along the potential manifold. Hessian can properly exploit the local structure of the underlying data manifold. Hessian regularization significantly boosts the generalizability of a classifier, especially when there are a small number of labeled examples and a large number of unlabeled examples. To evaluate the proposed method, extensive experiments were conducted on the unstructured social activity attribute (USAA) dataset for social activity recognition. Our results demonstrate that the proposed method outperforms baseline methods, including the traditional co-training and LapCo algorithms.
    PLoS ONE 09/2014; 9(9):e108474. · 3.53 Impact Factor
  • Source
    Ruxin Wang, Dacheng Tao
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper comprehensively reviews the recent development of image deblurring, including non-blind/blind, spatially invariant/variant deblurring techniques. Indeed, these techniques share the same objective of inferring a latent sharp image from one or several corresponding blurry images, while the blind deblurring techniques are also required to derive an accurate blur kernel. Considering the critical role of image restoration in modern imaging systems to provide high-quality images under complex environments such as motion, undesirable lighting conditions, and imperfect system components, image deblurring has attracted growing attention in recent years. From the viewpoint of how to handle the ill-posedness which is a crucial issue in deblurring tasks, existing methods can be grouped into five categories: Bayesian inference framework, variational methods, sparse representation-based methods, homography-based modeling, and region-based methods. In spite of achieving a certain level of development, image deblurring, especially the blind case, is limited in its success by complex application conditions which make the blur kernel hard to obtain and be spatially variant. We provide a holistic understanding and deep insight into image deblurring in this review. An analysis of the empirical evidence for representative methods, practical issues, as well as a discussion of promising future directions are also presented.
    09/2014;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We address the problem of removing video color tone jitter that is common in amateur videos recorded with hand-held devices. To achieve this, we introduce color state to represent the exposure and white balance state of a frame. The color state of each frame can be computed by accumulating the color transformations of neighboring frame pairs. Then, the tonal changes of the video can be represented by a time-varying trajectory in color state space. To remove the tone jitter, we smooth the original color state trajectory by solving an (L1) optimization problem with PCA dimensionality reduction. In addition, we propose a novel selective strategy to remove small tone jitter while retaining extreme exposure and white balance changes to avoid serious artifacts. Quantitative evaluation and visual comparison with previous work demonstrate the effectiveness of our tonal stabilization method. This system can also be used as a preprocessing tool for other video editing methods.
    IEEE Transactions on Image Processing 09/2014; 23(11):4838-4849. · 3.11 Impact Factor
  • Tianyi Zhou, Dacheng Tao
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes multi-task copula (MTC) that can handle a much wider class of tasks than mean regression with Gaussian noise in most former multi-task learning (MTL). While former MTL emphasizes shared structure among models, MTC aims at joint prediction to exploit inter-output correlation. Given input, the outputs of MTC are allowed to follow arbitrary joint continuous distribution. MTC captures the joint likelihood of multi-output by learning the marginal of each output firstly and then a sparse and smooth output dependency graph function. While the former can be achieved by classical MTL, learning graphs dynamically varying with input is quite a challenge. We address this issue by developing sparse graph regression (SpaGraphR), a non-parametric estimator incorporating kernel smoothing, maximum likelihood, and sparse graph structure to gain fast learning algorithm. It starts from a few seed graphs on a few input points, and then updates the graphs on other input points by a fast operator via coarse-to-fine propagation. Due to the power of copula in modeling semi-parametric distributions, SpaGraphR can model a rich class of dynamic non-Gaussian correlations. We show that MTC can address more flexible and difficult tasks that do not fit the assumptions of former MTL nicely, and can fully exploit their relatedness. Experiments on robotic control and stock price prediction justify its appealing performance in challenging MTL problems.
    08/2014;

Publication Stats

5k Citations
581.76 Total Impact Points

Institutions

  • 2010–2015
    • University of Technology Sydney 
      • Centre for Quantum Computation and Intelligent Systems (QCIS)
      Sydney, New South Wales, Australia
    • Shanghai Jiao Tong University
      Shanghai, Shanghai Shi, China
  • 2013–2014
    • University of Petroleum (East China)
      Tsingtao, Shandong Sheng, China
    • China University of Petroleum
      Ch’ang-p’ing-ch’ü, Beijing, China
  • 2011–2013
    • National University of Defense Technology
      • National Key Laboratory of Parallel and Distributed Processing
      Ch’ang-sha-shih, Hunan, China
    • École Polytechnique Fédérale de Lausanne
      • Section d'informatique
      Lausanne, VD, Switzerland
    • Northeast Institute of Geography and Agroecology
      • National Pattern Recognition Laboratory
      Beijing, Beijing Shi, China
    • Wuhan University
      • State Key Laboratory of Information engineering in Surveying, Mapping and Remote Sensing
      Wuhan, Hubei, China
    • State Key Laboratory Of Transient Optics And Photonics
      Ch’ang-an, Shaanxi, China
    • Peking University
      Peping, Beijing, China
    • Xiamen University
      • Department of Computer Science
      Xiamen, Fujian, China
  • 2008–2012
    • Zhejiang University
      • College of Computer Science and Technology
      Hangzhou, Zhejiang Sheng, China
    • Xidian University
      • School of Life Sciences and Technology
      Ch’ang-an, Shaanxi, China
    • Tianjin University
      • Department of Electronic Information Engineering
      T’ien-ching-shih, Tianjin Shi, China
    • University of Vermont
      • Department of Computer Science
      Burlington, Vermont, United States
  • 1970–2012
    • Nanyang Technological University
      • School of Computer Engineering
      Tumasik, Singapore
  • 2009–2010
    • Aston University
      • School of Engineering and Applied Science
      Birmingham, ENG, United Kingdom
    • Hong Kong Baptist University
      Chiu-lung, Kowloon City, Hong Kong
    • Western Kentucky University
      Kentucky, United States
    • Brunel University
      • School of Engineering and Design
      अक्सब्रिज, England, United Kingdom
  • 2008–2010
    • Chinese Academy of Sciences
      • National Pattern Recognition Laboratory
      Peping, Beijing, China
  • 2007–2010
    • The University of Hong Kong
      • Department of Computer Science
      Hong Kong, Hong Kong
  • 2007–2009
    • The Hong Kong Polytechnic University
      • Department of Computing
      Hong Kong, Hong Kong
  • 2005–2009
    • Birkbeck, University of London
      • Department of Computer Science and Information Systems
      Londinium, England, United Kingdom
    • University of London
      Londinium, England, United Kingdom
    • Nanchang University
      Nan-ch’ang-shih, Jiangxi Sheng, China
  • 2006–2008
    • The University of Sheffield
      • Department of Electronic and Electrical Engineering
      Sheffield, ENG, United Kingdom
    • Tongji Hospital
      Wu-han-shih, Hubei, China
  • 2004–2005
    • The Chinese University of Hong Kong
      • Department of Information Engineering
      Hong Kong, Hong Kong