About
53
Publications
2,874
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
667
Citations
Publications
Publications (53)
In this paper we present a novel reflective method to estimate 2D-3D face shape across large pose. We include the knowledge that a face is a 3D object into the learning pipeline, and formulate face alignment as a 3DMM fitting problem, where the camera projection matrix and 3D shape parameters are learned by an extended cascaded pose regression fram...
Supervised Descent Method (SDM) has shown good performance in solving non-linear least squares problems in computer vision, giving state of the art results for the problem of face alignment. However, when SDM learns the generic descent maps, it is very difficult to avoid over-fitting due to the high dimensionality of the input features. In this pap...
Despite the great success of recent facial landmarks localization approaches, the pres-ence of occlusions significantly degrades the performance of the systems. However, very few works have addressed this problem explicitly due to the high diversity of occlusion in real world. In this paper, we address the face mask reasoning and facial landmarks l...
Face alignment involves locating several facial parts such as eyes, nose and mouth, and has been popularly tackled by fitting deformable models. In this paper, we explore the effect of the combination of structured random regressors and Constrained Local Models (CLMs). Unlike most previous CLMs, we proposed a novel structured random regressors to g...
We propose an adaptive Bayesian hidden Markov model for fully unsupervised part-of-speech (POS) induction. The proposed model with its inference algorithm has two extensions to the first-order Bayesian HMM with Dirichlet priors. First our algorithm infers the optimal number of hidden states from the training corpus rather than fixes the dimensional...
In recent years, cross-domain learning algorithms have attracted much attention to solve labeled data insufficient problem.
However, these cross-domain learning algorithms cannot be applied for subspace learning, which plays a key role in multimedia
processing. This paper envisions the cross-domain discriminative subspace learning and provides an e...
This paper presents a discriminative temporal topic model (DTTM) for facial expression recognition. Our DTTM is developed by introducing temporal and categorical information into Latent Dirichlet Allocation (LDA) topic model. Temporal information is integrated by placing an asymmetric Dirichlet prior over document-topic distributions. The discrimin...
Distribution calibration plays an important role in cross-domain learning. However, existing distribution distance metrics are not geodesic; therefore, they cannot measure the intrinsic distance between two distributions. In this paper, we calibrate two distributions by using the geodesic distance in Riemannian symmetric space. Our method learns a...
In this paper we extend the latent Dirichlet allocation (LDA) topic model to model facial expression dynamics. Our topic model
integrates the temporal information of image sequences through redefining topic generation probability without involving new
latent variables or increasing inference difficulties. A collapsed Gibbs sampler is derived for ba...
Near-duplicate video retrieval is becoming more and more important with the exponential growth of the Web. Though various approaches have been proposed to address this problem, they are mainly focusing on the retrieval accuracy while infeasible to query on Web scale video database in real time. This paper proposes a novel method to address the effi...
Dimension reduction algorithms have attracted a lot of attentions in face recognition because they can select a subset of effective and efficient discriminative features in the face images. Most of dimension reduction algorithms can not well model both the intra-class geometry and interclass discrimination simultaneously. In this paper, we introduc...
In recent years, cross-domain learning algorithms have attracted much attention to solve labeled data insufficient problem. However, these cross-domain learning algorithms cannot be applied for subspace learning, which plays a key role in multimedia, e. g., Web image annotation. This paper envisions the cross-domain discriminative subspace learning...
This paper presents an unsupervised Chinese Part-of-Speech (POS) tagging model based on the first-order HMM. Unlike the conventional HMM, the num-ber of hidden states is not fixed and will be increased to fit the training data. In favor of sparse distribution, the Dirich-let priors are introduced with variational inference method. To reduce the emi...
In recent years, transfer learning has attracted much attention in multimedia. In this paper, we propose an efficient transfer
dimensionality reduction algorithm called transfer discriminative Logmaps (TDL). TDL finds a common feature so that 1) the
quadratic distance between the distribution of the training set and that of the testing set is minim...
Is it possible to train a learning model to separate tigers from elks when we have 1) labeled samples of leopard and zebra and 2) unlabelled samples of tiger and elk at hand? Cross-domain learning algorithms can be used to solve the above problem. However, existing cross-domain algorithms cannot be applied for dimension reduction, which plays a key...
This paper presents a nonparametric discriminant HMM and applies it to facial expression recognition. In the proposed HMM, we introduce an effective nonparametric output probability estimation method to increase the discrimination ability at both hidden state level and class level. The proposed method uses a nonparametric adaptive kernel to utilize...
We present a model which integrates dependency parsing with reinforcement learning based on Markov decision pro- cess. At each time step, a transition is picked up to construct the dependency tree in terms of the long-run reward. The op- timal policy for choosing transitions can be found with the SARSA algorithm. In SARSA, an approximation of the s...
This paper discusses a new convolution tree kernel by introducing local alignments. The main idea of the new kernel is to allow some syntactic alternations during each match between subtrees. In this paper, we give an algorithm to calculate the composite kernel. The experiment results show promising improvements on two tasks: semantic role labeling...
We present a Temporal Exemplar-based Bayesian Networks (TEBNs) for facial expression recognition. The proposed Bayesian Networks (BNs) consists of three layers: Observation layer, Exemplars layer and Prior Knowledge layer. In the Exemplars layer, exemplar-based model is integrated with BNs to improve the accuracy of probability estimation. In the P...
In this article, we propose a new postprocessing strategy, word suggestion, based on a multiple word trigger-pair language model for Chinese character recognizers. With the word suggestion strategy, Chinese character recognizers may even achieve a recognition rate greater than the top-n candidate recognition rate. To construct the multiple word tri...
Association rule has evolved from the primitive form of single dimension intratransaction to the form of multi-dimension intertransaction. The challenge for mining multi-dimension intertransaction rules is the formidable search space. Researchers have proposed various methods to handle this problem, such as restricting the number of dimensions, con...
We introduce a C.G. constraint on adaptive random testing (ART) for programs with numerical input. One rationale behind adaptive random testing is to have the test candidates to be as widespread over the input domain as possible. However, the computation may be quite expensive in some cases. The C.G. constraint is introduced to maintain the widespr...
In this paper, we introduce a C. G. constraint on Adaptive Random Testing (ART) for programs with numerical input. One rationale behind Adaptive Random Testing is to have the test candidates to be as widespread over the input domain as possible. However, the computation may be quite expensive in some cases. The C. G. constraint is introduced to mai...
A good language model is essential to a postprocessing algorithm for recognition systems. Trigger pair model has been used to investigate long distance dependent relationship. However, previous trigger pair model has only one word for its trigger. It is desirable that more words can be observed in the trigger for a better prediction of the triggere...
An efficient initialisation algorithm is presented for
linear-phase paraunitary filter banks by incorporating
near-power-complementary constraints on spectrally adjacent filters.
Experiments show that the proposed initialisation helps the design
parameters in the lattice structures quickly converge to a set of
initial values to be used in further r...
A new class of transmultiplexers are designed with the filter
responses spread in both the time and frequency domains. In contrast to
design algorithms with passband flatness and stopband attenuation
criteria, the new algorithm includes a time/frequency property.
Transmultiplexers with five users and linear-phase property are
experimentally constru...
An algorithm for complex-valued paraunitary filter banks with
conjugate symmetry is proposed which leads to the filters having
linear-phase frequency responses. Lattice structures are proposed that
can be treated as a generalisation of the factorisation of real-valued
linear-phase filter banks
In this paper, a new design algorithm for complex-valued
paraunitary filter banks with conjugate symmetry is proposed which leads
the analysis and synthesis filters to have linear-phase frequency
responses. The lattice structures proposed in this paper can be treated
as a generalization of the factorization of real-valued linear-phase
filter banks....
Filter banks have been shown to be efficient in several emerging
signal communication applications. A new class of time-frequency spread
coders for transmultiplexer systems using multirate filter banks is
presented. As compared with conventional filter banks designed with
stopband attenuation and passband flatness criteria, the user coders
with the...
A new algorithm for a subclass of linear-phase paraunitary filter
banks with generalized filter length and symmetry polarity is reported.
New properties in the polyphase matrix are derived, and the lattice
factorizations for filter banks with an even and odd number of channels
are examined in a generalized algorithm
Subband filter banks with non-uniform passband distribution in
frequency domain are studied. Several design examples are presented and
compared with conventional uniform bandwidth filter banks. Image coding
results show that filter banks with non-uniform bandwidth outperform
filter banks with uniform bandwidth, especially in low bit rate coding
In this paper, a new design algorithm is presented for a family of
linear phase paraunitary filter banks with generalized filter length and
symmetric polarity. A number of new constraints on the distributions of
filter length and symmetry polarity among the channels are derived. In
the algorithm, the lengths of the filters are gradually reduced thr...
Fuzzy-Attribute Graph (FAG) was proposed to handle fuzziness in
the pattern primitives in structural pattern recognition. FAG has the
advantage that we can combine several possible definitions into a single
template, and hence only one matching is required instead of one for
each definition. Also, each vertex or edge of the graph can contain
fuzzy...
The symmetric extension method has been shown to he an efficient
way for subband processing of finite-length sequences. This paper
presents an extension of this method to general linear-phase
perfect-reconstruction filter banks. We derive constraints on the length
and symmetry polarity of the permissible filter banks and propose a new
design algori...
This paper describes a deformable elastic matching approach to
handwritten Chinese character recognition (HCCR). Handwritten character
is regarded as a kind of deformable object, with elastic property. For
the same category of character, we assume that different handwriting
variations share the same topological structure, but may differ in shape
de...
A supervised learning ART model (SART) is proposed which is based on the structure ofARTMAP but is much simpler. The techniques ofmatch tracking and complement coding have been implemented to ensure the correct selection 0/category and stability during the training and testing phases. Two simulations have been done in order to verify and evaluate t...
In this paper we study support preservative (SP) symmetric
extension methods for M-channel perfect-reconstruction linear-phase FIR
analysis/synthesis systems. For a given finite-duration sequence and a
given FIR linear-phase filter bank, necessary and sufficient conditions
are derived such that SP symmetric extensions exist. Moreover, explicit
meth...
High-dimensional grammars such as web grammars and plex grammars were used in syntactic recognition of complex 2-D or 3-D objects. In this paper, we present a simple modification, borrowing the concept of guards from concurrent programming to attributed grammar proposed by D. E. Knuth. We show that the resultant grammar can handle patterns describe...
In everyday life, many properties or concepts we encumber are fuzzy in nature. To include those fuzzy properties in solving some types of problems, we have extended the attributed graph to fuzzy-attribute graph (FAG). With such extension, equality of attributes can no longer be used when matching of FAG's is considered, as equality of two fuzzy set...
An algorithm for the clustering of existing clusters is introduced in this paper. The algorithm was adopted from fuzzy-c-mean and modifications made to take into account the extra information, i.e. some data samples already form clusters. Partition coefficients, together with some other criteria, are used for testing cluster validity. The method wa...
In everyday life, many properties or concepts we encounter are fuzzy in nature. To include those fuzzy properties in solving some types of problems, we have extended the attributed graph to fuzzy-attribute graph (FAG). With such extension, equality of attributes can no longer be used when matching of FAG's is considered, as equality of two fuzzy se...
(Uncorrected OCR) 1 Abstract of Thesis Entitled 'Fuzzy Set Theoretic Approach to Handwritten Chinese Character Recognition' Submitted by Chan Kwok Ping For the degree of Doctor of Philosophy at the University of Hong Kong in June 1989 Fuzzy set theory was incorporated into two main approaches of pattern recognition �the decision-theoretic approach...
A distortion model is proposed to expand a given database so that it represents more variations in handwriting than the original. The model performs appropriate shearing and warping operations on the sample characters in order to generate additional samples with variations in stroke directions and in the relative sizes between different subpatterns...
Thesis (Ph.D.)--University of Hong Kong, 1989. Mode of Access: World Wide Web.