About
46
Publications
2,747
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
256
Citations
Publications
Publications (46)
Despite the significant progress of conditional image generation, it remains difficult to synthesize a ground-view panorama image from a top-view aerial image. Among the core challenges are the vast differences in image appearance and resolution between aerial images and panorama images, and the limited aside information available for top-to-ground...
In this paper, we tackle the problem of synthesizing a ground-view panorama image conditioned on a top-view aerial image, which is a challenging problem due to the large gap between the two image domains with different view-points. Instead of learning cross-view mapping in a feedforward pass, we propose a novel adversarial feedback GAN framework na...
In this paper, we tackle the problem of synthesizing a ground-view panorama image conditioned on a top-view aerial image, which is a challenging problem due to the large gap between the two image domains with different view-points. Instead of learning cross-view mapping in a feedforward pass, we propose a novel adversarial feedback GAN framework na...
Retrieving the given objects hidden amidst the gallery set is important for public safety and decision-making. Heterogeneous pedestrian retrieval (person re-identification) aims to retrieve the same person images from different modality set for identification. To address this problem, we contribute a new character-illustration-style image and norma...
Unsupervised domain adaptation aims at learning a classification model robust to data distribution shift between a labeled source domain and an unlabeled target domain. Most existing approaches have overlooked the multi-dimensional nature of visual data, building their models in vector space. Meanwhile, the issue of limited training samples is rare...
Matching video clips of people across non-overlapping surveillance cameras (video-based person re-identification) is of significant importance in many real-world applications. In this paper, we address the video-based person re-identification by developing a Local and Global Aligned Spatiotemporal Attention (LGASA) network. Our LGASA network consis...
This work focuses on unsupervised visual domain adaptation which is still challenging in visual recognition. Most of the attention has been dedicated to seeking the domain-invariant features of cross-domain data, but they ignores the valuable discriminative information in the source domain. In this paper, we propose a Discriminative Dictionary Evol...
Learning to generate natural scenes has always been a daunting task in computer vision. This is even more laborious when generating images with very different views. When the views are very different, the view fields have little overlap or objects are occluded, leading the task very challenging. In this paper, we propose to use Generative Adversari...
Synthesizing high-resolution realistic images from text descriptions is a challenging task. Almost all existing text-to-image methods employ stacked generative adversarial networks as the backbone, utilize cross-modal attention mechanisms to fuse text and image features, and use extra networks to ensure text-image semantic consistency. The existing...
Learning to generate natural scenes has always been a daunting task in computer vision. This is even more laborious when generating images with very different views. When the views are very different, the view fields have little overlap or objects are occluded, leading the task very challenging. In this paper, we propose to use Generative Adversari...
Different views of one object usually represent different aspects of the object, and a single view is unlikely to comprehensively describe the object. In multi-view learning, comprehensive utilization of multi-view information is helpful. In this paper, we propose a novel supervised latent subspace learning method called multi-view intact discrimin...
Unsupervised domain adaptation (UDA) attempts to transfer knowledge learned from labeled source domain to unlabeled target domain. Its main challenge is distribution gap between two domains. Most of works focus on reducing domain shift by domain alignment methods. Although these methods can reduce the domain shift, the samples far from the class ce...
Unsupervised Domain Adaptation (UDA) addresses the problem of performance degradation due to domain shift between training and testing sets, which is common in computer vision applications. Most existing UDA approaches are based on vector-form data although the typical format of data or features in visual applications is multi-dimensional tensor. B...
Multi-view subspace clustering aims to divide a set of multisource data into several groups according to their underlying subspace structure. Although the spectral clustering based methods achieve promotion in multi-view clustering, their utility is limited by the separate learning manner in which affinity matrix construction and cluster indicator...
The assumption that training samples and test samples obey the same distribution is grossly violated when images are from distinct domains, which can lead to degradation in classification performance. In this paper, we propose a sparse representation based domain adaptation approach to tackle the cross-domain image classification problem. Specifica...
Existing zero-shot recognition (ZSR) approaches generally learn a projection function from the labelled training (source) dataset. However, applying the learned projection function without adaptation to the test (target) dataset is prone to the domain shift problem. In this paper, we propose a semantic double-autoencoder with attribute constraint (...
Recognizing face images in low-resolution (LR) scenarios have bigger challenges than recognizing those in high-resolution (HR) scenarios due to that LR images usually lack discriminative details. Previous methods ignore the existence of occlusions in the LR probe images. To alleviate this problem, we propose a low-rank representation and locality-c...
Multiple image variations occur in natural face images, such as the changes of pose, illumination, occlusion and expression. For non-specific variations based face recognition, learning effective features is an important research topic. Subspace learning is a widely used face recognition technique; however, numerous subspace analysis methods do not...
Recent years have witnessed the great progress for semantic segmentation using deep convolutional neural networks (DCNNs). This paper presents a novel fully convolutional network for semantic segmentation using multi-scale contextual convolutional features. Since objects in natural images tend to be with various scales and aspect ratios, capturing...
Recently, position-patch based face hallucination methods have received much attention, and obtained promising progresses duo to their effectiveness and efficiency. A locality-constrained double low-rank representation (LCDLRR) method is proposed for effective face hallucination in this paper. LCDLRR attempts to directly use the image matrix based...
In human-machine interaction, the captured faces are usually low-resolution (LR), which will degrade the performance of the following face detection and face recognition. Face hallucination is the technology of obtaining a high-resolution (HR) face from its observed LR one. In order to recover more facial details, we propose a novel method called k...
Regression techniques, such as ridge regression (RR) and logistic regression (LR), have been widely used in supervised learning for pattern classification. However, these methods mainly exploit the class label information for linear mapping function learning. They will become less effective when the number of training samples per class is small. In...
In this paper, we propose an Local Tensor Subspace Alignment algorithm (LTESA) for capturing the underlying geometric structure of images distributed on a manifold embedded in high-dimensional ambient space. The basic idea is to represent images as tensor objects and obtain locally linear structures by local rank one tensor projection (ROTP), then...
Color Image Recognition is one of the most important fields in Pattern Recognition. Both Multi-set canonical correlation analysis and Kernel method are important techniques in the field of color image recognition. In this paper, we combine the two methods and propose one novel color image recognition approach: color image kernel canonical correlati...
In this paper, Local Tensor Subspace Alignment algorithm (LTESA) is proposed to explore the substantial geometry of image manifold by regarding images as tensor objects. LTESA characterizes local geometry of tensor in local tensor subspace with rank-one tensor approximation, then align the local tensor subspaces to achieve a global low-dimensional...
This paper develops a Local Discriminative Orthogonal Rank-One Tensor Projection (LDOROTP) technique for image feature extraction. The goal of LDOROTP is to learn a compact feature for images meanwhile endow the feature with prominent discriminative ability. LDOROTP achieves the goal through a serial of rank-one tensor projections with orthogonal c...
This paper develops a manifold-oriented stochastic neighbor projection (MSNP) technique for feature extraction. MSNP is designed to find a linear projection for the purpose of capturing the underlying pattern structure of observations that actually lie on a nonlinear manifold. In MSNP, the similarity information of observations is encoded with stoc...
Distance metric learning has been of wide concern for its prominent influence to many methods in machine learning and pattern recognition . The earlier metric learning algorithms mostly focus on finding distance metric from sample vectors. However, when dealing with images, the existing approaches often encounter high computational cost due to huge...
Two common ways of network comparison are global property statistics and subgraph enumeration. However, they have unique traits, some advantages and some disadvantages. Global property of networks is easy to compute, but can only provide limited information; the subgraph-based method can extract essential structure of the studied networks, but for...
Dimensionality reduction has been demonstrated to be an effective way for feature extraction in the pattern recognition task. In this paper, a new manifold learning algorithm, Local Discriminant Space Alignment (LDSA), is developed for nonlinear dimensionality reduction. In LDSA, the discriminant structure and the local geometry of data manifold is...