About
43
Publications
5,973
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,389
Citations
Publications
Publications (43)
We explore using body gestures for hidden emotional state analysis. As an important non-verbal communicative fashion, human body gestures are capable of conveying emotional information during social communication. In previous works, efforts have been made mainly on facial expressions, speech, or expressive body gestures to interpret classical expre...
Emotion recognition from body gestures is challenging since similar emotions can be expressed by arbitrary spatial configurations of joints, which results in relying on modeling spatial-temporal patterns from a more global level. However, most recent powerful graph convolution networks (GCNs) separate the spatial and temporal modeling into isolated...
We introduce a new dataset for the emotional artificial intelligence research: identity-free video dataset for Micro-Gesture Understanding and Emotion analysis (iMiGUE). Different from existing public datasets, iMiGUE focuses on nonverbal body gestures without using any identity information, while the predominant researches of emotion analysis conc...
Gesture recognition has attracted considerable attention owing to its great potential in applications. Although the great progress has been made recently in multi-modal learning methods, existing methods still lack effective integration to fully explore synergies among spatio-temporal modalities effectively for gesture recognition. The problems are...
Skeleton based action recognition is playing a critical role in computer vision research, its applications have been widely deployed in many areas. Currently, benefiting from the graph convolutional networks (GCN), the performance of this task is dramatically improved due to the powerful ability of GCN for modeling the Non-Euclidean data. However,...
Gesture recognition has attracted considerable attention owing to its great potential in applications. Although the great progress has been made recently in multi-modal learning methods, existing methods still lack effective integration to fully explore synergies among spatio-temporal modalities effectively for gesture recognition. The problems are...
To address the problem of training on small datasets for action recognition tasks, most prior works are either based on a large number of training samples or require pre-trained models transferred from other large datasets to tackle overfitting problems. However, it limits the research within organizations that have strong computational abilities....
Learning 3D skeleton-based representation for gesture recognition has progressively stood out because of its invariance to the viewpoint and background dynamics of video. Typically, existing techniques use absolute coordinates to determine human motion features. The recognition of gestures, however, is irrespective of the position of the performer,...
Classifying and rendering volumes of the structure are two essential goals of the visualization process. However, loss of some voxels can cause poor visualization results, such as small holes or non-smooth patches in visualized volumes. Beginning with the classified volumes, we propose a modified Allen-Cahn equation, which has the motion of mean cu...
Temporal dynamics is an open issue for modeling human body gestures. A solution is resorting to the generative models, such as the hidden Markov model (HMM). Nevertheless, most of the work assumes fixed anchors for each hidden state, which make it hard to describe the explicit temporal structure of gestures. Based on the observation that a gesture...
Online segmentation and recognition of skeleton- based gestures are challenging. Compared with offline cases, the inference of online settings can only rely on the current few frames and always completes before whole temporal movements are performed. However, incompletely performed gestures are ambiguous and their early recognition is easy to fall...
Fine-grained image classification aims at recognizing different subordinates in one basiclevel category, for example, distinguishing species of birds. Compared with basic-level classification, it has both low inter-class and high intra-class variances. Therefore, utilization of discriminative parts is crucial for fine-grained classification. In thi...
Saliency integration has aroused general concern on unifying saliency maps from multiple saliency models. Previous offline integration methods usually face two challenges: 1. if most of the candidate saliency models misjudge the saliency on an image, the integration result will lean heavily on those inferior candidate models; 2. an unawareness of t...
In this paper, we propose two novel regularization models in patch-wise and pixel-wise respectively, which are efficient to reconstruct high-resolution (HR) face image from low-resolution (LR) input. Unlike the conventional patch-based models which depend on the assumption of local geometry consistency in LR and HR spaces, the proposed method direc...
Recent saliency models rely on propagation to compute the saliency map. Previous propagation methods are single directional, where foreground propagation and background propagation are separate (e.g., only foreground propagation, or background propagation after foreground propagation). Different from the previous approaches, we propose a bi-directi...
Inspired by the fact that the matrix formed by nonlocal similar patches in a natural image is of low rank, the nuclear norm minimization (NNM) has been widely used in various image processing studies. Nonetheless, nuclear norm based convex surrogate of the rank function usually over-shrinks the rank components since it treats different components e...
Patch-based sparse representation modeling has shown great potential in image compressive sensing (CS) reconstruction. However, this model usually suffers from some limits, such as dictionary learning with great computational complexity, neglecting the relationship among similar patches. In this paper, a group-based sparse representation method wit...
Compressed sensing (CS) has been successfully utilized by many computer vision applications. However,the task of signal reconstruction is still challenging, especially when we only have the CS measurements of an image (CS image reconstruction). Compared with the task of traditional image restoration (e.g., image denosing, debluring and inpainting,...
Background subtraction is the key step for a wide spectrum of video applications such as object tracking and human behavior analysis. Compressive sensing based methods, which make little specific assumptions about the background, have recently attracted wide attention in background subtraction. Within the framework of compressive sensing, backgroun...
The visualization of arteries and heart usually plays a crucial role in the clinical diagnosis, but researchers face the problems of region selection and mutual occlusion in clinical visualization. Therefore, the arteries and the heart cannot be easily visualized by current visualization methods. To solve the problems, we propose a new framework fo...
Sparse coding has achieved a great success in various image processing studies. However, there is not any benchmark to measure the sparsity of image patch/group because sparse discriminant conditions cannot keep unchanged. This paper analyzes the sparsity of group based on the strategy of the rank minimization. Firstly, an adaptive dictionary for e...
Millions of images on the web enable us to explore images from social events such as a family party, thus it is of interest to understand and model the affect exhibited by a group of people in images. But analysis of the affect expressed by multiple people is challenging due to varied indoor and outdoor settings, and interactions taking place betwe...
Millions of images on the web enable us to explore images from social events such as a family party, thus it is of interest to understand and model the affect exhibited by a group of people in images. But analysis of the affect expressed by multiple people is challenging due to varied indoor and outdoor settings, and interactions taking place betwe...
Group sparsity has shown great potential in various low-level vision tasks (e.g, image denoising, deblurring and inpainting). In this paper, we propose a new prior model for image denoising via group sparsity residual constraint (GSRC). To enhance the performance of group sparse-based image denoising, the concept of group sparsity residual is propo...
As the matrix formed by nonlocal similar patches in a natural image is of a low rank, the nuclear norm minimization (NNM) has been widely studied for image processing. Since the singular values have clear meanings and should be treated differently, NNM regularizes each of them equally, which often restricts its capability and flexibility. Recent ad...
Recently, there are increasing interests in inferring mirco-expression from facial image sequences. Due to subtle facial movement of micro-expressions, feature extraction has become an important and critical issue for spontaneous facial micro-expression recognition. Recent works usually used spatiotemporal local binary pattern for micro-expression...
Recently, there are increasing interests in inferring mirco-expression from facial image sequences. Due to subtle facial movement of micro-expressions, feature extraction has become an important and critical issue for spontaneous facial micro-expression recognition. Recent works usually used spatiotemporal local binary pattern for micro-expression...
Low rank and sparse representation based methods, which make few specific assumptions about the background, have recently attracted wide attention in background modeling. With these methods, moving objects in the scene are modeled as pixel-wised sparse outliers. However, in many practical scenarios, the distributions of these moving parts are not t...
In this paper, a novel two-phase framework is presented to deal with the face hallucination problem. In the first phase, an initial high-resolution (HR) face image is produced in patch-wise. Each input low-resolution (LR) patch is represented as a linear combination of training patches and the corresponding HR patch is estimated by the same combina...
In this paper, a novel foreground detection method based on two-stage framework is presented. In the first stage, a class of structured sparsity-inducing norms is introduced to model moving objects in videos and thus regard the observed sequence as being made up of the sum of a low-rank matrix and a structured sparse outlier matrix. In virtue of ad...
Mixture of Gaussians (MoG) is well-known for effectively in sustaining background variations, which has been widely adopted for background subtraction. However, in complex backgrounds, MoG often traps in keeping balance between model convergence speed and its stability. The main difficulty is the selection of learning rates. In this paper, an effec...