About
78
Publications
8,414
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,436
Citations
Introduction
Current institution
Publications
Publications (78)
Federated learning (FL) enables multiple client medical institutes collaboratively train a deep learning (DL) model with privacy protection. However, the performance of FL can be constrained by the limited availability of labeled data in small institutes and the heterogeneous (i.e., non-i.i.d.) data distribution across institutes. Though data augme...
Deep learning is usually data starved, and the unsupervised domain adaptation (UDA) is developed to introduce the knowledge in the labeled source domain to the unlabeled target domain. Recently, deep self-training presents a powerful means for UDA, involving an iterative process of predicting the target domain and then taking the confident predicti...
There has been a growing interest in unsupervised domain adaptation (UDA) to alleviate the data scalability issue, while the existing works usually focus on classifying independently discrete labels. However, in many tasks (e.g., medical diagnosis), the labels are discrete and successively distributed. The UDA for ordinal classification requires in...
Unsupervised domain adaptation (UDA) has been successfully applied to transfer knowledge from a labeled source domain to target domains without their labels. Recently introduced transferable prototypical networks (TPN) further addresses class-wise conditional alignment. In TPN, while the closeness of class centers between source and target domains...
Deep learning is usually data starved, and the unsupervised domain adaptation (UDA) is developed to introduce the knowledge in the labeled source domain to the unlabeled target domain. Recently, deep self-training presents a powerful means for UDA, involving an iterative process of predicting the target domain and then taking the confident predicti...
This paper presents a new nested U-shape attention network (NUA-Net) with improved robustness of lesions for effective vascular segmentation in retinal imaging. Unlike most of the current deep learning approaches which rely on vanilla upsample module to recover distinguishable features for segmentation, our attention-based multi-scale network exten...
In this work, we propose an adversarial unsupervised domain adaptation (UDA) approach with the inherent conditional and label shifts, in which we aim to align the distributions w.r.t. both $p(x|y)$ and $p(y)$. Since the label is inaccessible in the target domain, the conventional adversarial UDA assumes $p(y)$ is invariant across domains, and relie...
There has been a growing interest in unsupervised domain adaptation (UDA) to alleviate the data scalability issue, while the existing works usually focus on classifying independently discrete labels. However, in many tasks (e.g., medical diagnosis), the labels are discrete and successively distributed. The UDA for ordinal classification requires in...
How to extract effective expression representations that invariant to the identity-specific attributes is a long-lasting problem for facial expression recognition (FER). Most of the previous methods process the RGB images of a sequence, while we argue that the off-the-shelf and valuable expression-related muscle movement is already embedded in the...
Recent advances in unsupervised domain adaptation (UDA) show that transferable prototypical learning presents a powerful means for class conditional alignment, which encourages the closeness of cross-domain class centroids. However, the cross-domain inner-class compactness and the underlying fine-grained subtype structure remained largely underexpl...
The widely-used cross-entropy (CE) loss-based deep networks achieved significant progress w.r.t. the classification accuracy. However, the CE loss can essentially ignore the risk of misclassification which is usually measured by the distance between the prediction and label in a semantic hierarchical tree. In this paper, we propose to incorporate t...
Unsupervised domain adaptation (UDA) aims to transfer the knowledge on a labeled source domain distribution to perform well on an unlabeled target domain. Recently, the deep self-training involves an iterative process of predicting on the target domain and then taking the confident predictions as hard pseudo-labels for retraining. However, the pseu...
This paper targets to explore the inter-subject variations eliminated facial expression representation in the compressed video domain. Most of the previous methods process the RGB images of a sequence, while the off-the-shelf and valuable expression-related muscle movement already embedded in the compression format. In the up to two orders of magni...
Recent advances in unsupervised domain adaptation (UDA) show that transferable prototypical learning presents a powerful means for class conditional alignment, which encourages the closeness of cross-domain class centroids. However, the cross-domain inner-class compactness and the underlying fine-grained subtype structure remained largely underexpl...
This paper targets on learning-based novel view synthesis from a single or limited 2D images without the pose supervision. In the viewer-centered coordinates, we construct an end-to-end trainable conditional variational framework to disentangle the unsupervisely learned relative-pose/rotation and implicit global 3D representation (shape, texture an...
Semantic segmentation (SS) is an important perception manner for self-driving cars and robotics, which classifies each pixel into a pre-determined class. The widely-used cross entropy (CE) loss-based deep networks has achieved significant progress w.r.t. the mean Intersection-over Union (mIoU). However, the cross entropy loss can not take the diffe...
Semantic segmentation is important for many real-world systems, e.g., autonomous vehicles, which predict the class of each pixel. Recently, deep networks achieved significant progress w.r.t. the mean Intersection-over Union (mIoU) with the cross-entropy loss. However, the cross entropy loss can essentially ignore the difference of severity for an a...
Semantic segmentation is important for many real-world systems, e.g., autonomous vehicles, which predict the class of each pixel. Recently, deep networks achieved significant progress w.r.t. the mean Intersection-over Union (mIoU) with the cross-entropy loss. However, the cross-entropy loss can essentially ignore the difference of severity for an a...
There is a large amount of public available labeled image-based facial expression recognition datasets. How could these images help for the audio emotion recognition with limited labeled data according to their inherent correlations can be a meaningful and challenging task. In this paper, we propose a semi-supervised adversarial network that allows...
This paper targets on learning-based novel view synthesis from a single or limited 2D images without the pose supervision. In the viewer-centered coordinates, we construct an end-to-end trainable conditional variational framework to disentangle the unsupervisely learned relative-pose/rotation and implicit global 3D representation (shape, texture an...
Semantic segmentation is a class of methods to classify each pixel in an image into semantic classes, which is critical for autonomous vehicles and surgery systems. Cross-entropy (CE) loss-based deep neural networks (DNN) achieved great success w.r.t. the accuracy-based metrics, e.g., mean Intersection-over Union. However, the CE loss has a limitat...
Deep neural networks are usually data-starved, but manually annotation can be costly in many specific tasks. For instance, the emotion recognition from the audio. However, there is a large amount of public available labeled image-based facial expression recognition datasets. How could these images help for the audio emotion recognition with limited...
Object detection locates the objects with bounding boxes and identifies their classes, which is valuable in many computer vision applications (e.g. autonomous driving). Most existing deep learning-based methods output a probability vector for instance classification trained with the one-hot label. However, the limitation of these models lies in att...
Mu Li Kede Ma Jane You- [...]
Wangmeng Zuo
Precise estimation of the probabilistic structure of natural images plays an essential role in image compression. Despite the recent remarkable success of end-to-end optimized image compression, the latent codes are usually assumed to be fully statistically factorized in order to simplify entropy modeling. However, this assumption generally does no...
Semantic segmentation (SS) is an important perception manner for self-driving cars and robotics, which classifies each pixel into a pre-determined class. The widely-used cross entropy (CE) loss-based deep networks has achieved significant progress w.r.t. the mean Intersection-over Union (mIoU). However, the cross entropy loss can not take the diffe...
Learning-based lossy image compression usually involves the joint optimization of rate-distortion performance, and requires to cope with the spatial variation of image content and contextual dependence among learned codes. Traditional entropy models can spatially adapt the local bit rate based on the image content, but usually are limited in exploi...
This paper presents a new training model for orientation invariant object detection in aerial images by extending a deep learning based RetinaNet which is a single-stage detector based on feature pyramid networks and focal loss for dense object detection. Unlike R3Det which applies feature refinement to handle rotating objects, we proposed further...
Due to the powerful capability of the data representation, deep learning has achieved a remarkable performance in supervised hash function learning. However, most of the existing hashing methods focus on point-to-point matching that is too strict and unnecessary. In this article, we propose a novel deep supervised hashing method by relaxing the mat...
The existing residual attention network (RAN) method mainly utilizes the deeper network layer for the image objects which are to be classified. However, when the network depth is simply increased, it will lead to gradient dispersion (or explosion) effect. To address the problem, we propose a new improvement method of residual attention network for...
Mu Li Kede Ma Jane You- [...]
Wangmeng Zuo
It has long been understood that precisely estimating the probabilistic structure of natural visual images is crucial for image compression. Despite the remarkable success of recent end-to-end optimized image compression, the latent code representation is assumed to be fully statistically factorized such that the entropy modeling is feasible. Here...
Multiview learning has been widely studied in various fields and achieved outstanding performances in comparison to many single-view-based approaches. In this paper, a novel multiview learning method based on the Gaussian process latent variable model (GPLVM) is proposed. In contrast to existing GPLVM methods which only assume that there are transf...
In this paper, we present correlated logistic (CorrLog) model for multilabel image classification. CorrLog extends conventional logistic regression model into multilabel cases, via explicitly modeling the pairwise correlation between labels. In addition, we propose to learn the model parameters of CorrLog with elastic net regularization, which help...
A Hyperspectral Image (HSI) contains a great number of spectral bands for each pixel; however, the spatial resolution of HSI is low. Hyperspectral image super-resolution is effective to enhance the spatial resolution while preserving the high-spectral-resolution by software techniques. Recently, the existing methods have been presented to fuse HSI...
Hyperspectral remote sensing image unsupervised classification, which assigns each pixel of the image into a certain land-cover class without any training samples, plays an important role in the hyperspectral image processing but still leaves huge challenges due to the complicated and high-dimensional data observation. Although many advanced hypers...
A hyperspectral image (HSI) contains a great number of spectral bands for each pixel, which will limit the conventional image classification methods to distinguish land-cover types of each pixel. Dimensionality reduction is an effective way to improve the performance of classification. Linear discriminant analysis (LDA) is a popular dimensionality...
Hyperspectral image (HSI) classification is a widely used application to provide important information of land covers. Each pixel of an HSI has hundreds of spectral bands, which are often considered as features. However, some features are highly correlated and nonlinear. To address these problems, we propose a new discrimination analysis framework...
In recent year, the number of people who are suffering from the Diabetes Mellitus (DM) has increased remarkably and the detection of DM disease has attracted much attention. Different from some existing methods which are invasive, Traditional Chinese Medicine (TCM) provides a non-invasive strategy for DM diagnosis by exploiting some features in the...
Data clustering is the task to group the data samples into certain clusters based on the relationships of samples and structures hidden in data, and it is a fundamental and important topic in data mining and machine learning areas. In the literature, the spectral clustering is one of the most popular approaches and has many variants in recent years...
In recent yeas, various data clustering algorithms have been proposed in the data mining and engineering communities. However, there are still drawbacks in traditional clustering methods which are worth to be further investigated, such as clustering for the high dimensional data, learning an ideal affinity matrix which optimally reveals the global...
Fetal movement is an important index of fetal well-being. The absence or a reduction in fetal movement is a symptom or an alarming sign of fetal compromise or even death. The timely detection of abnormalities in fetal movement is vital to reduce the incidence of fetal loss, perinatal morbidity and maternal distress. This paper presents a smart feta...
Recently, a growing number of advanced hyperspectral remote sensing image classification techniques have been proposed and reported superiority in accuracy on the public available urban datasets, e.g., the Washington DC, and the Pavia Centre and University. Since the task of hyperspectral image classification is basically a special case of pattern...
Low-rank matrix approximation has been widely used for data subspace clustering and feature representation in many computer vision and pattern recognition applications. However, in order to enhance the discriminability, most of the matrix approximation based feature extraction algorithms usually generate the cluster labels by certain clustering alg...
Conventional dictionary learning algorithms suffer from the following problems when applied to face recognition. First, since in most face recognition applications there are only a limited number of original training samples, it is difficult to obtain a reliable dictionary with a large number of atoms from these samples. Second, because the face im...
Sparse representation has shown an attractive performance in a number of applications. However, the available sparse representation methods still suffer from some problems, and it is necessary to design more efficient methods. Particularly, to design a computationally inexpensive, easily solvable, and robust sparse representation method is a signif...
In this paper, we present correlated logistic model (CorrLog) for multilabel image classification. CorrLog extends conventional Logistic Regression model into multilabel cases, via explicitly modelling the pairwise correlation between labels. In addition, we propose to learn model parameters of CorrLog with Elastic Net regularization, which helps e...
Nonnegative matrix factorization (NMF) has been successfully used in many fields as a low-dimensional representation method. Projective nonnegative matrix factorization (PNMF) is a variant of NMF that was proposed to learn a subspace for feature extraction. However, both original NMF and PNMF are sensitive to noise and are unsuitable for feature ex...
This paper presents a new system to monitor retinal microaneurysm which are regarded as the first sign of diabetic retinopathy(DR). The proposed approach to automatic microaneurysm detection aims to enhance screening large populations. Most of the existing computer-aided systems for microaneurysm detection are based on the sophisticated medical dev...
Local Binary Pattern (LBP) has been successfully used in computer vision and pattern recognition applications such as texture recognition. It could effectively address gray-scale and rotation variation. However, it failed to get desirable performance for texture classification with scale transformation. In this paper, a new method based on dominant...
Principal component analysis (PCA) is widely applied in various areas, one of the typical applications is in face. Many versions of PCA have been developed for face recognition. However, most of these approaches are sensitive to grossly corrupted entries in a 2D matrix representing a face image. In this paper, we try to reduce the influence of gros...
Online learning is very important for processing sequential data and helps alleviate the computation burden on large scale data as well. Especially, one-pass online learning is to predict a new coming sample׳s label and update the model based on the prediction, where each coming sample is used only once and never stored. So far, existing one-pass o...
The representation based classification has achieved promising performance in high-dimensional pattern classification problems. As we know, in real-world applications the samples are usually corrupted by noise. However, representation based classification can take only noise in the test sample into account and is not able to deal with noise in the...
The image of a face varies with the illumination, pose, and facial expression, thus we say that a single face image is of high uncertainty for representing the face. In this sense, a face image is just an observation and it should not be considered as the absolutely accurate representation of the face. As more face images from the same person provi...
Effective and efficient texture feature extraction and classification is an important problem in image understanding and recognition. Recently, texton learning based texture classification approaches have been widely studied, where the textons are usually learned via K-means clustering or sparse coding methods. However, the K-means clustering is to...
In this paper, we improve the minimum squared error (MSE) algorithm for classification by modifying its classification rule. Differing from the conventional MSE algorithm which first obtains the mapping that can best transform the training sample into its class label and then exploits the obtained mapping to predict the class label of the test samp...
Video content analysis and understanding are active research topics in modern visual computing and communication. In this context, a particular challenging problem that attracts much attention is human action recognition. In this chapter, we propose a new methodology to solve the problem using geometric statistical information. Two new approaches,...
Human hand back skin texture (HBST) is often consistent for a person and distinctive from person to person. In this paper, we study the HBST pattern recognition problem with applications to personal identification and gender classification. A specially designed system is developed to capture HBST images, and an HBST image database was established,...
Automated segmentation of blood vessels in retinal images can tell us about retinal, ophthalmic and even systemic diseases so that it can help ophthalmologists screen larger populations for vessel abnormalities. For example, the vessel width shows the abnormality of arterial narrowing, a serious damage caused by hypertension. Because the width of r...
Texture classification is a classical yet still active topic in computer vision and pattern recognition. Recently, several new texture classification approaches by modeling texture images as distributions over a set of textons have been proposed. These textons are learned as the cluster centers in the image patch feature space using the K-means clu...
The early diagnosis of proliferative diabetic retinopathy (PDR), a common complication of diabetes that damages the retina, is crucial to the protection of the vision of diabetes sufferers. The onset of PDR is signaled by the appearance of neovascular net. Such neovascular nets might be identified using retinal vessel extraction techniques. The com...
This paper presents a new approach to content-based image retrieval by using dynamic indexing and guided search in a hierarchical structure, and extending data mining and data warehousing techniques. The proposed algorithms include: a wavelet-based scheme for multiple image feature extraction, the extension of a conventional data warehouse and an i...
Wheat storage is an important part of grain logistics and also it is vital for the security of food which has wheat as raw material. Currently, the estimated storage period for wheat in China is 3-5 years. This paper establishes a mathematics model with Markov chain and applies it with the data from 70 national reserved grain depots of Henan Provin...
With the rapid advances in computing and electronic imaging technology, there has been increasing interest in developing computer aided medical diagnosis systems to improve the medical service for the public. Images of ocular fundus provide crucial observable features for diagnosing many kinds of pathologies such as diabetes, hypertension, and arte...
This paper presents a new approach to automated retinal vessel segmentation based on multiscale analysis and adaptive thresholding. The accurate identification of the appearance of blood vessels in ocular fundus plays an important role in medical diagnosis of many diseases. In contrast to the existing methods for computer aided diagnosis which are...
This paper presents an approach to parallel implementation of wavelet transforms in a distributed computing environment. To achieve robustness and efficiency, we proposed a parallel algorithm for wavelet transform which can be implemented in SIMD, MIMD and pipeline architectures on the configured system. Our experimental results show that our propo...
We present a wavelet-based, high performance, hierarchical scheme
for image matching which includes (1) dynamic detection of interesting
points as feature points at different levels of subband images via the
wavelet transform, (2) adaptive thresholding selection based on
compactness measures of fuzzy sets in image feature space, and (3) a
guided se...