About
290
Publications
50,401
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,942
Citations
Publications
Publications (290)
Object detection methods have achieved remarkable performances when the training and testing data satisfy the assumption of i.i.d. However, the training and testing data may be collected from different domains, and the gap between the domains can significantly degrade the detectors. Test Time Adaptive Object Detection (TTA-OD) is a novel online app...
Due to the inefficiency of pixel-level annotations, weakly supervised salient object detection with image-category labels (WSSOD) has been receiving increasing attention. Previous works usually endeavor to generate high-quality pseudolabels to train the detectors in a fully supervised manner. However, we find that the detection performance is often...
Recently, deep hashing-based cross-modal retrieval has attracted much attention of researchers, due to its advantages of fast retrieval efficiency and low storage overhead, etc. However, the existing deep hashing-based cross-modal retrieval methods typically 1) suffer from inadequately capturing the semantic relevance and coexistent information for...
Unconstrained palmprint images have shown great potential for recognition applications due to their lower restrictions regarding hand poses and backgrounds during contactless image acquisition. However, they face two challenges: 1) Unclear palm contours and finger-valley points of unconstrained palmprint images make it difficult to locate landmarks...
Fire monitoring is an important task. We can better monitor the fires by enhancing the coverage rate of the fire wireless sensor networks. However, the maximization of the network coverage to achieve the best monitoring effect is widely recognized as a challenging problem. Thus, it is necessary to develop new technologies for the deployment of the...
Effective fire detection can identify the source of the fire faster, and reduce the risk of loss of life and property. Existing methods still fail to efficiently improve models’ multi-scale feature learning capabilities, which are significant to the detection of fire targets of various sizes. Besides, these methods often overlook the accumulation o...
As a combination of emerging multi-view learning methods and traditional multi-label classification tasks, multi-view multi-label classification has shown broad application prospects. The diverse semantic information contained in heterogeneous data effectively enables the further development of multi-label classification. However, the widespread in...
Diabetic Retinopathy (DR), the leading cause of blindness in diabetic patients, is diagnosed by the condition of retinal multiple lesions. As a difficult task in medical image segmentation, DR multi-lesion segmentation faces the main concerns as follows. On the one hand, retinal lesions vary in location, shape, and size. On the other hand, because...
Despite the fact that there is a remarkable achievement on multifocus image fusion, most of the existing methods only generate a low-resolution image if the given source images suffer from low resolution. Obviously, a naive strategy is to independently conduct image fusion and image super-resolution. However, this two-step approach would inevitably...
Super-resolving the magnetic resonance (MR) image of a target contrast under the guidance of the corresponding auxiliary contrast, which provides additional anatomical information, is a new and effective solution for fast MR imaging. However, current multi-contrast super-resolution (SR) methods tend to concatenate different contrasts directly, igno...
Federated learning enables multiple hospitals to cooperatively learn a shared model without privacy disclosure. Existing methods often take a common assumption that the data from different hospitals have the same modalities. However, such a setting is difficult to fully satisfy in practical applications, since the imaging guidelines may be differen...
Unsupervised domain adaptive object detection (UDA-OD) is a challenging task that aims to improve the generalization of detectors across domains. Although the existing UDA-OD methods have demonstrated their capabilities, they fail to investigate two critical correlations in the adaptation procedure, ie., i) the correlation between the features insi...
3D semantic occupancy has garnered considerable attention due to its abundant structural information encompassing the entire autonomous driving scene. However, existing 3D occupancy prediction methods are typically tailored for single-frame inputs, resulting in unsatisfactory performance and temporal inconsistencies in real-world continuous scenari...
Thanks to its powerful ability to depict high-resolution anatomical information, magnetic resonance imaging (MRI) has become an essential non-invasive scanning technique in clinical practice. However, excessive acquisition time often leads to the degradation of image quality and psychological discomfort among subjects, hindering its further popular...
Unsupervised domain adaptive object detection (UDA-OD) is a challenging problem since it needs to locate and recognize objects while maintaining the generalization ability across domains. Most existing UDA-OD methods directly integrate the adaptive modules into the detectors. This integration procedure can significantly sacrifice the detection perf...
Over the past few years, there has been growing interest in developing a broad, universal, and general-purpose computer vision system. Such systems have the potential to address a wide range of vision tasks simultaneously, without being limited to specific problems or data domains. This universality is crucial for practical, real-world computer vis...
The 0-1 grid method is commonly used to divide a fire building into fully passable and fully impassable areas. Firefighters are only able to perform rescue tasks in the fully passable areas. However, in an actual building fire environment, there are three types of areas: fully impassable areas (areas blocked by obstacles or with heavy smoke and fir...
The goal of Camouflaged object detection (COD) is to detect objects that are visually embedded in their surroundings. Existing COD methods only focus on detecting camouflaged objects from seen classes, while they suffer from performance degradation to detect unseen classes. However, in a real-world scenario, collecting sufficient data for seen clas...
Incomplete multiview clustering (IMC) is a hot and emerging topic. It is well known that unavoidable data incompleteness greatly weakens the effective information of multiview data. To date, existing IMC methods usually bypass unavailable views according to prior missing information, which is considered a second-best scheme based on evasion. Other...
In recent years, multi-view multi-label learning has aroused extensive research enthusiasm. However, multi-view multi-label data in the real world is commonly incomplete due to the uncertain factors of data collection and manual annotation, which means that not only multi-view features are often missing, and label completeness is also difficult to...
Diabetic retinopathy (DR) is the main cause of irreversible blindness for working-age adults. The previous models for DR detection have difficulties in clinical application. The main reason is that most of the previous methods only use single-view data, and the single field of view (FOV) only accounts for about 13% of the FOV of the retina, resulti...
As we all know, multi-view data is more expressive than single-view data and multi-label annotation enjoys richer supervision information than single-label, which makes multi-view multi-label learning widely applicable for various pattern recognition tasks. In this complex representation learning problem, three main challenges can be characterized...
Federated learning enables multiple hospitals to cooperatively learn a shared model without privacy disclosure. Existing methods often take a common assumption that the data from different hospitals have the same modalities. However, such a setting is difficult to fully satisfy in practical applications, since the imaging guidelines may be differen...
Palmprint recognition provides a potential solution for noninvasive personal authentication due to its excellent contactless property and user-security, and it has attracted tremendous research interest in recent years. However, most existing methods focus on intraspectral palmprint recognition, which requires gallery and probe images to be capture...
Incomplete multi-view clustering is a hot and emerging topic. It is well known that unavoidable data incompleteness greatly weakens the effective information of multi-view data. To date, existing incomplete multi-view clustering methods usually bypass unavailable views according to prior missing information, which is considered as a second-best sch...
Fall event detection has been a research hotspot in recent years in the fields of medicine and health. Currently, vision-based fall detection methods have been considered the most promising methods due to their advantages of a non-contact characteristic and easy deployment. However, the existing vision-based fall detection methods mainly use superv...
View missing and label missing are two challenging problems in the applications of multi-view multi-label classification scenery. In the past years, many efforts have been made to address the incomplete multi-view learning or incomplete multi-label learning problem. However, few works can simultaneously handle the challenging case with both the inc...
Due to the rapid development of multimedia technology and sensor technology, multi-view clustering (MVC) has become a research hotspot in machine learning, data mining, and other fields and has been developed significantly in the past decades. Compared with single-view clustering, MVC improves clustering performance by exploiting complementary and...
Diabetic retinopathy (DR), the main cause of irreversible blindness, is one of the most common complications of diabetes. At present, deep convolutional neural networks have achieved promising performance in automatic DR detection tasks. The convolution operation of methods is a local cross‐correlation operation, whose receptive field determines th...
Falls are a major health threat for older people. A timely assistance can reduce the extent of physical injury caused by the falls. Currently, low-cost and convenient video surveillance systems based on ordinary RGB cameras are widely used for improving the safety of people. The fall detection is a research hotspot in intelligent video surveillance...
In recent years, many incomplete multi-view clustering methods have been proposed to address the challenging and new clustering task on incomplete multi-view data whose part of view representations are not fully collected for some samples. Although extensive experiments have validated the effectiveness of these methods for handling the incomplete l...
Weakly supervised video anomaly detection is generally formulated as a multiple instance learning (MIL) problem, where an anomaly detector learns to generate frame-level anomaly scores under the supervision of MIL-based video-level classification. However, most previous works suffer from two drawbacks: 1) they lack ability to model temporal relatio...
Weakly supervised object detection (WSOD) has received widespread attention since it requires only image-category annotations for detector training. Many advanced approaches solve this problem by a two-phase learning framework, that is, instance mining that classifies generated proposals via multiple instance learning, and instance refinement that...
This article proposes a novel data reconstruction method, called projective cross-reconstruction (PCR) for cross-domain recognition. The intrinsic philosophy behind PCR is that the data from different domains but with the same label have a strong correlation and thus they can be reconstructed with each other. To this end, we first rearrange the dat...
Dictionary-based Classification (DC) has been a promising learning theory in multimedia computing. Previous studies focused on learning a discriminative dictionary as well as the sparsest representation based on the dictionary, to cope with the complex conditions in real-world applications. However, robustness by learning only one single dictionary...
Preserving the intrinsic structure of data is very important for unsupervised dimensionality reduction. For structure preserving, graph embedding technique is widely considered. However, most of the existing unsupervised graph embedding based methods cannot effectively preserve the intrinsic structure of data since these methods either use the cons...
Open-set semi-supervised learning (OSSL) has attracted growing interest, which investigates a more practical scenario where out-of-distribution (OOD) samples are only contained in unlabeled data. Existing OSSL methods like OpenMatch learn an OOD detector to identify outliers, which often update all modal parameters (i.e., full fine-tuning) to propa...
Federated learning (FL) can be used to improve data privacy and efficiency in magnetic resonance (MR) image reconstruction by enabling multiple institutions to collaborate without needing to aggregate local data. However, the domain shift caused by different MR imaging protocols can substantially degrade the performance of FL models. Recent FL tech...
Conventional multi-view clustering seeks to partition data into respective groups based on the assumption that all views are fully observed. However, in practical applications, such as disease diagnosis, multimedia analysis, and recommendation system, it is common to observe that not all views of samples are available in many cases, which leads to...
Incomplete multi-view clustering, which aims to solve the clustering problem on the incomplete multi-view data with partial view missing, has received more and more attention in recent years. Although numerous methods have been developed, most of the methods either cannot flexibly handle the incomplete multi-view data with arbitrary missing views o...
With the dramatic increase in the amount of multimedia data, cross-modal similarity retrieval has become one of the most popular yet challenging problems. Hashing offers a promising solution for large-scale cross-modal data searching by embedding the high-dimensional data into the low-dimensional similarity preserving Hamming space. However, most e...
In this article, we propose a new linear regression (LR)-based multiclass classification method, called discriminative regression with adaptive graph diffusion (DRAGD). Different from existing graph embedding-based LR methods, DRAGD introduces a new graph learning and embedding term, which explores the high-order structure information between four...
Accelerated multi-modal magnetic resonance (MR) imaging is a new and effective solution for fast MR imaging, providing superior performance in restoring the target modality from its undersampled counterpart with guidance from an auxiliary modality. However, existing works simply combine the auxiliary modality as prior information, lacking in-depth...
Therapeutic peptide prediction is critical for drug development and therapeutic therapy. Researchers have developed several computational methods to identify different therapeutic peptide types. However, most computational methods focus on identifying the specific type of therapeutic peptides and fail to accurately predict all types of therapeutic...
Breast cancer accounts for the largest number of patients among all cancers in the world. Intervention treatment for early breast cancer can dramatically extend a woman's 5-year survival rate. However, the lack of public available breast mammography databases in the field of Computer-aided Diagnosis and the insufficient feature extraction ability f...
Multiview learning has a great potential to achieve a better performance than the conventional single-view based methods. However, some views of multiview data may be missing in the real-world applications. To cluster such incomplete multiview data, researchers tend to fill in the absent instances with the average vector of the available data for e...
In the earlier days, part segmentation methods for vehicle re-id were based on segmenting the feature map of the last convolutional layer. However, by calculating the receptive field, we can see that the size of the receptive field of each point in the feature map of the last convolutional layer exceeds that of the original input image. Therefore,...
Deep learning has been widely used in the field of mammographic image classification owing to its superiority in automatic feature extraction. However, general deep learning models cannot achieve very satisfactory classification results on mammographic images because these models are not specifically designed for mammographic images and do not take...
In this paper, we propose a Global-Supervised Contrastive loss and a view-aware-based post-processing (VABPP) method for the field of vehicle re-identification. The traditional supervised contrastive loss calculates the distances of features within the batch, so it has the local attribute. While the proposed Global-Supervised Contrastive loss has n...
Video anomaly detection (VAD) refers to the discrimination of unexpected events in videos. The deep generative model (DGM)-based method learns the regular patterns on normal videos and expects the learned model to yield larger generative errors for abnormal frames. However, DGM cannot always do so, since it usually captures the shared patterns betw...
In this article, we propose a collaborative palmprint-specific binary feature learning method and a compact network consisting of a single convolution layer for efficient palmprint feature extraction. Unlike most existing palmprint feature learning methods, such as deep-learning, which usually ignore the inherent characteristics of palmprints and l...
In this paper, we tackle the under-sampled MRI reconstruction problem with auxiliary contrasts. Instead of adopting a naive fusion scheme before feeding the multi-contrast features into the networks, we propose a more interpretable early fusion approach. By utilizing the inherent multi-contrast nature of MR image feature maps, we propose a Cross-co...
Incomplete multi-view clustering, which aims to solve the clustering problem on the incomplete multi-view data with partial view missing, has received more and more attention in recent years. Although numerous methods have been developed, most of the methods either cannot flexibly handle the incomplete multi-view data with arbitrary missing views o...
Conventional multiview clustering seeks to partition data into respective groups based on the assumption that all views are fully observed. However, in practical applications, such as disease diagnosis, multimedia analysis, and recommendation system, it is common to observe that not all views of samples are available in many cases, which leads to t...
Federated learning (FL) can be used to improve data privacy and efficiency in magnetic resonance (MR) image reconstruction by enabling multiple institutions to collaborate without needing to aggregate local data. However, the domain shift caused by different MR imaging protocols can substantially degrade the performance of FL models. Recent FL tech...
Deep autoencoder (AE) has demonstrated promising performances in visual anomaly detection (VAD). Learning normal patterns on normal data, deep AE is expected to yield larger reconstruction errors for anomalous samples, which is utilized as the criterion for detecting anomalies. However, this hypothesis cannot be always tenable since the deep AE usu...
Center loss is widely used as a supervision tool in deep learning method. However, the center loss also has some shortcomings, the most important of which is that it must be combined with softmax loss to run well. In this article, we sum up five shortcomings of center loss and solve all of them by proposing a dual distance center loss (DDCL). Compa...
Weakly supervised object detection (WSOD aims to train object detectors by using only image-level annotations. Many recent works on WSOD adopt multiple instance detection networks (MIDN, which usually generate a certain number of proposals and regard proposal classification as a latent model learning within image classification. However, these meth...
The continuous developments of urban and industrial environments have increased the demand for intelligent video surveillance. Deep learning has achieved remarkable performance for anomaly detection in surveillance videos. Previous approaches achieve anomaly detection with a single pretext task (image reconstruction or prediction) and detect anomal...
Magnetic resonance (MR) imaging is a commonly used scanning technique for disease detection, diagnosis and treatment monitoring. Although it is able to produce detailed images of organs and tissues with better contrast, it suffers from a long acquisition time, which makes the image quality vulnerable to say motion artifacts. Recently, many approach...
Magnetic Resonance Imaging (MRI) has been widely used in clinical application and pathology research to help doctors provide better diagnoses. However, accurate diagnosis by MRI remains a great challenge, as images obtained via current MRI techniques usually have low resolutions. Improving MRI image quality and resolution has thus become a critical...
The core problem of Magnetic Resonance Imaging (MRI) is the trade off between acceleration and image quality. Image reconstruction and super-resolution are two crucial techniques in Magnetic Resonance Imaging (MRI). Current methods are designed to perform these tasks separately, ignoring the correlations between them. In this work, we propose an en...
Super-resolving the Magnetic Resonance (MR) image of a target contrast under the guidance of the corresponding auxiliary contrast, which provides additional anatomical information, is a new and effective solution for fast MR imaging. However, current multi-contrast super-resolution (SR) methods tend to concatenate different contrasts directly, igno...
Super-resolution (SR) plays a crucial role in improving the image quality of magnetic resonance imaging (MRI). MRI produces multi-contrast images and can provide a clear display of soft tissues. However, current super-resolution methods only employ a single contrast, or use a simple multi-contrast fusion mechanism, ignoring the rich relations among...
Magnetic resonance (MR) image acquisition is an inherently prolonged process, whose acceleration has long been the subject of research. This is commonly achieved by obtaining multiple undersampled images, simultaneously, through parallel imaging. In this article, we propose the dual-octave network (DONet), which is capable of learning multiscale sp...
Accelerating multi-modal magnetic resonance (MR) imaging is a new and effective solution for fast MR imaging, providing superior performance in restoring the target modality from its undersampled counterpart with guidance from an auxiliary modality. However, existing works simply introduce the auxiliary modality as prior information, lacking in-dep...
Since human-labeled samples are free for the target set, unsupervised person re-identification (Re-ID) has attracted much attention in recent years, by additionally exploiting the source set. However, due to the differences on camera styles, illumination and backgrounds, there exists a large gap between source domain and target domain, introducing...
The core problem of Magnetic Resonance Imaging (MRI) is the trade off between acceleration and image quality. Image reconstruction and super-resolution are two crucial techniques in Magnetic Resonance Imaging (MRI). Current methods are designed to perform these tasks separately, ignoring the correlations between them. In this work, we propose an en...
Super-resolution (SR) plays a crucial role in improving the image quality of magnetic resonance imaging (MRI). MRI produces multi-contrast images and can provide a clear display of soft tissues. However, current super-resolution methods only employ a single contrast, or use a simple multi-contrast fusion mechanism, ignoring the rich relations among...
Dictionary-based classification has been promising in knowledge discovery from image data, due to its good performance and interpretable theoretical system. Dictionary learning effectively supports both small- and large-scale datasets, while its robustness and performance depends on the atoms of the dictionary most of the time. Empirically, using a...