Dong Huang

Dong Huang
  • PhD
  • Professor (Associate) at South China Agricultural University

About

130
Publications
41,156
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,981
Citations
Introduction
Dong Huang received the B.S. degree in computer science from South China University of Technology, Guangzhou, China, in 2009, and the M.Sc. and Ph.D. degrees in computer science from Sun Yat-sen University, Guangzhou, in 2011 and 2015, respectively. He joined South China Agricultural University, Guangzhou, in 2015, where he is currently an Associate Professor with the College of Mathematics and Informatics. From Jul. 2017 to Jul. 2018, he was a Visiting Fellow with the School of Computer Science and Engineering, Nanyang Technological University, Singapore. His research interests include data mining and machine learning, and more specifically focus on ensemble clustering, multi-view clustering, and large-scale clustering.
Current institution
South China Agricultural University
Current position
  • Professor (Associate)

Publications

Publications (130)
Article
Full-text available
The rapid emergence of high-dimensional data in various areas has brought new challenges to current ensemble clustering research. To deal with the curse of dimensionality, recently considerable efforts in ensemble clustering have been made by means of different subspace-based techniques. However, besides the emphasis on subspaces, rather limited at...
Article
Full-text available
Ensemble clustering has been a popular research topic in data mining and machine learning. Despite its significant progress in recent years, there are still two challenging issues in the current ensemble clustering research. First, most of the existing algorithms tend to investigate the ensemble information at the object-level, yet often lack the a...
Article
Full-text available
This paper focuses on scalability and robustness of spectral clustering for extremely large-scale datasets with limited resources. Two novel algorithms are proposed, namely, ultra-scalable spectral clustering (U-SPEC) and ultra-scalable ensemble clustering (U-SENC). In U-SPEC, a hybrid representative selection strategy and a fast approximation meth...
Article
Full-text available
Despite significant progress, there remain three limitations to the previous multi-view clustering algorithms. First, they often suffer from high computational complexity, restricting their feasibility for large-scale datasets. Second, they typically fuse multi-view information via one-stage fusion, neglecting the possibilities in multi-stage fusio...
Article
Full-text available
Graph learning has emerged as a promising technique for multi-view clustering due to its ability to learn a unified and robust graph from multiple views. However, existing graph learning methods mostly focus on the multi-view consistency issue, yet often neglect the inconsistency between views, which makes them vulnerable to possibly low-quality or...
Article
Ensemble clustering aims to combine different base clusterings into a better clustering than that of the individual one. In general, a co-association matrix depicting the pairwise affinity between different data samples is constructed by average fusion or weighted fusion of the connective matrices from multiple base clusterings. Despite the signifi...
Article
Multi-view clustering (MVC) is essential for integrating heterogeneous data from multiple sources. However, many existing approaches are hindered by high computational complexity and the separate optimization of similarity and cluster structures. In light of these challenges, this paper presents a novel anchor-based MVC method termed simple one-ste...
Article
Purpose This study aims to investigate whether artificial intelligence can improve the diagnostic accuracy of vertigo related diseases. Experimental Design Based on the clinical guidelines, clinical symptoms and laboratory test results were extracted from electronic medical records as variables. These variables were then input into a machine learn...
Article
Most existing large-scale multiview clustering algorithms attempt to capture data distribution in multiple views by selecting view-wise anchor representations beforehand with $k$ -means, or by direct matrix factorization on the original observations. Despite impressive performance, few of them have paid attention to the semantic correlations betw...
Article
Deep contrastive clustering has recently gained significant attention due to its advantageous ability to leverage the contrastive learning paradigm for joint representation learning and clustering. However, previous deep contrastive clustering approaches mostly focus on instance discrimination or cluster discrimination, which often overlook the ric...
Article
Multi-view spectral clustering has achieved impressive performance by learning multiple robust and meaningful similarity graphs for clustering. Generally, the existing literatures often construct multiple similarity graphs by certain similarity measure (e.g. the Euclidean distance), which lack the desired ability to learn sparse and reliable connec...
Article
Though many deep attributed graph clustering approaches have been developed in recent years, most still suffer from two limitations. First, in the input space, they primarily rely on the original topology structure as the input (to some graph network), lacking the ability to jointly leverage local and global topology information to refine the graph...
Article
Full-text available
Deep clustering has shown its promising capability in joint representation learning and clustering via deep neural networks. Despite the significant progress, the existing deep clustering works mostly utilize some distribution-based clustering loss, lacking the ability to unify representation learning and multi-scale structure learning. To address...
Article
Full-text available
This paper focuses on two limitations to previous multi-view clustering approaches. First, they frequently suffer from quadratic or cubic computational complexity, which restricts their feasibility for large-scale datasets. Second, they often rely on a single graph on each view, yet lack the ability to jointly explore many versatile graph structure...
Article
Full-text available
Deep clustering has recently emerged as a promising technique for complex data clustering. Despite the considerable progress, previous deep clustering works mostly build or learn the final clustering by only utilizing a single layer of representation, e.g., by performing the $K$ -means clustering on the last fully-connected layer or by associatin...
Article
Incomplete multi-view clustering (IMC) has recently received widespread attention in the field of clustering analysis. In spite of the great success, we observe that the current IMC approaches are still faced with three common demerits. First, they mostly fail to recover the inherent (especially nonlinear) subspace structure during incomplete clust...
Article
Full-text available
Despite significant progress, previous multi-view unsupervised feature selection methods mostly suffer from two limitations. First, they generally utilize either cluster structure or similarity structure to guide the feature selection, which neglect the possibility of a joint formulation with mutual benefits. Second, they often learn the simila...
Article
Full-text available
This paper focuses on the problem of motorcyclist helmet detection in single images. Although some previous works have been developed to deal with this problem, yet most of them are designed for videos and not suitable for single-image helmet detection. In view of this, in this paper, we propose a novel dual-detection framework for single-image mot...
Preprint
Full-text available
The bipartite graph structure has shown its promising ability in facilitating the subspace clustering and spectral clustering algorithms for large-scale datasets. To avoid the post-processing via k-means during the bipartite graph partitioning, the constrained Laplacian rank (CLR) is often utilized for constraining the number of connected component...
Article
Full-text available
Recently the deep learning has shown its advantage in representation learning and clustering for time series data. Despite the considerable progress, the existing deep time series clustering approaches mostly seek to train the deep neural network by some instance reconstruction based or cluster distribution based objective, which, however, lack the...
Article
The increasing concerns of public health and safety lead to a practical need to detect smoking behaviors (or smokers) in public places. Previous smoker detection methods often focus on cigarette detection, which overlook the interaction between the smoker and the cigarette. In light of this, this paper presents a single-image smoker detection frame...
Article
Full-text available
Electroencephalogram (EEG) is an important technology to explore the central nervous mechanism of tinnitus. However, it is hard to obtain consistent results in many previous studies for the high heterogeneity of tinnitus. In order to identify tinnitus and provide theoretical guidance for the diagnosis and treatment, we propose a robust, data-effici...
Article
Full-text available
Although previous graph-based multi-view clustering (MVC) algorithms have gained significant progress, most of them are still faced with three limitations. First, they often suffer from high computational complexity, which restricts their applications in large-scale scenarios. Second, they usually perform graph learning either at the single-view le...
Article
Incomplete multi-view clustering, which included missing data in different views, is more challenging than multi-view clustering. For the purpose of eliminating the negative influence of incomplete data, researchers have proposed a series of solutions. However, the present incomplete multi-view clustering methods still confront three major issues:...
Article
Full-text available
Deep clustering has attracted increasing attention in recent years due to its capability of joint representation learning and clustering via deep neural networks. In its latest developments, the contrastive learning has emerged as an effective technique to substantially enhance the deep clustering performance. However, the existing contrastive lear...
Article
Full-text available
Contrastive deep clustering has recently gained significant attention with its ability of joint contrastive learning and clustering via deep neural networks. Despite the rapid progress, previous works mostly require both positive and negative sample pairs for contrastive clustering, which rely on a relative large batch-size. Moreover, they typicall...
Preprint
Full-text available
Contrastive deep clustering has recently gained significant attention with its ability of joint contrastive learning and clustering via deep neural networks. Despite the rapid progress, previous works mostly require both positive and negative sample pairs for contrastive clustering, which rely on a relative large batch-size. Moreover, they typicall...
Preprint
Full-text available
Recently the deep learning has shown its advantage in representation learning and clustering for time series data. Despite the considerable progress, the existing deep time series clustering approaches mostly seek to train the deep neural network by some instance reconstruction based or cluster distribution based objective, which, however, lack the...
Data
This repository provides the Matlab source code for three ensemble clustering algorithms, namely, MDEC-HC, MDEC-SC, and MDEC-BG. You can run 'demo_1.m' to test the algorithms. If you find the code helpful for your research, please cite the paper below. D. Huang, C.-D. Wang, J.-H. Lai, and C.-K. Kwoh, Toward Multidiversified Ensemble Clustering o...
Article
Low-rank multi-view subspace clustering has recently attracted increasing attention in the multi-view learning research. Despite significant progress, most existing approaches still suffer from two issues. First, they mostly focus on exploiting the low-rank consistency across multiple views, but often ignore the low-rank structure within each view....
Chapter
An increasing amount of attention have been attracted in multi-view subspace clustering, whose impressive performance can be achieved by means of the self-expressive property, under the assumption of linear relations between multi-view data samples. However, most of them fail to recover the nonlinear relations between multi-view data for deeper stu...
Article
Multi-view subspace clustering aims to discover the hidden subspace structures from multiple views for robust clustering, and has been attracting considerable attention in recent years. Despite significant progress, most of the previous multi-view subspace clustering algorithms are still faced with two limitations. First, they usually focus on the...
Preprint
Full-text available
Although previous graph-based multi-view clustering algorithms have gained significant progress, most of them are still faced with three limitations. First, they often suffer from high computational complexity, which restricts their applications in large-scale scenarios. Second, they usually perform graph learning either at the single-view level or...
Chapter
Deep clustering has recently emerged as a promising direction in clustering analysis, which aims to leverage the representation learning power of deep neural networks to enhance the clustering of highly-complex data. However, most of the existing deep clustering algorithms tend to utilize a single layer (typically the last fully-connected layer) of...
Preprint
Multiview clustering has been extensively studied to take advantage of multi-source information to improve the clustering performance. In general, most of the existing works typically compute an n * n affinity graph by some similarity/distance metrics (e.g. the Euclidean distance) or learned representations, and explore the pairwise correlations ac...
Article
Full-text available
Multi-view clustering (MVC) has attracted more and more attention in the recent few years by making full use of complementary and consensus information between multiple views to cluster objects into different partitions. Although there have been two existing works for MVC survey, neither of them jointly takes the recent popular deep learning-based...
Preprint
Full-text available
Deep clustering has recently attracted significant attention. Despite the remarkable progress, most of the previous deep clustering works still suffer from two limitations. First, many of them focus on some distribution-based clustering loss, lacking the ability to exploit sample-wise (or augmentation-wise) relationships via contrastive learning. S...
Preprint
Full-text available
Vision Transformer (ViT) has shown its advantages over the convolutional neural network (CNN) with its ability to capture global long-range dependencies for visual representation learning. Besides ViT, contrastive learning is another popular research topic recently. While previous contrastive learning works are mostly based on CNNs, some recent stu...
Preprint
Full-text available
Deep clustering has attracted increasing attention in recent years due to its capability of joint representation learning and clustering via deep neural networks. In its latest developments, the contrastive learning has emerged as an effective technique to substantially enhance the deep clustering performance. However, the existing contrastive lear...
Preprint
Full-text available
Deep clustering has recently emerged as a promising technique for complex image clustering. Despite the significant progress, previous deep clustering works mostly tend to construct the final clustering by utilizing a single layer of representation, e.g., by performing $K$-means on the last fully-connected layer or by associating some clustering lo...
Preprint
Full-text available
Despite the recent progress, the existing multi-view unsupervised feature selection methods mostly suffer from two limitations. First, they generally utilize either cluster structure or similarity structure to guide the feature selection, neglecting the possibility of a joint formulation with mutual benefits. Second, they often learn the similarity...
Preprint
Full-text available
Despite significant progress, there remain three limitations to the previous multi-view clustering algorithms. First, they often suffer from high computational complexity, restricting their feasibility for large-scale datasets. Second, they typically fuse multi-view information via one-stage fusion, neglecting the possibilities in multi-stage fusio...
Preprint
Full-text available
Multi-view subspace clustering aims to discover the hidden subspace structures from multiple views for robust clustering, and has been attracting considerable attention in recent years. Despite significant progress, most of the previous multi-view subspace clustering algorithms are still faced with two limitations. First, they usually focus on the...
Article
Full-text available
Multi-view subspace clustering has been an important and powerful tool for partitioning multi-view data, especially multi-view high-dimensional data. Despite great success, most of the existing multi-view subspace clustering methods still suffer from three limitations. First, they often recover the subspace structure in the original space, which ca...
Chapter
This paper deals with the problem of helmet detection on motorcyclists in single images. Some previous attempts have been made to detect helmets on motorcyclists, most of which are designed for videos and not suitable for single-image helmet detection. In this paper, we propose a single-image motorcyclist helmet detection method via deep neural net...
Chapter
This paper addresses the problem of smoker detection in a single image. Previous smoker detection works usually focus on cigarette detection, yet often neglect the rich information of smoking behavior (especially the interaction information between smoker and cigarette). Though some attempts have been made to detect the smoking behavior, they typic...
Chapter
In the paper, we propose a link-based consensus clustering approach with random walk propagation (LCC-RW), which is able to incorporate common neighborhood information as well as multi-scale indirect relationships. Specifically, the microcluster representation is adopted to facilitate the computation. With the ensemble of base clusterings represent...
Article
Since the Lipschitz properties of convolutional neural networks (CNNs) are widely considered to be related to adversarial robustness, we theoretically characterize the L-1 norm and L-infinity norm of 2D multi-channel convolutional layers and provide efficient methods to compute the exact L-1 norm and L-infinity norm. Based on our theorem, we propos...
Chapter
Multi-view subspace clustering has emerged as a crucial tool to solve the multi-view clustering problem. However, many of the existing methods merely focus on the consistency issue when learning the multi-view representations, failing to capture the latent inconsistency across different views (which can be caused by the view-specificity or diversit...
Article
Although many multi-view clustering approaches have been developed recently, one common shortcoming of most of them is that they generally rely on the original feature space or consider the two components of the similarity-based clustering separately (i.e., similarity matrix construction and cluster indicator matrix calculation), which may negative...
Article
Recently, network embedding has received a large amount of attention in network analysis. Although some network embedding methods have been developed from different perspectives, on one hand, most of the existing methods only focus on leveraging the plain network structure, ignoring the abundant attribute information of nodes. On the other hand, fo...
Data
Attached is the Matlab source code for the ensemble clustering algorithms in our TSMC-S 2021 paper. If you find it useful for your research, please cite the paper below. Dong Huang, Chang-Dong Wang, Hongxing Peng, Jianhuang Lai, and Chee-Keong Kwoh. Enhanced Ensemble Clustering via Fast Propagation of Cluster-wise Similarities, IEEE Transactions o...
Article
Multiview subspace clustering (MVSC) is a recently emerging technique that aims to discover the underlying subspace in multiview data and thereby cluster the data based on the learned subspace. Though quite a few MVSC methods have been proposed in recent years, most of them cannot explicitly preserve the locality in the learned subspaces and also n...
Article
Network embedding aims to learn the low-dimensional node representations for networks, which has attracted an increasing amount of attention in recent years. Most existing efforts in this field attempt to embed the network based on node similarity, which generally relies on edge existence statistics of the network. Instead of relying on the global...
Preprint
Full-text available
Since the Lipschitz properties of convolutional neural network (CNN) are widely considered to be related to adversarial robustness, we theoretically characterize the $\ell_1$ norm and $\ell_\infty$ norm of 2D multi-channel convolutional layers and provide efficient methods to compute the exact $\ell_1$ norm and $\ell_\infty$ norm. Based on our theo...
Article
Multi-view subspace clustering has made remarkable achievements in the field of multi-view learning for high-dimensional data. However, many existing multi-view subspace clustering methods still have two disadvantages. First, most of them only recover the subspace structure from either consistent or specific perspective. Second, they often fail to...
Preprint
Graph learning has emerged as a promising technique for multi-view clustering with its ability to learn a unified and robust graph from multiple views. However, existing graph learning methods mostly focus on the multi-view consistency issue, yet often neglect the inconsistency across multiple views, which makes them vulnerable to possibly low-qual...
Data
This repository provides the Matlab source code and experimental data for two large-scale clustering algorithms, namely, Ultra-Scalable Spectral Clustering (U-SPEC) and Ultra-Scalable Ensemble Clustering (U-SENC), both of which have nearly linear time and space complexity and are capable of robustly and efficiently partitioning ten-million-level no...
Conference Paper
Full-text available
Subspace clustering has been gaining increasing attention in recent years due to its promising ability in dealing with high-dimensional data. However, most of the existing subspace clustering methods tend to only exploit the subspace information to construct a single affinity graph (typically for spectral clustering), which often lack the ability t...
Data
This repository provides the MATLAB code for the Spectral Clustering by Subspace Randomization and Graph Fusion (SC-SRGF) algorithm. If you find it helpful for your research, please cite the paper below. X. Cai, D. Huang, C.-D. Wang, C.-K. Kwoh, Spectral Clustering by Subspace Randomization and Graph Fusion for High-Dimensional Data, in Proc. of th...
Chapter
Subspace clustering has been gaining increasing attention in recent years due to its promising ability in dealing with high-dimensional data. However, most of the existing subspace clustering methods tend to only exploit the subspace information to construct a single affinity graph (typically for spectral clustering), which often lack the ability t...
Article
Previous multi-view clustering algorithms mostly partition the multi-view data in their original feature space, the efficacy of which heavily and implicitly relies on the quality of the original feature presentation. In light of this, this paper proposes a novel approach termed Multi-view Clustering in Latent Embedding Space (MCLES), which is able...
Conference Paper
Full-text available
Previous multi-view clustering algorithms mostly partition the multi-view data in their original feature space, the efficacy of which heavily and implicitly relies on the quality of the original feature presentation. In light of this, this paper proposes a novel approach termed Multi-view Clustering in Latent Embedding Space (MCLES), which is able...
Chapter
Consensus clustering aims to combine multiple base clusters into a probably better and more robust clustering result. Despite the significant progress in recent years, the existing consensus clustering approaches are mostly designed for general-purpose scenarios, yet often lack the ability to effectively and efficiently deal with high-dimensional d...
Conference Paper
Full-text available
Random forest has been an important technique in ensemble classification, due to its effectiveness and robustness in handling complex data. But many of the previous random forest models tend to treat all features equally and often lack the ability to well reflect the potentially different importance of different features, especially in high-dimensi...
Chapter
Full-text available
Consensus clustering has in recent years become one of the most popular topics in the clustering research, due to its promising ability in combining multiple weak base clusterings into a strong consensus result. In this paper, we aim to deal with three challenging issues in consensus clustering, i.e., the high-order integration issue, the local rel...
Article
Full-text available
Community detection (or graph clustering) is crucial for unraveling the structural properties of complex networks. As an important technique in community detection, label propagation has shown the advantage of finding a good community structure with nearly linear time complexity. However, despite the progress that has been made, there are still sev...
Conference Paper
Full-text available
Graph Learning has emerged as a promising technique for multi-view clustering, and has recently attracted lots of attention due to its capability of adaptively learning a unified and probably better graph from multiple views. However, the existing multi-view graph learning methods mostly focus on the multi-view consistency, but neglect the potentia...
Data
Attached is the MATLAB source code of the SRCFS algorithm in our KBS 2019 paper. If you find it helpful for your research, please cite the paper below. D. Huang, X. Cai, and C.-D. Wang, Unsupervised Feature Selection with Multi-Subspace Randomization and Collaboration, Knowledge-Based Systems, 2019, vol.182, pp.104856.
Article
Multi-view subspace clustering is essential to many scientific problems. However, most existing methods suffer from three aspects of issues. First, these methods usually adopt a two-step framework, lacking the ability to achieve an optimal common affinity matrix across multiple views. Second, these methods are intended to solve the clustering probl...
Article
Due to the ability of addressing the data sparsity and cold-start problems, Cross-Domain Collaborative Filtering (CDCF) has received a significant amount of attention. Despite significant success, most of the existing CDCF algorithms assume that all the domains are correlated, which is however not always guaranteed in practice. In this paper, we pr...
Article
Full-text available
Unsupervised feature selection has been an important technique in high-dimensional data analysis. Despite significant success, most of the existing unsupervised feature selection methods tend to estimate the underlying structure of data in the original feature space, but lack the ability to explore various subspaces in the high-dimensional space. I...
Conference Paper
Network embedding, which learns low-dimensional representations from networks for network information preservation, has gained considerable attention in recent years. Network embedding has been shown to outperform many traditional node representation learning methods on the tasks such as clustering, classification and visualization. However, focusi...
Article
Full-text available
Recently, low-dimensional embedding of nodes has received a large amount of attention in the field of network analysis. While the existing methods mostly focus on the network embedding of the entire network, there are also some situations where people may only be interested in some nodes (i.e., partial nodes) rather than all nodes, especially in la...
Chapter
In recent years, multi-view clustering has been widely used in many areas. As an important category of multi-view clustering, multi-view spectral clustering has recently shown promising advantages in partitioning clusters of arbitrary shapes. Despite significant success, there are still two challenging issues in multi-view spectral clustering, i.e....
Preprint
Full-text available
This paper focuses on scalability and robustness of spectral clustering for extremely large-scale datasets with limited resources. Two novel algorithms are proposed, namely, ultra-scalable spectral clustering (U-SPEC) and ultra-scalable ensemble clustering (U-SENC). In U-SPEC, a hybrid representative selection strategy and a fast approximation meth...
Conference Paper
Full-text available
As one of the most classical clustering techniques, the k-means clustering has been widely used in various areas over the past few decades. Despite its significant success, there are still several challenging issues in the k-means clustering research, one of which lies in its high sensitivity to the selection of the initial cluster centers. In this...
Conference Paper
Full-text available
Consensus clustering has in recent years become one of the most popular topics in the clustering research, due to its promising ability in combining multiple weak base clusterings into a strong consensus result. In this paper, we aim to deal with three challenging issues in consensus clustering, i.e., the high-order integration issue, the local rel...
Conference Paper
Ensemble clustering has recently emerged as a powerful tool for aggregating multiple clustering results into a probably better and more robust clustering result. While many ensemble clustering techniques have been developed in recent years, there is still a potential drawback in most of the existing algorithms. They generally neglect the possible n...
Preprint
Ensemble clustering has been a popular research topic in data mining and machine learning. Despite its significant progress in recent years, there are still two challenging issues in the current ensemble clustering research. First, most of the existing algorithms tend to investigate the ensemble information at the object-level, yet often lack the a...
Article
Full-text available
Network community detection with higher-order features has become a hot research topic recently, since higher-order features that are captured at the level of small network subgraphs can help to gain new insights into the higher-order organizations of the overall network. However, most of the existing higher-order community detection methods only l...
Article
Full-text available
Given the importance of central reorganization and tinnitus, we undertook the current study to investigate changes to electroencephalogram (EEG) microstates and their association with the clinical symptoms in tinnitus. High-density (128 channel) EEG was used to explore changes in microstate features in 15 subjects with subjective tinnitus and 17 ag...
Article
Full-text available
Due to its ability to combine multiple base clusterings into a probably better and more robust clustering, the ensemble clustering technique has been attracting increasing attention in recent years. Despite the significant success, one limitation to most of the existing ensemble clustering methods is that they generally treat all base clusterings e...

Network

Cited By