Fig 1 - uploaded by Xifeng Guo
Content may be subject to copyright.
The structure of proposed Convolutional AutoEncoders (CAE) for MNIST. In the middle there is a fully connected autoencoder whose embedded layer is composed of only 10 neurons. The rest are convolutional layers and convolutional transpose layers (some work refers to as Deconvolutional layer). The network can be trained directly in an end-to-end manner.
Source publication
Deep clustering utilizes deep neural networks to learn feature representation that is suitable for clustering tasks. Though demonstrating promising performance in various applications, we observe that existing deep clustering algorithms either do not well take advantage of convolutional neural networks or do not considerably preserve the local stru...
Contexts in source publication
Context 1
... DCEC structure is composed of CAE (see Fig. 1) and a clustering layer which is connected to the embedded layer of CAE, as depicted in Fig. 2. The clustering layer maps each embedded point z i of input image x i into a soft label. Then the clustering loss L c is defined as Kullback-Leibler divergence (KL divergence) between the distribution of soft labels and the predefined target distribution. CAE is used to learn embedded features and the clustering loss guides the embedded features to be prone to forming ...
Context 2
... propose a new Convolutional AutoEncoders (CAE) that does not need tedious layer-wise pretraining, as shown in Fig. 1. First, some convolutional layers are stacked on the input images to extract hierarchical features. Then flatten all units in the last convolutional layer to form a vector, followed by a fully connected layer with only 10 units which is called embedded layer. The input 2D image is thus transformed into 10 dimensional feature space. To train it in the unsupervised manner, we use a fully connected layer and some convolutional transpose layers to transform embedded feature back to original image. The parameters of encoder h = F ω (x) and decoder x = G ω (h) are updated by minimizing the reconstruction ...
Similar publications
In the computation of flow problems, the modeling of moving boundaries and interfaces is still challenging. There are different techniques to model such problems: interface-tracking and interface-capturing methods. Here we study an interface-capturing method called convected level-set. The difference to the standard level-set method is that the rei...
Citations
... We consider the item recommendation problem in FL context which involves learning a cluster-wise semi-global model from the set of clusters [26]. Cluster formulation is an important aspect, and considering the diverse amount of interactions at each client, privacy settings in the FL environment, and geographic characteristics of users [45], we perform clustering through KL divergence between the local data distributions of each client and the information entropy of clients' local data [16]. Based on a particular production system, other clustering frameworks could be utilized. ...
Privacy and trust are highly demanding in practical recommendation engines. Although Federated Learning (FL) has significantly addressed privacy concerns, commercial operators are still worried about several technical challenges while bringing FL into production. Additionally, classical FL has several intrinsic operational limitations such as single-point failure, data and model tampering, and heterogenic clients participating in the FL process. To address these challenges in practical recommenders, we propose a responsible recommendation generation framework based on blockchain-empowered asynchronous FL that can be adopted for any model-based recommender system. In standard FL settings, we build an additional aggregation layer in which multiple trusted nodes guided by a mediator component perform gradient aggregation to achieve an optimal model locally in a parallel fashion. The mediator partitions users into clusters, and each cluster is represented by a cluster head. Once a cluster gets semi-global convergence, the cluster head transmits model gradients to the FL server for global aggregation. Additionally, the trusted cluster heads are responsible to submit the converged semi-global model to a blockchain to ensure tamper resilience. In our settings, an additional mediator component works like an independent observer that monitors the performance of each cluster head, updates a reward score, and records it into a digital ledger. Finally, evaluation results on three diversified benchmarks illustrate that the recommendation performance on selected measures is considerably comparable with the standard and federated version of a well-known neural collaborative filtering recommender.
... With the rapid development of deep learning in computer vision [21], recent studies have been conducted using neural networks for this task [22][23][24][25]. Autoencoder [26][27][28][29][30][31] and GAN [32,33] are two major area of interest within the field based on reconstruction. In this kind of methods, the training data are constructed with normal samples, and in the test phase, a reconstruction error is used to distinguish abnormal samples. ...
... In most cases, mean square error(MSE) is utilized as the loss function. Some deep clustering algorithms used autoencoder for pre-training, and latent space variable z for clustering and optimization [27,28,30,48]. At the same time, some scholars also used the low reconstruction error of autoencoder to identify abnormal data [8,13]. ...
... From the above review, it can be seen that sampling from the perspective of data distribution will be a better solution. Clustering algorithms will be more advantageous in this regard [27,28,30,48]. Therefore, we propose a clustering-based downsampling method to further improve the above methods, which aims to reduce the number of data pairs without destroying the data distribution. ...
Under the X-ray scanning, mobile phone explosive modified at the battery is stealthy, which increases the difficulty of security inspections to detect prohibited phones. This critical issue has not received enough attention. In this paper, we contribute the first modified mobile phone X-ray image benchmark for object recognition, named MPXray. MPXray focuses on the detection of prohibited items that are imbalanced and fine-grained. To deal with such anomaly detection task where few abnormal samples are obtained, we propose a few-shot prohibited phone detection (FSPPD) model based on contrastive learning. FSPPD uses an unsupervised sampling module(USM) to obtain anchors that are more representative of the data distribution, so as to construct balanced input for contrastive learning. For handling hard-to-classify caused by fine-grained samples, an anchor-wise contrastive loss(AW-CL) is designed to supervise models speed up the proximity between intra-class samples and separation between between-class samples. FSPPD is more suitable for applications where electronic products need to be checked individually. We evaluate our model on MPXray, from both the classification perspective and anomaly detection perspective. Experimental results show that our model achieves better recall for modified mobile phones. Additionally, we verify the generalization ability of the proposed model on the CIFAR10 dataset. Compared with widely used algorithms, our model achieves certain superiority in recall metrics.
... Transformers, initially introduced for natural language processing tasks, use self-attention mechanisms to assign different weights to input features. This way, they yield state-of-the-art results across a variety of NLP tasks [4][5][6][7][8][9][10][11][12][13]. ...
The aim of this research was to develop and deploy efficient deep convolutional neural network (DCNN) frameworks for detecting and discriminating between various categories of designer drugs. These are of particular relevance in forensic contexts, aiding efforts to prevent and counter drug use and trafficking and supporting associated legal investigations. Our multinomial classification architectures, based on Attenuated Total Reflectance Fourier-Transform Infrared (ATR-FTIR) spectra, are primarily tailored to accurately identify synthetic cannabinoids. Within the scope of our dataset, they also adeptly detect other forensically significant drugs and misused prescription medications. The artificial intelligence (AI) models we developed use two platforms: our custom-designed, pre-trained Convolutional Autoencoder (CAE) and a structure derived from the Vision Transformer Trained on ImageNet Competition Data (ViT-B/32) model. In order to compare and refine our models, various loss functions (cross-entropy and focal loss) and optimization algorithms (Adaptive Moment Estimation, Stochastic Gradient Descent, Sign Stochastic Gradient Descent, and Root Mean Square Propagation) were tested and evaluated at differing learning rates. This study shows that innovative transfer learning methods, which integrate both unsupervised and supervised techniques with spectroscopic data pre-processing (ATR correction, normalization, smoothing) and present significant benefits. Their effectiveness in training AI systems on limited, imbalanced datasets is particularly notable. The strategic deployment of CAEs, complemented by data augmentation and synthetic sample generation using the Synthetic Minority Oversampling Technique (SMOTE) and class weights, effectively address the challenges posed by such datasets. The robustness and adaptability of our DCNN models are discussed, emphasizing their reliability and portability for real-world applications. Beyond their primary forensic utility, these systems demonstrate versatility, making them suitable for broader computer vision tasks, notably image classification and object detection.
... Other several deep clustering models could be found in [1,27]. The deep clustering convolution neural network for image data such as the DAE or DCAE model could be found in the work of Huang et al [20] and Guo et al [17]. Recently, there are some interesting methods to enhance the representation methods in the encoding parts such as the self-organizing map (SOM), the growing neural gas (GNG), and the generative adversarial networks (GANs). ...
Deep clustering is an approach that uses deep learning to cluster data, since it involves training a neural network model to become familiar with a data representation that is suitable for clustering. Deep clustering has been applied to a wide range of data types, including images, texts, time series and has the advantage of being able to automatically learn features from the data, which can be more effective than using hand-crafted features. It is also able to handle high-dimensional data, such as time series with many variables, which can be challenging for traditional clustering techniques. In this paper, we introduce a novel deep neural network type to improve the performance of the auto-encoder part by ignoring the unnecessary extra-noises and labelling the input data. Our approach is helpful when just a limited amount of labelled data is available, but labelling a big amount of data would be costly or time-consuming. It also applies for the data in high-dimensional and difficult to define a good set of features for clustering.
... Pooling layers are used to eliminate feature noise and to reduce the size of the input data. Activation functions are also applied in convolutional layers [27]. ...
The recognition of handwritten characters and numbers is a complex challenge in the field of pattern recognition, especially for the Arabic language. While significant progress has been made for the automatic recognition of Latin handwritten characters, methods and approaches for the Arabic language remain insufficient. Deep learning technologies, in particular auto-encoders (AE), offer new perspectives for handwriting recognition. In this article, we introduce the different types of most popular AE, such as Convolutional AE (CAE), Sparse AE (SAE), Denoising AE (DAE), and Variational AE (VAE), and evaluate their performance on two reference databases: the Modified Arabic Digits dataBase (MADBase) and Arabic Handwritten Character Dataset (AHCD). Using data augmentation to improve results, the VAE algorithm showed higher accuracy than other Deep Learning algorithms on both databases, with very encouraging results of 98.77% for MADBase and 98.42% for AHCD.
... It must be noted that these techniques make use of the deep neural network for feature extraction, and then use a traditional clustering algorithm to classify each pixel. While some techniques embed the clustering step into the deep learning architecture, these are not adapted to HSI or 3D convolutions [24]. The development of such an architecture should be verified with the well-known Earth remote sensing data sets, a process which lies outside the scope of this work. ...
The remnants of explosive volcanism on Mercury have been observed in the form of vents and pyroclastic deposits, termed faculae, using data from the Mercury Atmospheric and Surface Composition Spectrometer (MASCS) onboard the Mercury surface, space environment, geochemistry, and ranging (MESSENGER) spacecraft. Although these features present a wide variety of sizes, shapes, and spectral properties, the large number of observations and the lack of high-resolution hyperspectral images complicates their detailed characterisation. We investigate the application of unsupervised deep learning to explore the diversity and constrain the extent of the Hermean pyroclastic deposits. We use a three-dimensional convolutional autoencoder (3DCAE) to extract the spectral and spatial attributes that characterise these features and to create cluster maps constructing a unique framework to compare different deposits. From the cluster maps we define the boundaries of 55 irregular deposits covering 110 vents and compare the results with previous radius and surface estimates. We find that the network is capable of extracting spatial information such as the border of the faculae, and spectral information to altogether highlight the pyroclastic deposits from the background terrain. Overall, we find the 3DCAE an effective technique to analyse sparse observations in planetary sciences.
... In this study, we chose a nonlinear approach by fitting and deploying deep learning autoencoders to our dataset. Whilst the WIBS does not generate many dimensions, AEs have been shown to improve the performance of clustering methods, including K-means, in a number of applications [48][49][50]. Following this, at the third level, we employ a range of clustering approaches and evaluate each through a side-by-side comparison of cluster properties. ...
In a comparative study contrasting new and traditional clustering techniques, the capabilities of K-means, the hierarchal clustering algorithm (HCA), and GenieClust were examined. Both K-means and HCA demonstrated strong consistency in cluster profiles and sizes, emphasizing their effectiveness in differentiating particle types and confirming that the fundamental patterns within the data were captured reliably. An added dimension to the study was the integration of an autoencoder (AE). When coupled with K-means, the AE enhanced outlier detection, particularly in identifying compositional loadings of each cluster. Conversely, whilst the AE’s application to all methods revealed a potential for noise reduction by removing infrequent, larger particles, in the case of HCA, this information distortion during the encoding process may have affected the clustering outcomes by reducing the number of observably distinct clusters. The findings from this study indicate that GenieClust, when applied both with and without an AE, was effective in delineating a notable number of distinct clusters. Furthermore, each cluster’s compositional loadings exhibited greater internal variability, distinguishing up to 3× more particle types per cluster compared to traditional means, and thus underscoring the algorithms’ ability to differentiate subtle data patterns. The work here postulates that the application of GenieClust both with and without an AE may provide important information through initial outlier detection and enriched speciation with an AE applied, evidenced by a greater number of distinct clusters within the main body of the data.
... Among various neural networks, the convolution autoencoder is the most suitable architecture to extract features from a high-dimensional image because of two primary benefits: 1. The CNN-based architecture can better retain the connected information between the pixels of an image (Guo et al. 2017). 2. Slicing and stacking the data in other neural network leads to a large loss of information. ...
Streamflow prediction of rivers is crucial for making decisions in watershed and inland waterways management. The US Army Corps of Engineers (USACE) uses a river routing model called RAPID to predict water discharges for thousands of rivers in the network for watershed and inland waterways management. However, the calibration of hydrological streamflow parameters in RAPID is time-consuming and requires streamflow measurement data which may not be available for some ungauged locations. In this study, we aim to address the calibration aspect of the RAPID model by exploring machine learning (ML)-based methods to facilitate efficient calibration of hydrological model parameters without the need for streamflow measurements. Various ML models are constructed and compared to learn a relationship between hydrological model parameters and various river parameters, such as length, slope, catchment size, percentage of vegetation, and elevation contours. The studied ML models include Gaussian process regression, Gaussian mixture copula, Random Forest, and XGBoost. This study has shown that ML models that are carefully constructed by considering causal and sensitive input features offer a potential approach that not only obtains calibrated hydrological model parameters with reasonable accuracy but also bypasses the current calibration challenges.
HIGHLIGHTS
Calibration of hydrology model using machine learning.;
Learning unknown relationship between model parameters and river features.;
Rapid calibration of hydrological model parameters.;
Comparative study of different machine learning techniques.;
... Depending on the type of network, we are able to classify DC algorithms into four types: autoencoder (AE)-based, variational AE (VAE)-based, generative adversarial network (GAN)-based, and convolutional neural networks (CNNs)-based [15]. AE-based models are currently the most popular, with associated structures used and showing good performance in a large number of experiments [16], [17], [18], [19], [20]. In the past five years, Guo et al. [21] proposed to ignore the network loss and only use the clustering loss to replace the loss of the entire model. ...
Multitask learning uses external knowledge to improve internal clustering and single-task learning. Existing multitask learning algorithms mostly use shallow-level correlation to aid judgment, and the boundary factors on high-dimensional datasets often lead algorithms to poor performance. The initial parameters of these algorithms cause the border samples to fall into a local optimal solution. In this study, a multitask-guided deep clustering (DC) with boundary adaptation (MTDC-BA) based on a convolutional neural network autoencoder (CNN-AE) is proposed. In the first stage, dubbed multitask pretraining (M-train), we construct an autoencoder (AE) named CNN-AE using the DenseNet-like structure, which performs deep feature extraction and stores captured multitask knowledge into model parameters. In the second phase, the parameters of the M-train are shared for CNN-AE, and clustering results are obtained by deep features, which is termed as single-task fitting (S-fit). To eliminate the boundary effect, we use data augmentation and improved self-paced learning to construct the boundary adaptation. We integrate boundary adaptors into the M-train and S-fit stages appropriately. The interpretability of MTDC-BA is accomplished by data transformation. The model relies on the principle that features become important as the reconfiguration loss decreases. Experiments on a series of typical datasets confirm the performance of the proposed MTDC-BA. Compared with other traditional clustering methods, including single-task DC algorithms and the latest multitask clustering algorithms, our MTDC-BA achieves better clustering performance with higher computational efficiency. Deep features clustering results demonstrate the stability of MTDC-BA by visualization and convergence verification. Through the visualization experiment, we explain and analyze the whole model data input and the middle characteristic layer. Further understanding of the principle of MTDC-BA. Through additional experiments, we know that the proposed MTDC-BA is efficient in the use of multitask knowledge. Finally, we carry out sensitivity experiments on the hyper-parameters to verify their optimal performance.
... Namely, autoencoders (AEs), Deep Belief Networks (DBNs) [23], Convolutional Neural Networks (CNNs) [24,25], and Generative Adversarial Networks (GANs) [26] have been introduced and used in various deep clustering applications. In particular, autoencoders have been widely adapted to address challenges relevant to the deep clustering architectures [21,[27][28][29]. A recent survey [30] distinguishes deep clustering methods in terms of methodology, prior knowledge, and architecture. ...
... For AE-based deep clustering approaches, the encoder layers of the trained autoencoder are the ones leveraged for feature transformation into a lower-dimensional space, which serves as the input for the clustering algorithm. Recently, an AE-based deep clustering architecture, deep embedded clustering (DEC) [21], was proposed, which was then followed by a number of variants [19,[27][28][29][31][32][33][34][35][36] that used DEC as the basis for their framework. From then, deep clustering has become a growing research field, with DEC [21] being the experimental benchmark for many deep clustering approaches [29,34]. ...
... Specifically, the IDEC model [37] extended DEC by preserving the local structure of data in the feature space through keeping the decoder layers to avoid feature space distortion by the clustering loss. Another model introduced in [27] extended IDEC by adopting a convolutional autoencoder (CAE) as the deep network. Later, several deep clustering works based on (CAE) were suggested [32]. ...
Semi-supervised clustering typically relies on both labeled and unlabeled data to guide the learning process towards the optimal data partition and to prevent falling into local minima. However, researchers’ efforts made to improve existing semi-supervised clustering approaches are relatively scarce compared to the contributions made to enhance the state-of-the-art fully unsupervised clustering approaches. In this paper, we propose a novel semi-supervised deep clustering approach, named Soft Constrained Deep Clustering (SC-DEC), that aims to address the limitations exhibited by existing semi-supervised clustering approaches. Specifically, the proposed approach leverages a deep neural network architecture and generates fuzzy membership degrees that better reflect the true partition of the data. In particular, the proposed approach uses side-information and formulates it as a set of soft pairwise constraints to supervise the machine learning process. This supervision information is expressed using rather relaxed constraints named “should-link” constraints. Such constraints determine whether the pairs of data instances should be assigned to the same or different cluster(s). In fact, the clustering task was formulated as an optimization problem via the minimization of a novel objective function. Moreover, the proposed approach’s performance was assessed via extensive experiments using benchmark datasets. Furthermore, the proposed approach was compared to relevant state-of-the-art clustering algorithms, and the obtained results demonstrate the impact of using minimal previous knowledge about the data in improving the overall clustering performance.