Applied Intelligence

Published by Springer Nature
Online ISSN: 1573-7497
Learn more about this page
Recent publications
Article
Based on the urgent need for copyright protection in the digital era, the accuracy of iris recognition technology for authentication must be considered. Ensuring the accuracy of high-precision authentication has tremendous challenges in improving imperceptibility and robustness, especially the longer iris features, as watermarks lead to reduced imperceptibility. A novel digital watermarking method called IrisMarkNet, which embeds the copyright owner’s binary iris features into the cover image based on deep neural networks, is first proposed to protect image copyright. It utilizes a novel pyramid feature fusion module PFF based on a multiscale feature fusion strategy to obtain better imperceptibility and well-enhanced robustness, which performs better than other watermark algorithms adopting single-scale feature fusion. Additionally, for different mini-batches, the noise in the noise layer is randomly selected for adversarial training to advance the robustness of the proposed model. In addition, we suggest utilizing Convolutional Block Attention Module (CBAM) Woo et al (1), which can help to learn better iris features in the decoding stage and propose a novel authenticator to achieve authentication of the image copyright owner. The extensive experimental and comparative results have demonstrated the superior performance of the proposed scheme compared with the state-of-the-art watermark algorithms. Under all experimental distortions, such as JPEG compression, crop attack, Gaussian filter, salt-and-pepper noise, Gaussian noise, and median filter, IrisMarkNet realizes well-improved robustness and imperceptibility along with a good accuracy rate in the authentication of digital images.
 
Article
Cloud computing is widely used in various fields, which can provide sufficient computing resources to address users’ demands (workflows) quickly and effectively. However, resource failure is inevitable, and a challenge to optimize the workflow scheduling is to consider the fault tolerance. Most of previous algorithms are based on failure prediction and fault-tolerant strategies, which can cause the time delay and waste of resources. In this paper, combining the above two methods through a deep reinforcement learning framework, an adaptive fault-tolerant workflow scheduling framework called RLFTWS is proposed, aiming to minimize the makespan and resource usage rate. In this framework, the fault-tolerant workflow scheduling is formulated as a markov decision process. Resubmission and replication strategy are as two actions. A heuristic algorithm is designed for the task allocation and execution according to the selected fault-tolerant strategy. And, double deep Q network framework (DDQN) is developed to select the fault-tolerant strategy adaptively for each task under the current environment state, which is not only prediction but also learning in the process of interacting with the environment. Simulation results show that the proposed RLFTWS can efficiently balance the makespan and resource usage rate, and achieve fault tolerance.
 
Article
Although the recent stereo matching methods based on deep learning achieve unprecedented state-of-the-art performance, the accuracy of these approaches suffers a drastic drop when dealing with environments much different in context from those observed at training time. In this paper, we propose a novel Scene-Aware Network (SA-Net) that integrates scene information to achieve cross-domain stereo matching. Specifically, we design a Scene-Aware Module (SAM) to extract rich scene details, which can make the network with it have better generalization ability between different domain. In order to use rich scene information to perfectly guide shallow features to realize cost aggregation, we introduce a new Multi-element Feature Fusion Strategy (MFFS). Extensive quantitative and qualitative evaluations on different domain illustrate that our SA-Net achieves competitive performance and in particular obtains better ability of domain generalization.
 
Article
In unsupervised learning tasks, one of the most significant and challenging aspects is how to estimate the optimal number of clusters (NC) for a particular set of data. Identifying NC in a given dataset is an essential criterion of cluster validity in clustering analysis. The purpose of cluster analysis is to group data points of similar characteristics, which helps determine distributions and correlations of patterns in large datasets. Recently, the availability and diversity of vast data have inspired researchers to identify an optimal NC in such data. In this paper, an ensemble approach is proposed called Ensemble Cluster Validity Index ECVI, to determine the optimal NC based on integrating and optimising several clustering validity indices, namely the Silhouette (Sil) index, the Davies–Bouldin (DB) index, the Calinski-Harabasz (CH) index, and the Gap statistic. The proposed ECVI aims to enhance the selection of the proper NC, which can be used as a measure of a dataset’s partitioning correctness to represent the actual structure of the dataset. The clustering solution (outcome) of the proposed ECVI is used as an input parameter for the k-means clustering algorithm. In other words, the proposed ECVI is concentrated to develop and validate an internal validity method in order to identify a suitable NC. The experimental comparison with the ground-truth labels for given datasets collected from the UCI repository demonstrates that the proposed ECVI outperforms and produces promising outcomes when finding the optimal ECVI in such datasets. The ECVI evaluates the clustering results obtained using a specific algorithm (e.g., k-means or affinity propagation) and identifies the optimal NC for twenty-two UCI datasets. The effectiveness of the proposed ECVI is illustrated by the theoretical analysis and then demonstrated by extensive experiments. ECVI was compared to fifteen recently published and state-of-the-art validity indices, including DB, SIL, CH, Gap, STR, EM with STR, K-means with STR, KL, Hart, Wint, IGP, Dunn, BWC, PBM, and SC indices. The experimental results show that ECVI surpasses all the compared indices in terms of the optimal NC and accuracy rate.
 
Article
Most existing action quality assessment (AQA) methods provide only an overall quality score for the input video and lack an evaluation of each substage of the movement process; thus, these methods cannot provide detailed feedback for users. Moreover, the existing datasets do not provide labels for substage quality assessment. To address these problems, in this work, a new label-reconstruction-based pseudo-subscore learning (PSL) method is proposed for AQA in sporting events. In the proposed method, the overall score of an action is not only regarded as a quality label but also used as a feature of the training set. A label-reconstruction-based learning algorithm is built to generate pseudo-subscore labels for the training set. Moreover, based on the pseudo-subscore labels and overall score labels, a multi-substage AQA model is fine-tuned from the PSL model to predict the action quality score of each substage and the overall score for an athlete. Several ablation experiments are performed to verify the effectiveness of each module. The experimental results show that our approach achieves state-of-the-art performance.
 
Article
E mpathy is the ability to spontaneously or purposefully place oneself in another’s situation. Under the continuous effect of empathy, an individual’s preference for things will inevitably be affected by the local and non-local social environment. Inspired by neuropsychology, this paper constructs an extended empathy model to compensate for the shortcomings of previous models in describing the global preference (utility) coupling between individuals, and analyzes how to make efficient decisions based on this model in a large-scale multiagent system. Empathy is abstracted as a random experience process in the form of nonstationary Markov chains, and empathetic utility is defined as the expectation of preference experienced under the corresponding transition probability distribution. By structurally introducing the self-other separation mechanism and energy attenuation mechanism, the model can exhibit social attributes, including absorbency, inhibition, and anisotropy. An extended iterative candidate elimination (EICE) algorithm is designed for the decision problem defined by the proposed model. This algorithm correlates the error upper bound of the objective function with that of the empathy utility to perform the iterative estimation of the candidate strategies. For a polynomial objective function, the EICE under affective empathy can reduce the algorithm complexity from Onx\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$O\left (n^{x}\right )$\end{document} to Ony\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$O\left (n^{y}\right )$\end{document} (1 ≤ y ≤ 2 ≤ x ≤ 3). In terms of application prospects, the model and the corresponding decision algorithm are proved to be not only suitable for human society but also able to match the engineering application scenarios such as human-machine interaction and unmanned aerial vehicle (UAV) formation under specific requirements.
 
Article
Bayesian networks (BNs) are one of the most compelling theoretical models in uncertain knowledge representation and inference. However, many domains are encountering the dilemma of insufficient data. Learning BN parameters using raw data may lead to low learning accuracy. Therefore, this paper seeks to solve the problem via two novel data extension methods. First, a constraint-based nonparametric bootstrap (CNB) method is proposed, which extends the raw data and guides the parameter distribution of the extended data through a constraint-based sample scoring function. The experimental results on 12 BNs show that the extended data can improve the parameter learning accuracy and enhance the existing parameter learning approaches. The CNB is still valid for medium and large networks with relatively large data. When the original data are of inferior quality, the CNB is unattainable to extend it. Then, a constraint-based parametric bootstrap (CPB) method is proposed, creating a new parameter distribution by constraints and the original samples. The experimental results for the missing data demonstrate that the extended data perform better. The CPB is insensitive to the proportion of missing data and remains superior in relatively large data.
 
Article
Anomaly detection is a challenging problem in science and engineering that appeals to numerous scholars. It is of great relevance to detect anomalies and analyze their potential implications. In this study, a multi-level anomaly detection framework with information granules of higher type and higher order is developed based on the principle of justifiable granularity and Fuzzy C-Means (FCM) clustering algorithm, including two different types of approaches, namely abstract level approach (ALA) and detailed level approach (DLA). The ALA approach is implemented at a comparatively abstract level (viz., level-1), in which two distinct types of information granules of order-1 (viz., information granules of type-1 and type-2) are employed for anomaly detection. The DLA approach is formulated and derived from the ALA approach at a more detailed level (viz., level-2), which generates more detailed information granules, namely information granules of order-2, through successive splitting information granules and the FCM clustering algorithm to refine the problem at various levels. Furthermore, a similarity measurement algorithm is designed for anomaly detection utilizing information granules of higher type and higher order. Comprehensive performance indexes are produced to quantify the performance of the proposed framework compared with the methods of two single-level approaches and two multi-level approaches. Synthetic data and several real-world data coming from various areas are engaged to demonstrate and support the superiority of the proposed approaches over other classical methods in terms of detection accuracy and data anomaly resolution.
 
Article
Privacy Preserving Data Mining (PPDM) is an important research area in data mining, which aims at protecting the privacy during the data mining process so that personal data and sensitive information is not revealed to unauthorized persons. PPDM is a critical task as data often contain sensitive information about individuals that can easily compromise their privacy such as their financial status, political beliefs or medical history. In particular, many algorithms were proposed to hide sensitive frequent itemsets (values that frequently co-occur) in a transaction database. However, a major problem is that those algorithms are based on removing entire transactions (records) instead of single items (values). Hence, many changes are made to databases, which lead to losing useful information. To avoid making too many changes to a database while still preserving privacy, this paper proposes a novel PPDM algorithm called NSGAII4ID for hiding sensitive frequent itemsets by only removing some items from transactions rather than removing whole transactions. To our best knowledge, this is the first paper exploring this approach. Sanitization is treated as a multi-objective optimization problem where the goal is to reduce four side effects, namely hiding failure, missing cost, artificial cost, and database dissimilarity. However, we observe that only three side effects matter in our case (the artificial cost is always zero). Besides, another key difference is that NSGAII4ID sanitizes a database by performing optimization at two levels. At the transaction level, a multi-objective optimization algorithm is applied to find a subset of candidate transactions to be modified. Then, at the item level, NSGAII4ID searches for an optimal subset of items to be removed from each candidate transaction. We show that this problem is exactly the Set cover Problem (SCP) which we solve by using a fast greedy polynomial algorithm. We have conducted extensive experiments on four datasets to compare NSGAII4DT with state-of-the-art PPDM algorithms in terms of runtime, memory cost and total number of removed items during sanitization. The obtained results show that NSGAII4DT achieves a very good balance in minimizing side effects compared with the state-of-the art algorithms.
 
Illustration of our proposed model. Given an input image I, the retrieval-based method first retrieve similar images with their corresponding descriptions T from dataset. The region-based visual features V of I is extracted by Faster R-CNN. a) The Cross-modal Feature Distilling module distilled V with reference to T at the word level, and distilled T with reference to V at the region level. b) The Gated Feature Fusion module dynamically fused the matched region-word features, and suppress the mismatched feature fusion. c) The fused features T̂\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\hat {T}$\end{document} and V̂\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\hat {V}$\end{document} are further aggregated and concatenated as the enhanced feature Vs. The decoder takes Vs and the visual relationship feature Vr for decoding
Illustration of the gated feature fusion process. (a) denotes the fusion for the distilled textual features with the original region-based features. (b) denotes the fusion for the distilled visual features with the original word-based features
The sampled test results in MSCOCO Karpathy test slit. Query denote the input image and the corresponding annotated Ground Truth in the right. Similar represent the retrieved most similar image in both visual and semantic from the dataset with their annotated similar sentences. There listed the generated description from the baseline and ours to make a comparison intuitively. The red color indicate the right contents, and the green color represent the wrongly contents that does not match the query image
The wrongly predicted descriptions of the baseline and our model on the MSCOCO Karpathy test split
Article
Image captioning is a cross-modal task to describe an image into descriptions. The commonly used image captioning methods include the generation-based and the retrieval-based method. In this paper, we propose a feature enhanced image captioning model, which is mainly made up of three parts: cross-modal feature enhanced module (CFD), gated feature fusion (GFF), cross-modal decoder. The retrieval-based method first retrieved the semantic related similar sentences for each image. CFD mutually coarse aligned the region-based visual features with the word-based similar sentences. GFF further performs a deeper interaction for the coarse aligned visual and semantic features through a dynamic gate to control the fusion level, and get the fine aligned features. We concatenated the two sets fine aligned features as the enhanced features. Both the visual relationship features and the enhanced features guide the cross-modal decoder generate the description. Our model got 131.0 and 68.3 CIDEr score when it compared with different methods on MSCOCO and Flickr30k. Further ablation studies also demonstrate the effectiveness of each component.
 
Article
Recent developments in big data applications have heightened the need for understanding and processing high-dimensional data. It is necessary to extract some excellent features that effect the learning performance in high-dimensional data. Feature selection algorithm based on rough set theory as an important preprocessing method has been widely used in practical applications. Meanwhile, it should be noted that different attributes have different effects on model evaluation. Nevertheless, each feature or attribute has the same degree of importance in the interval-valued information system by using rough set models, ignoring the imbalance between features. Moreover, the monotonic classification effect of interval-valued data is easily affected by noise. For these two issues, we introduce different weights into neighborhood relations and propose a novel approach for feature selection-based weighted neighborhood rough sets for interval-valued information systems in this study. First, weighted neighborhood relations and some important properties are proposed by considering different attribute weights in the interval-valued information system. Then, we construct an interval-valued-based weighted neighborhood rough set (IVWNRS) model to solve the contradiction between the degree of dependency and the classification ability of the attribute subset. Furthermore, a heuristic algorithm is designed according to the degree of dependency to select an attribute subset that has both strong correlation and high dependency. Finally, we compare it with six other representative feature selection algorithms on fifteen public datasets to evaluate the performance of the proposed algorithm. Experimental results on different classifiers show that the IVWNRS algorithm has higher classification performance and is significantly effective.
 
Article
Density-based clustering has gained increasing attention during the past decades as it allows the discovery of clusters with arbitrary shapes and is robust to noisy objects. However, existing density-based clustering approaches tend to fail if there exist multiple clusters with different densities in a sea of noise. In this paper, we propose a new multi-level clustering method by exploiting the dynamic local density with wavelet transform. Specifically, a concept of dynamic reverse k-nearest neighbor is first introduced, and its count distribution is modeled as a Poisson distribution. The dynamic local density, which is robust to density varieties, is further defined with the cumulative Poisson distribution function. Afterward, a cluster order is constructed based on the derived dynamic local density and finally used to yield the clusters by employing the wavelet transform. Compared to existing approaches, our proposed method can detect clusters with different densities and allows obtaining more clustering information such as the number of clusters, break points between clusters, the boundary of clusters, etc. Extensive experiments on both synthetic and real-world data sets have demonstrated that our proposed method is effective and produces better clustering results when compared to many state-of-the-art clustering algorithms.
 
Article
Outlierdetection is an important research direction in data mining, including fraud detection, activity monitoring, medical research, network intrusion detection, etc. Many outlier detection methods have been proposed; however, most of them are not suitable for complex patterns because they do not extract appropriate neighbor information and cannot estimate the density accurately. Additionally, their performance is not stable and depends heavily on the number of nearest neighbors (k) selected. To overcome the above defects, we proposed a neighborhood weighted-based outlier detection(NWOD) algorithm that can obtain correct detection result in a variety of situations. In our algorithm, the local density of an object is measured by constructing a weighted nearest neighbor graph and quantifying how difficult it is for the object and its nearest neighbors to reach each other. Furthermore, the proposed neighborhood weighted local outlier factor (NWLOF) compares the differences of the neighborhood weighted local density between a given object and the objects in its neighborhood, and then the degree of being an outlier of an object can be judged. The larger the NWLOF of an object is, the more likely it is to be an outlier. In addition, due to our proposed algorithm being based on the concept of a natural stable structure, its performance does not rely on the value of k. Experiments conducted on both synthetic and real-world datasets show the superiority of our algorithm.
 
Article
This work explores the effectiveness and robustness of quantum computing by conjoining the principles of quantum computing with the conventional computational paradigm for the automatic clustering of colour images. In order to develop such a computationally efficient algorithm, two population-based meta-heuristic algorithms, viz., Particle Swarm Optimization (PSO) algorithm and Enhanced Particle Swarm Optimization (EPSO) algorithm have been consolidated with the quantum computing framework to yield the Quantum Inspired Particle Swarm Optimization (QIPSO) algorithm and the Quantum Inspired Enhanced Particle Swarm Optimization (QIEPSO) algorithm, respectively. This paper also presents a comparison between the proposed quantum inspired algorithms with their corresponding classical counterparts and also with three other evolutionary algorithms, viz., Artificial Bee Colony (ABC), Differential Evolution (DE) and Covariance Matrix Adaption Evolution Strategies (CMA-ES). In this paper, twenty different sized colour images have been used for conducting the experiments. Among these twenty images, ten are Berkeley images and ten are real life colour images. Three cluster validity indices, viz., PBM, CS-Measure (CSM) and Dunn index (DI) have been used as objective functions for measuring the effectiveness of clustering. In addition, in order to improve the performance of the proposed algorithms, some participating parameters have been adjusted using the Sobol’s sensitivity analysis test. Four segmentation evaluation metrics have been used for quantitative evaluation of the proposed algorithms. The effectiveness and efficiency of the proposed quantum inspired algorithms have been established over their conventional counterparts and the three other competitive algorithms with regards to optimal computational time, convergence rate and robustness.
 
Matrix storage in tables
Example feed-forward neural network
Equivalence rule for array query rewriting: “avg(a + b) ≡ avg(a) + avg(b)”
Training (red) and predicted (blue) wind speed values for the linear regression algorithm
Article
This paper provides an in-depth survey on the integration of machine learning and array databases. First,machine learning support in modern database management systems is introduced. From straightforward implementations of linear algebra operations in SQL to machine learning capabilities of specialized database managers designed to process specific types of data, a number of different approaches are overviewed. Then, the paper covers the database features already implemented in current machine learning systems. Features such as rewriting, compression, and caching allow users to implement more efficient machine learning applications. The underlying linear algebra computations in some of the most used machine learning algorithms are studied in order to determine which linear algebra operations should be efficiently implemented by array databases. An exhaustive overview of array data and relevant array database managers is also provided. Those database features that have been proven of special importance for efficient execution of machine learning algorithms are analyzed in detail for each relevant array database management system. Finally, current state of array databases capabilities for machine learning implementation is shown through two example implementations in Rasdaman and SciDB.
 
Article
With the development of depth sensors and pose estimation algorithms, action recognition technology based on the human skeleton has attracted wide attention from researchers. The human skeleton action recognition methods embedded with semantic information have excellent performance in terms of computational cost and recognition results by extracting spatio-temporal features of all joints, nevertheless, they will cause information redundancy and are of limitations in extracting long-term context spatio-temporal features. In this work, we propose a semantic-guided multi-scale neural network (SGMSN) method for skeleton action recognition. For spatial modeling, the key insight of our approach is to achieve multi-scale graph convolution by manipulating the data level (without adding additional computational cost). For temporal modeling, we build the multi-scale temporal convolutional network with a multi-scale receptive field across the temporal dimensions. Several experiments have been carried out on two publicly available large-scale skeleton datasets, NTU RGB+D and NTU RGB+D 120. On the NTU RGB+D datasets, the accuracy is 90.1% (cross-subject) and 95.8% (cross-view) respectively. The experimental results show that the performance of the proposed network architecture is superior to most current state-of-the-art action recognition models.
 
Article
X-ray images are essential data sources for checking the condition of the teeth, gums, jaws, and bone structure of the mouth. Tooth recognition is fundamental in image-processing-based diagnoses. In most previous recognition studies, only four-axis-based object-detection models have been considered because they perform normal object detection while the object is resting on a flat surface. However, because the teeth have various orientations, the existing four-axis-based model leads to inaccurate and inefficient recognition results. Thus, in this study, we propose a five-axis-based object-detection model that considers the orientation of the tooth. Based on a tooth-image dataset labeled using the five-axis ground truth, our proposed method processed five-axis annotated data by employing a variant of the faster region-based convolutional neural network. In the experiment, our proposed method outperformed the existing four-axis approach, both qualitatively and quantitatively. The experimental results indicated that the proposed five-axis-based recognition model will be an important basis for a dental-image-based diagnosis.
 
Article
In this paper, we propose a novel video-sequence-based pedestrian re-identification method using the feature distribution similarity measurement between pedestrian video sequences (PRI-FDSM). We use the multiple granularity network combined with generative adversarial skew correction to extract and generate the feature point sets of the corrected pedestrian sequences. Then, we construct the corresponding probability function estimators for each pedestrian sequence using a radial basis function neural network to describe the feature distributions of specific sequences. Finally, we measure the similarity between the feature distributions of sequences to obtain re-identification results. The proposed method uses the distribution similarity measurement of the feature point sets of different sequences to make full use of all the image information of the specific pedestrian in a sequence. Thus, our method can mitigate the problem of insufficient use of the details of some images in a sequence, which commonly occurs in existing fusion feature point measurement methods. Besides, we correct the input skewed pedestrian sequences and achieve posture unification for the input sequences, which effectively mitigates the posture skewing problem of the photographed pedestrians in real-world surveillance scenes. We also build a dataset that more accurately represents the real-world surveillance scenes that contain pedestrian sequences with skewed postures. The results of the ablation experiment on iLIDS-VID and this dataset demonstrate the effectiveness of the proposed distribution-based similarity measurement method. We also compare the performance of the proposed method and several state-of-the-art methods on our dataset. Experimental results show that the indices of our method are all higher than those of the existing methods, and its mAP, Rank-1 and Rank-5 surpass the second best by 3.7%, 1.3% and 1.7% respectively.
 
Article
This paper proposes a prevention method of block withholding attack (PMBWA) based on miners’ mining behavior in blockchain to prevent the block withholding attack. The PMBWA first performs the data pre-processing based on the box chart detection algorithm for data cleaning and preliminary verification. Then the PMBWA uses the behavior reward, punishment mechanism, and credit model to comprehensively evaluate the contribution of miners. The PMBWA proposes a credit level classification algorithm (CLCA) of miners that weighs posterior probability and similarity to detect the malicious miners. Finally, the PMBWA allocates the corresponding income weight for miners of different credit levels. The simulation results show that regardless of how the numbers of blocks and malicious computing power change, the PMBWA can allocate low-income weight to the corresponding malicious computing power, and significantly improve the precision rate and recall rate of malicious computing power detection in the defensive mining pool. The PMBWA can largely reduce the average cumulative income of malicious computing power and improve the average cumulative income of non-malicious computing power. The PMBWA outperforms the state-of-the-art methods such as ICIAS, SRIAS, and IASCM.
 
Article
We present a novel, robust target detection method to locate a target from a reference image (UAV image) according to a target image (remote sensing satellite image). Using sparse parameterization diffeomorphic matching based on multiscale kernels, the approach modeling the nonrigid transformation function between the reference image and target image is proposed to complete target detection. Furthermore, it designs an feature point matching fusing intensity and phase information to determine the corresponding keypoints, which solves the cross-view problem. Then, the displacements of the corresponding keypoint sets are classified into several subsets using the probabilistic mixture model. The sparse parameterization diffeomorphic matching is executed in the subsets, removing the influence of outliers in the corresponding keypoints. The subset with the maximum evaluation for the transformation is utilized to locate the target. Finally, multiscale kernels based on sparse parameterization are integrated into diffeomorphic matching, solving the large deformation problems between target and reference images. The proposed approach incorporates the stationary velocity field into the diffeomorphism and utilizes the Lie group idea for the stationary velocity to trade off the matching accuracy and computational time. On the University-1652 image dataset with multi-view and multisource properties, experimental results show that the proposed approach is robust to noise and large deformations.
 
Article
Software repositories are increasingly essential to support the management of typical artifacts building up projects, including source code, documentation, and bug reports. GitHub is at the forefront of this kind of platforms, providing developer with a reservoir of code contained in more than 28M repositories. To help developers find the right artifacts, GitHub uses topics, which are short texts assigned to the stored artifacts. However, assigning inappropriate topics to a repository might hamper its popularity and reachability. In our previous work, we implemented MNBN and TopFilter to recommend GitHub topics. MNBN exploits a stochastic network to predict topics, while TopFilter relies on a syntactic-based function to recommend topics. In this paper, we extend our work by building HybridRec, a recommender system based on stochastic and collaborative-filtering techniques to generate more relevant topics. To deal with unbalanced datasets, we employ a Complement Naïve Bayesian Network (CNBN). Furthermore, we apply a preprocessing phase to clean and refine the input data before feeding the recommendation engine. An empirical evaluation demonstrates that HybridRec outperforms three state-of-the-art baselines, obtaining a better performance with respect to various metrics. We conclude that the conceived framework can be used to help developers increase their projects’ visibility.
 
Article
Paraphrase generation is one of the long-standing and important tasks in natural language processing. Existing literature has mainly focused on the generation of sentence-level paraphrases, in which the relationship between sentences was ignored, such as sentence reordering, sentence splitting, and sentence merging. In this paper, while paying attention to the relationship within sentences, we also explore the relationship between sentences. For the task of document-level interpretation generation, we focus on reordering documents to enhance inter-sentence diversity.We use the attention-enhanced graph long short-term memory (LSTM) to encode the relationship graph between sentences, so that each sentence generates a coherent representation that conforms to the context. Based on the sentence-level paraphrase generation model, we constructed a pseudo-document-level paraphrase dataset. The automatic evaluation shows that our model achieves higher scores in terms of semantic relevance and diversity scores than other strong baseline models. In the manual evaluation, the validity of our model is also confirmed. Experiments show that our model retains the semantics of the source document, while generating paraphrase documents with high diversity. When we reorder the sentences, the output paraphrase documents can still preserve the coherence between sentences with higher scores.
 
Article
In recent years, numerous convolution neural networks (CNNs) for single image super-resolution (SISR) have shown powerful capability in image reconstruction. Especially, accurate and compact networks receive widespread attention due to their superiority in running speed on resources-limited devices. However, in most lightweight CNN-based methods, the limitation of informative features extraction results in mediocre reconstructed images. In this paper, we propose a spatial-temporal feature refine network (STFRN) to alleviate the above problem by extract features from diverse dimensions. Specifically, we first introduce a spatial-temporal stage learning (STSL) of our STFRN in two distinct views: 1) for temporal feature extraction, we enhance the relevance of various stages and introduce a persistent memory via multi-stage learning, thereby boosting its reconstruction capability; 2) for spatial feature extraction, we enlarge the receptive fields by densely deepening the network to capture more context features. In the time dimension, we only maintain the main stage, i.e., the last stage, for efficient training. In addition to STSL, we elaborately design an effective multi-attention enhanced residual block (MERB), which further refines informative features at the depth level. In detail, by combining multiple attention blocks, we achieve the complementary features from diverse views (i.e., in point-wise, channel-wise, and spatial dimensions). Compared with ordinary residual blocks, MERB realizes better results while consuming less computational resources. Like most literature, we also utilize residual learning and dense connection to promote performance. Extensive experiments show our STFRN levering on the STSL and MERBs is superior to other state-of-the-art methods in quality and quantity.
 
Article
Deep convolutional neural networks (DCNN) have been widely used in the field of image denoising because of their fast inference and good performance. However, the design of networks for the DCNN is mostly empirical, and the interpretation and robustness of them remains a major challenge. Inspired by the total generalized variation method, this paper proposes a novel adaptive denoising network. It mainly improves the TGV algorithm in terms of two points. Firstly, the first and second order derivation term are replaced by the learnable operators. Secondly, the regularization terms are learned from the training data by using convolution networks other than the fixed ones. The network design derived from the process for tackling the denoising problem based on the primal-dual hybrid gradient optimization algorithm, is called TGVLNet- Total Generalized Variational-Liked Network, which allows for the image prior and the linear operators to be tuned differently in each iteration, and enhances the flexibility and generalization ability of the network. The experiment results of Gaussian noise removal and signal dependent noise removal manifest that the proposed network has superior performance and generalization. Compared to most of the blind denoising methods with additive white Gaussian noise, the proposed TGVLNet performs better in unseen noise level. It is noting that we train the model only on the synthetic image of the signal dependent noise removal, and use the model to remove the noise of some images on two real datasets, i.e NC12 and Nam, the denoising results also presents much better visual quality and performance, which further verify the generalization and robustness of our method.
 
An example of SGG. An input image with bounding boxes is shown on the left. The corresponding scene graph is shown on the right. This graph contains the various objects, such as the “woman”, “kite”, “dog”, that are localized in the image by the colour-coded bounding boxes shown above, and the relationships between those objects, “holding”
The overall framework of the TASG model. Given an image, we first apply the Faster R-CNN algorithm to obtain the visual features and locations of object proposals. Our framework also includes three new modules for SGG: (1) an enhanced object detection module with Bi-LSTM for object-to-object information exchange, (2) a new module for capturing contextual information capture that uses Transformer layers to compute the context between boundary areas, and (3) an adaptive inference module with a special feature fusion strategy used to handle deviations in the dataset
The architecture of the Transformer-based contextual information extraction module described in Section 3.2, including a detailed illustration of the internal structure of the encoder, which consists of a multihead attention layer, an add&norm layer, a feedforward layer and another add&norm layer
R@100 values of MOTIFS and the proposed TASG method under PredCls for each Top-50 category ranked by frequency
Top 10 relationship retrieval results of the TASG model under the SGCls protocol. Green indicates detected objects or predicates that have been predicted correctly and overlap with the ground truth, blue indicates correct predictions that are not labelled in the ground truth, and red indicates misclassified results
Article
Understanding a visual scene requires not only identifying single objects in isolation but also inferring the relationships and interactions between object pairs. In this study, we propose a novel scene graph generation framework based on Transformer to convert image data into linguistic descriptions characterized as nodes and edges of a graph describing the <subject–predicate–object> information of the given image. The proposed model consists of three components. First, we propose an enhanced object detection module with bidirectional long short-term memory (Bi-LSTM) for object-to-object information exchange to generate the classification probabilities for object bounding boxes and classes. Second, we introduce a novel context information capture module containing Transformer layers that outputs object categories containing object context as well as edge information for specific object pairs with context. Finally, since the relationship frequencies follow a long-tailed distribution, an adaptive inference module with a special feature fusion strategy is designed to soften the distribution and perform adaptive reasoning about relationship classification based on the visual appearance of object pairs. We have conducted detailed experiments on three popular open-source datasets, namely, Visual Genome, OpenImages, and Visual Relationship Detection, and have performed ablation experiments on each module, demonstrating significant improvements under different settings and in terms of various metrics.
 
Article
In this paper, we study how to denoise medical ultrasound images and improve the performance of instance segmentation using deep learning technology. Since medical ultrasound images usually contain a lot of noises, we first propose a novel unsupervised learning approach called Dual Image (DI) for denoising of medical ultrasound images. DI consists of three main features. Firstly, unlike many existing supervised denoising methods, it does not need clean medical ultrasound images for denoising. Instead, it uses Computed Tomography (CT) images and the noise patches extracted from medical ultrasound images for denoising. Secondly, to effectively select noise patches from medical ultrasound images, a patch selection algorithm based on entropy is formulated. Thirdly, to minimize structure variation of denoised medical ultrasound images, a new reconstruction block is designed for combining the structural information from the structural enhancement block. After denoising, since medical ultrasound images are usually poor in features, to further improve the instance segmentation performance, we extend SOLOv2 to Segmenting on Ultrasound Image (SOUI) by proposing the Double Feature Pyramid Network (D-FPN) and mask fusion branch to strengthen the communication and fusion of different feature layers. Extensive experiments have been performed to study the performance of DI and SOUI using practical medical ultrasound images. We demonstrate that DI can greatly improve the quality of medical ultrasound images and minimize structure variation of denoised medical ultrasound images. SOUI gives 53.7% AP(Average Precision) on practical medical ultrasound images, and outperforms most the state-of-the-art instance segmentation methods including SOLOv2. Code is available at: https://github.com/ztt0821/SOUI.
 
Article
Genome edit is a modern technology to serve mankind. The main idea is derived from RNA mediated Nuclease, which is the CRISPR/Cas9 natural process of the bacterial genome. In this paper, we have developed an algorithm CMT-MARL for finding the multiple editable target site from the similar sequences. Among different types of genes, there are many common regions, which are important concerning the production of proteins or any other biological function in the organisms. Tracing multiple target sites is important for the case of gene duplication, gene fusion, finding mutations from co-expressed genes and transcripts from genes. The complexity to find out common editable targets from similar kind of sequences using brute force method is O(lⁿ), where l is the genome sequence length and n is the number of sequences. If n goes higher then the complexity of the problem reaches to some infeasible computational time. We have applied Reinforcement learning Algorithm using Eligibility Trace and Monte Carlo method to tackle this problem. The time complexity of the algorithm CMT-MARL is O(nl²). Finally we have compared our result set with existing algorithm “CRISPR- MultiTargeter” [1] (http://www.multicrispr.net/) concerning the goodness of editing. We have used the data set from Ensembl BioMart (http://www.ensembl.org). We have run our methodology in Mouse, Rat, Zebrafish, Chicken and Human genes. Finally, we locate the optimal regions for editing diseased or duplicated genes concerning our hybrid score mechanism with all types of biological factors.
 
Article
Nonnegative matrix factorization (NMF) is a novel paradigm for feature representation and dimensionality reduction. However, the performance of the NMF model is affected by two critical and challenging problems. One is that the original NMF does not consider the distribution information of data and parameters, resulting in inaccurate representations. The other is the high computational complexity in online processing. Bayesian approaches are proposed to address the former problem of NMF. However, most existing Bayesian-based NMF models utilize an exponential prior, which only guarantees the nonnegativity of parameters without fully considering the prior information of the parameters. Thus, a new Bayesian-based NMF model is constructed based on the Gaussian likelihood and a truncated Gaussian prior, called the truncated Gaussian-based NMF (TG-NMF) model, in which a truncated Gaussian prior can prevent overfitting while ensuring nonnegativity. Furthermore, Bayesian inference-based incremental learning is introduced to reduce the high computational complexity of TG-NMF; this model is called TG-INMF. We adopt variational Bayesian to estimate all parameters of TG-NMF and TG-INMF. Experiments on genetic data-based tumor recognition demonstrate that our models are competitive with other existing methods for classification problems.
 
Article
During graduate education, postgraduates have to spend considerable time finding papers to explore the development branches of their field. However, existing paper recommendation methods focus on several attributes (title, author, keyword, venue, etc.). The network schema constructed by these attributes is extremely sparse, which easily causes the loss of important semantic paths between attributes. This results in a lack of correlations among relevant papers, which affects paper recommendation efficiency. Moreover, the relationships between multiple semantic paths can be found through common homogeneous and heterogeneous attributes. These relationships can establish many correlations among relevant papers. To address the above problems, this paper proposes a new approach to fuse multi-semantic paths into a heterogeneous educational network (HEN) for personalized paper recommendation. After data processing, a new HEN schema is built by enriching nodes and edges in heterogeneous networks. Then, different semantic meta-paths are generated by projection sub-nets. Next, a new HEN embedding method is proposed by multi-semantic path fusion to generate rich HEN node sequences. Finally, personalized paper recommendation for postgraduates by targeted path similarity. The proposed method was performed on two paper datasets in the fields of educational intergenerational mobility from 1987 to 2021 and data mining and intelligent media from 1997 to 2021. Substantial experiments demonstrate that the proposed approach is effective.
 
Article
During recent decades, multi-objective optimization has aroused extensive attention, and a variety of related algorithms have been proposed. A hybrid multi-objective optimization algorithm based on angle competition and neighborhood protection mechanism (HCPMOEA) is proposed in this paper. First, an environmental selection strategy based on neighborhood protection is introduced to make great compromises between optimization performance and time consumption. Then, the difference between Genetic algorithm and Differential evolution is analyzed from the perspective of offspring distribution and a hybrid operator is proposed to obtain good balances between exploration and exploitation. Besides, an elite set is employed to improve chances of the superior solutions generating offspring, and angle competition strategy is adopted to realize optimization matching of parents, thus improving the quality of offspring. The performance of HCPMOEA has been proved by comparing with 13 classic or state-of-the-arts algorithms on 19 standard benchmark, and the corresponding results show the competitive advantages in effectiveness and efficiency. In addition, the practicality of the proposed HCPMOEA is further verified by two real-world instances. Therefore, all of the aforementioned results have proved the superiority of the proposed HCPMOEA in solving bi-objective and tri-objective problems.
 
The credibility distribution of triangular fuzzy variable γ~\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \overset{\sim }{\gamma } $$\end{document}
Article
This paper aims to offer a group decision making (GDM) method based on intuitionistic triangular fuzzy information. Toward this end, a new ranking method for intuitionistic triangular fuzzy numbers (ITFNs) is firstly introduced based on the credibility measure theory, which can provide a total order on ITFNs. In view of the condition that interactions exist among the decision makers (DMs) or the criteria, the 2-additive Shapley intuitionistic triangular fuzzy aggregation (2ASITFA) operator is further proposed. According to the Wasserstein distance, a new distance measure of intuitionistic triangular fuzzy sets (ITFSs) is defined. Moreover, under the partial weak order information on the importance and interaction among the DMs and the criteria, the programming models are constructed to obtain the optimal 2-additive measure of the DMs and criteria respectively. Then, an intuitionistic triangular fuzzy multi-criteria GDM (ITFMCGDM) method is developed. At length, a practical example for evaluating the service quality of medical and nursing institutions (MNIs) is offered to illustrate the application of the proposed method.
 
Article
An important but very time consuming part of the research process is literature review. An already large and nevertheless growing ground set of publications as well as a steadily increasing publication rate continue to worsen the situation. Consequently, automating this task as far as possible is desirable. Experimental results of systems are key-insights of high importance during literature review and usually represented in form of tables. Our pipeline KIETA exploits these tables to contribute to the endeavor of automation by extracting them and their contained knowledge from scientific publications. The pipeline is split into multiple steps to guarantee modularity as well as analyzability, and agnosticim regarding the specific scientific domain up until the knowledge extraction step, which is based upon an ontology. Additionally, a dataset of corresponding articles has been manually annotated with information regarding table and knowledge extraction. Experiments show promising results that signal the possibility of an automated system, while also indicating limits of extracting knowledge from tables without any context.
 
Article
With the advent of transformers having attention mechanisms, the advancements in Natural Language Processing (NLP) have been manifold. However, these models possess huge complexity and enormous computational overhead. Besides, the performance of such models relies on the feature representation strategy for encoding the input text. To address these issues, we propose a novel transformer encoder architecture with Selective Learn-Forget Network (SLFN) and contextualized word representation enhanced through Parts-of-Speech Characteristics Embedding (PSCE). The novel SLFN selectively retains significant information in the text through a gated mechanism. It enables parallel processing, captures long-range dependencies and simultaneously increases the transformer’s efficiency while processing long sequences. While the intuitive PSCE deals with polysemy, distinguishes word-inflections based on context and effectively understands the syntactic as well as semantic information in the text. The single-block architecture is extremely efficient with 96.1% reduced parameters compared to BERT. The proposed architecture yields 6.8% higher accuracy than vanilla transformer architecture and appreciable improvement over various state-of-the-art models for sentiment analysis over three data-sets from diverse domains.
 
Article
Deep neural networks are known to be vulnerable to malicious perturbations. Current methods for improving adversarial robustness make use of either implicit or explicit regularization, with the latter is usually based on adversarial training. Randomized smoothing, the averaging of the classifier outputs over a random distribution centered in the sample, has been shown to guarantee a classifier’s performance subject to bounded perturbations of the input. In this work, we study the application of randomized smoothing to improve performance on unperturbed data and increase robustness to adversarial attacks. We propose to combine smoothing along with adversarial training and randomization approaches, and find that doing so significantly improves the resilience compared to the baseline. We examine our method’s performance on common white-box (FGSM, PGD) and black-box (transferable attack and NAttack) attacks on CIFAR-10 and CIFAR-100, and determine that for a low number of iterations, smoothing provides a significant performance boost that persists even for perturbations with a high attack norm, 𝜖. For example, under a PGD-10 attack on CIFAR-10 using Wide-ResNet28-4, we achieve 60.3% accuracy for infinity norm 𝜖∞=8/255\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\epsilon _{\infty }=\nicefrac {8}{255}$\end{document} and 13.1% accuracy for 𝜖∞=35/255\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\epsilon _{\infty }=\nicefrac {35}{255}$\end{document} – outperforming previous art by 3% and 6%, respectively. We achieve nearly twice the accuracy on 𝜖∞=35/255\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\epsilon _{\infty }=\nicefrac {35}{255}$\end{document} and even more so for perturbations with higher infinity norm. A https://github.com/yanemcovsky/SIAM of the proposed method is provided.
 
Comparisons of all methods against each other in terms of each evaluation criterion with the Nemenyi test
Sensitivity analysis of parameter λA
Sensitivity analysis of parameter λS
Sensitivity analysis of parameter λβ
Convergence of MV2ML on the Emotions and Yeast datasets
Article
Multi-view multi-label learning (MVML) is an important paradigm in machine learning, where each instance is represented by several heterogeneous views and associated with a set of class labels. However, label incompleteness and the ignorance of both the relationships among views and the correlations among labels will cause performance degradation in MVML algorithms. Accordingly, a novel method, label recovery and label correlation co-learning forMulti-ViewMulti-Label classification with incoMpleteLabels (MV2ML), is proposed in this paper. First, a label correlation-guided binary classifier kernel-based is constructed for each label. Then, we adopt the multi-kernel fusion method to effectively fuse the multi-view data by utilizing the individual and complementary information among multiple views and distinguishing the contribution difference of each view. Finally, we propose a collaborative learning strategy that considers the exploitation of asymmetric label correlations, the fusion of multi-view data, the recovery of incomplete label matrix and the construction of the classification model simultaneously. In such a way, the recovery of incomplete label matrix and the learning of label correlations interact and boost each other to guide the training of classifiers. Extensive experimental results demonstrate that MV2ML achieves highly competitive classification performance against state-of-the-art approaches on various real-world multi-view multi-label datasets in terms of six evaluation criteria.
 
Article
We propose a framework for the assessment of uncertainty quantification in deep regression. The framework is based on regression problems where the regression function is a linear combination of nonlinear functions. Basically, any level of complexity can be realized through the choice of the nonlinear functions and the dimensionality of their domain. Results of an uncertainty quantification for deep regression are compared against those obtained by a statistical reference method. The reference method utilizes knowledge about the underlying nonlinear functions and is based on Bayesian linear regression using a prior reference. The flexibility, together with the availability of a reference solution, makes the framework suitable for defining benchmark sets for uncertainty quantification. Reliability of uncertainty quantification is assessed in terms of coverage probabilities, and accuracy through the size of calculated uncertainties. We illustrate the proposed framework by applying it to current approaches for uncertainty quantification in deep regression. In addition, results for three real-world regression tasks are presented.
 
An alternating optimization for (5).
Decision boundaries of SVM (in red solid line) and our element-wise kernel learning framework (in black solid line): (a) Using Gaussian kernel with μ = 10; (b) Using Gaussian kernel with μ = 0.1
The performance variation of the proposed method with respect to λ on Fertility, Sonar, Ionosphere, Monks1, Monks2, and Monks3 datasets
The objective function values for solving Z, G, and α, (from left to right) respectively, versus iterations on Ionosphere dataset
The objective function values for solving Z, G, and α, (from left to right) respectively, versus iterations on Sonar dataset
Article
An effective kernel learning framework is a fundamental issue which has been attracted considerable attention during the past decade. However, existing multiple kernel learning algorithms follow the assumption that an optimal kernel is a weighted combination of pre-specified kernels, leading to limited kernel representation and insufficient flexibility. Moreover, data-dependent kernel learning approaches explore a flexible kernel matrix in the neighborhood area of the fixed initial kernel matrix, resulting in the restriction on the kernel search space. To solve these limitations, we propose element-wise kernel learning via the connection between representative kernel learning and parameter-free kernel learning. A data-adaptive kernel matrix without any specific formulations is imposed on the representative kernels in an element-wise manner. To diminish the adverse effect of the correlated information among pre-specified base kernels, representative kernels are diversely determined by the kernel selection process. The extensive experiments on benchmark and real-world datasets indicate our proposed framework achieves superior performance against well-known kernel-based algorithms.
 
Article
Anomalies indicate impending failures in expensive industrial devices. Manufacturers of such devices or Plant Managers depend heavily on anomaly detection algorithms to perform monitoring and predictive maintenance activities. Since false alarms directly impact any industrial manufacturer’s revenue, it is crucial to reduce the number of false alarms generated by anomaly detection algorithms. Here in this paper, we have proposed multiple solutions to address this ongoing problem in the industry. The proposed unsupervised solution, Multi-Generations Tree (MGTree) algorithm, not only reduced the false positive alarms but is also equally effective on small and large datasets. MGTree has been applied to multiple industrial datasets such as Yahoo, AWS, GE, and machine sensors for evaluation purposes. Our empirical evaluation shows that MGTree performs favorably to Isolation Forest (iForest), One Class Support Vector Machine (OCSVM), and Elliptic Envelope in terms of correctness of the identification (True-Positive and False-Positive) of the anomalies. We have also proposed a time series prediction algorithm Weighted Time-Window Moving Estimation (WTM), which does not rely on the dataset’s stationary characteristics and is evaluated on multiple time-series datasets. The hybrid combination of WTM and MGTree, Uni-variate Multi-Generations Tree (UVMGTree) worked very well in anomaly identification of the time series datasets and outperformed OCSVM, iForest, SARIMA, and Elliptic Envelope. Our approach can have a profound impact on the predictive maintenance and health monitoring of the industrial systems across the domains where the operations team can save significant time and effort in handling false alarms. .
 
Article
Convolutional neural networks have been widely used in various application scenarios. To extend the application to some areas where accuracy is critical, researchers have been investigating methods to improve accuracy using deeper or broader network structures, which creates exponential growth in computation and storage costs and delays in response time. In this paper, we propose a self-distillation image classification algorithm that significantly improves performance while decreasing training costs. In traditional self-distillation, the student model needs to improve its ability to acquire global information and focus on key features due to the lack of guidance from the teacher model. For this reason, we improved the traditional self-distillation algorithm by using a positional attention module and a residual block with attention. Experimental results show that the method achieves better performance compared with traditional knowledge distillation methods and attention networks.
 
Article
Authentication systems have evolved immensely ever since the use of biometrics in the field of security has started. Nonetheless, extensive usage of biometric systems has also resulted in growing fear of identity theft as storing biometric user templates in databases opens up severe security concerns. Consequently, the scope of biometrics invariably depends on the ability of the system to manifest security and robustness against biometric identity theft along with achieving acceptable recognition performance. This paper proposes a user template protection technique to secure a fingerprint based user template used in a biometric authentication system. It is based on the fingerprint shell, which computes alignment-free, singular point independent, and non-invertible user templates. The secure user template generated by the proposed technique comprises fingerprint shell curves computed using the transformed pair-polar structure of the minutia points. The proposed technique has been evaluated on five fingerprint databases of Fingerprint Verification Competitions, namely, FVC2002, FVC2004, and FVC2006. The effectiveness of the technique is analyzed in terms of revocability, diversity, security, and recognition performance. The comparison of results with that of the existing techniques demonstrates the robustness and efficacy of the proposed technique.
 
Article
The use of CMOS technology to generate neural session keys is presented in this research for incorporation with the Internet of Things (IoT) to improve security. Emerging technology advancements in the IoT era have enabled better tactics to exacerbate energy efficiency and security difficulties. Current safety solutions do not effectively address the security of IoT. Regarding IoT integration, a tiny logic area ASIC design of a re-keying enabled Triple Layer Vector-Valued Neural Network (TLVVNN) is presented utilizing CMOS designs with measurements of 65 and 130 nanometers. There hasn’t been much study into optimizing the value of neural weights for faster neural synchronization. Harris’ Hawks is used in this instance to optimize the neural network’s weight vector for faster coordination. Once this process is completed, the synchronized weight becomes the session key. This method offers several advantages, namely (1) production of the session key by mutual neural synchronization over the public channel is one of the advantages of this technology. (2) It facilitates Hawks-based neural weight vector optimization for faster neural synchronization across public channels. (3) As per behavioral modeling, the synchronization duration might be reduced from 1.25 ms to less than 0.7 ms for a 20% weight imbalance in the re-keying phase. (4) Geometric, brute force, and majority attacks are all prohibited. Experiments to validate the suggested method’s functionality are carried out, and the results show that the proposed approach outperforms current similar techniques in terms of efficiency.
 
Article
In recent years, the Multi-agent Deep Reinforcement Learning Algorithm has been developing rapidly, in which the value method-based algorithm plays an important role (such as Monotonic Value Function Factorisation (QMIX) and Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement learning (QTRAN)). In spite of the fact, the performance of current value-based multi-agent algorithm under complex scene still can be further improved. In value function-based model, a mixing network is usually used to mix the local action value of each agent to get joint action value when the partial observability will cause the problem of misalignment and unsatisfying mixing results. This paper proposes a multi-agent model called Transform Networks that transform the individual local action-value function gotten by agent network to individual global action-value function, which will avoid the problem of misalignment caused by partial observability when the individual action value is mixed, and the joint action value can represent the cooperative conditions of all agents well. Using the StarCraft Multi-Agent Challenge (SMAC) as the experimental platform, the comparison of the performance of algorithms on five different maps proved that the proposed method has better effect than the current most advanced baseline algorithms.
 
Article
Temporal knowledge graphs (TKGs) have become an effective tool for numerous intelligent applications. Due to their incompleteness, TKG embedding methods have been proposed to infer the missing temporal facts, and work by learning latent representations for entities, relations and timestamps. However, these methods primarily focus on measuring the plausibility of the whole temporal fact, and ignore the semantic property that there exists a bias between any relation and its involved entities at various time steps. In this paper, we present a novel temporal knowledge graph completion framework, which imposes relational constraints to preserve the semantic property implied in TKGs. Specifically, we borrow ideas from two well-known transformation functions, i.e., tensor decomposition and hyperplane projection, and design relational constraints associated with timestamps. We then adopt suitable regularization schemes to accommodate specific relational constraints, which combat overfitting and enforce temporal smoothness. Experimental studies indicate the superiority of our proposal compared to existing baselines on the task of temporal knowledge graph completion.
 
Examples of challenging crowd counting scenarios: scale variations, occlusion and perspective distortions
Examples of scenarios where both sparse and crowded situations occur at the same time. The red boxes represent the sparse part while the yellow boxes refer to the crowded parts
Overall pipeline of the proposed MACC architecture with three tasks: 1) Density level classification to encode the global contextual information and learn to scale and zoom in the dense areas for improved counting accuracy, 2) Density Map Estimation to generate the density map, and 3) Segmentation guided attention which filter out background noise from the foreground features. This mitigate the problem of density pattern shift produced by density differences between sparse and dense regions
Detailed architecture of Scale Module
Visualization of density and segmentation maps for images from ShanghaiTech part A and HaCrowd dataset. Row 1: Input images; Row 2: Groundtruth; Row 3: Predicted density map; Row 4: Predicted segmentation map
Article
Crowd counting and Crowd density map estimation face several challenges, including occlusions, non-uniform density, and intra-scene scale and perspective variations. Significant progress has been made in the development of most crowd counting approaches in recent years, especially with the emergence of deep learning and massive crowd datasets. The purpose of this work is to address the problem of crowd density estimatation in both sparse and crowded situations. In this paper, we propose a multi-task attention based crowd counting network (MACC Net), which consists of three contributions: 1) density level classification, which offers the global contextual information for the density estimation network; 2) density map estimation; and 3) segmentation guided attention to filter out the background noise from the foreground features. The proposed MACC Net is evaluated on four popular datasets including ShanghaiTech, UCF-CC-50, UCF-QRNF, and a recently launched dataset HaCrowd. The MACC Net achieves the state of the art in estimation when applied to HaCrowd and UCF-CC-50, while on the others, it obtains competitive results.
 
Article
Striking a balance between convergence and diversity matters considerably for evolutionary algorithms in solving many-objective optimization problems. The performance of these algorithms depends on their capability of obtaining a set of uniformly distributed solutions as close to the Pareto optimal front as possible. However, most existing evolutionary algorithms encounter challenges in solving many-objective optimization problems. Thus, in this paper, an adaptive many-objective evolutionary algorithm with coordinated selection strategies, labeled ACS-MOEA, is proposed to balance the convergence and diversity. The coordinated selection strategies include three selection strategies, i.e., the selection based on shifted-dominated distance, the selection based on objective vector angle, and the selection based on Non-Euclidean geometry distance. The first is used in the mating selection process to select high-quality parents for the generation of good offspring. Both the second and the third selection strategies are employed in the environmental selection process to delete poor solutions one by one for preserving the elitist solutions of the next generation. The performance of ACS-MOEA is verified by comparing it with six state-of-the-art algorithms on several well-known benchmark test suites with up to 10 objectives. Experimental results have fully demonstrated the competitiveness of ACS-MOEA in balancing convergence and diversity. Moreover, the proposed ACS-MOEA has also been verified to be effective in solving constrained many-objective optimization problems.
 
Article
Most many-objective optimization algorithms focus on balancing convergence and diversity, instead of attaching importance to the contribution of the boundary solution. The boundary solution is beneficial for enhancing the PF coverage; therefore, we propose a many-objective evolutionary algorithm based on the corner solution and cosine distance (MaOEA-CSCD) to balance convergence and diversity, as well as protect the PF boundary. We set a corner solution archive to store the corner solutions and apply these corner solutions and cosine distance in the mating strategy to improve the quality of the parents to generate high-quality offspring. In environmental selection, a greedy strategy is applied to select the corner solution and the solution with better convergence to overcome the insufficient selection pressure while protecting the PF boundary and guaranteeing the search space. Then, a selection–deletion strategy is used to balance convergence and diversity, it first selects solutions based on the maximum cosine distance, and then considers replacement solutions based on convergence. The comparison of MaOEA-CSCD with six algorithms on 25 benchmark and three real-world optimization problems shows that it is competitive.
 
Article
Incremental language learning, which involves retrieving pseudo-data from previous tasks, can alleviate catastrophic forgetting. However, previous methods require a large amount of pseudo-data to approach the performance of multitask learning, and the performance decreases dramatically when there is significantly less pseudo-data than new task data. This decrease occurs because the pseudo-data are learned inefficiently and deviate from the real data. To address these issues, we propose reminding the incremental language model via data-free self-distillation (DFSD), which includes 1) self-distillation based on the Earth mover’s distance (SD-EMD) and 2) hidden data augmentation (HDA). SD-EMD can increase the efficiency of the model by adaptively estimating the knowledge distribution in all GPT-2 layers and effectively transferring data from the teacher model to the student model via adaptive self-multilayer-to-multilayer mapping. HDA can reduce deviations by decomposing the generation process via data augmentation and bootstrapping. Our experiments on decaNLP and text classification tasks with low pseudo-data sampling ratios reveal that the DFSD model outperforms previous state-of-the-art incremental methods. The advantages of DFSD become more apparent when there is less pseudo-data and larger deviations.
 
Article
Zero-shot learning (ZSL) aims to classify samples of unseen categories for which no training data is available. At present, the VAEGAN framework which combines Generative Adversarial Networks (GAN) with Variational Auto-Encoder (VAE) has achieved good performance in zero-shot image classification. Based on the VAEGAN, we propose a new zero-shot image classification method named Enhanced VAEGAN (E-VAEGAN). Firstly, we design a feature alignment module to align visual features and attribute features. Then, the aligned features are fused with the hidden layer features of the encoder to improve output features of the encoder. Secondly, the triplet loss is applied during the encoder training, which further increases the discriminability of features. Finally, the hidden layer features of the discriminator are input into a transform module and then fed back to the generator, which improves the quality of the generated fake samples. The originality of this paper is that we design a new E-VAEGAN which employs the feature alignment module, triplet loss and transform module to reduce the ambiguity between categories and make the generated fake features similar to the real features. Experiments show that our method outperforms the compared methods on five zero-shot learning benchmarks.
 
Article
The success of crowdsourcing projects relies critically on motivating a crowd to contribute. One particularly effective method for incentivising participants to perform tasks is to run contests where participants compete against each other for rewards. However, there are numerous ways to implement such contests in specific projects, that vary in how performance is evaluated, how participants are rewarded, and the sizes of the prizes. Also, the best way to implement contests in a particular project is still an open challenge, as the effectiveness of each contest implementation (henceforth, incentive ) is unknown in advance. Hence, in a crowdsourcing project, a practical approach to maximise the overall utility of the requester (which can be measured by the total number of completed tasks or the quality of the task submissions) is to choose a set of incentives suggested by previous studies from the literature or from the requester’s experience. Then, an effective mechanism can be applied to automatically select appropriate incentives from this set over different time intervals so as to maximise the cumulative utility within a given financial budget and a time limit. To this end, we present a novel approach to this incentive selection problem . Specifically, we formalise it as an online decision making problem, where each action corresponds to offering a specific incentive. After that, we detail and evaluate a novel algorithm, , to solve the incentive selection problem efficiently and adaptively. In theory, in the case that all the estimates in (except the estimates of the effectiveness of each incentive) are correct, we show that the algorithm achieves the regret bound of $\mathcal {O}(\sqrt {B/c})$ O ( B / c ) , where B denotes the financial budget and c is the average cost of the incentives. In experiments, the performance of is about 93% (up to 98%) of the optimal solution and about 9% (up to 40%) better than state-of-the-art algorithms in a broad range of settings, which vary in budget sizes, time limits, numbers of incentives, values of the standard deviation of the incentives’ utilities, and group sizes of the contests (i.e., the numbers of participants in a contest).
 
Article
RGB-T salient object detection (SOD) combines thermal infrared and RGB images to overcome the light sensitivity of RGB images in low-light conditions. However, the quality of RGB-T images could be unreliable under complex imaging scenarios, and direct fusion of these low-quality images will lead to sub-optimal detection results. In this paper, we propose a novel Modal Complementary Fusion Network (MCFNet) to alleviate the contamination effect of low-quality images from both global and local perspectives. Specifically, we design a modal reweight module (MRM) to evaluate the global quality of images and adaptively reweight RGB-T features by explicitly modelling interdependencies between RGB and thermal images. Furthermore, we propose a spatial complementary fusion module (SCFM) to explore the complementary local regions between RGB-T images and selectively fuse multi-modal features. Finally, multi-scale features are fused to obtain the salient detection result. Experiments on three RGB-T benchmark datasets demonstrate that our MCFNet achieved outstanding performance compared with the latest state-of-the-art methods. We have also achieved competitive results in RGB-D SOD tasks, which proves the generalization of our method. The source code is released at https://github.com/dotaball/MCFNet.
 
Top-cited authors
Seyedali Mirjalili
  • Griffith University
Harish Garg
  • Thapar Institute of Engineering & Technology, Patiala
Ibrahim Aljarah
  • University of Jordan
Shahzad Saremi
  • Griffith University
Hossam Faris
  • University of Jordan