Article

Evolutionary Generative Adversarial Networks

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Generative adversarial networks (GAN) have been effective for learning generative models for real-world data. However, existing GANs (GAN and its variants) tend to suffer from training problems such as instability and mode collapse. In this paper, we propose a novel GAN framework called evolutionary generative adversarial networks (E-GAN) for stable GAN training and improved generative performance. Unlike existing GANs, which employ a pre-defined adversarial objective function alternately training a generator and a discriminator, we utilize different adversarial training objectives as mutation operations and evolve a population of generators to adapt to the environment (i.e., the discriminator). We also utilize an evaluation mechanism to measure the quality and diversity of generated samples, such that only well-performing generator(s) are preserved and used for further training. In this way, E-GAN overcomes the limitations of an individual adversarial training objective and always preserves the best offspring, contributing to progress in and the success of GANs. Experiments on several datasets demonstrate that E-GAN achieves convincing generative performance and reduces the training problems inherent in existing GANs.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Figure 14 depicts the framework of a GAN, where a two-player adversarial game is played between a generator (G) and a discriminator (D). The generator's updating gradients are determined by the discriminator through an adaptive objective [66]. ...
... Figure 14. The framework of a GRU [66]. ...
... Furthermore, several models have been developed based on the Generative Adversarial Network (GAN) framework to address specific tasks. These models include Coupled GAN (Co-GAN) [65], Markovian GAN [79], Evolutionary GAN (E-GAN) [66], Unrolled GAN [80], Bayesian Conditional GAN [81], Relativistic GAN [82], Laplacian GAN (Lap-GAN) [83], Graph Embedding GAN (GE-GAN) [77], Wasserstein GAN (WGAN) [84], and Boundary Equilibrium GAN (BEGAN) [85]. ...
Preprint
Full-text available
Deep learning (DL) has emerged as a powerful subset of machine learning (ML) and artificial intelligence (AI), outperforming traditional ML methods, especially in handling unstructured and large datasets. Its impact spans across various domains, including speech recognition, healthcare, autonomous vehicles, cybersecurity, predictive analytics, and more. However, the complexity and dynamic nature of real-world problems present challenges in designing effective deep learning models. Consequently, several deep learning models have been developed to address different problems and applications. In this article, we conduct a comprehensive survey of various deep learning models, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Models, Deep Reinforcement Learning (DRL), and Deep Transfer Learning. We examine the structure, applications, benefits, and limitations of each model. Furthermore, we perform an analysis using three publicly available datasets: IMDB, ARAS, and Fruit-360. We compare the performance of six renowned deep learning models: CNN, Simple RNN, Long Short-Term Memory (LSTM), Bidirectional LSTM, Gated Recurrent Unit (GRU), and Bidirectional GRU.
... Evolutionary Generative Adversarial Networks (E-GAN) were first proposed in [22] to mitigate the inherent deficiencies existing in conventional GANs, namely, mode collapse, training instability, and vanishing gradients. Different from vanilla GANs, which update a generator and a discriminator iteratively in a specified adversarial optimization strategy, E-GAN introduced a neuro-evolutionary computation strategy (NECS) that evolved a population of generators (parents) {G θ } instead of a single generator in a given static environment denoted by the discriminator D φ to produce a set of new generators (offspring). ...
... In essence, the variation mutations of generators in IGASEN-EMWGAN are the different objective functions, whose purposes are to update the parameters in generators followed by each evaluation by the discriminators and narrow the distances between the generated distribution and the data distribution [22,33]. As shown in Figure 2, the θ } in a given dynamic environment denoted by the discriminators D φ n (n = 1, 2, · · · , τ) by NECS. ...
... In essence, the variation mutations of generators in IGASEN-EMWGAN are the different objective functions, whose purposes are to update the parameters in generators followed by each evaluation by the discriminators and narrow the distances between the generated distribution and the data distribution [22,33]. As shown in Figure 2, the performance of the original minimax mutation is disappointing because the gradient of the generator easily vanishes. ...
Article
Full-text available
During the process of ship coating, various defects will occur due to the improper operation by the workers, environmental changes, etc. The special characteristics of ship coating limit the amount of data and result in the problem of class imbalance, which is not conducive to ensuring the effectiveness of deep learning-based models. Therefore, a novel hybrid intelligent image generation algorithm called the IGASEN-EMWGAN model for ship painting defect images is proposed to tackle the aforementioned limitations in this paper. First, based on a subset of imbalanced ship painting defect image samples obtained by a bootstrap sampling algorithm, a batch of different base discriminators was trained independently with the algorithm parameter and sample perturbation method. Then, an improved genetic algorithm based on the simulated annealing algorithm is used to search for the optimal subset of base discriminators. Further, the IGASEN-EMWGAN model was constructed by fusing the base discriminators in this subset through a weighted integration strategy. Finally, the trained IGASEN-EMWGAN model is used to generate new defect images of the minority classes to obtain a balanced dataset of ship painting defects. The extensive experimental results are conducted on a real unbalanced ship coating defect database and show that, compared with the baselines, the values of the ID and FID scores are significantly improved by 4.92% and decreased by 7.29%, respectively, which prove the superior effectiveness of the proposed model in this paper.
... The simple data augmentations based on basic image manipulations are flipping, cropping, rotation, translation, etc [37], [38]. Recently, GAN based approach refers to the practice of creating artificial instances from a dataset such that they retain similar characteristics to the original set [39], [40]. In malware detection, several papers applied data augmentation method to solve imbalance or data insufficiency issues [12], [41]. ...
... Moreover, the proposed method showed stable performance even with relatively little training data. We applied different kinds of several recent GAN models (i.e., deep convolutional GAN (DCGAN) [42], least-squares GAN (LSGAN) [43], Wasserstein GAN with gradient penalty (WGAN-GP) [44], evolutionary GAN (E-GAN) [40]) to our design, it could be shown as a potentially reliable adaptation in state-of-the-art GAN models. ...
... proposed E-GAN [40] using several objective functions (i.e., minimax, heuristic, and least-squares). Generators using each objective function are evaluated by a discriminator, and the best-performing generator is chosen to evolve to the next stage. ...
Article
Full-text available
Zero-day malicious software (malware) refers to a previously unknown or newly discovered software vulnerability. The fundamental objective of this paper is to enhance detection for analogous zero-day malware by efficient learning to plausible generated data. To detect zero-day malware, we proposed a malware training framework based on the generated analogous malware data using generative adversarial networks (PlausMal-GAN). Thus, the PlausMal-GAN can suitably produce analogous zero-day malware images with high quality and high diversity from the existing malware data. The discriminator, as a detector, learns various malware features using both real and generated malware images. In terms of performance, the proposed framework showed higher and more stable performances for the analogous zero-day malware images, which can be assumed to be analogous zero-day malware data. We obtained reliable accuracy performances in the proposed PlausMal-GAN framework with representative GAN models (i.e., deep convolutional GAN, least-squares GAN, Wasserstein GAN with gradient penalty, and evolutionary GAN). These results indicate that the use of the proposed framework is beneficial for the detection and prediction of numerous and analogous zero-day malware data from noted malware when developing and updating malware detection systems.
... Continuous monitoring of free calcium is highly important in the subsequent production processes, as the cement calcination process operates continuously. Therefore, after addressing the issue of limited labeled data, [22][23][24] constructing the generated labeled data as the input layer can enable the prediction of multi-node continuous time series for free lime content, which is important for improving the accuracy of soft measurement models. ...
Article
This paper proposes a method to address the issue of insufficient capture of temporal dependencies in cement production processes, which is based on a data-augmented Seq2Seq-WGAN (Sequence to Sequence-Wasserstein Generate Adversarial Network) model. Considering the existence of various temporal scales in cement production processes, we use WGAN to generate a large amount of f-CaO label data and employ Seq2Seq to solve the problem of unequal length input–output sequences. We use the unlabeled relevant variable data as the input to the encoder of the Seq2Seq-WGAN model and use the generated labels as the input to the decoder, thus fully exploring the temporal dependency relationships between input and output variables. We use the hidden vector containing the temporal characteristics of cement produced by the encoder as the initial state of the gate recurrent unit in the decoder to achieve accurate prediction of key points and continuous time. The experimental results show that the Seq2Seq-WGAN model can achieve accurate prediction of continuous time series of free calcium and offer direction for subsequent production planning. This method has high practicality and application prospects, and can provide strong support for the production scheduling of the cement industry.
... The Minimax game algorithm tries to minimize the maximum possible loss which results in multiple possibilities that can be used to generate new samples. In doing so, GAN projects the available simple distribution to a much more complex high-dimensional, real-world data distribution [18]. GAN trains two adversarial networks called the generator and the discriminator. ...
Article
Full-text available
Recent ransomware attacks threaten not only personal files but also critical infrastructure like smart grids, necessitating early detection before encryption occurs. Current methods, reliant on pre-encryption data, suffer from insufficient and rapidly outdated attack patterns, despite efforts to focus on select features. Such an approach assumes that the same features remain unchanged. This approach proves ineffective due to the polymorphic and metamorphic characteristics of ransomware, which generate unique attack patterns for each new target, particularly in the pre-encryption phase where evasiveness is prioritized. As a result, the selected features quickly become obsolete. Therefore, this study proposes an enhanced Bi-Gradual Minimax (BGM) loss function for the Generative Adversarial Network (GAN) Algorithm that compensates for the attack patterns insufficiency to represents the polymorphic behavior at the earlier phases of the ransomware lifecycle. Unlike existing GAN-based models, the BGM-GAN gradually minimizes the maximum loss of the generator and discriminator in the network. This allows the generator to create artificial patterns that resemble the pre-encryption data distribution. The generator is used to craft evasive adversarial patterns and add them to the original data. Then, the generator and discriminator compete to optimize their weights during the training phase such that the generator produces realistic attack patterns, while the discriminator endeavors to distinguish between the real and crafted patterns. The experimental results show that the proposed BGM-GAN reached maximum accuracy of 0.98, recall (0.96), and a minimum false positive rate (0.14) which all outperform those obtained by the existing works. The application of BGM-GAN can be extended to early detect malware and other types of attacks.
... The rapid advancement of artificial intelligence (AI) and machine learning (ML) includes their integration into various aspects of our daily lives, from image recognition to language understanding [1]. Over the past few decades, AI systems have evolved from rudimentary, remote-controlled devices to sophisticated models capable of generating photorealistic images and interpreting complex language [2,3]. As AI development accelerates, driven by increasing investments and faster computational training, its potential impact on society grows [4,5]. ...
Chapter
Full-text available
Artificial Intelligence (AI) is making significant strides in the field of education, offering new opportunities for personalized learning and access to education for a more diverse population. Despite this potential, the adoption of AI in K-12 education is limited, and educators’ express hesitancy towards its integration due to perceived technological barriers and misconceptions. The purpose of this study is to examine the perceptions of K-12 educators in all 50 states of the USA towards AI, policies, training, and resources related to technology and AI, their comfort with technology, willingness to adopt new technologies for classroom instruction, and needs assessment for necessary infrastructure, such as reliable internet access, hardware, and software. Researchers analyzed regional differences in attitudes towards AI integration in the classroom. The findings suggest the overall positive perception of AI and openness towards its integration. However, disparities in access to technology and comfort levels with technology exist among different regions, genders, and age groups. These findings suggest that policymakers and educators need to develop targeted strategies to ensure equitable access to technology and AI integration in the classroom. The implications of this work are the need for an authentic STEM model for integrating AI into K-12 education and offer recommendations for policymakers and educators to support the successful adoption of AI in the classroom.
... Image generation in our case specifically means unsupervised learning [11] and seed-based image synthesis [12,13]. Generative adversarial networks [14], or GANs, are typical tools used for this purpose [15][16][17]. Previous works have already shown the feasibility of this method [18][19][20][21][22][23] and have proposed improvements and adaptations. ...
Article
Full-text available
Image inpainting is a critical area of research in computer vision with a broad range of applications, including image restoration and editing. However, current inpainting models often struggle to learn the specific painting styles and fine-grained brushstrokes of individual artists when restoring Chinese landscape paintings. To address this challenge, this paper proposes a novel inpainting model specifically designed for Chinese landscape paintings, featuring a hierarchical structure that can be applied to restore the famous Dwelling in the Fuchun Mountains with remarkable fidelity. The proposed method leverages an image processing algorithm to extract the structural information of Chinese landscape paintings. This approach enables the model to decompose the inpainting process into two separate steps, generating less informative backgrounds and more detailed foregrounds. By seamlessly merging the generated results with the remaining portions of the original work, the proposed method can faithfully restore Chinese landscape paintings while preserving their rich details and fine-grained styles. Overall, the results of this study demonstrate that the proposed method represents a significant step forward in the field of image inpainting, particularly for the restoration of Chinese landscape paintings. The hierarchical structure and image processing algorithm used in this model is able to faithfully restore delicate and intricate details of these paintings, making it a promising tool for art restoration professionals and researchers.
... Subsequently, Lipizzaner is applied to medical image augmentations [62]. In [51], Toutouh hybridized E-GAN [49] and Lipizzaner to combine mutation and population approaches to improve diversity GANs. Based on neuro-evolution and coevolution in the GAN training, Costa et al. [50] devised COEGAN to provide a more stable training method and the automatic design of neural network architectures. ...
Preprint
Zero-shot learning (ZSL) aims to recognize the novel classes which cannot be collected for training a prediction model. Accordingly, generative models (e.g., generative adversarial network (GAN)) are typically used to synthesize the visual samples conditioned by the class semantic vectors and achieve remarkable progress for ZSL. However, existing GAN-based generative ZSL methods are based on hand-crafted models, which cannot adapt to various datasets/scenarios and fails to model instability. To alleviate these challenges, we propose evolutionary generative adversarial network search (termed EGANS) to automatically design the generative network with good adaptation and stability, enabling reliable visual feature sample synthesis for advancing ZSL. Specifically, we adopt cooperative dual evolution to conduct a neural architecture search for both generator and discriminator under a unified evolutionary adversarial framework. EGANS is learned by two stages: evolution generator architecture search and evolution discriminator architecture search. During the evolution generator architecture search, we adopt a many-to-one adversarial training strategy to evolutionarily search for the optimal generator. Then the optimal generator is further applied to search for the optimal discriminator in the evolution discriminator architecture search with a similar evolution search algorithm. Once the optimal generator and discriminator are searched, we entail them into various generative ZSL baselines for ZSL classification. Extensive experiments show that EGANS consistently improve existing generative ZSL methods on the standard CUB, SUN, AWA2 and FLO datasets. The significant performance gains indicate that the evolutionary neural architecture search explores a virgin field in ZSL.
... The image classification model facilitates discrimination of different infections in terms of their appearance and structure. To learn the approximate location information of the patch on the pulmonary image, the model uses relative distance-from-edge as an extra weight [24] Therefore, relying on the answer and proposals of [64], an LSTM system is put forward for the judgment of COVID-19 linked cardiac association. Bearing in mind that in feedforward neural networks signals are permissible to just move in one direction travelling onward from the contribution to the output. ...
Conference Paper
Full-text available
The outbreak of COVID-19 put the whole world in an unprecedentedly harsh situation, horribly disrupting life around the world and killing thousands. COVID-19 remains a real threat to the public health system as it spreads to 212 countries and territories and the number of cases of infection and deaths increases to 5,212,172 and 334,915 (as of May 22, 2020). This treatise provides a response to virus eradication via artificial intelligence (AI). Several deep learning (DL) methods have been described to achieve this goal, including GAN (Generative Adversarial Network), ELM (Extreme Learning Machine), and LSTM (Long / Short Term Memory). It describes an integrated bioinformatics approach that combines various aspects of information from a series of orthopedic and unstructured data sources to form a user-friendly platform for physicians and researchers. A major advantage of these AI-powered platforms is to facilitate the diagnosis and treatment process of the COVID-19 disease. The latest relevant publications and medical reports have been investigated to select inputs and targets for networks that will facilitate arriving at reliable artificial neural network-based tools for COVID-19-related challenges. There are also several specific inputs per platform, including clinical data and data in various formats, such as medical images, which can improve the performance of the introduced method for the best response in real application.
... Evolutionary computation(EC), inspired by natural evolutionary processes, is one of the global optimization methods [18], [19]. This method has been shown to be useful in solving optimization problems as a meta-heuristic method or in combination with other machine learning methods [20], [21]. Evolutionary computation starts with a number of candidate solutions and iteratively evaluates and reproduces the candidates, gradually converging to the optimal solution. ...
Article
Full-text available
Federated learning, where the distribution of distributed data is unknown, is more difficult and costly to train a central model with than traditional machine learning. In this study, we propose Federated Learning with Genetic Algorithm, which enables faster central model training at lower cost by providing an appropriate client selection method. A client can have its own communication cost depending on its data sharing preference, and based on this cost and the result of the client’s local update, we can select the appropriate combination of clients each round with a genetic algorithm. In each round, the client’s combinations are evaluated anew, which are continually explored. To evaluate the algorithm, we distributed the image dataset and communication costs in two ways and conducted federated learning for the image classification model. Experiments showed that the proposed algorithm can find a more efficient client combination and accelerate the training of federated learning.
... Chen et al. [15] developed a data augmentation method using centroidal Voronoi tessellation sampling and a modified conditional GAN to overcome a sample collection problem in soft-sensor development in chemical engineering. Ohno et al. [16][17][18] improved the kernel ridge regression learning performance using data augmented with a GAN-based real-valued non-volume-preserving model. Zhu et al. [19] applied a GAN to hyperspectral image classification problems and developed a 1D-GAN and 3D-GAN for spectral and spectral-space classifications, respectively. ...
Article
Full-text available
In this study, we present a methodology for deriving an optimal performance design in a new domain using a designable generative adversarial network (DGAN) structure based on domain-adaptive designable data augmentation (DADDA). In a generative adversarial network (GAN), two neural networks—the generator and discriminator—compete to learn virtual data that are difficult to distinguish from real data. The DGAN can estimate the corresponding design variables along with the generated virtual data by adding an inverse generator to the GAN structure. Designable data augmentation is possible by using a DGAN for unknown design variables in the virtual data created by the GAN. The advantage of a DGAN is that it is adaptable to both single and multiple design domains. In this study, we develop a DADDA method using a domain-adaptive DGAN. DADDA is a source/target domain-adaptive method that maximizes the design performance in a new target domain from the beginning based on an already optimized source domain that has accumulated a large amount of data. The methodology is verified using one mathematical example and two engineering analysis examples. First, the design is derived by increasing the goal performance step by step. The engineering analysis examples confirm that the design can be improved by up to 12.6%. In addition, we propose a method for deriving the maximum performance enhancement limit according to a virtual data-based optimization, without analyzing the goal performance individually, by grafting it with a genetic algorithm.
... Historically, the concept of Wasserstein distance has been a fundamental element in the field of generative models, particularly in the training of Generative Adversarial Networks (GANs). The introduction of the Wasserstein GAN (WGAN) [30] aimed to tackle the common issues associated with traditional GANs, such as unstable training, mode collapse, and vanishing gradients. The application of the Wasserstein distance in this context allowed for a more meaningful and smoother gradient during training, resulting in improved model stability and performance. ...
Preprint
Full-text available
In recent years, deep learning models have revolutionized medical image interpretation, offering substantial improvements in diagnostic accuracy. However, these models often struggle with challenging images where critical features are partially or fully occluded, which is a common scenario in clinical practice. In this paper, we propose a novel curriculum learning-based approach to train deep learning models to handle occluded medical images effectively. Our method progressively introduces occlusion, starting from clear, unobstructed images and gradually moving to images with increasing occlusion levels. This ordered learning process, akin to human learning, allows the model to first grasp simple, discernable patterns and subsequently build upon this knowledge to understand more complicated, occluded scenarios. Furthermore, we present three novel occlusion synthesis methods, namely Wasserstein Curriculum Learning (WCL), Information Adaptive Learning (IAL), and Geodesic Curriculum Learning (GCL). Our extensive experiments on diverse medical image datasets demonstrate substantial improvements in model robustness and diagnostic accuracy over conventional training methodologies.
... The information inside the FNN can proceed from the input layer, via the hidden layer, and to the output layer. Recently, a lot of studies and applications have shown that the FNN has good function-fitting ability and pattern recognition ability [39], [40]. The FNN-based KLM can learn the mapping relationship between positions and directions of the experiences. ...
Article
Full-text available
Evolutionary computation (EC) is a kind of meta-heuristic algorithm that takes inspiration from natural evolution and swarm intelligence behaviors. In the EC algorithm, there is a huge amount of data generated during the evolutionary process. These data reflect the evolutionary behavior and therefore mining and utilizing these data can obtain promising knowledge for improving the effectiveness and efficiency of EC algorithms to better solve optimization problems. Considering this and inspired by the ability of human beings that acquire knowledge from the historical successful experiences of their predecessors, this paper proposes a novel EC paradigm, named knowledge learning EC (KLEC). The KLEC aims to learn from historical successful experiences to obtain a knowledge library and to guide the evolutionary behaviors of individuals based on the knowledge library. The KLEC includes two main processes named “learning from experiences to obtain knowledge” and “utilizing knowledge to guide evolution”. First, KLEC maintains a knowledge library model and updates this model by learning the successful experiences collected in every generation. Second, KLEC not only adopts the evolutionary operation but also utilizes the knowledge library model to guide individuals for better evolution. The KLEC is a generic and effective framework, and we propose two algorithm instances of KLEC, which are knowledge learning-based differential evolution and knowledge learning-based particle swarm optimization. Also, we combine the knowledge learning framework with several state-of-the-art EC algorithms, showing that the performance of the state-of-the-art algorithms can be significantly enhanced by incorporating the knowledge learning framework.
... ; if A is an action that has not been selected bef ore represents the MRR. The use of a logarithmic function in the equation is motivated by previous studies [39], [40], which found that it leads to a stable loss. When the relevant files are ranked higher, the average precision tends to be higher. ...
Preprint
Full-text available
Software developers spend a significant portion of time fixing bugs in their projects. To streamline this process, bug localization approaches have been proposed to identify the source code files that are likely responsible for a particular bug. Prior work proposed several similarity-based machine-learning techniques for bug localization. Despite significant advances in these techniques, they do not directly optimize the evaluation measures. Instead, they use different metrics in the training and testing phases, which can negatively impact the model performance in retrieval tasks. In this paper, we propose RLocator, a Reinforcement Learning-based (RL) bug localization approach. We formulate the bug localization problem using a Markov Decision Process (MDP) to optimize the evaluation measures directly. We present the technique and experimentally evaluate it based on a benchmark dataset of 8,316 bug reports from six highly popular Apache projects. Our evaluation shows that RLocator achieves up to a Mean Reciprocal Rank (MRR) of 0.62 and a Mean Average Precision (MAP) of 0.59. Our results demonstrate that directly optimizing evaluation measures considerably contributes to performance improvement of the bug localization problem.
... The disease-related characteristics are present, and it will considerably improve the performance of the diagnostic conclusion by giving a complete identified section and corroboration for the complex audits and inspection results. The author studied the potential of establishing autonomous computerized programs for diagnosing risk levels and the existence of disease-related characteristics in endoscopic images [20]. The existence of disease-related characteristics in endoscopic images is first discovered, and afterwards, the extent of disease follows options on this. ...
... In terms of high mobility, tunnels and other environmental conditions, there are some problems such as low update rate and signal interruption, which seriously affect its navigation and positioning accuracy, whereas INS is an autonomous system. By continuously integrating the real angular velocity and specific force measured by the gyroscopes and accelerometers, the position, velocity and azimuth information can be obtained [3,4]. However, INS gyro drift error and accelerometer deviation will increase with time [5]. ...
... Due to the limitation of the special aviation controls of UAVs in the capital area, we were unable to fly more. According to this problem, we could employ algorithmic compensation, such as GAN (Generative Adversarial Network) methods [54,55]. This work will be listed in our next stage study. ...
Article
Full-text available
Pine wilt disease (PWD) is a great danger, due to two aspects: no effective cure and fast dissemination. One key to the prevention and treatment of pine wilt disease is the early detection of infected wood. Subsequently, appropriate treatment can be applied to limit the further spread of pine wilt disease. In this work, a UAV (Unmanned Aerial Vehicle) with a RGB (Red, Green, Blue) camera was employed as it provided high-quality images of pine trees in a timely manner. Seven flights were performed above seven sample plots in northwestern Beijing, China. Then, raw images captured by the UAV were further pre-processed, classified, annotated, and formed the research datasets. In the formal analysis, improved YOLOv5 frameworks that integrated four attention mechanism modules, i.e., SE (Squeeze-and-Excitation), CA (Coordinate Attention), ECA (Efficient Channel Attention), and CBAM (Convolutional Block Attention Module), were developed. Each of them had been shown to improve the overall identification rate of infected trees at different ranges. The CA module was found to have the best performance, with an accuracy of 92.6%, a 3.3% improvement over the original YOLOv5s model. Meanwhile, the recognition speed was improved by 20 frames/second compared to the original YOLOv5s model. The comprehensive performance could well support the need for rapid detection of pine wilt disease. The overall framework proposed by this work shows a fast response to the spread of PWD. In addition, it requires a small amount of financial resources, which determines the duplication of this method for forestry operators.
... In recent years, deep generative models have emerged as a powerful means of learning data distributions. These models, which include generative adversarial networks (GANs) [163,164,165,166], Vector-Quantized Variational Autoencoders (VQ-VAEs) [167,168], autoregressive models [169,170], and diffusion models [171,172], have demonstrated impressive capabilities in a wide range of applications. By learning the underlying probability distribution that generated the data, researches can gain insights into the underlying mechanisms of the data-generating process. ...
Preprint
Automated machine learning (AutoML) seeks to build ML models with minimal human effort. While considerable research has been conducted in the area of AutoML in general, aiming to take humans out of the loop when building artificial intelligence (AI) applications, scant literature has focused on how AutoML works well in open-environment scenarios such as the process of training and updating large models, industrial supply chains or the industrial metaverse, where people often face open-loop problems during the search process: they must continuously collect data, update data and models, satisfy the requirements of the development and deployment environment, support massive devices, modify evaluation metrics, etc. Addressing the open-environment issue with pure data-driven approaches requires considerable data, computing resources, and effort from dedicated data engineers, making current AutoML systems and platforms inefficient and computationally intractable. Human-computer interaction is a practical and feasible way to tackle the problem of open-environment AI. In this paper, we introduce OmniForce, a human-centered AutoML (HAML) system that yields both human-assisted ML and ML-assisted human techniques, to put an AutoML system into practice and build adaptive AI in open-environment scenarios. Specifically, we present OmniForce in terms of ML version management; pipeline-driven development and deployment collaborations; a flexible search strategy framework; and widely provisioned and crowdsourced application algorithms, including large models. Furthermore, the (large) models constructed by OmniForce can be automatically turned into remote services in a few minutes; this process is dubbed model as a service (MaaS). Experimental results obtained in multiple search spaces and real-world use cases demonstrate the efficacy and efficiency of OmniForce.
... GANs are well known for performing image generation (Radford, Metz, and Chintala 2015) and text generation (Zhang et al. 2017). More recent exploration of novel use of GANs includes evolutionary GAN (Wang et al. 2019). Our extension to this typical GAN setup that is novel is we introduce the concept of a surrogate model that acts as the oracle to the discriminator. ...
Preprint
Full-text available
We propose a new Tipping Point Generative Adversarial Network (TIP-GAN) for better characterizing potential climate tipping points in Earth system models. We describe an adversarial game to explore the parameter space of these models, detect upcoming tipping points, and discover the drivers of tipping points. In this setup, a set of generators learn to construct model configurations that will invoke a climate tipping point. The discriminator learns to identify which generators are generating each model configuration and whether a given configuration will lead to a tipping point. The discriminator is trained using an oracle (a surrogate climate model) to test if a generated model configuration leads to a tipping point or not. We demonstrate the application of this GAN to invoke the collapse of the Atlantic Meridional Overturning Circulation (AMOC). We share experimental results of modifying the loss functions and the number of generators to exploit the area of uncertainty in model state space near a climate tipping point. In addition, we show that our trained discriminator can predict AMOC collapse with a high degree of accuracy without the use of the oracle. This approach could generalize to other tipping points, and could augment climate modeling research by directing users interested in studying tipping points to parameter sets likely to induce said tipping points in their computationally intensive climate models.
... In this paper, a multi-layer perceptron Wasserstein generative adversarial network is proposed, which can be expressed as Eq. (2) [43], where G is the discriminator model for judging the type of sample, and D is the generator model for generating fake data, and z is the seed noise, x ∈ R n are real samples, x ∈ R n are generated samples, y ∈ R 2 are the decision values of the discriminator, θ G and θ D are the generator and discriminator model parameters, respectively. ...
... The GAN [23] is an unsupervised method that constructs a model through two neural networks: a generator and discriminator. The underlying concept, along with its many variations, represents one of the most innovative ideas in machine learning over the last decade. ...
Article
Full-text available
Anomaly detection is an important research topic in the field of artificial intelligence and visual scene understanding. The most significant challenge in real-world anomaly detection problems is the high imbalance of available data (i.e., non-anomalous versus anomalous data). This limits the use of supervised learning methods. Furthermore, the abnormal—and even normal—datasets in the airport field are relatively insufficient, causing them to be difficult to use to train deep neural networks when conducting experiments. Because generative adversarial networks (GANs) are able to effectively learn the latent vector space of all images, the present study adopted a GAN variant with autoencoders to create a hybrid model for detecting anomalies and hazards in the airport environment. The proposed method, which integrates the Wasserstein-GAN (WGAN) and Skip-GANomaly models to distinguish between normal and abnormal images, is called the Improved Wasserstein Skip-Connection GAN (IWGAN). In the experimental stage, we evaluated different hyper-parameters—including the activation function, learning rate, decay rate, training times of discriminator, and method of label smoothing—to identify the optimal combination. The proposed model’s performance was compared with that of existing models, such as U-Net, GAN, WGAN, GANomaly, and Skip-GANomaly. Our experimental results indicate that the proposed model yields exceptional performance.
Article
Change detection (CD) in synthetic aperture radar (SAR) images aims to detect changed areas by considering the changes in backscattering coefficients. However, the changes can be further divided into positive and negative changes in terms of the increase or decrease of backscattering coefficient, so the CD task can be divided into binary and ternary according to the number of existent categories. This paper introduces an M-nary (binary or ternary) SAR change detection procedure based on the generative adversarial network (GAN) and neural architecture search (NAS) strategy to detect which changes exist in the SAR image-pair and design specialized classifiers for both binary and ternary CD. First, a difference image generation approach based on the salient changed region extraction and neighborhood information is designed for a robust difference representation on the M-nary CD. Due to the further subdivision of changes, the insufficiency of labeled data presents the M-nary change detection with a dilemma. Concerning the lack of labeled information, this paper presents a labeled sample generation strategy based on the GAN architecture search to supplement sample data. Since GAN training is inherently unstable, NAS provides an effective means of searching GAN architecture automatically and ameliorates the reliability of generated samples. During the architecture search procedure, a double-phase evolutionary search strategy is introduced to further improve the stability of GAN training. The experimental results with theoretical analysis prove the validity, robustness, and potential of our method in synthetic as well as real SAR datasets.
Article
Generative Adversarial Networks (GANs) have been proposed as a method to generate multiple replicas from an original version combining a Discriminator and a Generator. The main applications of GANs have been the casual generation of audio and video content. GANs, as a neural method that generates populations of individuals, have emulated genetic algorithms based on biologically inspired operators such as mutation, crossover and selection. This article presents the Deep Learning Generative Adversarial Random Neural Network (RNN) with the same features and functionality as a GAN. Furthermore, the presented algorithm is proposed for an application, the Digital Creative, that generates tradeable replicas in a Data Marketplace, such as 1D functions or audio, 2D and 3D images and video content. The RNN Generator creates individuals mapped from a latent space while the GAN Discriminator evaluates them based on the true data distribution. The performance of the Deep Learning Generative Adversarial RNN has been assessed against several input vectors with different dimensions, in addition to 1D functions and 2D images. The presented results are successful: the learning objective of the RNN Generator creates tradeable replicas at low error, whereas the RNN Discriminator learning target identifies unfit individuals.
Article
Quantum computers are next-generation devices that hold promise to perform calculations beyond the reach of classical computers. A leading method towards achieving this goal is through quantum machine learning, especially quantum generative learning. Due to the intrinsic probabilistic nature of quantum mechanics, it is reasonable to postulate that quantum generative learning models (QGLMs) may surpass their classical counterparts. As such, QGLMs are receiving growing attention from the quantum physics and computer science communities, where various QGLMs that can be efficiently implemented on near-term quantum machines with potential computational advantages are proposed. In this paper, we review the current progress of QGLMs from the perspective of machine learning. Particularly, we interpret these QGLMs, covering quantum circuit Born machines, quantum generative adversarial networks, quantum Boltzmann machines, and quantum variational autoencoders, as the quantum extension of classical generative learning models. In this context, we explore their intrinsic relations and their fundamental differences. We further summarize the potential applications of QGLMs in both conventional machine learning tasks and quantum physics. Last, we discuss the challenges and further research directions for QGLMs.
Article
A cross-ribosome binding site (cRBS) adjusts the dynamic range of transcription factor-based biosensors (TFBs) by controlling protein expression and folding. The rational design of a cRBS with desired TFB dynamic range remains an important issue in TFB forward and reverse engineering. Here, we report a novel artificial intelligence (AI)-based forward-reverse engineering platform for TFB dynamic range prediction and de novo cRBS design with selected TFB dynamic ranges. The platform demonstrated superior in processing unbalanced minority-class datasets and was guided by sequence characteristics from trained cRBSs. The platform identified correlations between cRBSs and dynamic ranges to mimic bidirectional design between these factors based on Wasserstein generative adversarial network (GAN) with a gradient penalty (GP) (WGAN-GP) and balancing GAN with GP (BAGAN-GP). For forward and reverse engineering, the predictive accuracy was up to 98% and 82%, respectively. Collectively, we generated an AI-based method for the rational design of TFBs with desired dynamic ranges.
Article
A Generative Adversarial Network (GAN) can learn the relationship between two image domains and achieve unpaired image-to-image translation. One of the breakthroughs was Cycle-consistent Generative Adversarial Networks (CycleGAN), which is a popular method to transfer the content representations from the source domain to the target domain. Existing studies have gradually improved the performance of CycleGAN models by modifying the network structure or loss function of CycleGAN. However, these methods tend to suffer from training instability and the generators lack the ability to acquire the most discriminating features between the source and target domains, thus making the generated images of low fidelity and few texture details. To overcome these issues, this paper proposes a new method that combines Evolutionary Algorithms (EAs) and Attention Mechanisms to train GANs. Specifically, from an initial CycleGAN, binary vectors indicating the activation of the weights of the generators are progressively improved upon by means of an EA. At the end of this process, the best-performing configurations of generators can be retained for image generation. In addition, to address the issues of low fidelity and lack of texture details on generated images, we make use of the channel attention mechanism. The latter component allows the candidate generators to learn important features of real images and thus generate images with higher quality. The experiments demonstrate qualitatively and quantitatively that the proposed method, namely, Attention evolutionary GAN (AevoGAN) alleviates the training instability problems of CycleGAN training. In the test results, the proposed method can generate higher quality images and obtain better results than the CycleGAN training methods present in the literature, in terms of Inception Score (IS), Fréchet Inception Distance (FID) and Kernel Inception Distance (KID).
Article
In recent years, Reinforcement Learning and Gradient optimization were applied with Neural Architecture Search algorithms in Generative Adversarial Network to achieve their state-of-the-art (SOTA) performance. However, the existing RL-based methods utilised the calculation of Inception Score or Fréchet Inception Distance as the reward value to guide the controller, which actually wasted much of searching time. In order to improve the search efficiency without degradation of performance, this paper proposes recycling the discriminator to evaluate the performance of architectures, in other words, we propose to self-guide the search process. In the mean time, we introduce new concept of multiple controllers and the method of reward shaping to independently and effectively search the cell architectures. The experiments demonstrate the effectiveness and efficiency of our Multi-Self GAN and the ablation study also exhibits its robustness.
Chapter
Generative adversarial network (GAN) is a powerful method to reproduce the distribution of a given data set. It is widely used for generating photo-realistic images or data collections that appear real. Evolutionary GAN (E-GAN) is one of state-of-the-art GAN variations. E-GAN combines population based search and evolutionary operators from evolutionary algorithms with GAN to enhance diversity and search performance. In this study we aim to improve E-GAN by adding transfer learning and crossover which is a key evolutionary operator that is commonly used in evolutionary algorithms, but not in E-GAN.
Article
Full-text available
Learning and optimization are the two essential abilities of human beings for problem solving. Similarly, computer scientists have made great efforts to design artificial neural network (ANN) and evolutionary computation (EC) to simulate the learning ability and the optimization ability for solving real-world problems, respectively. These have been two essential branches in artificial intelligence (AI) and computer science. However, in humans, learning and optimization are usually integrated together for problem solving. Therefore, how to efficiently integrate these two abilities together to develop powerful AI remains a significant but challenging issue. Motivated by this, this paper proposes a novel learning-aided evolutionary optimization (LEO) framework that plus learning and evolution for solving optimization problems. The LEO is integrated with the evolution knowledge learned by ANN from the evolution process of EC to promote optimization efficiency. The LEO framework is applied to both classical EC algorithms and some state-of-the-art EC algorithms including a champion algorithm, with benchmarking against the IEEE Congress on Evolutionary Computation competition data. The experimental results show that the LEO can significantly enhance the existing EC algorithms to better solve both single-objective and multi-/many-objective global optimization problems, suggesting that learning plus evolution is more intelligent for problem solving. Moreover, the experimental results have also validated the time efficiency of the LEO, where the additional time cost for using LEO is greatly deserved. Therefore, the promising LEO can lead to a new and more efficient paradigm for EC algorithms to solve global optimization problems by plus learning and evolution.
Conference Paper
Full-text available
Compressing convolutional neural networks (CNNs) is essential for transferring the success of CNNs to a wide variety of applications to mobile devices. In contrast to directly recognizing subtle weights or filters as redundant in a given CNN, this paper presents an evolutionary method to automatically eliminate redundant convolution filters. We represent each compressed network as a binary individual of specific fitness. Then, the population is upgraded at each evolutionary iteration using genetic operations. As a result, an extremely compact CNN is generated using the fittest individual, which has the original network structure and can be directly deployed in any off-the-shelf deep learning libraries. In this approach, either large or small convolution filters can be redundant, and filters in the compressed network are more distinct. In addition, since the number of filters in each convolutional layer is reduced, the number of filter channels and the size of feature maps are also decreased, naturally improving both the compression and speed-up ratios. Experiments on benchmark deep CNN models suggest the superiority of the proposed algorithm over the state-of-the-art compression methods, e.g. combined with the parameter refining approach, we can reduce the storage requirement and the floating-point multiplications of ResNet-50 by a factor of 14.64x and 5.19x, respectively, without affecting its accuracy.
Article
Full-text available
The tracking-by-detection framework consists of two stages, i.e., drawing samples around the target object in the first stage and classifying each sample as the target object or as background in the second stage. The performance of existing trackers using deep classification networks is limited by two aspects. First, the positive samples in each frame are highly spatially overlapped, and they fail to capture rich appearance variations. Second, there exists extreme class imbalance between positive and negative samples. This paper presents the VITAL algorithm to address these two problems via adversarial learning. To augment positive samples, we use a generative network to randomly generate masks, which are applied to adaptively dropout input features to capture a variety of appearance changes. With the use of adversarial learning, our network identifies the mask that maintains the most robust features of the target objects over a long temporal span. In addition, to handle the issue of class imbalance, we propose a high-order cost sensitive loss to decrease the effect of easy negative samples to facilitate training the classification network. Extensive experiments on benchmark datasets demonstrate that the proposed tracker performs favorably against state-of-the-art approaches.
Conference Paper
Full-text available
Generative Adversarial Networks (GANs) excel at creating realistic images with complex models for which maximum likelihood is infeasible. However, the convergence of GAN training has still not been proved. We propose a two time-scale update rule (TTUR) for training GANs with stochastic gradient descent on arbitrary GAN loss functions. TTUR has an individual learning rate for both the discriminator and the generator. Using the theory of stochastic approximation, we prove that the TTUR converges under mild assumptions to a stationary local Nash equilibrium. The convergence carries over to the popular Adam optimization, for which we prove that it follows the dynamics of a heavy ball with friction and thus prefers flat minima in the objective landscape. For the evaluation of the performance of GANs at image generation, we introduce the `Fréchet Inception Distance'' (FID) which captures the similarity of generated images to real ones better than the Inception Score. In experiments, TTUR improves learning for DCGANs and Improved Wasserstein GANs (WGAN-GP) outperforming conventional GAN training on CelebA, CIFAR-10, SVHN, LSUN Bedrooms, and the One Billion Word Benchmark. https://papers.nips.cc/paper/7240-gans-trained-by-a-two-time-scale-update-rule-converge-to-a-local-nash-equilibrium
Article
Full-text available
A Triangle Generative Adversarial Network ($\Delta$-GAN) is developed for semi-supervised cross-domain joint distribution matching, where the training data consists of samples from each domain, and supervision of domain correspondence is provided by only a few paired samples. $\Delta$-GAN consists of four neural networks, two generators and two discriminators. The generators are designed to learn the two-way conditional distributions between the two domains, while the discriminators implicitly define a ternary discriminative function, which is trained to distinguish real data pairs and two kinds of fake data pairs. The generators and discriminators are trained together using adversarial learning. Under mild assumptions, in theory the joint distributions characterized by the two generators concentrate to the data distribution. In experiments, three different kinds of domain pairs are considered, image-label, image-image and image-attribute pairs. Experiments on semi-supervised image classification, image-to-image translation and attribute-based image generation demonstrate the superiority of the proposed approach.
Article
Full-text available
We propose in this paper a novel approach to tackle the problem of mode collapse encountered in generative adversarial network (GAN). Our idea is intuitive but proven to be very effective, especially in addressing some key limitations of GAN. In essence, it combines the Kullback-Leibler (KL) and reverse KL divergences into a unified objective function, thus it exploits the complementary statistical properties from these divergences to effectively diversify the estimated density in capturing multi-modes. We term our method dual discriminator generative adversarial nets (D2GAN) which, unlike GAN, has two discriminators; and together with a generator, it also has the analogy of a minimax game, wherein a discriminator rewards high scores for samples from data distribution whilst another discriminator, conversely, favoring data from the generator, and the generator produces data to fool both two discriminators. We develop theoretical analysis to show that, given the maximal discriminators, optimizing the generator of D2GAN reduces to minimizing both KL and reverse KL divergences between data distribution and the distribution induced from the data generated by the generator, hence effectively avoiding the mode collapsing problem. We conduct extensive experiments on synthetic and real-world large-scale datasets (MNIST, CIFAR-10, STL-10, ImageNet), where we have made our best effort to compare our D2GAN with the latest state-of-the-art GAN's variants in comprehensive qualitative and quantitative evaluations. The experimental results demonstrate the competitive and superior performance of our approach in generating good quality and diverse samples over baselines, and the capability of our method to scale up to ImageNet database.
Conference Paper
Full-text available
One of the challenges in the study of generative adversarial networks is the instability of its training. In this paper, we propose a novel weight normalization technique called spectral normalization to stabilize the training of the discriminator. Our new normalization technique is computationally light and easy to incorporate into existing implementations. We tested the efficacy of spectral normalization on CIFAR10, STL-10, and ILSVRC2012 dataset, and we experimentally confirmed that spectrally normalized GANs (SN-GANs) is capable of generating images of better or equal quality relative to the previous training stabilization techniques.
Article
Full-text available
Support Vector Machines are among the most powerful learning algorithms for classification tasks. However, these algorithms require a high computational cost during the training phase, which can limit their application on large-scale datasets. Moreover, it is known that their effectiveness highly depends on the hyper-parameters used to train the model. With the intention of dealing with these, this paper introduces an Evolutionary Multi-Objective Model and Instance Selection approach for support vector machines with Pareto-based Ensemble, whose goals are, precisely, to optimize the size of the training set and the classification performance attained by the selection of the instances, which can be done using either a wrapper or a filter approach. Due to the nature of multi-objective evolutionary algorithms, several Pareto optimal solutions can be found.We study several ways of using such information to perform a classification task. To accomplish this, our proposal performs a processing over the Pareto solutions in order to combine them into a single ensemble. This is done in five different ways, which are based on a global Pareto ensemble, error reduction, a complementary error reduction, maximized margin distance and boosting. Through a comprehensive experimental study we evaluate the suitability of the proposed approach and the Pareto processing, and we show its advantages over a single-objective formulation, traditional instance selection techniques and learning algorithms.
Article
Full-text available
Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training data will not be available. We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples. Our goal is to learn a mapping G: X -> Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss. Because this mapping is highly under-constrained, we couple it with an inverse mapping F: Y -> X and introduce a cycle consistency loss to push F(G(X)) \approx X (and vice versa). Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc. Quantitative comparisons against several prior methods demonstrate the superiority of our approach.
Article
Full-text available
Divide-and-Conquer (DC) is conceptually well suited to deal with high-dimensional optimization problems by decomposing the original problem into multiple low-dimensional sub-problems, and tackling them separately. Nevertheless, the dimensionality mismatch between the original problem and subproblems makes it non-trivial to precisely assess the quality of a candidate solution to a sub-problem, which has been a major hurdle for applying the idea of DC to non-separable highdimensional optimization problems. In this paper, we suggest that searching a good solution to a sub-problem can be viewed as a computationally expensive problem and can be addressed with the aid of meta-models. As a result, a novel approach, namely Self-Evaluation Evolution (SEE) is proposed. Empirical studies have shown the advantages of SEE over 4 representative compared algorithms increase with the problem size on the CEC2010 large scale global optimization benchmark. The weakness of SEE is also analysed in the empirical studies.
Article
With benefits of low storage cost and fast query speed, cross-modal hashing has received considerable attention recently. However, almost all existing methods on cross-modal hashing cannot obtain powerful hash codes due to directly utilizing hand-crafted features or ignoring heterogeneous correlations across different modalities, which will greatly degrade the retrieval performance. In this paper, we propose a novel deep cross-modal hashing method to generate compact hash codes through an end-to-end deep learning architecture, which can effectively capture the intrinsic relationships between various modalities. Our architecture integrates different types of pairwise constraints to encourage the similarities of the hash codes from an intra-modal view and an inter-modal view, respectively. Moreover, additional decorrelation constraints are introduced to this architecture, thus enhancing the discriminative ability of each hash bit. Extensive experiments show that our proposed method yields state-of-the-art results on two cross-modal retrieval datasets.
Article
Due to its strong representation learning ability and its facilitation of joint learning for representation and hash codes, deep learning-to-hash has achieved promising results and is becoming increasingly popular for the large-scale approximate nearest neighbor search. However, recent studies highlight the vulnerability of deep image classifiers to adversarial examples; this also introduces profound security concerns for deep retrieval systems. Accordingly, in order to study the robustness of modern deep hashing models to adversarial perturbations, we propose hash adversary generation (HAG), a novel method of crafting adversarial examples for Hamming space search. The main goal of HAG is to generate imperceptibly perturbed examples as queries, whose nearest neighbors from a targeted hashing model are semantically irrelevant to the original queries. Extensive experiments prove that HAG can successfully craft adversarial examples with small perturbations to mislead targeted hashing models. The transferability of these perturbations under a variety of settings is also verified. Moreover, by combining heterogeneous perturbations, we further provide a simple yet effective method of constructing adversarial examples for black-box attacks.
Chapter
The recent years have witnessed significant growth in constructing robust generative models to capture informative distributions of natural data. However, it is difficult to fully exploit the distribution of complex data, like images and videos, due to the high dimensionality of ambient space. Sequentially, how to effectively guide the training of generative models is a crucial issue. In this paper, we present a subspace-based generative adversarial network (Sub-GAN) which simultaneously disentangles multiple latent subspaces and generates diverse samples correspondingly. Since the high-dimensional natural data usually lies on a union of low-dimensional subspaces which contain semantically extensive structure, Sub-GAN incorporates a novel clusterer that can interact with the generator and discriminator via subspace information. Unlike the traditional generative models, the proposed Sub-GAN can control the diversity of the generated samples via the multiplicity of the learned subspaces. Moreover, the Sub-GAN follows an unsupervised fashion to explore not only the visual classes but the latent continuous attributes. We demonstrate that our model can discover meaningful visual attributes which is hard to be annotated via strong supervision, e.g., the writing style of digits, thus avoid the mode collapse problem. Extensive experimental results show the competitive performance of the proposed method for both generating diverse images with satisfied quality and discovering discriminative latent subspaces.
Article
Style transfer describes the rendering of an image's semantic content as different artistic styles. Recently, generative adversarial networks (GANs) have emerged as an effective approach in style transfer by adversarially training the generator to synthesize convincing counterfeits. However, traditional GAN suffers from the mode collapse issue, resulting in unstable training and making style transfer quality difficult to guarantee. In addition, the GAN generator is only compatible with one style, so a series of GANs must be trained to provide users with choices to transfer more than one kind of style. In this paper, we focus on tackling these challenges and limitations to improve style transfer. We propose adversarial gated networks (Gated-GAN) to transfer multiple styles in a single model. The generative networks have three modules: an encoder, a gated transformer, and a decoder. Different styles can be achieved by passing input images through different branches of the gated transformer. To stabilize training, the encoder and decoder are combined as an auto-encoder to reconstruct the input images. The discriminative networks are used to distinguish whether the input image is a stylized or genuine image. An auxiliary classifier is used to recognize the style categories of transferred images, thereby helping the generative networks generate images in multiple styles. In addition, Gated-GAN makes it possible to explore a new style by investigating styles learned from artists or genres. Our extensive experiments demonstrate the stability and effectiveness of the proposed model for multi-style transfer.
Conference Paper
Generative Adversarial Networks (GANs) have become one of the dominant methods for fitting generative models to complicated real-life data, and even found unusual uses such as designing good cryptographic primitives. In this talk, we will first introduce the ba- sics of GANs and then discuss the fundamental statistical question about GANs — assuming the training can succeed with polynomial samples, can we have any statistical guarantees for the estimated distributions? In the work with Arora, Ge, Liang, and Zhang, we suggested a dilemma: powerful discriminators cause overfitting, whereas weak discriminators cannot detect mode collapse. Such a conundrum may be solved or alleviated by designing discrimina- tor class with strong distinguishing power against the particular generator class (instead of against all possible generators.)
Article
Neural text generation models are often autoregressive language models or seq2seq models. These models generate text by sampling words sequentially, with each word conditioned on the previous word, and are state-of-the-art for several machine translation and summarization benchmarks. These benchmarks are often defined by validation perplexity even though this is not a direct measure of the quality of the generated text. Additionally, these models are typically trained via maxi- mum likelihood and teacher forcing. These methods are well-suited to optimizing perplexity but can result in poor sample quality since generating text requires conditioning on sequences of words that may have never been observed at training time. We propose to improve sample quality using Generative Adversarial Networks (GANs), which explicitly train the generator to produce high quality samples and have shown a lot of success in image generation. GANs were originally designed to output differentiable values, so discrete language generation is challenging for them. We claim that validation perplexity alone is not indicative of the quality of text generated by a model. We introduce an actor-critic conditional GAN that fills in missing text conditioned on the surrounding context. We show qualitatively and quantitatively, evidence that this produces more realistic conditional and unconditional text samples compared to a maximum likelihood trained model.
Article
The use of multiple features has been shown to be an effective strategy for visual tracking because of their complementary contributions to appearance modeling. The key problem is how to learn a fused representation from multiple features for appearance modeling. Different features extracted from the same object should share some commonalities in their representations while each feature should also have some feature-specific representation patterns which reflect its complementarity in appearance modeling. Different from existing multi-feature sparse trackers which only consider the commonalities among the sparsity patterns of multiple features, this paper proposes a novel multiple sparse representation framework for visual tracking which jointly exploits the shared and feature-specific properties of different features by decomposing multiple sparsity patterns. Moreover, we introduce a novel online multiple metric learning to efficiently and adaptively incorporate the appearance proximity constraint, which ensures that the learned commonalities of multiple features are more representative. Experimental results on tracking benchmark videos and other challenging videos demonstrate the effectiveness of the proposed tracker.
Article
We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses. This both speeds the training up and greatly stabilizes it, allowing us to produce images of unprecedented quality, e.g., CelebA images at 1024^2. We also propose a simple way to increase the variation in generated images, and achieve a record inception score of 8.80 in unsupervised CIFAR10. Additionally, we describe several implementation details that are important for discouraging unhealthy competition between the generator and discriminator. Finally, we suggest a new metric for evaluating GAN results, both in terms of image quality and variation. As an additional contribution, we construct a higher-quality version of the CelebA dataset.
Article
In the last 20 years, Evolutionary Algorithms (EAs) have shown to be an effective method to solve Multi-objective Optimization Problems (MOPs). Due to their population-based nature, Multi-objective Evolutionary Algorithms (MOEAs) are able to generate a set of trade-off solutions (called nondominated solutions) in a single algorithmic execution instead of having to perform a series of independent executions, as normally done with mathematical programming techniques. Additionally, MOEAs can be successfully applied to problems with difficult features such as multifrontality, discontinuity and disjoint feasible regions, among others. On the other hand, Coevolutionary algorithms (CAs) are extensions of traditional evolutionary algorithms (EAs) which have become subject of numerous studies in the last few years, particularly for dealing with large scale global optimization problems. CAs have also been applied to the solution of MOPs, motivating the development of new algorithmic and analytical formulations that have advanced the state of the art in coevolutionary algorithms research, while simultaneously opening a new research path within MOEAs. This paper presents a critical review of the most representative Coevolutionary MOEAs (CMOEAs) that have been reported in the specialized literature. This survey includes a taxonomy of approaches together with a brief description of their main features. In the final part of the paper, we also identify what we believe to be promising areas of future research in the field of CMOEAs.
Article
Generative adversarial networks (GANs) are a family of generative models that do not minimize a single training criterion. Unlike other generative models, the data distribution is learned via a game between a generator (the generative model) and a discriminator (a teacher providing training signal) that each minimize their own cost. GANs are designed to reach a Nash equilibrium at which each player cannot reduce their cost without changing the other players' parameters. One useful approach for the theory of GANs is to show that a divergence between the training distribution and the model distribution obtains its minimum value at equilibrium. Several recent research directions have been motivated by the idea that this divergence is the primary guide for the learning process and that every step of learning should decrease the divergence. We show that this view is overly restrictive. During GAN training, the discriminator provides learning signal in situations where the gradients of the divergences between distributions would not be useful. We provide empirical counterexamples to the view of GAN training as divergence minimization. Specifically, we demonstrate that GANs are able to learn distributions in situations where the divergence minimization point of view predicts they would fail. We also show that gradient penalties motivated from the divergence minimization perspective are equally helpful when applied in other contexts in which the divergence minimization perspective does not predict they would be helpful. This contributes to a growing body of evidence that GAN training may be more usefully viewed as approaching Nash equilibria via trajectories that do not necessarily minimize a specific divergence at each step.
Article
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.
Conference Paper
We introduce the "Energy-based Generative Adversarial Network" (EBGAN) model which views the discriminator in GAN framework as an energy function that associates low energies with the regions near the data manifold and higher energies everywhere else. Similar to the probabilistic GANs, a generator is trained to produce contrastive samples with minimal energies, while the energy function is trained to assign high energies to those generated samples. Viewing the discriminator as an energy function allows to use a wide variety of architectures and loss functionals in addition to the usual binary discriminant network. Among them, an instantiation of EBGANs is to use an auto-encoder architecture alongside the energy being the reconstruction error. We show that this form of EBGAN exhibits more stable behavior than regular GANs during training. We also show that a single-scale architecture can be trained to generate high-resolution images.
Conference Paper
In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks - demonstrating their applicability as general image representations.
Article
The problem of maximizing monotone k -submodular functions under a size constraint arises in many applications, and it is NP-hard. In this paper, we propose a new approach which employs a multiobjective evolutionary algorithm to maximize the given objective and minimize the size simultaneously. For general cases, we prove that the proposed method can obtain the asymptotically tight approximation guarantee, which was also achieved by the greedy algorithm. Moreover, we further give instances where the proposed approach performs better than the greedy algorithm on applications of influence maximization, information coverage maximization, and sensor placement. Experimental results on real-world data sets exhibit the superior performance of the proposed approach.
Article
Combinatorial testing can test software that has various configurations for multiple parameters efficiently. This method is based on a set of test cases that guarantee a certain level of interaction among parameters. Mixed covering array can be used to represent a test-suite. Each row of the array corresponds to a test case. In general, a smaller size of mixed covering array does not necessarily imply less testing time. There are certain combinations of parameter values which would take much longer time than other cases. Based on this observation, it is more valuable to construct mixed covering arrays that are better in terms of testing effort characterization other than size. We present a method to find cost-aware mixed covering arrays. The method contains two steps. First, simulated annealing algorithm is used to get a mixed covering array with a small size. Then we propose a novel nested differential evolution algorithm to improve the solution with its testing effort. The experimental results indicate that our method succeeds in constructing cost-aware mixed covering arrays for real-world applications. The testing effort is significantly reduced compared with representative state-of-the-art algorithms.
Conference Paper
In this paper, we propose a principled Tag Disentangled Generative Adversarial Networks (TD-GAN) for re-rendering new images for the object of interest from a single image of it by specifying multiple scene properties (such as viewpoint, illumination, expression, etc.). The whole framework consists of a disentangling network, a generative network, a tag mapping net, and a discriminative network, which are trained jointly based on a given set of images that are completely/partially tagged (i.e., supervised/semi-supervised setting). Given an input image, the disentangling network extracts disentangled and interpretable representations, which are then used to generate images by the generative network. In order to boost the quality of disentangled representations, the tag mapping net is integrated to explore the consistency between the image and its tags. Furthermore, the discriminative network is introduced to implement the adversarial training strategy for generating more realistic images. Experiments on two challenging datasets demonstrate the state-of-the-art performance of the proposed framework in the problem of interest.
Conference Paper
Hashing has been a widely-adopted technique for nearest neighbor search in large-scale image retrieval tasks. Recent research has shown that leveraging supervised information can lead to high quality hashing. However, the cost of annotating data is often an obstacle when applying supervised hashing to a new domain. Moreover, the results can suffer from the robustness problem as the data at training and test stage may come from different distributions. This paper studies the exploration of generating synthetic data through semi-supervised generative adversarial networks (GANs), which leverages largely unlabeled and limited labeled training data to produce highly compelling data with intrinsic invariance and global coherence, for better understanding statistical structures of natural data. We demonstrate that the above two limitations can be well mitigated by applying the synthetic data for hashing. Specifically, a novel deep semantic hashing with GANs (DSH-GANs) is presented, which mainly consists of four components: a deep convolution neural networks (CNN) for learning image representations, an adversary stream to distinguish synthetic images from real ones, a hash stream for encoding image representations to hash codes and a classification stream. The whole architecture is trained end-to-end by jointly optimizing three losses, i.e., adversarial loss to correct label of synthetic or real for each sample, triplet ranking loss to preserve the relative similarity ordering in the input real-synthetic triplets and classification loss to classify each sample accurately. Extensive experiments conducted on both CIFAR-10 and NUS-WIDE image benchmarks validate the capability of exploiting synthetic images for hashing. Our framework also achieves superior results when compared to state-of-the-art deep hash models.
Article
An effective allocation of search effort is important in multi-objective optimization, particularly in many-objective optimization problems. This paper presents a new adaptive search effort allocation strategy for MOEA/D-M2M, a recent MOEA/D algorithm for challenging Many-Objective Optimization Problems (MaOPs). This proposed method adaptively adjusts the subregions of its subproblems by detecting the importance of different objectives in an adaptive manner. More specifically, it periodically resets the subregion setting based on the distribution of the current solutions in the objective space such that the search effort is not wasted on unpromising regions. The basic idea is that the current population can be regarded as an approximation to the Pareto front (PF) and thus one can implicitly estimate the shape of the PF and such estimation can be used for adjusting the search focus. The performance of proposed algorithm has been verified by comparing it with eight representative and competitive algorithms on a set of degenerated many-objective optimization problems with disconnected and connected PFs. Performances of the proposed algorithm on a number of non-degenerated test instances with connected and disconnected PFs are also studied.
Article
Despite their growing prominence, optimization in generative adversarial networks (GANs) is still a poorly-understood topic. In this paper, we analyze the "gradient descent" form of GAN optimization (i.e., the natural setting where we simultaneously take small gradient steps in both generator and discriminator parameters). We show that even though GAN optimization does not correspond to a convex-concave game, even for simple parameterizations, under proper conditions, equilibrium points of this optimization procedure are still locally asymptotically stable for the traditional GAN formulation. On the other hand, we show that the recently-proposed Wasserstein GAN can have non-convergent limit cycles near equilibrium. Motivated by this stability analysis, we propose an additional regularization term for gradient descent GAN updates, which is able to guarantee local stability for both the WGAN and for the traditional GAN, and also shows practical promise in speeding up convergence and addressing mode collapse.
Article
We present a novel training framework for neural sequence models, particularly for grounded dialog generation. The standard training paradigm for these models is maximum likelihood estimation (MLE), or minimizing the cross-entropy of the human responses. Across a variety of domains, a recurring problem with MLE trained generative neural dialog models (G) is that they tend to produce 'safe' and generic responses ("I don't know", "I can't tell"). In contrast, discriminative dialog models (D) that are trained to rank a list of candidate human responses outperform their generative counterparts; in terms of automatic metrics, diversity, and informativeness of the responses. However, D is not useful in practice since it can not be deployed to have real conversations with users. Our work aims to achieve the best of both worlds -- the practical usefulness of G and the strong performance of D -- via knowledge transfer from D to G. Our primary contribution is an end-to-end trainable generative visual dialog model, where G receives gradients from D as a perceptual (not adversarial) loss of the sequence sampled from G. We leverage the recently proposed Gumbel-Softmax (GS) approximation to the discrete distribution -- specifically, a RNN augmented with a sequence of GS samplers, coupled with the straight-through gradient estimator to enable end-to-end differentiability. We also introduce a stronger encoder for visual dialog, and employ a self-attention mechanism for answer encoding along with a metric learning loss to aid D in better capturing semantic similarities in answer responses. Overall, our proposed model outperforms state-of-the-art on the VisDial dataset by a significant margin (2.67% on recall@10).
Article
Training generative adversarial networks is unstable in high-dimensions when the true data distribution lies on a lower-dimensional manifold. The discriminator is then easily able to separate nearly all generated samples leaving the generator without meaningful gradients. We propose training a single generator simultaneously against an array of discriminators, each of which looks at a different random low-dimensional projection of the data. We show that individual discriminators then provide stable gradients to the generator, and that the generator learns to produce samples consistent with the full data distribution to satisfy all discriminators. We demonstrate the practical utility of this approach experimentally, and show that it is able to produce image samples with higher quality than traditional training with a single discriminator.
Book
This book introduces numerous algorithmic hybridizations between both worlds that show how machine learning can improve and support evolution strategies. The set of methods comprises covariance matrix estimation, meta-modeling of fitness and constraint functions, dimensionality reduction for search and visualization of high-dimensional optimization processes, and clustering-based niching. After giving an introduction to evolution strategies and machine learning, the book builds the bridge between both worlds with an algorithmic and experimental perspective. Experiments mostly employ a (1+1)-ES and are implemented in Python using the machine learning library scikit-learn. The examples are conducted on typical benchmark problems illustrating algorithmic concepts and their experimental behavior. The book closes with a discussion of related lines of research.
Article
Generative Adversarial Networks (GANs) are powerful generative models, but suffer from training instability. The recently proposed Wasserstein GAN (WGAN) makes significant progress toward stable training of GANs, but can still generate low-quality samples or fail to converge in some settings. We find that these training failures are often due to the use of weight clipping in WGAN to enforce a Lipschitz constraint on the critic, which can lead to pathological behavior. We propose an alternative method for enforcing the Lipschitz constraint: instead of clipping weights, penalize the norm of the gradient of the critic with respect to its input. Our proposed method converges faster and generates higher-quality samples than WGAN with weight clipping. Finally, our method enables very stable GAN training: for the first time, we can train a wide variety of GAN architectures with almost no hyperparameter tuning, including 101-layer ResNets and language models over discrete data.
Article
We explore the use of Evolution Strategies, a class of black box optimization algorithms, as an alternative to popular RL techniques such as Q-learning and Policy Gradients. Experiments on MuJoCo and Atari show that ES is a viable solution strategy that scales extremely well with the number of CPUs available: By using hundreds to thousands of parallel workers, ES can solve 3D humanoid walking in 10 minutes and obtain competitive results on most Atari games after one hour of training time. In addition, we highlight several advantages of ES as a black box optimization technique: it is invariant to action frequency and delayed rewards, tolerant of extremely long horizons, and does not need temporal discounting or value function approximation.