To read the full-text of this research, you can request a copy directly from the authors.
... However, other desirable characteristics were not considered for optimization in this work. In [12], the authors use properties like molecular weight and sequence similarity score, among others, to measure the bioactivity of a peptide using the non-dominated sorting genetic algorithm-II (NSGA-II) [13]. However, this work determines the desirability of a peptide based on some features that were randomly selected and optimized. ...
... When the ideal number is unknown the proposed process of chromosomes with variable length is better especially if resources for computation are limited. Generic algorithms and GANs are an interesting field for hyperparameter optimisation as shown above and additional research done in [12], [13], [14], [15], [16] As Pix2Pix has a variable length when training the generator due to the skip connection described in Section 3 each training cycle is using a different amount of neurons leading to the variable length. The above described extension of generic algorithm can be interesting for the future to test how well this works with Pix2Pix. ...
Generative models and their possible applications are almost limitless. But there are still problems that such models have. On one hand, the models are difficult to train. Stability in training, mode collapse or non convergence, together with the huge parameter space make it extremely costly and difficult to train and optimize generative models. The following paper proposes an optimization method limited to a few hyperparameters with grid-search and early stopping which selects the best hyperparameter combination based on the results obtained with the Universal Image Quality Index (UIQ) by creating a copy of the source image and comparing it with the generated target. The proposed method allows to directly measure the impact of hyperparameter tuning by comparing the achieved UIQ score against a baseline.
... A stopping criterion stops the search if the criterion is met or the generations are not improving any further. There are several different ways for genetic algorithms and GANs for hyperparameter optimisation as shown in additional research done in [22], [23], [24], [25], [26], [27]. Performance wise grid search and genetic algorithms took nearly the same time to find results of equal accuracy and roughly twice as long to achieve the same accuracy as random search using the CIFAR-10 dataset. ...
Hyperparameter tuning is an important aspect in machine-learning especially for deep generative models. Tuning models to stabilize training and to get the best accuracy can be a time consuming and protracted process. Generative models have a large search space requiring resources and knowledge to find the best parameters. Therefore, in most cases the search space is reduced and parameters are limited to a selected few to save time and computation time. This paper explores three different strategies to predict high impact hyperparameters for Pix2Pix. The achieved results show, that binary classification and regression achieve good results and reliably predict good hyperparameter combinations.
This study introduces an augmented Long-Short Term Memory (LSTM) neural network architecture, integrating Symbolic Genetic Programming (SGP), with the objective of forecasting cross-sectional price returns across a comprehensive dataset comprising 4500 listed stocks in the Chinese market over the period from 2014 to 2022. Using the S&P Alpha Pool Dataset for China as basic input, this architecture incorporates data augmentation and feature extraction techniques. The result of this study demonstrates significant improvements in Rank Information coefficient (Rank IC) and IC information ratio (ICIR) by 1128% and 5360% respectively when it is applied to fundamental indicators. For technical indicators, the hybrid model achieves a 206% increase in Rank IC and an impressive surge of 2752% in ICIR. Furthermore, the proposed hybrid SGP-LSTM model outperforms major Chinese stock indexes, generating average annualized excess returns of 31.00%, 24.48%, and 16.38% compared to the CSI 300 index, CSI 500 index, and the average portfolio, respectively. These findings highlight the effectiveness of SGP-LSTM model in improving the accuracy of cross-sectional stock return predictions and provide valuable insights for fund managers, traders, and financial analysts.
Deep learning has recently achieved great success in many areas due to its strong capacity in data process. For instance, it has been widely used in financial areas such as stock market prediction, portfolio optimization, financial information processing and trade execution strategies. Stock market prediction is one of the most popular and valuable area in finance. In this paper, we propose a novel architecture of Generative Adversarial Network (GAN) with the Multi-Layer Perceptron (MLP) as the discriminator and the Long Short-Term Memory (LSTM) as the generator for forecasting the closing price of stocks. The generator is built by LSTM to mine the data distributions of stocks from given data in stock market and generate data in the same distributions, whereas the discriminator designed by MLP aims to discriminate the real stock data and generated data. We choose the daily data on S&P 500 Index and several stocks in a wide range of trading days and try to predict the daily closing price. Experimental results show that our novel GAN can get a promising performance in the closing price prediction on the real data compared with other models in machine learning and deep learning.
Stock price prediction is an important issue in the financial world, as it contributes to the development of effective strategies for stock exchange transactions. In this paper, we propose a generic framework employing Long Short-Term Memory (LSTM) and convolutional neural network (CNN) for adversarial training to forecast high-frequency stock market. This model takes the publicly available index provided by trading software as input to avoid complex financial theory research and difficult technical analysis, which provides the convenience for the ordinary trader of nonfinancial specialty. Our study simulates the trading mode of the actual trader and uses the method of rolling partition training set and testing set to analyze the effect of the model update cycle on the prediction performance. Extensive experiments show that our proposed approach can effectively improve stock price direction prediction accuracy and reduce forecast error.
Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.
Convolutional neural networks (CNNs) have gained remarkable success on many image classification tasks in recent years. However, the performance of CNNs highly relies upon their architectures. For the most state-of-the-art CNNs, their architectures are often manually designed with expertise in both CNNs and the investigated problems. Therefore, it is difficult for users, who have no extended expertise in CNNs, to design optimal CNN architectures for their own image classification problems of interest. In this article, we propose an automatic CNN architecture design method by using genetic algorithms, to effectively address the image classification tasks. The most merit of the proposed algorithm remains in its "automatic" characteristic that users do not need domain knowledge of CNNs when using the proposed algorithm, while they can still obtain a promising CNN architecture for the given images. The proposed algorithm is validated on widely used benchmark image classification datasets, compared to the state-of-the-art peer competitors covering eight manually designed CNNs, seven automatic + manually tuning, and five automatic CNN architecture design algorithms. The experimental results indicate the proposed algorithm outperforms the existing automatic CNN architecture design algorithms in terms of classification accuracy, parameter numbers, and consumed computational resources. The proposed algorithm also shows the very comparable classification accuracy to the best one from manually designed and automatic + manually tuning CNNs, while consuming fewer computational resources.
Financial time-series modeling is a challenging problem as it retains various complex statistical properties and the mechanism behind the process is unrevealed to a large extent. In this paper, a deep neural networks based approach, generative adversarial networks (GANs) for financial time-series modeling is presented. GANs learn the properties of data and generate realistic data in a data-driven manner. The GAN model produces a time-series that recovers the statistical properties of financial time-series such as the linear unpredictability, the heavy-tailed price return distribution, volatility clustering, leverage effects, the coarse-fine volatility correlation, and the gain/loss asymmetry.
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
Many statistical models, and in particular autoregressive-moving average time series models, can be regarded as means of transforming the data to white noise, that is, to an uncorrelated sequence of errors. If the parameters are known exactly, this random sequence can be computed directly from the observations; when this calculation is made with estimates substituted for the true parameter values, the resulting sequence is referred to as the "residuals," which can be regarded as estimates of the errors. If the appropriate model has been chosen, there will be zero autocorrelation in the errors. In checking adequacy of fit it is therefore logical to study the sample autocorrelation function of the residuals. For large samples the residuals from a correctly fitted model resemble very closely the true errors of the process; however, care is needed in interpreting the serial correlations of the residuals. It is shown here that the residual autocorrelations are to a close approximation representable as a singular linear transformation of the autocorrelations of the errors so that they possess a singular normal distribution. Failing to allow for this results in a tendency to overlook evidence of lack of fit. Tests of fit and diagnostic checks are devised which take these facts into account.
Generative adversarial nets
I Goodfellow
J Pouget-Abadie
Mirza Jean
Xu Mehdi
Warde-Farley Bing
Ozair David
Generative Adversarial Network for Stock Market price Prediction
Ricardo Alberto
Carrillo Romero
A genetic algorithm tutorial. Statistics and computing