Article

Autoencoding Conditional GAN for Portfolio Allocation Diversification

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The 'ALL' dataset consists of an ETF that tracks the market indices of developed and emerging markets, a long-term bond ETF, and a commodity ETF." Developed Market(DM) (1), (2), (3), (4), (7), (8), (9), (10) "The 'DM' dataset consists of an ETF that tracks the market indices of developed markets, a long-term bond ETF, and a commodity ETF." Global Bond(7),(13),(14), (8), (9), (10) "The 'Global Bond' dataset is constructed by referencing the Bloomberg Global Aggregate Total Return Index and taking into account the maturity and country of the bonds." US Sector(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26) "The 'US Sector' dataset consists of 11 Sector ETFs in the United States, with reference to the Global Industry Classification Standard (GICS)." ...
Article
Full-text available
Asset allocation method using reinforcement learning is being actively researched. However, the existing asset allocation methods do not consider the following viewpoints in solving the asset allocation problem. First, State design without considering portfolio management and financial market characteristics. Second, Model Overfitting. Third, Model training design without considering the statistical structure of financial time series data. To solve these problems, we propose a new Reinforcement Learning asset allocation method. First, financial market state and agent state. Second, Monte Carlo simulation data are used to increase training data complexity. Third, Monte Carlo simulation data are created considering various statistical structures of financial markets. We show experimentally that our method outperforms the benchmark at several test intervals.
... The advantage of GANs is that they can be trained in an unsupervised way. GANs were introduced in 2014 in a well cited paper [6] by Goodfellow et al. and tested initially on image datasets [7,8], medicine [9,10], Quantitative Finance [11], for portfolio optimization [12], fraud detection [13], trading model optimization [14] and generation of time series [15,16]. ...
Preprint
Full-text available
Quantum machine learning (QML) is a cross-disciplinary subject made up of two of the most exciting research areas: quantum computing and classical machine learning (ML), with ML and artificial intelligence (AI) being projected as the first fields that will be impacted by the rise of quantum machines. Quantum computers are being used today in drug discovery, material & molecular modelling and finance. In this work, we discuss some upcoming active new research areas in application of quantum machine learning (QML) in finance. We discuss certain QML models that has become areas of active interest in the financial world for various applications. We use real world financial dataset and compare models such as qGAN (quantum generative adversarial networks) and QCBM (quantum circuit Born machine) among others, using simulated environments. For the qGAN, we define quantum circuits for discriminators and generators and show promises of future quantum advantage via QML in finance.
... Fortunately, recent advances in neural networks and machine learning-for example, an algorithm known as Generative Adversarial Networks or GAN-has proven effective to this end in a number of applications (Goodfellow et al., 2014). Moreover, a number of authors have explored the use of GAN-based algorithms in portfolio optimization problems, albeit, within the scope of a framework different than the one discussed in this paper (e.g., Lu and Yi (2022); Mariani et al. (2019); Pun, Wang, and Wong (2020); Takahashi, Chen, and Tanaka-Ishii (2019)). Lommers, Harzli, and Kim (2021). ...
Preprint
Full-text available
We propose a new approach to portfolio optimization that utilizes a unique combination of synthetic data generation and a CVaR-constraint. We formulate the portfolio optimization problem as an asset allocation problem in which each asset class is accessed through a passive (index) fund. The asset-class weights are determined by solving an optimization problem which includes a CVaR-constraint. The optimization is carried out by means of a Modified CTGAN algorithm which incorporates features (contextual information) and is used to generate synthetic return scenarios, which, in turn, are fed into the optimization engine. For contextual information we rely on several points along the U.S. Treasury yield curve. The merits of this approach are demonstrated with an example based on ten asset classes (covering stocks, bonds, and commodities) over a fourteen-and-half year period (January 2008-June 2022). We also show that the synthetic generation process is able to capture well the key characteristics of the original data, and the optimization scheme results in portfolios that exhibit satisfactory out-of-sample performance. We also show that this approach outperforms the conventional equal-weights (1/N) asset allocation strategy and other optimization formulations based on historical data only.
... Accurate models are not always available for the system of which we want to build a DT for its operating conditions. The problem of the availability of data could possibly be solved using Generative Adversarial Networks (GANs) [21][22][23] and/or transfer learning [24][25][26][27][28], but the technology is still maturing. For this study, Generalized Additive Models (GAMs) are used as they are a category of ML that requires few to none hyperparameters tuning to approximate nonlinear relationships with a combination of linear formulation of a series of smoothing functions [29]. ...
Article
Full-text available
With the increasing constraints on energy and resource markets and the non-decreasing trend in energy demand, the need for relevant clean energy generation and storage solutions is growing and is gradually reaching the individual home. However, small-scale energy storage is still an expensive investment in 2022 and the risk/reward ratio is not yet attractive enough for individual homeowners. One solution is for homeowners not to store excess clean energy individually but to produce hydrogen for mutual use. In this paper, a collective production of hydrogen for a daily filling of a bus is considered. Following our previous work on the subject, the investigation consists of finding an optimal buy/sell rule to the grid, and the use of the energy with an additional objective: mobility. The dominant technique in the energy community is reinforcement learning, which however is difficult to use when the learning data is limited, as in our study. We chose a less data-intensive and yet technically well-documented approach. Our results show that rulebooks, different but more interesting than the usual robust rule, exist and can be cost-effective. In some cases, they even show that it is worth punctually missing the H2 production requirement in exchange for higher economic performance. However, they require fine-tuning as to not deteriorate the system performance.
... Accurate models aren't always available for the system of which we want to build a DT for its operating conditions. The problem of the availability of data could possibly be solved using Generative Adversarial Networks (GANs) [18][19][20] and/or transfer learning [21][22][23][24][25], but the technology is still maturing. For this study, General Additive Models (GAMs) are used as they are a category of ML that requires few to none hyperparameters tuning to approximates nonlinear relationships with a combination of linear formulation of a series of smoothening functions [26]. ...
Preprint
Full-text available
With increasing constraints on energy and resource markets and the non-decreasing trend in energy demand, the need for relevant clean energy generation and storage solutions is growing and is gradually reaching the individual home. But small-scale energy storage is still an expensive investment in 2022 and the risk/reward ratio is not yet attractive enough for individual homeowners. One solution is for homeowners not to store excess clean energy individually but to produce hydrogen for mutual use. In this paper a collective production of hydrogen for a daily filling of a bus is considered. Following our previous work on the subject, the investigation consists of finding an optimal buy/sell rule to the grid, and the use of the energy with an additional objective: mobility. The dominant technique in the energy community is reinforcement learning, which is however difficult to use when the learning data is limited as in our study. We chose a less data-intensive and yet technically well-documented approach. Our results show that rulebooks, different but more interesting than the usual robust rule, exist and can be cost-effective. But they require fine-tuning as to not deteriorate system performance. In some cases, it is worth missing the H2 production requirement in exchange for higher economic performance.
Chapter
Over the decades, the Markowitz framework has been used extensively in portfolio analysis though it puts too much emphasis on the analysis of the market uncertainty rather than on the trend prediction. While generative adversarial network (GAN), conditional GAN (CGAN), and autoencoding CGAN (ACGAN) have been explored to generate financial time series and extract features that can help portfolio analysis. The limitation of the CGAN or ACGAN framework stands in putting too much emphasis on generating series and finding the internal trends of the series rather than predicting the future trends. In this paper, we introduce a hybrid approach on conditional GAN based on deep generative models that learns the internal trend of historical data while modeling market uncertainty and future trends. We evaluate the model on several real-world datasets from both the US and Europe markets, and show that the proposed HybridCGAN and HybridACGAN models lead to better portfolio allocation compared to the existing Markowitz, CGAN, and ACGAN approaches.
Chapter
Full-text available
Fog and haze naturally or artificially appearing in the environment, limit human visibility. As a way of improving the visibility, digital images are captured and many different image enhancement methods are applied to remove the fog and haze effects. One of the fundamental methods is the Dark Channel Prior (DCP) method. DCP can remove fog and haze on a single image by modelling the psychical diminishing structure of fog. In this study, the spectral signature of the DCP method was investigated by using the transmission maps produced by the results of the DCP method on the Spectral Hazy Image Database (SHIA) dataset, which consists of hyperspectral images taken in the visible and near infrared band range. In this study, it was observed that the transmission response of different regions in the image to the increase in fog density was different. By using this distinctiveness on two foggy images taken from the scene in two different high fog level, this study achieves to reveal the silhouette of the scene which is totally not visible to human eye.
Article
Full-text available
We present a set of stylized empirical facts emerging from the statistical analysis of price variations in various types of financial markets. We first discuss some general issues common to all statistical studies of financial time series. Various statistical properties of asset returns are then described: distributional properties, tail properties and extreme fluctuations, pathwise regularity, linear and nonlinear dependence of returns in time and across stocks. Our description emphasizes properties common to a wide variety of markets and instruments. We then show how these statistical properties invalidate many of the common statistical approaches used to study financial data sets and examine some of the statistical problems encountered in each case.
Article
Full-text available
This article and the companion paper aim at reviewing recent empirical and theoretical developments usually grouped under the term Econophysics. Since the name was coined in 1995 by merging the words 'Economics' and 'Physics', this new interdisciplinary field has grown in various directions: theoretical macroeconomics (wealth distribution), microstructure of financial markets (order book modeling), econometrics of financial bubbles and crashes, etc. We discuss the interactions between Physics, Mathematics, Economics and Finance that led to the emergence of Econophysics. We then present empirical studies revealing the statistical properties of financial time series. We begin the presentation with the widely acknowledged 'stylized facts', which describe the returns of financial assets—fat tails, volatility clustering, autocorrelation, etc.—and recall that some of these properties are directly linked to the way 'time' is taken into account. We continue with the statistical properties observed on order books in financial markets. For the sake of illustrating this review, (nearly) all the stated facts are reproduced using our own high-frequency financial database. Finally, contributions to the study of correlations of assets such as random matrix theory and graph theory are presented. The companion paper will review models in Econophysics from the point of view of agent-based modeling.
Article
Full-text available
Is perception of the whole based on perception of its parts? There is psychological and physiological evidence for parts-based representations in the brain, and certain computational theories of object recognition rely on such representations. But little is known about how brains or computers might learn the parts of objects. Here we demonstrate an algorithm for non-negative matrix factorization that is able to learn parts of faces and semantic features of text. This is in contrast to other methods, such as principal components analysis and vector quantization, that learn holistic, not parts-based, representations. Non-negative matrix factorization is distinguished from the other methods by its use of non-negativity constraints. These constraints lead to a parts-based representation because they allow only additive, not subtractive, combinations. When non-negative matrix factorization is implemented as a neural network, parts-based representations emerge by virtue of two properties: the firing rates of neurons are never negative and synaptic strengths do not change sign.
Article
Full-text available
We investigate quantitatively the so-called "leverage effect," which corresponds to a negative correlation between past returns and future volatility. For individual stocks this correlation is moderate and decays over 50 days, while for stock indices it is much stronger but decays faster. For individual stocks the magnitude of this correlation has a universal value that can be rationalized in terms of a new "retarded" model which interpolates between a purely additive and a purely multiplicative stochastic process. For stock indices a specific amplification phenomenon seems to be necessary to account for the observed amplitude of the effect.
Article
SVR-GARCH model tends to “backward eavesdrop” when forecasting the financial time series volatility in which case it tends to simply produce the prediction by deviating the previous volatility. Though the SVR-GARCH model has achieved good performance in terms of various performance measurements, trading opportunities, peak or trough behaviors in the time series are all hampered by underestimating or overestimating the volatility. We propose a blending ARCH (BARCH) and an augmented BARCH (aBARCH) model to overcome this kind of problem and make the prediction towards better peak or trough behaviors.The method is illustrated using real data sets including SH300 and S&P500. The empirical results obtained suggest that the augmented and blending models improve the volatility forecasting ability.
Article
Financial time-series modeling is a challenging problem as it retains various complex statistical properties and the mechanism behind the process is unrevealed to a large extent. In this paper, a deep neural networks based approach, generative adversarial networks (GANs) for financial time-series modeling is presented. GANs learn the properties of data and generate realistic data in a data-driven manner. The GAN model produces a time-series that recovers the statistical properties of financial time-series such as the linear unpredictability, the heavy-tailed price return distribution, volatility clustering, leverage effects, the coarse-fine volatility correlation, and the gain/loss asymmetry.
Article
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.
Article
Support vector machine (SVM) is a very specific type of learning algorithms characterized by the capacity control of the decision function, the use of the kernel functions and the sparsity of the solution. In this paper, we investigate the predictability of financial movement direction with SVM by forecasting the weekly movement direction of NIKKEI 225 index. To evaluate the forecasting ability of SVM, we compare its performance with those of Linear Discriminant Analysis, Quadratic Discriminant Analysis and Elman Backpropagation Neural Networks. The experiment results show that SVM outperforms the other classification methods. Further, we propose a combining model by integrating SVM with the other classification methods. The combining model performs best among all the forecasting methods.
Article
The efficient market hypothesis gives rise to forecasting tests that mirror those adopted when testing the optimality of a forecast in the context of a given information set. However, there are also important differences arising from the fact that market efficiency tests rely on establishing profitable trading opportunities in ‘real time’. Forecasters constantly search for predictable patterns and affect prices when they attempt to exploit trading opportunities. Stable forecasting patterns are therefore unlikely to persist for long periods of time and will self-destruct when discovered by a large number of investors. This gives rise to non-stationarities in the time series of financial returns and complicates both formal tests of market efficiency and the search for successful forecasting approaches.
Article
Traditional econometric models assume a constant one-period forecast variance. To generalize this implausible assumption, a new class of stochastic processes called autoregressive conditional heteroscedastic (ARCH) processes are introduced in this paper. These are mean zero, serially uncorrelated processes with nonconstant variances conditional on the past, but constant unconditional variances. For such processes, the recent past gives information about the one-period forecast variance. A regression model is then introduced with disturbances following an ARCH process. Maximum likelihood estimators are described and a simple scoring iteration formulated. Ordinary least squares maintains its optimality properties in this set-up, but maximum likelihood is more efficient. The relative efficiency is calculated and can be infinite. To test whether the disturbances follow an ARCH process, the Lagrange multiplier procedure is employed. The test is based simply on the autocorrelation of the squared OLS residuals. This model is used to estimate the means and variances of inflation in the U.K. The ARCH effect is found to be significant and the estimated variances increase substantially during the chaotic seventies.
Article
A natural generalization of the ARCH (Autoregressive Conditional Heteroskedastic) process introduced in Engle (1982) to allow for past conditional variances in the current conditional variance equation is proposed. Stationarity conditions and autocorrelation structure for this new class of parametric models are derived. Maximum likelihood estimation and testing are also considered. Finally an empirical example relating to the uncertainty of the inflation rate is presented.
  • Igor Tulchinsky