PosterPDF Available

Nonlinear forecasting with many predictors by neural network factor models (Deep Learning in Finance)



This study proposes a nonlinear generalization of factor models based on artificial neural networks for forecasting financial time series with many predictors.
This study proposes a nonlinear forecasting tech-
nique based on an improved factor model with
two neural network extensions. This model
would be able to capture both non-linearity and
non-normality of a high-dimensional dataset.
Specification (architecture) of the neural network
Factor model is determined on the basis of sta-
tistical inference and special emphasis is given to
data-driven specification.
Linear factor models can be represented as a spe-
cial case of this neural network factor model.It
means that, if there is no non-linearity between
variables, it will work like a linear model.
Forecasting with factor models are a two-step process:
Factor Estimation, which summarizes the infor-
mation contained in a large data set in a small
number of factors.
= Λift
Forecasting Equation, which is the prediction of
the variable of interest by using common factors.
one of the X
=λ ft+1|t+εt+1 (2)
Common factors and the idiosyncratic component can
be forecast simultaneously or separately.
Figure 1: The standard auto-associative neural network archi-
tecture for nonlinear PCA (combination of two feed-forward NNs)
The first extension proposes a NLPCA (neural network
principal component analysis) as an alternative for fac-
tor estimation, which allows the factors to have a non-
linear relationship to the input variables. NLPCA non-
linearly generalizes the classical PCA method by a non-
linear mapping from data to factors. Both neural net-
work parameters and unobservable factors (f) can be op-
timised simultaneously to minimise the reconstruction
error e:
XX, MSE =E(|| ˆ
Second extension is a nonlinear factor augmented
forecasting equation based on a single hidden
layer feed-forward neural network model which
can be built in a similar fashion as a statistical
A neural network model can be defined as:
yt=G(xt;ψ) + εt=α0˜xt+
ixtβi) + εt
The function F(˜ω0
ixtβi), often called the activa-
tion function, is a logistic function.
Figure 2: Artificial Neuron configuration
[1] J. H. Stock and M. W. Watson. Forecasting using principal components from a large number of predictors. American Statistical Association,
97:1167–1179, 2002.
[2] M. Forni et al. The generalized dynamic factor model: Estimation and forecasting. American Statistical Association, 100:830–840, 2005.
[3] M. C. Medeiros and T. Terasvirta. Building neural network models for time series: A statistical approach. J of Forecasting, 25:49–75, 2006.
[4] C. M. Kuan and H. White. Artificial neural networks: An econometric perspectiv. Econometric Reviews, 13:1–91, 1994.
[5] M. Deistler and E. Hamann. Identification of factor models for forecasting returns. Financial Econometrics, 3(2):256–281, 2005.
[6] A. N. Gorban and B. M. Kegl(Eds.). Principal Manifolds for Data Visualization and Dimension Reduction. Springer, 2008.
[7] M. A. Kramer. Nonlinear principal component analysis using autoassociative neural networks. AIChE, 37:233–243, 1991.
Out-of-sample forecast evaluation results based on dif-
ferent criteria (RMSE, Hit-Rate and Theil) showed that
the proposed neural network factor model (NNFM)
significantly outperformed linear factor model and
Random-Wald approach.
Phone +44 (0)7737842985
Figure 3: Nonlinear PCA can describe the inherent structure of the data by a curved subspace.
Three stages of model building:
Variable selection
by linearizing the model (approximate NN model
by a polynomial of sufficiently high order) and
applying well-known techniques of linear vari-
able selection to this approximation.
Parameter estimation
Estimate the parameters by maximum likelihood,
making use of the normality assumptions made
on residual.
Determining the number of hidden units (neu-
Applying Lagrange multiplier type tests. One
possibility is to begin with a small model and se-
quentially add hidden units to the model.
Financial returns present special features and share the
following stylised facts: comovements, non-linearity,
non-gausianity (skewness and heavy tails) and lever-
age effect, which makes the modelling of this variable
Figure 4: monthly return observations of the 52 companies in
S&P100 index
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
A data-driven approach for forecasting returns of asset prices is introduced. Special emphasis is given to data-driven specification and to dimension reduction. Specification is performed by a modified AIC, BIC-based An-algorithm. Quasi-static principal component analysis, quasi-static factor models with idiosyncratic errors and reduced rank regression are considered. The forecasting results obtained are compared. Copyright 2005, Oxford University Press.
Full-text available
This Paper proposes a new forecasting method that exploits information from a large panel of time series. The method is based on the generalized dynamic factor model proposed in Forni, Hallin, Lippi, and Reichlin (2000), and takes advantage of the information on the dynamic covariance structure of the whole panel. We first use our previous method to obtain an estimation for the covariance matrices of common and idiosyncratic components. The generalized eigenvectors of this couple of matrices are then used to derive a consistent estimate of the optimal forecast, which is constructed as a linear combination of present and past observations only (one-sided filter). This two-step approach solves the end-of-sample problems caused by two-sided filtering (as in our previous work), while retaining the advantages of an estimator based on dynamic information. Both simulation results and an empirical illustration on the forecast of the Euro area industrial production and inflation, based on a panel of 447 monthly time series show very encouraging results.
Nonlinear principal component analysis is a novel technique for multivariate data analysis, similar to the well-known method of principal component analysis. NLPCA, like PCA, is used to identify and remove correlations among problem variables as an aid to dimensionality reduction, visualization, and exploratory data analysis. While PCA identifies only linear correlations between variables, NLPCA uncovers both linear and nonlinear correlations, without restriction on the character of the nonlinearities present in the data. NLPCA operates by training a feedforward neural network to perform the identity mapping, where the network inputs are reproduced at the output layer. The network contains an internal “bottleneck” layer (containing fewer nodes than input or output layers), which forces the network to develop a compact representation of the input data, and two additional hidden layers. The NLPCA method is demonstrated using time-dependent, simulated batch reaction data. Results show that NLPCA successfully reduces dimensionality and produces a feature space map resembling the actual distribution of the underlying system parameters.
This paper is concerned with modelling time series by single hidden-layer feedforward neural network models. A coherent modelling strategy based on statistical inference is presented. Variable selection is carried out using existing techniques. The problem of selecting the number of hidden units is solved by sequentially applying Lagrange multiplier type tests, with the aim of avoiding the estimation of unidentified models. Misspecification tests are derived for evaluating an estimated neural network model. A small-sample simulation test is carried out to show how the proposed modelling strategy works and how the misspecification tests behave in small samples. Two applications to real time series, one univariate and the other multivariate, are considered as well. Sets of one-step-ahead forecasts are constructed and forecast accuracy is compared with that of other nonlinear models applied to the same series.
This article considers forecasting a single time series when there are many predictors (N) and time series observations (T). When the data follow an approximate factor model, the predictors can be summarized by a small number of indexes, which we estimate using principal components. Feasible forecasts are shown to be asymptotically efficient in the sense that the difference between the feasible forecasts and the infeasible forecasts constructed using the actual values of the factors converges in probability to 0 as both N and T grow large. The estimated factors are shown to be consistent, even in the presence of time variation in the factor model.
Artificial neural networks: An econometric perspectiv
  • C M Kuan
  • H White
C. M. Kuan and H. White. Artificial neural networks: An econometric perspectiv. Econometric Reviews, 13:1-91, 1994.