Boundary problem and data leakage: A caveat for wavelet-based forecasting

Preprint (PDF Available) · December 2018with 59 Reads
DOI: 10.13140/RG.2.2.28234.21446
Cite this publication
Abstract
The application of machine learning to economics has drawn much attention in recent years. Forecasting economic data based on machine learning needs feature extraction to obtain better performance. In time series forecasting, researchers often use the wavelet transform to process time series data, and have reported that the combination of a neural network model with the wavelet transform improves the accuracy of the prediction. There are, however, many papers relating to wavelet-based forecasting that do not provide sufficient information on how the time-series data was processed. We show that inappropriate procedures for applying the wavelet decomposition to time series data easily lead to data leakage, which uses unobserved data and so its forecasting results would be of extremely high precision. We find that wavelet-based forecasting in which the time series data are processed appropriately cannot outperform even a naive prediction. Prediction performance based on wavelets is unreliable if the researcher does not specify the data processing method.
Figures - uploaded by Yuto Kajita
Author content
All content in this area was uploaded by Yuto Kajita
Content may be subject to copyright.
JCER DISCUSSION PAPER
No.148
December 2018
Japan Center for Economic Research
Boundary problem and data leakage: A caveat for
wavelet-based forecasting
Ryo Hasumi
Yuto Kajita
Boundary problem and data leakage: A caveat for wavelet-based
forecasting
Ryo HasumiYuto Kajita
December 2018
Abstract
The application of machine learning to economics has drawn much attention in recent
years. Forecasting economic data based on machine learning needs feature extraction to
obtain better performance. In time series forecasting, researchers often use the wavelet
transform to process time series data, and have reported that the combination of a neural
network model with the wavelet transform improves the accuracy of the prediction. There
are, however, many papers relating to wavelet-based forecasting that do not provide suffi-
cient information on how the time-series data was processed. We show that inappropriate
procedures for applying the wavelet decomposition to time series data easily lead to data
leakage, which uses unobserved data and so its forecasting results would be of extremely high
precision. We find that wavelet-based forecasting in which the time series data are processed
appropriately cannot outperform even a naive prediction. Prediction performance based on
wavelets is unreliable if the researcher does not specify the data processing method.
Keywords: Wavelet Transformation, Data Leakage, Boundary Problem
1 Introduction
It has often been reported in the literature that the application of the wavelet decomposition
to time series data enhances the predictive power of a forecasting model. The early studies of
Aussem and Murtagh (1997) and Aussem et al. (1998) show the compatibility of the wavelet
transform with a recurrent neural network (RNN) since the wavelet transform decomposes the
time series data into periodic components and trend and the RNN is suitable for handling
the regularity of the signals. A number of papers have proposed combining neural network
models and the wavelet decomposition, and have reported a reasonable accuracy (Pahasa and
Theera-Umpon (2007), Minu et al. (2010), Hsieh et al. (2011), Ortega and Khashanah (2014),
Jothimani et al. (2015), Yu et al. (2017), Bao et al. (2017)). The wavelet decomposition has
also been applied to a factor-augmented model (Rua (2011)) and the GARCH model (Tan et
al. (2010)), as well as the canonical ARIMA model (Fernandez (2008), Al Wadia et al. (2011),
Kriechbaumer et al. (2014), Zhang et al. (2017)). In this way, many researchers have constructed
forecasting models based on the wavelet transform.
Although many papers have used the wavelet decomposition to process time series data, the
boundary problem involved here is often ignored. This boundary problem involves the variation
of the wavelet coefficients near the end point of the transformation window with its shifts, and
is caused by the assumption of circularity, or by padding with artificial data after the endpoint.
The boundary problem is closely related to forecasting since it concerns the data yet to be
observed, located beyond the boundary.
In this study, we show that disregarding the boundary problem and taking inappropriate
procedures for employing the wavelet transform leads to a serious problem, especially in fore-
casting: data leakage. Data leakage is the mishandling of a model or data, in which information
Corresponding author, Japan Center for Economic Research. E-mail: hasumi@jcer.or.jp
Graduate School of Economics, Waseda University.
1
not observed in that period is used. In the presence of this problem, the obtained model and
results are unrealistic and often seem to be unrealistically good. In fact, there are studies in
which data leakage occurs unintentionally or intentionally and the results are seemingly biased
in favor of the wavelet transform. Without getting rid of this mishandling, we cannot deny the
possibility that the usefulness of the wavelet transform has been overstated or even does not
exist.1
This paper is organized as follows. In Section 2, we describe a simple example of the wavelet
transform and the cause of the boundary problem. In Section 3, we construct simple forecast
models to clarify the data leakage associated with the wavelet transform. Section 4 is a discussion
of how to manage the boundary problem. The last section is the Conclusion.
2 The wavelet transform and the boundary problem
The discrete wavelet transform (DWT) is an orthogonal transformation of time series data X
by a discrete wavelet matrix W. The resulting matrix W=WXis called the DWT coefficients.
If we conduct the level-2 Daubechies(4) discrete wavelet transform (D(4) DWT), for which the
length of X, denoted by N, must be a multiple of 4, we first define two kinds of orthogonal
matrices
BJ
|{z}
N/2J×N/J
=
h1h00 0 0 ... 0 0 0 h3h2
h3h2h1h00... 0 0 0 0 0
.
.
.
00000... 0h3h2h1h0
,(1)
AJ
|{z}
N/2J×N/J
=
g1g00 0 0 ... 0 0 0 g3g2
g3g2g1g00... 00000
.
.
.
0 0 0 0 0 ... 0g3g2g1g0
,(2)
where
h0=13
42, h1=3 + 3
42, h2=3 + 3
42, h3=13
42,
g0=h3, g1=h2, g2=h1, g3=h0,
and then define
W=
B1
B2A1
A2A1
.(3)
Since WW=I, we can decompose Xas
X=B
1B1X
| {z }
D1
+A
1B
2B2A1X
| {z }
D2
+A
1A
2A2A1X
| {z }
S
.(4)
Here, Djand Sare called the wavelet details and smooth, respectively.
The circularity assumption appears in the last two elements of the first row of Bj,h3, and
h2, which make a certain length of elements near the end of the wavelet details and smooth
dependent on the beginning of the data X. For example, the last two elements of D1are
affected by the first two observations of X.
Under the circularity assumption, the data yet to be observed, X(N+ 1), X(N+ 2), . . . , are
replaced by X(1), X(2), . . . . In case the circularity assumption is inappropriate for the reason
1As for deliberate data leakage, Herwartz and Schl¨uter (2017) refer to it as perfect foresight and Kriechbaumer
et al. (2014) as calibration.
2
Figure 1: Description of data processing
Allsamples(2009Q4Ͳ2018Q2)
2009Q4Ͳ2010Q4
2010Q1Ͳ2011Q1
2010Q2Ͳ2011Q2
2017Q1Ͳ2018Q1
30
pairs
Trainingsamples
(252observations)
2011Q1
2011Q2
2011Q3
2018Q2
Testsamples
(64observations)
Estimatetheparameters Evaluatetheforecastingaccuracy
………
………
that there exists a discrepancy between X(N) and X(1), for example, another assumption is
often used: reflection, in which X(N+ 1), X(N+ 2), . . . are replaced by X(N), X(N1), . . . .
Constant padding is also an option in which a constant X(N) substitutes for X(N+ 1), X(N+
2), . . . .
As is apparent from this explanation, if we obtain new observations, which must be a multiple
of 4 in this case, the elements near the end of the wavelet details Djand smooth Sthat we already
have will change, since X(1), X(2), . . . under circularity or X(N), X(N1), . . . under reflection
are replaced by the actual data X(N+1), X (N+2), . . . . This drawback of the wavelet transform
is referred to as the boundary problem. It occurs in all the wavelet transforms, including the
maximal overlap discrete wavelet transform (MODWT), except for the most primitive Haar
DWT, where the width of the wavelet is 2.
3 Wavelet-based forecasting and data leakage: A simple exam-
ple
We now construct a wavelet-based forecasting model for predicting the S&P500 index. The
sample data are the daily closing prices covering a period of seven years, from January 2010 to
June 2018. We split the whole data into 30 subsets with the pairs of training sample X(1) and
test sample X(2) as shown in Figure 1. The sample length of X(1) is 252 trading days before
the beginning of each quarter (from 2011Q1 to 18Q2) and that of X(2) is 64 trading days from
the beginning of the quarter.
For each subset, we firstly take the logarithm of the training and test sets by the D(4) DWT
and then obtain two sets, wavelet details and smooth, (D(1)
1,D(1)
2,S(1)) and (D(2)
1,D(2)
2,S(2)). In
the second step, we apply the AR(1) model to D(1)
1,D(1)
2and the first differences of S(1),
D(1)
1(t) = α1+β1D(1)
1(t1) + εD1(t),(5)
D(1)
2(t) = α2+β2D(1)
2(t1) + εD2(t),(6)
S(1)(t) = α3+β3S(1) (t1) + εS(t),(7)
and estimate the coefficients by OLS. In the third step, we perform out-of-sample predictions at
each trading day by using the estimated models and the wavelet details and smooth of the test
3
Table 1: Summary of prediction performance (1)
method MDA RMSE MAE ARR
overall DWT(circ.) AR1 0.7302 0.0063 0.0048 0.3778
overall DWT(circ.) naive 0.7143 0.0064 0.0048 0.3545
dln(X) AR1 0.5238 0.0074 0.0056 0.0146
dln(X) naive 0.4841 0.0108 0.0083 -0.0059
Note: MDA (mean direction accuracy), RMSE (root mean squared error) and MAE (mean absolute
error) are the medians across the 30 sets of training and test series. ARR is the median of the average
return rate corresponding to each forecasting method, i.e., buy if a one-period-ahead forecast is positive
and sell if one is negative.
set (D(2)
1,D(2)
2,S(2)):
b
D(2)
1(t) = α1+β1D(2)
1(t1),(8)
b
D(2)
2(t) = α2+β2D(2)
2(t1),(9)
b
S(2)(t) = α3+β3S(2) (t1).(10)
By summing the predicted values of the wavelet details and smooth ( b
D1,b
D2,b
S), we have a
prediction of the logarithm of the test series,
ln( b
X(2)(t)) = b
D(2)
1(t) + b
D(2)
2(t) + b
S(2)(t).(11)
We call this forecasting method overall DWT(circ.) AR1.
For comparison, we make a prediction assuming the one-period-ahead D1to be zero ( b
D(2)
1(t+
1) = 0), and D2and Skeep the same level ( b
D(2)
2(t+ 1) = D(2)
t(t) and b
S(2)(t+ 1) = S(2)
t(t))
and name this model overall DWT(circ.) naive. We also estimate the AR(1) model directly
applied to ∆ ln(X(1)) and make one-period-ahead predictions, which is denoted by dln(X) AR1.
dln(X) naive assumes that the percentage change of a one-day-ahead is the same as that of
today.
The prediction performance is evaluated by four performance measures: mean directional
accuracy (MDA), root mean squared error (RMSE), mean absolute error (MAE), and average
rate of return (ARR). We calculate ARR as the average of the returns from the results of the
trading strategy based on the model’s forecasting.
Table 1 shows a summary of the prediction performance of the models. As a whole, the
precision of overall DWT(circ.) AR1 and overall DWT(circ.) naive is much higher than the
direct predictions, dln(X) AR1 and dln(X) naive. The facts that D1takes over high-frequency
components and that the coefficient of the AR(1) is negative increase the accuracy of the AR(1)
prediction compared to the naive prediction. Since Splays the role of a centered moving
average, knowing its level greatly increases the prediction accuracy. The same logic applies to
D2since it takes over longer period components. Figure 2 depicts an example of predictions
based on overall DWT(circ.) AR1 and dln(X) AR1 during 2018: Q1 as well as its actual values.
Apparently, this figure also confirms the extremely high accuracy of the prediction based on the
DWT.
Although the procedures employing the wavelet transform seem to be quite sensible, the
result is spurious and impossible to be applied in actual practice. The seemingly high accuracy
is caused by the fact that the wavelet details and smooth, (D(2)
1,D(2)
2,S(2)), obtained by apply-
ing the level 2 D(4) DWT to the whole test series X(2) containing future information. More
specifically, D1(t), D2(t), and S(t) depend on the data up to X(t+3), X(t+ 7), and X(t+ 7), re-
spectively, by the definition exhibited in Equation (4) and the values of the higher order wavelet
details and smooth, D2and S, are adjusted in line with the trend, which makes it easy to guess
a one-period-ahead value of the original series to be forecasted. This is typical data leakage: a
4
Figure 2: An example path of the log difference of the S&P 500 index and its predictions
(2018:Q1)
0 10 20 30 40 50 60
−0.02 −0.01 0.00 0.01
ln(S&P500)
trading day
actuals
overall_DWT(circ.)_AR1
dln(X)_AR1
Note: The solid black line is a test set of log differenced S&P 500 index from 04/02/2018 to 06/29/2018
(64 trading days). The dashed blue line is a one-period-ahead forecast based on the wavelet details and
smooth converted from the whole test set and the application of the AR(1) model. The dotted pink line
indicates those based on the AR(1) model directly applied to ∆ ln(X(1)).
forecast based on information not yet obtainable at that time. It is similar to the drawback of
the Hodrick–Prescott filter shown by Hamilton (2017), that the detrended cyclical component
obtained by the two-sided HP filter is highly predictable since it depends on the future error
terms.
In the next section, we explain an appropriate procedure to make the out-of-sample prediction
in wavelet-based forecasting and examine its forecasting accuracy.
4 How to manage the boundary problem in forecasting
That the wavelet details and smooth are affected by the data from future times is closely related
to the boundary problem. If the wavelet transform is sequentially performed in decomposing the
test data, as suggested in Aussem et al. (1998), we can avoid data leakage. Table 2 is a summary
of the prediction results when the DWT is sequentially applied to the series of 64 trading days
before the timing of the prediction, as follows:
Step 1 : Set the reference date τand initialize i= 0.
Step 2 : Set a rolling window from τ63+ di to τ+di where dstand for the interval to employ
the DWT.
Step 3 : Employ the DWT and store the wavelet details and smooth from τ+ (i1)d+ 1 to
τ+di.
Step 4 : Increase iby 1 and go back to Step 2 until the window goes beyond the end of the
period of the test set.
5
Table 2: Summary of prediction performance (2)
method MDA RMSE MAE ARR
sequential DWT(circ.) AR1 0.4841 0.0384 0.0322 0.0110
sequential DWT(circ.) naive 0.4841 0.0168 0.0143 0.0065
sequential DWT(ref.) naive 0.5238 0.0085 0.0063 0.0130
sequential DWT(con.) naive 0.5238 0.0081 0.0060 0.0250
Note: See the note to Table 1.
sequential DWT(circ.) AR1 uses the same AR(1) coeficients as overall DWT(circ.) AR1.se-
quential DWT(circ.) naive,sequential DWT(ref.) naive and sequential DWT(con.) naive are naive
predictions employing a common DWT, but each assumes different boundary conditions: circu-
lar, reflection, and constant padding. The interval dis set to 1. This procedure ensures that the
results are based on out-of-sample predictions.
It is clear that neither method in Table 2 explicitly outperforms dln(X) AR1 or dln(X) naive
shown in Table 1. sequential DWT(con.) naive may perform relatively well, but is much worse
than overall DWT(circ.) AR1 or overall DWT(circ.) naive. This exercise suggests that assum-
ing reflection or constant padding instead of circularity may help to some extent, but not deci-
sively.2
Figure 3 depicts example paths of (D(2)
1,D(2)
2,S(2)) obtained by the above procedure assuming
reflection and setting the reference date τto the beginning of the period of 2018Q2 and the
interval d= 4 (the red dotted lines). For comparison, we also depict overall DWT(circ.) naive
(the solid black lines) corresponding to the period. The difference between the two lines reflects
the boundary problem in a broad sense as these are based on different information. It is worth
noticing that one cannot obtain the black solid lines, which has much richer information, until
the end of the period. Although the boundary problem occurs only near the beginning and end
of the wavelet details and smooth, the above exercises have clarified that a researcher should not
naively discard the unstable part near the end of the series and use only the stable part since
their stability is the result of incorporating future information.
2For aliviating the boundary problem, Arino (1995) proposes padding based on an ARIMA model. Herwartz
and Schl¨uter (2017) use future prices for padding as their objective of forecasting is foreign exchange rates.
6
Figure 3: Example paths of wavelet details and smooth extracted by two different methods
(2018:Q1)
0 10 20 30 40 50 60
−0.005 0.005 0.015
D1
trading day
sequential
overall
0 10 20 30 40 50 60
−0.010 0.000 0.010
D2
trading day
0 10 20 30 40 50 60
7.87 7.89 7.91 7.93
S2
trading day
Note: The solid black lines are the wavelet details and smooth obtained by transforming a whole test
set beginning from 04/02/2018. The dashed red lines are those obtained by sequentially conducting the
DWT assuming reflection, in which the number of samples is increased by 4 while the sample size is
restricted to 64, and the last 4 values of each calculation are stored.
7
5 Conclusion
In this study, we have shown that there is a close relationship between forecasting through the
wavelet transform and the boundary problem this involves. How to decompose the test data into
the wavelet details and smooth is not unique, especially in relation to the boundary problem,
and is important for ensuring reproducibility in a practical environment. As discussed in the
previous section, the wavelet transform should be applied repeatedly at each time of prediction:
otherwise, data leakage may occur. Although the above example of data leakage concerns a
misuse of the DWT, it is also the case with the MODWT, where the length of the data to be
transformed is not restricted to be a multiple of 2J. Despite its being called “perfect foresight,”
such prediction is often not perfect but of good accuracy, which makes the problem difficult to
be noticed.
References
Al Wadia, S., Mohd Tahir Ismailb, M. H. Alkhahazaleh, and Samsul Ariffin Abdul Karim (2011)
‘Selecting wavelet transforms model in forecasting financial time series data based on arima
model.’ Applied Mathematical Sciences 5(7), 315–326
Arino, Miguel A. (1995) ‘Time series forecasts via wavelets: an application to car sales in the
spanish market’
Aussem, Alex, and Fionn Murtagh (1997) ‘Combining neural network forecasts on wavelet-
transformed time series.’ Connection Science 9(1), 113–122
Aussem, Alex, Jonathan Campbell, and Fionn Murtagh (1998) ‘Wavelet-based feature extraction
and decomposition strategies for financial forecasting.’ Journal of Computational Intelligence
in Finance 6(1), 5–12
Bao, Wei, Jun Yue, and Yulei Rao (2017) ‘A deep learning framework for financial time series
using stacked autoencoders and long-short term memory.’ PloS One 12(7), e0180944
Fernandez, Viviana (2008) ‘Traditional versus novel forecasting techniques: how much do we
gain?’ Journal of Forecasting 27(7), 637–648
Hamilton, James D. (2017) ‘Why you should never use the hodrick-prescott filter.’ National
Bureau of Economic Research Working Paper 23429
Herwartz, Helmut, and Stephan Schl¨uter (2017) ‘On the predictive information of futures’ prices:
A wavelet-based assessment.’ Journal of Forecasting 36(4), 345–356
Hsieh, Tsung-Jung, Hsiao-Fen Hsiao, and Wei-Chang Yeh (2011) ‘Forecasting stock markets
using wavelet transforms and recurrent neural networks: An integrated system based on
artificial bee colony algorithm.’ Applied Soft Computing 11(2), 2510–2525
Jothimani, Dhanya, Ravi Shankar, and Surendra S. Yadav (2015) ‘Discrete wavelet transform-
based prediction of stock index: a study on national stock exchange fifty index.’ Journal of
Financial Management and Analysis 28(2), 35–49
Kriechbaumer, Thomas, Andrew Angus, David Parsons, and Monica Rivas Casado (2014) ‘An
improved wavelet–arima approach for forecasting metal prices.’ Resources Policy 39, 32–41
Minu, K.K., M.C. Lineesh, and C. Jessy John (2010) ‘Wavelet neural networks for nonlinear
time series analysis.’ Applied Mathematical Sciences 4(50), 2485–2495
Ortega, Luis, and Khaldoun Khashanah (2014) ‘A neuro-wavelet model for the short-term fore-
casting of high-frequency time series of stock returns.’ Journal of Forecasting 33(2), 134–146
8
Pahasa, Jonglak, and Nipon Theera-Umpon (2007) ‘Short-term load forecasting using wavelet
transform and support vector machines.’ In ‘Power Engineering Conference, 2007. IPEC 2007.
International’ IEEE pp. 47–52
Rua, Ant´onio (2011) ‘A wavelet approach for factor-augmented forecasting.’ Journal of Fore-
casting 30(7), 666–678
Tan, Zhongfu, Jinliang Zhang, Jianhui Wang, and Jun Xu (2010) ‘Day-ahead electricity price
forecasting using wavelet transform combined with arima and garch models.’ Applied Energy
87(11), 3606–3610
Yu, Lean, Yang Zhao, and Ling Tang (2017) ‘Ensemble forecasting for complex time series using
sparse representation and neural networks.’ Journal of Forecasting 36(2), 122–138
Zhang, Keyi, Ramazan Gen¸cay, and M. Ege Yazgan (2017) ‘Application of wavelet decomposition
in time-series forecasting.’ Economics Letters 158, 41–46
9
This research hasn't been cited in any other publications.
  • Article
    Here's why. (a) The Hodrick-Prescott (HP) filter introduces spurious dynamic relations that have no basis in the underlying data-generating process. (b) Filtered values at the end of the sample are very different from those in the middle and are also characterized by spurious dynamics. (c) A statistical formalization of the problem typically produces values for the smoothing parameter vastly at odds with common practice. (d) There is a better alternative. A regression of the variable at date t on the four most recent values as of date t - h achieves all the objectives sought by users of the HP filter with none of its drawbacks. © 2018 by the President and Fellows of Harvard College and the Massachusetts Institute of Technology.
  • Article
    Full-text available
    The application of deep learning approaches to finance has received a great deal of attention from both investors and researchers. This study presents a novel deep learning framework where wavelet transforms (WT), stacked autoencoders (SAEs) and long-short term memory (LSTM) are combined for stock price forecasting. The SAEs for hierarchically extracted deep features is introduced into stock price forecasting for the first time. The deep learning framework comprises three stages. First, the stock price time series is decomposed by WT to eliminate noise. Second, SAEs is applied to generate deep high-level features for predicting the stock price. Third, high-level denoising features are fed into LSTM to forecast the next day’s closing price. Six market indices and their corresponding index futures are chosen to examine the performance of the proposed model. Results show that the proposed model outperforms other similar models in both predictive accuracy and profitability performance.
  • Article
    Full-text available
    Observed time series data can exhibit different components, such as trends, seasonality, and jumps, which are characterized by different coefficients in their respective data generating processes. Therefore, fitting a given time series model to aggregated data can be time consuming and may lead to a loss of forecasting accuracy. In this paper, coefficients for variable components in estimations are generated based on wavelet-based multiresolution analyses. Thus, the accuracy of forecasts based on aggregate data should be improved because the constraint of equality among the model coefficients for all data components is relaxed.
  • Article
    While in speculative markets forward prices could be regarded as natural predictors for future spot rates, empirically, forward prices often fail to indicate ex ante the direction of price movements. In terms of forecasting, the random walk approximation of speculative prices has been established to provide ‘naive’ predictors that are most difficult to outperform by both purely backward-looking time series models and more structural approaches processing information from forward markets. We empirically assess the implicit predictive content of forward prices by means of wavelet-based prediction of two foreign exchange (FX) rates and the price of Brent oil quoted either in US dollars or euros. Essentially, wavelet-based predictors are smoothed auxiliary (padded) time series quotes that are added to the sample information beyond the forecast origin. We compare wavelet predictors obtained from padding with constant prices (i.e. random walk predictors) and forward prices. For the case of FX markets, padding with forward prices is more effective than padding with constant prices, and, moreover, respective wavelet-based predictors outperform purely backward-looking time series approaches (ARIMA). For the case of Brent oil quoted in US dollars, wavelet-based predictors do not signal predictive content of forward prices for future spot prices. Copyright
  • Article
    Full-text available
    Financial Times Series such as stock price and exchange rates are, often, non-linear and non-stationary. Use of decomposition models has been found to improve the accuracy of predictive models. The paper proposes a hybrid approach integrating the advantages of both decomposition model (namely, Maximal Overlap Discrete Wavelet Transform (MODWT)) and machine learning models (ANN and SVR) to predict the National Stock Exchange Fifty Index. In first phase, the data is decomposed into a smaller number of subseries using MODWT. In next phase, each subseries is predicted using machine learning models (i.e., ANN and SVR). The predicted subseries are aggregated to obtain the final forecasts. In final stage, the effectiveness of the proposed approach is evaluated using error measures and statistical test. The proposed methods (MODWT-ANN and MODWT-SVR) are compared with ANN and SVR models and, it was observed that the return on investment obtained based on trading rules using predicted values of MODWT-SVR model was higher than that of Buy-and-hold strategy.
  • Article
    Based on the concept of 'decomposition and ensemble', a novel ensemble forecasting approach is proposed for complex time series by coupling sparse representation (SR) and feedforward neural network (FNN), i.e. the SR-based FNN approach. Three main steps are involved: data decomposition via SR, individual forecasting via FNN and ensemble forecasting via a simple addition method. In particular, to capture various coexisting hidden factors, the effective decomposition tool of SR with its unique virtues of flexibility and generalization is introduced to formulate an overcomplete dictionary covering diverse bases, e.g. exponential basis for main trend, Fourier basis for cyclical (and seasonal) features and wavelet basis for transient actions, different from other techniques with a single basis. Using crude oil price (a typical complex time series) as sample data, the empirical study statistically confirms the superiority of the SR-based FNN method over some other popular forecasting models and similar ensemble models (with other decomposition tools).
  • Article
    Full-text available
    Recently, wavelet transforms have gained very high attention in many fields and applications such as physics, engineering, signal processing, applied mathematics and statistics. In this paper, we present the advantage of wavelet transforms in forecasting financial time series data. Amman stock market (Jordan) was selected as a tool to show the ability of wavelet transform in forecasting financial time series, experimentally. This article suggests a novel technique for forecasting the financial time series data, based on Wavelet transforms and ARIMA model. Daily return data from 1993 until 2009 is used for this study.
  • Article
    We propose a wavelet neural network (neuro-wavelet) model for the short-term forecast of stock returns from high-frequency financial data. The proposed hybrid model combines the capability of wavelets and neural networks to capture non-stationary nonlinear attributes embedded in financial time series. A comparison study was performed on the predictive power of two econometric models and four recurrent neural network topologies. Several statistical measures were applied to the predictions and standard errors to evaluate the performance of all models. A Jordan net that used as input the coefficients resulting from a non-decimated wavelet-based multi-resolution decomposition of an exogenous signal showed a consistent superior forecasting performance. Reasonable forecasting accuracy for the one-, three- and five step-ahead horizons was achieved by the proposed model. The procedure used to build the neuro-wavelet model is reusable and can be applied to any high-frequency financial series to specify the model characteristics associated with that particular series. Copyright © 2013 John Wiley & Sons, Ltd.