A Novel Time Series Forecasting Approach with Multi-Level Data Decomposing and Modeling
ABSTRACT Time series produced in complex systems are always controlled by multi-level laws, including macroscopic and microscopic laws. These multi-level laws bring on the combination of long-memory effects and short-term irregular fluctuations in the same series. Traditional analysis and forecasting methods do not distinguish these multi-level influences and always make a single model for prediction, which has to introduce a lot of parameters to describe the characteristics of complex systems and results in the loss of efficiency or accuracy. This paper goes deep into the structure of series data, decomposes time series into several simpler ones with different smoothness, and then samples them with multi-scale sizes. After that, each time series is modeled and predicated respectively, and their results are integrated finally. The experimental results on the stock forecasting show that the method is effective and satisfying, even for the time series with large fluctuations
-
Citations (0)
-
Cited In (0)
Page 1
A Novel Time Series Forecasting Approach with
Multi-Level Data Decomposing and Modeling∗
Xuemei Han, Congfu Xu, Huifeng Shen and Yunhe Pan
College of Computer Science,
Zhejiang University,
Hangzhou, 310027, P.R.China
zjuwendy@yahoo.com.cn, xucongfu@cs.zju.edu.cn
useraddcn@yahoo.com.cn, panyh@zju.edu.cn
Abstract—Time series produced in complex systems are
always controlled by multi-level laws, including macroscopic and
microscopic laws. These multi-level laws bring on the combination
of long-memory effects and short-term irregular fluctuations in
the same series. Traditional analysis and forecasting methods do
not distinguish these multi-level influences and always make a
single model for prediction, which has to introduce a lot of
parameters to describe the characteristics of complex systems
and results in the loss of efficiency or accuracy. This paper
goes deep into the structure of series data, decomposes time
series into several simpler ones with different smoothness, and
then samples them with multi-scale sizes. After that, each time
series is modeled and predicated respectively, and their results
are integrated finally. The experimental results on the stock
forecasting show that the method is effective and satisfying, even
for the time series with large fluctuations.
Index Terms—time series forecasting, complex system, data
decomposing, multi-scale sampling
I. INTRODUCTION
Time series is one of the most frequently encountered
forms of data. The forecasting of time series is an important
topic in data mining since it plays great role in the economic
decision making, prevention of natural calamity, and so on.
This paper pays more attention to the time series in com-
plex systems, such as financial data, meteorologic data, etc.
Since these data are controlled by multi-level laws, including
macroscopic and microscopic laws, they are more stochastic
than other types of data, such as engineering data and physical
data. Statisticians, economists and mathematicians have paid
much attention to the analysis of time series structure of
complex systems. Most of economists consider that time series
are the components of trends, cycles, seasonal variations and
irregular fluctuations [1]. In 1951, Hurst [2] investigated the
long-term storage capacity of reservoirs and found the long-
memory trait of the hydrology series firstly. After 1980s,
researchers found this trait was very common in time series of
different areas [3].
On the other hand, Rosenblatt [4] presents the concept, st-
——————————————–
∗This paper is supported by National Natural Science Foundation of China (No. 60402010), Advanced Research Project of China Defense Ministry (No.
413150804), and partially supported by the Aerospace Research Foundation (No. 2003-HT-ZJDX-13).
Congfu Xu is the correspondent author.
rong
process in time series.
Many methods are proposed to cope with time series fore-
casting, such as Box-Jenkins [5][6], Neural Networks [7][8],
Genetic Algorithms [9][10], Kalman filter method [11], etc.
These methods generally construct a single model with compli-
cated parameters on the raw data when describing the complex
system, while ignore preprocessing before modeling. However,
since both long term trends and short term fluctuations coexist
in the same sequence, it is a dilemma to balance the accuracy
and the efficiency: discarding mass historic data which may be
useful for analysis and forecasting will decrease the accuracy,
while giving the same weight to the historic data will increase
the processing time inevitably.
This paper proposes a new forecasting approach with
multi-level decomposing and multi-scale sampling. In our
method, the time series are decomposed into several simpler
ones which are called as separated-series, then we sample
every new separated-series with diverse scales. The new
separated-series can be described with different models or the
same model with different parameters. Then the forecasting of
every new separated-series is conducted. The final forecasting
results of the original series can be obtained by integrating
those of separated-series.
The rest of this paper is organized as follows. In Section
II, we give several definitions and formulations of time series
preprocessing. Section III explains the modeling and forecast-
ing processes. Experimental results of the approach are shown
in Section IV. The last section offers our conclusion.
mixingcondition,whichreflectstheshort-term
II. TIME SERIES PREPROCESSING
The data produced in complex systems are often influenced
by multi-level factors, and the influence periods of these
factors are diverse. Some factors may result in a long-term
evolution, for example, the inherent value mechanism of
stock, which is decisive in the long-term trends of stock
price. While some factors’ durations are much shorter, such
as people’s psychological factors in the stock market, which
1-4244-0332-4/06/$20.00 ©2006 IEEE
1712
Proceedings of the 6th World Congress on Intelligent Control
and Automation, June 21 - 23, 2006, Dalian, China
Authorized licensed use limited to: Hong. Downloaded on February 7, 2009 at 23:58 from IEEE Xplore. Restrictions apply.
Page 2
may always cause short-term fluctuations. This requests
multi-scale models which can deal with not only long-term
trends forecasting but also short-term fluctuations prediction.
In this paper, original time series are decomposed into
several ones with different cycles. The traits of the new time
series are more visible and the new time series are easier
to be modeled. After decomposing, each new time series is
sampled with different cycle which can bring the gain of
efficiency without decreasing the accuracy. This section gives
the formulation of the approach.
A. Definitions
We start by giving some definitions related to time series
which may be convenient for describing our approach.
Definition 1 (standard sampling cycle): Standard sampl-
ing cycle, which is denoted as δ, is the original interval of
recordings or statistics. This paper assumes that δ is invariable
for a given system, and the other sampling cycles are all
integral multiple of δ.
Definition 2 (operator ⊕ and separated-series): T is an
ordered set {t1, t2, ..., tn, ...} with ti+1− ti = δ,i ∈
N, and 1 ≤ i ≤ n. A time series X = {xti,t ∈ T} can be
decomposed to several time series and this operation is denoted
as ⊕
X = X(1) ⊕ X(2) ⊕ ... ⊕ X(m),
(1)
and the i-th position of the sequence X(j) satisfies
m
?
j=1
x(j)ti= xti,
1 ≤ j ≤ m.
X(j) is named as separated-series which is equinumerous to
X, and the sampling time of the same order as X is the same.
B. Time Series Decomposing
In this subsection we present a method to decompose
time series into several separated-series satisfying equation
(1). This method, which we name as multi-smoothness-factor
decomposing, is an inductive process as below:
a) Set smoothness-factor array: Given an array
L = {l1,l2,...,lm−1}
where li ∈ N, 1 ≤ i ≤ m − 1, L is a monotonically
decreasing sequence, for instance, a binary-base exponent
array like {26,24,22,21}. L can be determined by experiments
or experiences.
b) Evaluation of X(1): By using smoothness-factor l1,
the p-th element in X(1) can be calculated by
⎧
⎪
x(1)tp=
⎪
⎪
⎪
⎩
⎨
1
p
1
l1
p?
q=1xtq,
p?
p ≤ l1+ 1
q=p−l1
xtq,p > l1+ 1
c) Evaluation of X(j), 1 < j < m: Elements of
time series X(j) can be gotten by subtracting elements of
X(1),X(2),...,X(j−1) from X with the same orders. Using
smoothness-factor lj, the p-th element in X(j) can be calculated
by
⎧
⎪
d) Evaluation of X(m): Elements of time series X(m)
can be gotten by subtracting elements of X(1), X(2), ...,
X(m − 1) from X with the same orders.
Algorithm 1 shows the process of evaluating x(j)tp.
x(j)tp=
⎪
⎪
⎪
⎩
⎨
1
p
1
lj(
p?
q=1xtq,
p?
p ≤ lj+ 1
j−1
?
q=p−lj
xtq−
k=1
x(k)tq),p > lj+ 1
Algorithm 1: Evaluate-Elements-of-X(j)
Data: X, L
Result: x(j)tp, 1 ≤ j ≤ m
begin
x(1)tp←− xtp
for j = 1 to m − 1 do
if p ≤ lj+ 1 then
x(j)tp←− (x(j)tp−1∗ (p − 1) + xtp)/p
else
x(j)tp←− (x(j)tp−1∗lj−x(j)tp−lj−1+xtp)/lj
xtp←− xtp− x(j)tp
x(m)tp←− xtp
end
Therefore, the original time series is decomposed into
several ones with different traits. The number of X(j) can
be determined by experiments, in most cases, 3 to 5 may
be appropriate. The separated-series will turn from smooth to
coarse with the increment of j.
C. Multi-Scale Sampling
A typical decomposing result is shown in fig. 1. The orig-
inal time series is decomposed into three separated-series by
two smoothness factors. We can find that the separated series
turn from smooth to coarse and the periods of fluctuations
become shorter with the increment of j. For the smoother
separated-series, since they reflect the extending of long-term
trends, their long-term history must be considered in modeling.
However, each point in the smoother separated-series carries
less information than that in the coarse ones. It is a waste
of time if the smoother separated-series are sampled with
standard sampling cycle δ. Multi-scale sampling has been
discussed in recent years [12]. The essence is that the new
data are sampled with high frequency and the older data with
lower frequency in the same time series. This seems simple,
but the variable measurements always make an extra trouble
for the later processes.
1713
Authorized licensed use limited to: Hong. Downloaded on February 7, 2009 at 23:58 from IEEE Xplore. Restrictions apply.
Page 3
50100150 200250300350 400450500
−5
0
5
10
15
X(1)
X(2)
X(3)
Fig. 1.A typical figure of separated-series.
This paper proposes a new multi-scale sampling method.
We sample the same separated-series with the constant fre-
quency, while the sampling frequencies are variable when
handling the different separated-series. Let the ordered set
S={s1,s2,...,sm} be the sampling cycle array, where sj(1 ≤
j ≤ m) is the sampling cycle of X(j) and the elements of S
are monotonically decreasing. To be convenient for calculation,
we choose sj−1/sj∈ N. The new time series sampling from
X(j) is denoted as˙X(j).
III. MODELING AND FORECASTING
In the following subsections, we show the process of
modeling and forecasting. The process of new data entering
and other details related to modeling and forecasting are also
discussed.
A. New Data Entering
The time series analysis and forecasting is always an online
process. New data will enter continuously in the process of
forecasting, and they should be processed at once. Algorithm 2
shows the update procedure of sampled separated-series when
new data entering.
Algorithm 2: New-Data-In
Data: Sampled separated-series{˙X(j)}, xtk, L,S,
{CountS[j]} is set of sampling counters.
Result: {˙X?(j)} with new data, 1 ≤ j ≤ m
begin
Evaluate-Elements-of-X(j)
for j = 1 to m do
CountS[j] ←− CountS[j] + 1
if CountS[j] mod S[j] = 0 then
˙X(j) ←−˙X(j) + {x(j)tk}
end
B. Construct Forecasting Models
Neural Networks, Genetic Algorithms, etc., can be em-
ployed for the separated-series modeling. Different models
or one model with different parameters can be adopted for
different separated-series. Here we exemplify the modeling and
forecasting process by using Box-Jenkins method.
A stationary time series {yk} of mean zero can be taken
for responses of linear time-invariant random system with the
input of white noise. Then yksatisfies the difference equation
yk=
p
?
i=1
φiyk−i+ εk−
q
?
j=1
θjεk−j.
(2)
which is denoted as ARMA(p,q), where
the weighted sum of the most recent p responses, and
?q
can be obtained by the least squares method or maximum
likelihood estimating method, whereafter, ykcan be predicated
by equation 2. Details of constructing ARMA model can be
found in [3].
Assume that there are H sampling points in ˙X(j), then
the processes of modeling are:
1) Select the most recent h sampling points, h ≤ H. If
˙X(j) is not a stationary sequence, it must be transformed
to be a stationary one by first or multistage difference
method .
2) Construct an ARMA model for the sampling points with
appropriate parameters. The discussion of determining
the order of ARMA model can be seen in [13]. The
reasonability of the model should be tested through the
Box Pierce test [3].
3) Predict the value of next moment by using equation 2.
4) Add new data and compare the difference of new data
with the prediction of the step 3. The residuals should
be calculated.
The model should be verified periodically. For the time-
variant complex system, the parameters may not be appropriate
for prediction any longer with the passage of time. When the
residuals can not satisfy presumable assumptions, the most
recent h sampling points are selected and the model is updated
through step 1 and step 2.
?p
i=1φiyk−i is
j=1θjεk−jis the weighted sum of the recent q white noise.
The estimation of φ1, φ2, ... φp, θ1, θ2, ... θq and σ2
ε
C. Result Integration
Now the prediction of each separated-series is gotten. The
sampling cycle of each˙X(j) is different, the prediction cycle
turns shorter as the sampling frequency goes higher. They are
very useful, for example, from the prediction of the longer
sampling cycle, the potential trend of the time series can be
seen. But for the short-term accurate prediction, the results of
all the separated-series must be integrated to obtain the final
forecasting result.
As shown in fig. 2, the final forecasting results should be
integrated from those of all the frequency sampling sequences.
Assuming to predict the result of the moment τ, the prediction
1714
Authorized licensed use limited to: Hong. Downloaded on February 7, 2009 at 23:58 from IEEE Xplore. Restrictions apply.
Page 4
Fig. 2.Relations between multi-scale frequency series
of τ of each separated-series should be obtained according to
equation 1. Since the sampling cycles are different, not all
the predictions of τ can be obtained directly, but the nearest
observing and forecasting are available. Since these cases most
happen on the smoother separated-series, the absent predic-
tions can be calculated by polynomial interpolation simply.
Therefore, we get the final result.
IV. EXPERIMENTS AND RESULTS
Stock data are chosen to validate our approach in this
paper. Some forecasting results are demonstrated in Figures
3, 4 and 5. We choose 530 days close prices of Minsheng
Bank for modeling and forecasting. The smoothness array
L = {60,15} , which decompose the original time series into
three separated-series. As fig. 3 shows, the top left subplot
is the fitting curve of the original time series, and the other
subplots are the fitting curves of separated-series.
The sampling cycle array S is chosen as {10,5,1} accord-
ing to the smoothness of each separated-series. The final fitting
result is shown in fig. 4 (Close price as the vertical axis and
time as the horizontal axis).
100200300 400500
5
10
15
100200300400500
5
10
15
100200300400500
−5
0
5
100200300400500
−5
0
5
Fig. 3.The original time series and the separated-series
The curve of relative errors is shown in fig. 5. The mean
of relative errors is 0.0105, and 92.83% relative errors are
smaller than 3%. Compared with the approaches which are
used for modeling with fixed parameters such as in [14][15],
our approach gets better forecasting results.
50100 150200250300350 400450500
5
6
7
8
9
10
11
12
13
14
15
16
Observed data
Forecasting data
Fig. 4.The fitting and forecasting curve of Minsheng Bank stock
50100150200250300350400450500
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Fig. 5.Fitting curve of errors
V. CONCLUSION
In this paper, we propose a new approach for time series
analysis and forecasting. Through multi-level decomposing,
the complex time series turns to be a series of simpler
ones, then the separated-series are sampled according to their
smoothness and modeled respectively. Treating the long-term
trends and short-term fluctuations with different weight bring
on gains of accuracy and efficiency.
REFERENCES
[1] B. L. Bowerman and R. T.O’Connell, Forecasting and Time Series: An
Applied Approach, 3rd ed.Florence, KY: Tomson Learning, 1993.
[2] H. Hurst, “Long-term storage capcity of reservoirs,” Trans. of the
American Society of Civil Engineers, vol. 116, pp. 770–779, 1951.
[3] Z. Shi-ying and F. Zhi, Cointegration Theory and SV Models, 1st ed.
Tsinghua University, 2004.
[4] M. Rosenblatt, “A central limit theorem and a strong mixing condition,”
in In Proc. of the National Academy of Sciences, vol. 42, 1956, pp.
43–47.
[5] S. I. Wu and R.-P. Lu, “Combining artificial neural networks and
statistics for stock market forecasting,” in In Proc. of ACM Conference
on Computer Science, May 1993, pp. 257–264.
[6] S.-J. Huang and K.-R. Shih, “Short-term load forecasting via ARMA
model identification including non-gaussian process considerations,”
IEEE Trans. on Power Systems, vol. 18, no. 2, pp. 673–679, May 2003.
1715
Authorized licensed use limited to: Hong. Downloaded on February 7, 2009 at 23:58 from IEEE Xplore. Restrictions apply.
Page 5
[7] R. S. T. Lee, “iJADE stock advisor: an intelligent agent based stock
prediction system using hybrid RBF recurrent network,” IEEE Trans. on
Systems, Man, and Cybernetics, vol. 34, no. 3, pp. 421–428, May 2004.
[8] D. V. P. Emad, W. Saad, I. Donald, and C. Wunsch, “Comparative study
of stock trend prediction using time delay, recurrent and probabilistic
neural networks,” IEEE Trans. on Neural Networks, vol. 9, no. 6, pp.
1456–1470, 1998.
[9] H. Iba and T. Sasaki, “Using genetic programming to predict financial
data,” in In IEEE Proc. of Congress on Evolutionary Computation, JUL.
1999, pp. 244–251.
[10] S. H. Ling, F. H. F. Leung, H. K. Lam, and P. K. S. Tam, “Short-term
electric load forecasting based on a neural fuzzy network,” IEEE Trans.
on Industrial Electronics, vol. 50, no. 6, pp. 1305–1316, 2003.
[11] D. McGonigal and D. Ionescu, “An outline for a kalman filter and recur-
sive parameter estimation approach applied to stock market forecasting,”
in In IEEE Conf. on Electrical and Computer Engineering, vol. 2, SEP.
1995, pp. 1148–1151.
[12] E. Palpanas, M. Vlachos, and E. J. K. et al, “Online amnesic approxi-
mation of streaming time series,” in In IEEE Conf. on ICDE, 2004, pp.
338–349.
[13] B. Steve and O. Cyril, “A comparison of box-jenkins and objective
methods for determining the order of a nonseasonal arma model,”
Journal of Forecasting, vol. 13, pp. 419–434, 1994.
[14] B. R. Chang, “A study of non-periodic short-term random walk forecast-
ing based on RBFNN, ARMA, or SVR-GM(1,1—/sql tau/)approach,” in
In IEEE Conf. on Systems, Man, and Cybernetics, 2003.
[15] Y.-F. Wang, “On-demand forecasting of stock prices using a real-time
predictor,” IEEE Trans. on Knowlege and Data Engineering, vol. 15,
no. 4, pp. 1033–1037, 2003.
1716
Authorized licensed use limited to: Hong. Downloaded on February 7, 2009 at 23:58 from IEEE Xplore. Restrictions apply.