ArticlePDF Available

Abstract and Figures

The prediction of future time series values based on past and present information is very useful and necessary for various industrial and financial applications. In this study, a novel approach that integrates the wavelet and Takagi-Sugeno-Kang (TSK)-fuzzy-rule-based systems for stock price prediction is developed. A wavelet transform using the Haar wavelet will be applied to decompose the time series in the Haar basis. From the hierarchical scalewise decomposition provided by the wavelet transform, we will next select a number of interesting representations of the time series for further analysis. Then, the TSK fuzzy-rule-based system is employed to predict the stock price based on a set of selected technical indices. To avoid rule explosion, the k-means algorithm is applied to cluster the data and a fuzzy rule is generated in each cluster. Finally, a K nearest neighbor (KNN) is applied as a sliding window to further fine-tune the forecasted result from the TSK model. Simulation results show that the model has successfully forecasted the price variation for stocks with accuracy up to 99.1% in Taiwan Stock Exchange index. Comparative studies with existing prediction models indicate that the proposed model is very promising and can be implemented in a real-time trading system for stock price prediction.
Content may be subject to copyright.
802 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 6, NOVEMBER 2008
A Hybrid System Integrating a Wavelet and TSK
Fuzzy Rules for Stock Price Forecasting
Pei-Chann Chang and Chin-Yuan Fan
Abstract—The prediction of future time series values based on
past and present information is very useful and necessary for var-
ious industrial and financial applications. In this study, a novel
approach that integrates the wavelet and Takagi–Sugeno–Kang
(TSK)-fuzzy-rule-based systems for stock price prediction is devel-
oped. A wavelet transform using the Haar wavelet will be applied
to decompose the time series in the Haar basis. From the hierar-
chical scalewise decomposition provided by the wavelet transform,
we will next select a number of interesting representations of the
time series for further analysis. Then, the TSK fuzzy-rule-based
system is employed to predict the stock price based on a set of
selected technical indices. To avoid rule explosion, the k-means al-
gorithm is applied to cluster the data and a fuzzy rule is generated
in each cluster. Finally, a K nearest neighbor (KNN) is applied as a
sliding window to further fine-tune the forecasted result from the
TSK model. Simulation results show that the model has success-
fully forecasted the price variation for stocks with accuracy up to
99.1% in Taiwan Stock Exchange index. Comparative studies with
existing prediction models indicate that the proposed model is very
promising and can be implemented in a real-time trading system
for stock price prediction.
Index Terms—Fuzzy ruled system, K-mean clustering, multi-
ple regression analysis (MRA), simulated annealing (SA), wavelet
preprocessing.
I. INTRODUCTION
M
INING stock market tendencies is a challenging task
due to its high volatility and noisy environment. Many
factors influence the performance of a stock market including
political events, general economic conditions, and traders’ ex-
pectations. Though stocks and futures traders have relied heav-
ily upon various types of intelligent systems to make trading
decisions, the success so far is quite limited [4].
Many attempts have been made to predict the financial mar-
kets, ranging from traditional time series approaches to artificial
intelligence techniques such as fuzzy systems, and especially, ar-
tificial neural network (ANN) methodologies [1]. However, the
main drawback with ANNs, and other black-box techniques,
is the tremendous difficulty in interpreting the results. They
do not provide an insight into the nature of the interactions
Manuscript received March 9, 2007; revised August 5, 2007 and January 3,
2008. First published September 26, 2008; current version published October
20, 2008. This paper was recommended by Associate Editor G. Papadimitriou.
P.-C. Chang is with the Department of Information Management, Yuan
Ze University, Taoyuan 32026, Taiwan, R.O.C. (e-mail: iepchang@saturn.
yzu.edu.tw).
C.-Y. Fan is with the Department of Industrial Engineering and Manage-
ment, Yuan Ze University, Taoyuan 32026, Taiwan, R.O.C. (e-mail: S948906@
mail.yzu.edu.tw).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TSMCC.2008.2001694
between the technical indicators and the stock market fluctua-
tions. Thus, there is a need to develop methodologies that facili-
tate an increased understanding of market processes, in addition
to providing temporally accurate predictions [17], [35], [65].
Another issue to be dealt with is that the dimensionality of fi-
nancial time series data also creates another challenge in ANN
approaches.
The development of a timely and accurate trading decision-
making tool is a key for stock traders to make profits. Since
the stock price series is affected by a mixture of deterministic
and random factors [17], new tools and techniques are needed
in dealing with noise and nonlinearity in stock price prediction.
Data mining, aimed at finding rules hidden in very large amount
of data, is a new and efficient approach for time series analysis.
Data mining on time series needs to translate the continuous
time series into discrete symbol sequences first. In this work,
the wavelet transform using the Haar wavelet will be applied to
decompose the time series in the Haar basis. From the hierarchi-
cal scalewise decomposition provided by the wavelet transform,
we will next select a number of interesting representations of the
time series for further analysis. In addition, statistical analysis
is employed to select important factors that affect the perfor-
mance of the stock market the most. These factors are chosen as
the inputs of the Takagi–Sugeno–Kang (TSK) fuzzy-rule-based
system to predict the future stock price. The reason for choosing
the TSK fuzzy system is owing to its universal approximation
capability [54] and the possibility to gain insights into the data,
which is of particular interest for stock price prediction.
The proposed framework combines several soft computing
(SC) techniques such as a wavelet transform, TSK fuzzy sys-
tem, data clustering, simulated annealing (SA), and K nearest
neighbor (KNN). In addition to wavelet-based data representa-
tion, the k-means clustering algorithm is applied to cluster the
data before the TSK fuzzy rules are generated. A fuzzy rule is
then generated for each cluster, which enables us to determine
the membership functions of the fuzzy subsets, and the optimal
number of fuzzy rules as well. Finally, a KNN is applied as a
sliding window to further fine tune the forecasted result from
the TSK model.
The remainder of the paper is organized as follows. Section II
reviews the different methods for stock forecasting using SC
techniques such as neural networks (NNs) and fuzzy systems.
Section III describes the proposed hybrid approach to stock
price prediction by integrating a wavelet with the TSK fuzzy-
rule-based system. Section IV presents empirical results of the
hybrid approach and compare it with three other approaches.
Finally, conclusions and future directions of the research are
discussed in Section V.
1094-6977/$25.00 © 2008 IEEE
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
CHANG AND FAN: HYBRID SYSTEM INTEGRATING A WAVELET AND TSK FUZZY RULES 803
II. LITERATURE SURVEY
Conventional research addressing the stock forecasting prob-
lem has generally relied on time series analysis techniques, i.e.,
mixed autoregression moving average (ARMA) as well as mul-
tiple regression models (MRMs). However, the assumptions of
these methods may come out with ineffective results since a
number of missing factors such as macroeconomical or political
effects may seriously influence stock tendencies.
White [59] was the first to use NNs for market forecasting. He
used a feedforward NN (FFNN) to study the IBM daily common
stock returns and he found that his training results were overop-
timistic, being the result of overfitting or of learning irrelevant
features. In general, there are two different methodologies for
stock price prediction in using ANN as a research tool [68]. The
first methodology is to consider the stock price variations as a
time series and predict the future price based on its past values. In
this approach, ANNs have been employed as the predictor, see,
e.g., [5], [10], [14], [16], [17], [27], [32], [37], [42], [48], [63],
and [64]. These prediction models, however, have their limita-
tions owing to the tremendous noise and high dimensionality
of stock price data. Therefore, the performances of the existing
models are not satisfactory [65].
The second approach takes the technical indices and qualita-
tive factors such as political effects into account in stock market
forecasting and trend analysis.
Yao and Poh [63] use technical indicators (%K and %D)
along with price information to predict future price values. They
achieved good returns, and found that their models performed
better using daily data rather than weekly data. Hobbs and Bour-
bakis [26] predict prices of stocks based on the fluctuations in
the rest of the market for the same day. They show consistently
high rates of return, although the investment is done in a fric-
tionless environment. Paying commissions on the large number
of trades instigated would certainly erode much of the benefit
from the trading strategy proposed. Austin and Looney [8] de-
velop an NN that predicts the proper time to move money into
and out of the stock market. They used two valuation indicators,
two monetary policy indicators, and four technical indicators
to predict the four week forward excess return on the dividend
adjusted S&P 500 stock index. The results significantly out-
performed the buy-and-hold strategy. Backpropagation ANNs
are applied to predict future elements in the price time series
in the Korea composite stock price index (KOSPI) [30]. L
´
opez
et al. [38] use time delay connections in enhanced NNs (that is,
the addition of time-dependant information in each weight) to
forecast IBEX-35 (Spanish stock index) index close prices one
day ahead. Stochastic NNs is applied for forecast the volatil-
ity of index returns in the TUNINDEX (Tunisian stock index),
and finds that the out-of sample NN results are superior to tra-
ditional generalized autoregressive conditional heteroskedastic
(GARCH) models [52]. Nenortaite and Simutis [41] present
a trading approach based on one-step ahead profit estimates
created by combining NNs with particle swarm optimization
algorithms. The method is profitable given small commission
costs, but does not exceed the S&P500 returns when realistic
commissions are introduced. Jaruszewicz and Mandziuk [29]
train ANNs using both technical analysis variables and inter-
market data, to predict one day changes in the NIKKEI index.
They achieve good results using moving average convergence
divergence (MACD), Williams, and two averages, along with
related market data from the National Association of Securities
Dealers Automated Quotation System (NASDAQ) and DAX.
It has been a new tendency that combining the SC technolo-
gies of NNs, fuzzy logic (FL), and genetic algorithms (GAs)
may significantly improve an analysis [1], [2], [9], [11]–[13],
[19], [24], [25], [27], [31], [36], [39], [40], [51], [53], [55]–
[57], [61]. In generally, NNs are used for learning and curve
fitting, FL is used to deal with imprecision and uncertainty, and
GAs are used for search and optimization [13], [28], [39], [62].
Zadeh [65] pointed out that merging these technologies allows
for the exploitation of a tolerance for imprecision, uncertainty,
and partial truth to achieve tractability, robustness, and low so-
lution cost.
Wavelet analysis is a relatively new field in signal process-
ing [18]. Wavelets are mathematical functions that decompose
data into different frequency components, and then, study each
component with a resolution matched to its scale—a scale refers
to a time horizon [47]. Wavelet filtering is particularly rele-
vant to volatile and time-varying characteristics of real-world
time series and is not restrained by the assumption of station-
arity [44]. The wavelet transform decomposes a process into
different scales, which makes it useful in differentiating season-
alities, revealing structural breaks and volatility clusters, and
identifying local and global dynamic properties of a process
at these timescales [5], [23], [45]. Wavelet analysis has been
shown to be especially productive in analyzing, modeling, and
predicting the behavior of financial instruments as diverse as
stocks and exchange rates [43], [46].
In this paper, a wavelet filtering is used and that is because of
the property of wavelets for economic analysis in data decompo-
sition by time scale. Economic and financial systems, like many
other systems, contain variables that operate on a variety of time
scales simultaneously so that the relationships between variables
may well differ across time scales. A wavelet will decompose
the time series into a range of frequency scales [7], [19], [22].
The lower level of the decomposition can capture the long range
dependencies with only a few coefficients, while the higher lev-
els capture the usual short-term dependencies. This research,
motivated by the effective preprocessing capability of wavelets
and the predictive power of fuzzy rule system, presents a hybrid
system by integrating the wavelet and a TSK fuzzy rule system
for stock price prediction.
III. M
ETHODOLOGY
In tradition, there are two major factors to be considered in
stock forecasting, and they are technical analysis and funda-
mental analysis. Technical analysis concentrates on the study
of market action, and fundamental analysis concentrates on the
economic forces of supply and demand that cause price move-
ments. In addition, as explained in [68], statistics, technical anal-
ysis, fundamental analysis, and linear regression are all used to
attempt to predict the market’s direction. However, technical
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
804 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 6, NOVEMBER 2008
indexes themselves alone often miss a lot of potential chances
in the stock movement before the appropriate trading signal is
generated. Thus, although technical analysis may yield insights
into the market, its highly subjective nature and inherent time
delay does not make it ideal for the fast, dynamic trading mar-
kets. That is why a wavelet and TSK fuzzy-rule-based model is
developed in this research.
Fundamental analysis must rely on the reasons of price move-
ment, and this process is very complicated since there are so
many factors that may affect the price change such as political,
psychotically events, etc. Therefore, the basic assumption of
this research is that the price movement is closely related to the
variation of technical index as widely applied in financial time
series researches. A set of technical indexes will be applied as
input factors and the output will be the stock price. To study
their relationship, a hybrid method that integrates a wavelet and
Takagi and Sugeno fuzzy-system-based forecasting model is
developed and implemented in this research for Taiwan stock
price prediction. The main procedures of the hybrid system are
shown in Fig. 1 and inputs and outputs of each block are further
explained in the following sections.
The notation of variables applied in the following sections is
shown in Table I.
A. Data Preprocessing Using Wavelet Theory
The reason for applying wavelet theory as a data preprocess-
ing method is because that as mentioned by Ramsey [46], the
process of representation in wavelet is able to deal with the non-
stationarity involved with economic and financial time series.
One of the benefits of a wavelet approach is the flexibility in
handling very irregular data series, as illustrated in [47]. Eco-
nomic and financial systems contain variables that operate on a
variety of time scale simultaneously so that the relationship be-
tween variables may differ across time scale. The most important
property of wavelets for economic analysis is decomposition by
time scale.
In this research, Haar wavelet is applied as our major wavelet
transform tools. Haar wavelet is a wavelet evolved from con-
tinuous wavelet transform. According to [3], wavelet not only
decompose the data in terms of times and frequency, but also can
reduce lots of processing times. For a time series of size N,the
wavelet decomposition used here can be determined in O(n)
time. In considering Haar wavelet and Coiflets wavelet [18],
Coiflets wavelet considers more aspects than Harr wavelet, es-
pecially in combining compact support with various degree of
smoothness and numbers of vanishing moments [3], but Haar
wavelet still provides easily and quickly process time without
losing much in performance than other wavelet systems [48].
Haar wavelet has been widely applied in time series forecast-
ing [6], [48].
Depending on normalization rules, there are two types of
Haar wavelets within a given function/family: father and mother
wavelets
Φ
a,b
=2
1/2
Φ
t 2
a
b
2
a
ψ
a,b
=2
1/2
ψ
t 2
a
b
2
a
Fig. 1. Framework of the hybrid model.
Father wavelets
Φ(t)dt =1
Mother wavelets
ψ(t)dt =0. (1)
Father wavelets are used for the “lowest frequency” smooth
components; those requiring wavelets with the widest support
and mother wavelets are used for the “higher frequency” detail
components. Father wavelets are used for the “trend compo-
nents” and mother wavelets are used for all the deviations from
trend. While a sequence of mother wavelets is used to represent
a function, only one father wavelet is used.
A time series data, i.e., functionf(t), is an input to be repre-
sented by a wavelet analysis, and it can be built up as a sequence
of projections onto father and mother wavelets indexed by both
{b}, b = {0, 1, 2,...} and by {s} =2
a
, {a =1, 2, 3,...}.In
actual data analysis using discretely sampled data, it is neces-
sary to create a lattice over which the calculations will be made.
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
CHANG AND FAN: HYBRID SYSTEM INTEGRATING A WAVELET AND TSK FUZZY RULES 805
TABLE I
N
OTATION OF VARIABLES APPLIED IN SECTION III
Mathematically, it is convenient to use a dyadic expansion as
illustrated in (1).
The coefficients in the expansion are given by the projections:
s
A,b
=
f(t
A,b
(t)dt
d
a,b
=
f(t)ψ
a,b
(t)dt, and
a =1, 2,...,A (2)
where A is the maximum scale sustainable by the number of
data points. The representation of the signal f(t) can now be
given by:
f(t)=
b
s
A,b
Φ
A,b
(t)+
b
d
A,b
ψ
A,b
(t)
+
b
d
A1,b
ψ
A1,b
(t)+···+
b
d
1,b
ψ
1,b
(t). (3)
The approximation can be represented:
f(t)=S
A
(t)+D
A
(t)+D
A1
(t)+···+ D
a
+ ···+ D
1
S
A
(t)=
b
S
A,b
φ
A,b
(t)
D
A
(t)=
b
D
A,b
ψ
A,b
(t). (4)
When n, the number of observations is divisible by 2
J
, and
then, the number of coefficients of each type is given by the
following:
1) at the finest scale, 2
1
: n/2 coefficients d
1,b
;
2) at the next scale, 2
2
: n/2
2
coefficients d
2,b
;
3) at the coarsest scale, 2
A
: n/2
A
coefficients d
A,b
; and
4) at the coarsest scale, 2
A
: n/2
A
coefficients s
A,b
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
806 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 6, NOVEMBER 2008
Fig. 2. Wavelet transforming process from f(t).
so that
n =
n
2
+
n
4
+ ···+
n
2
A1
+
n
2
A
. (5)
As shown in Fig. 2, f (t) represents the original data, S
1
represents an approximation signal, and D
1
represents a detailed
signal. We can define the multiresolution decomposition of a
signal by specifying:
SA coarsest scale, and
S
A1
= S
A
+ D
A
. (6)
In general
S
a1
= S
a
+ D
a
(7)
where {S
A
,S
A1
,...,S
1
} is a sequence of multiresolution ap-
proximations of the function f(t), at ever increasing levels of
refinement. The corresponding multiresolution decomposition
of f(t) is given by {S
A
,D
A
,D
A1
,...,D
a
,...,D
1
}.
The sequence of terms S
A
,D
A
,D
A1
,...,D
a
,...,D
1
rep-
resents a set of orthogonal signal components that provide rep-
resentations of the signal at resolutions 1 to A; each DAk
provides the orthogonal increment to the representation of the
function f (t) at the scale, or resolution 2
J k
. When the data
pattern is very rough, the wavelet process will be repeatedly
applied. In the preprocessing, the target is to minimize the mean
absolute percentage error (MAPE) between the signal before
and after transformation. In this way, the noise in the original
data can be removed.
B. TSK-Fuzzy-System-Based Prediction
1) Input Selection Using Stepwise Regression Analysis: A
set of important technical factors, as shown in Table II, which
will affect the forecasting result, have been identified by [14].
These important input factors will be further selected through
stepwise regression analysis (SRA) model in this research.
There are totally six important indexes to be selected from and
they are X
1
= six day moving average; X
2
= six day bias
(BIAS); X
3
= six day relative strength index (RSI); X
4
= nine
day stochastic line (KD); X
5
= the moving average divergence
(MABIAS); and X
6
= the 13 days psychological line. The out-
put is Y = stock price.
The SRA is applied to determine the set of independent vari-
ables that most closely affect the dependent variable. This is
accomplished by the repetition of a variable selection. The step-
by-step procedure of the SRA approach is explained in details
in the following.
Step 1) Calculate the correlation coefficient (r) of each input
variable (X
1
X
2
···X
n
) and output data (Y ). Then, a
correlation matrix is derived.
Step 2) Rank each variable according to its square (r
2
) from
correlation matrix (suppose X
i
is the largest one in
the current stage), and check the linear regression
of this variable to the output data, i.e., derive a re-
gression model as
ˆ
Y = f(X
i
). α value is applied to
consider the significance of each input variable. Re-
peat this process until all variables are tested. Finally,
select those statistically significant variables for fur-
ther verification and assume that these variables are
(X
1
X
2
···X
k
).
Step 3) Calculate partial F value for those statistically sig-
nificant variables, as shown in (9), and choose the
largest correlation coefficient among these input vari-
ables (assume that it is X
j
). Then, derive another
regression model
ˆ
Y = f(X
5
,X
4
) again
F
j
=
MSR(X
j
/X
i
)
MSE(X
j
/X
i
)
=
SSR(X
j
/X
i
)/(k 2)
SSE(X
j
/X
i
)/(n k)
,i I
(8)
F
j
=Max
1j N,j/I
(F
j
). (9)
Step 4) Calculate the partial F value of the original data for
input variable X
j
. If the value is smaller than a user
defined threshold, it is removed from the model be-
cause X
j
is not statistically significant for the output.
Step 5) Repeat steps 3 to 4. If every input number’s partial F
value is greater than the user defined threshold, stop.
It means that every input value should have signif-
icant influences on output value. According to [10]
and [13], we always set the threshold value as 4. If
the F value of a specific variable is greater than the
user defined threshold, it is added to the model as
a significant factor. When the F value of a specific
variable is smaller than a user defined threshold, it
is removed from the model. The statistical software
Statistical Package for the Social Sciences (SPSS) for
Windows 10.0 was used for SRA in this research. The
flow diagram of SRA is shown in Fig. 3.
2) TSK Fuzzy Rule Systems: The TSK fuzzy systems is se-
lected as a universal function approximation for the stock pre-
diction problems due to its ability to explain nonlinear relations
using a relatively low number of simple rules. The structure of
our TSK model is a multiple-input single-output fuzzy system
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
CHANG AND FAN: HYBRID SYSTEM INTEGRATING A WAVELET AND TSK FUZZY RULES 807
Fig. 3. Flow diagram of SRA.
and its associated fuzzy inference method comprises a set of K
IF-THEN rules in the following form:
R
i
: If x
1
is A
i1
,x
2
is A
i2
...x
n
is A
in
,
then y
i
= β
i1
+ β
i1
x
1
+ ···+ β
in
x
n
where x
j
,j =1,...,n, are the inputs of the fuzzy system, n is
the number of inputs, A
ij
,i=1,...,N, are fuzzy subsets, N
is the number of fuzzy rules, y
i
is the output of ith rule, and
β
ji
are parameters for consequence rules. It is a first-order TSK
fuzzy rule system, and in this paper, Gaussian fuzzy membership
functions are adopted
A
ij
(x
j
) = exp
(x
j
a
ij
)
2
σ
2
ij
(10)
where a
ij
and σ
ij
are the mean and standard deviation of the
Gaussian functions. Given a crisp input pair (x
0
1
···x
0
n
); the
crisp output of the TSK model is described by
y =
N
i=1
w
i
y
i
N
i=1
w
i
(11)
where w
i
is the strength of rule i determined by
w
i
=
n
j=1
A
ij
(x
0
j
) (12)
and
y
i
= β
i0
+ β
i1
x
0
1
+ ···+ β
in
x
0
n
. (13)
The main task in TSK fuzzy-rule-based prediction is to de-
termine the parameters in the fuzzy membership functions and
in the rule consequences using a learning algorithm, given a set
of training data specifying the functional mapping between the
inputs and the output.
3) Data Clustering: The purpose of data clustering is to clus-
ter the set of financial time series data into different groups, and
data in each group will have a more homogeneous characteris-
tic. However, it is very important to determine how many fuzzy
rules should be generated beforehand. If a standard rule struc-
ture is used, rule explosion occurs when the number of inputs is
high. To resolve this problem, we divide the training data into
a number of clusters based on the output data (stock price) and
one fuzzy rule is generated for each cluster [32]. By doing this,
the number of fuzzy rules can be reduced effectively. Besides,
we can determine the fuzzy membership functions using the
mean and standard deviation of the data points that belongs to
each cluster.
The K-means clustering algorithm is employed for data clus-
tering. K-means is a nonhierarchical clustering technique in
which the dataset is partitioned into K clusters. During the
clustering, the data points are randomly assigned to the clusters
to minimize the following squared error (SE):
SE =
K
i=1
pC
i
|p m
i
|
2
(14)
where p are data points in the cluster C
i
, m
i
is the center of
cluster C
i
, and K is the number of clusters.
Once the training data are clustered, we can calculate the
parameters of the membership functions for each cluster as
follows:
a
ij
=
1
s
i
s
i
i=1
x
j
σ
ij
=
1
s
i
1
(x
j
a
ij
)
2
(15)
where S
i
is the number of data points in cluster i. In addition,
the output of the training data is also normalized using the mean
and standard deviation of the data in each cluster.
4) Optimization of the Parameters in Fuzzy Rules Using
Simulated Annealing: The purpose of applying the SA is to
find a set of best values for the parameters within the fuzzy
rules. Traditionally, the parameter settings of TSK’s rules are
generated using the gradient method. The generalized gradient
algorithm searches for the solution in a multidimensional space
along the steepest ascent direction. However, such a search can
be extremely slow and ineffective if the equation has many
plateaus distributed throughout the landscape. Therefore, this
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
808 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 6, NOVEMBER 2008
Fig. 4. Gradient method as compared with a simulated annealing approach.
method may not be able to derive an optimal solution and can
be trapped in a local optimum, as shown in Fig. 4. In such cases,
statistical search methods may offer better strategy in resolving
both problems [60], [66].
One of the most widely used statistical search method is the
SA [33], which uses the metropolis algorithm to decide whether
to accept or reject a configuration that results in an increased
cost during its attempts in searching for the minimum cost.
The main characteristics of SA are its simplicity and the rapid
convergence. To properly adjust parameters’ weights of TSK
fuzzy rules, the SA approach is effective if the chosen energy,
or cost function, for the global system is appropriate. In this
study, the cost function is defined as the MAPE for the set of
testing data, i.e., a series of stock index. The procedure of SA
is well known, as described in [33]. First, it is necessary to
generate random values of the parameters’ weights, and second,
to compute the associated cost of the system. This cost will
be minimized when the parameters’ weights achieve a global
minimum, the method thus allowing escape from local minima.
The detailed set up of the parameters for SA will be described
in the next section. Through a proper setup of the cooling sched-
ule, finally SA can be applied to derive a set of near-optimal
parameters for these TSK rules, as shown in Fig. 4.
5) Using K-Nearest-Neighbor as a Sliding Window: The ba-
sic idea of KNN [20] is to identify similar patterns of current
data trend from the historic data. We use KNN as a sliding
window to forecasting the data value for next day and use the
current k data as a time window to search within the historic
data to see if there are any similar patterns identified. Basically,
our approach is categorized as a one-step ahead prediction. The
selected data are preprocessed with the wavelet, and then, TSK
model is applied to generate a set of fuzzy rules for prediction
of stock price. In addition, the KNN sliding window is further
applied to reduce the forecasting errors. The set of historic data
is divided into training and testing set for cross validation. The
KNN is simpler than other SC approaches because there is no
model to train on the data series. Instead, the data series is be-
ing searched for situations similar to the current, each time a
forecast needs to be made.
To describe the KNN process, several terms have to be defined
first. Assume the window size is L, which means there are L data
in each window to be considered. The final data points of the
data series are the reference data, and the length of the reference
is the window size. To forecast the data series’ next data point,
the reference is compared to the first group of data points in
the historical data series, called a candidate, and an error is
computed. Then, the reference is moved one data point forward
to the next candidate and another error is computed, and so on.
An error is calculated by subtracting the candidate value from
the reference value. All errors of the testing data are sorted and
stored in an array. Assume that the number of nearest neighbors
is H. Then, the smallest H errors corresponding to these H
candidates will be selected. Finally, the forecasted value will be
equal to the average of these k data points. Then, to forecast
the next data point, the process is repeated with the previously
forecasted data point appended to the end of the data series.
This process can be iteratively repeated until all n data points
are calculated.
Use KNN to calculate the new forecasted value.
Step 1) Use original data as a contrast data. Suppose to fore-
cast number i data from index number t +1, i.e.,
number
ˆ
X
i,t+1
value.
Step 2) Use number t to t L + 1 data for contrast base.
Using the sliding window method, one by one com-
pare the data from 1 to t L, and also calculate the
Euclidean distance from every interval D
(1)
j
, and find
the corresponding forecasting value F
(1)
j
D
(1)
j
=
L
l=1
(X
i,tL+ l
X
i,l+j 1
)
2
F
(1)
j
= X
i,j+L
. (16)
In (14), j =1 t L.
Step 3) Consider all D
(1)
j
and find the kth smallest number.
It is KNN’s K option value.
Step 4) Use the weighted voting method to find the last fore-
casting value
ˆ
X
i,t+1
. The equation is
ˆ
X
i,t+1
=
H
k=1
F
k
/W
k
H
k=1
1/W
k
. (17)
W
k
means the kth smallest D
(1)
j
value, F
k
means
the F
(1)
j
value corresponding to the kth smallest D
(1)
j
value, H =1 k. And the parameter set of the slid-
ing windows is (L, H). A simple example for KNN
forecasting with window size L = 3 and H = 2is
shown in Fig. 5.
C. Different Models to be Compared With
In this research, we use traditional back-propagation neural
networks (BPNs), the MRM, and a forecasting method by inte-
grating GA with Wang and Mendal’s algorithm for fuzzy rule
generation (GAWM) [13], [58] to compare with our wavelet
TSK fuzzy rule forecasting system. These three compared mod-
els will be briefly introduced.
BPN [49] is a popular system that has been widely employed
in financial forecasting. The most popular training method for
BPN is the supervised learning, i.e., learning by samples, which
will be selected in this research to train the system. After learning
(or training), the trained connection weights can be used for the
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
CHANG AND FAN: HYBRID SYSTEM INTEGRATING A WAVELET AND TSK FUZZY RULES 809
TABLE I I
T
ECHNICAL INDICES USED AS INPUT VARIABLES
prediction of future occurrences. Technical indices selected by
SRA will be applied as the inputs to BPN and the output will
be the Taiwan stock index. The detailed parameter’s setting for
BPN will be described in Section IV.
Multiple regression analysis (MRA) [12] is one of the most
popular methods applied in business forecasting. MRA is em-
ployed for testing hypothesis with regard to the relationship
between a dependent variable (Y ) and two or more independent
variables (Xs). It is easy to establish a model when there is a
linear relationship between the independent variable and depen-
dent variable. On the other hand, it is very difficult to establish an
accurate model within a nonlinear relation. In this research, the
output factor (Y ) is the stock price and the input factors include
six day moving average (X1), six day bias (X2), six day RSI
(X3), nine day stochastic line (X4), moving average divergence
(X5), and the 13 days psychological line (X6). The multiple
regression formula of this problem can be defined as follows:
Y = a
1
X
1
+ a
2
X
2
+ a
3
X
3
+ a
4
X
4
+ a
5
X
5
+ a
6
X
6
+ b.
(18)
Among them, parameters a
1
, a
2
, a
3
, a
4
, a
5
, a
6
, and b are all
calculated via SPSS statistics software.
Wang and Mendel [58] developed a method to create a fuzzy
rule base, which is a combination of rules generated from nu-
merical examples, i.e., historic stock data and linguistic rules
supplied by human experts [15]. The Wang and Mendel (WM)
method is evolved with GAs and the idea is similar to evolving
NN [60]. Essentially, a simple GA is used to determine the near-
optimal number of fuzzy terms for each variable; as a result, the
objective function can be better improved by this evolution.
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
810 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 6, NOVEMBER 2008
Fig. 5. KNN forecasting with window size L = 3andH = 2.
D. Performance Measures
There are many measures of prediction accuracy used to com-
pare forecasting methods out of sample [10]. In considering
about the dimensionality of the data, in this research, there are
six input factors and one output value, and 494 training data.
The mean square error (MSE) of each model is very large after
training this data set. Since the purpose of the stock prediction is
to make profit instead of just predict the future price accurately,
we use MAPE as a performance measure instead of MSE. The
equation of MAPE is listed later.
1) Mean Absolute Percentage Error: The accuracy of pre-
dictions was measured with the following indicator, i.e., MAPE.
The average forecast error is measured as a percentage of his-
torical results. The absolute value allows for the effect of adding
different signs. It is calculated as follows:
MAPE =
1
n
n
t=1
|X
t
F
t
|
X
t
× 100 (19)
where X
t
is the true value and F
t
is the predicted value at time
t. The MAPE is an average over n test sets.
IV. S
IMULATION RESULTS
The Taiwan Stock Exchange (TSE) began operations since
1962. At the end of January, 2005, Taiwan Stock Exchange Cor-
poration (TSEC) had 699 listed companies with market capital
topping NT$13.7 trillion (US $396 billion). Most stock trading
goes to the listed IT companies and the trading value of TSE
stock market places it in the top ten of stock exchanges in the
world.
The data set applied for test in this research is the TSE in-
dex, and it has been decomposed into three different sets: the
training data, test data, and validation data. The data for TSE
Fig. 6. Figures before and after a wavelet transformation for 492 stock price
data from Taiwan Stock Exchange index.
index are from July 18, 2003 to December 31, 2005, totally 614
records. During this period, the stock market has gone through a
rough up and down period owing to the national political issues.
Therefore, these data are very representative and suitable for
study and analysis. The first 492 records will be training and
cross-validation data and the rest of the data, i.e., 122 records
will be for out-of-sample test data. To avoid the interaction
among these factors, we will test each factor using SRA and
identify the factor that will affect the final forecasted results
significantly. The final combination of the factors will be fi-
nalized after the analysis. The factors selected finally are MA6
and BIAS6; these two index and the output variables are TSE
index.
Before training the TSK fuzzy model, a wavelet transfor-
mation has been applied to preprocess the data. According to
the MAPE, a three-level wavelet preprocessing is thus applied.
Through this process, the noise in the original data can be re-
moved. The result of a wavelet-based decomposition process
is depicted in Fig. 6. According to the RSA method described
in Section III-B, six input variables are finally selected as the
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
CHANG AND FAN: HYBRID SYSTEM INTEGRATING A WAVELET AND TSK FUZZY RULES 811
Fig. 7. MAPEs of the hybrid model for different number of data clusters.
TABLE III
P
ARAMETER SETTINGS FOR SIMULATED ANNEALING
inputs to the TSK model to predict the stock price. They are six
day moving average (MA), six day bias (BIAS), six day RSI,
nine day stochastic line (KD), the moving average divergence
(MABIAS), and the 13 days psychological line (PSY).
In the k-means clustering algorithm, the number of clusters
must be predefined. It is a very interesting subject to be fur-
ther investigated since there is no exact theorem explaining
the effect of number of clusters to the forecasting accuracy.
To check the sensitivity of the performance of our model on
the number of clusters, various numbers of data clusters are
investigated. In the experiment, the stock price data were clus-
tered into two to eight clusters. The performance (MAPE) of
the algorithms with different number of clusters is shown in
Fig. 7.
As can be observed from the figure, MAPE will start to de-
crease as the number of clusters increase. However, as the num-
ber of clusters reaches a certain value, MAPE starts to increase.
Part of the reasons is because when the number of clusters is too
large, the number of the data in each cluster is too small. These
data in each cluster are not representative enough to generate a
model to forecast the future stock index. Therefore, in this re-
search, the number of clusters will be three since it provides best
performance (the smallest MAPE). This number of clusters is
not definite and it has to be decided experimentally for different
application purposes.
The parameter setting in three different levels for the SA is
provided in Table III. Then, we use statistical software Minitab
R14 to run the Taguchi experiments and the results of different
factor levels are shown in Table IV. Table V lists the final setting
for each factor in SA procedure. The factor response graph of
these experimental results is shown in Fig. 8.
The convergence diagram of the learning process of the
rule consequence parameters using the SA is shown in Fig. 9.
Finally, the MAPE of the forecasting model gradually de-
creases to 3.8% after the temperature drops to a certain level.
It is justified that the proposed SA approach can find near-
optimal solutions for the set of parameters of consequence
rules.
We compared the proposed hybrid method combining the
wavelet and TSK fuzzy rules with three existing methods. To
justify the use of SA and KNN sliding window, a set of experi-
mental results are listed in Table VI. In this table, rule number
is decided by (14) and (15), and the first number means the
window size L; the second number means the number of best
neighborhood data H. According to a series of experiments
where L is setup as 2, 4, 6, and 8 and H is setup as 2, 3, 4, and
5, the best forecasting result in terms of the minimum MAPE
value from the hybrid model is in (2,5) with an average of
0.792.
The four different algorithms to be compared with are the
traditional back-propagation neural networks (BPNs), a stan-
dard TSK, the MRM, and a forecasting method by integrating
GA with Wang and Mendal’s algorithm for fuzzy rule gener-
ation (GAWM) [13], [58]. Tables VII and VIII are the best
parameter setting for BPN and GAWM using design of
experiments. Fig. 10 also reveals the experimental results that
GAWM converged after 100 generations. Table VIII shows the
MAPE value for all different methods.
As observed from Table IX, MRM has the largest MAPE
value and part of the reasons is because MRM cannot fully
explain the nonlinear relationship among the stock price and
the technical indexes. BPN also has a large error as compared
with other models and that is due to the tremendous noise
and complex dimensionality of stock price data. Besides, the
quantity of data itself and the input variables may also in-
terfere with each other. In addition, BP learning algorithm is
subject to getting stuck in a local optimum, while the TSK
is less likely. Therefore, the result may not be that convinc-
ing. In addition, BPN methods do not provide an insight into
the nature of the interactions between the technical indicators
and the stock market fluctuations. As for GAWM, the fuzzy
rules generated from the training data are very large when
compared with TSK and these rules may interact with each
other.
From the experimental tests, we can observe that TSK fuzzy
system is more suitable in handling large amount of data.
The set of data is clustered according its mean and standard
deviation. After this clustering, these set of data will be de-
composed into couples of subclusters. These subclusters have
more homogeneous characteristics within themselves, and each
subcluster will be transformed into a TSK fuzzy rule. There-
fore, the fuzzy rules generated from TSK are quite small since
each cluster of data only generates one single rule. For our
experimental tests, the clusters have been preset into two to
eight; therefore, there are two to eight fuzzy rules within each
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
812 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 6, NOVEMBER 2008
TABLE I V
T
AGUCHI EXPERIMENTAL RESULT OF DIFFERENT FACTOR LEVELS
TABLE V
B
EST PARAMETER SET FOR SA
Fig. 8. Factor response graph.
Fig. 9. SA training astringent diagram.
TABLE V I
R
ESULTS OF DIFFERENT NUMBER OF SLIDING WINDOWS (L) AND H NEAREST
NEIGHBORS APPLIED IN TSE INDEX FORECASTING (IN PERCENT)
TABLE V II
BPN P
ARAMETER SET
system. The best forecasting results will be in three clusters. In
addition, KNN has been further applied to reduce the fore-
casting errors. That is why, TSK has a better forecasting
accuracy.
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
CHANG AND FAN: HYBRID SYSTEM INTEGRATING A WAVELET AND TSK FUZZY RULES 813
TABLE VIII
P
ARAMETER SETTING FOR GAWM
Fig. 10 Convergence diagram of GAW& M model.
TABLE I X
MAPE P
ERFORMANCE FROM DIFFERENT METHODS
V. C ONCLUSION
This paper proposed a TSK fuzzy model for stock price pre-
diction. To facilitate the prediction, the data are preprocessed
using the Haar wavelet. Then, SRA technique is employed to
select the most relevant factors for prediction. To avoid rule ex-
plosion, the k-means clustering algorithm is employed to group
the data into a number of clusters and one fuzzy rule is generated
for each cluster. As an additional benefit, the fuzzy membership
function can be determined automatically using the mean and
variance of the data in each cluster. The parameters in the con-
sequences of the TSK rules are optimized using the SA. A KNN
sliding window is applied to retrieve the similar patterns in the
historical data and further adjust the forecasted value from the
TSK model.
The proposed model is compared with the BNP, TSK, MRM,
and GAWM for stock price prediction. Simulation results show
that the TSK model with wavelet-based preprocessing greatly
outperforms the other three models. To the best of knowledge,
the combination of the TSK fuzzy model with wavelet-based
preprocessing is new for stock price forecasting. Due to its very
promising performance, we are going to apply the system for
real-time daily trading.
In the future, a different TSK fuzzy model, such as a nonlinear
model using NNs as a consequence, can be further applied in a
more complex time series problem. In addition, more advanced
pattern matching algorithm can be embedded in the system to
retrieve significant patterns from the historic stock data for com-
parison with the current trend of the data. As a result, intelligent
trading signals instead of stock price can be identified.
R
EFERENCES
[1] A. Abraham, N. Baikunth, and P. K. Mahanti, “Hybrid intelligent systems
for stock market analysis,” in Lecture Notes Computer Science, London,
U.K.: Springer-Verlag, vol. 2074, pp. 337–345, 2001.
[2] A. Abraham, N. S. Philip, and P. Saratchandran, “Modeling chaotic be-
havior of stock indices using intelligent paradigms,” Neural, Parallel Sci.
Comput., vol. 11, pp. 143–160, 2003.
[3] F. Abramovich, P. Besbeas, and T. Sapatinas, “Empirical Bayes approach
to block wavelet function estimation,” Comput. Statist. Data Anal.,
vol. 39, no. 4, 28, pp. 435–451, 2002.
[4] Y. S. Abu-Mostafa and A. F. Atiya, “Introduction to financial forecasting,”
Appl. Intell., vol. 6, pp. 205–213, 1996.
[5] M. Aiken and M. Bsat, “Forecasting market trends with neural networks,”
Inf. Syst. Manag., vol. 16, no. 4, pp. 42–48, 1994.
[6] V. Alarcon-Aquino and J. A. Barria, “Multi resolution FIR neural-network-
based learning algorithm applied to network traffic prediction,” IEEE
Trans. Syst., Man Cybern., Part C: Appl. Rev., vol. 36, no. 2, pp. 208–209,
Mar. 2006.
[7] A. Aussem and F. Murtagh, “Combining neural network forecasts on
wavelet-transformed time series,” Connection Sci., vol. 9, pp. 113–122,
Mar. 1997.
[8] M. Austin and C. Looney, “Security market timing using neural network
models,” New Rev. Appl. Expert Syst., vol. 3, pp. 3–14, 1997.
[9] N. Baba, N. Inoue, and H. Asakawa, “Utilization of neural networks &
GAs for constructing reliable decision support systems to deal stocks,” in
Proc. IEEE-INNS-ENNS Int. Joint Conf. Neural Netw. (IJCNN’00),vol.5,
Jul., pp. 5111–5116.
[10] D. Brownstone, “Using percentage accuracy to measure neural network
predictions in stock market movements,” Neurocomputing, vol. 10,
pp. 237–250, 1996.
[11] P. C. Chang and T. W. Liao, “Combing SOM and fuzzy rule base for
flow time prediction in semiconductor manufacturing factory,” Appl. Soft
Comput., vol. 6, no. 2, pp. 198–206, 2006a.
[12] P. C. Chang and Y. W. Wang, “Fuzzy delphi and back-propagation model
for sales forecasting in PCB industry,” Expert Syst. Appl., vol. 30, no. 4,
pp. 715–726, 2006b.
[13] P. C. Chang, C. H. Liu, and Y. W. Wang, “A hybrid model by clustering and
evolving fuzzy rules for sale forecasting in printed circuit board industry,”
Decision Support Syst., vol. 42, no. 3, pp. 1715–1729, 2006.
[14] P. C. Chang, Y. W. Wang, and W. N. Yang, “An investigation of the hybrid
forecasting models for stock price variation in Taiwan,” J. Chin. Inst. Ind.
Eng., vol. 21, no. 4, pp. 358–368, 2004.
[15] M. Y. Chen and D. A. Linkens, “Rule-base self-generation and simplifica-
tion for data-driven fuzzy models,” Fuzzy Sets Syst., vol. 142, pp. 243–265,
2004.
[16] A. S. Chen, M. T. Leung, and H. Daouk, Application of neural net-
works to an emerging financial market: Forecasting and trading the
Taiwan stock index,” Comput. Operations Res., vol. 30, pp. 901–923,
2003.
[17] S. C. Chi, H. P. Chen, and C. H. Cheng, “A forecasting approach for
stock index future using grey theory and neural networks,” in Proc.
IEEE Int. Joint Conf. Neural Netw., Washington, DC, 1999, pp. 3850–
3855.
[18] A. Cohen, I. Daubechies, and P. Vial, “Wavelets on the interval and fast
wavelet transform,” Appl. Comp. Harm. Anal., vol. 1, no. 1, pp. 54–81,
Dec. 1993.
[19] G. Corani and G. Guariso, “Coupling fuzzy modeling and neural networks
for river flood prediction,” IEEE Trans. Syst., Man Cybern., Part C: Appl.
Rev., vol. 35, no. 3, pp. 382–390, Aug. 2005.
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
814 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 6, NOVEMBER 2008
[20] P. A. Devijver and J. Kittler, Pattern Recognition: Statistical Approach.
London, U.K.: Prentice-Hall, 1982.
[21] T. Dogaru and L. Carin, “Application of Haar-wavelet-based multiresolu-
tion time-domain schemes to electromagnetic scattering problems,” IEEE
Trans. Antennas Propag., vol. 50, no. 6, pp. 774–784, Jun. 2002.
[22] D. L. Donoho and I. M. Johnstone, “Adapting to unknown smoothness via
wavelet shrinkage,” J. Amer. Stat. Assoc., vol. 90, no. 432, pp. 1200–1224,
Dec. 1995.
[23] R. Genc¸ay, F. Selcuk, and B. Whitcher, “Differentiating intraday season-
alities through wavelet multi-scaling,” Phys. A, vol. 289, pp. 543–556,
2001.
[24] R. H. Golan and W. Ziarko, “A methodology for stock market analysis
utilizing rough set theory,” in Proc. IEEE/IAFE 1996 Conf. Comput. Intell.
Financial Eng., New York, 1995, pp. 32–40.
[25] Y. Hoekstra, “A stock market forecasting support system based on fuzzy
logic,” in Proc. 27th Annu. Hawaii Int. Conf. Syst. Sci., Wailea, HI, 1994,
pp. 281–287.
[26] A. Hobbs and N. G. Bourbakis, “A neurofuzzy arbitrage simulator for
stock investing,” in Proc. Int. Conf. Comput. Intell. Financial Eng.
(CIFER), New York, 1995, pp. 160–177.
[27] K. Izumi and K. Ueda, “Analysis of exchange rate scenarios using an
artificial market approach,” in Proc. Int. Conf. Artif. Intell., vol. 2, 1999,
pp. 360–366.
[28] L. C. Jain and N. M. Martin, Fusion of Neural Networks, Fuzzy Sets, and
Genetic Algorithms. New York: CRC Press LLC, 1999.
[29] M. Jaruszewicz and J. Mandziuk, “One day prediction of NIKKEI index
considering information from other stock markets,” in Proc. Int. Conf.
Artif. Intell. Soft Comput. ICAISC 2004, vol. 3070, pp. 1130–1135.
[30] K.-J. Kim and W. B. Lee, “Stock market prediction using artificial neural
networks with optimal feature transformation,” Neural Comput. Appl.,
vol. 13, no. 3, pp. 255–260, 2004.
[31] K. J. Kim and I. Han, “Genetic algorithms approach to feature discretiza-
tion in artificial neural networks for the prediction of stock price index,”
Expert Syst. Appl., vol. 19, pp. 125–132, 2000.
[32] T. Kimoto and K. Asakawa, “Stock market prediction system with modular
neural network,” in Proc. IEEE Int. Joint Conf. Neural Netw., San Diego,
CA, 1990, pp. 1–6.
[33] S. Kirkpatrick, C. D. Gellat Jr., and M. P. Vecchi, “Optimization by simu-
lated annealing,” Science, vol. 220, pp. 671–680, 1983.
[34] A. J. Koning, P. H. Franses, M. Hibon, and H. O. Stekler, “The M3
competition: Statistical analysis of the results,” Int. J. Forecast., vol. 21,
pp. 397–409, 2005.
[35] A. Kusiak, M. R. Smith, and Z. Song, “Planning product configurations
based on sales data,” IEEE Trans. Syst., Man Cybern., Part C: Appl. Rev.,
vol. 37, no. 4, pp. 602–609, Jul. 2007.
[36] H. L. Larsen and R. R. Yager, “A framework for fuzzy recognition technol-
ogy,” IEEE Trans. Syst., Man, Cybern., part C, vol. 30, no. 1, pp. 65–76,
Feb. 2000.
[37] J. W. Lee, “Stock price prediction using reinforcement learning,” in Proc.
IEEE Int. Joint Conf. Neural Netw., Pusan, Korea, 2001, pp. 690–695.
[38] L. F. M. L
´
opez, M. A. D
´
ıaz, V. Palencia, E. Santos, and P. Jim
´
enez, “IBEX-
35 stock market forecasting using time delay connections in enhanced
neural networks,” World Multiconf. Syst., Cybern. Inf., vol. 67, pp. 455–
460, 2002.
[39] S. Mitra and Y. Hayashi, “Bioinformatics with soft computing,” IEEE
Trans. Syst., Man Cybern., Part C: Appl. Rev., vol. 36, no. 5, pp. 616–635,
Sep. 2006.
[40] D. Montana and L. Davis, “Training feed forward neural networks using
genetic algorithms,” in Proc. 11th Int. Joint Conf. Artif. Intell., Morgan
Kaufmann, San Mateo, CA, 1989, pp. 762–767.
[41] J. Nenortaite and R. Simutis, “Stocks’ trading systems based on the particle
swarm optimization algorithm,” Comput. Sci.—ICCS, vol. 3039, no. 4,
pp. 843–850, 2004.
[42] K. Papagiannaki, N. Taft, Z.-L. Zhang, and C. Diot, “Long-term forecast-
ing of internet backbone traffic,” IEEE Trans. Neural Netw., vol. 16, no. 5,
pp. 1110–1124, Sep. 2005.
[43] K. Parasuraman and Elshorbagy, “A wavelet networks: An alternative to
classical neural networks,” in Proc. 2005 IEEE Int. Joint Conf. Neural
Netw., IJCNN ’05, vol. 5, 2005, pp. 2674–2679.
[44] A. Popoola and K. Ahmad, “Testing the suitability of wavelet pre-
processing for TSK fuzzy models,” in Proc. FUZZ-IEEE: Int. Conf. Fuzzy
Syst. Netw., Vancouver, BC, Canada, Jul. 16–22, 2006, pp. 1305–1309.
[45] A. Popoola, S. Ahmad, and K. Ahmad, “Multi-scale wavelet preprocessing
for fuzzy systems,” in Proc. 2005 ICSC Congr. Comput. Intell. Methods
Appl., Dec. 2005, pp. 15–17.
[46] J. B. Ramsey, “The contribution of wavelets to the analysis of economic
and financial data,” Phil. Trans. R. Soc. London, vol. 357, pp. 2593–2606,
Sep. 1999.
[47] J. B. Ramsey and Z. Zhang, “The analysis of foreign exchange data using
waveform dictionaries,” J. Empirical Finance, vol. 4, pp. 341–372, 1997.
[48] O. Renaud, J. L. Starck, and F. Murtagh, “Prediction based on a multiscale
decomposition,” Int. J. Wavelets, Multiresolution Inf. Process.,vol.1,
no. 2, pp. 217–232, 2003.
[49] D. E. Rumelhart, G. E. Hilton, and R. J. Williams, “Learning repre-
sentations by backpropagation errors,” Nature, vol. 323, pp. 533–536,
1986.
[50] K. Schierholt and C. H. Dagli, “Stock market prediction using different
neural network classification architectures,” in Proc. IEEE/IAFE 1996
Conf. Comput. Intell. Financial Eng., New York, 1996, pp. 72–78.
[51] R. Sitte and J. Sitte, “Analysis of the predictive ability of time delay neural
networks applied to the S&P500 time series,” IEEE Trans. Syst., Man,
Cybern., part C, vol. 30, no. 4, pp. 568–572, Nov. 2000.
[52] C. Slim, “Forecasting the volatility of stock index returns: A stochastic
neural network approach,” Comput. Sci. Its Appl., vol. 3, pp. 935–944,
2004.
[53] M.-C. Su, C.-W. Liu, and S.-S. Tsay, “Neural-network-based fuzzy model
and its application to transient stability prediction in power systems,”
IEEE Trans. Syst., Man, Cybern., Part C: Appl. Rev., vol. 29, no. 1,
pp. 149–157, Feb. 1999.
[54] T. Takagi and M. Sugeno, “Fuzzy identification of systems and its appli-
cation to modeling and control,” IEEE Trans. Syst., Man Cybern., vol. 15,
no. 1, pp. 116–132, Jan. 1985.
[55] I. N. Tansel, S. Y. Yang, G. Venkataraman, A. Sasirathsiri, W. Y. Bao,
and N. Mahendrakar, “Modeling time series data by using neural net-
works and genetic algorithms,” Smart Engineering System Design: Neu-
ral Networks, Fuzzy Logic, Evolutionary Programming, Data Mining,
and Complex Systems (Proc. Artif. Neural Netw. Eng. Conf., ANNIE’99),
C. H. Dagli, A. L. Buczak, J. Ghosh, M. J. Embrechts, and O. Ersoy, Eds.
New York: ASME Press, 1999, pp. 1055–1060.
[56] A. Thammano, “Neuro-fuzzy model for stock market prediction,” in
Smart Engineering System Design: Neural Networks, Fuzzy Logic, Evo-
lutionary Programming, Data Mining, and Complex Systems (Proc. Artif.
Neural Netw. Eng. Conf., ANNIE’99), C. H. Dagli, A. L. Buczak, J. Ghosh,
M. J. Embrechts, and O. Ersoy, Eds. New York: ASME Press, 1999,
pp. 587–591.
[57] S. Wang and N. P. Archer, “A neural network based fuzzy set model for
organizational decision making,” IEEE Trans. Syst., Man, Cybern., Part
C: Appl. Rev., vol. 28, no. 2, pp. 194–203, May 1998.
[58] L. X. Wang and J. M. Mendel, “Generating fuzzy rules by learning form
examples,” IEEE Trans. Syst., Man, Cybern., vol. 22, no. 6, pp. 1414–
1427, Nov./Dec. 1992.
[59] H. White, “Economic prediction using neural networks: The case of IBM
daily stock returns,” in Proc. 2nd Annu. IEEE Conf. Neural Netw., II,
1988, pp. 451–458.
[60] W. L. Woo and S. S. Dlay, “Neural network approach to blind signal
separation of mono-nonlinearity mixed sources,” IEEE Trans. Circuits
Syst., vol. 52, no. 6, pp. 1236–1247, Jun. 2005.
[61] X. Yao, “Evolving artificial neural networks,” Proc. IEEE, vol. 87, no. 9,
pp. 1423–1447, Sep. 1999.
[62] L. Yu and Y.-Q. Zhang, “Evolutionary fuzzy neural networks for hybrid
financial prediction,” IEEE Trans. Syst., Man, Cybern., Part C: Appl.
Rev., vol. 35, no. 2, pp. 244–249, May 2005.
[63] J. Yao and H. L. Poh, “Forecasting the KLSE index using neural networks,”
in Proc. IEEE Int. Conf. Neural Netw., vol. 2, Nov./Dec. 1995, pp. 1012–
1017.
[64] Y. Yoon and J. Swales, “Prediction stock price performance: A neural
network approach,” in Proc. 24th Annu. Hawaii Int. Conf. Syst. Sci.,
1991, pp. 156–162.
[65] L. A. Zadeh, “The role of fuzzy logic in modeling, identification and
control,” Model. Identification Control, vol. 15, no. 3, pp. 191–203, 1994.
[66] C. Zanchettin and T. B. Ludermir, “Hybrid technique for artificial neural
network architecture and weight optimization, A. Jorge et al.,Eds.,
Proc. 9th Eur. Conf. Principles Practice Knowledge Discovery Databases
(PKDD 2005), Lecture Notes in Artificial Intelligence, vol. 3721, 2005,
pp. 709–716.
[67] G. P. Zhang, “Avoiding pitfalls in neural network research,” IEEE Trans.
Syst., Man, Cybern., Part C: Appl. Rev., vol. 37, no. 1, pp. 3–16, Jan. 2007.
[68] Y.-Q. Zhang, S. Akkaladevi, G. Vachtsevanos, and T. Y. Lin, “Granular
neural Web agents for stock prediction,” Soft Comput., vol. 6, pp. 406–
413, 2002.
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
CHANG AND FAN: HYBRID SYSTEM INTEGRATING A WAVELET AND TSK FUZZY RULES 815
Pei-Chann Chang received the M.S. and Ph.D. de-
grees from Lehigh University, Bethlehem, PA, in
1985 and 1989.
He is currently a Professor at the Department
of Information Management, Yuan Ze University,
Taoyuan, Taiwan, R.O.C. His current research inter-
ests include financial time series forecasting, evolu-
tionary computation, fuzzy neural applications, pro-
duction scheduling, forecasting, case-based reason-
ing, and applications of soft computing. He is the
author or coauthor of more than 60 papers published
in international journals. He is the Senior Editor of the Journal of Chinese In-
stitute of Industrial Engineering (JCIIE).
Chin-Yuan Fan received the B.S. degree from the
Chinese-Culture University, Taipei, Taiwan, R.O.C.,
2001, and the M.S. degree from Da-Yeh University,
Changhua, Taiwan, in 2003, both in management. He
is currently working toward the Ph.D. degree at the
Industrial Engineering and Management Department,
Yuan Ze University, Taoyuan, Taiwan.
His current research interests include applications
of soft computing, financial time series forecasting,
multiobjective optimization problems, and multicri-
teria decision making.
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
... Predicting future trends and price movements in the stock market necessitates the use of a number of different analytical methods [1]. Forecasting the stock market helps traders and investors plan their stock purchases and sales in light of future market circumstances. ...
... Initiate the random population 2. Set the 1 y and 2 y are parents, 1 x and 2 x are descendants 3. ...
Article
Full-text available
Stock market forecasting involves predicting fluctuations and trends in the value of financial assets, utilizing statistical and machine learning models to analyze historical market data for insights into future behavior. This practice aids investors, traders, financial institutions, and governments in making informed decisions, managing risks, and assessing economic conditions. Forecasting financial markets is difficult due to the intricate interplay of global economics, politics, and investor sentiment, making it inherently unpredictable. This study introduces a Deep Learning based Expert Framework for Stock Market forecasting (Portfolio prediction) called DLEF-SM. The methodology begins with an improved jellyfish-induced filtering (IJF-F) technique for preprocessing, effectively analyzing raw data and eliminating artifacts. To address imbalanced data and enhance data quality, pre-trained convolutional neural network (CNN) architectures, VGGFace2 and ResNet-50, are used for feature extraction. Additionally, an improved black widow optimization (IBWO) algorithm is designed for feature selection, reducing data dimensionality and preventing under-fitting. For precise stock market predictions, integrate deep reinforcement learning with artificial neural network (DRL-ANN) is proposed. Simulation outcomes reveal that the proposed framework achieves maximum forecasting accuracy, reaching 99.562%, 98.235%, and 98.825% for S&P500-S, S&P500-L, and DAX markets, respectively.
... Time series prediction has been widely studied due to its applications in signal processing, computer science, finance and health [1][2][3][4]. The main purpose of these studies is to predict the next value of a time series without observing future values. ...
... Here, we detail the optimization procedure to minimize the mean square error expressed in (2). In order to minimize the mean square error, we employ gradient descent through the back propagation of derivatives ...
Article
Full-text available
In this paper, we investigate the capability of modeling distant temporal interaction of Long Short-Term Memory (LSTM) and introduce a novel Long Short-Term Memory on time series problems. To increase the capability of modeling distant temporal interactions, we propose a hierarchical architecture (HLSTM) using several LSTM models and a linear layer. This novel framework is then applied to electric power consumption, real-life crime and financial data. We demonstrate in our simulations that this structure significantly improves the modeling of deep temporal connections compared to the classical architecture of LSTM and various studies in the literature. Furthermore, we analyze the sensitivity of the new architecture with respect to the hidden size of LSTM.
... Xiao-Ming and Cheng-Zhang [15] studied 10 ANNs models as base model in AdaBoost approach to predict stock price in the Shanghai Stock Exchange and foreign stock markets. The authors in [16] studied the Haar wavelet and Takagi-Sugeno-Kang (TSK) fuzzy rule-based system to forecast price fluctuations. The TSK fuzzy rule-based model is used with a number of indicators to predict stock prices. ...
... Step6: Parameters w i , c ij , and b ij are updated by Eqs (14)(15)(16). ...
Article
Full-text available
This research employs the gradient descent learning (FIR.DM) approach as a learning process in a nonlinear spectral model of maximum overlapping discrete wavelet transform (MODWT) to improve volatility prediction of daily stock market prices using Saudi Arabia’s stock exchange (Tadawul) data. The MODWT comprises five mathematical functions and fuzzy inference rules. The inputs are the oil price (Loil) and repo rate (Repo) according to multiple regression correlation, and the Engle and Granger Causality test Engle RF, (1987). The logarithm of the stock market price (LSCS) in Tadawul reflects the output variable. The correlation matrix reveals that there is no collinearity between the input variables, and the causality test demonstrates that the input variables significantly influence the outcome variable. According to the multiple regression, there is a substantial negative influence between Loil and LSCS but a significant positive effect between Repo and output. For the 80% dataset under ME (0.000005), MAE (0.003214), and MAPE (0.064497), the MODWT-LA8 (ARIMA(1,1,0) with drift) for the LSCS variable performs better than other WT functions. In the novel hybrid model MODWT-FIR.DM, each function’s approximation coefficient (LSCS) is applied with input variables (Loil and Repo). We evaluate the performance of the proposed model (MODWT-LA8-FIR.DM) using different statistical measures (ME, RMSE, MAE, MPE) and compare it to two established models: the original FIR.DM and other MODWT-FIR.DM functions for forecasting 20% of datasets. The outcomes show that the MODWT-LA8-FIR.DM performs better than the traditional models based on lower ME (3.167586), RMSE (3.167638), MAE (3.167586), and MPE (80.860849). The proposed hybrid model may be a potential stock market forecasting model.
... Many feature extraction techniques have been evolved to get the latent signal contained in the data for further processing [36,37]. Therefore, the extracted features only can be fitted into various models to eliminate noise, reduce the redundancy and computational complexity as well as improve the prediction accuracy [38][39][40]. Some of feature selection techniques used in literature are principal component analysis (PCA), decision trees (DT) [41], MARS etc. MARS is a popular feature selection technique that uses a combination of linear and nonlinear regression to identify the most relevant features. ...
Article
Full-text available
Accurate prediction of time series data is crucial for informed decision-making and economic development. However, predicting noisy time series data is a challenging task due to their irregularity and complex trends. In the past, several attempts have been made to model complex time series data using both stochastic and machine learning techniques. This study proposed a CEEMDAN-based hybrid machine learning algorithm combined with stochastic models to capture the volatility of weekly potato price in major markets of India. The smooth decomposed component is predicted using stochastic models, while the coarser components, selected using MARS, are fitted into two different machine learning algorithms. The final predictions for the original series are obtained using optimization techniques such as PSO. The performance of the proposed algorithm is measured using various metrics, and it is found that the optimization-based combination of models outperforms the individual counterparts. Overall, this study presents a promising approach to predict price series using a hybrid model combining stochastic and machine learning techniques, with feature selection and optimization techniques for improved performance.
... Bernardo et al. (2013) presented a fuzzy logic system (FLS) for modeling and predicting financial applications using a new method that outperformed various machine learning models. Chang et al. 2008 presented a method using wavelet and Takagi-Sugeno-Kang (TSK)-fuzzy-rule-based systems. They achieved an accuracy of up to 99.1% by using simulation data projected the price fluctuation for stocks in the Taiwan Stock Exchange index. ...
Article
Financial experts may make successful selections thanks to the stock market's research and forecasting capabilities, which is exciting. This study examines the stock market forecast outcomes through a simple feed-forward neural network (FFNN) model. Then, we contrast those outcomes with those produced using more sophisticated Elman, fuzzy logic, and radial basis function networks. Any problem with finite input-output mapping may be solved using the FFNN as long as it has at least one hidden layer and a sufficient number of neurons. An ANN in which RBFs are used as activation functions is called a radial basis function network (RBFN). Utilizing the Levenberg-Marquardt Back Propagation technique, the FFNN and Elman networks are trained in this study. A Fuzzy Inference System (FIS) of Sugeno type is employed to replicate the predictive procedure within the realm of fuzzy logic. We choose the optimal RBF values using several clustering techniques. The approaches were validated using public stock market data on the National Stock Exchange of Indonesia.
... The adaptive network-based fuzzy inference system (ANFIS) is a combination of fuzzy and ANN learning algorithms of five layers (Çatık et al., 2020;Geng & Wang, 2010;Khuntia & Hiremath, 2019;Li et al., 2020;Lv et al., 2020;Smyth & Narayan, 2018;Wang et al., 2020;Xiao et al., 2018) and (Fan et al., 2012) that is used in a variety of applications (Nadimi et al., 2010;Wang et al., 2020). The ANFIS model has achieved superior accuracy results in many applications as a forecasting model in comparison to the ANN model independently (Abiyev & Abiyev, 2012;Chang & Fan, 2008;Homayouni & Amiri, 2011;Honghui & Yongqiang, 2012;Munandar, 2015;Sehgal et al., 2014;Septiarini et al., 2016). Therefore, this paper exploits using the ANFIS model in forecasting the fluctuations in the cryptocurrency stock market. ...
Chapter
This study aims to model and enhance the forecasting accuracy of cryptocurrency market data patterns using the daily bitcoin (BTC) close price data with 1535 observations from December 2017 to January 2022. The model employs a nonlinear spectral model of maximum overlapping discrete wavelet transform (MODWT) with Haar mathematical functions in conjunction with an adaptive network-based fuzzy inference system (ANFIS). We have selected the logarithm volume of bitcoin (LV) and logarithm trade count (LCT) as input values according to correlation and multiple regressions. The input and output variables have been collected from the cryptocurrency market. The performance of the proposed model (MODWT-Haar-ANFIS) is compared with traditional models that are the autoregressive integrated moving average (ARIMA) model and the ANFIS model. The obtained results show that the performance of MODWT-Haar-ANFIS is better than that of the traditional models. Therefore, the proposed forecasting model is a promising approach that capable of deploying in the cryptocurrency markets.
Article
The intricacies and dynamism of financial markets pose challenges to models seeking to comprehensively capture the multitude of factors influencing stock price movements. As such, there remains room for improvement in forecasting accuracy. In response, we introduce a novel approach that unifies the Root Mean Square Error (RMSE), loss functions of Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNN). By concurrently optimizing their RMSE loss functions, our novel approach takes use of the capabilities of LSTM for learning long-term time series relationships and CNN for extracting deep features from data. To maximize the efficacy of each model branch within this unified framework, we split the training set into two different representations, one consisting of standard time series data and the other of standard picture data. We compare our proposed model to others in the field to demonstrate its viability, particularly Backpropagation (BP), LSTM, CNN, and a fusion LSTM-CNN model. Experimental evaluations conducted on three diverse datasets—Development Bank, Stock Connect Index (SCI), and Composite Index (CI)—validate the robust predictive performance and applicability of our joint RMSE loss LSTM-CNN model, thus showcasing its potential in financial forecasting.
Article
Emerging markets, such as the Chinese financial market, are occasionally subject to extreme risk events that result in investor losses during the investment process. To address the challenge of investment selection amidst market fluctuations, considering the fuzzy uncertainty and tail risk compensation based on the asymmetric perspective, we propose to use the lower VaR ratio and the upper VaR ratio as investment objectives to construct a multi-period credibilistic portfolio selection model. The study reveals that the cumulative returns and terminal wealth of the constructed model surpassed those of the benchmark models, delivering greater social and economic welfare to investors. During extreme events, investors could promptly adjust their portfolio structure to achieve higher investment returns. Investors who prefer the lower VaR ratio tend to make conservative investment decisions and allocate a higher proportion to defensive assets, such as bonds and risk-free assets. Conversely, investors who favor the upper VaR ratio are inclined to adopt aggressive investment strategies and allocate a larger proportion to high-risk stocks. The findings demonstrate that the proposed model offers differentiated investment decisions, and the research conclusions serve as valuable references for investors engaged in multi-period asset allocation and risk management.
Article
This paper develops a neural network model that predicts the proper time to move investment funds in and out of the stock market to maximize investment return. The goal is to use data which is easy to obtain, inexpensive and timely, enabling any investor to exploit neural network technology. The model is estimated using two valuation indicators, two monetary policy indicators, and four technical market indicators to explain the four week forward excess return on the Standard and Poors 500 stock Index over the U.S. Treasury Bill yield. The model is simulated out of sample and the results are compared to buy and hold strategies of investing in stocks and Treasury Bills alone. The model is shown to significantly out perform buy and hold strategies on both an absolute and a risk adjusted basis.
Article
Learning and evolution ai-e two fundamental forms of adaptation. There has been a gl-eat interest in combining learning and evolution with artificial neural networks (ANN's) in recent years. This paper: I) reviews reviews ent combinations between ANN's and evolutionary algorithms (EA's), including using EA's to evolve ANN connection weights, architectures, learning rules, and input features; 2) discusses different search operators which have been used in various EA's; and 3) points out possible future research directions. it is shown, through a considerably large literature review, that combinations between ANN's and EA's can lead to significantly better intelligent systems than relying on ANN's or EA's alone.
Article
After summarizing the properties of wavelets that are most likely to be useful in economic and financial analysis, the literatur on the application of wavelet techniques in these fields is reviewed. Special attention is given to the potential for insight into the development of economic theory or the enhancement of our understanding of economic phenomena. The paper is conclude with a section containing speculations about the relevance of wavelet analysis to economic and financial time–series give the experience to date. This discussion includes some suggestions about improving our understanding and evaluation of forecast using a wavelet approach.
Article
We attempt to recover a function of unknown smoothness from noisy sampled data. We introduce a procedure, SureShrirtk, that suppresses noise by thresholding the empirical wavelet coefficients. The thresholding is adaptive: A threshold level is assigned to each dyadic resolution level by the principle of minimizing the Stein unbiased estimate of risk (Sure) for threshold estimates. The computational effort of the overall procedure is order N · log(N) as a function of the sample size N, SurvShrink is smoothness adaptive: If the unknown function contains jumps, then the reconstruction (essentially) does also; if the unknown function has a smooth piece, then the reconstruction is (essentially) as smooth as the mother wavelet will allow. The procedure is in a sense optimally smoothness adaptive: It is near minimax simultaneously over a whole interval of the Besov scale; the size of this interval depends on the choice of mother wavelet. We know from a previous paper by the authors that traditional smoothing methods-kernels, splines, and orthogonal series estimates-even with optimal choices of the smoothing parameter, would be unable to perform in a near-minimax way over many spaces in the Besov scale. Examples of SureShtink are given. The advantages of the method are particularly evident when the underlying function has jump discontinuities on a smooth background.