ArticlePDF Available

Abstract and Figures

The prediction of future time series values based on past and present information is very useful and necessary for various industrial and financial applications. In this study, a novel approach that integrates the wavelet and Takagi-Sugeno-Kang (TSK)-fuzzy-rule-based systems for stock price prediction is developed. A wavelet transform using the Haar wavelet will be applied to decompose the time series in the Haar basis. From the hierarchical scalewise decomposition provided by the wavelet transform, we will next select a number of interesting representations of the time series for further analysis. Then, the TSK fuzzy-rule-based system is employed to predict the stock price based on a set of selected technical indices. To avoid rule explosion, the k-means algorithm is applied to cluster the data and a fuzzy rule is generated in each cluster. Finally, a K nearest neighbor (KNN) is applied as a sliding window to further fine-tune the forecasted result from the TSK model. Simulation results show that the model has successfully forecasted the price variation for stocks with accuracy up to 99.1% in Taiwan Stock Exchange index. Comparative studies with existing prediction models indicate that the proposed model is very promising and can be implemented in a real-time trading system for stock price prediction.
Content may be subject to copyright.
802 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 6, NOVEMBER 2008
A Hybrid System Integrating a Wavelet and TSK
Fuzzy Rules for Stock Price Forecasting
Pei-Chann Chang and Chin-Yuan Fan
Abstract—The prediction of future time series values based on
past and present information is very useful and necessary for var-
ious industrial and financial applications. In this study, a novel
approach that integrates the wavelet and Takagi–Sugeno–Kang
(TSK)-fuzzy-rule-based systems for stock price prediction is devel-
oped. A wavelet transform using the Haar wavelet will be applied
to decompose the time series in the Haar basis. From the hierar-
chical scalewise decomposition provided by the wavelet transform,
we will next select a number of interesting representations of the
time series for further analysis. Then, the TSK fuzzy-rule-based
system is employed to predict the stock price based on a set of
selected technical indices. To avoid rule explosion, the k-means al-
gorithm is applied to cluster the data and a fuzzy rule is generated
in each cluster. Finally, a K nearest neighbor (KNN) is applied as a
sliding window to further fine-tune the forecasted result from the
TSK model. Simulation results show that the model has success-
fully forecasted the price variation for stocks with accuracy up to
99.1% in Taiwan Stock Exchange index. Comparative studies with
existing prediction models indicate that the proposed model is very
promising and can be implemented in a real-time trading system
for stock price prediction.
Index Terms—Fuzzy ruled system, K-mean clustering, multi-
ple regression analysis (MRA), simulated annealing (SA), wavelet
preprocessing.
I. INTRODUCTION
M
INING stock market tendencies is a challenging task
due to its high volatility and noisy environment. Many
factors influence the performance of a stock market including
political events, general economic conditions, and traders’ ex-
pectations. Though stocks and futures traders have relied heav-
ily upon various types of intelligent systems to make trading
decisions, the success so far is quite limited [4].
Many attempts have been made to predict the financial mar-
kets, ranging from traditional time series approaches to artificial
intelligence techniques such as fuzzy systems, and especially, ar-
tificial neural network (ANN) methodologies [1]. However, the
main drawback with ANNs, and other black-box techniques,
is the tremendous difficulty in interpreting the results. They
do not provide an insight into the nature of the interactions
Manuscript received March 9, 2007; revised August 5, 2007 and January 3,
2008. First published September 26, 2008; current version published October
20, 2008. This paper was recommended by Associate Editor G. Papadimitriou.
P.-C. Chang is with the Department of Information Management, Yuan
Ze University, Taoyuan 32026, Taiwan, R.O.C. (e-mail: iepchang@saturn.
yzu.edu.tw).
C.-Y. Fan is with the Department of Industrial Engineering and Manage-
ment, Yuan Ze University, Taoyuan 32026, Taiwan, R.O.C. (e-mail: S948906@
mail.yzu.edu.tw).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TSMCC.2008.2001694
between the technical indicators and the stock market fluctua-
tions. Thus, there is a need to develop methodologies that facili-
tate an increased understanding of market processes, in addition
to providing temporally accurate predictions [17], [35], [65].
Another issue to be dealt with is that the dimensionality of fi-
nancial time series data also creates another challenge in ANN
approaches.
The development of a timely and accurate trading decision-
making tool is a key for stock traders to make profits. Since
the stock price series is affected by a mixture of deterministic
and random factors [17], new tools and techniques are needed
in dealing with noise and nonlinearity in stock price prediction.
Data mining, aimed at finding rules hidden in very large amount
of data, is a new and efficient approach for time series analysis.
Data mining on time series needs to translate the continuous
time series into discrete symbol sequences first. In this work,
the wavelet transform using the Haar wavelet will be applied to
decompose the time series in the Haar basis. From the hierarchi-
cal scalewise decomposition provided by the wavelet transform,
we will next select a number of interesting representations of the
time series for further analysis. In addition, statistical analysis
is employed to select important factors that affect the perfor-
mance of the stock market the most. These factors are chosen as
the inputs of the Takagi–Sugeno–Kang (TSK) fuzzy-rule-based
system to predict the future stock price. The reason for choosing
the TSK fuzzy system is owing to its universal approximation
capability [54] and the possibility to gain insights into the data,
which is of particular interest for stock price prediction.
The proposed framework combines several soft computing
(SC) techniques such as a wavelet transform, TSK fuzzy sys-
tem, data clustering, simulated annealing (SA), and K nearest
neighbor (KNN). In addition to wavelet-based data representa-
tion, the k-means clustering algorithm is applied to cluster the
data before the TSK fuzzy rules are generated. A fuzzy rule is
then generated for each cluster, which enables us to determine
the membership functions of the fuzzy subsets, and the optimal
number of fuzzy rules as well. Finally, a KNN is applied as a
sliding window to further fine tune the forecasted result from
the TSK model.
The remainder of the paper is organized as follows. Section II
reviews the different methods for stock forecasting using SC
techniques such as neural networks (NNs) and fuzzy systems.
Section III describes the proposed hybrid approach to stock
price prediction by integrating a wavelet with the TSK fuzzy-
rule-based system. Section IV presents empirical results of the
hybrid approach and compare it with three other approaches.
Finally, conclusions and future directions of the research are
discussed in Section V.
1094-6977/$25.00 © 2008 IEEE
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
CHANG AND FAN: HYBRID SYSTEM INTEGRATING A WAVELET AND TSK FUZZY RULES 803
II. LITERATURE SURVEY
Conventional research addressing the stock forecasting prob-
lem has generally relied on time series analysis techniques, i.e.,
mixed autoregression moving average (ARMA) as well as mul-
tiple regression models (MRMs). However, the assumptions of
these methods may come out with ineffective results since a
number of missing factors such as macroeconomical or political
effects may seriously influence stock tendencies.
White [59] was the first to use NNs for market forecasting. He
used a feedforward NN (FFNN) to study the IBM daily common
stock returns and he found that his training results were overop-
timistic, being the result of overfitting or of learning irrelevant
features. In general, there are two different methodologies for
stock price prediction in using ANN as a research tool [68]. The
first methodology is to consider the stock price variations as a
time series and predict the future price based on its past values. In
this approach, ANNs have been employed as the predictor, see,
e.g., [5], [10], [14], [16], [17], [27], [32], [37], [42], [48], [63],
and [64]. These prediction models, however, have their limita-
tions owing to the tremendous noise and high dimensionality
of stock price data. Therefore, the performances of the existing
models are not satisfactory [65].
The second approach takes the technical indices and qualita-
tive factors such as political effects into account in stock market
forecasting and trend analysis.
Yao and Poh [63] use technical indicators (%K and %D)
along with price information to predict future price values. They
achieved good returns, and found that their models performed
better using daily data rather than weekly data. Hobbs and Bour-
bakis [26] predict prices of stocks based on the fluctuations in
the rest of the market for the same day. They show consistently
high rates of return, although the investment is done in a fric-
tionless environment. Paying commissions on the large number
of trades instigated would certainly erode much of the benefit
from the trading strategy proposed. Austin and Looney [8] de-
velop an NN that predicts the proper time to move money into
and out of the stock market. They used two valuation indicators,
two monetary policy indicators, and four technical indicators
to predict the four week forward excess return on the dividend
adjusted S&P 500 stock index. The results significantly out-
performed the buy-and-hold strategy. Backpropagation ANNs
are applied to predict future elements in the price time series
in the Korea composite stock price index (KOSPI) [30]. L
´
opez
et al. [38] use time delay connections in enhanced NNs (that is,
the addition of time-dependant information in each weight) to
forecast IBEX-35 (Spanish stock index) index close prices one
day ahead. Stochastic NNs is applied for forecast the volatil-
ity of index returns in the TUNINDEX (Tunisian stock index),
and finds that the out-of sample NN results are superior to tra-
ditional generalized autoregressive conditional heteroskedastic
(GARCH) models [52]. Nenortaite and Simutis [41] present
a trading approach based on one-step ahead profit estimates
created by combining NNs with particle swarm optimization
algorithms. The method is profitable given small commission
costs, but does not exceed the S&P500 returns when realistic
commissions are introduced. Jaruszewicz and Mandziuk [29]
train ANNs using both technical analysis variables and inter-
market data, to predict one day changes in the NIKKEI index.
They achieve good results using moving average convergence
divergence (MACD), Williams, and two averages, along with
related market data from the National Association of Securities
Dealers Automated Quotation System (NASDAQ) and DAX.
It has been a new tendency that combining the SC technolo-
gies of NNs, fuzzy logic (FL), and genetic algorithms (GAs)
may significantly improve an analysis [1], [2], [9], [11]–[13],
[19], [24], [25], [27], [31], [36], [39], [40], [51], [53], [55]–
[57], [61]. In generally, NNs are used for learning and curve
fitting, FL is used to deal with imprecision and uncertainty, and
GAs are used for search and optimization [13], [28], [39], [62].
Zadeh [65] pointed out that merging these technologies allows
for the exploitation of a tolerance for imprecision, uncertainty,
and partial truth to achieve tractability, robustness, and low so-
lution cost.
Wavelet analysis is a relatively new field in signal process-
ing [18]. Wavelets are mathematical functions that decompose
data into different frequency components, and then, study each
component with a resolution matched to its scale—a scale refers
to a time horizon [47]. Wavelet filtering is particularly rele-
vant to volatile and time-varying characteristics of real-world
time series and is not restrained by the assumption of station-
arity [44]. The wavelet transform decomposes a process into
different scales, which makes it useful in differentiating season-
alities, revealing structural breaks and volatility clusters, and
identifying local and global dynamic properties of a process
at these timescales [5], [23], [45]. Wavelet analysis has been
shown to be especially productive in analyzing, modeling, and
predicting the behavior of financial instruments as diverse as
stocks and exchange rates [43], [46].
In this paper, a wavelet filtering is used and that is because of
the property of wavelets for economic analysis in data decompo-
sition by time scale. Economic and financial systems, like many
other systems, contain variables that operate on a variety of time
scales simultaneously so that the relationships between variables
may well differ across time scales. A wavelet will decompose
the time series into a range of frequency scales [7], [19], [22].
The lower level of the decomposition can capture the long range
dependencies with only a few coefficients, while the higher lev-
els capture the usual short-term dependencies. This research,
motivated by the effective preprocessing capability of wavelets
and the predictive power of fuzzy rule system, presents a hybrid
system by integrating the wavelet and a TSK fuzzy rule system
for stock price prediction.
III. M
ETHODOLOGY
In tradition, there are two major factors to be considered in
stock forecasting, and they are technical analysis and funda-
mental analysis. Technical analysis concentrates on the study
of market action, and fundamental analysis concentrates on the
economic forces of supply and demand that cause price move-
ments. In addition, as explained in [68], statistics, technical anal-
ysis, fundamental analysis, and linear regression are all used to
attempt to predict the market’s direction. However, technical
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
804 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 6, NOVEMBER 2008
indexes themselves alone often miss a lot of potential chances
in the stock movement before the appropriate trading signal is
generated. Thus, although technical analysis may yield insights
into the market, its highly subjective nature and inherent time
delay does not make it ideal for the fast, dynamic trading mar-
kets. That is why a wavelet and TSK fuzzy-rule-based model is
developed in this research.
Fundamental analysis must rely on the reasons of price move-
ment, and this process is very complicated since there are so
many factors that may affect the price change such as political,
psychotically events, etc. Therefore, the basic assumption of
this research is that the price movement is closely related to the
variation of technical index as widely applied in financial time
series researches. A set of technical indexes will be applied as
input factors and the output will be the stock price. To study
their relationship, a hybrid method that integrates a wavelet and
Takagi and Sugeno fuzzy-system-based forecasting model is
developed and implemented in this research for Taiwan stock
price prediction. The main procedures of the hybrid system are
shown in Fig. 1 and inputs and outputs of each block are further
explained in the following sections.
The notation of variables applied in the following sections is
shown in Table I.
A. Data Preprocessing Using Wavelet Theory
The reason for applying wavelet theory as a data preprocess-
ing method is because that as mentioned by Ramsey [46], the
process of representation in wavelet is able to deal with the non-
stationarity involved with economic and financial time series.
One of the benefits of a wavelet approach is the flexibility in
handling very irregular data series, as illustrated in [47]. Eco-
nomic and financial systems contain variables that operate on a
variety of time scale simultaneously so that the relationship be-
tween variables may differ across time scale. The most important
property of wavelets for economic analysis is decomposition by
time scale.
In this research, Haar wavelet is applied as our major wavelet
transform tools. Haar wavelet is a wavelet evolved from con-
tinuous wavelet transform. According to [3], wavelet not only
decompose the data in terms of times and frequency, but also can
reduce lots of processing times. For a time series of size N,the
wavelet decomposition used here can be determined in O(n)
time. In considering Haar wavelet and Coiflets wavelet [18],
Coiflets wavelet considers more aspects than Harr wavelet, es-
pecially in combining compact support with various degree of
smoothness and numbers of vanishing moments [3], but Haar
wavelet still provides easily and quickly process time without
losing much in performance than other wavelet systems [48].
Haar wavelet has been widely applied in time series forecast-
ing [6], [48].
Depending on normalization rules, there are two types of
Haar wavelets within a given function/family: father and mother
wavelets
Φ
a,b
=2
1/2
Φ
t 2
a
b
2
a
ψ
a,b
=2
1/2
ψ
t 2
a
b
2
a
Fig. 1. Framework of the hybrid model.
Father wavelets
Φ(t)dt =1
Mother wavelets
ψ(t)dt =0. (1)
Father wavelets are used for the “lowest frequency” smooth
components; those requiring wavelets with the widest support
and mother wavelets are used for the “higher frequency” detail
components. Father wavelets are used for the “trend compo-
nents” and mother wavelets are used for all the deviations from
trend. While a sequence of mother wavelets is used to represent
a function, only one father wavelet is used.
A time series data, i.e., functionf(t), is an input to be repre-
sented by a wavelet analysis, and it can be built up as a sequence
of projections onto father and mother wavelets indexed by both
{b}, b = {0, 1, 2,...} and by {s} =2
a
, {a =1, 2, 3,...}.In
actual data analysis using discretely sampled data, it is neces-
sary to create a lattice over which the calculations will be made.
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
CHANG AND FAN: HYBRID SYSTEM INTEGRATING A WAVELET AND TSK FUZZY RULES 805
TABLE I
N
OTATION OF VARIABLES APPLIED IN SECTION III
Mathematically, it is convenient to use a dyadic expansion as
illustrated in (1).
The coefficients in the expansion are given by the projections:
s
A,b
=
f(t
A,b
(t)dt
d
a,b
=
f(t)ψ
a,b
(t)dt, and
a =1, 2,...,A (2)
where A is the maximum scale sustainable by the number of
data points. The representation of the signal f(t) can now be
given by:
f(t)=
b
s
A,b
Φ
A,b
(t)+
b
d
A,b
ψ
A,b
(t)
+
b
d
A1,b
ψ
A1,b
(t)+···+
b
d
1,b
ψ
1,b
(t). (3)
The approximation can be represented:
f(t)=S
A
(t)+D
A
(t)+D
A1
(t)+···+ D
a
+ ···+ D
1
S
A
(t)=
b
S
A,b
φ
A,b
(t)
D
A
(t)=
b
D
A,b
ψ
A,b
(t). (4)
When n, the number of observations is divisible by 2
J
, and
then, the number of coefficients of each type is given by the
following:
1) at the finest scale, 2
1
: n/2 coefficients d
1,b
;
2) at the next scale, 2
2
: n/2
2
coefficients d
2,b
;
3) at the coarsest scale, 2
A
: n/2
A
coefficients d
A,b
; and
4) at the coarsest scale, 2
A
: n/2
A
coefficients s
A,b
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
806 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 6, NOVEMBER 2008
Fig. 2. Wavelet transforming process from f(t).
so that
n =
n
2
+
n
4
+ ···+
n
2
A1
+
n
2
A
. (5)
As shown in Fig. 2, f (t) represents the original data, S
1
represents an approximation signal, and D
1
represents a detailed
signal. We can define the multiresolution decomposition of a
signal by specifying:
SA coarsest scale, and
S
A1
= S
A
+ D
A
. (6)
In general
S
a1
= S
a
+ D
a
(7)
where {S
A
,S
A1
,...,S
1
} is a sequence of multiresolution ap-
proximations of the function f(t), at ever increasing levels of
refinement. The corresponding multiresolution decomposition
of f(t) is given by {S
A
,D
A
,D
A1
,...,D
a
,...,D
1
}.
The sequence of terms S
A
,D
A
,D
A1
,...,D
a
,...,D
1
rep-
resents a set of orthogonal signal components that provide rep-
resentations of the signal at resolutions 1 to A; each DAk
provides the orthogonal increment to the representation of the
function f (t) at the scale, or resolution 2
J k
. When the data
pattern is very rough, the wavelet process will be repeatedly
applied. In the preprocessing, the target is to minimize the mean
absolute percentage error (MAPE) between the signal before
and after transformation. In this way, the noise in the original
data can be removed.
B. TSK-Fuzzy-System-Based Prediction
1) Input Selection Using Stepwise Regression Analysis: A
set of important technical factors, as shown in Table II, which
will affect the forecasting result, have been identified by [14].
These important input factors will be further selected through
stepwise regression analysis (SRA) model in this research.
There are totally six important indexes to be selected from and
they are X
1
= six day moving average; X
2
= six day bias
(BIAS); X
3
= six day relative strength index (RSI); X
4
= nine
day stochastic line (KD); X
5
= the moving average divergence
(MABIAS); and X
6
= the 13 days psychological line. The out-
put is Y = stock price.
The SRA is applied to determine the set of independent vari-
ables that most closely affect the dependent variable. This is
accomplished by the repetition of a variable selection. The step-
by-step procedure of the SRA approach is explained in details
in the following.
Step 1) Calculate the correlation coefficient (r) of each input
variable (X
1
X
2
···X
n
) and output data (Y ). Then, a
correlation matrix is derived.
Step 2) Rank each variable according to its square (r
2
) from
correlation matrix (suppose X
i
is the largest one in
the current stage), and check the linear regression
of this variable to the output data, i.e., derive a re-
gression model as
ˆ
Y = f(X
i
). α value is applied to
consider the significance of each input variable. Re-
peat this process until all variables are tested. Finally,
select those statistically significant variables for fur-
ther verification and assume that these variables are
(X
1
X
2
···X
k
).
Step 3) Calculate partial F value for those statistically sig-
nificant variables, as shown in (9), and choose the
largest correlation coefficient among these input vari-
ables (assume that it is X
j
). Then, derive another
regression model
ˆ
Y = f(X
5
,X
4
) again
F
j
=
MSR(X
j
/X
i
)
MSE(X
j
/X
i
)
=
SSR(X
j
/X
i
)/(k 2)
SSE(X
j
/X
i
)/(n k)
,i I
(8)
F
j
=Max
1j N,j/I
(F
j
). (9)
Step 4) Calculate the partial F value of the original data for
input variable X
j
. If the value is smaller than a user
defined threshold, it is removed from the model be-
cause X
j
is not statistically significant for the output.
Step 5) Repeat steps 3 to 4. If every input number’s partial F
value is greater than the user defined threshold, stop.
It means that every input value should have signif-
icant influences on output value. According to [10]
and [13], we always set the threshold value as 4. If
the F value of a specific variable is greater than the
user defined threshold, it is added to the model as
a significant factor. When the F value of a specific
variable is smaller than a user defined threshold, it
is removed from the model. The statistical software
Statistical Package for the Social Sciences (SPSS) for
Windows 10.0 was used for SRA in this research. The
flow diagram of SRA is shown in Fig. 3.
2) TSK Fuzzy Rule Systems: The TSK fuzzy systems is se-
lected as a universal function approximation for the stock pre-
diction problems due to its ability to explain nonlinear relations
using a relatively low number of simple rules. The structure of
our TSK model is a multiple-input single-output fuzzy system
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
CHANG AND FAN: HYBRID SYSTEM INTEGRATING A WAVELET AND TSK FUZZY RULES 807
Fig. 3. Flow diagram of SRA.
and its associated fuzzy inference method comprises a set of K
IF-THEN rules in the following form:
R
i
: If x
1
is A
i1
,x
2
is A
i2
...x
n
is A
in
,
then y
i
= β
i1
+ β
i1
x
1
+ ···+ β
in
x
n
where x
j
,j =1,...,n, are the inputs of the fuzzy system, n is
the number of inputs, A
ij
,i=1,...,N, are fuzzy subsets, N
is the number of fuzzy rules, y
i
is the output of ith rule, and
β
ji
are parameters for consequence rules. It is a first-order TSK
fuzzy rule system, and in this paper, Gaussian fuzzy membership
functions are adopted
A
ij
(x
j
) = exp
(x
j
a
ij
)
2
σ
2
ij
(10)
where a
ij
and σ
ij
are the mean and standard deviation of the
Gaussian functions. Given a crisp input pair (x
0
1
···x
0
n
); the
crisp output of the TSK model is described by
y =
N
i=1
w
i
y
i
N
i=1
w
i
(11)
where w
i
is the strength of rule i determined by
w
i
=
n
j=1
A
ij
(x
0
j
) (12)
and
y
i
= β
i0
+ β
i1
x
0
1
+ ···+ β
in
x
0
n
. (13)
The main task in TSK fuzzy-rule-based prediction is to de-
termine the parameters in the fuzzy membership functions and
in the rule consequences using a learning algorithm, given a set
of training data specifying the functional mapping between the
inputs and the output.
3) Data Clustering: The purpose of data clustering is to clus-
ter the set of financial time series data into different groups, and
data in each group will have a more homogeneous characteris-
tic. However, it is very important to determine how many fuzzy
rules should be generated beforehand. If a standard rule struc-
ture is used, rule explosion occurs when the number of inputs is
high. To resolve this problem, we divide the training data into
a number of clusters based on the output data (stock price) and
one fuzzy rule is generated for each cluster [32]. By doing this,
the number of fuzzy rules can be reduced effectively. Besides,
we can determine the fuzzy membership functions using the
mean and standard deviation of the data points that belongs to
each cluster.
The K-means clustering algorithm is employed for data clus-
tering. K-means is a nonhierarchical clustering technique in
which the dataset is partitioned into K clusters. During the
clustering, the data points are randomly assigned to the clusters
to minimize the following squared error (SE):
SE =
K
i=1
pC
i
|p m
i
|
2
(14)
where p are data points in the cluster C
i
, m
i
is the center of
cluster C
i
, and K is the number of clusters.
Once the training data are clustered, we can calculate the
parameters of the membership functions for each cluster as
follows:
a
ij
=
1
s
i
s
i
i=1
x
j
σ
ij
=
1
s
i
1
(x
j
a
ij
)
2
(15)
where S
i
is the number of data points in cluster i. In addition,
the output of the training data is also normalized using the mean
and standard deviation of the data in each cluster.
4) Optimization of the Parameters in Fuzzy Rules Using
Simulated Annealing: The purpose of applying the SA is to
find a set of best values for the parameters within the fuzzy
rules. Traditionally, the parameter settings of TSK’s rules are
generated using the gradient method. The generalized gradient
algorithm searches for the solution in a multidimensional space
along the steepest ascent direction. However, such a search can
be extremely slow and ineffective if the equation has many
plateaus distributed throughout the landscape. Therefore, this
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
808 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 6, NOVEMBER 2008
Fig. 4. Gradient method as compared with a simulated annealing approach.
method may not be able to derive an optimal solution and can
be trapped in a local optimum, as shown in Fig. 4. In such cases,
statistical search methods may offer better strategy in resolving
both problems [60], [66].
One of the most widely used statistical search method is the
SA [33], which uses the metropolis algorithm to decide whether
to accept or reject a configuration that results in an increased
cost during its attempts in searching for the minimum cost.
The main characteristics of SA are its simplicity and the rapid
convergence. To properly adjust parameters’ weights of TSK
fuzzy rules, the SA approach is effective if the chosen energy,
or cost function, for the global system is appropriate. In this
study, the cost function is defined as the MAPE for the set of
testing data, i.e., a series of stock index. The procedure of SA
is well known, as described in [33]. First, it is necessary to
generate random values of the parameters’ weights, and second,
to compute the associated cost of the system. This cost will
be minimized when the parameters’ weights achieve a global
minimum, the method thus allowing escape from local minima.
The detailed set up of the parameters for SA will be described
in the next section. Through a proper setup of the cooling sched-
ule, finally SA can be applied to derive a set of near-optimal
parameters for these TSK rules, as shown in Fig. 4.
5) Using K-Nearest-Neighbor as a Sliding Window: The ba-
sic idea of KNN [20] is to identify similar patterns of current
data trend from the historic data. We use KNN as a sliding
window to forecasting the data value for next day and use the
current k data as a time window to search within the historic
data to see if there are any similar patterns identified. Basically,
our approach is categorized as a one-step ahead prediction. The
selected data are preprocessed with the wavelet, and then, TSK
model is applied to generate a set of fuzzy rules for prediction
of stock price. In addition, the KNN sliding window is further
applied to reduce the forecasting errors. The set of historic data
is divided into training and testing set for cross validation. The
KNN is simpler than other SC approaches because there is no
model to train on the data series. Instead, the data series is be-
ing searched for situations similar to the current, each time a
forecast needs to be made.
To describe the KNN process, several terms have to be defined
first. Assume the window size is L, which means there are L data
in each window to be considered. The final data points of the
data series are the reference data, and the length of the reference
is the window size. To forecast the data series’ next data point,
the reference is compared to the first group of data points in
the historical data series, called a candidate, and an error is
computed. Then, the reference is moved one data point forward
to the next candidate and another error is computed, and so on.
An error is calculated by subtracting the candidate value from
the reference value. All errors of the testing data are sorted and
stored in an array. Assume that the number of nearest neighbors
is H. Then, the smallest H errors corresponding to these H
candidates will be selected. Finally, the forecasted value will be
equal to the average of these k data points. Then, to forecast
the next data point, the process is repeated with the previously
forecasted data point appended to the end of the data series.
This process can be iteratively repeated until all n data points
are calculated.
Use KNN to calculate the new forecasted value.
Step 1) Use original data as a contrast data. Suppose to fore-
cast number i data from index number t +1, i.e.,
number
ˆ
X
i,t+1
value.
Step 2) Use number t to t L + 1 data for contrast base.
Using the sliding window method, one by one com-
pare the data from 1 to t L, and also calculate the
Euclidean distance from every interval D
(1)
j
, and find
the corresponding forecasting value F
(1)
j
D
(1)
j
=
L
l=1
(X
i,tL+ l
X
i,l+j 1
)
2
F
(1)
j
= X
i,j+L
. (16)
In (14), j =1 t L.
Step 3) Consider all D
(1)
j
and find the kth smallest number.
It is KNN’s K option value.
Step 4) Use the weighted voting method to find the last fore-
casting value
ˆ
X
i,t+1
. The equation is
ˆ
X
i,t+1
=
H
k=1
F
k
/W
k
H
k=1
1/W
k
. (17)
W
k
means the kth smallest D
(1)
j
value, F
k
means
the F
(1)
j
value corresponding to the kth smallest D
(1)
j
value, H =1 k. And the parameter set of the slid-
ing windows is (L, H). A simple example for KNN
forecasting with window size L = 3 and H = 2is
shown in Fig. 5.
C. Different Models to be Compared With
In this research, we use traditional back-propagation neural
networks (BPNs), the MRM, and a forecasting method by inte-
grating GA with Wang and Mendal’s algorithm for fuzzy rule
generation (GAWM) [13], [58] to compare with our wavelet
TSK fuzzy rule forecasting system. These three compared mod-
els will be briefly introduced.
BPN [49] is a popular system that has been widely employed
in financial forecasting. The most popular training method for
BPN is the supervised learning, i.e., learning by samples, which
will be selected in this research to train the system. After learning
(or training), the trained connection weights can be used for the
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
CHANG AND FAN: HYBRID SYSTEM INTEGRATING A WAVELET AND TSK FUZZY RULES 809
TABLE I I
T
ECHNICAL INDICES USED AS INPUT VARIABLES
prediction of future occurrences. Technical indices selected by
SRA will be applied as the inputs to BPN and the output will
be the Taiwan stock index. The detailed parameter’s setting for
BPN will be described in Section IV.
Multiple regression analysis (MRA) [12] is one of the most
popular methods applied in business forecasting. MRA is em-
ployed for testing hypothesis with regard to the relationship
between a dependent variable (Y ) and two or more independent
variables (Xs). It is easy to establish a model when there is a
linear relationship between the independent variable and depen-
dent variable. On the other hand, it is very difficult to establish an
accurate model within a nonlinear relation. In this research, the
output factor (Y ) is the stock price and the input factors include
six day moving average (X1), six day bias (X2), six day RSI
(X3), nine day stochastic line (X4), moving average divergence
(X5), and the 13 days psychological line (X6). The multiple
regression formula of this problem can be defined as follows:
Y = a
1
X
1
+ a
2
X
2
+ a
3
X
3
+ a
4
X
4
+ a
5
X
5
+ a
6
X
6
+ b.
(18)
Among them, parameters a
1
, a
2
, a
3
, a
4
, a
5
, a
6
, and b are all
calculated via SPSS statistics software.
Wang and Mendel [58] developed a method to create a fuzzy
rule base, which is a combination of rules generated from nu-
merical examples, i.e., historic stock data and linguistic rules
supplied by human experts [15]. The Wang and Mendel (WM)
method is evolved with GAs and the idea is similar to evolving
NN [60]. Essentially, a simple GA is used to determine the near-
optimal number of fuzzy terms for each variable; as a result, the
objective function can be better improved by this evolution.
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
810 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 6, NOVEMBER 2008
Fig. 5. KNN forecasting with window size L = 3andH = 2.
D. Performance Measures
There are many measures of prediction accuracy used to com-
pare forecasting methods out of sample [10]. In considering
about the dimensionality of the data, in this research, there are
six input factors and one output value, and 494 training data.
The mean square error (MSE) of each model is very large after
training this data set. Since the purpose of the stock prediction is
to make profit instead of just predict the future price accurately,
we use MAPE as a performance measure instead of MSE. The
equation of MAPE is listed later.
1) Mean Absolute Percentage Error: The accuracy of pre-
dictions was measured with the following indicator, i.e., MAPE.
The average forecast error is measured as a percentage of his-
torical results. The absolute value allows for the effect of adding
different signs. It is calculated as follows:
MAPE =
1
n
n
t=1
|X
t
F
t
|
X
t
× 100 (19)
where X
t
is the true value and F
t
is the predicted value at time
t. The MAPE is an average over n test sets.
IV. S
IMULATION RESULTS
The Taiwan Stock Exchange (TSE) began operations since
1962. At the end of January, 2005, Taiwan Stock Exchange Cor-
poration (TSEC) had 699 listed companies with market capital
topping NT$13.7 trillion (US $396 billion). Most stock trading
goes to the listed IT companies and the trading value of TSE
stock market places it in the top ten of stock exchanges in the
world.
The data set applied for test in this research is the TSE in-
dex, and it has been decomposed into three different sets: the
training data, test data, and validation data. The data for TSE
Fig. 6. Figures before and after a wavelet transformation for 492 stock price
data from Taiwan Stock Exchange index.
index are from July 18, 2003 to December 31, 2005, totally 614
records. During this period, the stock market has gone through a
rough up and down period owing to the national political issues.
Therefore, these data are very representative and suitable for
study and analysis. The first 492 records will be training and
cross-validation data and the rest of the data, i.e., 122 records
will be for out-of-sample test data. To avoid the interaction
among these factors, we will test each factor using SRA and
identify the factor that will affect the final forecasted results
significantly. The final combination of the factors will be fi-
nalized after the analysis. The factors selected finally are MA6
and BIAS6; these two index and the output variables are TSE
index.
Before training the TSK fuzzy model, a wavelet transfor-
mation has been applied to preprocess the data. According to
the MAPE, a three-level wavelet preprocessing is thus applied.
Through this process, the noise in the original data can be re-
moved. The result of a wavelet-based decomposition process
is depicted in Fig. 6. According to the RSA method described
in Section III-B, six input variables are finally selected as the
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
CHANG AND FAN: HYBRID SYSTEM INTEGRATING A WAVELET AND TSK FUZZY RULES 811
Fig. 7. MAPEs of the hybrid model for different number of data clusters.
TABLE III
P
ARAMETER SETTINGS FOR SIMULATED ANNEALING
inputs to the TSK model to predict the stock price. They are six
day moving average (MA), six day bias (BIAS), six day RSI,
nine day stochastic line (KD), the moving average divergence
(MABIAS), and the 13 days psychological line (PSY).
In the k-means clustering algorithm, the number of clusters
must be predefined. It is a very interesting subject to be fur-
ther investigated since there is no exact theorem explaining
the effect of number of clusters to the forecasting accuracy.
To check the sensitivity of the performance of our model on
the number of clusters, various numbers of data clusters are
investigated. In the experiment, the stock price data were clus-
tered into two to eight clusters. The performance (MAPE) of
the algorithms with different number of clusters is shown in
Fig. 7.
As can be observed from the figure, MAPE will start to de-
crease as the number of clusters increase. However, as the num-
ber of clusters reaches a certain value, MAPE starts to increase.
Part of the reasons is because when the number of clusters is too
large, the number of the data in each cluster is too small. These
data in each cluster are not representative enough to generate a
model to forecast the future stock index. Therefore, in this re-
search, the number of clusters will be three since it provides best
performance (the smallest MAPE). This number of clusters is
not definite and it has to be decided experimentally for different
application purposes.
The parameter setting in three different levels for the SA is
provided in Table III. Then, we use statistical software Minitab
R14 to run the Taguchi experiments and the results of different
factor levels are shown in Table IV. Table V lists the final setting
for each factor in SA procedure. The factor response graph of
these experimental results is shown in Fig. 8.
The convergence diagram of the learning process of the
rule consequence parameters using the SA is shown in Fig. 9.
Finally, the MAPE of the forecasting model gradually de-
creases to 3.8% after the temperature drops to a certain level.
It is justified that the proposed SA approach can find near-
optimal solutions for the set of parameters of consequence
rules.
We compared the proposed hybrid method combining the
wavelet and TSK fuzzy rules with three existing methods. To
justify the use of SA and KNN sliding window, a set of experi-
mental results are listed in Table VI. In this table, rule number
is decided by (14) and (15), and the first number means the
window size L; the second number means the number of best
neighborhood data H. According to a series of experiments
where L is setup as 2, 4, 6, and 8 and H is setup as 2, 3, 4, and
5, the best forecasting result in terms of the minimum MAPE
value from the hybrid model is in (2,5) with an average of
0.792.
The four different algorithms to be compared with are the
traditional back-propagation neural networks (BPNs), a stan-
dard TSK, the MRM, and a forecasting method by integrating
GA with Wang and Mendal’s algorithm for fuzzy rule gener-
ation (GAWM) [13], [58]. Tables VII and VIII are the best
parameter setting for BPN and GAWM using design of
experiments. Fig. 10 also reveals the experimental results that
GAWM converged after 100 generations. Table VIII shows the
MAPE value for all different methods.
As observed from Table IX, MRM has the largest MAPE
value and part of the reasons is because MRM cannot fully
explain the nonlinear relationship among the stock price and
the technical indexes. BPN also has a large error as compared
with other models and that is due to the tremendous noise
and complex dimensionality of stock price data. Besides, the
quantity of data itself and the input variables may also in-
terfere with each other. In addition, BP learning algorithm is
subject to getting stuck in a local optimum, while the TSK
is less likely. Therefore, the result may not be that convinc-
ing. In addition, BPN methods do not provide an insight into
the nature of the interactions between the technical indicators
and the stock market fluctuations. As for GAWM, the fuzzy
rules generated from the training data are very large when
compared with TSK and these rules may interact with each
other.
From the experimental tests, we can observe that TSK fuzzy
system is more suitable in handling large amount of data.
The set of data is clustered according its mean and standard
deviation. After this clustering, these set of data will be de-
composed into couples of subclusters. These subclusters have
more homogeneous characteristics within themselves, and each
subcluster will be transformed into a TSK fuzzy rule. There-
fore, the fuzzy rules generated from TSK are quite small since
each cluster of data only generates one single rule. For our
experimental tests, the clusters have been preset into two to
eight; therefore, there are two to eight fuzzy rules within each
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
812 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 6, NOVEMBER 2008
TABLE I V
T
AGUCHI EXPERIMENTAL RESULT OF DIFFERENT FACTOR LEVELS
TABLE V
B
EST PARAMETER SET FOR SA
Fig. 8. Factor response graph.
Fig. 9. SA training astringent diagram.
TABLE V I
R
ESULTS OF DIFFERENT NUMBER OF SLIDING WINDOWS (L) AND H NEAREST
NEIGHBORS APPLIED IN TSE INDEX FORECASTING (IN PERCENT)
TABLE V II
BPN P
ARAMETER SET
system. The best forecasting results will be in three clusters. In
addition, KNN has been further applied to reduce the fore-
casting errors. That is why, TSK has a better forecasting
accuracy.
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
CHANG AND FAN: HYBRID SYSTEM INTEGRATING A WAVELET AND TSK FUZZY RULES 813
TABLE VIII
P
ARAMETER SETTING FOR GAWM
Fig. 10 Convergence diagram of GAW& M model.
TABLE I X
MAPE P
ERFORMANCE FROM DIFFERENT METHODS
V. C ONCLUSION
This paper proposed a TSK fuzzy model for stock price pre-
diction. To facilitate the prediction, the data are preprocessed
using the Haar wavelet. Then, SRA technique is employed to
select the most relevant factors for prediction. To avoid rule ex-
plosion, the k-means clustering algorithm is employed to group
the data into a number of clusters and one fuzzy rule is generated
for each cluster. As an additional benefit, the fuzzy membership
function can be determined automatically using the mean and
variance of the data in each cluster. The parameters in the con-
sequences of the TSK rules are optimized using the SA. A KNN
sliding window is applied to retrieve the similar patterns in the
historical data and further adjust the forecasted value from the
TSK model.
The proposed model is compared with the BNP, TSK, MRM,
and GAWM for stock price prediction. Simulation results show
that the TSK model with wavelet-based preprocessing greatly
outperforms the other three models. To the best of knowledge,
the combination of the TSK fuzzy model with wavelet-based
preprocessing is new for stock price forecasting. Due to its very
promising performance, we are going to apply the system for
real-time daily trading.
In the future, a different TSK fuzzy model, such as a nonlinear
model using NNs as a consequence, can be further applied in a
more complex time series problem. In addition, more advanced
pattern matching algorithm can be embedded in the system to
retrieve significant patterns from the historic stock data for com-
parison with the current trend of the data. As a result, intelligent
trading signals instead of stock price can be identified.
R
EFERENCES
[1] A. Abraham, N. Baikunth, and P. K. Mahanti, “Hybrid intelligent systems
for stock market analysis,” in Lecture Notes Computer Science, London,
U.K.: Springer-Verlag, vol. 2074, pp. 337–345, 2001.
[2] A. Abraham, N. S. Philip, and P. Saratchandran, “Modeling chaotic be-
havior of stock indices using intelligent paradigms,” Neural, Parallel Sci.
Comput., vol. 11, pp. 143–160, 2003.
[3] F. Abramovich, P. Besbeas, and T. Sapatinas, “Empirical Bayes approach
to block wavelet function estimation,” Comput. Statist. Data Anal.,
vol. 39, no. 4, 28, pp. 435–451, 2002.
[4] Y. S. Abu-Mostafa and A. F. Atiya, “Introduction to financial forecasting,”
Appl. Intell., vol. 6, pp. 205–213, 1996.
[5] M. Aiken and M. Bsat, “Forecasting market trends with neural networks,”
Inf. Syst. Manag., vol. 16, no. 4, pp. 42–48, 1994.
[6] V. Alarcon-Aquino and J. A. Barria, “Multi resolution FIR neural-network-
based learning algorithm applied to network traffic prediction,” IEEE
Trans. Syst., Man Cybern., Part C: Appl. Rev., vol. 36, no. 2, pp. 208–209,
Mar. 2006.
[7] A. Aussem and F. Murtagh, “Combining neural network forecasts on
wavelet-transformed time series,” Connection Sci., vol. 9, pp. 113–122,
Mar. 1997.
[8] M. Austin and C. Looney, “Security market timing using neural network
models,” New Rev. Appl. Expert Syst., vol. 3, pp. 3–14, 1997.
[9] N. Baba, N. Inoue, and H. Asakawa, “Utilization of neural networks &
GAs for constructing reliable decision support systems to deal stocks,” in
Proc. IEEE-INNS-ENNS Int. Joint Conf. Neural Netw. (IJCNN’00),vol.5,
Jul., pp. 5111–5116.
[10] D. Brownstone, “Using percentage accuracy to measure neural network
predictions in stock market movements,” Neurocomputing, vol. 10,
pp. 237–250, 1996.
[11] P. C. Chang and T. W. Liao, “Combing SOM and fuzzy rule base for
flow time prediction in semiconductor manufacturing factory,” Appl. Soft
Comput., vol. 6, no. 2, pp. 198–206, 2006a.
[12] P. C. Chang and Y. W. Wang, “Fuzzy delphi and back-propagation model
for sales forecasting in PCB industry,” Expert Syst. Appl., vol. 30, no. 4,
pp. 715–726, 2006b.
[13] P. C. Chang, C. H. Liu, and Y. W. Wang, “A hybrid model by clustering and
evolving fuzzy rules for sale forecasting in printed circuit board industry,”
Decision Support Syst., vol. 42, no. 3, pp. 1715–1729, 2006.
[14] P. C. Chang, Y. W. Wang, and W. N. Yang, “An investigation of the hybrid
forecasting models for stock price variation in Taiwan,” J. Chin. Inst. Ind.
Eng., vol. 21, no. 4, pp. 358–368, 2004.
[15] M. Y. Chen and D. A. Linkens, “Rule-base self-generation and simplifica-
tion for data-driven fuzzy models,” Fuzzy Sets Syst., vol. 142, pp. 243–265,
2004.
[16] A. S. Chen, M. T. Leung, and H. Daouk, Application of neural net-
works to an emerging financial market: Forecasting and trading the
Taiwan stock index,” Comput. Operations Res., vol. 30, pp. 901–923,
2003.
[17] S. C. Chi, H. P. Chen, and C. H. Cheng, “A forecasting approach for
stock index future using grey theory and neural networks,” in Proc.
IEEE Int. Joint Conf. Neural Netw., Washington, DC, 1999, pp. 3850–
3855.
[18] A. Cohen, I. Daubechies, and P. Vial, “Wavelets on the interval and fast
wavelet transform,” Appl. Comp. Harm. Anal., vol. 1, no. 1, pp. 54–81,
Dec. 1993.
[19] G. Corani and G. Guariso, “Coupling fuzzy modeling and neural networks
for river flood prediction,” IEEE Trans. Syst., Man Cybern., Part C: Appl.
Rev., vol. 35, no. 3, pp. 382–390, Aug. 2005.
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
814 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 38, NO. 6, NOVEMBER 2008
[20] P. A. Devijver and J. Kittler, Pattern Recognition: Statistical Approach.
London, U.K.: Prentice-Hall, 1982.
[21] T. Dogaru and L. Carin, “Application of Haar-wavelet-based multiresolu-
tion time-domain schemes to electromagnetic scattering problems,” IEEE
Trans. Antennas Propag., vol. 50, no. 6, pp. 774–784, Jun. 2002.
[22] D. L. Donoho and I. M. Johnstone, “Adapting to unknown smoothness via
wavelet shrinkage,” J. Amer. Stat. Assoc., vol. 90, no. 432, pp. 1200–1224,
Dec. 1995.
[23] R. Genc¸ay, F. Selcuk, and B. Whitcher, “Differentiating intraday season-
alities through wavelet multi-scaling,” Phys. A, vol. 289, pp. 543–556,
2001.
[24] R. H. Golan and W. Ziarko, “A methodology for stock market analysis
utilizing rough set theory,” in Proc. IEEE/IAFE 1996 Conf. Comput. Intell.
Financial Eng., New York, 1995, pp. 32–40.
[25] Y. Hoekstra, “A stock market forecasting support system based on fuzzy
logic,” in Proc. 27th Annu. Hawaii Int. Conf. Syst. Sci., Wailea, HI, 1994,
pp. 281–287.
[26] A. Hobbs and N. G. Bourbakis, “A neurofuzzy arbitrage simulator for
stock investing,” in Proc. Int. Conf. Comput. Intell. Financial Eng.
(CIFER), New York, 1995, pp. 160–177.
[27] K. Izumi and K. Ueda, “Analysis of exchange rate scenarios using an
artificial market approach,” in Proc. Int. Conf. Artif. Intell., vol. 2, 1999,
pp. 360–366.
[28] L. C. Jain and N. M. Martin, Fusion of Neural Networks, Fuzzy Sets, and
Genetic Algorithms. New York: CRC Press LLC, 1999.
[29] M. Jaruszewicz and J. Mandziuk, “One day prediction of NIKKEI index
considering information from other stock markets,” in Proc. Int. Conf.
Artif. Intell. Soft Comput. ICAISC 2004, vol. 3070, pp. 1130–1135.
[30] K.-J. Kim and W. B. Lee, “Stock market prediction using artificial neural
networks with optimal feature transformation,” Neural Comput. Appl.,
vol. 13, no. 3, pp. 255–260, 2004.
[31] K. J. Kim and I. Han, “Genetic algorithms approach to feature discretiza-
tion in artificial neural networks for the prediction of stock price index,”
Expert Syst. Appl., vol. 19, pp. 125–132, 2000.
[32] T. Kimoto and K. Asakawa, “Stock market prediction system with modular
neural network,” in Proc. IEEE Int. Joint Conf. Neural Netw., San Diego,
CA, 1990, pp. 1–6.
[33] S. Kirkpatrick, C. D. Gellat Jr., and M. P. Vecchi, “Optimization by simu-
lated annealing,” Science, vol. 220, pp. 671–680, 1983.
[34] A. J. Koning, P. H. Franses, M. Hibon, and H. O. Stekler, “The M3
competition: Statistical analysis of the results,” Int. J. Forecast., vol. 21,
pp. 397–409, 2005.
[35] A. Kusiak, M. R. Smith, and Z. Song, “Planning product configurations
based on sales data,” IEEE Trans. Syst., Man Cybern., Part C: Appl. Rev.,
vol. 37, no. 4, pp. 602–609, Jul. 2007.
[36] H. L. Larsen and R. R. Yager, “A framework for fuzzy recognition technol-
ogy,” IEEE Trans. Syst., Man, Cybern., part C, vol. 30, no. 1, pp. 65–76,
Feb. 2000.
[37] J. W. Lee, “Stock price prediction using reinforcement learning,” in Proc.
IEEE Int. Joint Conf. Neural Netw., Pusan, Korea, 2001, pp. 690–695.
[38] L. F. M. L
´
opez, M. A. D
´
ıaz, V. Palencia, E. Santos, and P. Jim
´
enez, “IBEX-
35 stock market forecasting using time delay connections in enhanced
neural networks,” World Multiconf. Syst., Cybern. Inf., vol. 67, pp. 455–
460, 2002.
[39] S. Mitra and Y. Hayashi, “Bioinformatics with soft computing,” IEEE
Trans. Syst., Man Cybern., Part C: Appl. Rev., vol. 36, no. 5, pp. 616–635,
Sep. 2006.
[40] D. Montana and L. Davis, “Training feed forward neural networks using
genetic algorithms,” in Proc. 11th Int. Joint Conf. Artif. Intell., Morgan
Kaufmann, San Mateo, CA, 1989, pp. 762–767.
[41] J. Nenortaite and R. Simutis, “Stocks’ trading systems based on the particle
swarm optimization algorithm,” Comput. Sci.—ICCS, vol. 3039, no. 4,
pp. 843–850, 2004.
[42] K. Papagiannaki, N. Taft, Z.-L. Zhang, and C. Diot, “Long-term forecast-
ing of internet backbone traffic,” IEEE Trans. Neural Netw., vol. 16, no. 5,
pp. 1110–1124, Sep. 2005.
[43] K. Parasuraman and Elshorbagy, “A wavelet networks: An alternative to
classical neural networks,” in Proc. 2005 IEEE Int. Joint Conf. Neural
Netw., IJCNN ’05, vol. 5, 2005, pp. 2674–2679.
[44] A. Popoola and K. Ahmad, “Testing the suitability of wavelet pre-
processing for TSK fuzzy models,” in Proc. FUZZ-IEEE: Int. Conf. Fuzzy
Syst. Netw., Vancouver, BC, Canada, Jul. 16–22, 2006, pp. 1305–1309.
[45] A. Popoola, S. Ahmad, and K. Ahmad, “Multi-scale wavelet preprocessing
for fuzzy systems,” in Proc. 2005 ICSC Congr. Comput. Intell. Methods
Appl., Dec. 2005, pp. 15–17.
[46] J. B. Ramsey, “The contribution of wavelets to the analysis of economic
and financial data,” Phil. Trans. R. Soc. London, vol. 357, pp. 2593–2606,
Sep. 1999.
[47] J. B. Ramsey and Z. Zhang, “The analysis of foreign exchange data using
waveform dictionaries,” J. Empirical Finance, vol. 4, pp. 341–372, 1997.
[48] O. Renaud, J. L. Starck, and F. Murtagh, “Prediction based on a multiscale
decomposition,” Int. J. Wavelets, Multiresolution Inf. Process.,vol.1,
no. 2, pp. 217–232, 2003.
[49] D. E. Rumelhart, G. E. Hilton, and R. J. Williams, “Learning repre-
sentations by backpropagation errors,” Nature, vol. 323, pp. 533–536,
1986.
[50] K. Schierholt and C. H. Dagli, “Stock market prediction using different
neural network classification architectures,” in Proc. IEEE/IAFE 1996
Conf. Comput. Intell. Financial Eng., New York, 1996, pp. 72–78.
[51] R. Sitte and J. Sitte, “Analysis of the predictive ability of time delay neural
networks applied to the S&P500 time series,” IEEE Trans. Syst., Man,
Cybern., part C, vol. 30, no. 4, pp. 568–572, Nov. 2000.
[52] C. Slim, “Forecasting the volatility of stock index returns: A stochastic
neural network approach,” Comput. Sci. Its Appl., vol. 3, pp. 935–944,
2004.
[53] M.-C. Su, C.-W. Liu, and S.-S. Tsay, “Neural-network-based fuzzy model
and its application to transient stability prediction in power systems,”
IEEE Trans. Syst., Man, Cybern., Part C: Appl. Rev., vol. 29, no. 1,
pp. 149–157, Feb. 1999.
[54] T. Takagi and M. Sugeno, “Fuzzy identification of systems and its appli-
cation to modeling and control,” IEEE Trans. Syst., Man Cybern., vol. 15,
no. 1, pp. 116–132, Jan. 1985.
[55] I. N. Tansel, S. Y. Yang, G. Venkataraman, A. Sasirathsiri, W. Y. Bao,
and N. Mahendrakar, “Modeling time series data by using neural net-
works and genetic algorithms,” Smart Engineering System Design: Neu-
ral Networks, Fuzzy Logic, Evolutionary Programming, Data Mining,
and Complex Systems (Proc. Artif. Neural Netw. Eng. Conf., ANNIE’99),
C. H. Dagli, A. L. Buczak, J. Ghosh, M. J. Embrechts, and O. Ersoy, Eds.
New York: ASME Press, 1999, pp. 1055–1060.
[56] A. Thammano, “Neuro-fuzzy model for stock market prediction,” in
Smart Engineering System Design: Neural Networks, Fuzzy Logic, Evo-
lutionary Programming, Data Mining, and Complex Systems (Proc. Artif.
Neural Netw. Eng. Conf., ANNIE’99), C. H. Dagli, A. L. Buczak, J. Ghosh,
M. J. Embrechts, and O. Ersoy, Eds. New York: ASME Press, 1999,
pp. 587–591.
[57] S. Wang and N. P. Archer, “A neural network based fuzzy set model for
organizational decision making,” IEEE Trans. Syst., Man, Cybern., Part
C: Appl. Rev., vol. 28, no. 2, pp. 194–203, May 1998.
[58] L. X. Wang and J. M. Mendel, “Generating fuzzy rules by learning form
examples,” IEEE Trans. Syst., Man, Cybern., vol. 22, no. 6, pp. 1414–
1427, Nov./Dec. 1992.
[59] H. White, “Economic prediction using neural networks: The case of IBM
daily stock returns,” in Proc. 2nd Annu. IEEE Conf. Neural Netw., II,
1988, pp. 451–458.
[60] W. L. Woo and S. S. Dlay, “Neural network approach to blind signal
separation of mono-nonlinearity mixed sources,” IEEE Trans. Circuits
Syst., vol. 52, no. 6, pp. 1236–1247, Jun. 2005.
[61] X. Yao, “Evolving artificial neural networks,” Proc. IEEE, vol. 87, no. 9,
pp. 1423–1447, Sep. 1999.
[62] L. Yu and Y.-Q. Zhang, “Evolutionary fuzzy neural networks for hybrid
financial prediction,” IEEE Trans. Syst., Man, Cybern., Part C: Appl.
Rev., vol. 35, no. 2, pp. 244–249, May 2005.
[63] J. Yao and H. L. Poh, “Forecasting the KLSE index using neural networks,”
in Proc. IEEE Int. Conf. Neural Netw., vol. 2, Nov./Dec. 1995, pp. 1012–
1017.
[64] Y. Yoon and J. Swales, “Prediction stock price performance: A neural
network approach,” in Proc. 24th Annu. Hawaii Int. Conf. Syst. Sci.,
1991, pp. 156–162.
[65] L. A. Zadeh, “The role of fuzzy logic in modeling, identification and
control,” Model. Identification Control, vol. 15, no. 3, pp. 191–203, 1994.
[66] C. Zanchettin and T. B. Ludermir, “Hybrid technique for artificial neural
network architecture and weight optimization, A. Jorge et al.,Eds.,
Proc. 9th Eur. Conf. Principles Practice Knowledge Discovery Databases
(PKDD 2005), Lecture Notes in Artificial Intelligence, vol. 3721, 2005,
pp. 709–716.
[67] G. P. Zhang, “Avoiding pitfalls in neural network research,” IEEE Trans.
Syst., Man, Cybern., Part C: Appl. Rev., vol. 37, no. 1, pp. 3–16, Jan. 2007.
[68] Y.-Q. Zhang, S. Akkaladevi, G. Vachtsevanos, and T. Y. Lin, “Granular
neural Web agents for stock prediction,” Soft Comput., vol. 6, pp. 406–
413, 2002.
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
CHANG AND FAN: HYBRID SYSTEM INTEGRATING A WAVELET AND TSK FUZZY RULES 815
Pei-Chann Chang received the M.S. and Ph.D. de-
grees from Lehigh University, Bethlehem, PA, in
1985 and 1989.
He is currently a Professor at the Department
of Information Management, Yuan Ze University,
Taoyuan, Taiwan, R.O.C. His current research inter-
ests include financial time series forecasting, evolu-
tionary computation, fuzzy neural applications, pro-
duction scheduling, forecasting, case-based reason-
ing, and applications of soft computing. He is the
author or coauthor of more than 60 papers published
in international journals. He is the Senior Editor of the Journal of Chinese In-
stitute of Industrial Engineering (JCIIE).
Chin-Yuan Fan received the B.S. degree from the
Chinese-Culture University, Taipei, Taiwan, R.O.C.,
2001, and the M.S. degree from Da-Yeh University,
Changhua, Taiwan, in 2003, both in management. He
is currently working toward the Ph.D. degree at the
Industrial Engineering and Management Department,
Yuan Ze University, Taoyuan, Taiwan.
His current research interests include applications
of soft computing, financial time series forecasting,
multiobjective optimization problems, and multicri-
teria decision making.
Authorized licensed use limited to: Hong. Downloaded on February 8, 2009 at 00:06 from IEEE Xplore. Restrictions apply.
... With the introduction of intelligent systems in fuzzy theory (Zadeh, 1965) that deal with uncertainty in data, a considerable amount of literature has been published discussing stock market forecasting models based on fuzzy theory. These models include fuzzy time-series (Cagcag Yolcu & Alpaslan, 2018;Chu et al., 2009), adaptive network-based fuzzy inference systems (Wei et al., 2011), Takagi-Sugeno-Kang (TSK) type fuzzy systems (Chang & Fan, 2008;Chang & Liu, 2008) and other variants Pal & Kar, 2019). ...
... The framework of this TSK fuzzy system was formed using step-wise regression (to select the essential factors), k-means clustering (to partition data into clusters), and a fuzzy inference system (to generate rules and set the parameters). A comprehensive study was conducted by Chang and Fan (2008) in which a novel prediction method that integrates a Haar wavelet transform and a TSK fuzzy rule-based system for forecasting the stock market was proposed. In this study, they also employed k-means clustering to avoid rule explosion and then generated fuzzy rules for each cluster. ...
... which is often used in time-series analyses, was ranked third among the preferred validation methods in 10% (5) studies. Notice that some studies (see, e.g., Chang & Fan, 2008;Mo & Wang, 2018) mentioned that they applied ''Crossvalidation'', but did not name the cross-validation technique they used explicitly. ...
Article
Full-text available
In this literature review, we investigate machine learning techniques that are applied for stock market prediction. A focus area in this literature review is the stock markets investigated in the literature as well as the types of variables used as input in the machine learning techniques used for predicting these markets. We examined 138 journal articles published between 2000 and 2019. The main contributions of this review are: 1) an extensive examination of the data, in particular, the markets and stock indices covered in the predictions, as well as the 2173 unique variables used for stock market predictions, including technical indicators, macro-economic variables, and fundamental indicators, and 2) an in-depth review of the machine learning techniques and their variants deployed for the predictions. In addition, we provide a bibliometric analysis of these journal articles, highlighting the most influential works and articles.
... In a TSK FLS as a data-driven approach, utilizing a training dataset, the parameters of the rules and membership functions are optimized for training error reduction [30]. Since different types of fine-tuned membership functions do not make considerable differences in inference results [8], in this study, we utilize Gaussian membership functions. In fuzzy system modeling, Gaussian membership functions are widely used. ...
Article
Full-text available
Fuzzy logic systems (FLSs) are proper tools for learning and predicting of real-world problems. Type-2 fuzzy sets are developments of the conventional type-1 fuzzy sets which are applied for prediction problems with uncertainty. Interval type-2 fuzzy logic system (IT2 FLSs) is the most wildly used type-2 FLS due to its efficiency and simplicity. Passenger demand prediction has a crucial role in the public transportation sector. Because of the nonlinearity and instability of the passenger arrivals prediction, IT2 FLS can be an appropriate method for solving this problem. In this paper, we develop a fuzzy logic system named KIT2 TSK for passenger arrivals prediction in subway stations. In our proposed model, we utilize the Kumaraswamy distribution in the construction of an IT2 TSK FLS. Furthermore, we develop a new input selection measure that applies the SchweizerSklar t-conorm operator in the variable selection process. The flexibility of the Kumaraswamy distribution leads to the ability to approximate several distributions using the same equation by different values for its shape parameters. Utilizing this property, we adopt our proposed model for passenger arrivals prediction of one line of the Tehran subway system as a case study. Moreover, to see the results on unusual days, passenger demand on public holidays, weekends, and special events are also taken into account. The results demonstrate that our proposed methodology has better performance in the hourly prediction of passenger arrivals compared to the benchmarks. The results for the chaotic Mackey-Glass problem also show the superiority of our proposed model.
... Over the years, Fama's conclusions have been repeatedly questioned. There are now numerous papers that try to forecast stock market prices (to name a few of them: Pai and Lin, 2005;Chang and Fan, 2008;Tsai and Wang, 2009;Sen, 2017;Khashei and Hajirahimi, 2018), using various techniques. ...
Article
Full-text available
Stock market prices are known to be very volatile and noisy, and their accurate forecasting is a challenging problem. Traditionally, both linear and non-linear methods (such as ARIMA and LSTM) have been proposed and successfully applied to stock market prediction, but there is room to develop models that further reduce the forecast error. In this paper, we introduce a Deep Convolutional Generative Adversarial Network (DCGAN) architecture to deal with the problem of forecasting the closing price of stocks. To test the empirical performance of our proposed model we use the FTSE MIB (Financial Times Stock Exchange Milano Indice di Borsa), the benchmark stock market index for the Italian national stock exchange. By conducting both single-step and multi-step forecasting, we observe that our proposed model performs better than standard widely used tools, suggesting that Deep Learning (and in particular GANs) is a promising field for financial time series forecasting.
... ere are many articles that employed ANN for the prediction of stock market. For example, the authors [13] also predicted price fluctuations using the Haar wavelet and Takagi-Sugeno-Kang (TSK) fuzzy rule-based system. e TSK fuzzy rule-based method is used to forecast stock prices using a number of technical indices. ...
Article
Full-text available
This study aims to model and enhance the forecasting accuracy of Saudi Arabia stock exchange (Tadawul) data patterns using the daily stock price indices data with 2026 observations from October 2011 to December 2019. This study employs a nonlinear spectral model of maximum overlapping discrete wavelet transform (MODWT) with five mathematical functions, namely, Haar, Daubechies (Db), Least Square (LA-8), Best localization (BL14), and Coiflet (C6) in conjunction with adaptive network-based fuzzy inference system (ANFIS). We have selected oil price (Loil) and repo rate (Repo) as input values according to correlation, the Engle and Granger Causality test, and multiple regressions. The input variables in this study have been collected from Saudi Authority for Statistics and Saudi Central Bank. The output variable is obtained from Tadawul. The performance of the proposed model (MODWT-LA8-ANFIS) is evaluated in terms of mean error (ME), root mean square error (RMSE), and mean absolute percentage error (MAPE). Also, we have compared the MODWT-LA8-ANFIS model with traditional models, which are autoregressive integrated moving average (ARIMA) model and ANFIS model. The obtained results show that the performance of MODWT-LA8-ANFIS is better than that of the traditional models. Therefore, the proposed forecasting model is capable of decomposing in the stock markets.
Article
With the availability of high frequency data and new techniques for the management of noise in signals, we revisit the question, can we predict financial asset prices? The present work proposes an algorithm for next-step log-return prediction. Data in frequencies from 1 to 15 minutes, for 25 high capitalization assets in the Mexican market were used. The model applied consists on a wavelet followed by a Long Short-Term Memory neural network (LSTM). Application of either wavelets or neural networks in finance are common, the novelty comes from the application of the particular architecture proposed. The results show that, on average, the proposed LSTM neuro-wavelet model outperforms both an ARIMA model and a benchmark dense neural network model. We conclude that, although further research (in other stock markets, at higher frequencies, etc.) is in order, given the ever increasing technical capacity of market participants, the inclusion of the LSTM neuro-wavelet model is a valuable addition to the market participant toolkit, and might pose an advantage to traditional predictive tools.
Article
Full-text available
Global climatic changes and increased carbon footprints provided the main impetus for thedecrease in the use of fossil fuels for electricity generation and transportation. Matured manufacturingtechnologies of solar PV panels and On-shore and off-shore windmills have brought down the cost ofgeneration of electricity using solar energy on par with conventional fossil fuel. Initially, solar and windpower generation was envisioned for microgrids, serving small local communities. However, advancementsin power electronics have now facilitated large solar and wind farms to be integrated with main powergrids. In this context, hosting capacity, which is the amount of distributed energy resources a grid canaccommodate, without significant infrastructure up-gradation, has gained importance. In determining thehosting capacity at a particular location, the uncertainties of wind and solar power generation play a role.Effective forecasting models using time-series weather data can be built to predict wind and solar powergeneration. This forecast is essential to ensure proper grid operation and control when renewable energysources are already installed. The forecast is also useful in the planning stages for investment decisions anddistribution system planning. While long-term forecasts are rarely needed for the operation of integratedgrids, accurate short-term predictive models are necessary for scheduling. This paper presents an extensivereview of various forecast models available in the literature. The study mainly focuses on the short-termforecast, providing a critical review of the duration of data used in each model and a synoptic comparisonof their performance ind
Article
Full-text available
This paper analyzes the factor zoo, which has theoretical and empirical implications for finance, from a machine learning perspective. More specifically, we discuss feature selection in the context of deep neural network models to predict the stock price direction. We investigated a set of 124 technical analysis indicators used as explanatory variables in the recent literature and specialized trading websites. We applied three feature selection methods to shrink the feature set aiming to eliminate redundant information from similar indicators. Using daily data from stocks of seven global market indexes between 2008 and 2019, we tested neural networks with different settings of hidden layers and dropout rates. We compared various classification metrics, taking into account profitability and transaction costs levels to analyze economic gains. The results show that the variables were not uniformly chosen by the feature selection algorithms and that the out-of-sample accuracy rate of the prediction converged to two values – besides the 50% accuracy value that would suggest market efficiency, a “strange attractor” of 65% accuracy also was achieved consistently. We also found that the profitability of the strategies did not manage to significantly outperform the Buy-and-Hold strategy, even showing fairly large negative values for some hyperparameter combinations.
Article
In traditional hierarchical fuzzy classifiers, there exist some main problems. These issues include the output of the previous layer influencing with the input of the next layers, the inconsistency between the intra-layer and inter-layer, and the improvement of global optimization. Based on the above matters, a novel quantitative-integration-based hierarchical Takagi-Sugeno-Kang (TSK) is proposed to figure out the challenges in existing hierarchical fuzzy classifiers. As a novel hierarchical structure, the proposed classifier is built in a stacked manner. Each base building unit consists of an optimized zero-order TSK fuzzy classifier. For good interpretability of each base building unit, the antecedent parameters are solved by random selections of input features, random combinations of fuzzy rules, divisions of fuzzy partitions and generations of cluster centers. In addition, an improved method based on classical ridge regression is proposed to solve the problem of the consistency between in-layer and inter-layer. In order to enhance the classification performance of the fuzzy classifier, the input of each layer is optimized to effectively resolve the problem that the output of the previous layer affects the input of the next base building unit. Moreover, a method that randomly selects a part of the sample from the original training set is adopted to open the manifold structure of input set. Furthermore, the experimental results indicate that this method can indeed make the QI-TSK-FC fuzzy classifier obtain good classification performance. To improve the generalization ability of the fuzzy classifier, a method that quantitatively solves the integrated output is considered so that the QI-TSK-FC model can quickly obtain the satisfactory optimal solution. Finally, the classification performance and interpretability of the proposed fuzzy classifier are verified through the MIT-BIH sleep datasets.
Article
This paper develops a neural network model that predicts the proper time to move investment funds in and out of the stock market to maximize investment return. The goal is to use data which is easy to obtain, inexpensive and timely, enabling any investor to exploit neural network technology. The model is estimated using two valuation indicators, two monetary policy indicators, and four technical market indicators to explain the four week forward excess return on the Standard and Poors 500 stock Index over the U.S. Treasury Bill yield. The model is simulated out of sample and the results are compared to buy and hold strategies of investing in stocks and Treasury Bills alone. The model is shown to significantly out perform buy and hold strategies on both an absolute and a risk adjusted basis.
Article
Learning and evolution ai-e two fundamental forms of adaptation. There has been a gl-eat interest in combining learning and evolution with artificial neural networks (ANN's) in recent years. This paper: I) reviews reviews ent combinations between ANN's and evolutionary algorithms (EA's), including using EA's to evolve ANN connection weights, architectures, learning rules, and input features; 2) discusses different search operators which have been used in various EA's; and 3) points out possible future research directions. it is shown, through a considerably large literature review, that combinations between ANN's and EA's can lead to significantly better intelligent systems than relying on ANN's or EA's alone.
Article
After summarizing the properties of wavelets that are most likely to be useful in economic and financial analysis, the literatur on the application of wavelet techniques in these fields is reviewed. Special attention is given to the potential for insight into the development of economic theory or the enhancement of our understanding of economic phenomena. The paper is conclude with a section containing speculations about the relevance of wavelet analysis to economic and financial time–series give the experience to date. This discussion includes some suggestions about improving our understanding and evaluation of forecast using a wavelet approach.
Article
We attempt to recover a function of unknown smoothness from noisy sampled data. We introduce a procedure, SureShrirtk, that suppresses noise by thresholding the empirical wavelet coefficients. The thresholding is adaptive: A threshold level is assigned to each dyadic resolution level by the principle of minimizing the Stein unbiased estimate of risk (Sure) for threshold estimates. The computational effort of the overall procedure is order N · log(N) as a function of the sample size N, SurvShrink is smoothness adaptive: If the unknown function contains jumps, then the reconstruction (essentially) does also; if the unknown function has a smooth piece, then the reconstruction is (essentially) as smooth as the mother wavelet will allow. The procedure is in a sense optimally smoothness adaptive: It is near minimax simultaneously over a whole interval of the Besov scale; the size of this interval depends on the choice of mother wavelet. We know from a previous paper by the authors that traditional smoothing methods-kernels, splines, and orthogonal series estimates-even with optimal choices of the smoothing parameter, would be unable to perform in a near-minimax way over many spaces in the Besov scale. Examples of SureShtink are given. The advantages of the method are particularly evident when the underlying function has jump discontinuities on a smooth background.