ArticlePDF Available

Predicting the Turkish Stock Market BIST 30 Index using Deep Learning

Authors:

Abstract and Figures

The non-linearity and high change rates of stock market index prices make prediction a challenging problem for traders and data scientists. Data modeling and machine learning have been extensively utilized for proposing solutions to this difficult problem. In recent years, deep learning has proved itself in solving such complex problems. In this paper, we tackle the problem offorecasting the Turkish Stock Market BIST 30 index movements and prices. We propose a deep learning model fed with technical indicators and oscillators calculated from historical index price data. Experiments conducted by applying our model on a dataset gathered for a period of 27 months on www.investing.com demonstrate that our solution outperforms other similar proposals and attains good accuracy, achieving 0.0332, 0.109, 0.09, 0.1069 and 0.2581 as mean squared error in predictingBIST 30 index prices for the next five trading days. Based on these results, we argue that using deep neural networks is advisable for stock market index prediction.
Content may be subject to copyright.
Uluslararası Mühendislik
Araştırma ve Geliştirme Dergisi
International Journal of
Engineering Research and
Development
UMAGD, (2019) 11(1), 253-265.
10.29137/umagd.425560
Cilt/Volume:11 Sayı/Issue:1 Ocak/January 2019
Araştırma Makalesi / Research Article .
*Sorumlu Yazar: mdemirci@gazi.edu.tr
Predicting the Turkish Stock Market BIST 30 Index using Deep Learning
Halil Raşo 1, Mehmet Demirci *1
1Gazi University, Faculty of Engineering, Department of Computer Engineering, Ankara, Turkey
Başvuru/Received: 21/05/2018 Kabul/Accepted: 12/12/2018 Son Versiyon/Final Version: 31/01/2019
Abstract
The non-linearity and high change rates of stock market index prices make prediction a challenging problem for traders and data
scientists. Data modeling and machine learning have been extensively utilized for proposing solutions to this difficult problem.
In recent years, deep learning has proved itself in solving such complex problems. In this paper, we tackle the problem of
forecasting the Turkish Stock Market BIST 30 index movements and prices. We propose a deep learning model fed with
technical indicators and oscillators calculated from historical index price data. Experiments conducted by applying our model
on a dataset gathered for a period of 27 months on www.investing.com demonstrate that our solution outperforms other similar
proposals and attains good accuracy, achieving 0.0332, 0.109, 0.09, 0.1069 and 0.2581 as mean squared error in predicting BIST
30 index prices for the next five trading days. Based on these results, we argue that using deep neural networks is advisable for
stock market index prediction.
Key Words
Stock market index prediction; Deep learning; Deep Neural Network; Stock index; BIST
UMAGD, (2019) 11(1), 253-265, Raşo & Demirci
254
1. INTRODUCTION
Stock market prediction is a difficult task due to the huge amount of data to be processed, frequent and nonlinear stock price
changes, and the diversity of influencing factors, such as national/global economic conditions and news, investors moods etc. In
addition, the efficient market hypothesis states that stock movements are in accordance with the random walk model, thus making
it highly improbable to predict their movement directions and prices.
Some investors use technical indicators and oscillators to build charts and patterns to help them discover price trends, and devise
strategies for high-return investments. This method is called technical analysis and its proponents believe that the market discounts
everything, prices move in trends and historical trends repeat themselves. Technical analysis or charting indicators focus on
historical values and movements of index prices as the entry point of the financial market analysis, and build charts illustrating
hidden points that investors can use in their investment strategies.
Another camp of investors rely on fundamental indicators rather than technical ones and they are called fundamental analysts.
They analyze the volume of shares, financial and political news, investors’ moods and other factors that influence the market. The
main difference between these two schools of financial market analysis is the period of time that investment strategies consider.
Technical analysis focuses on the next short time period whereas fundamental analysis considers longer time periods, i.e. at least
one quarter or more.
Many traditional machine learning proposals like support vector machines, variant decision trees, k-nearest neighbors and artificial
neural networks have been suggested for stock market prediction. These algorithms are powerful in many problems but in such
highly volatile and non-linear problems they suffer from stability issues. Deep learning has proved itself a promising solution for
such environments and showed good performance (Akita et al., 2016).
In this work, we propose a deep learning model trained on the most important technical oscillators of the BIST 30 Index of the
Turkish stock market. These indicators are noted by famous technical analysts and www.investing.com, a popular website among
traders and investors. To the best of our knowledge, this paper is the first in the literature using a deep learning model to predict
Turkish stock market index prices. We have conducted our experiments on historical index data obtained from www.investing.com
for a 27-month period from 01.01.2016 to 11.04.2018. Our model was able to produce good predictions with low error rates, as
discussed in the evaluation section.
The rest of the paper is organized as follows. A brief review of related works is given in Section 2. Technical indicators and
oscillators used in our proposal have been described and fundamental indicators are briefly mentioned in Section 3. Our proposal
and the walkthrough of our method have been explained in detail in Section 4. Section 5 describes our experiments and presents
the evaluation results. Finally, conclusions and future work areas are highlighted in Section 6.
2. RELATED WORK
Technical analysis and fundamental analysis are the two main schools of thought when it comes to analyzing the financial markets.
While technical analysis focuses on historical data of stock prices and volumes, fundamental analysis gives significant weight to
investors’ sentiments, economic and political conditions and news. Nassirtoussi et al. (2015) proposed an approach to predict
intraday forex currency-pair directions by analyzing breaking financial news headlines. Another work done by Shynkevich et al.
(2015) analyzed the effects of market-related articles on stock trends and prices. The authors have developed a model to predict
these values from industry-specific articles. Oliveir et al. (2017) analyzed the impact of microblogging data related to stock market
news on the investors’ sentiments. The authors forecast the returns, volatility and trading volume of diverse indices and portfolios
from tweet messages. Ni et al. (2015) investigated the effects of investors’ sentiments on stock market index prices and directions.
According to their findings, the impact of investor sentiment is considerable for up to 2 years. Its effect is asymmetric, that is, it is
positive and large for stocks with high returns in the short term, while negative and small in the long term. Leigh et al. (2002)
studied the effectiveness of technical analysis approaches using multiple technical indicators and how they are used to achieve
high return rates using decision making systems. They emphasized the importance of the “bull flag” price and volume pattern
heuristic in getting abnormal results. Later, the indicators noted by these schools are used as inputs or features to prediction systems.
Machine learning algorithms are the primary techniques used for predicting stock prices and directions. Gui et al. (2014) proposed
an interesting approach through which the prediction is not a specific number but a limit instead. The authors transformed financial
time series into fuzzy particle sequences and then used support vector machine to build a regression model on the lower and upper
bounds to decrease the estimation error. Dechow et al. (2001) showed how short-sellers benefit from those factors in refining their
investment strategies and maximizing their returns. Another work (Lewellen et al., 2010) emphasized the importance of key factors
of fundamental analysis and suggested some improvements in empirical tests. Dechow et al. (2010) reviewed the various measures
of “earning quality” and how it is related to the company fundamental performance. Qian et al. (2007) used the Hurst exponent to
select highly predictable period, and later training patterns or indicators are generated by auto-mutual information and false nearest
neighbor methods. Trained by an ensemble of inductive machine learning approaches such as artificial neural networks, decision
trees and k-nearest neighbors, the model achieved 60- 65% of accuracy.
UMAGD, (2019) 11(1), 253-265, Raşo & Demirci
255
Sands et al. (2015) compared different classification proposals: Support vector machine using least squares implementation,
artificial neural networks, naïve Bayes classifier and SVM optimized by particle swarm optimization in building an investment
portfolio with maximum gain and minimum risk. According to their experiments, SVM optimized by particle swarm optimization
is capable of predicting the stock values with high accuracy. Another work (Ince et al., 2017) proposed a hybrid model for
forecasting stock market movements. This model is composed of ICA for selecting important features between some technical
indicators and then using kernel methods such as SVM, TWSVM, MPM, KFDA and random walk to build a model for predicting
stock movements. According to their experiments on Dow-Jones, Nasdaq and S&P500 indices, the models like ICA-SVM, ICA-
TWSVM, ICA-MPM and ICA-KFDA have achieved high accuracy.
Bastı et al. (2015) addressed the underpricing of Turkish companies in initial public offers traded in Istanbul Stock Exchange. They
employed decision tree and support vector machine to investigate the key factors affecting the short-term performance of initial
IPOs. Another approach (Chen et al., 2017) proposed a hybrid model composed of feature weighted of both support vector machine
and k-nearest neighbors. The authors applied the model on two well-known Chinese stock market indices, Shanghai and Shenzhen
stock exchange indices. Teixeira et al. (2010) combined technical analysis and k-nearest neighbors. Qian et al. (2007), Zhang et al.
(2009), Moghaddam et al. (2016), and Boyacioglu et al. (2010) have investigated the use of artificial neural networks in stock
market prediction. A recent work (Akita et al., 2016) using deep learning for stock market prediction was applied on Tokyo stock
exchange market. The authors used paragraph vector to convert newspaper articles into distributed representations and used them
with historical prices to predict values close to the actual stock prices.
We have noticed that most of the proposals based on technical analysis use their indicators heuristically without any features
engineering or what recently technical analysts prefer to use between such indicators. For this reason, we did feature engineering
and chose the most important ones noted by highly reputed global investment website. The closest study to our proposal was the
one by Akita et al. (2016) due to its use of deep learning techniques but what differentiates our work is applying deep learning for
Turkish stock exchange and using technical analysis in feature selection.
3. BACKGROUND: TECHNICAL ANALYSIS AND INDICATORS
In this section, we formalize the problem to be addressed. Prediction of stock index prices is a time series problem where each
sample or observation contains the price values that an index can take during a trading day such as open, low, high, volume, trading
date and closing price. The goal is to predict the price value for the following trading days with low errors. In this paper, we use
samples and observations on a daily basis. This could be adapted by using other trading periods like minutes, weeks, months or
even years. We express this in mathematical terms as follows:
        (1)
We build a deep learning model using certain technical analysis indicators as features. The following subsection describes technical
analysis and the indicators we have used.
3.1. Technical Analysis
Technical analysts believe that historical prices of stock indices contain very important hidden information and they are highly
related to the current prices. According to them, this could be explored by what they call indicators, oscillators and charts. So, the
prices and directions of stock indices could be predicted by using such indicators. In that case we can rewrite the formula as given
in Eq. 2.
       
Technical analysis relies on analyzing certain indicators to extract information such as buy/sell signals from historical data and
construct high-return investment strategies. There are approximately 150 technical indicators, but we only provide a brief
description of the most important ones which are accepted by the popular investment portal www.investing.com.
3.1.1. Relative Strength Index RSI
RSI is the most important momentum indicator developed by noted analyst J. Welles Wilder Jr and is explained in (Welles, 1978).
It is used to identify overbought and oversold regions of the analyzed index. These regions are highly significant to the technical
analyst or the trader to give buy or sell orders. What RSI does is observing the magnitude of recent gains and losses over a specified
time period (14 trading days by default) to measure speed and change of price movements of an index. RSI is calculated by the
following formula:
  
 (3)
There are two important RSI levels: (70, 30). When the value of RSI exceeds 70, this interpreted as a sell signal as the price
becomes overvalued. On the other hand, when the RSI value falls under 30, a buy signal is generated. Some investors use extreme
version of the RSI indicator where these two levels are (80, 20). It is important to mention that the time unit considered by the
technical indicators in our calculations is one day but could be other trading units like minutes, hours, months or years.
UMAGD, (2019) 11(1), 253-265, Raşo & Demirci
256
3.1.2. Bollinger Bands BB:
BB is a momentum indicator or chart developed in the 1980s by noted trader John Bollinger (2001) through which the price of the
index is bracketed by an upper and lower band along with a 21-day simple moving average (the default time period is 21 trading
units). The upper and the lower band is double standard deviation of the middle band. According to Bollinger, when the price
exceeds the upper band, it becomes overvalued and there will be a correction, so a sell opportunity is generated. Conversely, when
it goes below the lower band, then the price is undervalued and it should be corrected, so a buy signal is generated.
3.1.3. Stochastic Oscillator STOCH:
STOCH is a momentum indicator or oscillator frequently used by market traders and it compares the price of an index to the range
of its prices over a certain period of time (The default time period is 14 trading units). The stochastic oscillator is calculated using
the following formula:
   
 (4)
Where:
is the most recent closing price,
 is the low of the 14 previous trading sessions,
 is the highest price traded during the same 14-day period,
 is the current market rate for the currency pair,
 is 3-period moving average of .
3.1.4. Williams %R:
Williams %R is a momentum indicator developed by famous technical analyst Larry Williams (1973), and it is the inverse of the
Fast-Stochastic Oscillator. Williams %R reflects the level of the closing price relative to the highest high for the look-back period.
In contrast, the Stochastic Oscillator reflects the level of the closing price relative to the lowest low. Williams %R is calculated
by the following formula:
   
   (5)
Where:
 is the highest high,
is the closing price,
 is the lowest low.
The time period considered by the formula is 14 trading units.
3.1.5. Price Rate of Change ROC:
The price rate of change (ROC) is a technical indicator of momentum that measures the percentage change in price between the
current price and the price n periods in the past. It is calculated by using the following formula:
  
 (6)
Where
is the most recent closing price,
is the closing price n periods ago.
3.1.6. Simple Moving Average SMA:
SMA is the simplest momentum indicator used by many traders and calculated by adding the closing price of the index for a
number of time periods (The usual time period like other momentum indicators is 14 trading units) then dividing this total by the
number of considered time periods as in the following formula:
  

(7)
3.1.7. Exponential Moving Average EMA:
An exponential moving average (EMA) is the exponential variation of the standard simple moving average except that in the
former we give more importance to the latest closing prices of the index. This type of moving average reacts faster to recent price
changes than a simple moving average and is calculated by using the following formula:
      (8)
 
 (9)
Where:
 is the current EMA value,
 is the previous EMA value,
is the length of the EMA.
UMAGD, (2019) 11(1), 253-265, Raşo & Demirci
257
3.1.8. Commodity Channel Index CCI:
Another momentum indicator called Commodity Channel Index or CCI was developed by Donald Lambert and it measures the
current price level relative to an average price level over a given period of time (14 trading units). CCI is relatively high when
prices are far above their average, and CCI is relatively low when prices are far below their average. In this manner, CCI can be
used to identify overbought and oversold levels which are important levels considered by traders to make buy and sell orders. It is
calculated by using the following formula:
 
 (10)
3.1.9. On-Balance Volume - OBV:
On-balance volume or OBV is a momentum indicator developed by Joseph E. Granville (1976) that considers index volume flow
to predict changes in its price. According to him, the price of the index will eventually jump upward when volume increases sharply
without a significant change in it and vice versa. It is computed by using the following formula:

   (11)
Where:
 in the current on-balance volume,
 is the positive-negative volume (volume is positive if current volume is bigger than previous volume)
3.1.10. Moving average convergence divergence - MACD:
Moving average convergence divergence or MACD is a trend-following momentum indicator that shows the relationship between
two moving averages of prices. The MACD is calculated by subtracting the 26-day exponential moving average (EMA) from the
12-day EMA. A nine-day EMA of the MACD, called the "signal line", is then plotted on top of the MACD, functioning as a trigger
for buy and sell signals. MACD is calculated by the following formula:
    (12)
3.1.11. STOCHRSI:
Some momentum indicators give a good performance when they are accompanied by other technical or momentum indicators.
STOCHRSI is one such indicator used in technical analysis that ranges between zero and one. It is created by applying the
Stochastic Oscillator formula to a set of Relative Strength Index (RSI) values rather than standard price data. Using RSI values
within the stochastic formula gives traders an idea of whether the current RSI value is overbought or oversold - a measure that
becomes specifically useful when the RSI value is confined between its signal levels of 20 and 80.
3.2. Fundamental Analysis:
Fundamental analysts believe in fundamental factors rather than technical ones. They care about the intrinsic values of stocks and
take into account everything related to the stocks such as earnings, market shares, financial conditions, news and investors’
sentiments. Contrary to technical analysts, fundamental analysts perform their analysis and calculations for a sufficiently long time
period and try to minimize their transactions. Eq. 13 gives the estimation from fundamental analysts’ point of view.
   (13)
4. Proposed Model
Historical data of the stock index are quite simple and contain only few values that it can take during a trading unit (hour, day,
month or year) such as open, high, low, volume and closing price. Our goal is to predict the closing price from these values, which
is a challenging task due to the volatility of these prices. We need to calculate technical indicators from such values as these
indicators hold valuable hidden information about prices. There are approximately 150 indicators, below we list the most important
ones as noted by technical analysts and by a popular investment portal. Selecting and calculating these oscillators is the first step
of our model which consists of two parts:
4.1. Calculation of Technical Indicators:
This step calculates technical indicators or oscillators from historical data of BIST 30 index price and volume. While some of these
indicators depend only on the closing price of the index, others depend on the low and high as well as the closing price. For
instance, one of the oscillators called On-Balance Volume (OBV) depends on the volume value of the index. The calculation of
these oscillators is based on the default time period of each one. The output of this calculation will be the input of our deep NN. In
other words, they will be accepted like its features.
4.2. Deep Neural Network (Deep Learning):
Artificial Neural Network or ANN is one of the most important research areas in artificial intelligence and machine learning. The
main idea behind ANN is inspired by the natural neural network of the human nervous system. Neurons are imitated with
computing units connected with each other in the form of a network through axons and dendrites. Each neuron or node receives
UMAGD, (2019) 11(1), 253-265, Raşo & Demirci
258
inputs from other nodes through its dendrites, performs an operation on them and sends the result of that operation to other neurons.
The inputs to the ANN (also known as features) are technical indicators in our case.
A perceptron is a binary classifier that uses a linear prediction function. Most ANNs are networks of perceptrons, also known as
feed forward neural networks, organized into fully connected layers. While a perceptron is suitable when trying to build a linear
decision boundary, simple ANN becomes unfeasible in the case of building a regression model with many features, hence deep
neural networks are needed. Deep NNs are simply ANNs with more hidden layers and neurons in each of them as illustrated in
Fig. 1. In the following subsection, we explain the steps taken in our work to build a regression model to predict the future prices
of the BIST 30 index.
Fig. 1. General Structure of an artificial neural network. Deep neural networks consist of more hidden layers and more
neurons in each hidden layer (Vázquez, F. 2017)
4.3. Walkthrough
Here, we provide a step-by-step explanation of the phases of our method.
4.3.1. Gathering data:
As our problem is analyzing the Turkish stock exchange market in order to predict future price movements of BIST 30, we needed
to gather significant amount of financial data. One of the most reliable websites followed by many traders is
www.investing.com. We obtained our dataset from that website for a period of more than two years. The prices of stock indices
are generally given in csv format containing the values of closing, opening, low, high, and volume of the index. Table 1. shows a
sample portion of the dataset.
Table 1. Sample of historical data of BIST 30
Date
Price
Open
High
Low
Volume
04.01.16
85,981.14
87,428.49
87,428.49
85,023.80
339.09
05.01.16
86,147.25
85,981.14
86,940.28
84,502.58
488.09
06.01.16
86,862.50
86,147.25
86,970.83
84,904.24
596.21
07.01.16
87,417.44
86,862.50
87,577.47
84,994.29
705.17
08.01.16
86,234.62
87,417.44
88,226.75
85,932.68
565.21
11.01.16
86,825.17
85,933.88
87,568.90
85,517.25
500.37
12.01.16
87,724.37
86,783.72
88,216.83
86,094.12
634.06
4.3.2. Data preparation:
Cleaning and processing data is necessary in most cases before applying machine learning algorithms. The datasets related to
financial markets suffer from several specific problems:
Some companies may not exist any longer.
The market is closed during national holidays and on the weekends.
For technical problems prices contain negative errors.
These issues should be addressed when constructing a machine learning model. Two important preprocessing issues are
normalization and finding correlated features. It is strongly advised to make the features data range [0, 1]. Fig. 2 and Fig. 3 show
the histograms of the features before and after normalization. We see that all of the features are normalized, and their values are in
the range [0, 1] except the price since it is not a feature, but the target value we are going to predict.
Another issue is finding out if, and which, features are correlated with each other. Such features should be eliminated as correlated
features cause an ANN to overfit and have a bad impact on its performance. According to technical indicator formulas, we expect
high correlation between SMA and EMA as they both represent moving averages. If so, we should eliminate one as it serves
nothing. Fig. 4 confirms our intuition that these two features are highly correlated. It is important to notice that some indicators
UMAGD, (2019) 11(1), 253-265, Raşo & Demirci
259
have more than one output and these outputs are correlated with each other we keep them as they are. Fig. 5 shows the pairwise
features correlation. We see that OBV indicator is highly correlated with Bollinger bands (BB) indicator, so we drop it from our
calculations. As a result, we have dropped two indicators (SMA and OBV), and used the remaining nine indicators from Section
3.1 in our model.
4.3.3. Choosing a model:
Selecting a suitable model is critical for the performance of machine learning. In this work, we try to predict stock index values,
so we focus on regression. For reasons outlined in the introduction, we pick a deep neural network trained over technical oscillators
obtained from technical indicators calculator.
Fig. 2. Histogram of features before normalization
Fig. 3. Histogram of features after normalization
Fig. 4. High correlation between SMA and EMA
UMAGD, (2019) 11(1), 253-265, Raşo & Demirci
260
Fig. 5. Pairwise features correlations. OBV indicator is highly correlated with BB indicators.
4.3.4. Training:
Training a machine learning model means adjusting the model parameters to reduce the loss and achieve the desired prediction.
Parameters in our case are neuron weights and bias. We train our deep NN using Keras API over the TensorFlow framework
developed by Google. We have used Keras sequential model API with ReLU (Rectified Linear Unit) Activation Function for
hidden layers. As our target model is a regression model, there is no need for an activation function in the output layer. The last
point that we should mention is the optimizer algorithm which is responsible for adjusting weights and bias. We have used Adam
adaptive moment estimation optimizer.
4.3.5. Evaluation:
A common split ratio between the training set and the test set is 80-20, and we use this ratio. We could not use cross validation
when splitting the test set from the training set because our problem is time-series prediction and in such a situation the algorithm
learns on the first portion of the dataset (training set) and is then evaluated on the test set (the last portion of the data). In other
words, the algorithm could not be trained on recent data and tested on older data. So, our model is trained on the first 80% of our
dataset (BIST 30’s historical data for 27 months of trading) and it is tested over the last 20% after shifting the y values according
to the target trading day.
4.3.6. Hyperparameter tuning:
Typically, it is hard to generate a robust and highly accurate model on the first run of an algorithm. Thus, some parameters of the
model should be readjusted to decrease the loss of the regression model. Possible changes include increasing or decreasing the
number of hidden layers, number of the neurons in such layers, activation function and optimizer algorithm used for training the
network. We have achieved the desired performance with 7 hidden layers and 2 dropout layers, Relu as an activation function and
Adam as an optimizer. Fig. 6. shows our network where the input layer has 15 neurons (technical indicators after dropping SMA
and OBV) and the output layer has only one neuron as we try to predict the index value (one value). Additionally, there are 7
hidden layers with 512, 256, 128, 64, 32, 16, and 8 neurons. There are two dropout layers after the first and the second hidden layer
with 30% and 25% as dropout rates respectively.
UMAGD, (2019) 11(1), 253-265, Raşo & Demirci
261
Fig. 6. Deep neural network used in our work
4.3.7. Prediction:
After adjusting the parameters which helped to obtain an acceptable model, this model applied over the test set data to make
predictions and evaluate the performance using various metrics.
5. EXPERIMENTAL RESULTS
We have conducted our experiments on a dataset gathered from 01.01.2016 to 11.04.2018 on www.investing.com. Each
observation or row contains the trading date, closing, opening, low and high price values as well as the volume and change
percentage with respect to the previous trading day. After preprocessing our data and clearing out negative and null values, we
calculate the technical indicators or oscillators to be used as features in our model. We split the dataset into training set X and test
set y and train the deep neural network. We use 80/20 as the training/test split ratio where the first 80% of data (BIST’s historical
data for 27 months of trading) is used as training set and the last 20% of data is used as test set. The y values in each training and
test portion are shifted according to the trading day. For example, if we want to predict the index value for the next trading day we
shift the y values with one and for second trading day with two and so on.
One important point we should mention here is that as our problem is a time-series problem, in order to predict the price value
after one or two trading days we should shift the target column as much as needed. For example, to predict the index closing price
for the next day, we should shift y by one row, and by two rows in the case of predicting the price for the next two days. This
mechanism is known as window mechanism.
As our problem is a regression problem, we use mean squared error (MSE), R2 score, mean absolute error (MAE) and mean
absolute percentage error (MAPE) metrics to evaluate the performance of our model and compare it with other methods in the
literature (Patel et al., 2015), (Sakarya et al., 2015). We have used multiple performance metrics as each metric yields some valuable
information not supplied by the others. For example, sometimes the MSE is very low but the R2 score is negative, which means
that the model is arbitrary and did not train well. Generally, the metrics except R2 are considered better when close to zero, whereas
the best value for R2 is 1.
UMAGD, (2019) 11(1), 253-265, Raşo & Demirci
262
Fig. 7. shows the loss achieved by our model. As the loss trends towards and stays close to 0, this means that our model is trained
well. Table 4 gives the performance metrics achieved by our deep learning model for the first five trading days and compare them
with two other techniques: SVR (support vector regression) and regular ANN (artificial neural network). Our deep learning model
clearly outperforms ANN for the first five trading days and SVR for the first four trading days, whereas SVR gives better results
for the fifth trading day. Also, our deep model outperforms the proposals presented by Patel et al. (2015) and by Sakarya et al.
(2015) as illustrated in Table 2 and Table 3 using the metrics reported in those works. Fig. 8. plots predicted closing prices vs. real
closing prices for the five next trading days. We observe that the predicted prices closely follow the actual trends.
Table 2. Comparison between our proposal and work by Patel et al. (2015) using MSE as metric.
Trading day
Our Proposal
Other proposal
1st trading day
0.0332
0.4427
2nd trading day
0.1090
0.8748
3rd trading day
0.0900
1.3556
4th trading day
0.1069
1.8445
5th trading day
0.2581
2.3455
Table 3. Comparison between our proposal and work by Sakarya et al. (2015) using MAPE as metric.
Our Proposal
Other proposal
next trading day
1.0676
2.015
Fig. 7. Loss achieved by our model for the next trading day
0
0,2
0,4
0,6
0,8
1
0246810 12 14 16 18 20 22 24 26 28 30 32 34 36 38
MSE Values
Steps to reach desired MSE ın the output layer
Loss function: MSE (output layer)
Training Loss Validation Loss
UMAGD, (2019) 11(1), 253-265, Raşo & Demirci
263
Fig. 8. Actual vs. predicted prices for the next five trading days.
Table 4. Performance comparison for the next five trading days.
MSE
R2 Score
MAE
MAPE
1st Trading Day
Deep Model
0.0332
0.8937
0.1487
1.0676
SVR
0.1846
0.410
0.3823
2.7185
ANN
0.8866
-1.8311
0.7703
5.537
2nd Trading Day
Deep Model
0.1090
0.6518
0.2714
1.6726
SVR
0.1448
0.5374
0.3188
2.2634
ANN
0.6971
-1.2260
0.6747
4.8565
3rd Trading Day
Deep Model
0.0900
0.7135
0.2532
1.6052
SVR
0.1009
0.6789
0.2580
1.8394
ANN
0.4554
-0.4488
0.5635
4.0630
4th Trading Day
Deep Model
0.1069
0.6598
0.2669
1.6103
SVR
0.0996
0.6831
0.2609
1.8694
ANN
5.9874
-18.045
2.2313
15.8569
5th Trading Day
Deep Model
0.2581
0.1787
0.4115
3.1046
SVR
0.1496
0.5240
0.3160
2.2551
ANN
2.9615
-8.4205
1.6230
11.5143
6. CONCLUSIONS AND FUTURE WORK
The non-linearity and high volatility of stock market index prices make it challenging to forecast these prices. Successful prediction
of stock market index values would immensely help investors devise a high-return investment strategy. Generally, stock market
prediction can be categorized into two camps in terms of the features used to build prediction models: Technical analysis-based
UMAGD, (2019) 11(1), 253-265, Raşo & Demirci
264
proposals, fundamental analysis-based proposals. We addressed the BIST 30 index prediction problem using deep learning where
features are selected from common important technical indicators. Using data from 01.01.2016 to 11.04.2018, we trained and tested
our model to show that our model outperforms other techniques like ANN and SVR as well as comparable proposals in the literature
(Patel et al., 2015, Sakarya et al., 2015). Therefore, we conclude that deep learning in this context has proved itself as a promising
solution for such a complex task.
Stock market index prediction can be divided into two main broad categories in terms the output of predictions: Stock index price
prediction (regression model, which is what we have focused on) and stock index direction prediction (classification model) which
can be either up or down. The latter is important for building investment strategies containing more than one index. In future work,
we plan to predict the index direction using deep learning with the same indicators. Another future work area is combining
fundamental and technical indicators and using them together as features of the deep neural network. Also, another potential work
could be adding breaking news to features sets to make features more complete and improve learning performance. Finally, all
proposals are currently applied on offline datasets, and it would be useful to extend the model to handle live data as well.
REFERENCES
Akita, R., Yoshihara, A., Matsubara, T. & Uehara, K. (2016). Deep learning for stock prediction using numerical and textual
information. IEEE, Computer and Information Science (ICIS).
Bastı, E., Kuzey, C. & Delen, D. (2015). Analyzing initial public offerings' short-term performance using decision trees and SVMs,
73, 15-27.
Bollinger, J. (2001). Bollinger on Bollinger Bands, McGraw-Hill Education.
Boyacioglu, M. A. & Avci, D. (2010). An Adaptive Network-Based Fuzzy Inference System (ANFIS) for the prediction of stock
market return: The case of the Istanbul Stock Exchange. Elsevier, Expert Systems with Applications, 37(12), 7908-7912.
Chen, Y. & Hao, Y. (2017). A feature weighted support vector machine and K-nearest neighbor algorithm for stock market indices
prediction. Elsevier, Expert Systems with Applications, 80, 340-355.
Dechow, P., Ge, W. & Schrand, C. (2010). Understanding earnings quality: A review of the proxies, their determinants and their
consequences. Elsevier, Journal of Accounting and Economics, 50(2-3), 344-40.
Dechow, P. M., Hutton, A. P., Meulbroek, L. & Sloan, R. G. (2001). Short-sellers, fundamental analysis, and stock returns.
Elsevier, Journal of Financial Economics, 61(1), 77-106.
Gui, B., Wei, X., Shen, Q. Qi, J. & Guo, L. (2014). Financial Time Series Forecasting Using Support Vector Machine. IEEE, CIS,
15-16.
Ince, H. & Trafalis, T. B. (2017). A Hybrid forecasting model for stock market prediction. Economic Computation and Economic
Cybernetics Studies and Research, 51(3).
Leigh, W., Purvis, R. & Ragusa, J. M. (2002). Forecasting the NYSE Composite Index with Technical Analysis, Pattern
Recognizer, Neural Network, and Genetic Algorithm: A Case Study in Romantic Decision Support. Elsevier, Decision Support
Systems, 32, 361-377.
Lewellen, J. (2010). Accounting anomalies and fundamental analysis: An alternative view. Elsevier, Journal of Accounting and
Economics, 50(2-3), 455-466.
Moghaddam, A. H., Moghaddam, M. H. & Esfandyari, M. (2016). Stock market index prediction using artificial neural network.
Elsevier, Journal of Economics, Finance and Administrative Science, 21(41), 89-93.
Nassirtoussi, A. K., Aghabozorgi, S., Wah, T. Y. & Ngo, D. C. L. (2015). Text mining of news-headlines for FOREX market
prediction: A Multi-layer Dimension Reduction Algorithm with semantics and sentiment. Elsevier, Expert Systems with
Applications, 42(1), 306-324.
Ni, Z., Wang, D. & Xue, W. (2015). Investor sentiment and its nonlinear effect on stock returnsNew evidence from the Chinese
stock market based on panel quantile regression model. Elsevier, Economic Modelling, 50, 266-274.
Oliveir, N., Cortez, P. & Areal, N. (2017). The impact of microblogging data for stock market prediction: Using Twitter to predict
returns, volatility, trading volume and survey sentiment indices. Elsevier, Expert Systems with Applications, 73, 125-144.
Patel, J., Shah, S., Thakkar, P. & Kotecha, K. (2015). Predicting stock market index using fusion of machine learning techniques.
Elsevier, Expert Systems with Applications, 42, 21622172.
UMAGD, (2019) 11(1), 253-265, Raşo & Demirci
265
Qian, B. & Rasheed, K. (2007). Stock market prediction with multiple classifiers. Springer, Applied Intelligence, 26(1), 2533.
Sakarya, Ş., Yavuz, M., Karaoğlan, A. D. & Özdemir, N. (2015). Stock market index prediction with neural network during
financial crises: A review on Bist-100. 1, 2, 53-67.
Sands, T. M., Tayal, D., Morris, M. E. & Monteiro, S. T. (2015). Robust stock value prediction using support vector machines
with particle swarm optimization. IEEE, Evolutionary Computation (CEC).
Shynkevich, Y., McGinnity, T.M., Coleman, S. & Belatreche, A. (2015). Stock price prediction based on stock-specific and sub-
industry-specific news articles. IEEE, Neural Networks (IJCNN), 12-17.
Teixeira, L. A. & De Oliveira, A. L. I. (2010). A method for automatic stock trading combining technical analysis and nearest
neighbor classification. Elsevier, Expert Systems with Applications, 37(10), 6885-6890.
Vázquez, F. (2017). Deep Learning made easy with Deep Cognition. https://becominghuman.ai/deep-learning-made-easy-with-
deep-cognition-403fbe445351.
Welles Jr, J. (1978). New Concepts in Technical Trading Systems, Hunter Publishing Company, Greensboro, NC.
Williams, L. (1973). How I Made One Million Dollars Last Year Trading Commodities, Conceptual Management, Montery, CA.
Joseph E. Granville (1976). Granville’s New Strategy of Daily Stock Market Timing for Maximum Profit, Prentice-Hall, Inc.,
ISBN 0-13-363432-9.
Zhang, Y. & Wu, L. (2009). Stock market prediction of S&P 500 via combination of improved BCO approach and BP neural
network. Elsevier, Expert Systems with Applications, 36(5), 8849-8854.
... These include technical analysis and fundamental analysis methods. Trend patterns and charts are used in technical analysis based on historical price data for the identification of hidden points that may be used by investors for decision-making [1]. ...
... First, an LSTM layer with 32 internal units is added, and we set the return sequences argument to true. The input shape argument was set to (15,1). The second LSTM layer is added, again with 32 internal units, and the return sequence is set to true. ...
... The GRU is constructed in a similar configuration as the LSTM network. The first GRU layer consists of 32 internal units with the return sequences argument set to true and the input shape arguments set to (15,1). ...
Article
Full-text available
Stock market forecasting has drawn interest from both economists and computer scientists as a classic yet difficult topic. With the objective of constructing an effective prediction model, both linear and machine learning tools have been investigated for the past couple of decades. In recent years, recurrent neural networks (RNNs) have been observed to perform well on tasks involving sequence-based data in many research domains. With this motivation, we investigated the performance of long-short term memory (LSTM) and gated recurrent units (GRU) and their combination with the attention mechanism; LSTM + Attention, GRU + Attention, and LSTM + GRU + Attention. The methods were evaluated with stock data from three different stock indices: the KSE 100 index, the DSE 30 index, and the BSE Sensex. The results were compared to other machine learning models such as support vector regression, random forest, and k-nearest neighbor. The best results for the three datasets were obtained by the RNN-based models combined with the attention mechanism. The performances of the RNN and attention-based models are higher and would be more effective for applications in the business industry.
... Bollinger indikatörüne göre, hisse senedi fiyatları ne zaman üst bandı aşarsa bir düzeltme beklenir bu durum satış sinyali olarak kabul edilir. Hisse senedi fiyatları alt bandın altına düşerse, fiyatların ucuzladığı düşünülür ve bu durum alış sinyali olarak kabul edilir(Raşo ve Demirci 2019). ...
Article
Bu çalışmanın amacı; yatırımcılar tarafından yaygın şekilde kullanılan RSI, MACD, Stokastik ve Bollinger Bandı indikatörlerinin Borsa İstanbul Turizm Sektöründe işlem görmekte olan hisse senetleri üzerindeki performanslarını karşılaştırmaktır. Bu bağlamda, hisse senetlerinin günlük kapanış değerleri Matriks Veri Tabanı ile analiz edilmiştir. Teknik indikatörlerin alım satım kararları dikkate alınarak karlı işlem yüzdeleri, ortalama kazanç/ ortalama kayıp oranları ve elde ettikleri getiri oranları analiz edilmiştir. Geçmişe yönelik yapılan testler sonucunda, RSI(50) indikatörü al ve tut stratejisine göre daha iyi performans gösterirken diğer indikatörler al ve tut stratejisine göre daha düşük performans göstermiştir. Çalışma sonucunda, RSI(50) indikatörü alım satım sinyalleri ile oluşturulan sistemlerde etkin piyasalar hipotezi ve rassal yürüyüş teorisi varsayımları reddedilirken, RSI(30-70), MACD, Stokastik ve Bollinger Bandı indikatörleri ile oluşturulan sistemlerde etkin piyasalar hipotezi ve rassal yürüyüş teorisinin varsayımlarının geçerli olduğu sonucuna ulaşılmıştır.
... In 54 out of 100 equities, the CNN-Corr model with the restructured topographies based on clustered feature correlation had the highest Macro-Averaged (MA) F-Measure metric scores. Another deep learning study predicts the BIST 30 Index over the course of 27 months utilizing technical indicators like the Relative Strength Index, Bollinger Bands, Stochastic Oscillator, and MACD [15]. In most of the studies on predicting stock prices and stock indices for Borsa Istanbul, daily observation is used. ...
Article
Full-text available
With the High-Frequency Trading process, which is a subclass of algorithmic trading transactions, intraday information has increasing importance. Traditional statistical methods often fall short in capturing the intricate patterns and volatility inherent in such high-frequency data. In contrast, ANN models demonstrate remarkable capability in handling these challenges, and VAR models provide insights into short-term relationships among variables. This study highlights the importance of using both ANN and VAR models for processing these short time intervals. BIST100 index which is the main index of Borsa Istanbul is predicted with two different models in different data ranges with artificial neural network models and vector auto regression models. Both generated ANN models successfully complete the training stages, with extremely high precision, and exhibit exceptionally low error values in their predictions. Although both models are effective, the evidence favors the model evaluated using 5-minute data for both the training and prediction phases of artificial neural network models. However, the relative importance of 15-minute data in explaining the variation of BIST100 is higher. Moreover, the VAR model results indicate that the short-term relationship between variables can be influenced by the range of data and the 15-minute interval data of the variables play a more significant role in explaining the BIST100 index over the longer term.
... The leading algorithm with the best Macro-Averaged (MA) F-Measure metric scores in 54 of 100 stocks is CNN-Corr model with the reordered features according to clustered feature correlation. In the nest study using deep learning, Raşo and Demirci [5] predict BIST 30 Index for a period of 27 months. ...
Article
In this study, two simulation models have been developed to predict the main stock price index of Borsa Istanbul (BIST100) with an artificial intelligence approach. In order to analyze the role of technical indicators in intraday predicting of stock markets, two different artificial neural network models have been developed in which different parameters are defined in the input layers. In the first model, 5 input parameters have been defined as open price (OP), highest price (HP), lowest price (LP), and two different moving averages (MA), 3 more parameters added as The Relative Strength Index (RSI), The Moving Average Convergence Divergence (MACD) and the moving average of MACD (TRIGGER). 70% of the data used in multi-layer network models developed with a total of 97 data sets have been used for training the model, 20% for validation and 10% for testing. The results show that both ANN models can predict BIST100 values with very low error rates. However, it is seen that the prediction performance of the first model, which has been developed by defining fewer input data, is higher than the second model. The results obtained support that the predictions made with intraday data are stronger between 13:00 and 16:30.
... Ünvan and Ergenç (2022) additionally compared the predictive ability of ANN and regression models applied to the BIST-100 Index's closing prices between 2010-2020 and found ANN to perform better than the considered regression models. Raşo and Demirci (2019) used deep learning methods to forecast the Turkish Stock Market on the BIST-30 Index from January 2016-April 2018, with their study's findings revealing the deep learning model to outperform other techniques such as support vector regression (SVR). Furthermore, Alp et al. (2020) conducted a comparative study on BIST Index price prediction, comparing the performance of ARIMA against two deep learning methods (i.e., long short-term memory [LSTM] and gated-recurrent unit [GRU]) for predicting the BIST-30, BIST-50, and BIST-100 price indexes. ...
Article
İşsizlik, sadece kapsamlı bir ekonomik sorun değil, aynı zamanda tüm ulusların odak noktası haline gelen karmaşık bir sosyal sorundur. İşsizlik sorununun doğru bir şekilde ele alınması, ülkenin kalkınmasıyla doğrudan ilişkilidir. Bu yönde oluşturulan politikaların başarası, işsizlik oranının doğru bir şekilde tahmin edilmesine dayanır. Bu bağlamda, makalemiz işsizlik oranı tahmininin yapılmasında yapay zekâ, makine öğrenimi ve klasik yöntemlerin kıyaslamasını amaçlamaktadır. Bu amaçla, Türkiye İstatistik Kurumu'ndan (TÜİK) Ocak 2005 verileriyle Aralık 2023 dönemini kapsayan işsizlik oranı verileri elde edilmiştir. Araştırmada Ölçüt modeli olarak ARIMA, SARIMA makine öğrenimi modeli olarak Rassal Orman, XGBoost, LSTM ve GRU modelleri uygulanmıştır. Elde edilen sonuçlar, SARIMA'nın tahmin grafiğinin ve performans göstergelerinin ARİMA modeli performans değerlerinden daha iyi olduğunu göstermektedir. MAPE hariç diğer tüm hata ölçütleri, SARIMA modelinin hata ölçütlerinden büyük olmasına rağmen, tüm Makine Öğrenimi modellerinin R karesi ARIMA ve SARIMA modellerinin R karesinden büyüktür. Ayrıca, sonuçlar en uygun metrik göstergeleri sergileyen makine öğrenimi yönteminin GRU modeli olduğunu ortaya koymuştur. Bu modelin MAE ve RMSE değerleri en düşükken, R karesi ise en yüksektir. Buna en yakın göstergeleri Rassal Orman modeli sergilemektedir.
Article
Portfolio management is crucial for investors. We propose a dynamic portfolio management framework based on reinforcement learning using the proximal policy optimization algorithm. The two‐part framework includes a feature extraction network and a full connected network. First, the majority of the previous research on portfolio management based on reinforcement learning has been dedicated to discrete action spaces. We propose a potential solution to the problem of a continuous action space with a constraint (i.e., the sum of the portfolio weights is equal to 1). Second, we explore different feature extraction networks (i.e., convolutional neural network [CNN], long short‐term memory [LSTM] network, and convolutional LSTM network) combined with our system, and we conduct extensive experiments on the six kinds of assets, including 16 features. The empirical results show that the CNN performs best in the test set. Last, we discuss the effect of the trading frequency on our trading system and find that the monthly trading frequency has a higher Sharpe ratio in the test set than other trading frequencies.
Chapter
Bitcoin is currently the most well-known cryptocurrency, and its market is extremely volatile. The evolution of the financial system is influenced by the growing popularity and commercial acceptance of investors worldwide. Investments are at risk due to the largely unknown dynamic characteristics and predictability of cryptocurrencies. Deep learning and predictive analytics are currently used in artificial intelligence approaches to analyze financial data. The aim of this study is to estimate the cryptographic value of the Bitcoin currency using the Gated Recurrent Unit model (GRU) and the Bidirectional Long-Term Memory (Bi-LSTM) model. Data collected from Yahoo Finance from July 2014 to July 2023. In addition, the performance of recurrent neural network classifiers was compared. The results show that GRU outperforms Bi-LSTM in terms of performance criteria; In other words, using the GRU model is one of the most effective methods for predicting cryptocurrency prices.
Article
Full-text available
In this study we apply a Deep Learning Technique to predict stock prices for the 30 stocks that compose the BIST30, Turkish Stock Market Index before and after the onset of Covid-19 crises. Specifically, we utilize the Bi-Directional Long-Short Term Memory (BiLSTM) model which is a variation of the Long-Short-Term Memory (LSTM) model to predict stock prices for the BIST30 stocks. We compare the performance of the model to other commonly used machine learning models like decision tree, bagging, random forest, adaptive boosting (Adaboost), gradient boosting, and eXtreme gradient boosting (XGBoost), artificial neural networks (ANN), and other deep Leaning models like recurrent neural network (RNN), and the Long-Short-Term Memory (LSTM) model. The BiLSTM model seems to have better performance compared to conventional models used for predicting stock prices and continues to have superior performance in the Covid19 period. The LSTM model seems to have a good overall performance and is the next best model.
Article
Borsa analizi, geleceğe yönelik tahminler yapmak için finansal, politik ve sosyal göstergeleri göz önünde bulundurarak borsayı inceler ve değerlendirir. Büyük veri ve derin öğrenme teknolojilerindeki gelişmelerin çığır açan sonuçları, araştırmacıların ve endüstrinin dikkatini bilgisayar destekli borsa analizine çekmektedir. Geleneksel makine öğrenimi ve derin öğrenme modellerini kullanarak borsa analizi konusunda çeşitli çalışmalar bulunmaktadır. Bu çalışmada, temel model olarak Otoregresif Entegre Hareketli Ortalama (ARIMA) yöntemini tekrarlayan sinir ağlarının üç farklı modeliyle karşılaştırılmıştır; Uzun Kısa Süreli Bellek (Long Short Term Memory- LSTM) ağları, Geçitli Tekrarlayan Birim (Gated Recurrent Unit- GRU), dikkat katmanlı LSTM modeli. Bu çalışmada literatürdeki diğer çalışmalardan farklı olarak 28 tane finansal indikatör kullanılarak Borsa İstanbul verileri üzerinde gün içi tahminler yaparken dört farklı modelin sonuçları karşılaştırılmıştır. İstatistiksel ve doğrusal bir model olan ARIMA, zaman serileri tahmini için doğrusal olmayan RNN modelleri ile karşılaştırılmıştır ancak 3 sinir ağı modelinden de yüksek ortalama hata oranına sahip olduğu görülmüştür. LSTM sonuçları GRU modeline çok yakın olsa da GRU diğerlerinden biraz daha iyi performans göstermektedir. Dikkat mekanizmalı sinir ağı diğer temel sinir ağlarından daha iyi sonuç vermemektedir.
Article
Full-text available
In this study the ability of artificial neural network (ANN) in forecasting the daily NASDAQ stock exchange rate was investigated. Several feed forward ANNs that were trained by the back propagation algorithm have been assessed. The methodology used in this study considered the short-term historical stock prices as well as the day of week as inputs. Daily stock exchange rates of NASDAQ from January 28, 2015 to 18 June, 2015 are used to develop a robust model. First 70 days (January 28 to March 7) are selected as training dataset and the last 29 days are used for testing the model prediction ability. Networks for NASDAQ index prediction for two type of input dataset (four prior days and nine prior days) were developed and validated.
Conference Paper
Full-text available
Accurate forecasting of upcoming trends in the capital markets is extremely important for algorithmic trading and investment management. Before making a trading decision, investors estimate the probability that a certain news item will influence the market based on the available information. Speculation among traders is often caused by the release of a breaking news article and results in price movements. Publications of news articles influence the market state that makes them a powerful source of data in financial forecasting. Recently, researchers have developed trend and price prediction models based on information extracted from news articles. However, to date no previous research that investigates the advantages of using news articles with different levels of relevance to the target stock has been conducted. This research study uses the multiple kernel learning technique to effectively combine information extracted from stock-specific and sub-industry-specific news articles for prediction of an upcoming price movement. News articles are divided into these two categories based on their relevance to a targeted stock and analyzed by separate kernels. The experimental results show that utilizing two categories of news improves the prediction accuracy in comparison with methods based on a single news category.
Article
Stock market predictions have been studied by academics and practitioners. In this paper, a hybrid model is proposed to predict the stock market movement. We have combined the independent component analysis (ICA) and kernel methods. ICA is used to select the important indicators. After determining the inputs, kernel methods are employed to predict the direction of the stock market. We have used the Dow-Jones, Nasdaq and S&P500 indices for experiments. Technical indicators of the indices are used as input variables for the proposed model. According to the analysis of the experimental results, kernel methods are capable of producing satisfactory forecasting accuracies and gain rates for Dow-Jones, Nasdaq and S&P 500 indices. The trading experiment shows that the kernel methods obtain higher rate of returns than the other investment strategies. © 2017, Bucharest University of Economic Studies. All rights reserved.
Article
This study investigates stock market indices prediction that is an interesting and important research in the areas of investment and applications, as it can get more profits and returns at lower risk rate with effective exchange strategies. To realize accurate prediction, various methods have been tried, among which the machine learning methods have drawn attention and been developed. In this paper, we propose a basic hybridized framework of the feature weighted support vector machine as well as feature weighted K-nearest neighbor to effectively predict stock market indices. We first establish a detailed theory of feature weighted SVM for the data classification assigning different weights for different features with respect to the classification importance. Then, to get the weights, we estimate the importance of each feature by computing the information gain. Lastly, we use feature weighted K-nearest neighbor to predict future stock market indices by computing k weighted nearest neighbors from the historical dataset. Experiment results on two well known Chinese stock market indices like Shanghai and Shenzhen stock exchange indices are finally presented to test the performance of our established model. With our proposed model, it can achieve a better prediction capability to Shanghai Stock Exchange Composite Index and Shenzhen Stock Exchange Component Index in the short, medium and long term respectively. The proposed algorithm can also be adapted to other stock market indices prediction.
Article
In this paper, we propose a robust methodology to assess the value of microblogging data to forecast stock market variables: returns, volatility and trading volume of diverse indices and portfolios. The methodology uses sentiment and attention indicators extracted from microblogs (a large Twitter dataset is adopted) and survey indices (AAII and II, USMC and Sentix), diverse forms to daily aggregate these indicators, usage of a Kalman Filter to merge microblog and survey sources, a realistic rolling windows evaluation, several Machine Learning methods and the Diebold-Mariano test to validate if the sentiment and attention based predictions are valuable when compared with an autoregressive baseline. We found that Twitter sentiment and posting volume were relevant for the forecasting of returns of S&P 500 index, portfolios of lower market capitalization and some industries. Additionally, KF sentiment was informative for the forecasting of returns. Moreover, Twitter and KF sentiment indicators were useful for the prediction of some survey sentiment indicators. These results confirm the usefulness of microblogging data for financial expert systems, allowing to predict stock market behavior and providing a valuable alternative for existing survey measures with advantages (e.g., fast and cheap creation, daily frequency).
Article
The traditional financial time series forecasting methods use accurate input data for prediction, and then make single-step or multi-step prediction based on the established regression model. So its prediction result is one or more specific values. But because of the complexity of financial markets, the traditional forecasting methods are less reliable. In this paper, we transform the financial time series into fuzzy grain particle sequences, and use support vector machine regression to regress the upper and lower bounds of the fuzzy particles, and then apply regression model single-step prediction on the upper and lower bounds, which will limit the predict results within a range. This is a new idea. The Shanghai Composite Index Week closed index for the experimental data, experimental results show the effectiveness of this approach.