# Forecasting Foreign Exchange Rates With Artificial Neural Networks: A Review

**Abstract**

Forecasting exchange rates is an important financial problem that is receiving increas-ing attention especially because of its difficulty and practical applications. Artificial neural networks (ANNs) have been widely used as a promising alternative approach for a forecasting task because of several distinguished features. Research efforts on ANNs for forecasting exchange rates are considerable. In this paper, we attempt to provide a survey of research in this area. Several design factors significantly impact the accuracy of neural network forecasts. These factors include the selection of input variables, prepar-ing data, and network architecture. There is no consensus about the factors. In different cases, various decisions have their own effectiveness. We also describe the integration of ANNs with other methods and report the comparison between performances of ANNs and those of other forecasting methods, and finding mixed results. Finally, the future research directions in this area are discussed.

April 2, 2004 22:13 WSPC/173-IJITDM 00096

International Journal of Information Technology & Decision Making

Vol. 3, No. 1 (2004) 145–165

c

World Scientiﬁc Publishing Company

FORECASTING FOREIGN EXCHANGE RATES

WITH ARTIFICIAL NEURAL NETWORKS: A REVIEW

WEI HUANG

Institute of Systems Science, Academy of Mathematics and Systems Sciences

Chinese Academy of Sciences, Beijing 100080, People’s Republic of China

School of Knowledge Science, Japan Advanced Institute of Science and Technology

1-1, Asahidai, Ishikawa 923-1292, Japan

K. K. LAI

Department of Management Sciences, City University of Hong Kong

Tat Chee Avenue, Kowloon, Hong Kong

Y. NAKAMORI

School of Knowledge Science, Japan Advanced Institute of Science and Technology

1-1, Asahidai, Ishikawa 923-1292, Japan

SHOUYANG WANG∗

Institute of Systems Science, Academy of Mathematics and Systems Sciences

Chinese Academy of Sciences, Beijing 100080, People’s Republic of China

Tel: 86-10-62651381

Fax: 86-10-62568364

sywang@amss.ac.cn

Forecasting exchange rates is an important ﬁnancial problem that is receiving increas-

ing attention especially because of its diﬃculty and practical applications. Artiﬁcial

neural networks (ANNs) have been widely used as a promising alternative approach for

a forecasting task because of several distinguished features. Research eﬀorts on ANNs

for forecasting exchange rates are considerable. In this paper, we attempt to provide a

survey of research in this area. Several design factors signiﬁcantly impact the accuracy of

neural network forecasts. These factors include the selection of input variables, prepar-

ing data, and network architecture. There is no consensus about the factors. In diﬀerent

cases, various decisions have their own eﬀectiveness. We also describe the integration of

ANNs with other methods and report the comparison between performances of ANNs

and those of other forecasting methods, and ﬁnding mixed results. Finally, the future

research directions in this area are discussed.

Keywords: Artiﬁcial neural networks; exchange rate; forecasting.

∗Corresponding author. This author is also with School of Business, Hunan University

145

April 2, 2004 22:13 WSPC/173-IJITDM 00096

146 W. Huang et al.

1. Introduction

The foreign exchange market is the largest and most liquid of the ﬁnancial markets,

with an estimated $1 trillion traded every day. Exchange rates are amongst the

most important economic indices in the international monetary markets. For large

multinational ﬁrms, which conduct substantial currency transfers in the course of

business, being able to accurately forecast exchange rate movements can result in

substantial improvement in the ﬁrm’s overall proﬁtability.

Exchange rates are aﬀected by many highly correlated economic, political

and even psychological factors. These factors interact in a very complex fashion.

Exchange rate series exhibit high volatility, complexity and noise that result from

an elusive market mechanism generating daily observations.49 Evidence has clearly

shown that while there is little linear dependence, the null hypothesis of indepen-

dence can be strongly rejected, demonstrating the existence of non-linearities in

exchange rates.11

Much research eﬀort has been devoted to exploring the nonlinearity of exchange

rate data and to developing speciﬁc nonlinear models to improve exchange rate fore-

casting. Parametric nonlinear models such as the autoregressive random variance

(ARV) model,44 autoregressive conditional heteroscedasticity (ARCH),16 general

autoregressive conditional heteroskedasticity (GARCH),1chaotic dynamic31 and

self-exciting threshold autoregressive4models have been proposed and applied to

foreign exchange rate forecasting. While these models may be good for a particu-

lar situation, they perform poorly for other applications. The pre-speciﬁcation of

the model form restricts the usefulness of these parametric nonlinear models since

many other possible nonlinear patterns can be considered. One particular nonlin-

ear speciﬁcation will not be general enough to capture all the nonlinearities in the

data. Some nonparametric methods have also been proposed to forecast exchange

rates.7,28,29 However, nonparametric methods investigated in these studies are still

unable to improve upon a simple random walk model in out-of-sample predictions

of exchange rates.

There has been growing interest in the adoption of the state-of-the-art artiﬁ-

cial intelligence technologies to solve the problem. One stream of these advanced

techniques focuses on the use of artiﬁcial neural networks (ANNs) to analyze

the historical data and provide predictions on future movements in the foreign

exchange market. An ANN is a system loosely modeled on the human brain,

which detect the underlying functional relationships within a set of data and

perform tasks such as pattern recognition, classiﬁcation, evaluation, modeling, pre-

diction and control. ANNs are particularly well suited to ﬁnding accurate solu-

tions in an environment characterized by complex, noisy, irrelevant or partial

information. Several distinguishing features of ANNs make them valuable and

attractive in forecasting. First, as opposed to the traditional model-based meth-

ods, ANNs are data-driven self-adaptive methods in that there are few a priori

assumptions about the models for problems under study. Second, ANNs can

April 2, 2004 22:13 WSPC/173-IJITDM 00096

Forecasting Foreign Exchange Rates with ANN 147

generalize. Third, ANNs are universal functional approximators. Finally, ANNs

are nonlinear.59

The idea of using ANNs for forecasting exchange rates is not new. Weigend

et al.53 ﬁnd that neural networks are better than random walk models in pre-

dicting the DEM/USD exchange rate. Refense et al.37 apply a multi-layer per-

ceptron network to predict the exchange rate between USD/DEM, and to study

the convergence issue related to network architecture. Refense36 develops a con-

structive learning algorithm to ﬁnd the best neural network conﬁguration in fore-

casting DEM/USD. Podding33 studies the problem of predicting the trend of the

USD/DEM, and compares results to those obtained through regression analysis.

Pi32 proposes a test for dependence among exchange rates. Shin41 applies an

ANN model and moving average trading rules to investigate return predictabil-

ity of exchange rates. Zhang and Hutchinson62 report the experience of forecasting

the tick-by-tick CHF/USD. Kuan and Liu24 use both feed-forward and recurrent

neural networks to forecast GBP, CAD, DEM, JPY, CHF against USD. Wu55 com-

pares neural networks with ARIMA models in forecasting Taiwan/USD exchange

rates. Hann and Steurer15 mark comparisons between the neural network and linear

model in USD/DEM forecasting. Episcopos and Davis10 investigate the problem of

predicting daily returns based on ﬁve Canadian exchange rates using ANNs and

a heteroskedastic model, EGARCH. Tenti48 proposes the use of recurrent neural

networks in order to forecast exchange rates. Other earlier examples using ANN in

exchange rates application include Zhang61 and Yao et al.56

Considerable research eﬀort has gone into ANNs for forecasting exchange rates.

In this paper, we attempt to provide a survey of research in this area. Forecasting

exchange rates using ANNs is a process that can be divided into several steps.

Our goal in this paper is to ﬁnd out consensus and disagreements in each step.

Hence, the comparisons of various methods used by diﬀerent researchers go along

the whole forecasting process. For the consensus areas, guidelines are summarized.

With the disagreements, we analyze the reasons and point out the advantages and

disadvantages of various methods.

The paper is organized as follows. Section 2 covers input selection. Sec. 3 deals

with preparing data. In Sec. 4, we give a brief presentation of ANN architecture.

Section 5 describes the integration of ANNs with other methods. The comparison

between performances of ANNs and those of other forecasting methods is reported

in Sec. 6. Finally, conclusions and directions for future research are discussed in

Sec. 7.

2. Input Selection

There are two kinds of inputs — fundamental inputs and technical inputs.

Fundamental inputs include consumer price index, foreign reserve, GDP, export and

import volume, interest rates, etc. Technical inputs include the delayed time series

data, moving average, relative strength index, etc. Besides the above two kinds

April 2, 2004 22:13 WSPC/173-IJITDM 00096

148 W. Huang et al.

of inputs, individual forecast results could be used as inputs when using ANNs

as combined forecasting tools. In order to provide improved volatility forecasts,

Hu and Tsoukalas17 combine GARCH, EGARCH, IGARCH and MAV volatility

forecast through ANNs. A preliminary eﬀort to maximize the output performance

is conducted by ensuring adequate domain knowledge representation from input

variable.47,51

While Walczak et al.52 claim that multivariate inputs are necessary, most neural

network inputs for exchange rate prediction are univariate. Univariate inputs utilize

data directly from time series being forecast, while multivariate inputs utilize infor-

mation from outside the time series in addition to the time series itself. Univariate

inputs rely on the predictive capabilities of the time series itself, corresponding to

a technical analysis as opposed to a fundamental analysis. For a univariate time

series forecasting problem, the network inputs are the past, lagged observations of

the data series and the output is the future value. Each input pattern is composed of

a moving window of a ﬁxed length along the series. In this sense, the feed-forward

network used for time series forecasting is a general autoregressive model. The

question is how many lag periods should be included in predicting the future. Some

authors designed experiments to help selecting the number of input nodes while oth-

ers adopted some intuitive of empirical ideas. Mixed results are often reported in

the literatures. The lack of systematic approaches to neural network model building

is probably the primary cause of inconsistencies in the reported ﬁndings.

Ideally, we desire a small number of lag periods that can unveil the unique fea-

tures embedded in the data. The inclusion of excessive periods will adversely aﬀect

the training time of the network, and the algorithm will likely be trapped in local

optimal solutions. On the other hand, if the lag is smaller than required, forecast-

ing accuracy will be jeopardized because the search is restricted to a subspace. Too

few or too many lag periods aﬀect either the learning or prediction capability of

the network. It is desirable to reduce the number of input nodes to an absolute

minimum of essential nodes.

Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC)

as well as several extensions have been used as information-based in-sample model

selection criteria in selecting neural networks for foreign exchange rate time series

forecasting.34 However, the in-sample model selection criteria are not able to provide

a reliable guide to out-of-sample performance and there is no apparent connection

between in-sample model ﬁt and out-of-sample forecasting performance.

Huang et al.18 propose a general approach called Autocorrelation Criterion (AC)

to determine lag structures in the applications of ANNs to univariate time series

forecasting. They apply the approach to the determination of input variables

for foreign exchange rate forecasting and conduct comparisons between AC and

information-based in sample model selection criterion. Experiment results show that

AC outperforms information-based in sample model selection criterion in terms of

forecasting performance.

April 2, 2004 22:13 WSPC/173-IJITDM 00096

Forecasting Foreign Exchange Rates with ANN 149

We suggest practitioners to employ Autocorrelation Criterion in the case of

univariate input. It does not require any assumptions, completely independent of

particular class of model. The selection of input variables is data-driven, making full

uses of information among sample observations even if the underlying relationships

are unknown or hard to describe. Thus, it is well suited for time series problems

whose solutions require knowledge that is diﬃcult to specify but for which there

are enough data or observations. It nevertheless provides a practical way to solve

input selection for neural networks in time series forecasting.

Multivariate inputs are based on economics and ﬁnance theory. El Shazly et al.8

use fundamental inputs including the one month Eurorate on US dollar deposit, the

one month Eurorate on the foreign currency deposit, the spot exchange rate, and

the one month forward premium on the foreign currency. El Shazly et al.9use inputs

including the 90-day Euro deposit rate on the US dollar (INT US); the 90-day Euro

deposit rate on the British pound (INTBP), German mark (INTDM), Japanese

yen (INTJY), and the Swiss franc (INTSF); the spot exchange rate of the foreign

currency: SBP, SDM, SJY, and SSF expressed in direct form; the 90-day forward

exchange rate on the foreign currency: FDBP, FDDM, FDJY, FDSF; and the 90-day

future exchange rate on the foreign currency: FTBP, FTDM, FTJY, FTSF. The

above input variables selection comes from Interest Rate Parity, a principle by which

forward exchange rates reﬂect relative interest rates on default risk-free instruments

denominated in alternative currencies. Currencies of countries with high interest

rates are expected by the market to depreciate over time, and currencies of countries

with low interest rates are expected to appreciate over time.

Leung et al.25 use the MUIP relationship39 as the theoretical basis for multivari-

ate speciﬁcation. The MUIP relationship can be modeled and written as follows:

et=α0+α1(r∗

t

−rt)+α2(π∗

t

−π)+α3(p∗

t

−pt)+α4(cat/nyt)

+α5(ca∗

t/ny∗

t)+µt

where eis the natural logarithm of the exchange rate, deﬁned as the foreign cur-

rency price of domestic currency. r, π, p and (ca/ny) represent the logarithm of the

nominal short-term interest rate, expected price inﬂation rate, the logarithm of

the price level, and the ratio of current account to nominal GDP for the domestic

economy respectively. Asterisks denote the corresponding foreign variables. µis the

error term. The variables (ca/ny)and(ca∗

/ny∗) are proxies for the risk premium.

Tent i 48 uses inputs including the compound returns of the last nperiods (where

n=1,2,3,5,8), the running standard deviation of the klast periods (where k=

13,21,34), and technical indicators such as the average directional movement index

(ADX), trend movement index TMI), rate of change (ROC), and Ehlers leading

indicator (ELI). Lisi and Schiavo26 use both the past observation of the series

itself and those of an “auxiliary” variable chosen among the remaining series. For

example, the lagged FRF/USD, GBP/USD are used to predict future FRF/USD.

According to Walczak and Cerpa’s51 suggestions, multivariate inputs can be

determined through the following steps. Firstly, perform standard knowledge

April 2, 2004 22:13 WSPC/173-IJITDM 00096

150 W. Huang et al.

acquisition. Get as more explanatory variables to foreign exchange rates as pos-

sible from economics and ﬁnance theory. The primary purpose of the knowledge

acquisition phase is to guarantee that the input variable set is not under-speciﬁed,

providing all relevant domain criteria to the ANNs. Once a base set of input vari-

ables deﬁned through knowledge acquisition, the set can be pruned to eliminate

variables that contribute noise to the ANNs and consequently reduce the ANNs

generalization performance. Smith43 claims that ANNs input variables need to be

predictive, but should not be correlated. Correlated variables degrade ANNs perfor-

mance by interacting with each other as well as other elements to produce a biased

eﬀect. A ﬁrst pass ﬁlter to help identify noise variables is to calculate the correla-

tion of pairs of variables (Pearson correlation matrix). Alternatively, a chi-square

test may be used for categorical variables. If two variables have a high correlation,

then one of these two variables may be removed from the set of variables with-

out adversely aﬀecting the ANNs performance. Additional statistical techniques

may be applied, depending on the distribution properties of the data set. Stepwise

regression (multiple or logistic) and factor analysis provide viable tools for evalu-

ating the predictive value of input variables and may serve as a secondary ﬁlter to

the Pearson correlation matrix.

Multivariate input has advantages in long term forecasting, unveiling the move-

ment trend of foreign exchange rates. But it needs more data and time. Some

explanatory variables are not available in time requirement. Univariate input has

not such problems. The practitioners can beneﬁt from a net reduction in the devel-

opment costs, since less data is required. However, it lacks of economic explanations,

which weakens forecasting credibility.

3. Preparing Data

Due to the fact that only relatively little preliminary knowledge is required to

train artiﬁcial neural networks and on account of the black box character, data is

often presented to the networks without any further processing steps being taken.

However, the degree of care invested in preparing the data is of decisive importance

to the networks learning speed and the quality of approximation it can attain. Every

hour invested in preparing the data may save days in training the networks.

The ﬁrst questions to be considered here are of a very general nature:

(1) Is suﬃcient data available, and does this data contain the correct information?

(2) Does the available data cover the range of the variables concerned as completely

as possible?

(3) Are there borderline cases that are not covered by the data?

(4) Does the data contain irrelevant information?

(5) Are there transformations or combinations of variables (e.g. ratios) that

describe the problem more eﬀectively than the individual variables themselves?

Once all these points have been clariﬁed, the data needs to be transformed into

an appropriate form for the networks. Various normalization methods are generally

April 2, 2004 22:13 WSPC/173-IJITDM 00096

Forecasting Foreign Exchange Rates with ANN 151

employed to this end. Tenti48 normalizes inputs to zero mean and two standard

deviations. In Hu et al.’s17 study, all inputs to ANNs are linearly normalized to

[0,1]. El Shazly et al.9suggest that data should be manipulated and converted to

the required format for further processing. In Lisi and Schiavo’s26 study, the log-

diﬀerenced data are scaled linearly in the range of 0.2–0.8 in order to adapt them to

the output range of the sigmoid activation function. Qi and Zhang34 apply natural

logarithm transformation to raw data to stabilize the series. An ADF test shows

that the transformed time series contains a unit root, thus the ﬁrst order diﬀerence

is applied.

There is no consensus on whether data normalization should be used. For exam-

ple, it is still unclear that whether there is a need to normalize the input because

the arc weights can undo the scaling. Shanker et al.40 investigate the eﬀectiveness of

linear and statistical normalization methods for classiﬁcation problems. They ﬁnd

that data normalization methods do not necessarily lead to better performance

particularly when the networks and sample size are large. El Shazly et al.8apply

normalization and transformation to initial runs and then discard them. They ﬁnd

that although the diﬀerence data set speeds the training time by reducing the noise

during training, the networks when tested yield poor forecasts. Based on the objec-

tive of improving the testing performance rather than speeding up training time,

they decide to use raw data during training. Zhang and Hu59 ﬁnd no signiﬁcant

diﬀerence between using normalized and original data, based on their experience

with the exchange rate data. Hence, raw data are used in that study.

Although normalization of the data is not compulsory, it is sometimes unavoid-

able. If for example, a function is valid only for a limited range, e.g. the sigmoid

function (0.0–1.0) or the tanh function (−1.0–1.0), the network will be unable to

generate any output values outside of this range. The target output data for the

training and test phases must therefore be normalized.

In principle, it is not absolutely necessary to normalize the input data, as

the networks input layer is assigned a linear function. However, it is neverthe-

less inadvisable not to normalize data when using multivariate inputs. As a result

of normalization, all variables acquire the same signiﬁcance for the learning pro-

cess. If normalization is not carried out, variables with greater values will be given

preference.

Generally, a data set is divided into two, the training set and the test set. The

training set is used for ANNs model development and the test set is used to evaluate

the forecasting ability. Sometimes a third set, called the validation set, is used to

avoid the overﬁtting problem or to determine the stopping point in the training

process.

There is no general solution to splitting the training set and test set. The Brain-

maker software randomly selects 10% of the facts from the data set and uses them

for testing. Yao et al.57 suggest that historical data are divided into three portions:

training, validation and testing sets. The training set contains 70% of the collected

data, while the validation and the testing sets contain 20% and 10% respectively.

April 2, 2004 22:13 WSPC/173-IJITDM 00096

152 W. Huang et al.

The division is based on a rule of thumb derived from the authors’ experience.

Other researches just give the division directly, not touching on the reasons for it.

Sample size is another factor that can aﬀect artiﬁcial neural networks forecasting

ability. Neural networks researchers have used various sizes of training sets, ranging

from one year to sixteen years.21,36,46,48,52,58 Large samples are often claimed to

be optimal in training neural networks due to the large set of parameters involved

in the network. To test if there is a signiﬁcant diﬀerence between large and small

training samples in modeling and forecasting exchange rates, Zhang and Hu58 use

two training sample sizes. The large sample consists of 887 observations from 1976

to 1992, and the small one includes 261 data points from 1998 to 1992. Their result

is that the large sample outperforms the smaller sample. Most of the researchers

typically use all of the data in building neural networks forecasting model once they

have obtained their training data, with no attempt at comparing data quantity

eﬀects on the quality of the produced forecasting models.

However, Kang,22 in a comprehensive study of neural network time series fore-

casting, ﬁnds that neural networks forecasting models do not necessarily require

a large data set to perform well. Walczak50 examines the eﬀect of diﬀerent sizes

of training sample sets on forecasting exchange rates. His research results indicate

that for ﬁnancial time series, two years of training data is frequently all that is

required to produce optimal forecasting accuracy. He claims that given an appro-

priate amount of historical knowledge, neural networks can forecast future exchange

rates with 60% accuracy, while neural networks trained on a larger training set have

a worse forecasting performance. In addition to high-quality forecasts, the reduced

training set sizes reduce development cost and time.

Huang et al.19 propose to determine the optimal quantity of training data by

using change-point detection. The behavior of exchange rates is evolving over time.

Therefore, we can conjecture that the movement of exchange rates has a series

of change points, which divide data into several homogeneous groups that take

heterogeneous characteristics from each other.

4. Architectures of ANNs

Three classes of ANNs architectures have been employed for forecasting foreign

exchange rates. In this section, we give a brief presentation and conduct some

comparisons.

4.1. Feedforw ard

In feedforward ANNs, the connections between units do not form cycles. Feedfor-

ward ANNs usually produce a response to an input quickly.

4.1.1. Multi-layer perceptrons (MLP)

MLP38 is perhaps the most popular network architecture in use, which is rela-

tively easy to implement. An MLP is typically composed of several layers of nodes

April 2, 2004 22:13 WSPC/173-IJITDM 00096

Forecasting Foreign Exchange Rates with ANN 153

Fig. 1. Examples of multi-layer perceptron neural network architectures.

(see Fig. 1). The network thus has a simple interpretation as a form of input-

output model, with the weights and thresholds (biases) the free parameters of the

model. Although it has been shown theoretically that the MLP has a universal

functional approximating capability and can approximate any nonlinear function

with arbitrary accuracy, no universal guideline exists in choosing the appropriate

model structure for practical applications. Thus, a trial-and-error approach or cross-

validation experiment is often adopted to help ﬁnd the best model. Typically a large

number of neural network architectures are considered. The one with the best per-

formance in the validation set is chosen as the winner, and the others are discarded.

4.1.2. Radial basis function networks (RBFNs)

RBFNs30 have static Gaussian function as the nonlinearity for the hidden layer

processing elements. The Gaussian function responds only to a small region of the

input space where the Gaussian is centered. The key to successful implementation

of these networks is to ﬁnd suitable centers for the Gaussian functions. This can

be done with supervised learning, but an unsupervised approach usually produces

better results.

The advantage of the radial basis function network is that it ﬁnds the input to

output map using local approximators. Usually the supervised segment is simply a

linear combination of the approximators. Since linear combiners have few weights,

these networks train extremely fast and require fewer training samples.

April 2, 2004 22:13 WSPC/173-IJITDM 00096

154 W. Huang et al.

4.1.3. Learning vector quantization (LVQ)

LVQ12,23 is a precursor of the well-known self-organizing maps (also called Kohonen

feature maps) and like them it can be seen as a special kind of artiﬁcial neural

network. A neural network for learning vector quantization consists of two layers:

an input layer and an output layer. It represents a set of reference vectors, the

coordinates of which are the weights of the connections leading from the input

neurons to an output neuron. Hence, one may also say that each output neuron

corresponds to one reference vector. This kind of ANNs architecture can only be

used for classiﬁcation. Hence, we cannot employ it in forecasting foreign exchange

rates value.

4.1.4. General regression neural networks (GRNNs)

GRNNs45 are memory-based feed-forward networks based on the estimation of

probability density functions. GRNNs featuring fast training times, can model non-

linear functions, and have been shown to perform well in noisy environments given

enough data. The GRNN topology consists of four layers: the input layer, pattern

layer, summation layer and output layer. Each layer of processing units is assigned

a speciﬁc computational function when nonlinear regression is performed. The only

adjustable parameter in a GRNN is the smoothing factor for the kernel function.

The optimization of the smoothing factor is critical to the GRNN’s performance and

is usually found through iterative adjustments and the cross-validation procedure.

The advantages of GRNN include

(1) Fast training times.

(2) Can handle both linear and non-linear data.

(3) Adding new samples to the training set does not require re-calibrating the

model.

(4) Only one adjustable parameter thereby making overtraining less likely.

The disadvantages include:

(1) Has trouble with irrelevant inputs (i.e. suﬀers from the dimensionality curse).

(2) No intuitive method for selecting the optimal smoothing parameter.

(3) Requires many training samples to adequately span the variation in the data.

(4) Requires that all the training samples be stored for future use (i.e. prediction).

4.2. Feedback

In feedback ANNs, there are cycles in the connections. Each time an input is pre-

sented, ANNs must iterate for a potentially long time before it produces a response.

Feedback ANNs are usually more diﬃcult to train than feedforward ANNs.

April 2, 2004 22:13 WSPC/173-IJITDM 00096

Forecasting Foreign Exchange Rates with ANN 155

4.2.1. Recurrent neural networks (RNNs)

Recurrent neural networks (RNNs), in which the input layer’s activity patterns

pass through the network more than once before generating a new output pattern,

can learn extremely complex temporal patterns. Recurrent architecture has been

proved to be superior to the windowing technique of overlapping snapshots of data,

which is used with standard back-propagation. In fact, by introducing time-lagged

model components, RNNs may respond to the same input pattern in a diﬀerent

way at diﬀerent times, depending on the input sequence. The main disadvantage

of RNNs is that they require substantially more connections, and more memory in

simulation, than standard back-propagation networks. RNNs can yield good result

because of the rough repetition of similar patterns present in exchange rate time

series. These regular but subtle sequences can provide a beneﬁcial forecast ability.

4.3. Competitive

4.3.1. Fuzzy ARTMAP network

A fuzzy ARTMAP network3is a fuzzy ART2network that adds a single output

layer to generate an error signal to the fuzzy ART network that is made up of the

input, complement and category layers. The addition of the output layer for the

error signal transforms the network from an unsupervised network to a supervised

network where the network learns from examples in which the real category is

known.

4.3.2. Modular

Modular ANNs20 essentially make use of multiple individual back-propagation net-

works (BPNs) that compete to learn diﬀerent aspects of the problem. The networks

use an expert gating mechanism to choose which of the BPNs (called a local expert)

does best on a particular input observation, essentially assigning diﬀerent regions

of the data space to diﬀerent local experts. The general idea is that the error at

each local expert is weighted by its posterior probability (obtained as training takes

place) that it was responsible for in the current output vector. The gating networks

learns by trying to match its prior probabilities to the posterior probabilities found

in each local expert.

MLP is used most frequently for exchange rate prediction, because it has an

inherent capability of arbitrary input-output mapping. However, other types of

ANNs are also used.

Tent i 48 perform tests with three variations of RNNs. The ﬁrst architecture used

(RNN1) has one hidden and one recurrent layer. The output layer is fed back into

the hidden layer, by means of the recurrent layer, showing the resulting output of

the previous pattern. In the second version (RNN2), similar to that of Fransconi

et al.,13 the hidden layer is fed back into itself through an extra layer of recurrent

nodes. In the third version (RNN3), patterns are processed from the input layer

April 2, 2004 22:13 WSPC/173-IJITDM 00096

156 W. Huang et al.

through a recurrent layer of nodes, which holds the input layer’s contents as they

existed when previous patterns were trained, and then are fed back into input layer.

Leung et al.25 examine GRNNs forecast ability and compare its performance

with a variety of forecasting techniques, including the multi-layered feed-forward

network.

Davis et al.6present a variety of neural networks forecasting models applied to

Canadian–US exchange rate data. Networks such as back-propagation, modular,

radial basis functions, linear vector quantization, fuzzy ARTMAP and genetic rein-

forcement learning are examined. It is important to note that they predict direction

shifts on Canadian–US exchange rate data rather than absolute price. Diﬀerent

types of classiﬁcation networks have characteristics that may prove eﬀective for

speciﬁc classiﬁcation data.

The selection of ANNs architecture is an open problem. ANNs designers must

use the constraints of the training data set and development cost for determination.

We suggest practitioners to employ MLP which is relatively easy and costs less to

implement.

5. The Integration of ANNs with Other Methods

The desire to further enhance the performance of neural network prediction has

led to the development of hybrid systems that combine neural networks with other

methods. The integration of ANNs with other technologies, such as wavelet analysis,

genetic algorithm, or fuzzy logic can improve the applications of ANNs. Although

each technology has its own strengths and weaknesses, these technologies are com-

plementary. Weaknesses of one technology can be overcome by strengths of another

by achieving a systematic eﬀect. Such an eﬀect can create results that are more

eﬃcient, productive, and eﬀective than the sum of their parts.

Genetic algorithm (GA) is a class of probabilistic search techniques based on

biological evolution. Each point in the solution space is coded as a binary string

called a chromosome. For instance, the co-ordinate (10,5,3) is encoded as

1010

|{z }

10

0101

|{z }

5

0011

|{z }

3

When a new generation exists, each member is ranked according to its ﬁtness.

From this, a new population must be created. Essentially this is a “survival of the

ﬁttest solution”, and the members used for mating are chosen with a probability

proportional to their ﬁtness.

A technique called crossover is employed to maximize retention of the good

points of the previous generation. This is analogous to biological mating in which a

child may be superior to both parents if it inherits good genes from both parents. In

the computing process, this is achieved by swapping corresponding bits in pairs of

chromosomes according to a given crossover rate; for instance, the last three bits of

one chromosome may be swapped with the last three bits of another chromosome.

April 2, 2004 22:13 WSPC/173-IJITDM 00096

Forecasting Foreign Exchange Rates with ANN 157

If the population does not contain all of the traits needed to solve a problem, no

amount of crossover will work. As a result, a single bit is ﬂipped very infrequently.

This is called mutation, and solves one of the problems of neural networks — that we

arrive at local minima. Mutation provides a way out by preventing a bit converging

on a single value throughout the entire population. Mutation must be kept to a

minimum to prevent loss of good chromosomes.

The inclusion of GA search techniques was undertaken for two reasons. The ﬁrst

relates to the potential GA oﬀer in terms of adaptiveness. The ﬂexibility, robustness

and simplicity that GA oﬀers render them very attractive in that respect. The

second reason stems from the diﬃculty in optimizing neural network applications.

By operating on entire populations of candid solutions in parallel, GA is much less

likely to get stuck at a local optimum.

Wavelet analysis is used to process information eﬀectively at diﬀerent scales.

It is very useful for feature detection from complex and chaotic time series. In

particular, the speciﬁc local properties of wavelets can be useful in describing the

signals with discontinuous or fractal structures in the ﬁnancial market. It also allows

the removal of noise-dependent high frequencies, while conserving the signal bearing

high frequency terms. However, one of the most critical issues in the application of

the wavelet analysis is to choose the correct wavelet thresholding parameters.

El Shazly et al.9design a hybrid system combining neural networks with genetic

training to forecast the three-month spot exchange rate. Once the network is

trained, tested and identiﬁed as being “good”, a GA is applied to it in order to

optimize its performance. The process of genetic evolution works on the neuron con-

nection of a trained network by applying two procedures: mutation and crossover.

The application of hybrid systems seems to be well suited for the forecasting of

ﬁnancial data.

Shin et al.42 propose an integrated thresholding design of the optimal or near-

optimal wavelet transformation (WT) by GA to represent a signiﬁcant signal most

suitable in ANN models. The model is applied to forecast the Korean won/USD

returns one day ahead of time. In this study, the multi-scale signal representation of

ANNs is supported by a wavelet transform as the multi-signal decomposition tech-

nique to detect the features of signiﬁcant patterns. A strategy is devised using WT

to construct a ﬁlter that is signiﬁcantly matched to the frequency of the time series

within the combined model. The experimental results show the enhanced ﬁltering

or signal multi-resolution power of wavelet analysis by GA in the performance of

the ANNs. This study also ﬁnds that the hybrid system of wavelet transformations

and ANNs by GA is much better than other ANNs that use other three-wavelet

thresholding algorithms (cross-validation, best level, and best basis) to increase

forecasting performance.

April 2, 2004 22:13 WSPC/173-IJITDM 00096

158 W. Huang et al.

6. Performance Comparison with Other Forecasting Methods

There are inconsistent reports on the performance of ANNs for forecasting exchange

rates when compared with other forecasting methods. Table 1 summarizes the lit-

erature on the relative performance of ANNs.

Weigen d et al.53 ﬁnd that neural networks are better than random walk

models (RW) in predicting the DEM/USD exchange rate. Wei et al.54 claim

that ANNs’ forecasting performance is better than those of AR(p), ARMA(p, q),

ARIMA(p, d, q). Lisi et al.26 make a comparison between ANNs and chaotic models

in forecasting exchange rate prediction. ANNs perform slightly better than chaotic

models, in term of NMSE; nevertheless, the two models are statistically equiva-

lent. Yao and Tan57 show that irrespective of NMSE, gradient or proﬁt, ANNs are

much better than traditional ARIMA model when forecasting the exchange rates

between USD and ﬁve other major currencies, AUD, CHF, DEM, GBP and JPY.

Leung et al.25 point out that GRNNs generally outperform parametric multivariate

transfer functions and the random walk models.

Episcopos and Davis10 suggest that neural networks are similar to EGARCH,

but superior to random walk models in terms of in-sample ﬁt and out-of-sample

prediction performance. Hann and Steurer15 compared neural network models with

linear monetary models in forecasting USD/DEM. Out-of-sample results show that,

for weekly data, neural networks are much better than linear models and na¨ıve

predictions of a random walk model with regard to Theil’s U measure, the hit rate,

the annualized returns and the Sharp ratio. However, if monthly data are used,

neural networks do not show much improvement over linear models. Monthly data

usually contain more irregularities (seasonality, cyclicity, nonlinearity, noise).

Zhang and Hutchinson62 ﬁnd mixed results for neural networks in compari-

son with those from random walk models using diﬀerent sections of the data set.

Kuan and Liu24 examine the out-of-sample forecasting ability of neural networks on

ﬁve exchange rates against the USD, including GBP, CAD, DEM, JPY and CHF.

For the GBP and JPY, they demonstrate that neural networks have signiﬁcant

market timing ability and/or achieve signiﬁcantly lower out-of-sample RMSE than

the random walk model across three testing periods. For the other three exchange

rates, neural networks are not shown to be superior in forecasting performance.

Their results also show that diﬀerent network models perform quite diﬀerently

in out-of-sample forecasting. Hu et al.17 compare combining the performance of

ANNs with those of various forecasting methods. Using diﬀerent performance mea-

surements and diﬀerent data stages, they get diﬀerent results. ANNs are not always

better than other forecasting tools. Zhang and Hu58 ﬁnd that neural networks pre-

dict much better than random walk model when using large training samples. Small

training samples will make ANNs fail to outperform the random walks for longer

forecast horizons. They suggest possible structural changes in exchange rate data.

Therefore, as more observations are available, they should be used to revise the fore-

casting neural networks models to better reﬂect change in the underlying pattern.

April 2, 2004 22:13 WSPC/173-IJITDM 00096

Forecasting Foreign Exchange Rates with ANN 159

Table 1. The relative performance of ANNs with traditional forecasting methods.

Researchers Data ANNs Type Traditional Forecasting Method Performance Measure Conclusions

Episcopos and

Davis10

USD, DEM, FRF, JPY,

GBP against CAD

MLP EGARCH, RW RMSE Similar to EGARCH;

Better than RW

Hann and

Steurer15

DEM/USD MLP Linear model, RW Theil’s U measure, Hit

rate, the annualized

returns and the Sharp

ratio

Better in weekly data;

Similar in monthly data

Hu and

Tsoukalas17

BEF/LUF, GBP, DKK,

NLG, FRF, GRD, IEP,

ITL, PTE, ESP, USD

against DEM

MLP MAV, GARCH,

EGARCH, IGARCH,

OLS, AVE

RMSE, MAE Mixed results

Kuan and Liu24 GBP, CAD, DEM, JPY

and CHF against USD

MLP, RNNs RW RMSE Mixed results

Leung et al.25 GBP, JPY, CAD against

USD

GRNNs Multivariate transfer

function, RW

MAE, RMSE Better

Lisi and Schiavo 26 FRF, DEM, ITL, GBP

against USD

MLP Chaotic model, RW NMSE Better

Wei and Jiang54 GBP/USD MLP AR, ARMA, ARIMA RMSE Better

Wei gen d et al.53 DEM/USD MLP RW ARV Better

Yao a n d Ta n 57 AUD, CHF, DEM, GBP,

JPY against USD

MLP ARIMA NMSE, Correctness of

gradient prediction

Better

Zhang and Hu58 GBP/USD MLP RW RMSE, MAE, MAPE Mixed results

Zhang and

Hutchinson62

CHF/USD MLP RW RMSE Mixed results

MAE: mean absolute error.

RMSE: root mean square error.

NMSE: normalized mean square error.

MAPE: mean absolute p ercentage error.

ARV: average rel ati ve variance.

April 2, 2004 22:13 WSPC/173-IJITDM 00096

160 W. Huang et al.

It is important to note that most studies use a single neural network model

in modeling and predicting exchange rates. As data-dependent neural networks

tend to be more unstable than traditional parametric models, performance of the

keep-the-best (KTB) approach can vary dramatically with diﬀerent models and

data. Random variations resulting from data partitioning or subtle shifts in the

parameters of the time series generating process can have a large eﬀect on the

learning and generalization capability of a single neural network model. These may

be the reasons why neural networks perform diﬀerently for diﬀerent exchange rate

series and diﬀerent time frequencies with diﬀerent data partitions.

7. Conclusions

In this paper, we present a survey of forecasting exchange rates using artiﬁcial neu-

ral networks. ANNs have been shown to be a promising tool for forecasting ﬁnancial

time series. Several design factors signiﬁcantly impact the accuracy of neural net-

work forecasts. These factors include selection of input variables, preparing data,

ANNs architecture. There is no consensus on these factors. In diﬀerent cases, var-

ious decisions have their own eﬀectiveness. There is no formal systematic model

building approach. The integration of neural networks with other technologies is

reported. We also discuss the relative performance of ANNs compared with other

forecasting methods, ﬁnding mixed results.

Model uncertainty comes from three main sources: model structure, param-

eter estimation and data. The nonlinear nonparametric nature of ANNs may cause

more uncertainties in model building. This learning and generalization dilemma or

tradeoﬀ has been extensive, and is still an active research topic in the ﬁeld. To

improve generalization performance of neural network models, we may need to go

beyond the model selection methods. Eﬀorts can be made along the lines of hint,14

Bayesian regularization,27 Vapnik-Chervonenkis dimension analysis,5and support

vector machine.35

Neural network ensembles seem promising for improving predictions over the

KTB approach because they do not solely rely on the performance of a single

neural network model. Zhang et al.60 examine three ensemble approaches. The

ﬁrst approach is to combine neural networks trained with diﬀerent initial random

weights but with the same data. The second approach is to combine diﬀerent neu-

ral network architectures within an ensemble. The third approach is to combine

networks trained with diﬀerent sample data. Many other ensemble methods can be

considered. For example, one potential method is based on bootstrapping samples

randomly generated from the original whole training time series. While computa-

tionally expensive, ensemble models based on bootstrapping samples may provide

further insights and evidence on the eﬀectiveness of the ensemble method for out-

of-sample forecasting. Research eﬀorts should also be devoted to the methods that

April 2, 2004 22:13 WSPC/173-IJITDM 00096

Forecasting Foreign Exchange Rates with ANN 161

can further reduce the correlation eﬀect in combining neural networks and to quan-

tifying the impact that shifts in the data generation parameters have on the various

approaches.

The exchange rates forecasted include Australian Dollar (AUD), Belgian/

Luxembourg Franc (BEF/LUF), British Pound (GBP), Canadian Dollar (CAD),

Danish Krone (DKK), Deutsche Mark (DEM), Dutch Guilder (NLG), French

Franc (FRF), Greek Drachma (GRD), Irish Punt (IEP), Italian Lira (ITL),

Japanese Yen (JPY), Korean won, Portuguese Escudo (PTE), Russian rouble,

Spanish Peseta (ESP), Swiss Franc (CHF) and US Dollar (USD). Among them,

USD, GBP, JPY, DEM are forecast most frequently. Contrary to Diebold and

Nason’s7opinion that there is little variation in results from one exchange

rate to another when nonparametric methods are used, results in one exchange

rate cannot be generalized to another. At least, caution is still needed when

generalizing.

While some studies have found encouraging results using this artiﬁcial intel-

ligence technique to predict the movements of established ﬁnancial markets, it is

interesting to verify the persistence of this performance in the emerging markets.

These rapidly growing ﬁnancial markets are usually characterized by high volatility,

relatively smaller capitalization, and less price eﬃciency, features which may hinder

the eﬀectiveness of those forecasting models developed for established markets. So

future research should extend to other exchange rates.

There are only two studies9,42 on the integration of ANNs and other technolo-

gies. Their research results show improvement in forecasting performance. Since the

neural network is considered to have great potential as a powerful forecasting tool,

its integration with other technologies should improve its overall performance.

For practitioners, trading driven by a certain forecast with a small forecast error

may not be as proﬁtable as trading guided by an accurate prediction of the direction

of movement. Therefore, predicting the direction of change of foreign exchange rates

and its return is also signiﬁcant in the development of eﬀective market trading

strategies.

Future research should attempt to formulate a hybrid neural network model

for forecasting as follows: the model integrates ANNs with more complementary

technologies to enhance its self-adaptation to diﬀerent situations. More statistical

analyses should be provided in determining some parameters like sample size and

frequency, etc. We need to ﬁnd out which kind of data segments best capture the

underlying behavior of market changes. Appropriate sample frequency should be

investigated to provide enough information on underlying relationship in exchange

rates as well as to limit noise incorporation.

Acknowledgement

This project is supported by NSFC, CAS and the City University of Hong Kong.

April 2, 2004 22:13 WSPC/173-IJITDM 00096

162 W. Huang et al.

References

1. T. Bollerslev, Modeling the coherence in short-run nominal exchange rates: A mul-

tivariate generalized ARCH model, Review of Economics and Statistics 72 (1990)

498–505.

2. G. A. Carpenter, S. Grossberg and J. H. Reynolds, ART-MAP: Supervised real time

learning and classiﬁcation of nonstationary data by a self-organizing neural network,

Neural Networks 4(1991) 565–588.

3. G. A. Carpenter, S. Grossberg, N. Markuzon, J. H. Reynolds and D. B. Rosen, Fuzzy

ARTMAP: An adaptive resonance architecture for incremental learning of analog

maps, IJCNN, June 1992, Baltimore, Vol. III, pp. 309–314.

4. D. Chappel, J. Padmore, P. Mistry and C. Ellis, A threshold model for French

franc/Deutsch mark exchange rate, Journal of Forecasting 15 (1996) 155–164.

5. V. Cherkassky, F. M. Mulier and X. Shao, Model complexity control for regression

using VC generalization bounds, IEEE Transactions on Neural Networks 10 (1999)

1075–1089.

6. J. T. Davis, A. Episcopos and S. Wettimuny, Predicting direction shifts on

Canadian-US exchange rates with artiﬁcial neural networks, International Journal

of Intelligent Systems in Accounting, Finance and Management 10 (2001) 83–96.

7. F. X. Diebold and J. A. Nason, Nonparametric exchange rate prediction, Journal of

International Economics 28 (1990) 315–322.

8. M. R. El Shazly and H. E. El Shazly, Comparing the forecasting performance of neural

networks and forward exchange rates, Journal of Multinational Financial Management

7(1997) 345–356.

9. M. R. El Shazly and H. E. El Shazly, Forecasting currency prices using a genetically

evolved neural network architecture, International Review of Financial Analysis 8(1)

(1999) 67–82.

10. A. Episcopos and J. Davis, Predicting returns on Canadian exchange rates with arti-

ﬁcial neural networks and EGARCHM-M model, Neural Computing and Application

4(1996) 168–174.

11. H. Fang, K. S. Lai and M. Lai, Fractal structure in currency futures price dynamics,

Journal of Futures Markets 14 (1994) 169–181.

12. L. Fausett, Fundamentals of Neural Networks (Prentice Hall, Englewood Cliﬀs, NJ,

1994).

13. P. Fransconi, M. Gori and G. Soda, Local feedback multilayered networks, Neural

Computation 4(1992) 120–130.

14. R. Garcia and R. Gencay, Predicting and hedging derivative securities with neural

networks and a homogeneity hint, Journal of Econometrics 94 (2000) 93–115.

15. T. H. Hann and E. Steurer, Much ado about nothing? Exchange rate forecasting:

Neural networks versus linear models using monthly and weekly data, Neurocomputing

10 (1996) 323–339.

16. D. A. Hsieh, Modeling heteroscedasticity in daily foreign-exchange rates, Journal of

Business and Economic Statistics 7(1989) 307–317.

17. M. Y. Hu and C. Tsoukalas, Combining conditional volatility forecasts using neu-

ral networks: An application to the EMS exchange rates, Journal of International

Financial Markets, Institutions and Money 9(1999) 407–422.

18. W. Huang, Y. Nakamori and S. Y. Wang, A general approach based on autocorrelation

to determine input variables of neural networks for time series forecasting, submitted

to IEEE Transaction on System, Human and Cybernetics, 2002a.

19. W. Huang, Y. Nakamori and S. Y. Wang, Using change-point detection to seek

optimal training set for neural networks in foreign exchange rates forecasting,

April 2, 2004 22:13 WSPC/173-IJITDM 00096

Forecasting Foreign Exchange Rates with ANN 163

submitted to International Journal of Computational Intelligence and Organization,

2002b.

20. R. A. Jacobs, M. I. Jordan, S. J. Nowlan and G. E. Hinton, Adaptive mixtures of

local experts, Neural Computation 3(1991) 79–87.

21. A. M. Jamal and C. Sundar, Modeling exchange rates with neural networks, Journal

of Applied Business Research 14(1) (1998) 1–5.

22. S. Kang, An investigation of the use of feedforward neural network for forecasting,

PhD Thesis, Kent State University, Kent, OH 44240, USA, (1991).

23. T. Kohonen, Learning vector quantization, Neural Networks 1(suppl 1) (1988) p. 303.

24. C. M. Kuan and T. Liu, Forecasting exchange rates using feedforward and recurrent

neural networks, Journal of Applied Econometrics 10 (1995) 347–364.

25. M. T. Leung, A. S. Chen and H. Dauk, Forecasting exchange rate using general

regression neural networks, Computer and Operations Research 27 (2000) 1093–1110.

26. F. Lisi and R. A. Schiavo, A comparison between neural networks and chaotic models

for exchange rate prediction, Computational Statistical and Data Analysis 30 (1999)

87–102.

27. D. J. C. Mackay, Bayesian interpolation, Neural Computation 4(1992) 415–447.

28. R. A. Meese and A. Rose, An empirical assessment of non-linearities in models of

exchange rate determination, Review of Economic Studies 58 (1991) 601–619.

29. B. Mizrach, Multivariate nearest-neighbor forecasts of EMS exchange rates, Journal

of Applied Econometrics 7(1992) S151–S164.

30. J. Moody and C. J. Darken, Fast learning in networks of locally tuned processing

units, Neural Computation 1(1989) 281–294.

31. D. A. Peel and P. Yadav, The time series behavior of spot exchange rates in the

German hyper-inﬂation period: Was the process chaotic? Empiri cal E cono mic s 20

(1995) 455–463.

32. H. Pi, Dependence analysis and neural network modeling of currency exchange rates,

Proceedings of F irst In te rna tio nal Wor ksh op on Ne ural N etwor ks in Capit al Mark et,

London, 1993.

33. A. Podding, Short term forecasting of the USD/DEM exchange rate, in Proceedi ngs o f

First International Workshop on Neural Networks in Capital Market, London, 1993.

34. M. Qi and G. Zhang, An investigation of model selection criteria for neural network

time series forecasting, European Journal of Operational Research 132 (2001) 666–680.

35. S. Raudys, How good are support vector machine (should be “machines?”), Neural

Networks 13 (2000) 17–19.

36. A. N. Refenes, Constructive learning and its application to currency exchange rate

forecasting, in Neural Networks in Finance and Investing: Using Artiﬁcial Intelli-

gence to Improve Real-World Performance, eds. R. R. Trippi and E. Turban (Probus

Publishing Company, Chicago, 1993), pp. 777–805.

37. A. N. Refenes, M. Azema-Barac, L. Chen and S. A. Karoussos, Currency exchange rate

prediction and neural network design strategies, Neural Computing and Application

1(1993) 46–58.

38. D. E. Rumelhart, G. E. Hinto and R. J. Williams, Learning internal representation

by error propagation, in Parrallel Distributed Processing, eds. D. E. Rumelhart and

J. L. McClelland, Vol. 1 (MIT Press, Cambridge, MA, 1986), pp. 318–362.

39. N. Sarantis and C. Stewart, Monetary and asset market models for sterling exchange

rates: A cointegration approach, Journal of Economic Integration 10 (1995) 335–371.

40. M. Shanker, M. Y. Hu and M. S. Hung, Eﬀect of data standardization on neural

network training, Omega 24(4) (1996) 385–397.

April 2, 2004 22:13 WSPC/173-IJITDM 00096

164 W. Huang et al.

41. S. H. Shin, Essays on asset return predictability, PhD Dissertation, Massachusetts

Institute of Technology, (0753), 1993.

42. T. Shin and I. Han, Optimal signal multi-resolution by genetic algorithm to support

artiﬁcial neural networks for exchange–rate forecasting, Expert Systems with Applica-

tion 18 (2000) 257–269.

43. M. Smith, Neural Networks for Statistical Modeling (Van Nostrand Reinhold,

New York, 1993).

44. M. K. P. So, K. Lam and W. K. Li, Forecasting exchange rate volatility using autore-

gressive random variance model, Applied Financial Economics 9(1999) 583–591.

45. D. Specht, A general regression neural network, IEEE Transactions on Neural

Networks 2(1991) 568–576.

46. E. Steurer, Nonlinear modelling of the DEM/USD exchange rate, in Neural Networks

in the Capital Markets, ed. A. Refenes (Wiley, New York), pp. 199–211.

47. A. Tahai, S. Walczak and J. T. Rigsby, Improving artiﬁcial neural network perfor-

mance through input variable selection, in Applications of Fuzzy Sets and the Theory

of Evi den ce to Account ing I I ,eds.P.Siegel,K.Omer,A.deKorvinandA.Zebda(JAI

Press, Stamford, CN, 1998), pp. 277–292.

48. P. Tenti, Forecasting foreign exchange rates using recurrent neural networks, Applied

Artiﬁcial Intelligence 10 (1996) 567–581.

49. P. Theodossiou, The stochastic properties of major Canadian exchange rates, The

Financial Review 29(2) (1994) 193–221.

50. S. Walczak, An empirical analysis of data requirements for ﬁnancial forecasting with

neural networks, Journal of Management Information System 17(4) (2001) 203–222.

51. S. Walczak and N. Cerpa, Heuristic principles for the design of artiﬁcial neural net-

works, Information and Software Technology 41(2) (1999) 107–117.

52. S. Walczak, A. Tahai and K. Karim, Improved cash ﬂows using neural network models

for forecasting foreign exchange rates, in Applications of Fuzzy Sets and the Theory of

Evidence t o Accoun tin g I I,eds.P.Siegel,K.Omer,A.deKorvinandA.Zebda(JAI

Press, Stamford, CN, 1998), pp. 293–310.

53. A. S. Weigend, B. A. Huberman and D. E. Rumelhart, Predicting sunspots and

exchange rates with connectionist networks, in No nli near M odel ing and Forecast -

ing, eds. M. Casdagli and S. Eubank (Addison-Wesley, Redwood City, CA, 1992),

pp. 395–432.

54. W. X. Wei and Z. H. Jiang, Artiﬁcial neural network forecasting model for exchange

rate and empirical analysis, Forecas tin g 2(1995) 67–69.

55. B. Wu, Model-free forecasting for nonlinear time series (with application to exchange

rates), Computational Statistics and Data Analysis 19 (1995) 433–459.

56. J. T. Yao, H. L. Poh and T. Jasic, Foreign exchange rates forecasting with neural

networks, in Proceedin gs of the In terna tio nal C onf erence on N eura l Inf orm at ion P ro-

cessing, Hong Kong, 1996, pp. 754–759.

57. J. T. Yao and C. L. Tan, A case study on using neural networks to perform technical

forecasting of forex, Neurocomputing 34 (2000) 79–98.

58. G. Zhang and M. Y. Hu, Neural network forecasting of the British Pound/US Dollar

exchange rate, Journal of Management Science 26(4) (1998) 495–506.

59. G. Zhang, B. E. Patuwo and M. Y. Hu, Forecasting with artiﬁcial neural networks:

The state of the art, International Journal of Forecasting 14 (1998) 35–62.

60. G. P. Zhang and V. L. Berardi, Time series forecasting with neural network ensembles:

An application for exchange rate prediction, Journal of the Operational Research

Society 52 (2001) 652–664.

April 2, 2004 22:13 WSPC/173-IJITDM 00096

Forecasting Foreign Exchange Rates with ANN 165

61. X. R. Zhang, Non-linear predictive models for intra day foreign exchange trading,

International Journal of Intelligent System in Accounting, Finance and Management

3(1994) 293–302.

62. X. Zhang and J. Hutchinson, Simple architectures on fast machines: Practical issues in

nonlinear time series prediction, in Time Series Prediction: Forecasting the Future and

Understanding the Past, eds. A. S. Weigend and N. A. Gershenfeld (Addison-Wesley,

Reading, MA, 1994), pp. 219–241.

- CitationsCitations59
- ReferencesReferences70

- "However, artificial neural networks (ANNs) are introduced as promising data mining tools that provide an alternative to statistical techniques in building credit scoring models. Furthermore, artificial neural networks have recently been used successfully in different business applications (Akkoc, 2012; Chen and Huang, 2003; Eletter, 2012; Gao et al., 2006; Huang et al., 2004a; Khashman, 2010; Malhorta and Malhorta, 2003; Martens et al., 2007; Tsai and Wu, 2008; West, 2000). "

[Show abstract] [Hide abstract]**ABSTRACT:**Despite the increase in the number of non-performing loans and competition in the banking market, most of the Jordanian commercial banks are reluctant to use data mining tools to support credit decisions. Artificial neural networks represent a new family of statistical techniques and promising data mining tools that have been used successfully in classification problems in many domains. This paper proposes two credit scoring models using data mining techniques to support loan decisions for the Jordanian commercial banks. Loan application evaluation would improve credit decision effectiveness and control loan office tasks, as well as save analysis time and cost. Both accepted and rejected loan applications, from different Jordanian commercial banks, were used to build the credit scoring models. The results indicate that the logistic regression model performed slightly better than the radial basis function model in terms of the overall accuracy rate. However, the radial basis function was superior in identifying those customers who may default.- "Therefore, they can be applicable to more complicated cases in practice. Neural networks have been regarded by many experts as a promising tool for solving time series forecast problem [10][11][12][13][14][15][16]. Consequently, much research concerning neural networks for forecasting time series has been done in the last decades, such as multilayer perceptron (MLP) [17, 18], radial basis function (RBF) [19], support vector machine (SVM) or support vector regression (SVR) [20][21][22][23][24][25], recurrent neural network (RNN) [26][27][28], echo state network (ESN) [29, 30], neural gas (NG) [31] and self-organising map (SOM) [32][33][34][35][36]. SOM is a type of unsupervised learning methods that can be applied to data clustering or classification. "

[Show abstract] [Hide abstract]**ABSTRACT:**Neural networks have been widely applied to time series prediction over past few decades. Generally, applications of them restrict to causal models where current values are dependent on past values. In contrast, a non-causal neural network is proposed in this paper to deal with time series prediction by allowing dependence on future values. Both past and future values are used together for training and prediction. In prediction, future values are the expected values of training samples. In addition, weightings of the past and future values are incorporated into the network to improve prediction performance. Experimental results on benchmark and FX time series show that the proposed network is effective.- "Due to their ability of dealing with non-linear systems through data mapping, machine learning has become a popular choice among researchers in various forecasting scenarios. In particular, models based on Neural Networks (Wang, 2004; Chang et al, 2009; Huang et al, 2004) and Support Vector Machines (Huang, 2010; Kamruzzaman et al, 2003; Trafalis, 2006; Brandl et al, 2009; Zhao et al, 2009) achieved accuracy at least at par with predictions from both structural econometric and naive models. In contrast to ANN estimation, the SVM solution derives from convex optimization making the optimal solution both global and unique. "

[Show abstract] [Hide abstract]**ABSTRACT:**In this paper, we present a novel machine learning based forecasting system of the EUR/USD exchange rate directional changes. Specifically, we feed an over complete variable set to a Support Vector Machines (SVM) model and refine it through a Sensitivity Analysis process. The dataset spans from 1/1/1999 to 30/11/2011; the data of the last 7 months are reserved for out-of-sample testing. Results show that the proposed scheme outperforms various other machine learning methods treating similar scenarios.

Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.

This publication is from a journal that may support self archiving.

Learn more