ArticlePDF Available

Finding the Optimal Pre-Set Boundaries for Pairs Trading Strategy Based on Cointegration Technique

Authors:

Abstract

Pairs trading is one of the arbitrage strategies that can be use in trading stocks on the stock market. This paper incorporates pairs trading with the use of cointegration technique to exploit stocks that are temporarily out of equilibrium. In determining which two stocks can be a pair, Banerjee, Dolado, Galbraith and Hendry (1993) and Vidyamurthy (2004) showed that the cointegration technique is more effective than correlation criterion for extracting profit potential in temporary pricing anomalies between two stock prices driven by common underlying factors. By using stationary properties of cointegration errors following an AR(1) process, this paper explores the ways in which the pre-set boundaries chosen to open a trade can influence the minimum total profit over a specified trading horizon. The minimum total profit relates to the pre-set boundaries for opening trades, the higher the profit per trade but the lower the trade nummbers. The number of trades over a specified trading horizon is estimated by using the average trade duration and the average inter-trade interval. For any pre-set boundaries, both of these values are estimated by making an analogy to the mean first-passage time. The aims of this paper are to develop numerical algorithm to estimate the average trade duration, the average inter-trade interval, and the average number of trades and to use these to find optimal pre-set boundaries that maximize the minimum total profit.
University of Wollongong
Research Online
Centre for Statistical & Survey Methodology
Working Paper Series Faculty of Informatics
2009
Finding the Optimal Pre-set Boundaries for Pairs
Trading Strategy Based on Cointegration
Technique
H. Puspaningrum
University of Wollongong, hp261@uow.edu.au
Y. X. Lin
University of Wollongong, yanxia@uow.edu.au
C. Gulati
University of Wollongong, cmg@uow.edu.au
Research Online is the open access institutional repository for the
University of Wollongong. For further information contact Manager
Repository Services: morgan@uow.edu.au.
Recommended Citation
Puspaningrum, H.; Lin, Y. X.; and Gulati, C., Finding the Optimal Pre-set Boundaries for Pairs Trading Strategy Based on
Cointegration Technique, Centre for Statistical and Survey Methodology, University of Wollongong, Working Paper 21-09, 2009,
25p.
http://ro.uow.edu.au/cssmwp/41
Copyright © 2008 by the Centre for Statistical & Survey Methodology, UOW. Work in progress,
no part of this paper may be reproduced without permission from the Centre.
Centre for Statistical & Survey Methodology, University of Wollongong, Wollongong NSW
2522. Phone +61 2 4221 5435, Fax +61 2 4221 4845. Email: anica@uow.edu.au
Centre for Statistical and Survey Methodology
The University of Wollongong
Working Paper
21-09
Finding the Optimal Pre-Set Boundaries for Pairs Trading Strategy
Based on Cointegration Technique
Heni Puspaningrum, Yan-Xia Lin, Chandra Gulati
Finding the Optimal Pre-set Boundaries for Pairs Trading
Strategy Based on Cointegration Technique
Heni Puspaningrum1, YanXia Lin2, Chandra Gulati3
School of Mathematics and Applied Statistics
University of Wollongong, Australia
Abstract
Pairs trading is one of the arbitrage strategies that can be used in trading stocks
on the stock market. It incorporates the use of a standard statistical model to
exploit the stocks that are out of equilibrium for short-term time. In determining
which two stocks can be a pair, Banerjee et al. (1993) shows that the cointegration
technique is more effective than correlation criterion for extracting profit potential in
temporary pricing anomalies for share prices driven by common underlying factors.
This paper explores the ways in which the pre-set boundaries chosen to open a
trade can influence the minimum total profit over a specified trading horizon. The
minimum total profit relates to the pre-set minimum profit per trade and the number
of trades during the trading horizon. The higher the pre-set boundaries for opening
trades, the higher the profit per trade but the lower the trade numbers. The opposite
applies for lowering the boundary values. The number of trades over a specified
trading horizon is determined jointly by the average trade duration and the average
inter-trade interval. For any pre-set boundaries, both of these values are estimated
by making an analogy to the mean first-passage time. The aims of this paper are
to develop numerical algorithm to estimate the average trade duration, the average
inter-trade interval, and the average number of trades and then use them to find
the optimal pre-set boundaries that would maximize the minimum total profit for
cointegration error following an AR(1) process.
Keywords: pairs trading, cointegration, integral equation, the mean first-passage
time.
1 Introduction
Pairs trading was first discovered in the early 1980s by the quantitative analyst
Nunzio Tartaglia and a team of physicists, computer scientists and mathematicians,
who did not have a background in finance. Their idea was to develop statistical
1email: hp261@uow.edu.au
2email: yanxia@uow.edu.au
3email: cmg@uow.edu.au
1
rules to find ways to perform arbitrage trades, and take the ‘skill’ out of trading
(Gatev et al. 1999, 2006).
Pairs trading works by taking the arbitrage opportunity of temporary anomalies
between related stocks which have long-run equilibrium. When such an event occurs,
one stock will be overvalued relative to the other stock. We can then invest in a
two-stock portfolio (a pair) where the overvalued stock is sold (short position) and
the undervalued stock is bought (long position). The trade is closed out by taking
the opposite position of these stocks after the stocks have settled back into their
long-run relationship. The profit is captured from this short-term discrepancies in
the two stock prices. Since the profit is not depend on the movement of the market,
pairs trading is a market-neutral investment strategy.
According to (Gatev et al.,1999, 2006), it appears that the growing popularity
of the pairs trading strategy may also pose a problem because the opportunities to
trade become much smaller, as many other arbitrageurs are aware of the strategy
and may choose to enter at an earlier point of deviation from the equilibrium. The
profit from the pairs trading strategy in recent times is less than the profit before the
pairs trading strategy is found . However, Gillespie and Ulph (2001), Habak (2002),
and Hong and Susmel (2003) show that significant returns could still be made in
more recent times with the strategy. An extensive discussion of pairs trading can be
found in Gatev et al (1999, 2006), Vidyamurthy (2004), Whistler (2004)and Ehrman
(2006).
In determining which two stocks can be a pair, people commonly choose two
stocks that are highly correlated (see Stone (http://www.investopedia.com), Avery-
Wright (http://compareshares.com.au), Goodboy (http://biz.yahoo.com) and Ehr
man (2006)). However, Banerjee et al. (1993) shows that the cointegration technique
is more effective than correlation for extracting profit potential as the cointegration
relationship guarantees that the two stocks have a long-run stationary relationship.
Gillespie and Ulph (2001), Hong and Susmel (2003), Vidyamurthy (2004) and Her-
lemont (www.yats.com) also suggest this technique. However, no one has developed
pairs trading strategy based on cointegration by quantitatively estimating the aver-
age trade duration, the average inter-trade interval, the average number of trades,
the minimum total profit, and then finding the optimal pre-set boundaries (thresh-
olds) to open the pair trades. The following paragraphs will briefly explain about
these terms and pairs trading base on cointegration.
Substantial literature (see, for example, Fama and French, 1988; Liu et al., 1997;
Narayan, 2005; and references cited therein) confirm that stock prices are character-
ized by a unit root which means the stock prices are I(1) non-stationary time series.
Sometimes an appropriate linear combination of two I(1) non-stationary time series
2
could form a stationary time series. If this happens, we say these two I(1) series are
cointegrated. 4
In order to determine whether cointegration exists between two time series there
are two techniques that are generally used: the Engle-Granger two-step approach,
developed by Engle and Granger (1987), and the technique developed by Johansen
(1988). The Engle-Granger approach uses OLS (Ordinary Least Squares) to estimate
the long-run steady-state relationship between the variables in the model, and then
test whether the residual from the equation is stationary or not. Even though it is
quite easy to use, there are some criticisms of this approach, e.g.: (1) this test for
cointegration is likely to have lower power than the alternative tests; (2) its finite
sample estimates of long-run relationships are potentially biased; and (3) inferences
cannot be drawn using standard t-statistics about the significance of the parameters
of the static long-run model (Harris, 1995). To overcome the problems found in
the Engle-Granger approach, the Johansen’s approach uses a vector error-correction
model (VECM) so that all variables can be endogenous. More discussion about
these two methods can be found in Harris (1995). One more advantage of Johansen’s
(1988) technique is that it has become available in a user-friendly software, namely,
P cF iml (version 8), which has been used for running the cointegration analysis in
this paper.
The pairs trading strategy, using a cointegration technique, is briefly introduced
below :
Consider two shares S1 and S2 whose prices are I(1). If the share prices PS1,t and
PS2,t are cointegrated, there exist cointegration coefficients 1 and βcorresponding to
PS1,t and PS2,t respectively, such that a cointegration relationship can be constructed
as follows:
PS1,t βPS2,t =²
t,(1)
where ²
t(the actual cointegration error) is a stationary time series.
Define ²t(the adjusted cointegration error) is as follows:
²t=²
tE(²
t),(2)
where ²tis also a stationary time series and E(.) means the expectation. The actual
cointegration error ²
tis adjusted so that the mean of the adjusted cointegration
error E(²t) is zero in order to simplify subsequent analysis.
We have to set an upper-bound U(U > 0) and a lower-bound L(L < 0) before
we apply the pairs trading. The function of these boundaries act as a threshold to
open a trade. Let NS1and NS2denote the number of shares S1 and S2 respectively.
Two type of trades, U-trades and L-trades, are considered. For a U-trade, a trade is
4I(1) means the time series is non-stationary but the first difference is stationary.
3
opened when the adjusted cointegration error is higher than or equal to the pre-set
upper-bound Uby selling NS1of S1 shares and buying NS2of S2 shares and then
closing the trade when the adjusted cointegration error is less than or equal to zero.
This is done by buying NS1of S1 shares and selling NS2of S2 shares. The opposite
happens for the L-trade, where a trade is opened when the adjusted cointegration
error is less than or equal to the pre-set lower-bound Lby buying NS1of S1 shares
and selling NS2of S2 shares. The trade is closed when the adjusted cointegration
error is higher than or equal to zero by selling NS1of S1 shares and buying back
NS2of S2 shares. It is assumed that the actual cointegration error (²
t) as well as
the adjusted cointegration error (²t) are stationary processes and have symmetric
distributions, so the lengths from the upper-bound Uto the mean and from the
lower-bound Lto the mean are the same. As a result, the expected number of
U-trades and L-trades are the same. For details, see Lin et al. (2003, 2006).
In our discussion, the following terms will be required.
Trade duration is the time between opening and closing a U-trade (an L-trade).
Inter-trade interval is the time between two consecutive U-trades (L-trades)
or the time between closing a U-trade(an L-trade) and then opening the next
U-trade(L-trade). We assume that there is no open trade (neither U-trade nor
L-trade) if the previous trade has not been closed yet.
Period is the sum of the trade duration and the inter-trade interval for U-trades
(L-trades).
To simplify the discussion in this paper, we subsequently focus mainly on the U-
trade case unless stated otherwise. The expected trade durations and the expected
inter-trade intervals are estimated to determine the expected number of U-trades
over a specified trading horizon. As the expected numbers of U-trades and L-trades
are the same, the expected number of U-trades can be doubled to obtain the expected
number of trades.
Figure 1 shows two cointegrated shares, i.e. Transonic Travel Ltd (TNS) and
Travel.com.au (TVL), and their adjusted cointegration error denoted by eps . Both
are travel companies listed on the Australian Stock Exchange. In this case, TNS is
S1 and TVL is S2. Further description of the cointegration relationship of these two
shares can be found in Section 5. At time t = 5, the adjusted cointegration error
of the two stocks (eps ) is higher than the upper-bound U, so a trade is opened by
selling TNS and buying TVL. At t = 14, eps is less than the eps mean 0 , so the
trade is closed by taking the opposite position. Figure 1 also illustrates an example
of trade duration, inter-trade interval and period.
4
0 5 10 15 20 25 30 35 40 45 50
−5
0
5
10
15
20
25
time (day)
stock prices and cointegration error (eps)
U
L
open
sell
buy
close
buy
sell
open
buy
sell
close
buy
sell
−−−−−−−−−−−
trade duration −−−−−−−−−−− inter−trade interval −−−−−−−−−−
−−−−−−−−−−−−−−−−−−−−−−− period −−−−−−−−−−−−−−−−−−−−−−
TNS
TVL
eps
Figure 1: Example of two cointegrated shares (TNS and TVL) with E(eps)=0
Lin et al. (2003, 2006) develop a pairs trading strategy based on a cointegration
technique called the cointegration coefficients weighted (CCW) rule. The CCW rule
works by trading the number of S1 and S2 shares as a proportion of cointegration
coefficients to achieve a pre-set minimum profit per trade. The pre-set minimum
profit per trade corresponds to the pre-set boundaries Uand Lchosen to open
trades. However, they did not discuss the optimality issue on the pre-set boundaries.
Developing a numerical algorithm to calculate the optimal pre-set boundary values
will be the main target of this paper.
We determine the optimality of the pre-set boundary values by maximizing the
minimum total profit (MTP) over a specified trading horizon. The MTP corre-
sponds to the pre-set minimum profit per trade and the number of trades during
the trading horizon. As the derivation of the pre-set minimum profit per trade is
already provided in Lin et al. (2003, 2006), this paper will provide the estimated
number of trades. The number of trades is also influenced by the distance of the
pre-set boundaries from the long-run cointegration equilibrium. The higher the pre-
set boundaries for opening trades, the higher the minimum profit per trade but the
lower the trade numbers. The opposite applies for lowering the boundary values.
The number of trades over a specified trading horizon is determined jointly by
the average trade duration and the average inter-trade interval. For any pre-set
boundaries, both of those values are estimated by making an analogy to the mean
first-passage times for an AR(1) process. This paper applies an integral equation
approach to evaluate the mean first-passage times from Basak and Ho (2004).
The paper is organized as follows. Section 2 gives a brief summary of the trading
5
rules to obtain the pre-set minimum profit per trade. In Section 3, we give a brief
description of the mean first-passage time of an AR(1) process using an integral
equation approach and apply the concepts to estimate the average trade duration,
the average inter-trade interval and then the number of trades in the pairs trading
strategy. In Section 4, a numerical algorithm is developed to calculate the optimal
pre-set upper-bound, denoted Uo, that would maximize the minimum total profit.
Section 5 provides two empirical examples, i.e. BHP-RIO and TNS-TVS and the
last section has discussion and a conclusion.
2 Minimum Profit Per Trade
This section will explain how to determine the number of shares of S1 and S2
needed to get the pre-set minimum profit per trade. Using Eqs.(1) and (2), this
paper follows the derivation of the minimum profit per trade as in Lin et al. (2003,
2006). Consider the following assumptions.
1. The two share price series are cointegrated over the relevant time period.
2. Long (buy) and short (sell) positions always apply to the same shares in the
share-pair.
3. Short sales are permitted or possible through a broker and there is no interest
charged for the short sales and no cost for trading.
4. β > 0
Assumptions 1 and 2 are fairly non-controversial. The others assumptions are
applied to simplify the analysis. To support the fourth assumption, we have exam-
ined seven share pairs ( ANZ-ADB, ABC-HAN, ABC-BLD, CCL-CHB, HAN-RIN,
BHP-RIO, and TNS-TVL)5from the Australian Stock Exchange using daily data
for 2004 (www.finance.yahoo.com.au) and find that the β’s for those cointegrated
shares were positive.
2.1 U-trades
Consider two cointegrated shares, S1 and S2 as in Eq.(1). By using Assumption 1,
we can conclude that
5ANZ Banking Group Ltd (ANZ), Adelaide Bank (ADB), Adelaide Brighton (ABC), Boral Ltd
(BLD), BHP Billiton Ltd (BHP), Coca-cola Amatil (CCL), Coca-cola Hellenic (CHB), Hanson Plc
(HAN), Rinker Group Ltd (RIN), Rio Tinto Ltd (RIO), Transonic Travel (TNS), Travel.com.au
(TVL)
6
When ²tU, the price of one unit share S1 is higher than or equal to the
price of βunit shares S2, relative to their equilibrium relationship. In other
words, S1 is overvalued while S2 is undervalued, relative to their equilibrium
relationship. A trade is opened at this time. Let torepresent the time of
opening a trade position.
If ²t0, the price of one unit share S1 is less than or equal the price of β
unit shares S2, relative to their equilibrium relationship. In other words, S1 is
under-valued while S2 is over-valued according to their equilibrium relation-
ship. The trade is closed at this time. Let tcrepresent the time of closing out
a trade position.
When the adjusted cointegration error is higher than or equal to the pre-set
upper-bound Uat time to, a trade is opened by selling NS1of S1 shares at time to
for NS1PS1,t0dollars and buying NS2of S2 at time tofor NS2PS2,todollars.
When the adjusted cointegration error has settled back to its mean at time tc,
the positions are closed out by simultaneously selling the long position shares for
NS2PS2,tcdollars and buying back the NS1of S1 shares for NS1PS1,tcdollars.
Profit per trade will be
P=NS2(PS2,tcPS2,to) + NS1(PS1,toPS1,tc).(3)
According to the CCW rule as in Lin et al. (2003, 2006), if the weight of NS2
and NS1are chosen as a proportion of the cointegration coefficients, i.e. NS1= 1
and NS2=β, the minimum profit per trade can be determined as follows: 6
P=NS2(PS2,tcPS2,to) + NS1(PS1,toPS1,tc)
=β[PS2,tcPS2,to]+[PS1,toPS1,tc]
=β[PS2,tcPS2,to] + [(²to+E(²
t) + βPS2,to)(²tc+E(²
t) + βPS2,tc)]
= (²to²tc)U. (4)
Thus, by trading the shares with the weight as a proportion of the cointegration
coefficients, the profit per trade is at least Udollars.
2.2 L-trades
For an L-trade, the pre-set lower-bound Lcan be set to be U. So, a trade is
opened when ²t Uby selling S2 and buying S1.
6For simplicity, fractional share holdings are permitted
7
Profit per trade will be:
P=NS2(PS2,toPS2,tc) + NS1(PS1,tcPS1,to).(5)
Analogous to the derivation of minimum profit per trade for an U-trade, let
NS2=βand NS1= 1. Thus,
P=β[PS2,toPS2,tc]+[PS1,tcPS1,to]
=β[PS2,toPS2,tc] + [(²tc+E(²
t) + βPS2,tc)(²to+E(²
t) + βPS2,to)]
= (²tc²to)U. (6)
So, trading 1 unit share S1 and βunit shares S2, either in U-trades or L-trades
would make a minimum profit per trade as much as U. However, for L-trades we
need to borrow some money because with ²tnegative at opening time means that
the income from the short sales (selling βunit shares S2) is insufficient to buy 1
unit share S1.
3 Mean First-passage Time of an AR(1) Process
and Pairs Trading
As a stationary process, the actual cointegration error (²
t) as well as the adjusted
cointegration error (²t) may follow linear stationary processes (e.g.: White noise,
Autoregressive, Moving average, and Autoregressive-Moving Average processes),
non-linear stationary processes or other stationary processes. We have examined
seven share pairs ( ANZ-ADB, ABC-HAN, ABC-BLD, CCL-CHB, HAN-RIN, BHP-
RIO, and TNS-TVL) from the Australian Stock Exchange using daily data for 2004
(www.finance.yahoo.com.au). All of these share pairs produce cointegration error
with AR(1) processes. Elliott (2005) and Herlemont (www.yats.com) also suggested
AR(1) processes for modeling pairs trading, but they used the Ornstein-Uhlenbeck
process which is the continuous-time counterpart of an AR(1) process to estimate
the optimal boundaries. However, due to the complexity of stochastic analysis in
the Ornstein-Uhlenbeck process, their results are difficult to be applied in practical
situation. Therefore, in this paper we focus on an AR(1) process and use an integral
equation approach from Basak and Ho (2004) which is more practicable than the
Ornstein-Uhlenbeck process .
This section will provide steps to obtain an estimation of the number of trades
over a specified trading horizon. Firstly, we will give a brief summary of the mean
first-passage time of AR(1) process using an integral equation approach from Basak
and Ho (2004). Secondly, a numerical scheme is provided to calculate the mean first-
passage time of an AR(1) process using an integral equation approach. Thirdly, the
8
average trade duration and the average inter-trades interval are estimated using
an analogy of the mean first-passage time. Fourthly, the number of trades over a
specified trading horizon is approximated using the average trade duration and the
average inter-trade interval.
3.1 The mean first-passage time of an AR(1) process using
an integral equation approach
Consider an AR(1) process:
Yt=φYt1+ξt,(7)
where 1< φ < 1 and ξti.i.d N(0, σ2
ξ).
The first-passage time Ta,b(y0) is defined as
Ta,b(y0) = inf{t:Yt> b or Yt< a|aY0=y0b}(8)
Particularly,
Ta(y0) = Ta,(y0) = inf{t:Yt< a|Y0=y0a}(9)
and
Tb(y0) = T−∞,b(y0) = inf{t:Yt> b|bY0=y0}(10)
E(Ta,b(y0)), E(Ta(y0)), and E(Tb(y0)) denote the mean first-passage time of
Ta,b(y0), Ta(y0), and Tb(y0) respectively. Basak and Ho (2004) derive the mean
first-passage time of an AR(1) process using an integral equation approach.
We define a discrete-time real-valued Markov process {Yt}on a probability space
{,F,P} with stationary continuous transition density f(y|x), continuous in both
xand y. The term f(y|x) denotes the transition density of reaching yat the next
step given that the present state is x. Suppose that Y0=y0[a, b]. The mean
first-passage time over interval [a, b] of an AR(1) process, starting at initial state
y0[a, b], is given by
E(Ta,b(y0)) = Zb
aE(Ta,b(u))f(u|y0)du + 1.(11)
For an AR(1) process in Eq.(7), f(u|y0) will be a normal distribution with mean
φy0and variance σ2
ξ. Thus,
E(Ta,b(y0)) = 1
2πσξZb
aE(Ta,b(u)) exp Ã(uφy0)2
2σ2
ξ!du + 1.(12)
Details of the derivation can be found in Basak and Ho (2004). The integral
equation in Eq.(12) is a Fredholm type of the second kind and can be solved numer-
ically using the Nystrom method (Atkinson, 1997) as in the next subsection.
9
3.2 Numerical scheme
If we want to calculate E(Tb(y0)), that is, the mean first-passage time over a given
level bof an AR(1) process starting at initial state y0, it can be computed by adding
a lower boundary afirst. Since E(Ta,b (y0)) converges monotonically to E(Tb(y0)) as
a −∞, the approximation of E(Tb(y0)) can be obtained by evaluating E(Ta,b (y0))
as a −∞ instead.
Consider E(Ta,b(y0)) as in Eq.(12). Now, define h= (ba)/n, where nis the
number of partitions in [a, b] and his the length of each partition.
Using the trapezoid integration rule (Atkinson, 1997):
Zb
af(u)du h
2[w0f(u0) + w1f(u1) + · ·· +wn1f(un1) + wnf(un)] ,(13)
where u0=a, ui=a+ih, un=b, i = 1, . . . , n and the weights wifor the corre-
sponding nodes are
wi=
1,for i= 0 and i=n
2,for others
Thus, the integral term in Eq.(12) can be approximated by
Zb
aE(Ta,b(u)) exp Ã(uφy0)2
2σ2
ξ!du h
2
n
X
j=0
wjE(Ta,b(uj)) exp Ã(ujφy0)2
2σ2
ξ!,
(14)
.
Let En(Ta,b(y0)) denote the approximation of E(Ta,b(y0)) using npartitions.
Thus, the expectation in Eq.(12) using npartitions can be estimated by
En(Ta,b(y0)) h
22πσξ
n
X
j=0
wjEn(Ta,b(uj)) exp Ã(ujφy0)2
2σ2
ξ!+ 1.(15)
Set y0as uifor i= 0,1, . . . , n and reformulate Eq.(15) as follows
En(Ta,b(ui))
n
X
j=0
h
22πσξ
wjEn(Ta,b(uj)) exp Ã(ujφui)2
2σ2
ξ!= 1,(16)
and then solve the following linear equations in (17) to obtain an approximation of
En(Ta,b(uj)).
1K(u0, u0)K(u0, u1). . . K(u0, un)
K(u1, u0) 1 K(u1, u1). . . K(u1, un)
.
.
..
.
..
.
..
.
.
K(un, u0)K(un, u1). . . 1K(un, un)
En(Ta,b(u0))
En(Ta,b(u1))
.
.
.
En(Ta,b(un))
=
1
1
.
.
.
1
(17)
10
Table 1: Mean first-passage time of level 0, given y0= 1.5 for y(t) in (7)
σ2
ξφIntegral equation Simulation
0.5 3.9181 (b = 5, n= 50) 3.9419
0.49 0 2 (b = 5, n = 50) 1.9918
-0.5 1.2329 (b=5, n= 50) 1.2341
0.5 3.5401 (b = 7, n= 70) 3.5571
1 0 2 (b = 7, n = 70) 2.0055
-0.5 1.3666 (b=7, n= 70) 1.3636
0.5 3.0467 (b = 14, n= 140) 3.0512
4 0 2 (b = 14, n = 140) 1.9959
-0.5 1.5626 (b=14, n= 140) 1.5725
where
K(ui, uj) = h
22πσξ
wjexp Ã(ujφui)2
2σ2
ξ!.
Examples of numerical results for some AR(1) processes are provided in Table
1. We compare the results using an integral equation approach and simulation.
For given σ2
ξ,φ, and y0= 1.5, the mean first-passage time of level zero, using an
integral equation approach, is calculated. We use different band nin order to make
the length of partition hthe same for each case. The results show that h= 0.1
is enough to get results similar to the simulation. For simulation, we generate an
AR(1) process as in Eq.(7) for given σ2
ξ, and φ. Using the initial state y0= 1.5,
the time needed for the process to cross zero for the first time is calculated. The
simulation is repeated 1000 times and then we calculate the average. The table
shows that the simulation results confirm the results from the integral equation
approach.
3.3 Trade durations and inter-trade intervals
Consider the adjusted cointegration error and assume that ²tin Eq.(2) follows an
AR(1) process, i.e:
²t=φ²t1+at,where ati.i.d N(0, σ2
a).(18)
As explained in Section 1, the trade duration is the time between opening and
closing a trade. For a U-trade, a trade is opened when ²tis higher than or equal to
the pre-set upper-bound Uand it is closed when ²tis less than or equal to 0 which
is the mean of the adjusted cointegration error. Suppose ²tis at U, so a U-trade is
11
opened. To calculate the expected trade duration, we would like to know the time
needed on average for ²tto pass 0 for the first time. Thus, calculating the expected
trade duration is the same as calculating the mean first-passage time for ²tto pass
0 for the first time, given the initial value is U. Let T DUdenote the expected
trade duration corresponding to the pre-set upper-bound U. Using Eq.(12), T DUis
defined as follows:
T DU:= E(T0,(U)) = lim
b→∞
1
2πσaZb
0E(T0,b(s)) exp Ã(sφU )2
2σ2
a!ds + 1.(19)
As for trade duration, the inter-trade interval is the waiting time needed to open
a trade after the previous trade is closed. For a U-trade, if there is an open U-trade
while ²tis at 0 during trading, the trade has to be closed. To calculate the expected
inter-trade interval, we would like to know the time needed on average for ²tto pass
the pre-set upper-bound Ufor the first time, so we can open a U-trade again. Thus,
calculating the expected inter-trade interval is the same as calculating the mean
first-passage time for ²tto pass Ugiven the initial value is 0. Let IUdenote the
expected inter-trade interval for the pre-set upper-bound U.
IU:= E(T−∞,U (0)) = lim
b→−∞
1
2πσaZU
bE(Tb,U (s)) exp Ãs2
2σ2
a!ds + 1.(20)
3.4 Number of trades over a trading horizon
The expected number of U-trades E(NUT ) and the expected number of periods
corresponding to U-trades E(NU P ) over a time horizon [0,T] are defined as follow:
E(NUT ) =
X
k=1
kP (NU T =k)
and
E(NUP ) =
X
k=1
kP (NU P =k)
.
In this subsection, we want to derive the expected number of U-trades E(NUT )
over a specified trading horizon. However, it is difficult to evaluate the exact value
of E(NUT ). Thus, a possible range of values of E(NUT ) is provided.
As explained in Section 1, periodUis defined as the sum of the trade duration
and the inter-trade interval for U-trades. Thus, the expected periodUis given by,
E(periodU) = T DU+IU.
First, we will evaluate the expected number of periodU’s E(NU P ) in the time
horizon [0,T] as it has a direct connection to the trade duration and the inter-trade
12
interval. Then, the relationship of NU T and NUP will be used to obtain a possible
range of values of E(NUT ).
Let periodU i denote the length of the period corresponding to the ith U-trade.
Thus,
TE
NUP
X
i=1
(P eriodU i)
=
X
k=1 "k
X
i=1
E(P eriodU i)#P(NUP =k).(21)
Since the period depends on the distribution of ²t, which is a stationary time
series, E(periodU i) will be the same for all i. Thus, E(periodU i ) = E(periodU) and
TE(periodU)
X
k=1
kP (NU P =k) = E(periodU)E(NUP ).(22)
Thus,
E(NUP )T
E(periodU)=T
T DU+IU
.(23)
As for the derivation that leads to (23),
T < E
NUP +1
X
i=0
(periodU i)
=E(P eriodU)E(NU P + 1),(24)
giving
E(NUP )>T
E(P eriodU)1 = T
T DU+IU1.(25)
Thus, T
T DU+IUE(NUP )>T
T DU+IU1.(26)
However, the relationship between number of U-trades (NU T ), and number of
periodU’s (NU P ) is NUT =NU P or NU T =NU P + 1.
Thus,
T
T DU+IU
+ 1 E(NU P )+1E(NU T )E(NUP )>T
T DU+IU1.(27)
Table 2 shows the estimation of the number of U-trades over T = 1000 ob-
servations for some AR(1) processes using the theory presented above. We use
ˆ
NUT =1000
T DU+IU1 to estimate the expected number of U-trades within [0,T]. The
average trade duration for U-trades T DUand the average inter-trade interval for
U-trades IUare calculated using the integral equation approach.
Table 3 shows the simulation results of the number of U-trades as a compari-
son to the theoretical results in Table 2. 1000 observations are generated from the
model described in Eq.(18) for each simulation and each simulation is independently
13
Table 2: Estimation of the number of U-trades using an integral equation approach
with U= 1.5
φ σ2
aT DUIUˆ
NUT =1000
T DU+IU1
0.49 3.9181 40.6074 21.459
0.5 1 3.5401 14.6006 54.125
4 3.0469 5.5679 115.079
0.49 1.2329 32.6253 28.535
-0.5 1 1.3666 10.523 83.1071
4 1.5626 3.6220 191.879
Table 3: Simulated number of trades for an AR(1) process using T = 1000 observa-
tions and U= 1.5
φ σ2
aT DUIUNUT simulation ˆ
NUT =1000
T DU+IU1
0.49 4.054(0.585) 42.801(12.340) 21.725(5.277) 20.342
0.5 1 3.780(0.308) 15.153(1.838) 52.650(5.226) 51.817
4 3.407(0.255) 6.254(0.466) 103.000(6.421) 102.508
0.49 1.170(0.084) 28.958(4.760) 32.025(5.091) 32.191
-0.5 1 1.242(0.072) 9.206(0.952) 95.175(8.311) 94.706
4 1.385(0.051) 3.030(0.151) 225.500(8.741) 225.500
14
repeated 40 times. The values in parentheses are the standard deviations. In cal-
culating the trade duration for each simulation, we start to open a trade when ²t
exceeds U. In calculating the inter-trade interval, the trade is closed when ²tgoes
below zero. This is done because in the simulation, ²tis a discrete time process.
Thus, it is hard to obtain the exact time for ²tat Uand 0. We calculate the average
trade duration T DU, the average inter-trade interval IUand the number of U-trades
NUT for each simulation. At the end of all 40 repeated simulations, we calculate the
mean of T DU,IUand NUT from all simulations as well as the standard deviations.
Furthermore, the last column shows the number of trades using ˆ
NUT =1000
T DU+IU1.
From Table 3, we can conclude that if we can estimate the average of trading dura-
tion and the average of intra-trade interval correctly, the formula ˆ
NUT =1000
T DU+IU1
can be used to estimate the number of U-trades.
Comparing the number of U-trades results in Tables 2 and 3, we see that for
φ= 0.5, the estimates of the number of U-trades using the integral equation are
higher than those given by the simulation results. The opposite happens if φ=0.5.
The difference is due to a slight difference in the framework underpinning the theory
of integral equations and that for simulation from real data.
4 Minimum Total Profit and the Optimal Pre-set
Upper-bound
This section will combine the pre-set minimum profit per trade from Section 2
and the number of U-trades from Section 3 to define minimum total profit (MTP)
over the time horizon [0,T]. The optimal pre-set upper-bound, denoted by Uo, is
determined by maximizing the MTP.
Let T PUdenote the total profit from U-trades within the time horizon [0,T] for
a pre-set upper-bound U. Thus,
T PU=
NUT
X
i
( Profit from the ith U-trade).
Using Eqs.(4) and (27),
Profit per trade U
and
E(NUT )T
T DU+IU1.
15
Table 4: Numerical results in determining optimal U
φ=0.8φ=0.5φ=0.2
σ2
aM T P (Uo)UoMT P (Uo)UoM T P (Uo)Uo
0.25 91.7097 0.59 77.0414 0.5 66.4935 0.47
0.49 128.3609 0.83 107.8254 0.7 93.0673 0.65
1 183.3448 1.19 154.0117 1 132.9287 0.93
2.25 274.9967 1.78 230.9996 1.49 199.3710 1.4
4 366.6515 2.37 307.9922 1.99 265.8216 1.86
1.2σaσa0.93σa
0.72σ²0.87σ²0.91σ²
φ= 0.2φ= 0.5φ= 0.8
σ2
aM T P (Uo)UoMT P (Uo)UoM T P (Uo)Uo
0.25 55.1798 0.47 46.7138 0.53 34.7004 0.7
0.49 77.219 0.66 65.3655 0.74 48.5545 0.97
1 110.2877 0.95 93.3549 1.05 69.3438 1.39
2.25 165.4104 1.42 140.0095 1.58 103.9991 2.09
4 220.5361 1.89 186.6704 2.1 138.6582 2.78
0.95σa1.05σa1.4σa
0.93σ²0.91σ²0.84σ²
16
Define the minimum total profit with the time horizon [0,T] by
M T P (U) := µT
T DU+IU1U.7(28)
Then, considering all U[0, b], the optimal pre-set upper-bound Uois chosen
such that M T P (Uo) takes the maximum at that Uo. In practice, the value of bis set
up as 5σ²because ²tis a stationary process, and the probability that |²t|is greater
than 5σ²is close to zero.8
The numerical algorithm to calculate the optimal pre-set upper-bound Uis as
follows:
1. Set up the value of bas 5σ².
2. Decide a sequence of pre-set upper-bounds Ui, where Ui=i×0.01, and i=
0, . . . , b/0.01.
3. For each Ui,
(a) calculate E(T0,b(Ui)) as the trade duration (T DUi) using Eq.(19).
(b) calculate E(Tb,Ui(0)) as the inter-trade interval (IUi) using Eq.(20).
(c) calculate M T P (Ui) = µT
T DUi+IUi1Ui.
4. Find Uo {Ui}such that M T P (Uo) is the maximum .
Examples of numerical results from some AR(1) processes are shown in Table 4.
We use the model of an AR(1) process described in Eq.(18) and T= 1000. The table
shows that for a given φ,Uoincreases as σaincreases. The last two rows of each φ
show the approximation of Uoas a proportion of σaand σ². Those approximations
can be used as a general rule in choosing Uo. For example if we have the adjusted
cointegration error ²twith an AR(1) process and the φis -0.5 or 0.5, quickly we can
choose Uo=σa.
The MTP can be used as a criteria to determine whether the stock pairs are
worth to be traded. If we have limited funding to trade stocks in the market, and
we have identified several stock pairs, we can choose the stock pair that give the
maximum MTP.
5 Empirical examples
This section will investigate the application of the above pairs trading strategy.
Since we do not apply real pairs trading in the stock market, we use empirical data
7We adopt the notation M T P (U) since the Minimum Total Profit is a function of U.
8σ²is the standard deviation of ²t
17
available in the internet (www.finance.yahoo.com.au). The empirical data is divided
into two parts, namely in-sample data and out-sample data. The in-sample data is
assigned as training period where we analysis the cointegration relationship and then
determine the optimal pre-set optimal boundaries Uoand Lo. The out-sample data
is assigned as trading period. The out-sample data is assumed still hold the same
cointegration relationship with the in-sample data, so the pairs trading strategy can
be applied to the out-sample data using the optimal pre-set optimal boundaries Uo
and Loobtained from the in-sample data.
There is no standard rule to choose how long the training period (in-sample data)
and trading period (out-sample data) needed. However, the training period needs to
be long enough so that we can determine that a cointegration relationship actually
exists, but not so long that it is obsolete for the trading period. For trading period,
it needs to be long enough to have opportunities to open and close trades and test
the strategy, but it can not too long because it is possible that the cointegration
relationship between the two stocks may change. We use 12-month training period
and 6-month trading period with daily data as these periods correspond with the
other study by Gatev et al.(1999, 2006), Gillespie and Ulph (2001) and Habak
(2002).
This paper give two specific illustrations, BHP-RIO and TNS-TVL on the Aus-
tralian Stock Exchange (ASX). The stocks of BHP-RIO and TNS-TVL are coin-
tegrated and the cointegration error can be fitted with the AR(1) model. We use
PcFiml (Doornik and Hendry, 1997) and PcGive (Hendry and Doornik, 1996) soft-
wares to analyze the cointegration relationship of the data.
From the in-sample data, knowing that BHP-RIO and TNS-TVL are cointe-
grated and the cointegration error are AR(1) processes, the values of φand σacan
be estimated. The algorithm in Section 4 is applied to obtain the estimates of the
optimal pre-set upper-bound Uo, the number of U-trades, the expected of trade du-
ration and the estimates of the minimum total profits from U-trades for each pair
of shares. As we have explained before, the number of trades and the minimum
total profits, produced by the algorithm in Section 4 ,are for U-trades during the
time horizon [0,T] only. As the ²tfrom those share pairs are stationary processes
and have symmetric distributions, in considering the L-trades, we can simply take
the total number of trades and the total profit to be double the results from the
algorithm above and the estimate of the optimal pre-set lower-bound is Lo=Uo.
After we obtained the estimates of the optimal pre-set boundaries Uoand Lo,
the pairs trading strategy is applied to the in-sample data to obtain the actual
number of trades, total profits and the averages of the trade durations. If the
adjusted cointegration error, ²t, is above or at Uoa U-trade is opened by selling one
18
unit share S1 and buying βunit shares S2 and then close the trade by doing the
opposite position when ²tis below or at zero. We can also open an L-trade when ²t
is below Lo=Uoby buying 1 unit share S1 and selling βunit shares S2 and then
it is closed by doing the opposite position when ²tis above or at zero. In the case
of BHP-RIO, BHP is assigned as S1 and RIO is S2 while in the case of TNS-TVL,
TNS is S1 and TVL is S2. There is no opening trade when the previous trade has
not been closed. We can compare the theoretical results and the actual results from
the in sample data whether the share pair is worth enough to be traded.
Using the optimal pre-set boundaries Uoand Loas well as the cointegration
relationship from the in-sample data, we apply the pairs trading strategy to the
out-sample data. We calculate the profit and trade duration from each trade (U-
trades as well as L-trades) and at the end, the total number of trades, the total
profits and the averages of the trade durations are also calculated. The results from
the out-sample data show whether the pairs trading strategy still works or not.
5.1 BHP-RIO
BHP Billiton and Rio Tinto are major operators in the mining sector. Both have
diversified mining resources in Australia, as well as other countries, that define them
as blue-chip stocks in the ASX.
This paper uses the daily closing price of the two stocks from 2 January 2004
to 30 December 2004 as in-sample data and 3 January 2005 to 30 June 2005 as
out-sample data. From the in-sample data, cointegration relationship of the two
stocks is obtained as follows:
BH Pt0.61248RIOt=²
t,(29)
and the adjusted cointegration error
²t=²
t7.3884,(30)
and then fit ²tas an AR(1) process as follows:
²t= 0.8994²t1+at,(31)
where σ²= 0.6479.
Using the in-sample data i.e. T= 251 observations from 2 January 2004 to 30
December 2004 , and letting φ= 0.8994 and σa=1φ2σ²= 0.2055, we obtain
the following estimates from the numerical algorithm in Section 4:
1. optimal pre-set upper-bound Uo= 0.81 and lower-bound Lo= -0.81,
19
2. total Number of trades (U-trades + L-trades) = 16.00,
3. minimum Total Profit (U-trades + L-trades)= 0.81 ×16 = $12.96,
4. expected trade duration = 13.74 days,
5. 1 trade (either a U-trade or an L-trade) per 15.70 days.
The above results are the estimation from the theory using the in-sample data.
We also want to know the results when the pairs strategy explained in Section 2 is
applied to the in-sample data and using Uo= 0.81 and Lo= -0.81. We obtain the
following actual results:
1. number of trades (U-trades + L-trades)= 9,
2. total profit (U-trades + L-trades)= $ 10.23 ,
3. average profit per trade = $1.14,
4. average trade duration = 11.91 days,
5. on average, 1 trade (either a U-trade or an L-trade) per 27.9 days.
Comparing the estimation results from the theory and the actual results from the
in-sample data show that the actual number of trades and the actual total profit are
less than the estimation. However, we still get some profit and we always observe the
profit per trade is higher than 0.81 dollars which is the optimal pre-set upper-bound
Uo(the average profit per trade = $1.14).
We also want investigate whether the pairs trading strategy using the out-sample
data will also produce profit. Assume that the out sample data (T = 124 observa-
tions) still follows the models in Eqs.(29) and (31), so we can apply the same pair
strategy and apply Uo= 0.81 and Lo=0.81 as for the in-sample data. From the
out-sample data, we obtain
1. number of trades (U-trades + L-trades): 4,
2. total profit (U-trades + L-trades): $ 4.77 ,
3. average profit per trade = $1.19,
4. average trade duration = 22.75 days,
5. on average, 1 trade (either a U-trade or an L-trade) per 31 days.
20
Comparing the trading results from the in-sample data and the out-sample data,
the results are not too different (notice that the number of observations of the out-
sample data is half of the in- sample data). The significant different from the both
results is the average trade duration. However, from the out sample data we still
always obtain profit per trade which is higher than 0.81 dollars (the average profit
per trade = $1.19).
5.2 TNS-TVL
Transonic Travel Ltd (TNS) and Travel.com.au (TVL) are travel companies listed
on the ASX. In this study we consider the daily closing price of the two stocks
from 2 January 2004 to 30 December 2004 as in-sample data and from 3 January
2005 to 30 June 2005 as out-sample data. From the in-sample data, we obtain the
cointegration relationship of the two stocks to be:
T N St0.26659T V Lt=²
t,(32)
with the adjusted cointegration error
²t=²
t15.43,(33)
and we fit ²tas an AR(1) process as follows:
²t= 0.9465²t1+at,(34)
where σ²= 1.256258.
With the indicated 251 days in-sample data, and φ= 0.9465 and σa=1φ2σ²,
we obtain the following estimates from the numerical algorithm in Section 4:
1. optimal upper-bound Uo= 1.00 and lower-bound Lo= -1.00,
2. number of trades (U-trades + L-trades)= 10.91,
3. minimum Total Profit (U-trades + L-trades)= 1 ×10.91 = $10.91,
4. expected trade duration = 18.00 days,
5. 1 trade (either a U-trade or an L-trade) per 18.42 days.
Analogous to the BHP-RIO case, we apply the pairs strategy explained in Section
2 for the in-sample data with Uo= 1.00 and Lo= -1.00, and we obtain the actual
results:
1. number of trades (U-trades + L-trades)= 7,
21
2. total profit (U-trades + L-trades)= $ 13.49 ,
3. average profit per trade = $1.93,
4. average trade duration = 19.71 days.
5. on average, 1 trade (either a U-trade or an L-trade) per 35.85 days.
and from the out sample data (T = 124 observations), we obtain:
1. number of trades (U-trades + L-trades)= 2,
2. total profit (U-trades + L-trades)= $ 3.68 ,
3. average profit per trade = $1.84,
4. average trade duration = 32.5 days,
5. on average, 1 trade (either a U-trade or an L-trade) per 62 days.
From the TNS-TVL case, we see that the actual results for total profit and trade
duration from the in-sample data are not too different with the estimation results
from the theory, and even the actual total profit is significantly higher than the
estimate. Furthermore, we always get a profit which is higher than 1 dollar from
each trade (the average profit per trade = $1.93 and $1.84 from in-sample data and
out-sample data respectively ). However, the results from out-sample data are not
quite good as we have only 2 trades and the the average trading duration is quite
high (about 1 month). Perhaps, this result reflects that the out sample data does
not quite follow the models in Eq.(32).
6 Conclusion
In this paper we have given a methodology to choose the optimal pre-set boundaries
for pairs trading strategy based on cointegration technique and give a quantitative
method to evaluate the average trade duration, the average inter-trade interval, and
the average number of trades. The optimality in term of maximizing the minimum
total profit over the specified trading horizon is developed by combining cointegra-
tion technique, the cointegration coefficient weighted rule, and the mean first-passage
time using an integral equation approach.
The pairs trading strategy is applied to empirical data from two pair samples:
BHP-RIO and TNS-TVL. Even though from the BHP-RIO case we can not obtain
results as high as projected, we always obtain a profit per trade higher than the
optimal pre-set upper-bound Uo. The actual total profit from both pair cases are
22
quite similar to the estimates. For the TNS-TVL case, the results from the out
sample data are not quite good. Perhaps, these results are due to the out sample
data not quite following the models developed from the in sample data. Adjustment
to the model may need to be made when using out sample data.
The above strategy can be extended if we set the minimum profit per trade as the
minimum profit required (Pr), for example to meet the trading cost. We can trade
NS1=bPr
Ucof S1 shares and NS2=bβPr
Ucof S2 shares to obtain the minimum profit
per trading to be at least Pr. If we want to restrict the money invested in the trade
to amount of I, we trade NS1=bI
(PS1,to+βPS2,to)cof S1 and NS2=bβI
(PS1,to+βPS2,to)c
of S2 when we open a trade and then will get minimum profit per trade of UNS1.9
We are aware that in large groups of stocks, the cointegrated stocks may not
follow the assumptions given in this paper. For example, the cointegration relation-
ship may disappear in the future, or the cointegration error may not be symmetric
or may not an AR(1) process. Whether the technique displayed in this paper works
or not only relies on two conditions:(1) within the in-sample data, there is a lin-
ear combination of stocks to form an AR(1) series, (2) such relationship does not
significantly change in the trading period (out-sample data). Further investigations
are warranted to explore different assumptions. In this paper we have established a
framework that may be applied for a cointegrated stock pair with AR(1) cointegra-
tion error.
Acknowledgements
The authors would like to thank John Rayner, Martin Bunder, and Bernie Wilkes
for their insightful and useful comments.
References:
Alexander, C. (1999). Optimal Hedging Using Cointegration. Philosophical
Transactions of the Royal Society, A357,2039-2058.
Atkinson, K. (1997). The Numerical Solution of Integral Equations of the Second
Kind. Cambridge University Press.
Avery-Wright, R. Market Neutral (Pairs Trading) Explained. http:// compare-
shares.com.au/expert CFPs11.php.
Banerjee, A., Dolado, J., Galbraith, J. and Hendry, D. (1993). Cointegration,
Error Correction, and the Econometric Analysis of Non-Stationary Data. New York,
Oxford University Press.
Basak, G.K. and Ho, K.W.R. (2004). Level-crossing Probabilities and First-
passage Times for Linear Processes. Advance Applied Probability,36, 643-666.
9bacdenotes the maximum integer number less than or equal a.
23
Box, G.E.P. and Jenkins, G.M. (1976). Time Series Analysis: Forecasting and
Control. Holden-Day Inc.
Doornik, J.A. and Hendry, D.F. (1997). Modeling Dynamic Systems Using Pc-
Fiml 9.0 for Windows. International Thomson Business Press.
Elliott, R.J., van der Hoek, J. and Malcolm W.P. (2005). Pairs Trading. Quan-
titative Finance,5, 271-276.
Engle, R.F. and Granger, C.W.J. (1987). Co-integration and Error Correction
Representation, Estimation and Testing. Econometrica,58, 251-276.
Ehrman, D.S. (2006). The Handbook of Pairs Trading: Strategies Using Equities,
Options, and Futures. John Wiley & Sons Inc.
Fama, E.F. and French, K. R. (1988). Dividend yields and expected stock re-
turns. Journal of Political Economy,96, 246-273.
Gatev, E.G., Goetzmann, W.N. and Rouwenhorst, K.G. (1999). Pairs Trading:
Performance of a Relative Value Arbitrage Rule. Working Paper 7-32, National
Bureau of Economic Research.
Gatev, E.G., Goetzmann, W.N. and Rouwenhorst, K.G. (2006). Pairs Trading:
Performance of a Relative Value Arbitrage Rule. The Review of Financial Studies,
19, 797-827.
Gillespie, T. and Ulph, C. (2001). Pair Trades Methodology: A Question of
Mean Reversion. Derivative Analysis Department, Salomon Smith Barney.
Goodboy, D. A Basic Introduction to Pairs Trading. http://biz.yahoo.com
Granger, C.W.J. (1986). Developments in the Study of Cointegrated Economic
Variables. Oxford Bulletin of Economics and Statistics,48, 213-228.
Grant, A. (2003). Simulation and Application of a New Pairs Trading Strategy
Involving Cointegration. Bachelor of Mathematics and Finance Honours Thesis,
University of Wollongong.
Habak, C. (2002). Pairs Trading: Applying Cointegration to a Pairs Trading
Strategy. Bachelor of Mathematics and Finance Honours Thesis, University of Wol-
longong.
Harris, R.I.D. (1995). Using Cointegration Analysis in Econometric Modeling.
London, Prentice Hall.
Hendry, D.F. and Doornik, J.A. (1996). Empirical Econometric Modeling Using
PcGive 9.0 for Windows. International Thomson Business Press.
Hendry, D.F. and Juselius, K. (2000). Explaining Cointegration Analysis: Part
II. Energy Journal,22, 75-120.
Herlemont, D. (2003). Pairs Trading, Convergence Trading, Cointegration. YATS
24
Finances and Technology. www.yats.com.
Hong, G. and Susmel, R. (2003). Pairs-Trading in the Asian ADR Market.
http://www.bauer.uh.edu/rsusmel/Academic/ptadr.pdf.
Johansen, S. (1988). Statistical Analysis of Cointegrating Vectors. Journal of
Economic Dynamics and Control,12, 231-254.
Lin, Y.X., McCrae, M. and Gulati, C. (2003). Necessary Conditions for a Min-
imum Profit Level in Pairs Trading: A Cointegration-based Procedure. Workshop
on Mathematical Finance, University of Wollongong.
Lin, Y.X., McCrae, M. and Gulati, C. (2006). Loss Protection in Pairs Trading
Through Minimum Profit Bounds: A Cointegration Approach. Journal of Applied
Mathematics and Decision Sciences, (doi: 10.1155/JAMDS/2006/73803).
Liu, X., Song, H. and Romilly, P. (1997). Are Chinese stock markets efficient?
A cointegration and casuality analysis. Applied Economics Letters,4, 511-515.
Narayan, P.K. (2005). Are the Australian and New Zealand stocks prices non-
linear with a unit root? Applied Economics,37, 2161-2166.
Stone, C. Finding Profit in Pairs. http://www.investopedia.com.
Vidyamurthy, G. (2004). Pairs Trading: Quantitative Methods and Analysis.
John Wiley.
Whistler, M. (2004). Trading Pairs - Capturing Profits and Hedging Risk with
Statistical Arbitrage Strategies. John Wiley.
25
... Previous methods for pair trading generally decompose the task as two separate steps: pair selection and trading. For pair selection, they generally adopt predefined statistical tests or fundamental distance measurements to select two assets according to their historical price feature [5,7,8,10,11,15,16,19,27,35,36,38,39,46]. For example, a number of previous researches apply the cointegration test [46] to verify if the historical price spread of two assets is stationary. ...
... However, an ideal asset pair in these modelfree methods were expected to be two assets with exactly the same price movement in historical time, which have zero trading opportunities for no fluctuations of price spread. There were also methods [5,7,12,15,27,38,39,46] that directly model the tradability of a candidate pair based on the Engle-Granger cointegration test, which performs linear regression using the price series of two assets and expects the residual to be stationary. However, the mean-reversion properties of the spread of an asset pair in the future can be irrelevant to their mean-reversion strength in history, which limits the trading performance of the selected pair from these parameter-free methods. ...
Preprint
Pair trading is one of the most effective statistical arbitrage strategies which seeks a neutral profit by hedging a pair of selected assets. Existing methods generally decompose the task into two separate steps: pair selection and trading. However, the decoupling of two closely related subtasks can block information propagation and lead to limited overall performance. For pair selection, ignoring the trading performance results in the wrong assets being selected with irrelevant price movements, while the agent trained for trading can overfit to the selected assets without any historical information of other assets. To address it, in this paper, we propose a paradigm for automatic pair trading as a unified task rather than a two-step pipeline. We design a hierarchical reinforcement learning framework to jointly learn and optimize two subtasks. A high-level policy would select two assets from all possible combinations and a low-level policy would then perform a series of trading actions. Experimental results on real-world stock data demonstrate the effectiveness of our method on pair trading compared with both existing pair selection and trading methods.
... Further, risk arbitrage has been relatively less discussed while transaction costs have been rarely considered in the existing literature (Chan, 2008(Chan, , 2013). Analyses of risk-arbitrage, particularly pairs trading, was first introduced by Gatev et al. (1999), followed by many others (Vidyamurthy, 2004;Clegg & Krauss, 2018;Liew & Wu, 2013;Puspaningrum et al., 2009;Rad et al., 2016). However, the focus of prior studies has been mostly restricted to the stock market. ...
... Considering transaction costs, Do and Faff (2012) show that the algorithm developed by Gatev et al. (1999) is largely unprofitable and therefore inapplicable after 2002. Another method that can be applied to a pairs trading framework is cointegration (Puspaningrum et al., 2009;Vidyamurthy, 2004). Vidyamurthy (2004) emphasizes on the fact that security pairing is a critical step to achieve significant trading performances. ...
Article
Full-text available
Among the various statistical trading strategies, pairs trading has been widely employed as a market neutral strategy owing to its simple approach and ease of application. In this context, we develop a cointegration-based pairs trading framework with a set of pre-conditions for pair eligibility and apply it to different asset classes. The performance analysis of a portfolio of 45 pairs is considered for the period of January 2007 to January 2021, which covers the period of a full market cycle of adjacent bull and bear periods; it is studied and benchmarked against the S&P500 index, which is considered as a proxy for the general market. We find an average annual return of 15% with an average Sharpe ratio of 1.43 after considering the transaction costs; we observe that this performance does not vary significantly with a change in the transaction cost levels and does not pass below the risk-free return levels with changing market conditions. Further, the strategy is observed to perform better during bear market conditions. Considering the highly liquid trading environment of the strategy, our findings raise a call for a discussion on the semi-strong form market efficiency.
... The authors noted that at reasonable minimum profit levels, the protocol does not significantly reduce trade numbers or absolute profits compared to an unprotected trading strategy. The cointegration method was further applied by [Puspaningrum et al., 2010, Dunis et al., 2010, Gutierrez and Tse, 2011, Rad et al., 2016. The state space model performed by Elliott et al. [2005] is in the field of the time series approach. ...
Preprint
Full-text available
A high frequency pairs trading (HFPT) algorithm is built by the integration of pairs trading and threshold rebalancing algorithm. The determination of optimal threshold (OT) for the HFPT is crucial to maximize its profitability. Thus, selection of the OT for the pairs trading is an important problem, and this study suggests a procedure to classify OT ranges by supervised machine learning (ML) techniques. In this regard, a sample dataset is created for ML applications. In this dataset, the target variables (OT values) are computed by the application of HFPT algorithm to real price data of 50 crypto-assets, and input variables (features) are calculated as portfolio mean, variance, skewness, kurtosis, value at risk, and correlation coefficient of the pairs. Before classification process, the pairs (or portfolios) are divided into three subgroups (as positively, weakly and negatively correlated), and then OT values are classified by 6 ML methods. Comparing the accuracy of ML methods, it is observed that the best accuracy is obtained by the Random Forest (RF) classifier for all portfolio groups in two-class, three-class and four-class classification. Also, it is seen that the right classification performance of ML methods on positively and negatively correlated pairs are better than weakly correlated pairs. Furthermore, the success of RF classifier is verified with a validation dataset that contains price series of 50 crypto-assets in January and February 2024. The applicability of OT range classification procedure in practical exchange markets is also demonstrated, and it is shown that prediction of the OT range is possible with RF method and the HFPT algorithm can yield reasonable profits when threshold selected in predicted range.
... The co-integration approach is used in these experiments with respect to certain technical aspects (Vidyamurthy, 2004;Puspaningrum et al., 2010;Krauss, 2017). Comparative¯ndings were addressed by introducing di®erent mythologies in some experiments like (Caldeira and Moura, 2013;Lin et al., 2006;Liew and Wu, 2013). ...
Article
Full-text available
Pair trading strategy is a well-known profitable strategy in stock, forex, and commodity markets. As most of the world stock markets declined during COVID-19 period, therefore this study is going to observe whether this strategy is still profitable after COVID-19 pandemic. One of the powerful algorithms of DBSCAN under the umbrella of unsupervised machine learning is applied and three clusters were formed by using market and accounting data. The formation of these three clusters was based on book value per share, earning per share, classification of sector, market capitalisation and with other factors formed from PCA on the returns of daily data of six months of the 80 sample firms for year 2019–2020. An average of −0.32% average excess monthly return with Sharpe ratio of −0.0012 and Treynor ratio of −0.0231 is to be observed in COVID-19 pandemic period. However, the result of risk-adjusted performance under Jensen’s alpha is observed to be insignificant. The policy implication of this study, for different portfolios and fund managers is suggested to use machine learning approach to get positive and higher returns for their clients.
... For example, pairs trading and its generalizations rely on the construction of mean-reverting spreads, but the mean-reverting spreads must have a certain degree of predictability. So, researchers have developed loss protection methods for pairs trading [11] or used algorithms to estimate trade duration and find optimal preset boundaries [12]. Pairs trading is a simple concept. ...
Article
Full-text available
In this paper, we make use of the replicating asset for statistical arbitrage trading, where the replicating asset is constructed by a portfolio that mimics the returns from a factor model. Using the replicating asset in the context of statistical arbitrage has never been done before in the literature. A novel optimal statistical arbitrage trading model is applied, and we derive the average transaction length and return for the Berkshire A stock and its replicating asset. The results show that the statistical arbitrage method proposed by Bertram (2010) is profitable by using the replicating asset. We also compute the average returns under different transaction costs. For the statistical arbitrage using the replicating asset of the factor model, average annual returns were at least 33%. Robustness is examined with the S&P500. Our results can provide hedge fund managers with a new technique for conducting statistical arbitrage.
Article
Full-text available
The pair trading strategy under portfolio construction is one of the profitable strategies. This is first ever study to observe the pair trading strategy using Islamic Index KSE firms. Distance approach is applied in this study by taking firms under KMI-30 Islamic index by using daily data from year January 1, 2012 to year December 31, 2019. These firms are divided into three subcategories, high-cap, mid-cap and low-cap. After formation of pairs, trading algorithm under various parameterizations is used to observe the profitability. The study concluded with positive and significant returns of top 3, 5, 7 and 10 pairs for each category. In addition, this study also witnessed with positive returns after risk adjustment of market factor for the top 3, 5, 7, and 10 pairs of Islamic index’s firms. These results are accordance with theories of mean revision and market neutrality. The study is contributing to the literature regarding the profitability of pair trading in Islamic Indices firms in Pakistan. Policy Implication for Islamic fund managers and investors is recommended.
Article
Full-text available
This study intends to approach on the various aspects of frozen food and buying behaviour. The food that can be easily and quickly prepared has existed contemplated and considered as affected form of cuisine but yes in the increasing and energy concerning times of day it has noticeably enhance a part of our lives and it has affected the consumer’s buying conclusion when it meets to grocery buying. The research was attended with 50 respondents utilizing useful and inspecting sampling at market in Uttar Pradesh. Likert-scale located enquiry was developed and data was assembled using availability sampling It was found that the quickly made food has a significant impact on purchasing decision and consumer behaviour.
Article
Full-text available
Based on recent works on stocks comovement, Pairs Trading’s strategy is enhanced by reducing the stock universe to the stocks with the lower volatility on a given date. From this universe of low volatility stocks, pairs are selected by looking for pairs whose series present a high degree of antipersistence. Finally, a “reversion to the mean” strategy is applied to these pairs. It is shown that, with this approach to Pairs Trading, positive results can be obtained for stock from the Nasdaq stock exchange, mainly during bull markets and low volatility periods.
Article
Full-text available
This study aims to analyze the relationship between the business cycles of the ASEAN +3 countries. In addition, the effects of the spillover value on the coincident indicators are determined. This study employs secondary data and uses multivariate time series of five ASEAN countries, namely, Indonesia, Malaysia, Singapore, Thailand, and the Philippines. The proxy was the real gross domestic product (GDP) collected annually from the CEIC, the IMF, and the World Bank for the period from 1964 to 2016. The data was plotted against two time periods, 1964-1998 as the pre-crisis period, and 1999-2016 as the post-crisis period. The index data was changed to the base year 2010. The data was subsequently separated from the trends and the cyclic components. The cyclic components were obtained by using Hondrick-Prescott filter, and them were further analyzed. The analytical method used was Contemporaneous and Cross-Correlation tools. The results showed that, before and after the crisis, the value of the business cycle correlation between ASEAN +3 countries was stronger and moved together at the same level of lag value. The implication of this research was an initial finding of the ASEAN +3 countries' prerequisites for the formation of a common currency.
Article
This work examines a deep learning approach to complement investors’ practices for the identification of pairs-trading opportunities among cointegrated stocks. We refer to the reversal effect, consisting in the fact that temporarily market deviations are likely to correct and finally converge again, to generate valuable pairs-trading signals based on the application of Long Short-Term Memory networks (LSTM). Specifically, we propose to use the LSTM to estimate the probability of a stock to exhibit increasing market returns in the near future compared to its peers, and we compare and combine these predictions with trading practices based on sorting stocks according to either price or returns gaps. In so doing, we investigate the ability of our proposed approach to provide valuable signals under different perspectives including variations in the investment horizons, transaction costs and weighting schemes. Our analysis shows that strategies including such predictions can contribute to improve portfolio performances providing predictive signals whose information content goes above and beyond the one embedded in both price and returns gaps.
Article
Full-text available
Although models of cointegrated financial time series are now relatively common place in the literature their importance has, until very recently, been mainly theoretical. This is because the traditional starting point for portfolio risk management in practice is a correlation analysis of returns, whereas cointegration is based on the raw price, rate or yield data. In standard risk-return models these price data are differenced before the analysis is even begun, and differencing removes a-priori any long-term trends in the data. Of course these trends are implicit in the returns data, but any decision based on long-term common trends in the price data is excluded in standard risk-return modelling. Cointegration and correlation are related, but different concepts. High correlation of returns does not necessarily imply high cointegration in prices. An example is given in figure 1, with a 10-year daily series on US dollar spot exchange rates of the German Mark (DEM) and the Dutch Guilder (NLG) from 1975 to 1985. Their returns are very highly correlated: the correlation coefficient is approximately 0.98 (figure 1(a)). So also do the rates move together over long periods of time, and they appear to be cointegrated (figure 1(b)). Now suppose that we add a very small daily incremental return of, say, 0.0002 to NLG. The
Article
This book provides an extensive introduction to the numerical solution of a large class of integral equations. The initial chapters provide a general framework for the numerical analysis of Fredholm integral equations of the second kind, covering degenerate kernel, projection and Nystrom methods. Additional discussions of multivariable integral equations and iteration methods update the reader on the present state of the art in this area. The final chapters focus on the numerical solution of boundary integral equation (BIE) reformulations of Laplace's equation, in both two and three dimensions. Two chapters are devoted to planar BIE problems, which include both existing methods and remaining questions. Practical problems for BIE such as the set up and solution of the discretised BIE are also discussed. Each chapter concludes with a discussion of the literature and a large bibliography serves as an extended resource for students and researchers needing more information on solving particular integral equations.
Article
The numerical solution of Fredholm integral equations of the second kind - Volume 24 Issue 1 - Ivan G. Graham
Article
The power of dividend yields to forecast stock returns, measured by regression R2, increases with the return horizon. We offer a two-part explanation. (1) High autocorrelation causes the variance of expected returns to grow faster than the return horizon. (2) The growth of the variance of unexpected returns with the return horizon is attenuated by a discount-rate effect - shocks to expected returns generate opposite shocks to current prices. We estimate that, on average, the future price increases implied by higher expected returns are just offset by the decline in the current price. Thus, time-varying expected returns generate ‘temporary’ components of prices.
Article
In this paper, we study pairs-trading strategies for 64 Asian shares listed in their local markets and listed in the U.S. as ADRs. Given that all pairs are cointegrated, they are logical choice for pairs-trading. We find that pairs-trading in this market delivers significant profits. The results are robust to different profit measures and different holding periods. For example, for a conservative investor willing to wait for a one-year period, before closing the portfolio pairs-trading positions, pairs-trading delivers annualized profits over 33%.