arXiv:0810.1922v1 [q-fin.PM] 10 Oct 2008
Look-Ahead Benchmark Bias
in Portfolio Performance Evaluation∗
Gilles Daniel1, Didier Sornette1,2,†and Peter Wohrmann3
1ETH Z¨ urich, Department of Management, Technology and Economics
Kreuzplatz 5, CH-8032 Z¨ urich, Switzerland
2Swiss Finance Institute, c/o University of Geneva
40 blvd. Du Pont d’Arve, CH 1211 Geneva 4, Switzerland
3Swiss Banking Institute, University of Z¨ urich
email@example.com, firstname.lastname@example.org and email@example.com
†Contact author: Prof. Sornette, tel: +41 (0) 44 63 28917 fax: +41 (0) 44 63 21914
December 2, 2008
Abstract: Performance of investment managers are evaluated in comparison with bench-
marks, suchasfinancialindices. Duetotheoperationalconstraintthatmostprofessionaldatabases
do not track the change of constitution of benchmark portfolios, standard tests of performance
suffer from the “look-ahead benchmark bias,” when they use the assets constituting the bench-
marks of reference at the end of the testing period, rather than at the beginning of the period.
Here, we report that the “look-ahead benchmark bias” can exhibit a surprisingly large amplitude
for portfolios of common stocks (up to 8% annum for the S&P500 taken as the benchmark) –
while most studies have emphasized related survival biases in performance of mutual and hedge
funds for which the biases can be expected to be even larger. We use the CRSP database from
1926 to 2006 and analyze the running top 500 US capitalizations to demonstrate that this bias
can account for a gross overestimationof performance metrics such as the Sharpe ratio as well as
an underestimation of risk, as measured for instance by peak-to-valley drawdowns. We demon-
strate the presence of a significant bias in the estimation of the survival and look-ahead biases
studied in the literature. A general methodology to test the properties of investment strategies is
advanced in terms of random strategies with similar investment constraints.
JEL codes: G11 (Portfolio Choice; Investment Decisions), C52 (Model Evaluation and Selec-
Keywords: survival bias, look-ahead bias, portfolio optimization, benchmark, investment strate-
∗We are grateful to Y. Malevergne for useful discussions.
Market professionals and financial economists alike strive to estimate the performance of mu-
tual funds, of hedge-funds and more generally of any financial investment, and to quantify the
return/risk characteristics of investment strategies. Having selected the funds and/or strategies of
interest, a time-honored approach consists in quantifying their past performance over some time
period. A large literature has followed this route, motivated by the eternal question of whether
some managers/strategies systematically outperform others, with its implications for market ef-
ficiency and investment opportunities.
Backtesting investment performance may appear straightforward and natural at first sight.
However, a significant literature has unearthed, studied and tried to correct for ex-post condi-
tioning biases, which include the survival bias, the look-ahead bias and data-snooping, which
continue to pollute even the most careful assessments. Here, we present a dramatic illustration
of a variant of the look-ahead bias, that we refer to as the “look-ahead benchmark bias,” which
surprised us by the large amplitude of the overestimation of expected returns of up to 8% per
annum. This overestimation is comparable to the largest amplitudes of the survival biases and
look-ahead biases found for mutual funds or hedge-funds. We demonstrate the generic nature
of the “look-ahead benchmark bias” by studying the performance of portfolios investing solely
in regular stocks using very simple strategies, such as buy-and-hold, Markovitz optimization or
random stock picking.
The look-ahead benchmark bias that we document is strongly related to the look-ahead bias
proper and to the survival bias, but has no particular relation to data-snooping (Lo and McKinlay
1990, White 2000, Sullivan et al. 1999), which we therefore do not discuss further.
The standard survivorship bias refers to the fact that many estimates of persistence in invest-
ment performance are based on data sets that only contain funds that are in existenceat the end of
the sample period; see, e.g. (Brown et al. 1992, Grinblatt and Titman 1992). The corresponding
survivorshipbias is caused by the fact that poor performing funds are less likely to be observed in
data sets that only contain the surviving funds, because the survival probabilities depend on past
performance. Perhaps less appreciated is thefact thatstocks themselveshavealso a largeexit rate
and hence also suffer from the survival bias. For instance, Knaup (2005) examines the business
survival characteristics of all establishments that started in the United States in the late 1990s
when the boom of much of that decade was not yet showing signs of weakness, and finds that, if
85% of firms survive more than one year, only 45% survive more than four years. Bartelsman et
al. (2003) confirm that a large number of firms enter and exit most markets every year in a group
of ten OECD countries: datacoveringthefirst part ofthe1990s showthefirm turnoverrate(entry
plus exit rates) to be between 15 and 20 per cent in the business sector of most countries, i.e. a
fifth of firms are either recent entrants or will close down within the year. And this phenomenon
of firm exits is not confined to small firms. Indeed, in the exhaustive CRSP database of about
26’900 listed US firms, covering the period from Jan. 1926 to Dec. 2006 (Center for Research
in Security Prices, http://www.crsp.com/), we find that on average 25% of names disap-
peared after 3.3 years, 75% of names disappeared after 14 years and 95% of names disappeared
after 34 years.
The standard look-ahead bias refers to the use of information in a simulation that would not
be available during the time period being simulated, usually resulting in an upward shift of the
results. An example is the false assumption that earnings data become available immediately at
the end of a financial period. Another example is observed in performance persistence studies,
in which it is common to form portfolios of funds/stocks based upon a ranking performed at the
end of a first period, together with the implicit or explicit condition that the funds/stocks are still
in the selected ranks at the end of the second testing period. In other words, funds/stocks that
are considered for evaluation are those which survive a minimum period of time after a ranking
period (Brown et al. 1992). This bias is not remedied even if a survivorship free data-base is
used, because it reflects additional constraints on ranking.
More generally, the fact that a data set is survivorship free does not imply that standard
methods of analysis do not suffer from ex-post conditioning biases, which in one way or another
may use (often implicit or hidden) present information which would not have been available in a
Previous works have investigated both survivorship and look-ahead biases. Brown et al.
(1992) have shown that survivorship in mutual funds can introduce a bias strong enough to ac-
count for the strength of the evidence favoring return predictability previously reported. Carpen-
ter and Lynch (1999) find, among other results, that look-ahead biased methodologies (which
require funds to survive a minimum period of time after a ranking period) materially bias statis-
tics. ter Horst et al. (2001) introduce a weighting procedure based upon probit regressions
which models how survival probabilities depend upon historical returns, fund age and aggre-
gate economy-wide shocks, and which provides look-ahead bias-corrected estimates of mutual
fund performances. Baquero et al. (2005) apply the methodology of ter Horst et al. (2001) to
hedge-fund performance, which requires a well-specified model that explains survival of hedge
funds and how it depends upon historical performance. ter Horst and Verbeek (2007) extend the
look-ahead bias correction method of Baquero et al. (2005) to hedge-funds by correcting sepa-
rately for additional self-selection biases that plague hedge-fund databases (underperformers do
not wish to make their performance known while funds that performed well have less incentive
to report to data vendors to attract potential investors).
The major part of the literature is devoted to assess the look-ahead bias on actively managed
investment funds. Here, we are studying how the back-testing of investment strategies on biased
stock price databases is effected. We add to the literature by focusing on the look-ahead bias
that appears when the assets used to test the portfolio performance are selected on the basis
of their relationship with the benchmark to which the performance is compared. In the next
section 2, we provide a specific straightforward implementation using the S&P500 index as the
benchmark over the period from January 2001 to December 2006. Section 3 is devoted to a more
systematic illustration of the look-ahead benchmark bias over different periods, from 1926 to
2007. The substantial difference in the performances of up to 8% between portfolios with and
without look-ahead bias provide an indication for the bias in the performance of the back-test
of an active investment strategy as it is commonly carried out. Sections 2 and 3 document that
passive strategies are higher in after cleaning the database with respect to the look-ahead bias.
Under quitegeneral assumptions,we givein Section 4 an analytical prediction for thelook-ahead
bias happening to the mean-variance investmentrule which might be applied in a mutual fund. In
particular, we discuss, under what conditions the naive diversification would be favorable. The
same methodology can be applied to give decision support to the hedge fund manager whether
on the empirical evidence presented in sections 2 and 3 by using random strategies. Random
strategies are proposed as a simple and efficient test of the value added by a given strategy,
which take into account all possible biases, including those too difficult to address or that are
even unknown to the analyst.
to December 2006 using the S&P500 as the benchmark
Let us consider a manager who wants to back-test a given trading strategy on past data, namely
on a pool of stocks such as the constituents of the S&P500 index on a given period, say January
2001 to December 2006. To do so, the natural approach would be the following:
1. Obtain the list of constituents of the S&P500 index at the present time (end of December
2. For each name (stock), retrieve the closing price time series for the given period from
January 2001 to December 2006;
3. Backtest the strategy on that data set, for instance by comparing it with the S&P500 bench-
However, doing so introduces a formidable bias, and can easily lead to erroneous conclusions.
Figure 1 dramatically illustrates the effect by comparing the performance of two investments.
In the first investment, which is subject to the look-ahead bias, we build $1 of an equally
weighted portfolio invested in the 500 stocks constituting the S&P500 index at the end of the
period (29 December 2006), and hold it from 1st January 2001 until 29th December 2006.
Meanwhile, the second investment simply consists in buying $1 of an equally weighted portfolio
invested in the 500 stocks constituting the S&P500 index at the beginning of the period (1st Jan-
uary 2001), and in holding it from 1st January 2001 until 29th December 2006. Both investments
are buy-and-hold strategies and should have yielded similar performances if the constitution of
the S&P500 index had not changed over this period.1However, the list of constituents of the
S&P500 index is updated, usually on a monthly basis, in order to include only the largest capital-
izations of the US stock market at the current time. Consequently, this list of constituents cannot,
almost per force, contain a stock that for instance crashed in the recent past. Indeed, in such a
case, the stock has a large probability to be passed in terms of capitalization by another stock of
the same industry branch and thus leave the index and be replaced by that other stock. The only
difference in the two portfolios is that the first investment uses a look-ahead information, namely
it knows on 1st January 2001 what will be the list of constituents of the S&P500 index at the
end of the period (29th December 2006). This apparently innocuous look-ahead bias leads to a
huge difference in performance, as can be seen in Figure 1 and from simple statistics: the first
(respectively second) investmenthas an annual average compounded return of 6.4% (resp. 2.3%)
1Since the S&P500 index is *not* equally weighted, we should expect a slight discrepancy between the evolu-
tion of portfolio 2 and the actual index.
and a Sharpe ratio (non-adjusted for risk-free rate) of 0.4 (resp. 0.1). The first investment has
significant better return but, what is even more important, it exhibits larger risk-adjusted returns.
Forreference, wealsoplotthehistoricalvalueoftheactual S&P500index,normalisedto1on
1st January 2001. Note that its performance is slightly worse than that of the second investment
discussed above. This could be due to the different weighting and also to the effect reported by
Cai and Houge (2007)2.
Many managers would have been happier to report Sharpe ratios in the range obtained for
the first investment, especially over this turbulent time period. Investment strategies exhibiting
this kind of performance would fuel interpretations that this is evidence of a departure from the
efficient market hypothesis and/or of the existence of arbitrage opportunities. On the other hand,
other pundits would observe that this look-ahead bias is really obvious, so that no one would
fall into such a trivial trap. This quite reasonable assessment actually collides with one simple
but often overlooked operational limitation of back-tests3: the change of constitution of financial
indices are not recorded in most standard professional databases, such as Bloomberg, Reuters,
Datastream or Yahoo! Finance. As the standard goal for investment managers is at least to
emulate or better to beat some index of reference, back-tests on comparative investments should
use a defined set of assets on which to invest, which is defined at the beginning of the period.
However, because the list of constituents of the indices is unexpectedly challenging to retrieve4,
it is common practice to use the set of assets constituting the benchmarks of reference at the
present time, rather than at the beginning of the period. Then, necessarily, the kind of look-ahead
bias that we report here will automatically pollute the conclusions, with sometimes dramatic
consequences, as illustrated in Figure 1. We refer to this as the “look-ahead benchmark bias.”
A part of the over-performance of the ex-post portfolio over the S&P500 index can be at-
tributed to the fact that the former is equally-weighted while the later is value-weighted. But
this does not explain away the look-ahead effect as shown by the large difference between the
equally-weighted ex-post and ex-ante portfolios. For instance, consider the DJIA. While the
mean return of the price-weighted DJIA index from January 2001 to September 2007 was on
slightly below (3.2% p.a) that of the price-weighted ex-post portfolio (3.8% p.a.), the difference
is much larger for the period from February 1973 to September 2007: 5.7% p.a. versus 7.8% p.a.
2The look-ahead benchmark bias documented here is related to the work of Cai and Houge (2007) who study
how additions and deletions affect benchmark performance. Studying changes to the small-cap Russell 2000 index
from 1979-2004, Cai and Houge (2007) find that a buy-and-hold portfolio significantly outperforms the annually
rebalanced index by an average of 2.2% over one year and by 17.3% over five years. These excess returns result
from strong positive momentum of index deletions and poor long-run returns of new issue additions.
3Since the benchmark is observed continuously, real-time assessment of performance does not of course suffer
from this problem. We only refer to back-testing which uses a recorded time series of the benchmark and present
knowledge of its constituents.
4Standard & Poor’s themselves provide the list of constituents of the S&P500 index only since January 2000,
while scripting Reuters, Bloomberg and Datastream returned only incomplete results. In fact, it appears that both
the CRSP and Compustat databases are necessary to retrieve the list of constituents of the S&P500 index at any
given point in time, and these databases are usually not accessible to practitioners.