Content uploaded by Vitor G. Azevedo

Author content

All content in this area was uploaded by Vitor G. Azevedo on Sep 06, 2018

Content may be subject to copyright.

Earnings Forecasts: The Case for Combining Analysts’ Estimates

with a Mechanical Model I

Vitor G. Azevedoa, Patrick Bielsteinb, Manuel Gerharta

aDepartment of Financial Management and Capital Markets, TUM School of Management, Technical

University of Munich, Arcisstr. 21, 80333 Munich, Germany

bEDHEC-Risk Institute, Scientiﬁc Beta, 10 Fleet Place, London EC4M 7RB, United Kingdom

Abstract

We propose a novel method to forecast corporate earnings, which combines the accu-

racy of analysts’ forecasts with the unbiasedness of a mechanical model. We build on

recent insights from the earnings forecasts literature to select variables that have predic-

tive power with respect to earnings. Our model outperforms the most popular methods

from the literature in terms of forecast accuracy, bias, and earnings response coeﬃcient.

Furthermore, using our estimates in the implied cost of capital calculation leads to a

substantially stronger correlation with realized returns compared to extant mechanical

earnings estimates.

Keywords: Earnings forecasts; analysts’ forecasts; forecast evaluation; implied cost of

capital; expected returns. JEL classiﬁcations: G12, G29, M41.

IWorking Paper; this version: September 6, 2018. We thank Tobias Berg, Daniel Bias, J¨urgen

Ernstberger, Robert Heigermoser, Christoph Kaserer, Lisa Knauer, Jake Thomas, and participants at

the Paris Financial Management Conference 2017, European Accounting Association 2018, and TUM

School of Management, Finance Department, summer workshop 2016 for insightful discussions and helpful

comments. Part of this research was undertaken while Patrick was visiting INSEAD. The authors also

thank the National Council for Scientiﬁc and Technological Development (CNPq) and the Science without

Borders Program for ﬁnancial support, which included a research scholarship. Disclosures: Patrick works

for EDHEC-Risk Institute (ERI) Scientiﬁc Beta, a smart beta index provider. The views expressed in

this paper are those of the author and do not necessarily reﬂect or represent those of ERI Scientiﬁc Beta.

Contacts: vitor.azevedo@tum.de (Vitor G. Azevedo). Phone: +49 (0)89 289 25179.

1. Introduction

Earnings forecasts are a critical input in many academic studies in ﬁnance and ac-

counting as well as in practical applications. They are central to ﬁrm valuation, are

widely used in asset allocation decisions, and are the basis for the accounting-based cost

of capital calculations such as the Implied Cost of Capital (ICC). It is, therefore, crucial

to have precise and unbiased estimates.

The most popular source for obtaining earnings forecasts are ﬁnancial analysts. These

forecasts are aggregated by data providers, such as the Institutional Brokers’ Estimate

System (I/B/E/S), and subsequently made available to academics and practitioners by

these providers. Although analysts’ forecasts are fairly accurate (O’Brien,1988;Hou

et al.,2012), researchers have found a signiﬁcant optimism bias (Francis and Philbrick,

1993;McNichols and O’Brian,1997;Easton and Sommers,2007).

The alternative to analysts’ earnings forecasts is a mechanical model, which can either

solely be based on past realizations of earnings (time-series models) or on a combination

of past earnings and other ﬁnancial variables. The literature ﬁrst developed time-series

models. These models use past realizations of earnings in a linear or an exponential

smoothing framework (Ball and Brown,1968;Brown et al.,1987). The results are un-

derwhelming; these forecasts are neither accurate nor unbiased. In addition, they suﬀer

from survivorship bias as only ﬁrms with a long history of earnings can be included in

the model. Fried and Givoly (1982) conclude that time-series models are worse than ana-

lysts’ forecasts for predicting future earnings. This result was later conﬁrmed by O’Brien

(1988).

Recently, cross-sectional models to forecast earnings proliferated. Fama and French

(2006) create one of the ﬁrst cross-sectional models that predict future proﬁtability and

show that earnings as an independent variable are highly persistent in forecasting prof-

itability. Hou et al. (2012) develop a cross-sectional model (henceforth HVZ model) based

on assets, earnings, and dividends, which outperforms analysts’ forecasts in terms of cov-

erage, Earnings Response Coeﬃcients (ERC),1and forecast bias2but still trailed analysts’

forecasts with respect to forecast accuracy.3Gerakos and Gramacy (2013) ﬁnd that a sim-

1The ERC estimates the relationship between earnings surprises and stock returns.

2Bias is deﬁned as the diﬀerence between the actual earnings and earnings forecast.

3Accuracy is deﬁned as the absolute value of the forecast error.

2

ple Random Walk (RW) model, in which the previous period’s value is used as a forecast,

performs as well as other, more sophisticated, earnings forecast models. Finally, Li and

Mohanram (2014) implement an Earnings Persistence (EP) and a Residual Income (RI)

model to forecast earnings. They show that these models are superior to the HVZ and

RW models in terms of bias, accuracy, and ERC.

More recently, Ball and Ghysels (2017) develop a model based on mixed data sampling

regression methods (MIDAS), which combines various high-frequency time-series data to

forecast earnings. Their model outperforms raw analysts’ forecasts in some cases and

also can be combined with analysts’ forecasts to improve forecast accuracy. The ﬁndings

from Ball and Ghysels (2017) tie in with ours as they show that mechanical models can

be used to improve earnings forecasts. One signiﬁcant diﬀerence to our study is that the

model from Ball and Ghysels (2017) is not suited to estimate the ICC as the focus is on

short-term forecast horizons (next quarter), whereas the ICC also requires medium- and

long-term forecasts (up to ﬁve years in the future). We provide empirical evidence on the

advantages of combining analysts forecasts with a regression-based model for longer-term

earnings forecasts.

In summary, existing studies show that analysts’ earnings forecasts are more accurate

than mechanical earnings forecasts but they do less well in terms of bias and ERC. In

addition, analysts’ forecasts have two important shortcomings: sluggishness4and poor

long-term estimates.5

This study proposes a parsimonious cross-sectional regression model consisting of an-

alysts’ earnings forecasts, gross proﬁts, and past stock performance. The inclusion of an-

alysts’ forecasts aims to improve forecast accuracy, in particular of short-term forecasts as

analysts’ have a timing and information advantage over forecasts based solely on account-

ing data (Ball and Ghysels,2017). Including gross proﬁts is motivated by ﬁndings from

Novy-Marx (2013), which suggest that gross proﬁts predict future earnings. Furthermore,

Novy-Marx (2013) shows that gross proﬁts explain many earnings-related asset pricing

anomalies, such as return on assets, earnings-to-price, asset turnover, gross margins, and

4Guay et al. (2011) show that analysts, on average, are slower than the stock market in processing

new, earnings-related information. They suggest to use short-term stock returns to mitigate the eﬀect of

sluggish analysts forecasts.

5For instance, Bradshaw (2012) report that even when analysts have timing and information advan-

tages, analysts’ forecasts of future earnings are not consistently more accurate than mechanical models

for longer forecast horizons.

3

standardized unexpected earnings. It is intuitive that stock returns also contain informa-

tion regarding future earnings. Indeed, Richardson et al. (2010) and Ashton and Wang

(2012) ﬁnd that changes in stock prices predict future earnings and Abarbanell (1991)

shows that stock returns are related to future earnings forecasts revisions. Including past

stock returns in our combined model has a further advantage as this variable mitigates

the eﬀect of sluggish analysts’ forecasts (Guay et al.,2011). We term our method the

combined model (CM), as it combines analysts’ forecasts with a cross-sectional method.

We compare our combined model to the most popular methods in the literature,

namely raw analysts’ forecasts and the RW, EP, RI, and HVZ models. To isolate the

value of analysts forecasts within the CM, we also estimate a cross-sectional analysts’

forecasts (CSAF) model.6We show that the combined model delivers earnings forecasts

that are slightly more accurate than analysts’ forecasts and markedly more accurate than

the mechanical models, while beating all other tested methods in terms of bias and ERC.

Concerning the CSAF model, this model underperforms not only the combined model but

also the raw analysts’ forecasts in terms of bias and accuracy. This suggests that using the

analysts’ forecasts in a mechanical model is not suﬃcient to improve the accuracy of the

forecasts nor to decrease bias. However, the fact that the combined model outperforms

all of the analyzed models, including the CSAF, shows that the variables gross proﬁts and

past stock performance substantially improve earnings forecasts.

One important application of earnings forecasts is to estimate a ﬁrm’s cost of capital, in

particular, the ICC. To further evaluate the earnings forecasts from our tested models, we

use them as inputs in computing the ICC. We ﬁnd that many of our benchmark models

produce ICC estimates that have a negative and signiﬁcant relation to gross proﬁts.

This evidence conﬂicts with Novy-Marx (2013) and Fama and French (2015) who derive

theoretically and show empirically that ﬁrms with high gross proﬁtability should have

higher expected returns. In contrast, the ICC based on our combined model shows a

positive and signiﬁcant relation, in line with the theoretical derivation. In addition, the

ICC based on the combined model displays a stronger association with ex-post realized

returns for both dimensions (cross-sectional and time-series) than the ICC based on the

other benchmark models. A long-short strategy of buying the highest ICC decile and

6This model is based on a cross-sectional regression using only analysts’ earnings forecasts as an

input.

4

short-selling the lowest ICC decile based on the ICC estimated with the combined model

yields a signiﬁcant average annual returns of up to 6.65%.

This study contributes to the ﬁnance and accounting literature in several ways. First,

we document that combining analysts’ earnings forecasts with a regression-based model

leads to more accurate and less biased estimates than each of the components alone. It

takes advantage of each method’s favorable characteristics while mitigating their short-

comings. The CM also outperforms the most popular models from the literature in all

three dimensions analyzed: bias, accuracy, and ERC. Second, we analyze one application

of earnings forecasts, the estimation of the ICC, and show that using earnings forecasts

from the CM leads to an increase in the relation between ICC and future returns cross-

sectionally. Thus, one major criticism of the ICC, namely the weak correlation between

ICC estimates and realized returns7, is attenuated by using more accurate earnings fore-

casts. The improvement in earnings forecast quality is also economically meaningful as

long/short portfolios constructed using the ICC based on earnings forecasts from the CM

have signiﬁcant excess portfolio returns.

Third, we provide evidence that using analysts’ forecasts in a cross-sectional forecast

model is not suﬃcient to remove the optimistic bias nor to improve the accuracy of

forecasts. The cross-sectional models use in-sample coeﬃcients to predict earnings out-of-

sample and this approach seems to introduce a large amount of noise in the out-of-sample

estimates, in particular for long-term estimates. The results show that the longer the

time horizon is, the worse the CSAF model’s performance in comparison with the raw

analysts’ forecasts.

The paper is organized as follows. In Section 2, we describe our sample selection,

the cross-sectional models, and provide details on the ICC estimation. In Section 3,

we compare the performance of earnings forecasts proxies in terms of bias, accuracy,

and ERC. In Section 4, we evaluate the performance of ICC estimates calculated using

diﬀerent methods to forecast earnings. We conclude in Section 6.

7Easton and Monahan (2005) analyze the cross-sectional correlation between returns and diﬀerent

ICC approaches and ﬁnd that none of the ICC estimates has a positive association with returns. The

authors conclude that the ICC estimates are unreliable for the entire cross-section of ﬁrms.

5

2. Data and methodology

2.1. Sample selection

We select ﬁrms at the intersection of the Center for Research in Security Prices

(CRSP), Compustat fundamentals annual, and I/B/E/S summary ﬁles. We ﬁlter for

ﬁrms listed on NYSE, AMEX, and NASDAQ with share codes 10 and 11. Our sample

starts in June 1977, as this is the ﬁrst year for which I/B/E/S provides analysts’ forecasts,

and ends in June 2015. We require at least ﬁve years of data for the 10-year pooled re-

gressions of the cross-sectional forecast models. To evaluate the earnings forecasts, we use

data from the year after the forecast was made. Therefore, our forecasts cover the period

from 1982 to 2014. We require non-missing one- and two-year-ahead earnings forecasts,

price, and shares outstanding from I/B/E/S and book values, earnings, and dividends

from Compustat to include a ﬁrm-year in the sample. Our proxy for the risk-free rate is

the yield on the U.S. 10-year government bond, which we obtain from Thomson Reuters

Datastream. We use the following variables from Compustat: income before extraor-

dinary items (Compustat IB), gross proﬁts (Compustat items: (REVT −COGS)), total

assets (Compustat AT ), dividends (Compustat DVC ), book value (Compustat CEQ),

book value of debt (Compustat items: (DLC +DLTT )), and capital expenditures (Com-

pustat CAPX).

2.2. Earnings Forecasts

We develop a model that combines analysts’ earnings forecasts with a cross-sectional

regression model to forecast earnings. We benchmark this approach to popular methods

from the literature, namely using only analysts’ forecasts, the RW model,8and four cross-

sectional models: the CSAF, Hou et al. (2012) (HVZ),9EP, and RI models.10 Although

one of the beneﬁts of the cross-sectional models is the wider coverage as accounting vari-

ables are usually more widely available than analysts’ forecasts, Li and Mohanram (2014)

show that cross-sectional earnings forecasts in the sample without I/B/E/S coverage are

8We include the RW based on evidence that at a one-year horizon, the RW model performs as well

as more sophisticated estimation methods (Gerakos and Gramacy,2013).

9According to Hou et al. (2012), their cross-sectional model is superior to analysts’ forecasts in terms

of forecast bias and ERC.

10We include the Earnings Persistence and Residual Income Models as a benchmark due to evidence

of Li and Mohanram (2014) that these models outperform the HVZ model in terms of forecast bias,

accuracy, earnings response coeﬃcient, and correlation of ICCs with future earnings and risk factors.

6

substantially more inaccurate and biased than the sample with I/B/E/S coverage. This

is intuitive as ﬁrms without analyst coverage tend to be smaller ﬁrms with a lower infor-

mation environment (Hou et al.,2012), which makes it more diﬃcult to forecast earnings

mechanically. In addition, the sample covered by I/B/E/S “represents 90 percent or more

of the total market capitalization”11 of all ﬁrms on NYSE, AMEX, and Nasdaq.

We obtain analysts’ forecasts and share prices from I/B/E/S as of June for each year

in the sample period. To compare analysts’ forecasts to the above-mentioned models, we

transform analysts’ estimates from a per share level to a dollar level by multiplying the

per share ﬁgures by the number of shares outstanding provided by I/B/E/S. For the RW

model, following Gerakos and Gramacy (2013), we use income before extraordinary items

from year (t) as earnings forecasts for year (t+τwith τ= 1 to 3).

We follow the approach of Hou et al. (2012) when estimating the cross-sectional re-

gressions. First, we run a rolling window pooled regression (in-sample) using the previous

ten years of data (see Equation 1). We regress the dependent variable earnings (E(i,t))

for ﬁrm (i) in year (t) on the independent variables (x1, x2,· · · , xn) for ﬁrm (i) in the

relevant year (t−τwith τ= 1 to 3). ((i,t)) is the error term for period (t). We perform

the regression at the dollar level with unscaled data.12

E(i,t)=α0+α1x1(i,t−τ)+α2x2(i,t−τ)+· · · +αnxn(i,t−τ)+(i,t).(1)

Second, we forecast earnings (E(i,t+τ)) (out-of-sample) for year (t+τ) (see Equation 2).

We obtain the forecast by multiplying the independent variables for each ﬁrm (i) of year

(t) with the coeﬃcients (α0, α1, α2,· · · , αn) from the pooled regression from Equation 1.

The advantage of this approach is that there are no strict survivorship requirements as

we require ﬁrms only to have suﬃcient accounting data for year (t) to forecast earnings.

˜

E(i,t+τ)=α0+α1x1(i,t)+α2x2(i,t)+· · · +αnxn(i,t).(2)

Consider the following example. Assume that 2010 is year (t) and we want to forecast the

11Claus and Thomas (2001, pp. 1638).

12We estimate our results based on a dollar level instead of per share level because we apply the

earnings forecasts to estimate ICC approaches that rely on clean surplus relation, and this assumptions

is even more critical at the per share level. “Per share clean surplus relation does not hold if a ﬁrm

issues/buys shares insofar the transaction changes bvps [book value per share]”(Ohlson,2005, p. 327).

7

earnings for 2011 (t+τwith τ= 1). First, we run a pooled regression with the dependent

variable data for the period 2001–2010 (from year t−9 to year t) on the independent

variables for the period 2000–2009 (from year (t−9−τ) to year (t−τwith τ= 1) and

store the regression coeﬃcients. Then, we multiply these coeﬃcients (α0, α1, α2,· · · , αn)

by the independent variables (x1, x2, ..., xn) from year 2010 (year = t) to estimate the

earnings for 2011 (year t+τwith τ= 1).

We forecast earnings in June of each year (t). We take care to avoid the use of data

that was not publicly available at the estimation dates. To this end, we collect accounting

data only for companies with ﬁscal year end between April of year (t−1) to March of year

(t). To mitigate the inﬂuence of outliers, we winsorize earnings and other level variables

each year at the ﬁrst and last percentile as in Hou et al. (2012) and Li and Mohanram

(2014).

Note that when evaluating forecast bias, accuracy, and ERC the researcher has to en-

sure that the deﬁnition of earnings forecasts and realized earnings are in line. More specif-

ically, analysts typically forecast street earnings, which diﬀer from earnings according to

the Generally Accepted Accounting Principles (GAAP) in signiﬁcant points (Bradshaw

and Sloan,2002). To account for this diﬀerence, we compare analysts’ forecasts and the

combined model forecasts to realized street earnings. For the other models (HVZ, RI,

EP, RW), we make the comparison based on realized income before extraordinary items,

which is based on GAAP. This distinction is also made in other papers (e.g., Hou et al.

(2012)).Furthermore, in order to report a fair comparison among the models, the sam-

ple of earnings forecast models is restricted to ﬁrm-year observations for which analysts’

forecasts are available.

2.2.1. Combined model

The combined model aims to take advantage of the high accuracy of analysts’ fore-

casts, while incorporating the low bias of the cross-sectional models. To include analysts’

forecasts, we use the last available forecast from I/B/E/S. Our cross-sectional model is a

parsimonious approach that includes gross proﬁts and two variables related to past stock

returns. Our use of gross proﬁts is motivated by ﬁndings from Novy-Marx (2013), who

shows that this variable explains most earnings related anomalies and a wide range of

seemingly unrelated proﬁtable trading strategies. We include two variables related to

past stock returns because Ashton and Wang (2012) and Richardson et al. (2010) show

8

that price changes drive earnings. The model is presented in Equation 3:

E(i,t)=α0+α1eIBES1(i,t−τ)+α2GP(i,t−τ)+α3r10(i,t−τ)+α4r122(i,t−τ)+(i,t),(3)

(E(i,t)) represents the street earnings of ﬁrm (i) in year (t), (eIBES1(i,t−τ)with τ= 1 to 3)

is the I/B/E/S one-year-ahead earnings forecast, (GP(i,t−τ)) is gross proﬁts, (r10(i,t−τ))13

is the change of market capitalization over the preceding month. (r122(i,t−τ))14 is the

change in market capitalization from t−12 to t−2 months. As the regression is carried

out at the dollar level, the I/B/E/S one-year-ahead earnings per share forecast, as well

as the realized street earnings per share, are multiplied by the number of shares provided

by I/B/E/S.

2.2.2. Cross-Sectional Analysts’ Forecasts

To show that the combined model beneﬁts from the combination of analysts’ forecast

with a mechanical model (and that neither of its components drives the strong forecast

performance), we include a model that uses analysts’ forecast in a cross-sectional regres-

sion. We estimate the cross-sectional analysts’ forecasts (CSAF) model with Equation 4:

E(i,t)=α0+α1eIBES1(i,t−τ)+(i,t),(4)

where (E(i,t)) represents the street earnings of ﬁrm (i) in year (t) and (eIBES1(i,t−τ)with

τ= 1 to 3) are the I/B/E/S one-year-ahead earnings forecasts. This regression is carried

out at the dollar level.15

2.2.3. The HVZ Model

We estimate the Hou et al. (2012) model with Equation 5:

E(i,t)=α0+α1eA(i,t−τ)+α2D(i,t−τ)+α3DD(i,t−τ)+α4E(i,t−τ)+α5N egE(i,t−τ)+

α6Ac(i,t−τ)+(i,t),(5)

13We estimate (r10(i,t−τ)) by multiplying market equity of month (t−1−τ) with the total return

(including dividends) from month (t−1−τ) to (t−τ).

14We estimate (r122(i,t−τ)) by multiplying market equity of month (t−12 −τ) with the total return

(including dividends) from month (t−12 −τ) to (t−2−τ).

15In unreported tests we conﬁrmed that the main results still hold if we perform the regression at the

per share level.

9

where (E(i,t)) represents income before extraordinary items of ﬁrm (i) in year (t), (A(i,t−τ))

represents total assets in year (t−τwith τ= 1 to 3), (D(i,t−τ)) denotes paid dividends of

ﬁrm (i) in year (t−τwith τ= 1 to 3), (DD(i,t−τ)) is a dummy variable that equals 1 if ﬁrm

(i) paid a dividend in year (t−τ) and 0 otherwise, (NegE(i,t−τ)) is a dummy variable, which

is set to 1 if company (i) reported negative earnings and 0 otherwise, and (Ac(i,t−τ)) is ac-

cruals for ﬁrm (i) in year (t−τwith τ= 1 to 3). Accruals are estimated until 1987 as the

change in non-cash current assets less the change in the current liabilities, excluding the

change in short-term debt and the change in taxes payable minus depreciation and amorti-

zation expenses (Compustat items: (ACT −CHE)−(LCT−DLC −TXP)−DP). Starting

in 1988, we estimate accruals as the diﬀerence between earnings and cash ﬂows from

operations (Compustat items: IB−(OANCF−XIDOC )).

2.2.4. The Earnings Persistence Model

The Earnings Persistence (EP) model according to Li and Mohanram (2014) is speci-

ﬁed as:

E(i,t)=α0+α1eNegE(i,t−τ)+α2E(i,t−τ)+α3N egE ∗E(i,t−τ)+(i,t),(6)

where (E(i,t)) represents income before extraordinary items for ﬁrm (i) in year (t),16

(NegE(i,t−τ)) is a dummy variable, which is set to 1 if company (i) reported negative

earnings and 0 otherwise, and (NegE ∗E(i,t−τ)) is the interaction term of the latter two

variables.

2.2.5. The Residual Income Model

The Residual Income (RI) model was introduced by Edwards and Bell (1961) and

Feltham and Ohlson (1996). The model was subsequently adjusted by Li and Mohanram

(2014) to forecast earnings. We estimate the model according to Equation 7:

E(i,t)=α0+α1eNegE(i,t−τ)+α2E(i,t−τ)+α3N egE ∗E(i,t−τ)+α4B(i,t−τ)+

α5T AC C(i,t−τ)+(i,t),(7)

16Like Hou et al. (2012), we use income before extraordinary items as a proxy for earnings forecasts. We

use the same proxy for the benchmark models in order to make the comparison consistent. The results

are robust to using income before special and extraordinary items as proposed by Li and Mohanram

(2014).

10

where (E(i,t)) represents income before extraordinary items for ﬁrm (i) in year (t), (NegE(i,t−τ))

is a dummy variable, which is set to 1 if company (i) reported negative earnings and 0

otherwise, (NegE ∗E(i,t−τ)) is the interaction term between the negative earnings dummy

variable and earnings, (B(i,t−τ)) denotes book value for ﬁrm (i) in year (t−τwith τ= 1 to

3), and (T AC C(i,t−τ)) is total accruals for ﬁrm (i) in year (t−τwith τ= 1 to 3). Total ac-

cruals are based on Richardson et al. (2005), calculated as the sum of change in net working

capital (Compustat items: (ACT −C H E)−(LCT −DLC)), the change in net non-current

operating assets (Compustat items: (AT −ACT −IV AO)−(LT −LC T −DLT T )), and

the change in net ﬁnancial assets (Compustat items: (I V ST +I V AO)−(DLT T +DLC +

P ST K )).

2.3. Estimating the ICC

The ICC is deﬁned as the discount rate that equates a stock’s current price to the

present value of its expected future free cash ﬂows to equity. The cash ﬂows are estimated

using earnings forecasts and expected growth in earnings. There are many diﬀerent ap-

proaches to estimate the ICC in the literature, so for the purpose of our tests, we choose

four common methods. We implement two methods that are based on a residual income

model, namely Gebhardt et al. (2001) (GLS) and Claus and Thomas (2001) (CT).17 In

addition, we employ two methods that are based on an abnormal earnings growth model,

namely Ohlson and Juettner-Nauroth (2005) (OJ) and Easton (2004) (modiﬁed price-

earnings growth or MPEG). Last, we estimate a composite ICC, which is the average of

the four above-mentioned approaches. To maximize the coverage of the composite ICC,

we only require a ﬁrm to have at least one non-missing individual ICC estimate (as in

Hou et al. (2012))

For the ICC calculation, we require each ﬁrm to have a one-year-ahead, a two-year-

ahead, and a three-year-ahead earnings forecast. If the three-year-ahead forecast is not

available, we estimate it by multiplying the two-year-ahead mean earnings forecast by

one plus the consensus long-term growth rate. If neither the three-year-ahead earnings

forecast nor the long-term growth rate is available, we compute the growth rate between

the one-year and two-year-ahead earning forecasts and use this to estimate the three-year-

17Although the CT and GLS approaches are both based on a residual income valuation model, the

methods have an important diﬀerence. While the CT model is designed to compute the market-level cost

of capital, the GLS model is made to compute the ﬁrm-level cost of capital.

11

ahead earnings forecast. Following Hou et al. (2012), we assume that the annual report

becomes publicly available at the latest 90 days after the ﬁscal year-end. Like Gebhardt

et al. (2001), we create a synthetic book value when this information is not yet public.

Speciﬁcally, we estimate the synthetic book value using book value data for year (t−1) plus

earnings minus dividends (Bt=Bt−1+EP St−Dt). Regarding the payout ratio, we use

the current payout ratio for ﬁrms with positive earnings. Like Gebhardt et al. (2001),for

ﬁrms with negative earnings, we compute the payout ratio as the ratio between dividends

and 6% of total assets. For the residual income models, we estimate the book value in

year18 t+τusing the clean surplus relation B(t+τ)=B(t+τ−1) +EP S(t+τ)∗(1 −P ayoutR).

We set the Payout Ratio to zero when the EP S(t+τ)is negative to avoid economically

questionable negative dividends. Furthermore, we exclude all observations with negative

book value per share. Following P´astor et al. (2008), we winsorize growth rates below 2%

and above 100%. See Appendix A for a detailed description of the ICC methodologies.

3. Empirical results of earnings forecasts methods

3.1. Coeﬃcient estimates of cross-sectional regressions

In this section, we present the ﬁrst step of our procedure to forecast earnings, i.e.,

the pooled (in-sample) regression using lagged ten years of data. We report the average

coeﬃcients, the respective t-statistics with Newey and West (1987) adjustment and the

Adjusted R-squared. The earnings are estimated yearly from 1983 to 2015 for one-year-

ahead forecasts, from 1985 to 2015 for two-year-ahead forecasts, and from 1987 to 2015

for three-year-ahead forecasts. We regress earnings at time (t) on lagged independent

variables. (τ= 1), (τ= 2), and (τ= 3) indicate that the independent variables are

lagged by one, two and three years, respectively.

Panel A of Table 1reports the results for the combined model. First, we can see

that lagged analysts’ earnings forecasts (eIBES1i,t−τwith τ= 1 to 3) are highly signif-

icant in explaining earnings even when controlling for other variables from the earnings

forecasts literature. Various studies have documented the accuracy of analysts’ earnings

forecasts (e.g., Fried and Givoly (1982); O’Brien (1988); Hou et al. (2012)) and this ﬁnd-

ing corroborates our choice of including analysts’ forecasts in our combined model. In

18The CT (GLS) ICC approach requires the calculation of book value from the year tto year t+τ

with τ= 4 (τ= 11).

12

terms of magnitude, the average coeﬃcient for analysts’ earnings forecasts is less than 1

(0.957 for one-year-lagged regression, 0.872 in two-year-lagged regression, and 0.774 in the

three-year-lagged regression), which conﬁrms the result from the literature that analysts’

forecasts tend to be too optimistic.

Table 1: Coeﬃcient estimates from the pooled (in-sample) regressions

Panel A: Combined Model

Intercept eIBE S1(t−τ)GP(t−τ)r10(t−τ)r122(t−τ)Adj. R-squared

τ= 1 -1.550 0.957 -0.006 0.057 0.013 0.94

[4.33]** [33.31]** [2.62]* [4.74]** [2.02]

τ= 2 0.914 0.872 0.026 0.086 0.014 0.86

[0.53] [21.37]** [7.32]** [4.24]** [2.25]*

τ= 3 3.947 0.774 0.057 0.054 0.037 0.81

[1.08] [13.08]** [12.53]** [3.13]** [1.18]

Panel B: Cross-Sectional Analysts’ Forecasts (CSAF)

Intercept eIBE S1(t−τ)Adj. R-squared

τ= 1 -1.675 0.953 0.94

[3.85]** [35.87]**

τ= 2 5.941 0.971 0.85

[1.74] [22.08]**

τ= 3 15.343 0.992 0.78

[2.6]* [20.38]**

Panel C: Hou et al. (2012) Mo del

Intercept E(t−τ)A(t−τ)D(t−τ)Acc(t−τ)DD(t−τ)Neg E(t−τ)Adj. R-squared

τ= 1 -2.202 0.733 0.002 0.339 -0.086 5.572 4.258 0.77

[1.85] [42.69]** [3.80]** [9.49]** [-0.86] [8.56]** [1.15]

τ= 2 -1.675 0.641 0.004 0.460 -0.135 7.848 6.972 0.69

[-1.09] [27.86]** [3.4]** [8.01]** [-0.49] [6.63]** [1.40]

τ= 3 1.249 0.668 0.004 0.411 -0.168 6.905 16.315 0.66

[1.16] [10.85]** [5.05]** [8.04]** [2.03] [6.63]** [2.37]*

Panel D: Earnings Persistence

Intercept E(t−τ)NegE ∗E(t−τ)N egE(t−τ)Adj. R-squared

τ= 1 2.380 0.968 -0.980 -9.728 0.77

[6.05]** [77.46]** [6.41]** [3.26]**

τ= 2 6.046 0.993 -1.394 -11.644 0.68

[4.64]** [34.54]** [7.57]** [2.46]*

τ= 3 10.411 1.038 -1.990 -19.516 0.63

[2.63]* [23.05]** [7.37]** [3.47]**

Panel E: Residual Income

Intercept E(t−τ)NegE ∗E(t−τ)B(t−τ)T AC C(t−τ)N egE(t−τ)Adj. R-squared

τ= 1 -0.362 0.767 -0.502 0.035 -0.049 -8.476 0.78

[-0.30] [18.80]** [6.17]** [8.81]** [-1.14] [2.70]*

τ= 2 1.346 0.688 -0.661 0.053 -0.072 -10.737 0.70

[0.91] [25.59]** [5.62]** [9.68]** [-1.35] [2.23]*

τ= 3 3.128 0.679 -1.108 0.062 -0.055 -15.394 0.66

[1.42] [12.72]** [7.21]** [7.40]** [1.81] [2.23]*

This table shows the average coeﬃcients, the respective t-statistics with Newey and West (1987) ad-

justment (in brackets) and the Adjusted R-squared from pooled regressions using 10 years of data. We

regress Earnings from year t on lagged independent variables from year (t−τwith τ= 1 to 3 years).

The regressions are performed from 1982 to 2014 for τ= 1, from 1983 to 2013 for τ= 2, and from

1984 to 2012 for τ= 3.** and * denote signiﬁcance at 0.01 and 0.05 level, respectively. Panel A re-

ports the coeﬃcients from the Combined Model, Panel B from the Cross-sectional Analysts’ Forecasts,

Panel C from HVZ model, Panel D from Earnings Persistence, and Panel E from Residual Income. The

columns display the variables used in the regression (see section 2.2 for details on the construction of

these variables).

Although the one-year-lagged gross proﬁts variable (GPi,t−1) is negative and weakly

signiﬁcant in explaining earnings, the two-, and three-year-lagged coeﬃcients of gross

proﬁts are positive and signiﬁcant with a t-statistic of 7.32 and 12.53 and coeﬃcients of

0.026 and 0.057, respectively. The low signiﬁcance and the negative coeﬃcients in the

13

one-year-lagged (τ= 1) regression are likely due to the large explanatory power of ana-

lysts’ one-year-ahead earnings forecasts, leaving one-year-lagged gross proﬁts redundant.

The positive and signiﬁcant coeﬃcients of gross proﬁts in the two and three-year-lagged

regressions conﬁrm the results of Novy-Marx (2013) that this variable is a good proxy for

future earnings.

The coeﬃcients of the one-month past stock return (r10(t−τ)) are all positive (0.057,

0.086, and 0.054, for τ= 1 to 3, respectively) and signiﬁcant at the 1% level in all analyzed

periods. Finally, past stock return from −12 to −2 months (r122(t−τ)) is signiﬁcant at the

5% signiﬁcance level for the two-year-lagged period (t-statistics of 2.25) having a positive

coeﬃcient (0.014 for τ= 2). These results conﬁrm the ﬁndings from Ashton and Wang

(2012) and Richardson et al. (2010) that stock price changes have a positive correlation

with forward earnings and they tie in with the evidence from Abarbanell (1991) that

analysts’ forecasts do not fully reﬂect the information in prior stock price changes. Our

results are also in line with Guay et al. (2011) who ﬁnd that analysts tend to react slowly

to information contained in recent stock price changes.

In Panel B of Table 1, we see the results regarding the CSAF model. In particular, the

coeﬃcients of analysts’ earnings forecasts in the one-year-lagged regressions are quite close

to the CM (coeﬃcient of 0.953 with t-statistics of 35.87 for the CSAF compared to the

coeﬃcient of 0.957 and t-statistics of 33.31 for the CM). However, the two- and three-year-

lagged regressions show a diﬀerent picture. While the coeﬃcients of analysts’ earnings

forecasts on the CSAF regression are closer to one (0.971 in the two-year-lagged regression

and 0.992 in the three-year-lagged regression), the coeﬃcients of the CM are lower (0.872

in the two-year-lagged regression and 0.774 in the three-year-lagged regression). This

indicates that the additional variables gross proﬁts and lagged returns in the CM become

more important in the two- and three-year ahead earnings forecasts compared to the one-

year ahead ones. These results are in line with Bradshaw et al. (2012) who show that

analysts’ forecasts are accurate for one-year-ahead horizons, but the two- and three-year-

ahead forecasts can underperform even a random walk model.

In Panel C of Table 1, we see the results regarding the HVZ model. The model pro-

posed by Hou et al. (2012) shows a positive and signiﬁcant relation between earnings

(E(t)) and one-, two-, and three-year-lagged (τ= 1 to 3) earnings (E(t−τ)), lagged div-

idends (D(t−τ)), lagged assets (A(t−τ)) and the dummy of lagged dividends (DD(t−τ)).

14

The coeﬃcient of the dummy variable indicating lagged negative earnings (NegE(t−τ))

is positive and statistically signiﬁcant in three-year-lagged regression and the accruals

(Ac(t−τ)) variable is signiﬁcant in none of the regressions. The magnitude and the sign

of the coeﬃcients are similar to Hou et al. (2012) and Li and Mohanram (2014), even

though the sample period is diﬀerent.19

For the EP model (see Panel D of Table 1), the lagged dummy variable of negative

earnings (NegE(t−τ)) is negative and signiﬁcant, lagged earnings (E(t−τ)) is positive and

signiﬁcant, and the interaction term (Neg E * E(t−τ)) is negative and signiﬁcant in all

analyzed regressions (τ= 1 to 3).

For the RI model (see Panel E of Table 1), the lagged dummy of negative earnings

(NegE(t−τ)) is negative and signiﬁcant, lagged earnings (E(t−τ)) is positive and signiﬁcant,

the interaction term (Neg E * E(t−τ)) is negative, and lagged book value (B(t−τ)) is

positive and signiﬁcant. All these results are similar to Li and Mohanram (2014) with the

only diﬀerence being that (T AC C(t−τ)) is negative but not signiﬁcant in our regression.

This diﬀerence is probably due to the diﬀerent estimation period and a possibly diﬀerent

calculation method of standard errors for the t-statistics.

When we compare the adjusted R-squared ﬁgures of the tested models, we see that the

combined model and the CSAF present the highest values for all analyzed periods. For

the one-year-lagged regression, the adjusted R-squared of the combined model is 0.94,

compared to 0.94 (CSAF), 0.77 (HVZ model), 0.77 (EP model), and 0.78 (RI model).

For the two-year-lagged regression, the combined model has an adjusted R-squared of

0.86, which is higher than the CSAF(0.85) HVZ (0.69), EP (0.68), and RI (0.70) models.

For the three-year-lagged regression, the adjusted R-squared values are 0.81 (combined

model), 0.78 (CSAF), 0.66 (HVZ model), 0.63 (EP model), and 0.66 (RI model). Our

adjusted R-squared values for the EP and RI models are higher than in Li and Mohanram

(2014) as we estimate these models at the dollar level so that the heteroskedasticity of the

dollar level data inﬂates the adjusted R-squared. Although a high in-sample R-squared

value is not a suﬃcient condition for high out-of-sample performance, it is a necessary

one (Welch and Goyal,2008). These in-sample results bode well for the combined model.

We will analyze the forecast bias in the next section.

19Hou et al. (2012) perform the regression yearly from 1968 to 2008 using ten years of lagged data,

while Li and Mohanram (2014) use the period from 1968 to 2012.

15

3.2. Bias comparison

There is ample evidence that analysts’ forecasts tend to be too optimistic (e.g., Lin

and McNichols (1998); Hong and Kubik (2003); Merkley et al. (2017)) with one of the

reasons being that they face a conﬂict of interest. In a survey of 365 analysts, Brown

et al. (2015) ﬁnd that 44% of respondents say their success in generating underwriting

business or trading commissions is very important for their compensation. There is also

empirical evidence for the conﬂict of interest hypothesis. Hong and Kubik (2003) ﬁnd

that controlling for accuracy, analysts who are optimistic compared to the consensus

are more likely to have favorable job separations. In particular, for analysts who cover

stocks underwritten by their houses, optimism becomes more relevant than accuracy for

favorable job separations. This optimism bias carries over into many applications that use

these forecasts as an input. Easton and Sommers (2007) estimate that overly-optimistic

analysts’ earnings forecasts lead to an upward bias in the ICC of 2.84%. Given the

importance of bias, we now compare the mean and median biases of all tested earnings

forecast models. We deﬁne bias as the diﬀerence between actual earnings and earnings

forecasts, scaled by the ﬁrm’s end-of-June market equity. We estimate bias out-of-sample

for one-, two-, and three-year-ahead forecasts (τ= 1 to 3).

Bias(i,t+τ)=(Actual Earnings(i,t+τ)−Earnings F or ecast(i,t+τ))

Market Equity(i,t)

(8)

As we can see in Equation 8, a negative (positive) bias means overly-optimistic (pes-

simistic) earnings forecasts. A bias of zero means unbiased forecasts. We estimate bias

at the end of June of each year20 for each ﬁrm. Then, we estimate the yearly mean and

median forecast biases. In Panel A of Table 2, we report the average of the yearly mean

and median biases and the respective t-statistics with the Newey-West adjustment for all

tested models.

In Panel A of Table 2, we see that the combined model is the only model that has no

statistically signiﬁcant bias at the 0.05 signiﬁcance level. We emphasize that this result

also holds when analyzing the mean and median biases and when testing one-, two- or

20We estimate one, two, three-year-ahead forecast bias for the periods 1985–2015, 1987–2015, and

1989-2015, respectively.

16

three-year-ahead forecasts. Our results conﬁrm the positive bias of analysts’ forecasts,

as the mean and median biases are negative and statistically signiﬁcant for one-, two,

and three-year-ahead forecasts. The one-year-ahead median bias is small in magnitude

(−0.002), i.e., it overestimates earnings by an amount of 0.2% of market equity. However,

the median bias increases in two and three-year ahead forecasts to −0.009 and −0.013,

respectively. Our results are diﬀerent from those in Abarbanell and Lehavy (2003), who

show that the median bias is zero for analysts’ forecasts. This is possibly due to the

diﬀerent sample period (Abarbanell and Lehavy (2003) analyze the period from 1985 to

1998) and the diﬀerent forecast periodicity (the authors use quarterly forecasts while we

use yearly forecasts).

Moving to the benchmark models, the HVZ and RI models present an optimistic

mean bias in the one-, two-, and three-year-ahead forecasts. The EP model displays an

optimistic bias in the mean one-year-ahead forecasts as well as in the median two- and

three-year-ahead regressions. The forecasts based on the RW model show a positive bias

which means that they are overly pessimistic. This is intuitive as this model does not

take growth in earnings into account. Finally, the CSAF model performs well in that it

only has a signiﬁcant bias at the two-year ahead forecast horizon. However, it does show

a greater bias in terms of magnitude for the three-year ahead forecasts compared to the

raw analysts’ forecasts. This indicates that simply incorporating analysts’ forecasts into

a cross-sectional regression does not remove the overly-optimistic bias.

17

Table 2: Earnings Forecasts Bias

Panel A: Bias of earnings forecasts

Bias Et+1 Bias Et+2 Bias Et+3

Mean Median Mean Median Mean Median

CM -0.006 0.005 -0.009 0.000 -0.005 -0.001

[-1.08] [1.99] [-1.06] [0.09] [-0.38] [-0.33]

AF -0.033 -0.002 -0.036 -0.009 -0.028 -0.013

[2.79]** [2.62]* [4.77]** [5.29]** [3.23]** [3.84]**

CSAF -0.008 0.005 -0.035 -0.010 -0.047 -0.018

[-1.21] [2.03] [3.10]** [2.58]* [1.88] [2.04]

HVZ -0.038 0.002 -0.040 -0.004 -0.053 -0.008

[3.49]** [0.92] [2.23]* [-0.84] [2.29]* [-1.15]

EP -0.048 -0.003 -0.073 -0.013 -0.078 -0.016

[3.44]** [-1.16] [4.69]** [3.08]** [7.41]** [2.42]*

RI -0.026 0.003 -0.040 -0.005 -0.051 -0.010

[2.06]* [1.10] [2.61]* [-1.46] [3.41]** [1.72]

RW 0.004 0.006 0.029 0.010 0.036 0.014

[0.49] [5.32]** [1.65] [3.13]** [1.53] [2.73]*

Panel B : Diﬀerence of bias of earnings forecasts

Bias Et+1 Bias Et+2 Bias Et+3

Mean Median Mean Median Mean Median

CM-AF 0.027 0.007 0.027 0.009 0.023 0.012

[2.50]* [2.31]* [3.33]** [2.83]** [6.60]** [5.13]**

CM-CSAF 0.003 0.000 0.026 0.011 0.042 0.016

[1.00] [0.48] [4.73]** [3.96]** [2.99]** [2.67]*

CM-HVZ 0.032 0.003 0.032 0.004 0.049 0.006

[2.42]* [0.86] [1.51] [0.80] [2.19]* [1.05]

CM-EP 0.042 0.008 0.065 0.013 0.074 0.015

[2.80]** [2.24]* [4.10]** [3.16]** [4.48]** [3.44]**

CM-RI 0.02 0.00 0.03 0.01 0.05 0.01

[1.44] [0.78] [1.88] [1.51] [2.55]* [2.06]*

CM-RW -0.010 -0.001 -0.037 -0.009 -0.041 -0.015

[0.96] [0.45] [2.34]* [3.26]** [1.91] [7.50]**

This table summarizes the mean and median bias for the US market. Bias is deﬁned as the

diﬀerence between earnings forecasts and actual earnings, scaled by the ﬁrm’s end-of-June

market equity. The rows in Panel A show the diﬀerent models: Combined Model (CM),

raw analysts’ forecasts (AF), Cross-Sectional Analysts’ Forecasts (CSAF), Hou, van Dijk

and Zhang (HVZ, 2012), Residual Income (RI), Earnings Persistence (EP), and Random

Walk (RW). Panel B displays the diﬀerence in bias between the CM and each of the other

forecast methods. The Newey-West t-statistics are presented in brackets. Results are shown

for one-, two-, and three-year ahead earnings forecasts. We estimate one, two, three-year

ahead forecast bias for the periods 1985–2015, 1987–2015, and 1989–2015, respectively. **

and * denote signiﬁcance at 0.01 and 0.05 levels, respectively.

18

Figure 1: One-year-ahead Mean Bias Figure 2: One-year-ahead Median Bias

Figure 3: Two-year-ahead Mean Bias Figure 4: Two-year-ahead Median Bias

Figure 5: Three-year-ahead Mean Bias Figure 6: Three-year-ahead Median Bias

Panel B of Table 2shows whether the bias of the combined model is statistically

diﬀerent in comparison to other models. The ﬁrst row presents the diﬀerence between the

combined model and analysts’ forecasts, and we see that in all periods, for the mean and

the median, the biases are statistically diﬀerent. Thus, we document that the combined

model is not as overly-optimistic as raw analysts’ forecasts. In the second row, we can

compare the CM to the CSAF, and we can see that the bias is statistically diﬀerent

for two- and three-year-ahead mean and median forecasts. These results show that the

additional variables of the CM (compared to the CSAF) are important to obtain unbiased

19

forecasts, in particular for long-term earnings. When we compare the combined model to

the RI model, we see diﬀerences only for the three-year-ahead forecast. Furthermore, the

CM is statistically less optimistic than the HVZ for one- and three-year-ahead forecasts

and less pessimistic than the RW for two- and three-year-ahead forecasts. Last, we show

that the combined model is not as overly-optimistic as the EP model at a statistically

signiﬁcant margin for all analyzed periods. In short, the combined model displays the

lowest bias of all tested models for all forecast horizons.

In order to analyze forecast bias over time, Figures 1 to 6 show the mean and median

forecast bias for one-, two-, and three-year-ahead earnings forecasts. For the sake of clarity,

we only include the raw analysts’ forecasts, the combined model, and the benchmark

model with forecast bias closest to zero in the ﬁgure. The optimism bias of the raw

analysts’ forecasts is immediately apparent. The corresponding graph is almost always

below zero for diﬀerent forecast horizons and aggregation methods (mean and median).

We also see spikes in the bias for the RW model that correspond to economic shocks. For

example, the burst of the Internet bubble in 2001 results in an overly-optimistic estimate

as the previous (high) level of earnings is used as a forecast.

3.3. Accuracy comparison

There is substantial evidence that analysts’ forecasts are more accurate than mechani-

cal models (e.g., Fried and Givoly (1982); O’Brien (1988); Hou et al. (2012)). Researchers

argue that the higher accuracy of analysts’ forecasts is due to their “innate ability and

task-speciﬁc experience”21 (e.g., Clement et al. (2007)), industry related experience ob-

tained before becoming an analyst (e.g., Bradley et al. (2017)), and the number of analysts

covering each industry (e.g., Merkley et al. (2017)).

In this section, we compare the forecast accuracy of all tested models. We use absolute

error as a proxy for accuracy. Following Bradley et al. (2017), we estimate the absolute

error as the absolute diﬀerence between actual earnings and earnings forecasts, scaled by

the ﬁrm’s end-of-June market equity. The lower the value of the absolute error, the more

21According to Clement et al. (2007), task-speciﬁc experience is deﬁned as the analyst’s experience in

forecasting around a particular type of situation or event, such as forecasting earnings when restructurings

occur or forecasting earnings around an acquisition.

20

accurate the forecast.

Absolute error(i,t+τ)=abs "(F orecast E arnings(i,t+τ)−Actual Earnings(i,t+τ))

Market Equity(i,t)#(9)

We estimate the out-of-sample absolute error at the end of June of each year,22 based

on Equation 9, for one-, two-, and three-year-ahead time horizons (τ= 1 to 3) for each

ﬁrm. In Panel A of Table 3, we report the yearly average of the mean and median absolute

errors (accuracy) and the respective t-statistics with the Newey-West adjustment for all

tested models.

As we see in Panel A of Table 3, the combined model is slightly superior to the raw

analysts’ forecasts and the CSAF model and markedly superior to the benchmark models

in terms of mean accuracy. If we compare the three most accurate models, the CM has

the best accuracy (0.046), followed by CSAF (0.050) and AF (0.057). The mean absolute

error of the benchmark models is roughly twice as high (inaccurate) as the CSAF model,

raw analysts’ forecasts or the combined model for the one-year-ahead forecast. For two-

and three-year ahead mean absolute error, the combined model again is more accurate

than the other models but we note that the diﬀerence to analysts’ forecasts is smaller

(the combined model has a mean absolute error of 0.063 and 0.070 for two- and three-

year-ahead forecasts, in comparison, the mean absolute error of analysts’ forecasts is

0.070 and 0.076). Regarding the CSAF model, the diﬀerence in terms of accuracy to the

CM becomes higher for long-term forecasts since the absolute error for the CSAF model

is 0.076 for two-year-ahead and 0.099 for three-year-ahead forecasts. The CSAF model

outperforms the raw analysts’ forecasts for the one-year-ahead horizon (mean error), but

it is less accurate for two- and three-year-ahead forecasts. Finally, the mean absolute

error of the other benchmark models is on average ﬁve percentage points higher than the

combined model.

22We estimate one-, two-, and three-year-ahead forecast accuracy for the periods 1985–2015, 1987–

2015, and 1989–2015, respectively.

21

Table 3: Earnings Forecasts Accuracy

Panel A: Accuracy of earnings forecasts

Accuracy Et+1 Accuracy Et+2 Accuracy Et+3

Mean Median Mean Median Mean Median

CM 0.046 0.015 0.063 0.026 0.070 0.033

AF 0.057 0.011 0.070 0.024 0.076 0.033

CSAF 0.050 0.016 0.076 0.030 0.099 0.042

HVZ 0.109 0.033 0.119 0.045 0.128 0.048

EP 0.112 0.029 0.135 0.045 0.135 0.050

RI 0.104 0.028 0.117 0.041 0.120 0.046

RW 0.114 0.025 0.124 0.037 0.127 0.044

Panel B : Diﬀerence of accuracy of earnings forecasts

Accuracy Et+1 Accuracy Et+2 Accuracy Et+3

Mean Median Mean Median Mean Median

CM-AF -0.010 0.005 -0.007 0.002 -0.006 -0.001

[1.48] [3.68]** [2.07]* [1.67] [3.61]** [0.84]

CM-CSAF -0.004 -0.001 -0.013 -0.004 -0.029 -0.009

[3.50]** [1.10] [2.46]* [1.89] [2.38]* [3.22]**

CM-HVZ -0.062 -0.018 -0.056 -0.019 -0.058 -0.015

[5.41]** [8.30]** [6.94]** [16.60]** [6.71]** [13.65]**

CM-EP -0.065 -0.013 -0.072 -0.019 -0.064 -0.017

[5.90]** [4.29]** [7.85]** [13.00]** [7.64]** [12.76]**

CM-RI -0.058 -0.013 -0.054 -0.014 -0.050 -0.014

[4.49]** [3.05]** [8.19]** [11.54]** [10.06]** [16.02]**

CM-RW -0.067 -0.010 -0.061 -0.011 -0.056 -0.011

[4.02]** [2.73]* [4.21]** [6.97]** [4.52]** [6.40]**

This table summarizes the mean and median forecast accuracy for the US market. We deﬁne

accuracy as the absolute diﬀerence between actual earnings and earnings forecasts, scaled by the

ﬁrm’s end-of-June market equity. The rows of Panel A show the diﬀerent models: Combined

Model (CM), raw analysts’ forecasts (AF), Cross-Sectional Analysts’ Forecasts (CSAF), Hou,

van Dijk and Zhang (HVZ, 2012), Residual Income (RI), Earnings Persistence (EP), and Random

Walk (RW). Panel B displays the diﬀerence in forecast accuracy between the CM and each of the

other forecast methods. We show Newey-West t-statistics in brackets. We estimate one-, two-,

and three-year ahead forecast accuracy for the periods 1985–2015, 1987–2015, and 1989–2015,

respectively. ** and * denote signiﬁcance at 0.01 and 0.05 levels, respectively.

With regard to median absolute error, the results of analysts’ forecasts are slightly

superior to the combined model for one- and two-year-ahead horizons (0.011 and 0.024

for raw analysts’ forecasts and 0.015 and 0.026 for the combined model for one-year and

two-year forecasts, respectively). For three-year-ahead forecasts, the median absolute

error is 0.033 for both models. The third best model in terms of median accuracy is

the CSAF, with absolute errors of 0.016, 0.030, and 0.042 for one-, two-, and three-year-

22

ahead forecasts. Concerning the other benchmark models, the median absolute error is

substantially higher (more inaccurate) compared to the raw analysts’ forecasts, the CSAF

and the CM. We also highlight that the analysts’ forecasts are more accurate than the

ones estimated with the CSAF model in one-, two-, and three-year-ahead forecasts. This

is evidence that only including the analysts’ forecasts in a cross-sectional model is not

suﬃcient to improve the forecasts.

In Panel B of Table 3, we test whether the diﬀerences are statistically signiﬁcant. The

CM shows superior accuracy compared to all cross-sectional models and the RW model.

Like Gerakos and Gramacy (2013), we ﬁnd that the RW model is as accurate as the cross-

sectional models. Comparing the combined model to analysts’ forecasts, the combined

model outperforms the analysts in the medium and long-term (two- and three-year-ahead)

forecasts. However, the results for one-year-ahead are mixed since the analysts’ forecasts

have a better median accuracy, while the mean accuracy is not statistically diﬀerent

between both models.

In Figures 7 to 12, we plot the forecast accuracy over time for the tested methods.

The raw analysts’ forecasts are superior to the combined model in terms of one-year-

ahead median accuracy, in particular for the ﬁrst years of the sample period. When we

split the analyzed period into two equal-length sub-periods, we see that the diﬀerence

in median accuracy during the period 1985–2000 is 0.0073, while in the period 2001–

2015 it decreases to 0.0022. We observe the same pattern for two-year-ahead median

accuracy; here the diﬀerence falls from 0.0032 (earlier period) to 0.0000 (later period),

which indicates that the combined model has improved the accuracy compared to the

raw analysts’ forecasts over the years. Last, note that the raw analysts’ forecasts and the

combined model outperform the benchmark models in all periods.

23

Figure 7: One-year-ahead Mean Accuracy Figure 8: One-year-ahead Median Accuracy

Figure 9: Two-year-ahead Mean Accuracy Figure 10: Two-year-ahead Median Accuracy

Figure 11: Three-year-ahead Mean Accuracy Figure 12: Three-year-ahead Median Accuracy

3.4. Earnings Response Coeﬃcient

The ERC is the coeﬃcient that measures the response of stock prices to surprises (new

information) in accounting earnings announcements (Easton and Zmijewski (1989)). Li

and Mohanram (2014) explain that a higher ERC suggests that the market reacts more

strongly to the unexpected earnings from a model that represents a better approximation

of market expectations. According to Brown (1993), assuming an informationally eﬃcient

market, the accuracy and market association could be considered “two sides of the same

24

coin.”23 However, it is important to clarify that while bias and accuracy are ex-post

assessments of forecasts, the ERC examines the extent to which earnings forecasts provide

the best ex-ante estimates of market expectations. This analysis also helps to rule out

the possibility that our results are only driven by diﬀerent deﬁnitions of earnings (street

versus GAAP).

We estimate the ERC using the sum of the quarterly earnings announcement returns

(market-adjusted, from day −1 to day +1) on one-, two-, and three-year-ahead ﬁrm-

speciﬁc unexpected earnings (i.e., the forecast bias) measured over the same horizon. The

unexpected earnings, as well as the returns, are standardized to make the ERC comparable

among all models. Panel A of Table 4shows the time-series average of the ERCs, the

respective t-statistics, and the time-series average of adjusted R-squared for all tested

models. Panel B of Table 4shows the pairwise comparison between the combined model

and the other models.

As we see in Panel B of Table 4, for one-year-ahead forecasts, the combined model out-

performs raw analysts’ forecasts regarding ERC coeﬃcient and adjusted R-squared. The

diﬀerence in the ERC coeﬃcient is also highly statistically signiﬁcant (t-statistic of 3.76).

For the same forecast horizon, the combined model does not signiﬁcantly outperform the

other benchmark models. When analyzing two-year-ahead forecasts, we note that the

combined model shows a higher ERC coeﬃcient than the CSAF, HVZ, EP, RI, models

and a higher adjusted R-squared than the CSAF, HVZ and RI models at a statistically

signiﬁcant margin. Finally, for three-year-ahead forecasts, the results are statistically

diﬀerent when comparing the combined model to the RW or the CSAF models.

In summary, we ﬁnd that the combined model is not just the less biased and more

accurate one but also represents market expectations most consistently among all tested

models.

23Brown (1993, pp. 296).

25

Table 4: Earnings Response Coeﬃcient

Panel A: Earnings Response Coeﬃcient (ERC)

Et+1 Et+2 Et+3

ERC Adj. R-squared ERC Adj. R-squared ERC Adj. R-squared

CM 0.132 0.016 0.130 0.017 0.098 0.009

[13.12]** [5.72]** [6.75]**

AF 0.104 0.011 0.097 0.011 0.087 0.008

[9.25]** [5.15]** [6.60]**

CSAF 0.129 0.016 0.109 0.013 0.061 0.005

[12.66]** [4.96]** [4.09]**

HVZ 0.120 0.015 0.081 0.010 0.057 0.006

[10.75]** [5.30]** [3.25]**

EP 0.114 0.015 0.069 0.007 0.068 0.006

[9.03]** [4.53]** [5.85]**

RI 0.124 0.017 0.082 0.008 0.072 0.006

[11.03]** [7.83]** [5.94]**

RW 0.120 0.015 0.088 0.009 0.061 0.005

[7.37]** [6.80]** [4.38]**

Panel B: Comparison of the diﬀerence

Et+1 Et+2 Et+3

ERC Adj. R-squared ERC Adj. R-squared ERC Adj. R-squared

CM-AF 0.028 0.005 0.033 0.007 0.011 0.002

[3.76]** [4.27]** [1.21] [1.47] [1.32] [1.51]

CM-CSAF 0.003 0.000 0.021 0.004 0.037 0.004

[0.59] [0.23] [3.16]** [3.08]** [2.45]* [2.88]**

CM-HVZ 0.011 0.000 0.049 0.007 0.041 0.003

[0.95] [0.15] [2.12]* [2.12]* [1.98] [1.64]

CM-EP 0.018 0.000 0.060 0.010 0.030 0.003

[1.72] [0.14] [2.42]* [2.02] [1.51] [1.83]

CM-RI 0.007 -0.001 0.047 0.010 0.027 0.003

[0.70] [0.35] [2.63]* [2.21]* [1.34] [1.82]

CM-RW 0.012 0.001 0.042 0.008 0.037 0.004

[0.68] [0.26] [1.88] [2.02] [3.69]** [2.53]*

This table reports the time-series averages of the earnings response coeﬃcients (ERC) for forecasts from the

Combined Model (CM), raw analysts’ forecasts (AF), Cross-Sectional Analysts’ Forecasts (CSAF), Hou,

van Dijk and Zhang (HVZ, 2012), Residual Income (RI), Earnings Persistence (EP), and Random Walk

(RW) models, as well as their pairwise comparisons. The Newey-West t-statistics are reported in brackets.

The ERC is estimated by regressing the sum of the quarterly earnings announcement returns (market-

adjusted, from day −1 to day +1) over the next one-, two-, and three-years on ﬁrm-speciﬁc unexpected

earnings (i.e., the forecast bias) measured over the same horizon. We standardize the unexpected earnings

and the returns to make the ERC comparable among all models. ** and * denote signiﬁcance at 0.01, and

0.05 levels, respectively.

4. Implied Cost of Capital

The ICC is a popular proxy for expected returns (see e.g., P´astor et al. (2008); Frank

and Shen (2016)) as its estimates contain less noise than estimates based on realized

returns (e.g., Lee et al. (2009)). Better earnings estimates should improve the correlation

between the ICC and subsequent realized earnings leading to better ICC estimates. In this

26

section, we analyze the performance of ICC estimates using proxies for earnings forecasts

based on the combined model, analysts’ forecasts, and the benchmark models. First, we

compute the ICC on an aggregate level and evaluate its ability to predict realized returns

over time. Then, we analyze the cross-sectional correlation between ICC and ex-post

forward returns.

4.1. Relation between ICC and returns on an aggregate level

There is evidence that the ICC at an aggregate level is a good proxy for time-varying

expected returns (e.g., P´astor et al. (2008); Li et al. (2013)). Due to the fact that one of

the main inputs for the ICC estimation are earnings forecasts, we believe that this input

can strongly inﬂuence the ICC’s performance as a proxy for expected returns. In this

section, we test whether the slopes of ICC calculated using diﬀerent proxies for earnings

forecasts at predicting future market returns are greater than zero.24 We regress ex-post,

one-year-forward value-weighted (VW) excess market returns on VW ICC equity premia.

For each earnings forecast method, we estimate ﬁve diﬀerent ICC models (GLS, CT, OJ,

MPEG, and a Composite, which is the average of the four previous models). We employ

the following proxies for earnings forecasts: the combined model, analysts’ forecasts, the

HVZ model, the EP model, and the RI model.25 To compute the ICC premia and excess

returns, we use the yield on the U.S. 10-year government bond. Panel A of Table 5

presents the results.

For the one-year-forward return predictive regressions, we document that the ICC

estimated with earnings from the combined model oﬀers the largest number of signiﬁcant

regression slopes. For three ICC methods (CT, OJ, and MPEG) the coeﬃcients are

signiﬁcant at the 0.05 level. In contrast, the HVZ and CSAF models, as well as raw

analysts’ forecasts, only produce two signiﬁcant coeﬃcients. By comparing the t-statistics,

the ICC estimated with the combined model reports the highest t-statistics in three out

of the ﬁve ICC approaches.

24Following Li et al. (2013), we use a one-sided test of the null hypothesis to test whether the slopes are

greater than zero. We use one-sided test to analyze the correlation between ICC and returns over-time

as well as cross-sectionally.

25We do not include the RW model because this method does not allow for earnings growth and is,

therefore, not suitable for estimating the ICC.

27

Table 5: Regressions of ICC and ex-post realized returns

Panel A: Forecasting at the aggregate level

Model CM AF CSAF HVZ EF RI

Intercept Coeﬃcient Intercept Coeﬃcient Intercept Coeﬃcient Intercept Coeﬃcient Intercept Coeﬃcient Intercept Coeﬃcient

GLS 3.381 1.553 1.571 1.470 3.244 1.615 4.414 1.376 4.261 1.252 4.358 1.240

[0.795] [1.156] [0.279] [1.132] [0.741] [1.152] [1.170] [0.900] [1.086] [0.816] [1.122] [0.800]

CT -1.087 4.056 -6.656 3.257 -2.095 4.506 3.549 2.942 3.674 1.945 3.217 2.158

[-0.209] [1.833]* [-0.832] [1.913]* [-0.367] [1.841]* [0.857] [1.263] [0.753] [0.622] [0.625] [0.779]

OJ -1.994 2.713 -10.315 2.899 -2.145 2.697 -2.016 4.066 5.234 -0.020 0.535 3.274

[-0.352] [1.988]* [-0.893] [1.626] [-0.345] [1.712]* [-0.419] [2.707]** [1.133] [-0.007] [0.100] [1.426]

MPEG 1.889 1.668 -6.416 2.294 3.011 1.025 4.062 2.189 5.319 0.626 4.705 1.798

[0.468] [2.697]** [-0.705] [1.712]* [0.773] [1.429] [1.158] [1.892]* [1.717]* [0.323] [1.388] [1.139]

Composite 1.181 2.473 -4.899 2.483 0.908 2.516 3.947 2.323 4.741 1.126 4.073 1.745

[0.249] [1.679] [-0.578] [1.595] [0.180] [1.570] [1.007] [1.276] [1.301] [0.476] [0.996] [0.846]

Panel B: Fama-Macbeth regression

Model CM AF CSAF HVZ EF RI

Intercept Coeﬃcient Intercept Coeﬃcient Intercept Coeﬃcient Intercept Coeﬃcient Intercept Coeﬃcient Intercept Coeﬃcient

GLS 3.796 0.827 3.440 0.576 4.342 0.653 4.151 0.473 4.118 0.417 4.219 0.474

[1.623] [2.354]* [1.303] [1.380] [1.859]* [1.885]* [1.722]* [1.866]* [1.675] [2.132]* [1.736]* [2.097]*

CT 4.086 0.681 4.742 0.300 4.307 0.615 5.233 0.172 4.948 0.158 4.751 0.193

[1.719]* [2.976]** [1.817]* [1.188] [1.846]* [2.778]** [2.155]* [1.529] [2.066]* [1.666] [1.992]* [1.475]

OJ 4.350 0.411 4.852 0.167 5.267 0.262 5.482 0.092 5.478 0.097 4.775 0.215

[2.095]* [1.785]* [1.750]* [0.679] [2.516]** [1.415] [2.411]* [0.882] [2.417]* [0.830] [2.057]* [1.380]

MPEG 5.603 0.223 5.018 0.123 6.150 0.138 5.364 0.091 5.624 0.091 5.004 0.200

[2.484]** [1.368] [1.842]* [0.677] [2.657]** [1.014] [2.423]* [0.965] [2.459]* [0.832] [2.134]* [1.332]

Composite 4.108 0.659 4.344 0.311 4.616 0.553 4.881 0.256 4.867 0.218 4.637 0.289

[1.840]* [2.439]* [1.558] [0.999] [2.055]* [2.218]* [2.017]* [1.688] [2.006]* [1.789]* [1.945]* [1.993]*

Panel A presents univariate OLS regressions of ex-post excess realized returns on ICC premium based on ﬁve proxies of earnings forecasts: Combined Model

(CM), Analysts’ Forecasts (AF), Cross-Sectional Analysts’ Forecasts (CSAF), Hou et al. (2012) (HVZ), Earnings Persistence (EP), and Residual Income

(RI). We show the results based on the following ICC approaches: GLS, CT, OJ, MPEG and the composite of the before-mentioned ICC approaches. The

dependent variables are the value-weighted market risk premium. Panel B presents the average coeﬃcients of Fama-Macbeth regressions of realized returns on

ICC premium. To compute the ICC premiums and excess returns, we use the yield on the U.S. 10-year government bond. The Newey-West t-statistics with

three-lag periods are presented in brackets. ** and * denote signiﬁcance based on one-tailed tests at 0.01 and 0.05 levels, respectively. Our sample is from June

1986 to June 2012.

28

4.2. Relation between ICC and returns cross-sectionally

In the previous section, we compared the predictive power of the ICC over time.

Now, we analyze whether the ICC has a positive correlation to the cross-section of stock

returns. To this end, we perform univariate Fama and Macbeth (1973) (FM) cross-

sectional regressions of ex-post-forward return premium on four individual ICC premium

estimates (we use the GLS, CT, OJ, and MPEG approaches) and on the Composite ICC

premium at the ﬁrm level. To estimate earnings’ forecasts for the ICC computation, we

use the following proxies: the combined model, analysts’ forecasts, the CSAF model, the

HVZ model, the EP model, and the RI model. The results are reported in Panel B of

Table 5.

When we regress cross-sectional monthly returns on the ICC, we can see that the ICC

estimated with the combined model has the strongest correlation with the cross-section

of returns since the coeﬃcients are statistically signiﬁcant in four (GLS, CT, OJ, and

Composite) out of ﬁve ICC approaches. The ICC estimated with the combined model

has the highest t-statistics in all analyzed ICC approaches. Interestingly, the second model

with the highest number of signiﬁcant coeﬃcients is the ICC estimated with the CSAF

model. This result shows that although the CSAF model is less accurate and more biased

than the raw analysts’ forecasts, the resulting ICC estimates have a higher correlation

with the cross-section of expected returns.

4.3. Portfolio strategies

As shown in Table 5, the ICC exhibits weak explanatory power in FM regressions.

However, this ﬁnding might be driven by small and micro-cap stocks as the FM regressions

weight the observations equally (Novy-Marx,2013). An additional shortcoming of FM

regressions is that they are sensitive to outliers. To address these potential issues, we

analyze the performance of value-weighted portfolios sorted on their ICC.

Table 6presents annual excess of returns (in excess of the risk-free return). The stocks

are sorted into quintiles and deciles based on their respective ICC at the end of June each

year from 1986 to 2012. We report the performance of the long-short strategies 5–1 (ﬁfth

quintile minus the ﬁrst quintile) and 10–1 (tenth decile minus ﬁrst decile). We estimate

ICCs based on earnings from the following models: the CM, AF, CSAF, HVZ, RI, EP,

and RW. We sort portfolios based on the following ICC approaches: CT, GLS, OJ, and

29

MPEG. In addition, we include a Composite ICC, which is the average of the above-

mentioned approaches. To compute the excess returns, we use the one-month Treasury

bill rate.

The results of the long-short strategies show that only the ICC estimated with the

combined model and the CSAF Model reported signiﬁcant excess returns. The ICC

estimated with the combined model has signiﬁcant excess returns with the GLS approach

for the 5–1 (4.45%) and 10–1 (4.98%) long-short strategies and with the CT approach

for the 10–1 strategy, with annualized excess returns of 6.65%. The ICC estimated with

the CSAF model has signiﬁcant excess returns for the strategy 10–1 with the CT and

Composite ICC. Some of our results here may diﬀer from the corresponding results in

the original papers for the HVZ, EP, and RI models. This may be due to a diﬀerent

return frequency used to compute t-statistics, diﬀerent sample periods, and diﬀerent stock

universes.

In summary, the ICC estimated with the combined model reports stronger correlation

with returns compared to the other models. The results hold in both dimensions, over-time

and cross-sectionally. The ICC estimated with the CSAF model has similar predictive

power compared to the raw analysts’ forecasts but a stronger correlation with returns

cross-sectionally.

5. Firm characteristics and expected returns

We evaluate whether a set of ﬁrm characteristics that have been used to explain the

cross-sectional variation of expected returns proxied by average realized returns also have

the same relation when the ICC as a proxy for expected returns is used. We perform

Fama and Macbeth (1973) (FM) cross-sectional regressions with ex-post excess realized

returns from July (year t) to June (year t+ 1) and excess ICC estimated with diﬀerent

proxies for earnings forecasts as dependent variables. The independent variables are ﬁrm

characteristics available prior to the end of June of the year (t). We estimate the ICC26

based on diﬀerent proxies of earnings forecasts at the end of June of each year.

We use the following ﬁrm characteristics. We estimate market βat the end of June

for each stock and for each year using the stock’s previous 60 monthly excess returns

26For the sake of brevity, following Hou et al. (2012), we provide the results based only on the Com-

posite ICC, which is the average of the CT, GLS, OJ, and MPEG approaches.

30

Table 6: Returns of portfolios formed on ICC

Combined Model Analysts’ Forecasts

GLS CT OJ MPEG Composite GLS CT OJ MPEG Composite

5-1 4.45 2.34 2.38 1.97 2.33 2.52 0.18 -0.79 0.17 0.50

[1.969]* [0.863] [0.978] [0.861] [0.862] [1.064] [0.065] [-0.279] [0.060] [0.177]

10-1 4.98 6.65 3.44 3.22 3.66 4.42 0.46 1.09 -0.24 -0.42

[1.656]* [2.064]* [1.129] [1.145] [1.156] [1.313] [0.117] [0.298] [-0.066] [-0.104]

CSAF Model HVZ Model

GLS CT OJ MPEG Composite GLS CT OJ MPEG Composite

5-1 3.72 2.39 0.64 1.61 2.89 2.28 1.34 2.04 2.02 0.64

[1.555] [0.916] [0.254] [0.673] [1.087] [0.840] [0.417] [0.790] [0.749] [0.217]

10-1 4.22 6.22 2.84 1.94 5.68 1.02 1.84 0.72 1.42 1.13

[1.326] [1.832]* [0.945] [0.656] [1.694]* [0.290] [0.519] [0.225] [0.421] [0.304]

EP Model RI Model

GLS CT OJ MPEG Composite GLS CT OJ MPEG Composite

5-1 2.86 1.60 -0.80 -1.74 0.07 2.21 -1.04 3.48 2.59 0.83

[0.951] [0.471] [-0.313] [-0.644] [0.020] [0.909] [-0.385] [1.556] [1.156] [0.312]

10-1 0.18 2.18 0.48 -1.18 1.42 1.26 1.39 4.26 3.01 1.92

[0.048] [0.557] [0.154] [-0.364] [0.352] [0.369] [0.378] [1.457] [1.023] [0.534]

This table reports the value-weighted excess of returns of portfolios sorted on ICC. We sort stocks at the end of June each

year from 1985 to 2012 into quintiles and deciles based on ICC. We report the results for long-short strategies of 5–1 (ﬁfth

quintile minus ﬁrst quintile) and 10-1 (tenth decile minus ﬁrst decile). We sort the portfolios on ICC estimated with earnings

estimated by the Combined Model (CM), raw analysts’ forecasts (AF), Cross-Sectional Analysts’ Forecasts (CSAF), Hou,

van Dijk and Zhang (HVZ, 2012), Residual Income (RI), Earnings Persistence (EP), and Random Walk (RW) models. We

estimate ICC based on Claus and Thomas (2001) (CT), Easton (2004) (MPEG), Gebhardt et al. (2001) (GLS), and Ohlson and

Juettner-Nauroth (2005) (OJ). In addition, we include a Composite ICC, which is the average of all of the above-mentioned

approaches. To compute the excess of returns, we use the one-month Treasury bill rate. The one-month Treasury bill rate

was downloaded at the Kenneth French’s library. OLS t-statistics are presented in brackets. ** and * denote signiﬁcance

based on one-tailed tests at 0.01 and 0.05 levels, respectively. The excess returns are annualized by multiplying by 12 and

are expressed in percentages. Our sample covers the period from July 1986 to June 2013.

31

(we require a minimum of 24 months, and excess returns are in excess of the one-month

Treasury bill rate taken from Kenneth French’s data library). Idiosyncratic volatility is

the standard deviation of the residuals from regressing the stock’s returns in excess of the

one-month Treasury bill rate on the three Fama and French (1993) factors27 estimated

yearly at the end of June using the previous 60 monthly returns (we require a minimum of

24 months) (e.g., Ang et al. (2006); Hou et al. (2015)). Asset growth is the change in total

assets from the ﬁscal year ending in year (t−1) to the ﬁscal year ending in (t), divided

by (t−1) total assets (e.g., Fama and French (2015)). Size is the natural logarithm of

market equity at the end of June in year (t). Gross proﬁtability is the ratio of gross proﬁts

to total assets (e.g., Novy-Marx (2013)). Leverage is book value of debt divided by book

equity. CapEx is capital expenditures divided by total assets from year (t−1). ln(beme)

is the natural logarithm of the ratio of book equity to market equity at the previous ﬁscal

year-end. In Table 7, we provide the average of the FM regression coeﬃcients estimated

yearly for the period from June 1986 to June 2012 and the respective t-statistics with

Newey-West adjustment.

For market βthe results are mixed. While we see negative and signiﬁcant coeﬃcients

for the ICC with earnings forecasts from the combined model, as well as from the cross-

sectional (CSAF, HVZ, EP, and RI) models, the ICC using analysts’ earnings forecasts

has a positive relation with market β. The relation between market βand forward returns

is not statistically signiﬁcant. These results are similar to Hou et al. (2012), as their ICC

model has a negative and signiﬁcant relation to market βand the relation to realized

returns are not statistically signiﬁcant. The ICC based on the combined model, analysts’

forecasts, EP, and HVZ earnings forecasts has a positive and signiﬁcant relation with

leverage, but forward returns and ICC with CSAF and RI earnings forecasts have no

signiﬁcant coeﬃcients for leverage.

All proxies of expected returns have positive coeﬃcients for idiosyncratic volatility.

However, the coeﬃcients are statistically signiﬁcant only for the ICC with earnings fore-

casts derived from the combined model (t-statistic of 2.514), analysts’ forecasts (t-statistic

of 4.446), the CSAF model (t-statistics of 2.518), and the EP model (t-statistic of 3.218).

The results for asset growth are interesting since we are able to conﬁrm the negative

cross-sectional relation of asset growth and returns, also shown in Aharoni et al. (2013).

27We download the three Fama-French factor returns from Kenneth French’s website.

32

Although, the ICC estimated with most proxies of earnings forecasts shows a negative

and signiﬁcant relation with asset growth (the ICC with the combined model earnings

forecasts has a coeﬃcient of −0.497 and t-statistic of 5.386, the ICC with HVZ model has

a coeﬃcient of −1.637 and t-statistic of 5.687, the ICC with the EP model has a coeﬃ-

cient of −0.417 and t-statistics of 3.521, and the ICC with the RI model has a coeﬃcient

of −0.769 and a t-statistic of 4.986), the ICC with analysts’ forecasts has a positive and

signiﬁcant relation with a coeﬃcient of 0.181 and a t-statistic of 3.076. These ﬁndings

caution against using ICC based on analysts’ forecasts earnings as a proxy for expected

returns.

Table 7: Implied Cost of Capital and Risk Factors

Realized Returns CM AF CSAF HVZ EP RI

Market β0.180 -0.467 0.503 -0.703 -1.428 -0.455 -0.495

[0.106] [-2.669]* [2.170]* [-2.119]* [-3.605]** [-2.306]* [-1.888]

Idiosyncratic Volatility -0.126 0.161 0.184 0.636 0.262 0.533 0.417

[-0.792] [2.514]* [4.446]** [2.518]* [1.496] [3.218]** [1.793]

Asset Growth -4.871 -0.497 0.181 0.143 -1.637 -0.417 -0.769

[-5.563]** [-5.386]** [3.076]** [0.906] [-5.687]** [-3.521]** [-4.986]**

Ln(Size) 0.057 -1.331 -0.781 -3.736 -2.411 -3.236 -2.465

[0.084] [-5.863]** [-10.407]** [-5.168]** [-4.326]** [-6.518]** [-3.515]**

Gross Proﬁtability 5.627 3.013 -0.461 1.439 -5.151 -0.209 -3.012

[3.181]** [8.492]** [-1.518] [1.881] [-6.657]** [-0.315] [-3.280]**

Leverage -0.065 0.107 0.064 0.032 0.132 0.059 0.057

[-0.507] [4.738]** [5.936]** [1.136] [4.599]** [2.078]* [1.472]

CapEX -5.660 -3.527 0.105 -2.412 -4.845 -2.680 -3.659

[-0.731] [-5.016]** [0.151] [-1.799] [-4.032]** [-2.384]* [-3.458]**

Ln(BEME) 2.086 1.951 1.212 2.585 4.263 3.262 3.961

[1.925] [16.141]** [10.602]** [11.665]** [10.974]** [13.213]** [37.946]**

This table presents the time-series average of slope coeﬃcients from cross-sectional FM regressions of annual

Composite ICC premium and ex-post realized returns premium on the following risk factors: market β, id-

iosyncratic volatility, asset growth, size, gross proﬁtability, leverage, CapEx and ln(beme) (book-to-market).

We estimate market βat the end of June for each stock and for each year using the stock’s previous 60

monthly excess returns (we require a minimum of 24 months, and excess returns are in excess of the one-

month Treasury bill rate taken from Kenneth French’s data library). Idiosyncratic volatility is the standard

deviation of the residuals from regressing the stock’s returns in excess of the one-month Treasury bill rate

on the three Fama and French (1993) factors estimated yearly at the end of June using the previous 60

monthly returns (we require a minimum of 24 months). Asset Growth is the change in total assets from the

ﬁscal year ending in year (t−1) for the ﬁscal year ending in (t), divided by (t−1) total assets. Size is the

natural logarithm of market equity at the end of June in year (t). Gross proﬁtability is the ratio of gross

proﬁts to total assets. Leverage is book value of debt divided by book equity. CapEx is capital expenditures

divided by total assets from year (t−1). ln(beme) is the natural logarithm of the ratio of book equity to

market equity at the previous ﬁscal year-end. We estimate ICC with earnings forecasts from the Combined

Model (CM), raw analysts’ forecasts (AF), Cross-Sectional Analysts’ Forecasts (CSAF), Hou, van Dijk and

Zhang (HVZ, 2012), Residual Income (RI), Earnings Persistence (EP), and Random Walk (RW) models.

To compute the ICC premiums and the excess returns, we use the yield on the U.S. 10-year government

bond. The Newey-West t-statistics are presented in parentheses. ** and * denote signiﬁcance at 0.01 and

0.05 level, respectively. Our sample covers the period from June 1986 to June 2012.

The size eﬀect is stronger when we use the ICC as a proxy for expected returns than

when realized returns are used. The ICCs based on any of the tested earnings forecasts

methods show signiﬁcant coeﬃcients at the 0.01 level. When we analyze the relation of

33

size and forward realized returns, the coeﬃcient is not statistically signiﬁcant. Concerning

the value eﬀect, the coeﬃcients of ln(BEME) are positive and statistically signiﬁcant for

all proxies of expected returns, but the t-statistics are higher when the ICC is used as

a proxy for expected returns than when the ex-post realized returns are used. This is

not surprising as the ICC is a more sophisticated value measure and is therefore highly

correlated with the value factor (e.g., Li et al. (2013)).

According to Novy-Marx (2013), gross proﬁtability has a positive and signiﬁcant re-

lation to returns. In our study, we conﬁrm these results as the t-statistic of returns is

3.181 and the coeﬃcient is 5.627. The results for the ICC based on the combined model

(a positive coeﬃcient of 3.013 and a t-statistic of 8.492) are also similar to the one from

returns. However, when we analyze the ICCs with earnings forecasts from the HVZ model

and the RI model, the results show a negative and signiﬁcant relation, with a t-statistic

of 6.657, and 3.280, respectively. Finally, CapEx has a negative and signiﬁcant relation

with the ICC based on the combined model, HVZ, EP, and RI models and insigniﬁcant

with the other proxies of expected returns analyzed in this study.

6. Conclusion

In this study, we develop a new method to forecast corporate earnings. We build upon

analysts’ earnings forecasts, which are known to be accurate, yet upwardly biased. To

improve these analysts’ forecasts we combine them with variables that have proven to be

good predictors of earnings. First, we include gross proﬁts, as Novy-Marx (2013) ﬁnds a

strong association with earnings. Second, we follow Ashton and Wang (2012), who show

that stock price changes drive earnings, by including recent stock market performance.

We compare our new approach to several methods from the literature, namely raw

analyst forecasts, the model by Hou et al. (2012), the earnings persistent model (Li

and Mohanram,2014), and the residual income model (Li and Mohanram,2014). In

addition, we add an alternative benchmark, the CSAF model, which is based on a cross-

sectional regression including only analysts’ earnings forecasts as an input. We ﬁnd that

our combined model has the lowest bias and highest accuracy among all the tested models.

Regarding market expectations, we show that the combined model also performs better

than the other benchmark models. Furthermore, we compute the ICC based on the

diﬀerent earnings forecast models and ﬁnd that the combined model leads to ICC estimates

34

that have the strongest association with subsequent realized earnings.

This new method makes a strong case for combining two diﬀerent approaches to fore-

cast earnings, that is, human forecasts made by ﬁnancial analysts and mechanical forecasts

based purely on ﬁnancial data. These two approaches have distinct advantages and dis-

advantages, analysts’ forecasts are known to be accurate, yet upwardly biased. On the

other hand, mechanical forecasts are unbiased, but not as accurate. Combining them into

one model mitigates both disadvantages while conserving the advantages.

Our ﬁndings are relevant for practitioners working with earnings forecasts, as well as

academics employing earnings forecasts as inputs for valuation models, such as the ICC.

We recommend the use of our combined model to improve the accuracy and unbiasedness

of earnings forecasts, which beneﬁts methods that build on these forecasts and applications

thereof.

35

Appendix A

Model Formulas and Implementation Details Source

GLS

Mt=Bt+

11

X

τ=1

Et[ROEt+τ−ICC ×Bt+τ−1]

(1 + ICC)k+Et[(ROEt+12 −I CC)×Bt+11]

ICC ×(1 + I CC)11 (.1)

where Mtis the market equity in year t. I CC is the Implied Cost of Capital. Btis

the book equity. Et[] represents market expectations based on information available

in year t, and (ROEt+τ−ICC )×Bt+τ−1, denotes the residual income in year

(t+τ), i.e., the diﬀerence between the return on book equity and the ICC multiplied

by the book equity in the previous year. We compute the ROE from years t+1 to

t+3 as F EP St/Bt−1, where the F E P Stis the consensus mean I/B/E/S analysts‘

earnings per share of period t. After year t+3, we linearly fade for the next nine

years to a target industry median. We calculate this proxy as a rolling industry

median over 5 years, considering only ﬁrms that have a positive ROE. Our industry

deﬁnition is based on Fama and French (1997). Finally, after the period t+12, the

terminal value is a simply perpertuity of the residual incomes. We estimate the

book value based on clean surplus accounting and a constant payout ratio P O, i.e.,

Bt=Bt−1+F EP St+ (1 −P O ).

Gebhardt

et al.

(2001)

CT

Mt=Bt+

5

X

τ=1

Et[ROEt+τ−ICC ×Bt+τ−1]

(1 + ICC)k+Et[(ROEt+5 −I CC)×Bt+4](1 + g)

(ICC −g)×(1 + I CC)5(.2)

where Mtis the market equips in year t. I CC is the Implied Cost of Capital. Btis

the book equity. Et[] represents market expectations based on information available

in year t, and (ROEt+τ−ICC)×Bt+τ−1), denotes the residual income in year

t+τ, i.e., the diﬀerence between the return on book equity and tee ICC multiplied

by the book equity in the previous year. We compute the ROE from years t+1 to

t+5 as F EP St/Bt−1, where the F E P Stis the consensus mean I/B/E/S analysts‘

earnings per share of period t. we estimate the forecasts in the years t+4 and

t+5 using a long-term growth forecast, g, and the three-year ahead forecast. We

estimate gas 10-year government bond minus an assumed real risk-free rate of three

percent. Finally, after the period t+5, the terminal value is a simply perpetuity of

the residual incomes. We estimate the book value based on clean surplus accounting

and a constant payout ratio P O , i.e., Bt=Bt−1+F EP St+ (1 −P O).

Claus

and

Thomas

(2001)

MPEG

Mt=Et[Et+2] + I CC ×Et[Dt+1]−Et[Et+1 ]

ICC2(.3)

where Mtis the market equity in year t. IC C is the Implied Cost of Capital. Et[]

represents market expectations based on information available in year t, Et+1 and

Et+2 are, the earnings forecast in years t+1 and t+2, respectively. Dt+1 is the

dividend in year t+1.

Easton

(2004)

OJ

ICC =A+sA2+Et[Et+1 ]

Mt

+ (g−(γ−1)) (.4)

where: A= 0.5((γ−1) + Et[Dt+1]

Mt), Mtis the market equity in year t. IC C is the

Implied Cost of Capital. Et[] represents market expectations based on information

available in year t, Et+1 is the earnings forecast in years t+1. Dt+1 is the dividend

in year t+1. gis the short-term growth, computed as the rate between EPSt+1 and

EPSt+2. γis the perpetual growth rate in abnormal earnings beyond the forecast

horizon, calculated as 10-year government bond minus an assumed real risk-free rate

of three percent.

Ohlson

and

Juettner-

Nauroth

(2005)

36

References

Abarbanell, J., Lehavy, R., 2003. Biased forecasts or biased earnings? The role of reported

earnings in explaining apparent bias and over/underreaction in analysts’ earnings fore-

casts. Journal of Accounting and Economics 36 (Issues 1-3), 105–146.

Abarbanell, J. S., 1991. Do analysts’ earnings forecasts incorporate information in prior

stock price changes? Journal of Accounting and Economics 14 (2), 147–165.

Aharoni, G., Grundy, B., Zeng, Q., 2013. Stock returns and the Miller Modigliani val-

uation formula: Revisiting the Fama French analysis. Journal of Financial Economics

110 (2), 347–357.

Ang, A., Hodrick, R. J., Xing, Y., Zhang, X., 2006. The cross-section of volatility and

expected returns. The Journal of Finance 61 (1), 259–299.

Ashton, D., Wang, P., 2012. Terminal valuations, growth rates and the implied cost of

capital. Review of Accounting Studies 18 (1), 261–290.

Ball, R., Brown, P., 1968. An empirical evaluation of accounting income numbers. Journal

of Accounting Research 6 (2), 159–178.

Ball, R. T., Ghysels, E., 2017. Automated earnings forecasts: Beat analysts or combine

and conquer? Management Science, Forthcoming.

Bradley, D., Gokkaya, S., Liu, X. I., 2017. Before an analyst becomes an analyst: Does

industry experience matter? The Journal of Finance 72 (2), 751–792.

Bradshaw, M. T., 2012. Analysts’ Forecasts: What Do We Know After Decades of Work?

Bradshaw, M. T., Drake, M. S., Myers, J. N., Myers, L. A., 2012. A re-examination of

analysts’ superiority over time-series forecasts of annual earnings. Review of Accounting

Studies 17 (4), 944–968.

Bradshaw, M. T., Sloan, R. G., 2002. GAAP versus the street : An empirical assessment

of two alternative deﬁnitions of earnings. Journal of Accounting Research 40 (1), 41–66.

Brown, L., Richardson, G., Schwager, S., 1987. An information interpretation of ﬁnancial

analyst superiority in forecasting earnings. Journal of Accounting and Economics 25 (1),

49–67.

37

Brown, L. D., 1993. Earnings forecasting research: Its implications for capital markets

research. International Journal of Forecasting 9, 295–320.

Brown, L. D., Call, A. C., Clement, M. B., 2015. Inside the “black box” of sell-side

ﬁnancial analysts. Journal of Accounting Research 53 (1), 1–47.

Claus, J., Thomas, J., 2001. Equity premia as low as three percent? Evidence from

analysts’ earnings forecasts for domestic and international stock markets. The Journal

of Finance 56 (5), 1629–1666.

Clement, M. B., Koonce, L., Lopez, T. J., 2007. The roles of task-speciﬁc forecasting

experience and innate ability in understanding analyst forecasting performance. Journal

of Accounting and Economics 44 (3), 378–398.

Easton, D. P., 2004. PE ratio; PEG ratio, and estimating the implied expected rate of

return on equity capital. The Accounting Review 79 (1), 73–95.

Easton, P., Monahan, S., 2005. An evaluation of accounting-based measure of expected

returns. The Accounting Review 80 (2), 501–538.

Easton, P. D., Sommers, G. A., 2007. Eﬀect of analysts’ optimism on estimates of the

expected rate of return implied by earnings forecasts. Journal of Accounting Research

45 (5), 983–1015.

Easton, P. D., Zmijewski, M. E., 1989. Cross-sectional variation in the stock market

response to accounting earnings announcements. Journal of Accounting and Economics

11 (2-3), 117–141.

Edwards, E. O., Bell, P. W., 1961. The theory and measurement of business income.

University of California Press, Berkeley, CA.

Fama, E., French, K. R., 2015. A ﬁve-factor asset pricing model. Journal of Financial

Economics 116 (1), 1–22.

Fama, E. F., French, K. R., 1993. Common risk factors in the returns on stocks and

bonds. Journal of Financial Economics 33 (1), 3–56.

Fama, E. F., French, K. R., 1997. Industry costs of equity. Journal of Financial Economics

43 (2), 153–193.

38

Fama, E. F., French, K. R., 2006. Proﬁtability, investment and average returns. Journal

of Financial Economics 82 (3), 491–518.

Fama, E. F., Macbeth, J. D., 1973. Risk , return, and equilibrium : Empirical tests. The

Journal of Political Economy 81 (3), 607–636.

Feltham, G. A., Ohlson, J. A., 1996. Uncertainty resolution and the theory of depreciation

measurement. Journal of Accounting Research 34 (2), 209–234.

Francis, J., Philbrick, D., 1993. Analysts’ decisions as products of a multi-task environ-

ment. Journal of Accounting Research 31 (2), 216.

Frank, M. Z., Shen, T., 2016. Investment and the weighted average cost of capital. Journal

of Financial Economics 119 (2), 300–315.

Fried, D., Givoly, D., 1982. Financial analysts’ forecasts of earnings: A better surrogate

for market expectations. Journal of Accounting and Economics 4 (2), 85–107.

Gebhardt, W. R., Lee, C. M. C., Swaminathan, B., 2001. Toward an implied cost of

capital. Journal of Accounting Research 39 (1), 135–176.

Gerakos, J., Gramacy, R. B., 2013. Regression-based earnings forecasts. Working Paper,

Chicago Booth Research Paper No. 12-26.

Guay, W., Kothari, S. P., Shu, S., 2011. Properties of implied cost of capital using analysts’

forecasts. Australian Journal of Management 36 (2), 125–149.

Hong, H., Kubik, J. D., 2003. Analyzing the analysts: career concerns and biased earnings

forecasts. The Journal of Finance 58 (1), 313–351.

Hou, K., van Dijk, M. A., Zhang, Y., 2012. The implied cost of capital: A new approach.

Journal of Accounting and Economics 53 (3), 504–526.

Hou, K., Xue, C., Zhang, L., 2015. Digesting anomalies: An investment approach. Review

of Financial Studies 28 (3), 650–705.

Lee, C., Ng, D., Swaminathan, B., 2009. Testing International Asset Pricing Models Using

Implied Costs of Capital. Journal of Financial and Quantitative Analysis 44 (02), 307.

39

Li, K. K., Mohanram, P., 2014. Evaluating cross-sectional forecasting models for implied

cost of capital. Review of Accounting Studies 19 (3), 1152–1185.

Li, Y., Ng, D. T., Swaminathan, B., 2013. Predicting market returns using aggregate

implied cost of capital. Journal of Financial Economics 110 (2), 419–436.

Lin, H.-W., McNichols, M. F., 1998. Underwriting relationships, analysts’ earnings fore-

casts and investment recommendations. Journal of Accounting and Economics 25 (1),

101–127.

McNichols, M. F., O’Brian, P. C., 1997. Self-selection and analysts coverage. Journal of

Accounting Research 35 (Supplement), 167–199.

Merkley, K., Michaely, R., Pacelli, J., 2017. Does the scope of the sell-side analyst industry

matter? An examination of bias, accuracy, and information content of analyst reports.

The Journal of Finance 72 (3), 1285–1334.

Newey, W., West, K., 1987. A simple, positive demi-deﬁnite, heteroskedasticity and au-

tocorrelation consistent covariance matrix. Econometrica 55 (3), 703–708.

Novy-Marx, R., 2013. The other side of value: The gross proﬁtability premium. Journal

of Financial Economics 108 (1), 1–28.

O’Brien, P. C., 1988. Analysts’ forecasts as earnings expectations. Journal of Accounting

and Economics 10 (1), 53–83.

Ohlson, J. A., 2005. On accounting-based valuation formulae. Review of Accounting Stud-

ies 10 (2-3), 323–347.

Ohlson, J. A., Juettner-Nauroth, B. E., 2005. Expected EPS and EPS growth as deter-

minants of value. Review of Accounting Studies 10 (2-3), 349–365.

P´astor, L., Sinha, M., Swaminathan, B., 2008. Estimating the intertemporal risk-return

tradeoﬀ using the implied cost of capital. The Journal of Finance 63 (6), 2859–2897.

Richardson, S., Tuna, I., Wysocki, P., 2010. Accounting anomalies and fundamental analy-

sis: A review of recent research advances. Journal of Accounting and Economics 50 (2-3),

410–454.

40

Richardson, S. A., Sloan, R. G., Soliman, M. T., Tuna, I., 2005. Accrual reliability,

earnings persistence and stock prices. Journal of Accounting and Economics 39 (3),

437–485.

Welch, I., Goyal, A., 2008. A comprehensive look at the empirical performance of equity

premium prediction. Review of Financial Studies 21 (4), 1455–1508.

41