Article

A Comparison of Some Out-of-Sample Tests of Predictability in Iterated Multi-Step-Ahead Forecasts

Authors:
  • Adolfo Ibañez University
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We consider tests of equal population forecasting ability when mean squared prediction error is the metric for forecasting ability, the two competing models are nested, and the iterated method is used to obtain multistep forecasts. We use Monte Carlo simulations to explore the size and power of the MSPE-adjusted test of Clark and West (2006, 2007) (CW) and the Diebold-Mariano-West (DMW) test. The empirical size of the CW test is almost always tolerable: across a set of 252 simulation results that span 5 DGPs, 9 horizons, and various sample sizes, the median size of nominal 10% tests is 8.8%. The comparable figure for the DMW test, which is generally undersized, is 2.2%. An exception for DMW occurs for long horizon forecasts and processes that quickly revert to the mean, in which case CW and DMW perform comparably. We argue that this is to be expected, because at long horizons the two competing models are both forecasting the process to have reverted to its mean. An exception for CW occurs with a nonlinear DGP, in which CW is usually oversized. CW has greater power and greater size adjusted power than does DMW in virtually all DGPs, horizons and sample sizes. For both CW and DMW, power tends to fall with the horizon, reflecting the fact that forecasts from the two competing models both converge towards the mean as the horizon grows. Consistent with these results, in an empirical exercise comparing models for inflation, CW yields many more rejections of equal forecasting ability than does DMW, with most of the rejections occurring at short horizons.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Therefore, our approach is recommendable when used with slightly undersized tests. Simulations completed by Pincheira and West (2016) show that Clark and West is indeed undersized at short horizons of h=1, h=2 and h=3 steps ahead. Consequently, our approach is expected to be adequate at these horizons as well. is bounded in probability. ...
... They show that the cost of approximating the correct critical values by standard normal ones is in general low: it produces a little undersized test. Furthermore, simulations completed by Clark and McCracken (2013) and Pincheira and West (2016) are consistent with the view that the CW statistic can reasonably be thought of as approximately normal. We will see via simulations in the following sections that our approach also seems to work well with standard normal critical values in a variety of settings. ...
... DGP 1: Here we focus on the case where the null is a martingale model. DGP 1 is fairly similar to the first DGP in Pincheira and West (2016) and to those used in Clark and West (2006), Mankiw and Shapiro (1986), Nelson and Kim (1993), Stambaugh (1999), Campbell (2001), Tauchen (2001) and Pincheira (2013). This DGP is designed to match exchange rate series for which the martingale difference is a plausible null hypothesis and a model based on uncovered interest parity is a plausible alternative. ...
Article
Full-text available
In this paper we introduce a “power booster factor” for out-of-sample tests of predictability. The relevant econometric environment is one in which the econometrician wants to compare the population Mean Squared Prediction Errors (MSPE) of two models: one big nesting model, and another smaller nested model. Although our factor can be used to improve finite sample properties of several out-of-sample tests of predictability, in this paper we focus on the widely used test developed by Clark and West (2006, 2007). Our new test multiplies the Clark and West t-statistic by a factor that should be close to one under the null hypothesis that the short nested model is the true model, but that should be greater than one under the alternative hypothesis that the big nesting model is more adequate. We use Monte Carlo simulations to explore the size and power of our approach. Our simulations reveal that the new test is well sized and powerful. In particular, it tends to be less undersized and more powerful than the test by Clark and West (2006, 2007). Although most of the gains in power are associated to size improvements, we also obtain gains in size-adjusted-power. Finally we illustrate the use of our approach when evaluating the ability that an international core inflation factor has to predict core inflation in a sample of 30 OECD economies. With our “power booster factor” more rejections of the null hypothesis are obtained, indicating a strong influence of global inflation in a selected group of these OECD countries.
... Most of the asymptotic theory for the CW test and other statistics developed in McCracken (2001, 2005) [9,10] and McCracken (2007) [11] focused almost exclusively on direct multi-step-ahead forecasts. However, with some exceptions (e.g., Clark and McCracken (2013b) [15] and Pincheira and West (2016) [16]), iterated multi-step-ahead forecasts have received much less attention. In part for this reason, we evaluated the performance of our test (relative to CW), focusing on iterated multi-step-ahead forecasts. ...
... Most of the asymptotic theory for the CW test and other statistics developed in McCracken (2001, 2005) [9,10] and McCracken (2007) [11] focused almost exclusively on direct multi-step-ahead forecasts. However, with some exceptions (e.g., Clark and McCracken (2013b) [15] and Pincheira and West (2016) [16]), iterated multi-step-ahead forecasts have received much less attention. In part for this reason, we evaluated the performance of our test (relative to CW), focusing on iterated multi-step-ahead forecasts. ...
... We considered a case like this given its relevance in finance and macroeconomics. Our setup is very similar to simulation experiments in Pincheira and West (2006) [16], Stambaugh (1999) [29], Nelson and Kim (1993) [30], and Mankiw and Shapiro (1986) [31]. ...
Article
Full-text available
In this paper, we present a new asymptotically normal test for out-of-sample evaluation in nested models. Our approach is a simple modification of a traditional encompassing test that is commonly known as Clark and West test (CW). The key point of our strategy is to introduce an independent random variable that prevents the traditional CW test from becoming degenerate under the null hypothesis of equal predictive ability. Using the approach developed by West (1996), we show that in our test, the impact of parameter estimation uncertainty vanishes asymptotically. Using a variety of Monte Carlo simulations in iterated multi-step-ahead forecasts, we evaluated our test and CW in terms of size and power. These simulations reveal that our approach is reasonably well-sized, even at long horizons when CW may present severe size distortions. In terms of power, results were mixed but CW has an edge over our approach. Finally, we illustrate the use of our test with an empirical application in the context of the commodity currencies literature.
... Most of the asymptotic theory for the CW test and other statistics developed in McCracken (2001, 2005) and McCracken (2007) focus almost exclusively on direct multistep-ahead forecasts. However, with some exceptions (e.g., Clark and McCracken (2013b) and Pincheira and West (2016)), iterated multi-step-ahead forecasts have received much less attention. In part for this reason, we evaluate the performance of our test (relative to CW), focusing on iterated multi-step-ahead forecasts. ...
... Additionally, the alternative forecast for multi-step-ahead horizons is constructed iteratively through an AR(p) on +1 . This is the same parametrization considered in Pincheira and West (2016), and it is based on a monthly exchange rate application in Clark and West (2006). Therefore, +1 represents the monthly return of a U.S dollar bilateral exchange rate and is the corresponding interest rate differential. ...
... Our second DGP is mainly inspired in macroeconomic data, and it is also considered in Pincheira and West (2016) and Clark and West (2007). This DGP is based on models exploring the relationship between U.S GDP growth and the Federal Reserve Bank of Chicago's factor index of economic activity. ...
Preprint
Full-text available
In this paper we present a new asymptotically normal test for out-of-sample evaluation in nested models. Our approach is a simple modification of a traditional encompassing test that is commonly known as Clark and West test (CW). The key point of our strategy is to introduce an independent random variable that prevents the traditional CW test from becoming degenerate under the null hypothesis of equal predictive ability. Using the approach developed by West (1996), we show that in our test the impact of parameter estimation uncertainty vanishes asymptotically. Using a variety of Monte Carlo simulations in iterated multi-step-ahead forecasts we evaluate our test and CW in terms of size and power. These simulations reveal that our approach is reasonably well-sized even at long horizons when CW may present severe size distortions. In terms of power, results are mixed but CW has an edge over our approach. Finally, we illustrate the use of our test with an empirical application in the context of the commodity currencies literature.
... The Chilean peso is particularly attractive as a commodity-currency due to the relevance of copper in the country's economy. According to the Central Bank of Chile, copper represented 48.02% of total Chilean exports in 2019. Besides, Chile is a net oil-importer country with about 10% of total imports focused on purchases of crude oil and its derivatives. ...
... Given that the CW test evaluates differences in MSPE at the population level, it still might reject the null when the ratios shown in Tables 7 and 8 are greater than 1. These paradoxical results are now well known in the forecasting literature and are explained in detail in the same papers by West (2006, 2007) and in many others, see for instance,West (2006) andPincheira and West (2016). ...
Article
In this paper we show that the Chilean exchange rate has the ability to predict the returns of oil and of three additional oil-related products: gasoline, propane and heating oil. We show this using both in- and out-of sample exercises at multiple horizons. Natural explanations for our findings rely on the well know “dollar effect” and on the present-value theory for exchange rate determination in combination with the strong co-movement displayed by fuel and metal prices. Given that the Chilean economy is heavily influenced by copper, which represents nearly 50% of total national exports, the floating Chilean Peso is importantly affected by price fluctuations in this metal. As oil-related products display an important co-movement with base metal prices, it is reasonable to expect evidence of Granger causality from the Chilean peso to these oil-related products. Interestingly, we provide sound evidence indicating that the predictive ability of the Chilean Peso goes beyond these natural explanations. In particular, we show another plausible predictive channel: volatility in combination with a negative contemporaneous leverage effect in fuel returns. Finally, we compare the Chilean peso with other commodity-currencies in their ability to predict fuel returns. The Chilean peso fares extremely well in this competition, especially at short horizons of one, three and six months.
... They show that the cost of approximating the correct critical values with standard normal critical values is in general low:it produces a little undersized test. Further work byClark & McCracken (2013) andPincheira & West (2016) show that normal critical values tend to work well when multistep ahead forecasts are constructed using the iterative method, at least when the data generating process is not very persistent. This is very important because in this paper we rely on the iterative method for the construction of multistep ahead forecasts. ...
... This is very important because in this paper we rely on the iterative method for the construction of multistep ahead forecasts. We rely then on the vast simulations provided by CW,Clark & McCracken (2013) andPincheira & West (2016) to use standard normal critical values in our out-of-sample exercises. ...
Article
Full-text available
We propose a useful way to predict building permits in the US, exploiting rich real-time data from web search queries. The time series on building permits is usually considered as a leading indicator of economic activity in the construction sector. Nevertheless, new data on building permits are released with a lag close to two months. Therefore, an accurate now-cast of this leading indicator is desirable. We show that models including Google search queries now-cast and forecast better than our good, not naïve, univariate benchmarks both in-sample and out-of-sample. We also show that our results are robust to different specifications, the use of rolling or recursive windows and, in some cases, to the forecasting horizon. Since Google queries information is free, our approach is a simple and inexpensive way to predict building permits in the United States.
... Only relatively recent research has explored the behavior of the Clark and West (2007) test in iterated multistep ahead forecasts. See Clark and McCracken (2013) and Pincheira and West (2016). Via simulations, these papers show that the CW test performs well when the models under evaluation are linear. ...
... Consistent with Pincheira and West (2016), figures in tables C2 and C3 are greater than the corresponding asymptotically normal critical value at the 10%, 5% and 1% significance levels. Let us recall that these asymptotical critical values are 1.282, 1.645 and 2.32 respectively. ...
Article
We explore the ability of traditional core inflation –consumer prices excluding food and energy– to predict headline CPI annual inflation. We analyze a sample of OECD and non-OECD economies using monthly data from January 1994 to March 2015. Our results indicate that sizable predictability emerges for a small subset of countries. For the rest of our economies predictability is either subtle or undetectable. These results hold true even when implementing an out-of-sample test of Granger causality especially designed to compare forecasts from nested models. Our findings partially challenge the common wisdom about the ability of core inflation to forecast headline inflation, and suggest a careful weighting of the traditional exclusion of food and energy prices when assessing the size of the monetary stimulus.
... normal, but in DGP1 we also experiment with shocks displaying fat tails. In all simulations we consider both rolling and recursive samples, several values for the parameter λ in (2.17), a single value of the initial regression sample size R and four values of the number of one step ahead predictions P.4.1 Experimental designDGP 1: For the case where the null is a martingale model, we consider a DGP fairly similar to DGP 1 inPincheira and West (2016). This DGP is such as the ones used inClark and West (2006),Mankiw and Shapiro (1986),Nelson and Kim (1993),Stambaugh (1999),Campbell (2001),Tauchen (2001) andPincheira (2013). ...
... Our second DGP corresponds to the very same DGP 3 inPincheira and West (2016). This DGP is motivated by the literature on commodity currencies. ...
... They show that the cost of approximating the correct critical values with standard normal critical values is in general low:it produces a little undersized test. Further work byClark & McCracken (2013) andPincheira & West (2016) show that normal critical values tend to work well when multistep ahead forecasts are constructed using the iterative method, at least when the data generating process is not very persistent. This is very important because in this paper we rely on the iterative method for the construction of multistep ahead forecasts. ...
... This is very important because in this paper we rely on the iterative method for the construction of multistep ahead forecasts. We rely then on the vast simulations provided by CW,Clark & McCracken (2013) andPincheira & West (2016) to use standard normal critical values in our out-of-sample exercises. ...
... Forecast precision in economic models has long been critical in financial decision making, with significant advances in methodologies and tools over time (Brandl et al., 2006;Pincheira & West, 2016). The seminal work of Meese and Rogoff (1983) in 1983 catalyzed a shift in focus toward prediction evaluation in economic models, particularly in the context of exchange rates (Engel et al., 2007). ...
Article
Full-text available
This study proposes a novel method for forecasting the returns of assets comprising the Ibovespa from January 1, 2016, to December 30, 2020, by integrating machine learning algorithms-Gradient Boosting Machine, k-Nearest Neighbor, and Bayesian Regularized Neural Networks. Employing an ensemble strategy with diverse data modeling approaches, the method includes a pre-processing stage for variable selection, ranking their importance using statistical techniques such as OneR, Information Gain, and Chi-Square. This approach aims to overcome common challenges such as overfitting, high dimensionality, and computational efficiency, thus enhancing the robustness of the machine learning model and reducing susceptibility to biases and fluctuations. Empirical results demonstrate that, compared to the ARIMA model, the machine learning algorithm shows superior performance in forecast error and forecast hit rate and precision (R2, Willmott, and Kurtosis). Furthermore, the results suggest that the proposed algorithm can significantly improve predictive precision when applied to the ARIMA model and generalized to various datasets that include various markets and assets.
... In the specific case of nested models, a rejection of the null hypothesis of no encompassing means that a combination of the forecasts from the nested and nonnested models is better (in terms of MSPE) than either individual forecast. See [12,67,68] for more insights into the interpretation of the ENC-t. ...
Article
Full-text available
This paper tests the random walk hypothesis in the cryptocurrency market. Based on the well-known Meese–Rogoff puzzle, we evaluate whether cryptocurrency returns are predictable or not. For this purpose, we conduct in-sample and out-of-sample analyses to examine the forecasting power of our model built with autoregressive components and lagged returns of Bitcoin, compared with the random walk benchmark. To this end, we considered the 13 major cryptocurrencies between 2018 and 2022. Our results indicate that our models significantly outperform the random walk benchmark. In particular, cryptocurrencies tend to be far more persistent than regular exchange rates, and Bitcoin (BTC) seems to improve the predictive accuracy of our models for some cryptocurrencies. Furthermore, while the predictive performance is time varying, we find predictive ability in different regimes before and during the pandemic crisis. We think that these results are helpful to policymakers and investors because they open a new perspective on cryptocurrency investing strategies and regulations to improve financial stability.
... We construct multistep ahead forecasts using the iterated approach, explained for instance, in Pincheira and West (2016). According to this approach, at time the ℎ-step ahead forecast for ln( ) is built in terms of the (ℎ − 1)-step ahead forecast as follows ...
Article
Full-text available
Recently, the Generalized Growth Model (GGM) has played a prominent role as an effective tool to predict the spread of pandemics exhibiting subexponential growth. A key feature of this model is a damping parameter p that is bounded to the [0,1] interval. By allowing this parameter to take negative values, we show that the GGM can also be useful to predict the spread of COVID-19 in countries that are at middle stages of the pandemic. Using both in-sample and out-of-sample evaluations, we show that a semi-unrestricted version of the model outperforms the traditional GGM in a number of countries when predicting the number of infected people at short horizons. Reductions in Root Mean Squared Prediction Errors (RMSPE) are shown to be substantial. Our results indicate that our semi-unrestricted version of the GGM should be added to the traditional set of phenomenological models used to generate forecasts during early to middle stages of epidemic outbreaks.
... The out-of-sample analyses presented in subsection 3. Table 1 with the strategy that simply predicts base metals returns with a constant estimated in recursive windows. We follow this approach because, according to the work in Pincheira and West (2016), with some convex combinations between the nesting and nested models we should be able to outperform the nested benchmark at the sample level whenever the core statistic of the ENCNEW test is positive 3 . In our notation represents the out-of-sample MSPE of the RW with drift. ...
... 19 Simulation evidence carried out by Clark and McCracken (2013) and Pincheira and West (2016) show that normal critical values tend to work well when multistep-ahead forecasts are constructed using the iterative method, at least when the data generating process is not very persistent. This is very important because in this paper we use the iterative method for the construction of multistep-ahead forecasts. ...
Article
Full-text available
We propose a useful way to predict building permits in the USA, exploiting rich data from web search queries. The relevance of our work relies on the fact that the time series on building permits is used as a leading indicator of economic activity in the construction sector. Nevertheless, new data on building permits are released with a lag of a few weeks. Therefore, an accurate nowcast of this leading indicator is desirable. In this paper, we show that models including Google search queries nowcast and forecast better than many of our good, not naïve benchmarks. We show this with both in-sample and out-of-sample exercises. In addition, we show that the results of these predictions are robust to different specifications, the use of rolling or expanding windows and, in some cases, to the forecasting horizon. Since Google queries information is free, our approach is a simple and inexpensive way to predict building permits in the USA.
Article
Full-text available
We show that a straightforward modification of a trading-based test for predictability displays interesting advantages over the Excess Profitability (EP) test proposed by Anatolyev and Gerco when testing the Driftless Random Walk Hypothesis. Our statistic is called the Straightforward Excess Profitability (SEP) test, and it avoids the calculation of a term that under the null of no predictability should be zero but in practice may be sizable. In addition, our test does not require the strong assumption of independence used to derive the EP test. We claim that dependence is the rule and not the exception. We show via Monte Carlo simulations that the SEP test outperforms the EP test in terms of size and power. Finally, we illustrate the use of our test in an empirical application within the context of the commodity-currencies literature.
Article
We explore the ability of core inflation to predict headline CPI annual inflation for a sample of eight developing economies in Latin America over the period January 1995–May 2017. Our in-sample and out-of-sample results are roughly consistent in providing robust evidence of predictability in four of the countries in our sample. Mixed evidence is found for the other four countries. The bulk of the out-of-sample evidence of predictability concentrates on the short horizons of one and six months. In contrast, at the longest horizon of 24 months, we only find out-of-sample evidence of predictability for two countries: Chile and Colombia, with robust results only for the latter. This is both important and challenging, given that the monetary authorities in our sample of developing countries are currently implementing or are taking steps toward the future implementation of inflation targeting regimes, which are based heavily on long-run inflation forecasts.
Article
Full-text available
In this paper we build forecasts for Chilean year-on-year inflation using both multivariate and univariate time series models augmented with different measures of international inflation. We consider two versions of international inflation factors. The first version is built using year-on-year inflation of 18 Latin American countries (excluding Chile). The second version is built using year-on-year inflation of 30 OECD countries (excluding Chile). We show sound in-sample and pseudo out-of-sample evidence indicating that these international factors do help forecast Chilean inflation at several horizons by reducing the root-mean squared prediction error of our benchmarks models. Our results are robust to a number of sensitivity analyses. Several transmission channels from international to domestic inflation are also discussed. Finally, we provide some comments about the implications of our findings for the conduction of domestic monetary policy.
Article
Full-text available
This paper examines the asymptotic and finite-sample properties of tests of equal forecast accuracy and encompassing applied to direct, multistep predictions from nested regression models. We first derive asymptotic distributions; these nonstandard distributions depend on the parameters of the data-generating process. We then use Monte Carlo simulations to examine finite-sample size and power. Our asymptotic approximation yields good size and power properties for some, but not all, of the tests; a bootstrap works reasonably well for all tests. The paper concludes with a reexamination of the predictive content of capacity utilization for inflation.
Article
Full-text available
Purpose The purpose of this paper is to propose and test empirically an inflation model containing permanent and transitory heteroskedastic components for the G7 countries. More specifically, recent evidences from the literature are gathered to construct a model with a heteroskedastic global component capturing comovements amongst G7 economies. Moreover, evidence of asymmetric generalized autoregressive conditionally heteroskedastic effects both in the transitory and in the permanent components are taken into account, and the time‐varying variance of each component allows their influence over the observable inflation to change over time. Out‐of‐sample forecasting exercises are used to test the model validity. Design/methodology/approach The model is written in state‐space form and estimation is carried out in one step via quasi‐maximum likelihood using the augmented Kalman filter, which allows us to compute smoothed estimates of permanent and of transitory components of inflation rates. Out‐of‐sample forecasts are compared against a random walk (RW) and an autoregressive (AR) model of order one. The significance of the differences in forecast accuracy is tested using the Diebold‐Marino test, the forecast encompassing test, and the Pesaran and Timmermann test. Findings The proposed model fits the data quite well and has good forecasting capabilities when compared to RW and to AR models of order one. The volatility of the global inflation trend extracted from the model captures the international effects of the “Great Moderation” and of the “Great Recession”. An increase in correlation of inflation for certain country pairs since the start of the “Great Recession” is observed. Moreover, there is evidence of asymmetry in inflation volatility, which is consistent with the idea that higher inflation levels lead to greater uncertainty about future inflation. Originality/value This article introduces a new global inflation model with permanent and transitory heteroskedastic components incorporating many recent findings of the literature, and proposes a one step estimation procedure for it. The model fits very well the data and produces good out‐of‐sample forecasts.
Article
Full-text available
In this paper, two competing types of multistep predictors, i.e., plug-in and direct predictors, are considered in autoregressive (AR) processes. When a working model AR(k) is used for the h-step prediction with h 1, the plug-in predictor is obtained from repeatedly using the fitted (by least squares) AR(k) model with an unknown future value replaced by their own forecasts, and the direct predictor is obtained by estimating the h-step prediction model s coefficients directly by linear least squares. Under rather mild conditions, asymptotic expressions for the mean-squared prediction errors (MSPEs) of these two predictors are obtained in stationary cases. In addition, we also extend these results to models with deterministic time trends. Based on these expressions, performances of the plug-in and direct predictors are compared. Finally, two examples are given to illustrate that some stationary case results on these MSPEs can not be generalized to the nonstationary case.The author is deeply grateful to the co-editor Pentti Saikkonen and two referees for their helpful suggestions and comments on a previous version of this paper.
Article
Full-text available
Standard models of exchange rates, based on macroeconomic variables such as prices, interest rates, output, etc., are thought by many researchers to have failed empirically. We present evidence to the contrary. First, we emphasize the point that "beating a random walk" in forecasting is too strong a criterion for accepting an exchange rate model. Typically models should have low forecasting power of this type. We then propose a number of alternative ways to evaluate models. We examine in-sample fit, but emphasize the importance of the monetary policy rule, and its effects on expectations, in determining exchange rates. Next we present evidence that exchange rates incorporate news about future macroeconomic fundamentals, as the models imply. We demonstrate that the models might well be able to account for observed exchange-rate volatility. We discuss studies that examine the response of exchange rates to announcements of economic data. Then we present estimates of exchange-rate models in which expected present values of fundamentals are calculated from survey forecasts. Finally, we show that out-of-sample forecasting power of models can be increased by focusing on panel estimation and long-horizon forecasts.
Article
Full-text available
This paper shows that inflation in industrialized countries is largely a global phenomenon. First, the inflation rates of 22 OECD countries have a common factor that alone accounts for nearly 70 percent of their variance. This large variance share that is associated with Global Inflation is not only due to the trend components of inflation (up from 1960 to 1980 and down thereafter) but also to fluctuations at business cycle frequencies. Second, we show that, in conformity to the prediction of New Keynesian open economy models, there is little spillover of inflationay shocks across countries. The comovement of inflation comes largely from common shocks. Global Inflation is a function of real developments at short horizons and monetary developments at longer horizons. Third, there is a robust "error correction mechanism" that brings national inflation rates back to Global Inflation. A simple model that accounts for this feature consistently beats the previous benchmarks used to forecast inflation 4 to 8 quarters ahead across samples and countries.
Article
Full-text available
When a rate of return is regressed on a lagged stochastic regressor, such as a dividend yield, the regression disturbance is correlated with the regressor's innovation. The OLS estimator's finite-sample properties, derived here, can depart substantially from the standard regression setting. Bayesian posterior distributions for the regression parameters are obtained under specifications that differ with respect to (i) prior beliefs about the autocorrelation of the regressor and (ii) whether the initial observation of the regressor is specified as fixed or stochastic. The posteriors differ across such specifications asset allocations in the presence of estimation risk exhibit sensitivity to those differences.
Article
Full-text available
We propose and evaluate explicit tests of the null hypothesis of no difference in the accuracy of two competing forecasts. In contrast to previously developed tests, a wide variety of accuracy measures can be used (in particular, the loss function need not be quadratic and need not even be symmetric), and forecast errors can be non-Gaussian, nonzero mean, serially correlated, and contemporaneously correlated. Asymptotic and exact finite-sample tests are proposed, evaluated, and illustrated.
Article
We explore the ability of traditional core inflation –consumer prices excluding food and energy– to predict headline CPI annual inflation. We analyze a sample of OECD and non-OECD economies using monthly data from January 1994 to March 2015. Our results indicate that sizable predictability emerges for a small subset of countries. For the rest of our economies predictability is either subtle or undetectable. These results hold true even when implementing an out-of-sample test of Granger causality especially designed to compare forecasts from nested models. Our findings partially challenge the common wisdom about the ability of core inflation to forecast headline inflation, and suggest a careful weighting of the traditional exclusion of food and energy prices when assessing the size of the monetary stimulus.
Article
In this paper we analyse the utility of international measures of inflation in predicting local ones. To that end, we consider a set of 31 OECD economies for which monthly inflation data are available. Three main conclusions emerge. First, there is an important share of countries for which relatively robust evidence of predictability is found for both core and headline inflation. Second, the share of countries for which there is evidence of robust predictability is about the same for core and headline inflation, although gains in root-mean-squared prediction error are higher for headline inflation. Third, while the evidence indicates that an international inflation factor may be a useful predictor for several countries, it also indicates that, for many countries as well, predictability is either questionable, undetectable, non-robust or simply non-existent.
Article
This chapter discusses what the asset-pricing literature concludes about the forecastability of interest rates. It outlines forecasting methodologies implied by this literature, including dynamic, no-arbitrage term structure models and their macro-finance extensions. It also reviews the empirical evidence concerning the predictability of future yields on Treasury bonds and future excess returns to holding these bonds. In particular, it critically evaluates theory and evidence that variables other than current bond yields are useful in forecasting.
Article
This chapter discusses recent developments in inflation forecasting. We perform a horse-race among a large set of traditional and recently developed forecasting methods, and discuss a number of principles that emerge from this exercise. We find that judgmental survey forecasts outperform model-based ones, often by a wide margin. A very simple forecast that is just a glide path between the survey assessment of inflation in the current-quarter and the long-run survey forecast value turns out to be competitive with the actual survey forecast and thereby does about as well or better than model-based forecasts. We explore the strengths and weaknesses of some specific prediction methods, including forecasts based on the Phillips curve and based on dynamic stochastic general equilibrium models, in greater detail. We also consider measures of inflation expectations taken from financial markets and the tradeoff between forecasting aggregates and disaggregates.
Article
This paper surveys recent developments in the evaluation of point and density forecasts in the context of forecasts made by Vector Autoregressions. Specific emphasis is placed on highlighting those parts of the existing literature that are applicable to direct multi-step forecasts and those parts that are applicable to iterated multi-step forecasts. This literature includes advancements in the evaluation of forecasts in population (based on true, unknown model coefficients) and the evaluation of forecasts in the finite sample (based on estimated model coefficients). The paper then examines in Monte Carlo experiments the finite-sample properties of some tests of equal forecast accuracy, focusing on the comparison of VAR forecasts to AR forecasts. These experiments show the tests to behave as should be expected given the theory. For example, using critical values obtained by bootstrap methods, tests of equal accuracy in population have empirical size about equal to nominal size.
Article
Dynamic stochastic general equilibrium (DSGE) models use modern macroeconomic theory to explain and predict comovements of aggregate time series over the business cycle and to perform policy analysis. We explain how to use DSGE models for all three purposes — forecasting, story telling, and policy experiments — and review their forecasting record. We also provide our own real-time assessment of the forecasting performance of the Smets and Wouters (2007) model data up to 2011, compare it with Blue Chip and Greenbook forecasts, and show how it changes as we augment the standard set of observables with external information from surveys (nowcasts, interest rate forecasts, and expectations for long-run inflation and output growth). We explore methods of generating forecasts in the presence of a zero-lower-bound constraint on nominal interest rates and conditional on counterfactual interest rate paths. Finally, we perform a postmortem of DSGE model forecasts of the Great Recession and show that forecasts from a version of the Smets-Wouters model augmented by financial frictions, and using spreads as an observable, compare well with Blue Chip forecasts.
Article
This paper surveys recent developments in the evaluation of point forecasts. Taking West’s (2006) survey as a starting point, we briefly cover the state of the literature as of the time of West’s writing. We then focus on recent developments, including advancements in the evaluation of forecasts at the population level (based on true, unknown model coefficients), the evaluation of forecasts in the finite sample (based on estimated model coefficients), and the evaluation of conditional versus unconditional forecasts. We present original results in a few subject areas: the optimization of power in determining the split of a sample into in-sample and out-of-sample portions; whether the accuracy of inference in evaluation of multistep forecasts can be improved with the judicious choice of HAC estimator (it can); and the extension of West’s (1996) theory results for population-level, unconditional forecast evaluation to the case of conditional forecast evaluation.
Article
The size and power properties of several tests of equal Mean Square Prediction Error (MSPE) and of Forecast Encompassing (FE) are evaluated, using Monte Carlo simulations, in the context of dynamic regressions. For nested models, the F-type test of forecast encompassing proposed by Clark and McCracken (2001) displays overall the best properties. However its power advantage tends to become smaller as the prediction sample increases and for multi-step ahead predictions; in these cases a standard FE test based on Gaussian critical values becomes relatively more attractive. The ranking among the tests remains broadly unaltered for one-step and multi-step ahead predictions, for partially misspecified models and for highly persistent data. A similar setup is then used to analyze the case of non-nested models. Again it is found that FE tests have a significantly better performance than tests of equal MSPE for discriminating between correct and misspecified models. An empirical application evaluates the predictive ability of nested and non-nested models for GDP in Italy and the euro-area.
Article
This paper presents analytical, Monte Carlo, and empirical evidence on the effects of structural breaks on tests for equal forecast accuracy and encompassing. We show that out-of-sample predictive content can be hard to find because out-of-sample tests are highly dependent on the timing of the predictive ability. Moreover, predictive content is harder to find with some tests than others: in power, F-type tests of equal forecast accuracy and encompassing often dominate t-type alternatives. Based on these results and evidence from an empirical application, we conclude that structural breaks under the alternative may explain why researchers often find evidence of in-sample, but not out-of-sample, predictive content.
Article
We consider using out-of-sample mean squared prediction errors (MSPEs) to evaluate the null that a given series follows a zero mean martingale difference against the alternative that it is linearly predictable. Under the null of no predictability, the population MSPE of the null “no change” model equals that of the linear alternative. We show analytically and via simulations that despite this equality, the alternative model's sample MSPE is expected to be greater than the null's. For rolling regression estimators of the alternative model's parameters, we propose and evaluate an asymptotically normal test that properly accounts for the upward shift of the sample MSPE of the alternative model. Our simulations indicate that our proposed procedure works well.
Article
Forecast evaluation often compares a parsimonious null model to a larger model that nests the null model. Under the null that the parsimonious model generates the data, the larger model introduces noise into its forecasts by estimating parameters whose population values are zero. We observe that the mean squared prediction error (MSPE) from the parsimonious model is therefore expected to be smaller than that of the larger model. We describe how to adjust MSPEs to account for this noise. We propose applying standard methods [West, K.D., 1996. Asymptotic inference about predictive ability. Econometrica 64, 1067–1084] to test whether the adjusted mean squared error difference is zero. We refer to nonstandard limiting distributions derived in Clark and McCracken [2001. Tests of equal forecast accuracy and encompassing for nested models. Journal of Econometrics 105, 85–110; 2005a. Evaluating direct multistep forecasts. Econometric Reviews 24, 369–404] to argue that use of standard normal critical values will yield actual sizes close to, but a little less than, nominal size. Simulation evidence supports our recommended procedure.
Article
“Iterated” multiperiod-ahead time series forecasts are made using a one-period ahead model, iterated forward for the desired number of periods, whereas “direct” forecasts are made using a horizon-specific estimated model, where the dependent variable is the multiperiod ahead value being forecasted. Which approach is better is an empirical matter: in theory, iterated forecasts are more efficient if the one-period ahead model is correctly specified, but direct forecasts are more robust to model misspecification. This paper compares empirical iterated and direct forecasts from linear univariate and bivariate models by applying simulated out-of-sample methods to 170 U.S. monthly macroeconomic time series spanning 1959–2002. The iterated forecasts typically outperform the direct forecasts, particularly, if the models can select long-lag specifications. The relative performance of the iterated forecasts improves with the forecast horizon.
Article
We examine the asymptotic and finite-sample properties of tests for equal forecast accuracy and encompassing applied to 1-step ahead forecasts from nested linear models. We first derive the asymptotic distributions of two standard tests and one new test of encompassing and provide tables of asymptotically valid critical values. Monte Carlo methods are then used to evaluate the size and power of tests of equal forecast accuracy and encompassing. The simulations indicate that post-sample tests can be reasonably well sized. Of the post-sample tests considered, the encompassing test proposed in this paper is the most powerful. We conclude with an empirical application regarding the predictive content of unemployment for inflation.
Article
This paper studies tests of predictability in regressions with a given AR(1) regressor and an asset return dependent variable measured over a short or long horizon. The paper shows that when there is a persistent predictable component in the return, an increase in the horizon may increase the R2 statistic of the regression and the approximate slope of a predictability test. Monte Carlo experiments show that long-horizon regression tests have serious size distortions when asymptotic critical values are used, but some versions of such tests have power advantages remaining after size is corrected.
Article
This study compares the out-of-sample forecasting accuracy of various structural and time series exchange rate models. We find that a random walk model performs as well as any estimated model at one to twelve month horizons for the dollar/pound, dollar/mark, dollar/yen and trade-weighted dollar exchange rates. The candidate structural models include the flexible-price (Frenkel-Bilson) and sticky-price (Dornbusch-Frankel) monetary models, and a sticky-price model which incorporates the current account (Hooper-Morton). The structural models perform poorly despite the fact that we base their forecasts on actual realized values of future explanatory variables.
Article
The paper considers multi-step forecasting of a stationary vector process under a quadratic loss function with a collection of finite-order vector autoregressions (VAR). Under severe misspecification it is preferable to use the multi-step loss function also for parameter estimation. We propose a modification to Shibata's (Ann. Statist. 8 (1980) 147) final prediction error criterion to jointly choose the VAR lag order and one of two predictors: the maximum likelihood estimator plug-in predictor or the loss function estimator plug-in predictor. A Monte Carlo experiment illustrates the theoretical results and documents the empirical performance of the selection criterion.
Article
This paper examines the dynamics of various measures of national, regional, and global inflation. The paper calculates the first two common factors for four measures of industrial country inflation rates: total CPI, core CPI, cyclical total CPI, and cyclical core CPI. The paper then demonstrates that the first common factor is sometimes helpful in forecasting national inflation rates. It also shows that the second common factor and the first common factor for cyclical inflation is sometimes helpful in forecasting national CPI inflation rates. Finally, the paper suggests that the commonality of industrial inflation rates reflects the commonality of the determinants of inflation.
Article
We propose a nonparametric method for automatically selecting the number of autocovariances to use in computing a heteroskedasticity and autocorrelation consistent covariance matrix. For a given kernel for weighting the autocovariances, we prove that our procedure is asymptotically equivalent to one that is optimal under a mean-squared error loss function. Monte Carlo simulations suggest that our procedure performs tolerably well, although it does result in size distortions.
Article
This paper describes a simple method of calculating a heteroskedasticity and autocorrelation consistent covariance matrix that is positive semi-definite by construction. It also establishes consistency of the estimated covariance matrix under fairly general conditions.
Article
We propose a framework for out-of-sample predictive ability testing and forecast selection designed for use in the realistic situation in which the forecasting model is possibly misspecified, due to unmodeled dynamics, unmodeled heterogeneity, incorrect functional form, or any combination of these. Relative to the existing literature (Diebold and Mariano (1995) and West (1996)), we introduce two main innovations: (i) We derive our tests in an environment where the finite sample properties of the estimators on which the forecasts may depend are preserved asymptotically. (ii) We accommodate conditional evaluation objectives (can we predict which forecast will be more accurate at a future date?), which nest unconditional objectives (which forecast was more accurate on average?), that have been the sole focus of previous literature. As a result of (i), our tests have several advantages: they capture the effect of estimation uncertainty on relative forecast performance, they can handle forecasts based on both nested and nonnested models, they allow the forecasts to be produced by general estimation methods, and they are easy to compute. Although both unconditional and conditional approaches are informative, conditioning can help fine-tune the forecast selection to current economic conditions. To this end, we propose a two-step decision rule that uses current information to select the best forecast for the future date of interest. We illustrate the usefulness of our approach by comparing forecasts from leading parameter-reduction methods for macroeconomic forecasting using a large number of predictors. Copyright The Econometric Society 2006.
Article
This paper develops procedures for inference about the moments of smooth functions of out-of-sample predictions and prediction errors when there is a long time series of predictions and realizations. The aim is to provide tools for analysis of predictive accuracy and efficiency and, more generally, of predictive ability. The paper allows for nonnested and nonlinear models as well as for possible dependence of predictions and prediction errors on estimated regression parameters. Simulations indicate that the procedures can work well in samples of size typically available. Copyright 1996 by The Econometric Society.
Article
We study the functioning of secured and unsecured interbank markets in the presence of credit risk. The model generates empirical predictions that are in line with developments during the 2007–09 financial crisis. Interest rates decouple across secured and unsecured markets following an adverse shock to credit risk. The scarcity of underlying collateral may amplify the volatility of interest rates in secured markets. We use the model to discuss various policy responses to the crisis.
Article
The pure expectations theory of unbiased forward exchange rates predicts that the slope coefficient in a regression of the change in the spot rate on the difference between the current forward and spot rates should equal unity. In the recent empirical work by Fama, the estimates of this coefficient turn out to be negative in all regressions for nine major industrialized nations. This paper demonstrates that under the expectations theory, the sampling distribution of the regression estimator of this coefficient is upward-biased relative to unity and strongly skewed to the right. The likelihood of negative values is essentially zero. Thus, the estimator is biased in a direction opposite to what is observed. Since the observed estimates lie far out in the thin left-hand tail of the estimator's sampling distribution, the evidence against the hypothesis of unbiased forward rates is much stronger than previously believed.
Article
Predictive regressions are subject to two small sample biases: the coefficient estimate is biased if the predictor is endogenous and asymptotic standard errors in the case of overlapping periods are biased downward. Both biases work in the direction of making t-ratios too large so that standard inference may indicate predictability even if none is present. Using annual returns since 1872 and monthly returns since 1927, the authors estimate empirical distributions by randomizing residuals in the vector autoregression representation of the variables. The estimated biases are large enough to affect inference in practice and should be accounted for when studying predictability. Copyright 1993 by American Finance Association.
Article
The authors consider the situation in which two forecasts of the same variable are available. The possibility exists of forming a combined forecast as a weighted average of the individual ones and estimation the weights that should be optimally attached to each forecast. If the entire weight should optimally be associated with one forecast, that forecast is said to encompass the other. A natural test for forecast encompassing is based on least squares regression. The authors find, however, that the null distribution of this test statistic is not robust to nonnormality in the forecast errors. They discuss several alternative tests that are robust.
Comparing forecast accuracy: a monte carlo investigation
  • Busetti
Tests for forecast encompassing
  • Harvey
Can exchange rates forecast commodity prices?
  • Chen