Article

# Parallel Sequential Monte Carlo for Efficient Density Combination: The DeCo MATLAB Toolbox

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

## Abstract

This paper presents the Matlab package DeCo (Density Combination) which is based on the paper by Billio et al. (2013) where a constructive Bayesian approach is presented for combining predictive densities originating from different models or other sources of information. The combination weights are time-varying and may depend on past predictive forecasting performances and other learning mechanisms. The core algorithm is the function DeCo which applies banks of parallel Sequential Monte Carlo algorithms to filter the time-varying combination weights. The DeCo procedure has been implemented both for standard CPU computing and for Graphical Process Unit (GPU) parallel computing. For the GPU implementation we use the Matlab parallel computing toolbox and show how to use General Purposes GPU computing almost effortless. This GPU implementation comes with a speed up of the execution time up to seventy times compared to a standard CPU Matlab implementation on a multicore CPU. We show the use of the package and the computational gain of the GPU version, through some simulation experiments and empirical applications.

## No full-text available

... A parallel thought is applied to a bunch of SMCs on each equal size sub-dataset [4], and the final combination is a simple average. In contrast, we use the entire dataset for SMC but parallelize the sub-resampling within each cluster. ...
Article
For regular particle filter algorithm or Sequential Monte Carlo (SMC) methods, the initial weights are traditionally dependent on the proposed distribution, the posterior distribution at the current timestamp in the sampled sequence, and the target is the posterior distribution of the previous timestamp. This is technically correct, but leads to algorithms which usually have practical issues with degeneracy, where all particles eventually collapse onto a single particle. In this paper, we propose and evaluate using $k$ means clustering to attack and even take advantage of this degeneracy. Specifically, we propose a Stochastic SMC algorithm which initializes the set of $k$ means, providing the initial centers chosen from the collapsed particles. To fight against degeneracy, we adjust the regular SMC weights, mediated by cluster proportions, and then correct them to retain the same expectation as before. We experimentally demonstrate that our approach has better performance than vanilla algorithms.
... In linear pooling, the combination weights based on previous forecast performance and maximization of likelihood have been found to provide forecast gains; see [14,15]. Starting from these pooling schemes, the traditional pools are generalized by [16][17][18]. ...
Article
Full-text available
Decision-makers often consult different experts to build reliable forecasts on variables of interest. Combining more opinions and calibrating them to maximize the forecast accuracy is consequently a crucial issue in several economic problems. This paper applies a Bayesian beta mixture model to derive a combined and calibrated density function using random calibration functionals and random combination weights. In particular, it compares the application of linear, harmonic and logarithmic pooling in the Bayesian combination approach. The three combination schemes, i.e., linear, harmonic and logarithmic, are studied in simulation examples with multimodal densities and an empirical application with a large database of stock data. All of the experiments show that in a beta mixture calibration framework, the three combination schemes are substantially equivalent, achieving calibration, and no clear preference for one of them appears. The financial application shows that the linear pooling together with beta mixture calibration achieves the best results in terms of calibrated forecast.
Article
Proper scoring rules are used to assess the out-of-sample accuracy of probabilistic forecasts, with different scoring rules rewarding distinct aspects of forecast performance. Herein, we re-investigate the practice of using proper scoring rules to produce probabilistic forecasts that are ‘optimal’ according to a given score and assess when their out-of-sample accuracy is superior to alternative forecasts, according to that score. Particular attention is paid to relative predictive performance under misspecification of the predictive model. Using numerical illustrations, we document several novel findings within this paradigm that highlight the important interplay between the true data generating process, the assumed predictive model and the scoring rule. Notably, we show that only when a predictive model is sufficiently compatible with the true process to allow a particular score criterion to reward what it is designed to reward, will this approach to forecasting reap benefits. Subject to this compatibility, however, the superiority of the optimal forecast will be greater, the greater is the degree of misspecification. We explore these issues under a range of different scenarios and using both artificially simulated and empirical data.
Article
We propose a new method for conducting Bayesian prediction that delivers accurate predictions without correctly specifying the unknown true data generating process. A prior is defined over a class of plausible predictive models. After observing data, we update the prior to a posterior over these models, via a criterion that captures a user‐specified measure of predictive accuracy. Under regularity, this update yields posterior concentration onto the element of the predictive class that maximizes the expectation of the accuracy measure. In a series of simulation experiments and empirical examples we find notable gains in predictive accuracy relative to conventional likelihoodbased prediction.
Article
We develop a sequential Monte Carlo (SMC) algorithm for Bayesian inference in vector autoregressions with stochastic volatility (VAR-SV). The algorithm builds particle approximations to the sequence of the model’s posteriors, adapting the particles from one approximation to the next as the window of available data expands. The parallelizability of the algorithm’s computations allows the adaptations to occur rapidly. Our particular algorithm exploits the ability to marginalize many parameters from the posterior analytically and embeds a known Markov chain Monte Carlo (MCMC) algorithm for the model as an effective mutation kernel for fighting particle degeneracy. We show that, relative to using MCMC alone, our algorithm increases the precision of inference while reducing computing time by an order of magnitude when estimating a medium-scale VAR-SV model.
Preprint
Full-text available
We propose a new method for conducting Bayesian prediction that delivers accurate predictions without correctly specifying the unknown true data generating process. A prior is defined over a class of plausible predictive models. After observing data, we update the prior to a posterior over these models, via a criterion that captures a user-specified measure of predictive accuracy. Under regularity, this update yields posterior concentration onto the element of the predictive class that maximizes the expectation of the accuracy measure. In a series of simulation experiments and empirical examples we find notable gains in predictive accuracy relative to conventional likelihood-based prediction.
Article
We introduce a Combined Density Nowcasting (CDN) approach to Dynamic Factor Models (DFM) that in a coherent way accounts for time-varying uncertainty of several model and data features in order to provide more accurate and complete density nowcasts. The combination weights are latent random variables that depend on past nowcasting performance and other learning mechanisms. The combined density scheme is incorporated in a Bayesian Sequential Monte Carlo method which re-balances the set of nowcasted densities in each period using updated information on the time-varying weights. Experiments with simulated data show that CDN works particularly well in a situation of early data releases with relatively large data uncertainty and model incompleteness. Empirical results, based on US real-time data of 120 monthly variables, indicate that CDN gives more accurate density nowcasts of US GDP growth than a model selection strategy and other combination strategies throughout the quarter with relatively large gains for the two first months of the quarter. CDN also provides informative signals on model incompleteness during recent recessions. Focusing on the tails, CDN delivers probabilities of negative growth, that provide good signals for calling recessions and ending economic slumps in real time.
Article
This paper starts with a brief description of the introduction of the likelihood approach in econometrics as presented in Cowles Foundation Monographs 10 and 14. A sketch is given of the criticisms on this approach mainly from the first group of Bayesian econometricians. Publication and citation patterns of Bayesian econometric papers are analyzed in ten major econometric journals from the late 1970s until the first few months of 2014. Results indicate a cluster of journals with theoretical and applied papers, mainly consisting of Journal of Econometrics, Journal of Business and Economic Statistics and Journal of Applied Econometrics which contains the large majority of high quality Bayesian econometric papers. A second cluster of theoretical journals, mainly consisting of Econometrica and Review of Economic Studies contains few Bayesian econometric papers. The scientific impact, however, of these few papers on Bayesian econometric research is substantial. Special issues from the journals Econometric Reviews, Journal of Econometrics and Econometric Theory received wide attention. Marketing Science shows an ever increasing number of Bayesian papers since the middle nineties. The International Economic Review and the Review of Economics and Statistics show a moderate time varying increase. An upward movement in publication patterns in most journals occurs in the early 1990s due to the effect of the 'Computational Revolution'.More abstract in the paper.
Article
Full-text available
The approximation of the Feynman-Kac semigroups by systems of interacting particles is a very active research field, with applications in many different areas. In this paper, we study the parallelization of such approximations. The total population of particles is divided into sub-populations, referred to as \emph{islands}. The particles within each island follow the usual selection / mutation dynamics. We show that the evolution of each island is also driven by a Feynman-Kac semigroup, whose transition and potential can be explicitly related to ones of the original problem. Therefore, the same genetic type approximation of the Feynman-Kac semi-group may be used at the island level; each island might undergo selection / mutation algorithm. We investigate the impact of the population size within each island and the number of islands, and study different type of interactions. We find conditions under which introducing interactions between islands is beneficial. The theoretical results are supported by some Monte Carlo experiments.
Article
Full-text available
Article
Full-text available
Over the last decade, there has been a growing interest in the use of graphics processing units (GPUs) for non-graphics applications. From early academic proof-of-concept papers around the year 2000, the use of GPUs has now matured to a point where there are countless industrial applications. Together with the expanding use of GPUs, we have also seen a tremendous development in the programming languages and tools, and getting started programming GPUs has never been easier. However, whilst getting started with GPU programming can be simple, being able to fully utilize GPU hardware is an art that can take months or years to master. The aim of this article is to simplify this process, by giving an overview of current GPU programming strategies, profile-driven development, and an outlook to future trends.
Article
Full-text available
It is well known that the Basel II Accord requires banks and other Authorized Deposit-taking Institutions (ADIs) to communicate their daily risk forecasts to the appropriate monetary authorities at the beginning of each trading day, using one or more risk models, whether individually or as combinations, to measure Value-at-Risk (VaR). The risk estimates of these models are used to determine capital requirements and associated capital costs of ADIs, depending in part on the number of previous violations, whereby realised losses exceed the estimated VaR. McAleer et al. (2009) proposed a new approach to model selection for predicting VaR, consisting of combining alternative risk models, and comparing conservative and aggressive strategies for choosing between VaR models. This paper addresses the question of risk management of risk, namely VaR of VIX futures prices, and extends the approaches given in McAleer et al. (2009) and Chang et al. (2011) to examine how different risk management strategies performed during the 2008-09 global financial crisis (GFC). The empirical results suggest that an aggressive strategy of choosing the Supremum of single model forecasts, as compared with Bayesian and non-Bayesian combinations of models, is preferred to other alternatives, and is robust during the GFC. However, this strategy implies relatively high numbers of violations and accumulated losses, which are admissible under the Basel II Accord.
Article
Full-text available
Recursive-weight forecast combination is often found to an ineffective method of improving point forecast accuracy in the presence of uncertain instabilities. We examine the effectiveness of this strategy for forecast densities using (many) vector autoregressive (VAR) and autoregressive (AR) models of output growth, inflation and interest rates. Our proposed recursive-weight density combination strategy, based on the recursive logarithmic score of the forecast densities, produces well-calibrated predictive densities for US real-time data by giving substantial weight to models that allow for structural breaks. In contrast, equal-weight combinations produce poorly calibrated forecast densities for Great Moderation data. Copyright © 2010 John Wiley & Sons, Ltd.
Article
Full-text available
This paper brings together two important but hitherto largely unrelated areas of the forecasting literature, density forecasting and forecast combination. It proposes a practical data-driven approach to the direct combination of density forecasts by taking a weighted linear combination of the competing density forecasts. The combination weights are chosen to minimize the ‘distance’, as measured by the Kullback–Leibler information criterion, between the forecasted and true but unknown density. We explain how this minimization both can and should be achieved but leave theoretical analysis to future research. Comparisons with the optimal combination of point forecasts are made. An application to simple time-series density forecasts and two widely used published density forecasts for U.K. inflation, namely the Bank of England and NIESR “fan” charts, illustrates that combination can but need not always help.
Article
Full-text available
We present a case-study on the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. Graphics cards, containing multiple Graphics Processing Units (GPUs), are self-contained parallel computational devices that can be housed in conventional desktop and laptop computers and can be thought of as prototypes of the next generation of many-core processors. For certain classes of population-based Monte Carlo algorithms they offer massively parallel simulation, with the added advantage over conventional distributed multi-core processors that they are cheap, easily accessible, easy to maintain, easy to code, dedicated local devices with low power consumption. On a canonical set of stochastic simulation examples including population-based Markov chain Monte Carlo methods and Sequential Monte Carlo methods, we nd speedups from 35 to 500 fold over conventional single-threaded computer code. Our findings suggest that GPUs have the potential to facilitate the growth of statistical modelling into complex data rich domains through the availability of cheap and accessible many-core computation. We believe the speedup we observe should motivate wider use of parallelizable simulation methods and greater methodological attention to their design.
Article
Full-text available
Clark and McCracken (2008) argue that combining real-time point forecasts from VARs of output, prices and interest rates improves point forecast accuracy in the presence of uncertain model instabilities. In this paper, we generalize their approach to consider forecast density combinations and evaluations. Whereas Clark and McCracken (2008) show that the point forecast errors from particular equal-weight pairwise averages are typically comparable or better than benchmark univariate time series models, we show that neither approach produces accurate real-time forecast densities for recent US data. If greater weight is given to models that allow for the shifts in volatilities associated with the Great Moderation, predictive density accuracy improves substantially.
Article
Full-text available
La simulation est devenue dans la dernière décennie un outil essentiel du traitement statistique de modèles complexes et de la mise en oeuvre de techniques statistiques avancées, comme le bootstrap ou les méthodes d'inférence simulée. Ce livre présente les éléments de base de la simulation de lois de probabilité (génération de variables uniformes et de lois usuelles) et de leur utilisation en Statistique (intégration de Monte Carlo, optimisation stochastique). Après un bref rappel sur les chaînes de Markov, les techniques plus spécifiques de Monte Carlo par chaînes de Markov (MCMC) sont présentées en détail, à la fois du point de vue théorique (validité et convergence) et du point de vue de leur implémentation (accélération, choix de paramètres, limitations). Les algorithmes d'échantillonnage de Gibbs sont ainsi distingués des méthodes générales de Hastings-Metropolis par leur plus grande richesse théorique. Les derniers chapitres contiennent un exposé critique sur l'état de l'art en contrôle de convergence de ces algorithmes et une présentation unifiée des diverses applications des méthodes MCMC aux modèles à données manquantes. De nombreux exemples statistiques illustrent les méthodes présentées dans cet ouvrage destiné aux étudiants de deuxième et troisième cycles universitaires en Mathématiques Appliquées ainsi qu'aux chercheurs et praticiens désirant utiliser les méthodes MCMC. Monte Carlo statistical methods, particularly those based on Markov chains, are now an essential component of the standard set of techniques used by statisticians. This new edition has been revised towards a coherent and flowing coverage of these simulation techniques, with incorporation of the most recent developments in the field. In particular, the introductory coverage of random variable generation has been totally revised, with many concepts being unified through a fundamental theorem of simulation There are five completely new chapters that cover Monte Carlo control, reversible jump, slice sampling, sequential Monte Carlo, and perfect sampling. There is a more in-depth coverage of Gibbs sampling, which is now contained in three consecutive chapters. The development of Gibbs sampling starts with slice sampling and its connection with the fundamental theorem of simulation, and builds up to two-stage Gibbs sampling and its theoretical properties. A third chapter covers the multi-stage Gibbs sampler and its variety of applications. Lastly, chapters from the previous edition have been revised towards easier access, with the examples getting more detailed coverage. This textbook is intended for a second year graduate course, but will also be useful to someone who either wants to apply simulation techniques for the resolution of practical problems or wishes to grasp the fundamental principles behind those methods. The authors do not assume familiarity with Monte Carlo techniques (such as random variable generation), with computer programming, or with any Markov chain theory (the necessary concepts are developed in Chapter 6). A solutions manual, which covers approximately 40% of the problems, is available for instructors who require the book for a course. oui
Article
Full-text available
The aim of this paper is to compare three regularized particle filters in an online data processing context. We carry out the comparison in terms of hidden states filtering and parameters estmation, considering a Bayesian paradigm and a univariate stochastic volatility model. We discuss the use of an improper prior distribution in the initialization of the filtering procedure and show that the Regularized Auxiliary Particle Filter (R-APF) outperforms the Regularized Sequential Importance Sampling (R-SIS) and the Regularized Sampling Importance Resampling (R-SIR).
Article
Full-text available
Macro-economic models are generally designed to achieve a multiplicity of objectives and correspondingly, they have been evaluated using a vast range of statistical, econometric, economic, political and even aesthetic criteria. However, in so far as they claim to represent economic behaviour, empirical macro-economic systems are certainly open to direct evaluation and testing against data information. The last few years have witnessed a substantial growth in the literature on econometric evaluation techniques, but despite important improvements in formalising evaluation procedures and their increased scope, formidable problems confront any investigation of a high dimensional, non-linear, stochastic, dynamic structure. Since system characteristics are the prime concern of economy-wide models, it might be the case that the validity of every individual component is not essential to adequate overall performance. While this viewpoint is debatable it does draw attention to the need for system evaluation procedures, at which point data limitations pose serious constraints on formal tests. Thus a new “limited information” test of forecast encompassing is proposed, based only on forecasts and requiring no other data from a model's proprietors. The derivation, merits and drawbacks of such a test are presented together with some suggestions for testing entailed relationships and inter-equation feedbacks.
Conference Paper
Full-text available
Although Java was not specifically designed for the computationally intensive numeric applications that are the typical fodder of highly parallel machines, its widespread popularity and portability make it an interesting candidate vehicle for massively parallel programming. With the advent of high-performance optimizing Java compilers, the open question is: How can Java programs best exploit massive parallelism? The authors have been contemplating this question via libraries of Java-routines for specifying and coordinating parallel codes. It would be most desirable to have these routines written in 100%-Pure Java; however, a more expedient solution is to provide Java wrappers (stubs) to existing parallel coordination libraries, such as MPI. MPI is an attractive alternative, as like Java, it is portable. We discuss both approaches here. In undertaking this study, we have also identified some minor modifications of the current language specification that would make 100%-Pure Java parallel programming more natural
Article
Full-text available
A general framework for using Monte Carlo methods in dynamic systems is provided and its wide applications indicated. Under this framework, several currently available techniques are studied and generalized to accommodate more complex features. All of these methods are partial combinations of three ingredients: importance sampling and resampling, rejection sampling, and Markov chain iterations. We deliver a guideline on how they should be used and under what circumstance each method is most suitable. Through the analysis of differences and connections, we consolidate these methods into a generic algorithm by combining desirable features. In addition, we propose a general use of Rao-Blackwellization to improve performances. Examples from econometrics and engineering are presented to demonstrate the importance of Rao-Blackwellization and to compare different Monte Carlo procedures. Keywords: Blind deconvolution; Bootstrap filter; Gibbs sampling; Hidden Markov model; Kalman filter; Markov...
Article
Full-text available
The term sequential Monte Carlo methods'' or, equivalently, particle filters,'' refers to a general class of iterative algorithms that performs Monte Carlo approximations of a given sequence of distributions of interest (\pi_t). We establish in this paper a central limit theorem for the Monte Carlo estimates produced by these computational methods. This result holds under minimal assumptions on the distributions \pi_t, and applies in a general framework which encompasses most of the sequential Monte Carlo methods that have been considered in the literature, including the resample-move algorithm of Gilks and Berzuini [J. R. Stat. Soc. Ser. B Stat. Methodol. 63 (2001) 127-146] and the residual resampling scheme. The corresponding asymptotic variances provide a convenient measurement of the precision of a given particle filter. We study, in particular, in some typical examples of Bayesian applications, whether and at which rate these asymptotic variances diverge in time, in order to assess the long term reliability of the considered algorithm.
Article
A Bayesian semi-parametric dynamic model combination is proposed in order to deal with a large set of predictive densities. It extends the mixture of experts and the smoothly mixing regression models by allowing combination weight dependence between models as well as over time. It introduces an information reduction step by using a clustering mechanism that allocates the large set of predictive densities into a smaller number of mutually exclusive subsets. The complexity of the approach is further reduced by making use of the class-preserving property of the logistic-normal distribution that is specified in the compositional dynamic factor model for the weight dynamics with latent factors defined on a reduced dimension simplex. The whole model is represented as a nonlinear state space model that allows groups of predictive models with corresponding combination weights to be updated with parallel clustering and sequential Monte Carlo filters. The approach is applied to predict Standard & Poor’s 500 index using more than 7000 predictive densities based on US individual stocks and finds substantial forecast and economic gains. Similar forecast gains are obtained in point and density forecasting of US real GDP, Inflation, Treasury Bill yield and employment using a large data set.
Article
Two separate sets of forecasts of airline passenger data have been combined to form a composite set of forecasts. The main conclusion is that the composite set of forecasts can yield lower mean-square error than either of the original forecasts. Past errors of each of the original forecasts are used to determine the weights to attach to these two original forecasts in forming the combined forecasts, and different methods of deriving these weights are examined.
Article
This paper compares alternative models of time-varying volatility on the basis of the accuracy of real-time point and density forecasts of key macroeconomic time series for the USA. We consider Bayesian autoregressive and vector autoregressive models that incorporate some form of time-varying volatility, precisely random walk stochastic volatility, stochastic volatility following a stationary AR process, stochastic volatility coupled with fat tails, GARCH and mixture of innovation models. The results show that the AR and VAR specifications with conventional stochastic volatility dominate other volatility specifications, in terms of point forecasting to some degree and density forecasting to a greater degree. Copyright © 2014 John Wiley & Sons, Ltd.
Article
By making use of martingale representations, we derive the asymptotic normality of particle filters in hidden Markov models and a relatively simple formula for their asymptotic variances. Although repeated resamplings result in complicated dependence among the sample paths, the asymptotic variance formula and martingale representations lead to consistent estimates of the standard errors of the particle filter estimates of the hidden states.
Article
We propose a Bayesian combination approach for multivariate predictive densities which relies upon a distributional state space representation of the combination weights. Several specifications of multivariate time-varying weights are introduced with a particular focus on weight dynamics driven by the past performance of the predictive densities and the use of learning mechanisms. In the proposed approach the model set can be incomplete, meaning that all models can be individually misspecified. A Sequential Monte Carlo method is proposed to approximate the filtering and predictive densities. The combination approach is assessed using statistical and utility-based performance measures for evaluating density forecasts. Simulation results indicate that, for a set of linear autoregressive models, the combination strategy is successful in selecting, with probability close to one, the true model when the model set is complete and it is able to detect parameter i nstability when the model set includes the true model that has generated subsamples of data. For the macro series we find that incompleteness of the models is relatively large in the 70's, the beginning of the 80's and during the recent financial crisis, and lower during the Great Moderation. With respect to returns of the S&P 500 series, we find that an investment strategy using a combination of predictions from professional forecasters and from a white noise model puts more weight on the white noise model in the beginning of the 90's and switches to giving more weight to the professional forecasts over time.
Article
This paper compares alternative models of time-varying macroeconomic volatility on the basis of the accuracy of point and density forecasts of macroeconomic variables. In this analysis, we consider both Bayesian autoregressive and Bayesian vector autoregressive models that incorporate some form of time-varying volatility, precisely stochastic volatility (both with constant and time-varying autoregressive coefficients), stochastic volatility following a stationary AR process, stochastic volatility coupled with fat tails, GARCH, and mixture-of-innovation models. The comparison is based on the accuracy of forecasts of key macroeconomic time series for real-time post–War-II data both for the United States and United Kingdom. The results show that the AR and VAR specifications with widely used stochastic volatility dominate models with alternative volatility specifications, in terms of point forecasting to some degree and density forecasting to a greater degree.
Article
This paper serves as an introduction and survey for economists to the field of sequential Monte Carlo methods which are also known as particle filters. Sequential Monte Carlo methods are simulation based algorithms used to compute the high-dimensional and/or complex integrals that arise regularly in applied work. These methods are becoming increasingly popular in economics and finance; from dynamic stochastic general equilibrium models in macro-economics to option pricing. The objective of this paper is to explain the basics of the methodology, provide references to the literature, and cover some of the theoretical results that justify the methods in practice.
Article
This paper shows the potential of heterogeneous computing in solving dynamic equilibrium models in economics. We illustrate the power and simplicity of C++ Accelerated Massive Parallelism (C++ AMP) recently introduced by Microsoft. Starting from the same exercise as Aldrich et al. (J Econ Dyn Control 35:386–393, 2011) we document a speed gain together with a simplified programming style that naturally enables parallelization.
Article
It is well known that a linear combination of forecasts can outperform individual forecasts. The common practice, however, is to obtain a weighted average of forecasts, with the weights adding up to unity. This paper considers three alternative approaches to obtaining linear combinations. It is shown that the best method is to add a constant term and not to constrain the weights to add to unity. These methods are tested with data on forecasts of quarterly hog prices, both within and out of sample. It is demonstrated that the optimum method proposed here is superior to the common practice of letting the weights add up to one.
Article
The rapid growth in the performance of graphics hardware, coupled with recent improvements in its programmability has lead to its adoption in many non-graphics applications, including a wide variety of scientific computing fields. At the same time, a number of important dynamic optimal policy problems in economics are athirst of computing power to help overcome dual curses of complexity and dimensionality. We investigate if computational economics may benefit from new tools on a case study of imperfect information dynamic programming problem with learning and experimentation trade-off, that is, a choice between controlling the policy target and learning system parameters. Specifically, we use a model of active learning and control of a linear autoregression with the unknown slope that appeared in a variety of macroeconomic policy and other contexts. The endogeneity of posterior beliefs makes the problem difficult in that the value function need not be convex and the policy function need not be continuous. This complication makes the problem a suitable target for massively-parallel computation using graphics processors (GPUs). Our findings are cautiously optimistic in that the new tools let us easily achieve a factor of 15 performance gain relative to an implementation targeting single-core processors. Further gains up to a factor of 26 are also achievable but lie behind a learning and experimentation barrier of their own. Drawing upon experience with CUDA programming architecture and GPUs provides general lessons on how to best exploit future trends in parallel computation in economics.
Article
his paper reconsiders sequential Monte Carlo approaches to Bayesian inference in the light of massively parallel desktop computing capabilities now well within the reach of individual academics. It first develops an algorithm that is well suited to parallel computing in general and for which convergence results have been established in the sequential Monte Carlo literature but that tends to require manual tuning in practical application. It then introduces endogenous adaptations in the algorithm that obviate the need for tuning, using a new approach based on the structure of parallel computing to show that convergence properties are preserved and to provide reliable assessment of simulation error in the approximation of posterior moments. The algorithm is generic, requiring only code for simulation from the prior distribution and evaluation of the prior and data densities, thereby shortening development cycles for new models. Through its use of data point tempering it is robust to irregular posteriors, including multimodal distributions. The sequential structure of the algorithm leads to reliable and generic computation of marginal likelihood as a by-product. The paper includes three detailed examples taken from state-of-the-art substantive research applications. These examples illustrate the many desirable properties of the algorithm. and demonstrate that it can reduce computing time by several orders of magnitude.
Article
The aim of this paper is to assess whether modeling structural change can help improving the accuracy of macroeconomic forecasts. We conduct a simulated real-time out-of-sample exercise using a time-varying coefficients vector autoregression (VAR) with stochastic volatility to predict the inflation rate, unemployment rate and interest rate in the USA. The model generates accurate predictions for the three variables. In particular, the forecasts of inflation are much more accurate than those obtained with any other competing model, including fixed coefficients VARs, time-varying autoregressions and the naïve random walk model. The results hold true also after the mid 1980s, a period in which forecasting inflation was particularly hard. Copyright © 2011 John Wiley & Sons, Ltd.
Article
The stability properties of a class of interacting measure valued processes arising in nonlinear filtering and genetic algorithm theory is discussed. Simple sufficient conditions are given for exponential decays. These criteria are applied to study the asymptotic stability of the nonlinear filtering equation and infinite population models as those arising in Biology and evolutionary computing literature. On the basis of these stability properties we also propose a uniform convergence theorem for the interacting particle numerical scheme of the nonlinear filtering equation introduced in a previous work. In the last part of this study we propose a refinement genetic type particle method with periodic selection dates and we improve the previous uniform convergence results. We finally discuss the uniform convergence of particle approximations including branching and random population size systems. (C) 2001 Editions scientifiques et medicales Elsevier SAS.
Article
Forecast evaluation often compares a parsimonious null model to a larger model that nests the null model. Under the null that the parsimonious model generates the data, the larger model introduces noise into its forecasts by estimating parameters whose population values are zero. We observe that the mean squared prediction error (MSPE) from the parsimonious model is therefore expected to be smaller than that of the larger model. We describe how to adjust MSPEs to account for this noise. We propose applying standard methods [West, K.D., 1996. Asymptotic inference about predictive ability. Econometrica 64, 1067–1084] to test whether the adjusted mean squared error difference is zero. We refer to nonstandard limiting distributions derived in Clark and McCracken [2001. Tests of equal forecast accuracy and encompassing for nested models. Journal of Econometrics 105, 85–110; 2005a. Evaluating direct multistep forecasts. Econometric Reviews 24, 369–404] to argue that use of standard normal critical values will yield actual sizes close to, but a little less than, nominal size. Simulation evidence supports our recommended procedure.
Article
Combined forecasts from a linear and a nonlinear model are investigated for time series with possibly nonlinear characteristics. The forecasts are combined by a constant coefficient regression method as well as a time varying method. The time varying method allows for a locally (non)linear modeling. The methods are applied to three data sets: Canadian lynx and sunspot series, US annual macro-economic time series — used by Nelson and Plosser (J. Monetary Econ., 10 (1982) 139) — and US monthly unemployment rate and production indices. It is shown that the combined forecasts perform well, especially with time varying coefficients. This result holds for out of sample performance for the sunspot series, the Canadian lynx number series and the monthly series, but it does not uniformly hold for the Nelson and Plosser economic time series.
Article
This paper shows how to build algorithms that use graphics processing units (GPUs) installed in most modern computers to solve dynamic equilibrium models in economics. In particular, we rely on the compute unified device architecture (CUDA) of NVIDIA GPUs. We illustrate the power of the approach by solving a simple real business cycle model with value function iteration. We document improvements in speed of around 200 times and suggest that even further gains are likely.
Article
Given two sources of forecasts of the same quantity, it is possible to compare prediction records. In particular, it can be useful to test the hypothesis of equal accuracy in forecast performance. We analyse the behaviour of two possible tests, and of modifications of these tests designed to circumvent shortcomings in the original formulations. As a result of this analysis, a recommendation for one particular testing approach is made for practical applications.
Book
Monte Carlo methods are revolutionizing the on-line analysis of data in fields as diverse as financial modeling, target tracking and computer vision. These methods, appearing under the names of bootstrap filters, condensation, optimal Monte Carlo filters, particle filters and survival of the fittest, have made it possible to solve numerically many complex, non-standard problems that were previously intractable. This book presents the first comprehensive treatment of these techniques, including convergence results and applications to tracking, guidance, automated target recognition, aircraft navigation, robot navigation, econometrics, financial modeling, neural networks, optimal control, optimal filtering, communications, reinforcement learning, signal enhancement, model averaging and selection, computer vision, semiconductor design, population biology, dynamic Bayesian networks, and time series analysis. This will be of great value to students, researchers and practitioners, who have some basic knowledge of probability. Arnaud Doucet received the Ph. D. degree from the University of Paris-XI Orsay in 1997. From 1998 to 2000, he conducted research at the Signal Processing Group of Cambridge University, UK. He is currently an assistant professor at the Department of Electrical Engineering of Melbourne University, Australia. His research interests include Bayesian statistics, dynamic models and Monte Carlo methods. Nando de Freitas obtained a Ph.D. degree in information engineering from Cambridge University in 1999. He is presently a research associate with the artificial intelligence group of the University of California at Berkeley. His main research interests are in Bayesian statistics and the application of on-line and batch Monte Carlo methods to machine learning. Neil Gordon obtained a Ph.D. in Statistics from Imperial College, University of London in 1993. He is with the Pattern and Information Processing group at the Defence Evaluation and Research Agency in the United Kingdom. His research interests are in time series, statistical data analysis, and pattern recognition with a particular emphasis on target tracking and missile guidance.
Article
Combined forecasts from a linear and a nonlinear model are investigated for time series with possibly nonlinear characteristics. The forecasts are combined by a constant coefficient regression method as well as a time varying method. The time varying method allows for a locally (non)linear model. The methods are applied to data from two kinds of disciplines: the Canadian lynx and sunspot series from the natural sciences, and Nelson-Plosser's U.S. series from economics. It is shown that the combined forecasts perform well, especially with time varying coefficients. This result holds for out of sample performance for the sunspot and Canadian lynx number series, but it does not uniformly hold for economic time series.
Article
This paper develops methods for automatic selection of variables in forecasting Bayesian vector autoregressions (VARs) using the Gibbs sampler. In particular, I provide computationally efficient algorithms for stochastic variable selection in generic (linear and nonlinear) VARs. The performance of the proposed variable selection method is assessed in a small Monte Carlo experiment, and in forecasting 4 macroeconomic series of the UK using time-varying parameters vector autoregressions (TVP-VARs). Restricted models consistently improve upon their unrestricted counterparts in forecasting, showing the merits of variable selection in selecting parsimonious models.
Article
We study the stability of the optimal filter w.r.t. its initial condition and w.r.t. the model for the hidden state and the observations in a general hidden Markov model, using the Hilbert projective metric. These stability results are then used to prove, under some mixing assumption, the uniform convergence to the optimal filter of several particle filters, such as the interacting particle filter and some other original particle filters.
Article
A prediction model is any statement of a probability distribution for an outcome not yet observed. This study considers the properties of weighted linear combinations of n prediction models, or linear pools, evaluated using the conventional log predictive scoring rule. The log score is a concave function of the weights and, in general, an optimal linear combination will include several models with positive weights despite the fact that exactly one model has limiting posterior probability one. The paper derives several interesting formal results: for example, a prediction model with positive weight in a pool may have zero weight if some other models are deleted from that pool. The results are illustrated using S&P 500 returns with prediction models from the ARCH, stochastic volatility and Markov mixture families. In this example models that are clearly inferior by the usual scoring criteria have positive weights in optimal linear pools, and these pools substantially outperform their best components.
Article
The nature of computing is changing and it poses both challenges and opportunities for economists. Instead of increasing clock speed, future microprocessors will have “multi-cores” with separate execution units. “Threads” or other multi-processing techniques that are rarely used today are required to take full advantage of them. Beyond one machine, it has become easy to harness multiple computers to work in clusters. Besides dedicated clusters, they can be made up of unused lab computers or even your colleagues’ machines. Finally, grids of computers spanning the Internet are now becoming a reality.
Article
This paper shows how a high-level matrix programming language may be used to perform Monte Carlo simulation, bootstrapping, estimation by maximum likelihood and GMM, and kernel regression in parallel on symmetric multiprocessor computers or clusters of workstations. The implementation of parallelization is done in a way such that an investigator may use the programs without any knowledge of parallel programming. A bootable CD that allows rapid creation of a cluster for parallel computing is introduced. Examples show that parallelization can lead to important reductions in computational time. Detailed discussion of how the Monte Carlo problem was parallelized is included as an example for learning to write parallel programs for Octave. Copyright Springer Science + Business Media, Inc. 2005
Article
The computational difficulty of econometric problems has increased dramatically in recent years as econometricians examine more complicated models and utilize more sophisticated estimation techniques. Many problems in econometrics are "embarrassingly parallel" and can take advantage of parallel computing to reduce the wall clock time it takes to solve a problem. In this paper I demonstrate a method that can be used to solve a maximum likelihood problem using the MPI message passing library. The econometric problem is a simple multinomial logit model that does not require parallel computing but illustrates many of the problems one would confront when estimating more complicated models. Copyright 2002 by Kluwer Academic Publishers
Article
This toolbox of MATLAB econometrics functions includes a collection of regression functions for least-squares, simultaneous systems (2SLS, 3SLS, SUR), limited dependent variable (logit, probit, tobit), time-series (VAR, BVAR, ECM) estimation and forecasting functions, ridge, Theil-Goldberger, switching regimes, robust regression, regression diagnostics functions, cointegration testing, statistical distributions (CDF, PDF and random deviate generation), Bayesian Gibbs sampling estimation and MCMC convergence diagnostics, maximum likelihood and Bayesian spatial econometrics functions. Demonstrations are provided for all functions and a 200 page manual is available in postscript and PDF form with examples and data sets (also available online). All functions provide printed and graphical output similar to that found in RATS, SAS or TSP. The toolbox, manual and examples are free; no attribution or blame need be assigned to the authors. User contributed functions are welcome and many such functions are included in the Econometrics Toolbox, as well as other MATLAB functions that have been placed in the public domain.
Article
Using Bayesian Model Averaging, we examine whether inflation's effects on economic growth are robust to model uncertainty across numerous specifications. Cross-sectional data provide little evidence of a robust inflation-growth relationship, even after allowing for non-linear effects. Panel data with fixed effects suggest inflation is one of the more robust variables affecting growth, and non-linear results suggest that high inflation observations drive the results. However, this robustness is lost when estimation is carried out with instrumental variables.
Article
The stability properties of a class of interacting measure valued processes arising in nonlinear filtering and genetic algorithm theory is discussed.
Article
this article, we use the approach based on the Hilbert metric to study the asymptotic behavior of the optimal filter, and to prove as in [9] the uniform convergence of several particle filters, such as the interacting particle filter (IPF) and some other original particle filters. A common assumption to prove stability results, see e.g. in [9, Theorem 2.4], is that the Markov transition kernels are mixing, which implies that the hidden state sequence is ergodic. Our results are obtained under the assumption that the nonnegative kernels describing the evolution of the unnormalized optimal filter, and incorporating simultaneously the Markov transition kernels and the likelihood functions, are mixing. This is a weaker assumption, see Proposition 3.9, which allows to consider some cases, similar to the case studied in [6], where the hidden state sequence is not ergodic, see Example 3.10. This point of view is further developped by Le Gland and Oudjane in [22] and by Oudjane and Rubenthaler in [28]. Our main contribution is to study also the stability of the optimal filter w.r.t. the model, when the local error is propagated by mixing kernels, and can be estimated in the Hilbert metric, in the total variation norm, or in a weaker distance suitable for random probability distributions. AMS 1991 subject classifications. Primary 93E11, 93E15, 62E25; secondary 60B10, 60J27, 62G07, 62G09, 62L10
Accelerated Massive Parallelism with Microsoft Visual C++
• K Gregory
• A Miller
Gregory K, Miller A (2012). Accelerated Massive Parallelism with Microsoft Visual C++. Microsoft Press, USA.
Combined Parameter and State Estimation in Simulation Based Filtering
• J S Liu
• M West
Liu JS, West M (2001). "Combined Parameter and State Estimation in Simulation Based Filtering." In A Doucet, N de Freitas, N Gordon (eds.), Sequential Monte Carlo Methods in Practice. Springer-Verlag.