
James G. MackinnonQueen's University | QueensU · Department of Economics
James G. Mackinnon
PhD, Princeton University
About
157
Publications
71,991
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
27,787
Citations
Introduction
Skills and Expertise
Publications
Publications (157)
For linear regression models with cross-section or panel data, it is natural to assume that the disturbances are clustered in two dimensions. However, the finite-sample properties of two-way cluster-robust tests and confidence intervals are often poor. We discuss several ways to improve inference with two-way clustering. Two of these are existing m...
We study cluster-robust inference for binary response models. Inference based on the most commonly-used cluster-robust variance matrix estimator (CRVE) can be very unreliable. We study several alternatives. Conceptually the simplest of these, but also the most computationally demanding, involves jackknifing at the cluster level. We also propose a l...
We introduce a new command, summclust, that summarizes the cluster structure of the dataset for linear regression models with clustered disturbances. The key unit of observation for such a model is the cluster. We therefore propose cluster-level measures of leverage, partial leverage, and influence and show how to compute them quickly in most cases...
We provide new and computationally attractive methods, based on
jackknifing, to obtain cluster-robust variance matrix estimators (CRVEs) for linear regression models estimated by least squares. These estimators have previously been computationally infeasible except for small samples. We also propose several new variants of the wild cluster bootstra...
The overwhelming majority of empirical research that uses cluster-robust inference assumes that the clustering structure is known, even though there are often several possible ways in which a dataset could be clustered. We propose two tests for the correct level of clustering in regression models. One test focuses on inference about a single coeffi...
We provide computationally attractive methods to obtain jackknife‐based cluster‐robust variance matrix estimators (CRVEs) for linear regression models estimated by least squares. We also propose several new variants of the wild cluster bootstrap, which involve these CRVEs, jackknife‐based bootstrap data‐generating processes, or both. Extensive simu...
The overwhelming majority of empirical research that uses cluster-robust inference assumes that the clustering structure is known, even though there are often several possible ways in which a dataset could be clustered. We propose two tests for the correct level of clustering in regression models. One test focuses on inference about a single coeffi...
We provide computationally attractive methods to obtain jackknife-based cluster-robust variance matrix estimators (CRVEs) for linear regression models estimated by least squares. We also propose several new variants of the wild cluster bootstrap, which involve these CRVEs, jackknife-based bootstrap data-generating processes, or both. Extensive simu...
As I demonstrate using evidence from a journal data repository that I manage, the datasets used in empirical work are getting larger. When we use very large datasets, it can be dangerous to rely on standard methods for statistical inference. In addition, we need to worry about computational issues. We must be careful in our choice of statistical me...
Cluster-robust inference is widely used in modern empirical work in economics and many other disciplines. When data are clustered, the key unit of observation is the cluster. We propose measures of "high-leverage" clusters and "influential" clusters for linear regression models. The measures of leverage and partial leverage, and functions of them,...
Methods for cluster-robust inference are routinely used in economics and many other disciplines. However, it is only recently that theoretical foundations for the use of these methods in many empirically relevant situations have been developed. In this paper, we use these theoretical results to provide a guide to empirical practice. We do not attem...
Methods for cluster-robust inference are routinely used in economics and many other disciplines. However, it is only recently that theoretical foundations for the use of these methods in many empirically relevant situations have been developed. In this paper, we use these theoretical results to provide a guide to empirical practice. We do not attem...
Efficient computational algorithms for bootstrapping linear regression models with clustered data are discussed. For ordinary least squares (OLS) regression, a new algorithm is provided for the pairs cluster bootstrap, along with two algorithms for the wild cluster bootstrap. One of these is a new way to express an existing method. For instrumental...
Inference using difference-in-differences with clustered data requires care. Previous research has shown that, when there are few treated clusters, t-tests based on cluster-robust variance estimators (CRVEs) severely overreject, and different variants of the wild cluster bootstrap can either overreject or underreject dramatically. We study two rand...
We study two cluster-robust variance estimators (CRVEs) for regression models with clustering in two dimensions and give conditions under which t-statistics based on each of them yield asymptotically valid inferences. In particular, one of the CRVEs requires stronger assumptions about the nature of the intra-cluster correlations. We then propose se...
In many fields of economics, and also in other disciplines, it is hard to justify the assumption that the random error terms in regression models are uncorrelated. It seems more plausible to assume that they are correlated within clusters, such as geographical areas or time periods, but uncorrelated across clusters. It has therefore become very pop...
When there are few treated clusters in a pure treatment or difference-in-differences setting, t tests based on a cluster-robust variance estimator can severely over-reject. Although procedures based on the wild cluster bootstrap often work well when the number of treated clusters is not too small, they can either over-reject or under-reject serious...
We study inference based on cluster-robust variance estimators for regression models with clustered errors, focusing on the wild cluster bootstrap. We state conditions under which asymptotic and bootstrap tests and confidence intervals are asymptotically valid. These conditions put limits on the rates at which the cluster sizes can increase as the...
The wild bootstrap was originally developed for regression models with heteroskedasticity of unknown form. Over the past 30 years, it has been extended to models estimated by instrumental variables and maximum likelihood and to ones where the error terms are (perhaps multiway) clustered. Like bootstrap methods in general, the wild bootstrap is espe...
Inference based on cluster-robust standard errors in linear regression models, using either the Student's t distribution or the wild cluster bootstrap, is known to fail when the number of treated clusters is very small. We propose a family of new procedures called the subcluster wild bootstrap, which includes the ordinary wild boot-strap as a limit...
We study asymptotic inference based on cluster-robust variance estimators for regression models with clustered errors, focusing on the wild cluster bootstrap and the ordinary wild bootstrap. We state conditions under which both asymptotic and bootstrap tests and confidence intervals will be asymptotically valid. These conditions put limits on the r...
Confidence intervals based on cluster-robust covariance matrices can be constructed in many ways. In addition to conventional intervals obtained by inverting Wald ( t ) tests, the paper studies intervals obtained by inverting LM tests, studentized bootstrap intervals based on the wild cluster bootstrap, and restricted bootstrap intervals obtained b...
The cluster robust variance estimator (CRVE) relies on the number of clusters being sufficiently large. Monte Carlo evidence suggests that the ‘rule of 42’ is not true for unbalanced clusters. Rejection frequencies are higher for datasets with 50 clusters proportional to US state populations than with 50 balanced clusters. Using critical values bas...
Inference using large datasets is not nearly as straightforward as conventional econometric theory suggests when the disturbances are clustered, even with very small intra-cluster correlations. The information contained in such a dataset grows much more slowly with the sample size than it would if the observations were independent. Moreover, infere...
Confidence intervals based on cluster-robust covariance matrices can
be constructed in many ways. In addition to conventional intervals
obtained by inverting Wald (t) tests, the paper studies intervals
obtained by inverting LM tests, studentized bootstrap intervals based
on the wild cluster bootstrap, and restricted bootstrap intervals
obtained by...
We study the finite-sample properties of tests for overidentifying restrictions in linear regression models with a single endogenous regressor and weak instruments. Under the assumption of Gaussian disturbances, we derive expressions for a variety of test statistics as functions of eight mutually independent random variables and two nuisance parame...
We calculate numerically the asymptotic distribution functions of likelihood ratio tests for fractional unit roots and cointegration rank. Because these distributions depend on a real-valued parameter, b, which must be estimated, simple tabulation is not feasible. Partly due to the presence of this parameter, the choice of model specification for t...
Economists are often interested in the coefficient of a single endogenous explanatory variable in a linear simultaneous equations model. One way to obtain a confidence set for this coefficient is to invert the Anderson-Rubin test. The "AR confidence sets" that result have correct coverage under classical assumptions. However, AR confidence sets als...
Teaching graduate econometrics means covering three different kinds of subject matter: a grounding in the theory of econometrics, a long laundry list of available econometric techniques, and an introduction to the fact that the practice of linking models and data is every bit as untidy as mathematical statistics is neat. I assign Econometric Theory...
We study several methods of constructing confidence sets for the coefficient of the single right-hand-side endogenous variable in a linear equation with weak instruments. Two of these are based on conditional likelihood ratio (CLR) tests, and the others are based on inverting t statistics or the bootstrap P values associated with them. We propose a...
White (1980) marked the beginning of a new era for inference in econometrics. It introduced the revolutionary idea of inference that is robust to heteroskedasticity of unknown form, an idea that was very soon extended to other forms of robust inference and also led to many new estimation methods. This paper discusses the development of heteroskedas...
A Bayesian method for estimation of a hazard term structure is presented in a functional data analysis framework. The hazard terms structure is designed to include the effects of changes in economic conditions, as well as trends in stock prices and accounting ...
We study several tests for the coefficient of the single right-hand-side endogenous variable in a linear equation estimated by instrumental variables. We show that writing all the test statistics--Student's t, Anderson--Rubin, the LM statistic of Kleibergen and Moreira (K), and likelihood ratio (LR)--as functions of six random quantities leads to a...
An artificial regression is a linear regression that is associated with some other econometric model, which is usually nonlinear. It can be used for a variety of purposes, in particular computing covariance matrices and calculating test statistics. The best-known artificial regression is the Gauss–Newton regression, whose key properties are shared...
The well-known Durbin–Watson, or DW, statistic, which was proposed by Durbin and Watson (1950, 1951), is used for testing the null hypothesis that the error terms of a linear regression model are serially independent.
Associated with every popular nonlinear estimation method is at least one "artificial" linear regression. We define an artificial regression in terms of three conditions that it must satisfy. Then we show how artificial regressions can be useful for numerical optimization, testing hypotheses, and computing parameter estimates. Several existing arti...
Summary We develop a method based on the use of polar coordinates to investigate the existence of moments for instrumental variables and related estimators in the linear regression model. For generalized IV estimators, we obtain familiar results. For JIVE, we obtain the new result that this estimator has no moments at all. Simulation results illus...
We propose a wild bootstrap procedure for linear regression models estimated by instrumental variables. Like other bootstrap procedures that we have proposed elsewhere, it uses efficient estimates of the reduced-form equation(s). Unlike them, it takes account of possible heteroskedasticity of unknown form. We apply this procedure to t tests, includ...
This paper surveys bootstrap and Monte Carlo methods for testing hypotheses in econometrics. Several different ways of computing bootstrap P values are discussed, including the double bootstrap and the fast double bootstrap. It is emphasized that there are many different procedures for generating bootstrap samples for regression models and other ty...
Two procedures are proposed for estimating the rejection probabilities (RPs) of bootstrap tests in Monte Carlo experiments without actually computing a bootstrap test for each replication. These procedures are only about twice as expensive (per replication) as estimating RPs for asymptotic tests. Then a new procedure is proposed for computing boots...
We perform an extensive series of Monte Carlo experiments to compare the performance of two variants of the ‘jackknife instrumental variables estimator’, or JIVE, with that of the more familiar 2SLS and LIML estimators. We find no evidence to suggest that JIVE should ever be used. It is always more dispersed than 2SLS, often very much so, and it is...
We perform an extensive series of Monte Carlo experiments to compare the performance of two variants of the 'jackknife instrumental variables estimator', or JIVE, with that of the more familiar 2SLS and LIML estimators. We find no evidence to suggest that JIVE should ever be used. It is always more dispersed than 2SLS, often very much so, and it is...
Resampling methods such as the bootstrap are routinely used to estimate the finite-sample null distributions of a range of test statistics. We present a simple and tractable way to perform classical hypothesis tests based upon a kernel estimate of the CDF of the bootstrap statistics. This approach has a number of appealing features: (i) it can perf...
We introduce the concept of the bootstrap discrepancy, which measures the di#erence in rejection probabilities between a bootstrap test based on a given test statistic and that of a (usually infeasible) test based on the true distribution of the statistic. We show that the bootstrap discrepancy is of the same order of magnitude under the null hypot...
There are many bootstrap methods that can be used for econometric analysis. In certain circumstances, such as regression models with independent and identically distributed error terms, appropriately chosen bootstrap methods generally work very well. However, there are many other cases, such as regression models with dependent errors, in which boot...
We provide a joint treatment of three major issues that surround testing for a unit root in practice: uncertainty as to whether or not a linear deterministic trend is present in the data, uncertainty as to whether the initial condition of the process is (asymptotically) negligible or not, and the possible presence of nonstationary volatility in the...
Conventional procedures for Monte Carlo and bootstrap tests require that B, the number of simulations, satisfy a specific relationship with the level of the test. Otherwise, a test that would instead be exact will either overreject or underreject for finite B. We present expressions for the rejection frequencies associated with existing procedures...
The astonishing increase in computer performance over the past two decades has made it possible for economists to base many statistical inferences on simulated, or bootstrap, distributions rather than on distributions obtained from asymptotic theory. In this paper, I review some of the basic ideas of bootstrap inference. I discuss Monte Carlo tests...
This paper provides densities and finite sample critical values for the singleequation error correction statistic for testing cointegration. Graphs and response surfaces summarize extensive Monte Carlo simulations and highlight simple dependencies of the statistic's quantiles on the number of variables in the error correction model, the choice of d...
It has been shown in previous work that bootstrapping the J test for nonnested linear regression models dramatically improves its nite-sample performance. We provide evidence that a more sophisticated bootstrap procedure, which we call the fast double bootstrap, produces a very substantial further improvement in cases where the ordinary bootstrap d...
We provide a joint treatment of three major issues that surround testing for a unit root in practice: uncertainty as to whether or not a linear deterministic trend is present in the data, uncertainty as to whether the initial condition of the process is (asymptotically) negligible or not, and the possible presence of nonstationary volatility in the...
This paper provides densities and finite sample critical values for the single-equation error correction statistic for testing cointegration. Graphs and response surfaces summarize extensive Monte Carlo simulations and highlight simple dependencies of the statistic's quantiles on the number of variables in the error correction model, the choice of...
It has been shown in previous work that bootstrapping the J test for nonnested linear regression models dramatically improves its finite-sample performance. We provide evidence that a more sophisticated bootstrap procedure, which we call the fast double bootstrap, produces a very substantial further improvement in cases where the ordinary bootstrap...
We show that the power of a bootstrap test will generally be very close to the level-adjusted power of the asymptotic test on which it is based, provided the latter is calculated properly. Our result, when combined with previous results on approximating the rejection frequency of bootstrap tests, provides a way to simulate the power of both asympto...
We first propose procedures for estimating the rejection probabilities for bootstrap tests in Monte Carlo experiments without actually computing a bootstrap test for each replication. These procedures are only about twice as expensive (per replication) as estimating rejection probabilities for asymptotic tests. We then propose procedures for comput...
This paper investigates the relation between hypothesis testing and the construc-tion of confidence intervals, with particular regard to bootstrap tests. In practice, confidence intervals are almost always based on Wald tests, and consequently are not invariant under nonlinear reparametrisations. Bootstrap percentile-t confidence intervals are an i...
This paper discusses how to choose the number of bootstrap samples when performing bootstrap tests. There are two important issues that arise when the number of bootstraps is finite. One is bias in the estimation of bootstrap P values or critical values, and the second is loss of power. We discuss an easy way to avoid bias and thus obtain exact tes...
This paper employs systems-based cointegration techniques developed by Johansen (1998, Journal of Economic Dynamics and Control 12, 231-254; 1995, Likelihood-based Inference in Cointegrated Vector Autoregressive Models, Oxford University Press) to determine which European Union countries would form a successful Economic and Monetary Union (EMU), ba...
In practice, bootstrap tests must use a finite number of bootstrap samples. This means that the outcome of the test will depend on the sequence of random numbers used to generate the bootstrap samples, and it necessarily results in some loss of power. We examine the extent of this power loss and propose a simple pretest procedure for choosing the n...
The primary aim of the paper is to place current methodological discussions in macroeconometric modeling contrasting the ‘theory first’ versus the ‘data first’ perspectives in the context of a broader methodological framework with a view to constructively appraise them. In particular, the paper focuses on Colander’s argument in his paper “Economist...
This paper provides cumulative distribution functions, densities, and finite sample critical values for the single-equation error correction statistic for testing cointegration. Graphs and response surfaces summarize extensive Monte Carlo simulations and highlight simple dependencies of the statistic's quantiles on the number of variables in the er...
This paper employs response surface regressions based on simulation experiments to calculate asymptotic distribution functions for the Johansen-type likelihood ratio tests for cointegration. These are carried out in the context of the models recently proposed by Pesaran, Shin, and Smith (1997) that allow for the possibility of exogenous variables i...
: Many test statistics in econometrics have asymptotic distributions that cannot be evaluated analytically. In order to conduct asymptotic inference, it is therefore necessary to resort to simulation. The techniques that have commonly been used yield only a small number of critical values, which can be seriously inaccurate because of both experimen...
Economists routinely compute test statistics of which the finite-sample distributions are unknown and use them to reject, or not reject, whatever hypotheses are being tested. Let ^ denote the realized value of such a test statistic. For simplicity, and without loss of generality, assume that we wish to reject if ^ is sufficiently large. In principl...
We provide a theoretical framework in which to study the accuracy of bootstrap P values, which may be based on a parametric or nonparametric bootstrap. In the parametric case, the accuracy of a bootstrap test will depend on the shape of what we call the critical value function. We show that, in many circumstances, the error in rejection probability...
Bootstrap testing of nonlinear models normally requires at least one nonlinear estimation for every bootstrap sample. We show how to reduce computational costs by performing only a fixed, small number of Newton or quasi-Newton steps for each bootstrap sample. The number of steps is smaller for likelihood ratio tests than for other types of classica...
Associated with every popular nonlinear estimationmethod is at least ont "artificial" linear regression. We define an artificial regression in terms of three conditions that it must satisfy. Then we show how artificial regressions can be useful for numerical optimization, testing hypotheses, and computing paremeter estimates. Several existing artif...
Simple techniques for the graphical display of simulation evidence concerning the size and power of hypothesis tests are developed and illustrated. Three types of figures--called P value plots, P value discrepancy plots, and size-power curves--are discussed. Some Monte Carlo experiments on the properties of alternative forms of the information matr...
Bootstrap tests are tests for which the significance level is calculated using some variant of the bootstrap, which may be parametric or nonparametric. We show that the power of a bootstrap test will generally be very close to the power of the asymptotic test on which it is based, provided that both tests are properly adjusted to have the correct s...
This paper employs response surface regressions based on simulation experiments to calculate distribution functions for some well-known unit root and cointegration test statistics. The principal contributions of the paper are a set of data files that contain estimated response surface coefficients and a computer program for utilizing them. This pro...
Bootstrap tests are tests for which the significance level is calculated by some sort of bootstrap procedure, which may be parametric or nonparametric. We show that, in many circumstances, the size distortion of a bootstrap P value for a test will be one whole order of magnitude smaller than that of the corresponding asymptotic P value. We also sho...
This paper employs response surface regressions based on simulation experiments to calculate asymptotic distribution functions for the Johansen-type likelihood ratio tests for cointegration. These are carried out in the context of the models recently proposed by Pesaran, Shin, and Smith (1997) that allow for the possibility of exogenous variables i...
Bootstrap tests are tests for which the significance level is calculated by some sort of bootstrap procedure, which may be parametric or nonparametric. We provide a theoretical framework in which to study the size distorsions of bootstrap P values. We show that, in many circumstances, the size distorsion of a bootstrap test will be one whole order...
This paper discusses methods for reducing the bias of consistent estimators that are biased in finite samples. These methods are available whenever the bias function, which relates the bias of the parameter estimates to the values of the parameters, can be estimated by computer simulation or by some other method. If so, bias can be reduced by one f...
Monte Carlo experiments and response surface regressions are used to calculate approximate asymptotic distribution functions for a number of well-known unit root and cointegration test statistics. These allow empirical workers to calculate approximate P values for these tests. The results of the paper are based on an extensive set of Monte Carlo ex...
Offering a unifying theoretical perspective not readily available in any other text, this innovative guide to econometrics uses simple geometrical arguments to develop students' intuitive understanding of basic and advanced topics, emphasizing throughout the practical applications of modern theory and nonlinear techniques of estimation. One theme o...
A new form of the information matrix test is developed for a wide variety of statistical models. The test is constructed against an explicit alternative with random parameter variation. It is computed using a double-length artificial regression instead of the more conventional outer-product-of-the-gradient regression, which is known to have very po...
This is a survey of recent developments in the field of cointegration, which links long run components of a pair or of a group of series. It can then be used to discuss some types of equilibrium and to introduce them into time-series models in a fairly uncontroversial way. The idea was introduced in the early 1980s and has generated much interest s...
Any artificial regression that can be used to compute Lagrange Multiplier tests can just as easily be used to compute C(fi) tests. This also makes it possible to compute Wald-like tests by means of artificial regressions
Associated with every popular nonlinear estimation method is at least one "artificial" linear regression. We define an artificial regression in terms of three conditions that it must satisfy. Then we show how artificial regressions can be useful for numerical optimization, testing hypotheses, and computing parameter estimates. Several existing arti...
Methods based on linear regression provide an easy way to use the information in control variates to improve the efficiency with which certain features of the distributions of estimators and test statistics are estimated in Monte Carlo experiments. We propose a new technique that allows these methods to be used when the quantities of interest are q...
This paper provides tables of critical values for some popular tests of cointegration and unit roots. Although these tables are necessarily based on computer simulations, they are much more accurate than those previously available. The results of the simulation experiments are summarized by means of response surface regressions in which critical va...
A scale-invariant family of transformations is proposed which, unlike the Box-Cox transformation, can be applied to variables that are equal to zero or of either sign. Two Lagrange multiplier tests are derived for testing the null hypothesis of no dependent variable transformation against the alternative of a transformation from this family. These...
We develop a new form of the information matrix test for a wide variety of statistical models, and present full details for the special case of univariate nonlinear regression models. Chesher (1984) showed that the implicit alternative of the information matrix test is a model with random parameter variation. We exploit this fact by constructing th...
It is often argued that the dependent variable in money demand functions is really the price level, the money stock itself being exogenous. A recent approach which stresses the theme is the "buffer stock" hypothesis, in which money supply shocks explicitly appear in the demand for money function, because prices and interest rates do not adjust rapi...
It is remarkably easy to test for structural change, of the type that the classic F or "Chow" test is designed to detect, in a manner that is robust to heteroskedasticity of possibly unknown form. This paper first discusses how to test for structural change in nonlinear regression models by using a variant of the Gauss-Newton regression. It then sh...
Artificial linear regressions often provide a convenient way to calculate test statistics and estimated covariance ma trices. This paper discusses one family of these regressions called d ouble length because the number of observations in the artificial reg ression is twice the actual number of observations. These double-leng th regressions can be...
We consider several issues related to what Hausman (1978) called "specification tests", namely tests designed to verify the consistency of parameter estimates. We first review a number of results about these tests in linear regression models, and present some new material on their distribution when the model being tested is false, and on a simple w...
The local power of test statistics is analyzed by considering sequences of data-generating processes (DGPs) that approach the null hypothesis without necessarily satisfying the alternative. The three classical test statistics-LR, Wald, and LM-are shown to tend asymptot ically to the same random variable under all such sequences. The powe r of these...
We develop a new specification test which can easily be applied to regression models that have been estimated by generalized least squares. The test is a variant of the F-test. It is derived for the general case, and also, in more detail, for the commonly encountered case of models with AR(1) errors. Two empirical examples are presented, one of the...
The asymptotic power of a statistical test depends on the model being tested, the (implicit) alternative against which the test is constructed, and the process which actually generated the data. The exact way in which it does so is examined for several classes of models and tests. First, we analyze the power of tests of nonlinear regression models...