
Michael Wolf- University of Zurich
Michael Wolf
- University of Zurich
About
83
Publications
10,589
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,086
Citations
Current institution
Publications
Publications (83)
Markowitz portfolio selection is a cornerstone in finance, both in academia and in the industry. Most academic studies either ignore transaction costs or account for them in a way that is both unrealistic and suboptimal by (i) assuming transaction costs to be constant across stocks and (ii) ignoring them at the portfolio-selection state and simply...
Multivariate GARCH models do not perform well in large dimensions due to the so-called curse of dimensionality. The recent DCC-NL model of (Engle et al., 2019) is able to overcome this curse via nonlinear shrinkage estimation of the unconditional correlation matrix. In this paper, we show how performance can be increased further by using open/high/...
Under rotation-equivariant decision theory, sample covariance matrix eigenvalues can be optimally shrunk by recombining sample eigenvectors with a (potentially nonlinear) function of the unobservable population covariance matrix. The optimal shape of this function reflects the loss/risk that is to be minimized. We solve the problem of optimal covar...
Modeling and forecasting dynamic (or time-varying) covariance matrices has many important applications in finance, such as Markowitz portfolio selection. A popular tool to this end are multivariate GARCH models. Historically, such models did not perform well in large dimensions due to the so-called curse of dimensionality. The recent DCC-NL model o...
This paper constructs a new estimator for large covariance matrices by drawing a bridge between the classic Stein (1975) estimator in finite samples and recent progress under large-dimensional asymptotics. Our formula is quadratic: it has two shrinkage targets weighted by quadratic functions of the concentration ratio (matrix dimension divided by s...
Under rotation-equivariant decision theory, sample covariance matrix eigenvalues can be optimally shrunk by recombining sample eigenvectors with a (potentially nonlinear) function of the unobservable population covariance matrix. The optimal shape of this function reflects the loss/risk that is to be minimized. We introduce a broad family of covari...
Many econometric and data-science applications require a reliable estimate of the covariance matrix, such as Markowitz portfolio selection. When the number of variables is of the same magnitude as the number of observations, this constitutes a difficult estimation problem; the sample covariance matrix certainly will not do. In this paper, we review...
This paper injects factor structure into the estimation of time-varying, large-dimensional covariance matrices of stock returns. Existing factor models struggle to model the covariance matrix of residuals in the presence of time-varying conditional heteroskedasticity in large universes. Conversely, rotation-equivariant estimators of large-dimension...
Applied researchers often want to make inference for the difference of a given performance measure for two investment strategies. In this paper, we consider the class of performance measures that are smooth functions of population means of the underlying returns; this class is very rich and contains many performance measures of practical interest (...
Many researchers seek factors that predict the cross-section of stock returns. The standard methodology sorts stocks according to their factor scores into quantiles and forms a corresponding long-short portfolio. Such a course of action ignores any information on the covariance matrix of stock returns. Historically, it has been difficult to estimat...
This paper injects factor structure into the estimation of time-varying, large-dimensional covariance matrices of stock returns. Existing factor models struggle to model the covariance matrix of residuals in the presence of conditional heteroskedasticity in large universes. Conversely, rotation-equivariant estimators of large-dimensional time-varyi...
Certain estimation problems involving the covariance matrix in large dimensions are considered. Due to the breakdown of finite-dimensional asymptotic theory when the dimension is not negligible with respect to the sample size, it is necessary to resort to an alternative framework known as large-dimensional asymptotics. Recently, an estimator of the...
Second moments of asset returns are important for risk management and portfolio selection. The problem of estimating second moments can be approached from two angles: time series and the cross-section. In time series, the key is to account for conditional heteroskedasticity; a favored model is Dynamic Conditional Correlation (DCC), derived from the...
In many multiple testing problems, the individual null hypotheses (i) concern univariate parameters and (ii) are one-sided. In such problems, power gains can be obtained for bootstrap multiple testing procedures in scenarios where some of the parameters are 'deep in the null' by making certain adjustments to the null distribution under which to res...
Many researchers seek factors that predict the cross-section of stock returns. The standard methodology sorts stocks according to their factor scores into quantiles and forms a corresponding long-short portfolio. Such a course of action ignores any information on the covariance matrix of stock returns. Historically, it has been difficult to estimat...
This paper shows how asymptotically valid inference in regression models based on the weighted least squares (WLS) estimator can be obtained even when the model for reweighting the data is misspecified. Like the ordinary least squares estimator, the WLS estimator can be accompanied by heterokedasticty-consistent (HC) standard errors without knowled...
The mispricing of marketing performance indicators (e.g., brand equity, churn, and customer satisfaction) is an important element of arguments in favor of the financial value of marketing investments. Evidence for mispricing can be assessed by examining whether or not portfolios composed of firms that load highly on marketing performance indicators...
Markowitz (1952) portfolio selection requires estimates of (i) the vector of expected returns and (ii) the covariance matrix of returns. Many proposals to address the first question exist already. This paper addresses the second question. We promote a new nonlinear shrinkage estimator of the covariance matrix that is more flexible than previous lin...
Linear regression models form the cornerstone of applied research in economics and other scientific disciplines. When conditional heteroskedasticity is present, or at least suspected, the practice of reweighting the data has long been abandoned in favor of estimating model parameters by ordinary least squares (OLS), in conjunction with using hetero...
This paper considers the problem of testing a finite number of moment inequalities. We propose a two-step approach. In the first step, a confidence region for the moments is constructed. In the second step, this set is used to provide information about which moments are “negative.” A Bonferonni-type correction is used to account for the fact that,...
A key stated objective of the Australian Plain Packaging Act 2011 is to influence smoking prevalence, in particular of minors. We use the Roy Morgan Single Source (Australia) data set on minors, (that is, Australians aged 14 to 17 years) over the time period January 2001 to December 2013 to analyze whether there is evidence that this goal has been...
Many postulated relations in finance imply that expected asset returns strictly increase in an underlying characteristic. To examine the validity of such a claim, one needs to take the entire range of the characteristic into account, as is done in the recent proposal of Patton and Timmermann (2010). But their test is only a test for the direction o...
We introduce a general testing procedure inmodels with possible identification failure that has exactasymptotic rejection probability under the null hypothesis. The procedure iswidely applicable and in this paper we apply it to tests of arbitrary linear parameter hypotheses as well as to tests of overidentification in time series models given by un...
Many economic and financial applications require the forecast of a random variable of interest over several periods into the future. The sequence of individual forecasts, one period at a time, is called a path-forecast, where the term path refers to the sequence of individual future realizations of the random variable. The problem of constructing a...
This paper revisits the methodology of Stein (1975, 1986) for estimating a covariance matrix in the setting where the number of variables can be of the same magnitude as the sample size. Stein proposed to keep the eigenvectors of the sample covariance matrix but to shrink the eigenvalues. By minimizing an unbiased estimator of risk, Stein derived a...
Covariance matrix estimation and principal component analysis (PCA) are two cornerstones of multivariate analysis. Classic textbook solutions perform poorly when the dimension of the data is of a magnitude similar to the sample size, or even larger. In such settings, there is a common remedy for both statistical problems: nonlinear shrinkage of the...
This paper considers the problem of testing a finite number of moment inequalities. We propose a two-step approach. In the first step, a confidence region for the moments is constructed. In the second step, this set is used to provide information about which moments are “negative.” A Bonferonni-type correction is used to account for the fact that w...
There has been a recent debate in the marketing literature concerning the possible mispricing of customer satisfaction. While earlier studies claim that portfolios with attractive out-of-sample properties can be formed by loading on stocks whose firms enjoy high customer satisfaction, later studies challenge this finding. A large part of the disagr...
Many statistical applications require an estimate of a covariance matrix and/or its inverse. When the matrix dimension is large compared to the sample size, which happens frequently, the sample covariance matrix is known to perform poorly and may suffer from ill-conditioning. There already exists an extensive literature concerning improved estimato...
Many postulated relations in finance imply that expected asset returns should monotonically increase in a certain characteristic. To examine the validity of such a claim, one
typically considers a finite number of return categories, ordered according to the underlying characteristic. A standard approach is to simply test for a difference in expecte...
Risk management and Markowitz (1952) portfolio selection both require an estimate of the covariance matrix of asset returns and/or its inverse. When the matrix dimension is large compared to the sample size, which happens frequently, the sample covariance matrix is known to perform poorly and may suffer from ill-conditioning. There already exists a...
Applied researchers often test for the difference of the variance of two investment strategies;
in particular, when the investment strategies under consideration aim to implement
the global minimum variance portfolio. A popular tool to this end is the F-test for the
equality of variances. Unfortunately, this test is not valid when the returns are c...
Multiple testing refers to any instance that involves the simultaneous testing of more than one hypothesis. If decisions about the individual hypotheses are based on the unadjusted marginal p-values, then there is typically a large probability that some of the true null hypotheses will be rejected. Unfortunately, such a course of action is still co...
This article reviews important concepts and methods that are useful for hypothesis testing. First, we discuss the Neyman-Pearson framework. Various approaches to optimality are presented, including finite-sample and large-sample optimality. Then, we summarize some of the most important methods, as well as resampling methodology, which is useful to...
Fund-of-funds (FoF) managers face the task of selecting a (relatively) small number of hedge funds from a large universe of candidate funds. We analyse whether such a selection can be successfully achieved by looking at the track records of the available funds alone, using advanced statistical techniques. In particular, at a given point in time, we...
Consider the problem of testing s hypotheses simultaneously. In order to deal with the multiplicity problem, the classical approach is to restrict attention to procedures that control the familywise error rate (FWE). Typically, it is known how to construct tests of the individual hypotheses, and the problem is how to combine them into a multiple te...
We present a theoretical basis for testing related endpoints. Typically, it is known how to construct tests of the individual hypotheses, and the problem is how to combine them into a multiple test procedure that controls the familywise error rate. Using the closure method, we emphasize the role of consonant procedures, from an interpretive as well...
[eng] Transportation costs and monopoly location in presence of regional disparities. . This article aims at analysing the impact of the level of transportation costs on the location choice of a monopolist. We consider two asymmetric regions. The heterogeneity of space lies in both regional incomes and population sizes: the first region is endowed...
This paper considers the problem of testing s null hypotheses simultaneously while controlling the false discovery rate (FDR). Benjamini and Hochberg (J.R.Stat. Soc. Ser. B 57(1):289–300, 1995) provide a method for controlling the FDR based on p-values for each of the null hypotheses under the assumption that the p-values are independent. Subsequen...
Consider the problem of testing s hypotheses simultaneously. In this paper, we derive methods which control the generalized familywise error rate given by the probability of k or more false rejections, abbreviated k-FWER. We derive both single-step and stepdown procedures that control the k-FWER in finite samples or asymptotically, depending on the...
It is common in econometric applications that several hypothesis tests are carried out simultaneously. The problem then becomes how to decide which hypotheses to reject, accounting for the multitude of tests. The classical approach is to control the familywise error rate (FWE), which is the probability of one or more false rejections. But when the...
Multilevel or mixed effects models are commonly applied to hierarchical data. The level 2 residuals, which are otherwise known as random effects, are often of both substantive and diagnostic interest. Substantively, they are frequently used for institutional comparisons or rankings. Diagnostically, they are used to assess the model assumptions at t...
Consider the problem of testing s hypotheses simultaneously. The usual approach to dealing with the multiplicity problem is to restrict attention to procedures that control the probability of even one false rejection, the familiar familywise error rate (FWER). In many applications, particularly if s is large, one might be willing to tolerate more t...
Applied researchers often test for the difference of the Sharpe ratios of two investment strategies. A very popular tool to this end is the test of Jobson and Korkie (1981), which has been corrected by Memmel (2003). Unfortunately, this test is not valid when returns have tails heavier than the normal distribution or are of time series nature. Inst...
A well-known pitfall of Markowitz (1952) portfolio optimization is that the sample covariance matrix, which is a critical input, is very erroneous when there are many assets to choose from. If unchecked, this phenomenon skews the optimizer towards extreme weights that tend to perform poorly in the real world. One solution that has been proposed is...
In econometric applications, often several hypothesis tests are carried out at once. The problem then becomes how to decide which hypotheses to reject, accounting for the multitude of tests. This paper suggests a stepwise multiple testing procedure that asymptotically controls the familywise error rate. Compared to related single-step methods, the...
Consider the problem of testing k hypotheses simultaneously. In this paper, we discuss finite and large sample theory of stepdown methods that provide control of the familywise error rate (FWE). In order to improve upon the Bonferroni method or Holm's (1979) stepdown method, Westfall and Young (1993) make eective use of resampling to construct step...
We establish the validity of subsampling confidence intervals for the mean of a dependent series with heavy-tailed marginal distributions. Using point process theory, we focus on GARCH-like time series models. We propose a data-dependent method for the optimal block size selection and investigate its performance by means of a simulation study. Copy...
We consider the problem of making inference for the autocorrelations of a time series in the possible presence of a unit root. Even when the underlying series is assumed to be strictly stationary, the robustness against a unit root is a desirable property to ensure good finite-sample coverage in case the series has a near unit root. In addition to...
This paper offers a new approach to estimating time-varying covariance matrices in the framework of the diagonal-vech version of the multivariate GARCH(1,1) model. Our method is numerically feasible for large-scale problems, produces positive semidefinite conditional covariance matrices, and does not impose unrealistic a priori restrictions. We pro...
The central message of this paper is that nobody should be using the sample covariance matrix for the purpose of portfolio optimization. It contains estimation error of the kind most likely to perturb a mean-variance optimizer. In its place, we suggest using the matrix obtained from the sample covariance matrix through a transformation called shrin...
This paper offers a new approach to estimating time-varying covariance matrices in the framework of the diagonal-vech version of the multivariate GARCH(1,1) model. Our method is numerically feasible for large-scale problems, produces positive semidefinite conditional covariance matrices, and does not impose unrealistic a priori restrictions. We pro...
Kim and Pollard (Annals of Statistics, 18 (1990) 191–219) showed that a general class of M-estimators converge at rate n1/3 rather than at the standard rate n1/2. Many times, this situation arises when the objective function is non-smooth. The limiting distribution is the (almost surely unique) random vector that maximizes a certain Gaussian proces...
This paper discusses inference in self-exciting threshold autoregressive (SETAR) models. Of main interest is inference for the threshold parameter. It is well-known that the asymptotics of the corresponding estimator depend upon whether the SETAR model is continuous or not. In the continuous case, the limiting distribution is normal and standard in...
A general approach to constructing confidence intervals by subsampling was presented in Politis and Romano (1994). The crux of the method is recomputing a statistic over subsamples of the data, and these recomputed values are used to build up an estimated sampling distribution. The method works under extremely weak conditions, it applies to indepen...
Confidence intervals in time series regressions suffer from notorious coverage problems. This is especially true when the dependence in the data is noticeable and sample sizes are small to moderate, as is often the case in empirical studies. This paper proposes a method that combines prewhitening and the studentized bootstrap. While both prewhiteni...
A new method is proposed for constructing confidence intervals in autoregressive models with linear time trend. Interest focuses on the sum of the autoregressive coefficients because this parameter provides a useful scalar measure of the long-run persistence properties of an economic time series. Since the type of the limiting distribution of the c...
This paper proposes to estimate the covariance matrix of stock returns by an optimally weighted average of two existing estimators: the sample covariance matrix and single-index covariance matrix. This method is generally known as shrinkage, and it is standard in decision theory and in empirical Bayesian statistics. Our shrinkage estimator can be s...
This paper analyzes whether standard covariance matrix tests work when dimensionality is large, and in particular larger than sample size. In the latter case, the singularity of the sample covariance matrix makes likelihood ratio tests degenerate, but other tests based on quadratic forms of sample covariance matrix eigenvalues remain well-defined....
In this paper, we provide a method for constructing confidence intervals for the variance that exhibit guaranteed coverage probability for any sample size, uniformly over a wide class of probability distributions. In contrast, standard methods achieve guaranteed coverage only in the limit for a fixed distribution or for any sample size over a very...
Given a sample X/,. .. ,x" from a distribution F, the problem of constructing nonparametric confidence intervals for the mean p(F) is considered. Unlike boots trap procedures or those based on normal approximations, we insist on any procedure being truly nonparametric in the sense that the probability that the confidence interval contains p(F) base...
In this article, a general central limit theorem for a triangular array of m-dependent random variables is presented. Here, m may tend to infinity with the row index at a certain rate. Our theorem is a generalization of previous results. Some examples are given that show that the generalization is useful. In particular, we consider the limiting beh...
The recently developed subsampling methodology has been shown to be valid for the construction of large-sample confidence regions for a general unknown parameter e under very minimal conditions. Nevertheless, in some specific cases -e.g. in the case of the sample mean of Li.d. data- it has been noted that the subsampling distribution estimators und...
In this article, asymptotic inference for the mean of i.i.d. observations in the context of heavy-tailed distributions is
discussed. While both the standard asymptotic method based on the normal approximation and Efron's bootstrap are inconsistent
when the underlying distribution does not possess a second moment, we propose two approaches based on...
In this chapter, we consider \( {{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{X}}_{n}} = ({X_{1}}, \ldots ,{X_{n}}) \) to be an observed stretch of a stationary, strong mixing sequence of real-valued random variables {X t,t∈ℤ}. The probability measure generating the observations is again denoted by P. As mentioned in Appendix A, the...
Stationary time series are very convenient to work with from a mathematical point of view, but the assumption of stationarity is often violated when modeling real-life data. To mention only two examples, many economic time series exhibit seasonal fluctuations, while stock return data typically show time-dependent variability. The goal of this chapt...
The bootstrap was discovered by Efron (1979), who coined the name. In this chapter, the bootstrap is developed as a general method to approximate the sampling distribution of a statistic, a pivot, or a root (defined below), in order to construct confidence regions for a parameter of interest. The use of the bootstrap to approximate a null distribut...
The main practical problem in applying the subsampling method lies in choosing the block size b. This problem is shared by all blocking methods, such as, for example, the moving blocks bootstrap or Carlstein’s (1986) variance estimator (see Sections 3.8 and 3.9). The asymptotic conditions, at least for first-order theory, are usually b → ∞ and b/n...
It has been two decades since Efron (1979) introduced the bootstrap procedure for estimating sampling distributions of statistics based on independent and identically distributed (i.i.d.) observations. While the bootstrap has enjoyed tremendous success and has led to something like a revolution in the field of statistics, it is known to fail for a...
There has been considerable debate in the recent finance literature over whether stock returns are predictable. A number of studies appear to provide empirical support for the use of the current dividend-price ratio, or dividend yield, as a measure of expected stock returns (see, for example, Rozeff, 1984; Campbell and Shiller, 1988b; Fama and Fren...
Suppose {X(t), t ∈ Gd } is a random field in d dimensions, with d ∈ ℤ+; that is, {X(t), t ∈ Gd } is a collection of random variables X(t) taking values in a state space S, defined on a probability space (Ω, A, P), and indexed by the variable t ∈ G d . Throughout this chapter, G will stand for either the set of real numbers ℝ, or the set of integers...
In this chapter, a general theory for the construction of confidence intervals or regions is presented. Much of what is presented is extracted from Politis and Romano (1992c, 1994b). The basic idea is to approximate the sampling distribution of a statistic based on the values of the statistic computed over smaller subsets of the data. For example,...
Let X 1,…, X n denote a realization of a stationary time series. Suppose the infinite dimensional distribution of the infinite sequence is denoted P. The problem we consider is inference for a parameter θ(P). The focus of the present chapter is the case when the parameter space Θ is a metric space. The reason for considering such generality is to b...
It is well known that inference methods for i.i.d. data or, more generally, independent data are simply not consistent when the underlying sequence is dependent. Therefore, the resampling and subsampling methods discussed in the previous chapters need to be modified to be applicable with time series data.
Let X 1,…, X n be an observed stretch of a (strictly) stationary, strong mixing sequence of random variables X t, t ∈ Z} taking values in an arbitrary sample space S; the probability measure generating the observations is denoted by P. The strong mixing condition means that the sequence tends to zero as k tends to infinity, where A and B are events...
This chapter is concerned with making inference for p in the simple AR(1) model
$$ {X_{t}} = \mu + \rho {X_{{t - 1}}} + { \in _{t}}, $$ (12.1) where {∈t} is a strictly stationary white noise innovation sequence and ρ ∈( —1, 1]. It is well known that if |ρ|.
Many applied problems require a covariance matrix estimator that is not only invertible, but also well-conditioned (that is, inverting it does not amplify estimation error). For large-dimensional covariance matrices, the usual estimator—the sample covariance matrix—is typically not well-conditioned and may not even be invertible. This paper introdu...
In this paper, we propose a new method for constructing confidence intervals for the autoregressive parameter of an AR(I) model. Our method works when the parameter equals one, is close to one, or is far away from one and is therefore more general than previous procedures. The crux of the method is to recompute the OLS t-statistics for the AR(I) pa...