Hedibert Freitas LopesUniversity of Chicago | UC · Chicago Booth School of Business
Hedibert Freitas Lopes
PhD in Statistics and Decision Sciences
About
92
Publications
15,460
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,010
Citations
Introduction
I conduct research in Markov Chain and Sequential Monte Carlo techniques applied to multivariate econometrics and time-series models; modeling time-varying covariance of multivariate time series through latent factor analysis; Choleski decomposition and other factorizations; dynamic models and Bayesian inference; and computation. I am mainly interested in the implementation of the Bayesian paradigm to solve real large-scale problems in Econometrics and other fields of Economics.
Publications
Publications (92)
Factor Analysis is a popular method for modeling dependence in multivariate data. However, determining the number of factors and obtaining a sparse orientation of the loadings are still major challenges. In this paper, we propose a decision-theoretic approach that brings to light the relation between a sparse representation of the loadings and fact...
We introduce a new and general set of identifiability conditions for factor models which handles the ordering problem associated with current common practice. In addition, the new class of parsimonious Bayesian factor analysis leads to a factor loading matrix repre-sentation which is an intuitive and easy to implement factor selection scheme. We ar...
In this paper we investigate whether or not the volatility per period of stocks is lower over longer horizons. Taking the perspective of an investor, we evaluate the predictive variance of k-period returns under different model and prior specifications. We adopt the state space framework of Pástor and Stambaugh (2012 L. Pástor and R. F. Stambaugh....
In this work, we investigate sequential Bayesian estimation for inference of stochastic volatility with variance-gamma (SVVG) jumps in returns. We develop an estimation algorithm that combines the sequential learning auxiliary particle filter with the particle learning filter. Simulation evidence and empirical estimation results indicate that this...
It is well-known that parameter estimates and forecasts are sensitive to assump-tions about the tail behavior of the error distribution. In this paper we develop an approach to sequential inference that also simultaneously estimates the tail of the accompanying error distribution. Our simulation-based approach models errors with a t ν -distribution...
Instrumental variable (IV) regression provides a number of statistical challenges due to the shape of the likelihood. We review the main Bayesian literature on instrumental variables and highlight these pathologies. We discuss Jeffreys priors, the connection to the errors-in-the-variables problems and more general error distributions. We propose, a...
In this paper we use Google Flu Trends data together with a sequential surveillance model based on the state-space methodology, to track the evolution of an epidemic process over time. We embed a classical mathematical epidemiology model (a susceptible-exposed-infected-recovered (SEIR) model) within the state-space framework, thereby extending the...
In this paper we develop a simulation-based approach to sequential inference in Bayesian statistics. Our resampling–sampling perspective provides draws from posterior distributions of interest by exploiting the sequential nature of Bayes theorem. Predictive inferences are a direct byproduct of our analysis as are marginal likelihoods for model asse...
Background
Several studies in Drosophila have shown excessive movement of retrogenes from the X chromosome to autosomes, and that these genes are frequently expressed in the testis. This phenomenon has led to several hypotheses invoking natural selection as the process driving male-biased genes to the autosomes. Metta and Schlötterer (BMC Evol Biol...
Meiotic sex chromosome inactivation (MSCI) during spermatogenesis has been proposed as one of the evolutionary driving forces behind both the under-representation of male-biased genes on, and the gene movement out of, the X chromosome in Drosophila. However, the relevance of MSCI in shaping sex chromosome evolution is controversial. Here we examine...
Normalized testis developmental expression. Normalized expression (log2 based) from each testis developmental stage [32] for each Drosophila transcript with corresponding chromosomal location in xls format ('CG identification: transcript number').
Tissue specific gene dataset. List of tissue-specific genes obtained through Flyatlas [33] expression (Methods and [32]). Each excel sheet corresponds to one analyzed adult tissue: midgut, malpigian tubules, accessory glands, salivary gland, head, ovary, and testis. Minimal fold between one tissue against all other tissues analyzed is shown for < 2...
Figure S1. Correlation between two expression datasets from Drosophila spermatogenesis [1,2]. X-axis represents the fold differences between bam mutant and wild type testis from [1]. Y-axis represents the fold differences between mitotic and meiotic expression of spermatogenesis from [2]. Forty-seven D. melanogaster genes analyzed by Metta and Schl...
We propose a model-based vulnerability index of the population from Uruguay
to vector-borne diseases. We have available measurements of a set of variables
in the census tract level of the 19 Departmental capitals of Uruguay. In
particular, we propose an index that combines different sources of information
via a set of micro-environmental indicators...
Particle learning provides a simulation-based approach to sequential Bayesian computation. To sample from a posterior distribution of interest we use an essential state vector together with a predictive distribution and propagation rule to build a resampling-sampling framework. Predictive inference and sequential Bayes factors are a direct by-produ...
Many situations in practice require appropriate specification of operating characteristics under extreme conditions. Typical
examples include environmental sciences where studies include extreme temperature, rainfall and river flow to name a few.
In these cases, the effect of geographic and climatological inputs are likely to play a relevant role....
In this review we explore issues of the sensitivity of Bayes estimates to the prior and form of the likelihood. With respect to the prior, we argue that non-Bayesian analyses also incorporate prior information, illustrate that the Bayes posterior mean and the frequentist maximum likelihood estimator are often asymptotically equivalent, review a sim...
This paper introduces a new class of spatio-temporal models for measurements belonging to the exponential family of distributions. In this new class, the spatial and temporal components are conditionally independently modeled via a latent factor analysis structure for the (canonical) transformation of the measurements mean function. The factor load...
This paper is concerned with extreme value density estimation. The generalized Pareto distribution (GPD) beyond a given threshold
is combined with a nonparametric estimation approach below the threshold. This semiparametric setup is shown to generalize
a few existing approaches and enables density estimation over the complete sample space. Estimati...
Transaction costs limit the supply of credit to small and medium-sized firms (SMEs). From a sample of 65,535 SME credit proposals submitted to a large Brazilian bank between January 2004 and September 2006, this research analyzes credit granting decisions. Results suggest that small firms face credit rationing and that low risk credit contracts wit...
We present particle-based algorithms for sequential filtering and parameter learning in state-space autoregressive (AR) models
with structured priors. Non-conjugate priors are specified on the AR coefficients at the system level by imposing uniform
or truncated normal priors on the moduli and wavelengths of the reciprocal roots of the AR characteri...
In this paper we review sequential Monte Carlo (SMC) methods, or particle filters (PF), with special emphasis on its potential applications in financial time series analysis and econometrics. We start with the well-known normal dynamic linear model, also known as the normal linear state space model, for which sequential state learning is available...
We propose a novel framework for estimating the time-varying covariation among stocks. Our work is inspired by asset pricing theory and associated developments in Financial Index Models. We work with a family of highly structured dynamic factor models that seek the extraction of the latent struc-ture responsible for the cross-sectional covariation...
This paper develops efficient sequential learning methods for the estimation of general mixture models. The approach is distinguished from alternative particle filtering methods in two major ways. First, each iteration begins by resampling particles according to posterior predictive probability, leading to a more efficient set for propagation. Seco...
Particle learning (PL) provides state filtering, sequential parameter learning and smoothing in a general class of state space models. Our approach extends existing particle methods by incorporating the estimation of static parameters via a fully-adapted filter that utilizes conditional sufficient statistics for parameters and/or states as particle...
The analysis of temporal dependence in multivariate time series is considered. The dependence structure between the marginal series is modelled through the use of copulas which, unlike the correlation matrix, give a complete description of the joint distribution. The parameters of the copula function vary through time, following certain evolution e...
Extensive gene expression during meiosis is a hallmark of spermatogenesis. Although it was generally accepted that RNA transcription ceases during meiosis, recent observations suggest that some transcription occurs in postmeiosis. To further resolve this issue, we provide direct evidence for the de novo transcription of RNA during the postmeiotic p...
This paper presents novel Bayesian econometric methods for reducing high-dimensional data into low-dimensional aggregates using factor models to examine the effect of early-life conditions and education on health. These methods are applied to the 1970 British Cohort Study within a life course framework to analyze the effect of childhood cognitive a...
The modified mixture model with Markov switching volatility specification is introduced to analyze the relationship between stock return volatility and trading volume. We propose to construct an algorithm based on Markov chain Monte Carlo simulation methods to estimate all the parameters in the model using a Bayesian approach. The series of returns...
In the second half of 2009 the world experienced an intense influenza activity. The new 2009 H1N1 virus, formerly known as the swine flu, has in only five months found its way from Mexico to a majority of the countries on the planet. The fears of a large second-wave pandemic and its potential impact on health and economic outcomes have underlined t...
Pairwise plot for spermatogenic phase expression. Pairwise plots of gene product intensities (lower panel) and correlations (upper panel). Mit, Mei, and Pos correspond to the spermatogenic phases, with three replicates within each phase.
(0.91 MB TIF)
Gene product intensities during mitosis and meiosis for 2,599 testis biased gene products and their respective classification as over-, under-, or equally expressed in meiosis.
(0.63 MB XLS)
Bayesian estimation model for differential expression distributions. (A) Differential expression between meiosis and mitosis model through a mixture of two normal distributions (red and black lines). The first normal distribution (red) has a small variance, whereas the second (black) has a significant larger variance. (B) Regions for differential e...
Spermatogenic gene expression for X-linked and autosomal-linked genes in meiosis versus post-meiosis comparisons. Proportions of genes and their respective Bayesian 95% Confidence Intervals in each of the following classes: (A) Genes over-expressed in meiosis (expression in meiosis greater than expression in post-meiosis); (B) Genes under-expressed...
Spermatogenic gene expression analysis for Bayesian Model A and for twofold change method. Scatter plots of intensities (log2) of X-linked (A) and autosomal-linked genes (B) in meiosis versus mitosis comparison. The twofold and Bayesian cutoffs are indicated by blue and pink lines, respectively. (C) and (D) Proportions of genes classified as over-,...
Distribution of fold expression differences. Boxplot of fold expression (mitotic/meiotic) for genes under expressed in meiosis. Note that the range of expression-fold differences is large.
(0.50 MB TIF)
Gene product intensities during mitosis and meiosis for 91 parental-retrogene pairs and their respective posterior probability of having complementary expression.
(0.12 MB XLS)
Expression intensities (log2) for all 18801 D. melanogaster gene products and their respective classification as over-, under-, or equally expressed in meiosis.
(5.51 MB XLS)
Supplementary methods, list of supplementary tables, references for supplementary methods.
(0.10 MB DOC)
In Drosophila, genes expressed in males tend to accumulate on autosomes and are underrepresented on the X chromosome. In particular, genes expressed in testis have been observed to frequently relocate from the X chromosome to the autosomes. The inactivation of X-linked genes during male meiosis (i.e., meiotic sex chromosome inactivation-MSCI) was f...
This paper develops particle learning methods for Generalized Dynamic Conditionally Linear Models (GDCLMs). These models are natural extensions of dynamic linear state space models (DLMs) that also allow for flexible mixture error distributions. Particle learning methods provide sequential state filtering and parameter estimates from the joint post...
A new class of space-time models derived from standard dynamic fac-tor models is proposed. The temporal dependence is modeled by latent factors while the spatial dependence is modeled by the factor loadings. Factor analytic arguments are used to help identify temporal components that summarize most of the spatial variation of a given region. The te...
Copula functions and marginal distributions are combined to produce multivariate distributions. We show advantages of estimating
all parameters of these models using the Bayesian approach, which can be done with standard Markov chain Monte Carlo algorithms.
Deviance-based model selection criteria are also discussed when applied to copula models sin...
This paper describes a Bayesian approach to make inference for risk reserve processes with an unknown claim-size distribution. A flexible model based on mixtures of Erlang distributions is proposed to approximate the special features frequently observed in insurance claim sizes, such as long tails and heterogeneity. A Bayesian density estimation ap...
We generalize the factor stochastic volatility (FSV) model of Pitt and Shephard [1999. Time varying covariances: a factor stochastic volatility approach (with discussion). In: Bernardo, J.M., Berger, J.O., Dawid, A.P., Smith, A.F.M. (Eds.), Bayesian Statistics, vol. 6, Oxford University Press, London, pp. 547–570.] and Aguilar and West [2000. Bayes...
We propose a simulation-based algorithm for inference in stochastic volatility models with possible regime switching in which the regime state is governed by a first-order Markov process. Using auxiliary particle filters we developed a strategy to sequentially learn about states and parameters of the model. The methodology is tested against a synth...
We present a comprehensive review of the literature related to Markov chain Monte Carlo (MCMC) and sequential Monte Carlo (SMC) methods to estimate the parameters and the unobserved latent volatility in the class of stochastic volatility models using a Bayesian approach. Stochastic volatility models provide an alternative to the GARCH-type family f...
In this paper, we propose a Bayesian approach to model the level and the variance of (financial) time series by the special class of nonlinear time series models known as the logistic smooth transition autoregressive models, or simply the LSTAR models. We first propose a Markov Chain Monte Carlo (MCMC) algorithm for the levels of the time series an...
In this paper, we propose a fully Bayesian approach to the special class of nonlinear time-series models called the logistic smooth transition autoregressive (LSTAR) model. Initially, a Gibbs sampler is proposed for the LSTAR where the lag length, k, is kept fixed. Then, uncertainty about k is taken into account and a novel reversible jump Markov C...
We examine the class of extended generalized inverse Gaus-sian (EGIG) distributions. This class of distributions, which appeared briefly in a monograph by Jørgensen (1982), is more deeply and broadly studied in this paper. We start by deriving its probabilistic properties. Furthermore, we use the EGIG family in two popular and important statistical...
This chapter presents an overview of dynamic Bayesian models. Dynamic Bayesian modelling and forecasting of time series is one of the most important areas emerged in Statistics at the end of the last century. The chapter describes the class of DLM both to set the notation and to introduce important arguments of Bayesian dynamic models, such as mode...
Our main aims in this article are: (i) to model the means by which rainfall affects malaria incidence in the state of Pará, one of Brazil's largest states; and (ii) to check for similarities along the counties in the state. We use state of the art spatial–temporal models which can, we believe, anticipate various kinds of interactions and relations...
Data with asymmetric heavy tails can arise from mixture of data from multiple populations or processes. We propose a computer intensive procedure to fit by quasi-maximum likelihood a mixture model to a robustly standardized data set. The robust standardization of the data set results in well-defined tails which are modeled using extreme value theor...
Factor analysis has been one of the most powerful and flexible tools for assessment of multivariate dependence and codependence. Loosely speaking, it could be argued that the origin of its success rests in its very exploratory nature, where various kinds of data-relationships amongst the variables at study can be iteratively verified and/or refuted...
Factor analysis has been one of the most powerful and flexible tools for assessment of multivariate dependence and codependence. Loosely speaking, it could be argued that the origin of its success rests in its very exploratory nature, where various kinds of data-relationships amongst the variables at study can be iteratively verified and/or refuted...
The aim of this paper is to analyze extremal events using Generalized Pareto Distributions (GPD), considering explicitly the uncertainty about the threshold. Current practice empirically determines such parameter and proceeds by estimating the GPD parameters based on data beyond it, discarding all the information available below such threshold. We...
We propose a class of longitudinal data models with random effects that generalizes currently used models in two important ways. First, the random-effects model is a flexible mixture of multivariate normals, accommodating population heterogeneity, outliers, and nonlinearity in the regression on subject-specific covariates. Second, the model include...
Bayesian inference in factor analytic models has received re-newed attention in recent years, partly due to computational advances but also partly to applied focuses generating factor structures as exemplified by recent work in financial time series modeling. The focus of our current work is to investigate the commonly overlooked problem of prior s...
In this paper we adapt recently developed simulation-based sequential algorithms to two important stochastic volatility models. Firstly, we present a simulation-based algorithm for inference in stochastic volatility models with possible regime switching in which the regime state is governed by a first-order Markov process, the Markov switching stoc...
A simulation-based filter algorithm for sequential inference in the Markov switching stochastic volatility (MSSV) model is proposed. Our algorithm is based on both Pitt and Shephard's (1999) auxiliary particle filter (APF) and Liu and West's (2001) sequential density estimation ideas. The algorithm is tested against a synthetic time series that mim...
In this article we use factor models to describe a certain class of covariance structure for financial time series models. More specifically, we concentrate on situations where the factor variances are modeled by a multivariate stochastic volatility structure. We build on previous work by allowing the factor loadings, in the factor model structure,...
this paper. Let y ij denote the j-th measurement on the i-th patient, let i denote a random eects vector for patient i, and let x i denote patient-speci c covariates, including treatment dose. The usual structure of population PK/PD models is p(y ij j i ); p( i jx i ; ); p(): (1) Here p(y ij j i ) is typically a parametric non-linear regression for...
The past decade has witenessed a series of (well accepted and defined) financial crises periods in the world economy. Most of these events are country specific and eventually spreaded out across neighbor countries, with the concept of vicinity extrapolating the geographic maps and entering the contagion maps. Unfortunately, what contagion represent...
This paper analyzes the Brazilian industrial production index using a Bayesian methodology based on a new class of prior distributions for AR models that is extended to allow for heavy-tailed errors. The analysis shows how a unified approach is able to deal with model uncertainty, inference on latent structure, inference on unitary roots, forecasti...
Bayesian inference in factor analytic models has received renewed attention in recent years, partly due to computational advances but also partly to applied focuses generating factor structures as exemplied by recent work in nancial time series modeling. The focus of our current work is on exploring questions of uncertainty about the number of late...
This paper considers the analysis of the Brazilian GNP and industrial production index using statistical tools recently developed for time series. The main purpose is the short-term forecast and structural decomposition of both series through an autoregressive model that allows, but not imposes, nonstationary behavior. A very strong point in this p...
A large number of nonlinear time series models can be more easily analyzed using traditional linear methods by considering explicitly the difference between parameters of interest, or just parameters, and hyperparameters. One example is the class of conditionally Gaussian dynamic linear models (DLM). Bayesian vector autoregressive (BVAR) models and...
Vector autoregressions (VAR) are extensively used to model economic time series. The large number of parameters is the main difficult with VAR models, however. To overcome this, Litterman (1986) suggests to use a Bayesian strategy to estimate the VAR, equation by equation, where, a priori, the lags have decreasing importance (known as Litterman Pri...
Forecasting the levels of vector autoregressive (VAR) log-transformed time series has shown to be awkward by Ari~no and Franses (1996) who realised that just exponentiating the forecasts was a naive procedure due to the ocurrence of bias. They pr oposed a new manner to forecast untransformed VAR through correcting the log-transformed forecasts, and...
l; 6. tr(A) = P n i=1 i , where 's are the eigenvalues of A; 7. tr( P k i=1 A i ) = P k i=1 tr(A i ); 8. x 0 Ax = tr(x 0 Ax) = tr(Axx 0 ), for any vector x. UFPR-97/Hedibert Freitas Lopes Page: 5 Vetorization: Let's A be any (m Theta n) matrix, i.e. A = (a 1 ; Delta Delta Delta ; a n ), where a i 's are m-dimensional vectors. Then, the vectorizatio...
ffl We consider the analysis of the Brazilian industrial production index using recently developed tools for time series. ffl The main purpose is the short-term forecast and structural decomposition of the series through an autoregressive model that allows, but not imposes, nonstationary behaviour. ffl A very strong point is that we incorporate all...
Particle learning provides a simulation-based approach to sequential Bayesian computation. To sample from a posterior distribution of interest we use an essential state vector together with a predictive and propagation rule to build a resampling-sampling framework. Predictive inference and sequential Bayes factors are a direct by-product. Our appro...
Abstract In this paper we propose a fully Bayesian approach to the special class of nonlinear time series models called the logistic smooth transition autoregressive (LSTAR) model. Initially, a Gibbs sampler is proposed for the LSTAR where the lag length, k, is kept flxed. Then, uncertainty about k is taken into account and a novel reversible jump...
In this paper we describe the challenges of Bayesian computation in Finance. We show that empirical asset pricing leads to a nonlinear non-Gaussian state space model for the evolutions of asset returns and derivative prices. Bayesian methods extract latent state variables and estimate parameters by calculating the posterior distributions of interes...
We show that the large time-series variation of put option prices on the S&P 500 Futures Index can be understood through traditional asset-pricing theory. This phenomenon is explained through a general equilibrium model based on recursive preferences, linear pro-duction technology and time-varying production growth rates. We directly take our gener...