Article

Predicting Winner and Loser Stocks: A Classification Approach

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Modern investors face a high-dimensional prediction problem: thousands of observable variables are potentially relevant for forecasting. We reassess the conventional wisdom on market efficiency in light of this fact. In our equilibrium model, N assets have cash flows that are linear in J characteristics, with unknown coefficients. Risk-neutral Bayesian investors learn these coefficients and determine market prices. If J and N are comparable in size, returns are cross-sectionally predictable ex post. In-sample tests of market efficiency reject the no-predictability null with high probability, even though investors use information optimally in real time. In contrast, out-of-sample tests retain their economic meaning.
Article
Full-text available
We perform a comparative analysis of machine learning methods for the canonical problem of empirical asset pricing: measuring asset risk premiums. We demonstrate large economic gains to investors using machine learning forecasts, in some cases doubling the performance of leading regression-based strategies from the literature. We identify the best-performing methods (trees and neural networks) and trace their predictive gains to allowing nonlinear predictor interactions missed by other methods. All methods agree on the same set of dominant predictive signals, a set that includes variations on momentum, liquidity, and volatility. Authors have furnished an Internet Appendix, which is available on the Oxford University Press Web site next to the link to the final published paper online.
Article
Full-text available
We evaluate the robustness of momentum returns in the US stock market over the period 1965–2012. We find that momentum profits have become insignificant since the late 1990s. Investigations of momentum profits in high and low volatility months address the concerns about unprecedented levels of market volatility in this period rendering momentum strategy unprofitable. Momentum profits remain insignificant in tests designed to control for seasonality, up or down market conditions, firm size and liquidity. Past returns, can no longer explain the cross-sectional variation in stock returns, even following up markets. Investigation of post holding period returns of momentum portfolios and risk adjusted buy and hold returns of stocks in momentum suggests that investors possibly recognize that momentum strategy is profitable and trade in ways that arbitrage away such profits. These findings are partially consistent with Schwert (Handbook of the economics of finance. Elsevier, Amsterdam, 2003) that documents two primary reasons for the disappearance of an anomaly in the behavior of asset prices, first, sample selection bias, and second, uncovering of anomaly by investors who trade in the assets to arbitrage it away. In further analyses we find evidence that suggest two other possible explanations for the declining momentum profits, besides uncovering of the anomaly by investors, that involve decline in the risk premium on a macroeconomic factor, growth rate in industrial production in particular and relative improvement in market efficiency.
Article
Full-text available
This paper studies the properties and predictive ability of return forecasts from Fama-MacBeth cross-sectional regressions. These forecasts mimic how an investor could, in real time, combine many firm characteristics to get a composite estimate of a stock's expected return. Empirically, the forecasts exhibit significant cross-sectional variation and have strong predictive power for subsequent stock returns. For example, using ten-year rolling estimates of Fama-MacBeth slopes and a cross-sectional model with 15 firm characteristics (all based on low-frequency data), the expected-return estimates have a cross-sectional standard deviation of 0.90% monthly and a predictive slope for future monthly returns of 0.77, with a t-statistic of 10.17.
Article
Full-text available
We find that the determinants of the cross-section of expected stock returns are stable in their identity and influence from period to period and from country to country. Out-of-sample predictions of expected return are strongly and consistently accurate. Two findings distinguish this paper from others in the contemporary literature: First, stocks with higher expected and realized rates of return are unambiguously lower in risk than stocks with lower returns. Second, the important determinants of expected stock returns are strikingly common to the major equity markets of the world. Overall, the results seem to reveal a major failure in the Efficient Markets Hypothesis.
Article
Full-text available
Predictable variation in equity returns might reflect either (1) predictable changes in expected returns or (2) market inefficiency and stock price “overreaction.” These explanations can be distinguished by examining returns over short time intervals since systematic changes in fundamental valuation over intervals like a week should not occur in efficient markets. The evidence suggests that the “winners” and “losers” one week experience sizeable return reversals the next week in a way that reflects apparent arbitrage profits which persist after corrections for bid-ask spreads and plausible transactions costs. This probably reflects inefficiency in the market for liquidity around large price changes.
Article
Full-text available
Tests of financial asset pricing models may yield misleading inferences when properties of the data are used to construct the test statistics. In particular, such tests are often based on returns to portfolios of common stock, where portfolios are constructed by sorting on some empirically motivated characteristic of the securities such as market value of equity. Analytical calculations, Monte Carlo simulations, and two empirical examples show that the effects of this type of data snooping can be substantial.
Article
Full-text available
This paper documents that strategies that buy stocks that have performed well in the past and sell stocks that hav e performed poorly in the past generate significant positive returns o ver three- to twelve-month holding periods. The authors find that the profitability of these strategies are not due to their systematic risk or to delay ed stock price reactions to common factors. However, part of the abnorm al returns generated in the first year after portfolio formation dissipates in the following two years. A similar pattern of returns around the earnings announcements of past winners and losers is also documented. Copyright 1993 by American Finance Association.
Article
Full-text available
This paper presents new empirical evidence of predictability of individual stock returns. The negative first-order serial correlation in monthly stock returns is highly significant. Furthermore, significant positive serial correlation is found at longer lags, and the twelve-month serial correlation is particularly strong. Using the observed systematic behavior of stock return, one-step-ahead return forecasts are made and ten portfolios are formed from the forecasts. The difference between the abnormal returns on the extreme decile portfolios over the period 1934-87 is 2.49 percent per month. Copyright 1990 by American Finance Association.
Article
We extend the Fama–MacBeth regression framework for cross-sectional return prediction to incorporate big data and machine learning. Our extension involves a three-step procedure for generating return forecasts based on Fama–MacBeth regressions with regularization and predictor selection as well as forecast combination and encompassing. As a by-product, it provides estimates of characteristic payoffs. We also develop three performance measures for assessing cross-sectional return forecasts, including a generalization of the popular time-series out-of-sample R2 statistic to the cross section. Applying our extension to over 200 firm characteristics, our cross-sectional return forecasts significantly improve out-of-sample predictive accuracy and provide substantial economic value to investors. Overall, our results suggest that a relatively large number of characteristics matter for determining cross-sectional expected returns. Our new method is straightforward to implement and interpret, and it performs well in our application.
Article
Much of the extant literature predicts market returns with “simple” models that use only a few parameters. Contrary to conventional wisdom, we theoretically prove that simple models severely understate return predictability compared to “complex” models in which the number of parameters exceeds the number of observations. We empirically document the virtue of complexity in U.S. equity market return prediction. Our findings establish the rationale for modeling expected returns through machine learning. This article is protected by copyright. All rights reserved
Article
We comprehensively investigate the robustness of well-known factor models to altered factor formation breakpoints. Deviating from the standard 30th and 70th percentile selection, we use an extensive set of anomaly test portfolios to uncover two main findings: First, there is a trade-off between specification and diversification. More centered breakpoints tend to result in less (idiosyncratic) risk. More extreme sorts lead to greater exposure to the underlying anomalies and thus to higher average returns. Second, the models are robust to varying degrees. Hou et al.’s model [2015, Digesting Anomalies: An Investment Approach, Review of Financial Studies 28, 650–705] is much more sensitive to changes in breakpoints than the Fama–French models.
Article
Several papers argue that financial economics faces a replication crisis because the majority of studies cannot be replicated or are the result of multiple testing of too many factors. We develop and estimate a Bayesian model of factor replication that leads to different conclusions. The majority of asset pricing factors (i) can be replicated; (ii) can be clustered into 13 themes, the majority of which are significant parts of the tangency portfolio; (iii) work out‐of‐sample in a new large data set covering 93 countries; and (iv) have evidence that is strengthened (not weakened) by the large number of observed factors.
Article
Factors display strong cross-sectional momentum that subsumes momentum in industries and other portfolio characteristics. The profits of all these momentum strategies—based on factors, industries, and other characteristics—significantly correlate with each other and therefore likely emanate from the same source. If factors display momentum, so will any set of portfolios with cross-sectional variation in factor loadings. Consistent with factors being at the root of momentum, we find that momentum in industry-neutral factors explains industry momentum, but industry momentum explains none of the factor momentum. Cross-sectional factor momentum concentrates in the first few highest-eigenvalue factors and is distinct from time-series factor momentum. Authors have furnished an Internet Appendix, which is available on the Oxford University Press Web site next to the link to the final published paper online.
Article
Momentum in individual stock returns relates to momentum in factor returns. Most factors are positively autocorrelated: the average factor earns a monthly return of six basis points following a year of losses and 51 basis points following a positive year. We find that factor momentum concentrates in factors that explain more of the cross section of returns and that it is not incidental to individual stock momentum: momentum‐neutral factors display more momentum. Momentum found in high‐eigenvalue PC factors subsumes most forms of individual stock momentum. Our results suggest that momentum is not a distinct risk factor—it times other factors. This article is protected by copyright. All rights reserved
Article
We propose a nonparametric method to study which characteristics provide incremental information for the cross-section of expected returns. We use the adaptive group LASSO to select characteristics and to estimate how selected characteristics affect expected returns nonparametrically. Our method can handle a large number of characteristics and allows for a flexible functional form. Our implementation is insensitive to outliers. Many of the previously identified return predictors don’t provide incremental information for expected returns, and nonlinearities are important. We study our method’s properties in simulations and find large improvements in both model selection and prediction compared to alternative selection methods. Authors have furnished an Internet Appendix, which is available on the Oxford University Press Web site next to the link to the final published paper online.
Article
A common practice in the finance literature is to create characteristic portfolios by sorting on characteristics associated with average returns. We show that the resultant portfolios are likely to capture not only the priced risk associated with the characteristic but also unpriced risk. We develop a procedure to remove this unpriced risk using covariance information estimated from past returns. We apply our methodology to the five Fama-French characteristic portfolios. The squared Sharpe ratio of the optimal combination of the resultant characteristic-efficient portfolios is 2.13, compared with 1.17 for the original characteristic portfolios.
Article
We construct a robust stochastic discount factor (SDF) summarizing the joint explanatory power of a large number of cross-sectional stock return predictors. Our method achieves robust out-of-sample performance in this high-dimensional setting by imposing an economically motivated prior on SDF coefficients that shrinks contributions of low-variance principal components of the candidate characteristics-based factors. We find that characteristics-sparse SDFs formed from a few such factors—e.g., the four- or five-factor models in the recent literature—cannot adequately summarize the cross-section of expected stock returns. However, an SDF formed from a small number of principal components performs well.
Book
This book provides a general and comprehensible overview of imbalanced learning. It contains a formal description of a problem, and focuses on its main features, and the most relevant proposed solutions. Additionally, it considers the different scenarios in Data Science for which the imbalanced classification can create a real challenge. This book stresses the gap with standard classification tasks by reviewing the case studies and ad-hoc performance metrics that are applied in this area. It also covers the different approaches that have been traditionally applied to address the binary skewed class distribution. Specifically, it reviews cost-sensitive learning, data-level preprocessing methods and algorithm-level solutions, taking also into account those ensemble-learning solutions that embed any of the former alternatives. Furthermore, it focuses on the extension of the problem for multi-class problems, where the former classical methods are no longer to be applied in a straightforward way. This book also focuses on the data intrinsic characteristics that are the main causes which, added to the uneven class distribution, truly hinders the performance of classification algorithms in this scenario. Then, some notes on data reduction are provided in order to understand the advantages related to the use of this type of approaches. Finally this book introduces some novel areas of study that are gathering a deeper attention on the imbalanced data issue. Specifically, it considers the classification of data streams, non-classical classification problems, and the scalability related to Big Data. Examples of software libraries and modules to address imbalanced classification are provided. This book is highly suitable for technical professionals, senior undergraduate and graduate students in the areas of data science, computer science and engineering. It will also be useful for scientists and researchers to gain insight on the current developments in this area of study, as well as future research directions.
Article
We take up Cochrane's (2011) challenge to identify the firm characteristics that provide independent information about average U.S. monthly stock returns by simultaneously including 94 characteristics in Fama-MacBeth regressions that avoid overweighting microcaps and adjust for data-snooping bias. We find that while 12 characteristics are reliably independent determinants in non-microcap stocks from 1980 to 2014 as a whole, return predictability sharply fell in 2003 such that just two characteristics have been independent determinants since then. Outside of microcaps, the hedge returns to exploiting characteristics-based predictability also have been insignificantly different from zero since 2003. (JEL G12, G14)
Article
Given the competition for top journal space, there is an incentive to produce “significant” results. With the combination of unreported tests, lack of adjustment for multiple tests, and direct and indirect p‐ hacking, many of the results being published will fail to hold up in the future. In addition, there are basic issues with the interpretation of statistical significance. Increasing thresholds may be necessary, but still may not be sufficient: if the effect being studied is rare, even t > 3 will produce a large number of false positives. Here I explore the meaning and limitations of a p‐ value. I offer a simple alternative (the minimum Bayes factor). I present guidelines for a robust, transparent research culture in financial economics. Finally, I offer some thoughts on the importance of risk‐taking (from the perspective of authors and editors) to advance our field. SUMMARY Empirical research in financial economics relies too much on p ‐values, which are poorly understood in the first place. Journals want to publish papers with positive results and this incentivizes researchers to engage in data mining and “ p ‐hacking.” The outcome will likely be an embarrassing number of false positives—effects that will not be repeated in the future. The minimum Bayes factor (which is a function of the p ‐value) combined with prior odds provides a simple solution that can be reported alongside the usual p ‐value. The Bayesianized p ‐value answers the question: What is the probability that the null is true? The same technique can be used to answer: What threshold of t ‐statistic do I need so that there is only a 5% chance that the null is true? The threshold depends on the economic plausibility of the hypothesis.
Article
Hundreds of papers and factors attempt to explain the cross-section of expected returns. Given this extensive data mining, it does not make sense to use the usual criteria for establishing significance. Which hurdle should be used for current research? Our paper introduces a new multiple testing framework and provides historical cutoffs from the first empirical tests in 1967 to today. A new factor needs to clear a much higher hurdle, with a t-statistic greater than 3.0. We argue that most claimed research findings in financial economics are likely false.
Article
We consider three sets of phenomena that feature prominently in the financial economics literature: (1) conditional mean dependence (or lack thereof) in asset returns, (2) dependence (and hence forecastability) in asset return signs, and (3) dependence (and hence forecastability) in asset return volatilities. We show that they are very much interrelated and explore the relationships in detail. Among other things, we show that (1) volatility dependence produces sign dependence, so long as expected returns are nonzero, so that one should expect sign dependence, given the overwhelming evidence of volatility dependence; (2) it is statistically possible to have sign dependence without conditional mean dependence; (3) sign dependence is not likely to be found via analysis of sign autocorrelations, runs tests, or traditional market timing tests because of the special nonlinear nature of sign dependence, so that traditional market timing tests are best viewed as tests for sign dependence arising from variation in expected returns rather than from variation in volatility or higher moments; (4) sign dependence is not likely to be found in very high-frequency (e.g., daily) or very low-frequency (e.g., annual) returns; instead, it is more likely to be found at intermediate return horizons; and (5) the link between volatility dependence and sign dependence remains intact in conditionally non-Gaussian environments, for example, with time-varying conditional skewness and/or kurtosis.
Article
Because the state of the equity market is latent, several methods have been proposed to identify past and current states of the market and forecast future ones. These methods encompass semi-parametric rule-based methods and parametric Markov switching models. We compare the mean-variance utilities that result when a risk-averse agent uses the predictions of the different methods in an investment decision. Our application of this framework to the S&P 500 shows that rule-based methods are preferable for (in-sample) identification of the state of the market, but Markov switching models for (out-of-sample) forecasting. In-sample, only the mean return of the market index matters, which rule-based methods exactly capture. Because Markov switching models use both the mean and the variance to infer the state, they produce superior forecasts and lead to significantly better out-of-sample performance than rule-based methods. We conclude that the variance is a crucial ingredient for forecasting the market state. Copyright
Article
Variables with strong marginal explanatory power in cross-section asset pricing regressions typically show less power to produce increments to average portfolio returns, for two reasons. (1) Adding an explanatory variable can attenuate the slopes in a regression. (2) Adding a variable with marginal explanatory power always attenuates the values of other explanatory variables in the extremes of a regression’s fitted values. Without a restriction on portfolio weights, the maximum Sharpe ratios in the GRS statistic of Gibbons et al. (1989) provide little information about an incremental variable’s impact on the portfolio opportunity set.
Article
In this article we introduce a decomposition of the joint distribution of price changes of assets recorded trade-by-trade. Our decomposition means that we can model the dynamics of price changes using quite simple and interpretable models which are easily extended in a great number of directions, including using durations and volume as explanatory variables. Thus we provide an econometric basis for empirical work on market microstructure using time series of transaction data. We use maximum likelihood estimation and testing methods to assess the fit of the model to one year of IBM stock price data taken from the New York Stock Exchange.
Article
A five-factor model directed at capturing the size, value, profitability, and investment patterns in average stock returns performs better than the three-factor model of Fama and French (FF, 1993). The five-factor model's main problem is its failure to capture the low average returns on small stocks whose returns behave like those of firms that invest a lot despite low profitability. The model's performance is not sensitive to the way its factors are defined. With the addition of profitability and investment factors, the value factor of the FF three-factor model becomes redundant for describing average returns in the sample we examine.
Article
We examine whether the recent regime of increased liquidity and trading activity is associated with attenuation of prominent equity return anomalies due to increased arbitrage. We find that the majority of the anomalies have attenuated, and the average returns from a portfolio strategy based on prominent anomalies have approximately halved after decimalization. We provide evidence that hedge fund assets under management, short interest and aggregate share turnover have led to the decline in anomaly-based trading strategy profits in recent years. Overall, our work indicates that policies to stimulate liquidity and ameliorate trading costs improve capital market efficiency.
Article
Using recent US financial market data, this study tested whether relative strength trading strategy was profitable in two different sample periods (1990 to 2012 and 1965 to 2012). In contrast to the previous findings, our study finds no clear evidence for profitable zero-cost buy and hold strategy for 3- to 12-month periods for the period 1990 to 2012. However, we find few profitable zero-cost strategy for the period 1965 to 2012, but the returns are much smaller than previously reported. These findings may imply gain in market efficiency in the US financial markets in recent period.
Article
Despite the voluminous empirical research on the potential predictability of stock returns, much less attention has been paid to the predictability of bear and bull stock markets. In this study, the aim is to predict U.S. bear and bull stock markets with dynamic binary time series models. Based on the analysis of the monthly U.S. data set, bear and bull markets are predictable in and out of sample. In particular, substantial additional predictive power can be obtained by allowing for a dynamic structure in the binary response model. Probability forecasts of the state of the stock market can also be utilized to obtain optimal asset allocation decisions between stocks and bonds. It turns out that the dynamic probit models yield much higher portfolio returns than the buy-and-hold trading strategy in a small-scale market timing experiment.
Article
Discount rate variation is the central organizing question of current asset pricing research. I survey facts, theories and applications. We thought returns were uncorrelated over time, so variation in price-dividend ratios was due to variation in expected cashflows. Now it seems all price-dividend variation corresponds to discount-rate variation. We thought that the cross-section of expected returns came from the CAPM. Now we have a zoo of new factors. I categorize discount-rate theories based on central ingredients and data sources. Discount-rate variation continues to change finance applications, including portfolio theory, accounting, cost of capital, capital structure, compensation, and macroeconomics.Institutional subscribers to the NBER working paper series, and residents of developing countries may download this paper without additional charge at www.nber.org.
Article
Several empirical studies have documented that the signs of excess stock returns are, to some extent, predictable. In this paper, we consider the predictive ability of the binary dependent dynamic probit model in predicting the direction of monthly excess stock returns. The recession forecast obtained from the model for a binary recession indicator appears to be the most useful predictive variable, and once it is employed, the sign of the excess return is predictable in-sample. The new dynamic “error correction” probit model proposed in the paper yields better out-of-sample sign forecasts, with the resulting average trading returns being higher than those of either the buy-and-hold strategy or trading rules based on ARMAX models.
Article
It has become standard practice in the cross-sectional asset pricing literature to evaluate models based on how well they explain average returns on size-B/M portfolios, something many models seem to do remarkably well. In this paper, we review and critique the empirical methods used in the literature. We argue that asset pricing tests are often highly misleading, in the sense that apparently strong explanatory power (high cross-sectional R2s and small pricing errors) can provide quite weak support for a model. We offer a number of suggestions for improving empirical tests and evidence that several proposed models do not work as well as originally advertised.
Article
This paper presents a new pattern in the cross-section of expected stock returns. Stocks tend to have relatively high (or low) returns every year in the same calendar month. We recognize the annual cross-sectional autocorrelation pattern documented in Jegadeesh [1990. Evidence of predictable behavior of security returns. Journal of Finance 45, 881–898] at lags of 12, 24, and 36 months as part of a general pattern that lasts up to 20 annual lags, superimposed on the general momentum/reversal patterns. This pattern explains an economically and statistically significant magnitude of the cross-sectional variation in average stock returns. Volume and volatility exhibit similar seasonal patterns but they do not explain the seasonality in returns. The pattern is independent of size, industry, earnings announcements, dividends, and fiscal year. The results are consistent with the existence of a persistent seasonal effect in stock returns.
Article
Valuation theory says that expected stock returns are related to three variables: the book-to-market equity ratio (B-t/M-t), expected profitability, and expected investment. Given B-t/M-t and expected profitability, higher expected rates of investment imply lower expected returns. But controlling for the other two variables, more profitable firms have higher expected returns, as do firms with higher B-t/M-t. These predictions are confirmed in our tests. (c) 2006 Elsevier B.V. All rights reserved.
Article
Bull and bear markets are a common way of describing cycles in equity prices. To fully describe such cycles one would need to know the data generating process (DGP) for equity prices. We begin with a definition of bull and bear markets and use an algorithm based on it to sort a given time series of equity prices into periods that can be designated as bull and bear markets. The rule to do this is then studied analytically and it is shown that bull and bear market characteristics depend upon the DGP for capital gains. By simulation methods we examine a number of DGPs that are known to fit the data quite well-random walks, GARCH models, and models with duration dependence. We find that a pure random walk provides as good an explanation of bull and bear markets as the more complex statistical models. In the final section of the paper we look at some asset pricing models that appear in the literature from the viewpoint of their success in producing bull and bear markets which resemble those in the data. Copyright © 2002 John Wiley & Sons, Ltd.
Article
Empirical evidence suggests that many macroeconomic and financial time series are subject to occasional structural breaks. In this paper we present analytical results quantifying the effects of such breaks on the correlation between the forecast and the realization and on the ability to forecast the sign or direction of a time-series that is subject to breaks. Our results suggest that it can be very costly to ignore breaks. Forecasting approaches that condition on the most recent break are likely to perform better over unconditional approaches that use expanding or rolling estimation windows provided that the break is reasonably large.
Article
In a previous paper, we found systematic price reversals for stocks that experience extreme long‐term gains or losses: Past losers significantly outperform past winners. We interpreted this finding as consistent with the behavioral hypothesis of investor overreaction. In this follow‐up paper, additional evidence is reported that supports the overreaction hypothesis and that is inconsistent with two alternative hypotheses based on firm size and differences in risk, as measured by CAPM‐betas. The seasonal pattern of returns is also examined. Excess returns in January are related to both short‐term and long‐term past performance, as well as to the previous year market return.
Article
The primary aim of the paper is to place current methodological discussions in macroeconometric modeling contrasting the ‘theory first’ versus the ‘data first’ perspectives in the context of a broader methodological framework with a view to constructively appraise them. In particular, the paper focuses on Colander’s argument in his paper “Economists, Incentives, Judgement, and the European CVAR Approach to Macroeconometrics” contrasting two different perspectives in Europe and the US that are currently dominating empirical macroeconometric modeling and delves deeper into their methodological/philosophical underpinnings. It is argued that the key to establishing a constructive dialogue between them is provided by a better understanding of the role of data in modern statistical inference, and how that relates to the centuries old issue of the realisticness of economic theories.
How stable is the predictive power of the yield curve? Evidence from Germany and the United States
  • A Estrella
Does academic research destroy stock return predictability?
  • R D Mclean
Evidence of predictable behavior of security returns
  • P Jaccard
Forecasting stock indices: a comparison of classification and level estimation models
  • T Leung
Regime changes and financial markets
  • A Ang