ChapterPDF Available

Mostly Harmless Econometrics: An Empiricist's Companion

Authors:

Abstract

The core methods in today's econometric toolkit are linear regression for statistical control, instrumental variables methods for the analysis of natural experiments, and differences-in-differences methods that exploit policy changes. In the modern experimentalist paradigm, these techniques address clear causal questions such as: Do smaller classes increase learning? Should wife batterers be arrested? How much does education raise wages?Mostly Harmless Econometricsshows how the basic tools of applied econometrics allow the data to speak.In addition to econometric essentials,Mostly Harmless Econometricscovers important new extensions--regression-discontinuity designs and quantile regression--as well as how to get standard errors right. Joshua Angrist and J rn-Steffen Pischke explain why fancier econometric techniques are typically unnecessary and even dangerous. The applied econometric methods emphasized in this book are easy to use and relevant for many areas of contemporary social science.An irreverent review of econometric essentialsA focus on tools that applied researchers use mostChapters on regression-discontinuity designs, quantile regression, and standard errorsMany empirical examplesA clear and concise resource with wide applications.
A preview of the PDF is not available
... Thus, we conclude that the subpopulation identified by the IVs are the compliers. To explain why we require monotonicity, we can rewrite Eq. 1 as [24] ...
... On the other hand, when unaccounted for treatment effect heterogeneity exists, OLS will generate a marginal treatment effect estimate that is implicitly weighted by the covariance structure in the observed data sample as opposed to explicit weighting in IPTW. [20,24] One advantage of the propensity score is that data-driven selection of the confounders to model treatment assignment is done separately from the fitting of Eq. 3. In contrast, adding and removing confounders in OLS in a data-driven fashion will also affect the estimand, estimate, and corresponding inference for the treatment effect. Therefore, IPTW is able to control inflation in Type I error from repeated testing of the treatment effect coefficient as a result of fitting several models. ...
Preprint
Full-text available
To estimate causal effects, analysts performing observational studies in health settings utilize several strategies to mitigate bias due to confounding by indication. There are two broad classes of approaches for these purposes: use of confounders and instrumental variables (IVs). Because such approaches are largely characterized by untestable assumptions, analysts must operate under an indefinite paradigm that these methods will work imperfectly. In this tutorial, we formalize a set of general principles and heuristics for estimating causal effects in the two approaches when the assumptions are potentially violated. This crucially requires reframing the process of observational studies as hypothesizing potential scenarios where the estimates from one approach are less inconsistent than the other. While most of our discussion of methodology centers around the linear setting, we touch upon complexities in non-linear settings and flexible procedures such as target minimum loss-based estimation (TMLE) and double machine learning (DML). To demonstrate the application of our principles, we investigate the use of donepezil off-label for mild cognitive impairment (MCI). We compare and contrast results from confounder and IV methods, traditional and flexible, within our analysis and to a similar observational study and clinical trial.
... OLS is analytically and interpretively straight-forward, robust to misspecification in large samples, and in most situations, tends to provide substantively similar results to models that are more appropriate for the distributional characteristics of the data (ie, count models). 26,27 However, for some of the analyses conducted in this study, we find that results are sensitive to model choice. As such, we present our findings using both OLS as our pre-specified, benchmark approach and the statistical model that better fits the distributional characteristic of the data (ie, negative binomial or logistic regression), based on diagnostics and log-likelihood statistics. ...
... To increase the precision of our estimates and account for blocking procedures, we statistically adjusted for covariates pre-specified in our registered analysis plan. 26 We included the following individuallevel covariates (measured at baseline): age, gender, race, Hispanic/Latino, and the following blocking variable: school. Baseline imputation indicators and the baseline measure of the outcome variable were also included. ...
Article
Full-text available
Background: Although positive youth development (PYD) programs have demonstrated effectiveness in improving adolescent reproductive health outcomes, there is a lack of evidence on effective school-based interventions designed especially for high school settings. This study examined the efficacy of Peer Group Connection (PGC-HS), a school-based PYD program, in improving sexual health outcomes for high school participants. Methods: A total of 1523 ninth-grade students at 18 schools were randomly assigned to be offered PGC-HS or a classes-as-usual control condition during 2016 to 2017 and 2017 to 2018 school years. Impacts were assessed on three confirmatory and 6 exploratory outcomes via self-reported participant questionnaire data collected at the beginning of 10th grade. Results: Although the offer of PGC-HS had no statistically detectable effect on confirmatory behavioral outcomes (sexual initiation, frequency of sex, and number of sexual partners) at 10th grade follow-up, causal impact estimates indicate that PGC-HS participants were less likely than control participants to ever have had vaginal sex. PGC-HS participants also scored higher on decision-making skills and perceived connection to peer connectedness. Conclusions: Results suggest that by building social and emotional skills and helping students form supportive peer relationships, PGC-HS may encourage students to make healthier choices and avoid risky behaviors during a critical period in high school, thus, reducing the risk of pregnancy.
... To investigate this difference more profoundly, we explore the levels at which the gains of education were realised through FPE for our full sample, and for the poor and wealthy subgroups, separately. Given that our IV-estimates resemble local average treatment effects (LATE), this exercise informs us about the sub-population of "compliers" driving our results (see Angrist and Pischke 2009). Figure 3 depicts these shifts visually, plotting the difference in the conditional probability (on the y-axis) of having completed at least a given school grade (on the x-axis) for women that were just able to benefit from the policy change (13 years and 28 younger), compared to women who too old to benefit from the removal of fees (14 years and older). ...
... Note that these changes are conditional CDF changes, including the usual covariates and weighing with the sample weights. Further, the changes of the three different samples are normalised by their respective first-stages as outlined inAngrist and Pischke (2009), giving us the contribution (weight) of the respective schooling level change towards the average causal response over all educational levels. ...
Conference Paper
Full-text available
This article investigates women's returns to schooling by exploiting Burundi's free primary education policy (FPE) of 2005 as a natural experiment. Credibly exogenous variation in education is identified through a fuzzy regression discontinuity design (RDD). Our results show that while educational attainment was positively influenced by Burundi's FPE for women situated at all wealth levels, the relevant downstream effects of schooling - measured by fertility, literacy and employment - reveal heterogeneous treatment effects by wealth. Poorer women profit in terms of higher literacy, employment as well as reduced fertility through policy induced education, while there are almost no effects of additional education for non-poor women. Our findings help in evaluating the generalisability of the nexus between women's education and fertility as well as associated factors.
... Several of our variables are highly skewed as we saw in Figures 2, 3, & 5 which is problematic for regression models since they assume the dependent variable to be distributed according to a Gaussian distribution [Angrist and Pischke, 2008]. One common solution for this problem, and one that we also adopt in this paper, is to log-transform the skewed variables x as log(x + 1) which makes them look more Gaussian. ...
Preprint
Full-text available
The Internet has made it easier for social scientists to study human behavior by analyzing their interactions on social media platforms. Many of these platforms characterize conversations among users via threads, which induce a tree-like structure. The structural properties of these discussion trees, such as their width, depth, and size, can be used to make inferences regarding user discussion patterns and conversation dynamics. In this paper, we seek to understand the structure of these online discussions on Reddit. We characterize the structure of these discussions via a set of global and local discussion-tree properties. The global features constitute information regarding the community/subreddit of a given post, whereas the local features are comprised of the properties of the post itself. We perform various statistical analyses on a year's worth of Reddit data containing a quarter of a million posts and several million comments. These analyses allow us to tease apart the relative contribution of a discussion post's global and local properties and characterize the importance of specific individual features in determining the discussions' structural patterns. Our results indicate that both local and global features explain a significant amount of structural variation. Local features are collectively more important as they explain significantly more variation in the discussion trees' structural properties than global features. However, there is significant heterogeneity in the impact of the various features. Several global features, e.g., the topic, age, popularity, and the redundancy of content in a subreddit, also play a crucial role in understanding the specific properties of discussion trees.
... "Difference in differences (DID) is a statistical technique used in econometrics and quantitative research in the social sciences that attempts to mimic an experimental research design using observational study data, by studying the differential effect of a treatment on a 'treatment group' versus a 'control group' in a natural experiment. It calculates the effect of a treatment (i.e., an explanatory variable or an independent variable) on an outcome (i.e., a response variable or dependent variable) by comparing the average change over time in the outcome variable for the treatment group to the average change over time for the control group" (Angrist and Pischke 2008). ...
Article
Full-text available
The analysis of what human capital (HC) is has a long history and culminates in the acknowledgment that HC and its growth are very important for both cognitive education (cognitive skills (CSs)) and personal life (noncognitive skills (NCSs)) and that CSs and NCSs have a strong reciprocal relationship, as studies by Heckman demonstrated. The present contribution (following Heckman’s approach) analyzed the relationship between CSs and NCSs in a sample of middle school students in the Autonomous Province of Trento. The second goal of the research was to verify whether educational teaching behaviors improved students’ personalities. Aside from the use of administrative data (INVALSI data, 2015 and 2018), one survey was administered in the 2018–2019 schooling year to verify the relationship between NCSs and CSs. Moreover, we sought to determine whether education teaching behavior improved the students’ personalities (1522 students in 25 schools) and whether programs could enhance NCSs. Methodological tools for the analysis involved the generalized least squares approach to answer the first question and a difference-in-differences model for the second. The main results showed that the levels of NCSs affected the ability to learn and improve CSs; a challenging teaching approach, especially if accompanied by programs improving its quality, had positive results. Finally, the research suggested that a wider, national-based survey following students from primary to secondary school would allow for a greater understanding of the dynamics of CSs and NCSs.
Article
The correlation between screen size and visualisations of wind turbines in an internet survey with 2,359 respondents is analysed. Respondents answering the survey on a screen smaller than or equal to an A4 sheet find the visualised wind turbines significantly less visible than respondents answering on a larger screen. These results fuel the debate on using visualisations in internet surveys.
Article
Политический троллинг в социальных сетях превратился в последние годы в новую технологию цифровой политики. Исследования троллинга в политической науке, однако, фокусируются на проблеме детекции троллей и описании характера их деятельности и стратегий, во многом игнорируя вопрос о восприятии троллинга пользователями. Результатом такого смещения академической оптики является острая нехватка работ о последствиях и результатах политической онлайн-активности троллей. Методологически ситуация усугубляется тем, что существующие исследования опираются на выявление троллей путем ручной разметки пользователей социальных сетей. Игнорирование вопросов восприятия троллинга в этой ситуации может приводить к систематическим смещениям в эмпирических результатах. Авторы данной работы стремятся заполнить отмеченный пробел в научной литературе, исследуя восприятие политического троллинга в социальной сети “ВКонтакте”. С опорой на литературу, посвященную феномену селективного восприятия, в статье выдвигается гипотеза о том, что сторонники и противники действующей власти будут чаще называть троллингом сообщения, выражающие противоположную политическую позицию. Эта гипотеза проверяется на основе анализа оригинальных эмпирических данных методами регрессионного анализа, который показывает, что лишь одна из рассматриваемых групп респондентов – оппозиционно настроенные респонденты – склонна чаще маркировать в качестве троллинга сообщения с противоположной политической позицией. При этом сторонники действующей власти не демонстрируют систематических различий в восприятии провластного и оппозиционного троллинга. Полученные результаты, с одной стороны, указывают на методологические ограничения эмпирических исследований, опирающихся на размеченные данные; с другой – указывают на значимые различия в восприятии политической онлайн-информации сторонниками и противниками действующей власти и актуализируют проблематику базовых политических представлений в исследованиях политической коммуникации.
Article
Objective: The aim: To establish that there are differences in the density trends of surgeons in some European countries and 16 OECD countries and to compare the trends of the density of surgeons in some European countries and 16 OECD member countries, 2005 - 2018. Patients and methods: Materials and methods: The study is based on data of the Centre for Medical Statistics of the Ministry of Health of Ukraine obtained during 2005-2020 and OECD data obtained during 2005-2018. The Difference-in-Differences method has been used to determine the density trends differences, the regression analysis method - to predict the number of surgeons in 2020. Results: Results: In 2020, there were 28,559 surgeons (0.687 per 1,000) in Ukraine, which is by 17.7% less than in 2005. From 2005 to 2018, the density of surgeons per 1000 in Ukraine and the United States decreased (-7.45% and -2.5%). In Korea (+ 78.38%), Greece (+ 65.52%), Lithuania (+ 58.57%), Slovenia (+ 45.65%) and other 11 countries the surgeon's density increased. In 2030, Ukraine is predicted to significantly reduce the number of surgeons, general surgeons, ophthalmologists and urologists; as well as to increase the number of cardiovascular surgeons. The number of proctologists, oncologists-surgeons, neurosurgeons, thoracic surgeons, orthopaedists-traumatologists and anaesthesiologists will not change significantly. Conclusion: Conclusions: It is possible to state the availability of surgical care according to the density of surgeons in Ukraine, similar to the level of OECD countries. In 2030, the number of surgeons is projected to decrease, with the exception of cardiovascular surgeons.
Article
This paper estimates effects of long-term care (LTC) benefits on utilization of primary and secondary healthcare in Catalonia (Spain). Identification comes from plausibly exogenous variation in the leniency of LTC needs assessment. We estimate that receiving LTC benefits worth 365 euros per month, on average, reduces the probability of avoidable hospital admissions by 66%, and has no significant effect on planned hospitalisations nor on hospitalisation for any reason. Receiving LTC benefits is estimated to reduce unscheduled primary care visits by 44% and has no significant effect on scheduled visits. These findings have important policy implications suggesting that allocating resources to LTC may not only increase the welfare of LTC beneficiaries but also reduce avoidable and unscheduled utilisation of healthcare.
Article
Background and aims: Existing research on mental health comorbidities of youth e-cigarette use is subject to confounding bias and reverse causality. This study aimed to measure the effects of e-cigarette use on youth mental health, using e-cigarette minimum legal age (MLA) law in Canada as a natural experiment. Design: We used difference-in-differences (DD), difference-in-differences-in-differences (DDD) and two-sample instrumental variables (TSIV) methods. Setting: Data were from nationally representative Canadian Community Health Surveys 2008-2019 and Canadian Student Tobacco Alcohol and Drugs Surveys 2008-2019. Participants: The study sample comprised of respondents aged 15 to 18 (in DD analysis; n = 33 858) and aged 15 to 24 (in DDD analysis; n = 78 689). Measurements: Primary outcomes were self-reported mood disorders and anxiety disorders. Secondary outcomes were cannabis use, illicit drug use, cigarette use and strength of peer relationships at schools. Findings: After the e-cigarette MLA laws, risks of mood disorders declined by 1.9 percentage points (95% CI, 0.0-3.8; P = 0.05) in the DD analysis and by 2.6 percentage points (95% CI, 0.2-5.0; P = 0.03) in the DDD analysis. For anxiety disorders, while the DD estimate was negative but imprecisely estimated, the MLA law reduced risks of anxiety disorder by 3.6 percentage points (95% CI, 0.9-6.2; P = 0.01) in the DDD analysis. Youths in provinces with MLA laws were also less likely to report cannabis use and illicit drug use and more likely to feel being part of schools. TSIV analysis indicates that youth e-cigarette use increased the likelihood of mood and anxiety disorders by 44% and 37%, respectively. Conclusion and relevance: In Canada, the e-cigarette minimum legal age law appears to have reduced risks of mood and anxiety disorders, lowered substance use and improved peer relationships at schools. Combined with previous evidence of lower e-cigarette use following the minimum legal age law, our findings indicate that youth e-cigarette use increases risks of mood and anxiety disorders.
Article
Full-text available
We evaluate Angrist and Krueger (1991) and Bound, Jaeger, and Baker (1995) by constructing reliable confidence regions around the 2SLS and LIML estimators for returns-to-schooling regardless of the quality of the instruments. The results indicate that the returns-to-schooling were between 8 and 25 percent in 1970 and between 4 and 14 percent in 1980. Although the estimates are less accurate than previously thought, most specifications by Angrist and Krueger (1991) are informative for returns-to-schooling. In particular, concern about the reliability of the model with 178 instruments is unfounded despite the low first-stage F-statistic. Finally, we briefly discuss bias-adjustment of estimators and pretesting procedures as solutions to the weak-instrument problem.
Article
For a quarter century, American education researchers have tended to favour qualitative and descriptive analyses over quantitative studies using random assignment or featuring credible quasi-experimental research designs. This has now changed. In 2002 and 2003, the US Department of Education funded a dozen randomized trials to evaluate the efficacy of pre-school programmes, up from one in 2000. In this essay, I explore the intellectual and legislative roots of this change, beginning with the story of how contemporary education research fell out of step with other social sciences. I then use a study in which low-achieving high-school students were randomly offered incentives to learn to show how recent developments in research methods answer ethical and practical objections to the use of random assignment for research on schools. Finally, I offer a few cautionary notes based on results from the recent effort to cut class size in California.
Article
Bounds for matrix weighted averages of pairs of vectors are presented. The weight matrices are constrained to certain classes suggested by the Bayesian analysis of the linear regression model and the multivariate normal model. The bounds identify the region within which the posterior location vector must lie if the prior comes from a certain class of priors.
Article
The paper examines two approaches to the omitted variable problem. Both of them try to correct for the omitted variable bias by specifying several equations in which the unobservable appears. The first approach assumes that the common left out variable is the only thing connecting the residuals from these equations, making it possible to extract this common factor and control for it. The second approach relies on building a model of the unobservable, by specifying observable variables which are causally related to it. A combination of these two methods is applied to the 1964 CPS-NORC veterans sample in order to evaluate the bias in income- schooling regressions caused by the omission of an unobservable initial ‘ability’ variable.
Article
When several candidate tests are available for a given testing problem, and each has nice properties with respect to different criteria such as efficiency and robustness, it is desirable to combine them. We discuss various combined tests based on asymptotically normal tests. When the means of two standardized tests under contiguous alternatives are close, we show that the maximum of the two tests appears to have an overall best performance compared with other forms of combined tests considered, and that it retains most power compared with the better one of the two tests combined. As an application, for testing zero location shift between two groups, we studied the normal, Wilcoxon, median tests and their combined tests. Because of their structural differences, the joint convergence and the asymptotic correlation of the tests are not easily derived from the usual asymptotic arguments of the tests. We developed a novel application of martingale theory to obtain the asymptotic correlations and their estimators. Simulation studies were also performed to examine the small sample properties of these combined tests. Finally we illustrate the methods by a real data example.