PreprintPDF Available

Cannabis use worsens the gender pay gap: evidence from randomly assigned interviewers

Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

We leverage interviewer assignment in an interviewer-mediated survey as a randomization device to measure the wage penalty of cannabis use. Using the Australian household survey, we show that the female wage penalty for cannabis use is 8.3%. For males, we estimate a noisy zero. This confirms findings from the medical literature that women are biologically more sensitive to cannabis and its health‐related harm. As the legalization of cannabis is typically found to increase its usage, it may also exacerbate the gender wage gap.
Content may be subject to copyright.
Cannabis use worsens the gender pay gap:
evidence from randomly assigned interviewers
Sergey Alexeev*1, Don Weatherburn2, and Mark Wooden3
1University of Sydney
2University of New South Wales
3University of Melbourne
May 10, 2023
We leverage interviewer assignment in an interviewer-mediated survey as a randomization
device to measure the wage penalty of cannabis use. Using the Australian household survey,
we show that the female wage penalty for cannabis use is 8.3%. For males, we estimate a noisy
zero. This confirms findings from the medical literature that women are biologically more
sensitive to cannabis and its health-related harm. As the legalization of cannabis is typically
found to increase its usage, it may also exacerbate the gender wage gap.
Keywords: Health Behavior; Illegal Behavior; Wage Dierentials; Non-labor Discrimination;
Government Policy
JEL Codes: I12; P37; J31; J16; I18
*Author to whom correspondence should be addressed:
We thank Peter Hull, Manudeep Bhuller and seminar participants at the University of Sydney and the
University of Melbourne for helpful feedback.
1 Introduction
Both the general public and many professionals welcome the creation of a legal cannabis market
(e.g., Weatherburn, Alexeev, and Livingston 2022; Shover and Humphreys 2019). Worries about
cannabis often relate to the connection between cannabis use and crime. However, cannabis-
induced crime by users shows to be limited (e.g., Simpson 2003; Pedersen and Skardhamar 2010).
Instead, the argument in favour of legalization is that organized crime is heavily involved in
supplying cannabis use (Xiong 2021); thus, legalization and regulation of cannabis would reduce
criminal activity.
Another potential risk is whether cannabis use induces the use of hard drugs, but this ‘gateway-
drug’ eect appears to be absent or small (e.g., van Ours 2003; Melberg, Jones, and Bretteville-
Jensen 2010; van Ours 2006b). People also worry about the health eects of cannabis use. A
strong association between cannabis use and poor health has been observed for decades (Degen-
hardt, Hall, and Lynskey 2003; Arseneault et al. 2004). Although some doubts has been expressed
about the causality of this relationship (Hall and Degenhardt 2009; Werb, Fischer, and Wood
2010; Pudney 2014), recent rigorous studies has established this link (Hall 2015; van Ours and
Williams 2011,2012; van Ours et al. 2013).
Medical reasons for the negative eects of cannabis use and health are well understood. They
include suppressed immune system, impaired executive functions (e.g., learning), cardiovascular
diseases, obstructive lung disease and paranoid thoughts (Alshaarawy 2019; Crean, Crane, and
Mason 2011; Filbey et al. 2014; Khanji et al. 2020; Kempker, Honig, and Martin 2015; Freeman
et al. 2014). Importantly, these physiological eects are highly gender-dependent, with females
much more susceptible to cannabis harm (Struik, Sanna, and Fattore 2018; Fattore and Fratta
2010; Craft 2005). The physiological eects suggest that cannabis users are less productive work-
ers. However, to date, the wage eects of cannabis use often show counterintuitive results.
van Ours and Williams (2015) separate literature on the cannabis wage eects into two waves.
The first wave of literature should be taken with caution as they tend to use questionable identi-
fication strategies. Kaestner (1991) estimates that males who have tried cannabis earn 18% more
than otherwise similar males who have not tried cannabis. Register and Williams (1992) estimate
that using cannabis on one more occasion per month increases hourly wages by 5%, and Gill
and Michaels (1992) find that drug users earn about 4% more per hour than nonusers. Kaestner
(1994a,1994b) concludes that drug use does not have a systematic impact on wages. Second-wave
literature, from 1998 onwards, improves on the earlier work. Some studies tend to find evidence
that nonproblematic use of cannabis has no impact on wages (French, Roebuck, and Alexandre
2001; van Ours 2006a), while others show noticeable wage penalties (Burgess and Propper 1998;
DeSimone 2002; Zarkin et al. 1998).
Identifying the causal eect of drug use on wages is long acknowledged as an unusually chal-
lenging empirical question (Kaestner 1998). The model also has to be robust to threats to internal
validity that originate from unobserved confounders (e.g., risk preference aects wage and corre-
lates with cannabis use) and reverse causality (e.g., the high wage may initiate drug use).
Consider Figure 1that encodes the relationship between earning and cannabis in a directed
acyclic graph. The dotted arrow refers to the correlation captured with the OLS estimate of the
wage equation. Arrows emanating from ‘unobserved confounders’ refers to omitted variable bias;
a bi-directed arrow between the use and wage refers to simultaneity.
Figure 1: Endogeneity and identification of the eects of cannabis on wages
Instrumental variable Cannabis use Earnings
Unobserved confounders
In this article, we estimate the wage cannabis use penalties by gender utilizing a form of a
Bartik instrumental variable (IV) (Goldsmith-Pinkham, Sorkin, and Swift 2020; Breuer 2021;
Bartik 1991). Specifically, we leverage the allocation of interviewers in an interviewer-mediated
survey. We show that respondents’ assignment to the interviewers is exogenous in the sense of
being conditionally random and orthogonal to current respondents’ characteristics (see arrows
emanating from ‘instrumental variable in Figure 1).
We then construct a simulated instead of realized treatment using the plausibly exogenous
subdimension of the exposure variation originating from the interviewer assignment. This simu-
lated treatment is highly predictive of the realized treatment but does not have an idiosyncratic
link with a respondent, drastically reducing concerns about reverse causality and unit-specific
omitted factors.
In practice, our IV is a residualized leave-out-mean value for an interviewer. It is residualized
because we partial out the statistical area to account for interviewer allocation. We do this be-
cause, by construction, the respondents might be assigned to the same pool of interviewers within
an area. After residualizing, we average the residual for a given respondent using the information
on other respondents with the same interviewer.
Using averages (or lagged values), such as the industry average for the firm-level data, is a
popular approach to reduce concerns about reverse causality and unit-specific omitted factors.
However, they are often criticized for not being plausibly exogenous (Reiss and Wolak 2007;
Larcker and Rusticus 2010). In our case, they are exogenous by the survey design.
Our tests demonstrate that the constructed IV is relevant, random and satisfies monotonicity.
The exclusion restriction that our IV is not aecting current wages is also satisfied by the
survey design. We use a multi-purse panel survey where the interviewers’ assignment is not
conditioned on any household or family members’ characteristics. In addition, most interviewers
collect data for multiple waves and follow up with the same individuals. That is, the allocation
(and, thus, the IV we construct) is predetermined and, thus, orthogonal to current wages.
The idea of using the interviewer identity as an exogenous non-experimental source of identi-
fication also has been previously used by Barnighausen et al. (2011). They noted that the positive
answer to HIV status correlated with the interviewer’s identity and used this correlation to correct
the non-random disclosure of the HIV status.
Because we use cross-sectional data for our Bartik-like IV, our approach is also similar to
simulated instruments used in Currie and Gruber (1996). Their paper measures the eect of
medical care utilization on children’s health outcomes. To address endogeneity, they use observed
characteristics to estimate a probability of eligibility for Medicaid given the rules in each state
in that year and use that probability as an IV. In our case, we also estimate the probability of
treatment using the aspects of data collection that are exogenous to respondents by design. We
now explain this data collection process.
2 DATA 5
2 Data
For our work, we use wave 17 of The Household, Income and Labour Dynamics in Australia
(HILDA) Survey. It is the main Australian nationally representative household-based panel, and
it has the required for our study information on the respondents’ interviewers, labour market
dynamics, and cannabis use.
In the first subsection we argue that the respondents are allocated to interviewers in a random
fashion. We argue that the cannabis is use is not subject to the interviewrer eects because there
is no direct
2.1 HILDA Data Collection Processes
For this wave, the data is collected by a total of 175 interviewers (Summerfield et al. 2021, Table
8.3), the majority of whom (140) had worked on at least one previous wave of the study. Thirty
of these worked from a centralised oce where they staed a dedicated 1800 telephone line and
conducted interviews by telephone (the T1800 workforce). The remaining 145 were located in
dierent locations around the country and conducted interviews mostly in person in respondent’s
homes (the face-to-face interviewer workforce).
The T1800 team was primarily responsible for sample members who had previously indicated
a preference for an interview by telephone or were determined to live in an area where it was not
cost-eective to send a face-to-face interviewer. Interviewer assignment to these sample members
was eectively random. Responsibility for allocating falls to the coordinators, who develop a
roster and inform interviewers about the list of respondents to call. The coordinator allocates
work in a way that shares the workload as evenly as possible, and interviewers have no prior
knowledge of allocation. There is no scope for respondents to influence the choice of interviewers
and no scope for interviewers to pick respondents. This group, however, was only responsible for
6.6% of completed interviews.
Face-to-face interviewers were assigned area-based workloads determined primarily by loca-
tion to minimise cost. Interviewers are typically assigned workloads relatively close to where the
interviewer resides. The main exception is workloads clustered around a rural location where no
interviewer resides. In these cases, an interviewer based in the nearest major city will travel to
and stay at the required location for an extended period.
Once contact with a sample household is established and appointment times agreed, the data
2 DATA 6
collection process involves three distinct steps. First, an interview is conducted with at least
one household member where the household composition is established, and questions about the
household (e.g., on housing) are asked. Second, interviews are then sought with every member
of the household aged 15 years or older. Third, a self-completion questionnaire (SCQ) is left with
all interview respondents to complete in private.
SCQ minimizes the measurement errors due to the interviewer eect. During the in-person
interviewer, interviewer-respondent socio-economic proximity produce biases in answers (Davis
et al. 2009; Johnson et al. 2000; Flores-Macias and Lawson 2008; West and Blom 2016).
Once completed, these can either be collected by the interviewer or returned via the mail in
reply-paid envelopes provided for this purpose. Where possible, the completed SCQs were col-
lected by the interviewers at the time of the interview. Where this was not possible, interviewers
were required to make at least one additional trip to the household at a later date for the express
purpose of collecting the completed SCQs. This ensured that return rates among face-to-face in-
terviews were extremely high (95.1% in wave 17). Among the much smaller group of telephone
respondents, who were all requested to return the completed form in the mail, however, the SCQ
return was much lower (59.1%).
The SCQ consists mainly of questions that are dicult to answer in a time-eective manner in
a personal interview (e.g., time use, expenditure) or which respondents may feel more comfort-
able completing on their own without the aid of an interviewer. In wave 17, this included new
questions on the use of illicit drugs.
2.2 Basic patterns of cannabis use
Wave 17 has a total of 17,570 individuals. We estimate the eects of cannabis use on wages on
a sample of employed workers, excluding self-employed individuals. The latter are excluded
because they are unincorporated businesses without wages and only business income or incorpo-
rated businesses with wages but with non-market discretion over what to consider own wage and
business income. Some fraction of individuals is reported as being family workers. They are also
Table 1tabulates the frequency of cannabis use in our estimations sample. We define our
treatment as the indicator for any cannabis use (codes 1 to 6 in the leftmost column of Table 1).
Thus, in our sample, 10.4% of females and 16.7% of males belong to the treatment group.1
1. The bottom half of Table 1shows that approximately 8% of our primary sample has missing SCQs. We note
2 DATA 7
Table 1: Frequency of cannabis use among employed Australians
Female Male Total
N Share N Share N Share
1 Every day 38 0.010 75 0.021 113 0.016
2 Once a week or more 47 0.013 92 0.026 139 0.019
3 2 or 3 times a month 32 0.009 61 0.017 93 0.013
4 About once a month 21 0.006 44 0.013 65 0.009
5 Every few months 57 0.015 91 0.026 148 0.020
6 Once or twice a year 194 0.052 221 0.063 415 0.057
7 Not at all 3352 0.896 2916 0.833 6268 0.866
Total 3741 0.772 3500 0.757 7241 0.765
Notes: Tabulation of cannabis use. Employee-only sample excluding observations with
missing log earning rate values and top and bottom 1% of the IV value. More on the IV
in Section 3.1.
Source: HILDA 2017.
We now perform some basic data visualisations to get a better idea of our treatment variable.
Figure 2a shows the age histogram of employed cannabis users by gender. The male distribution
is shifted to the right relative to the female. The respondent’s in their 20s at the most frequent
user of cannabis, particularly females. For example, at age 23-25, 13% of female workers smoke
weed. For males, this fraction is 7%. The same pattern also holds in a statistical sense, which
we verify by regressing cannabis use dummy on a female indicator interacted by the age groups
The above mentioned gender specific distinction is of note. Some argue that the female wage
gap is partially due to relatively lower actual labour market experience, a variable that is often
not available in nationally representative data sets and thus often seen as an confounder (Blau
and Kahn 1997,2006,2017). Within the earnings lifecycle, lack of experience is most damaging
in the early 20s, especially at higher education levels. This concavity of log earnings age and
experience profiles, with steeper profiles for persons with more years of education, is a widely
accepted motivation to include age and age square into wage equation (Heckman, Lochner, and
Todd 2003). The tendency of females to use cannabis at the time when it is most damaging for
current, and future earnings already imply that cannabis may be a source of wage disparities.
Wave 2017 also contains information on the age of the first use of cannabis. This allows mea-
suring how many years the cannabis has been used. Figure 2b demonstrates histogram where
we replaced the value 1 of our treatment variable with ‘current age age of first use + 1’. The
value 1 corresponds to users who started using cannabis this year, 2 corresponds to those who
that SCQs are not missing at random (Watson and Wooden 2015), which poses an inherent limitation of the data.
2 DATA 8
Figure 2: Basic patterns of cannabis use
(a) Age of cannabis users by gender
(b) Years of cannabis use by gender
Notes: (a) Histogram of age among employed active users by gender; (b) Histogram of year of use among active
employed users by gender.
Source: HILDA 2017.
2 DATA 9
began a year ago, and so on. The distribution for male users is again shifted to the right relative
to females. Accordingly, the female distribution is concentrated at the left tail relative to the male
distribution. The histogram suggests that females are less likely to be long-term users.
The potential limitation of this histogram is that this measure is may be correlated with age
and therefore may reflect the gender dierences in age structure. One way to control for these
dierences is to divide the variable by the respondents’ age. Then the variable becomes the share
of life respondent has been using cannabis. Under these manipulation, the same pattern as in
Figure 2b still holds.
The patterns shown in Figure 2may be due to self-selection into employment or cannabis use;
thus, potentially, may be of no significance for wages. Our main model, introduced in the next
section, corrects for these and other sources of endogeneity.
2.3 Descriptive statistics
All of the variables used in this study are characterized in Table 2. The uni-variate gender dif-
ferences on the right of the table are estimated by regressing each variable on a constant and a
dummy for females with robust standard errors. On average, employed females’ earnings rate is
about 11% less and they are 6% less likely to be cannabis users.
Further, women are more educated and have shorter tenure. Possibly because lesser educated
women choose not to work (Heckman 1979; Mulligan and Rubinstein 2008). The variables at
the bottom of the table support this view by showing that employed women are also less likely
to be married and have fewer kids younger than 18 years old (referees to as minor in the table),
suggesting that married women with dependent children are missing from the workforce.
Shorter tenure may also be due to older women with longer tenure missing from the work-
force. Another known explanation is that anticipating shorter and more discontinuous work,
women generally have lower incentives to invest in on-the-job training (Becker 1985). Women
are also known to choose occupations for which on-the-job training is less important because the
returns to such investments are reaped only as long as one remains with a particular employer
(Blau and Kahn 2017). Alternatively, it may reflect the fact that due to the birth of a child, a
woman withdraws from the labour force, breaking her tie to her employer and forgoing the re-
turns to any firm-specific training (Waldfogel 1998).
Female occupation is heavily shifted away from construction or manufacturing towards ser-
2 DATA 10
Table 2: Descriptive statistics
Female (F) Male (M) F-M
Mean SD Min Max Mean SD Min Max
Outcome and focal variables
Earnings rate, ln 3.328 0.631 -2.479 7.657 3.445 0.676 -3.937 8.204 -0.110***
Cannabis use 0.104 0.305 0 1 0.167 0.373 0 1 -0.0607***
Human capital measures
Schooling 13.395 2.382 0 19 12.953 2.255 4 19 0.443***
Age 39.561 14.078 15 79 39.107 13.843 15 80 0.392
Tenure 6.746 7.744 0 47 7.060 8.313 0 57 -0.384*
Industry occupation
Agriculture 0.008 0.088 0 1 0.024 0.154 0 1 -0.0156***
Energy 0.002 0.049 0 1 0.010 0.101 0 1 -0.00777***
Mining 0.005 0.071 0 1 0.028 0.166 0 1 -0.0237***
Manufacturing 0.036 0.186 0 1 0.131 0.337 0 1 -0.0944***
Construction 0.013 0.115 0 1 0.115 0.319 0 1 -0.101***
Trade 0.130 0.336 0 1 0.148 0.355 0 1 -0.0164*
Transport 0.024 0.152 0 1 0.067 0.250 0 1 -0.0411***
Bank/Insurance 0.038 0.192 0 1 0.030 0.171 0 1 0.00726+
Services 0.744 0.437 0 1 0.446 0.497 0 1 0.293***
Selection variables
Married 0.440 0.496 0 1 0.489 0.500 0 1 -0.0480***
Cohabiting 0.160 0.367 0 1 0.171 0.377 0 1 -0.00982
Single 0.249 0.433 0 1 0.244 0.430 0 1 0.00225
Minors 0.708 1.005 0 6 0.808 1.107 0 6 -0.0980***
Other income, IHSF 10.148 3.907 0 14.433 8.578 4.830 0 14.426 1.540***
Observations 3741 3500
Notes: Tabulation of cannabis use. Employee-only sample excluding observations with missing log earning rate values and
top and bottom 1% of the IV value. More on the IV in Section 3.1.
+𝑝 < 0.1, * 𝑝 < 0.05, ** 𝑝 < 0.01, *** 𝑝 < 0.001.
Source: HILDA 2017.
vices. The expansion of the service sector and computerization are known factors that equalized
gender pay by reducing the demand for blue-collar production labour where men are dispro-
portionately represented (Weinberg 2000; Autor, Levy, and Murnane 2003; Goldin, Goldin, and
Economic Research 1990).
The bottom of the table shows the variables that we use to model workforce participation
(although, for this, we use a larger sample that includes both employed and unemployed indi-
viduals). The last variable Other income is the inverse hyperbolic sine function (IHSF)2of
Cross National Equivalent File (CNEF) total household income in excess of individual earnings.
We will discuss this variable in more detail in Section 3.2.
2. Defined as sinh1(𝑥) := ln(𝑥+1 + 𝑥2). Its derivative is 1
/1+𝑥2which if 𝑥is not too small approximates 1
/𝑥, the
derivative of ln(𝑥).
3 Methods
3.1 Instrument construction
We reduce endogeneity concerns by focusing on a subset of the exposure variation that is plausi-
bly exogenous. To start, we must first identify a subdimension 𝑘which allows decomposing the
treatment variation 𝑋𝑖. In our setting, the following identity holds:
where 𝑘denotes an interviewer and 1(𝛾𝑘) is its indicator function. The identity does not require
any assumption. It states that an interviewer is subsumed for an observed treatment status. The
purpose of this decomposition is to show that the individual components are needed to be ob-
servable. In our data, we see both the treatment status and the unique interviewer number that
conducted the interview.
We use this 𝑘dimension to construct a potential instead of realized treatment. To ensure that
the potential treatment variable is exogenous, we leverage conditional randomness of the HILDA
interviewer allocation. As mentioned in the previous subsection, some fraction of interviewer
assignment is unconditionally random, while most interviewers may collect data within a partic-
ular geographic area. Thus, sampled households within an area might be served by the same pool
of interviewers. The randomisation of interviewers is plausible between specific areas, whereas
assignments within areas may not be random. We account for this by including the statistical
area fixed eects.
For area, we choose the Australian Bureau of Statistics Statistical Areas Level 3 (SA3), which
are the functional areas of regional towns and cities and include one or more Local Government
Areas (LGAs) (Australian Bureau of Statistics 2016). In our employee-only sample, there are 318
SA3s. This level is optimal because the interviewers tend to cover at least a few LGAs.3
Another useful feature of the HILDA interviewer allocation is that the survey is a multi purse;
the interviewers’ assignment is not conditioned on any household or family members’ character-
istics. Most interviewees also collect data for multiple waves; thus, the allocation is both random
and predetermined. This favour the IV exclusion restriction, which, in our case, is that the initial
3. In practice, choosing the lower area levels creates the incidental parameters problem (e.g., we are unable to
estimate the area parameters for, say, 3530 SA1s using a female sample of size of 4481). In contrast, using higher
area levels may not achieve conditional randomisation of interviews.
assignment of the interviewer is orthogonal to current wages.
In practice, the following steps are taken to construct the IV. We first take the treatment vari-
able 𝑋𝑖and residualize it. Let the residual after removing the interacted interviewer-area fixed
eects 𝜓𝑘𝑎 (with 𝑎indexing areas) and controls 𝑊be:
𝑘(𝑖)=𝑋𝑖𝜓𝑘𝑎 𝑊
Intuitively, this step ‘imprints’ interviewers conditionally random positive response rate into the
cannabis use variable.
Further inclusion of controls 𝑊in Equation (2) adjusts the IV for 𝑖. The same controls are
included in both stages of our 2SLS model and the Heckman selection equation (introduced in
the next subsection) for the same reasons. Including controls improves eciency and consistency
because conditioning allows better isolating the exogenous variation originating from the inter-
viewer assignment. If the underlying IV varies with 𝑊, not including 𝑊here would result in
a loss of information since the resulting IV would not reflect all the exogenous movement in 𝑋
(Wooldridge 2010, Ch. 5.1.2).
The residual 𝑋*
𝑘(𝑖)includes the required instrument 𝑍𝑘(𝑖), as well as idiosyncratic individual
level variation 𝜌𝑘(𝑖). These residuals are then used to construct the leave-out IV:
where 𝑛𝑘is the total number of HILDA respondents interviewed by interviewer 𝑘and 𝑗indexes
other respondents interviewed by 𝑘. Eectively, Equation (3) removes the residual 𝜌𝑘(𝑖)from
Equation (2) by aggregating the residualised measure to the 𝑘cell level and excluding the re-
spondent whose cannabis use is being instrumented. This leave-out measure is preferred because
avoiding this step would introduce the same estimation errors on both the left- and right-hand
side of the regression and may produce biased estimates of use on wage.
Table 3illustrates the step involved in the IV construction using a simplified example, where
area fixed eects and the control are excluded, and only one interviewer is shown. The inter-
viewer fixed eect 𝜓𝑘shows the fraction of positive answers, which are, in turn, coded as ones in
column 𝑋𝑖. The residual is then the dierence between 𝜓𝑘and the treatment status. The final IV
is the mean for an interviewer excluding 𝑖s 𝑋*
𝑘(𝑖). Because cannabis use is a relatively rare event,
Table 3: Simple example of IV construction
𝑖 𝑘 𝑋𝑖𝑋*
100061 12039 1 0.6 0.4 5 -0.15
1000907 12039 0 -0.4 0.4 5 0.1
1000908 12039 1 0.6 0.4 5 -0.15
800166 12039 0 -0.4 0.4 5 0.1
118842 12039 0 -0.4 0.4 5 0.1
Notes: The demonstration of the steps (2) and (3) for
a fixed interviewer excluding the control variables
and area fixed eects.
Source: HILDA 2017.
our IV negatively correlates with the treatment.4
Table 4: Estimates of the control variables parameters in Equation (2)
Dependent variable: Cannabis use
Female Male
Estimate SE Estimate SE
Age -0.00706*** (0.00212) -0.00187 (0.00240)
Age20.0000376 (0.0000254) -0.0000266 (0.0000280)
Schooling -0.00649** (0.00228) -0.0127*** (0.00299)
Tenure -0.00115 (0.000711) -0.00159* (0.000746)
Industry (Agriculture omitted)
Energy -0.0109 (0.113) -0.0730 (0.0685)
Mining -0.0752 (0.0841) -0.0374 (0.0488)
Manufacturing -0.0185 (0.0459) 0.0159 (0.0333)
Construction 0.0753 (0.0542) 0.0951** (0.0329)
Trade 0.0114 (0.0414) 0.0220 (0.0331)
Transport -0.0154 (0.0502) 0.00148 (0.0375)
Bank/Insurance -0.0137 (0.0472) -0.0159 (0.0460)
Services -0.00327 (0.0395) 0.0359 (0.0309)
N of FE 564 568
N 4381 4482
𝑅20.073 0.089
Notes: The table reports the estimates of Equation (2). The estimates for the inter-
acted interviewer-area fixed eects are visualised in Figure 3.
+𝑝 < 0.1, * 𝑝 < 0.05, ** 𝑝 < 0.01, *** 𝑝 < 0.001.
Source: HILDA 2017.
For the actual IV construction, we perform the residualization on the full sample (separately
for males or females) without restricting the population to employed workers only. A bigger
4. One could also note that if, instead of interview, we only had area fixed eects, our steps are similar to the ones
taken in the examiner design with an interviewer treated is an examiner (e.g., Alexeev and Weatherburn 2022). If we
do follows these steps for our setting, the constructed IV fails the relevance assumption because there is no economic
reason for the interviewer’s identity to influence the treatment. The only reason for this correlation would be the
interview eects, which would not be a reliable source of identification.
Figure 3: The fixed eects estimates in Equation (2)
(a) Female sample
(b) Male sample
Notes: The figure shows the point estimates of the interacted interviewer-area fixed eects corresponding to Equa-
tion (2). See Table 4for further regression output. Boxes show 5, 25, 50, 75, 95 percentiles
Source: HILDA 2017.
Figure 4: The number of individuals interviewed by one interviewer
(a) Female sample
(b) Male sample
Notes: The figure shows the number of respondents interviewed by an interviewer. Boxes show 5, 25, 50, 75, 95
Source: HILDA 2017.
sample ensures that our IV is better predictive of cannabis use. Table 4reports the estimates of
Equation (2), with the point estimates of the fixed eects shown in Figure 3. All but few fixed
eects are relatively small in value, as shown by the box plots below the dot plots in Figure 3.
After leaving one out, we exclude the top and bottom 1% of values from the IV for a few rea-
sons. Firstly it ensures the absence of influence of outliers since the IV estimates are susceptible
to bias generated by extreme values (Young 2022). Secondly, it reduces the number of observa-
tions with a low 𝑛𝑘. To reliably remove the idiosyncratic component, observations with low 𝑛𝑘
should generally be avoided. Figure 4shows the distribution of 𝑛𝑘in our ultimate sample. The
median number for males is 37; for females, 39.
Lastly, because 𝑍𝑖
𝑘(𝑖)is used within the 2SLS estimator, our resulting approach is a jackknife
instrumental variables estimator, which is recommended for models when the number of IVs (the
interviewers) increases with sample size (Stock, Wright, and Yogo 2002; Evdokimov and Koles´
2018; Koles´
ar et al. 2015).
3.2 Econometric model
To evaluate the causal eect of cannabis use on wages we employ a linear model with an endoge-
nous explanatory variable and sample selection:
𝑆𝑖= 1𝑉
𝜂𝑖 𝒩 (0,1),(7)
where Equation (4) is the structural equation of interest, Equation (5) is a linear projection for the
endogenous variable, and Equation (6) is the Heckman selection equation.
The variable 𝑌𝑖is the wage for individual 𝑖, which is observed only when 𝑆𝑖= 1. The variable
𝑋𝑖is an indicator for being an active cannabis user. The array of controls 𝑊𝑖includes standard
human capital measures (schooling, age, tenure) and industry of occupation. The variable 𝑆𝑖
takes on value 1 if we observe 𝑌𝑖and 0 otherwise. The variables 𝑉𝑖are used to model the selection
process. They are an array of dummies for marital status (single, married, cohabiting), a number
of minors (dependents less than 18 years old) in the household, and other household income
(defined as a CNEF total household income minus own labour market earning with 0 if the result
is negative).
The assumption is that these variables causally aect wages on extensive margins and only
influence intensive margins through the extensive ones. We note, however, that our estimates,
reported in Section 5, and the conclusion we draw from them are largely independent of biases
that maybe be caused by the sample selection. Thus, this assumption is not of critical importance.
To leverage more variation from the selection variables to avoid the problem of collinearity
between Mills ratios and control variables, we add other income and the number of minors in a
quadratic form. This allows, for example, the subsequent children to have a distinct eect on the
decision to work (cf. Mogstad and Wiswall 2016). Our estimates, reported in Section 5, show this
parametrization is well justified. In addition, we apply the IHSF to other income to improve the
fit of this heavily skewed variable and to prevent the loss of sample size (logging would drop 0).
The parameters 𝛽is of interest. Because 𝑋𝑖is a dummy variable, and workers that do not
use cannabis is an omitted category, the coecient shows a percent dierence in wage between
worker attributed to the cannabis use. The estimates are produced on male and female samples
separately. To estimate Equations (4), (5), (6), we follow the procedure described in Wooldridge
(2010, Ch. 19.6.2). We probit the positive labor market earning on 𝑊,𝑉and 𝑍𝑖
𝑘(𝑖). Then we
estimate equations (4) and (5) with 2SLS and include Mills ratios constructed from probit as the
control variable. We recover standard errors using bootstrap with 1000 replications, because
analytical standard errors are generally incorrect with this estimation procedure. Incidentally,
bootstrap also helps to establish the robustness of our IV estimates, as argued by Young (2022).
4 Verifying identifying assumptions
Throughout this section, we test the IV assumptions when a full vector of the control variable
is included but excluding the Mills rations because wages (the incidentally truncated variable
requiring adjustment) do not enter into these tests.
We first establish the IV randomness, confirming our estimation design’s key aspect. We then
turn to other IV assumptions keeping in my mind that they are satisfied either by the HILDA
data collection process or by IV construction. We also note, however, that the IV assumptions
are ultimately untestable as they are defined in terms of unobservables, whereas we perform our
tests using observables.
4.1 Instrument independence
The first assumption of our 2SLS framework that we test is that the IV must be uncorrelated with
observed and unobserved characteristics of men. To verify this assumption, we need to establish
that the assignment of interviewers is random using the observable characteristics. We regress
cannabis use and then IV on various variables and compare the estimates. Table 5reports the
results. The left side of the table reports results for females, and the right side for males.
Cannabis use is highly non-random, suggesting that the OLS estimates of the wage equation
would be biased. For females, the observables explain 4.5% of variation in cannabis use. The
hypothesis of joint insignificant is rejected with a 𝑝-value <0.000. In contrast, no statistically
significant relationship between the IV and various demographic variables is seen. The estimates
are all close to zero. The model is not jointly statistically significant, with a 𝑝-value of 0.986.
This provides evidence that interviewers are randomly assigned to respondents. Thus, the IV
is also independent of the unobserved characteristics of workers. The same conclusion can be
made about the male sample.
4.2 Instrument relevance
We now turn to the second assumption of our 2SLS framework the IV relevance. The IV has
to correlate with cannabis use. Figure 5presents the IV’s histogram. Superimposed over the
histogram is the nonparametric regression of cannabis use on the IV (a flexible analog to Equa-
tion (5)).
Table 6reports the linear version of this regression the actual Equation (5). We first report
the Cragg and Donald (1993) Wald statistic the multivariate analog of the F-test advocated by
Stock and Yogo (2005). Cragg-Donald Wald statistic is referred to as F in the bottom half of
Table 6.
Olea and Pflueger (2013) showed that F-statistic even adjusted for heteroskedasticity, autocor-
relation, and clustering can be misleadingly high. They developed the eective F-statistic. When
data is conditionally homoskedastic and serially uncorrelated, the eective F-statistic is identical
to the Cragg and Donald (1993). It should be higher than 12.28 (this cutocorresponds to the
conventional rule-of-thumb cutoof 10). Table 6refers to this statistic as EF. The dierence be-
Table 5: Test of randomization
Cannabis use IV
Estimate SE Estimate SE
Age -0.00859*** (0.00212) -0.0000468 (0.0000615)
Age20.0000585* (0.0000239) 0.000000539 (0.000000696)
Schooling -0.00335+ (0.00190) -0.0000130 (0.0000562)
Tenure -0.00198*** (0.000478) 0.00000278 (0.0000159)
Industry (Agriculture omitted)
Energy -0.0670 (0.0573) -0.0000258 (0.00208)
Mining -0.0653 (0.0760) -0.00177 (0.00292)
Manufacturing -0.0287 (0.0598) -0.000589 (0.00183)
Construction 0.0543 (0.0770) -0.00162 (0.00231)
Trade 0.0133 (0.0573) -0.0000396 (0.00171)
Transport -0.0339 (0.0617) 0.000713 (0.00185)
Bank/Insurance 0.0216 (0.0610) -0.000242 (0.00176)
Services 0.000244 (0.0556) -0.000333 (0.00167)
Constant 0.397*** (0.0718) 0.00158 (0.00210)
N 3743 3743
𝑅20.045 -0.002
𝑝-value <0.000 0.986
Cannabis use IV
Estimate SE Estimate SE
Age -0.00456+ (0.00256) -0.0000122 (0.0000755)
Age20.00000702 (0.0000287) -4.71e-08 (0.000000857)
Schooling -0.00796** (0.00283) 0.0000204 (0.0000797)
Tenure -0.00298*** (0.000647) 0.0000143 (0.0000192)
Industry (Agriculture omitted)
Energy -0.0541 (0.0607) 0.00221 (0.00182)
Mining -0.0703 (0.0489) -0.0000690 (0.00156)
Manufacturing -0.0149 (0.0437) 0.000471 (0.00126)
Construction 0.0883+ (0.0461) -0.0000466 (0.00132)
Trade 0.00974 (0.0441) -0.000119 (0.00127)
Transport -0.0375 (0.0451) 0.000220 (0.00132)
Bank/Insurance -0.0369 (0.0506) 0.000673 (0.00150)
Services 0.0227 (0.0424) 0.000197 (0.00122)
Constant 0.443*** (0.0677) 0.000357 (0.00195)
N 3500 3500
𝑅20.051 -0.003
𝑝-value <0.000 0.703
Notes: The table reports estimates after regressing an indicator for cannabis use and IV
on demographic characteristics. Robust standard errors are reported in parentheses.
𝑝-value is on the null hypothesis that all coecients are equal to zero. See Section 3.1
for the IV construction.
+𝑝 < 0.1, * 𝑝 < 0.05, ** 𝑝 < 0.01, *** 𝑝 < 0.001.
Source: HILDA 2017.
Figure 5: The IV and first stage
(a) Female sample
(b) Male sample
Notes: This figure reports the distribution of the IV (see Section 3.1 for the IV construction) and nonparametric
regression of cannabis use on IV (the first stage of the model introduced in Section 3.2). Dashed lines represent a
95% confidence interval.
Source: HILDA 2017.
Table 6: First stage
Dependent variable: Cannabis use
Female Male
Estimate SE Estimate SE
IV -28.15*** (0.323) -28.83*** (0.491)
N 3,743 3,500
F 7,592.5 7,530.6
EF 2,177.2 3,456.0
EF-MS 180.8 286.0
Int 145 146
Notes: The table reports the first stage estimates. F
refers to the Cragg-Donald Wald statistic; EF to eective
F-statistic; EF-MS to eective F-statistic divided by the
square root of the number of the interviewers (referred
to as Int). See Section 3.1 for the IV construction.
+𝑝 < 0.1, * 𝑝 < 0.05, ** 𝑝 < 0.01, *** 𝑝 < 0.001.
Source: HILDA 2017.
tween F and EF indicates that more than half of the IV strength in our sample indeed comes from
misspecified variance-covariance matrix (e.g., Andrews, Stock, and Sun 2019).
The limitation of using the F test at the first stage in the context of many IVs is that the
residualized leave-out IV is treated as a single variable, inflating F-statistics, as it understates
the true dimensionality of the underlying IVs (Hull 2017). Mikusheva and Sun (2021) propose
the test where F-statistics is divided by the square root of the number of underlying IVs. Our
preferred F-statistic, referred in the table as EF-MS, is the eective F-statistics adjusted by the
number of underlying interviewers. According to it, the IV is strong, with a first stage preferred
F-statistic of 180.8 for females and 286.0 for males, well above a recently advocated cutoof
104.7 (Lee et al. 2021).5
4.3 Instrument monotonicity
The third assumption of the 2SLS framework is monotonicity (Imbens and Angrist 1994; Angrist
2016). The IV shows local average treatment eect (LATE), which compares a mix of IV always
takers and compliers and a mix of IV never takers and compliers. Understanding this logic is par-
ticularly important for the conclusion that cannabis use is more damaging to female than male
workers. Thus, it is important to ensure that gender rather than the compliers subpopulation
drives the dierence in eects. Table 6already suggest that the complier populations are approx-
5. We use STATA packages IVREG2 (Baum, Schaer, and Stillman 2002) and WEAKIVTEST (Pflueger and Wang
2013) to assist our calculations.
imately identical for both genders because the strength of the coecients at the first stage is the
Table 7: Examining monotonicity
Dependent variable: Cannabis use
Female Male
Subsample Estimate SE N Estimate SE N
Earnings below median -29.46*** (0.810) 2011 -30.04*** (0.565) 1956
Earnings above median -25.48*** (0.905) 1949 -26.71*** (0.793) 1948
Schooling below median -27.89*** (0.692) 2640 -28.69*** (0.502) 2745
Schooling above median -27.52*** (0.924) 1657 -28.38*** (0.778) 1649
Age below median -30.44*** (0.673) 2177 -30.32*** (0.527) 2213
Age above median -22.16*** (1.061) 2120 -25.59*** (0.746) 2181
Tenure below median -29.43*** (0.732) 2268 -29.57*** (0.520) 2221
Tenure above median -24.72*** (0.884) 2029 -27.14*** (0.722) 2173
Agriculture (=1) -22.59*** (5.703) 73 -25.64*** (2.134) 213
Mining (=1) -20.37*** (2.564) 19 -21.51*** (2.344) 101
Manufacturing (=1) -24.29*** (2.330) 169 -30.51*** (1.141) 545
Construction (=1) -29.09*** (3.616) 71 -29.14*** (1.013) 631
Trade (=1) -29.63*** (1.993) 555 -29.54*** (0.987) 593
Transport (=1) -24.04*** (3.015) 104 -25.56*** (1.660) 268
Bank/Insurance (=1) -35.03*** (3.497) 153 -24.97*** (3.062) 125
Services (=1) -27.59*** (0.610) 3144 -28.91*** (0.700) 1880
Notes: The table reports the estimate of the first stage on various subsamples. Robust
standard errors are reported in parentheses. See Section 3.1 for the IV construction.
+𝑝 < 0.1, * 𝑝 < 0.05, ** 𝑝 < 0.01, *** 𝑝 < 0.001.
Source: HILDA 2017.
To better understand our LATE, we comparing the first stage by subpopulations. The coe-
cient at the first stage is higher for the subgroups with more compliers. If, in addition, coecients
for some subpopulations have opposite signs, then the sample has IV non-compliers, which in-
validate a LATE interpretation of our results.
Table 7reports the first stage estimates by gender and by subpopulations. For example, the
first line reports two coecients. The left coecient restricts the sample to females and then fur-
ther restricts the sample to those with earnings less than the median in this female-only sample.
The right coecient repeats the same for males.
The table supports two conclusions. First, the first stage estimate on the subsamples is nearly
the same as reported in Table 6, and they are almost the same across genders. This suggests
that our LATE is close to ATE for both genders and, thus, gender-specific estimates are directly
4.4 Instrument exclusion restriction
The exclusion restriction, in this case, is that the allocation of interviewers is orthogonal to the
current wage. This immediately follows from the HILDA design, where the allocation of inter-
viewers is not only random conditionally on the area but also predetermined many years before
the current wages.
To increase the assurance that the violation of the exclusion restriction is not aecting our
estimates, we also conducted additional calculations using an imperfect instrument estimator
developed by Nevo and Rosen (2012) and a plausibly exogenous estimator developed by Conley,
Hansen, and Rossi (2012). Nevo and Rosen (2012)’s estimates are suitable when there is an un-
derstanding of the direction of the correlation of an IV with an unobserved error term, but not
necessarily its magnitude.
Their estimator relaxes the assumption of exogeneity and only requires that the IV demon-
strates a weaker correlation with the equation error term than the endogenous variable for which
it is used. Using this approach gives the bounds that are consistent with our baseline estimates.
Conley, Hansen, and Rossi (2012)’s approach is well-suited for circumstances in which the direc-
tion of the correlation is not known, but the magnitude is known. Again, the bounds we recover
give no reason to believe that the exclusion restriction is violated.6
5 Results
Table 8reports the estimates of the parameters in Equation (6) corresponding to the selection vari-
ables. These are not marginal eects; these parameters are only informative on the direction of
change in probability. Data is not restricted to employees only. The variables fit the data well, and
the fit is gender-dependent. Single males are predicted to work less frequently relative to married
ones. For females, marriage arrangements appear to have no relevance. A number of minors ex-
plain no variation in male employment, but for females, first children predict a larger reduction
in employment than the subsequent ones. Other household income predicts more frequent labour
market participation for males but to a lesser extent at higher values. For females, lower values
of other household income do not reduce labour market participation, whereas higher values do.
Table 9and Table 10 present the estimates of the eects of cannabis use on wages for, re-
6. In practice we use community-written STATA codes IMPERFECTIV (Matta and Clarke 2019) and PLAUSEXOG
(Clarke 2020).
Table 8: Selection equation
Dependent variable: Positive earnings
Female Male
Estimate SE Estimate SE
Minors -0.404*** (0.0901) -0.0649 (0.0701)
Minors20.109*** (0.0311) 0.0116 (0.0203)
Other income 0.0749 (0.0650) 0.265*** (0.0457)
Other income2-0.0163*** (0.00416) -0.0322*** (0.00345)
Marriage status (married is omitted)
Cohabiting -0.0758 (0.102) -0.0760 (0.0972)
Single 0.0250 (0.114) -0.302** (0.108)
Controls X X
Pseudo 𝑅20.1043 0.2182
N 4297 4394
Notes: The table shows the estimated parameters of the selection variables.
+𝑝 < 0.1, * 𝑝 < 0.05, ** 𝑝 < 0.01, *** 𝑝 < 0.001.
Source: HILDA 2017
spectively, females and males. The top half of the tables report the OLS estimates; the bottom
reports the comparable 2SLS estimates. The robust standard errors are reported throughout.
When Heckman correction is applied, robustness is achieved by bootstrapping. For females, the
full 2SLS model with sample correction estimates a penalty of 8.32%. For males, the comparable
estimate is imprecise and centred around the value of 0.02%. The null hypothesis of equality of
these coecients across models performed with the generalized Hausman test (Greene 2011, p.
233) and bootstrap is rejected with a 𝑝-value <0.000
The OLS estimates change substantially when additional variables are added. For both sam-
ples, the uncontrolled OLS estimates show an unrealistic penalty of 21% for females and 19% for
males. Adding controls turns the cannabis penalty imprecise, with the point estimate approxi-
mately the same across the gender. The Heckman correction has an opposing eect on the penalty
depending on gender. For females, the penalty marginally increases; for males, it drops. It sug-
gests that the females that choose to work may be more resilient to cannabis harm. Either way,
the OLS results are not informative on the penalty, as they are either imprecise or implausibly
high. We also confirm endogeneity issues by applying the Durbin-Wu-Hausman test. For both
genders, full models estimated with OLS shows endogeneity with 𝑝-value less than 0.000.
The 2SLS estimates exploit plausibly exogenous variation in the interviewers’ allocation. The
estimates are stable. The inclusion of additional variables into the model reduces standard errors
with only a limited eect on the point estimates. It is known that the wage gender dierences
Table 9: The estimated eect of cannabis use on wages: female sample
Dependent variable (DV): Log of earnings rate
Estimate SE Estimate SE Estimate SE
Cannabis use -0.211*** (0.0324) -0.0340 (0.0296) -0.0390 (0.0296)
Age 0.0605*** (0.00478) 0.0600*** (0.00479)
Age2-0.000620*** (0.0000555) -0.000605*** (0.0000555)
Schooling 0.0651*** (0.00436) 0.0635*** (0.00438)
Tenure 0.0109*** (0.00122) 0.0107*** (0.00122)
Industry (Agriculture omitted)
Energy 0.219 (0.162) -0.00682 (0.164)
Mining 0.233 (0.176) 0.0137 (0.178)
Mfg. -0.0414 (0.143) -0.127 (0.143)
Const. 0.0809 (0.165) 0.00148 (0.164)
Trade -0.0229 (0.141) -0.124 (0.141)
Transport 0.0293 (0.148) -0.0727 (0.147)
Bank/Ins. 0.188 (0.143) 0.0582 (0.144)
Services 0.0225 (0.139) -0.0840 (0.139)
Mills ration -0.137*** (0.0279)
Constant 3.350*** (0.0109) 1.065*** (0.179) 1.300*** (0.180)
𝑅20.010 0.229 0.234
Cannabis use -0.0696 (0.0449) -0.0815* (0.0373) -0.0832* (0.0373)
Age 0.0600*** (0.00478) 0.0596*** (0.00479)
Age2-0.000617*** (0.0000554) -0.000601*** (0.0000555)
Schooling 0.0649*** (0.00435) 0.0633*** (0.00437)
Tenure 0.0108*** (0.00122) 0.0106*** (0.00122)
Industry (Agriculture omitted)
Energy 0.216 (0.162) -0.0122 (0.163)
Mining 0.230 (0.176) 0.00835 (0.178)
Mfg. -0.0425 (0.143) -0.129 (0.143)
Const. 0.0837 (0.165) 0.00315 (0.164)
Trade -0.0221 (0.141) -0.124 (0.141)
Transport 0.0279 (0.148) -0.0752 (0.147)
Bank/Ins. 0.189 (0.144) 0.0580 (0.144)
Services 0.0225 (0.139) -0.0851 (0.139)
Mills ration -0.139*** (0.0279)
Constant 3.335*** (0.0115) 1.084*** (0.179) 1.321*** (0.180)
N 3743 3743 3743
𝑅20.005 0.229 0.233
DV mean 3.328 3.328 3.328
DV min -2.479 -2.479 -2.479
DV max 7.657 7.657 7.657
Notes: This table reports the eect of cannabis use on wages for females. No use is the omitted category
(control group). See Section 3.1 for the IV construction. See Table 8for the selection equation estimates.
See Table 6for the 2SLS first stage estimates. Robust standard errors are reported for estimates without
Heckman correction. For estimates with Heckman correction, bootstrapped errors with 1000 replications
are reported. Mfg. stand for manufacturing; Ins. for insurance.
+𝑝 < 0.1, * 𝑝 < 0.05, ** 𝑝 < 0.01, *** 𝑝 < 0.001.
Source: HILDA 2017.
Table 10: The estimated eect of cannabis use on wages: male sample
Dependent variable (DV): Log of earnings rate
Estimate SE Estimate SE Estimate SE
Cannabis use -0.189*** (0.0297) -0.0214 (0.0263) 0.00455 (0.0259)
Age 0.0828*** (0.00497) 0.0734*** (0.00488)
Age2-0.000859*** (0.0000587) -0.000711*** (0.0000579)
Schooling 0.0681*** (0.00464) 0.0602*** (0.00453)
Tenure 0.00870*** (0.00129) 0.00869*** (0.00125)
Industry (Agriculture omitted)
Energy 0.686*** (0.0864) 0.0839 (0.0976)
Mining 0.529*** (0.0916) -0.0845 (0.102)
Mfg. 0.149* (0.0625) -0.0985 (0.0629)
Const. 0.207** (0.0639) 0.0792 (0.0626)
Trade 0.0629 (0.0627) -0.221*** (0.0645)
Transport 0.131+ (0.0792) -0.171* (0.0800)
Bank/Ins. 0.417*** (0.0758) 0.0685 (0.0782)
Services 0.195** (0.0609) -0.0341 (0.0616)
Mills ration -0.272*** (0.0218)
Constant 3.477*** (0.0125) 0.564*** (0.113) 1.390*** (0.122)
𝑅20.010 0.229 0.234
Cannabis use -0.0501 (0.0366) -0.0452 (0.0321) -0.00223 (0.0315)
Age 0.0827*** (0.00496) 0.0734*** (0.00486)
Age2-0.000859*** (0.0000586) -0.000712*** (0.0000578)
Schooling 0.0679*** (0.00464) 0.0602*** (0.00452)
Tenure 0.00862*** (0.00129) 0.00867*** (0.00125)
Industry (Agriculture omitted)
Energy 0.685*** (0.0860) 0.0846 (0.0975)
Mining 0.527*** (0.0915) -0.0838 (0.102)
Mfg. 0.150* (0.0625) -0.0981 (0.0628)
Const. 0.209** (0.0638) 0.0801 (0.0625)
Trade 0.0635 (0.0626) -0.220*** (0.0644)
Transport 0.130+ (0.0791) -0.170* (0.0798)
Bank/Ins. 0.416*** (0.0758) 0.0689 (0.0781)
Services 0.196** (0.0608) -0.0335 (0.0615)
Mills ration -0.272*** (0.0218)
Constant 3.453*** (0.0134) 0.574*** (0.113) 1.392*** (0.122)
N 3500 3500 3500
𝑅20.005 0.294 0.323
DV mean 3.445 3.445 3.445
DV min -3.937 -3.937 -3.937
DV max 8.204 8.204 8.204
Notes: This table reports the eect of cannabis use on wages for males. No use is the omitted category
(control group). See Section 3.1 for the IV construction. See Table 8for the selection equation estimates.
See Table 6for the 2SLS first stage estimates. Robust standard errors are reported for estimates without
Heckman correction. For estimates with Heckman correction, bootstrapped errors with 1000 replications
are reported. Mfg. stand for manufacturing; Ins. for insurance.
+𝑝 < 0.1, * 𝑝 < 0.05, ** 𝑝 < 0.01, *** 𝑝 < 0.001.
Source: HILDA 2017.
(Blau and Kahn 2017) or cannabis use penalty (Kaestner 1998) may be misrepresented if some of
the factors controlled for themselves reflect the impact of discrimination or use. Our IV bypasses
this potential collider bias, as we construct an IV that is orthogonal to the control variables. The
control variables only improve the eciency of the parameters by better isolating the exogenous
identifying variation.
To understand the economic significance of the estimated penalty, note that the observational
gender wage gap shown in Table 2is 11%. Thus, the penalty can almost fully account for the
gender gap. One limitation is that the female penalty is estimated relative to wages of female
non-user, not relative to non-users for both genders. Because female wages are lower, the penalty
expressed as a per cent change is also lower. To understand the size of the female penalty relative
to non-users workers, we pool female and male samples together and estimate our model where
we include gender fixed eects and interact the controls variable and the cannabis use by gender.
Now workers that do not use cannabis are the control group, which accounts for the preexisting
gender gap. Panel D of Table 11 shows that in this specification, the female penalty is equal to
5.1 Results robustness
Table 11 shows that our results are robust when we make alternative data modelling decisions.
All results reported are from our full model, including the controls, selection equation and boot-
strapped standard errors.
In Section 3.1, we mentioned that because cannabis use is a relatively rare event removing
the idiosyncratic component with a leaving-one-mean creates a negative correlation at the first
stage of our 2SLS estimator. Penal A of Table 11 shows that the results stay largely the same if we
skip leaving one out completely and use the residual as the IV. Under this specification, the IV
positively correlates with cannabis use, while the estimates for both genders are shifted into the
positive zone. The female gender is reduced to 5.7%, while for males, we estimate an imprecise
cannabis use premium.
We also mentioned that removing the idiosyncratic component with a leaving-one-mean re-
quires a reasonably large number of individuals interviewed by each interviewer. Our prefered
choice is to exclude the extreme IV values. As explained earlier, this automatically excludes
almost all interviewers with few respondents and avoids the sensitivity of the IV estimates to
Table 11: The estimated eect of cannabis use on wages: main results robustness
Estimate SE ¯
𝑅2N First stage EF-MS
A: Residualizing without leaving one out
Female -0.0574+ (0.0330) 0.237 3741 1.039*** (0.00829) 378.2
Male 0.0178 (0.0296) 0.322 3508 1.040*** (0.0100) 431.9
B: At least 20 respondents per interviewer keeping top and bottom 1% of IV
Female -0.0874* (0.0384) 0.231 3497 -27.15*** (0.311) 220.1
Male 0.000389 (0.0369) 0.317 3340 -29.83*** (0.411) 310.1
C: No industry of occupation
Female -0.0739* (0.0371) 0.225 3849 -28.81*** (0.581) 187.2
Male 0.0267 (0.0313) 0.310 3624 -29.86*** (0.474) 291.1
D: Penalty relative to non-users of both gender
Female -0.0995** (0.0343) 0.202 3,743 -28.15*** (0.323) 180.8
Male -0.0438 (0.0329) 0.228 3,500 -28.83*** (0.491) 286.0
Notes: Panel A shows estimates when Equation (3) is skipped. Panel B when Equation (3) is
included, but interviewers with too few respondents are excluded. Panel C excludes industry
indicators. Panel D reports results when penalties for both genders are estimated jointly, and no
use for both genders is the omitted category (control group). See Section 3.1 for the IV construc-
tion. See Table 8for the selection equation estimates. Robust standard errors are bootstrapped
with 1000 replications. EF-MS, is the eective F-statistics divided by the number of interviewers.
+𝑝 < 0.1, * 𝑝 < 0.05, ** 𝑝 < 0.01, *** 𝑝 < 0.001.
Source: HILDA 2017.
outliers. Penal B of Table 11 shows the results are the same if we drop the observations for inter-
viewers with less than 20 respondents instead of excluding the top and bottom 1% of the IV.
Standard wage equation does not include industry dummies. One might consider the industry
an endogenous choice that partially stems from the gender or cannabis wage penalties. Penal C
of Table 11 shows that ignoring the industry variables leaves the estimates statistically the same.
5.2 Additional results
The top of Table 12 reports estimates when the dummy variable for cannabis use is replaced with
an array of dummies for at least monthly and less than monthly intensity usage. The omitted
category is still no cannabis use, serving as a control group. The OLS estimates again demonstrate
instability when covariates are added. The 2SLS estimates also show instability when control
variables are included.
We showed that the IV is independent of the dummy variable for cannabis use but is not in-
dependent of self-selection into high or low use. This selection correlates with age (younger users
are more likely to be heavy smokers); therefore, including age which is a part of the control vec-
Table 12: The estimated eect of cannabis use on wages: by use frequency and initiation
Dependent variable: Log of earnings rate
Female Male
Ctrl HC Usage Estimate SE N ¯
𝑅2Estimate SE N ¯
OLS Monthly -0.226*** (0.0492) 3743 0.011 -0.247*** (0.0379) 3500 0.011
<Monthly -0.165*** (0.0346) -0.138*** (0.0412)
OLS XMonthly -0.0352 (0.0475) 3743 0.229 -0.0508 (0.0331) 3500 0.294
<Monthly -0.0218 (0.0300) 0.00409 (0.0371)
OLS X X Monthly -0.0722 (0.0520) 3743 0.234 -0.0277 (0.0326) 3500 0.323
<Monthly -0.0209 (0.0329) 0.0325 (0.0364)
2SLS Monthly -0.338*** (0.0804) 3743 0.010 -0.212*** (0.0386) 3500 0.011
<Monthly -0.141*** (0.0380) -0.0903* (0.0444)
2SLS XMonthly -0.151* (0.0743) 3743 0.229 -0.0604+ (0.0362) 3500 0.294
<Monthly -0.0172 (0.0318) 0.0000323 (0.0410)
2SLS X X Monthly -0.156* (0.0741) 3743 0.233 -0.0276 (0.0354) 3500 0.323
<Monthly -0.0209 (0.0319) 0.0417 (0.0402)
Ctrl HC Usage Estimate SE N ¯
𝑅2Estimate SE N ¯
OLS 3 years -0.500*** (0.0667) 3733 0.015 -0.736*** (0.109) 3482 0.025
>3 years -0.153*** (0.0358) -0.118*** (0.0284)
OLS X3 years -0.0760 (0.0670) 3733 0.229 -0.151 (0.106) 3482 0.295
>3 years -0.0270 (0.0322) -0.0114 (0.0261)
OLS X X 3 years -0.0740 (0.0672) 3733 0.233 -0.104 (0.105) 3482 0.324
>3 years -0.0332 (0.0318) 0.0111 (0.0257)
2SLS 3 years -0.500*** (0.0894) 3733 0.015 -0.647*** (0.103) 3482 0.025
>3 years -0.157*** (0.0431) -0.0833** (0.0314)
2SLS X3 years -0.102 (0.0898) 3733 0.229 -0.0981 (0.101) 3482 0.295
>3 years -0.0604 (0.0381) -0.0294 (0.0297)
2SLS X X 3 years -0.0989 (0.0894) 3733 0.233 -0.0380 (0.100) 3482 0.323
>3 years -0.0657+ (0.0381) 0.00469 (0.0292)
Notes: This table reports the results from our models when drug use dummy is interacted by use intensity and use
initiation. No use is the omitted category (control group). See Section 3.1 for the IV construction. See Table 8for
the selection equation estimates. See Table 6for the 2SLS first stage estimates. Robust standard errors are reported
for estimates without Heckman correction. For estimates with Heckman correction, bootstrapped errors with 1000
replications are reported. Ctrl. stand for controls, HC for Heckman correction.
+𝑝 < 0.1, * 𝑝 < 0.05, ** 𝑝 < 0.01, *** 𝑝 < 0.001.
Source: HILDA 2017.
tor, influences the point estimate of the IV model more than in the main results table. The full
2SLS specification shows that the females’ wage penalty is concentrated on heavy drug users. For
males, the results are both statistically and economically insignificant. Peculiarly, males infre-
quent users show a wage premium.
The bottom of Table 12 reports estimates where an array of dummies for high and low-
intensity usage is replaced with an array of dummies for recent (initiation within the last three
years) and long-term (more than four years) usage. This model again assumes that selection into
recent or long-term is exogenous conditionally on the control vector selected. That is why we
see that the 2SLS estimates change when controls are included. We see that for females, both
recent and long-term usage is associated with the wage penalty. However, only long-term use is
significant and only at 10% level.
6 Discussion
As noted in Loewenstein and Rick (2010), economists tend to play catch-up with researchers in
other disciplines when it comes to their understanding of addiction or their influence on users
and society. The same is true for the eects of cannabis use on productivity.
Numerous medical studies showed reduced cognition and mental and physical health in re-
sponse to cannabis use, particularly for females. However, to date, there remained substantial
uncertainty in empirical economics papers as to whether using cannabis has adverse labour mar-
ket eects (van Ours and Williams 2015). Therefore, our study estimates the wage penalty of
cannabis use with an explicit focus on gender dierences.
To overcome the endogeneity issues, we generate an IV that simulates the treatment status.
To ensure the validity of our IV, we leverage the interviewer assignment, which is, by design,
assigned to respondents randomly and independently of respondents’ current outcomes. Our
monotonicity test shows that our LATE is likely very close to ATE. Thus our results are relevant
to the underlying population.
Our findings highlight that cannabis use can have substantial and potentially long-lasting
eects on productivity in general but most notably on the gender wage gap. For female workers,
the cannabis use wage penalty is about 8.3%, whereas, for males, the penalty is not statistically
distinguishable from zero.
In the 2016 wave of the National Drug Strategy Household Survey, about 8% replied that they
would initiate cannabis use or increase their current consumption if cannabis was legalized. Of
that 8%, approximately 45% were female. As the wage penalty is much larger for females an
immediate consequence of legalization would be a reduction in gender wage equality. Reduction
in gender wage equality will, in turn, for example, increase the domestic violence (Aizer 2010;
Anderberg et al. 2015).
It does not mean that cannabis legalization should be avoided, as the ultimate policy choice
depends on a host of factors that our paper does not address. The optimal addiction public policy
request expertise from addiction, crime and psychology, while our study is informative only on
one relatively narrow aspect of drug use.
7 Declarations
Declarations of interest: none
This research did not receive any specific grant from funding agencies in the public, commercial,
or not-for-profit sectors.
The funding sources had no involvement in the conduction of the research, the preparation of the
article; in the collection, analysis and interpretation of data; in the writing of the report; or in the
decision to submit the article for publication.
This paper uses unit record data from Household, Income and Labour Dynamics in Australia Sur-
vey conducted by the Australian Government Department of Social Services (DSS). The findings
and views reported in this paper, however, are those of the authors and should not be attributed
to the Australian Government, DSS, or any of DSS’ contractors or partners.
Aizer, Anna. 2010. “The Gender Wage Gap and Domestic Violence.” American Economic Review
100, no. 4 (September): 1847–59. doi:10.1257/aer.100.4.1847.
articles?id=10.1257/aer.100.4.1847. (Cited on page 31).
Alexeev, Sergey, and Don Weatherburn. 2022. “Fines for illicit drug use do not prevent future
crime: evidence from randomly assigned judges. Journal of Economic Behavior & Organization
200:555–575. issn: 0167-2681. doi:https: / / doi . org /10 . 1016 / j . jebo . 2022. 06 . 015.https :
// (Cited on page 13).
Alshaarawy, Omayma. 2019. “Total and dierential white blood cell count in cannabis users:
results from the cross-sectional National Health and Nutrition Examination Survey, 2005–
2016.” Journal of Cannabis Research 1, no. 1 (July): 6. issn: 2522-5782. doi:10.1186/s42238-
019-0007-8. (Cited on page 2).
Anderberg, Dan, Helmut Rainer, Jonathan Wadsworth, and Tanya Wilson. 2015. “Unemployment
and Domestic Violence: Theory and Evidence. The Economic Journal 126, no. 597 (October):
1947–1979. issn: 0013-0133. doi:10.1111/ ecoj.12246. eprint: https://academic
ej /article - pdf /126 /597 / 1947/ 25841726/ ecoj12246 - sup- 0001 - appendixc - d. pdf.https:
// (Cited on page 31).
Andrews, Isaiah, James H. Stock, and Liyang Sun. 2019. “Weak Instruments in Instrumental Vari-
ables Regression: Theory and Practice. Annual Review of Economics 11 (1): 727–753. doi:10.
1146 / annurev - economics - 080218 - 025643. eprint: https : / / doi . org / 10 . 1146 / annurev -
economics-080218- 025643. economics-080218-025643.
(Cited on page 21).
Angrist, Joshua D. 2016. “Treatment Eect.” In The New Palgrave Dictionary of Economics, 1–8.
London: Palgrave Macmillan UK. isbn: 978-1-349-95121-5. doi:10.1057/978- 1-349-95121-
5 2533-1. 2533-1. (Cited on page 21).
Arseneault, Louise, Mary Cannon, John Witton, and Robin M. Murray. 2004. “Causal association
between cannabis and psychosis: examination of the evidence. British Journal of Psychiatry
184 (2): 110–117. doi:10.1192/bjp.184.2.110. (Cited on page 2).
Australian Bureau of Statistics. 2016. 1270.0. 55.001—Australian Statistical Geography Standard
(ASGS): Volume 1—Main Structure and Greater Capital City Statistical Areas July 2016. https:
/ / www . abs . gov . au / ausstats / abs @ .nsf / Lookup / by % 20Subject / 1270 . 0 . 55 . 001July %
202016Main%20FeaturesStatistical%20Area%20Level%203%20(SA3)10015. (Cited on
page 11).
Autor, David H., Frank Levy, and Richard J. Murnane. 2003. “The Skill Content of Recent Tech-
nological Change: An Empirical Exploration. The Quarterly Journal of Economics 118, no. 4
(November): 1279–1333. issn: 0033-5533. doi:10.1162/003355303322552801. eprint: https:
/ / academic . oup . com / qje / article - pdf / 118 / 4 / 1279 / 5427313 / 118 - 4 - 1279 . pdf.https :
// (Cited on page 10).
Barnighausen, Till, Jacob Bor, Speciosa Wandira-Kazibwe, and David Canning. 2011. “Correcting
HIV Prevalence Estimates for Survey Nonparticipation Using Heckman-type Selection Mod-
els.” Epidemiology 22 (1). issn: 1044-3983.
01000/Correcting HIV Prevalence Estimates for Survey.5.aspx. (Cited on page 4).
Bartik, Timothy J. 1991. Who Benefits from State and Local Economic Development Policies? W.E.
Upjohn Institute. isbn: 9780880991131, accessed July 10, 2022. http : / / www . jstor . org /
stable/j.ctvh4zh1q. (Cited on page 3).
Baum, Christopher F, Mark E Schaer, and Steven Stillman. 2002. IVREG2: Stata module for ex-
tended instrumental variables/2SLS and GMM estimation. Statistical Software Components,
Boston College Department of Economics, April. https: / / ideas . repec . org / c / boc / bocode/
s425401.html. (Cited on page 21).
Becker, Gary S. 1985. “Human Capital, Eort, and the Sexual Division of Labor. Journal of Labor
Economics 3 (1, Part 2): S33–S58. doi:10.1086 / 298075. eprint: https:/ / / 10.1086 /
298075. (Cited on page 9).
Blau, Francine D., and Lawrence M. Kahn. 1997. “Swimming Upstream: Trends in the Gender
Wage Dierential in the 1980s.” Journal of Labor Economics 15 (1): 1–42. issn: 0734306X,
15375307. (Cited on page 7).
. 2006. “The U.S. Gender Pay Gap in the 1990S: Slowing Convergence. ILR Review 60 (1):
45–66. doi:10.1177/001979390606000103. eprint:
00103. (Cited on page 7).
. 2017. “Women’s Work and Wages. In The New Palgrave Dictionary of Economics, 1–14.
London: Palgrave Macmillan UK. isbn: 978-1-349-95121-5. doi:10.1057/978- 1-349-95121-
5 2207-1. 2207-1. (Cited on pages 7,9,27).
Breuer, Matthias. 2021. “Bartik instruments: An applied introduction. Journal of Financial Re-
porting, forthcoming. (Cited on page 3).
Burgess, Simon M., and Carol Propper. 1998. “Early health-related behaviours and their impact
on later life chances: evidence from the US. Health Economics 7 (5): 381–399. doi:https://;2-B. eprint: https:
//onlinelibrary.wiley. com/doi/pdf/10.1002/%28SICI%291099-1050%28199808%297%
HEC359%3E3.0.CO%3B2-B. (Cited on page 3).
Clarke, Damian. 2020. PLAUSEXOG: Stata module to implement Conley et al’s plausibly exogenous
bounds. (Cited on page 23).
Conley, Timothy, Christian Hansen, and Peter Rossi. 2012. “Plausibly Exogenous.” The Review
of Economics and Statistics 94 (1): 260–272. doi:10 . 1162 / REST a00139. eprint: https :
// doi . org / 10 . 1162 /REST a 00139.https : / / doi . org / 10. 1162 / REST a 00139. (Cited on
page 23).
Craft, Rebecca M. 2005. “Sex dierences in behavioral eects of cannabinoids. Life Sciences 77
(20): 2471–2478. issn: 0024-3205. doi:https: / /doi . org/ 10 .1016 / j. lfs .2005 . 04.019.http :
// (Cited on page 2).
Cragg, John G., and Stephen G. Donald. 1993. “Testing Identifiability and Specification in In-
strumental Variable Models.” Econometric Theory 9 (2): 222–240. issn: 02664666, 14694360. (Cited on page 18).
Crean, Rebecca D., Natania A. Crane, and Barbara J. Mason. 2011. “An Evidence-Based Review
of Acute and Long-Term Eects of Cannabis Use on Executive Cognitive Functions. Journal
of Addiction Medicine 5 (1). issn: 1932-0620.
dicine/Fulltext/2011/03000/An Evidence Based Review of Acute and Long Term.1.aspx.
(Cited on page 2).
Currie, Janet, and Jonathan Gruber. 1996. “Health Insurance Eligibility, Utilization of Medical
Care, and Child Health*. The Quarterly Journal of Economics 111, no. 2 (May): 431–466. issn:
0033-5533. doi:10.2307/2946684. eprint:
2/431/5460686/111-2-431.pdf. (Cited on page 4).
Davis, R. E., M. P. Couper, N. K. Janz, C. H. Caldwell, and K. Resnicow. 2009. “Interviewer eects
in public health surveys. Health Education Research 25, no. 1 (September): 14–26. issn: 0268-
1153. doi:10.1093/her/cyp046. eprint: pdf/25/1/
14/1465507/cyp046.pdf. (Cited on page 6).
Degenhardt, Louisa, Wayne Hall, and Michael Lynskey. 2003. “Exploring the association between
cannabis use and depression.” Addiction 98 (11): 1493–1504. doi:
1360-0443.2003.00437.x. eprint:
00437.x. (Cited on page 2).
DeSimone, Je. 2002. “Illegal Drug Use and Employment. Journal of Labor Economics 20 (4):
952–977. doi:10.1086/342893. eprint:
1086/342893. (Cited on page 3).
Evdokimov, Kirill S, and Michal Koles´
ar. 2018. “Inference in Instrumental Variables Analysis
with Heterogeneous Treatment Eects.” Working paper.
het iv inf.pdf. (Cited on page 16).
Fattore, Liana, and Walter Fratta. 2010. “How important are sex dierences in cannabinoid ac-
tion?” British Journal of Pharmacology 160 (3): 544–548. doi:10 . 1111 / j . 1476 - 5381 . 2010 .
00776 . x. eprint: https : / / bpspubs . onlinelibrary . wiley . com / doi / pdf / 10 . 1111 / j . 1476 -
5381.2010. 00776 .x.https://bpspubs.onlinelibrary. wiley .com/doi/abs/10. 1111 /j.1476-
5381.2010.00776.x. (Cited on page 2).
Filbey, Francesca M., Sina Aslan, Vince D. Calhoun, Jerey S. Spence, Eswar Damaraju, Arvind
Caprihan, and Judith Segall. 2014. “Long-term eects of marijuana use on the brain.” Pro-
ceedings of the National Academy of Sciences. issn: 0027-8424. doi:10.1073/pnas.1415297111.
eprint: https:// www . content / early/2014/ 11 / 05/1415297111. full . pdf.https:
// (Cited on page 2).
Flores-Macias, Francisco, and Chappell Lawson. 2008. “Eects of Interviewer Gender on Sur-
vey Responses: Findings from a Household Survey in Mexico. International Journal of Public
Opinion Research 20, no. 1 (February): 100–110. issn: 0954-2892. doi:10.1093/ijpor/edn007.
eprint: https :/ /academic . oup. com/ ijpor/ article - pdf/ 20/ 1/ 100/ 2208706 /edn007 .pdf. (Cited on page 6).
Freeman, Daniel, Graham Dunn, Robin M. Murray, Nicole Evans, Rachel Lister, Angus Antley,
Mel Slater, et al. 2014. How Cannabis Causes Paranoia: Using the Intravenous Adminis-
tration of THC to Identify Key Cognitive Mechanisms Leading to Paranoia. Schizophrenia
Bulletin 41, no. 2 (July): 391–399. issn: 0586-7614. doi:10.1093/schbul/sbu098. eprint: https:
// (Cited on page 2).
French, Michael T., M. Christopher Roebuck, and Pierre K´
ebreau Alexandre. 2001. “Illicit Drug
Use, Employment, and Labor Force Participation.” Southern Economic Journal 68 (2): 349–368.
issn: 00384038. (Cited on page 3).
Gill, Andrew M., and Robert J. Michaels. 1992. “Does Drug Use Lower Wages?” ILR Review 45
(3): 419–434. doi:10.1177/001979399204500301. eprint:
99204500301. (Cited on page 2).
Goldin, C.D., P.E.C. Goldin, and National Bureau of Economic Research. 1990. Understanding the
Gender Gap: An Economic History of American Women. American studies collection. Oxford
University Press. isbn: 9780195050776.
tAAAAMAAJ. (Cited on page 10).
Goldsmith-Pinkham, Paul, Isaac Sorkin, and Henry Swift. 2020. “Bartik Instruments: What, When,
Why, and How.” American Economic Review 110, no. 8 (August): 2586–2624. doi:10.1257/aer.
20181047. (Cited on page 3).
Greene, W.H. 2011. Econometric Analysis. Pearson Education. isbn: 9780132997898. https://book (Cited on page 24).
Hall, Wayne. 2015. “What has research over the past two decades revealed about the adverse
health eects of recreational cannabis use?” Addiction 110 (1): 19–35. doi:
10.1111/add.12703. eprint: (Cited on page 2).
Hall, Wayne, and Louisa Degenhardt. 2009. “Adverse health eects of non-medical cannabis use.
The Lancet 374 (9698): 1383–1391. issn: 0140-6736. doi:https://doi. org / 10 .1016/S0140-
(Cited on page 2).
Heckman, James J. 1979. “Sample Selection Bias as a Specification Error.” Econometrica 47 (1):
153–161. issn: 00129682, 14680262. http : / / www . jstor . org / stable / 1912352. (Cited on
page 9).
Heckman, James J, Lance J Lochner, and Petra E Todd. 2003. Fifty Years of Mincer Earnings Regres-
sions. Working Paper, Working Paper Series 9732. National Bureau of Economic Research,
May. doi:10.3386/w9732. (Cited on page 7).
Hull, Peter. 2017. “Examiner Designs and First-Stage F Statistics: A Caution. Unpublished Work-
ing Paper. (Cited on page 21).
Imbens, Guido W., and Joshua D. Angrist. 1994. “Identification and Estimation of Local Average
Treatment Eects.” Econometrica 62 (2): 467–475. issn: 00129682, 14680262. http: / /www. (Cited on page 21).
Johnson, Timothy P., Michael Fendrich, Chitra Shaligram, Anthony Garcy, and Samuel Gillespie.
2000. “An Evaluation of the Eects of Interviewer Characteristics in an RDD Telephone Sur-
vey of Drug Use. Journal of Drug Issues 30 (1): 77–101. doi:10.1177/002204260003000105.
eprint: https : / / doi . org / 10 . 1177 / 002204260003000105.https : / / doi . org / 10 . 1177 /
002204260003000105. (Cited on page 6).
Kaestner, Robert. 1991. “The Eect of Illicit Drug Use on the Wages of Young Adults.” Journal
of Labor Economics 9 (4): 381–412. doi:10.1086/ 298274. eprint: 10 .1086/
298274. (Cited on page 2).
Kaestner, Robert. 1994a. “New Estimates of the Eect of Marijuana and Cocaine Use on Wages.”
Industrial and Labor Relations Review 47 (3): 454–470. issn: 00197939, 2162271X. http : / / (Cited on page 2).
. 1994b. “The Eect of Illicit Drug Use on the Labor Supply of Young Adults. The Journal
of Human Resources 29 (1): 126–155. issn: 0022166X.
(Cited on page 2).
. 1998. “Illicit Drug Use and Labor Market Outcomes: A Review of Economic Theory and
its Empirical Implications. Journal of Drug Issues 28 (3): 663–680. doi:10.1177/0022042698
02800306. eprint:
002204269802800306. (Cited on pages 3,27).
Kempker, Jordan A., Eric G. Honig, and Greg S. Martin. 2015. “The Eects of Marijuana Exposure
on Expiratory Airflow. A Study of Adults who Participated in the U.S. National Health and
Nutrition Examination Study. PMID: 25521349, Annals of the American Thoracic Society 12
(2): 135–141. doi:10 .1513 / AnnalsATS. 201407 - 333OC. eprint: https: / /
AnnalsATS.201407-333OC. (Cited on
page 2).
Khanji, Mohammed Y., Magnus T. Jensen, Asmaa A. Kenawy, Zahra Raisi-Estabragh, Jose M.
Paiva, Nay Aung, Kenneth Fung, et al. 2020. “Association Between Recreational Cannabis
Use and Cardiac Structure and Function. JACC: Cardiovascular Imaging 13 (3): 886–888. issn:
1936-878X. doi:
com/science/article/pii/S1936878X19310095. (Cited on page 2).
ar, Michal, Raj Chetty, John Friedman, Edward Glaeser, and Guido W. Imbens. 2015. “Iden-
tification and Inference With Many Invalid Instruments.” Journal of Business & Economic
Statistics 33 (4): 474–484. doi:10.1080 / 07350015 . 2014 . 978175. eprint: https: // doi. org/
10.1080/07350015.2014.978175. (Cited
on page 16).
Larcker, David F., and Tjomme O. Rusticus. 2010. “On the use of instrumental variables in ac-
counting research. Journal of Accounting and Economics 49 (3): 186–205. issn: 0165-4101.
doi:https: // doi. org/ 10. 1016/ j. jacceco . 2009 . 11 . 004.https : / / www . sciencedirect . com /
science/article/pii/S0165410109000718. (Cited on page 4).
Lee, David S, Justin McCrary, Marcelo J Moreira, and Jack R Porter. 2021. “Valid t-ratio Inference
for IV. American Economics Review (forthcoming).
1257/aer.20211063. (Cited on page 21).
Loewenstein, George, and Scott Rick. 2010. “Addiction. In Behavioural and Experimental Eco-
nomics, edited by Steven N. Durlauf and Lawrence E. Blume, 1–5. London: Palgrave Macmil-
lan UK. isbn: 978-0-230-28078-6. doi:10.1057/9780230280786 1.
9780230280786 1. (Cited on page 30).
Matta, Benjam´
ın, and Damian Clarke. 2019. IMPERFECTIV: Stata module to estimate bounds with
”Imperfect Instrumental Variables” (Nevo and Rosen, 2012). https : / /EconPapers. repec .org/
RePEc:boc:bocode:s458363. (Cited on page 23).
Melberg, Hans Olav, Andrew M. Jones, and Anne Line Bretteville-Jensen. 2010. “Is cannabis a
gateway to hard drugs?” Empirical Economics 38, no. 3 (June): 583–603. issn: 1435-8921. doi:1
0.1007/s00181-009-0280-z. (Cited on page 2).
Mikusheva, Anna, and Liyang Sun. 2021. “Inference with Many Weak Instruments. Rdab097,
The Review of Economic Studies (December). issn: 0034-6527. doi:10.1093/restud/rdab097.
eprint: https : / / academic . oup . com / restud / advance - article - pdf / doi / 10 . 1093 / restud /
rdab097 / 42562989 / rdab097 . pdf.https : / / doi . org / 10 . 1093 / restud / rdab097. (Cited on
page 21).
Mogstad, Magne, and Matthew Wiswall. 2016. “Testing the quantity–quality model of fertility:
Estimation using unrestricted family size models. Quantitative Economics 7 (1): 157–192.
doi: eprint:
3982/QE322. (Cited on page 17).
Mulligan, Casey B., and Yona Rubinstein. 2008. “Selection, Investment, and Women’s Relative
Wages over Time.” The Quarterly Journal of Economics 123 (3): 1061–1110. issn: 00335533,
15314650. (Cited on page 9).
Nevo, Aviv, and Adam M. Rosen. 2012. “Identification With Imperfect Instruments. The Review
of Economics and Statistics 94 (3): 659–671. doi:10 . 1162 / REST a00171. eprint: https :
// doi . org / 10 . 1162 /REST a 00171.https : / / doi . org / 10. 1162 / REST a 00171. (Cited on
page 23).
Olea, Jos´
e Luis Montiel, and Carolin Pflueger. 2013. “A Robust Test for Weak Instruments.” Jour-
nal of Business & Economic Statistics 31 (3): 358–369. doi:10.1080/00401706.2013.806694.
eprint: https : / / doi . org / 10 . 1080 / 00401706 . 2013 . 806694.https : / / doi . org / 10 . 1080 /
00401706.2013.806694. (Cited on page 18).
Pedersen, Willy, and Torbjørn Skardhamar. 2010. “Cannabis and crime: findings from a longitu-
dinal study.” Addiction 105 (1): 109–118. doi:
02719.x. eprint:
x. (Cited on
page 2).
Pflueger, Carolin, and Su Wang. 2013. WEAKIVTEST: Stata module to perform weak instrument test
for a single endogenous regressor in TSLS and LIML.
s457732.html. (Cited on page 21).
Pudney, Stephen. 2014. “Drugs policy: what should we do about cannabis?” Economic Policy 25,
no. 61 (August): 165–211. issn: 0266-4658. doi:10.1111/j.1468-0327.2009.00236.x. eprint:
cy25-0165.pdf. (Cited on page 2).
Register, Charles A., and Donald R. Williams. 1992. “Labor Market Eects of Marijuana and
Cocaine Use among Young Men.” ILR Review 45 (3): 435–448. doi:10.1177/0019793992045
00302. eprint:
001979399204500302. (Cited on page 2).
Reiss, Peter C., and Frank A. Wolak. 2007. “Chapter 64 Structural Econometric Modeling: Ra-
tionales and Examples from Industrial Organization,” edited by James J. Heckman and Ed-
ward E. Leamer, 6:4277–4415. Handbook of Econometrics. Elsevier. doi:https: / / doi . org/
10 .1016 / S1573- 4412(07 )06064 - 3.https://www.sciencedirect . com /science/article/pii /
S1573441207060643. (Cited on page 4).
Shover, Chelsea L., and Keith Humphreys. 2019. “Six policy lessons relevant to cannabis legal-
ization. PMID: 30870053, The American Journal of Drug and Alcohol Abuse 45 (6): 698–706.
doi:10.1080/00952990.2019.1569669. eprint: https:// doi . org / 10 .1080/00952990.2019.
1569669. (Cited on page 2).
Simpson, Mark. 2003. “The relationship between drug use and crime: a puzzle inside an enigma.
International Journal of Drug Policy 14 (4): 307–319. issn: 0955-3959. doi:
1016 / S0955 - 3959(03 ) 00081 - 1.https : / / www . sciencedirect . com / science / article / pii /
S0955395903000811. (Cited on page 2).
Stock, James H., Jonathan H. Wright, and Motohiro Yogo. 2002. “A Survey of Weak Instruments
and Weak Identification in Generalized Method of Moments.” Journal of Business & Economic
Statistics 20 (4): 518–529. issn: 07350015, accessed July 19, 2022. http: / /www . jstor. org /
stable/1392421. (Cited on page 16).
Stock, James, and Motohiro Yogo. 2005. “Asymptotic Distributions of Instrumental Variables
Statistics with Many Instruments. In Identification and Inference for Econometric Models, edited
by Donald W.K. Andrews, 109–120. New York: Cambridge University Press. http: / / www .
economics.harvard. edu / faculty/stock/ files / AsymptoticDistrib Stock%5C% 2BYogo . pdf.
(Cited on page 18).
Struik, Dicky, Fabrizio Sanna, and Liana Fattore. 2018. “The Modulating Role of Sex and Anabolic-
Androgenic Steroid Hormones in Cannabinoid Sensitivity.” Frontiers in Behavioral Neuro-
science 12:249. issn: 1662-5153. doi:10.3389/fnbeh.2018.00249.https:/ /www.frontiersin.
org/article/10.3389/fnbeh.2018.00249. (Cited on page 2).
Summerfield, Michelle, Brooke Garrard, Markus Hahn, Yihua Jin, Roopah Kamath, Ninette Macalalad,
Nicole Watson, Roger Wilkins, and Mark Wooden. 2021. “HILDA user manual release 20.”
Melbourne Institute of Applied Economic and Social Research, University of Melbourne. https:
/ / melbourneinstitute . unimelb . edu . au / hilda / for - data - users / user - manuals. (Cited on
page 5).
van Ours, Jan C. 2003. “Is cannabis a stepping-stone for cocaine?” Journal of Health Economics
22 (4): 539–554. issn: 0167-6296. doi:https :/ / doi. org /10 .1016 / S0167- 6296(03 ) 00005- 5. (Cited on page 2).
. 2006a. “Cannabis, cocaine and jobs. Journal of Applied Econometrics 21 (7): 897–917.
doi: eprint:
1002/jae.868. (Cited on page 3).
van Ours, Jan C. 2006b. “Dynamics in the use of drugs. Health Economics 15 (12): 1283–1294.
doi: eprint:
10.1002/hec.1128. (Cited on
page 2).
van Ours, Jan C., and Jenny Williams. 2011. “Cannabis use and mental health problems.” Journal
of Applied Econometrics 26 (7): 1137–1156. doi:https://doi . org / 10.1002/jae . 1182. eprint:
com/doi/abs/10.1002/jae.1182. (Cited on page 2).
. 2012. “The eects of cannabis use on physical and mental health. Journal of Health Eco-
nomics 31 (4): 564–577. issn: 0167-6296. doi: (Cited on page 2).
. 2015. “Cannabis use and its eects on health, education and labor market success.” Jour-
nal of Economic Surveys 29 (5): 993–1010. doi: eprint:
com/doi/abs/10.1111/joes.12070. (Cited on pages 2,30).
van Ours, Jan C., Jenny Williams, David Fergusson, and L. John Horwood. 2013. “Cannabis use
and suicidal ideation. Journal of Health Economics 32 (3): 524–537. issn: 0167-6296. doi:https:
pii/S016762961300009X. (Cited on page 2).
Waldfogel, Jane. 1998. “The Family Gap for Young Women in the United States and Britain: Can
Maternity Leave Make a Dierence?” Journal of Labor Economics 16 (3): 505–545. doi:10.1086/
209897. eprint: (Cited
on page 9).
Watson, N, and M Wooden. 2015. “Factors aecting response to the HILDA survey self-completion
questionnaire.” Melbourne: Melbourne Institute of Applied Economic and Social Research, Uni-
versity of Melbourne.
bibliography/hilda-discussion-papers/hdps115.pdf. (Cited on page 7).