ArticlePDF Available

Field Experimental Evidence on Gender Discrimination in Hiring: Biased as Heckman and Siegelman Predicted?



Correspondence studies are nowadays viewed as the most compelling avenue to test for hiring discrimination. However, these studies suffer from one fundamental methodological problem, as formulated by Heckman and Siegelman (The Urban Institute audit studies: Their methods and findings. In M. Fix, and R. Struyk (Eds.), Clear and convincing evidence: Measurement of discrimination in America, 1993), namely the bias in their results in case of group differences in the variance of unobserved determinants of hiring outcomes. In this study, the authors empirically investigate this bias in the context of gender discrimination. The authors do not find significant evidence for the feared bias.
Received May 18, 2015 Published as Economics Discussion Paper June 16, 2015
Revised August 12, 2015 Accepted August 13, 2015 Published August 20, 2015
© Author(s) 2015. Licensed under the Creative Commons License - Attribution 3.0
Vol. 9, 2015-25 | August 20, 2015 |
Field Experimental Evidence on Gender
Discrimination in Hiring: Biased as Heckman and
Siegelman Predicted?
Stijn Baert
Correspondence studies are nowadays viewed as the most compelling avenue to test for hiring
discrimination. However, these studies suffer from one fundamental methodological problem,
as formulated by Heckman and Siegelman (The Urban Institute audit studies: Their methods
and findings. In M. Fix, and R. Struyk (Eds.), Clear and convincing evidence: Measurement of
discrimination in America, 1993), namely the bias in their results in case of group differences
in the variance of unobserved determinants of hiring outcomes. In this study, the authors
empirically investigate this bias in the context of gender discrimination. The authors do not
find significant evidence for the feared bias.
JEL J16 J71 M51 J41 C93
Keywords Correspondence experiments; Gender discrimination; Unobserved
Stijn Baert, Ghent University – Department of Social Economics,
Citation Stijn Baert (2015). Field Experimental Evidence on Gender Discrimination in Hiring: Biased as
Heckman and Siegelman Predicted? Economics: The Open-Access, Open-Assessment E-Journal, 9 (2015-25): 1—
11. 1
1 Introduction
During the last decade, economists have attempted to estimate hiring
discrimination against women in the labour market by means of correspondence
experiments.1 Within these experiments, pairs of fictitious job applications, only
differing by the gender of the candidate, are sent to real job openings. By means of
standard probit regressions of the subsequent call-back from the employer on the
gender of the candidate, discrimination is identified. The correspondence testing
methodology is the golden standard to estimate hiring discrimination in the labour
market. It allows to disentangle employer discrimination from supply side
determinants of labour market outcomes. Selection on gender differences in (the
average level of) unobservable characteristics is not an issue as all the employees’
individual characteristics are under control of the researcher (Riach and Rich
However, a major critique on this methodology can be formulated based on
Heckman and Siegelman (1993). They show that not controlling for group
differences in the variance of unobservable productivity determinants (and ipso
facto of unobservable determinants of positive call-back) can lead to spurious
evidence of discrimination. The robustness of ethnic discrimination to the
Heckman and Siegelman critique (henceforth “HS critique”) is tested by three
former contributions to the empirical discrimination literature (Baert et al. 2015,
on Belgian data; Carlsson et al. 2014, on Swedish data; Neumark 2012, on US
data). These studies show that the HS critique is relevant. The bottom-line of their
results is that a higher (perceived) variance in unobservable determinants of
positive call-back among ethnic minorities (compared to the ethnic majority) leads
to an underestimation of the level of discrimination against them when not
controlling for ethnic group differentials in this variance.2
1 See, e.g., Albert et al. 2011, for Spain; Petit 2007, for France; Riach and Rich 2006, for the UK. Besides its
application in studies identifying gender discrimination in hiring, economists have used the correspondence
testing framework to test for unequal treatment in the labour market on grounds such as ethnicity, sexual
orientation, former unemployment and former employment in the army (see, e.g., Baert 2014; Baert and Balcaen
2013; Baert et al. 2015; Bertrand and Mullainathan 2004; Drydakis 2009; Eriksson and Rooth 2014; Kroft et al.
2 The results presented by Carlsson et al. (2014) deviate to some extent from this empirical pattern. 2
At the same time, as argued by Azmat and Petrongolo (2014) in their overview
of experimental advances in the study of gender differences in the labour market
“it should be stressed that existing [...] correspondence evidence on gender
discrimination is [...] still open to this criticism.” The only attempt to fill this gap
we are aware of, is Carlsson et al. (2014) who apply Neumark’s (2012)
econometric framework to a number of already published correspondence studies
among which one targeted at gender discrimination. In the present study, we
complement their evidence by an empirical investigation of the HS critique in the
context of gender discrimination using the same framework but another and in
our opinion theoretically more convincing – identifying assumption.
2 Methods
2.1 Heckman and Siegelman’s Critique
As argued above, correspondence studies adequately address concerns of
individual differences in unobservable determinants of productivity. Heckman and
Siegelman (1993) show, however, that group differences in the variance of these
unobservable determinants may still lead to spurious evidence of discrimination.
To see this more clearly for the case of gender discrimination in hiring, assume
that both the average observed and the average unobserved determinants of
productivity are the same for male and female candidates for an unfilled vacancy,
but that the variance of unobservable job-relevant characteristics is, at least in the
perception of the employer, higher for females than for males. In addition, suppose
that the employer considers the observed determinants of productivity, inferred
from the CV and the motivation letter, as relatively low compared to the job
requirement. In that case it is rational for the employer to invite the female and not
the male candidate, since it is more likely that the sum of observed and unobserved
productivity is higher for the female candidates. A correspondence test that detects
discrimination against females could therefore underestimate the extent of
discrimination against females.3
3 With other assumptions the bias may be in the opposite direction. 3
2.2 Neumark’s Empirical Framework
Neumark (2012) explicitly addresses this critique and provides a statistical
procedure to recover unbiased estimates of discrimination. In what follows, we
succinctly describe Neumark’s approach applied to gender discrimination.
It is well known that in a standard probit model only the ratio of the coefficients to
the standard deviation of the unobserved residual is identified. Usually, this
standard deviation is arbitrarily set to 1. In our case this means that the variance of
unobservable job-relevant characteristics is implicitly assumed to be equal for both
males and females, which, for reasons stated above, may bias the measures of
Neumark (2012) shows, however, that if the researcher observes job-relevant
characteristics that affect the male and female populations’ propensities of call-
back in the same way, one can identify the ratio of the standard deviation of the
unobserved productivity components of these groups.4 Implementing Neumark’s
(2012) idea in the context of gender discrimination boils down to the estimation of
a heteroskedastic probit model in which the variance of the error term is allowed to
vary with gender.
2.3 Identification Strategy
As mentioned in the previous subsection, identification of the group-specific
variance in observable determinants of positive call-back within the
heteroskedastic probit framework requires experimental data with variation in
observable job-relevant characteristics that affect the (in our case gender) groups’
propensities of call-back in the same way. Variables used by Baert et al. (2015),
Carlsson et al. (2014) and Neumark (2012) in their application of the Neumark
framework in the context of ethnic discrimination were education level,
personality traits, work experience, type of neighbourhood, sport activities and
application quality. In the context of gender discrimination, Carlsson et al. (2014)
assumed equal returns for both genders from variation in educational degree,
4 The intuition is that if in a standard probit model the estimated coefficients of these job-relevant characteristics
differ by gender, then this must be a consequence of a differential standard deviation, since by assumption the
coefficient of these characteristics should be the same across groups (and since, as mentioned before, in a probit
model only the ratio of the coefficients to the standard deviation are identified). 4
international mobility, work experience, employment status and job tenure. Their
choice can be criticised on theoretical grounds.5 All the aforementioned variables
used for identification of the Neumark procedure result from variation in choices
and outcomes at the employee side. Therefore, they may be correlated with
ethnicity or gender in reality.6
The alternative variable we assume to have the same return across groups is
the distance between the candidate’s living place and the workplace. On the one
hand, it is clear that this variable has the potential to the affect hiring decisions of
employers. This is the case as employers may prefer workers with a social network
in the neighbourhood of the firm. In addition, they may expect a higher
commitment from workers living close to the firm (and, therefore, wasting not too
much time by commuting). On the other hand, by using this variable we actually
exploit employer variation instead of employee variation as the living place of the
employee is constant. As a result, there is no reason why this variable would be
more rewarded for members of a particular sex.7 Both considerations are
confirmed empirically (see Section 4).
3 Data
We use data from Baert et al. (Forthcoming), a correspondence study investigating
the importance of employer preferences in explaining Sticky Floors. Sticky Floors
are defined as the pattern that women are, compared to men, less likely to start to
5 It should be noted, however, that based on the empirical tests the aforementioned authors present one cannot
reject that the chosen variables affect call-back probabilities with a different magnitude for the groups they study.
6 For instance, members of ethnic minorities may have a higher probability of living in more disadvantaged
neighbourhoods (Bertrand and Mullainathan 2004). As particular values for the aforementioned variables may
(not) square (and therefore enforce or disprove) prejudices about ethnic minorities or women, variation in these
variables may be expected to be valued differently for these groups.
7 One could argue that applications to employers living very far away from the residence of the applicant reflect a
willingness to be mobile which may be correlated with female sex. Women of child-bearing and rearing age
might be perceived as being less flexible when it comes to the distance between their workplace and residence.
However, the fictitious job candidates in the experimental data we use mentioned they were quite young (26 or
27), unmarried and not having children (see Section 4). Therefore, we assume that the return to living close to the
workplace is the same for the candidates of both genders within our data. Moreover, if we redo our estimations
using only observations with distances lower than 30 minutes of car driving, the results are very comparable to
the ones presented in the main text. 5
climb the job ladder. To this end, these authors sent fictitious job applications to
real job openings in the labour market of Flanders between October 2013 and
March 2014. During this period, they randomly selected 288 vacancies for jobs
targeting Bachelors in business administration and 288 vacancies for jobs targeting
Masters in business economics in the private sector. They restricted themselves to
vacancies requiring at most five years of work experience. Two job applications of
individuals with five years of work experience (in a first and current job), identical
in terms of productivity-relevant characteristics, were sent to the selected
vacancies. These applicants were single individuals born, studying and living in
comparable suburbs of Ghent, the second largest city of Flanders. Within each pair
of applicants, a typically male sounding name was randomly assigned to one of
both applications and a typically female sounding name to the other one. Call-
backs were received via telephone voicemail and email.
Baert et al. (Forthcoming) sent applications both to vacancies implying a
promotion in terms of occupational level and/or job authority and to vacancies at
the same level. Thereby, they were able to test whether unequal treatment of young
men and women in hiring was heterogeneous by whether or not jobs implied a
promotion in comparison with employees’ current position. They found significant
evidence of hiring discrimination against females when they applied for jobs at a
higher occupational level. For these jobs, females got, compared to males, about
33% less invitations for a job interview and 19% less positive reactions in broad
sense. On the other hand, they found no significant heterogeneity in hiring
discrimination by the job authority level of the posted jobs.
In the present study, we will test whether the discrimination measures
presented by Baert et al. (Forthcoming) are biased by gender differences in the
variance of unobserved determinants of hiring outcomes. Therefore, the data from
Baert et al. (Forthcoming) are, in view of our mentioned identifying strategy,
extended with the distance between the workplace announced in the vacancy and
the candidate’s residence.8
8 This distance, expressed in minutes when driving by car, is calculated using the online routing tool of Google
Maps. 6
4 Results
Table 1 presents the results of our empirical analysis. In Panel A we report the
degree of gender discrimination that comes out of a standard analysis of the data of
Baert et al. (Forthcoming). We retake their main findings by conducting basic
probit estimations with positive call-back as an outcome variable. Positive call-
back is defined as getting an invitation for an interview concerning the announced
job in models (1) and (2) and defined as getting any positive reaction from the
employer side in models (3) and (4).
On the one hand (in models (1) and (3)), we regress positive call-back on a
dummy indicating female sex of the candidate and the distance between the
workplace and the residence of the applicant. On the other hand, for models (2)
and (4), the effect of female sex is broken down by whether the vacancy indicated
a job implying a promotion in occupational level compared with the current job of
the candidate. This is done by replacing the dummy indicating female sex of the
candidate by two dummies: one indicating female candidates who applied for a job
not implying a promotion in occupational level and one indicating female
candidates who applied for a job implying a promotion in occupational level.9
By doing that, we get results that are very similar to those presented in Table 4
and Table 5 of Baert et al. (Forthcoming). More concretely, the regression results
indicate that, overall, the tested employers did not discriminate based on sex.
However, if the effect of revealing female gender is broken down by the
occupational level of the posted job, we find that a female name lowers the
probability of positive call-back by four to five percentage points when one applies
for jobs implying a promotion in this respect.
Interestingly, the estimation results for the variable “distance between the
workplace and the candidate’s residence” not presented in Table 1 but available
on request are, for all of the mentioned models, highly significantly different
from zero (p < 0.01) and have the expected (negative) sign. Moreover, based on a
Wald test applied to the estimation results of an alternative probit model with an
additional interaction variable between female sex of the candidate and the
distance between workplace and residence, we cannot reject that this distance
9 This operation also implies the introduction of a dummy indicating promotion jobs in terms of occupational
level and a dummy indicating promotion jobs in terms of job authority without an interaction with female sex. 7
variable is rewarded equally for males and females. The test results are
summarised in Table 1.
Panel B reports the results based on a re-estimation of models (1) to (4) by
means of a heteroskedastic probit model in the spirit of Neumark (2012) allowing
the variance of the error term to vary with the gender of candidate. By doing that,
we get unbiased results that are very comparable to those in Panel A. In other
words: we find no evidence for a bias in the sense of the HS critique. This finding
is related to the fact that the estimated male and female standard deviations
concerning the error term Male and σFemale) are very comparable. Therefore, our
results seem to indicate that the tested employers do not perceive a (gender) group
difference in the variance of unobserved determinants of productivity. These
results, therefore, corroborate with those of Carlsson et al. (2014) based on
correspondence testing data gathered in Sweden.
Last, we decompose, in the spirit of Neumark (2012), the unbiased estimates
in an effect through level (keeping group differences in the variance of the error
term constant) and an effect through variance (keeping differences in unbiased
parameters constant). Interestingly, but differing from the findings of Carlsson et
al. (2014), we find that the effects through level are, although not significantly
different from zero, more or less of the same magnitude as the total unbiased
effect. In addition, the effects through variance are rather close to zero.
Our result of no important perceived gender group difference in the variance of
unobserved variables deviates from the finding of the more substantial ethnic
group difference in this respect outlined in Baert et al. (2015), Carlsson et al.
(2014) and Neumark (2012). One explanation for this finding is that perceived
group differences in the variance of unobserved variables can be thought of as a
sort of statistical discrimination. Following Altonji and Blank (1999) employers
may believe that the same observable signal is more precise for one group
compared to another. This theory seems to be more applicable to ethnic groups
than to gender groups.
5 Conclusion
In this study, we investigated the research gap indicated by Azmat and Petrongolo
(2014). This gap boils down to the fact that standard analyses of correspondence 8
testing data aimed at investigating hiring discrimination do not control for group
differences in the variance of unobservable productivity determinants and, as a
consequence of that, may be biased. While the robustness of ethnic discrimination
to the this critique, formulated first by Heckman and Siegelman (1993), is tested
by three former studies, Azmat and Petrongolo (2014) stress that correspondence
studies on gender discrimination are still open to this critique. Estimating the bias
feared by Heckman and Siegelman (1993) in the context of gender discrimination
was the aim (and the contribution) of this study.
We used Belgian correspondence data aimed at measuring hiring
discrimination against young females. We employed the empirical framework
introduced by Neumark (2012) and proposed an original identifying assumption.
By doing that, we found no significant evidence for the bias feared by Heckman
and Siegelman (1993) related to the fact that the estimated (perceived) variance of
unobservables is very comparable for male and female job candidates.
The issue of gender differences in heterogeneity with respect to productivity is
an important puzzle piece in the study of gender convergence in the labour market.
We contribute modestly to this literature by showing that, at least in the perception
of Belgian employers, there is no evidence for the hypothesis that women are
(perceived as) more heterogeneous than men in productivity related variables
unobservable to researchers.
Albert, R., L. Escot, and J. Fernández-Cornejo (2011). A field experiment to study sex and
age discrimination in the Madrid labour market. International Journal of Human
Resource Management 22 (2): 351375.
URL: 10.1080/09585192.2011.540160.
Altonji, J., and R. Blank (1999). Race and Gender in the Labor Market. In O. Ashenfelter,
and D. Card (Eds.), Handbook of Labor Economics. Amsterdam: Elsevier.
Azmat, G., and B. Petrongolo (2014). Gender and the Labor Market: What Have We
Learned from Field and Lab Experiments? Labour Economics 30 (Special Issue on
“What determined the dynamics of labour economics research in the past 25 years?”):
32–40. URL:
Baert, S. (2014). Career lesbians. Getting hired for not having kids. Industrial Relations 45
(6): 543561. URL: 9
Baert, S., and P. Balcaen (2013). The Impact of Military Work Experience on Later Hiring
Chances in the Civilian Labour Market. Evidence from a Field Experiment.
Economics: The Open-Access, Open-Assessment E-Journal 7 (2013-37): 117.
Baert, S., B. Cockx, N. Gheyle, and C. Vandamme (2015). Is There Less Discrimination in
Occupations Where Recruitment Is Difficult? Industrial and Labor Relations Review
68 (3): 467500. URL:
Baert, S., A. De Pauw, and N. Deschacht (Forthcoming). Do Employer Preferences
Contribute to Sticky Floors? Industrial and Labor Relations Review.
Bertrand, M., and S. Mullainathan (2004). Are Emily and Greg More Employable than
Lakisha and Jamal? A field experiment on labor market discrimination. American
Economic Review 94 (4): 991–1013.
Carlsson, M., L. Fumarco, and D.-O. Rooth (2014). Does the design of correspondence
studies influence the measurement of discrimination? IZA Journal of Migration 2014
(3): 11. URL:
Drydakis, N. (2009). Sexual orientation discrimination in the labour market. Labour
Economics 16 (4): 364372. URL:
Eriksson, S., and D.-O. Rooth (2014). Do Employers Use Unemployment as a Sorting
Criterion When Hiring? Evidence from a Field Experiment. American Economic
Review 104 (3): 10141039. URL:
Heckman, J.J., and P. Siegelman (1993). The Urban Institute audit studies: Their methods
and findings. In M. Fix, and R. Struyk (Eds.), Clear and convincing evidence:
Measurement of discrimination in America. Washington DC: Urban Institute Press.
Kroft, K., F. Lange, and M.J. Notowidigdo (2013). Duration Dependence and Labor
Market Conditions: Evidence from a Field Experiment. Quarterly Journal of
Economics 128 (3): 11231167. URL:
Neumark, D. (2012). Detecting discrimination with audit and correspondence studies.
Journal of Human Resources 47 (4):11281157. URL:
Petit, P. (2007). The effects of age and family constraints on gender hiring discrimination:
A field experiment in the French financial sector. Labour Economics 14 (3): 371391.
Riach, P.A., and J. Rich (2002). Field experiments of discrimination in the market place.
Economic Journal 112 (November): F480F518.
URL: 10
Riach, P.A., and J. Rich (2006). An Experimental Investigation of Sexual Discrimination
in Hiring in the English Labor Market. B.E. Journal of Economic Analysis & Policy
(Advances) 5 (2): 1. URL: 11
Table 1: Estimation results
Model (1) Model (2) Model (3) Model (4)
A. Estimates from basic probit model
Female candidate
-0.010 (0.013)
-0.010 (0.017)
Female candidate x No promotion in occupational level
0.022 (0.018)
0.042 (0.026)
Female candidate x Promotion in occupational level
-0.040** (0.019)
-0.050** (0.022)
B. Estimates f rom heterosk edastic probit mo del
Female candidate
-0.012 (0.020)
-0.011 (0.017)
Female candidate x No promotion in occupational level
0.022 (0.021)
0.041 (0.029)
Female candidate x Promotion in occupational level
-0.037** (0.017)
-0.050** (0.020)
C. Effect through level
Female candidate -0.003 (0.049) 0.010 (0.051)
Female candidate x No promotion in occupational level 0.033 (0.053) 0.063 (0.059)
D. Effect through variance
Female candidate x No promotion in occupational level
-0.011 (0.046)
-0.022 (0.036)
Female candidate x Promotion in occupational level
-0.010 (0.041)
-0.023 (0.045)
Log (σFemale/σMale)
Wald test statistic, null hypothesis that σ
= 1 (p-
0.852 0.811 0.612 0.596
Wald test statistic, null hypothesis that ratio of coefficients
for distance between workplace and residence = 1 (p-value)
0.910 0.885 0.732 0.721
Dependent variable: invitation to a job interview
Dependent variable: any positive reaction
Notes: Additional controls included in the basic probit and heteroskedastic probit
models are: distance between the workplace and the candidate’s
residence and, for models (2) and (4) a dummy that indicates
jobs implying a promotion. The presented statistics are marginal effects and standard
errors, corrected for clustering at the vacancy level, in parentheses. *** (**) ((*)) indicates significance at the 1% (5%) ((10%)) level.
Please note:
You are most sincerely encouraged to participate in the open assessment of this article. You
can do so by either recommending the article or by posting your comments.
Please go to:
The Editor
© Author(s) 2015. Licensed under the Creative Commons Attribution 3.0.
... Although it provides evidence of unexplained inequality, it cannot provide reliable evidence of discrimination as such, since it depends on a rather unrealistic assumption that all other relevant characteristics are observed in the data. These studies are likely to suffer from endogeneity bias, as individuals who appear similar to researchers might in fact exhibit substantial heterogeneity and appear very different to employers (see Baert 2018). However, the residual gap approach nonetheless constitutes the implicit approach taken to the conceptualization of discrimination in comparative family policy studies (see Mandel 2012;Mandel and Semyonov 2006;Mandel and Shalev 2009;Shalev 2008). ...
... Our results are also in line with the most common finding reported in the rapidly expanding literature that is using correspondence audits to test for gender hiring discrimination in different institutional contexts (see Baert 2018). The rule seems to be an absence of statistically significant gender differences in employer callbacks (Baert 2015;Baert, Pauw, and Deschacht 2016;Capéau et al. 2012 [Belgium]; Zhou, Zhang, and Song 2013 [China]; Albert, Escot, and Fernández-Cornejo 2011 [Spain]). ...
... Our results are also in line with the most common finding reported in the rapidly expanding literature that is using correspondence audits to test for gender hiring discrimination in different institutional contexts (see Baert 2018). The rule seems to be an absence of statistically significant gender differences in employer callbacks (Baert 2015;Baert, Pauw, and Deschacht 2016;Capéau et al. 2012 [Belgium]; Zhou, Zhang, and Song 2013 [China]; Albert, Escot, and Fernández-Cornejo 2011 [Spain]). In some studies, female applicants have been found to be subject to positive discrimination (Booth and Two facts are notable here: (i) Sweden seems to be a typical rather than deviant case with regard to the degree of gender hiring discrimination practiced by employers. ...
Full-text available
A common assumption in comparative family policy studies is that employers statistically discriminate against women in countries with dual-earner family policy models. The empirical evidence cited in support of this assumption has exclusively been observational data, which should not be relied on to identify employer discrimination. In contrast, we investigate whether employers discriminate against women in Sweden—frequently viewed as epitomizing the dual-earner family policy model—using field experiment data. We find no evidence supporting the notion that Swedish employers statistically discriminate against women.
... This ranged from 0.050 in the district of Maaseik to 0.361 in the district of Roeselare. 19 As in Baert et al. (2015), we hypothesised that employers would be less selective (in terms of degree class and extra-curricular activities) when filling temporary and part-time jobs and in times of high labour market tightness. ...
... Second, candidates from a caring or technical programme received more invitations than candidates from a business programme. This might be explained by the relatively high numbers of bottleneck vacancies (with a high labour market tightness) in these occupations (Baert et al., 2015). Third, and not surprisingly, given the small differences between the CV template types, invitation rates do not substantially vary across these types. ...
... In particular, we did not know to what extent other candidates with a high degree class or extra-curricular activities also had candidated. The treatment effects measured for our candidates may depend on this (Baert, 2015;Heckman, 1998). However, to the extent that the vacancies tested were representative, which we believe was the case based on the random selection, our experiment shows that degree class and extra-curricular activities increase the chance of being invited to a job interview. ...
This study investigates the impact on first hiring outcomes of two main curriculum vitae (CV) characteristics by which graduates with a tertiary education degree distinguish themselves from their peers: degree class and extra‐curricular activities. These characteristics were randomly assigned to 2,800 fictitious job applications that were sent to real vacancies in Belgium. Academic performance and extra‐curricular engagement enhance job interview rates by 7.0% (CI 95% [0.3%, 13.7%]) and 6.5% (CI 95% [−0.5%, 13.4%]), respectively. We did not find evidence for these CV characteristics to reinforce or reduce their effect.
... Such beliefs, we argue, are likely to correspond with the politicians' (dis)taste for various sociodemographic groups and, hence, to affect their assessment of otherwise identical candidates in recruitment situations. Third, we employ insights from labor economics on discrimination in hiring processes (e.g., Baert 2015; to understand underrepresentation in public organizations and discuss the relevance of discriminatory perspectives explicitly in relation to top administrative positions. ...
... In comparison, evidence of gender discrimination is mixed and heterogeneous across different occupations, with effects indicating both discrimination of men and women as well as no discrimination at all (Baert 2017). Among studies in the Flemish context, two show no differences between callbacks for men and women (Baert 2015;Baert, De Pauw, and Deschacht 2016), while a third reveals a negative impact of pregnancy among women (Capeau et al. 2012). This may suggest that discrimination against women is primarily relevant among young women. ...
... While a study of labor market ethnic discrimination in Sweden (Carlsson and Rooth 2007) finds similar levels of discrimination in the public and private sectors, a Norwegian study documents significant discrimination in the private sector but not in the public sector (Midtbøen 2016). Furthermore, some studies have documented how the impact of minority group status on callbacks disappears when ethnic minorities mention volunteer work for organizations in their application or when they have had extensive work experience (Baert andVujic 2016, Baert et al. 2017). This suggests that discrimination is often based on statistical conclusions, where certain attributes (here, lack of appropriate experience) are erroneously ascribed to individuals from minority groups (Altonji and Blank 1999). ...
While a voluminous literature on representative bureaucracy and minority discrimination suggests that characteristics other than qualifications influence hiring decisions, little is known about whether this also pertains to the top positions in political-administrative organizations. To shed light on this question, we ask how candidate ethnicity, gender, and age affect the recruitment preferences among politicians regarding the candidates for top administrative positions. Our study uses a survey experiment with random assignment of 1,688 Flemish local politicians to one of eight different descriptions of applicants to the leading managerial position of their local authority. We find that ethnic minorities, women, and younger candidates are generally considered more qualified for the job. Moreover, the impact of ethnicity and gender on recruitment preferences is conditional on politicians’ ideological predispositions: Left-wing politicians consider ethnic minority candidates more competent, whereas right-wing politicians consider them less representative and are less inclined to invite them for job interviews than candidates from the ethnic majority. Furthermore, politicians furthest to the left are more inclined than right-wing politicians to recognize women as representative of the public at large and support inviting them for job interviews.
... Endnotes 1 For information on 'Growth, Equal Opportunities, Migration and Markets' (GEMM) project, financed by Horizon2020, see 2 If employers act upon a perceived group difference in the variance of unobserved expected productivity, field experimental evidence of discrimination may not be very informative (Heckman and Siegelman, 1993). Using the method proposed by Neumark (2012), Baert (2015) found no evidence of this bias related to gender heterogeneity. 3 Several concepts have been introduced to differentiate so-called error discrimination (England, 1994) and stereotype-based discrimination (Bobbitt-Zeher, 2011) from the economic-rational model, but the theory of statistical discrimination (albeit with bounded rationality) can easily accommodate the notion of stereotypes affecting employers' hiring decisions. 4 See Di Stasio and Larsen (2020) for a study of the combined effects of ethnicity and gender on employers callbacks, based on the GEMM occupations. ...
Gender discrimination is often regarded as an important driver of women’s disadvantage in the labour market, yet earlier studies show mixed results. However, because different studies employ different research designs, the estimates of discrimination cannot be compared across countries. By utilizing data from the first harmonized comparative field experiment on gender discrimination in hiring in six countries, we can directly compare employers’ callbacks to fictitious male and female applicants. The countries included vary in a number of key institutional, economic, and cultural dimensions, yet we found no sign of discrimination against women. This cross-national finding constitutes an important and robust piece of evidence. Second, we found discrimination against men in Germany, the Netherlands, Spain, and the UK, and no discrimination against men in Norway and the United States. However, in the pooled data the gender gradient hardly differs across countries. Our findings suggest that although employers operate in quite different institutional contexts, they regard female applicants as more suitable for jobs in female-dominated occupations, ceteris paribus, while we find no evidence that they regard male applicants as more suitable anywhere.
... CTs have characteristics that address these weaknesses (Bertrand and Duflo 2016) and, therefore, are now considered to be the gold standard for studying discrimination (Baert 2015). Most importantly, since CTs do not match actors, but instead written messages, they directly circumvent problems related to the presence of unobservable characteristics and to the experimenter effect (Bertrand and Duflo 2016;Pager 2007). ...
Full-text available
I test discrimination against blind tenants assisted by guide dogs in the Italian rental housing market by using fake application letters. I compare three fictitious household tenants: married couples, married couples where the wife is blind and owns a guide dog, and married couples where the normal-sighted wife owns a normal dog. I find that the households with a blind wife are invited less often to visit apartments they applied for, because of the presence of their guide dog; using the language of Italian and E.U. laws, this behavior is called indirect discrimination against disabled people. This result is robust. © 2017 by the Board of Regents of the University of Wisconsin System. (working paper version:
Conference Paper
Full-text available
This paper addresses barriers faced by immigrant women workers during their international recruitment process by exploring their experiences. Women’s studies have accelerated in the last three decades, with almost all the management literature agreeing on the challenges for women of adapting to the business life. The most popular issue in this literature regarding the women workforce is the Glass Ceiling syndrome, which hinders women's high-level positions. This study aims to develop an understanding of a new issue, which is called “Glass Door,” as representing gender discrimination in the recruitment process. This concept has been defined and the theory built by Hassink and Russo (2010). This study further reveals additional barriers encountered by immigrant women workers during the recruitment process in their international applications. The study is an ongoing process, with continuous outreach to women immigrants from developing countries. Specifically, the development paper, will be expanded by outreach to new international partnerships from other countries, including developed and developing ones. In this way, the paper has the potential to make a theoretical contribution by evaluating cross-cultural differences in the recruitment process of immigrant women from different nations.
Since 2000, more than 80 field experiments across 23 countries consider the traditional dimensions of discrimination in labor and housing markets—such as discrimination based on race. These studies nearly always find evidence of discrimination against minorities. The estimates of discrimination in these studies can be biased, however, if there is differential variation in the unobservable determinants of productivity or in the quality of majority and minority groups. It is possible that this experimental literature as a whole overstates the evidence of discrimination. The authors re-assess the evidence from the 10 existing studies of discrimination that have sufficient information to correct for this bias. For the housing market studies, the estimated effect of discrimination is robust to this correction. For the labor market studies, by contrast, the evidence is less robust, as just over half of the estimates of discrimination fall to near zero, become statistically insignificant, or change sign.
An audit study is a specific type of field experiment primarily used to test for discriminatory behavior when survey and interview questions induce social desirability bas. In this chapter, I first review the language and definitions related to audit studies and encourage adoption of a common language. I then discuss why researchers use the audit method as well as when researchers can and should use this method. Next, I give an overview of the history of audit studies, focusing on major developments and changes in the overall body of work. Finally, I discuss the limitations of correspondence audits and provide some thoughts on future directions.
Full-text available
This chapter aims to provide an exhaustive list of all (i.e. 90) correspondence studies on hiring discrimination that were conducted between 2005 and 2016 (and could be found through a systematic search). For all these studies, the direction of the estimated treatment effects is tabulated. In addition, a discussion of the findings by discrimination ground is provided.
Purpose The glass ceiling is a metaphor used to characterize the gender inequality of women at the top in most large western organizations. This situation has prompted many business organizations, NGOs and governments to encourage large organizations to promote more women into the executive suite and onto boards of directors. While there is little controversy about this initiative, this paper argues that there should be because it directly challenges the principle that merit should outweigh diversity. The paper aims to discuss these issues. Design/methodology/approach This paper reviews research that purports to show that women are unfairly under-represented in the most senior positions in large western organizations. It also reviews the arguments that more senior women would improve the performance of these organizations. This research is then used to develop a model of why there are markedly fewer women than men at the top of large organizations. Findings This study finds that most of the research studies purporting to show that there is a bias against promoting women to the top of large western organizations are unsound because they are poorly designed and/or fail to accommodate alternative explanations for this effect. Thus, the current number of women who run these organizations may be a good reflection of their contribution to the management of these organizations. These findings suggest that many of the policies that are promoted to help women break through the glass ceiling are misguided. Practical implications Large organizations should think carefully about following the advice of special interest groups who vigorously promote this social cause. Social implications Social policy advocates need better research from which to advance their cause that there are currently too few women in senior management positions of large organizations. Originality/value This is one of only a handful of papers that challenges the current orthodoxy that artificial glass ceilings are restricting the potential contribution of women to the better management of large organizations.
Full-text available
This study directly assesses the impact of military work experience compared with civilian work experience in similar jobs on the subsequent chances of being hired in the civilian labour market. It does so through a field experiment in the Belgian labour market. A statistical examination of our experimental dataset shows that in general we cannot reject that employers are indifferent to whether job candidates gained their experience in a civilian or a military environment.
Full-text available
The authors empirically test the cross-sectional relationship between hiring discrimination and labor market tightness at the level of the occupation. To this end, they conduct a correspondence test in the youth labor market. In line with theoretical expectations, results show that, compared to natives, candidates with a foreign-sounding name are equally often invited to a job interview if they apply for occupations for which vacancies are difficult to fill; but, they have to send out twice as many applications for occupations for which labor market tightness is low. Findings are robust to various sensitivity checks.
Full-text available
Correspondence studies can identify the extent of discrimination in hiring as typically defined by the law, which includes discrimination against ethnic minorities and females. However, as Heckman and Siegelman (1993) show, if employers act upon a group difference in the variance of unobserved variables, this measure of discrimination may not be very informative. This issue has essentially been ignored in the empirical literature until the recent methodological development by Neumark (2012). We apply Neumark’s method to a number of already published correspondence studies. We find the Heckman and Siegelman critique relevant for empirical work and give suggestions on how future correspondence studies may address this critique. JEL classification J71
Full-text available
Using a field experiment, we investigate whether discrimination based on women's sexual orientation differs by age and family constraints. We find weakly significant evidence of discrimination against young heterosexual women. This effect is driven by age (and fertility) rather than by motherhood. We do not find any unequal treatment at older ages. This age effect is consistent with our theoretical expectation that, relative to lesbian women, young heterosexual women are penalised for getting children more frequently and taking on, on average, more at-home-caring tasks. *** A DISCUSSION PAPER VERSION OF THIS STUDY IF FREELY DOWNLOADABLE HERE:
Full-text available
This article presents the findings of a field experiment carried out in Madrid which aims to analyse gender and age discrimination in hiring in the labour market of Madrid. A set of five pairs of fictitious man–woman curricula was sent in response to 1062 job offers in six occupations which were advertised on Internet over an eight-month period. It was quantified subsequently the extent to which the different firms contacted more or less the candidates of different sex, age and marital status. No discrimination is detected against women in terms of access to job interviews; however, discriminatory conduct is seen regarding the phenomenon of occupational gender segregation, in the sense that there is a continuance among employers of stereotyped views on the greater suitability of women for certain tasks. No evidence is found to indicate firms showing relative discrimination against married women with children in the first phase of hiring process. And a clear evidence of discrimination is obtained on the basis of age: firms show a substantial fall in interest over interviewing 38-year-old candidates (compared to those aged 24 or 28). This would imply that the tendency to discriminate against older workers may be high, and, what is more, it may start at a surprisingly young age.
Full-text available
Audit studies testing for discrimination have been criticized because applicants from different groups may not appear identical to employers. Correspondence studies address this criticism by using fictitious paper applicants whose qualifications can be made identical across groups. However, Heckman and Siegelman (1993) show that group differences in the variance of unobservable determinants of productivity can still generate spurious evidence of discrimination in either direction. This paper shows how to recover an unbiased estimate of discrimination when the correspondence study includes variation in applicant characteristics that affect hiring. The method is applied to actual data and assessed using Monte Carlo methods.Institutional subscribers to the NBER working paper series, and residents of developing countries may download this paper without additional charge at
We discuss the contribution of the experimental literature to the understanding of both traditional and previously unexplored dimensions of gender differences and discuss their bearings on labor market outcomes. Experiments have offered new findings on gender discrimination, and while they have identified a bias against hiring women in some labor market segments, the discrimination detected in field experiments is less pervasive than that implied by the regression approach. Experiments have also offered new insights into gender differences in preferences: women appear to gain less from negotiation, have lower preferences than men for risk and competition, and may be more sensitive to social cues. These gender differences in preferences also have implications in group settings, whereby the gender composition of a group affects team decisions and performance. Most of the evidence on gender traits comes from the lab, and key open questions remain as to the source of gender preferences—nature versus nurture, or their interaction—and their role, if any, in the workplace.
This article studies the role of employer behavior in generating “negative duration dependence”—the adverse effect of a longer unemployment spell—by sending fictitious résumés to real job postings in 100 U.S. cities. Our results indicate that the likelihood of receiving a callback for an interview significantly decreases with the length of a worker’s unemployment spell, with the majority of this decline occurring during the first eight months. We explore how this effect varies with local labor market conditions and find that duration dependence is stronger when the local labor market is tighter. This result is consistent with the prediction of a broad class of screening models in which employers use the unemployment spell length as a signal of unobserved productivity and recognize that this signal is less informative in weak labor markets. JEL Code: J64.
In this paper, we use unique data from a field experiment in the Swedish labor market to investigate how past and contemporary unemployment affect a young worker's probability of being invited to a job interview. In contrast to studies using registry/survey data, we have complete control over the information available to the employers and there is no scope for unobserved heterogeneity. We find no evidence that recruiting employers use information about past unemployment to sort workers, but some evidence that they use contemporary unemployment to sort workers. The fact that employers do not seem to use past unemployment as a sorting criterion suggests that the scarring effects of unemployment may not be as severe as has been indicated by previous studies.