Content uploaded by Xuan Jiang
Author content
All content in this area was uploaded by Xuan Jiang on Apr 26, 2021
Content may be subject to copyright.
Women in STEM: Ability, Preference, and Value *†
Xuan Jiang‡
The Ohio State University
March 25, 2021
Abstract
Women are underrepresented in both STEM college majors and STEM
jobs. Even with a STEM college degree, women are significantly less likely
to work in STEM occupations than their male counterparts. This paper stud-
ies the determinants of the gender gap in college major choice and job choice
between STEM and non-STEM fields and quantifies how much the gender
wage gap can be explained by these choices using an extended Roy Model. I
find that men’s ability sorting behavior is statistically stronger than women’s
in major choice, yet gender differences in ability and ability sorting together
explain only a small portion of the gender gap in STEM majors. The gen-
der gap in STEM occupations cannot be explained by the gender differences
in ability or ability sorting. Instead, a part of the gender gap in STEM oc-
cupations can be explained by the fact that women are more represented
in less Math-intensive STEM majors and graduates from those majors are
more likely to be well-matched to and to take jobs in non-STEM occupa-
tions. The other part of the gender gap in STEM occupations can be ex-
plained by women’s preference over work-life balance and women’s home
location. The counterfactual analysis shows that about 13.7% of the gender
wage gap among college graduates can be explained by the returns to STEM
careers among the non-STEM women in the top 6.7% of the ability distribu-
tion.
Keywords: Gender Differences in STEM, Ability Sorting, College Major
Choice, STEM Attrition, STEM Job
JEL Classification: I20, I23, J16, J24, J31
*I am grateful to Miguel Sarzosa for his invaluable support on this research project. I thank
Kevin Mumford, Victoria Prowse, Jillian Carr, Bruce Weinberg, Audrey Light, Cynthia Bansak,
Guanyi Yang, Xiaoxiao Li, Weiyang Tham for their insightful comments. I also thank seminar
participants at Purdue University, The Ohio State University, Villanova University, St. Lawrence
University, and conference attendees at SEA 2017 and SOLE 2018. All errors are my own.
†Click here or go to https://sites.google.com/site/gabixuanjiang/research for the latest ver-
sion. This paper was previously circulated under the title “Planting the Seeds for Success: Why
Women in STEM Do Not Stick in the Field”.
‡Department of Economics, Ohio State University. Email: jiang.445@osu.edu. I do not have
access to any information leading to the identification of individuals in the data. All data analysis
was carried out on a secure server.
1
1 Introduction
Women are underrepresented in science, technology, engineering, and mathe-
matics (STEM) college majors and occupations.1Even with a STEM degree,
young women’s participation decreases at each stage of the STEM pipeline (Xu,
2008). Why is the lack of women in the STEM fields a concern? First, we face
a scarcity of STEM workers and there are signs showing substantial shortage in
STEM labor supply while demand has been steadily increasing over decades.2
Second, as most of the STEM fields pay higher salaries than non-STEM fields,
the lack of women in STEM may contribute to the gender wage gap (Kahn and
Ginther,2017;Noonan,2017;Xue and Larson,2015;Beede et al.,2011). Third,
when young women lack role models to motivate them and help them envi-
sion themselves in those positions, they are deterred by the idea that STEM is
a “man’s field” where girls do not belong (Shapiro and Williams,2012). Last,
when women are not involved in STEM, products, services, and solutions are
designed by men and according to their user experiences. The needs that are
unique to women may be overlooked (Clayton et al.,2014;Hong and Page,2004;
Fisher and Margolis,2002).
The aim of this paper is threefold. First, I ask how much of the gender gap
in STEM college majors can be explained by gender differences in ability sort-
ing. This requires me to revisit the long-lasting debate on whether women are
less prepared on STEM-related subjects and avoiding STEM college majors be-
cause of that. Second, I ask why women with STEM degrees are less likely to
work in STEM occupations than their male counterparts. In particular, I assess
how much ability, ability sorting, and preferences can explain the gender gap in
STEM attrition after college graduation.3Third, by simulating the counterfactual
choices, I estimate the pecuniary returns to a STEM career for women who chose
a non-STEM path by skill level and discuss how much contributes to the gender
wage gap.
I analyze the questions above using unique administrative data from Pur-
due University, one of the nation’s leading public STEM institutions. It contains
academic records of undergraduate students who graduated between 2007–2014
1While nearly as many women hold college degrees as men overall, they make up only about
a third of all STEM degree holders. Although women fill close to half of all jobs in the U.S.
economy, they hold less than a quarter of STEM jobs. Women with STEM college degrees are less
likely than their male counterparts to work in STEM occupations. About 40 percent of men with
STEM college degrees work in STEM jobs, while only 26 percent of women with STEM degrees
work in STEM jobs (Noonan,2017).
2BLS projects that STEM employment will increase by 8.8% from 2018 to 2028, compared to 5%
increase in non-STEM employment: https://www.bls.gov/emp/tables/stem-employment.htm.
3In this paper, I use “STEM attrition” to mean “working in a non-STEM occupation after
getting a STEM degree”.
2
and is linked to the First Destination Survey conducted by the Purdue Center for
Career Opportunities. The data provide rich information on individuals’ high
school records, standardized test scores, college transcripts, and information on
the first job since college graduation. To explore the endogenous choices of major
and job between STEM and non-STEM and, more importantly, the gender differ-
ences in these choices, I apply an extended Roy model where individuals maxi-
mize their expected earnings by making the field choices conditional on the two
latent skills, General Academic Skill and STEM-specific Skill, as well as preferences
over the field of study and workplace. It is not the intention of this paper to re-
visit the dynamic problem of college major choices, which fully explores college
performance and how that affects one’s choice or update one’s belief, which has
been well documented in higher education literature (e.g. Arcidiacono (2004);
Rask and Tiefenthaler (2008); Arcidiacono et al. (2012); Ahn et al. (2019)). The
main focus of this paper, by adopting the Roy model, is to assess to what extent
the gender differences in ability sorting explain the gender gap in STEM fields
and quantify the pecuniary returns to STEM degrees and careers.
I first show that, on average, skills are statistically weaker determinants of
college major choice for women than men; however, gender differences in abil-
ity and ability sorting together explain only a small portion of the gender gap
in STEM majors. This finding complements the large literature on the gender
gap in college STEM degrees. While there is abundant literature that covers abil-
ity sorting in choice of college major4and gender differences in college major
choice,5the debate on whether and to what extent the gender gap in STEM can
be explained by gender differences in ability and gender difference in ability sort-
ing has not been settled. Some studies find that academic preparation in math
and science is crucial in choosing a quantitative college major, and women’s low
entry to STEM is mostly due to a lack of innate ability and less readiness in re-
lated subjects (Speer,2017;Card and Payne,2017;Aucejo et al.,2016;Eccles,2007;
Trusty,2002;Ethington and Woffle,1988;Hanson et al.,1996). Others argue that
the small gender differences in preparation do not explain the large gender gap
in STEM majors(Saltiel,2019;Zafar,2013;Kimmel et al.,2012;Dickson,2010;
Turner and Bowen,1999;Xie et al.,2003), which aligns with the finding in my
paper.
Second, I find that the gender gap in major choices is more pronounced among
high-ability women. High-ability women are less likely to choose a STEM major
while more rewarded in a STEM occupation than their high-ability male counter-
4For example, Humphries et al. (2017); Altonji et al. (2016); Wiswall and Zafar (2015a,b); Al-
tonji et al. (2012); Arcidiacono et al. (2012); Arcidiacono (2004), and many more.
5See Ahn et al. (2019); Dickson (2010); Turner and Bowen (1999); Blakemore and Low (1984);
Daymont and Andrisani (1984); Polachek (1981,1978), and many more.
3
parts. The counterfactual analysis shows that the return to a STEM career for a
high-ability woman is about $13,000–$20,000 in annual salary. About 13.66% of
the gender wage gap among college graduates can be explained by the return to
STEM careers among the non-STEM women in the top 6.67% of the ability dis-
tribution.6To the best of my knowledge, this paper is the first to document the
disproportionate and considerable number of high-ability women who choose
a non-STEM path, quantify the total pecuniary return to STEM careers for this
group, and then link the return to the gender wage gap. These findings add to
the literature looks at the gender gap in college major as a substantial contribu-
tion to the gender wage gap (Daymont and Andrisani,1984;Eide,1994;Brown
and Corcoran,1997;Blau and Kahn,2017).
Third, I contribute to the limited work on understanding the link between
field choice and job choice. Young women’s participation decreases at each stage
of the STEM pipeline with greater gender stratification in STEM occupations than
STEM education, suggesting that factors other than ability or training generate
gender gap in STEM occupations. Some studies find the demands of family and
children are major nonacademic barriers for women on the pathway to a STEM
profession(Xie et al.,2003;Kimmel et al.,2012;Hanson et al.,1996). Hunt (2016)
finds the major factor that drives the high exit rate of women leaving STEM fields
is lack of promotion opportunities by female engineers, while the family-related
constraints and dissatisfaction with the working conditions are only secondary
factors. Taking one step back, I focus on STEM attrition regarding the first job
choice of STEM college degree holders and investigate the gender differences in
opting-out of STEM fields right after college graduation, which is one of my chief
contributions in this paper.
I find no evidence showing that ability or ability sorting matters for STEM
occupation choice or the gender gap in STEM occupations. In fact, part of the
gender gap in STEM attrition could be attributed to the gender segregation in dif-
ferent STEM majors and how we define STEM jobs. Specifically, women are more
represented in less Math-intensive STEM majors. Moreover, graduates from less
Math-intensive STEM majors are more likely to be well-matched to and to take
jobs in non-STEM occupations. Further, by fully decomposing the job decision
equation, I show that both work-life balance preference and home region fixed
effects largely explain the gender gap in STEM attrition. Female STEM degree
holders who return to their home state are more likely to opt-out of STEM fields,
potentially attributed to women’s preference on job location, family-related rea-
sons, and stereotyping at their home state.
6The gender wage gap—$8,198—is calculated by subtracting Purdue’s female graduates’
mean annual salary by Purdue’s male graduates’ mean annual salary.
4
2 Data
I analyze a unique dataset from Purdue University. The dataset includes aca-
demic records of undergraduate students who graduated between 2007–2014.
The data provide rich information on individuals’ high school records, standard-
ized test scores, and college transcripts. I link the administrative dataset to the
voluntary First Destination Survey conducted by the Purdue Center for Career
Opportunities, which inquiries graduate’s first job information, including job ti-
tle, employer information, annual salary, etc.
Purdue University has one of the top engineering schools in the United States.
It is both important and interesting to study the gender difference in STEM in
Purdue’s undergraduate population. Although this population may differ from
the typical US undergraduate population, I believe it provides an appropriate
setting to answer the questions above at hand. First, given Purdue’s renowned
STEM programs, one would believe that Purdue students may be more STEM in-
clined than the national average. Thus, findings on Purdue women’s underrepre-
sentation in STEM may indicate even greater gender gaps at the national average.
Second, Purdue’s data is better suited for this study than any publicly available
dataset that inquires about skills, education, and career outcomes. Well-known
publicly available datasets which are not designed for study on higher education
or underrepresentation in STEM may not have enough observations of 4-year-
college degree holders, let alone enough observations of STEM degree holders.7
Third, this dataset provides detailed and administrative test scores which facili-
tate better measurement of individual abilities. Last, compared to the broad clas-
sification of fields of study and occupation in most publicly available datasets,
Purdue’s data provide the official 6-digit Classification of Instructional Programs
codes (henceforth, CIP codes) and fine-coded occupations in 6-digit Standard
Occupational Classification codes (henceforth, SOC codes). The detailed classi-
fication of majors and occupations is critical for understanding the job outcomes
and STEM attrition.
The administrative dataset available to this research contains pre-college aca-
demic records and college transcripts from 39,538 domestic undergraduate stu-
dents who graduated between 2007 and 2014.8After excluding observations
7Sample size of STEM degree holders matters not only for studying the link between major
choice and job choice but also for the computational purpose of the estimation. For example, the
well-known NLSY97 includes about 1600 individuals (women and men) who graduated from a
4-year-college. About 1400 college degree holders ever reported a full-time job, among which
20%-30% graduated with a STEM degree (broad classification). In addition to the small fraction
of graduates placed in STEM occupations, the sample size is too small to feed the estimation.
8I exclude international students from the analysis for two reasons. First, international stu-
dents come from very different educational backgrounds. Both the academic performance and
5
with incomplete pre-college records, primarily due to transfer students, I have
a sample with 35,371 individuals. I further restrict the sample based on the
variable requirement of the skill measurement. I drop the records with non-
traditional high school GPA9and obtain a sample with 28,877 individuals. Con-
ditional to having valid high school academic records, 13,104 students have valid
records of the ACT scores, which is essential to the measurement of skills.10 Sec-
tion 3.2 goes into more details about the reason for using the ACT scores in
the estimation. Another score I use for the skill measurement is grade point
of class Fundamentals of Speech Communication (henceforth COM11411), a re-
quired course for all freshmen at Purdue. Table A1 shows the details of sample
reduction.
In the restricted sample, with all valid test scores, 3,055 students (1145 women
and 1910 men) responded to the First Destination Survey with a meaningful job
title and annual salary. Table 1shows the summary statistics for the restricted
sample by gender. The same set of summary statistics for the full sample are
listed in Table A2. Overall, women have slightly higher ACT English scores,
COM114 grade points, and high school GPAs, while men have slightly higher
scores in ACT Reading and ACT Science. Gender differences in average ACT
Math scores are the largest among all test scores, mainly driven by a substantial
share of male students scoring in the higher percentiles on ACT Math. The high
male-female ratio in the higher percentiles of math performance has been doc-
umented in previous studies (e.g., Pope and Sydnor (2010).). Compared to the
restricted sample statistics, the average test scores in the full sample are gener-
ally lower. I present comparison of the test score distributions between the two
samples in Figure A2 and A3. Both figures show that the restricted and the full
educational choices are not comparable to the domestic students. Second, I observe only job des-
tinations within the U.S., but most international students left the U.S. after graduation and are
absent from the survey.
9I drop the individuals with missing high school GPA, a zero, or a less than 2.0 high school
GPA. The middle 50% of admitted Purdue first-year students have a 3.5-3.9 high school GPA. In
theory, there should not be any admission with a high school GPA less than 2.0. However, there
are two potential reasons that why we have records with a <2.0high school GPA. 1) Students
from private high schools might have an unusual high school GPA which is not on a 4.0 scale. I
do not have the information on each high school GPA’s scale and whether it could be converted.
2) Transferred students might have a mishandled GPA.
10In the full sample, there are 41% of students had ever taken the ACT (including those who
also took the SAT) when they applied to Purdue. The rest of them took only the SAT. I find no
selection on taking the ACT over the SAT in terms of ability. More importantly, there is no gender
difference in selection on taking the ACT over the SAT.
11COM114: the study of communication theories as applied to speech, and involves practical
communicative experiences ranging from interpersonal communication and small group pro-
cesses to informative and persuasive speaking in standard speaker-audience situations. https:
//www.cla.purdue.edu/communication/undergraduate/com_114.html. There are
some missing values in this variable due to credit transfers or course waivers.
6
sample differ in the means but not the shapes.12 The by state-year AFGRs are
not statistically different between the full and restricted sample, suggesting no
difference in the student composition in terms of high school background for the
restricted and the full sample. To further address the concern of potential selec-
tion into the final sample in terms of the unobservables, I estimate a model with
an indicator of reporting the first job as a dependent variable and two latent skills
as independent variables, controlling for relevant individual and cohort charac-
teristics. I further estimate the skill distributions and the major choice model for
the full sample regardless of survey response. I discuss this robustness check
fully in Appendix A.
The share of STEM degree holders is 29.38% among women and 54.72% among
men in the full sample. 73.11% of female STEM degree holders work in STEM
occupations, while 81.17% of male STEM degree holders work in STEM occu-
pations. The 2015 American Community Survey shows that 13.6% of college-
educated female employees are STEM degree holders, and 31.6% of college-
educated male employees are STEM degree holders. About 40% of men and
26% of women with STEM college degrees work in STEM jobs (Noonan,2017).
Compared to the national averages, Purdue’s STEM degree share is higher due
to the composition of undergraduate programs in Purdue. Additionally, Purdue
has lower post-graduation STEM attrition rates (the share of non-STEM works
in STEM graduates), which is likely due to two reasons. First, Purdue’s grad-
uates, on average, have above-average employment outcomes compared to the
national average. Second, the restricted sample further enlarges the gap to the
national averages due to the lower First Destination Survey response rate among
non-STEM graduates.
In the restricted samples, the share of STEM degree holders is 37.03% among
women and 63.40% among men, suggesting that both female and male STEM de-
gree holders are over-responding to the survey relative to their non-STEM coun-
terparts. The gender difference in the fraction of STEM degree holders of the full
sample (25.34%) is similar to that of the restricted sample (26.37%), but propor-
tionally, STEM women are slightly more over-responding relative to STEM men.
It is worth noting that men’s overall survey response rate is higher than that of
women (34.14% vs. 25.46%). Thus, it is arguable that I oversample high-earning
women, and the gender wage gap in this paper might be understated.
The average self-reported annual salary of women is 18.25% ($8,198) lower
than men over the restricted sample. Gender wage gaps are different across sub-
groups of career paths: there is a small and insignificant wage gap between fe-
12Shifting in means would not affect the main results because the model estimates only the
linear effects of ability sorting.
7
male STEM workers and male STEM workers. Female non-STEM workers with
STEM degrees earn 12.8% ($6,178) less than their male counterparts. Female non-
STEM workers with non-STEM degrees earn 16.7% ($6,519) less than their male
counterparts. The STEM job premium, by annual salary, is 45% for women; and
23% for men.13
STEM Major/Job Definition
Let me describe how I classify college majors and occupations as binary choice
variables (STEM/non-STEM). I use the “first graduation major” as a student’s
major, regardless of what major a student applied to or started with.14 Every
major is officially coded as a 6-digit Classification of Instructional Programs (CIP)
code, which is also shown on every international student’s Form I-20 for visa
issuance.15
The STEM major indicator in this study is defined by the “STEM Designated
Degree Program List Effective May 10, 2016” published by U.S. Immigration and
Customs Enforcement (ICE,2016). It is a complete list of fields of study that
the Department of Homeland Security (DHS) considers being science, technol-
ogy, engineering, or mathematics (STEM) fields of study for purposes of the 24-
month STEM optional practical training extension described at 8 CFR 214.2(f)16. I
code the STEM major indicator as 1 for Purdue undergraduate degree programs
showing up on this list and 0 otherwise, with some exceptions.17
The First Destination Survey provides information on graduates’ first jobs,
13Statistics from the 2015 American Community Survey (ACS) show that, in terms of hourly
wages, the STEM premium is 35% for women and 30% for men. Statistics from ACS 2009 shows
the STEM premium is 33% for women and 25% for men (Noonan,2017).
14There are 2.76% of students who graduated with a second major, and 0.087% graduated with
a third major. I exclude graduates with more than one major in this paper. Note that engineering
majors cannot be listed as a second major unless the first major is also in engineering. A student
can not transfer into an engineering major if he or she did not start as an engineering student.
15The Form I-20 (also known as the Certificate of Eligibility for Nonimmigrant (F-1) Student
Status-For Academic and Language Students) is an ICE document issued by SEVP-certified
schools (colleges, universities, and vocational schools) that provides supporting information on
a student’s F or M status.
16Under 8 CFR 214.2(f)(10)(ii)(C)(2), a STEM field of study is a field of study “included in the
Department of Education’s Classification of Instructional Programs taxonomy within the two-
digit series containing engineering, biological sciences, mathematics, and physical sciences, or a
related field. In general, related fields will include research, innovation, or development of new
technologies using engineering, mathematics, computer science, or natural sciences (including
physical, biological, and agricultural sciences).”
17Some customization has been made according to Purdue’s particular programs. “Nursing” is
defined as a non-STEM degree program by the DHS because there are many nursing degrees, and
most of them do not focus on medical training. The nursing major at Purdue offers only Bachelor
of Science in Nursing degrees, and the placement of undergraduates is basically as registered
nurses (RNs). Additionally, a Registered Nurse is defined as a STEM occupation according to
BLS. Two Purdue majors have not been documented in the DHS’s list: “Radiological Health
Sciences” and “Health Sciences General.” I treat both as STEM majors based on the degrees both
programs offer and the program requirements.
8
such as job title, employer (company name), job location (city and state), and
annual salary. I match each self-reported job title to a Standard Occupational
Classification (SOC) title associated with a 6-digit SOC code by using O*NET
search.18 I define a self-reported job as a STEM occupation according to the “At-
tachment C: Detailed 2010 SOC occupations included in STEM” published by the
Bureau of Labor Statistics (BLS,2012).19
3 Model
3.1 The General Setup
This general framework is inspired by the Roy model (Roy,1951;Borjas,1987),
in which individuals make choices to maximize their expected wage by selecting
working sectors based on their comparative advantages. I assume that students
choose between a STEM degree and a non-STEM degree and choose between a
STEM occupation and a non-STEM occupation to maximize the expected income
based upon their skills and preferences. Previous studies have documented that
major choice is a process rather than a one-time decision and well-explored col-
lege students’ major choice in a dynamic fashion incorporating updated beliefs
on expected outcomes and college performance (Astorne-Figari and Speer,2019;
Altonji et al.,2016;Arcidiacono,2004). In this paper, I have no intention to revisit
the dynamic problem of college major choices incorporating college performance
at each stage but rather focus on the gender differences among the “survivors”
with a static setting. Additionally, capturing the dynamic choice is difficult due
to two empirical reasons: 1) The actual time of major switching is uncertain be-
cause it is not required for students to declare major until they are about to de-
clare candidacy; 2) Dropping-out and transferring are alternatives for switching
major, but I do not observe dropouts or transfers in the dataset I have access too.
The core of the empirical strategy follows Carneiro et al. (2003); Hansen et al.
(2004); Heckman et al. (2006); Sarzosa and Urz ´
ua (2015); Sarzosa (2017); Prada
et al. (2017). The extended Roy model I implement here can be described as a set
of outcome equations (1) and choice equations (3and 5) linked by two underly-
ing factors: θA, the General Academic Skill, and, θB, the STEM-specific Skill.20 The
two latent skills represent an individual’s pre-college ability, which takes into
consideration of innate ability, schooling environment effects, and peer effects.
Most of the literature directly uses test scores as direct measures of ability; how-
18https://www.onetonline.org/find/
19https://www.bls.gov/soc/attachment_c_stem.pdf The list includes 184 STEM oc-
cupations out of 840 occupations, in 6-digit SOC codes.
20I use “factor” and “latent skill” interchangeably in this paper.
9
ever, those test scores should be considered only as functions of ability (Carneiro
et al.,2003). The measurement system 3.2 allows us to proxy unobservable ability
accounting for measurement errors in observed test scores.
For each individual, the main outcome variable, annual salary, is given in the
following form:
YDM,DJ=XYβY+αY,A θA+αY,B θB+eY(1)
where YDM,DJis the salary subjected to the major choice DMand the job choice
DJ.XYis a vector of all observable characteristics affecting salary, including
state-year level unemployment rate at job state, STEM employment fraction, em-
ployment in STEM and non-STEM sectors, number of college graduates, STEM
graduates, female graduates, and female STEM graduates. βYis the vector of
returns associated with XY.αY,A and αY,B are the factor loadings of the underly-
ing factors θAand θB.eYis the error term. I assume that eYis independent from
the observable controls and the unobserved factors, i.e. eY⊥⊥ (θA, θB,XY).
Choice of Major
Individuals choose between a STEM and a non-STEM degree based on their
latent skills and observable preferences. Let D∗
Mdenotes the utility for individu-
als who graduate with a STEM degree.
D∗
M=XMβM+αM,AθA+αM,BθB+eM(2)
where XMis a vector of observable characteristics affecting major choice, cap-
turing pre-college STEM environment, female peer effect during college, and
other cohort-specific characteristics. Variables in XMinclude Census division-
level STEM share of pre-college teachers, Census division-level female share of
pre-college STEM teachers,21 Purdue’s female share index of each major (base
year is 2006), major cohort size by degree year, and admission year fixed effects.
Literature has documented that role models—instructors, peers, and teaching
assistants—are effective in recruiting more women in STEM fields (Neumark
and Gardecki,1998;Herrmann et al.,2016;Kahn and Ginther,2017;Canaan and
Mouganie,2019;Mansour et al.,2020) and role models of both genders could be
equally effective (Cheryan et al.,2011). The reasoning to include STEM share of
pre-college teachers and female share of of pre-college STEM teachers is to cap-
21These two variables are constructed by using public-use data from NSF’s National Survey of
College Graduates, which provide teaching subjects of pre-college teachers. The mean and stan-
dard deviation of the first variable, “STEM Share of Pre-college Teachers”, is 0.5596 and 0.0170.
The mean and standard deviation of the second variable “Female Share of STEM Pre-college
Teachers” is 0.6575 and 0.0404.
10
ture the role modeling effect on students’, especially female students’, interest
in STEM major. Peer effect is another critical element affecting students’ field
choice and persistence. Studies find that gender composition of major is critical
to female students major choice and women are likely to switch out of male-
dominated fields (Kugler et al.,2017;Bostwick and Weinberg,2018;Astorne-
Figari and Speer,2019). I include female share index of each major to capture
this female peer effects. I also control for major cohort size for any potential
competition or collaborating effects, and admission year fixed effects for the rest
of unobserved variations in the entering cohort. The factor loadings, αM,A and
αM,B , indicate the importance of the two latent skills, θAand θB, in selection into
STEM majors. I assume independence of the error term, i.e., eM⊥⊥ (θA, θB,XM).
DM(= 1 if D∗
M>0) is a binary variable that equals one if the individual chooses
a STEM major and zero otherwise. The major choice can be written as
DM= [D∗
M>0] (3)
Choice of Job
I assume students with STEM degrees choose between a STEM and a non-STEM
job while students with non-STEM degrees place in non-STEM jobs only. Here
is the rationale for this assumption: a STEM job requires certain knowledge and
techniques usually obtained from STEM training. According to Purdue’s dataset,
only about 3% of the students placed in a STEM occupation with a non-STEM
degree. Due to this small fraction, it is computationally impossible to calculate
the model, including the category (DM= 0, DJ= 1). Thus, the job choice is
straightforward:
D∗
J=XJβJ+αJ,AθA+αJ,B θB+eJ,if DM= 1 (4)
where D∗
Jis the utility for individuals who work in STEM after getting STEM
degrees, conditional on graduating with STEM degrees (DM= 1). XJis a vec-
tor of observable characteristics that affect job choice, capturing preferences in
the workplace. Specifically, XJinclude state-year level share of young women
(ages ≤30) in STEM occupations, state-year level average work hours per week
of STEM occupations,22 cohort size of major, degree year fixed effects, and home
Census-region fixed effects. The share of young women in STEM occupations
is a proxy for female-friendliness at the workplace regarding providing sup-
22State-year level share of young women in STEM occupations and average work hours per
week of STEM occupations are constructed by using data from the CPS of the same time window.
The mean and standard deviation of the first variable, “Female Share of STEM Workers” is 0.4349
and 0.1931. The mean and standard deviation of the second variable, “STEM Average Work
Hours/Week” is 40.5823 and 1.4010.
11
port on work and promotion and reducing gender stereotyping, which is found
to be a key determinant for women’s retention in STEM (Kunze and Miller,
2017). Gender-specific preference on work-life balance could affect gender gaps
in STEM jobs as well. Literature finds that employers of male-dominated fields
such as engineering and technology expect employees to work long hours, and
women are more likely to leave their employer as well as exit the field entirely
in a situation in which they must choose between work and family (Kahn and
Ginther,2017;Cha,2013;Corbett and Hill,2015). I include major cohort size
to control for any potential competition or collaborating effects and degree year
fixed effects for the rest of the unobserved variation by degree cohort. The factor
loadings, αJ,A and αJ,B, indicate the importance of the two latent skills, θAand
θB, in selection into STEM occupations. DJis a binary variable, which equals 1
if the individual chooses a STEM job and 0 otherwise, conditional on graduating
with a STEM degree (DM= 1).
DJ= [D∗
J>0],if DM= 1 (5)
3.2 The Measurement System of The Two Latent Skills
To implement the model described above, I first estimate the distributions of the
two latent skills, F(θA)and F(θB), by a measurement system specified based
on empirical evidence on the relationship of college major choice and test scores
and the nature of the data available. The key point of adopting this latent skill
measurement system is to overcome the measurement errors when using the
observed test scores. Previous studies using test scores as direct measures of
ability have controversial findings on how much ability explains the gender gap
in college majors. Using SAT math and verbal scores or SAT composite and ACT
composite scores as direct measures of ability or readiness, Turner and Bowen
(1999) and Dickson (2010) find that the scores account for little of the gender
gap in majors or in differential switching out of engineering majors by gender;
while Ware et al. (1985) and Paglin and Rufolo (1990) find that the scores are
significant predictors of gender differences in major choice. Speer (2017) uses a
full set of ASVAB test scores from NLSY and finds those scores explain more than
half of the gender gap in college major choice. He also finds that having a wide
array of skill measures—particularly the mechanical scores in ASVAB—largely
improves the prediction on college major choice. Similarly, I use all separate
subject scores of ACT here (English, Reading, Science, and Math). Using ACT
12
subject scores allows me to gather enough test scores to identify two factors.23
Moreover, the unique Science subject in ACT provides an additional dimension
of skill, especially STEM-related, that other standard tests do not provide.
The 6 scores I use to estimate the latent skills are listed in (6): ACTEnglish,
COM114,ACTReading ,ACTScience,H SGP A, and ACTMath. I do not use any other
grades on a college transcript except COM114 mainly due to two reasons. First,
college performance will be highly correlated, if not perfectly correlated, with
pre-college ability. Thus, including college grades will cause collinearity prob-
lems, and the factor loadings will be biased. It remains an empirical question
that whether students gain extra ability during college. Second, ability sorting
and belief updating based on performance are two separate channels affecting
college major choice. The latter is the main channel for major switching and
dropping out. As mentioned earlier, exploring major switching and dropping
out behaviors is not the focus of this paper.
T=
T1
T2
T3
T4
T5
T6
=
ACTEnglish
COM114
ACTReading
ACTScience
H SGP A
ACTMath
(6)
The measurement system takes the following form, where each score could
be viewed as a linear production function of the underlying skills:
T=XTBT+AT,AθA+AT,B θB+eT(7)
Tis a L×1vector that contains Ltest scores associated with latent skills, θAand
θB.XTis a matrix of observable controls associated with scores, including home
state (the state where one completed high school) high school graduation year
level Averaged Freshman Graduation Rate (AFGR), home Census region fixed
effects, and college entrance semester fixed effects. BTis a vector of coefficients.
AT,A (AT,B) is a vector of loadings of the latent skill θA(θB). I assume indepen-
dence of the error terms: eT⊥⊥ (θA, θB,XT)and all elements in eTare mutually
independent.
Following the identification strategy of Carneiro et al. (2003), I estimate the
distribution of two latent skills, F(θA)and F(θB), and the two vectors of load-
ings, AT,A and AT,B from the covariance matrix of equation system (7). I assume
23A necessary condition for identification is L≥2k+ 1, where Lis the number of scores and k
is the number of factors (Carneiro et al.,2003).
13
orthogonality of the factors (i.e., θA⊥⊥ θB). The structure of the factor loadings,
AT,A and AT,B, takes the form in (8), where the first factor is allowed to affect all
six scores while the second factor is allowed to affect ACTScience,H SGP A, and
ACTMath. Specifically, any changes in the first latent skill will affect all six scores;
while any changes in the second latent skills will affect ACTS cience,H SGP A, and
ACTMath. This is the so-called “triangular” setting for the loading system, where
the first factor is identified from the covariances of all six scores; and the sec-
ond factor is identified from the residuals of the covariances of the second set of
scores—ACTS cience,H SGP A, and ACTMath —after partialling out the first factor.
[AT,A,AT,B ] =
αT1,A αT1,B
αT2,A αT2,B
αT3,A αT3,B
αT4,A αT4,B
αT5,A αT5,B
αT6,A αT6,B
=
αT1,A 0
αT2,A 0
1 0
αT4,A αT4,B
αT5,A αT5,B
αT6,A 1
(8)
As the first factor is identified from the covariances of all six scores, I la-
bel it as “General Academic Skill”. I consider this skill to be a comprehensive
skill that students use to read, write, communicate, and memorize. Literature
has documented that math and verbal skills are highly correlated (Aucejo et al.,
2016;Saltiel,2019). Here, General Academic Skill captures the correlation between
verbal-related skills and math-related skills, as it is identified from the covari-
ances of all six scores (ACT English, COM114, ACT Reading, ACT Science, HS-
GPA, and ACT Math) and allows to affect performance on all subjects. The sec-
ond factor, labeled as “STEM-specific Skill”, is identified from the covariances
of ACT Science, HSGPA, and ACT Math, after partialling out their covariances
with General Academic Skill. In this sense, ACT Science, HSGPA, and ACT Math
are influenced by the skill that affects both verbal-related and math-related per-
formance, General Academic Skill, and the STEM-related skill on an orthogonal
dimension, STEM-specific Skill.
I also explore an alternative setting of the factors in Appendix E, which takes
the “non-triangular” structure of the loading system (i.e., each factor is identified
only by a different set of scores). Compared to the preferred specification here,
the alternative setting does not assume orthogonality of the two factors but sac-
rifices part of the covariances between the scores by not using the covariances of
all six scores.
I estimate the measurement system to obtain βT,αT,A ,αT,B ,F(θA)and F(θB)
using maximum likelihood estimation (MLE):24
24I use a modified version of the relative developed STATA command, heterofactor, by Sarzosa
14
L=
N
Y
i=1 ZZ "feT1(XT i, T1i, γA, γB)×
... ×feT6(XT i, T6i, γA, γB)#dF (θA)dF (θB)(9)
Having identified the distribution of the two latent skills, F(θA)and F(θB),
from (9), I estimate the Roy model by integrating the likelihood function below
over the distributions of the two factors.25
L=
N
Y
i=1 ZZ
fey0(XY i, Y0i, θA, θB)×Φ(−M)(1−DMi )
×fey10 (XY i, Y10i, θA, θB)×Φ(M,J)(DMi )(1−DJi)
×fey11 (XY i, Y11i, θA, θB)×Φ(M,J)DMi DJi
dF (θA)dF (θB)
(10)
where M= (XMi βM+αM,AθA+αM ,BθB)and J= (XJ iβJ+αJ,AθA+αJ,B θB).
The two latent skills are the key factors linking the major choice, job choice and
salary outcomes together. By the Roy model’s assumption, it is the maximum
potential sectoral wage (Y0,Y10, and Y11 ) that drives the choices of major and job
(DMand DJ), conditioning on individuals latent skills (θA,θB) and observables
(XMi,XJi , and XY i). This MLE yields vectors of coefficients of interest, βY,βM,
βJ, and factor loadings, αY,A ,αY,B ,αM,A,αM,B,αJ,A, and αJ,B .
It is worth noting that I estimate the same model for the female sample and
the male sample separately and make a comparison of the factor loadings from
each estimation. As skills are unobserved, a presumable interaction term of skill
and gender is not identified. More specifically, each factor—female (male) θA,
female (male) θB—is identified as a distribution that the female (male) sample
follows, not a variable with a certain value maps to each individual.26
4 Estimation Results
4.1 Latent Skills
Table B1 and B2 present the estimates (βT,αT ,A and αT,B ) of the latent skills mea-
surement system (7) for women and men, respectively. The set of controls XT
includes the annual state-averaged freshmen graduation rate (AFGR) for high
school graduates, home region27 fixed effects and first enrollment semester fixed
and Urz´
ua (2016) to estimate the model
25Note that αT3,A and αT6,B (i.e., the loading of AC TReading and the loading of ACTM ath) are
normalized to 1 to facilitate the identification. Thus, General Academic Skill—θA, takes the metrics
of ACTReading and STEM-specific Skill—θB, takes the metrics of AC TM ath.
26Put another way, had I combined the female and male sample to estimate a single θA(or θB),
I would not be able to identify gender or the gender differences in ability sorting.
27The five home regions defined in this paper are the four Census regions—Northeast, South,
West, and Midwest—plus Indiana state. I define Indiana as a separate region due to the big body
of in-state students at Purdue.
15
effects. The loadings of General Academic Skill on all six scores and the load-
ings of STEM-specific Skill on ACTMath,ACTScience and HS GP A are all signifi-
cantly positive for both genders, meaning that both latent skills are positively
associated with the scores as expected. The magnitudes of the loadings are
quite similar across gender, except some slight gender differences: e.g., a one-
standard-deviation (one-unit) increase in General Academic Skill associates with
2.975 (0.832) points increase in an average woman’s ACTM ath score; while asso-
ciates with 2.580 (0.729) points increase in an average man’s ACTM ath score.28
The de-meaned latent skills distributions, F(θA)and F(θB), of female and male,
are shown in Figure B1.29 Both figures show that the latent skills distributions
are far from normal. In particular, both female and male General Academic Skill
distribution have a fat right tail; especially for women, the outstanding hump on
the right tail suggests that the fraction of high-ability women is relatively large
in the female sample.
4.2 The Roy Model
4.2.1 Choice of Major
Table 2shows the effect of latent skills and observable characteristics on selec-
tion between a STEM and a non-STEM major. Columns (1) and (2) show the
marginal effects of the Probit at the means for women and men, respectively.
Both General Academic Skill and STEM-specific Skill are significant determinants of
the likelihood of graduating with a STEM degree. Specifically, a one-standard-
deviation increase in General Academic Skill will significantly increase the proba-
bility of graduating with a STEM degree by 16.70 percentage points (henceforth,
p.p.) for an average woman; 23.22 p.p. for an average man. The marginal effect
of General Academic Skill on men is statistically larger than their female counter-
parts. The magnitudes are statistically different on the 5% significance level. A
one-standard-deviation increase in STEM-specific Skill will raise the likelihood of
graduating with a STEM degree by 13.94 p.p. for an average man; and 10.76 p.p.
for an average woman. The gender gap in the marginal effects of STEM-specific
Skill is smaller than the gender gap in the marginal effects of General Academic
Skill and is not statistically significant.
28The General Academic Skill takes the metrics of AC TReading and the STEM-specific Skill takes
the metrics of ACTM ath. The standard deviation of female (male) General Academic Skill is 3.576
(3.539); the standard deviation of female (male) STEM-specific Skill is 2.801 (2.862).
29The mean of female (male) “General Academic Skill” distribution is the constant term shown
in column (3) of Table B1 (Table B2): 15.706 (17.235). The mean of female (male) STEM-specific
Skill distribution is the constant term shown in column (6) of Table B1 (Table B2): 14.54 (13.383).
I incorporate the means of all distributions in simulations/counterfactual analysis.
16
Row 3–6 of Table 2show the marginal effects of preference measures on STEM
selection. The STEM share of local pre-college teachers, a proxy for the local
STEM environment and role modeling, has a significantly positive effect on se-
lecting STEM majors for both genders. Female share of local STEM pre-college
teachers has a positive effect for both genders as well, but only significant for
male students. These findings align with the literature on role models’ effects on
women in STEM (Neumark and Gardecki,1998;Herrmann et al.,2016;Kahn and
Ginther,2017;Canaan and Mouganie,2019;Mansour et al.,2020;Cheryan et al.,
2011). Having a larger female cohort has a positive and significant effect on both
genders, and the effect for women is twice the effect for men. This is consistent
with the finding of positive female peer effects in the literature (Bostwick and
Weinberg,2018). Having a larger cohort size have the same positive effect on
males and females.
In general, women sort less on General Academic Skill than their male counter-
parts. This finding is similar to Saltiel (2019) and Aucejo et al. (2016), where they
find men are more sensitive to the differential level of skills than women when
deciding to enroll in STEM fields. Alternative potential explanations I consider:
1) Women are more critical about or have higher standard of their own abil-
ity, which aligns with a broad literature on gender difference in self-estimate
or confidence in competitive environment (Barber and Odean,2001;Furnham,
2001;Gneezy et al.,2003;Niederle and Vesterlund,2010;Hardies et al.,2013). 2)
Preference conditional on ability is more dominating for women’s major choice,
as suggested in the literature on gender specific preference on college majors
(Turner and Bowen,1999;Xie et al.,2003;Dickson,2010;Kimmel et al.,2012). I
further discuss how much ability explain the gender gap in STEM major choice
in Section 5.4 by counterfactual analysis.
4.2.2 Choice of Job
Students who graduated with STEM degrees choose between STEM and non-
STEM jobs. By assumption, non-STEM graduates will place in non-STEM jobs
only. Table 3shows the marginal effects of latent skills and observable prefer-
ences over the workplace on the likelihood of working in STEM occupations.
Compared to major choices, latent skills are much weaker determinants in job
choices. Specifically, a one-standard-deviation increase in General Academic Skill
will increase the likelihood of working in a STEM occupation by 6.50 (3.78) p.p.
for an average female (male) STEM graduate. Sorting on the STEM-specific Skill
is even weaker and not statistically significant. A one-standard-deviation in-
crease in STEM-specific Skill will increase the likelihood of working in STEM by
17
3.95 (3.04) p.p. for an average female (male) STEM graduate. More importantly,
there are no significant gender differences in sorting on either skill when mak-
ing job decisions between STEM and non-STEM. The weak estimates imply that
skills are not key drivers of choice between a STEM and non-STEM job for either
female or male STEM graduates. This is not surprising: STEM degree holders
should be similarly qualified for a STEM job; thus, their decisions are not sensi-
tive to skills.
Determinants other than skills tell more about the gender difference in STEM
attrition. Long work hours deter women from staying in STEM. Specifically,
one more work hour per week significantly decreases the likelihood of working
in STEM for women by 1 p.p. This effect is enormous, considering the long
work hours in STEM fields.30 In contrast, men’s decision to work in STEM is not
strongly associated with work hours. Estimates in rows 3 and 5 show that both
female peer effect at workplace and cohort size effect account for some gender
differences, albeit not statistically significant. Section 5provides more insights
on why female STEM degree holders are more likely to opt-out of STEM fields
than their male counterparts.
4.2.3 Salary
Tables B3 and B4 show the salary returns to skills for men and women who en-
dogenously sort into different categories of majors and jobs, respectively. Columns
(1)–(3) in each table present the estimates for three categories: STEM degree hold-
ers work in STEM occupations, STEM degree holders work in non-STEM oc-
cupations, and non-STEM degree holders work in non-STEM occupations. For
simplicity, I denote these three types of men as M ale11,Male10, and M ale00; the
same logic goes for women. I control for state-level annual unemployment rate,
job location fixed effects,31 the annual national number of graduates, the annual
national number of graduates in STEM, the annual national number of female
graduates, the annual national number of female STEM graduates, the annual
national fraction of STEM employment in total employment, and annual national
employment in STEM and non-STEM fields.
In general, both General Academic Skill and STEM-specific Skill have positive
salary returns for all three categories of women and men. The return to STEM-
30Statistics from CPS show an average non-STEM worker works 38.28 hours per week while
an average STEM worker works 42.89 hours per week.
31I defined 10 job locations according to the Census divisions: “New England”, “Mid-Atlantic”,
“East North Central”, “West North Central”, “South Atlantic”, “East South Central”, “West South
Central”, “Mountain”, “Pacific”, and “Indiana”. Again, it is important to have Indiana as a
separate division due to the large body of in-state students. A large fraction of them will hold
in-state jobs after graduation.
18
specific Skill is higher than the return to General Academic Skill for all categories.
Male11 have larger salary return to General Academic Skill than Male10 and Male00;
same for females. All categories of women—F emale11 ,F emale10 and F emale00—
are rewarded for their STEM-specific Skill. In contrast, M ale00 have no significant
salary return to the STEM-specific Skill. Further, women are rewarded slightly
more than men for both of their skills, while only the return to STEM-specific Skill
shows statistical gender difference. In specific, a one-standard-deviation increase
in an average F emale00’s STEM-specific Skill will increase her annual salary by
$2,513. This might implicate why women are less likely to enroll in STEM ma-
jors: women with high STEM-specific Skill are also rewarded outside STEM fields.
Another point to be highlighted is that both male and female STEM graduates
working in non-STEM fields, M ale10 and F emale10 , are significantly rewarded
by STEM-specific Skill. This finding suggests that non-STEM occupations hold by
STEM graduates value knowledge and training of STEM.
4.2.4 Goodness of Fit
Appendix Cgoes into details about the model fit. Table C1 shows that the pre-
dicted scores from the measurement system fit the actual data well. Both the
first and second moments are very close to the data. Figures C1 and C2 are the
cumulative distributions of the actual and predicted scores for men and women,
respectively. Overall, both genders ’ predicted scores fit very well with the ac-
tual data. The data for high school GPA and communication 114 grade points are
lumpy because these two variables are discrete. Tables C2 presents the models’
goodness-of-fit on the first and second moments of major choice (DM), job choice
(DJ) and salary (Salary11,Salary10, and Salary00) for both genders. They are a
product of 1,000,000 simulations based on 1000 bootstrapping from the estimates
and 1000 random draws from the factor distributions within each bootstrap. This
table shows that the model accurately predicts the first and second moments of
each outcome by gender and provides confidence that the counterfactuals pre-
dicted by the model are appropriate.
4.3 Ability Distributions of Students in the Three Career Paths
Using the model estimates, I show predicted probability of majoring in STEM
and the predicted probability of working in STEM fields by decile of each skill
distribution in Figure B2 and B3. Women at every level of ability are less likely
to major in STEM or work in STEM than their male counterparts.
To better present the pattern of sorting on latent skills, I plot the distributions
of the two latent skills by gender and career path in Figure 1. Figure 1b shows the
19
distributions of General Academic Skill for Male00,M ale10, and Male11. The dis-
tribution of Male00 is different from the other two groups: men with non-STEM
degrees have significantly lower General Academic Skill than men with STEM de-
grees. More importantly, the distributions of both Male10 and the M ale11 have a
hump on the right tail, indicating that men with high General Academic Skill are
disproportionately more likely to graduate with STEM degrees. Figure 1d also
shows the distributions of Male00 is quite apart from the distributions of Male10
and Male11, indicating men with low STEM-specific Skill are less likely to major
in STEM.
Figure 1a shows that women’s sorting behavior is observably different from
men’s. The outstanding hump on the right tail of the distribution of F emale00 in-
dicates a mass of women with high General Academic Skill chooses non-STEM ma-
jors, which is quite different from the distribution of Male00. Apparently, high-
ability women are more likely to choose non-STEM majors than their male coun-
terparts. This is interesting and might be related to the literature about women
being too critical about their skills and less confident than men (Roberts,1991;
Johnson and Helgeson,2002;Ahn et al.,2019). Also, the fact that the skill dis-
tributions of group 10 and group 11, for both genders, are close to each other
again suggests that neither men nor women sort greatly on skills when making
job decisions.
5 Counterfactual Analysis
In this section, I first conduct counterfactual analyses to estimate the return to a
STEM degree and return to a STEM job by skill levels and then compare the gen-
der differences. Second, I link the return to a STEM career to the non-negligible
share of high-ability women who take the non-STEM path and show how much
the gender wage gap can be explained if this group of women would have chosen
a STEM path. Third, I decompose the major choice equation and the job choice
equation to estimate how much the gender differences in STEM majors can be
counted by each determinant. Finally, I further explore the link between major
choice and job choice and the gender gap in STEM attrition.
5.1 The Return to a STEM Degree
To understand the return to a STEM degree, I calculate the average treatment
effect (ATE) of having a STEM degree across skill levels for women and men, as
follows:
AT EM=E[Y10 −Y00 |θ, x]
20
Figure 2shows the ATE of having a STEM degree for women and men at each
decile of the two skills. Both curves are upward sloping, indicating a positive
return to skills. There are no significant gender differences here. Women’s ATEs
over both skill distributions have slightly larger standard deviations, implying
that among individuals with the same skill, return to a STEM degree of women
varies more than that of men. One number to take away: a female (male) student
with an average skill level could have earned $6,968 ($7,309) more per year if she
(he) had earned a STEM degree.
What is also interesting to see is the return to a STEM degree for individuals
on the margin of the choice. I calculate the marginal treatment effect (MTE) of
having a STEM degree for both genders as follows.
M T EM
i=E[Y10 −Y00|P r(XM,iβM+αM,A θA
i+αM,B θB
i=eM
i) = 1]
Figure B4 presents the MTEs of having a STEM degree for each gender across the
deciles of skill distributions. In general, MTEs are upward sloping, except men’s
MTE across General Academic Skill is insignificantly downward slopping. Com-
paring with ATEs (Figure 2), I find the ATEs and MTEs of having a STEM degree
are very similar except that the MTEs have significantly larger standard devia-
tions. This is mainly due to two reasons: 1) we are comparing fewer individuals
on the margin within the same skill deciles; 2) the observable characteristics of
an individual on the margin vary a lot more than an average individual.
5.2 The Return to a STEM Job
I do the same exercises to assess the return to a STEM job for STEM graduates as
follows.
AT EJ=E[Y11 −Y10 |θ, x, DM= 1]
M T EJ
i=E[Y11 −Y10|P r(XJ,iβJ+αJ,AθA
i+αJ,BθB
i=eJ
i)=1, DM= 1]
Figure 3shows that women’s ATE of having a STEM job is larger than men’s
at each level of both skills. One may notice that the ATE is downward-sloping
across deciles of STEM-specific Skill. This is because the salary return to STEM-
specific Skill for group 10 (STEM degree holders with a non-STEM job) is higher
than that for group 11 (STEM degree holders with a STEM job). That is, the
return to having a STEM job is positive across the entire distribution of STEM-
specific Skill; but with a diminishing marginal return. In other words, STEM de-
gree holders with low STEM-specific Skill have larger gains from a STEM job than
their counterparts with high STEM-specific Skill. For a female (male) STEM grad-
uate with a mean skill level, working in a STEM job makes $6,629 ($2,614) more
21
than working in a non-STEM job. That said, women are rewarded significantly
more for a STEM job than men.
Figure B5 displays the marginal treatment effects of working in STEM for
each gender over the deciles of each latent skill. The trends look similar to the
ATEs above. However, a woman’s MTE of working in STEM at each skill decile
is slightly larger than her ATE of working in a STEM job. Yet this pattern is not
true for men. Again, we see the gender differences in the MTEs: men’s MTE of
working in STEM is significantly lower than women’s, suggesting that the return
to a STEM job for women on the margin is significantly larger than that for their
male counterparts.
5.3 High-Ability Women and the Gender Wage Gap
If a non-STEM degree holder had chosen a STEM path and persisted, the average
treatment effect would be even larger. Figure B6 shows the ATEs of a STEM path
by gender by skill level.32 Overall, women are remarkably more rewarded for a
STEM path than men at every level of both skills. For individuals with the mean
skill level, the return to a STEM path is $13,598 for women and $9,923 for men.
It is worth noting that there is no gender gap in return to a STEM degree. The
gender gap in return to a STEM path is completely driven by the gender gap in
return to a STEM job, comparing numbers in the three panels.
Given that women have a higher return to a STEM path than men and that
the return is strictly increasing in skill, the total returns to high-ability women
who choose a non-STEM path are not trivial. Recall the hump shape on the right
tail of the General Academic Skill distribution for the F emale00 group in Figure 1a.
Compared with the same distribution for the Male00 group in Figure 1b, the
outstanding hump on F emale00’s distribution implies that high-ability women
are remarkably less likely to choose a STEM path than high-ability men. When
we overlap these two distributions in Figure 4, we can see a non-negligible mass
on the right tail, indicating high-ability women making different choices relative
to high-ability men. The curve of ATE in Figure B6 indicates that each woman
in this group would have increased annual income by $13,000–$20,000, had they
chose STEM path like their high-ability male counterparts.
It is difficult to identify the mechanism of the heterogeneous gender gap in
major choice across skill-level. Still, I can quantify the total return to a STEM
path for high-ability women who chose not to enter STEM. In particular, I in-
32
AT EM J =E[Y11 −Y00|θ, x]
22
tegrate the ATE of a STEM path over the mass of the shadowed area shown in
Figure 4. The mass of high-ability non-STEM women distributed in the shad-
owed area counts for 6.67% of women. The total returns from them choosing
a STEM path are equivalent to a $958 annual salary increase per woman. This
increase explains 13.66% of the gender wage gap in my sample.33 One should
not take this 13.66% gender gap for granted: it is contributed by a small group of
women–6.67% of the Purdue female sample. Moreover, one should not interpret
the result as every woman will gain only $958 per year by majoring in STEM,
which is clearly minuscule. Instead, a 13.66% gender wage gap will be closed
if a small group of women who are most likely to be capable of studying STEM
would have chosen STEM.
5.4 Decomposition of Major Choice
In this section, I return back to the gender difference in major choice by decom-
posing the choice function. The estimates in Table 2imply that women and men
sort on skills differently. What if women had sorted the same as men? What if
women and men had the same distributions of skills? Table 4presents counter-
factual likelihoods of majoring in STEM, following the approach in Urzua (2008).
The first row is the model predicted (i.e., factual) likelihood of graduating with
a STEM degree for women and men, respectively. For clarity, I write out the
likelihood function of graduating in STEM for each gender as follows:
Df
M(βM,f , X f
M, αM,A,f , αM ,B,f , θA,f , θB ,f )
Dm
M(βM,m, Xm
M, αM,A,m , αM,B,m , θA,m, θB ,m)
where superscripts fand mdenote the gender. The second row shows that
37.22% of women would have graduated in STEM when women are assumed
to have the the same factor loadings as men.34 The third row shows that 38.78%
of women would have graduated in STEM when women are assumed to have
the same skill distributions as men.35 Furthermore, by assuming that women
had both the same ability and the same ability sorting behavior as men, we see
the number increases to 40.07%.36 That said, when eliminating the gender differ-
ences in ability and ability sorting, the gender differences in STEM degree would
have shrunken by 11.5%; however, the majority of gender gap in STEM degree
33The gender wage gap, $8,198, is calculated by subtracting the Purdue female graduates’ av-
erage annual salary from the Purdue male graduates’ average annual salary.
34I.e., Df
M(βM,f , X f
M,αM,A,m , αM,B,m, θA,f , θB,f )
35I.e., Df
M(βM,f , X f
M, αM,A,f , αM,B ,f ,θA,m, θB ,m)
36I.e., Df
M(βM,f , X f
M,αM,A,m , αM,B,m, θA,m, θB,m )
23
remains unexplained.
Regarding the gender difference in the observable characteristics, I find that
the share of women in STEM would have significantly increased to 42.53%, when
replacing women’s coefficients of the observables with men’s.37 Further, if women
had the same observable characteristics as men, the proportion of female gradu-
ating in STEM will be 57.63%.38 Replacing both the coefficients and the observ-
ables with men’s, the counterfactual likelihood increases even more.39
Generally speaking, the counterfactuals in rows 5–7 in Table 4suggest that
gender differences in choosing majors can be primarily attributed to observable
characteristics, including macroeconomic conditions, labor demand for STEM
workers, and cohort effects. Besides these, however, there is still an unexplained
gender gap in major choice, which could be due to unobservable personal pref-
erences. The unobservable gender-specific personal preferences dominate when
women choose their college majors, as shown in the literature. For example,
some studies find that young female students with higher expected fertility tend
to choose majors that are progressively less subject to atrophy and obsolescence
(i.e., history and English), considering the expected time-out-of-the-labor force
(Polachek,1981;Blakemore and Low,1984). Literature also finds that men care
more about pecuniary outcomes and leadership in the workplace, while women
are more likely to value opportunities to help others, to contribute to society, and
to interact with people (Zafar,2013;Daymont and Andrisani,1984).
5.5 Gender Differences in STEM Attrition
In this section, I further explore the gender differences in STEM attrition from de-
grees to jobs. The weak marginal effects in the job choice (see Table 3) imply that
ability sorting cannot explain the gender difference in STEM job choice, which is
not surprising given that STEM degree holders should be equally qualified for a
STEM job. A counterfactual analysis shown in Table 5, carried out by the same
exercises in the section above, reaffirms that gender differences in ability or abil-
ity sorting explain very little of the gender gap in STEM attrition. In light of this
implication, I investigate potential mechanisms of the gender gap in STEM attri-
tion in two ways: 1) the interaction between gender segregation in each STEM
major and the STEM composition of jobs valuing the knowledge or training of
each STEM major; 2) gender differences in preferences related to job search or
the workplace.
37I.e., Df
M(βM,m , Xf
M, αM,A,f , αM,B ,f , θA,f , θB,f ))
38I.e., Df
M(βM,f ,Xm
M, αM,A,f , αM,B ,f , θA,f , θB,f ))
39I.e. Df
M(βM,m , Xm
M, αM,A,f , αM,B ,f , θA,f , θB,f ))
24
5.5.1 STEM Majors and Well-Matched Jobs
When talking about STEM attrition, we essentially care about whether STEM
degree holders remain applying their knowledge or training to their jobs. Light
and Rama (2019) find that some majors and occupations apply and require STEM
knowledge of different intensities. Therefore, whether and to what extent non-
STEM jobs defined by BLS or any other source hire and value STEM majors, espe-
cially Less Math-intensive STEM majors, could make a difference for the meaning
of “STEM attrition”. Women are underrepresented in STEM but proportionally
represented in Less Math-intensive STEM, which I discuss in-depth in Appendix
D. Not surprisingly, Less Math-intensive STEM majors, including all life sciences,
psychology, and social sciences, are more likely to place in non STEM jobs (see
Figure D1).40
To quantify how graduates are valued by (well-matched to) jobs, I create a
measure of match quality for each major-occupation pair based on data from the
National Survey of College Graduates (henceforth NSCG) (NSF,NSF).41 I exploit
the answers to the survey question—“To what extent was your work on your
principal job related to your highest degree?”—to generate this match quality
measure.42
Taking the major-occupation pair match quality to my data, I find that the
share of STEM jobs is 86.51% (79.69%) for male (female) STEM degree holders
whose major-job are identified as “well-matched”. These numbers are based
on the main definitions of STEM degrees (the ICE definition) and STEM occu-
pations (the BLS definition) used in the paper. This gender gap in STEM jobs
(86.51% versus 79.69%) is similar to the gender gap in STEM jobs among all
STEM degree holders (81.17% versus 73.11%), ignoring major-job match qual-
ity. When changing the definition of STEM to Math-intensive STEM, I find the
share of Math-intensive STEM jobs is 89.96% (86.79%) among men (women) with
Math-intensive STEM degrees, where the gender gap in STEM jobs shrinks exten-
sively. This implies that conditional high match quality between major and job,
part of the gender gaps in STEM attrition could be explained by that women are
40Following the definitions in Ceci et al. (2014) and (Kahn and Ginther,2017).
41The NSCG, sponsored by the National Center for Science and Engineering Statistics (NC-
SES) within the National Science Foundation, provides data on the characteristics of the nation’s
college graduates, with particular focus on those in the science and engineering workforce. The
advantage of NSCG is that it provides data useful in understanding the relationship between
college education and career opportunities, as well as the relationship between degree field and
occupation.
42I quantify each individual answer by assigning 1 to “highly related”, 0.5 to “somewhat re-
lated”, and 0 to “not related” and define “match quality” as the mean of each major-occupation
pair. I define major-occupation pair as “well-matched” if the match quality ≥0.5, which is equiv-
alent to “somewhat related” and above.
25
proportionally represented in Less Math-intensive STEM majors. Moreover, Less
Math-intensive STEM majors are more likely to be well matched to a non-STEM
job based on the main definition of STEM.
The Less Math-intensive STEM majors that could be well-matched to a non-
STEM job include: “Management Sciences and Quantitative Methods”, “Animal
Sciences”, “Food Sciences”, “Wildlife and Wildlands Science and Management”,
“Biochemistry”, “Plant Sciences”, and “Environmental Studies”. Graduates of
these majors are likely to be applying their knowledge to non-STEM jobs and
considered as “leaving STEM”. Here are two examples to fix mind, 1) graduates
from “Management Sciences and Quantitative Methods” work as “General and
Operations Managers”; 2) graduates from “Animal Sciences” work as “First-Line
Supervisors of Farming”. It is not this paper ’s intention to discuss what the best
STEM definition is. Even using NSF or O*Net’s definition of STEM, we will
still see these two examples. Yet, the implication here is that how we define the
boundaries of STEM and non-STEM is critical to study the gender gap in STEM
and, more generally, the assessment of STEM graduates’ job placement.
5.5.2 Preferences over Job Conditions
Even conditional on the match quality between major and job, there is still a
gender gap in STEM attrition based on either definition of STEM. Row 5 in Ta-
ble 5indicates that women would have been more likely to work in STEM if
we assume that they had the same return to the observable characteristics as
men. Specifically, there would have been 75.4% female STEM graduates staying
in STEM fields instead of 71.59%. This 3.8 p.p. increase explains 44.65% of the
gender gap in choosing a STEM job.
Next, I fully decompose the observable characteristics in the job function to
further investigate other determinants of job choice. Results are shown in Ta-
ble 6. I do the decomposition in two ways: using the main definition of STEM
majors and jobs and using the math-intensive STEM majors and jobs. Columns
(1) and (3) shows the counterfactual likelihood of holding a STEM job when
eliminating each set of covariate(s) in women’s job choice function; Columns (2)
and (4) shows the counterfactuals of replacing women’s covariate(s) with men’s.
The first row is the factual fraction of STEM workers in female STEM graduates
(i.e. Df
J(βJ,m, Xf
J, αJ,A,f , αJ,B,f , θA,f , θB,f )). Rows 2–6 show the counterfactuals
when making changes to a variable’s coefficient in βJ,m. Among all factors, work
hours/week and the “home region” fixed effects (the Census region where students
completed high school) are the two major determinants that explain the gen-
der difference in job decisions. The similarities between results under different
26
STEM definitions suggest that the gender differences in preferences are still key
to the gender gap in STEM attrition, even after accounting for different gender
segregation in different fields.
Specifically, columns (1)–(4) in row 2 consistently show that eliminating work
hours as a determinant of job decisions or replacing women’s preference of work
hours with men’s will largely increase the probability of working in STEM for
women. This is intuitive: STEM occupations are known for long work hours,
and studies have shown that women are less likely to work in STEM due to that
(e.g., Schlenker (2015)). Row 6 shows that either excluding the home region fixed
effects or replacing women’s home region fixed effects with men’s will fully close
the gender gap on job choice (80.78% or 80.59%). This suggests that home loca-
tion is a key factor of the job decision for women but not for men. I run a simple
regression where an indicator of working in STEM is the dependent variable and
an indicator of working in the home state is the key independent variable, con-
trolling for home state fixed effects and degree calendar year fixed effects. Table
F1 shows that there is a significantly negative correlation between having a job
in home state and working in STEM fields for female STEM graduates. But there
is no such a correlation for male STEM graduates.
5.5.3 Home State
To take a closer look at the jobs held by the group of STEM graduates who work
in their home state, I show the shares and average earnings of STEM graduates
in this two-by-two choice of jobs and locations by gender in Table F2. About 81%
of male STEM graduates work in STEM no matter whether their job location is in
their home state or not. Among female STEM graduates who hold jobs in their
home state, 32.3% opt-out of STEM, while among those who hold jobs outside
of their home state, only 24.4% opt-out of STEM. Moreover, there is no gender
wage gap among STEM workers, but there is an obvious gender wage gap for
non-STEM workers, especially for the non-STEM workers in their home state.
The average salary for men and women who hold non-STEM jobs in the home
state is $50445 versus $41111. From what we have seen in Table C2, there is no
gender wage gap in STEM jobs, but there is one in non-STEM jobs, even among
STEM graduates.
Without eliminating the possibility that STEM and non-STEM occupations
discriminate against women differently (i.e., STEM jobs do not discriminate women,
but non-STEM jobs do), I find a more convincing explanation: there is more
heterogeneity in non-STEM occupations, so there is larger gender segregation
27
within non-STEM. There are 184 STEM occupations versus 656 STEM occupa-
tions (out of 840 occupations) based on the BLS definition. Based on the limited
observations of home-non-STEM workers in Table F2, I find that the most pop-
ular non-STEM occupations that those STEM graduates tend to have are “Man-
agement Occupations” and “Business and Financial Operations Occupations”43.
41% (30%) of men (women) are in “Management Occupations” and 23% (25%)
of men (women) are in “Business and Financial Operations Occupations”. These
two occupations are also likely to be the best-paying jobs among all non-STEM
jobs. Thus, the higher fraction of men in these two non-STEM occupations may
explain the home-non-STEM gender wage gap discussed above.
Returning to why women who work in their home state are more likely to
opt-out of STEM, it goes without saying that occupation and job location are si-
multaneous choices, and it is difficult, if not impossible, to identify the causality.
The literature has documented that women are spatially less flexible than men
in the job search. Women tend to work closer to home and sacrifice job match
quality in a joint-search with the spouse (Bielby and Bielby,1992;Fanning Mad-
den,1981;Robst,2007;van Ham et al.,2001). Thus, one potential mechanism
here is that women’s stronger geographic preference creates a trade-off between
a mismatched job (non-STEM job) at the desired location (or home state) and a
matched job (STEM job) outside of the desired location.
It is out of my data range to provide any information on job satisfaction, job
search, or family background to show direct evidence. I thus seek evidence
from NSCG, where I explore the series of questions on the “Reason for work-
ing outside field of highest degree” on a sample that covers the same BA degree
year window as my main sample (2007–2014). Table F3 shows the shares of the
“Most important reason for working outside field of highest degree” by gender.44
Among all college graduates who work outside the field of study, 8.71% women
consider “job location” as the main reason while only 5.59% of men consider so.
8.33% of women consider “family-related reason” as the main reason while 5.2%
of men consider so. Among STEM graduates who work in non-STEM jobs, the
share of women who consider these two reasons as the main reason for work-
ing outside of the field of study is somewhat higher than the full sample.45 This
432-digit level SOC code “11-0000”, “Management Occupations” and 2-digit level SOC code
“13-0000”, “Business and Financial Operations Occupations”.
44Those who answered “Not related” in question A21: “To what extent was your work on your
principal job related to your highest degree?” mark “Yes” or “No” for the 7 provided reasons (see
Table F3) for the question “Did these factors influence your decision to work in an area outside
the field of your highest degree?”. Then the respondent picks two reasons out of the 7 as the
“Most important reason” and the “Second most important reason”.
45Here, I classify STEM majors and STEM jobs following the NSF classification, which is quite
straightforward based on the NSF’s codebook.
28
pattern also holds when we look at the “Second most important reason”. Fur-
ther, the gender difference in choosing “pay, promotion opportunities” as the
most important reason is consistent with the literature that men care more about
pecuniary outcomes and leadership in the workplace (Zafar,2013;Daymont and
Andrisani,1984). Moreover, the fact that women tend to concern more on “work-
ing condition (hours, equip., working environment)” echoes my finding in the
previous sections: that is, 1) longer working hour in STEM jobs deters women
from committing to; 2) Eliminating working hours as a determinant of job choice
will largely boost women’s likelihood of working in STEM.
Overall, the descriptive statistics from NSCG provide supportive evidence on
why women are more likely to work in their home state. Meanwhile, given home
location as a constraint, it could be other characteristics of the home state, rather
than job supply or search friction, that deters women from working in STEM.
Pope and Sydnor (2010) find that stereotypical gender norms on standardized
tests vary systematically at the state level, and these strong social forces create
gender differences in performance on test scores. Areas with lower gender in-
equality have smaller gender disparities in stereotypically male-dominated tests
of math and science and smaller disparities in stereotypically female-dominated
tests of reading; vice versa. They suggest that potential mechanism includes
opportunities in the workforce and the psychological effect of stereotypes. If
state-level stereotyping affects gender differences in test performance, it could
also affect gender disparities in job choices.
Inspired by their paper, I investigate the correlation between the home state’s
stereotyping level and the likelihood of STEM graduates having a STEM job by
adopting their key variable, “stereotype adherence index”, as a measure of state-
level stereotyping towards STEM jobs and regressing the indicator of having a
STEM job on this index.46 Table F4 shows the regression estimates on the up-
per panel and the summary statistics of the index on the lower panel. Female
STEM graduates from a state with a high stereotype adherence index are less
likely to work in STEM. Specifically, a standard deviation increase in the home
state’s “stereotype adherence index” is associated with a 6.72 p.p decrease in fe-
male STEM graduates’ likelihood of working in STEM. In contrast, there is no
significant correlation between male STEM graduates’ job choice and the index.
Although these results only show stylized facts, they put forward a potential
mechanism of why female STEM graduates who go back to their home state are
less likely to work in STEM.
46“Stereotype adherence index” measures the geographic variation in gender disparities on
standardized test scores. It is generated by averaging a state’s male–female ratio in math and
science with the state’s female–male ratio in reading using the top-5-percent cutoff.
29
6 Conclusion and Discussion
I study gender differences in college STEM majors and STEM jobs by assess-
ing gender differences in ability, ability sorting, and preferences over the field
of study and job characteristics. Revisiting the long-lasting debate on whether
women avoid STEM due to a disadvantage in STEM-related ability, I find that,
on average, there is no significant gender difference in ability, men’s ability sort-
ing is statistically stronger than women’s in major choices, and that there is no
gender difference in ability sorting into jobs. Overall, gender differences in abil-
ity and ability sorting together explain only a small portion of the gender gap in
STEM. This finding aligns with some existing studies that argue that preferences,
rather than ability, are the main driver of the gender gap in STEM fields.
The nuance in my finding is that the gender gap in major choices is more
pronounced among high-ability women. About 13.66% of the gender wage gap
among college graduates can be explained by the return to STEM careers among
the women in the top 6.67% of the ability distribution. Without assessing the
optimality of decisions, a positive correlation between STEM major choice and
the pre-college STEM environment suggests a possible direction to shrink the
gender gap in STEM majors, as well as the gender wage gap.
I investigate why women with STEM degrees are more likely to opt-out of
STEM jobs than men, contributing to the limited literature in this space. I find
that part of the gender gap in STEM attrition can be attributed to the gender
segregation in different STEM majors and how we define STEM jobs. Women
are more represented in less Math-intensive STEM majors. Moreover, graduates
from less Math-intensive STEM majors are more likely to be well-matched to and
take jobs in non-STEM occupations. These findings suggest that how we define
the boundaries of STEM and non-STEM is essential to study the gender gap in
STEM and, more generally, the assessment of STEM graduates’ job placement.
I also find suggestive evidence showing that job location, family-related rea-
sons, and stereotyping at the home location (where one completed high school)
are potential reasons for STEM-majored women opt-out of STEM jobs more than
men. This finding links to the literature on geographic constraints on women’s
job searches (Sorenson and Dahl, 2016; Benson, 2014). Future research should ex-
plore the relationship between marriage, family, and female STEM graduates’ job
choices and, in particular, whether female STEM graduates return to their home
states because of social networks, family, or child care. My finding of a negative
correlation between home location stereotyping and women’s STEM jobs also
makes future research on the mobility of female STEM talents important. Specif-
ically, how to attract young women to locations with low STEM employment to
30
study STEM and how to build a virtuous circle of training and maintain female
STEM talents?
31
References
Ahn, T., P. Arcidiacono, A. Hopson, and J. R. Thomas (2019). Equilibrium grade
inflation with implications for female interest in stem majors. Technical report,
National Bureau of Economic Research.
Altonji, J. G., P. Arcidiacono, and A. Maurel (2016). The analysis of field choice
in college and graduate school: Determinants and wage effects. In Handbook of
the Economics of Education, Volume 5, pp. 305–396. Elsevier.
Altonji, J. G., E. Blom, and C. Meghir (2012). Heterogeneity in human capital
investments: High school curriculum, college major, and careers. Annu. Rev.
Econ. 4(1), 185–223.
Arcidiacono, P. (2004). Ability sorting and the returns to college major. Journal of
Econometrics 121(1), 343–375.
Arcidiacono, P., V. J. Hotz, and S. Kang (2012). Modeling college major choices
using elicited measures of expectations and counterfactuals. Journal of Econo-
metrics 166(1), 3–16.
Astorne-Figari, C. and J. D. Speer (2019). Are changes of major major changes?
the roles of grades, gender, and preferences in college major switching. Eco-
nomics of Education Review 70, 75–93.
Aucejo, E. M., J. James, et al. (2016). The path to college education: Are verbal
skills more important than math skills? Technical report.
Barber, B. M. and T. Odean (2001). Boys will be boys: Gender, overconfidence,
and common stock investment. The Quarterly Journal of Economics 116(1), 261–
292.
Beede, D. N., T. A. Julian, D. Langdon, G. McKittrick, B. Khan, and M. E. Doms
(2011). Women in stem: A gender gap to innovation. ESA Issue Brief# 04-11.
US Department of Commerce.
Bielby, W. T. and D. D. Bielby (1992). I will follow him: Family ties, gender-role
beliefs, and reluctance to relocate for a better job. American Journal of Sociol-
ogy 97(5), 1241–1267.
Blakemore, A. E. and S. A. Low (1984). Sex differences in occupational selection:
The case of college majors. The Review of Economics and Statistics, 157–163.
32
Blau, F. D. and L. M. Kahn (2017). The gender wage gap: Extent, trends, and
explanations. Journal of Economic Literature 55(3), 789–865.
BLS (2012). Attachment c: Detailed 2010 soc occupations included in stem.
https://www.bls.gov/soc/attachment_c_stem.pdf. [Online; ac-
cessed 2020-07-15].
Borjas, G. J. (1987). Self-selection and the earnings of immigrants. Technical
report, National Bureau of Economic Research.
Bostwick, V. K. and B. A. Weinberg (2018). Nevertheless she persisted? gender
peer effects in doctoral stem programs. NBER Working Paper.
Brown, C. and M. Corcoran (1997). Sex-based differences in school content and
the male-female wage gap. Journal of Labor Economics 15(3), 431–465.
Canaan, S. and P. Mouganie (2019). Female science advisors and the stem gender
gap. Available at SSRN 3396119.
Card, D. and A. A. Payne (2017). High school choices and the gender gap in
stem. NBER Working Paper.
Carneiro, P., K. T. Hansen, and J. J. Heckman (2003). Estimating distributions
of treatment effects with an application to the returns to schooling and mea-
surement of the effects of uncertainty on college choice. International Economic
Review 44(2), 361–422.
Ceci, S. J., D. K. Ginther, S. Kahn, and W. M. Williams (2014). Women in academic
science: A changing landscape. Psychological Science in the Public Interest 15(3),
75–141.
Cha, Y. (2013). Overwork and the persistence of gender segregation in occupa-
tions. Gender & society 27(2), 158–184.
Cheryan, S., J. O. Siy, M. Vichayapai, B. J. Drury, and S. Kim (2011). Do female and
male role models who embody stem stereotypes hinder women’s anticipated
success in stem? Social Psychological and Personality Science 2(6), 656–664.
Clayton, J. A., F. S. Collins, et al. (2014). Nih to balance sex in cell and animal
studies. Nature 509(7500), 282–3.
Corbett, C. and C. Hill (2015). Solving the Equation: The Variables for Women’s
Success in Engineering and Computing. ERIC.
33
Daymont, T. N. and P. J. Andrisani (1984). Job preferences, college major, and the
gender gap in earnings. Journal of Human Resources, 408–428.
Dickson, L. (2010). Race and gender differences in college major choice. The
Annals of the American Academy of Political and Social Science 627(1), 108–124.
Eccles, J. S. (2007). Where Are All the Women? Gender Differences in Participation in
Physical Science and Engineering. American Psychological Association.
Eide, E. (1994). College major choice and changes in the gender wage gap. Con-
temporary Economic Policy 12(2), 55–64.
Ethington, C. A. and L. M. Woffle (1988). Women’s selection of quantitative un-
dergraduate fields of study: Direct and indirect influences. American Educa-
tional Research Journal 25(2), 157–175.
Fanning Madden, J. (1981). Why women work closer to home. Urban Stud-
ies 18(2), 181–194.
Fisher, A. and J. Margolis (2002). Unlocking the clubhouse: the carnegie mellon
experience. ACM SIGCSE Bulletin 34(2), 79–83.
Furnham, A. (2001). Self-estimates of intelligence: Culture and gender differ-
ence in self and other estimates of both general (g) and multiple intelligences.
Personality and Individual Differences 31(8), 1381–1405.
Gneezy, U., M. Niederle, and A. Rustichini (2003). Performance in competitive
environments: Gender differences. The Quarterly Journal of Economics 118(3),
1049–1074.
Hansen, K. T., J. J. Heckman, and K. J. Mullen (2004). The effect of schooling and
ability on achievement test scores. Journal of Econometrics 121(1-2), 39–98.
Hanson, S. L., M. Schaub, and D. P. Baker (1996). Gender stratification in the sci-
ence pipeline a comparative analysis of seven countries. Gender & Society 10(3),
271–290.
Hardies, K., D. Breesch, and J. Branson (2013). Gender differences in overconfi-
dence and risk taking: Do self-selection and socialization matter? Economics
Letters 118(3), 442–444.
Heckman, J. J., J. Stixrud, and S. Urzua (2006). The effects of cognitive and
noncognitive abilities on labor market outcomes and social behavior. Journal
of Labor Economics 24(3), 411–482.
34
Herrmann, S. D., R. M. Adelman, J. E. Bodford, O. Graudejus, M. A. Okun, and
V. S. Kwan (2016). The effects of a female role model on academic performance
and persistence of women in stem courses. Basic and Applied Social Psychol-
ogy 38(5), 258–268.
Hong, L. and S. E. Page (2004). Groups of diverse problem solvers can out-
perform groups of high-ability problem solvers. Proceedings of the National
Academy of Sciences 101(46), 16385–16389.
Humphries, J. E., J. S. Joensen, G. Veramendi, et al. (2017). College major choice:
Sorting and differential returns to skills. Unpublished Paper.
Hunt, J. (2016). Why do women leave science and engineering? ILR Review 69(1),
199–226.
ICE (2016). Stem designated degree program list. https://www.ice.gov/
sites/default/files/documents/Document/2016/stem-list.
pdf. [Online; accessed 2020-07-15].
Johnson, M. and V. S. Helgeson (2002). Sex differences in response to evaluative
feedback: A field study. Psychology of Women Quarterly 26(3), 242–251.
Kahn, S. and D. Ginther (2017). Women and stem. Technical report, National
Bureau of Economic Research.
Kimmel, L. G., J. D. Miller, and J. S. Eccles (2012). Do the paths to stemm profes-
sions differ by gender? Peabody Journal of Education 87(1), 92–113.
Kugler, A. D., C. H. Tinsley, and O. Ukhaneva (2017). Choice of majors: Are
women really different from men? Technical report, National Bureau of Eco-
nomic Research.
Kunze, A. and A. R. Miller (2017). Women helping women? evidence from
private sector data on workplace hierarchies. Review of Economics and Statis-
tics 99(5), 769–775.
Light, A. and A. Rama (2019). Moving beyond the stem/non-stem dichotomy:
wage benefits to increasing the stem-intensities of college coursework and oc-
cupational requirements. Education Economics 27(4), 358–382.
Mansour, H., D. I. Rees, B. M. Rintala, and N. N. Wozny (2020). The effects of pro-
fessor gender on the post-graduation outcomes of female students. Technical
report, National Bureau of Economic Research.
35
Neumark, D. and R. Gardecki (1998). Women helping women? role model and
mentoring effects on female ph. d. students in economics. Journal of Human
Resources 33(1).
Niederle, M. and L. Vesterlund (2010). Explaining the gender gap in math test
scores: The role of competition. Journal of Economic Perspectives 24(2), 129–44.
Noonan, R. (2017). “women in stem: 2017 update”. Office of the Chief Economist,
Economics and Statistics Administration, U.S. Department of Commerce (ESA Is-
sue Brief #06-17). Retrieved from https://www.esa.gov/reports/women-stem-2017-
update..
NSF. National survey of college graduates. https://www.nsf.gov/
statistics/srvygrads/#tabs-2&tools. Accessed: 2020-07-01.
Paglin, M. and A. M. Rufolo (1990). Heterogeneous human capital, occupational
choice, and male-female earnings differences. Journal of Labor Economics 8(1,
Part 1), 123–144.
Polachek, S. W. (1978). Sex differences in college major. ILR Review 31(4), 498–508.
Polachek, S. W. (1981). Occupational self-selection: A human capital approach to
sex differences in occupational structure. The review of Economics and Statistics,
60–69.
Pope, D. G. and J. R. Sydnor (2010). Geographic variation in the gender differ-
ences in test scores. Journal of Economic Perspectives 24(2), 95–108.
Prada, M. F., S. Urz ´
ua, et al. (2017). One size does not fit all: Multiple dimensions
of ability, college attendance, and earnings. Journal of Labor Economics 35(4),
953–991.
Rask, K. and J. Tiefenthaler (2008). The role of grade sensitivity in explaining
the gender imbalance in undergraduate economics. Economics of Education Re-
view 27(6), 676–687.
Roberts, T.-A. (1991). Gender and the influence of evaluations on self-
assessments in achievement settings. Psychological bulletin 109(2), 297.
Robst, J. (2007). Education, college major, and job match: Gender differences in
reasons for mismatch. Education Economics 15(2), 159–175.
Roy, A. D. (1951). Some thoughts on the distribution of earnings. Oxford Economic
Papers 3(2), 135–146.
36
Saltiel, F. (2019). What’s math got to do with it? multidimensional ability and the
gender gap in stem. In 2019 Meeting Papers, Volume 1201. Society for Economic
Dynamics.
Sarzosa, M. (2017). Negative social interactions and skill accumulation: The case
of school bullying.
Sarzosa, M. and S. Urz ´
ua (2015). Bullying in teenagers, the role of cognitive and
non-cognitive skills. NBER Working Paper, w21631.
Sarzosa, M. and S. Urz ´
ua (2016). Implementing factor models for unobserved
heterogeneity in stata. The Stata Journal 16(1), 197–228.
Schlenker, E. (2015). The labour supply of women in stem. IZA Journal of European
Labor Studies 4(1), 12.
Shapiro, J. R. and A. M. Williams (2012). The role of stereotype threats in un-
dermining girls’ and women’s performance and interest in stem fields. Sex
Roles 66(3-4), 175–183.
Speer, J. D. (2017). The gender gap in college major: Revisiting the role of pre-
college factors. Labour Economics 44, 69–88.
Trusty, J. (2002). Effects of high school course-taking and other variables on
choice of science and mathematics college majors. Journal of Counseling & De-
velopment 80(4), 464–474.
Turner, S. E. and W. G. Bowen (1999). Choice of major: The changing (unchang-
ing) gender gap. ILR Review 52(2), 289–313.
Urzua, S. (2008). Racial labor market gaps the role of abilities and schooling
choices. Journal of Human Resources 43(4), 919–971.
van Ham, M., C. H. Mulder, and P. Hooimeijer (2001). Spatial flexibility in job
mobility: macrolevel opportunities and microlevel restrictions. Environment
and Planning A: Economy and Space 33(5), 921–940.
Ware, N. C., N. A. Steckler, and J. Leserman (1985). Undergraduate women: Who
chooses a science major? The Journal of Higher Education 56(1), 73–84.
Wiswall, M. and B. Zafar (2015a). Determinants of college major choice: Identi-
fication using an information experiment. The Review of Economic Studies 82(2),
791–824.
37
Wiswall, M. and B. Zafar (2015b). How do college students respond to public
information about earnings? Journal of Human Capital 9(2), 117–169.
Xie, Y., Shauman, and K. A (2003). Women in science: Career processes and outcomes,
Volume 26. Harvard University Press Cambridge, MA.
Xu, Y. J. (2008). Gender disparity in stem disciplines: A study of faculty attrition
and turnover intentions. Research in Higher Education 49(7), 607–624.
Xue, Y. and R. C. Larson (2015). Stem crisis or stem surplus: Yes and yes. Monthly
Lab. Rev. 138, 1.
Zafar, B. (2013). College major choice and the gender gap. Journal of Human
Resources 48(3), 545–595.
38
Tables and Figures
Table 1: Summary Statistics: the Restricted Sample
Female Sample Male Sample
Mean SD Min Max Mean SD Min Max
Female Sample
ACT English 25.645 4.606 11 36 25.497 4.634 11 36
ACT Reading 25.930 4.942 12 36 26.264 4.947 8 36
ACT Science 24.657 3.954 12 36 26.715 4.389 11 36
ACT Math 25.631 4.509 15 36 28.229 4.187 15 36
HS GPA 3.531 0.426 2 4 3.482 0.427 2 4
COM114 3.526 0.570 2 4 3.339 0.630 1 4
AFGR 76.671 4.192 57.537 91.085 76.709 4.136 57.837 91.085
Annual Salary 45158 14369 8000 101000 53356 13097 5250 107000
STEM Major 37.03% 63.40%
STEM Job 73.11% 81.17%
Indiana 47.69% 43.19%
Midwest 41.83% 43.46%
Northeast 2.18% 2.57%
South 4.80% 7.33%
West 3.49% 3.46%
N1145 1910
Note: The analysis sample includes undergraduate students who graduated between 2007–2014. Stan-
dard test of ACT English, ACT Reading, ACT Science, and ACT Math are scored on a scale of 1–36.
COM114 grade points range from 1–4. AFGR is on the student’s high school state-high school gradua-
tion year level. Self-reported Annual Salary is nominal and in USD.
39
Table 2: Likelihood of Getting A STEM Degree
(1) (2)
Female Male
General Academic Skill 0.0467*** 0.0656***
(0.0058) (0.0056)
STEM-specific Skill 0.0384*** 0.0487***
(0.0086) (0.0062)
STEM Share of Pre-college Teachers 4.8935* 5.1891**
(2.8286) (2.0797)
Female Share of STEM Pre-college Teachers 0.7078 1.3684*
(1.0409) (0.7600)
Female Peer Effect 0.7181*** 0.3631**
(0.1490) (0.1642)
Cohort Size Effect 0.0030*** 0.0030***
(0.0003) (0.0003)
N1145 1910
Note: Column (1) and (2) show the marginal effect of probit at the means for the female and
male sample, respectively. Estimates in the top two rows reflect changes in probability of grad-
uating in STEM with one unit increase in the corresponding skill. The standard deviation of
female (male) General Academic Skill is 3.576 (3.539); the standard deviation of female (male)
STEM-specific Skill is 2.801 (2.862). Thus a one-standard-deviation increase in General Academic
Skill will increase an average woman’s (man’s) likelihood of graduating with a STEM degree by
16.70 (23.22) percentage points; and a one-standard-deviation increase in STEM-specific Skill will
increase an average woman’s (man’s) likelihood of graduating with in a STEM degree by 10.76
(13.94) percentage points. The dependent variable in both column (1) and (2) is an indicator of
graduating with a STEM degree. First enrollment year fixed effects are controlled but not shown
in this table for short. Standard errors in parentheses. * p < 0.10, ** p < 0.05, *** p < 0.001.
40
Table 3: Likelihood of STEM Graduates Working in STEM Occupations
(1) (2)
Female Male
General Academic Skill 0.0186* 0.0113**
(0.0106) (0.0058)
STEM-specific Skill 0.0145 0.0112
(0.0154) (0.0070)
Female Share of STEM Workers 0.0635 0.0423
(0.1541) (0.0757)
STEM Average Work Hours/Week -0.0100* -0.0020
(0.0159) (0.0070)
Cohort Size Effect 0.0190*** 0.0116***
(0.0159) (0.0070)
N424 1211
Note: Column (1) and (2) show the marginal effect of the Probit model at the means for the
female and male STEM graduates, respectively. Estimates in the top two rows reflect changes
in probability of working in STEM with one unit increase in the corresponding ability of STEM
graduates. The standard deviation of General Academic Skill of female (male) STEM degree
holders is 3.496 (3.349); the standard deviation of STEM Specific Skill of female (male) STEM
degree holders is 2.723 (2.711). Thus a one-standard-deviation increase in General Academic Skill
will increase an average woman’s (man’s) likelihood of working in a STEM occupation by 6.50
(3.78) p.p.; and a one-standard-deviation increase in STEM-specific Skill will increase an average
woman’s (man’s) likelihood of working in a STEM occupation by 3.95 (3.04) p.p. The dependent
variable in both columns is an indicator of working in STEM. Degree year fixed effects and home
Census region fixed effects are controlled but not shown in this table for short. Standard errors
in parentheses. * p < 0.10, ** p < 0.05, *** p < 0.001.
41
Table 4: Counterfactuals of Majoring in STEM
(1)
Female
Female’s Factual: 0.3690
(0.0120)
Counterfactual: replace women’s αM,A,αM,B with men’s 0.3722
(0.0121)
Counterfactual: replace women’s θA,θBwith men’s 0.3878
(0.0121)
Counterfactual: replace women’s αM,A,αM,B ,θA,θBwith men’s 0.4007
(0.0124)
Counterfactual: replace women’s βMwith men’s 0.4299
(0.0119)
Counterfactual: replace women’s XMwith men’s 0.5658
(0.0097)
Counterfactual: replace women’s βMand XMwith men’s 0.6442
(0.0092)
Male’s Factual 0.6345
(0.0233)
Note: The table shows the model’s predicted fraction of female STEM graduates, factually and
counterfactually. The predicted values come from 1000,000 replications: 1000 bootstraps each
with 1000 replications. Rows 2–7 show the probability of majoring in STEM when replacing
female parameters with the corresponding male parameters. Standard errors are in parentheses.
Bold numbers mean statistically different than the factual.
Table 5: Counterfactuals of Working in STEM
(1)
Female Factual: 0.7159
(0.0308)
Counterfactual: replace women’s αJ,A,αJ,B with men’s 0.7177
(0.0215)
Counterfactual: replace women’s θA,θBwith men’s 0.7281
(0.0214)
Counterfactual: replace women’s αJ,A,αJ,B ,θA,θBwith men’s 0.7176
(0.0218)
Counterfactual: replace women’s βJwith men’s 0.7540
(0.0206)
Male Factual: 0.8010
(0.0161)
Note: This table shows the model’s predicted fraction of female STEM graduates working in
STEM jobs, factually and counterfactually. The predicted values come from 1000,000 replications:
1000 bootstraps each with 1000 replications. Rows 2–5 show the probability of working in STEM
when replacing female parameters with the corresponding male parameters. Standard errors in
parentheses. Bold numbers mean statistically different than the factual.
42
Table 6: Decomposition of Job Decision
STEM Math-Int. STEM
(1) (2) (3) (4)
Exclude Replace Exclude Replace
Female Factual: 0.7159 0.7881
(0.0308) (0.0217)
Counterfactual: βFemale STEM Worker Size Effect 0.7036 0.7190 0.7874 0.7993
(0.0219) (0.0215) (0.0225) (0.0214)
Counterfactual: βWork Hours/Week 0.9426 0.9103 0.9344 0.9098
(0.0110) (0.0137) (0.0131) (0.0156)
Counterfactual: βCohort Size Effect 0.6839 0.6636 0.7891 0.7714
(0.0223) (0.0225) (0.0217) (0.0233)
Counterfactual: Year Fixed Effects 0.7531 0.7500 0.8142 0.8292
(0.0209) (0.0208) (0.0215) (0.0201)
Counterfactual: Home Region Fixed Effects 0.8078 0.8059 0.8795 0.8686
(0.0187) (0.0189) (0.0172) (0.0184)
N424 424 351 351
Note: The factual STEM retention rate for female (male) is 0.7159 (0.8010). The factual Math-
intensive STEM retention rate for female (male) is 0.7881 (0.8478). Column (1)/(3) shows the
counterfactual fraction of female STEM graduates working in STEM for excluding the corre-
sponding predictor. Column (2)/(4) shows the counterfactuals of replacing female’s coefficient
of interest with male’s. Standard errors in parentheses. Bold numbers mean statistically different
than the factuals by the test (H0=factual;H1=counterfactual).
43
(a) Female General Academic Skill (b) Male General Academic Skill
(c) Female STEM-specific Skill (d) Male STEM-specific Skill
Figure 1: Distribution of Two Skills by Career Group by Gender
44
(a) Female General Academic Skill (b) Male General Academic Skill
(c) Female STEM-specific Skill (d) Male STEM-specific Skill
Figure 2: ATE of Majoring in STEM, over the Deciles of Skills
45
(a) Female General Academic Skill (b) Male General Academic Skill
(c) Female STEM-specific Skill (d) Male STEM-specific Skill
Figure 3: ATE of Working in STEM, over the Deciles of Skills
Figure 4: Overlapping Skill Distributions of non-STEM Women and Men
46
Appendix A Data Details and Sample Selection
I address the concern of potential sample selection in three ways. First, I esti-
mate a model with an indicator of reporting the first job as a dependent variable
and two latent skills as independent variables, controlling for relevant charac-
teristics on both individual-level and cohort-level. Table A3 shows that there
is no correlation between one’s ability and the survey response. Although we
see the male sample shows a positive correlation between the STEM-specific Skill
and likelihood of responding to the survey, the magnitude is too small to have
economic meaning: a one-unit increase in the STEM-specific Skill will increase an
average man’s probability of reporting his first job by 0.548 percentage points.
This analysis basically rules out the concern about ability sorting on response to
the survey.
Second, I estimate the latent skill distributions of the full sample to compare
the skill distributions with the ones from the restricted sample. Figure A1a–A1d
show the de-meaned skill distributions of the restricted sample versus the full
sample by gender. These figures show that skill distributions from the two sam-
ples almost overlap with each other. To take a closer look at the differences in
distributions of the test scores used for the skill measurement, I show the kernel
density of each test score by gender in Figure A2 and A3. As we see in the sum-
mary statistics (Table 2), the average scores in the restricted sample are generally
higher than those in the full sample. The scores distributions reaffirm that it is
rather a mean-shifted relationship between the full and the restricted sample,
though the restricted sample shows a fatter right tail in a few scores. It is worth
noting that shifting in means would not affect the main results in the paper, as
the model estimates only the linear effect of ability sorting.
Finally, I estimate the major choice model for the full sample to compare the
ability sorting behavior in college major choice of the full sample to the main
results in this paper. The results are shown in Table A4. The significant gender
gap in ability sorting between men and women still holds: both latent skills are
much weaker determinants of majoring in STEM for women than men. Similar
to the results in Table 2, both women and men have similar estimates of STEM-
specific Skills between the restricted sample and the full sample. In contrast, the
estimates of General Academic Skill are smaller in the full sample. One explana-
tion for that difference is that the full sample includes more non-STEM major
individuals with low test scores not on the margin of selecting into STEM major,
suggesting by the summary statistics in Table A2.
47
Table A1: Sample Reduction
Sample Total Female Male
All Domestic 39,538 17,496 22,042
Complete Admission Records 35,371 15,293 19,448
Valid GPA 28,877 13,065 15,812
Valid ACT 13,104 6,181 6,923
Valid COM 114 Grade Points 10,089 4,495 5,594
Valid First Destination Survey Response 3,055 1,145 1,910
(30.28%) (25.47%) (34.14%)
STEM Degrees 43.46% 29.38% 54.72%
STEM Degrees among Survey Respondents 53.53% 37.03% 63.40%
Note: The sample includes Purdue University’s domestic undergraduate students who grad-
uated between 2007–2014. There are six scores required for the ability measurement system:
ACT English, ACT Reading, ACT Science, ACT Math, grade points of COM114 (required for all
Purdue freshmen) and high school GPA. A valid First Destination Survey response means the
graduate responded to the survey with a positive annual salary, a meaningful occupation and
job location.
Table A2: Summary Statistics: the Full Sample
Female Sample Male Sample
Mean SD Min Max Mean SD Min Max
ACT English 24.872 4.649 8 36 24.996 4.665 10 36
ACT Reading 25.275 5.007 4 36 25.801 5.095 7 36
ACT Science 23.945 3.883 9 36 26.127 4.318 9 36
ACT Math 24.684 4.387 13 36 27.463 4.372 13 36
HS GPA 3.464 0.440 2 4 3.404 0.453 2 4
COM114 3.430 0.614 1 4 3.243 0.651 1 4
AFGR 76.258 4.040 56.450 91.085 76.604 4.230 53.973 91.458
STEM Major 29.38% 54.72%
Indiana 54.11% 45.44%
Midwest 35.64 % 42.06 %
Northeast 2.19% 2.36%
South 4.47% 6.49%
West 3.59% 3.65%
N4565 5640
Note: The full sample includes undergraduate students who graduated between 2007–2014.
Standard test of ACT English, ACT Reading, ACT Science, and ACT Math are scored on a scale
of 1–36. COM114 grade points range from 1–4. AFGR is on the student’s high school state-high
school graduation year level. I do not observe dropout students. Self-reported Annual Salary is
nominal and in USD.
48
Table A3: Selection: Information of Self-report First Job
(1) (2)
Female Male
General Academic Skill 0.002117 -0.0020419
(0.00251018) (0.00276165)
STEM-specific Skill 0.0002244 0.0054804**
(0.00216602) (0.00229411)
N4565 5640
Note: Column (1) and (2) show the marginal effects of the two latent skills in the probit model of
reporting to the First Destination Survey. The dependent variable in both column (1) and (2) is
a dummy of self-reporting first job. Number of Purdue graduates in the same major, number of
Purdue female graduates in the same major, first enrolled year fixed effect, first enrolled semester
fixed effects, degree year fixed effects, degree semester fixed effects, and home region fixed effects
are controlled but not shown in this table for short. The estimates show that neither of female’s
latent skill correlates to response to the survey. Although there is a positive correlation between
men’s STEM Specific Skill and survey response, the magnitude is too small to have any economic
meaning: a one unit increase in STEM-specific Skill will increase the probability for an average
man to report his first job by 0.548 percentage points. Standard errors in parentheses. * p < 0.10,
** p < 0.05.
Table A4: Likelihood of Graduating with A STEM Major, the Full Sample
(1) (2)
Female Male
General Academic Skill 0.035*** 0.052***
(0.0022) (0.0030)
STEM-specific Skill 0.029*** 0.050***
(0.0030) (0.0039)
N4565 5640
Note: This table shows the marginal effects of the major choice model for the full sample, in order
to compared with the results of the restricted sample. Column (1) and (2) show the marginal
effect at the means for the female and male sample, respectively. All estimates in the table reflect
to changes in probability of graduating in STEM with one unit increase in the corresponding
ability. The dependent variable in both column (1) and (2) is dummy of majoring in STEM.
Number of Purdue graduates in the same major, number of Purdue female graduates in the same
major, first enrollment year, first enrollment semester, degree year fixed effects are controlled but
not shown in this table for short. Standard errors in parentheses. * p < 0.10, ** p < 0.05, ***
p < 0.01.
49
(a) Female General Academic Skill (b) Male General Academic Skill
(c) Female STEM-specific Skill (d) Male STEM-specific Skill
Figure A1: Skill Distributions, Restricted versus Full Sample
50
Figure A2: Distributions of the 6 Scores, Female Restricted versus Full Sample
Figure A3: Distributions of the 6 Scores, Male Restricted versus Full Sample
51
Appendix B Estimated Model Parameters and Addi-
tional Results
Table B1: Identification of Latent Skills at College Entrance, Female
(1) (2) (3) (4) (5) (6)
Dependent Var→ACT E COM114 ACT R ACT S HSGPA ACT M
General Academic Skill 1.127*** 0.045*** 1 0.771*** 1.780*** 0.832***
(0.020) (0.005) X (0.025) (0.097) (0.029)
STEM-Specific Skill 0.361*** 1.199*** 1
(0.043) (0.161) X
Home Region: Indiana -0.569 -0.128 -0.660 -1.209*** 1.889 -0.801
(0.773) (0.094) (0.827) (0.449) (1.832) (0.510)
Home Region: Midwest 1.044 -0.171* 0.210 -0.201 -3.313* 0.335
(0.783) (0.099) (0.853) (0.477) (1.946) (0.547)
Home Region: Northeast -1.389 -0.260* -0.893 -0.897 -1.779 0.0322
(1.158) (0.147) (1.264) (0.709) (2.892) (0.797)
Home Region: South 2.594** -0.073 1.918* 1.141** 2.550 1.839***
(1.066) (0.120) (1.108) (0.573) (2.334) (0.656)
AFGR 0.122*** 0.013** 0.103** 0.113*** 0.566*** 0.111***
(0.039) (0.005) (0.043) (0.0255) (0.103) (0.030)
First Term Semester: Fall 2.042* -0.112 2.557* 1.550* 8.124** 2.827**
(1.084) (0.178) (1.327) (0.942) (3.727) (1.306)
First Term Semester: Spring -1.536 -0.050 0.597 -1.167 -4.794 -1.524
(1.552) (0.258) (1.905) (1.301) (5.257) (1.648)
Constant 14.043*** 2.754*** 15.706*** 15.13*** -13.83 14.54***
(3.088) ( 0.427) (3.486) (2.128) (8.585) (2.620)
N1,145 1,145 1,145 1,145 1,145 1,145
Note: Each column is a separate regression specified in Equation 7. All columns have the same observations: 1145.
The loading of General Academic Skill is normalized to one in regression of ACTReading , so that General Academic
Skill takes the metrics of ACTReading . The loading of STEM-specific Skill is normalized to one in regression of
ACTM ath, so that STEM-specific Skill takes the metrics of ACTMath. I control for annual state-averaged freshmen
graduation rate (AFGR) on the year of each student graduated from high school, home census region fixed effects
and first enrollment semester fixed effects.
52
Table B2: Identification of Latent Skill at College Entrance, Male
(1) (2) (3) (4) (5) (6)
Dependent Var→ACT E COM114 ACT R ACT S HSGPA ACT M
General Academic Skill 1.151*** 0.045*** 1 0.831*** 1.557*** 0.729***
(0.017) (0.004) X (0.022) (0.078) (0.021)
STEM-specific Skill 0.455*** 1.107*** 1
( 0.029) (0.103) X
Home Region: Indiana -2.216*** -0.071 -1.981*** -1.831*** -0.180 -1.388***
(0.687) (0.080) (0.703) (0.397) (1.437) (0.394)
Home Region: Midwest -0.995 -0.206** -1.111 -0.427 -5.342*** -0.267
(0.736) (0.085) (0.748) (0.421) (1.519) (0.421)
Home Region: Northeast -1.441 -0.204* -1.138 -0.290 -3.640* -0.415
(0.978) (0.119) (1.013) (0.577) (2.120) (0.536)
Home Region: South 0.0362 -0.013 -0.068 0.188 -0.141 0.704
(0.742) (0.093) (0.777) (0.479) (1.699) (0.518)
AFGR 0.169*** 0.019*** 0.089** 0.108*** 0.654*** 0.124***
(0.031) (0.004) (0.034) (0.0224) (0.0810) (0.0226)
First Term Semester: Fall 4.941*** 0.210 3.610** 5.844*** 13.67*** 6.089***
(1.011) (0.179) (1.237) (0.853) (3.228) (0.670)
First Term Semester: Spring 2.794** -0.232 1.270 4.297*** 10.26** 4.402***
(1.315) (0.225) (1.578) (1.110) (4.100) (1.019)
Constant 9.045*** 2.204*** 17.235*** 13.607*** -25.932*** 13.383***
(2.582) (0.379) (2.880) (1.888) (6.891) (1.810)
N1,910 1,910 1,910 1,910 1,910 1,910
Note: Each column is a separate regression specified in Equation 7. All columns have the same observations: 1910.
The loading of General Academic Skill is normalized to one in regression of ACTReading , so that General Academic
Skill takes the metrics of ACTReading . The loading of STEM-specific Skill is normalized to one in regression of
ACTM ath, so that STEM-specific Skill takes the metrics of ACTMath. I control for annual state-averaged freshmen
graduation rate (AFGR) on the year of each student graduated from high school, home census region fixed effects
and first enrollment semester fixed effects.
53
Table B3: Salary Estimates, Male
(1) (2) (3)
VARIABLES M ale11 M ale10 M ale00
General Academic Skill 423.7268*** 169.6262 178.3931
(129.0390) (343.4151) (176.6399)
STEM-specific Skill 715.5332*** 1092.4550*** 283.0023
( 160.5208) (375.5270) (194.4961)
Unemployment Rate at Job State -838.4608** -1058.6730 -30.3880
(359.0134) (897.7244) (591.5366)
STEM Employment Fraction -173206 -2578968 -131179.6
(1726145) (4537130) (2340473)
# Employment in STEM -0.0002 0.0257 0.0026
(0.0141) (0.0382) (0.0192)
# Employment in nonSTEM -0.0000 -0.0010 -0.0001
(0.0006) (0.0016) (0.0008)
# Graduates 1.2083* 2.8769 -0.0705
(0.7375) (2.34445) (1.2439)
# STEM Graduates -1.2022 -3.621232 -1.0216
( 1.4600) ( 3.9220) (2.3663)
# Female Graduates -2.1254 -6.1693 -0.3924
(1.7149) (5.1670) (2.8282)
# Female STEM Graduates 2.5221 10.0240 3.3854
(4.1249) (10.6760) (6.5282 )
Constant 58327 454158 163678
(126261) (390848) (195894)
N1,910 1,910 1,910
Note: Column (1)–(3) separately show the estimates for men who graduate in STEM and work
in STEM (Male11), men who graduate in STEM and work in non-STEM (Male10), and men who
graduates in non-STEM and work in non-STEM (Male00). Estimates in the top two rows reflect
changes in salary with one unit increase in the corresponding skill. The dependent variable in all
columns is annual salary in USD. Job location (Census region) fixed effects are included but not
shown. Standard errors in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1.
54
Table B4: Salary Estimates, Female
(1) (2) (3)
VARIABLES F emale11 F emal e10 F emale00
General Academic Skill 786.4390*** 299.9881 143.2026
(216.4821) (427.2216) (157.6091)
STEM Specific Skill 928.8647*** 1537.7310** 897.1492***
(321.3625) (631.0745) (216.3343)
Unemployment Rate at Job State -131.1335 360.5059 -1065.638*
(605.7405) (1678.2650) (584.1682)
STEM Employment Fraction -3030426 -2939595 -1916998
(2961158) (7638009) (2226907)
# Employment in STEM 0.0178 0.0260 0.0129
(0.0241) (0.0627) (0.0184)
# Employment in nonSTEM -0.0009 -0.0010 -0.0006
(0.0010) (0.0026) (0.0007)
# Graduates 1.1033 0.1516 0.9771
(1.1959) (1.9286) (0.7096)
# STEM Graduates 0.4551 1.0689 -0.7856
(2.3621) (2.6065) (1.2198)
# Female Graduates -1.4926 -0.1680 -1.5739
(2.7849) (3.8246) (1.5081)
# Female STEM Graduates -1.7052 -2.8384 1.3533
(6.7334) (6.4461) (3.2041)
Constant 19282.46 205459.1 69147.86
(202287.5) (407797.3 ) (132414)
N1,145 1,145 1,145
Note: Column (1)–(3) separately show the estimates for women who graduate in STEM and
work in STEM (F emale11), women who graduate in STEM and work in non-STEM (F emal e10),
and women who graduates in non-STEM and work in non-STEM (F emale00). Estimates in the
top two rows reflect changes in salary with one unit increase in the corresponding skill. The
dependent variable in all columns is annual salary in USD. Job location (Census region) fixed
effects are included but not shown. Standard errors in parentheses. *** p < 0.01, ** p < 0.05, *
p < 0.1.
55
(a) Female (b) Male
Figure B1: Distributions of the Two Latent Skills
Notes: Distributions are centered at mean zero. Female sample: sd(f1) = 3.576;sd(f2) =
2.801. Male sample: sd(f1) = 3.539;sd(f2) = 2.862.
56
Figure B2: Predicted Probability of Majoring in STEM, over the Deciles of the
Two Skills
57
Figure B3: Predicted Probability of Working in STEM, over the Deciles of the Two
Skills
58
(a) Female General Academic Skill (b) Male General Academic Skill
(c) Female STEM-specific Skill (d) Male STEM-specific Skill
Figure B4: MTE of Majoring in STEM, over the Deciles of the Two Skills
59
(a) Female General Academic Skill (b) Male General Academic Skill
(c) Female STEM-specific Skill (d) Male STEM-specific Skill
Figure B5: MTE of Working in STEM, over the Deciles of the Two Skills
60
(a) Female General Academic Skill (b) Male General Academic Skill
(c) Female STEM-specific Skill (d) Male STEM-specific Skill
Figure B6: ATE of Majoring and Working in STEM, over the Deciles of Skills
61
Appendix C Assessing the Fit of the Model
Table C1: Assessing the Fit of the Model, Test Scores
Female Male
Panel A. ACT English
Data 25.645 (4.606) 25.497 (4.634)
Model Prediction 25.683 (4.619) 25.508 (4.634)
Panel B. Communication 114 Grade Points
Data 3.526 (0.570) 3.339 (0.630)
Model Prediction 3.523 (0.574) 3.339 (0.633)
Panel C. ACT Reading
Data 25.930 (4.942) 26.264 (4.947)
Model Prediction 25.973 (4.941) 26.277 (4.967)
Panel D. ACT Science
Data 24.657 (3.954) 26.715 (4.389)
Model Prediction 24.668 (4.080) 26.734 (4.353)
Panel E. exp(High School GPA)
Data 36.941 (13.034) 35.26324 (12.868)
Model Prediction 37.107 (13.578) 35.323 (12.849)
Panel F. ACT Math
Data 25.631 (4.509) 28.229 (4.187)
Model Prediction 25.666 (5.368) 28.234 (4.160)
N 1145 1910
Note: The predicted values come from 1,000,000 simulations based on 1000 bootstraps of the
estimated parameters of the model and 1000 random draws from the two ability distributions
within each bootstrap.
62
Table C2: Assessing the Fit of the Model, Choices and Salary
Female Male
Panel A. Prob(STEM Major)
Data 0.3703 (0.4831) 0.6340 (0.4818)
Model Prediction 0.3690 (0.4824) 0.6345 (0.4815)
N 1145 1910
Panel B. Prob(STEM Job)
Data 0.7311 (0.4439) 0.8117 (0.3911)
Model Prediction 0.7159 (0.4502) 0.8010 (0.3990)
N 424 1211
Panel C. Salary11
Data 58280 (11299) 58669 (11072)
Model Prediction 56404 (12251) 57914 (11047)
N 310 983
Panel D. Salary10
Data 48180 (14032) 54358 (13286)
Model Prediction 48040 (14990) 54349 (13829)
N 114 228
Panel E. Salary00
Data 39039 (11370) 45558 (11759)
Model Prediction 39493 (11569) 46234 (11863)
N 721 699
Note: Predicted means and standard deviations (in the parenthesis) are not statistically different
from the actual means and standard deviations. The predicted values come from 1,000,000 sim-
ulations based on 1000 bootstraps of the estimated parameters of the model and 1000 random
draws from the two ability distributions within each bootstrap.
63
Figure C1: Fit of the Model, Male Test Scores
Notes: Actual (red, dash) and predicted (blue, line) cumulative distributions plotted
of the following scores: (a) ACT English (b) Communication 114-grade points (c) ACT
Reading (d) ACT Science (e) exponential high school GPA, and (f) ACT Math. The pre-
dicted values come from simulations (10,000 reps) based on the estimated parameters of
the model.
Figure C2: Fit of the Model, Female Test Scores
Notes: Actual (red, dash) and predicted (blue, line) cumulative distributions plotted
of the following scores: (a) ACT English (b) Communication 114 grade points (c) ACT
Reading (d) ACT Science (e) exponential high school GPA, and (f) ACT Math. The pre-
dicted values come from simulations (10,000 reps) based on the estimated parameters of
the model.
64
Appendix D Heterogeneity of Gender Gap in STEM
Literature has documented that women are underrepresented in math-intensive
STEM fields (geosciences, engineering, economics, math/computer science, and
physical science) and less math-intensive STEM fields (life sciences, psychology,
and other social sciences) tend to have more women (Ceci et al.,2014;Kahn and
Ginther,2017). To understand the differential gender gaps across STEM fields,
I classify the STEM majors into these two sub categories, Math-intensive STEM
Majors and Less Math-intensive STEM Majors, as in Ceci et al. (2014); Kahn and
Ginther (2017). There are 38 Math-intensive STEM majors/programs and 14 less
Math-intensive STEM majors/programs among all 108 undergraduate programs
offered by Purdue during this time period. I then categorize the STEM jobs into
two categories, Math-intensive STEM Jobs and Less Math-intensive STEM Jobs.
Table D1 shows the shares of students in each category by gender. Indeed, we
see that women are proportionately representative in Less Math-intensive STEM
Majors. Specifically, 5.07% of women and 2.15% of men graduated from Less
Math-intensive STEM Majors. Female STEM graduates are also more likely to
work in Less Math-intensive STEM Jobs than male STEM graduates. Although
similar to the pattern shown in Kahn and Ginther (2017), due to the fact that Pur-
due is an engineering-intensive institution and does not have a medical school,
the group size of Less Math-intensive STEM Majors is quite small. Figure D1 de-
picts the shares of students in each combination of major-job categories. It is
worth noting that, the group size of Math-intensive STEM Majors and Less Math-
intensive STEM Jobs and Less Math-intensive STEM Majors and Math-intensive STEM
Jobs are tiny. The main destination for a Math-intensive STEM graduate is a Math-
intensive STEM job or a non-STEM job, and the main destination for a Less Math-
intensive STEM graduate is a Less Math-intensive STEM job or a non-STEM job.
Overall, the retention rate of Math-intensive STEM is 79.49% for women and
84.78% for men.
In order to understand the ability sorting behaviors under this stricter defi-
nition of STEM fields, I re-estimate the choice models by replacing the outcome
variable with an indicator of majoring (working) in Math-intensive STEM Majors
(Math-intensive STEM Jobs). Table D2 shows the marginal effects of the two latent
skills on selecting into Math-intensive STEM Majors. The estimates are not signif-
icantly different from the ones in Table 2. Similarly, the marginal effects of the
two latent skills on selecting into Math-intensive STEM Jobs shown in Table D3
are not different from the one in the main text, Table 3.
65
Figure D1: Major to Job Flows
Table D1: Math-intensive and Less Math-intensive STEM Majors
Female Sample Male Sample
STEM Major 37.03% 63.40%
Math-intensive STEM 30.66% 59.16%
Less Math-intensive STEM 6.37% 4.24%
#Observations 1145 1910
STEM Job 73.11% 81.17%
Math-intensive STEM 65.80% 79.19%
Less Math-intensive STEM 7.31% 1.98%
#Observations 424 1211
Note: This table breaks the fraction of STEM degrees into two categories: math-intensive STEM
majors and less math-intensive STEM majors and breaks the fraction of STEM jobs holding by
STEM graduates into two categories: math-intensive STEM jobs and less math-intensive STEM
jobs. The left panel is the restricted sample and the right panel is the full sample. There are
32 Math-intensive STEM majors/programs and 16 less Math-intensive STEM majors/programs
among all 104 undergraduate programs offered by Purdue during this time period.
66
Table D2: Likelihood of Graduating with A Math-intensive STEM Major
(1) (2)
Female Male
General Academic Skill 0.0444*** 0.0630***
(0.0064) (0.0058)
STEM-specific Skill 0.0394*** 0.0560***
(0.0079) (0.0067)
N1145 1910
Note: Column (1) and column (2) show the marginal effect of the probit model at the means for
the female and male sample, respectively. The dependent variable is an indicator of graduat-
ing in Math-intensive STEM majors. All estimates in the table reflect to changes in probability
of graduating in Math-intensive STEM with one unit increase in the corresponding ability. The
standard deviation of female (male) General Academic Skill is 3.576 (3.539); the standard devi-
ation of female (male) STEM Specific Skill is 2.801 (2.862). Thus a one-standard-deviation increase
in General Academic Skill will increase an average woman’s (man’s) likelihood of graduating
with a math-intensive STEM degree by 12.15 (23.00) p.p.; and a one-standard-deviation increase in
STEM-specific Skill will increase an average woman’s (man’s) likelihood of graduating with in a
STEM degree by 9.52 (15.74) p.p. Same set of controls as in Table 2are included here. Standard
errors in parentheses. * p < 0.10, ** p < 0.05, *** p < 0.01.
Table D3: Likelihood of Working in A Math-intensive STEM Job
(1) (2)
Female Male
General Academic Skill 0.0212* 0.0100*
(0.0049) (0.0057)
STEM-specific Skill 0.0085 0.0077
(0.0071) (0.0066)
N424 1211
Note: Column (1) and column (2) show the marginal effect of the probit model at the means for
the female and male sample, respectively. The dependent variable is an indicator of graduat-
ing in Math-intensive STEM majors. All estimates in the table reflect to changes in probability
of graduating in Math-intensive STEM with one unit increase in the corresponding ability. The
standard deviation of female (male) General Academic Skill is 3.576 (3.539); the standard devi-
ation of female (male) STEM-specific Skill is 2.801 (2.862). Thus a one-standard-deviation increase
in General Academic Skill will increase an average woman’s (man’s) likelihood of graduating
with a math-intensive STEM degree by 12.15 (23.00) p.p.; and a one-standard-deviation increase in
STEM-specific Skill will increase an average woman’s (man’s) likelihood of graduating with in a
STEM degree by 9.52 (15.74) p.p. Same set of controls as in Table 3are included here. Standard
errors in parentheses. * p < 0.10, ** p < 0.05, *** p < 0.01.
67
Appendix E Factors with Non-triangular Loadings
An alternative setting for the factor loadings is non-triangular, as follows.
ΛT=
αT1,A αT1,B
αT2,A αT2,B
αT3,A αT3,B
αT4,A αT4,B
αT5,A αT5,B
αT6,A αT6,B
=
αT1,A 0
αT2,A 0
1 0
0αT4,B
0αT5,B
0 1
where the first factor is identified only from the covariances of ACTEnglish,
COM114, and ACTReading. The second factor is identified from the covariances
of ACTScience,H SGP A and ACTMath . As each factor is identified from a differ-
ent set of test scores, no orthogonality assumption is imposed here. Intuitively,
based on the nature of the used scores, Factor 1 possibly represents verbal skills
while Factor 2 represents math skills. Compared to the main specification of the
factors in Section 3.2, this alternative setting sacrifices part of the covariances of
the scores by not using all 6 scores to identify the first factor. It might, however,
makes it easier to interpret or label the two factors. It also allows more variation
on the second factor, which is arguably more “STEM-related”. It is worth noting
that, by identifying from a different set of scores, the non-triangular factors are
essentially different from the triangular factors and should not be interpreted in
the same way.
Table E1 and E2 shows the estimates of this alternative measurement sys-
tem. The coefficients of controls are not much different from the main specifi-
cation. The loadings of Factor 1 on the first set of scores are significantly posi-
tive, indicating that an increase in Factor 1 will significantly increase ACTEnglish,
COM114 and ACTReading , as expected. Similarly, an increase in Factor 2 will sig-
nificantly increase ACTS cience,H SGP A and AC TM ath. Specifically, for example,
one standard deviation47 increase in an average woman’s Factor 1 will increase
her ACTEnglish by 3.92 points. One standard deviation increase in an average
woman’s Factor 2 will increase her ACTM ath by 3.77 points. Compared to the
main specification of the factors, the loadings of the new second factor have big-
ger magnitudes due to more variations it takes from the scores.
I then estimate the same model to analyze the sorting effects in major choice
and job choice. The purpose of this estimation is to show the robustness of the
47Standard deviation of female’s Factor 1 is 3.448, female’s Factor 2 is 3.770, male’s Factor 1 is
3.572, male’s Factor 2 is 3.937.
68
main results. Table E3 shows the estimates in major choice given the alternative
factors. Individuals sort positively on both skills. Specifically, one standard de-
viation increase in an average woman’s Factor 1 will increase her likelihood of
graduating in STEM by 5.23% percentage points; and that number for an average
man is 6.42%. One standard deviation increase in an average woman’s Factor 2
will increase her likelihood of graduating in STEM by 15.83% percentage points,
and that number for an average man is 25.98%.
Both genders sort more on Factor 2 than on Factor 1. This is not surprising:
the second factor now takes all common variations from ACTS cience,H SGP A and
ACTMath, in contrast to the residual variations of these scores after the first factor
has been identified. Additionally, it is intuitive that Factor 2 is more essential to
the choice between STEM fields and non-STEM fields. Similar to the estimates in
the main specification, we see here men sort more on both skills as well. Men’s
coefficients are statistically larger than women’s. In job choice, Table E4 shows
that no sorting on Factor 1 for both genders. Although there is positive sorting on
Factor 2, the gender difference is not significantly different from zero. Overall,
the estimates of both specifications/settings of the latent skill structures of the
factors are qualitatively consistent: men sort more on both latent skills in major
choice; there is no gender difference in ability sorting in job choice.
69
Table E1: Non-triangular Latent Skills at College Entrance, Female
Dependent Var→ACT E COM114 ACT R ACT S HSGPA ACT M
Factor 1 1.138*** 0.042*** 1
(0.049) (0.005) X
Factor 2 0.697*** 1.811*** 1
(0.027) (0.095) X
Home Region: Indiana -0.754 -0.103 -0.911 -1.710*** 0.270 -1.519**
(0.749) 0.092 (0.804) (0.585) (1.995) (0.653)
Home Region: Midwest 0.981 -0.128* 0.107 -0.322 -4.285** -0.044
(0.752) (0.097) (0.824) (0.628) (2.129) (0.705)
Home Region: Northeast -1.273 -0.194* -0.793 -1.453* -3.334 -0.814
(1.218) (0.146) (1.298) (0.883) (3.071) (0.949)
Home Region: South 2.323** -0.049 1.659 0.336 0.472 0.992
(1.129) (0.119) (1.149) (0.773) (2.594) (0.883)
AFGR 0.117*** 0.012** 0.097** 0.103*** 0.579*** 0.109**
(0.039) (0.005) (0.043) (0.034) (0.113) (0.038)
First Term Semester: Fall 1.862* -0.034 2.574* 1.526 8.516** 2.423
(1.052) (0.170) (1.275) (1.315) (4.201) (1.599)
First Term Semester: Spring -1.521 0.037 0.752 -0.251 -2.37 -0.100
(1.607) (0.254) (1.915) (1.966) (6.270) (2.398)
Constant 14.669*** 2.770*** 16.279*** 16.168*** -14.099 15.597 ***
(3.054) (0.418) (3.425) (2.957) (9.701) (3.484)
Observations 1,145
Note: Each column is a separate regression specified in Equation 7. All columns have the same observations: 1145.
The loading of Factor 1 is normalized to one in regression of ACTReading , so that Factor 1 takes the metrics of
ACTReading . The loading of Factor 2 is normalized to one in regression of ACTM ath, so that Factor 2 takes the metrics
of ACTM ath. I control for annual state-averaged freshmen graduation rate (AFGR) on the year of each student grad-
uated from high school, home census region fixed effects and first enrollment semester fixed effects. Standard errors
in parentheses. * p < 0.10, ** p < 0.05, *** p < 0.01.
70
Table E2: Non-triangular Latent Skills at College Entrance, Male
ACT E COM114 ACT R ACT S HSGPA ACT M
Factor 1 1.180*** 0.043*** 1
(0.049) (0.004) X
Factor 2 0.808*** 1.684*** 1
(0.027) (0.079) X
Home Region: Indiana -1.788*** -0.066 -1.707*** -1.726*** 0.335 -1.206***
(0.692) (0.079) (0.698) (0.548) (1.595) (0.544)
Home Region: Midwest -0.590 -0.199** -0.799 -0.276 -4.919** -0.160
(0.714) (0.083) (0.726) (0.571) (1.673) (0.563)
Home Region: Northeast -1.039 -0.195* -0.941 -0.335 -3.523 -0.406
(1.019) (0.118) (1.035) (0.758) (2.300) (0.712)
Home Region: South 0.476 -0.007 0.296 0.244 0.064 0.776
(0.757) (0.091) (.7799) (0.626) (1.844) (0.613)
AFGR 0.168*** 0.018*** 0.090** 0.115*** 0.664*** 0.131***
(0.029) (0.004) (0.033) (0.029) (0.086) (0.027)
First Term Semester: Fall 4.837*** -0.212 3.515** 5.558*** 13.106*** 5.817***
(1.076) (0.180) (1.276) (1.152) (3.545) (1.063)
First Term Semester: Spring 2.705*** -0.236 1.210 4.589*** 10.850** 4.744***
(1.354) (0.225) (1.601) (1.417) (4.403) (1.289)
Constant 8.7869*** 2.2636*** 16.952*** 13.233*** -26.697*** 12.888***
(2.5743) (0.379) ( 2.8526) (2.437) (7.402) (2.296)
Observations 1,910
Note: Each column is a separate regression specified in Equation 7. All columns have the same observations: 1910.
The loading of Factor 1 is normalized to one in regression of ACTReading , so that Factor 1 takes the metrics of
ACTReading . The loading of Factor 2 is normalized to one in regression of ACTM ath, so that Factor 2 takes the metrics
of ACTM ath. I control for annual state-averaged freshmen graduation rate (AFGR) on the year of each student grad-
uated from high school, home census region fixed effects and first enrollment semester fixed effects. Standard errors
in parentheses. * p < 0.10, ** p < 0.05, *** p < 0.01.
Table E3: Likelihood of Graduating with A STEM Major, Non-triangular
(1) (2)
Female Male
Factor 1 0.015** 0.018***
(0.0071) (0.0064)
Factor 2 0.042*** 0.066***
(0.0059) (0.0052)
N1145 1910
Note: This table is different with Table 2in terms of the loadings structure of two factors. Column
(1) and (2) show the marginal effect of probit at the means for the female and male sample,
respectively. All marginal effects reflect to changes in probability of graduating in STEM with one
unit increase in the corresponding ability. The standard deviation of female’s and male’s Factor
1 is 3.488 and 3.572; the standard deviation of female’s and male’s Factor 2 is 3.770 and 3.937.
The dependent variable in both column (1) and (2) is dummy of majoring in STEM. Number of
Purdue graduates in the same major, number of Purdue female graduates in the same major, first
enrollment year, first enrollment semester, degree year fixed effects are controlled but not shown
in this table for short. Standard errors in parentheses. * p < 0.10, ** p < 0.05, *** p < 0.01.
71
Table E4: Likelihood of STEM Graduates Working in STEM Occupations, Non-
triangular
(1) (2)
Female Male
Factor 1 0.001 0.004
(0.0119) (0.0062)
Factor 2 0.023** 0.013***
(0.0108) (0.0050)
N1145 1910
Note: This table is different with Table 3in terms of the loadings structure of two factors. Column
(1) and (2) show the marginal effect of probit at the means for the female and male sample,
respectively. All marginal effects reflect to changes in probability of working in STEM with one
unit increase in the corresponding ability of STEM graduates. The standard deviation of female’s
and male’s Factor 1 is 3.488 and 3.572; the standard deviation of female’s and male’s Factor 2
is 3.770 and 3.937. The dependent variable in both column (1) and (2) is dummy of working in
STEM. Number of Purdue graduates in the same major, number of Purdue female graduates in
the same major, home state STEM demand, degree year fixed effects, home region fixed effects
are controlled but not shown in this table for short. Standard errors in parentheses. * p < 0.10, **
p < 0.05, *** p < 0.01.
72
Appendix F Supporting Tables and Figures
Table F1: Correlation between Working in STEM Fields and Working in the
Home State
(1) (2)
Males STEM Grads Female STEM Grads
Indicator of Working in Home State 0.00003 -0.0855*
(0.0237) (0.0485)
N1211 424
Note: Column (1) and (2) show the correlations of working in a STEM job and working in home
state for male and female STEM graduates, respectively. The outcome variable in both column
(1) and (2) is an indicator of working in STEM and the key variable of interest is an indicator of
working in home state. Home state fixed effects and degree calendar year are controlled but not
shown in this table for short. Standard errors in parentheses. * p < 0.10
Table F2: STEM Graduates being Home or Away and Their Earnings
non-STEM Jobs STEM Jobs
Salary #Observations Salary #Observations
Panel A. Males
Away 57153 133 61117 587
(12595) (18.44%) (11534) (81.56%)
Home 50445 95 55040 396
(13303) (19.35%) (9237) (80.65%)
Panel B. Females
Away 52461 71 59560 220
(11885) (24.4%) (11127) (75.6%)
Home 41111 43 55152 90
(14565) (32.3%) (11163) (67.7%)
Note: Panel A and B separately show summary statistics for male and female STEM graduates.
Numbers in parenthesis for salary are the standard deviations and numbers in parenthesis for
#observations are the shares. “Home” means working one’s home state (reported at college en-
trance); otherwise “Away”.
73
Table F3: Reason for Working outside the Field of the Highest Degree
STEM Major, All
non-STEM Job
Female Male Female Male
1: Pay, promotion opportunities 18.06 24.58 23.04 28.99
2: Working conditions (hours, equip., working envir.) 11.87 8.06 10.75 8.86
3: Job location 8.71 5.59 7.15 5.49
4: Change in career or professional interests 16.04 15.73 16.25 16.64
5: Family-related reasons 8.33 5.20 7.49 4.33
6: Job in highest degree field not available 28.66 30.17 28.14 26.70
7: Other reason for not working 8.33 10.66 7.19 8.98
Total 100 100 100 100
Note: Data source: NSCG. This table shows the most important reason for working outside the
field of study for each gender among all bachelor degree holders (degree year 2007–2014) and
STEM bachelor degree holders who have non-STEM jobs.
Table F4: Correlation between Stereotype Adherence and Opting-out of STEM
(1) (2)
Dep. Var.: STEM Job = 1 Males STEM Grads Female STEM Grads
Stereotype Adherence Index -0.077 -0.210*
(0.065) ( 0.125)
N1211 424
Summary Stats Mean SD Min Max
Stereotype Adherence Index 2.02 0.32 1.4 3.1
Note: The upper panel shows the regression coefficients and the lower panel shows summary
statistics of the state-level stereotype adherence index. Column (1) and (2) show the correlation
between the index and working in STEM for male and female separately. The dependent variable
is dummy of working in STEM. The variable of interest is Stereotype Adherence Index, which is
the average of the male–female ratios in math and science and the female–male ratio in reading
for the top 5 percent of students (Pope and Sydnor,2010). Degree calendar year fixed effects are
controlled but not shown in this table for short. Standard errors in parentheses. * p < 0.10.
74