ArticlePDF Available

Gender differences in individual variation in academic grades fail to fit expected patterns for STEM


Abstract and Figures

Fewer women than men pursue careers in science, technology, engineering and mathematics (STEM), despite girls outperforming boys at school in the relevant subjects. According to the 'variability hypothesis', this over-representation of males is driven by gender differences in variance; greater male variability leads to greater numbers of men who exceed the performance threshold. Here, we use recent meta-analytic advances to compare gender differences in academic grades from over 1.6 million students. In line with previous studies we find strong evidence for lower variation among girls than boys, and of higher average grades for girls. However, the gender differences in both mean and variance of grades are smaller in STEM than non-STEM subjects, suggesting that greater variability is insufficient to explain male over-representation in STEM. Simulations of these differences suggest the top 10% of a class contains equal numbers of girls and boys in STEM, but more girls in non-STEM subjects.
This content is subject to copyright. Terms and conditions apply.
Gender differences in individual variation in
academic grades fail to t expected patterns
for STEM
R.E. ODea 1,2, M. Lagisz1, M.D. Jennions 2& S. Nakagawa 1
Fewer women than men pursue careers in science, technology, engineering and mathematics
(STEM), despite girls outperforming boys at school in the relevant subjects. According to the
variability hypothesis, this over-representation of males is driven by gender differences in
variance; greater male variability leads to greater numbers of men who exceed the perfor-
mance threshold. Here, we use recent meta-analytic advances to compare gender differences
in academic grades from over 1.6 million students. In line with previous studies we nd strong
evidence for lower variation among girls than boys, and of higher average grades for girls.
However, the gender differences in both mean and variance of grades are smaller in STEM
than non-STEM subjects, suggesting that greater variability is insufcient to explain male
over-representation in STEM. Simulations of these differences suggest the top 10% of a class
contains equal numbers of girls and boys in STEM, but more girls in non-STEM subjects.
DOI: 10.1038/s41467-018-06292-0 OPEN
1Evolution and Ecology Research Centre, School of Biological and Environmental Sciences, University of New South Wales, Sydney 2052 NSW, Australia.
2Research School of Biology, Australian National University, Canberra 2601 ACT, Australia. These authors contributed equally: R.E. ODea, M. Lagisz.
Correspondence and requests for materials should be addressed to R.E.OD. (email:
or to S.N. (email:
NATURE COMMUNICATIONS | (2018) 9:3777 | DOI: 10.1038/s41467-018-06292-0 | 1
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Achild entering school has endless answers to the question
what do you want to be when you grow up?By the end
of school, these have narrowed to a set of career aspira-
tions that are consistent with his or her self-concept (the way an
individual perceives themselves, and believes they are perceived
by others1). If the child is a girl, then she is likely to graduate with
career aspirations with lower earning potential than a male
classmate2. This phenomenon contributes to occupational seg-
regation, and there are numerous incentives to reduce its pre-
valence. Schooling has a strong inuence on the career aspirations
of students3, so addressing gender differences in the workforce
requires that we understand how gender affects school
Self-concept is heavily inuenced by school achievement1,4,
and high-performing students are more likely to pursue well-paid
careers, such as science, technology, engineering and mathematics
(STEM)-based jobs5. Girls tend to earn higher school grades than
boys, including in STEM subjects6, so why does this advantage
not transfer into the workforce? The variability hypothesis, also
called the greater male variability hypothesis, has been used to
explain this apparent contradiction7it is based on the tendency
for males to show greater variability than females for psycholo-
gical traits8(and for other traits across multiple species9), leading
to relatively fewer females with exceptional ability10. However,
the gender gap in employment within many highly paid occu-
pations exceeds gender differences in variability (e.g. some math-
intensive occupations employ far fewer women than the pro-
portion of girls who score in the top 1% of maths tests11).
Therefore, occupational segregation cannot be simply caused by
fewer women having the requisite ability for high-status jobs.
Girls are susceptible to conforming to stereotypes (stereotype
threat12) in the traditionally male-dominated elds of STEM, and
girls who try to succeed in these elds are hindered by backlash
effects13. STEM are high-paying elds that employ fewer women
than men14,15, and also require a high level of mathematical
ability16. Evidence from standardised tests administered to children
and adolescents indicates a greater gender difference in variation
in performance in STEM subjects than other subjects1719,and
an excess of males amongst the top-achieving students2022.
Therefore, a girl who performs well at school may notice that a
greater proportion of the students who do better than her in
mathematics and science classes are male, when compared to the
proportion in other subjects. This, when combined with stereotype
threat and the risk of backlash for behaving against gender ste-
reotypes13, could deter girls from pursuing a STEM-related career.
Based on this hypothesis, and assuming equivalency of gender
differences for standardised tests and class grades, we present an
illustration of the predicted grade distributions for female and male
students in Fig. 1.
Gender differences in variability have been tested using scores
on standardised tests19,23, but we are unaware of any study
describing gender differences in the variability of teacher-assigned
grades. While there are moderate-to-strong correlations (sensu24)
between grades and test scores2528, there is also a stark gender
difference. Girls tend to receive lower test scores relative to their
school grades, whereas boys receive higher test scores relative to
their school grades. There are multiple conjectures to explain this
discrepancy in mean gender differences between tests and grades
(e.g. on average, girls behave better, which gives them an
advantage in grades, but they fare worse when tested on novel
material that was not covered in class)29. Regardless of the source
of these differences, teacher-assigned grades are likely to affect
studentslives, and it is a reasonable conjecture that they have a
greater impact on studentsacademic self-concept than standar-
dised test scores1. Furthermore, grades are at least as good a
predictor of success at university (measured by grade point
average and graduation rate)30,31. Therefore, if gender differences
in variability were impacting girlsdecisions to pursue STEM, we
would expect to see these differences reected in school grades.
Here, we present a systematic meta-analysis on the effect of
gender on variance in academic achievement using teacher-
assigned grades. While grades are a more subjective measurement
than test scores, we also include data from university students,
whose grades are less affected by teachersassessment of beha-
viour. While earlier meta-analyses have examined how mean
academic achievement differs between the sexes6,32,33, mean and
variance differences should be examined together, as their mag-
nitudes can be correlated (meanvariance relationship34). For-
tunately, a recently published method allows for a meta-analytic
comparison of variances that takes into account any
meanvariance relationship35.
Based on the variability hypothesis, we expected female grades
to be less variable than those of males. To test this hypothesis, we
extended a previous meta-analysis by Voyer and Voyer6on dif-
ferences in the mean grades of students from ages 6 through to
university. We used a more appropriate effect size to compare
means, and another effect size to compare variances (Methods).
We found that grades for female students were less variable than
male grades. Then, focusing on school students (a relatively
unbiased sample compared to university students), we found that:
(1) the gender difference in variability has not changed noticeably
over the last 80 years (19312013); (2) gender differences in grade
variability are already present in childhood, and do not increase
during adolescence; (3) nally, gender differences in grade var-
iance were larger for STEM than non-STEM subjects, contrary to
our expectations shown in Fig. 1.
Description of dataset. Our dataset contained 346 effects sizes
extracted from 227 studies (Supplementary Data 1), representing
820,158 female and 826,629 male students. Fifty-two percent of
the effect sizes were for globalgrades (i.e. GPA), 26% were for
STEM (mathematics and science), 19% for non-STEM (language,
humanities, social science) and 3% for miscellaneous subjects.
North American data dominated the dataset, with 70% of the
effect sizes. Within the North American sample, 24% of studies
were on a racially diverse cohort of students, 23% were on
majority White/Caucasian students, 9% were on majority Black/
African American students, 1% were on majority Hispanic/Latino
students, and 43% of studies did not provide information on the
racial composition of students. In total, 62% of the effect sizes
came from school students (247,582 girls and 253,073 boys), and
the remainder from university students. The original grades were
awarded on a few different grading scales (Supplementary Figs. 1
and 2).
Gender differences in variability. Overall, girls had signicantly
higher grades than boys by 6.3% (natural logarithm of response
ratio (lnRR
(mean): 0.061, 95% condence interval, CI:
0.052 to 0.070), with 10.8% less variation among girls than
among boys (natural logarithm coefcient of variation ratio
(variance): 0.114, CI: 0.133 to 0.095) (Supple-
mentary Table 2; Fig. 2). The gender differences in mean grades
were signicantly larger at school than at university by 2.7%
schooluni diff
:0.028, CI: 0.044 to 0.011; Supplementary
Table 3). The gender differences in variation were also larger at
school than at university, but the difference of 4.2% was non-
signicant (lnCVR
schooluni diff
: 0.041, CI: 0.002 to 0.080; Sup-
plementary Table 3). To test for moderating factors, we only used
the school data in subsequent analyses. We excluded university
students because there is self-selection among students in terms
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-06292-0
2NATURE COMMUNICATIONS | (2018) 9:3777 | DOI: 10.1038/s41467-018-06292-0 |
Content courtesy of Springer Nature, terms of use apply. Rights reserved
of who applies for (and is then accepted at) a university. This
selection process makes undergraduates and postgraduates
unrepresentative of the general population. The results from
analyses for the whole dataset and for the university subset are
provided in Supplementary Tables 210, 12, and 1525 (the
university subset also had small sample sizes for STEM and non-
STEM subjects, making results from moderator analyses sensitive
to outlier studies).
Moderating effects of study year and student age.Thehigher
mean and lower variability of girlsthan boysgrades have not
changed signicantly over the past eight decades (Supplementary
Table 4, Supplementary Fig. 8A: lnRR
study year scaled (slope)
: 0.019,
CI: 0.017 to 0.055; Supplementary Table 4; Supplementary Fig. 8D:
study year scaled (slope)
:0.029, CI: 0.083 to 0.025). Within
genders, variability in grades showed a non-signicant trend towards
decreasing over time, but signicantly more so for girls than boys
(Supplementary Table 5, Supplementary Fig. 8G: natural logarithms
of the coefcient of variation (lnCV)
study year boysgirls (slope diff)
: 0.032,
CI: 0.004 to 0.060). Student age did not affect the gap between girls
and boys mean grades or the gender difference in grade variability
(Supplementary Fig. 9, Supplementary Table 6). Within genders,
variability in grades showed a non-signicant tendency to decrease as
students aged (Supplementary Table 7, Supplementary Fig. 9G:
student age boysgirls (slope)
: 0.010, CI: 0.067 to 0.087), and to
decrease faster for boys than girls (Supplementary Table 7, Supple-
mentary Fig. 9G: lnCV
student age boysgirls (slope diff)
:0.035, CI: 0.062
to 0.007).
Moderating effects of subject type: STEM versus non-STEM.
Girlssignicant advantage of 7.8% in mean grades in non-STEM
was more than double their 3.1% advantage in STEM. (Fig. 2a,
Supplementary Table 8: non-STEM: lnRR
: 0.075, CI:
0.049 to 0.102; STEM: lnRR
: 0.031, CI: 0.011 to 0.051; the
difference: lnRR
non-STEM-STEM diff
:0.044, CI: 0.065 to 0.024).
Variation in grades among girls was signicantly lower than that
among boys in every subject type, but the sexes were more similar
in STEM than non-STEM subjects (Fig. 2b, Supplementary
Table 9; STEM: 7.6% less variable grades; lnCVR
CI: 0.115 to 0.043; non-STEM: 13.3% less variable grades;
:0.149, CI: 0.199 to 0.099; the difference:
non-STEM-STEM diff
: 0.070, CI: 0.028 to 0.111). The greater
gender similarity in variability in STEM was due to girlsgrades
being signicantly more variable in STEM than non-STEM
subjects (Fig. 2c, Supplementary Table 10, lnCV
girls STEMnon-STEM
:0.101, CI: 0.170 to 0.033). In contrast, the variability of
boysgrades did not differ signicantly between STEM and non-
STEM subjects (Fig. 2c, Supplementary Table 10, lnCV
STEMnon-STEM diff
:0.030, CI: 0.102 to 0.042).
The small values of all meta-analytic estimates of
gender differences in means and variances imply a large
overlap in the grade distributions between the two sexes.
The simulated distributions of girlsand boysgrades in Fig. 3
show the distributions of grades overlap more in STEM
(94.2%) than non-STEM (88.2%) subjects. For example, within
the top 10% of the distribution the gender ratio is even for
STEM, and slightly female-skewed for non-STEM. Results of
additional analyses are presented in Supplementary
Tables 1325.
Our overall result was consistent with elements of the variability
hypothesis: female studentsgrades were less variable than those
of male students, but in contrast to expectations, the greatest
difference in variability occurred in non-STEM subjects. Average
female grades were also higher than males, corroborating the
ndings of Voyer and Voyer6(Fig. 2). Gender differences in grade
variability of school pupils was unaffected by their age, weakly
affected by the year of study, and most strongly affected by
whether or not the subject was STEM.
From grade one onward, we found that girlsgrades were less
variable than those of boys. Across the last 80 years, the variability
in school grades has slightly decreased for both boys and girls
(albeit slightly faster for girls). This decline might reect
increased student performance36, or greater reluctance to fail
students, i.e. grade ination37. These scenarios assume that there
is a ceiling effect on grades, whereby variance is reduced because
weaker students are shifted upwards, whereas the highest per-
forming students are bumped up against the ceilingof the
highest possible grade awarded on the grading scale. Although we
do not see strong evidence for a ceiling effect in our dataset
(Supplementary Fig. 5), below we discuss how the ceiling affect
could underestimate the magnitude of gender differences in
On average girls grades better, On average girls grades much better, On average girls grades slightly better,
Girls grades more consistent, Girls grades similarly variable, Girls grades much more consistent,
Fewer top scoring girls More top scoring girls Many fewer top scoring girls
Grades Grades Grades
Non-STEM STEM - Girls
- Boys
Fig. 1 Predicted distributions of school grades of girls (red) and boys (blue). aThe grade distribution overlaps represent the prediction that, when all grades
are considered, girls on average earn higher grades and are less variable than boys, although there are more highly performing boys than girls at the upper
end of the achievement distribution. bIn non-STEM subjects, the difference in mean grades between girls and boys may be even more pronounced in
favour of girls, which, coupled with similar variability, should result in many more highly performing girls than boys at the upper end of the achievement
distribution. cIn contrast, for STEM grades, we expected less difference between boys and girls mean grades and more grade variability for boys, resulting
in boys dominating at both the top and bottom of the achievement distribution
NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-06292-0 ARTICLE
NATURE COMMUNICATIONS | (2018) 9:3777 | DOI: 10.1038/s41467-018-06292-0 | 3
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Contrary to our expectations (Fig. 1), and those of many
others10, the gender difference in variability was smaller for
STEM than non-STEM subjects (Fig. 2). When the small gender
gap in grade variability is combined with the small gender dif-
ference in mean grades, it indicates that in STEM subjects, the
distributions of girlsand boysgrades are more similar than in
non-STEM subjects (Fig. 3). One possible explanation is that
boysare more affected by the ceiling affect in STEM than non-
STEM. For example, if a grading scale cannot distinguish between
students in the top 1% or top 0.1%, and if there exists a male skew
in the top 0.1% only in STEM but non in non-STEM, then gender
differences in variance would be underestimated in STEM. Wai
et al.22 tried to get around this ceiling effect by analysing seventh-
grade test scores explicitly designed to differentiate between
exceptional students. They found a female:male ratio of 0.25 in
the top 1% of students in STEM subjects, which is more imbal-
anced than our data suggests (Fig. 3c). While this nding is
intriguing, it should be noted that STEM careers are not restricted
to the exceptionally talented (although elds that subscribe to the
belief that talent is important for success tend to employ fewer
women38). Therefore, while our data does not preclude a gender
gap among the exceptionally talented, it nevertheless indicates a
practical similarity in girlsand boysacademic achievements,
which are likely to provide an imperfect but valid measure of the
ability to pursue STEM (Fig. 3).
Because studentsgrades impact their academic self-concept
and predict their future educational attainment (e.g. refs. 1,5), we
might therefore predict roughly equal participation of men and
women in STEM careers. However, the equivalence of girlsand
boysperformance in STEM subjects in school does not translate
into equivalent participation in STEM later in life. Is this because
grades are not measuring the abilities required to succeed in
STEM? Or does the relative advantage girls have over boys in
non-STEM subjects at school lead them to rationally favour
career choices with fewer competitors? We consider each of these
questions in turn.
We analysed school grades, where girls show a well-established
advantage over boys25, whereas most previous tests of gender
differences in variability have focussed on test scores18,19,23.To
explore whether the smaller variability difference in STEM
compared to non-STEM is conned to school grades, we per-
formed a supplementary analysis of a large international dataset
of standardised test scores of 15-year-olds (see Supplementary
Note 2 for details). This supplementary analysis found gender
differences in variance that were consistent across subjects; girls
test scores were more consistent than boys, with equivalent
gender differences in non-STEM and STEM subjects (Supple-
mentary Fig. 11). However, girls only showed a mean advantage
in non-STEM. Therefore, it appears that the mean differences
between test scores and grades are caused by shifts in the position
of girlsand boysdistributions, rather than changes in the shape
of distributions in STEM compared to non-STEM (girlsdis-
tributions of both grades and test scores are narrower than boys
distributions, but the difference is not more pronounced in
STEM). If girls perceive they have fewer competitors in non-
STEM subjects because, on average, fewer boys perform better
than girls, this might lead to a preference for non-STEM over
STEM careers39,40.
Gender differences in expectations of success can arise due to
backlash effects against individuals who defy the stereotype of
their gender, and/or due to gender differences in abilities tilt
(having comparatively high ability in one discipline compared to
another). Women in male-dominated pursuits, including STEM,
face a paradox: if they conform to gender stereotypes, they might
be perceived as less competent, but if they defy gender stereotypes
and perform like a man, then their progress can be halted by
backlashfrom both men and women13,41. Furthermore, analyses
of test scores have revealed that girls are more likely than boys to
show an abilities tilt in the direction favouring non-STEM sub-
jects (i.e. receive higher scores in non-STEM compared to
STEM)42. Our data are consistent with girls showing an ability tilt
in the direction of non-STEM subjects, although we cannot
compare individual student grades (Supplementary Table 11).
Intriguingly, there is evidence that balanced high-achieving
–0.05 0.00 0.05 0.10 0.15 0.20 0.25
Girls:boys ratio of the mean grades (lnRR)
–0.25 −0.20 −0.15 −0.10 −0.05 0.00 0.05
Girls:boys ratio of the variability of grades (lnCVR)
−2.0 –1.8 −1.6 –1.4 –1.2 –1.0
Grade variabilities (lnCV) for each gender
Girls higher grades than boys
Girls more consistent grades than boys
More consistent grades
Fig. 2 Main meta-analytic results. Results of analyses on aratios of the
grade means, bratios of grade variabilities, and ccoefcients of variations
for girls (red) and boys (blue). Diamonds and circles represent meta-
analytic estimates of mean effect sizes, and their 95% condence intervals
are drawn as whiskers. In a, natural logarithm of response ratio (lnRR)
represents the average difference between girlsand boysmean grades;
positive values of lnRR indicate lower boysmean grades. In b, natural
logarithm coefcient of variation ratio (lnCVR) represents the average
difference in grade variation between boys and girls; negative values of
lnCVR indicate greater male variance. In c, natural logarithms of the
coefcient of variation (lnCVs) are shown for boys and girls to illustrate
grade variation by gender; more negative values of lnCV indicate less
variation. Data and code for reproducing this gure are available at
refs. 52,53
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-06292-0
4NATURE COMMUNICATIONS | (2018) 9:3777 | DOI: 10.1038/s41467-018-06292-0 |
Content courtesy of Springer Nature, terms of use apply. Rights reserved
studentswho possess the potential to succeed in disparate
eldsprefer non-STEM careers43, and that girls are more likely
to be balanced than boys, at least among high achievers44.A
female skew towards balanced abilities could be a manifestation
of them showing lower levels of between-discipline variability (i.e.
greater consistency across disciplines). Gender differences in
between-discipline variability, rather than within-discipline
variability, is an interesting avenue for future research.
Agirlsanswertothequestionofwhat do you want to be when
you grow up?will be shaped by her own beliefs about gender, and
results support the variability hypotheses, we have shown that the
magnitude of the gender gap in STEM grades is small, and only
becomes male-skewed at the very top of the distribution (Fig. 3).
to have earned high enough grades to pursue a career in STEM.
When she evaluates her options, however, the STEM path is trod by
more male competitors than non-STEM, and presents additional
internal and external threats due to her and societiesgendered
beliefs (stereotype threat and backlash effects). To increase
recruitment of girls into STEM, this path should be made more
attractive for them. A future study could estimate how male-skewed
we would expect STEM careers to be based solely on gender dif-
ferences in academic achievement, by quantifying the academic
grades of current STEM employees. Our study focussed on gender
differences in academic achievement, but understanding gender
differences in any trait would be improved by simultaneously
comparing gender differences in mean and in variability.
Literature search and study selection. We performed a systematic literature
search following guidelines from PRISMA (Preferred Reporting Items for Sys-
tematic Reviews and Meta-Analyses46). The PRISMA ow diagram depicting our
search and screening process is shown in Fig. 4. We broadly followed the search
protocol used by Voyer and Voyer6. We searched three databases for articles
published between August 2011 and May 2015: ERIC, SCOPUS and ISI Web of
Science. We did not use the PsychINFO or PsycARTICLES databases used by
Voyer and Voyer6, as they were malfunctioning at the time of our search. We
searched for articles containing the term school grade/s,school achievement/s,
school mark/sor grade point average/s. The exact search strings used for each
database and additional details of the literature search are provided in Supple-
mentary Methods. While there was no clear signal of publication bias in the school
subset (Supplementary Tables 12, 25), a limitation of our literature search is that we
did not actively search for unpublished studies or theses.
More boys More girls
All students
Top 75%
Top 50%
Top 25%
Top 10% STEM equal
Top 2% non-STEM equal
f:m ratio
Top 25%, f:m 1.3
Top 5%, f:m 1.1
Top 1%, f:m 0.9
School grades
Top 25%, f:m 1.1
Top 5%, f:m 0.9
Top 1%, f:m 0.8
School grades
Gender ratio across school grades distributions
Percentile (from 100% to 0.01%)
Fig. 3 Inferred relative distributions of academic abilities of girls (red) and boys (blue). aSTEM and bnon-STEM school subjects, cthe proportion of girls in
each percentile. The relative mean and variance for each gender are based on the results of the meta-regression of school grades for school pupils, with
subject as a moderator. In aand b, dashed vertical lines indicate cutoff points, above which 25%, 5% and 1% of top-scoring pupils can be found. The
proportion of girls to boys across the distribution is shown in c, where values to the right on the x-axis correspond to the right tail of the achievement
distributions. f:m values represent ratios of top-scoring girls to boys above each cutoff point (i.e, f:m > 1 =more females; f:m < 1 =more males). Data and
code for reproducing this gure are available at refs. 52,53
NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-06292-0 ARTICLE
NATURE COMMUNICATIONS | (2018) 9:3777 | DOI: 10.1038/s41467-018-06292-0 | 5
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Eligibility criteria. To be included for data extraction at the full-text screening
phase, studies needed to present teacher-assigned grades or global GPA (grade
point average, i.e. grades averaged across many subjects) for a cohort containing
both male and female students. The students could be from grade one and above.
These criteria excluded kindergarten and single-sex studies, and self-reported
grades or test data. Because of socio-cultural effects on gender differences, we
required samples of students that took classes together; we therefore excluded
online courses. We also excluded retrospective studies comparing adults that were
not in the same study cohort. Where longitudinal data was reported, we included
only the rst year of data that met the inclusion criteria. In the case of studies that
reported high school GPA for an undergraduate sample, we only included the
university grades, if reported, and we deemed the high school grades ineligible. This
is because the high school grades of groups of undergraduates do not come from
the same cohortthey represent a subsample of students from disparate high
schools, and only those students who performed well enough to attend university.
When we identied studies that reported data from the same large database, we
only included the study with the largest sample size, and excluded the rest to avoid
pseudo-replication. The list of excluded studies, with reasons for exclusion, is
presented in Supplementary Data 2.
Data extraction and coding. From the original papers, we extracted the sample
sizes, means, and standard deviations for male and female academic grades. For the
studies used by Voyer and Voyer6, we attempted to contact authors if any of these
data were missing. All contacted authors were also asked to provide any additional
data (published or unpublished) they might have available. If we received no
response after 1 month, we sent a follow-up email. Only unstandardised grade data
was collected. When presented data was standardised, we contacted authors to
request the corresponding unstandardised values. For the studies published after
August 2011, we only contacted authors if variance data was missing. In total, data
from authors was acquired for 15 studies, including two unpublished studies.
Papers up until
August 2011
Records identified through
database screening
Records after duplicates
Records screened
(title + abstract + keywords)
Records screened for grades
Full-text articles assessed for
Eligible studies
Final full dataset
Papers published
2011 - May 2015
Voyer & Voyer 2014
Web of
1992 2183
149 Studies: 72 Studies: 7 Studies:
171 Cohorts: 91 Cohorts:
268 Cohorts:
6 Cohorts:
239 Effect sizes 101 Effect sizes
347 Effect sizes
228 Studies:
240 Effect sizes
1743 Excluded
55 Excluded
159 Excluded
PDF unavailable (13)
Study not in English (5)
Standardised grades (6)
Not the same cohort (1)
Test scores (1)
Missing descriptive statistics
Missing descriptive statistics (134)
Self-reported grades (19)
Replicate data (7)
Not the same cohort (6)
Study not in English (2)
Test scores (2)
Additional data
acquired from
Search: “school grade*’’ OR “school
achievement*” OR “school mark*’’ OR
“grade point average’
Fig. 4 PRISMA diagram showing the process of locating studies included in this meta-analysis. The full list of included studies, and the list of studies
excluded at the stage of full-text screening, are available in Supplementary Data 1 and 2
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-06292-0
6NATURE COMMUNICATIONS | (2018) 9:3777 | DOI: 10.1038/s41467-018-06292-0 |
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Moderator variables. In addition to the descriptive statistics for grades of males
and females, we extracted a number of moderator variables, all of which are pre-
sented in Supplementary Table 1. We generally followed the variables used by
Voyer and Voyer6(e.g. racial composition), as well as recording additional
information (e.g. age of students). An analysis of the moderating effect of racial
composition on the gender gap in school grades is presented in the Supplementary
Note 1 and Supplementary Tables 1, 3. Continuous moderators were scaled and
centred (resulting in mean of 0, and standard deviation of 1) prior to the analyses.
We used multiple imputations to ll in missing values of study year and students
mean age (details in Supplementary Methods).
Effect sizes. Using standardised effect sizes allowed us to combine original data
collected on different scales (grades were recorded on different scales among
included studies). To test for differences in mean grades between genders, we used
the natural logarithm response ratio (hereafter referred to as lnRR), and its cor-
responding sampling error variance s2
lnRR ¼ln
lnRR ¼s2
xm=the mean grade of female and male students, respectively,
mand s2
f=the variance in grades of female and male students, respectively,
and n
=the number of male and female students in each sample,
Positive values of lnRR imply greater mean grades for girls.
We extended the literature search in Voyer and Voyer6by 5 years, and our
analysis of mean grades differed from theirs in two ways: (1) we included only
studies where we could compare variances, and; (2) we used lnRR instead of the
standardised mean difference in performance (SMD or Hedges g24;see
Supplementary Equations 14). We chose to use lnRR because, unlike SMD, it is
unaffected by differences in variance (standard deviation) between groups.
However, for comparison with Voyers6results, we have repeated the lnRR analyses
using SMD as the effect size. The results for both lnRR and SMD analyseswhich
are very similar to each otherare presented in the Supplementary Figure 4, and
Supplementary Tables 24, 6, 8, 12, 13, 16, 19, 22, 25.
To assess differences in variance of grades of boys and girls, we used the natural
logarithm coefcient of variation ratio (lnCVR) and its associated sampling error
variance s2
lnCVR ¼ln CVf
lnCVR ¼s2
xm;ln sm
xf;ln sfffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
and CV
=the coefcient of variation for males and females s
xC;ln sCand ρln
xE;ln sE
=the correlations between the logged means and
standard deviations of the male and female students, respectively.
All other notation is described above. Positive values of lnCVR imply greater
variance in girlsgrades relative to boysgrades. By dividing the female and male
standard deviations by their respective means, we controlled for the effect of a
proportional relationship (the meanvariance relationship) between the standard
deviation and the mean. To test how the variance in grades has changed over time,
we also computed the natural logarithm of the coefcient of variation (lnCV) for
boys and girls separately, and its associated sampling error variance35:
lnCV ¼ln s
lnCV ¼s2
x;ln sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
All notation as described above. For the same mean, a more negative value of
lnCV implies a smaller variance.
Statistical analyses. We performed our main analyses on lnCVR and lnRR, and
their associated error terms, using the function in the R (v.3.4.2) package
metafor v.2.0-048. One-third of effect sizes were not independent, because they
came from the same study and/or the same cohort of students. We therefore
included cohort ID and comparison ID as random effects in each model (the levels
of study ID overlapped too much with cohort ID to model both levels simulta-
neously; e.g. in the school data, 120 studies and 141 cohorts, respectively). We also
modelled covariance between effect sizes, assuming that effect sizes from the same
cohort had 0.5 correlations between grades in different subjects (recommended in
ref. 49) because sampling error variances among these effect sizes based on the same
cohort are likely to be correlated. We added this covariance matrix as our sampling
error variance matrix (V argument in the function). In addition, to account
for the two main types of non-independence in our data (hierarchical/nested and
correlation/covariance structures), we used the robust function within the metafor
package to generate xed effects estimates and condence intervals, based on
robust variance estimation, from each model. To test for the overall effect
of gender on mean and variance in school grades, we constructed meta-analytical
models with no xed effects (i.e. meta-analytic model or intercept-only model). We
tested whether the results were signicantly different between school and university
by including the school or universitycategorical moderator in a meta-regression
model on the whole dataset. We then ran separate meta-analytical models on the
school and university data subsets to quantify respective heterogeneities (Supple-
mentary Methods). To test whether the gender gap in school grades varied between
subjects, we included subject type (STEM, non-STEM, Global, Other/NR) as a xed
effect in meta-regression analyses. To test whether the gender difference in school
grades has changed over historical time, or with student age, we included either
study year or average student age as a xed effect. To test whether the variance of
either males or females has changed over historical time, or with student age, we
used lnCV as the response variable, and the xed effects of sex and study year, or
sex and age, and their interactions. Point estimates from all statistical models were
considered statistically signicant when their CI did not span zero.
Robustness of results. There is a possibility of a bias in our results due to over-
reporting of positive ndings in published studies, so we tested our data for
publication bias using multilevel-model versions of funnel plots and Eggers
regression50,51. We also performed alternative analyses of key components of our
study to test whether our conclusions are robust. Overlaps of grade distributions
were inferred using simulation methods. Details and results of these analyses are
presented in Supplementary Methods and Supplementary Tables 1518, 2023.
Data availability
All data, code, and models that were used to generate results text, gures, and tables in
the main text and supplementary information are available to download from dedicated
repositories on the Open Science Framework52,53.
Received: 11 January 2018 Accepted: 9 August 2018
1. Möller, J., Pohlmann, B., Köller, O. & Marsh, H. W. A meta-analytic path
analysis of the internal/external frame of reference model of academic
achievement and academic self-concept. Rev. Educ. Res. 79, 11291167
2. Mandel, H. The role of occupational attributes in gender earnings inequality,
1970-2010. Soc. Sci. Res. 55, 122138 (2016).
3. Holmes, K., Gore, J., Smith, M. & Lloyd, A. An integrated analysis of school
studentsaspirations for stem careers: which student and school factors are
most predictive? Int. J. Sci. Math. Educ. 29,121 (2017).
4. Marsh, H. W., Trautwein, U., Lüdtke, O., Köller, O. & Baumert, J. Academic
self-concept, interest, grades, and standardized test scores: reciprocal effects
models of causal ordering. Child Dev. 76, 397416 (2005).
5. French, M. T., Homer, J. F., Popovici, I. & Robins, P. K. What you do in high
school matters: high school GPA, educational attainment, and labor market
earnings as a young adult. East. Econ. J. 41, 370386 (2015).
6. Voyer, D. & Voyer, S. D. Gender differences in scholastic achievement: a
meta-analysis. Psychol. Bull. 140, 11741204 (2014).
7. Shields, S. A. The variability hypothesis: the history of a biological model of
sex. Signs 7,130 (1982).
8. Johnson, W., Carothers, A. & Deary, I. J. Sex differences in variability in
general intelligence: a new look at the old question. Perspect. Psychol. Sci. 3,
518531 (2008).
9. Reinhold, K. & Engqvist, L. The variability is in the sex chromosomes.
Evolution 67, 36623668 (2013).
10. Halpern, D. F. et al. The science of sex differences in science and mathematics.
Psycho. Sci. Public Interest 8,151 (2007).
11. Wang, M. T. & Degol, J. L. Gender gap in science, technology, engineering,
and mathematics (stem): current knowledge, implications for practice, policy,
and future directions. Educ. Psychol. Rev. 29, 119140 (2017).
NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-06292-0 ARTICLE
NATURE COMMUNICATIONS | (2018) 9:3777 | DOI: 10.1038/s41467-018-06292-0 | 7
Content courtesy of Springer Nature, terms of use apply. Rights reserved
12. Spencer, S. J., Logel, C. & Davies, P. G. Stereotype threat. Annu. Rev. Psychol.
67, 415437 (2016).
13. Rudman, L. A. & Phelan, J. E. Backlash effects for disconrming gender
stereotypes in organizations. Res. Organ. Behav. 28,6179 (2008).
14. OECD. STEM workers receive a signicant earnings premium over other
workers with the same level of education: private wage and salary, workers
aged 25 and over.
15. Holman, L., Stuart-Fox, D. & Hauser, C. E. The gender gap in science: how
long until women are equally represented?. PLoS Biol. 16, e2004956 (2018).
16. Penner, A. M. Gender differences in extreme mathematical achievement: an
international perspective on biological and social factors. Am. J. Sociol. 114,
S138S170 (2008).
17. Feingold, A. Gender differences in variability in intellectual abilities: a cross-
cultural perspective. Sex Roles 30,8192 (1994).
18. Hedges, L. V. & Nowell, A. Sex differences in mental test scores, variability,
and numbers of high-scoring individuals. Science 269,4145 (1995).
19. Reilly, D., Neumann, D. L. & Andrews, G. Sex differences in mathematics and
science achievement: a meta-analysis of National Assessment of Educational
Progress assessments. J. Educ. Psychol. 107, 645662 (2015).
20. Cimpian, J. R., Lubienski, S. T., Timmer, J. D., Makowski, M. B. & Miller, E. K.
Have gender gaps in math closed? Achievement, teacher perceptions, and
learning behaviors across two ECLS-K cohorts. AERA Open 2,119 (2016).
21. Lakin, J. M. Sex differences in reasoning abilities: surprising evidence that
malefemale ratios in the tails of the quantitative reasoning distribution have
increased. Intelligence 41, 263274 (2013).
22. Wai, J., Cacchio, M., Putallaz, M. & Makel, M. C. Sex differences in the right
tail of cognitive abilities: a 30 year examination. Intelligence 38, 412423
23. Baye, A. & Monseur, C. Gender differences in variability and extreme scores in
an international context. Large-scale Assess. Educ. 4, 541 (2016).
24. Cohen, J. Statistical Power Analysis for the Behavioral Sciences (Revised
Edition) (Academic Press, New York, 1977).
25. Duckworth, A. L. & Seligman, M. E. P. Self-discipline gives girls the edge:
gender in self-discipline, grades, and achievement test scores. J. Educ. Psychol.
98, 198208 (2006).
26. McCandless, B. R., Roberts, A. & Starnes, T. Teachersmarks, achievement test
scores, and aptitude relations with respect to social class, race, and sex. J. Educ.
Psychol. 63, 153159 (1972).
27. Borghans, L., Golsteyn, B. H. H., Heckman, J. J. & Humphries, J. E. What
grades and achievement tests measure. Proc. Natl Acad. Sci. USA 113,
1335413359 (2016).
28. Zwick, R. & Green, J. G. New perspectives on the correlation of scholastic
assessment test scores, high school grades, and socioeconomic factors. J. Educ.
Meas. 44,123 (2007).
29. Cornwell, C., Mustard, D. B. & Van Parys, J. Noncognitive skills and the
gender disparities in test scores and teacher assessments: evidence from
primary school. J. Hum. Resour. 48, 236264 (2013).
30. Betts, J. R. & Morell, D. The determinants of undergraduate grade point
average: the relative importance of family background, high school resources,
and peer group effects. J. Hum. Resour. 34, 268293 (1999).
31. Zhang, G., Anderson, T. J., Ohland, M. W. & Thorndyke, B. R. Identifying
factors inuencing engineering student graduation: a longitudinal and cross-
institutional study. J. Eng. Educ. 93, 313320 (2004).
32. Else-Quest, N. M., Hyde, J. S. & Linn, M. C. Cross-national patterns of gender
differences in mathematics: a meta-analysis. Psychol. Bull. 136, 103127 (2010).
33. Lindberg, S. M., Hyde, J. S., Petersen, J. L. & Linn, M. C. New trends in gender
and mathematics performance: a meta-analysis. Psychol. Bull. 136, 11231135
34. Taylor, L. R. Aggregation, variance and the mean. Nature 189, 732735 (1961).
35. Nakagawa, S. et al. Meta-analysis of variation: ecological and evolutionary
applications and beyond. Methods Ecol. Evol. 6, 143152 (2015).
36. Trahan, L. H., Stuebing, K. K., Fletcher, J. M. & Hiscock, M. The Flynn effect:
a meta-analysis. Psychol. Bull. 140, 13321360 (2014).
37. Lackey, L. W. & Lackey, W. J. Grade ination: potential causes and solutions.
Int. J. Eng. Educ. 22, 130139 (2006).
38. Leslie, S.-J., Cimpian, A., Meyer, M. & Freeland, E. Expectations of brilliance
underlie gender distributions across academic disciplines. Science 347,
262265 (2015).
39. Niederle, M. & Vesterlund, L. Explaining the gender gap in math test scores:
the role of competition. J. Econ. Perspect. 24, 129144 (2010).
40. Gneezy, U. & Rustichini, A. Gender and competition at a young age. Am.
Econ. Rev. 94, 377381 (2004).
41. Rudman, L. A. & Fairchild, K. Reactions to counterstereotypic behavior: the
role of backlash in cultural stereotype maintenance. J. Pers. Soc. Psychol. 87,
157176 (2004).
42. Coyle, T. R., Snyder, A. C. & Richmond, M. C. Sex differences in ability tilt:
support for investment theory. Intelligence 50, 209220 (2015).
43. Wang, M. T., Eccles, J. S. & Kenny, S. Not lack of ability but more choice:
individual and gender differences in choice of careers in science, technology,
engineering, and mathematics. Psychol. Sci. 24, 770775 (2013).
44. Valla, J. M. & Ceci, S. J. Breadth-based models of womens
underrepresentation in STEM elds: an integrative commentary on Schmidt
(2011) and Nye et al. (2012). Perspect. Psychol. Sci. 9, 219224 (2014).
45. Riegle-Crumb, C., King, B. & Moore, C. Do they stay or do they go? The
switching decisions of individuals who enter gender atypical college majors.
Sex Roles 74, 436449 (2016).
46. Moher, D., Liberati, A., Tetzlaff, J. & Altman, D. G. PRISMA Group. Preferred
reporting items for systematic reviews and meta-analyses: the PRISMA
statement. J. Clin. Epidemiol. 62, 10061012 (2009).
47. Hedges, L. V., Gurevitch, J. & Curtis, P. S. The meta-analysis of response ratios
in experimental ecology. Ecology 80, 11501156 (1999).
48. Viechtbauer, W. Conducting meta-analyses in R with the metafor package. J.
Stat. Softw. 36,148 (2010).
49. Noble, D. W., Lagisz, M., ODea, R. E. & Nakagawa, S. Nonindependence and
sensitivity analyses in ecological and evolutionary meta-analyses. Mol. Ecol.
26, 24102425 (2017).
50. Egger, M., Smith, G. D., Schneider, M. & Minder, C. Bias in meta-analysis
detected by a simple, graphical test. BMJ 315, 629634 (1997).
51. Nakagawa, S. & Santos, E. S. A. Methodological issues and advances in
biological meta-analysis. Evol. Ecol. 26, 12531274 (2012).
52. ODea, R. E., Lagisz, M., Jennions, M. D. & Nakagawa, S. Data for Gender
differences in individual variation in academic grades fail to t expected
patterns for STEM.Open Science Framework (2018).
53. ODea, R. E., Lagisz, M., Jennions, M. D. & Nakagawa, S. Code for Gender
differences in individual variation in academic grades fail to t expected
patterns for STEM.Open Science Framework (2018).
We thank the following authors for kindly providing data used in analysis: Dr. Stephen
Borde, Dr. Christy Byrd, Dr. Christina Davies, Professor Rollande Deslandes, Professor
Jean-Marc Dewaele, Professor Noor Azina Binti Ismail, Dr. Marianne Johnson, Dr. Amy
Lutz, Dr. Amy Sibulkin, Dr. Helena Smrtnik Vitulićand Dr. Daniel Taylor. Sincere
thanks to Dr. Khandis Blake, Dr. Daniel Noble, Dr. Joel Pick, Professor Cordelia Fine, for
providing constructive comments that greatly improved the manuscript.
Author contributions
S.N. and M.D.J. conceived the study, R.E.O. and M.L. collected data, R.E.O., M.L. and S.
N. conducted analyses. All authors contributed to interpretation of the results and
writing the manuscript.
Additional information
Supplementary Information accompanies this paper at
Competing interests: The authors declare no competing interests.
Reprints and permission information is available online at
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional afliations.
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative
Commons license, and indicate if changes were made. The images or other third party
material in this article are included in the articles Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the
articles Creative Commons license and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this license, visit
© The Author(s) 2018
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-06292-0
8NATURE COMMUNICATIONS | (2018) 9:3777 | DOI: 10.1038/s41467-018-06292-0 |
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
... We tested whether resource limitation (Garibaldi et al., 2018) may be driving changes in yield stability: We hypothesised that when crop yield is increased by animal pollination, yield becomes increasingly limited by other factors, causing clustering around the maximum potential yield in that context, which results in increased stability (a ceiling effect, see O'Dea et al., 2018). These limiting factors could include agronomic constraints (498) CVR4 lnCVR z-score standardised mean yield of animalpollinated plants (497) Publication (46), experimental comparison (214), effect size (497) RR1 lnRR z-score standardised mean yield of auto-pollinated control plants (498) Publication (47) such as nutrient availability, and/or physiological constraints, such as the number of ovules per flower. ...
... In our study crops, greater yield benefits of animal pollination were associated with greater yield stability. Our results indicate that animal pollination increases spatial yield stability due to a ceiling effect (O'Dea et al., 2018) which results from resource limitation. According to the multiple limitation hypothesis (Garibaldi et al., 2018), crop yield is concurrently limited F I G U R E 2 Effect of spatial scale at which mean and variance have been calculated on the yield stability ratio (lnCVR) between animalpollinated and auto-pollinated control plants. ...
Full-text available
The benefits of animal pollination to crop yield are well known. In contrast, the effects of animal pollination on the spatial or temporal stability (the opposite of variability) of crop yield remain poorly understood. We use meta‐analysis to combine variability information from 215 experimental comparisons between animal‐pollinated and wind‐ or self‐pollinated control plants in apple, oilseed rape and faba bean. Animal pollination increased yield stability (by an average of 32% per unit of yield) at between‐flower, ‐plant, ‐plot and ‐field scales. Evidence suggests this occurs because yield benefits of animal pollination become progressively constrained closer to the maximum potential yield in a given context, causing clustering. The increase in yield stability with animal pollination is greatest when yield benefits of animal pollination are greatest, indicating that managing crop pollination to increase yield also increases yield stability. These additional pollination benefits have not yet been included in economic assessments but provide further justification for policies to protect pollinators.
... For example, multiple studies have demonstrated that males exhibit more variability in brain size and structure [116,117,123,149]. In some cases, these results have been misused to justify current gender disparities in STEM fields (due to a higher frequency of males at the "top-end" of neuroanatomical distributions) [150]. However, in other cases, researchers have proactively preempted such biased interpretations by thoughtfully considering the full space of theoretical possibilities and engaging with this space through scientific data-based inquiry. ...
... However, in other cases, researchers have proactively preempted such biased interpretations by thoughtfully considering the full space of theoretical possibilities and engaging with this space through scientific data-based inquiry. For example: (i) while O'Dea and colleagues [150] confirm that girls exhibit higher average, but less variable grades, they note that by the time girls graduates, they are just as likely as boys to have earned high enough grades to pursue a career in STEM; and (ii) while Wierenga and colleagues [116] confirm greater neuroanatomical variability among men, they note that extreme brain structure (in either direction) may be costly due to energy-induced tradeoffs between volume and conduction time [151]. ...
Full-text available
Highlights Recent large-scale studies have reached different conclusions regarding the presence of sex differences in human neuroanatomy. We show that these contradictory findings are explained by different methodological choices. While multiple large direct analyses highlight small, highly reproducible sex differences, reviews do not account for methodological heterogeneity across studies (e.g., statistical power/sample size, brain size-correction methods, segmentation, region selection, participant age). This explains many of the apparent inconsistencies reported in recent reviews. We also summarize observations that motivate research on sex differences in human neuroanatomy (including potential causes and effects), review methodological and empirical support for using structural MRI to investigate such patterns, and outline best practices for analyzing and describing neuroanatomical sex differences. Finally, we argue that broader historical and societal contexts make it important to reinforce the scientific method by adopting an actively "anti-sexist" viewpoint when conducting research on sex differences in the human brain.
... The second aim of this study was to investigate the presence of any difference in the cognitive pro le in the three groups. Current evidence supports that gender differences in cognitive performance are quite small, particularly in developed countries (O'Dea et al., 2018). However, cis men generally outperform cis women in visuospatial abilities (Palmiero et al., 2016), while cis women outperform cis men in verbal and memory tasks (Karalexi et al., 2020). ...
Full-text available
Several studies investigated the specific neural correlates of trans people, highlighting mixed results. This study aimed to investigate the presence of specific functional connectivity in trans men, compared to a homogeneous group of cisgender men and cisgender women. 42 participants (19 trans men, 11 cisgender men, and 12 cisgender women) underwent a resting state fMRI; a blood sample was collected in order to evaluate the hormonal status of testosterone, estradiol, and progesterone. Screening measures were administered for evaluating the intellectual ability and manual preference. Moreover, all participants underwent a neuropsychological evaluation of executive functions, attention, visual-perceptual ability, and verbal fluency. Trans men showed a weaker functional connectivity in the precentral gyrus, subcallosal cortex, paracingulate gyrus, temporal pole, and cingulate gyrus in contrast to cisgender men. Furthermore, trans men showed a worse performance than cisgender men and similar to that of cis women in verbal and visuospatial working-memory. In trans men, functional connectivity of precentral gyrus was positively correlated with blood testosterone and negatively correlated with estradiol and progesterone; the cluster involving the subcallosal cortex showed a positive correlation with testosterone and negative with estradiol, and the functional connectivity from a cluster involving the paracingulate gyrus showed a positive correlation with testosterone. This study sheds light to the importance of overpassing the binary-model, by highlighting the presence of neural pathways that could represent the peculiarity of the neural profile of people with gender dysphoria.
... The findings of the studies Adragna (2009) 2016) reported that there is a significant difference in the of mean score of career aspiration and also reported that female students had higher career aspiration. The studies of Aunola et al. (2018) andO'Dea et al. (2018)contradict the present study's findings, revealing that male students had higher career aspirations.The studies of Cervoni and Ivinson (2011), DiPrete and Buchmann (2013), Howard et al. (2011) and Wairimu (2012) shows the significant difference in the mean scores of career aspiration between the boys and girls. These studies reported that male students had higher career aspiration. ...
Full-text available
Individuals start to aspire toward their careers when they are at school. The decision to pursue a career can start as early as adolescence. At the secondary school level, students form their career goals. This study assessed the career aspiration of XI standard students. The objectives of the study are, to assess the career aspiration of XI standard students and to identify the significant difference if any in career aspiration of XI standard students concerning the sub-variables of population variables namely, locality, gender, type of management, and academic stream. A descriptive survey method with cluster sampling is used to choose a sample size of 265 XI standard students. A Likert scale is used as a research tool. The study employed differential statistics. The study found that career aspiration is high among the XI standard students. Self-confidence and preparation dimensions of career aspiration differ significantly with respect to gender, locality, and the type of management. Self-confidence, preparation, and motivation dimensions of career aspiration also differ significantly with respect to the stream of study of XI standard students.
... The first arises from candidates: different groups of candidates may exhibit different variability when their quality is estimated through a given test. For instance, students of different genders have been observed to show different variability on certain test scores (Baye and Monseur, 2016;O'Dea et al., 2018). The second arises from the decision makers: decision makers might have different levels of experience (or different amounts of data in case of algorithmic decision making) judging candidates from different groups and consequently, their ability to precisely assess the quality of candidates belonging to different groups might be different. ...
Data-driven decision-making algorithms are increasingly applied in many domains with high social impact, such as hiring, lending, or criminal justice. Recently, it was shown that such algorithms could lead to discrimination against certain demographic groups (e.g., they can discriminate by race, gender, or age). This led to a recent active line of research---called algorithmic fairness---which studies how to develop efficient algorithms with fairness guarantees. Most of the decision problems with high social impact mentioned above are essentially selection problems. In selection problems, the decision-maker must select a fixed fraction of the best candidates given their characteristics. The notion of a selection budget contrasts selection problems with classification problems typically studied in the algorithmic fairness literature.In this thesis, we study the causes of discrimination in selection problems and the impact of fairness mechanisms on the utility of selection. Our first contribution considers a selection problem with candidates whose quality is measured with a group-dependent noise---a phenomenon called differential variance. We study the impact of differential variance on group representations and how standard group fairness mechanisms affect the selection utility in the presence of differential variance. Our second contribution proposes a game-theoretic model of a selection problem with differential variance. We assume strategic candidates who maximize the individual utility by making a costly effort. The effort induces their quality, measured by a (group-fair) decision-maker with group-dependent noise. We characterize the equilibrium of such a game. In our third contribution, we consider a multistage selection problem. We extend classical group fairness notions to a multistage setting and propose the notions of local (per stage) and global (final stage) fairness. We then introduce and study the price of local fairness which is the ratio of utilities induced by the globally fair algorithm to that of the locally fair algorithm.
... Female students' higher tendency to suffer anxiety and stress effects from contact with patients [25,26] may explain these differences, as well as facing more obstacles during their clinical rotations when compared to males, such as poor mentoring and less support from hostile nurses [27]. However, some other studies report no significant differences between male and female students on their average grades [28], and others even report higher average grades among female students [29]. It is not clear why male students showed higher knowledge rates than females in our study, since the current literature regarding differences in knowledge according to gender is controversial. ...
Full-text available
During clinical rotations, medical students experience situations in which the patients’ right to privacy may be violated. The aim of this study is to analyze medical students’ perception of clinical situations that affect patients’ right to privacy, and to look for the influential factors that may contribute to the infringement on their rights, such as the students’ age, sex, academic year or parents’ educational level. A cross-sectional study was conducted with a survey via “Google Drive”. It consisted of 16 questions about personal information, 24 questions about their experience when rotating and 21 questions about their opinion concerning several situations related to the right to privacy. A total of 129 medical students from various Spanish medical schools participated. Only 31% of 3rd–6th year students declared having signed a confidentiality agreement when starting their clinical practice, and most students (52%) reported that doctors “sometimes”, “rarely” or “never” introduce themselves and the students when entering the patients’ rooms. Additionally, about 50% of all students reported that they would take a picture of a patient’s hospitalization report without his/her (consent), which would be useful for an assignment. Important mistakes during medical students’ rotations have been observed, as well as a general lack of knowledge regarding patient’s right to privacy among Spanish medical students. Men and older students showed better knowledge of current legislation, as well as those whose parents were both university-educated and those in higher academic years.
... After receiving that evidence and interpreting the results, students tend to develop beliefs about their performance, impacting students' self-efficacy. Although middle and high school girls tend to have higher grades in STEM subjects (Science, Technology, Engineering, and Math;Voyer & Voyer, 2014), stereotypes and self-confidence may prevent them from pursuing STEM careers (O'Dea et al., 2018). In this context, Gibbons & Borders, 2010 suggest that students make their career decisions between sixth and eightth grade, and intervention programs targeting girls' STEM self-efficacy have shown to be more effective before 8th grade (Cho et al., 2009). ...
... They concluded that males outnumbered females in the upper tail of the score distribution in 22 out of 28 ability scales. Several meta-analytic studies on gender differences in variability have ensued (Baye & Monseur, 2016;Gray et al., 2019;Irwing & Lynn, 2005;O'Dea et al., 2018). ...
Full-text available
The current study examined gender differences in divergent thinking (DT) using meta-analyses of mean difference and variation. The main objective of the meta-analysis of mean difference was to resolve contradictory findings in the creativity literature regarding the prevalence of creativity among males or females in creative potential. The meta-analysis of variation aimed to test the greater male variability hypothesis (GMVH) in DT. To test gender differences in means (i.e., Hedges’ g), results from 213 studies (k = 1,251; N = 115,289) were analyzed using a three-level approach. Females slightly outperformed males in DT, g = -.065, 95% CI [-.095, -.034], p = < .001. Three-level multiple regression analyses showed that the mean effect size significantly varied by (a) country, (b) DT subscale, (c) type of task, and (d) ability (gifted vs. non-gifted). In the second meta-analysis, the GMVH in creative potential was tested by synthesizing the results of 1,152 effect sizes from 187 studies (k = 1,152; N = 101,328). The results confirmed the existence of greater male variability (GMV) in DT, (InVR) = 1.216, 95% CI [1.14, 1.29], p = < .001, indicating 21.6% GMV in DT. Multiple regression analyses explained 29.82% of variability in the mean effect (InVR) at Level-2 (within-studies variance), and 5% of the variability in the mean effect at Level-3 (between-studies variance). The mean difference findings support the gender similarity hypothesis, while variation results tend to support the gender differences hypothesis. Limitations and recommendations for future studies are discussed.
In this paper, we analyze recently collected data that conducts a unique assessment of high school student performance for over two thousand students from five Chinese provinces. Across three domains of scientific intelligence tested, we document heterogeneous gender gaps in academic performance. These differences generally arise due to differential productivity of inputs to the education production process and not differential levels of inputs. At many quantiles of the achievement distribution, girls perform better than boys when identifying scientific issues, whereas the converse holds on the portion of the assessment that measures whether one can apply scientific evidence. These differences may partially explain the subsequent gap in decision to major in specific STEM disciplines in college. Further, our results imply caution from using a single summative gender achievement gap measure when gender gaps in subject knowledge are not constant across each domain of intelligence examined within the test.
Math self-concept is strongly associated with a range of academic and career outcomes in math. The current research sought to identify factors that distinguish between undergraduates with particularly low or high math self-concept. A sample of 754 college students were asked to recall a low point they had with math as well as respond to questionnaires measuring math self-concept, value, and anxiety. Focal analyses were conducted on a subsample of participants who reported either high (n = 90) or low (n = 94) math self-concept. Relative to participants who were high in math self-concept, those who were low tended to be women, were higher in math anxiety, and valued math less. Thematic analysis also revealed similarities and differences in how undergraduates from these two groups appraised challenges, or low points, that they encountered in their history with math. Although there were similarities in the types of low points described by members of these two groups, these experiences were often appraised in distinct ways. Unique themes also emerged for each group, indicating that narrative interpretations of math experiences vary with current levels of math self-concept. Implications for future research and math education are discussed.
Full-text available
Author summary In most fields of science, medicine, and technology research, men comprise more than half of the workforce, particularly at senior levels. Most previous work has concluded that the gender gap is smaller today than it was in the past, giving the impression that there will soon be equal numbers of men and women researchers and that current initiatives to recruit and retain more women are working adequately. Here, we used computational methods to determine the numbers of men and women authors listed on >10 million academic papers published since 2002, allowing us to precisely estimate the gender gap among researchers, as well as its rate of change, for most disciplines of science and medicine. We conclude that many research specialties (e.g., surgery, computer science, physics, and maths) will not reach gender parity this century, given present-day rates of increase in the number of women authors. Additionally, the gender gap varies greatly across countries, with Japan, Germany, and Switzerland having strikingly few women authors. Women were less often commissioned to write ‘invited’ papers, consistent with gender bias by journal editors, and were less often found in authorship positions usually associated with seniority (i.e., the last-listed or sole author). Our results support a need for further reforms to close the gender gap.
Full-text available
Intelligence quotient (IQ), grades, and scores on achievement tests are widely used as measures of cognition, but the correlations among them are far from perfect. This paper uses a variety of datasets to show that personality and IQ predict grades and scores on achievement tests. Personality is relatively more important in predicting grades than scores on achievement tests. IQ is relatively more important in predicting scores on achievement tests. Personality is generally more predictive than IQ on a variety of important life outcomes. Both grades and achievement tests are substantially better predictors of important life outcomes than IQ. The reason is that both capture personality traits that have independent predictive power beyond that of IQ.
Full-text available
Studies using data from the Early Childhood Longitudinal Study–Kindergarten Class of 1998–1999 (ECLS-K:1999) revealed gender gaps in mathematics achievement and teacher perceptions. However, recent evidence suggests that gender gaps have closed on state tests, raising the question of whether such gaps are absent in the ECLS-K:2011 cohort. Extending earlier analyses, this study compares the two ECLS-K cohorts, exploring gaps throughout the achievement distribution and examining whether learning behaviors might differentially explain gaps more at the bottom than the top of the distribution. Overall, this study reveals remarkable consistency across both ECLS-K cohorts, with the gender gap developing early among high achievers and spreading quickly throughout the distribution. Teachers consistently rate girls’ mathematical proficiency lower than that of boys with similar achievement and learning behaviors. Gender differences in learning approaches appear to be fairly consistent across the achievement distribution, but girls’ more studious approaches appear to have more payoff at the bottom of the distribution than at the top. Questions remain regarding why boys outperform girls at the top of the distribution, and several hypotheses are discussed. Overall, the persistent ECLS-K patterns make clear that girls’ early mathematics learning experiences merit further attention.
Full-text available
Drawing on prior theoretical and empirical research on gender segregation within educational fields as well as occupations, we examine the pathways of college students who at least initially embark on a gender-atypical path. Specifically, we explore whether women who enter fields that are male-dominated are more likely to switch fields than their female peers who have chosen other fields, as well as whether men who enter female-dominated majors are more likely to subsequently switch fields than their male peers who have chosen a more normative field. We utilize a sample of 3702 students from a nationally representative dataset on U.S. undergraduates, the Beginning Postsecondary Students Longitudinal Study (BPS 2004/09). Logistic regression models examine the likelihood that students switch majors, controlling for students’ social and academic background. Results reveal different patterns for men and women. Men who enter a female-dominated major are significantly more likely to switch majors than their male peers in other majors. By contrast, women in male-dominated fields are not more likely to switch fields compared to their female peers in other fields. The results are robust to supplementary analyses that include alternative specifications of the independent and dependent variables. The implications of our findings for the maintenance of gendered occupational segregation are discussed.
Full-text available
This study examines gender differences in the variability of student performance in reading, mathematics and science. Twelve databases from IEA and PISA were used to analyze gender differences within an international perspective from 1995 to 2015. Effect sizes and variance ratios were computed. The main results are as follows. (1) Gender differences vary by content area, students' educational levels, and students’ proficiency levels. The gender differences at the extreme tails of the distribution are often more substantial than the gender differences at the mean. (2) Exploring the extreme tails of the distributions shows that the situation of the weakest males in reading is a real matter of concern. In mathematics and science, males are more frequently among the highest performing students. (3) The “greater male variability hypothesis” is confirmed.
Meta-analysis is an important tool for synthesizing research on a variety of topics in ecology and evolution, including molecular ecology, but can be susceptible to non-independence. Non-independence can affect two major interrelated components of a meta-analysis: 1) the calculation of effect size statistics and 2) the estimation of overall meta-analytic estimates and their uncertainty. While some solutions to non-independence exist at the statistical analysis stages, there is little advice on what to do when complex analyses are not possible, or when studies with non-independent experimental designs exist in the data. Here we argue that exploring the effects of procedural decisions in a meta-analysis (e.g., inclusion of different quality data, choice of effect size) and statistical assumptions (e.g., assuming no phylogenetic covariance) using sensitivity analyses are extremely important in assessing the impact of non-independence. Sensitivity analyses can provide greater confidence in results and highlight important limitations of empirical work (e.g., impact of study design on overall effects). Despite their importance, sensitivity analyses are seldom applied to problems of non-independence. To encourage better practice for dealing with non-independence in meta-analytic studies, we present accessible examples demonstrating the impact that ignoring non-independence can have on meta-analytic estimates. We also provide pragmatic solutions for dealing with non-independent study designs, and for analyzing dependent effect sizes. Additionally, we offer reporting guidelines that will facilitate disclosure of the sources of non-independence in meta-analyses, leading to greater transparency and more robust conclusions. This article is protected by copyright. All rights reserved.
Declining enrolments in science, technology, engineering and mathematics (STEM) disciplines and a lack of interest in STEM careers are concerning at a time when society is becoming more reliant on complex technologies. We examine student aspirations for STEM careers by drawing on surveys conducted annually from 2012 to 2015. School students in years 3 to 12 (n = 6492) were asked to indicate their occupational choices. A logistic regression analysis showed that being in the older cohorts, possessing high cultural capital, being male, having a parent in a STEM occupation and high prior achievement in reading and numeracy, were significant. This analysis provides a strong empirical basis for school-based initiatives to improve STEM participation. In particular, strategies should target the following: the persistent lack of interest by females in some careers, improving student academic achievement in both literacy and numeracy and expanding knowledge of STEM careers, especially for students without familial STEM connections.