ArticlePDF Available

Teacher Mindsets Help Explain Where a Growth-Mindset Intervention Does and Doesn’t Work

Authors:

Abstract and Figures

A growth-mindset intervention teaches the belief that intellectual abilities can be developed. Where does the intervention work best? Prior research examined school-level moderators using data from the National Study of Learning Mindsets (NSLM), which delivered a short growth-mindset intervention during the first year of high school. In the present research, we used data from the NSLM to examine moderation by teachers’ mindsets and answer a new question: Can students independently implement their growth mindsets in virtually any classroom culture, or must students’ growth mindsets be supported by their teacher’s own growth mindsets (i.e., the mindset-plus-supportive-context hypothesis)? The present analysis (9,167 student records matched with 223 math teachers) supported the latter hypothesis. This result stood up to potentially confounding teacher factors and to a conservative Bayesian analysis. Thus, sustaining growth-mindset effects may require contextual supports that allow the proffered beliefs to take root and flourish.
Content may be subject to copyright.
Teacher Mindsets Help Explain Where a
Growth Mindset Intervention Does and Doesn’t Work
In press at Psychological Science
Authors:
David S. Yeager1*, Jamie M. Carroll1, Jenny Buontempo1, Andrei Cimpian2, Spencer Woody1,
Robert Crosnoe1, Chandra Muller1, Jared Murray1, Pratik Mhatre1, Nicole Kersting3, Christopher
Hulleman4, Molly Kudym1, Mary Murphy5, Angela Duckworth6, Gregory M. Walton7, & Carol
S. Dweck7
Affiliations:
1 University of Texas at Austin, Austin, TX, 2 New York University, New York, NY,
3 University of Arizona, Tucson, AZ, 4 University of Virginia, Charlottesville, VA,
5 Indiana University, Bloomington, IN, 6 University of Pennsylvania, Philadelphia, PA,
7 Stanford University, Stanford, CA
* Address correspondence to David S. Yeager (dyeager@utexas.edu).
Acknowledgments:
This paper uses data from the National Study of Learning Mindsets (PI: D. Yeager; Co-Is: R.
Crosnoe, C. Dweck, C. Muller, B. Schneider, & G. Walton; doi.org/10.3886/ICPSR37353.v1),
which was made possible through methods and data systems created by the Project for Education
Research That Scales (PERTS, PI: Dave Paunesku), data collection carried out by ICF (Project
directors: Kate Flint and Alice Roberts), meetings hosted by the Mindset Scholars Network at the
Center for Advanced Study in the Behavioral Sciences, assistance from M. Levi, M. Shankar, T.
Brock, C. Romero, C. Macrander, T. Wilson, E. Konar, E. Horng, H. Bloom, and M. Weiss, and
funding from the Raikes Foundation, the William T. Grant Foundation, the Spencer Foundation,
the Bezos Family Foundation, the Character Lab, the Houston Endowment, and the President and
Dean of Humanities and Social Sciences at Stanford University. Writing of this paper was
supported by the National Institutes of Health under award number R01HD084772, the National
Science Foundation under grant numbers 1761179 and 2004831, the William T. Grant
Foundation under grant 189706, the William and Melinda Gates Foundation under grant
numbers OPP1197627 and INV-004519, the UBS Optimus Foundation under grant number
47515, and an Advanced Research Fellowship from the Jacobs Foundation to David Yeager.
This research was also supported by grant P2CHD042849, Population Research Center, awarded
to the Population Research Center at The University of Texas at Austin by the National Institutes
of Health. The content is solely the responsibility of the authors and does not necessarily
represent the official views of the National Institutes of Health, the National Science Foundation,
and other funders.
Teacher and Student Mindsets
2
Abstract
A growth mindset intervention teaches the belief that intellectual abilities can be
developed. Where does the intervention work best? A prior paper examined school-level
moderators using data from the National Study of Learning Mindsets (NSLM), which delivered a
short growth mindset intervention during the first year of high school. This paper uses the NSLM
to examine moderation by teachers’ mindsets and answers a new question: Can students
independently implement their growth mindsets in virtually any classroom culture, or must
students growth mindsets be supported by their teacher’s own growth mindsets (i.e., the mindset
+ supportive context hypothesis)? The present analysis (N = 9,167 student records matched with
N = 223 math teachers) supported the latter hypothesis. This result stood up to potentially
confounding teacher factors and to a conservative Bayesian analysis. Thus, sustaining growth
mindset effects may require contextual supports that allow the proffered beliefs to take root and
flourish.
Keywords: Wise interventions, Growth mindset, Motivation, Adolescence, Affordances,
Implicit theories.
Teacher and Student Mindsets
3
Teacher Mindsets Help Explain Where a
Growth Mindset Intervention Does and Doesn’t Work
Psychological interventions change the ways that people make sense of their experiences,
and have led to improvement in a wide variety of domains of importance to society and to public
policy (Harackiewicz & Priniski, 2018; Walton & Wilson, 2018). These interventions offer
people new beliefs that encourage them to tackle rather than avoid a challenge or to persist rather
than give up. To the extent that people put these beliefs into practice, the interventions can
improve outcomes months or even years later (see Brady et al., 2020).
For instance, a growth-mindset of intelligence intervention conveys to students the
malleability of intellectual abilities in response to hard work, effective strategies, and help from
others. Short (<50-minute), online growth mindset interventions evaluated in randomized
controlled trialsincluding two pre-registered replicationshave improved the academic
outcomes of lower-achieving high school students and first-year college students (e.g., Yeager et
al., 2019; see Dweck & Yeager, 2019). These interventions seek to dispel a fixed mindset, the
idea that intellectual abilities cannot be changed, which has been associated with more “helpless”
responses to setbacks and lower achievement around the world (OECD, 2021).
Is successfully teaching students a growth mindset enough? A fundamental tension
centers on the role of the educational context. Should psychological interventions be thought of
as “giving” people adaptive beliefs that they can apply and reap the benefits from in almost any
context, even ones that do not directly support its use? Or do interventions simply offer beliefs
that must later be supported by the context if they are to bear fruit?
In a previous paper (Yeager et al., 2019), we examined the role of a school factor,
namely, the peer norms in a school, and found that the student growth mindset intervention could
Teacher and Student Mindsets
4
not overcome the obstacle of a peer culture that did not share or support growth mindset
behaviors, such as challenge seeking. Here we ask how the growth mindset intervention might
fare in classrooms led by teachers who endorse more of a fixed mindset (a less supportive
context for students’ growth mindsets) versus classrooms led by teachers who endorse more of a
growth mindset (a more supportive context).
Why Might a Growth Mindset Intervention Depend on Teacher Beliefs?
The present paper tests the viability of the mindset + supportive context hypothesis. In
this hypothesis, a teacher’s growth mindset acts as an “affordance” (Walton & Yeager, 2020;
also see Gibson, 1977) that can draw out a student’s nascent growth mindset and make it tenable
and actionable in the classroom.
i
This hypothesis grows out of the recognition that as people try
to implement a belief or behavior in a given context, they become aware of whether it is
beneficial and legitimate in that context by attending cues in their environments.
According to the mindset + supportive context hypothesis, teachers with a growth
mindset may convey how, in their class, mistakes are learning opportunities, not signs of low
ability, and back up this view with assignments and evaluations that reward continual
improvement (Canning, Muenks, Green, & Murphy, 2019; Muenks et al., 2020). This could
encourage a student to continue acting on their growth mindsets. By contrast, teachers with more
of a fixed mindset may implement practices that make a budding growth mindset inapplicable
and locally invalid. For instance, they may convey that only some students have the talent to get
an A, or say that not everyone is a math person (Rattan, Good, & Dweck, 2012; also see
Muenks et al., 2020). These messages could make students think that their intelligence would be
evaluated negatively if they had to work hard or if they asked a question that revealed their
confusion, discouraging students from acting out key growth mindset behaviors. According to
Teacher and Student Mindsets
5
this hypothesis, the intervention is like planting a seed, but one that will not take root and
flourish unless the “soil” is fertile (a classroom with growth mindset affordances) (see Walton &
Yeager, 2020).
Despite its intuitive appeal, the mindset + supportive context hypothesis was not a
foregone conclusion. Perhaps students are more like independent agents who can achieve in any
classroom context so long as they bring adaptive beliefs to the context and put forth effective
effort. Therefore, teachers’ mindsets could be irrelevant to the effectiveness of the intervention.
Research could even find stronger effects in a classroom led by teachers espousing more of a
fixed mindset. This would imply that the intervention fortifies students to find ways to achieve
(for example, by being less daunted by difficult tasks, working harder, persisting longer) even in
contexts that are not directly encouraging these behaviors (Canning et al., 2019; Leslie, Cimpian,
Meyer, & Freeland, 2015; Muenks et al., 2020). In this view, a student’s growth mindset could
be like an asset that can compensate for something lacking in the environment. Because no study
has examined classroom context moderators of the growth mindset intervention, a direct test of
the mindset + supportive context hypothesis was needed.
The Importance of Studying Treatment Effect Heterogeneity
Our attention to teachers’ mindsets as a moderating agent continues an important
development in psychological intervention research: a focus on treatment effect heterogeneity
(Tipton, Yeager, Iachan, & Schneider, 2019). Psychologists have often viewed heterogeneous
effects as a limitation, as meaning that the effects are unreliable, small, or applicable in too
limited a way, and therefore not important (for a discussion see Miller, 2019).
ii
But this view is
shifting. First, heterogeneity is now seen as the way things in the world actually are (Bryan,
Tipton, & Yeager, in press; Gelman & Loken, 2014). Nothing, and particularly no psychological
Teacher and Student Mindsets
6
phenomenon, works the same way for all people in all contexts. This fact that has been pointed
out for generations (Bronfenbrenner, 1977; Cronbach, 1957; Lewin, 1952), but it has only
recently begun to be appreciated sufficiently. Second, systematically probing where an
intervention does and does not work provides a unique opportunity to develop better theories and
interventions (Bryan et al., in press; McShane, Tackett, Böckenholt, & Gelman, 2019), including
by revealing mechanisms through which the intervention operates.
The Present Research
This study analyzed data from the National Study of Learning Mindsets (NSLM, Yeager,
2019), which was an intervention experiment conducted with a U.S. representative sample of 9th
grade students (registration: osf.io/tn6g4). The NSLM focused on the start of high school
because this is when academic standards often rise and when students establish a trajectory of
higher or lower academic achievement with lifelong consequences (Easton, Johnson, & Sartain,
2017). The NSLM was designed primarily to study treatment effect heterogeneity. The first
paper, as mentioned, focused on a school’s peer norms as a moderator (Yeager et al., 2019). The
second planned analysis, presented here, focuses on teacher factors. Teachers are important to
students directly because they lead the classroom and establish its culture. For example, teachers
create the norms for instruction, set the parameters for student participation, and control grading
and assessments, and thereby influence student motivation and engagement (Jackson, 2018;
Kraft, 2019).
The present focus on math grades (rather than overall GPA as in Yeager et al., 2019) is
motivated by the fact that students tend to find math challenging and anxiety-inducing (Hembree,
1990) and therefore a growth mindset might help students confront those challenges
productively. Further, our focus on math is relevant to policy. Success in 9th grade math is a
Teacher and Student Mindsets
7
gateway to a lifetime of advanced education, profitable careers, and even longevity (Carroll,
Muller, Grodsky, & Warren, 2017).
In this study of heterogeneous effects, what kinds of effect sizes should be expected?
Brief online growth mindset interventions have tended to improve the grades of lower-achieving
high school students by about .10 grade points (or .11 SD) (Yeager et al., 2019, 2016). This may
seem small relative to benchmarks from laboratory research, but that is not an appropriate
comparison for understanding intervention effects obtained in field settings (Kraft, 2020). An
entire year of learning in 9th grade math is worth .22 SD as assessed by achievement tests (Hill,
Bloom, Black, & Lipsey, 2008), and having a high-quality math teacher for a year during
adolescence, as compared to an average one, is worth .16 SD (Chetty, Friedman, & Rockoff,
2014). Expensive and comprehensive education reforms for adolescents show a median effect of
.03 SD. The largest effects top out at around 0.20 SD, with effects this large representing striking
outliers (Boulay et al., 2018). Thus, Kraft (2020) concluded that “effects of .15 or even .10 SD
should be considered large and impressive” (pg. 248) especially if the intervention is scalable,
rigorously evaluated, and assessed in terms of consequential, official outcomes (e.g. grades).
Method
Data
Data come from the NSLM, which as noted was a randomized trial conducted with
nationally representative sample of 9th grade students during the 2015-2016 school year (Yeager,
2019). The NSLM was approved by the IRBs at Stanford University, the University of Texas at
Austin, and ICF International. The current analysis, which focuses on math teachers, was central
to the original design of the study, appeared in our grant proposals, and referenced as the next
analysis in our previous pre-analysis plan (osf.io/afmb6/). The present study followed the Yeager
Teacher and Student Mindsets
8
at al. (2019) pre-registered analysis plan for every step that could be repeated from the first paper
(e.g., data processing, stopping rule, covariates, and statistical model). Analysis steps that are
unique to the present paper are outlined in detail in the SOM-R and previewed below. There was
no additional pre-registration for the present paper. Instead, we used a combination of a
conservative Bayesian analysis and a series of robustness tests to guard against false positives
and portray statistical uncertainty more accurately for analysis steps not specified in the pre-
registration. The two planned analyses (i.e., for the present paper and Yeager et al., 2019) were
conducted sequentially. The present study’s math teacher variables were not merged with the
student data until after the Yeager et al., (2019) analyses were completed.
The analytic sample included students with a valid condition variable, a math grade, and
their math teacher’s self-reported mindset (see online supplement Table S6). This sample
included 9,167 records (8,775 unique students, as some students had more than one math
teacher) nested within 223 unique teachers. It comprises 76% of the overall NSLM sample of
students with a math grade. Those who are missing data either could not be matched to a math
teacher or their math teacher did not answer the mindset questions. Missing data did not differ by
condition (see online supplement Table S7). We retained students who took two math courses
with different teachers, each of whom completed the survey. Listwise-deletion of them produced
the same results (see Table 2). In terms of math level, 7% of records were from a math class at a
level below Algebra 1, 70% were in Algebra 1; 19% were in Geometry, and 3% were in Algebra
II or above. Students were 50% female and racially diverse; 14% reported being Black/African-
American, 21% Latinx, 6% Asian, 4% Native American or Middle Eastern and 55% white, and
37% reported mothers with a bachelor’s degree. Teachers’ characteristics were similar to
Teacher and Student Mindsets
9
population estimates: 58% were female, 86% were white, non-Latinx, and 51% reported having
earned a master’s degree; they had been teaching an average of 13.83 years (SD = 9.95).
The previous, between-school analysis (Yeager et al. 2019) examined grades in all
subjects (math, science, English, and social studies). That analysis focused on the pre-registered
group of lower-achieving students (whose pre-treatment grades were below the school median)
because it would be harder to detect improvement among the already higher-achieving students
and because a previous pre-registered study had shown the effects to be concentrated among the
lower achievers (Yeager et al., 2016), which replicated prior work (Paunesku et al., 2015). The
current focus on math teachers and math grades, however, required us to include students at all
achievement levels, a decision we made before seeing the results. This is because classrooms are
smaller units than schools, so excluding half the sample would have left us with too few students
in many teachers’ classrooms and could have made estimates too imprecise. In addition, math
grades are on average substantially lower than in other subjects, probably because students in the
U.S. are tracked into advanced math classes earlier than in other subject areas, which suggests
that students overall tend to be in math classes that they find challenging. This means that fewer
students were already earning As, and more students’ grades could improve in response to an
intervention, particularly one focused on helping students engage with and learn from challenges.
(See Table 2 for supplementary analyses among low-achievers).
Procedure
The NSLM implemented a number of procedures that allowed it to be informative with
respect to contextual sources of intervention effect heterogeneity (Tipton et al., 2019). First,
students were randomly assigned on an individual basis (i.e., within classroom and school) to a
growth mindset intervention or a control group, while math teachers (who were unaware of
Teacher and Student Mindsets
10
condition and study procedures) were surveyed to measure their mindsets. Thus, each teacher in
the analytic sample had some students in the control group and some students in the treatment
group. Consequently, we could estimate a treatment effect for each teacher and examine
variation in effects across teachers. The study procedures appear in Figure 2 and are described in
more detail next. Additional information is reported in the technical documentation available
from ICPSR (Yeager, 2019) and in the supplemental material in Yeager et al. (2019).
Figure 2. The student and teacher data collection procedure in the National Study of Learning
Mindsets. Note: The non-seeing eye icon represents masking of condition assignments from
teachers, students, and researchers. The coin flip icon represents the moment of random
assignment. +About 85% of students took session 1 during the fall semester before the
Thanksgiving break, as planned; the rest took it in January or early February, to accommodate
school schedules. The pre-analysis plan specifying data processing rules can be found here:
osf.io/afmb6/. The data processing was carried out by MDRC, an independent research firm,
while unaware of condition assignment or results.
Data collection and processing. To reduce bias in the research process, three
professional research firms were contracted to form the sample, administer the intervention, and
collect all the data. ICF International selected and recruited a nationally-representative sample of
Teacher and Student Mindsets
11
public schools in the U.S. during the 2015-2016 academic year. Students within those schools
completed online surveys hosted by the firm PERTS, during which they were randomly assigned
to a growth mindset intervention or a control group. The final student response rates were high
(median student response rate across schools: 98%), and the recruited sample of schools closely
matched the population of interest (Gopalan & Tipton, 2018).
Random assignment to condition was conducted by the survey software at the student
level, with 50/50 probability, when students logged on to the survey for the first time. To prevent
expectancy effects, condition information was masked from involved parties, in that students did
not know there were two conditions (i.e. a “treatment” and a “control”) while teachers in the
school were not allowed to “take” the treatment, were not told the hypotheses of the study, and
were not told that students were randomly assigned to alternative conditions. The treatment and
control conditions looked remarkably similar, to reduce the likelihood that teachers saw a
difference. The intervention sessions generally occurred during electives (like health or PE), and
schools were discouraged from conducting sessions in math classes. Math teachers were not used
as proctors (usually, non-teaching staff coordinated data collection) so as to keep math teachers
as unaware of the study as possible. The intervention involved two ~25-minute sessions,
generally 1 to 4 weeks apart, and under 50 minutes in total for nearly all students. Immediately
after the second intervention session, students completed self-reports of mindsets (which served
as a manipulation check).
Prior to data collection, schools provided the research firm with a list of all instructors
who taught a math class that academic year with more than two 9th grade studentsthe
definition of a “9th grade math teacher” used here. This sample restriction was necessary because
each teacher would need both treated and control students to provide a within-teacher treatment
Teacher and Student Mindsets
12
effect estimate. All such teachers were invited to complete an approximately one-hour online
survey in return for a $40 incentive, and a large majority of teachers (86.8%) did so. This high
response rate reduced the likelihood that biased non-response could have affected the distribution
or validity of the teacher mindset measure.
The independent research firm ICF International obtained student survey data from the
technology vendor PERTS and administrative data (e.g., grades) from the schools and readied
both for final processing. MDRC, another independent research firm, then processed these data
following a registered pre-analysis plan. They were all unaware of students’ condition
assignments. Only then did our research team access the data and execute the planned analyses.
(In parallel, MDRC developed an independent evaluation report that reproduced the overall
intervention impacts and between-school heterogeneity results, Zhu, Garcia, & Alonzo, 2019).
Growth mindset intervention. The growth mindset intervention presented students with
information about how the brain learns and develops using this metaphor: The brain is like a
muscle that grows stronger (and smarter) when it learns from difficult challenges (Aronson et
al., 2002). Then, the intervention unpacked the meaning of this metaphor for experiences in
school, namely that struggles in school are not signs that one lacks ability but instead that one is
on the path to developing one’s abilities. Trusted sourcesscientists, slightly older students,
prominent individuals in societyprovided and supported these ideas. Students were then asked
to generate their own suggestions for putting a growth mindset into practice; for example, by
persisting in the face of difficulty, seeking out more challenging work, asking teachers for
appropriate help, and revising one’s learning strategies when needed, among others.
The intervention involved a number of other exercises designed to help students articulate
the growth mindset, how they could use it in their lives, and how other students like them might
Teacher and Student Mindsets
13
use it. It was deliberately not a lecture or an “exhortation,” so as to avoid the impression that the
intervention was telling young people what to think, since we know that for adolescents an
autonomy-threatening framing could be ineffective or even backfire. Instead, the intervention
treated young people as collaborators in the improvement of the intervention, sharing their own
unique expertise on what it is like to be a high school student. Additional detail on the
intervention (and control) groups appears in the supplement to Yeager et al., (2019) (also see the
SOM-R).
Control group. The control group was provided with interesting information about brain
functioning and its relation to memory and learning, but the program did not mention the
malleability of the brain or intellectual abilities. As in the growth mindset condition, trusted
sourcesscientists, older peers, and prominent individuals in societyprovided this information
and students were asked for their opinions and treated as having their own unique expertise. The
graphic art, headlines, and overall visual layout was very similar to the treatment, to help
students and teachers remain masked and to discourage comparison of materials. Because most
students were taking biology at the time, the neuroscience taught in the control group would have
added content above and beyond what students were learning in class and could even have
increased interest in science and in school. Indeed, students have sometimes found the control
material if anything more interesting than the treatment material (Yeager, Romero, et al., 2016).
In sum, the active control condition was designed to provide a rather rigorous test of the
effectiveness of the growth mindset intervention.
Measures
Primary outcome: Math grades. The primary dependent variable was students’ post-
treatment grades in their math course, which were generally recorded 7 or 8 months after the
Teacher and Student Mindsets
14
intervention. All math grades were obtained from schools’ official records. Grades ranged from 0
(an F) to 4.3 (an A+). The mean math GPA was 2.44 leaving considerable room for
improvement for many students.
Grades are the dependent variable of interest, not test scores, for three reasons. First,
grades are typically better predictors of college enrollment and lifespan outcomes than test
scores, and the signaling power of grades is apparent even though schools and teachers could
potentially inflate their grading scales (Pattison, Grodsky, & Muller, 2013). Thus, grades are
relevant for policy and for understanding trajectories of development. Second, grades represent
the accumulation of many different assignments (homework, quizzes, tests) and therefore signal
the kind of dedicated persistence that a growth mindset is designed to instill. Third, test scores
were not an option in this study because 9th grade is not always a grade in which state
achievement tests are administered, and most students did not have a math test score.
Primary moderator: Teacher mindset. Math teachers rated two fixed mindset
statements: “People have a certain amount of intelligence and they really can't do much to
change it” and “Being a top math student requires a special talent that just can’t be taught”
(1=Strongly agree, 6=Strongly disagree, M = 4.74, SD = 0.76). The first is a general fixed
mindset item intended to capture beliefs that might lead to mindset practices that are not specific
to math, such as not allowing students to revise and resubmit their work or discouraging low-
achievers’ questions. The second item captures a belief that could lead to more math-specific
mindset practices (see Leslie, Cimpian, Meyer, & Freeland, 2015). The two items were
correlated (r = .48, p < .001) and were averaged. We scored them so that higher values
corresponded to more growth mindset beliefs. We note that respondent time on this national
Teacher and Student Mindsets
15
math teacher survey was limited to encourage participation and survey completion, so every
construct, even teacher mindset, was limited to a small number of items.
The two mindset items used for the composite had not been administered to large samples
of high school math teachers before, so we assessed their concurrent validity by administering
them to a large, pilot sample of high school teachers along with items that assessed teacher
practices (N = 368 teachers). (The details of the sample and the exact item wordings are reported
in the SOM-R.) In the pilot, we found that teachers’ mindsets in fact predicted their endorsement
of practices expected to follow from teachers’ mindsets, based on theory and past research
(Canning et al., 2019; Haimovitz & Dweck, 2017; Leslie et al., 2015; Muenks et al., 2020).
Specifically, teachers’ endorsement of a growth mindset was positively associated with learning-
focused practices, r = .30, p<.001 (e.g., saying to a hypothetical struggling student, “Let’s see
what you don’t understand and I’ll explain it differently,” and not agreeing that, “It slows my
class down to encourage lower achievers to ask questions”). Further, teacher mindsets were
negatively associated with ability-focused practices (emphasizing raw ability and implying that
high effort was a negative sign about ability), r = .28, p<.001 (e.g., comforting a hypothetical
struggling student with “Don’t worry, it’s okay to not be a math person,” a la Rattan, Good, &
Dweck, 2012, and praising a succeeding student with “You’re lucky that you’re a math person”
or “It’s great that it’s so easy for you”). This is by no means an exhaustive list of potential
mindset teacher practices, and this is certainly not the only way to measure teacher practices. But
this validation study suggests that the teacher mindset measure captures differences in teachers
that extend to classroom practicespractices that the student growth mindset treatment could
either overcome or that could afford the opportunity for it to work.
Teacher and Student Mindsets
16
Potential confounds for teacher mindsets. Because only the student mindset
intervention was randomly assigned, and not teachers’ mindsets, other characteristics of teachers
could be correlated with their mindsets and with the magnitude of the intervention effect. For
instance, perhaps teachers’ growth mindsets are simply a proxy for competent and fair
instructional practices in general. To account for this possibility, we measured several potential
confounds for teacher mindsets: a video-based assessment of pedagogical content knowledge, a
fluid intelligence test for teachers, teachers’ masters-level preparation in education or math, and
an assessment of implicit racial bias. We call these “potential” confounds because, during the
design of the study, these were raised by at least one advisor to the study as something that could
interfere with the interpretation of teacher mindsets (although, in the end, these factors showed
rather weak associations with teacher mindsets; see Table S10). To this list of a priori,
theoretically-motivated teacher confounds, we added teacher race, gender, years teaching, and
whether they had heard of growth mindset before. We describe the potential confounds in the
supplement because their inclusion or exclusion does not change the sign, significance, or
magnitude of the key moderation results. To these potential teacher-level moderators we can also
add the pre-registered school-level moderators (challenge-seeking norms among students/peers,
school achievement level, and school percent racial/ethnic minority; see Yeager et al. 2019).
Adding these school factors in interaction with the treatment did not change the teacher mindset
interaction (see Table 2), suggesting that these factors examined previously (Yeager et al., 2019)
and the classroom-level factors examined here account for independent sources of moderation.
Last, in a post-hoc analysis we examined three student perceptions of the classroom climate that
could be confounded with teacher mindset: the level of cognitive challenge in the course, how
interesting the course was, and how much students thought the teacher was “good at teaching.”
Teacher and Student Mindsets
17
None of these factors were moderators and none altered the teacher mindset interaction (see
Table S11 in the SOM-R).
Manipulation check and moderator: Students’ mindset beliefs. At pre-test and again
at immediate post-test participants indicated their level of agreement with the three fixed-mindset
statements used as a manipulation check by Yeager et al. (2019) (e.g. “You have a certain
amount of intelligence, and you really can’t do much to change it.”, 1 = Strongly agree, 6 =
Strongly disagree). We averaged responses (Pre-test, M = 2.95, SD = 1.14; = .72; Post-test, M
= 2.70, SD = 1.19; = .78), and higher values corresponded to more of a fixed mindset. An
extensive discussion of the validity of this three-item mindset measure and its relation to the
growth mindset “meaning system” appears in Yeager and Dweck (2020). The scale at pre-test
was used in exploratory moderation analyses. The scale at post-test was used as a planned
manipulation check.
Student-level covariates. Student-level control variables related to achievement
included: the pre-treatment measure of low-achieving student status specified in the overall
NSLM pre-analysis plan (osf.io/afmb6/), which indicates that the student received an 8th grade
GPA below the median of other incoming 9th graders in the school; students’ expectations of
how well they would perform in math class (“Thinking about your skills and the difficulty of
your classes, how well do you think you’ll do in math in high school?”; 1=Extremely poorly to
7=Extremely well); students’ racial minority status, gender, and whether their mother had earned
a bachelor’s or above. These covariates were specified in the NSLM pre-analysis plan because
each could be related to achievement, and so a chance imbalance with respect to any of these
within a teacher’s classroom could bias treatment effect estimates. Controlling for these factors
reduces the influence of chance imbalances. Covariates were school-mean-centered.
Teacher and Student Mindsets
18
Analysis Plan
Estimands. The primary analysis focused on the sign and significance of the student
growth mindset intervention teacher growth mindset interaction. If the interaction was positive
and significant it would be more consistent with the mindset + supportive context hypothesis.
The primary estimands of interest (i.e., values we wished to estimate) were the simple
effects listed in Table 1. Row 1 assumes that teacher mindsets are unassociated with other
teacher factors, but this is not sufficiently conservative so it is not our primary analysis of
interest. Row 2 in Table 1 accounts for potential confounding in the interpretation of teachers’
mindsets by fixing the levels of potentially-confounding moderators to their population averages
(denoted by c in Table 1) and looking at the moderated effects of teacher mindsets (see row 2 of
Table 1). Thus, later when we present the key results in the paper in Table 2, those estimates
correspond to the estimands in row 2 of Table 1.
Table 1. Estimands of Interest: Conditional Average Treatment Effects (CATEs).
Teachers reporting growth mindsets
(i.e. mindset + supportive context)
Assuming no
confounding of
the moderator
CATE S = Growth =
   
    
E Adjusting
for potential
confounding
(primary
estimand of
interest)
CATE S = Growth, C = c =
     

       
Note: CATE = Conditional average treatment effect, or the treatment effect within a subgroup. i
indexes students, j indexes teachers, Y = math grades, T (for treatment) = treatment status, S =
teacher mindset, C (for confounds) = vector of teacher mindset confounds, c = population
average for potential teacher or school confounds. See proofs and justifications in Yamamoto
and Yeager (2019).
Primary statistical model: Linear mixed effects analysis. The primary analysis
examined the cross-level interaction using a typical multilevel, linear mixed effects model, with
a random treatment effect that varied across teachers and was predicted by teacher-level factors,
Teacher and Student Mindsets
19
but with one twist: fixed teacher intercepts. Such a model has become the standard approach for
multi-site trial heterogeneity analyses (Bloom, Raudenbush, Weiss, & Porter, 2017) because the
fixed intercept for each group prevents biases from chance imbalances in the random assignment
to treatment within small groups. This hybrid (fixed intercept, random slope) approach can
make a big difference in the present analysis, since some teachers may have small numbers of
students and, due to random sampling error, be more likely to have chance imbalances.
iii
This is
why the fixed intercept, random slope model was specified in the NSLM pre-analysis plan
(Yeager et al., 2019). As in all standard multilevel models, the random slope allows different
teachers’ students to have different treatment effects, but uses corrections to avoid overstating
the heterogeneity (called an empirical Bayesian shrinkage estimator). Specifically, the model we
estimate appears in Eq. 1,
  
  
     (1)
where yij is the math grade for student i in teacher j’s classroom. At the student level, is a
vector of k-2 student-level covariates (prior achievement, prior expectations for success,
race/ethnicity, gender, and parents’ education, all school-centered). At the teacher level, is a
fixed intercept for each teacher. The large section in brackets represents the multi-level
moderation portion of the model, our main interest. The student-level treatment status, , is
interacted with the continuous measure of teachers’ mindset beliefs () with controls for
potential confounds of teacher mindset beliefs (, a vector that includes implicit bias,
pedagogical content knowledge, fluid intelligence, and teacher master’s certification). The
teacher-level random error is and the student-level error term is .
The primary hypothesis test concerns the regression coefficient , which is the cross-
level student treatment teacher mindset interaction. When is positive and significant, it
Teacher and Student Mindsets
20
means that treatment effects are higher when teachers’ growth mindset scores are higher. The
case for a stronger interpretation of is bolstered if the coefficient’s sign and significance
persists even when accounting for the potential confounds indexed by . The model in Eq. 1
allows (teachers’ mindsets, the primary moderator) to remain a continuous variable. We
estimated the CATEs in Table 1 by implementing a standard approach in psychology: calculating
the treatment simple effect at -1 SD (teachers reporting relatively more of a fixed mindset) and
+1 SD (teachers reporting relatively more of a growth mindset), while holding confounding
moderators constant. We used the margins post-estimation command in Stata SE to do so. We
call the former teachers “relatively” more fixed mindset because their position on the scale
suggests they are in an intermediate group, not clearly growth mindset, but, on the whole, not
extremely fixed.
Secondary statistical model: Bayesian analysis. The primary model had at least one
major limitation: it presumed that all student and teacher-level variables had linear effects and
did not interact. The pre-analysis plan for the NSLM therefore stated that we would follow-up
the primary analysis by using a multi-level application of a flexible but conservative approach
called Bayesian Causal Forest (BCF), which relaxes the assumptions of nonlinearity and of no
higher-order interactions. BCF has been found, in multiple open competitions and simulation
studies, to detect true sources of complex treatment effect heterogeneity while not lending much
credence to noise (Hahn, Murray, & Carvalho, 2020). See Eq.2:
           (2)
The BCF model in Eq. 2 retained the key features of the primary statistical model in Eq. 1:
teacher-specific intercepts, student-level covariates, random variation in the treatment effect
across teachers (unexplained by covariates), and potential confounds for teacher mindset beliefs
Teacher and Student Mindsets
21
(collected in the vector ). The most notable change is that BCF replaces the additive linear
functions from the primary model with the nonlinear functions  and  . These nonlinear
functions have “sum-of-trees” representations that can flexibly represent interactions and other
non-linearities (thus avoiding the researcher degree of freedom of specifying a functional form),
and that can allow the data to determine how and whether a given covariate contributes to the
model predictions (thus avoiding the researcher degree of freedom of covariate selection). The
nonlinear functions are estimated using machine-learning techniques. Bayesian Additive
Regression Trees (BART) prior distributions that shrink the functions toward simpler structures
(like additive or nearly additive functions) while allowing the data to speak. See the SOM-R for
more detail about the priors used for BCF.
From the BCF output, there is no single regression coefficient to interpret, as there would
be in a typical linear regression model, because the output of the BCF model is a richer posterior
distribution of treatment effect estimates for each of the 9,167 teacher mindset/student grade
records in the sample. This means that we do not have to set the moderator to + or -1 SD.
Instead, we can summarize the subgroup treatment effects for each level of teacher mindsets,
while holding all of the potential confounds constant at their population means (see Figure 2 for
the plot). We note that conducting subgroup comparisons or hypothesis tests does not entail
changes to the model fit or prior specifications. The data were used exactly one time, to move
from the prior distribution over treatment effects to the posterior distribution. This facilitates
honest Bayesian inference concerning subgroup effects and subgroup differences, and eliminates
concerns with multiple hypothesis testing that can threaten the validity of a frequentist p-value
(Woody, Carvalho, & Murray, 2020).
Teacher and Student Mindsets
22
The BCF analysis had another advantage: it could accommodate the fact that there were
researcher degrees of freedom about which aspect of math classrooms might moderate the
treatment effectteacher mindsets, the other teacher variables, or qualities of the schools in
which teachers were embedded. BCF allowed all of these teacher and school factors to have the
same possibility of moderating the treatment effect, and gave them equal likelihood in the prior
distribution. In other words, BCF built uncertainty into the model output, which helped to guard
against spurious findings (see the SOM-R).
Results
Preliminary Analyses
Effectiveness of random assignment. The intervention and control groups did not differ
in terms of pre-random-assignment characteristics (see Table S5 and see Yeager et al. 2019).
Average effect on the manipulation check. The manipulation check was successful on
average. The growth mindset intervention led students to report lower fixed mindset beliefs
relative to the control group, (Control M = 2.91, SD = 1.17; Growth mindset M = 2.48, SD =
1.16), t=16.82, p<.001, d = .37.
Homogeneity of the manipulation check. The immediate treatment effect on student
mindsets (the beliefs students reported on the post-treatment manipulation check) was not
significantly moderated by teachers’ mindsets, B = .04 [95% CI: -.031, .102], t = 1.04, p = .297.
Further, there was very little cross-teacher variability in effects on the manipulation checks to
explain. According to the BCF model’s posterior distribution, the standard deviation of the
intervention effect across teachers was just 5% of the average intervention effect, which means
that the posterior prediction interval ranged from 90% to 110% of the average intervention
effect, a very narrow range. Here is what this means: treated students, regardless of their math
Teacher and Student Mindsets
23
teacher mindsets, ended the intervention session with similarly strong growth mindsets that could
be tried out. If we later found heterogeneous effects on math grades, measured months into the
future, it could reflect differences in the affordances that allowed students to act on their
mindsets in class.
Preliminary analyses of effect on math grades. A previous paper (Yeager et al., 2019,
Extended Data Table 1) and an independent impact evaluation (Zhu et al., 2019) reported the
significant main effect of the growth mindset treatment on math grades for the sample overall (p
= .001). Next, the present study’s BCF model found that there was about as much heterogeneity
in treatment effects across teachers (47% of the variation) as there was across schools (49%, with
the remaining 4% of variation coming from covariation between the two). Combined, these
analyses mean that the present paper was justified in focusing on heterogeneity in the treatment
effect on math grades independently from the school factors reported by Yeager et al., (2019).
Primary Analyses: Moderation by Teachers’ Mindsets
Linear mixed effects model. Teachers’ mindsets positively interacted with the
intervention effect on math grades: Student intervention Teacher mindset interaction B = .09
[95% CI: .026, .150], t = 2.79, p = .005 (see Eq. 1). This result was robust to changes to the
model, including consideration of the school-level moderators previously reported by Yeager et
al. (2019), and changes in the sub-sample of participating students (see Table 2).
Thus the data were consistent with the mindset + supportive context hypothesis: the
intervention could alter students’ mindsets, but a growth-affording context was necessary for
students’ grades to be improved. Students whose teachers did not clearly endorse growth mindset
beliefs showed a significant manipulation check effect immediately after the treatment, but their
math grades did not improve.
Teacher and Student Mindsets
24
Effect sizes. The CATEs (conditional average treatment effects) for students with more
fixed versus more growth mindset teachers are presented in Table 2. The effect for students in
classrooms with growth mindset teachers was 0.11 grade points and was significant at p<.001,
and there was no significant effect in classrooms of teachers reporting more of a fixed mindset
(compare columns 2 and 3). Notably, our primary analyses did not exclude students whose
grades could not have been lifted any further. If we limit our sample to the three-fourths of
students who were not already making straight As across all of their core classes before the
study, and who therefore had room to improve their grades, the estimated effect among students
in classrooms with growth mindset teachers becomes slightly larger, .14 grade points (see row 5,
Table 2).
The present analysis included a representative sample and used “intent-to-treat” analyses.
This means that we included students who could not speak or read English, who had visual or
physical impairments, who had attentional problems, whose computers malfunctioned, and more.
Thus, there were many students in the data who could not possibly have shown treatment effects.
This study therefore estimates effects that could be anticipated under naturalistic circumstances.
Teacher and Student Mindsets
25
Table 2. Effect of Growth Mindset Intervention on Math Grades in 9th Grade Among
Students with Fixed Versus Growth Mindset Math Teachers, Estimated in Linear Mixed
Effects Models.
Model specification
Teachers reporting more of
a fixed mindset
Teachers reporting more
of a growth mindset
Student intervention
Teacher mindset (continuous)
interaction
Primary Model Specification
Teacher mindset as
moderator + potential
teacher confounds,
(N = 9,167)
CATE = -.02 [-.074, .038],
t = -0.63, p = .531
CATE = .11 [.046, .167],
t = 3.46, p < .001
B = .09 [026, .150],
t = 2.79, p = .005
Robustness Test: Accounting for School-Level Moderators from Yeager et al. (2019)
Plus school-level
moderators,
(N = 9,167)
CATE = -.02 [-.075, .039]
t = 0.61, p = .542
CATE = .11 [.045, .168]
t = 3.37, p < .001
B = .09 [.025, .151],
t = 2.76, p = .006
Robustness Tests: Alternative Sub-samples#
Only students with
only one math teacher,
(N=8,383)
CATE = -.04 [-.108, .026]
t = -1.20, p = .230
CATE = .11 [.040, .170]
t = 3.18, p = .001
B = .09 [.028, .159],
t = 2.81, p = .005
Only previously-
lower-achieving (i.e.
below-median pre-
intervention GPA)
students ,
(N=4,811)
CATE = .02 [-.050, .097]
t = 0.63, p = .527
CATE = .13 [.067, .196]
t = 4.01, p < .001
B = .09 [.008, .165],
t = 2.17, p = .030
Only students
previously without
straight As,
(N=6,958)
CATE = -.01 [-.062, .041]
t = -0.39, p = .696
CATE = .14 [.071, .203]
t = 4.07, p < .001
B = .11 [.040, .180],
t = 3.10, p = .002
Note: CATE = Conditional average treatment effect, in GPA units (0 to 4.3 scale) estimated with
the margins postestimation command in Stata SE, holding potentially-confounding moderators
constant at their population means. All CATES estimated using teacher survey weights provided
by ICF International to make the estimates generalizable to the nation as a whole. Teachers with
more of a growth mindset in this analysis are those reporting mindset at +1 SD for the continuous
teacher mindset measure, while teachers with more of a fixed mindset are at -1 SD. Numbers in
brackets represent 95% confidence intervals. Regression model specified in Eq. 1. B =
unstandardized regression coefficient (i.e. expected treatment effects on GPA). This was the
pre-registered subgroup in Yeager et al. (2019). # Models included all teacher-level moderators.
Teacher and Student Mindsets
26
Bayesian machine-learning analysis. The BCF analyses yielded conclusions consistent
with the primary linear mixed effects model. First, there was a positive zero-order correlation of
r(223) = .55 between teachers’ mindsets and the estimated magnitude of the classroom’s
treatment effect (i.e., the posterior mean for the CATE for each teacher), which mirrors the
moderation results of the primary linear model. Figure 2, which depicts the posterior distribution
for each level of teacher mindset, holding all other moderators constant at the population mean,
shows no overlap between the interquartile range (IQR) for teachers with more of a growth
mindset (5 or 5.5) and the IQR for teachers with more of a fixed mindset (4 or lower). This
supports the conclusion of a positive interaction, again consistent with the mindset + supportive
context hypothesis.
The model also shows that teachers who strongly endorse growth mindset beliefs show a
positive average intervention effect greater than zero with approximately 100% certainty (see
Figure 2), confirming the results of the simple effects analysis from the linear model. We note
again that the BCF model is relatively conservative. It utilizes a prior distribution centered at a
homogeneous treatment effect of zero. This should be taken as strong evidence of moderation
and strong evidence that the intervention was effective for students of growth mindset teachers.
The BCF analysis also yielded new evidence that extended the primary linear model’s results.
Figure 2 shows that teachers’ growth mindsets were related to higher treatment effect sizes in a
linear fashion for most of the distribution, but there was no marginal increase in treatment effects
when teachers endorsed a growth mindset to an even greater extent once they were already high
on the scale (see the rightmost groups of teachers in Figure 2). The non-linearity, discovered by
the BCF analysis, should invite further investigation into whether teachers already endorsing a
very high growth mindset are using practices that encourage all of their students (even those in
Teacher and Student Mindsets
27
the control group) to engage in growth mindset behaviors, potentially narrowing the contrast
between treatment and control group students.
Figure 2. Evidence for the mindset + supportive context hypothesis regarding teacher mindsets
and a student mindset interventionup to a pointin a flexible Bayesian Causal Forest
model. Note: Posterior distributions are of the conditional average treatment effect (CATE), as a
proportion of the average treatment effect (ATE). Thus, 100% means the CATE is equal to the
population ATE. Red dots represent the estimated intervention effect (posterior means) at each
level of teacher mindset. The widths of the bars, from wide to narrow, represent the middle 50%
(i.e., IQR), 80% and 90% of the posterior distribution, respectively. The teacher mindset measure
ranges from 1 to 6. The dashed vertical line represents the population mean for teacher mindsets.
However, the x-axis stops at 3 because only five teachers had a mindset score below this and the
model cannot make precise predictions with so few teachers.
Exploratory Analyses of Baseline Student Mindsets
The brief, direct-to-student growth mindset intervention did not appear to overcome local
contextual factors that can suppress achievement (e.g., a teacher with a fixed mindset). Could it
address individual risk factors suppressing achievement, such as the student’s own fixed
mindset? A slight suggestion of this possibility appeared in one of the original growth-mindset
intervention experiments (Blackwell, Trzesniewski, & Dweck, 2007); a student’s prior growth
mindset negatively interacted with the intervention effect, but the result was imprecise (p = .07).
To revisit this question, we added students’ baseline mindsets as a moderator in the present
Teacher and Student Mindsets
28
study’s primary linear mixed effects model. We found a significant negative interaction with
student baseline growth mindsets, B = -.06 [95% CI: .018, .098], t = 2.85, p = .004, suggesting
stronger effects for students with more of a fixed mindset. Thus the (marginal) Blackwell et al.
(2007) moderation finding was borne out. This interaction was additive with, but not interactive
with, the teacher mindset interaction, which did not change in magnitude or significance by
including the student mindset interaction (two-way still p = .005; three-way interaction p > .20).
Exploring the CATEs, students reporting more fixed mindsets at baseline (- 1 SD), in classrooms
with a teacher reporting more of a growth mindset (+1 SD), showed an intervention effect on
their math grades of 0.16 grade points [0.079, 0.234], t = 3.957, p<.001. By contrast, and there
was no significant effect among students who already reported a strong growth mindset in
growth mindset classes, and, as noted, no effect overall in more fixed-mindset classes.
Discussion
In this nationally-representative, double-blind clinical trial, successfully teaching a
growth mindset to students lifted math grades overall, but this was not enough for all students to
reap the benefits of a growth mindset intervention. Supportive classroom contexts also mattered.
Students who were in classrooms with teachers who espoused more of a fixed mindset did not
show gains in their math grades over 9th grade compared to the control group, whereas students
in classrooms with more growth mindset teachers showed meaningful gains. This finding
suggests that students cannot simply carry their newly enhanced growth mindset to any
environment and implement it there. Rather, the classroom environment needs to support, or at
least permit, the mindset, by providing necessary affordances (see Walton & Yeager, 2020).
In addition, we discovered that students who formerly reported more of a fixed mindset
and who went back into a classroom with a teacher who had more of a growth mindset showed
Teacher and Student Mindsets
29
larger gains in achievement than did students who began the study with more of a growth
mindset. This finding supports the Walton and Yeager (2020) hypothesis that individuals at the
intersection of vulnerability (prior fixed mindset) and opportunity (high affordances) are the
most likely to benefit from psychological interventions.
The national sampling, and the use of an independent firm to administer the intervention,
permits strong claims of generalizability to U.S. public high school math classrooms. Future
studies could use or adapt a similar methodology to assess generalizability to other age groups,
content areas, or cultural contexts. In general, materials may need to be adapted, sometimes
extensively (see Yeager et al., 2016), to be appropriate to new settings.
A main limitation in our study is that teachers’ mindsets were measured, not manipulated.
The fact that teacher mindsets were moderators above and beyond other teacher confounders
lends support to our hypotheses about the importance of classroom affordances. But more
research is needed to determine whether teachers’ mindset beliefs, or the practices that follow
from them, play a direct, causal role. Thus, the mindset × context approach opens the window to
a new, experimental program of research.
If a future experimental intervention targeted both students and teachers, what kinds of
moderation patterns might be expected? There, we actually might see the largest effects for
formerly fixed mindset teachers. That is, the benefits of planting a seed and fertilizing the soil
should be greatest where soil was formerly inhospitable, and smaller where the soil was already
adequate.
In general, we view the testing and understanding of the causal effect of teacher mindsets
as the next step for mindset sciencefollowed, if successful, by the creation of programs to
promote more growth-mindset-sustaining classroom practices. Such research will be challenging
Teacher and Student Mindsets
30
to carry out, however. For example, we do not think it will be enough to simply copy or adapt the
student intervention and provide it to teachers. A new intervention for teachers will need to be
carefully developed and tested. We do not yet know which teacher beliefs or practices (or
combinations thereof) may be most important in which learning environments. Even if we did,
there is much to be learned about how to best encourage and support key beliefs and practices in
teachers. The current findings, along with other recent findings about the importance of
instructors’ mindsets in promoting achievement for all groups and reducing inequalities between
groups (Canning et al., 2019; Leslie et al., 2015; Muenks et al., 2020), point to the urgency and
value of this research.
Teacher and Student Mindsets
31
References
Bailey, D. H., Duncan, G. J., Cunha, F., Foorman, B. R., & Yeager, D. S. (in press). Persistence
and fadeout of educational intervention effects: Mechanisms and potential solutions.
Psychological Science in the Public Interest.
Blackwell, L. S., Trzesniewski, K. H., & Dweck, C. S. (2007). Implicit theories of intelligence
predict achievement across an adolescent transition: A longitudinal study and an
intervention. Child Development, 78(1), 246263. doi: 10.1111/j.1467-
8624.2007.00995.x
Bloom, H. S., Raudenbush, S. W., Weiss, M. J., & Porter, K. (2017). Using multisite
experiments to study cross-site variation in treatment effects: A hybrid approach with
fixed intercepts and a random treatment coefficient. Journal of Research on Educational
Effectiveness, 10(4), 817842. doi: 10.1080/19345747.2016.1264518
Boulay, B., Goodson, B., Olsen, R., McCormick, R., Darrow, C., Frye, M., … Sarna, M. (2018).
The investing in innovation fund: Summary of 67 evaluations (No. NCEE 2018-4013).
Washington, DC: National Center for Education Evaluation and Regional Assistance,
Institute of Education Sciences, U.S. Department of Education.
Brady, S. T., Cohen, G. L., Jarvis, S. N., & Walton, G. M. (2020). A brief social-belonging
intervention in college improves adult outcomes for black Americans. Science Advances,
6(18), eaay3689. doi: 10.1126/sciadv.aay3689
Bronfenbrenner, U. (1977). Toward an experimental ecology of human development. American
Psychologist, 32(7), 513531. doi: 10.1037/0003-066X.32.7.513
Bryan, C. J., Tipton, E., & Yeager, D. S. (in press). Behavioural science is unlikely to change the
world without a heterogeneity revolution. Nature Human Behaviour.
Teacher and Student Mindsets
32
Canning, E. A., Muenks, K., Green, D. J., & Murphy, M. C. (2019). STEM faculty who believe
ability is fixed have larger racial achievement gaps and inspire less student motivation in
their classes. Science Advances, 5(2), eaau4734. doi: 10.1126/sciadv.aau4734
Carroll, J. M., Muller, C., Grodsky, E., & Warren, J. R. (2017). Tracking health inequalities from
high school to midlife. Social Forces, 96(2), 591628. doi: 10.1093/sf/sox065
Chetty, R., Friedman, J. N., & Rockoff, J. E. (2014). Measuring the impacts of teachers I:
Evaluating bias in teacher value-added estimates. American Economic Review, 104(9),
25932632. doi: 10.1257/aer.104.9.2593
Cronbach, L. J. (1957). The two disciplines of scientific psychology. American Psychologist,
12(11), 671.
Dweck, C. S., & Yeager, D. S. (2019). Mindsets: A view from two eras. Perspectives on
Psychological Science. doi: 10.1177/1745691618804166
Easton, J. Q., Johnson, E., & Sartain, L. (2017). The predictive power of ninth-grade GPA.
Chicago, IL: University of Chicago Consortium on School Research. Retrieved from
Chicago, IL: University of Chicago Consortium on School Research website:
https://consortium.uchicago.edu/sites/default/files/publications/Predictive%20Power%20
of%20Ninth-Grade-Sept%202017-Consortium.pdf
Gelman, A., & Loken, E. (2014). The statistical crisis in science. American Scientist, 460465.
doi: 10.1511/2014.111.460
Gibson, J. J. (1977). The theory of affordances. In R. Shaw & J. Bransford (Eds.), Perceiving,
Acting, and Knowing (pp. 6782). Hillsdale, NJ: Lawrence Erlbaum.
Teacher and Student Mindsets
33
Gopalan, M., & Tipton, E. (2018). Is the National Study of Learning Mindsets nationally-
representative? Https://Psyarxiv.Com/Dvmr7/. Retrieved from
https://psyarxiv.com/dvmr7/
Hahn, P. R., Murray, J. S., & Carvalho, C. M. (2020). Bayesian regression tree models for causal
inference: Regularization, confounding, and heterogeneous effects. Bayesian Analysis.
doi: 10.1214/19-BA1195
Haimovitz, K., & Dweck, C. S. (2017). The origins of children’s growth and fixed mindsets:
New research and a new proposal. Child Development, 88(6), 18491859. doi:
10.1111/cdev.12955
Harackiewicz, J. M., & Priniski, S. J. (2018). Improving student outcomes in higher education:
The science of targeted intervention. Annual Review of Psychology, 69, 409435. doi:
10.1146/annurev-psych-122216-011725
Hembree, R. (1990). The nature, effects, and relief of mathematics anxiety. Journal for Research
in Mathematics Education, 3346. doi: 10.2307/749455
Hill, C. J., Bloom, H. S., Black, A. R., & Lipsey, M. W. (2008). Empirical benchmarks for
interpreting effect sizes in research. Child Development Perspectives, 2(3), 172177. doi:
10.1111/j.1750-8606.2008.00061.x
Kraft, M. A. (2020). Interpreting Effect Sizes of Education Interventions. Educational
Researcher, 49(4), 241253. doi: 10.3102/0013189X20912798
Lazarus, R. S. (1993). From psychological stress to the emotions: A history of changing
outlooks. Annual Review of Psychology, 44(1), 122. doi:
10.1146/annurev.ps.44.020193.000245
Teacher and Student Mindsets
34
Leslie, S.-J., Cimpian, A., Meyer, M., & Freeland, E. (2015). Expectations of brilliance underlie
gender distributions across academic disciplines. Science, 347(6219), 262265. doi:
10.1126/science.1261375
Lewin, K. (1952). Field theory in social science: Selected theoretical papers (D. Cartwright,
Ed.). London, England: Tavistock. Retrieved from
http://trove.nla.gov.au/version/21157377
McShane, B. B., Tackett, J. L., Böckenholt, U., & Gelman, A. (2019). Large-scale replication
projects in contemporary psychological research. The American Statistician, 73(sup1),
99105. doi: 10.1080/00031305.2018.1505655
Miller, D. I. (2019). When Do Growth Mindset Interventions Work? Trends in Cognitive
Sciences, 23(11), 910912. doi: 10.1016/j.tics.2019.08.005
Muenks, K., Canning, E. A., LaCosse, J., Green, D. J., Zirkel, S., Garcia, J. A., & Murphy, M. C.
(2020). Does my professor think my ability can change? Students’ perceptions of their
STEM professors’ mindset beliefs predict their psychological vulnerability, engagement,
and performance in class. Journal of Experimental Psychology: General. doi:
10.1037/xge0000763
OECD. (2021). Sky’s the limit: Growth mindset, students, and schools in PISA. Paris: PISA,
OECD Publishing. Retrieved from PISA, OECD Publishing website:
https://www.oecd.org/pisa/growth-mindset.pdf
Pattison, E., Grodsky, E., & Muller, C. (2013). Is the sky falling? Grade inflation and the
signaling power of grades. Educational Researcher, 42(5), 259265. doi:
10.3102/0013189X13481382
Teacher and Student Mindsets
35
Rattan, A., Good, C., & Dweck, C. S. (2012). “It’s ok — not everyone can be good at math”:
Instructors with an entity theory comfort (and demotivate) students. Journal of
Experimental Social Psychology, 48(3), 731737. doi: 10.1016/j.jesp.2011.12.012
Tipton, E., Yeager, D. S., Iachan, R., & Schneider, B. (2019). Designing probability samples to
study treatment effect heterogeneity. In P. J. Lavrakas (Ed.), Experimental Methods in
Survey Research: Techniques That Combine Random Sampling with Random Assignment
(pp. 435456). New York, NY: Wiley. Retrieved from
https://onlinelibrary.wiley.com/doi/abs/10.1002/9781119083771.ch22
Walton, G. M., & Wilson, T. D. (2018). Wise interventions: Psychological remedies for social
and personal problems. Psychological Review, 125(5), 617655. doi:
10.1037/rev0000115
Walton, G. M., & Yeager, D. S. (2020). Seed and soil: Psychological affordances in contexts
help to explain where wise interventions succeed or fail. Current Directions in
Psychological Science, 29(3), 219226. doi: 10.1177/0963721420904453
Woody, S., Carvalho, C. M., & Murray, J. S. (2020). Model interpretation through lower-
dimensional posterior summarization. ArXiv:1905.07103 [Stat]. Retrieved from
http://arxiv.org/abs/1905.07103
Yamamoto, T., & Yeager, D. S. (2019). Causal mediation and effect modification: A unified
framework. Working Paper, MIT.
Yeager, D. S. (2019). The National Study of Learning Mindsets, [United States], 2015-2016.
Inter-university Consortium for Political and Social Research [distributor]. doi:
10.3886/ICPSR37353.v1
Teacher and Student Mindsets
36
Yeager, D. S., Hanselman, P., Walton, G. M., Murray, J. S., Crosnoe, R., Muller, C., … Dweck,
C. S. (2019). A national experiment reveals where a growth mindset improves
achievement. Nature, 573(7774), 364369. doi: 10.1038/s41586-019-1466-y
Yeager, D. S., Romero, C., Paunesku, D., Hulleman, C. S., Schneider, B., Hinojosa, C., …
Dweck, C. S. (2016). Using design thinking to improve psychological interventions: The
case of the growth mindset during the transition to high school. Journal of Educational
Psychology, 108(3), 374391. doi: 10.1037/edu0000098
Zhu, P., Garcia, I., & Alonzo, E. (2019). An independent evaluation of growth mindset
intervention. New York, NY: MDRC. Retrieved from MDRC website:
https://files.eric.ed.gov/fulltext/ED594493.pdf
i
The mindset + supportive context, or “affordances,” hypothesis is akin to what Bailey and colleagues (2020) call
the “sustaining environments” hypothesis, which is the idea that intervention effects will fade out when people enter
post-intervention environments that lack adequate resources for an intervention to continue paying dividends.
ii
Lazarus (1993) summarized well the field’s pejorative view of treatment effect heterogeneity: “psychology has
long been ambivalent … opting for the view that its scientific task is to note invariances and develop general laws.
Variations around such laws are apt to be considered errors of measurement” (pg. 3).
iii
An exploratory analysis allowed the intercept and slope to vary randomly. It showed the same sign and
significance of results and supported the same conclusions as the pre-registered fixed intercept, random slope model.
1
Teacher Mindsets Help Explain Where a
Growth Mindset Intervention Does and Doesn’t Work:
Supplemental Online Materials Reviewed
This online supplement contains the following information:
An explanation of how the present paper aligns with the NSLM pre-registration.
Details of the Bayesian Causal Forest analysis.
Representativeness of the analytic sample (Tables S1 to S3)
Screenshots of the treatment and control conditions (reproduced from Yeager et al., 2019)
Descriptive statistics for student-level, teacher-level, and school-level variables (Table
S4)
Student-level descriptive statistics by treatment condition (Table S5)
Description of exclusions to the sample due to merging student and teacher data (Table
S6)
Student-level descriptive statistics by sample selection (Table S7)
Student-level measures in the National Study of Learning Mindsets
Teacher-level measures in the National Study of Learning Mindsets
Cross-level interaction effect coefficients of the measured teacher-level confounders
(Table S8)
Correlations between teacher growth mindset beliefs and teacher practices from an
independent concurrent validity pilot sample (Table S9)
Correlations between teacher growth mindset beliefs and other teacher- and school-level
moderators (Table S10)
Post-hoc analysis of student perceptions of the classroom climate that could confound the
teacher mindset analysis (Table S11).
2
Alignment with the Pre-Analysis Plan
Here we explain how our analyses followed the analysis plan (https://osf.io/afmb6/) and how we
made decisions about analysis steps that were not described in the pre-analysis plan.
1
We
summarize the alignment with the pre-registration in the table below.
The pre-registered analysis plan’s primary question (RQ4) focused on school-level moderators.
At the end of the plan we stated that “we plan to study whether variance in the treatment impact
across math classes varies due to characteristics of teachers and classrooms” (pg. 15). In the
present paper, we did this. We performed analyses following the methods of the school-level pre-
registered plan, replacing school-level factors with teacher-level factors.
No changes were made to the underlying dataset except for the exclusions listed in the paper
(only including students whose data could be matched to teachers’ mindset reports), which
means that this analysis follows and builds on the extensive pre-registered steps for processing
the grades dataset. We further note that all decisions about data processing were made by third
parties (ICF International, which obtained the grades data, and MDRC, which processed them
and identified math courses). Therefore in the present document we do not review in detail the
many data processing steps that were pre-registered and followed.
Pre-registration component
Pre-registration
Present paper
Public record of the study and
all measures on OSF/ICPSR
Yes
Pre-registration applies
Disclose all manipulations
Yes
Pre-registration applies
Data processing rules
Yes
Pre-registration applies
Sample size and stopping rule
Yes
Pre-registration applies
Covariates
Yes
Pre-registration applies
Primary statistical model (fixed
intercept, random slope)
Yes
Pre-registration applies
Secondary statistical model
(Bayesian Causal Forest)
Yes
Pre-registration applies
Student subgroup
operationalization
(i.e., low-achieving student)
Yes
Pre-registration applies to the
definition of the low-achiever
group. This group was tested
in a supplemental analysis. A
non-pre-registered group (all
students) is primary here.
Outcome variable
Overall GPA
The pre-registration said that
we would focus on math
1
The pre-analysis plan stated that we would pre-register a new plan for the between-teacher analyses. However we
were already unblinded to the data, and so this would not have been useful. Therefore, in this document we have
been transparent about how we made each choice, and then we reported robustness analyses for aspects of the
analyses that allowed for degrees of freedom. Furthermore, we note that the Bayesian machine-learning analysis
(which was mentioned in the analysis plan) builds in penalties for uncertainties about model specifications. This
helps our results to be even more conservative, and robust to researcher degrees of freedom.
3
classes next. This refers to the
use of math GPA, as in the
current study.
School-level moderators
Yes
Pre-registration applies
Teacher-level moderators
No
The overall focus on teachers
was mentioned in the pre-
registration, but the specific
variables were not.
Research Question: Do teacher-level factors explain the variability in the size of the CATE
of the GM on math GPA in U.S. public high schools? (compare to RQ4 in the plan)
- We hypothesized that the intervention effect would vary with respect to what the analysis
plan called “mindset saturation level”, which we define here as math teachers’ growth
mindsets. In a previous paper (Yeager et al. 2019), mindset saturation was examined on
the school level and was defined by peers’ challenge-seeking behavioral norms. We
showed that the intervention effect varied by peer challenge-seeking norms at the school
level. The current paper’s focus on teachers’ mindsets is uniquely predictive, above and
beyond the school-level peer norms, as shown in Table 2.
- We had two competing directional hypotheses about contexts interacting with student
growth mindset interventions. Here is what we stated in the analysis plan:
2
i. Larger effects on GPA in higher mindset saturation [contexts]. The
reason why is that the environment reinforces the message over time.
Giving the intervention in a high mindset saturation [context] is like
“planting a seed in tilled soil”. We now call it “fertile” soil. We have
labeled this hypothesis the “mindset + context” hypothesis in the present
paper. The label is different but it is the same hypothesis that we pre-
registered.
ii. Larger effects on GPA in lower mindset saturation [contexts]. The reason
why is that in high mindset saturation [contexts] students are already
receiving growth mindset from their teachers and peers (because the
control group is getting “treated”) – the intervention is a drop in the
bucket. Meanwhile, in lower mindset saturation [contexts], students are
most in need of a growth mindset the intervention is like “water on
parched soil.” In the present paper we call this the “mindset only”
hypothesis, because implied in this second alterantive is the hypothesis
that students can overcome a fixed mindset classroom. Although the label
is different, it is the same hypothesis that we pre-registered.
2
In the pre-analysis plan, it said “schools” in this predictions section. We have replaced it with [contexts] to
facilitate the application to teachers, which is the level we extended these predictions to in the present paper.
4
- Here we find support for the “seed in fertile soil” hypothesis (called the mindset +
context hypothesis in the paper), just like the previous between-school analysis (Yeager
et al., 2019). In this case, we find that there are larger effects of the treatment on math
GPA in classrooms with a teacher who had a higher growth mindset. The Bayesian
analysis finds hints of support for the “drop in the bucket” hypothesis at the very high
end, and we discuss this in the paper in two places, following our pre-registered analysis
plan.
- The analysis plan stated that we would test a “hybrid” mixed effects model (teacher fixed
intercepts and random slope), using survey weights, and that is what we did.
- The analysis plan said that we would turn to math GPA next, and this is what we did.
Math GPA is the outcome because math teachers participated in the survey, thus only
math grades were relevant to math teachers’ mindsets.
- Our main results do not focus on previously low-performing students because math
grades overall were lower than other subjects, giving more students opportunities for
growth. As a supplementary analysis, we present the results among previously low-
performing students and students without a pre-treatment GPA of 4.00 and above in
Table 2. Table 2 shows that the primary results we report are conservative. Excluding
higher-achieving students yielded directionally-larger effect sizes.
- The analysis plan stated that “with consultation from statisticians, we will evaluate
potential non-parametric models to examine the independent and interactive impact of
school-level moderators on between-school variability in the treatment (e.g. likely a
variation on Bayesian Additive Regression Trees).” We have done this, focusing on
teacher-level moderators, by including the Bayesian Causal Forest (BCF) analysis, which
builds on the BART methodology, as described by Hahn et al. (2020).
- The analysis plan stated that we would include three school-level factors in the models:
mindset norms, school achievement level, and school minority percentage. These school-
level variables were of interest in Yeager et al. (2019). Here, when we included these
school-level factors in our primary model, the results were unchanged (see Table 2 in the
paper).
- The analysis plan stated that we would test for a significant reduction in variability in the
treatment effect () once accounting for the context-level factor. This could not be
estimated in the lmer models, and so instead we used the BCF analyses to understand
how much of the variability in treatment effects was explained by school versus teacher
factors, and the answer was that they were equally important, explaining about half of the
variability, as reported in the paper. Thus our paper includes variability statistics that are
more advanced and reliable than what we pre-registered.
- The analysis plan stated that we would conduct continuous and categorical analyses of
the school-level moderators. Here, when focusing on teacher mindsets, we only
conducted continuous analyses because they are more conservative than cutting teacher
5
mindsets into sub-groups. Further, the BCF analysis makes this point moot because the
BART algorithm can cut the results into subgroups using machine learning rather than
researcher-selected cutpoints, which can be arbitrary. Rather than choose cutpoints, in
Figure 2 in the paper we simply present data for all of the points of the scale.
- The analysis plan listed several exploratory analyses that were conducted for the NSLM
(see the online supplement for Yeager et al., 2019). We did not re-conduct them here
because they primarily focused on school-level factors that are not of interest here. These
exploratory analyses would also have inflated Type-I error rates due to multiple
hypothesis testing. However the analytic dataset for the present paper has been posted on
ICPSR so future analysts may conduct them.
- The analysis plan stated that we plan to “conduct correlational analyses of the math
classroom data.” We still plan to do this, but these correlational analyses will be the
subject of future research papers.
6
Details of the Bayesian Causal Forest (BCF) Analysis
As noted in the paper, BCF has been found, in multiple open competitions and simulation
studies, to detect true sources of treatment effect heterogeneity while not lending much credence
to noise (Hahn et al., 2020; McConnell & Lindner, 2019; Wendling et al., 2018). BCF builds on
and has in several cases surpassed the popular Bayesian Additive Regression Trees (BART,
Chipman, George, & McCulloch, 2010) approach. Both Bayesian regression tree models and
BCF in particular are consistently top performers in empirical evaluations of methods for causal
inference (Dorie et al., 2019; Hahn et al., 2019; McConnell & Lindner, 2019; Wendling et al.,
2018). Here we provide more details about how the BCF model was estimated.
In the BCF analysis, the math GPA for student in the classroom of teacher is denoted by ,
and is modeled by
            
As before, is the vector of student-level covariates (prior achievement, prior expectations for
success, race/ethnicity, gender, parents’ education). There treatment effect moderators for the
school and teacher, denoted by and respectively, which interact with the student treatment
indicator . For the teacher confounds , we use teacher mindsets, a measure of racial prejudice,
and a measure of pedagogical ability. For the school confounds , we use school achievement
level, peer challenge-seeking norms, and percentage of students who are a minority, as in Yeager
et al. (2019). We also allow for teacher-level intercept random effects, , and random effects on
the treatment, . The student-level error term is  and is assumed to be normally distributed
with variance .
Here, and are nonparametric functions which allow for nonlinearities and interactions
between covariates in affecting the expected outcome. Furthermore, it allows for the level of the
intervention effect to vary across teachers and schools, via the interaction of   and . This
model is meant to mimic that of the linear analysis in terms of the specification of the control and
treatment modifier variables, but relax the strict assumption of linearity and additivity between
the covariates and the expected value of the outcome.
Using a nonparametric Bayesian approach in this manner has several advantages. First, it allows
the data to speak and better inform us about the relationships between the covariates and the
outcome. It allows us to uncover (possibly unanticipated) sources of heterogeneity in the
intervention effect while requiring few prior assumptions ahead of time. The prior we use results
in posterior estimates that are inherently conservative, making it unlikely that we will
dramatically over- or under-estimate the effect of the intervention.
Prior specification
To complete our Bayesian model, we must specify prior distributions for the unknown values in
the equation above. These include the nonparametric functions  and , the random teacher
effects and , and the error variance .
The prior for the functions  and  is taken from the Bayesian causal forests model (BCF;
Hahn et al., 2020). Under this model, both functions have a sum-of-trees representation, as first
defined for Bayesian methods in Chipman, George, and McCulloch (2010) Each tree consists of
7
a set of internal decision nodes which partitions the covariate space, and a set a of terminal
nodes, or leaves, corresponding to each element of the partition. The prior for each of  and
 is comprised of three parts: the number of trees, two parameters controlling the depth of
each tree, and a prior on the leaf parameters. Use of this sum-of-trees term allows for detection
of nonlinearity and interactions between covariates.
The key feature of the BCF model is that the prior for , which explains heterogeneity in the
intervention effect, is regularized more heavily compared to the control function  in order to
shrink toward homogeneous effects, i.e. the unlikely case that the intervention effect is constant
across all values of the moderators. The prior for  uses fewer trees, with each tree being
regularized to be shallower (that is, contain fewer partitions). Details are on prior specification
are given in Hahn, Murray, and Carvahlo (2020) and Chipman, George, and McCulloch (2010).
The random effects and are given a Gaussian prior with the standard deviation having a
prior of a half -distribution with 3 degrees of freedom, as recommended by Gelman (2006).
Finally, the error variance is given an inverse chi-squared prior with 3 degrees of freedom and
scale parameter informed by the data.
Posterior Inference
After conditioning on observed data, we can update the prior distribution to obtain a posterior
distribution. To calculate the posterior distribution for quantities of interest, we implement a
Markov chain Monte Carlo (MCMC) sampling scheme. Since the primary component of the
model is the sum-of-trees functions  and , the MCMC scheme relies on a Bayesian
backfitting algorithm (Hastie & Tibshirani, 2000). The rest of the parameters in the model are
conditionally conjugate, making their posterior sampling relatively efficient.
The model as specified has a causal interpretation under the Rubin causal model (Imbens &
Rubin, 2015) if we meet several identifying assumptions. First, it would require no unmeasured
confounders, i.e. there are no unmeasured covariates which affect both the treatment and the
outcome under either control or treatment. Second, it must have positive probability of
assignment to treatment for each student. Both these assumptions are true by the randomization
of the study design.
Having met these assumptions, we may estimate relevant causal estimands. For instance, we can
estimate the individual treatment effect for any particular student. That is, we can estimate the
difference between their observed GPA (whether they received the treatment or the control), and
what their GPA would have been had their treatment assignment been the opposite. In
mathematical terms, for one student in classroom we can estimate the quantity
   
  
even though we only actually observe one of these potential outcomes, in the terminology of the
Rubin causal model. We define the intervention effect for one teacher to be the average of the
estimated intervention effects across the students in their classroom. Moreover, we can estimate
the conditional average treatment effect (CATE) function which describes how the intervention
effect fluctuates as a function of the moderators
CATE  E        
8
That is, the CATE function is identical to   , and therefore the prior and posterior for the
CATE function is identical to that of  .
Furthermore, we can obtain valid posterior distributions for functions of  . In particular, we
can analyze the shift in the treatment effect when changing teacher growth mindset and fixing
the other moderators at their population means. This function can be written mathematically as
mindset mindset 
where mindset is the population average of teacher confounds except for mindset, and is the
population average. This is the function produced in Figure 2 in the main text.
The BCF estimate of was approximately additive. Therefore, to give an interpretable estimate
of this conditional intervention effect, we created an additive summary of the fitted  function
using splines, and looked at the partial effect of changing teacher mindset while fixing the other
confounds as specified above. This additive summary captures approximately 98% of the
predictive variance in the posterior of , so this additive summary is a faithful recapitulation of
the fitted CATE function.
Plotting the Results of the BCF Analyses
The additive summaries (i.e. generalized additive models) plotting the BCF results are presented
below. Each panel plots the partial effect of a moderator on the posterior predicted treatment
effect, which means that it is the effect of each moderator controlling for the alternative
moderators. Each dot represents a teacher’s treatment effect. Panel A is the focal moderator
(teacher mindset) and it shows an increasing treatment effect up to 5, and no increase after 5.
9
Panels B, C and D show no meaningful differences in the treatment effect across the spectrum of
the alternative moderators. Note that the y-axis in the top row of panels is not the average
treatment effect, but rather the “offset” – that is, by how much would you expect the average
treatment effect to go up or down depending on the level of the moderators. So negative numbers
do not mean the treatment was harmful in fixed mindset classrooms; it means that the treatment
effect was just smaller (i.e. subtracting from the average) in those classrooms.
Panel E in the figure above is a heat map of the posterior probability of a difference in treatment
effects at each percentile level of the moderator. It shows that teachers from approximately the
60th to the 100th percentile on teacher mindset (i.e. growth mindset teachers) show a rather high
likelihood of a difference in treatment effects relative to teachers at 40th percentile or lower (i.e.
fixed mindset teachers). Meanwhile, Panels F, G, and H show that there are no strong differences
between any point for the alternative moderators (i.e. the heat maps are very pale and close to
white, or a 50/50 probability of a directional difference).
Overall, that is rather strong evidence for moderation by teacher mindset and not by alternative,
potential confounds with teacher mindset. BCF found meaningful moderation for the focal
moderator (teacher mindset) but not for alternative, confounding moderators (teacher implicit
bias, pedagogical content knowledge, and fluid intelligence).
10
Representativeness of the Analytic Sample
School-Level:
Because not all schools in the National Study of Learning Mindsets (NSLM) provided student
grades attached to teachers’ names, we evaluated whether the schools in our analytic sample are
representative of our sampling frame (all regular U.S. public high schools with at least 25
students in 9th grade and in which 9th is the lowest grade). We performed this analysis in two
steps. First, we compared the schools in the analytic sample to schools in the sampling frame
using publicly available data such as the Common Core of Data (CCD), the Office of Civil
Rights (OCR), and a district-level tabulation of American Community Survey (ACS) data (See
Gopalan & Tipton, 2018 for more details). Table S1 below show that there are few differences
between characteristics of schools in the analytic sample as compared to the sampling frame.
Table S1: Comparisons of Benchmarks Across the Analytic Sample Schools and Schools in
the National Sampling Frame
Benchmarks
Sampling
Frame Mean
Analytic
Sample Mean
SMD
p
Proportion of 9th Grade Male Students
0.52
0.52
0.00
0.989
Proportion of 9th Grade Black Students
0.14
0.15
-0.01
0.712
Proportion of 9th Grade Hispanic Students
0.21
0.17
0.04
0.115
Proportion of 9th Grade White Students
0.57
0.60
-0.03
0.421
Proportion of 9th Grade Other Race
Students
0.05
0.06
-0.01
0.408
Total 9th Grade Enrollment
285.00
301.00
-0.06
0.564
Proportion of High School Students
Enrolled in Algebra 1
0.22
0.23
-0.01
0.512
Proportion of High School Students
Enrolled in Algebra 2
0.20
0.19
0.01
0.137
Proportion of High School Students
Enrolled in at least one AP Course
0.19
0.18
0.01
0.726
Proportion of High School Student Who
Took at least one AP Exam
0.70
0.70
0.00
0.925
Proportion of Students Who are
Chronically Absent
0.21
0.20
0.01
0.799
Note: SMD is the standardized mean difference. For proportions we report the absolute
differences. A small number of schools do not have information available from the CCD and/or
CRDC. Those schools are excluded from the mean calculations for missing benchmarks, as
appropriate. P-values are shown from one-sample t-tests comparing mean differences.
11
Second, we calculate the generalizability index (Tipton, 2014), a summary measure that provides
the degree of distributional similarity between the schools in the analytic sample and the
inference population, conditional on a set of covariates. The index is calculated using propensity
scores from a sampling propensity score model, which predicts membership in the analytic
sample, given a set of observed school-level characteristics, using logistic regression. The
generalizability index takes on values between 0 and 1, where a value of 0 means that the
analytic sample and inference population are completely different and a value of 1 means that the
analytic sample is an exact miniature of the inference population on the selected covariates (see
Tipton 2014 for more details). Table S2 below shows that the generalizability index is .98,
suggesting that the analytic sample is as good as a random sample from the population of
interest, conditional on the covariates included in the propensity score model. In all, we find that
site-level non-response does not compromise the generalizability of the results from the analytic
sample.
Table S2. Generalizability Index for the Present Sample
Inference Population
Generalizability
Index
Inference Population
N
Analytic Sample N
All Public High
Schools (9th grade
lowest level)
.988
9,522
52
Note: The analytic sample includes 58 schools, but schools must be omitted from calculations
due to missing values on one or more benchmarks. N refers to the number of schools with non-
missing benchmarks used in the sampling propensity score model: racial composition (%African
American, %Hispanic, %White), socioeconomic composition (%Free/Reduced-Price Lunch
Recipients), gender composition (%Male), the number of students in the school, the proportion
of students in the district that are English Language Learners in the district, and the proportion of
Special Education students in the district.
12
Teacher-Level:
The above analysis shows that the schools in the analytic sample are representative of the
inference population, but we also want to ensure that the teachers in the analytic sample are
representative of 9th grade math teachers within these schools. For this analysis, we used student
characteristics from the survey and administrative records to examine whether the characteristics
of students within the classrooms we examine are representative of the characteristics of students
in all of the classrooms in the NSLM. We aggregate student characteristics at the teacher-level
for three groups of teachers: those included in the analytic sample, those asked to participate in
the survey, and those who were matched to any students in our sample. We compare these
groups to ensure that nonresponse on the math teacher survey is not limiting the generalizability
of our sample. As Table S3 below shows, there are few differences between the teachers
represented in our sample compared to the full sample of teachers invited to participate in the
survey and all of the 9th math teachers matched to students in the survey.
Table S3. Generalizability of the Teacher Sample
Teachers in
Analytic
Sample
Teachers Asked to
Participate in Survey
Teachers Matched to
Students in Sample
N
223
255
439
Aggregate Student
Characteristics
Mean
Mean
SMD
p
Mean
SMD
p
Proportion Underrepresented
Minority Students
0.39
0.41
0.02
0.109
0.38
-0.01
0.687
Proportion Students with
College Educated Parent
0.36
0.35
-0.01
0.612
0.37
0.01
0.583
Proportion Female Students
0.50
0.49
-0.01
0.502
0.47
-0.03
0.002
Proportion Low-Achieving
Students
0.51
0.51
0.00
0.776
0.50
-0.01
0.540
Proportion Students in
Treatment Group
0.50
0.50
0.00
0.880
0.50
0.00
0.891
Average Student Math Grade
2.49
2.48
-0.01
0.842
2.60
0.11
0.020
Average Student
Expectations
5.22
5.18
-0.04
0.237
5.21
-0.01
0.670
Teacher Course Levels
Proportion teach Algebra 1
or below
0.80
0.82
0.02
0.443
0.73
-0.07
0.006
Proportion teach Geometry
or above
0.45
0.43
-0.02
0.604
0.45
0.00
0.923
Note: SMD is the standardized mean difference. For proportions we report the absolute
differences. A small number of schools do not have information available from the CCD and/or
CRDC. Those schools are excluded from the mean calculations for missing benchmarks, as
appropriate. P-values are shown from one-sample t-tests comparing mean differences.
13
Screenshots of the Growth Mindset Intervention and Control Materials
Illustrative Screenshots from the Control Condition
Treated students are presented with information about the malleability of the brain.
Treated students are presented with relevant scientific information.
14
Treated students are asked for their help in communicating these ideas to others as a means of
brining them into the story and including them in the narrative.
Treated students receive the message that effort is not enoughyou also need strategies to
overcome challenges and develop your skills.
15
Materials convey that teenage years are a special time for brain growth that students can leverage
to their advantage.
Relating growth mindset to their own lives helps students internalize the message by customizing
it, and reduces defensive reactions that might emanate from the perception that adults are telling
the students what to believe.
16
Treated students are encouraged to see the value of applying a growth mindset to their own lives.
Student testimonials, obtained from prior study participants, help communicate that holding a
growth mindset puts them in line with what other students think, and that they’re not alone in
their concerns about school.
17
Treatment materials summarize evidence showing that holding a “learning mindset” (the term
used for growth mindset here) is helpful, using national data from Chile.
18
Illustrative Screenshots from the Control Condition
Control students are asked to help improve a lesson about the brain.
Control students are presented with information about the physiological features of the brain that
does not include the mindset content.
19
The control content includes examples of evidence about the brain.
Control students were asked to engage with the material by responding to short answer prompts.
20
Control students saw student testimonials about the value of the content.
21
Table S4. Descriptive statistics for student-, teacher-, and school-level variables
Student-Level Variables (N =8,775)
Mean/%
SD
In treatment group
49.55%
Grade in math class
2.44
1.24
Expectations of success in math
5.20
1.11
Lower achieving student
51.79%
Female
49.99%
Black, Latinx, or Native-American
40.19%
Mother earned a bachelor's or above
36.80%
Teacher-Level Variables (N = 223)
Growth Mindset
4.74
0.76
Pro-white Implicit Racial Bias
-0.03
0.16
Math Pedagogical Content Knowledge
0.39
0.28
Years Teaching
13.82
9.94
Number of Raven’s problems correct (out of 4)
3.39
1.38
Female
58.74%
White, non-Hispanic
86.04%
Earned a Masters Degree or higher
51.12%
Heard About Growth Mindset
44.91%
School Sample (N = 58)
Challenge Seeking Norms (# of hard problems chosen)
2.57
0.58
School % Minority (Black, Latinx, or Native-American)
0.34
0.27
School Achievement Level (z-score)
0.08
0.95
22
Table S5. Student descriptive statistics by experimental group
Growth mindset
intervention
(N = 4,348)
Control
(N = 4,427)
Mean or %
SD
Mean or %
SD
p-value for
difference
Grade in Math Class
2.45
1.24
2.42
1.25
.256
Expectations
5.18
1.12
5.21
1.10
.341
Lower achieving student
52.02%
51.57%
.670
Background
Female
50.13%
49.85%
.798
Black, Latinx, or Native-American
40.18%
40.21%
.9783
Mother earned a Bachelor's or
Above
37.05%
36.55%
.625
23
Table S6. Description of sample exclusions
Sample Size
Number of Records (Students)
Explanation
Starting Sample
12,381 (11,508)
Students in the intent-to-treat (ITT)
sample who were assigned a math
grade (for information on this
starting sample, see Yeager et al.
2019)
Sample matched to
teachers
11,689 (10,816)
Excludes students in schools that
did not provide us with any math
teacher IDs (n=438) and student
records that could not be matched
to any math teacher (n=254)
Sample matched to at
least one teacher who
took the survey
9,622 (9,180)
Drops students linked to teachers
who did not take the survey (n=172
teachers)
Sample matched to
teachers’ mindsets
9,167 (8,775)
Drops students linked to teachers
with item nonresponse on the
teacher mindset questions
(n=11 teachers)
24
Table S7. Student descriptive statistics by whether students were matched to a teacher who
completed the survey.
Matched
(in analytic sample)
(N = 8,775)
Not Matched
(not in analytic sample)
(N = 2,733)
Mean or %
SD
Mean or %
SD
Treatment
49.55%
50.68%
Grade in Math Class
2.44
1.24
2.52
1.25
Expectations
5.20
1.11
5.16
1.24
Lower achieving student
51.79%
48.66%
Background
Female
49.99%
46.81%
Black, Latinx, or Native-American
40.19%
41.53%
Mother earned a Bachelor's or Above
36.80%
38.57%
25
Student Survey Measures in the National Study of Learning Mindsets
Students’ Expectations in Math
Scale
7=Extremely well
1=Extremely Poorly
Thinking about your skills and the difficulty of your classes, how well do you think you’ll do
in math in high school?
Students’ Racial Minority Status
Options
1=Black/African American
2=Hispanic/Latino
3=Native American/American Indian
4=White, not Hispanic
5=Asian/Asian-American
6=Middle Eastern
7=Pacific Islander/Native Hawaiian
8=Other
How would you classify your racial or ethnic group? Please check all that apply.
Coding Note: Students who checked Black/African American, Hispanic/Latino, Native
American/American Indian, Middle Eastern, and Pacific Islander/Native Hawaiian are
included in our indicator of racial minority status.
Mother’s Highest Education
Options
1=Did not finish high school
2=Finished high school, no college degree
3=Took some college courses, no college
4=AA or AS: Associate’s degree (i.e.,
community college or junior college)
5=BA or BS: Bachelor’s degree (four-year
college or university)
6=MA, MS, or MBA: Master’s degree
7=Doctorate: Lawyer, Doctor or PhD
8=Do not know
To the best of your knowledge, what is the HIGHEST level of education earned by your
mother?
If your mother was not the person who raised you, then answer this question thinking about
the adult who you spent the most time with growing up (such as your grandmother, father, or
legal guardian).
My mother or guardian’s highest level of education is:
Coding Note: Students who selected Bachelor’s degree, Master’s degree, or Doctorate are
included in our indicator of whether their mother earned bachelor’s degree or above.
26
Gender
Options
1=Male
2=Female
Are you:
Student Fixed Mindset
α=0.787
Scale
6=Strongly Agree
1=Strongly Disagree
1. Your intelligence is something about you that you can’t change very much.
2. You have a certain amount of intelligence, and you can’t do much to change it.
3. Being a math person or not is something you really can’t change. Some people are
good at math and other people aren’t.
27
Teacher Survey Measures in the National Study of Learning Mindsets
Teacher Mindset
α=0.646
Scale
6=Strongly Disagree
1=Strongly Agree
4. People have a certain amount of intelligence, and they really can’t do much to change
it.
5. Being a top math student requires a special talent that just can’t be taught.
Coding Note: Answers to these questions were averaged and the scale was reverse-coded so
higher values indicate a more growth mindset. In the primary analysis we bottom-coded the
scale at 3.5 because there are very few teachers who report a very fixed mindset and this
reduced the influence of extreme cases (N=15). In the BCF analysis we left the teacher
mindset values as-is because the nonlinear model is less likely to be misled by such outliers.
Teacher Covariates
Teacher Education: What degree or degrees have you earned? (Mark all that apply)
Response Options
Associate's
-Discipline
-College(s)/university(ies)
Bachelor's
-Discipline
-College(s)/university(ies)
Master's
-Discipline
-College(s)/university(ies)
PhD / JD / MD / EdD
-Discipline
-College(s)/university(ies)
Coding Note: We create a dichotomous indicator of whether the teacher received a master’s
degree or not.
Teacher Race: Which of these best matches how you would identify yourself? Please check all
that apply.
Response Options
Black / African-American
Hispanic / Latino
Native American / American Indian
White, not Hispanic
Asian / Asian-American - If so: how would
you describe your Asian descent? (e.g.
Chinese, Korean, Indian, etc.)
Middle Eastern
Pacific Islander / Native Hawaiian
Coding Note: We create a dichotomous indicator of whether the teacher is White, not Hispanic
28
or Asian.
Teacher Gender: How would you identify yourself?
Response Options
Male
Female
Other
Years Teaching: In what year did you first start a paid position as a teacher at any level in any
subject?
Open Response
Coding Note: We code the number of years teaching as 2016 their response.
Previous Knowledge of Growth Mindset: Have you ever heard of the concept of a “Growth
Mindset?” If so, what have you heard about it? It’s okay if not, we’re just curious what
different people have heard. Write your answer in the text box below.
Response Options
Yes I have heard of a growth mindset
No I have not heard of a growth mindset
29
Potential Confounds for Teachers’ Mindsets
Three measures assessed potential confounds for teachers’ mindsets in the moderation analysis.
To illustrate why this can be informative, see Figure S1 below. It shows that, when measuring
moderators, then there could be confounds in the moderation analysis.
Figure S1. Schematic showing why accounting for potential correlates of teacher mindsets
could clarify the role of teacher mindsets in the recursive processes that sustain the effects of a
short, online growth mindset intervention on students’ math grades.
Prior to conducting the study, we did not hypothesize that these variables would, necessarily, be
confounds for teachers’ mindsets. However, each of them was raised by at least one advisor or
reviewer of a grant proposal as a plausible counter-explanation for our results. Therefore, we
included three task-based measures to measure and account for these confounds.
Teacher fluid intelligence. The investigative team thought it would be unlikely that growth
mindset teachers simply had higher levels of fluid intelligence and therefore could create
classrooms that afforded more opportunities for learning. Nevertheless, to control for this
possibility, teachers’ fluid intelligence scores were assessed with the Raven’s Progressive
Matrices task (RPM). The RPM was chosen because of its correlation with the fluid intelligence
factor is high and because it is straightforward to respondents (and therefore easy to mass-
administer). The NSLM administered a subset of 5 items. Item selection was informed by Item
Response Theory (IRT) analyses with large, validated samples of RPM respondents; these
analyses of past datasets led us to select the 5 items that best captured variance across the range
of ability. RPM scores ranged from 0 to 5 correct (For the sample overall, not just the analytic
sample, M = 3.4, SD = 1.38; Chance = .83 correct). See Figure S2.
30
Figure S2. One of the easier puzzles from the measure of fluid intelligence.
Teacher pedagogical content knowledge. To control for the possibility that growth mindset
teachers simply are more skilled in the formal content of math pedagogy, teachers’ pedagogical
content knowledge (PCK) was measured using a new, scalable method: the Kersting et al.
(Kersting et al., 2014) Classroom-Video-Analysis assessment. See the screen capture below in
Figure S3.
During this assessment, math teachers view three clips of math classroom instruction, lasting
roughly 3 minutes. Teachers then answer a single open-ended prompt: “Using your professional
judgment, in the box below please write the question or questions you might ask the students in
this situation. Then explain how your question(s) would improve the students’ mathematical
understanding.” These responses were coded by Kersting, the developer of the measure, and a
second reliable coder, on each of three dimensions of PCK (Kersting et al., 2014). Teachers’
open-ended responses were coded as having given a high-knowledge answer (1) or not (0) for
each dimension. Codes were averaged for each respondent (M = .39, SD = .28, Range: 0 to 1);
higher values corresponded to greater values of pedagogical content knowledge.
31
Figure S3. The math pedagogical content knowledge assessment (see Kersting et al. 2014).
Teacher implicit racial bias. To rule out the possibility that fixed mindset teachers were simply
racially biased against minority groups, we administered a leading method for assessing implicit
racial bias: the Affect Misattribution Procedure (AMP) (Payne & Lundberg, 2014). The AMP is
uniquely suited for the present purposes because of its record of predicting consequential
behaviors in real-world settings more than some other implicit bias measures. For instance, in the
2008 American National Election Study, white survey respondents who identified as democrats
but scored high on the AMP’s measure of implicit anti-black bias were more likely to abstain
from voting than to vote for Barack Obama, a black democratic nominee for U.S. President
(Payne et al., 2010).
Figure S4. The Affect Misattribution Procedure (AMP).
The AMP follows the pattern shown below in Figure S3. Participants view a face (of varying
race or ethnicity), followed by pictogram. People are asked to guess whether the pictogram
probably refers to something pleasant or unpleasant. Then they see a “noise” screen and make
32
their judgment. The overall measure used here is pro-white bias: the proportion of times that
participants guessed that the pictogram was “pleasant” following a white face prime, minus the
proportion of times the participants guessed that the pictogram was “pleasant” following a black
face prime.
33
Table S8. Coefficients of the other teacher-level potential moderators, from a single
multilevel regression model predicting student math grades.
B
SE
t
p
Treatment x Implicit Racial Bias
.207
[-.143, .556]
.179
1.16
.248
Treatment x Fluid Intelligence Score
-.009
[-.044, .026]
.018
-0.50
.614
Treatment x Pedagogical Content Knowledge
-.064
[-.235, .108]
.088
-0.73
.466
Treatment x White/Asian Teacher
-.096
[-.239, .047]
.073
-1.32
.187
Treatment x Male Teacher
-.061
[-.157, .035]
.049
-1.24
.216
Treatment x Years Teaching
.003
[-.002, .008]
.002
1.10
.271
Treatment x Heard About Growth Mindset
-.017
[-.136, .069]
.050
-0.34
.736
Treatment x Master's Degree
-.087
[-.186, .013]
.051
-1.71
.087
Note: Unstandardized regression coefficients from the teacher-level x treatment interactions,
estimated in a model including all potential confounders (Table 2 in the main text).
34
Concurrent Validity Analysis
To assess the concurrent validity of teachers mindset beliefs with respect to teachers mindset-relevant beliefs/practices, we collected
survey data from N = 368 teachers in the OnRamps professional development network in Texas. These teachers taught math and
science courses (e.g. pre-calculus, college algebra, computer science) and came from a diverse set of schools that was racially and
ethnically diverse and matched the demographics of Texas. Teachers answered the two teacher mindset items administered in the
NSLM (and described in the paper). Then they completed measures that assessed teacher beliefs/practices which could afford a
growth mindset (or not). Correlations of teacher mindset with teacher beliefs/practices appear in Table S9.
Table S9: In the concurrent validity sample, correlations between teacher growth mindset beliefs and teacher practices.
Correlations with Teachers’
Growth Mindset Beliefs
Learning-focused practices composite
r = .30
It slows my class down to encourage lower achievers to ask questions (reverse-
coded).
r = .17
There is usually only one correct way to solve a math problem. (reverse-coded)
r = .21
Mathematics mostly involves learning facts and procedures. (reverse-coded)
r = .15
Imagine a student was feeling discouraged in math class in the way just described on
the previous page. How likely would you be to say each of the following statements?
Let’s see what you don’t understand and I’ll explain it differently.
r = .13
Ability-focused practices composite
r = -.28
Imagine a student was feeling discouraged in math class in the way just described on
the previous page. How likely would you be to say each of the following statements?
Don't worry - it's okay to not be a math/science/computer science person
r = -.20
Imagine a student was doing well in your math class in the way just described on the
previous page. How likely would you be to say each of the following statements?
You're lucky that you're a math/science/computer science person
r = -.27
It's great that it's so easy for you
r = -.18
Note: All correlations significant at p < .01.
35
Table S10: In the NSLM, correlations between teacher growth mindset beliefs and other teacher- and school-level moderators
1
2
3
4
5
6
7
8
9
10
11
12
1
Teacher Growth
Mindset
1.000
Teacher-level
2
Implicit Racial Bias
-.094
1.000
3
Fluid Intelligence
Score
-.038
-.022
1.000
4
Pedagogical Content
Knowledge
-.087
-.009
.195
1.000
5
White/Asian Teacher
-.096
-.084
.065
.098
1.000
6
Female Teacher
.058
-.045
-.102
.008
-.047
1.000
7
Years Teaching
-.129
-.050
-.090
-.065
.032
-.066
1.000
8
Heard About Growth
Mindset
.148
-.002
.142
.027
-.040
-.055
.009
1.000
9
Master’s Degree
.046
-.033
.007
-.050
-.030
-.091
.196
.170
1.000
School-level
10
Challenge seeking
Norms
-.051
.004
-.010
.076
.081
-.075
.085
.110
.017
1.000
11
Percent Minority
.043
-.066
.005
-.073
-.411
.026
.012
.181
.005
.032
1.000
12
Achievement Level
-.096
-.056
.032
.115
.292
-.060
-.028
-.014
-.020
.358
-.443
1.000
36
Post-hoc Analysis of Student Perceptions of the Classroom Climate that Could Confound
the Teacher Mindset Analysis
Student Perceptions measures were created by taking the average of student responses within
teacher. We drop any teachers without student survey responses to these items from the analysis.
On average, about 28 students within teachers were used to create these indicators, described
below.
Student Perceptions: Good Teacher
Options
5=Extremely True
1=Not at all True
How true or not true is the statement below?
My math teacher is good at teaching math.
N= 4,128 students
Student Perceptions: Interesting
Combined Questions
5=Extremely True
1=Not at all True
How true or not true is the statement below?
1. My math teacher makes lessons interesting.
2. My math class does not keep my attention I get bored. (Reverse Coded)
Coding Note: Students were asked either question 1 (N=2,091) or question 2 (N=2,039) above.
We reverse coded question 2 and combined these responses to make one indicator of whether
students think their teaching makes class interesting for them.
N=4,130
Student Perceptions: Academic Press
α=0.726
Scale
5=Extremely True
1=Not at all True
How true or not true is the statement below?
1. My math teacher accepts nothing less than out full effort.
2. In my math class, we learn a lot almost every day.
3. My math teacher asks questions to be sure we are following along when s/he is
teaching.
N=8,250
37
Table S11. Coefficients for student perceptions of teachers as potential confounding
moderators, from separate multilevel regression model predicting student math grades.
B
SE
t
p
Treatment x Growth Mindset Beliefs
0.089**
0.032
2.815
0.005
[.027, .151]
Treatment x Student Perceptions: Good Teacher
-0.077
0.067
-1.138
0.255
[-.209, .055]
N=9,151 records, 219 teachers
Treatment x Growth Mindset Beliefs
0.087**
0.032
2.759
0.006
[.025, .150]
Treatment x Student Perceptions: Interesting
-0.048
0.055
-0.873
0.383
[-.156, .060]
N=9,151 records, 219 teachers
Treatment x Growth Mindset Beliefs
0.088**
0.032
2.772
0.006
[.026, .150]
Treatment x Student Perceptions: Academic
Press
0.051
[-.129, .230]
0.092
0.551
0.581
N=9,160 records, 221 teachers
Note: Unstandardized regression coefficients from the teacher-level x treatment interactions,
estimated in a model including all potential confounders (Table 2 in the main text).
38
References
Chipman, H. A., George, E. I., & McCulloch, R. E. (2010). BART: Bayesian additive regression
trees. The Annals of Applied Statistics, 4(1), 266298. https://doi.org/10.1214/09-
AOAS285
Dorie, V., Hill, J., Shalit, U., Scott, M., & Cervone, D. (2019). Automated versus do-it-yourself
methods for causal inference: Lessons learned from a data analysis competition.
Statistical Science, 34(1), 4368. https://doi.org/10.1214/18-STS667
Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models (comment
on article by Browne and Draper). Bayesian Analysis, 1(3), 515534.
https://doi.org/10.1214/06-BA117A
Hahn, P. R., Dorie, V., & Murray, J. S. (2019). Atlantic Causal Inference Conference (ACIC)
Data Analysis Challenge 2017. ArXiv Preprint ArXiv:1905.09515.
Hahn, P. R., Murray, J. S., & Carvalho, C. M. (2020). Bayesian regression tree models for causal
inference: Regularization, confounding, and heterogeneous effects. Bayesian Analysis.
https://doi.org/10.1214/19-BA1195
Hastie, T., & Tibshirani, R. (2000). Bayesian backfitting (with comments and a rejoinder by the
authors. Statistical Science, 15(3), 196223. https://doi.org/10.1214/ss/1009212815
Imbens, G. W., & Rubin, D. B. (2015). Causal inference for statistics, social, and biomedical
sciences: An introduction. https://doi.org/10.1017/CBO9781139025751
Kersting, N. B., Sherin, B. L., & Stigler, J. W. (2014). Automated scoring of teachers’ open-
ended responses to video prompts: Bringing the classroom-video-analysis assessment to
scale. Educational and Psychological Measurement, 74(6), 950974.
https://doi.org/10.1177/0013164414521634
McConnell, K. J., & Lindner, S. (2019). Estimating treatment effects with machine learning.
Health Services Research, 54(6), 12731282. https://doi.org/10.1111/1475-6773.13212
Payne, B. K., Krosnick, J. A., Pasek, J., Lelkes, Y., Akhtar, O., & Tompson, T. (2010). Implicit
and explicit prejudice in the 2008 American presidential election. Journal of
Experimental Social Psychology, 46(2), 367374.
https://doi.org/10.1016/j.jesp.2009.11.001
Payne, B. K., & Lundberg, K. (2014). The affect misattribution procedure: Ten years of evidence
on reliability, validity, and mechanisms: Affect misattribution procedure. Social and
Personality Psychology Compass, 8(12), 672686. https://doi.org/10.1111/spc3.12148
Tipton, E. (2014). How generalizable is your experiment? An index for comparing experimental
samples and populations. Journal of Educational and Behavioral Statistics, 39(6), 478
501. https://doi.org/10.3102/1076998614558486
Wendling, T., Jung, K., Callahan, A., Schuler, A., Shah, N., & Gallego, B. (2018). Comparing
methods for estimation of heterogeneous treatment effects using observational data from
health care databases. Statistics in Medicine, 37(23), 33093324.
https://doi.org/10.1002/sim.7820
Yeager, D. S., Hanselman, P., Walton, G. M., Murray, J. S., Crosnoe, R., Muller, C., Tipton, E.,
Schneider, B., Hulleman, C. S., Hinojosa, C. P., Paunesku, D., Romero, C., Flint, K.,
Roberts, A., Trott, J., Iachan, R., Buontempo, J., Yang, S. M., Carvalho, C. M., …
Dweck, C. S. (2019). A national experiment reveals where a growth mindset improves
achievement. Nature, 573(7774), 364369. https://doi.org/10.1038/s41586-019-1466-y
... Research on Mindsets × Context theory suggests that individuals' growth mindsets are more effective in academic performance when the environment supports a growth mindset, highlighting the importance of a growth-oriented learning environment (i.e., the supportive effect of growth-mindset environments; Yeager et al., 2019Yeager et al., , 2022. Growth-mindset environments can provide affordances and opportunities for students to act on their growth mindsets (Hecht et al., 2021;Yeager et al., 2022). ...
... Research on Mindsets × Context theory suggests that individuals' growth mindsets are more effective in academic performance when the environment supports a growth mindset, highlighting the importance of a growth-oriented learning environment (i.e., the supportive effect of growth-mindset environments; Yeager et al., 2019Yeager et al., , 2022. Growth-mindset environments can provide affordances and opportunities for students to act on their growth mindsets (Hecht et al., 2021;Yeager et al., 2022). For instance, teachers with growth mindsets are more likely to communicate that challenges and making mistakes can be opportunities to learn and improve and are not signs of failure (Canning et al., 2019(Canning et al., , 2022Kroeper et al., 2022) and use evaluations that reflect improvement rather than performance (Muenks et al., 2020). ...
... As a result, students in classrooms with teachers and peers with growth (vs. fixed) mindsets showed more improvement in their academic performance over time (Yeager et al., , 2022. ...
Article
Background/aims: Recent research on mindsets has shifted from understanding its homogenous role on performance to understanding how classroom environments explain its heterogeneous effects (i.e., Mindsets × Context hypothesis). Does the macro context (e.g., societal level of student mindsets) also help explain its heterogeneous effects? And does this interaction effect also apply to understanding students' well-being? To address these questions, we examined whether and how the role of students' mindsets in performance (math, science, reading) and well-being (meaning in life, positive affect, life satisfaction) depends on the societal-mindset norms (i.e., Mindsets × Societal Norm effect). Sample/methods: We analysed a global data set (n = 612,004 adolescents in 78 societies) using multilevel analysis. The societal norm of student mindsets was the average score derived from students within each society. Results: Growth mindsets positively and weakly predicted all performance outcomes (rs = .192, .210, .224), but the associations were significantly stronger in societies with growth-mindset norms. In contrast, the associations between growth mindsets and psychological well-being were very weak and inconsistent (rs = -.066, .003, .008). Importantly, the association was negative in societies with fixed-mindset norms but positive in societies with growth-mindset norms. Conclusions: These findings challenge the idea that growth mindsets have ubiquitous positive effects in all societies. Growth mindsets might be ineffective or even detrimental in societies with fixed-mindset norms because such societal norms could suppress the potential of students with growth mindsets and undermines their well-being. Researchers should take societal norms into consideration in their efforts to understand and foster students' growth.
... Yeager et al. (2019) found that a brief online mindset intervention boosted grades for lesser achieving students while also increasing enrollment in more advanced math classes. In later research, Yeager et al. (2021) concluded that higher math grades from a mindset intervention were limited to students in classrooms with growth-minded teachers, suggesting environmental and classroom dynamics may also be important mediators for student growth mindset improvements. ...
Article
The vast majority of humans yearn for a better world. Underlying that desire is a hope that others will be better. We want politicians to act with integrity; social media CEOs to prioritize our mental health; energy executives to care for our planet; romantic partners to understand our needs; children to spend less time online. In short, we want people to live more virtuously. But how do go about achieving this? I believe Gandhi’s teachings provide the answer. He taught that we need not wait for others to change, instead, we can be the change that we are seeking. Gandhi believed humans are interconnected and that when one person changes, the collective also changes. To some, this might sound far-fetched, but scientific research is emerging that demystifies this wisdom. This paper underscores the benefits to the collective when individuals live with virtue. It begins with a review of Gandhi’s life, then highlights research related to sustainable behavior change, and culminates with an amalgamation of research that demonstrates behavior contagion from individuals to the collective. As we strive to create a better world for future generations, we'd be smart to be the change that we are seeking.
... Several researchers argue that interventions focusing on parents' cognitions and beliefs (e.g., self-efficacy beliefs, intentions, knowledge) are a promising way to change one's behavior Gärtner et al., 2018b;Wittkowski et al., 2016). Prior research offers several examples of interventions designed to teach an incremental theory to improve motivation and performance (e.g., Aronson et al., 2002;Blackwell et al., 2007;Good et al., 2003;Yeager et al., 2022). However, these studies have targeted students and teachers and focused on implicit theories of intelligence. ...
Thesis
Parents and their co-regulatory behaviors play a fundamental role in the development of child self-regulation. Concurrently, influencing factors that explain differences in parents’ behavior are insufficiently understood. Implicit theories of individuals are known to significantly determine behavior, motivation, and cognition in several domains. While implicit theories of students have been frequently studied, little research exists on implicit theories of parents. Therefore, the present dissertation aims to examine parents’ implicit theories in co- and self-regulatory processes in preschoolers. To this end, a theoretical framework is introduced that integrates the SOMA (setting/operating/monitoring/achievement) model by Burnette et al. (2013) and the three-term standard model by Bornstein et al. (2018). This dissertation presents three empirical papers that explore parents’ implicit theories and the interplay of co- and self-regulatory processes. Paper 1 is based on an online survey and examines how different domains of implicit theories co-occur within parents and are related to demographics, parents’ attitudes, and co-regulatory strategies. Three belief profiles with different configurations across domains emerge. Entity theorists have the lowest educational background. Incremental self-regulation theorists report more failure-is-enhancing mindsets, less performance-avoidance goals, and more mastery-oriented strategies than parents in the other profiles. Paper 2 uses an integrative theoretical framework to analyze different aspects of mothers’ scaffolding in mother-child interactions during a problem-solving task. The findings suggest that mothers apply different scaffolding strategies that may enhance children’s metacognitive self-regulatory strategies and task performance. Paper 3 evaluates the effects of mothers’ implicit theories in an experimental investigation with six conditions (intelligence-is-malleable, intelligence-is-stable, failure-is-enhancing, failure-is-debilitating, self-regulation-is-malleable, self-regulation-is-stable). The results indicate that parenting behaviors differ in dependence of the study condition. Mothers’ implicit theories indirectly affect children’s self-regulatory strategies, mediated via parenting behaviors. In conclusion, this dissertation provides further insight into (1) parents’ implicit theories in preschoolers, (2) the domain-specificity and interplay of different domains of implicit theories, (3) the theoretical framework of mothers’ scaffolding when studying mother-child interactions, and (4) the development of the SOMA model. The present work offers practical implications for parenting interventions and new avenues for future research.
... Aside from growth mindset interventions, other factors that seem to be important in increasing students' growth mindset relate to teacher factors. For example, teachers' own mindsets seem to influence the development of their students' growth mindset (Mesler et al., 2021) and also the positive benefits of the students' growth mindset on achievement (Yeager et al., 2022). The teachers' feedback and instructional adjustments interact to shape students' growth mindsets (Yan et al., 2021). ...
Article
The growth mindset, or the belief that intelligence can be increased with effort, has been shown to be positively associated with improved learning outcomes. This association has been observed in a few studies that inquired into reading outcomes, and fewer studies that looked into reading in a second language. The current study used data from the Programme for International Student Assessment (PISA) 2018 database to investigate the relationship between growth mindset and reading proficiency of Filipino students learning English. Data were from 6766 students who spoke a Philippine language at home, who were learning reading in English in high school, and who were assessed for reading proficiency using English texts. The study also inquired into the role of the students’ socioeconomic status (SES) and controlled for some teacher-related and motivation-related variables. Results of hierarchical regression analysis showed a weak but significant positive relationship between growth mindset and reading proficiency. The relationship between SES and reading proficiency was stronger; more importantly, SES moderated the relationship between growth mindset and reading proficiency. The positive relationship between growth mindset and English reading proficiency became weaker as the students’ SES decreased; the relationship was nonsignificant among students with the lowest SES. The discussion refers to SES differences in reading English language-learning environments in the Philippines and considers how resources in the learning environment enable (or constrain) the role of growth mindset in improving learning outcomes.
... Moreover, feedback-givers' beliefsdand how they communicate those beliefs implicitly or explicitlydcan affect feedback-recipients' views of themselves and their own ability to change. For example, when teachers communicate a belief that students cannot change (i.e., a fixed rather than growth mindset), students become demotivated and report lower expectations for improvement following poor performance [30,31]. Conversely, when teachers provide critical feedback in way that emphasizes the teacher's high standards and belief that the student is capable of meeting those standards (i.e., "wise feedback"), students are more likely to use the feedback and improve [26]. ...
Article
Full-text available
Feedback is information provided to recipients about their behavior, performance, or understanding, the goal of which is to foster recipients’ self-awareness, and behavioral reinforcement or change. Yet, feedback often fails to achieve this goal. For feedback to be effective, recipients must be receptive and accurately understand the meaning and veracity of the feedback (i.e., discern the truth in feedback). Honesty is critically important for both receptivity and discerning the truth in the feedback. In this article, we identify barriers to receptivity and discerning the truth in feedback and illustrate how these barriers hinder recipients’ learning and improvement. Barriers can arise from the feedback itself, the feedback-giver, and the feedback-recipient, and both parties share responsibility for removing them.
Article
Doubts about belonging in the classroom are often shouldered disproportionately by students from historically marginalized groups, which can lead to underperformance. Ecological-belonging interventions use a classroom-based activity to instill norms that adversity is normal, temporary, and surmountable. Building on prior studies, we sought to identify the conditions under which such interventions are effective. In a chemistry course (study 1), students from underrepresented ethnic backgrounds underperformed relative to their peers in the absence of the intervention. This performance gap was eliminated by the intervention. In an introductory biology course (study 2), there were no large performance gaps in the absence of the intervention, and the intervention had no effect. Study 2 also explored the role of the instructor that delivers the intervention. The intervention boosted scores in the classrooms of instructors with a fixed (versus growth-oriented) intelligence mindset. Our results suggest that ecological-belonging interventions are more effective in more threatening classroom contexts.
Article
Full-text available
Over the past two decades, educational policymakers in many countries have favored evidence-based educational programs and interventions. However, evidence-based education (EBE) has met with growing resistance from educational researchers. This article analyzes the objections against EBE and its preference for randomized controlled trials (RCTs). We conclude that the objections call for adjustments but do not justify abandoning EBE. Three future directions could make education more evidence-based whilst taking the objections against EBE into account: (1) study local factors, mechanisms, and implementation fidelity in RCTs, (2) utilize and improve the available longitudinal performance data, and (3) use integrated interventions and outcome measures.
Chapter
Educator dispositions are a perpetual topic of interest and an ever-evolving construct but can be difficult to define. How a set of desired dispositions manifests within various teaching contexts and learning communities will continuously evolve with changing historical, social, and societal issues. This guiding conceptual framework will help teacher educators engaging in dispositional development and assessment. Drawing on social emotional learning (SEL), this chapter unpacks desired educator dispositions. Three guiding forces underscore the framework: dispositions toward one's inner world, dispositions toward learning, and dispositions toward human differences. It is imperative to address ongoing dispositional development meaningfully and thoroughly to nurture educator dispositions in teacher preparation programs and ongoing professional development. This chapter utilizes the three main guiding forces identified above to conceptualize a framework on the formative development of educator dispositions and to guide future research and practice.
Article
Full-text available
With adequate support for the learner, errors can have high learning potential. This study investigates rather unsuitable action patterns of teachers in dealing with errors. Teachers rarely investigate the causes that evoke the occurrence of individual students’ errors, but instead often change addressees immediately after an error occurs. Such behavior is frequent in the classroom, leaving unexploited, yet important potential to learn from errors. It has remained unexplained why teachers act the way they do in error situations. Using video-stimulated recalls, I investigate the reasons for teachers’ behavior in students’ error situations by confronting them with recorded episodes from their own teaching. Error situations are analyzed (within-case) and teachers’ beliefs are classified in an explanatory model (cross-case) to illustrate patterns across teachers. Results show that teachers refer to an interaction of student attributes, their own attributes, and error attributes when reasoning their own behavior. I find that reference to specific attributes varies depending on the situation, and so do the described reasons that led to a particular behavior as a spontaneous or more reflective decision.
Article
Full-text available
This study aims to determine the characteristics of gifted students' perceptions of intelligence and the effective factors in the formation of these perceptions. The research is based on the explanatory sequential mixed-methods design. The research group consists of gifted students studying general ability in the fifth and seventh grades at Erzincan Science and Art Center. According to quantitative data of the study, the arithmetic means of the incremental theory of intelligence were higher than those of the fixed theory of intelligence on the basis of gender and grade level variables. However, qualitative data indicated that 9 (75%) of 12 students perceived intelligence as fixed.
Article
Full-text available
In the past decade, behavioural science has gained influence in policymaking but suffered a crisis of confidence in the replicability of its findings. Here, we describe a nascent heterogeneity revolution that we believe these twin historical trends have triggered. This revolution will be defined by the recognition that most treatment effects are heterogeneous, so the variation in effect estimates across studies that defines the replication crisis is to be expected as long as heterogeneous effects are studied without a systematic approach to sampling and moderation. When studied systematically, heterogeneity can be leveraged to build more complete theories of causal mechanism that could inform nuanced and dependable guidance to policymakers. We recommend investment in shared research infrastructure to make it feasible to study behavioural interventions in heterogeneous and generalizable samples, and suggest low-cost steps researchers can take immediately to avoid being misled by heterogeneity and begin to learn from it instead.
Article
Full-text available
The growth mindset is the belief that intellectual ability can be developed. This article seeks to answer recent questions about growth mindset, such as: Does a growth mindset predict student outcomes? Do growth mindset interventions work, and work reliably? Are the effect sizes meaningful enough to merit attention? And can teachers successfully instill a growth mindset in students? After exploring the important lessons learned from these questions, the article concludes that large-scale studies, including preregistered replications and studies conducted by third parties (such as international governmental agencies), justify confidence in growth mindset research. Mindset effects, however, are meaningfully heterogeneous across individuals and contexts. The article describes three recent advances that have helped the field to learn from this heterogeneity: standardized measures and interventions, studies designed specifically to identify where growth mindset interventions do not work (and why), and a conceptual framework for anticipating and interpreting moderation effects. The next generation of mindset research can build on these advances, for example by beginning to understand and perhaps change classroom contexts in ways that can make interventions more effective. Throughout, the authors reflect on lessons that can enrich metascientific perspectives on replication and generalization.
Article
Full-text available
Two experiments and 2 field studies examine how college students' perceptions of their science, technology, engineering, and mathematics (STEM) professors' mindset beliefs about the fixedness or malleability of intelligence predict students' anticipated and actual psychological experiences and performance in their STEM classes, as well as their engagement and interest in STEM more broadly. In Studies 1 (N = 252) and 2 (N = 224), faculty mindset beliefs were experimentally manipulated and students were exposed to STEM professors who endorsed either fixed or growth mindset beliefs. In Studies 3 (N = 291) and 4 (N = 902), we examined students' perceptions of their actual STEM professors' mindset beliefs and used experience sampling methodology (ESM) to capture their in-the-moment psychological experiences in those professors' classes. Across all studies, we find that students who perceive that their professor endorses more fixed mindset beliefs anticipate (Studies 1 and 2) and actually experience (Studies 3 and 4) more psychological vulnerability in those professors' classes-specifically, they report less belonging in class, greater evaluative concerns, greater imposter feelings, and greater negative affect. We also find that in-the-moment experiences of psychological vulnerability have downstream consequences. Students who perceive that their STEM professors endorse more fixed mindset beliefs experience greater psychological vulnerability in those professors' classes, which in turn predict greater dropout intentions, lower class attendance, less class engagement, less end-of-semester interest in STEM, and lower grades. These findings contribute to our understanding of how students' perceptions of professors' mindsets can serve as a situational cue that affects students' motivation, engagement, and performance in STEM. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Article
Full-text available
Could mitigating persistent worries about belonging in the transition to college improve adult life for black Americans? To examine this question, we conducted a long-term follow-up of a randomized social-belonging intervention delivered in the first year of college. This 1-hour exercise represented social and academic adversity early in college as common and temporary. As previously reported in Science , the exercise improved black students’ grades and well-being in college. The present study assessed the adult outcomes of these same participants. Examining adult life at an average age of 27, black adults who had received the treatment (versus control) exercise 7 to 11 years earlier reported significantly greater career satisfaction and success, psychological well-being, and community involvement and leadership. Gains were statistically mediated by greater college mentorship. The results suggest that addressing persistent social-psychological concerns via psychological intervention can shape the life course, partly by changing people’s social realities.
Article
Full-text available
Some environmental influences, including intentional interventions, have shown persistent effects on psychological characteristics and other socially important outcomes years and even decades later. At the same time, it is common to find that the effects of life events or interventions diminish and even disappear completely, a phenomenon known as fadeout. We review the evidence for persistence and fadeout, drawing primarily on evidence from educational interventions. We conclude that 1) fadeout is widespread, and often co-exists with persistence; 2) fadeout is a substantive phenomenon, not merely a measurement artefact; and 3) persistence depends on the types of skills targeted, the institutional constraints and opportunities within the social context, and complementarities between interventions and subsequent environmental affordances. We discuss the implications of these conclusions for research and policy.
Article
Full-text available
Researchers commonly interpret effect sizes by applying benchmarks proposed by Cohen over a half century ago. However, effects that are small by Cohen's standards are large relative to the impacts of most field-based interventions. These benchmarks also fail to consider important differences in study features, program costs, and scalability. In this paper, I present five broad guidelines for interpreting effect sizes that are applicable across the social sciences. I then propose a more structured schema with new empirical benchmarks for interpreting a specific class of studies: causal research on education interventions with standardized achievement outcomes. Together, these tools provide a practical approach for incorporating study features, cost, and scalability into the process of interpreting the policy importance of effect sizes.
Article
Full-text available
Psychologically “wise” interventions can cause lasting improvement in key aspects of people’s lives, but where will they work and where will they not? We consider the psychological affordance of the social context: Does the context in which the intervention is delivered afford the way of thinking offered by the intervention? If not, treatment effects are unlikely to persist. Change requires planting good seeds (a more adaptive perspective) in fertile soil in which that seed can grow (a context with appropriate affordances). We illustrate the role of psychological affordances in diverse problem spaces, including recent large-scale trials of growth-mindset and social-belonging interventions designed specifically to understand heterogeneity across contexts. We highlight how the study of psychological affordances can advance theory about social contexts and inform debates about replicability.
Article
Results of 151 studies were integrated by meta-analysis to scrutinize the construct mathematics anxiety. Mathematics anxiety is related to poor performance on mathematics achievement tests. It relates inversely to positive attitudes toward mathematics and is bound directly to avoidance of the subject. Variables that exhibit differential mathematics anxiety levels include ability, school grade level, and undergraduate fields of study, with preservice arithmetic teachers especially prone to mathematics anxiety. Females display higher levels than males. However, mathematics anxiety appears more strongly linked with poor performance and avoidance of mathematics in precollege males than females. A variety of treatments are effective in reducing mathematics anxiety. Improved mathematics performance consistently accompanies valid treatment.
Article
Nonparametric regression models have recently surged in their power and popularity, accompanying the trend of increasing dataset size and complexity. While these models have proven their predictive ability in empirical settings, they are often difficult to interpret and do not address the underlying inferential goals of the analyst or decision maker. In this paper, we propose a modular two-stage approach for creating parsimonious, interpretable summaries of complex models which allow freedom in the choice of modeling technique and the inferential target. In the first stage a flexible model is fit which is believed to be as accurate as possible. In the second stage, lower-dimensional summaries are constructed by projecting draws from the distribution onto simpler structures. These summaries naturally come with valid Bayesian uncertainty estimates. Further, since we use the data only once to move from prior to posterior, these uncertainty estimates remain valid across multiple summaries and after iteratively refining a summary. We apply our method and demonstrate its strengths across a range of simulated and real datasets. Code to reproduce the examples shown is avaiable at github.com/spencerwoody/ghost.